DOE Office of Scientific and Technical Information (OSTI.GOV)
Onda, M.; Kudo, S.; Fukuda, M.
Human glycophorin A, B, and E (GPA, GPB, and GPE) genes belong to a gene family located at the long arm of chromosome 4. These three genes are homologous from the 5'-flanking sequence to the Alu sequence, which is 1 kb downstream from the exon encoding the transmembrane domain. Analysis of the Alu sequence and flanking direct repeat sequences suggested that the GPA gene most closely resembles the ancestral gene, whereas the GPB and GPE gene arose by homologous recombination within the Alu sequence, acquiring 3' sequences from an unrelated precursor genomic segment. Here the authors describe the identification ofmore » this putative precursor genomic segment. A human genomic library was screened by using the sequence of the 3' region of the GPB gene as a probe. The genomic clones isolated were found to contain an Alu sequence that appeared to be involved in the recombination. Downstream from the Alu sequence, the nucleotide sequence of the precursor genomic segment is almost identical to that of the GPB or GPE gene. In contrast, the upstream sequence of the genomic segment differs entirely from that of the GPA, GPB, and GPE genes. Conservation of the direct repeats flanking the Alu sequence of the genomic segment strongly suggests that the sequence of this genomic segment has been maintained during evolution. This identified genomic segment was found to reside downstream from the GPA gene by both gene mapping and in situ chromosomal localization. The precursor genomic segment was also identified in the orangutan genome, which is known to lack GPB and GPE genes. These results indicate that one of the duplicated ancestral glycophorin genes acquired a unique 3' sequence by unequal crossing-over through its Alu sequence and the further downstream Alu sequence present in the duplicated gene. Further duplication and divergence of this gene yielded the GPB and GPE genes. 37 refs., 5 figs.« less
Detection of a new bat gammaherpesvirus in the Philippines.
Watanabe, Shumpei; Ueda, Naoya; Iha, Koichiro; Masangkay, Joseph S; Fujii, Hikaru; Alviola, Phillip; Mizutani, Tetsuya; Maeda, Ken; Yamane, Daisuke; Walid, Azab; Kato, Kentaro; Kyuwa, Shigeru; Tohya, Yukinobu; Yoshikawa, Yasuhiro; Akashi, Hiroomi
2009-08-01
A new bat herpesvirus was detected in the spleen of an insectivorous bat (Hipposideros diadema, family Hipposideridae) collected on Panay Island, the Philippines. PCR analyses were performed using COnsensus-DEgenerate Hybrid Oligonucleotide Primers (CODEHOPs) targeting the herpesvirus DNA polymerase (DPOL) gene. Although we obtained PCR products with CODEHOPs, direct sequencing using the primers was not possible because of high degree of degeneracy. Direct sequencing technology developed in our rapid determination system of viral RNA sequences (RDV) was applied in this study, and a partial DPOL nucleotide sequence was determined. In addition, a partial gB gene nucleotide sequence was also determined using the same strategy. We connected the partial gB and DPOL sequences with long-distance PCR, and a 3741-bp nucleotide fragment, including the 3' part of the gB gene and the 5' part of the DPOL gene, was finally determined. Phylogenetic analysis showed that the sequence was novel and most similar to those of the subfamily Gammaherpesvirinae.
Rowe, Will; Baker, Kate S; Verner-Jeffreys, David; Baker-Austin, Craig; Ryan, Jim J; Maskell, Duncan; Pearce, Gareth
2015-01-01
Antimicrobial resistance remains a growing and significant concern in human and veterinary medicine. Current laboratory methods for the detection and surveillance of antimicrobial resistant bacteria are limited in their effectiveness and scope. With the rapidly developing field of whole genome sequencing beginning to be utilised in clinical practice, the ability to interrogate sequencing data quickly and easily for the presence of antimicrobial resistance genes will become increasingly important and useful for informing clinical decisions. Additionally, use of such tools will provide insight into the dynamics of antimicrobial resistance genes in metagenomic samples such as those used in environmental monitoring. Here we present the Search Engine for Antimicrobial Resistance (SEAR), a pipeline and web interface for detection of horizontally acquired antimicrobial resistance genes in raw sequencing data. The pipeline provides gene information, abundance estimation and the reconstructed sequence of antimicrobial resistance genes; it also provides web links to additional information on each gene. The pipeline utilises clustering and read mapping to annotate full-length genes relative to a user-defined database. It also uses local alignment of annotated genes to a range of online databases to provide additional information. We demonstrate SEAR's application in the detection and abundance estimation of antimicrobial resistance genes in two novel environmental metagenomes, 32 human faecal microbiome datasets and 126 clinical isolates of Shigella sonnei. We have developed a pipeline that contributes to the improved capacity for antimicrobial resistance detection afforded by next generation sequencing technologies, allowing for rapid detection of antimicrobial resistance genes directly from sequencing data. SEAR uses raw sequencing data via an intuitive interface so can be run rapidly without requiring advanced bioinformatic skills or resources. Finally, we show that SEAR is effective in detecting antimicrobial resistance genes in metagenomic and isolate sequencing data from both environmental metagenomes and sequencing data from clinical isolates.
PMS2 gene mutational analysis: direct cDNA sequencing to circumvent pseudogene interference.
Wimmer, Katharina; Wernstedt, Annekatrin
2014-01-01
The presence of highly homologous pseudocopies can compromise the mutation analysis of a gene of interest. In particular, when using PCR-based strategies, pseudogene co-amplification has to be effectively prevented. This is often achieved by using primers designed to be parental gene specific according to the reference sequence and by applying stringent PCR conditions. However, there are cases in which this approach is of limited utility. For example, it has been shown that the PMS2 gene exchanges sequences with one of its pseudogenes, named PMS2CL. This results in functional PMS2 alleles containing pseudogene-derived sequences at their 3'-end and in nonfunctional PMS2CL pseudogene alleles that contain gene-derived sequences. Hence, the paralogues cannot be distinguished according to the reference sequence. This shortcoming can be effectively circumvented by using direct cDNA sequencing. This approach is based on the selective amplification of PMS2 transcripts in two overlapping 1.6-kb RT-PCR products. In addition to avoiding pseudogene co-amplification and allele dropout, this method has also the advantage that it allows to effectively identify deletions, splice mutations, and de novo retrotransposon insertions that escape the detection of most DNA-based mutation analysis protocols.
Dodds, Peter N.; Lawrence, Gregory J.; Catanzariti, Ann-Maree; Teh, Trazel; Wang, Ching-I. A.; Ayliffe, Michael A.; Kobe, Bostjan; Ellis, Jeffrey G.
2006-01-01
Plant resistance proteins (R proteins) recognize corresponding pathogen avirulence (Avr) proteins either indirectly through detection of changes in their host protein targets or through direct R–Avr protein interaction. Although indirect recognition imposes selection against Avr effector function, pathogen effector molecules recognized through direct interaction may overcome resistance through sequence diversification rather than loss of function. Here we show that the flax rust fungus AvrL567 genes, whose products are recognized by the L5, L6, and L7 R proteins of flax, are highly diverse, with 12 sequence variants identified from six rust strains. Seven AvrL567 variants derived from Avr alleles induce necrotic responses when expressed in flax plants containing corresponding resistance genes (R genes), whereas five variants from avr alleles do not. Differences in recognition specificity between AvrL567 variants and evidence for diversifying selection acting on these genes suggest they have been involved in a gene-specific arms race with the corresponding flax R genes. Yeast two-hybrid assays indicate that recognition is based on direct R–Avr protein interaction and recapitulate the interaction specificity observed in planta. Biochemical analysis of Escherichia coli-produced AvrL567 proteins shows that variants that escape recognition nevertheless maintain a conserved structure and stability, suggesting that the amino acid sequence differences directly affect the R–Avr protein interaction. We suggest that direct recognition associated with high genetic diversity at corresponding R and Avr gene loci represents an alternative outcome of plant–pathogen coevolution to indirect recognition associated with simple balanced polymorphisms for functional and nonfunctional R and Avr genes. PMID:16731621
Silar, Philippe; Barreau, Christian; Debuchy, Robert; Kicka, Sébastien; Turcq, Béatrice; Sainsard-Chanet, Annie; Sellem, Carole H; Billault, Alain; Cattolico, Laurence; Duprat, Simone; Weissenbach, Jean
2003-08-01
A Podospora anserina BAC library of 4800 clones has been constructed in the vector pBHYG allowing direct selection in fungi. Screening of the BAC collection for centromeric sequences of chromosome V allowed the recovery of clones localized on either sides of the centromere, but no BAC clone was found to contain the centromere. Seven BAC clones containing 322,195 and 156,244bp from either sides of the centromeric region were sequenced and annotated. One 5S rRNA gene, 5 tRNA genes, and 163 putative coding sequences (CDS) were identified. Among these, only six CDS seem specific to P. anserina. The gene density in the centromeric region is approximately one gene every 2.8kb. Extrapolation of this gene density to the whole genome of P. anserina suggests that the genome contains about 11,000 genes. Synteny analyses between P. anserina and Neurospora crassa show that co-linearity extends at the most to a few genes, suggesting rapid genome rearrangements between these two species.
Simulation of gene evolution under directional mutational pressure
NASA Astrophysics Data System (ADS)
Dudkiewicz, Małgorzata; Mackiewicz, Paweł; Kowalczuk, Maria; Mackiewicz, Dorota; Nowicka, Aleksandra; Polak, Natalia; Smolarczyk, Kamila; Banaszak, Joanna; R. Dudek, Mirosław; Cebrat, Stanisław
2004-05-01
The two main mechanisms generating the genetic diversity, mutation and recombination, have random character but they are biased which has an effect on the generation of asymmetry in the bacterial chromosome structure and in the protein coding sequences. Thus, like in a case of two chiral molecules-the two possible orientations of a gene in relation to the topology of a chromosome are not equivalent. Assuming that the sequence of a gene may oscillate only between certain limits of its structural composition means that the gene could be forced out of these limits by the directional mutation pressure, in the course of evolution. The probability of the event depends on the time the gene stays under the same mutation pressure. Inversion of the gene changes the directional mutational pressure to the reciprocal one and hence it changes the distance of the gene to its lower and upper bound of the structural tolerance. Using Monte Carlo methods we were able to simulate the evolution of genes under experimentally found mutational pressure, assuming simple mechanisms of selection. We found that the mutation and recombination should work in accordance to lower their negative effects on the function of the products of coding sequences.
Hamond, C; Pestana, C P; Medeiros, M A; Lilenbaum, W
2016-01-01
The aim of this study was to identify Leptospira in urine samples of cattle by direct sequencing of the secY gene. The validity of this approach was assessed using ten Leptospira strains obtained from cattle in Brazil and 77 DNA samples previously extracted from cattle urine, that were positive by PCR for the genus-specific lipL32 gene of Leptospira. Direct sequencing identified 24 (31·1%) interpretable secY sequences and these were identical to those obtained from direct DNA sequencing of the urine samples from which they were recovered. Phylogenetic analyses identified four species: L. interrogans, L. borgpetersenii, L. noguchii, and L. santarosai with the most prevalent genotypes being associated with L. borgpetersenii. While direct sequencing cannot, as yet, replace culturing of leptospires, it is a valid additional tool for epidemiological studies. An unexpected finding from this study was the genetic diversity of Leptospira infecting Brazilian cattle.
Evaluation of microbial community in hydrothermal field by direct DNA sequencing
NASA Astrophysics Data System (ADS)
Kawarabayasi, Y.; Maruyama, A.
2002-12-01
Many extremophiles have been discovered from terrestrial and marine hydrothermal fields. Some thermophiles can grow beyond 90°C in culture, while direct microscopic analysis occasionally indicates that microbes may survive in much hotter hydrothermal fluids. However, it is very difficult to isolate and cultivate such microbes from the environments, i.e., over 99% of total microbes remains undiscovered. Based on experiences of entire microbial genome analysis (Y.K.) and microbial community analysis (A.M.), we started to find out unique microbes/genes in hydrothermal fields through direct sequencing of environmental DNA fragments. At first, shotgun plasmid libraries were directly constructed with the DNA molecules prepared from mixed microbes collected by an in situ filtration system from low-temperature fluids at RM24 in the Southern East Pacific Rise (S-EPR). A gene amplification (PCR) technique was not used for preventing mutation in the process. The nucleotide sequences of 285 clones indicated that no sequence had identical data in public databases. Among 27 clones determined entire sequences, no ORF was identified on 14 clones like intron in Eukaryote. On four clones, tetra-nucleotide-long multiple tandem repetitive sequences were identified. This type of sequence was identified in some familiar disease in human. The result indicates that living/dead materials with eukaryotic features may exist in this low temperature field. Secondly, shotgun plasmid libraries were constructed from the environmental DNA prepared from Beppu hot springs. In randomly-selected 143 clones used for sequencing, no known sequence was identified. Unlike the clones in S-EPR library, clear ORFs were identified on all nine clones determined the entire sequence. It was found that one clone, H4052, contained the complete Aspartyl-tRNA synthetase. Phylogenetic analysis using amino acid sequences of this gene indicated that this gene was separated from other Euryarchaea before the differentiation of species. Thus, some novel archaeal species are expected to be in this field. The present direct cloning and sequencing technique is now opening a window to the new world in hydrothermal microbial community analysis.
Dehghanian, Fatemeh; Silawi, Mohammad; Tabei, Seyed M B
2017-02-01
Deficiency of phenylalanine hydroxylase (PAH) enzyme and elevation of phenylalanine in body fluids cause phenylketonuria (PKU). The gold standard for confirming PKU and PAH deficiency is detecting causal mutations by direct sequencing of the coding exons and splicing involved sequences of the PAH gene. Furthermore, haplotype analysis could be considered as an auxiliary approach for detecting PKU causative mutations before direct sequencing of the PAH gene by making comparisons between prior detected mutation linked-haplotypes and new PKU case haplotypes with undetermined mutations. In this study, 13 unrelated classical PKU patients took part in the study detecting causative mutations. Mutations were identified by polymerase chain reaction (PCR) and direct sequencing in all patients. After that, haplotype analysis was performed by studying VNTR and PAHSTR markers (linked genetic markers of the PAH gene) through application of PCR and capillary electrophoresis (CE). Mutation analysis was performed successfully and the detected mutations were as follows: c.782G>A, c.754C>T, c.842C>G, c.113-115delTCT, c.688G>A, and c.696A>G. Additionally, PAHSTR/VNTR haplotypes were detected to discover haplotypes linked to each mutation. Mutation detection is the best approach for confirming PAH enzyme deficiency in PKU patients. Due to the relatively large size of the PAH gene and high cost of the direct sequencing in developing countries, haplotype analysis could be used before DNA sequencing and mutation detection for a faster and cheaper way via identifying probable mutated exons.
Proliferating cell nuclear antigen (Pcna) as a direct downstream target gene of Hoxc8
DOE Office of Scientific and Technical Information (OSTI.GOV)
Min, Hyehyun; Lee, Ji-Yeon; Bok, Jinwoong
2010-02-19
Hoxc8 is a member of Hox family transcription factors that play crucial roles in spatiotemporal body patterning during embryogenesis. Hox proteins contain a conserved 61 amino acid homeodomain, which is responsible for recognition and binding of the proteins onto Hox-specific DNA binding motifs and regulates expression of their target genes. Previously, using proteome analysis, we identified Proliferating cell nuclear antigen (Pcna) as one of the putative target genes of Hoxc8. Here, we asked whether Hoxc8 regulates Pcna expression by directly binding to the regulatory sequence of Pcna. In mouse embryos at embryonic day 11.5, the expression pattern of Pcna wasmore » similar to that of Hoxc8 along the anteroposterior body axis. Moreover, Pcna transcript levels as well as cell proliferation rate were increased by overexpression of Hoxc8 in C3H10T1/2 mouse embryonic fibroblast cells. Characterization of 2.3 kb genomic sequence upstream of Pcna coding region revealed that the upstream sequence contains several Hox core binding sequences and one Hox-Pbx binding sequence. Direct binding of Hoxc8 proteins to the Pcna regulatory sequence was verified by chromatin immunoprecipitation assay. Taken together, our data suggest that Pcna is a direct downstream target of Hoxc8.« less
Schiex, Thomas; Gouzy, Jérôme; Moisan, Annick; de Oliveira, Yannick
2003-07-01
We describe FrameD, a program that predicts coding regions in prokaryotic and matured eukaryotic sequences. Initially targeted at gene prediction in bacterial GC rich genomes, the gene model used in FrameD also allows to predict genes in the presence of frameshifts and partially undetermined sequences which makes it also very suitable for gene prediction and frameshift correction in unfinished sequences such as EST and EST cluster sequences. Like recent eukaryotic gene prediction programs, FrameD also includes the ability to take into account protein similarity information both in its prediction and its graphical output. Its performances are evaluated on different bacterial genomes. The web site (http://genopole.toulouse.inra.fr/bioinfo/FrameD/FD) allows direct prediction, sequence correction and translation and the ability to learn new models for new organisms.
Mismer, D.; Rubin, G. M.
1989-01-01
We have analyzed the cis-acting regulatory sequences of the Rh1 (ninaE) gene in Drosophila melanogaster by P-element-mediated germline transformation of indicator genes transcribed from mutant ninaE promoter sequences. We have previously shown that a 200-bp region extending from -120 to +67 relative to the transcription start site is sufficient to obtain eye-specific expression from the ninaE promoter. In the present study, 22 different 4-13-bp sequences in the -120/+67 promoter region were altered by oligonucleotide-directed mutagenesis. Several of these sequences were found to be required for proper promoter function; two of these are conserved in the promoter of the homologous gene isolated from the related species Drosophila virilis. Alteration of a conserved 9-bp sequence results in aberrant, low level expression in the body. Alteration of a separate 11-bp sequence, found in the promoter regions of several photoreceptor-specific genes of Drosophila, results in an approximately 15-fold reduction in promoter efficiency but without apparent alteration of tissue-specificity. A protein factor capable of interacting with this 11-bp sequence has been detected by DNaseI footprinting in embryonic nuclear extracts. Finally, we have further characterized two separable enhancer sequences previously shown to be required for normal levels of expression from this promoter. PMID:2521839
Gene discovery by chemical mutagenesis and whole-genome sequencing in Dictyostelium.
Li, Cheng-Lin Frank; Santhanam, Balaji; Webb, Amanda Nicole; Zupan, Blaž; Shaulsky, Gad
2016-09-01
Whole-genome sequencing is a useful approach for identification of chemical-induced lesions, but previous applications involved tedious genetic mapping to pinpoint the causative mutations. We propose that saturation mutagenesis under low mutagenic loads, followed by whole-genome sequencing, should allow direct implication of genes by identifying multiple independent alleles of each relevant gene. We tested the hypothesis by performing three genetic screens with chemical mutagenesis in the social soil amoeba Dictyostelium discoideum Through genome sequencing, we successfully identified mutant genes with multiple alleles in near-saturation screens, including resistance to intense illumination and strong suppressors of defects in an allorecognition pathway. We tested the causality of the mutations by comparison to published data and by direct complementation tests, finding both dominant and recessive causative mutations. Therefore, our strategy provides a cost- and time-efficient approach to gene discovery by integrating chemical mutagenesis and whole-genome sequencing. The method should be applicable to many microbial systems, and it is expected to revolutionize the field of functional genomics in Dictyostelium by greatly expanding the mutation spectrum relative to other common mutagenesis methods. © 2016 Li et al.; Published by Cold Spring Harbor Laboratory Press.
Parrish, R Ryley; Day, Jeremy J; Lubin, Farah D
2012-07-01
DNA methylation is an epigenetic modification that is essential for the development and mature function of the central nervous system. Due to the relevance of this modification to the transcriptional control of gene expression, it is often necessary to examine changes in DNA methylation patterns with both gene and single-nucleotide resolution. Here, we describe an in-depth basic protocol for direct bisulfite sequencing of DNA isolated from brain tissue, which will permit direct assessment of methylation status at individual genes as well as individual cytosine molecules/nucleotides within a genomic region. This method yields analysis of DNA methylation patterns that is robust, accurate, and reproducible, thereby allowing insights into the role of alterations in DNA methylation in brain tissue.
2009-01-01
Background One of the most common and efficient methods for detecting mutations in genes is PCR amplification followed by direct sequencing. Until recently, the process of designing PCR assays has been to focus on individual assay parameters rather than concentrating on matching conditions for a set of assays. Primers for each individual assay were selected based on location and sequence concerns. The two primer sequences were then iteratively adjusted to make the individual assays work properly. This generally resulted in groups of assays with different annealing temperatures that required the use of multiple thermal cyclers or multiple passes in a single thermal cycler making diagnostic testing time-consuming, laborious and expensive. These factors have severely hampered diagnostic testing services, leaving many families without an answer for the exact cause of a familial genetic disease. A search of GeneTests for sequencing analysis of the entire coding sequence for genes that are known to cause muscular dystrophies returns only a small list of laboratories that perform comprehensive gene panels. The hypothesis for the study was that a complete set of universal assays can be designed to amplify and sequence any gene or family of genes using computer aided design tools. If true, this would allow automation and optimization of the mutation detection process resulting in reduced cost and increased throughput. Results An automated process has been developed for the detection of deletions, duplications/insertions and point mutations in any gene or family of genes and has been applied to ten genes known to bear mutations that cause muscular dystrophy: DMD; CAV3; CAPN3; FKRP; TRIM32; LMNA; SGCA; SGCB; SGCG; SGCD. Using this process, mutations have been found in five DMD patients and four LGMD patients (one in the FKRP gene, one in the CAV3 gene, and two likely causative heterozygous pairs of variations in the CAPN3 gene of two other patients). Methods and assay sequences are reported in this paper. Conclusion This automated process allows laboratories to discover DNA variations in a short time and at low cost. PMID:19835634
Bennett, Richard R; Schneider, Hal E; Estrella, Elicia; Burgess, Stephanie; Cheng, Andrew S; Barrett, Caitlin; Lip, Va; Lai, Poh San; Shen, Yiping; Wu, Bai-Lin; Darras, Basil T; Beggs, Alan H; Kunkel, Louis M
2009-10-18
One of the most common and efficient methods for detecting mutations in genes is PCR amplification followed by direct sequencing. Until recently, the process of designing PCR assays has been to focus on individual assay parameters rather than concentrating on matching conditions for a set of assays. Primers for each individual assay were selected based on location and sequence concerns. The two primer sequences were then iteratively adjusted to make the individual assays work properly. This generally resulted in groups of assays with different annealing temperatures that required the use of multiple thermal cyclers or multiple passes in a single thermal cycler making diagnostic testing time-consuming, laborious and expensive.These factors have severely hampered diagnostic testing services, leaving many families without an answer for the exact cause of a familial genetic disease. A search of GeneTests for sequencing analysis of the entire coding sequence for genes that are known to cause muscular dystrophies returns only a small list of laboratories that perform comprehensive gene panels.The hypothesis for the study was that a complete set of universal assays can be designed to amplify and sequence any gene or family of genes using computer aided design tools. If true, this would allow automation and optimization of the mutation detection process resulting in reduced cost and increased throughput. An automated process has been developed for the detection of deletions, duplications/insertions and point mutations in any gene or family of genes and has been applied to ten genes known to bear mutations that cause muscular dystrophy: DMD; CAV3; CAPN3; FKRP; TRIM32; LMNA; SGCA; SGCB; SGCG; SGCD. Using this process, mutations have been found in five DMD patients and four LGMD patients (one in the FKRP gene, one in the CAV3 gene, and two likely causative heterozygous pairs of variations in the CAPN3 gene of two other patients). Methods and assay sequences are reported in this paper. This automated process allows laboratories to discover DNA variations in a short time and at low cost.
Multiple copies of a bile acid-inducible gene in Eubacterium sp. strain VPI 12708.
Gopal-Srivastava, R; Mallonee, D H; White, W B; Hylemon, P B
1990-01-01
Eubacterium sp. strain VPI 12708 is an anaerobic intestinal bacterium which possesses inducible bile acid 7-dehydroxylation activity. Several new polypeptides are produced in this strain following induction with cholic acid. Genes coding for two copies of a bile acid-inducible 27,000-dalton polypeptide (baiA1 and baiA2) have been previously cloned and sequenced. We now report on a gene coding for a third copy of this 27,000-dalton polypeptide (baiA3). The baiA3 gene has been cloned in lambda DASH on an 11.2-kilobase DNA fragment from a partial Sau3A digest of the Eubacterium DNA. DNA sequence analysis of the baiA3 gene revealed 100% homology with the baiA1 gene within the coding region of the 27,000-dalton polypeptides. The baiA2 gene shares 81% sequence identity with the other two genes at the nucleotide level. The flanking nucleotide sequences associated with the baiA1 and baiA3 genes are identical for 930 bases in the 5' direction from the initiation codon and for at least 325 bases in the 3' direction from the stop codon, including the putative promoter regions for the genes. An additional open reading frame (occupying from 621 to 648 bases, depending on the correct start codon) was found in the identical 5' regions associated with the baiA1 and baiA3 clones. The 5' sequence 930 bases upstream from the baiA1 and baiA3 genes was totally divergent. The baiA2 gene, which is part of a large bile acid-inducible operon, showed no homology with the other two genes either in the 5' or 3' direction from the polypeptide coding region, except for a 15-base-pair presumed ribosome-binding site in the 5' region. These studies strongly suggest that a gene duplication (baiA1 and baiA3) has occurred and is stably maintained in this bacterium. Images PMID:2376563
The full mitochondrial genome sequence of Raillietina tetragona from chicken (Cestoda: Davaineidae).
Liang, Jian-Ying; Lin, Rui-Qing
2016-11-01
In the present study, the complete mitochondrial DNA (mtDNA) sequence of Raillietina tetragona was sequenced and its gene contents and genome organizations was compared with that of other tapeworm. The complete mt genome sequence of R. tetragona is 14,444 bp in length. It contains 12 protein-coding genes, two ribosomal RNA genes, 22 transfer RNA genes, and two non-coding region. All genes are transcribed in the same direction and have a nucleotide composition high in A and T. The contents of A + T of the complete mt genome are 71.4% for R. tetragona. The R. tetragona mt genome sequence provides novel mtDNA marker for studying the molecular epidemiology and population genetics of Raillietina and has implications for the molecular diagnosis of chicken cestodosis caused by Raillietina.
Khulordava, Irakli; Miller, Geraldine; Haas, David; Li, Haijing; McKinsey, Joel; Vanderende, Daniel; Tang, Yi-Wei
2003-05-01
We report two cases of culture-negative bacterial endocarditis in which the organisms were identified by amplification and sequencing of the bacterial 16S rRNA gene. These results support an important role for polymerase chain reaction followed by direct sequencing to determine the etiology of culture-negative bacterial endocarditis and to guide appropriate antimicrobial therapy.
Rapid and efficient cDNA library screening by self-ligation of inverse PCR products (SLIP).
Hoskins, Roger A; Stapleton, Mark; George, Reed A; Yu, Charles; Wan, Kenneth H; Carlson, Joseph W; Celniker, Susan E
2005-12-02
cDNA cloning is a central technology in molecular biology. cDNA sequences are used to determine mRNA transcript structures, including splice junctions, open reading frames (ORFs) and 5'- and 3'-untranslated regions (UTRs). cDNA clones are valuable reagents for functional studies of genes and proteins. Expressed Sequence Tag (EST) sequencing is the method of choice for recovering cDNAs representing many of the transcripts encoded in a eukaryotic genome. However, EST sequencing samples a cDNA library at random, and it recovers transcripts with low expression levels inefficiently. We describe a PCR-based method for directed screening of plasmid cDNA libraries. We demonstrate its utility in a screen of libraries used in our Drosophila EST projects for 153 transcription factor genes that were not represented by full-length cDNA clones in our Drosophila Gene Collection. We recovered high-quality, full-length cDNAs for 72 genes and variously compromised clones for an additional 32 genes. The method can be used at any scale, from the isolation of cDNA clones for a particular gene of interest, to the improvement of large gene collections in model organisms and the human. Finally, we discuss the relative merits of directed cDNA library screening and RT-PCR approaches.
Massively Parallel DNA Sequencing Facilitates Diagnosis of Patients with Usher Syndrome Type 1
Yoshimura, Hidekane; Iwasaki, Satoshi; Nishio, Shin-ya; Kumakawa, Kozo; Tono, Tetsuya; Kobayashi, Yumiko; Sato, Hiroaki; Nagai, Kyoko; Ishikawa, Kotaro; Ikezono, Tetsuo; Naito, Yasushi; Fukushima, Kunihiro; Oshikawa, Chie; Kimitsuki, Takashi; Nakanishi, Hiroshi; Usami, Shin-ichi
2014-01-01
Usher syndrome is an autosomal recessive disorder manifesting hearing loss, retinitis pigmentosa and vestibular dysfunction, and having three clinical subtypes. Usher syndrome type 1 is the most severe subtype due to its profound hearing loss, lack of vestibular responses, and retinitis pigmentosa that appears in prepuberty. Six of the corresponding genes have been identified, making early diagnosis through DNA testing possible, with many immediate and several long-term advantages for patients and their families. However, the conventional genetic techniques, such as direct sequence analysis, are both time-consuming and expensive. Targeted exon sequencing of selected genes using the massively parallel DNA sequencing technology will potentially enable us to systematically tackle previously intractable monogenic disorders and improve molecular diagnosis. Using this technique combined with direct sequence analysis, we screened 17 unrelated Usher syndrome type 1 patients and detected probable pathogenic variants in the 16 of them (94.1%) who carried at least one mutation. Seven patients had the MYO7A mutation (41.2%), which is the most common type in Japanese. Most of the mutations were detected by only the massively parallel DNA sequencing. We report here four patients, who had probable pathogenic mutations in two different Usher syndrome type 1 genes, and one case of MYO7A/PCDH15 digenic inheritance. This is the first report of Usher syndrome mutation analysis using massively parallel DNA sequencing and the frequency of Usher syndrome type 1 genes in Japanese. Mutation screening using this technique has the power to quickly identify mutations of many causative genes while maintaining cost-benefit performance. In addition, the simultaneous mutation analysis of large numbers of genes is useful for detecting mutations in different genes that are possibly disease modifiers or of digenic inheritance. PMID:24618850
Massively parallel DNA sequencing facilitates diagnosis of patients with Usher syndrome type 1.
Yoshimura, Hidekane; Iwasaki, Satoshi; Nishio, Shin-Ya; Kumakawa, Kozo; Tono, Tetsuya; Kobayashi, Yumiko; Sato, Hiroaki; Nagai, Kyoko; Ishikawa, Kotaro; Ikezono, Tetsuo; Naito, Yasushi; Fukushima, Kunihiro; Oshikawa, Chie; Kimitsuki, Takashi; Nakanishi, Hiroshi; Usami, Shin-Ichi
2014-01-01
Usher syndrome is an autosomal recessive disorder manifesting hearing loss, retinitis pigmentosa and vestibular dysfunction, and having three clinical subtypes. Usher syndrome type 1 is the most severe subtype due to its profound hearing loss, lack of vestibular responses, and retinitis pigmentosa that appears in prepuberty. Six of the corresponding genes have been identified, making early diagnosis through DNA testing possible, with many immediate and several long-term advantages for patients and their families. However, the conventional genetic techniques, such as direct sequence analysis, are both time-consuming and expensive. Targeted exon sequencing of selected genes using the massively parallel DNA sequencing technology will potentially enable us to systematically tackle previously intractable monogenic disorders and improve molecular diagnosis. Using this technique combined with direct sequence analysis, we screened 17 unrelated Usher syndrome type 1 patients and detected probable pathogenic variants in the 16 of them (94.1%) who carried at least one mutation. Seven patients had the MYO7A mutation (41.2%), which is the most common type in Japanese. Most of the mutations were detected by only the massively parallel DNA sequencing. We report here four patients, who had probable pathogenic mutations in two different Usher syndrome type 1 genes, and one case of MYO7A/PCDH15 digenic inheritance. This is the first report of Usher syndrome mutation analysis using massively parallel DNA sequencing and the frequency of Usher syndrome type 1 genes in Japanese. Mutation screening using this technique has the power to quickly identify mutations of many causative genes while maintaining cost-benefit performance. In addition, the simultaneous mutation analysis of large numbers of genes is useful for detecting mutations in different genes that are possibly disease modifiers or of digenic inheritance.
De Novo Protein Structure Prediction
NASA Astrophysics Data System (ADS)
Hung, Ling-Hong; Ngan, Shing-Chung; Samudrala, Ram
An unparalleled amount of sequence data is being made available from large-scale genome sequencing efforts. The data provide a shortcut to the determination of the function of a gene of interest, as long as there is an existing sequenced gene with similar sequence and of known function. This has spurred structural genomic initiatives with the goal of determining as many protein folds as possible (Brenner and Levitt, 2000; Burley, 2000; Brenner, 2001; Heinemann et al., 2001). The purpose of this is twofold: First, the structure of a gene product can often lead to direct inference of its function. Second, since the function of a protein is dependent on its structure, direct comparison of the structures of gene products can be more sensitive than the comparison of sequences of genes for detecting homology. Presently, structural determination by crystallography and NMR techniques is still slow and expensive in terms of manpower and resources, despite attempts to automate the processes. Computer structure prediction algorithms, while not providing the accuracy of the traditional techniques, are extremely quick and inexpensive and can provide useful low-resolution data for structure comparisons (Bonneau and Baker, 2001). Given the immense number of structures which the structural genomic projects are attempting to solve, there would be a considerable gain even if the computer structure prediction approach were applicable to a subset of proteins.
Lee, Seung-Bum; Kaittanis, Charalambos; Jansen, Robert K; Hostetler, Jessica B; Tallon, Luke J; Town, Christopher D; Daniell, Henry
2006-01-01
Background Cotton (Gossypium hirsutum) is the most important fiber crop grown in 90 countries. In 2004–2005, US farmers planted 79% of the 5.7-million hectares of nuclear transgenic cotton. Unfortunately, genetically modified cotton has the potential to hybridize with other cultivated and wild relatives, resulting in geographical restrictions to cultivation. However, chloroplast genetic engineering offers the possibility of containment because of maternal inheritance of transgenes. The complete chloroplast genome of cotton provides essential information required for genetic engineering. In addition, the sequence data were used to assess phylogenetic relationships among the major clades of rosids using cotton and 25 other completely sequenced angiosperm chloroplast genomes. Results The complete cotton chloroplast genome is 160,301 bp in length, with 112 unique genes and 19 duplicated genes within the IR, containing a total of 131 genes. There are four ribosomal RNAs, 30 distinct tRNA genes and 17 intron-containing genes. The gene order in cotton is identical to that of tobacco but lacks rpl22 and infA. There are 30 direct and 24 inverted repeats 30 bp or longer with a sequence identity ≥ 90%. Most of the direct repeats are within intergenic spacer regions, introns and a 72 bp-long direct repeat is within the psaA and psaB genes. Comparison of protein coding sequences with expressed sequence tags (ESTs) revealed nucleotide substitutions resulting in amino acid changes in ndhC, rpl23, rpl20, rps3 and clpP. Phylogenetic analysis of a data set including 61 protein-coding genes using both maximum likelihood and maximum parsimony were performed for 28 taxa, including cotton and five other angiosperm chloroplast genomes that were not included in any previous phylogenies. Conclusion Cotton chloroplast genome lacks rpl22 and infA and contains a number of dispersed direct and inverted repeats. RNA editing resulted in amino acid changes with significant impact on their hydropathy. Phylogenetic analysis provides strong support for the position of cotton in the Malvales in the eurosids II clade sister to Arabidopsis in the Brassicales. Furthermore, there is strong support for the placement of the Myrtales sister to the eurosid I clade, although expanded taxon sampling is needed to further test this relationship. PMID:16553962
Structure-Function Analysis of Chloroplast Proteins via Random Mutagenesis Using Error-Prone PCR.
Dumas, Louis; Zito, Francesca; Auroy, Pascaline; Johnson, Xenie; Peltier, Gilles; Alric, Jean
2018-06-01
Site-directed mutagenesis of chloroplast genes was developed three decades ago and has greatly advanced the field of photosynthesis research. Here, we describe a new approach for generating random chloroplast gene mutants that combines error-prone polymerase chain reaction of a gene of interest with chloroplast complementation of the knockout Chlamydomonas reinhardtii mutant. As a proof of concept, we targeted a 300-bp sequence of the petD gene that encodes subunit IV of the thylakoid membrane-bound cytochrome b 6 f complex. By sequencing chloroplast transformants, we revealed 149 mutations in the 300-bp target petD sequence that resulted in 92 amino acid substitutions in the 100-residue target subunit IV sequence. Our results show that this method is suited to the study of highly hydrophobic, multisubunit, and chloroplast-encoded proteins containing cofactors such as hemes, iron-sulfur clusters, and chlorophyll pigments. Moreover, we show that mutant screening and sequencing can be used to study photosynthetic mechanisms or to probe the mutational robustness of chloroplast-encoded proteins, and we propose that this method is a valuable tool for the directed evolution of enzymes in the chloroplast. © 2018 American Society of Plant Biologists. All rights reserved.
Microbial Characterization of Qatari Barchan Sand Dunes
Chatziefthimiou, Aspassia D.; Nguyen, Hanh; Richer, Renee; Louge, Michel; Sultan, Ali A.; Schloss, Patrick; Hay, Anthony G.
2016-01-01
This study represents the first characterization of sand microbiota in migrating barchan sand dunes. Bacterial communities were studied through direct counts and cultivation, as well as 16S rRNA gene and metagenomic sequence analysis to gain an understanding of microbial abundance, diversity, and potential metabolic capabilities. Direct on-grain cell counts gave an average of 5.3 ± 0.4 x 105 cells g-1 of sand. Cultured isolates (N = 64) selected for 16S rRNA gene sequencing belonged to the phyla Actinobacteria (58%), Firmicutes (27%) and Proteobacteria (15%). Deep-sequencing of 16S rRNA gene amplicons from 18 dunes demonstrated a high relative abundance of Proteobacteria, particularly enteric bacteria, and a dune-specific-pattern of bacterial community composition that correlated with dune size. Shotgun metagenome sequences of two representative dunes were analyzed and found to have similar relative bacterial abundance, though the relative abundances of eukaryotic, viral and enterobacterial sequences were greater in sand from the dune closer to a camel-pen. Functional analysis revealed patterns similar to those observed in desert soils; however, the increased relative abundance of genes encoding sporulation and dormancy are consistent with the dune microbiome being well-adapted to the exceptionally hyper-arid Qatari desert. PMID:27655399
Oduru, Sreedhar; Campbell, Janee L; Karri, SriTulasi; Hendry, William J; Khan, Shafiq A; Williams, Simon C
2003-01-01
Background Complete genome annotation will likely be achieved through a combination of computer-based analysis of available genome sequences combined with direct experimental characterization of expressed regions of individual genomes. We have utilized a comparative genomics approach involving the sequencing of randomly selected hamster testis cDNAs to begin to identify genes not previously annotated on the human, mouse, rat and Fugu (pufferfish) genomes. Results 735 distinct sequences were analyzed for their relatedness to known sequences in public databases. Eight of these sequences were derived from previously unidentified genes and expression of these genes in testis was confirmed by Northern blotting. The genomic locations of each sequence were mapped in human, mouse, rat and pufferfish, where applicable, and the structure of their cognate genes was derived using computer-based predictions, genomic comparisons and analysis of uncharacterized cDNA sequences from human and macaque. Conclusion The use of a comparative genomics approach resulted in the identification of eight cDNAs that correspond to previously uncharacterized genes in the human genome. The proteins encoded by these genes included a new member of the kinesin superfamily, a SET/MYND-domain protein, and six proteins for which no specific function could be predicted. Each gene was expressed primarily in testis, suggesting that they may play roles in the development and/or function of testicular cells. PMID:12783626
rrndb: the Ribosomal RNA Operon Copy Number Database
Klappenbach, Joel A.; Saxman, Paul R.; Cole, James R.; Schmidt, Thomas M.
2001-01-01
The Ribosomal RNA Operon Copy Number Database (rrndb) is an Internet-accessible database containing annotated information on rRNA operon copy number among prokaryotes. Gene redundancy is uncommon in prokaryotic genomes, yet the rRNA genes can vary from one to as many as 15 copies. Despite the widespread use of 16S rRNA gene sequences for identification of prokaryotes, information on the number and sequence of individual rRNA genes in a genome is not readily accessible. In an attempt to understand the evolutionary implications of rRNA operon redundancy, we have created a phylogenetically arranged report on rRNA gene copy number for a diverse collection of prokaryotic microorganisms. Each entry (organism) in the rrndb contains detailed information linked directly to external websites including the Ribosomal Database Project, GenBank, PubMed and several culture collections. Data contained in the rrndb will be valuable to researchers investigating microbial ecology and evolution using 16S rRNA gene sequences. The rrndb web site is directly accessible on the WWW at http://rrndb.cme.msu.edu. PMID:11125085
Subburaj, Saminathan; Chung, Sung Jin; Lee, Choongil; Ryu, Seuk-Min; Kim, Duk Hyoung; Kim, Jin-Soo; Bae, Sangsu; Lee, Geung-Joo
2016-07-01
Site-directed mutagenesis of nitrate reductase genes using direct delivery of purified Cas9 protein preassembled with guide RNA produces mutations efficiently in Petunia × hybrida protoplast system. The clustered, regularly interspaced, short palindromic repeat (CRISPR)-CRISPR associated endonuclease 9 (CRISPR/Cas9) system has been recently announced as a powerful molecular breeding tool for site-directed mutagenesis in higher plants. Here, we report a site-directed mutagenesis method targeting Petunia nitrate reductase (NR) gene locus. This method could create mutations efficiently using direct delivery of purified Cas9 protein and single guide RNA (sgRNA) into protoplast cells. After transient introduction of RNA-guided endonuclease (RGEN) ribonucleoproteins (RNPs) with different sgRNAs targeting NR genes, mutagenesis at the targeted loci was detected by T7E1 assay and confirmed by targeted deep sequencing. T7E1 assay showed that RGEN RNPs induced site-specific mutations at frequencies ranging from 2.4 to 21 % at four different sites (NR1, 2, 4 and 6) in the PhNR gene locus with average mutation efficiency of 14.9 ± 2.2 %. Targeted deep DNA sequencing revealed mutation rates of 5.3-17.8 % with average mutation rate of 11.5 ± 2 % at the same NR gene target sites in DNA fragments of analyzed protoplast transfectants. Further analysis from targeted deep sequencing showed that the average ratio of deletion to insertion produced collectively by the four NR-RGEN target sites (NR1, 2, 4, and 6) was about 63:37. Our results demonstrated that direct delivery of RGEN RNPs into protoplast cells of Petunia can be exploited as an efficient tool for site-directed mutagenesis of genes or genome editing in plant systems.
Transposon Tn10 contains two structural genes with opposite polarity between tetA and IS10R.
Schollmeier, K; Hillen, W
1984-01-01
The nucleotide sequence of the central part of Tn10 has been determined from the rightmost HindIII site to IS10R. This sequence contains two open reading frames with opposite polarity. The in vivo transcription start points in this sequence have been determined by S1 mapping. These results define one minor and two major promoters. The transcription starts of the two major promoters are only 18 base pairs apart, and the transcripts show different polarity and overlap by 18 base pairs. The nucleotide sequence reveals two regions with palindromic symmetry which may serve as operators. Their possible involvement in the regulation of transcription of both genes is discussed. Taken together these results allow for a maximal coding capacity of 138 amino acids directed toward IS10R and 197 amino acids directed toward tetA. The possible function of these gene products is discussed. The accompanying article (Braus et al., J. Bacteriol. 160:504-509, 1984) presents evidence that these genes are expressed. Images PMID:6094471
Schaschl, Helmut; Huber, Susanne; Schaefer, Katrin; Windhager, Sonja; Wallner, Bernard; Fieder, Martin
2015-05-13
The evolutionary highly conserved neurohypophyseal hormones oxytocin and arginine vasopressin play key roles in regulating social cognition and behaviours. The effects of these two peptides are meditated by their specific receptors, which are encoded by the oxytocin receptor (OXTR) and arginine vasopressin receptor 1a genes (AVPR1A), respectively. In several species, polymorphisms in these genes have been linked to various behavioural traits. Little, however, is known about whether positive selection acts on sequence variants in genes influencing variation in human behaviours. We identified, in both neuroreceptor genes, signatures of balancing selection in the cis-regulative acting sequences such as transcription factor binding and enhancer sequences, as well as in a transcriptional repressor sequence motif. Additionally, in the intron 3 of the OXTR gene, the SNP rs59190448 appears to be under positive directional selection. For rs59190448, only one phenotypical association is known so far, but it is in high LD' (>0.8) with loci of known association; i.e., variants associated with key pro-social behaviours and mental disorders in humans. Only for one SNP on the OXTR gene (rs59190448) was a sign of positive directional selection detected with all three methods of selection detection. For rs59190448, however, only one phenotypical association is known, but rs59190448 is in high LD' (>0.8), with variants associated with important pro-social behaviours and mental disorders in humans. We also detected various signatures of balancing selection on both neuroreceptor genes.
Anwar, R; Booth, A; Churchill, A J; Markham, A F
1996-01-01
The determination of nucleotide sequence is fundamental to the identification and molecular analysis of genes. Direct sequencing of PCR products is now becoming a commonplace procedure for haplotype analysis, and for defining mutations and polymorphism within genes, particularly for diagnostic purposes. A previously unrecognised phenomenon, primer related variability, observed in sequence data generated using Taq cycle sequencing and T7 Sequenase sequencing, is reported. This suggests that caution is necessary when interpreting DNA sequence data. This is particularly important in situations where treatment may be dependent on the accuracy of the molecular diagnosis. Images PMID:16696096
Gene encoding plant asparagine synthetase
Coruzzi, Gloria M.; Tsai, Fong-Ying
1993-10-26
The identification and cloning of the gene(s) for plant asparagine synthetase (AS), an important enzyme involved in the formation of asparagine, a major nitrogen transport compound of higher plants is described. Expression vectors constructed with the AS coding sequence may be utilized to produce plant AS; to engineer herbicide resistant plants, salt/drought tolerant plants or pathogen resistant plants; as a dominant selectable marker; or to select for novel herbicides or compounds useful as agents that synchronize plant cells in culture. The promoter for plant AS, which directs high levels of gene expression and is induced in an organ specific manner and by darkness, is also described. The AS promoter may be used to direct the expression of heterologous coding sequences in appropriate hosts.
Genomics approach to the environmental community of microorganisms
NASA Astrophysics Data System (ADS)
Kawarabayasi, Y.; Maruyama, A.
2004-12-01
It was indicated by microscopic observation or comparison of 16S rDNA sequence that many extremophiles were surviving in many hydrothermal environments. But it is generally said that over 99% of total microbes are now uncultivable. Thus, we planned to identify uncultivable microbes through direct sequencing of environmental DNA. At first, shotgun plasmid libraries were directly constructed with the DNA molecules prepared from mixed microbes collected from low-temperature hydrothermal water at RM24 in the Southern East Pacific Rise (S-EPR). It was shown that the sequences of some number of clones indicated the similar feature to the intron in eukaryote or tandem repetitive sequence identified in some human familiar diseases. The results indicated that many microorganisms with eukaryotic feature were dominant in low temperature water of S-EPR. Secondly, shotgun plasmid libraries were constructed from the environmental DNA prepared from Beppu hot springs. The ORFs were easily identified all clones determined entire sequence. Thus it can be said that hot springs is good resources for searching novel genes. At last, the mixed microbes isolated from Suiyo seamount were used for construction of shotgun library. The clones in this library contained the ORFs. From some clones in hot spring and Suiyo sample, aminoacyl-tRNA synthatase, which is generally present in all organisms, was isolated by similarity. The phylogenetic analysis of aminoacyl-tRNA synthetase identified indicated that novel and unidentified microorganisms should be present in hot spring or Suiyo seamount. The novel genes identified from Suiyo seamount were also utilized for expression in E. coli. Some gene products were successfully obtained from the E. coli cells as soluble proteins. Some protein indicated the thermostability up to 70_E#8249;C, meaning that the original host cell of this gene should be stable up to the same temperature. Our work indicates that environmental genomics, including the direct cloning, sequencing of environmental DNA and expression of gene identified, is powerful approach to collect novel uncultivable microbes or novel active genes.
Pitfalls and caveats in BRCA sequencing.
Bellosillo, Beatriz; Tusquets, Ignacio
2006-01-01
Between 5 and 10% of breast cancer cases are considered to result from hereditary predisposition. Germ-line mutations in BRCA1 and BRCA2 are responsible for an inherited predisposition of breast and ovarian cancer. Direct nucleotide sequencing is considered the gold standard technique for mutation detection for genes such as BRCA1 and BRCA2. In many laboratories that analyze BRCA1 and BRCA2, previous to direct sequencing, screening techniques to identify sequence variants in the PCR amplicons are performed. The mutations detected in these genes may be frameshift mutations (insertions or deletions), nonsense mutations, or missense mutations. The clinical interpretation of the mutation as the cause of the disease may be difficult to establish in the case of missense mutations. Only in 30-70% of the families in which a hereditary component is suspected, a mutation in BRCA1 and/or BRCA2 is detected. Negative results may be due to: wrong selection of the proband; mutations in the regulatory portion of the genes; gene silencing due to epigenetic phenomena; or large genomic rearrangements that produce deletions of whole exons. Another possibility that explains the lack of detection of alterations in BRCA1 or BRCA2 is the presence of mutations in undiscovered genes or in genes that interact with BRCA1 and/or BRCA2, which may be low-penetrance genes, like CHEK2.
Jühling, Frank; Pütz, Joern; Bernt, Matthias; Donath, Alexander; Middendorf, Martin; Florentz, Catherine; Stadler, Peter F.
2012-01-01
Transfer RNAs (tRNAs) are present in all types of cells as well as in organelles. tRNAs of animal mitochondria show a low level of primary sequence conservation and exhibit ‘bizarre’ secondary structures, lacking complete domains of the common cloverleaf. Such sequences are hard to detect and hence frequently missed in computational analyses and mitochondrial genome annotation. Here, we introduce an automatic annotation procedure for mitochondrial tRNA genes in Metazoa based on sequence and structural information in manually curated covariance models. The method, applied to re-annotate 1876 available metazoan mitochondrial RefSeq genomes, allows to distinguish between remaining functional genes and degrading ‘pseudogenes’, even at early stages of divergence. The subsequent analysis of a comprehensive set of mitochondrial tRNA genes gives new insights into the evolution of structures of mitochondrial tRNA sequences as well as into the mechanisms of genome rearrangements. We find frequent losses of tRNA genes concentrated in basal Metazoa, frequent independent losses of individual parts of tRNA genes, particularly in Arthropoda, and wide-spread conserved overlaps of tRNAs in opposite reading direction. Direct evidence for several recent Tandem Duplication-Random Loss events is gained, demonstrating that this mechanism has an impact on the appearance of new mitochondrial gene orders. PMID:22139921
Whole-exome sequencing identifies USH2A mutations in a pseudo-dominant Usher syndrome family.
Zheng, Sui-Lian; Zhang, Hong-Liang; Lin, Zhen-Lang; Kang, Qian-Yan
2015-10-01
Usher syndrome (USH) is an autosomal recessive (AR) multi-sensory degenerative disorder leading to deaf-blindness. USH is clinically subdivided into three subclasses, and 10 genes have been identified thus far. Clinical and genetic heterogeneities in USH make a precise diagnosis difficult. A dominant‑like USH family in successive generations was identified, and the present study aimed to determine the genetic predisposition of this family. Whole‑exome sequencing was performed in two affected patients and an unaffected relative. Systematic data were analyzed by bioinformatic analysis to remove the candidate mutations via step‑wise filtering. Direct Sanger sequencing and co‑segregation analysis were performed in the pedigree. One novel and two known mutations in the USH2A gene were identified, and were further confirmed by direct sequencing and co‑segregation analysis. The affected mother carried compound mutations in the USH2A gene, while the unaffected father carried a heterozygous mutation. The present study demonstrates that whole‑exome sequencing is a robust approach for the molecular diagnosis of disorders with high levels of genetic heterogeneity.
Terrados, Gloria; Finkernagel, Florian; Stielow, Bastian; Sadic, Dennis; Neubert, Juliane; Herdt, Olga; Krause, Michael; Scharfe, Maren; Jarek, Michael; Suske, Guntram
2012-01-01
The transcription factor Sp2 is essential for early mouse development and for proliferation of mouse embryonic fibroblasts in culture. Yet its mechanisms of action and its target genes are largely unknown. In this study, we have combined RNA interference, in vitro DNA binding, chromatin immunoprecipitation sequencing and global gene-expression profiling to investigate the role of Sp2 for cellular functions, to define target sites and to identify genes regulated by Sp2. We show that Sp2 is important for cellular proliferation that it binds to GC-boxes and occupies proximal promoters of genes essential for vital cellular processes including gene expression, replication, metabolism and signalling. Moreover, we identified important key target genes and cellular pathways that are directly regulated by Sp2. Most significantly, Sp2 binds and activates numerous sequence-specific transcription factor and co-activator genes, and represses the whole battery of cholesterol synthesis genes. Our results establish Sp2 as a sequence-specific regulator of vitally important genes. PMID:22684502
Gene Identification Algorithms Using Exploratory Statistical Analysis of Periodicity
NASA Astrophysics Data System (ADS)
Mukherjee, Shashi Bajaj; Sen, Pradip Kumar
2010-10-01
Studying periodic pattern is expected as a standard line of attack for recognizing DNA sequence in identification of gene and similar problems. But peculiarly very little significant work is done in this direction. This paper studies statistical properties of DNA sequences of complete genome using a new technique. A DNA sequence is converted to a numeric sequence using various types of mappings and standard Fourier technique is applied to study the periodicity. Distinct statistical behaviour of periodicity parameters is found in coding and non-coding sequences, which can be used to distinguish between these parts. Here DNA sequences of Drosophila melanogaster were analyzed with significant accuracy.
Ivanov, E. L.; Sugawara, N.; Fishman-Lobell, J.; Haber, J. E.
1996-01-01
HO endonuclease-induced double-strand breaks (DSBs) within a direct duplication of Escherichia coli lacZ genes are repaired either by gene conversion or by single-strand annealing (SSA), with >80% being SSA. Previously it was demonstrated that the RAD52 gene is required for DSB-induced SSA. In the present study, the effects of other genes belonging to the RAD52 epistasis group were analyzed. We show that RAD51, RAD54, RAD55, and RAD57 genes are not required for SSA irrespective of whether recombination occurred in plasmid or chromosomal DNA. In both plasmid and chromosomal constructs with homologous sequences in direct orientation, the proportion of SSA events over gene conversion was significantly elevated in the mutant strains. However, gene conversion was not affected when the two lacZ sequences were in inverted orientation. These results suggest that there is a competition between SSA and gene conversion processes that favors SSA in the absence of RAD51, RAD54, RAD55 and RAD57. Mutations in RAD50 and XRS2 genes do not prevent the completion, but markedly retard the kinetics, of DSB repair by both mechanisms in the lacZ direct repeat plasmid, a result resembling the effects of these genes during mating-type (MAT) switching. PMID:8849880
DOE Office of Scientific and Technical Information (OSTI.GOV)
Larsen, P. E.; Trivedi, G.; Sreedasyam, A.
2010-07-06
Accurate structural annotation is important for prediction of function and required for in vitro approaches to characterize or validate the gene expression products. Despite significant efforts in the field, determination of the gene structure from genomic data alone is a challenging and inaccurate process. The ease of acquisition of transcriptomic sequence provides a direct route to identify expressed sequences and determine the correct gene structure. We developed methods to utilize RNA-seq data to correct errors in the structural annotation and extend the boundaries of current gene models using assembly approaches. The methods were validated with a transcriptomic data set derivedmore » from the fungus Laccaria bicolor, which develops a mycorrhizal symbiotic association with the roots of many tree species. Our analysis focused on the subset of 1501 gene models that are differentially expressed in the free living vs. mycorrhizal transcriptome and are expected to be important elements related to carbon metabolism, membrane permeability and transport, and intracellular signaling. Of the set of 1501 gene models, 1439 (96%) successfully generated modified gene models in which all error flags were successfully resolved and the sequences aligned to the genomic sequence. The remaining 4% (62 gene models) either had deviations from transcriptomic data that could not be spanned or generated sequence that did not align to genomic sequence. The outcome of this process is a set of high confidence gene models that can be reliably used for experimental characterization of protein function. 69% of expressed mycorrhizal JGI 'best' gene models deviated from the transcript sequence derived by this method. The transcriptomic sequence enabled correction of a majority of the structural inconsistencies and resulted in a set of validated models for 96% of the mycorrhizal genes. The method described here can be applied to improve gene structural annotation in other species, provided that there is a sequenced genome and a set of gene models.« less
Kammermeier, Jochen; Drury, Suzanne; James, Chela T; Dziubak, Robert; Ocaka, Louise; Elawad, Mamoun; Beales, Philip; Lench, Nicholas; Uhlig, Holm H; Bacchelli, Chiara; Shah, Neil
2014-11-01
Multiple monogenetic conditions with partially overlapping phenotypes can present with inflammatory bowel disease (IBD)-like intestinal inflammation. With novel genotype-specific therapies emerging, establishing a molecular diagnosis is becoming increasingly important. We have introduced targeted next-generation sequencing (NGS) technology as a prospective screening tool in children with very early onset IBD (VEOIBD). We evaluated the coverage of 40 VEOIBD genes in two separate cohorts undergoing targeted gene panel sequencing (TGPS) (n=25) and whole exome sequencing (WES) (n=20). TGPS revealed causative mutations in four genes (IL10RA, EPCAM, TTC37 and SKIV2L) discovered unexpected phenotypes and directly influenced clinical decision making by supporting as well as avoiding haematopoietic stem cell transplantation. TGPS resulted in significantly higher median coverage when compared with WES, fewer coverage deficiencies and improved variant detection across established VEOIBD genes. Excluding or confirming known VEOIBD genotypes should be considered early in the disease course in all cases of therapy-refractory VEOIBD, as it can have a direct impact on patient management. To combine both described NGS technologies would compensate for the limitations of WES for disease-specific application while offering the opportunity for novel gene discovery in the research setting. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Isolation and characterization of a water stress-specific genomic gene, pwsi 18, from rice.
Joshee, N; Kisaka, H; Kitagawa, Y
1998-01-01
One of the water stress-specific cDNA clones of rice characterised previously, wsi18, was selected for further study. The wsi18 gene can be induced by water stress conditions such as mannitol, NaCl, and dryness, but not by ABA, cold, or heat. A genomic clone for wsi18, pwsi18, contained about 1.7 kbp of the 5' upstream sequence, two introns, and the full coding sequence. The 5'-upstream sequence of pwsi18 contained putative cis-acting elements, namely an ABA-responsive element (ABRE), three G-boxes, three E-boxes, a MEF-2 sequence, four direct and two inverted repeats, and four sequences similar to DRE, which is involved in the dehydration response of Arabidopsis genes. The gusA reporter gene under the control of the pwsi18 promoter showed transient expression in response to water stress. Deletion of the downstream DRE-like sequence between the distal G-boxes-2 and -3 resulted in rather low GUS expression.
GeneChip{sup {trademark}} screening assay for cystic fibrosis mutations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cronn, M.T.; Miyada, C.G.; Fucini, R.V.
1994-09-01
GeneChip{sup {trademark}} assays are based on high density, carefully designed arrays of short oligonucleotide probes (13-16 bases) built directly on derivatized silica substrates. DNA target sequence analysis is achieved by hybridizing fluorescently labeled amplification products to these arrays. Fluorescent hybridization signals located within the probe array are translated into target sequence information using the known probe sequence at each array feature. The mutation screening assay for cystic fibrosis includes sets of oligonucleotide probes designed to detect numerous different mutations that have been described in 14 exons and one intron of the CFTR gene. Each mutation site is addressed by amore » sub-array of at least 40 probe sequences, half designed to detect the wild type gene sequence and half designed to detect the reported mutant sequence. Hybridization with homozygous mutant, homozygous wild type or heterozygous targets results in distinctive hybridization patterns within a sub-array, permitting specific discrimination of each mutation. The GeneChip probe arrays are very small (approximately 1 cm{sup 2}). There miniature size coupled with their high information content make GeneChip probe arrays a useful and practical means for providing CF mutation analysis in a clinical setting.« less
Gene: a gene-centered information resource at NCBI.
Brown, Garth R; Hem, Vichet; Katz, Kenneth S; Ovetsky, Michael; Wallin, Craig; Ermolaeva, Olga; Tolstoy, Igor; Tatusova, Tatiana; Pruitt, Kim D; Maglott, Donna R; Murphy, Terence D
2015-01-01
The National Center for Biotechnology Information's (NCBI) Gene database (www.ncbi.nlm.nih.gov/gene) integrates gene-specific information from multiple data sources. NCBI Reference Sequence (RefSeq) genomes for viruses, prokaryotes and eukaryotes are the primary foundation for Gene records in that they form the critical association between sequence and a tracked gene upon which additional functional and descriptive content is anchored. Additional content is integrated based on the genomic location and RefSeq transcript and protein sequence data. The content of a Gene record represents the integration of curation and automated processing from RefSeq, collaborating model organism databases, consortia such as Gene Ontology, and other databases within NCBI. Records in Gene are assigned unique, tracked integers as identifiers. The content (citations, nomenclature, genomic location, gene products and their attributes, phenotypes, sequences, interactions, variation details, maps, expression, homologs, protein domains and external databases) is available via interactive browsing through NCBI's Entrez system, via NCBI's Entrez programming utilities (E-Utilities and Entrez Direct) and for bulk transfer by FTP. Published by Oxford University Press on behalf of Nucleic Acids Research 2014. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Lenarduzzi, S; Morgutti, M; Crovella, S; Coiana, A; Rosatelli, M C
2014-11-14
Cystic fibrosis (CF) is a common recessive genetic disease caused by mutations in the gene encoding for the cystic fibrosis transmembrane conductance regulator (CFTR) protein. More than 1800 different mutations have been described to date. Here, we report 3 novel mutations in CFTR in 3 Italian CF patients. To detect and identify 36 frequent mutations in Caucasians, we used the INNO-LiPA CFTR19 and INNO-LiPA CFTR17+Tn Update kits (Innogenetics; Ghent, Belgium). Our first analysis did not reveal both of the responsible mutations; thus, direct sequencing of the CFTR gene coding region was performed. The 3 patients were compound heterozygous. In one allele, the F508del (c.1521_1523delCTT, p.PHE508del) mutation in exon 11 was observed in each case. For the second allele, in patient No.1, direct sequencing revealed an 11-base pair deletion (GAGGCGATACT) in exon 14 (c.2236_2246del; pGlu746Alafs*29). In patient No. 2, direct sequencing revealed a nonsense mutation at nucleotide 3892 (c.3892G>T) in exon 24. In patient No. 3, direct sequencing revealed a deletion of cytosine in exon 27 (c.4296delC; p.Asn1432Lysfs*16). These 3 novel mutations indicate the production of a truncated protein, which consequently results in a non-functional polypeptide.
Lagares, Antonio; Hozbor, Daniela F.; Niehaus, Karsten; Otero, Augusto J. L. Pich; Lorenzen, Jens; Arnold, Walter; Pühler, Alfred
2001-01-01
The genetic characterization of a 5.5-kb chromosomal region of Sinorhizobium meliloti 2011 that contains lpsB, a gene required for the normal development of symbiosis with Medicago spp., is presented. The nucleotide sequence of this DNA fragment revealed the presence of six genes: greA and lpsB, transcribed in the forward direction; and lpsE, lpsD, lpsC, and lrp, transcribed in the reverse direction. Except for lpsB, none of the lps genes were relevant for nodulation and nitrogen fixation. Analysis of the transcriptional organization of lpsB showed that greA and lpsB are part of separate transcriptional units, which is in agreement with the finding of a DNA stretch homologous to a “nonnitrogen” promoter consensus sequence between greA and lpsB. The opposite orientation of lpsB with respect to its first downstream coding sequence, lpsE, indicated that the altered LPS and the defective symbiosis of lpsB mutants are both consequences of a primary nonpolar defect in a single gene. Global sequence comparisons revealed that the greA-lpsB and lrp genes of S. meliloti have a genetic organization similar to that of their homologous loci in R. leguminosarum bv. viciae. In particular, high sequence similarity was found between the translation product of lpsB and a core-related biosynthetic mannosyltransferase of R. leguminosarum bv. viciae encoded by the lpcC gene. The functional relationship between these two genes was demonstrated in genetic complementation experiments in which the S. meliloti lpsB gene restored the wild-type LPS phenotype when introduced into lpcC mutants of R. leguminosarum. These results support the view that S. meliloti lpsB also encodes a mannosyltransferase that participates in the biosynthesis of the LPS core. Evidence is provided for the presence of other lpsB-homologous sequences in several members of the family Rhizobiaceae. PMID:11157937
Fukumori, F; Saint, C P
1997-01-01
A 9,233-bp HindIII fragment of the aromatic amine catabolic plasmid pTDN1, isolated from a derivative of Pseudomonas putida mt-2 (UCC22), confers the ability to degrade aniline on P. putida KT2442. The fragment encodes six open reading frames which are arranged in the same direction. Their 5' upstream region is part of the direct-repeat sequence of pTDN1. Nucleotide sequence of 1.8 kb of the repeat sequence revealed only a single base pair change compared to the known sequence of IS1071 which is involved in the transposition of the chlorobenzoate genes (C. Nakatsu, J. Ng, R. Singh, N. Straus, and C. Wyndham, Proc. Natl. Acad. Sci. USA 88:8312-8316, 1991). Four open reading frames encode proteins with considerable homology to proteins found in other aromatic-compound degradation pathways. On the basis of sequence similarity, these genes are proposed to encode the large and small subunits of aniline oxygenase (tdnA1 and tdnA2, respectively), a reductase (tdnB), and a LysR-type regulatory gene (tdnR). The putative large subunit has a conserved [2Fe-2S]R Rieske-type ligand center. Two genes, tdnQ and tdnT, which may be involved in amino group transfer, are localized upstream of the putative oxygenase genes. The tdnQ gene product shares about 30% similarity with glutamine synthetases; however, a pUC-based plasmid carrying tdnQ did not support the growth of an Escherichia coli glnA strain in the absence of glutamine. TdnT possesses domains that are conserved among amidotransferases. The tdnQ, tdnA1, tdnA2, tdnB, and tdnR genes are essential for the conversion of aniline to catechol. PMID:8990291
2011-01-01
Background Rust fungi are biotrophic basidiomycete plant pathogens that cause major diseases on plants and trees world-wide, affecting agriculture and forestry. Their biotrophic nature precludes many established molecular genetic manipulations and lines of research. The generation of genomic resources for these microbes is leading to novel insights into biology such as interactions with the hosts and guiding directions for breakthrough research in plant pathology. Results To support gene discovery and gene model verification in the genome of the wheat leaf rust fungus, Puccinia triticina (Pt), we have generated Expressed Sequence Tags (ESTs) by sampling several life cycle stages. We focused on several spore stages and isolated haustorial structures from infected wheat, generating 17,684 ESTs. We produced sequences from both the sexual (pycniospores, aeciospores and teliospores) and asexual (germinated urediniospores) stages of the life cycle. From pycniospores and aeciospores, produced by infecting the alternate host, meadow rue (Thalictrum speciosissimum), 4,869 and 1,292 reads were generated, respectively. We generated 3,703 ESTs from teliospores produced on the senescent primary wheat host. Finally, we generated 6,817 reads from haustoria isolated from infected wheat as well as 1,003 sequences from germinated urediniospores. Along with 25,558 previously generated ESTs, we compiled a database of 13,328 non-redundant sequences (4,506 singlets and 8,822 contigs). Fungal genes were predicted using the EST version of the self-training GeneMarkS algorithm. To refine the EST database, we compared EST sequences by BLASTN to a set of 454 pyrosequencing-generated contigs and Sanger BAC-end sequences derived both from the Pt genome, and to ESTs and genome reads from wheat. A collection of 6,308 fungal genes was identified and compared to sequences of the cereal rusts, Puccinia graminis f. sp. tritici (Pgt) and stripe rust, P. striiformis f. sp. tritici (Pst), and poplar leaf rust Melampsora species, and the corn smut fungus, Ustilago maydis (Um). While extensive homologies were found, many genes appeared novel and species-specific; over 40% of genes did not match any known sequence in existing databases. Focusing on spore stages, direct comparison to Um identified potential functional homologs, possibly allowing heterologous functional analysis in that model fungus. Many potentially secreted protein genes were identified by similarity searches against genes and proteins of Pgt and Melampsora spp., revealing apparent orthologs. Conclusions The current set of Pt unigenes contributes to gene discovery in this major cereal pathogen and will be invaluable for gene model verification in the genome sequence. PMID:21435244
Brugnara, Milena; Gaudino, Rossella; Tedeschi, Silvana; Syrèn, Marie-Louise; Perrotta, Silverio; Maines, Evelina; Zaffanello, Marco
2014-09-01
We report the case of an infant boy with polyuria and a familial history of central diabetes insipidus. Laboratory blood tests disclosed hypokalemia, metabolic alkalosis, hyperreninemia, and hyperaldosteronism. Plasma magnesium concentration was slightly low. Urine analysis showed hypercalciuria, hyposthenuria, and high excretion of potassium. Such findings oriented toward type III Bartter syndrome (BSIII). Direct sequencing of the CLCNKB gene revealed no disease-causing mutations. The water deprivation test was positive. Magnetic resonance imaging showed a lack of posterior pituitary hyperintensity. Finally, direct sequencing of the AVP-NPII gene showed a point mutation (c.1884G>A) in a heterozygous state, confirming an autosomal dominant familial neurohypophyseal diabetes insipidus (adFNDI). This condition did not explain the patient's phenotype; thus, we investigated for Gitelman syndrome (GS). A direct sequencing of the SLC12A3 gene showed c.269A>C and c.1205C>A new mutations. In conclusion, the patient had a genetic combination of GS and adFNDI with a BSIII-like phenotype.
Evidence for Horizontal Gene Transfer in Evolution of Elongation Factor Tu in Enterococci
Ke, Danbing; Boissinot, Maurice; Huletsky, Ann; Picard, François J.; Frenette, Johanne; Ouellette, Marc; Roy, Paul H.; Bergeron, Michel G.
2000-01-01
The elongation factor Tu, encoded by tuf genes, is a GTP binding protein that plays a central role in protein synthesis. One to three tuf genes per genome are present, depending on the bacterial species. Most low-G+C-content gram-positive bacteria carry only one tuf gene. We have designed degenerate PCR primers derived from consensus sequences of the tuf gene to amplify partial tuf sequences from 17 enterococcal species and other phylogenetically related species. The amplified DNA fragments were sequenced either by direct sequencing or by sequencing cloned inserts containing putative amplicons. Two different tuf genes (tufA and tufB) were found in 11 enterococcal species, including Enterococcus avium, Enterococcus casseliflavus, Enterococcus dispar, Enterococcus durans, Enterococcus faecium, Enterococcus gallinarum, Enterococcus hirae, Enterococcus malodoratus, Enterococcus mundtii, Enterococcus pseudoavium, and Enterococcus raffinosus. For the other six enterococcal species (Enterococcus cecorum, Enterococcus columbae, Enterococcus faecalis, Enterococcus sulfureus, Enterococcus saccharolyticus, and Enterococcus solitarius), only the tufA gene was present. Based on 16S rRNA gene sequence analysis, the 11 species having two tuf genes all have a common ancestor, while the six species having only one copy diverged from the enterococcal lineage before that common ancestor. The presence of one or two copies of the tuf gene in enterococci was confirmed by Southern hybridization. Phylogenetic analysis of tuf sequences demonstrated that the enterococcal tufA gene branches with the Bacillus, Listeria, and Staphylococcus genera, while the enterococcal tufB gene clusters with the genera Streptococcus and Lactococcus. Primary structure analysis showed that four amino acid residues encoded within the sequenced regions are conserved and unique to the enterococcal tufB genes and the tuf genes of streptococci and Lactococcus lactis. The data suggest that an ancestral streptococcus or a streptococcus-related species may have horizontally transferred a tuf gene to the common ancestor of the 11 enterococcal species which now carry two tuf genes. PMID:11092850
Polycomb repressive complex 1 modifies transcription of active genes
Pherson, Michelle; Misulovin, Ziva; Gause, Maria; Mihindukulasuriya, Kathie; Swain, Amanda; Dorsett, Dale
2017-01-01
This study examines the role of Polycomb repressive complex 1 (PRC1) at active genes. The PRC1 and PRC2 complexes are crucial for epigenetic silencing during development of an organism. They are recruited to Polycomb response elements (PREs) and establish silenced domains over several kilobases. Recent studies show that PRC1 is also directly recruited to active genes by the cohesin complex. Cohesin participates broadly in control of gene transcription, but it is unknown whether cohesin-recruited PRC1 also plays a role in transcriptional control of active genes. We address this question using genome-wide RNA sequencing (RNA-seq) and chromatin immunoprecipitation sequencing (ChIP-seq). The results show that PRC1 influences transcription of active genes, and a significant fraction of its effects are likely direct. The roles of different PRC1 subunits can also vary depending on the gene. Depletion of PRC1 subunits by RNA interference alters phosphorylation of RNA polymerase II (Pol II) and occupancy by the Spt5 pausing-elongation factor at most active genes. These effects on Pol II phosphorylation and Spt5 are likely linked to changes in elongation and RNA processing detected by nascent RNA-seq, although the mechanisms remain unresolved. The experiments also reveal that PRC1 facilitates association of Spt5 with enhancers and PREs. Reduced Spt5 levels at these regulatory sequences upon PRC1 depletion coincide with changes in Pol II occupancy and phosphorylation. Our findings indicate that, in addition to its repressive roles in epigenetic gene silencing, PRC1 broadly influences transcription of active genes and may suppress transcription of nonpromoter regulatory sequences. PMID:28782042
Thauvin-Robinet, Christel; Franco, Brunella; Saugier-Veber, Pascale; Aral, Bernard; Gigot, Nadège; Donzel, Anne; Van Maldergem, Lionel; Bieth, Eric; Layet, Valérie; Mathieu, Michèle; Teebi, Ahmad; Lespinasse, James; Callier, Patrick; Mugneret, Francine; Masurel-Paulet, Alice; Gautier, Elodie; Huet, Frédéric; Teyssier, Jean-Raymond; Tosi, Mario; Frébourg, Thierry; Faivre, Laurence
2009-02-01
Oral-facial-digital type I syndrome (OFDI) is characterised by an X-linked dominant mode of inheritance with lethality in males. Clinical features include facial dysmorphism with oral, dental and distal abnormalities, polycystic kidney disease and central nervous system malformations. Considerable allelic heterogeneity has been reported within the OFD1 gene, but DNA bi-directional sequencing of the exons and intron-exon boundaries of the OFD1 gene remains negative in more than 20% of cases. We hypothesized that genomic rearrangements could account for the majority of the remaining undiagnosed cases. Thus, we took advantage of two independent available series of patients with OFDI syndrome and negative DNA bi-directional sequencing of the exons and intron-exon boundaries of the OFD1 gene from two different European labs: 13/36 cases from the French lab; 13/95 from the Italian lab. All patients were screened by a semiquantitative fluorescent multiplex method (QFMPSF) and relative quantification by real-time PCR (qPCR). Six OFD1 genomic deletions (exon 5, exons 1-8, exons 1-14, exons 10-11, exons 13-23 and exon 17) were identified, accounting for 5% of OFDI patients and for 23% of patients with negative mutation screening by DNA sequencing. The association of DNA direct sequencing, QFMPSF and qPCR detects OFD1 alteration in up to 85% of patients with a phenotype suggestive of OFDI syndrome. Given the average percentage of large genomic rearrangements (5%), we suggest that dosage methods should be performed in addition to DNA direct sequencing analysis to exclude the involvement of the OFD1 transcript when there are genetic counselling issues. (c) 2008 Wiley-Liss, Inc.
Rooting gene trees without outgroups: EP rooting.
Sinsheimer, Janet S; Little, Roderick J A; Lake, James A
2012-01-01
Gene sequences are routinely used to determine the topologies of unrooted phylogenetic trees, but many of the most important questions in evolution require knowing both the topologies and the roots of trees. However, general algorithms for calculating rooted trees from gene and genomic sequences in the absence of gene paralogs are few. Using the principles of evolutionary parsimony (EP) (Lake JA. 1987a. A rate-independent technique for analysis of nucleic acid sequences: evolutionary parsimony. Mol Biol Evol. 4:167-181) and its extensions (Cavender, J. 1989. Mechanized derivation of linear invariants. Mol Biol Evol. 6:301-316; Nguyen T, Speed TP. 1992. A derivation of all linear invariants for a nonbalanced transversion model. J Mol Evol. 35:60-76), we explicitly enumerate all linear invariants that solely contain rooting information and derive algorithms for rooting gene trees directly from gene and genomic sequences. These new EP linear rooting invariants allow one to determine rooted trees, even in the complete absence of outgroups and gene paralogs. EP rooting invariants are explicitly derived for three taxon trees, and rules for their extension to four or more taxa are provided. The method is demonstrated using 18S ribosomal DNA to illustrate how the new animal phylogeny (Aguinaldo AMA et al. 1997. Evidence for a clade of nematodes, arthropods, and other moulting animals. Nature 387:489-493; Lake JA. 1990. Origin of the metazoa. Proc Natl Acad Sci USA 87:763-766) may be rooted directly from sequences, even when they are short and paralogs are unavailable. These results are consistent with the current root (Philippe H et al. 2011. Acoelomorph flatworms are deuterostomes related to Xenoturbella. Nature 470:255-260).
Rooting Gene Trees without Outgroups: EP Rooting
Sinsheimer, Janet S.; Little, Roderick J. A.; Lake, James A.
2012-01-01
Gene sequences are routinely used to determine the topologies of unrooted phylogenetic trees, but many of the most important questions in evolution require knowing both the topologies and the roots of trees. However, general algorithms for calculating rooted trees from gene and genomic sequences in the absence of gene paralogs are few. Using the principles of evolutionary parsimony (EP) (Lake JA. 1987a. A rate-independent technique for analysis of nucleic acid sequences: evolutionary parsimony. Mol Biol Evol. 4:167–181) and its extensions (Cavender, J. 1989. Mechanized derivation of linear invariants. Mol Biol Evol. 6:301–316; Nguyen T, Speed TP. 1992. A derivation of all linear invariants for a nonbalanced transversion model. J Mol Evol. 35:60–76), we explicitly enumerate all linear invariants that solely contain rooting information and derive algorithms for rooting gene trees directly from gene and genomic sequences. These new EP linear rooting invariants allow one to determine rooted trees, even in the complete absence of outgroups and gene paralogs. EP rooting invariants are explicitly derived for three taxon trees, and rules for their extension to four or more taxa are provided. The method is demonstrated using 18S ribosomal DNA to illustrate how the new animal phylogeny (Aguinaldo AMA et al. 1997. Evidence for a clade of nematodes, arthropods, and other moulting animals. Nature 387:489–493; Lake JA. 1990. Origin of the metazoa. Proc Natl Acad Sci USA 87:763–766) may be rooted directly from sequences, even when they are short and paralogs are unavailable. These results are consistent with the current root (Philippe H et al. 2011. Acoelomorph flatworms are deuterostomes related to Xenoturbella. Nature 470:255–260). PMID:22593551
Ernst, J F; Stewart, J W; Sherman, F
1981-01-01
DNA sequence analysis of a cloned fragment directly established that the cyc1-11 mutation of iso-1-cytochrome c in the yeast Saccharomyces cerevisiae is a two-base-pair substitution that changes the CCA proline codon at amino acid position 76 to a UAA nonsense codon. Analysis of 11 revertant proteins and one cloned revertant gene showed that reversion of the cyc1-11 mutation can occur in three ways: a single base-pair substitution, which produces a serine replacement at position 76; recombination with the nonallelic CYC7 gene of iso-2-cytochrome c, which causes replacement of a segment in the cyc1-11 gene by the corresponding segment of the CYC7 gene; and either a two-base-pair substitution or recombination with the CYC7 gene, which causes the formation of the normal iso-1-cytochrome c sequence. These results demonstrate the occurrence of low frequencies of recombination between nonallelic genes having extensive but not complete homology. The formation of composite genes that share sequences from nonallelic genes may be an evolutionary mechanism for producing protein diversities and for maintaining identical sequences at different loci. Images PMID:6273865
Ruhlman, Tracey; Lee, Seung-Bum; Jansen, Robert K; Hostetler, Jessica B; Tallon, Luke J; Town, Christopher D; Daniell, Henry
2006-08-31
Carrot (Daucus carota) is a major food crop in the US and worldwide. Its capacity for storage and its lifecycle as a biennial make it an attractive species for the introduction of foreign genes, especially for oral delivery of vaccines and other therapeutic proteins. Until recently efforts to express recombinant proteins in carrot have had limited success in terms of protein accumulation in the edible tap roots. Plastid genetic engineering offers the potential to overcome this limitation, as demonstrated by the accumulation of BADH in chromoplasts of carrot taproots to confer exceedingly high levels of salt resistance. The complete plastid genome of carrot provides essential information required for genetic engineering. Additionally, the sequence data add to the rapidly growing database of plastid genomes for assessing phylogenetic relationships among angiosperms. The complete carrot plastid genome is 155,911 bp in length, with 115 unique genes and 21 duplicated genes within the IR. There are four ribosomal RNAs, 30 distinct tRNA genes and 18 intron-containing genes. Repeat analysis reveals 12 direct and 2 inverted repeats > or = 30 bp with a sequence identity > or = 90%. Phylogenetic analysis of nucleotide sequences for 61 protein-coding genes using both maximum parsimony (MP) and maximum likelihood (ML) were performed for 29 angiosperms. Phylogenies from both methods provide strong support for the monophyly of several major angiosperm clades, including monocots, eudicots, rosids, asterids, eurosids II, euasterids I, and euasterids II. The carrot plastid genome contains a number of dispersed direct and inverted repeats scattered throughout coding and non-coding regions. This is the first sequenced plastid genome of the family Apiaceae and only the second published genome sequence of the species-rich euasterid II clade. Both MP and ML trees provide very strong support (100% bootstrap) for the sister relationship of Daucus with Panax in the euasterid II clade. These results provide the best taxon sampling of complete chloroplast genomes and the strongest support yet for the sister relationship of Caryophyllales to the asterids. The availability of the complete plastid genome sequence should facilitate improved transformation efficiency and foreign gene expression in carrot through utilization of endogenous flanking sequences and regulatory elements.
Sequencing of cDNA Clones from the Genetic Map of Tomato (Lycopersicon esculentum)
Ganal, Martin W.; Czihal, Rosemarie; Hannappel, Ulrich; Kloos, Dorothee-U.; Polley, Andreas; Ling, Hong-Qing
1998-01-01
The dense RFLP linkage map of tomato (Lycopersicon esculentum) contains >300 anonymous cDNA clones. Of those clones, 272 were partially or completely sequenced. The sequences were compared at the DNA and protein level to known genes in databases. For 57% of the clones, a significant match to previously described genes was found. The information will permit the conversion of those markers to STS markers and allow their use in PCR-based mapping experiments. Furthermore, it will facilitate the comparative mapping of genes across distantly related plant species by direct comparison of DNA sequences and map positions. [cDNA sequence data reported in this paper have been submitted to the EMBL database under accession nos. AA824695–AA825005 and the dbEST_Id database under accession nos. 1546519–1546862.] PMID:9724330
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hiraiwa, Akikazu; Yamanaka, Katsuo; Kwok, W.W.
Although HLA genes have been shown to be associated with certain diseases, the basis for this association is unknown. Recent studies, however, have documented patterns of nucleotide sequence variation among some HLA genes associated with a particular disease. For rheumatoid arthritis, HLA genes in most patients have a shared nucleotide sequence encoding a key structural element of an HLA class II polypeptide; this sequence element is critical for the interaction of the HLA molecule with antigenic peptides and with responding T cells, suggestive of a direct role for this sequence element in disease susceptibility. The authors describe the serological andmore » cellular immunologic characteristics encoded by this rheumatoid arthritis-associated sequence element. Site-directed mutagenesis of the DRB1 gene was used to define amino acids critical for antibody and T-cell recognition of this structural element, focusing on residues that distinguish the rheumatoid arthritis-associated alleles Dw4 and Dw14 from a closely related allele, Dw10, not associated with disease. Both the gain and loss of rheumatoid arthritis-associated epitopes were highly dependent on three residues within a discrete domain of the HLA-DR molecule. Recognition was most strongly influenced by the following amino acids (in order): 70 > 71 > 67. Some alloreactive T-cell clones were also influenced by amino acid variation in portions of the DR molecule lying outside the shared sequence element.« less
Arthropod genomic resources for the 21st century
USDA-ARS?s Scientific Manuscript database
Genome references are foundational for high quality entomological research today. Species, sub populations and taxonomy are defined by gene flow and genome sequences. Gene content in arthropods is often directly reflective of life history, for example, diet and symbiont related gene loss is observed...
Lescat, Mathilde; Hoede, Claire; Clermont, Olivier; Garry, Louis; Darlu, Pierre; Tuffery, Pierre; Denamur, Erick; Picard, Bertrand
2009-12-29
Previous studies have established a correlation between electrophoretic polymorphism of esterase B, and virulence and phylogeny of Escherichia coli. Strains belonging to the phylogenetic group B2 are more frequently implicated in extraintestinal infections and include esterase B2 variants, whereas phylogenetic groups A, B1 and D contain less virulent strains and include esterase B1 variants. We investigated esterase B as a marker of phylogeny and/or virulence, in a thorough analysis of the esterase B-encoding gene. We identified the gene encoding esterase B as the acetyl-esterase gene (aes) using gene disruption. The analysis of aes nucleotide sequences in a panel of 78 reference strains, including the E. coli reference (ECOR) strains, demonstrated that the gene is under purifying selection. The phylogenetic tree reconstructed from aes sequences showed a strong correlation with the species phylogenetic history, based on multi-locus sequence typing using six housekeeping genes. The unambiguous distinction between variants B1 and B2 by electrophoresis was consistent with Aes amino-acid sequence analysis and protein modelling, which showed that substituted amino acids in the two esterase B variants occurred mostly at different sites on the protein surface. Studies in an experimental mouse model of septicaemia using mutant strains did not reveal a direct link between aes and extraintestinal virulence. Moreover, we did not find any genes in the chromosomal region of aes to be associated with virulence. Our findings suggest that aes does not play a direct role in the virulence of E. coli extraintestinal infection. However, this gene acts as a powerful marker of phylogeny, illustrating the extensive divergence of B2 phylogenetic group strains from the rest of the species.
Identification of presumed ancestral DNA sequences of phaseolin in Phaseolus vulgaris.
Kami, J; Velásquez, V B; Debouck, D G; Gepts, P
1995-01-01
Common bean (Phaseolus vulgaris) consists of two major geographic gene pools, one distributed in Mexico, Central America, and Colombia and the other in the southern Andes (southern Peru, Bolivia, and Argentina). Amplification and sequencing of members of the multigene family coding for phaseolin, the major seed storage protein of the common bean, provide evidence for accumulation of tandem direct repeats in both introns and exons during evolution of the multigene family in this species. The presumed ancestral phaseolin sequences, without tandem repeats, were found in recently discovered but nearly extinct wild common bean populations of Ecuador and northern Peru that are intermediate between the two major gene pools of the species based on geographical and molecular arguments. Our results illustrate the usefulness of tandem direct repeats in establishing the polarity of DNA sequence divergence and therefore in proposing phylogenies. Images Fig. 1 Fig. 3 PMID:7862642
Internal control regions for transcription of eukaryotic tRNA genes.
Sharp, S; DeFranco, D; Dingermann, T; Farrell, P; Söll, D
1981-01-01
We have identified the region within a eukaryotic tRNA gene required for initiation of transcription. These results were obtained by systematically constructing deletions extending from the 5' or the 3' flanking regions into a cloned Drosophila tRNAArg gene by using nuclease BAL 31. The ability of the newly generated deletion clones to direct the in vitro synthesis of tRNA precursors was measured in transcription systems from Xenopus laevis oocytes, Drosophila Kc cells, and HeLa cells. Two control regions within the coding sequence were identified. The first was essential for transcription and was contained between nucleotides 8 and 25 of the mature tRNA sequence. Genes devoid of the second control region, which was contained between nucleotides 50 and 58 of the mature tRNA sequence, could be transcribed but with reduced efficiency. Thus, the promoter regions within a tRNA gene encode the tRNA sequences of the D stem and D loop, the invariant uridine at position 8, and the semi-invariant G-T-psi-C sequence. Images PMID:6947245
Sun, Miao-Miao; Han, Liang; Zhang, Fu-Kai; Zhou, Dong-Hui; Wang, Shu-Qing; Ma, Jun; Zhu, Xing-Quan; Liu, Guo-Hua
2018-01-01
Marshallagia marshalli (Nematoda: Trichostrongylidae) infection can lead to serious parasitic gastroenteritis in sheep, goat, and wild ruminant, causing significant socioeconomic losses worldwide. Up to now, the study concerning the molecular biology of M. marshalli is limited. Herein, we sequenced the complete mitochondrial (mt) genome of M. marshalli and examined its phylogenetic relationship with selected members of the superfamily Trichostrongyloidea using Bayesian inference (BI) based on concatenated mt amino acid sequence datasets. The complete mt genome sequence of M. marshalli is 13,891 bp, including 12 protein-coding genes, 22 transfer RNA genes, and 2 ribosomal RNA genes. All protein-coding genes are transcribed in the same direction. Phylogenetic analyses based on concatenated amino acid sequences of the 12 protein-coding genes supported the monophylies of the families Haemonchidae, Molineidae, and Dictyocaulidae with strong statistical support, but rejected the monophyly of the family Trichostrongylidae. The determination of the complete mt genome sequence of M. marshalli provides novel genetic markers for studying the systematics, population genetics, and molecular epidemiology of M. marshalli and its congeners.
Gritz, L; Davies, J
1983-11-01
The plasmid-borne gene hph coding for hygromycin B phosphotransferase (HPH) in Escherichia coli has been identified and its nucleotide sequence determined. The hph gene is 1026 nucleotides long, coding for a protein with a predicted Mr of 39 000. The hph gene was placed in a shuttle plasmid vector, downstream from the promoter region of the cyc 1 gene of Saccharomyces cerevisiae, and an hph construction containing a single AUG in the 5' noncoding region allowed direct selection following transformation in yeast and in E. coli. Thus the hph gene can be used in cloning vectors for both pro- and eukaryotes.
Oishi, M; Gohma, H; Lejukole, H Y; Taniguchi, Y; Yamada, T; Suzuki, K; Shinkai, H; Uenishi, H; Yasue, H; Sasaki, Y
2004-05-01
Expressed sequence tags (ESTs) generated based on characterization of clones isolated randomly from cDNA libraries are used to study gene expression profiles in specific tissues and to provide useful information for characterizing tissue physiology. In this study, two directionally cloned cDNA libraries were constructed from 60 day-old bovine whole fetus and fetal placenta. We have characterized 5357 and 1126 clones, and then identified 3464 and 795 unique sequences for the fetus and placenta cDNA libraries: 1851 and 504 showed homology to already identified genes, and 1613 and 291 showed no significant matches to any of the sequences in DNA databases, respectively. Further, we found 94 unique sequences overlapping in both the fetus and the placenta, leading to a catalog of 4165 genes expressed in 60 day-old fetus and placenta. The catalog is used to examine expression profile of genes in 60 day-old bovine fetus and placenta.
Wang, Qiuyan; Wu, Huili; Wang, Anming; Du, Pengfei; Pei, Xiaolin; Li, Haifeng; Yin, Xiaopu; Huang, Lifeng; Xiong, Xiaolong
2010-01-01
DNA family shuffling is a powerful method for enzyme engineering, which utilizes recombination of naturally occurring functional diversity to accelerate laboratory-directed evolution. However, the use of this technique has been hindered by the scarcity of family genes with the required level of sequence identity in the genome database. We describe here a strategy for collecting metagenomic homologous genes for DNA shuffling from environmental samples by truncated metagenomic gene-specific PCR (TMGS-PCR). Using identified metagenomic gene-specific primers, twenty-three 921-bp truncated lipase gene fragments, which shared 64–99% identity with each other and formed a distinct subfamily of lipases, were retrieved from 60 metagenomic samples. These lipase genes were shuffled, and selected active clones were characterized. The chimeric clones show extensive functional and genetic diversity, as demonstrated by functional characterization and sequence analysis. Our results indicate that homologous sequences of genes captured by TMGS-PCR can be used as suitable genetic material for DNA family shuffling with broad applications in enzyme engineering. PMID:20962349
Mixed Sequence Reader: A Program for Analyzing DNA Sequences with Heterozygous Base Calling
Chang, Chun-Tien; Tsai, Chi-Neu; Tang, Chuan Yi; Chen, Chun-Houh; Lian, Jang-Hau; Hu, Chi-Yu; Tsai, Chia-Lung; Chao, Angel; Lai, Chyong-Huey; Wang, Tzu-Hao; Lee, Yun-Shien
2012-01-01
The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs), insertion-deletions (indels), short tandem repeats (STRs), and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR), which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i) physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii) predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS); (iii) determine human papilloma virus (HPV) genotypes by searching current viral databases in cases of double infections; (iv) estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4) and its paralog HSPDP3. PMID:22778697
Nield, Blair S.; Willows, Robert D.; Torda, Andrew E.; Gillings, Michael R.; Holmes, Andrew J.; Nevalainen, K.M. Helena; Stokes, H.W.; Mabbutt, Bridget C.
2004-01-01
By targeting gene cassettes by polymerase chain reaction (PCR) directly from environmentally derived DNA, we are able to amplify entire open reading frames (ORFs) independently of prior sequence knowledge. Approximately 10% of the mobile genes recovered by these means can be attributed to known protein families. Here we describe the characterization of two ORFs which show moderate homology to known proteins: (1) an aminoglycoside phosphotransferase displaying 25% sequence identity with APH(7″) from Streptomyces hygroscopicus, and (2) an RNA methyltransferase sharing 25%–28% identity with a group of recently defined bacterial RNA methyltransferases distinct from the SpoU enzyme family. Our novel genes were expressed as recombinant products and assayed for appropriate enzyme activity. The aminoglycoside phosphotransferase displayed ATPase activity, consistent with the presence of characteristic Mg2+-binding residues. Unlike related APH(4) or APH(7″) enzymes, however, this activity was not enhanced by hygromycin B or kanamycin, suggesting the normal substrate to be a different aminoglycoside. The RNA methyltransferase contains sequence motifs of the RNA methyltransferase superfamily, and our recombinant version showed methyltransferase activity with RNA. Our data confirm that gene cassettes present in the environment encode folded enzymes with novel sequence variation and demonstrable catalytic activity. Our PCR approach (cassette PCR) may be used to identify a diverse range of ORFs from any environmental sample, as well as to directly access the gene pool found in mobile gene cassettes commonly associated with integrons. This gene pool can be accessed from both cultured and uncultured microbial samples as a source of new enzymes and proteins. PMID:15152095
Nield, Blair S; Willows, Robert D; Torda, Andrew E; Gillings, Michael R; Holmes, Andrew J; Nevalainen, K M Helena; Stokes, H W; Mabbutt, Bridget C
2004-06-01
By targeting gene cassettes by polymerase chain reaction (PCR) directly from environmentally derived DNA, we are able to amplify entire open reading frames (ORFs) independently of prior sequence knowledge. Approximately 10% of the mobile genes recovered by these means can be attributed to known protein families. Here we describe the characterization of two ORFs which show moderate homology to known proteins: (1) an aminoglycoside phosphotransferase displaying 25% sequence identity with APH(7") from Streptomyces hygroscopicus, and (2) an RNA methyltransferase sharing 25%-28% identity with a group of recently defined bacterial RNA methyltransferases distinct from the SpoU enzyme family. Our novel genes were expressed as recombinant products and assayed for appropriate enzyme activity. The aminoglycoside phosphotransferase displayed ATPase activity, consistent with the presence of characteristic Mg(2+)-binding residues. Unlike related APH(4) or APH(7") enzymes, however, this activity was not enhanced by hygromycin B or kanamycin, suggesting the normal substrate to be a different aminoglycoside. The RNA methyltransferase contains sequence motifs of the RNA methyltransferase superfamily, and our recombinant version showed methyltransferase activity with RNA. Our data confirm that gene cassettes present in the environment encode folded enzymes with novel sequence variation and demonstrable catalytic activity. Our PCR approach (cassette PCR) may be used to identify a diverse range of ORFs from any environmental sample, as well as to directly access the gene pool found in mobile gene cassettes commonly associated with integrons. This gene pool can be accessed from both cultured and uncultured microbial samples as a source of new enzymes and proteins.
DISSECTING THE GENETICS OF HUMAN HIGH MYOPIA: A MOLECULAR BIOLOGIC APPROACH
Young, Terri L
2004-01-01
ABSTRACT Purpose Despite the plethora of experimental myopia animal studies that demonstrate biochemical factor changes in various eye tissues, and limited human studies utilizing pharmacologic agents to thwart axial elongation, we have little knowledge of the basic physiology that drives myopic development. Identifying the implicated genes for myopia susceptibility will provide a fundamental molecular understanding of how myopia occurs and may lead to directed physiologic (ie, pharmacologic, gene therapy) interventions. The purpose of this proposal is to describe the results of positional candidate gene screening of selected genes within the autosomal dominant high-grade myopia-2 locus (MYP2) on chromosome 18p11.31. Methods A physical map of a contracted MYP2 interval was compiled, and gene expression studies in ocular tissues using complementary DNA library screens, microarray matches, and reverse-transcription techniques aided in prioritizing gene selection for screening. The TGIF, EMLIN-2, MLCB, and CLUL1 genes were screened in DNA samples from unrelated controls and in high-myopia affected and unaffected family members from the original seven MYP2 pedigrees. All candidate genes were screened by direct base pair sequence analysis. Results Consistent segregation of a gene sequence alteration (polymorphism) with myopia was not demonstrated in any of the seven families. Novel single nucleotide polymorphisms were found. Conclusion The positional candidate genes TGIF, EMLIN-2, MLCB, and CLUL1 are not associated with MYP2-linked high-grade myopia. Base change polymorphisms discovered with base sequence screening of these genes were submitted to an Internet database. Other genes that also map within the interval are currently undergoing mutation screening. PMID:15747770
FunGene: the functional gene pipeline and repository.
Fish, Jordan A; Chai, Benli; Wang, Qiong; Sun, Yanni; Brown, C Titus; Tiedje, James M; Cole, James R
2013-01-01
Ribosomal RNA genes have become the standard molecular markers for microbial community analysis for good reasons, including universal occurrence in cellular organisms, availability of large databases, and ease of rRNA gene region amplification and analysis. As markers, however, rRNA genes have some significant limitations. The rRNA genes are often present in multiple copies, unlike most protein-coding genes. The slow rate of change in rRNA genes means that multiple species sometimes share identical 16S rRNA gene sequences, while many more species share identical sequences in the short 16S rRNA regions commonly analyzed. In addition, the genes involved in many important processes are not distributed in a phylogenetically coherent manner, potentially due to gene loss or horizontal gene transfer. While rRNA genes remain the most commonly used markers, key genes in ecologically important pathways, e.g., those involved in carbon and nitrogen cycling, can provide important insights into community composition and function not obtainable through rRNA analysis. However, working with ecofunctional gene data requires some tools beyond those required for rRNA analysis. To address this, our Functional Gene Pipeline and Repository (FunGene; http://fungene.cme.msu.edu/) offers databases of many common ecofunctional genes and proteins, as well as integrated tools that allow researchers to browse these collections and choose subsets for further analysis, build phylogenetic trees, test primers and probes for coverage, and download aligned sequences. Additional FunGene tools are specialized to process coding gene amplicon data. For example, FrameBot produces frameshift-corrected protein and DNA sequences from raw reads while finding the most closely related protein reference sequence. These tools can help provide better insight into microbial communities by directly studying key genes involved in important ecological processes.
González-Escalona, Narjol; Blackstone, George M.; DePaola, Angelo
2006-01-01
A Vibrio strain isolated from Alaskan oysters and classified by its biochemical characteristics as Vibrio alginolyticus possessed a thermostable direct hemolysin-related hemolysin (trh) gene previously reported only in Vibrio parahaemolyticus. This trh-like gene was cloned and sequenced and was 98% identical to the trh2 gene of V. parahaemolyticus. This gene seems to be functional since it was transcriptionally active in early-stationary-phase growing cells. To our knowledge, this is the first report of V. alginolyticus possessing a trh gene. PMID:17056701
Grace, Christy R.; Ferreira, Antonio M.; Waddell, M. Brett; Ridout, Granger; Naeve, Deanna; Leuze, Michael; LoCascio, Philip F.; Panetta, John C.; Wilkinson, Mark R.; Pui, Ching-Hon; Naeve, Clayton W.; Uberbacher, Edward C.; Bonten, Erik J.; Evans, William E.
2016-01-01
MicroRNAs are important regulators of gene expression, acting primarily by binding to sequence-specific locations on already transcribed messenger RNAs (mRNA) and typically down-regulating their stability or translation. Recent studies indicate that microRNAs may also play a role in up-regulating mRNA transcription levels, although a definitive mechanism has not been established. Double-helical DNA is capable of forming triple-helical structures through Hoogsteen and reverse Hoogsteen interactions in the major groove of the duplex, and we show physical evidence (i.e., NMR, FRET, SPR) that purine or pyrimidine-rich microRNAs of appropriate length and sequence form triple-helical structures with purine-rich sequences of duplex DNA, and identify microRNA sequences that favor triplex formation. We developed an algorithm (Trident) to search genome-wide for potential triplex-forming sites and show that several mammalian and non-mammalian genomes are enriched for strong microRNA triplex binding sites. We show that those genes containing sequences favoring microRNA triplex formation are markedly enriched (3.3 fold, p<2.2 × 10−16) for genes whose expression is positively correlated with expression of microRNAs targeting triplex binding sequences. This work has thus revealed a new mechanism by which microRNAs could interact with gene promoter regions to modify gene transcription. PMID:26844769
Richert, Kathrin; Brambilla, Evelyne; Stackebrandt, Erko
2005-01-01
PCR primer sets were developed for the specific amplification and sequence analyses encoding the gyrase subunit B (gyrB) of members of the family Microbacteriaceae, class Actinobacteria. The family contains species highly related by 16S rRNA gene sequence analyses. In order to test if the gene sequence analysis of gyrB is appropriate to discriminate between closely related species, we evaluate the 16S rRNA gene phylogeny of its members. As the published universal primer set for gyrB failed to amplify the responding gene of the majority of the 80 type strains of the family, three new primer sets were identified that generated fragments with a composite sequence length of about 900 nt. However, the amplification of all three fragments was successful only in 25% of the 80 type strains. In this study, the substitution frequencies in genes encoding gyrase and 16S rDNA were compared for 10 strains of nine genera. The frequency of gyrB nucleotide substitution is significantly higher than that of the 16S rDNA, and no linear correlation exists between the similarities of both molecules among members of the Microbacteriaceae. The phylogenetic analyses using the gyrB sequences provide higher resolution than using 16S rDNA sequences and seem able to discriminate between closely related species.
The LAM-PCR Method to Sequence LV Integration Sites.
Wang, Wei; Bartholomae, Cynthia C; Gabriel, Richard; Deichmann, Annette; Schmidt, Manfred
2016-01-01
Integrating viral gene transfer vectors are commonly used gene delivery tools in clinical gene therapy trials providing stable integration and continuous gene expression of the transgene in the treated host cell. However, integration of the reverse-transcribed vector DNA into the host genome is a potentially mutagenic event that may directly contribute to unwanted side effects. A comprehensive and accurate analysis of the integration site (IS) repertoire is indispensable to study clonality in transduced cells obtained from patients undergoing gene therapy and to identify potential in vivo selection of affected cell clones. To date, next-generation sequencing (NGS) of vector-genome junctions allows sophisticated studies on the integration repertoire in vitro and in vivo. We have explored the use of the Illumina MiSeq Personal Sequencer platform to sequence vector ISs amplified by non-restrictive linear amplification-mediated PCR (nrLAM-PCR) and LAM-PCR. MiSeq-based high-quality IS sequence retrieval is accomplished by the introduction of a double-barcode strategy that substantially minimizes the frequency of IS sequence collisions compared to the conventionally used single-barcode protocol. Here, we present an updated protocol of (nr)LAM-PCR for the analysis of lentiviral IS using a double-barcode system and followed by deep sequencing using the MiSeq device.
Comparative RNA sequencing reveals substantial genetic variation in endangered primates
Perry, George H.; Melsted, Páll; Marioni, John C.; Wang, Ying; Bainer, Russell; Pickrell, Joseph K.; Michelini, Katelyn; Zehr, Sarah; Yoder, Anne D.; Stephens, Matthew; Pritchard, Jonathan K.; Gilad, Yoav
2012-01-01
Comparative genomic studies in primates have yielded important insights into the evolutionary forces that shape genetic diversity and revealed the likely genetic basis for certain species-specific adaptations. To date, however, these studies have focused on only a small number of species. For the majority of nonhuman primates, including some of the most critically endangered, genome-level data are not yet available. In this study, we have taken the first steps toward addressing this gap by sequencing RNA from the livers of multiple individuals from each of 16 mammalian species, including humans and 11 nonhuman primates. Of the nonhuman primate species, five are lemurs and two are lorisoids, for which little or no genomic data were previously available. To analyze these data, we developed a method for de novo assembly and alignment of orthologous gene sequences across species. We assembled an average of 5721 gene sequences per species and characterized diversity and divergence of both gene sequences and gene expression levels. We identified patterns of variation that are consistent with the action of positive or directional selection, including an 18-fold enrichment of peroxisomal genes among genes whose regulation likely evolved under directional selection in the ancestral primate lineage. Importantly, we found no relationship between genetic diversity and endangered status, with the two most endangered species in our study, the black and white ruffed lemur and the Coquerel's sifaka, having the highest genetic diversity among all primates. Our observations imply that many endangered lemur populations still harbor considerable genetic variation. Timely efforts to conserve these species alongside their habitats have, therefore, strong potential to achieve long-term success. PMID:22207615
Identification of the genomic locus for the human Rieske Fe-S Protein gene on Chromosome 19q12
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pennacchio, L.A.
1994-05-06
We have identified the chromosomal location of the human Rieske Iron-Sulfur Protein (UQCRFS1) gene. Mapping by hybridization to a panel of monochromosomal hybrid cell lines indicated that the gene was either on chromosome 19 or 22. By screening a human chromosome 19 specific genomic cosmid library with an oligonucleotide probe made from the published Rieske cDNA sequence, we identified a corresponding cosmid. Portions of this cosmid were sequenced directly. The exon, exon:intron junction, and flanking sequences verified that this cosmid contains the genomic locus. Fluorescent in situ hybridization (FISH) was performed to localize this cosmid to chromosome band 19q12.
Parallel gene analysis with allele-specific padlock probes and tag microarrays
Banér, Johan; Isaksson, Anders; Waldenström, Erik; Jarvius, Jonas; Landegren, Ulf; Nilsson, Mats
2003-01-01
Parallel, highly specific analysis methods are required to take advantage of the extensive information about DNA sequence variation and of expressed sequences. We present a scalable laboratory technique suitable to analyze numerous target sequences in multiplexed assays. Sets of padlock probes were applied to analyze single nucleotide variation directly in total genomic DNA or cDNA for parallel genotyping or gene expression analysis. All reacted probes were then co-amplified and identified by hybridization to a standard tag oligonucleotide array. The technique was illustrated by analyzing normal and pathogenic variation within the Wilson disease-related ATP7B gene, both at the level of DNA and RNA, using allele-specific padlock probes. PMID:12930977
Louis, Ed
2011-01-01
In the early days of the yeast genome sequencing project, gene annotation was in its infancy and suffered the problem of many false positive annotations as well as missed genes. The lack of other sequences for comparison also prevented the annotation of conserved, functional sequences that were not coding. We are now in an era of comparative genomics where many closely related as well as more distantly related genomes are available for direct sequence and synteny comparisons allowing for more probable predictions of genes and other functional sequences due to conservation. We also have a plethora of functional genomics data which helps inform gene annotation for previously uncharacterised open reading frames (ORFs)/genes. For Saccharomyces cerevisiae this has resulted in a continuous updating of the gene and functional sequence annotations in the reference genome helping it retain its position as the best characterized eukaryotic organism's genome. A single reference genome for a species does not accurately describe the species and this is quite clear in the case of S. cerevisiae where the reference strain is not ideal for brewing or baking due to missing genes. Recent surveys of numerous isolates, from a variety of sources, using a variety of technologies have revealed a great deal of variation amongst isolates with genome sequence surveys providing information on novel genes, undetectable by other means. We now have a better understanding of the extant variation in S. cerevisiae as a species as well as some idea of how much we are missing from this understanding. As with gene annotation, comparative genomics enhances the discovery and description of genome variation and is providing us with the tools for understanding genome evolution, adaptation and selection, and underlying genetics of complex traits.
Yu, Ron X.; Liu, Jie; True, Nick; Wang, Wei
2008-01-01
A major challenge in the post-genome era is to reconstruct regulatory networks from the biological knowledge accumulated up to date. The development of tools for identifying direct target genes of transcription factors (TFs) is critical to this endeavor. Given a set of microarray experiments, a probabilistic model called TRANSMODIS has been developed which can infer the direct targets of a TF by integrating sequence motif, gene expression and ChIP-chip data. The performance of TRANSMODIS was first validated on a set of transcription factor perturbation experiments (TFPEs) involving Pho4p, a well studied TF in Saccharomyces cerevisiae. TRANSMODIS removed elements of arbitrariness in manual target gene selection process and produced results that concur with one's intuition. TRANSMODIS was further validated on a genome-wide scale by comparing it with two other methods in Saccharomyces cerevisiae. The usefulness of TRANSMODIS was then demonstrated by applying it to the identification of direct targets of DAF-16, a critical TF regulating ageing in Caenorhabditis elegans. We found that 189 genes were tightly regulated by DAF-16. In addition, DAF-16 has differential preference for motifs when acting as an activator or repressor, which awaits experimental verification. TRANSMODIS is computationally efficient and robust, making it a useful probabilistic framework for finding immediate targets. PMID:18350157
The complete mitochondrial genome sequence of Eimeria magna (Apicomplexa: Coccidia).
Tian, Si-Qin; Cui, Ping; Fang, Su-Fang; Liu, Guo-Hua; Wang, Chun-Ren; Zhu, Xing-Quan
2015-01-01
In the present study, we determined the complete mitochondrial DNA (mtDNA) sequence of Eimeria magna from rabbits for the first time, and compared its gene contents and genome organizations with that of seven Eimeria spp. from domestic chickens. The size of the complete mt genome sequence of E. magna is 6249 bp, which consists of 3 protein-coding genes (cytb, cox1 and cox3), 12 gene fragments for the large subunit (LSU) rRNA, and 7 gene fragments for the small subunit (SSU) rRNA, without transfer RNA genes, in accordance with that of Eimeria spp. from chickens. The putative direction of translation for three genes (cytb, cox1 and cox3) was the same as those of Eimeria species from domestic chickens. The content of A + T is 65.16% for E. magna mt genome (29.73% A, 35.43% T, 17.09 G and 17.75% C). The E. magna mt genome sequence provides novel mtDNA markers for studying the molecular epidemiology and population genetics of Eimeria spp. and has implications for the molecular diagnosis and control of rabbit coccidiosis.
Ab initio gene identification in metagenomic sequences
Zhu, Wenhan; Lomsadze, Alexandre; Borodovsky, Mark
2010-01-01
We describe an algorithm for gene identification in DNA sequences derived from shotgun sequencing of microbial communities. Accurate ab initio gene prediction in a short nucleotide sequence of anonymous origin is hampered by uncertainty in model parameters. While several machine learning approaches could be proposed to bypass this difficulty, one effective method is to estimate parameters from dependencies, formed in evolution, between frequencies of oligonucleotides in protein-coding regions and genome nucleotide composition. Original version of the method was proposed in 1999 and has been used since for (i) reconstructing codon frequency vector needed for gene finding in viral genomes and (ii) initializing parameters of self-training gene finding algorithms. With advent of new prokaryotic genomes en masse it became possible to enhance the original approach by using direct polynomial and logistic approximations of oligonucleotide frequencies, as well as by separating models for bacteria and archaea. These advances have increased the accuracy of model reconstruction and, subsequently, gene prediction. We describe the refined method and assess its accuracy on known prokaryotic genomes split into short sequences. Also, we show that as a result of application of the new method, several thousands of new genes could be added to existing annotations of several human and mouse gut metagenomes. PMID:20403810
Digital gene expression for non-model organisms
Hong, Lewis Z.; Li, Jun; Schmidt-Küntzel, Anne; Warren, Wesley C.; Barsh, Gregory S.
2011-01-01
Next-generation sequencing technologies offer new approaches for global measurements of gene expression but are mostly limited to organisms for which a high-quality assembled reference genome sequence is available. We present a method for gene expression profiling called EDGE, or EcoP15I-tagged Digital Gene Expression, based on ultra-high-throughput sequencing of 27-bp cDNA fragments that uniquely tag the corresponding gene, thereby allowing direct quantification of transcript abundance. We show that EDGE is capable of assaying for expression in >99% of genes in the genome and achieves saturation after 6–8 million reads. EDGE exhibits very little technical noise, reveals a large (106) dynamic range of gene expression, and is particularly suited for quantification of transcript abundance in non-model organisms where a high-quality annotated genome is not available. In a direct comparison with RNA-seq, both methods provide similar assessments of relative transcript abundance, but EDGE does better at detecting gene expression differences for poorly expressed genes and does not exhibit transcript length bias. Applying EDGE to laboratory mice, we show that a loss-of-function mutation in the melanocortin 1 receptor (Mc1r), recognized as a Mendelian determinant of yellow hair color in many different mammals, also causes reduced expression of genes involved in the interferon response. To illustrate the application of EDGE to a non-model organism, we examine skin biopsy samples from a cheetah (Acinonyx jubatus) and identify genes likely to control differences in the color of spotted versus non-spotted regions. PMID:21844123
Primer in Genetics and Genomics, Article 6: Basics of Epigenetic Control.
Fessele, Kristen L; Wright, Fay
2018-01-01
The epigenome is a collection of chemical compounds that attach to and overlay the DNA sequence to direct gene expression. Epigenetic marks do not alter DNA sequence but instead allow or silence gene activity and the subsequent production of proteins that guide the growth and development of an organism, direct and maintain cell identity, and allow for the production of primordial germ cells (PGCs; ova and spermatozoa). The three main epigenetic marks are (1) histone modification, (2) DNA methylation, and (3) noncoding RNA, and each works in a different way to regulate gene expression. This article reviews these concepts and discusses their role in normal functions such as X-chromosome inactivation, epigenetic reprogramming during embryonic development and PGC production, and the clinical example of the imprinting disorders Angelman and Prader-Willi syndromes.
Small gene family encoding an eggshell (chorion) protein of the human parasite Schistosoma mansoni
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bobek, L.A.; Rekosh, D.M.; Lo Verde, P.T.
1988-08-01
The authors isolated six independent genomic clones encoding schistosome chorion or eggshell proteins from a Schistosoma mansoni genomic library. A linkage map of five of the clones spanning 35 kilobase pairs (kbp) of the S. mansoni genome was constructed. The region contained two eggshell protein genes closely linked, separated by 7.5 kbp of intergenic DNA. The two genes of the cluster were arranged in the same orientation, that is, they were transcribed from the same strand. The sixth clone probably represents a third copy of the eggshell gene that is not contained within the 35-kbp region. The 5- end ofmore » the mRNA transcribed from these genes was defined by primer extension directly off the RNA. The ATCAT cap site sequence was homologous to a silkmoth chorion PuTCATT cap site sequence, where Pu indicates any purine. DNA sequence analysis showed that there were no introns in these genes. The DNA sequences of the three genes were very homologous to each other and to a cDNA clone, pSMf61-46, differing only in three or four nucleotices. A multiple TATA box was located at positions -23 to -31, and a CAAAT sequence was located at -52 upstream of the eggshell transcription unit. Comparison of sequences in regions further upstream with silkmoth and Drosophila sequences revealed very short elements that were shared. One such element, TCACGT, recently shown to be an essential cis-regulatory element for silkmoth chorion gene promoter function, was found at a similar position in all three organisms.« less
Metatranscriptomics of Soil Eukaryotic Communities.
Yadav, Rajiv K; Bragalini, Claudia; Fraissinet-Tachet, Laurence; Marmeisse, Roland; Luis, Patricia
2016-01-01
Functions expressed by eukaryotic organisms in soil can be specifically studied by analyzing the pool of eukaryotic-specific polyadenylated mRNA directly extracted from environmental samples. In this chapter, we describe two alternative protocols for the extraction of high-quality RNA from soil samples. Total soil RNA or mRNA can be converted to cDNA for direct high-throughput sequencing. Polyadenylated mRNA-derived full-length cDNAs can also be cloned in expression plasmid vectors to constitute soil cDNA libraries, which can be subsequently screened for functional gene categories. Alternatively, the diversity of specific gene families can also be explored following cDNA sequence capture using exploratory oligonucleotide probes.
Nakahata, Yasukazu; Yoshida, Mayumi; Takano, Atsuko; Soma, Haruhiko; Yamamoto, Takuro; Yasuda, Akio; Nakatsu, Toru; Takumi, Toru
2008-01-01
Background The circadian expression of the mammalian clock genes is based on transcriptional feedback loops. Two basic helix-loop-helix (bHLH) PAS (for Period-Arnt-Sim) domain-containing transcriptional activators, CLOCK and BMAL1, are known to regulate gene expression by interacting with a promoter element termed the E-box (CACGTG). The non-canonical E-boxes or E-box-like sequences have also been reported to be necessary for circadian oscillation. Results We report a new cis-element required for cell-autonomous circadian transcription of clock genes. This new element consists of a canonical E-box or a non-canonical E-box and an E-box-like sequence in tandem with the latter with a short interval, 6 base pairs, between them. We demonstrate that both E-box or E-box-like sequences are needed to generate cell-autonomous oscillation. We also verify that the spacing nucleotides with constant length between these 2 E-elements are crucial for robust oscillation. Furthermore, by in silico analysis we conclude that several clock and clock-controlled genes possess a direct repeat of the E-box-like elements in their promoter region. Conclusion We propose a novel possible mechanism regulated by double E-box-like elements, not to a single E-box, for circadian transcriptional oscillation. The direct repeat of the E-box-like elements identified in this study is the minimal required element for the generation of cell-autonomous transcriptional oscillation of clock and clock-controlled genes. PMID:18177499
Ashburner, M; Misra, S; Roote, J; Lewis, S E; Blazej, R; Davis, T; Doyle, C; Galle, R; George, R; Harris, N; Hartzell, G; Harvey, D; Hong, L; Houston, K; Hoskins, R; Johnson, G; Martin, C; Moshrefi, A; Palazzolo, M; Reese, M G; Spradling, A; Tsang, G; Wan, K; Whitelaw, K; Celniker, S
1999-01-01
A contiguous sequence of nearly 3 Mb from the genome of Drosophila melanogaster has been sequenced from a series of overlapping P1 and BAC clones. This region covers 69 chromosome polytene bands on chromosome arm 2L, including the genetically well-characterized "Adh region." A computational analysis of the sequence predicts 218 protein-coding genes, 11 tRNAs, and 17 transposable element sequences. At least 38 of the protein-coding genes are arranged in clusters of from 2 to 6 closely related genes, suggesting extensive tandem duplication. The gene density is one protein-coding gene every 13 kb; the transposable element density is one element every 171 kb. Of 73 genes in this region identified by genetic analysis, 49 have been located on the sequence; P-element insertions have been mapped to 43 genes. Ninety-five (44%) of the known and predicted genes match a Drosophila EST, and 144 (66%) have clear similarities to proteins in other organisms. Genes known to have mutant phenotypes are more likely to be represented in cDNA libraries, and far more likely to have products similar to proteins of other organisms, than are genes with no known mutant phenotype. Over 650 chromosome aberration breakpoints map to this chromosome region, and their nonrandom distribution on the genetic map reflects variation in gene spacing on the DNA. This is the first large-scale analysis of the genome of D. melanogaster at the sequence level. In addition to the direct results obtained, this analysis has allowed us to develop and test methods that will be needed to interpret the complete sequence of the genome of this species.Before beginning a Hunt, it is wise to ask someone what you are looking for before you begin looking for it. Milne 1926 PMID:10471707
Zhou, Fusheng; Fu, Hongyang; Liu, Linghua; Cui, Yong; Zhang, Zhengzhong; Chang, Ruixue; Yue, Zhen; Yang, Sen; Zhang, Xuejun
2014-09-01
Progressive symmetric erythrokeratodermia (PSEK) is characterized by symmetric and growing erythematous hyperkeratotic patches over the body shortly after birth, particularly trunk and limbs, the buttocks, and the face, sometimes together with palmoplantar keratoderma (PPK). The GJB2, GJB3, GJB4, GJB6, ARS (Component B), and LOR gene mutation might contribute to PSEK manifestation. This study aimed to identify sequence alteration of these genes in a Chinese PSEK patient with pseudoainhum. Genomic DNA was purified from the patient's peripheral blood. Mutation analysis of target genes was performed by direct sequencing using ABI 3730 sequencer No exonic mutations was identified in the aforementioned genes. The result underlines the genetic heterogeneity of PSEK and other related erythrokeratodermas. © 2014 The International Society of Dermatology.
Evaluation of the norrie disease gene in a family with incontinentia pigmenti.
Shastry, B S; Trese, M T
2000-01-01
Incontinentia pigmenti (IP) is an ectodermal multisystem disorder which can affect dental, ocular, cardiac and neurologic structures. The ocular changes of IP can have a very similar appearance to the retinal detachment of X-linked familial exudative vitreoretinopathy, which has been shown to be caused by the mutations in the Norrie disease gene. Therefore, it is of interest to determine whether similar mutations in the gene can account for the retinal pathology in patients with IP. To test our hypothesis, we have analyzed the entire Norrie disease gene for a family with IP, by single strand conformational polymorphism followed by DNA sequencing. The sequencing data revealed no disease-specific sequence alterations. These data suggest that ocular findings of IP are perhaps associated with different genes and there is no direct relationship between the genotype and phenotype. Copyright 2000 S. Karger AG, Basel
Workman, Rachael E; Myrka, Alexander M; Wong, G William; Tseng, Elizabeth; Welch, Kenneth C; Timp, Winston
2018-03-01
Hummingbirds oxidize ingested nectar sugars directly to fuel foraging but cannot sustain this fuel use during fasting periods, such as during the night or during long-distance migratory flights. Instead, fasting hummingbirds switch to oxidizing stored lipids that are derived from ingested sugars. The hummingbird liver plays a key role in moderating energy homeostasis and this remarkable capacity for fuel switching. Additionally, liver is the principle location of de novo lipogenesis, which can occur at exceptionally high rates, such as during premigratory fattening. Yet understanding how this tissue and whole organism moderates energy turnover is hampered by a lack of information regarding how relevant enzymes differ in sequence, expression, and regulation. We generated a de novo transcriptome of the hummingbird liver using PacBio full-length cDNA sequencing (Iso-Seq), yielding 8.6Gb of sequencing data, or 2.6M reads from 4 different size fractions. We analyzed data using the SMRTAnalysis v3.1 Iso-Seq pipeline, then clustered isoforms into gene families to generate de novo gene contigs using Cogent. We performed orthology analysis to identify closely related sequences between our transcriptome and other avian and human gene sets. Finally, we closely examined homology of critical lipid metabolism genes between our transcriptome data and avian and human genomes. We confirmed high levels of sequence divergence within hummingbird lipogenic enzymes, suggesting a high probability of adaptive divergent function in the hepatic lipogenic pathways. Our results leverage cutting-edge technology and a novel bioinformatics pipeline to provide a first direct look at the transcriptome of this incredible organism.
USDA-ARS?s Scientific Manuscript database
The Brachyspira hyodysenteriae B204 genome sequence revealed three VSH-1 tail genes hvp31, hvp60, and hvp37, in a 3.6 kb cluster. The location and transcription direction of these genes relative to the previously described VSH-1 16.3 kb gene operon indicate that the gene transfer agent VSH-1 has a ...
McKinney, Nancy
2002-01-01
PCR (polymerase chain reaction) primers for the detection of certain Bacillus species, such as Bacillus anthracis. The primers specifically amplify only DNA found in the target species and can distinguish closely related species. Species-specific PCR primers for Bacillus anthracis, Bacillus globigii and Clostridium perfringens are disclosed. The primers are directed to unique sequences within sasp (small acid soluble protein) genes.
Harnessing Whole Genome Sequencing in Medical Mycology.
Cuomo, Christina A
2017-01-01
Comparative genome sequencing studies of human fungal pathogens enable identification of genes and variants associated with virulence and drug resistance. This review describes current approaches, resources, and advances in applying whole genome sequencing to study clinically important fungal pathogens. Genomes for some important fungal pathogens were only recently assembled, revealing gene family expansions in many species and extreme gene loss in one obligate species. The scale and scope of species sequenced is rapidly expanding, leveraging technological advances to assemble and annotate genomes with higher precision. By using iteratively improved reference assemblies or those generated de novo for new species, recent studies have compared the sequence of isolates representing populations or clinical cohorts. Whole genome approaches provide the resolution necessary for comparison of closely related isolates, for example, in the analysis of outbreaks or sampled across time within a single host. Genomic analysis of fungal pathogens has enabled both basic research and diagnostic studies. The increased scale of sequencing can be applied across populations, and new metagenomic methods allow direct analysis of complex samples.
Mining biological databases for candidate disease genes
NASA Astrophysics Data System (ADS)
Braun, Terry A.; Scheetz, Todd; Webster, Gregg L.; Casavant, Thomas L.
2001-07-01
The publicly-funded effort to sequence the complete nucleotide sequence of the human genome, the Human Genome Project (HGP), has currently produced more than 93% of the 3 billion nucleotides of the human genome into a preliminary `draft' format. In addition, several valuable sources of information have been developed as direct and indirect results of the HGP. These include the sequencing of model organisms (rat, mouse, fly, and others), gene discovery projects (ESTs and full-length), and new technologies such as expression analysis and resources (micro-arrays or gene chips). These resources are invaluable for the researchers identifying the functional genes of the genome that transcribe and translate into the transcriptome and proteome, both of which potentially contain orders of magnitude more complexity than the genome itself. Preliminary analyses of this data identified approximately 30,000 - 40,000 human `genes.' However, the bulk of the effort still remains -- to identify the functional and structural elements contained within the transcriptome and proteome, and to associate function in the transcriptome and proteome to genes. A fortuitous consequence of the HGP is the existence of hundreds of databases containing biological information that may contain relevant data pertaining to the identification of disease-causing genes. The task of mining these databases for information on candidate genes is a commercial application of enormous potential. We are developing a system to acquire and mine data from specific databases to aid our efforts to identify disease genes. A high speed cluster of Linux of workstations is used to analyze sequence and perform distributed sequence alignments as part of our data mining and processing. This system has been used to mine GeneMap99 sequences within specific genomic intervals to identify potential candidate disease genes associated with Bardet-Biedle Syndrome (BBS).
Alu sequence involvement in transcriptional insulation of the keratin 18 gene in transgenic mice.
Thorey, I S; Ceceña, G; Reynolds, W; Oshima, R G
1993-01-01
The human keratin 18 (K18) gene is expressed in a variety of adult simple epithelial tissues, including liver, intestine, lung, and kidney, but is not normally found in skin, muscle, heart, spleen, or most of the brain. Transgenic animals derived from the cloned K18 gene express the transgene in appropriate tissues at levels directly proportional to the copy number and independently of the sites of integration. We have investigated in transgenic mice the dependence of K18 gene expression on the distal 5' and 3' flanking sequences and upon the RNA polymerase III promoter of an Alu repetitive DNA transcription unit immediately upstream of the K18 promoter. Integration site-independent expression of tandemly duplicated K18 transgenes requires the presence of either an 825-bp fragment of the 5' flanking sequence or the 3.5-kb 3' flanking sequence. Mutation of the RNA polymerase III promoter of the Alu element within the 825-bp fragment abolishes copy number-dependent expression in kidney but does not abolish integration site-independent expression when assayed in the absence of the 3' flanking sequence of the K18 gene. The characteristics of integration site-independent expression and copy number-dependent expression are separable. In addition, the formation of the chromatin state of the K18 gene, which likely restricts the tissue-specific expression of this gene, is not dependent upon the distal flanking sequences of the 10-kb K18 gene but rather may depend on internal regulatory regions of the gene. Images PMID:7692231
Dron, M; Hartmann, C; Rode, A; Sevignac, M
1985-01-01
We have characterized a 1.7 kb sequence, containing a tRNA Leu2 gene shared by the ct and mt genomes of Brassica oleracea. The two sequences are completely homologous except in two short regions where two distinct gene conversion events have occurred between two sets of direct repeats leading to the insertion of 5 bp in the T loop of the mt copy of the ct gene. This is the first evidence that gene conversion represents the initial evolutionary step in inactivation of transferred ct genes in the mt genome. We also indicate that organelle DNA transfer by organelle fusion is an ongoing process which could be useful in genetic engineering. PMID:4080548
A novel large deletion mutation of FERMT1 gene in a Chinese patient with Kindler syndrome.
Gao, Ying; Bai, Jin-li; Liu, Xiao-yan; Qu, Yu-jin; Cao, Yan-yan; Wang, Jian-cai; Jin, Yu-wei; Wang, Hong; Song, Fang
2015-11-01
Kindler syndrome (KS; OMIM 173650) is a rare autosomal recessive skin disorder, which results in symptoms including blistering, epidermal atrophy, increased risk of cancer, and poor wound healing. The majority of mutations of the disease-determining gene (FERMT1 gene) are single nucleotide substitutions, including missense mutations, nonsense mutations, etc. Large deletion mutations are seldom reported. To determine the mutation in the FERMT1 gene associated with a 7-year-old Chinese patient who presented clinical manifestation of KS, we performed direct sequencing of all the exons of FERMT1 gene. For the exons 2-6 without amplicons, we analyzed the copy numbers using quantitative real-time polymerase chain reaction (qRT-PCR) with specific primers. The deletion breakpoints were sublocalized and the range of deletion was confirmed by PCR and direct sequencing. In this study, we identified a new 17-kb deletion mutation spanning the introns 1-6 of FERMT1 gene in a Chinese patient with severe KS phenotypes. Her parents were carriers of the same mutation. Our study reported a newly identified large deletion mutation of FERMT1 gene involved in KS, which further enriched the mutation spectrum of the FERMT1 gene.
Zhang, Lihua; Chen, Xianzhong; Chen, Zhen; Wang, Zezheng; Jiang, Shan; Li, Li; Pötter, Markus; Shen, Wei; Fan, You
2016-11-01
The diploid yeast Candida tropicalis, which can utilize n-alkane as a carbon and energy source, is an attractive strain for both physiological studies and practical applications. However, it presents some characteristics, such as rare codon usage, difficulty in sequential gene disruption, and inefficiency in foreign gene expression, that hamper strain improvement through genetic engineering. In this work, we present a simple and effective method for sequential gene disruption in C. tropicalis based on the use of an auxotrophic mutant host defective in orotidine monophosphate decarboxylase (URA3). The disruption cassette, which consists of a functional yeast URA3 gene flanked by a 0.3 kb gene disruption auxiliary sequence (gda) direct repeat derived from downstream or upstream of the URA3 gene and of homologous arms of the target gene, was constructed and introduced into the yeast genome by integrative transformation. Stable integrants were isolated by selection for Ura + and identified by PCR and sequencing. The important feature of this construct, which makes it very attractive, is that recombination between the flanking direct gda repeats occurs at a high frequency (10 -8 ) during mitosis. After excision of the URA3 marker, only one copy of the gda sequence remains at the recombinant locus. Thus, the resulting ura3 strain can be used again to disrupt a second allelic gene in a similar manner. In addition to this effective sequential gene disruption method, a codon-optimized green fluorescent protein-encoding gene (GFP) was functionally expressed in C. tropicalis. Thus, we propose a simple and reliable method to improve C. tropicalis by genetic manipulation.
Study of cnidarian-algal symbiosis in the "omics" age.
Meyer, Eli; Weis, Virginia M
2012-08-01
The symbiotic associations between cnidarians and dinoflagellate algae (Symbiodinium) support productive and diverse ecosystems in coral reefs. Many aspects of this association, including the mechanistic basis of host-symbiont recognition and metabolic interaction, remain poorly understood. The first completed genome sequence for a symbiotic anthozoan is now available (the coral Acropora digitifera), and extensive expressed sequence tag resources are available for a variety of other symbiotic corals and anemones. These resources make it possible to profile gene expression, protein abundance, and protein localization associated with the symbiotic state. Here we review the history of "omics" studies of cnidarian-algal symbiosis and the current availability of sequence resources for corals and anemones, identifying genes putatively involved in symbiosis across 10 anthozoan species. The public availability of candidate symbiosis-associated genes leaves the field of cnidarian-algal symbiosis poised for in-depth comparative studies of sequence diversity and gene expression and for targeted functional studies of genes associated with symbiosis. Reviewing the progress to date suggests directions for future investigations of cnidarian-algal symbiosis that include (i) sequencing of Symbiodinium, (ii) proteomic analysis of the symbiosome membrane complex, (iii) glycomic analysis of Symbiodinium cell surfaces, and (iv) expression profiling of the gastrodermal cells hosting Symbiodinium.
Ruhlman, Tracey; Lee, Seung-Bum; Jansen, Robert K; Hostetler, Jessica B; Tallon, Luke J; Town, Christopher D; Daniell, Henry
2006-01-01
Background Carrot (Daucus carota) is a major food crop in the US and worldwide. Its capacity for storage and its lifecycle as a biennial make it an attractive species for the introduction of foreign genes, especially for oral delivery of vaccines and other therapeutic proteins. Until recently efforts to express recombinant proteins in carrot have had limited success in terms of protein accumulation in the edible tap roots. Plastid genetic engineering offers the potential to overcome this limitation, as demonstrated by the accumulation of BADH in chromoplasts of carrot taproots to confer exceedingly high levels of salt resistance. The complete plastid genome of carrot provides essential information required for genetic engineering. Additionally, the sequence data add to the rapidly growing database of plastid genomes for assessing phylogenetic relationships among angiosperms. Results The complete carrot plastid genome is 155,911 bp in length, with 115 unique genes and 21 duplicated genes within the IR. There are four ribosomal RNAs, 30 distinct tRNA genes and 18 intron-containing genes. Repeat analysis reveals 12 direct and 2 inverted repeats ≥ 30 bp with a sequence identity ≥ 90%. Phylogenetic analysis of nucleotide sequences for 61 protein-coding genes using both maximum parsimony (MP) and maximum likelihood (ML) were performed for 29 angiosperms. Phylogenies from both methods provide strong support for the monophyly of several major angiosperm clades, including monocots, eudicots, rosids, asterids, eurosids II, euasterids I, and euasterids II. Conclusion The carrot plastid genome contains a number of dispersed direct and inverted repeats scattered throughout coding and non-coding regions. This is the first sequenced plastid genome of the family Apiaceae and only the second published genome sequence of the species-rich euasterid II clade. Both MP and ML trees provide very strong support (100% bootstrap) for the sister relationship of Daucus with Panax in the euasterid II clade. These results provide the best taxon sampling of complete chloroplast genomes and the strongest support yet for the sister relationship of Caryophyllales to the asterids. The availability of the complete plastid genome sequence should facilitate improved transformation efficiency and foreign gene expression in carrot through utilization of endogenous flanking sequences and regulatory elements. PMID:16945140
Takaesu, Azusa; Watanabe, Kiyotaka; Takai, Shinji; Sasaki, Yukako; Orino, Koichi
2008-01-01
Background Iron-storage protein, ferritin plays a central role in iron metabolism. Ferritin has dual function to store iron and segregate iron for protection of iron-catalyzed reactive oxygen species. Tissue ferritin is composed of two kinds of subunits (H: heavy chain or heart-type subunit; L: light chain or liver-type subunit). Ferritin gene expression is controlled at translational level in iron-dependent manner or at transcriptional level in iron-independent manner. However, sequencing analysis of marine mammalian ferritin subunits has not yet been performed fully. The purpose of this study is to reveal cDNA-derived amino acid sequences of cetacean ferritin H and L subunits, and demonstrate the possibility of expression of these subunits, especially H subunit, by iron. Methods Sequence analyses of cetacean ferritin H and L subunits were performed by direct sequencing of polymerase chain reaction (PCR) fragments from cDNAs generated via reverse transcription-PCR of leukocyte total RNA prepared from blood samples of six different dolphin species (Pseudorca crassidens, Lagenorhynchus obliquidens, Grampus griseus, Globicephala macrorhynchus, Tursiops truncatus, and Delphinapterus leucas). The putative iron-responsive element sequence in the 5'-untranslated region of the six different dolphin species was revealed by direct sequencing of PCR fragments obtained using leukocyte genomic DNA. Results Dolphin H and L subunits consist of 182 and 174 amino acids, respectively, and amino acid sequence identities of ferritin subunits among these dolphins are highly conserved (H: 99–100%, (99→98) ; L: 98–100%). The conserved 28 bp IRE sequence was located -144 bp upstream from the initiation codon in the six different dolphin species. Conclusion These results indicate that six different dolphin species have conserved ferritin sequences, and suggest that these genes are iron-dependently expressed. PMID:18954429
Intact coding region of the serotonin transporter gene in obsessive-compulsive disorder
DOE Office of Scientific and Technical Information (OSTI.GOV)
Altemus, M.; Murphy, D.L.; Greenberg, B.
1996-07-26
Epidemiologic studies indicate that obsessive-compulsive disorder is genetically transmitted in some families, although no genetic abnormalities have been identified in individuals with this disorder. The selective response of obsessive-compulsive disorder to treatment with agents which block serotonin reuptake suggests the gene coding for the serotonin transporter as a candidate gene. The primary structure of the serotonin-transporter coding region was sequenced in 22 patients with obsessive-compulsive disorder, using direct PCR sequencing of cDNA synthesized from platelet serotonin-transporter mRNA. No variations in amino acid sequence were found among the obsessive-compulsive disorder patients or healthy controls. These results do not support a rolemore » for alteration in the primary structure of the coding region of the serotonin-transporter gene in the pathogenesis of obsessive-compulsive disorder. 27 refs.« less
Chwialkowska, Karolina; Korotko, Urszula; Kosinska, Joanna; Szarejko, Iwona; Kwasniewski, Miroslaw
2017-01-01
Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP) is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq). We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS) and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare . However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation analysis in crop plants with large and complex genomes.
Chwialkowska, Karolina; Korotko, Urszula; Kosinska, Joanna; Szarejko, Iwona; Kwasniewski, Miroslaw
2017-01-01
Epigenetic mechanisms, including histone modifications and DNA methylation, mutually regulate chromatin structure, maintain genome integrity, and affect gene expression and transposon mobility. Variations in DNA methylation within plant populations, as well as methylation in response to internal and external factors, are of increasing interest, especially in the crop research field. Methylation Sensitive Amplification Polymorphism (MSAP) is one of the most commonly used methods for assessing DNA methylation changes in plants. This method involves gel-based visualization of PCR fragments from selectively amplified DNA that are cleaved using methylation-sensitive restriction enzymes. In this study, we developed and validated a new method based on the conventional MSAP approach called Methylation Sensitive Amplification Polymorphism Sequencing (MSAP-Seq). We improved the MSAP-based approach by replacing the conventional separation of amplicons on polyacrylamide gels with direct, high-throughput sequencing using Next Generation Sequencing (NGS) and automated data analysis. MSAP-Seq allows for global sequence-based identification of changes in DNA methylation. This technique was validated in Hordeum vulgare. However, MSAP-Seq can be straightforwardly implemented in different plant species, including crops with large, complex and highly repetitive genomes. The incorporation of high-throughput sequencing into MSAP-Seq enables parallel and direct analysis of DNA methylation in hundreds of thousands of sites across the genome. MSAP-Seq provides direct genomic localization of changes and enables quantitative evaluation. We have shown that the MSAP-Seq method specifically targets gene-containing regions and that a single analysis can cover three-quarters of all genes in large genomes. Moreover, MSAP-Seq's simplicity, cost effectiveness, and high-multiplexing capability make this method highly affordable. Therefore, MSAP-Seq can be used for DNA methylation analysis in crop plants with large and complex genomes. PMID:29250096
Paugh, Steven W.; Coss, David R.; Bao, Ju; ...
2016-02-04
MicroRNAs are important regulators of gene expression, acting primarily by binding to sequence-specific locations on already transcribed messenger RNAs (mRNA). Recent studies indicate that microRNAs may also play a role in up-regulating mRNA transcription levels, although a definitive mechanism has not been established. Double-helical DNA is capable of forming triple-helical structures through Hoogsteen and reverse Hoogsteen interactions in the major groove of the duplex, and we show physical evidence that microRNAs form triple-helical structures with duplex DNA, and identify microRNA sequences that favor triplex formation. We developed an algorithm (Trident) to search genome-wide for potential triplex-forming sites and show thatmore » several mammalian and non-mammalian genomes are enriched for strong microRNA triplex binding sites. We show that those genes containing sequences favoring microRNA triplex formation are markedly enriched (3.3 fold, p<2.2 x 10 -16) for genes whose expression is positively correlated with expression of microRNAs targeting triplex binding sequences. As a result, this work has thus revealed a new mechanism by which microRNAs can interact with gene promoter regions to modify gene transcription.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Paugh, Steven W.; Coss, David R.; Bao, Ju
MicroRNAs are important regulators of gene expression, acting primarily by binding to sequence-specific locations on already transcribed messenger RNAs (mRNA). Recent studies indicate that microRNAs may also play a role in up-regulating mRNA transcription levels, although a definitive mechanism has not been established. Double-helical DNA is capable of forming triple-helical structures through Hoogsteen and reverse Hoogsteen interactions in the major groove of the duplex, and we show physical evidence that microRNAs form triple-helical structures with duplex DNA, and identify microRNA sequences that favor triplex formation. We developed an algorithm (Trident) to search genome-wide for potential triplex-forming sites and show thatmore » several mammalian and non-mammalian genomes are enriched for strong microRNA triplex binding sites. We show that those genes containing sequences favoring microRNA triplex formation are markedly enriched (3.3 fold, p<2.2 x 10 -16) for genes whose expression is positively correlated with expression of microRNAs targeting triplex binding sequences. As a result, this work has thus revealed a new mechanism by which microRNAs can interact with gene promoter regions to modify gene transcription.« less
Irie, S; Doi, S; Yorifuji, T; Takagi, M; Yano, K
1987-01-01
The nucleotide sequence of the genes from Pseudomonas putida encoding oxidation of benzene to catechol was determined. Five open reading frames were found in the sequence. Four corresponding protein molecules were detected by a DNA-directed in vitro translation system. Escherichia coli cells containing the fragment with the four open reading frames transformed benzene to cis-benzene glycol, which is an intermediate of the oxidation of benzene to catechol. The relation between the product of each cistron and the components of the benzene oxidation enzyme system is discussed. Images PMID:3667527
On construction of stochastic genetic networks based on gene expression sequences.
Ching, Wai-Ki; Ng, Michael M; Fung, Eric S; Akutsu, Tatsuya
2005-08-01
Reconstruction of genetic regulatory networks from time series data of gene expression patterns is an important research topic in bioinformatics. Probabilistic Boolean Networks (PBNs) have been proposed as an effective model for gene regulatory networks. PBNs are able to cope with uncertainty, corporate rule-based dependencies between genes and discover the sensitivity of genes in their interactions with other genes. However, PBNs are unlikely to use directly in practice because of huge amount of computational cost for obtaining predictors and their corresponding probabilities. In this paper, we propose a multivariate Markov model for approximating PBNs and describing the dynamics of a genetic network for gene expression sequences. The main contribution of the new model is to preserve the strength of PBNs and reduce the complexity of the networks. The number of parameters of our proposed model is O(n2) where n is the number of genes involved. We also develop efficient estimation methods for solving the model parameters. Numerical examples on synthetic data sets and practical yeast data sequences are given to demonstrate the effectiveness of the proposed model.
Pyrin gene and mutants thereof, which cause familial Mediterranean fever
Kastner, Daniel L [Bethesda, MD; Aksentijevichh, Ivona [Bethesda, MD; Centola, Michael [Tacoma Park, MD; Deng, Zuoming [Gaithersburg, MD; Sood, Ramen [Rockville, MD; Collins, Francis S [Rockville, MD; Blake, Trevor [Laytonsville, MD; Liu, P Paul [Ellicott City, MD; Fischel-Ghodsian, Nathan [Los Angeles, CA; Gumucio, Deborah L [Ann Arbor, MI; Richards, Robert I [North Adelaide, AU; Ricke, Darrell O [San Diego, CA; Doggett, Norman A [Santa Cruz, NM; Pras, Mordechai [Tel-Hashomer, IL
2003-09-30
The invention provides the nucleic acid sequence encoding the protein associated with familial Mediterranean fever (FMF). The cDNA sequence is designated as MEFV. The invention is also directed towards fragments of the DNA sequence, as well as the corresponding sequence for the RNA transcript and fragments thereof. Another aspect of the invention provides the amino acid sequence for a protein (pyrin) associated with FMF. The invention is directed towards both the full length amino acid sequence, fusion proteins containing the amino acid sequence and fragments thereof. The invention is also directed towards mutants of the nucleic acid and amino acid sequences associated with FMF. In particular, the invention discloses three missense mutations, clustered in within about 40 to 50 amino acids, in the highly conserved rfp (B30.2) domain at the C-terminal of the protein. These mutants include M6801, M694V, K695R, and V726A. Additionally, the invention includes methods for diagnosing a patient at risk for having FMF and kits therefor.
The contribution of 700,000 ORF sequence tags to the definition of the human transcriptome
Camargo, Anamaria A.; Samaia, Helena P. B.; Dias-Neto, Emmanuel; Simão, Daniel F.; Migotto, Italo A.; Briones, Marcelo R. S.; Costa, Fernando F.; Aparecida Nagai, Maria; Verjovski-Almeida, Sergio; Zago, Marco A.; Andrade, Luis Eduardo C.; Carrer, Helaine; El-Dorry, Hamza F. A.; Espreafico, Enilza M.; Habr-Gama, Angelita; Giannella-Neto, Daniel; Goldman, Gustavo H.; Gruber, Arthur; Hackel, Christine; Kimura, Edna T.; Maciel, Rui M. B.; Marie, Suely K. N.; Martins, Elizabeth A. L.; Nóbrega, Marina P.; Paçó-Larson, Maria Luisa; Pardini, Maria Inês M. C.; Pereira, Gonçalo G.; Pesquero, João Bosco; Rodrigues, Vanderlei; Rogatto, Silvia R.; da Silva, Ismael D. C. G.; Sogayar, Mari C.; Sonati, Maria de Fátima; Tajara, Eloiza H.; Valentini, Sandro R.; Alberto, Fernando L.; Amaral, Maria Elisabete J.; Aneas, Ivy; Arnaldi, Liliane A. T.; de Assis, Angela M.; Bengtson, Mário Henrique; Bergamo, Nadia Aparecida; Bombonato, Vanessa; de Camargo, Maria E. R.; Canevari, Renata A.; Carraro, Dirce M.; Cerutti, Janete M.; Corrêa, Maria Lucia C.; Corrêa, Rosana F. R.; Costa, Maria Cristina R.; Curcio, Cyntia; Hokama, Paula O. M.; Ferreira, Ari J. S.; Furuzawa, Gilberto K.; Gushiken, Tsieko; Ho, Paulo L.; Kimura, Elza; Krieger, José E.; Leite, Luciana C. C.; Majumder, Paromita; Marins, Mozart; Marques, Everaldo R.; Melo, Analy S. A.; Melo, Monica; Mestriner, Carlos Alberto; Miracca, Elisabete C.; Miranda, Daniela C.; Nascimento, Ana Lucia T. O.; Nóbrega, Francisco G.; Ojopi, Élida P. B.; Pandolfi, José Rodrigo C.; Pessoa, Luciana G.; Prevedel, Aline C.; Rahal, Paula; Rainho, Claudia A.; Reis, Eduardo M. R.; Ribeiro, Marcelo L.; da Rós, Nancy; de Sá, Renata G.; Sales, Magaly M.; Sant'anna, Simone Cristina; dos Santos, Mariana L.; da Silva, Aline M.; da Silva, Neusa P.; Silva, Wilson A.; da Silveira, Rosana A.; Sousa, Josane F.; Stecconi, Daniella; Tsukumo, Fernando; Valente, Valéria; Soares, Fernando; Moreira, Eloisa S.; Nunes, Diana N.; Correa, Ricardo G.; Zalcberg, Heloisa; Carvalho, Alex F.; Reis, Luis F. L.; Brentani, Ricardo R.; Simpson, Andrew J. G.; de Souza, Sandro J.
2001-01-01
Open reading frame expressed sequences tags (ORESTES) differ from conventional ESTs by providing sequence data from the central protein coding portion of transcripts. We generated a total of 696,745 ORESTES sequences from 24 human tissues and used a subset of the data that correspond to a set of 15,095 full-length mRNAs as a means of assessing the efficiency of the strategy and its potential contribution to the definition of the human transcriptome. We estimate that ORESTES sampled over 80% of all highly and moderately expressed, and between 40% and 50% of rarely expressed, human genes. In our most thoroughly sequenced tissue, the breast, the 130,000 ORESTES generated are derived from transcripts from an estimated 70% of all genes expressed in that tissue, with an equally efficient representation of both highly and poorly expressed genes. In this respect, we find that the capacity of the ORESTES strategy both for gene discovery and shotgun transcript sequence generation significantly exceeds that of conventional ESTs. The distribution of ORESTES is such that many human transcripts are now represented by a scaffold of partial sequences distributed along the length of each gene product. The experimental joining of the scaffold components, by reverse transcription–PCR, represents a direct route to transcript finishing that may represent a useful alternative to full-length cDNA cloning. PMID:11593022
The contribution of 700,000 ORF sequence tags to the definition of the human transcriptome.
Camargo, A A; Samaia, H P; Dias-Neto, E; Simão, D F; Migotto, I A; Briones, M R; Costa, F F; Nagai, M A; Verjovski-Almeida, S; Zago, M A; Andrade, L E; Carrer, H; El-Dorry, H F; Espreafico, E M; Habr-Gama, A; Giannella-Neto, D; Goldman, G H; Gruber, A; Hackel, C; Kimura, E T; Maciel, R M; Marie, S K; Martins, E A; Nobrega, M P; Paco-Larson, M L; Pardini, M I; Pereira, G G; Pesquero, J B; Rodrigues, V; Rogatto, S R; da Silva, I D; Sogayar, M C; Sonati, M F; Tajara, E H; Valentini, S R; Alberto, F L; Amaral, M E; Aneas, I; Arnaldi, L A; de Assis, A M; Bengtson, M H; Bergamo, N A; Bombonato, V; de Camargo, M E; Canevari, R A; Carraro, D M; Cerutti, J M; Correa, M L; Correa, R F; Costa, M C; Curcio, C; Hokama, P O; Ferreira, A J; Furuzawa, G K; Gushiken, T; Ho, P L; Kimura, E; Krieger, J E; Leite, L C; Majumder, P; Marins, M; Marques, E R; Melo, A S; Melo, M B; Mestriner, C A; Miracca, E C; Miranda, D C; Nascimento, A L; Nobrega, F G; Ojopi, E P; Pandolfi, J R; Pessoa, L G; Prevedel, A C; Rahal, P; Rainho, C A; Reis, E M; Ribeiro, M L; da Ros, N; de Sa, R G; Sales, M M; Sant'anna, S C; dos Santos, M L; da Silva, A M; da Silva, N P; Silva, W A; da Silveira, R A; Sousa, J F; Stecconi, D; Tsukumo, F; Valente, V; Soares, F; Moreira, E S; Nunes, D N; Correa, R G; Zalcberg, H; Carvalho, A F; Reis, L F; Brentani, R R; Simpson, A J; de Souza, S J; Melo, M
2001-10-09
Open reading frame expressed sequences tags (ORESTES) differ from conventional ESTs by providing sequence data from the central protein coding portion of transcripts. We generated a total of 696,745 ORESTES sequences from 24 human tissues and used a subset of the data that correspond to a set of 15,095 full-length mRNAs as a means of assessing the efficiency of the strategy and its potential contribution to the definition of the human transcriptome. We estimate that ORESTES sampled over 80% of all highly and moderately expressed, and between 40% and 50% of rarely expressed, human genes. In our most thoroughly sequenced tissue, the breast, the 130,000 ORESTES generated are derived from transcripts from an estimated 70% of all genes expressed in that tissue, with an equally efficient representation of both highly and poorly expressed genes. In this respect, we find that the capacity of the ORESTES strategy both for gene discovery and shotgun transcript sequence generation significantly exceeds that of conventional ESTs. The distribution of ORESTES is such that many human transcripts are now represented by a scaffold of partial sequences distributed along the length of each gene product. The experimental joining of the scaffold components, by reverse transcription-PCR, represents a direct route to transcript finishing that may represent a useful alternative to full-length cDNA cloning.
Grove, J R; Deutsch, P J; Price, D J; Habener, J F; Avruch, J
1989-11-25
Plasmids that encode a bioactive amino-terminal fragment of the heat-stable inhibitor of the cAMP-dependent protein kinase, PKI(1-31), were employed to characterize the role of this protein kinase in the control of transcriptional activity mediated by three DNA regulatory elements in the JEG-3 human placental cell line. The 5'-flanking sequence of the human collagenase gene contains the heptameric sequence, 5'-TGAGTCA-3', previously identified as a "phorbol ester" response element. Reporter genes containing either the intact 1.2-kilobase 5'-flanking sequence from the human collagenase gene or just the 7-base pair (bp) response element, when coupled to an enhancerless promoter, each exhibit both cAMP and phorbol ester-stimulated expression in JEG-3 cells. Cotransfection of either construct with plasmids encoding PKI(1-31) inhibits cAMP-stimulated but not basal- or phorbol ester-stimulated expression. Pretreatment of cells with phorbol ester for 1 or 2 days abrogates completely the response to rechallenge with phorbol ester but does not alter the basal expression of either construct; cAMP-stimulated expression, while modestly inhibited, remains vigorous. The 5'-flanking sequence of the human chorionic gonadotropin-alpha subunit (HCG alpha) gene has two copies of the sequence, 5'-TGACGTCA-3', contained in directly adjacent identical 18-bp segments, previously identified as a cAMP-response element. Reporter genes containing either the intact 1.5 kilobase of 5'-flanking sequence from the HCG alpha gene, or just the 36-bp tandem repeat cAMP response element, when coupled to an enhancerless promoter, both exhibit a vigorous cAMP stimulation of expression but no response to phorbol ester in JEG-3 cells. Cotransfection with plasmids encoding PKI(1-31) inhibits both basal and cAMP-stimulated expression in a parallel fashion. The 5'-flanking sequence of the human enkephalin gene mediates cAMP-stimulated expression of reporter genes in both JEG-3 and CV-1 cells. Plasmids encoding PKI(1-31) inhibit the expression that is stimulated by the addition of cAMP analogs in both cell lines; basal expression, however, is inhibited by PKI(1-31) only in the JEG-3 cell line and not in the CV-1 cells. These observations indicate that, in JEG-3 cells, PKI(1-31) is a specific inhibitor of kinase A-mediated gene transcription, but it does not modify kinase C-directed transcription.(ABSTRACT TRUNCATED AT 400 WORDS)
Severson, Eric; Arnett, Kelly L.; Wang, Hongfang; Zang, Chongzhi; Taing, Len; Liu, Hudan; Pear, Warren S.; Liu, X. Shirley; Blacklow, Stephen C.; Aster, Jon C.
2018-01-01
Notch transcription complexes (NTCs) drive target gene expression by binding to two distinct types of genomic response elements, NTC monomer-binding sites and sequence-paired sites (SPSs) that bind NTC dimers. SPSs are conserved and are linked to the Notch-responsiveness of a few genes, but their overall contribution to Notch-dependent gene regulation is unknown. To address this issue, we determined the DNA sequence requirements for NTC dimerization using a fluorescence resonance energy transfer (FRET) assay, and applied insights from these in vitro studies to Notch-“addicted” leukemia cells. We find that SPSs contribute to the regulation of approximately a third of direct Notch target genes. While originally described in promoters, SPSs are present mainly in long-range enhancers, including an enhancer containing a newly described SPS that regulates HES5. Our work provides a general method for identifying sequence-paired sites in genome-wide data sets and highlights the widespread role of NTC dimerization in Notch-transformed leukemia cells. PMID:28465412
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hong, R. L., Hamaguchi, L., Busch, M. A., and Weigel, D.
2003-06-01
OAK-B135 In Arabidopsis thaliana, cis-regulatory sequences of the floral homeotic gene AGAMOUS (AG) are located in the second intron. This 3 kb intron contains binding sites for two direct activators of AG, LEAFY (LFY) and WUSCHEL (WUS), along with other putative regulatory elements. We have used phylogenetic footprinting and the related technique of phylogenetic shadowing to identify putative cis-regulatory elements in this intron. Among 29 Brassicaceae, several other motifs, but not the LFY and WUS binding sites previously identified, are largely invariant. Using reporter gene analyses, we tested six of these motifs and found that they are all functionally importantmore » for activity of AG regulatory sequences in A. thaliana. Although there is little obvious sequence similarity outside the Brassicaceae, the intron from cucumber AG has at least partial activity in A. thaliana. Our studies underscore the value of the comparative approach as a tool that complements gene-by-gene promoter dissection, but also highlight that sequence-based studies alone are insufficient for a complete identification of cis-regulatory sites.« less
Detection and characterization of Pasteuria 16S rRNA gene sequences from nematodes and soils.
Duan, Y P; Castro, H F; Hewlett, T E; White, J H; Ogram, A V
2003-01-01
Various bacterial species in the genus Pasteuria have great potential as biocontrol agents against plant-parasitic nematodes, although study of this important genus is hampered by the current inability to cultivate Pasteuria species outside their host. To aid in the study of this genus, an extensive 16S rRNA gene sequence phylogeny was constructed and this information was used to develop cultivation-independent methods for detection of Pasteuria in soils and nematodes. Thirty new clones of Pasteuria 16S rRNA genes were obtained directly from nematodes and soil samples. These were sequenced and used to construct an extensive phylogeny of this genus. These sequences were divided into two deeply branching clades within the low-G + C, Gram-positive division; some sequences appear to represent novel species within the genus Pasteuria. In addition, a surprising degree of 16S rRNA gene sequence diversity was observed within what had previously been designated a single strain of Pasteuria penetrans (P-20). PCR primers specific to Pasteuria 16S rRNA for detection of Pasteuria in soils were also designed and evaluated. Detection limits for soil DNA were 100-10,000 Pasteuria endospores (g soil)(-1).
A fungal mock community control for amplicon sequencing experiments
USDA-ARS?s Scientific Manuscript database
The field of microbial ecology has been profoundly advanced by the ability to profile the composition of complex microbial communities by means of high throughput amplicon sequencing of marker genes amplified directly from environmental genomic DNA extracts. However, it has become increasingly clear...
New mutations in the NHS gene in Nance-Horan Syndrome families from the Netherlands.
Florijn, Ralph J; Loves, Willem; Maillette de Buy Wenniger-Prick, Liesbeth J J M; Mannens, Marcel M A M; Tijmes, Nel; Brooks, Simon P; Hardcastle, Alison J; Bergen, Arthur A B
2006-09-01
Mutations in the NHS gene cause Nance-Horan Syndrome (NHS), a rare X-chromosomal recessive disorder with variable features, including congenital cataract, microphthalmia, a peculiar form of the ear and dental anomalies. We investigated the NHS gene in four additional families with NHS from the Netherlands, by dHPLC and direct sequencing. We identified an unique mutation in each family. Three out of these four mutations were not reported before. We report here the first splice site sequence alteration mutation and three protein truncating mutations. Our results suggest that X-linked cataract and NHS are allelic disorders.
2009-01-01
Background Conifers are a large group of gymnosperm trees which are separated from the angiosperms by more than 300 million years of independent evolution. Conifer genomes are extremely large and contain considerable amounts of repetitive DNA. Currently, conifer sequence resources exist predominantly as expressed sequence tags (ESTs) and full-length (FL)cDNAs. There is no genome sequence available for a conifer or any other gymnosperm. Conifer defence-related genes often group into large families with closely related members. The goals of this study are to assess the feasibility of targeted isolation and sequence assembly of conifer BAC clones containing specific genes from two large gene families, and to characterize large segments of genomic DNA sequence for the first time from a conifer. Results We used a PCR-based approach to identify BAC clones for two target genes, a terpene synthase (3-carene synthase; 3CAR) and a cytochrome P450 (CYP720B4) from a non-arrayed genomic BAC library of white spruce (Picea glauca). Shotgun genomic fragments isolated from the BAC clones were sequenced to a depth of 15.6- and 16.0-fold coverage, respectively. Assembly and manual curation yielded sequence scaffolds of 172 kbp (3CAR) and 94 kbp (CYP720B4) long. Inspection of the genomic sequences revealed the intron-exon structures, the putative promoter regions and putative cis-regulatory elements of these genes. Sequences related to transposable elements (TEs), high complexity repeats and simple repeats were prevalent and comprised approximately 40% of the sequenced genomic DNA. An in silico simulation of the effect of sequencing depth on the quality of the sequence assembly provides direction for future efforts of conifer genome sequencing. Conclusion We report the first targeted cloning, sequencing, assembly, and annotation of large segments of genomic DNA from a conifer. We demonstrate that genomic BAC clones for individual members of multi-member gene families can be isolated in a gene-specific fashion. The results of the present work provide important new information about the structure and content of conifer genomic DNA that will guide future efforts to sequence and assemble conifer genomes. PMID:19656416
Hamberger, Björn; Hall, Dawn; Yuen, Mack; Oddy, Claire; Hamberger, Britta; Keeling, Christopher I; Ritland, Carol; Ritland, Kermit; Bohlmann, Jörg
2009-08-06
Conifers are a large group of gymnosperm trees which are separated from the angiosperms by more than 300 million years of independent evolution. Conifer genomes are extremely large and contain considerable amounts of repetitive DNA. Currently, conifer sequence resources exist predominantly as expressed sequence tags (ESTs) and full-length (FL)cDNAs. There is no genome sequence available for a conifer or any other gymnosperm. Conifer defence-related genes often group into large families with closely related members. The goals of this study are to assess the feasibility of targeted isolation and sequence assembly of conifer BAC clones containing specific genes from two large gene families, and to characterize large segments of genomic DNA sequence for the first time from a conifer. We used a PCR-based approach to identify BAC clones for two target genes, a terpene synthase (3-carene synthase; 3CAR) and a cytochrome P450 (CYP720B4) from a non-arrayed genomic BAC library of white spruce (Picea glauca). Shotgun genomic fragments isolated from the BAC clones were sequenced to a depth of 15.6- and 16.0-fold coverage, respectively. Assembly and manual curation yielded sequence scaffolds of 172 kbp (3CAR) and 94 kbp (CYP720B4) long. Inspection of the genomic sequences revealed the intron-exon structures, the putative promoter regions and putative cis-regulatory elements of these genes. Sequences related to transposable elements (TEs), high complexity repeats and simple repeats were prevalent and comprised approximately 40% of the sequenced genomic DNA. An in silico simulation of the effect of sequencing depth on the quality of the sequence assembly provides direction for future efforts of conifer genome sequencing. We report the first targeted cloning, sequencing, assembly, and annotation of large segments of genomic DNA from a conifer. We demonstrate that genomic BAC clones for individual members of multi-member gene families can be isolated in a gene-specific fashion. The results of the present work provide important new information about the structure and content of conifer genomic DNA that will guide future efforts to sequence and assemble conifer genomes.
Camilo, Cesar M; Lima, Gustavo M A; Maluf, Fernando V; Guido, Rafael V C; Polikarpov, Igor
2016-01-01
Following burgeoning genomic and transcriptomic sequencing data, biochemical and molecular biology groups worldwide are implementing high-throughput cloning and mutagenesis facilities in order to obtain a large number of soluble proteins for structural and functional characterization. Since manual primer design can be a time-consuming and error-generating step, particularly when working with hundreds of targets, the automation of primer design process becomes highly desirable. HTP-OligoDesigner was created to provide the scientific community with a simple and intuitive online primer design tool for both laboratory-scale and high-throughput projects of sequence-independent gene cloning and site-directed mutagenesis and a Tm calculator for quick queries.
Structure and expression of the attacin genes in Hyalophora cecropia.
Sun, S C; Lindström, I; Lee, J Y; Faye, I
1991-02-26
To study the regulation of the immune genes in insects, we have cloned and sequenced the attacin gene locus of the giant silk moth Hyalophora cecropia. The locus contains one acidic and one basic attacin gene as well as two pseudogenes, which are remnants of basic attacin genes. A small insertion element was found within the locus. The two functional attacin genes are transcribed in opposite directions and have two introns inserted at homologous positions. A common sequence, GGGGATTCCT, is found at nucleotide position -48 in the acidic gene and at nucleotide position -58 in the basic gene. Interestingly, this decanucleotide is similar to the consensus of the NF-k B-binding site. Expression studies revealed that both attacins are strongly induced by phorbol 12-myristate 13-acetate, lipopolysaccharide and bacteria. However, only the acidic attacin gene showed a clear response to injury.
Genes for cytochrome c oxidase subunit I, URF2, and three tRNAs in Drosophila mitochondrial DNA.
Clary, D O; Wolstenholme, D R
1983-01-01
Genes for URF2, tRNAtrp, tRNAcys, tRNAtyr and cytochrome c oxidase subunit I (COI) have been identified within a sequenced segment of the Drosophila yakuba mtDNA molecule. The five genes are arranged in the order given. Transcription of the tRNAcys and tRNAtyr genes is in the same direction as replication, while transcription of the URF2, tRNAtrp and COI genes is in the opposite direction. A similar arrangement of these genes is found in mammalian mtDNA except that in the latter, the tRNAala and tRNAasn genes are located between the tRNAtrp and tRNAcys genes. Also, a sequence found between the tRNAasn and tRNAcys genes in mammalian mtDNA, which is associated with the initiation of second strand DNA synthesis, is not found in this region of the D. yakuba mtDNA molecule. As the D. yakuba COI gene lacks a standard translation initiation codon, we consider the possibility that the quadruplet ATAA may serve this function. As in other D. yakuba mitochondrial polypeptide genes, AGA codons in the URF2 and COI genes do not correspond in position to arginine-specifying codons in the equivalent genes of mouse and yeast mtDNAs, but do most frequently correspond to serine-specifying codons. PMID:6314262
The Essential Genome of Escherichia coli K-12
2018-01-01
ABSTRACT Transposon-directed insertion site sequencing (TraDIS) is a high-throughput method coupling transposon mutagenesis with short-fragment DNA sequencing. It is commonly used to identify essential genes. Single gene deletion libraries are considered the gold standard for identifying essential genes. Currently, the TraDIS method has not been benchmarked against such libraries, and therefore, it remains unclear whether the two methodologies are comparable. To address this, a high-density transposon library was constructed in Escherichia coli K-12. Essential genes predicted from sequencing of this library were compared to existing essential gene databases. To decrease false-positive identification of essential genes, statistical data analysis included corrections for both gene length and genome length. Through this analysis, new essential genes and genes previously incorrectly designated essential were identified. We show that manual analysis of TraDIS data reveals novel features that would not have been detected by statistical analysis alone. Examples include short essential regions within genes, orientation-dependent effects, and fine-resolution identification of genome and protein features. Recognition of these insertion profiles in transposon mutagenesis data sets will assist genome annotation of less well characterized genomes and provides new insights into bacterial physiology and biochemistry. PMID:29463657
Song, Junfang; Duc, Céline; Storey, Kate G.; McLean, W. H. Irwin; Brown, Sara J.; Simpson, Gordon G.; Barton, Geoffrey J.
2014-01-01
The reference annotations made for a genome sequence provide the framework for all subsequent analyses of the genome. Correct and complete annotation in addition to the underlying genomic sequence is particularly important when interpreting the results of RNA-seq experiments where short sequence reads are mapped against the genome and assigned to genes according to the annotation. Inconsistencies in annotations between the reference and the experimental system can lead to incorrect interpretation of the effect on RNA expression of an experimental treatment or mutation in the system under study. Until recently, the genome-wide annotation of 3′ untranslated regions received less attention than coding regions and the delineation of intron/exon boundaries. In this paper, data produced for samples in Human, Chicken and A. thaliana by the novel single-molecule, strand-specific, Direct RNA Sequencing technology from Helicos Biosciences which locates 3′ polyadenylation sites to within +/− 2 nt, were combined with archival EST and RNA-Seq data. Nine examples are illustrated where this combination of data allowed: (1) gene and 3′ UTR re-annotation (including extension of one 3′ UTR by 5.9 kb); (2) disentangling of gene expression in complex regions; (3) clearer interpretation of small RNA expression and (4) identification of novel genes. While the specific examples displayed here may become obsolete as genome sequences and their annotations are refined, the principles laid out in this paper will be of general use both to those annotating genomes and those seeking to interpret existing publically available annotations in the context of their own experimental data. PMID:24722185
Ceccarelli, A; Zhukovskaya, N; Kawata, T; Bozzaro, S; Williams, J
2000-12-01
The ecmB gene of Dictyostelium is expressed at culmination both in the prestalk cells that enter the stalk tube and in ancillary stalk cell structures such as the basal disc. Stalk tube-specific expression is regulated by sequence elements within the cap-site proximal part of the promoter, the stalk tube (ST) promoter region. Dd-STATa, a member of the STAT transcription factor family, binds to elements present in the ST promoter-region and represses transcription prior to entry into the stalk tube. We have characterised an activatory DNA sequence element, that lies distal to the repressor elements and that is both necessary and sufficient for expression within the stalk tube. We have mapped this activator to a 28 nucleotide region (the 28-mer) within which we have identified a GA-containing sequence element that is required for efficient gene transcription. The Dd-STATa protein binds to the 28-mer in an in vitro binding assay, and binding is dependent upon the GA-containing sequence. However, the ecmB gene is expressed in a Dd-STATa null mutant, therefore Dd-STATa cannot be responsible for activating the 28-mer in vivo. Instead, we identified a distinct 28-mer binding activity in nuclear extracts from the Dd-STATa null mutant, the activity of this GA binding activity being largely masked in wild type extracts by the high affinity binding of the Dd-STATa protein. We suggest, that in addition to the long range repression exerted by binding to the two known repressor sites, Dd-STATa inhibits transcription by direct competition with this putative activator for binding to the GA sequence.
Rasmussen, L. D.; Zawadsky, C.; Binnerup, S. J.; Øregaard, G.; Sørensen, S. J.; Kroer, N.
2008-01-01
Mercury-resistant bacteria may be important players in mercury biogeochemistry. To assess the potential for mercury reduction by two subsurface microbial communities, resistant subpopulations and their merA genes were characterized by a combined molecular and cultivation-dependent approach. The cultivation method simulated natural conditions by using polycarbonate membranes as a growth support and a nonsterile soil slurry as a culture medium. Resistant bacteria were pregrown to microcolony-forming units (mCFU) before being plated on standard medium. Compared to direct plating, culturability was increased up to 2,800 times and numbers of mCFU were similar to the total number of mercury-resistant bacteria in the soils. Denaturing gradient gel electrophoresis analysis of DNA extracted from membranes suggested stimulation of growth of hard-to-culture bacteria during the preincubation. A total of 25 different 16S rRNA gene sequences were observed, including Alpha-, Beta-, and Gammaproteobacteria; Actinobacteria; Firmicutes; and Bacteroidetes. The diversity of isolates obtained by direct plating included eight different 16S rRNA gene sequences (Alpha- and Betaproteobacteria and Actinobacteria). Partial sequencing of merA of selected isolates led to the discovery of new merA sequences. With phylum-specific merA primers, PCR products were obtained for Alpha- and Betaproteobacteria and Actinobacteria but not for Bacteroidetes and Firmicutes. The similarity to known sequences ranged between 89 and 95%. One of the sequences did not result in a match in the BLAST search. The results illustrate the power of integrating advanced cultivation methodology with molecular techniques for the characterization of the diversity of mercury-resistant populations and assessing the potential for mercury reduction in contaminated environments. PMID:18441111
Migration pattern of hepatitis A virus genotype IA in North-Central Tunisia.
Beji-Hamza, Abir; Taffon, Stefania; Mhalla, Salma; Lo Presti, Alessandra; Equestre, Michele; Chionne, Paola; Madonna, Elisabetta; Cella, Eleonora; Bruni, Roberto; Ciccozzi, Massimo; Aouni, Mahjoub; Ciccaglione, Anna Rita
2015-02-08
Hepatitis A virus (HAV) epidemiology in Tunisia has changed from high to intermediate endemicity in the last decades. However, several outbreaks continue to occur. The last reported sequences from Tunisian HAV strains date back to 2006. In order to provide an updated overview of the strains currently circulating in Tunisia, a large-scale molecular analysis of samples from hepatitis A cases was performed, the first in Tunisia. Biological samples were collected from patients with laboratory confirmed hepatitis A: 145 sera samples in Tunis, Monastir, Sousse and Kairouan from 2008 to 2013 and 45 stool samples in Mahdia in 2009. HAV isolates were characterised by nested RT-PCR (VP1/2A region) and sequencing. The sequences finally obtained from 81 samples showed 78 genotype IA and 3 genotype IB isolates. A Tunisian genotype IA sequence dataset, including both the 78 newly obtained IA sequences and 51 sequences retrieved from GenBank, was used for phylogenetic investigation, including analysis of migration pattern among six towns. Virus gene flow from Sfax and Monastir was directed to all other towns; in contrast, the gene flows from Sousse, Tunis, Mahdia and Kairouan were directed to three, two, one and no towns, respectively. Several different HAV strains co-circulate in Tunisia, but the predominant genotype still continues to be IA (78/81, 96% isolates). A complex gene flow (migration) of HAV genotype IA was observed, with Sfax and Monastir showing gene flows to all other investigated towns. This approach coupled to a wider sampling can prove useful to investigate the factors underlying the spread of HAV in Tunisia and, thus, to implement appropriate preventing measures.
Workman, Rachael E; Myrka, Alexander M; Wong, G William; Tseng, Elizabeth
2018-01-01
Abstract Background Hummingbirds oxidize ingested nectar sugars directly to fuel foraging but cannot sustain this fuel use during fasting periods, such as during the night or during long-distance migratory flights. Instead, fasting hummingbirds switch to oxidizing stored lipids that are derived from ingested sugars. The hummingbird liver plays a key role in moderating energy homeostasis and this remarkable capacity for fuel switching. Additionally, liver is the principle location of de novo lipogenesis, which can occur at exceptionally high rates, such as during premigratory fattening. Yet understanding how this tissue and whole organism moderates energy turnover is hampered by a lack of information regarding how relevant enzymes differ in sequence, expression, and regulation. Findings We generated a de novo transcriptome of the hummingbird liver using PacBio full-length cDNA sequencing (Iso-Seq), yielding 8.6Gb of sequencing data, or 2.6M reads from 4 different size fractions. We analyzed data using the SMRTAnalysis v3.1 Iso-Seq pipeline, then clustered isoforms into gene families to generate de novo gene contigs using Cogent. We performed orthology analysis to identify closely related sequences between our transcriptome and other avian and human gene sets. Finally, we closely examined homology of critical lipid metabolism genes between our transcriptome data and avian and human genomes. Conclusions We confirmed high levels of sequence divergence within hummingbird lipogenic enzymes, suggesting a high probability of adaptive divergent function in the hepatic lipogenic pathways. Our results leverage cutting-edge technology and a novel bioinformatics pipeline to provide a first direct look at the transcriptome of this incredible organism. PMID:29618047
Effect of regulatory peptides on gene transcription.
Khavinson, V Kh; Shataeva, L K; Chernova, A A
2003-09-01
Experimental studies of geroprotective activity of synthetic oligopeptides and conformational analysis of the tetrapeptide Epithalon allowed us to hypothesize that regulatory oligopeptides directly initiate transcription of genes for vitally important proteins. Sequences of nucleotide pairs that can serve as binding sites for tetrapeptide Epithalon were identified in the promoter regions of retinal genes F379, telomerase, and RNA polymerase II.
Myelin protein zero gene sequencing diagnoses Charcot-Marie-Tooth Type 1B disease
DOE Office of Scientific and Technical Information (OSTI.GOV)
Su, Y.; Zhang, H.; Madrid, R.
1994-09-01
Charcot-Marie-Tooth disease (CMT), the most common genetic neuropathy, affects about 1 in 2600 people in Norway and is found worldwide. CMT Type 1 (CMT1) has slow nerve conduction with demyelinated Schwann cells. Autosomal dominant CMT Type 1B (CMT1B) results from mutations in the myelin protein zero gene which directs the synthesis of more than half of all Schwann cell protein. This gene was mapped to the chromosome 1q22-1q23.1 borderline by fluorescence in situ hybridization. The first 7 of 7 reported CMT1B mutations are unique. Thus the most effective means to identify CMT1B mutations in at-risk family members and fetuses ismore » to sequence the entire coding sequence in dominant or sporadic CMT patients without the CMT1A duplication. Of the 19 primers used in 16 pars to uniquely amplify the entire MPZ coding sequence, 6 primer pairs were used to amplify and sequence the 6 exons. The DyeDeoxy Terminator cycle sequencing method used with four different color fluorescent lables was superior to manual sequencing because it sequences more bases unambiguously from extracted genomic DNA samples within 24 hours. This protocol was used to test 28 CMT and Dejerine-Sottas patients without CMT1A gene duplication. Sequencing MPZ gene-specific amplified fragments identified 9 polymorphic sites within the 6 exons that encode the 248 amino acid MPZ protein. The large number of major CMT1B mutations identified by single strand sequencing are being verified by reverse strand sequencing and when possible, by restriction enzyme analysis. This protocol can be used to distringuish CMT1B patients from othre CMT phenotypes and to determine the CMT1B status of relatives both presymptomatically and prenatally.« less
Isoform Sequencing and State-of-Art Applications for Unravelling Complexity of Plant Transcriptomes
An, Dong; Li, Changsheng; Humbeck, Klaus
2018-01-01
Single-molecule real-time (SMRT) sequencing developed by PacBio, also called third-generation sequencing (TGS), offers longer reads than the second-generation sequencing (SGS). Given its ability to obtain full-length transcripts without assembly, isoform sequencing (Iso-Seq) of transcriptomes by PacBio is advantageous for genome annotation, identification of novel genes and isoforms, as well as the discovery of long non-coding RNA (lncRNA). In addition, Iso-Seq gives access to the direct detection of alternative splicing, alternative polyadenylation (APA), gene fusion, and DNA modifications. Such applications of Iso-Seq facilitate the understanding of gene structure, post-transcriptional regulatory networks, and subsequently proteomic diversity. In this review, we summarize its applications in plant transcriptome study, specifically pointing out challenges associated with each step in the experimental design and highlight the development of bioinformatic pipelines. We aim to provide the community with an integrative overview and a comprehensive guidance to Iso-Seq, and thus to promote its applications in plant research. PMID:29346292
Saga, Yukika; Inamura, Tomoka; Shimada, Nao; Kawata, Takefumi
2016-05-01
STATa, a Dictyostelium homologue of metazoan signal transducer and activator of transcription, is important for the organizer function in the tip region of the migrating Dictyostelium slug. We previously showed that ecmF gene expression depends on STATa in prestalk A (pstA) cells, where STATa is activated. Deletion and site-directed mutagenesis analysis of the ecmF/lacZ fusion gene in wild-type and STATa null strains identified an imperfect inverted repeat sequence, ACAAATANTATTTGT, as a STATa-responsive element. An upstream sequence element was required for efficient expression in the rear region of pstA zone; an element downstream of the inverted repeat was necessary for sufficient prestalk expression during culmination. Band shift analyses using purified STATa protein detected no sequence-specific binding to those ecmF elements. The only verified upregulated target gene of STATa is cudA gene; CudA directly activates expL7 gene expression in prestalk cells. However, ecmF gene expression was almost unaffected in a cudA null mutant. Several previously reported putative STATa target genes were also expressed in cudA null mutant but were downregulated in STATa null mutant. Moreover, mybC, which encodes another transcription factor, belonged to this category, and ecmF expression was downregulated in a mybC null mutant. These findings demonstrate the existence of a genetic hierarchy for pstA-specific genes, which can be classified into two distinct STATa downstream pathways, CudA dependent and independent. The ecmF expression is indirectly upregulated by STATa in a CudA-independent activation manner but dependent on MybC, whose expression is positively regulated by STATa. © 2016 Japanese Society of Developmental Biologists.
Mutation of domain III and domain VI in L gene conserved domain of Nipah virus
NASA Astrophysics Data System (ADS)
Jalani, Siti Aishah; Ibrahim, Nazlina
2016-11-01
Nipah virus (NiV) is the etiologic agent responsible for the respiratory illness and causes fatal encephalitis in human. NiV L protein subunit is thought to be responsible for the majority of enzymatic activities involved in viral transcription and replication. The L protein which is the viral RNA dependent RNA polymerase has high sequence homology among negative sense RNA viruses. In negative stranded RNA viruses, based on sequence alignment six conserved domain (domain I-IV) have been determined. Each domain is separated on variable regions that suggest the structure to consist concatenated functional domain. To directly address the roles of domains III and VI, site-directed mutations were constructed by the substitution of bases at sequences 2497, 2500, 5528 and 5532. Each mutated L gene can be used in future studies to test the ability for expression on in vitro translation.
Firth, A E; Jagger, B W; Wise, H M; Nelson, C C; Parsawar, K; Wills, N M; Napthine, S; Taubenberger, J K; Digard, P; Atkins, J F
2012-10-01
Programmed ribosomal frameshifting is used in the expression of many virus genes and some cellular genes. In eukaryotic systems, the most well-characterized mechanism involves -1 tandem tRNA slippage on an X_XXY_YYZ motif. By contrast, the mechanisms involved in programmed +1 (or -2) slippage are more varied and often poorly characterized. Recently, a novel gene, PA-X, was discovered in influenza A virus and found to be expressed via a shift to the +1 reading frame. Here, we identify, by mass spectrometric analysis, both the site (UCC_UUU_CGU) and direction (+1) of the frameshifting that is involved in PA-X expression. Related sites are identified in other virus genes that have previously been proposed to be expressed via +1 frameshifting. As these viruses infect insects (chronic bee paralysis virus), plants (fijiviruses and amalgamaviruses) and vertebrates (influenza A virus), such motifs may form a new class of +1 frameshift-inducing sequences that are active in diverse eukaryotes.
Polymorphism at codon 36 of the p53 gene.
Felix, C A; Brown, D L; Mitsudomi, T; Ikagaki, N; Wong, A; Wasserman, R; Womer, R B; Biegel, J A
1994-01-01
A polymorphism at codon 36 in exon 4 of the p53 gene was identified by single strand conformation polymorphism (SSCP) analysis and direct sequencing of genomic DNA PCR products. The polymorphic allele, present in the heterozygous state in genomic DNAs of four of 100 individuals (4%), changes the codon 36 CCG to CCA, eliminates a FinI restriction site and creates a BccI site. Including this polymorphism there are four known polymorphisms in the p53 coding sequence.
Genes Involved in Anaerobic Metabolism of Phenol in the Bacterium Thauera aromatica
Breinig, Sabine; Schiltz, Emile; Fuchs, Georg
2000-01-01
Genes involved in the anaerobic metabolism of phenol in the denitrifying bacterium Thauera aromatica have been studied. The first two committed steps in this metabolism appear to be phosphorylation of phenol to phenylphosphate by an unknown phosphoryl donor (“phenylphosphate synthase”) and subsequent carboxylation of phenylphosphate to 4-hydroxybenzoate under release of phosphate (“phenylphosphate carboxylase”). Both enzyme activities are strictly phenol induced. Two-dimensional gel electrophoresis allowed identification of several phenol-induced proteins. Based on N-terminal and internal amino acid sequences of such proteins, degenerate oligonucleotides were designed to identify the corresponding genes. A chromosomal DNA segment of about 14 kbp was sequenced which contained 10 genes transcribed in the same direction. These are organized in two adjacent gene clusters and include the genes coding for five identified phenol-induced proteins. Comparison with sequences in the databases revealed the following similarities: the gene products of two open reading frames (ORFs) are each similar to either the central part and N-terminal part of phosphoenolpyruvate synthases. We propose that these ORFs are components of the phenylphosphate synthase system. Three ORFs showed similarity to the ubiD gene product, 3-octaprenyl-4-hydroxybenzoate carboxy lyase; UbiD catalyzes the decarboxylation of a 4-hydroxybenzoate analogue in ubiquinone biosynthesis. Another ORF was similar to the ubiX gene product, an isoenzyme of UbiD. We propose that (some of) these four proteins are involved in the carboxylation of phenylphosphate. A 700-bp PCR product derived from one of these ORFs cross-hybridized with DNA from different Thauera and Azoarcus strains, even from those which have not been reported to grow with phenol. One ORF showed similarity to the mutT gene product, and three ORFs showed no strong similarities to sequences in the databases. Upstream of the first gene cluster, an ORF which is transcribed in the opposite direction codes for a protein highly similar to the DmpR regulatory protein of Pseudomonas putida. DmpR controls transcription of the genes of aerobic phenol metabolism, suggesting a similar regulation of anaerobic phenol metabolism by the putative regulator. PMID:11004186
A comparative analysis of soft computing techniques for gene prediction.
Goel, Neelam; Singh, Shailendra; Aseri, Trilok Chand
2013-07-01
The rapid growth of genomic sequence data for both human and nonhuman species has made analyzing these sequences, especially predicting genes in them, very important and is currently the focus of many research efforts. Beside its scientific interest in the molecular biology and genomics community, gene prediction is of considerable importance in human health and medicine. A variety of gene prediction techniques have been developed for eukaryotes over the past few years. This article reviews and analyzes the application of certain soft computing techniques in gene prediction. First, the problem of gene prediction and its challenges are described. These are followed by different soft computing techniques along with their application to gene prediction. In addition, a comparative analysis of different soft computing techniques for gene prediction is given. Finally some limitations of the current research activities and future research directions are provided. Copyright © 2013 Elsevier Inc. All rights reserved.
Narad, Priyanka; Kumar, Abhishek; Chakraborty, Amlan; Patni, Pranav; Sengupta, Abhishek; Wadhwa, Gulshan; Upadhyaya, K C
2017-09-01
Transcription factors are trans-acting proteins that interact with specific nucleotide sequences known as transcription factor binding site (TFBS), and these interactions are implicated in regulation of the gene expression. Regulation of transcriptional activation of a gene often involves multiple interactions of transcription factors with various sequence elements. Identification of these sequence elements is the first step in understanding the underlying molecular mechanism(s) that regulate the gene expression. For in silico identification of these sequence elements, we have developed an online computational tool named transcription factor information system (TFIS) for detecting TFBS for the first time using a collection of JAVA programs and is mainly based on TFBS detection using position weight matrix (PWM). The database used for obtaining position frequency matrices (PFM) is JASPAR and HOCOMOCO, which is an open-access database of transcription factor binding profiles. Pseudo-counts are used while converting PFM to PWM, and TFBS detection is carried out on the basis of percent score taken as threshold value. TFIS is equipped with advanced features such as direct sequence retrieving from NCBI database using gene identification number and accession number, detecting binding site for common TF in a batch of gene sequences, and TFBS detection after generating PWM from known raw binding sequences in addition to general detection methods. TFIS can detect the presence of potential TFBSs in both the directions at the same time. This feature increases its efficiency. And the results for this dual detection are presented in different colors specific to the orientation of the binding site. Results obtained by the TFIS are more detailed and specific to the detected TFs as integration of more informative links from various related web servers are added in the result pages like Gene Ontology, PAZAR database and Transcription Factor Encyclopedia in addition to NCBI and UniProt. Common TFs like SP1, AP1 and NF-KB of the Amyloid beta precursor gene is easily detected using TFIS along with multiple binding sites. In another scenario of embryonic developmental process, TFs of the FOX family (FOXL1 and FOXC1) were also identified. TFIS is platform-independent which is publicly available along with its support and documentation at http://tfistool.appspot.com and http://www.bioinfoplus.com/tfis/ . TFIS is licensed under the GNU General Public License, version 3 (GPL-3.0).
Luo, Shengzhan D.; Baker, Bruce S.
2015-01-01
“Regulatory evolution,” that is, changes in a gene’s expression pattern through changes at its regulatory sequence, rather than changes at the coding sequence of the gene or changes of the upstream transcription factors, has been increasingly recognized as a pervasive evolution mechanism. Many somatic sexually dimorphic features of Drosophila melanogaster are the results of gene expression regulated by the doublesex (dsx) gene, which encodes sex-specific transcription factors (DSXF in females and DSXM in males). Rapid changes in such sexually dimorphic features are likely a result of changes at the regulatory sequence of the target genes. We focused on the Flavin-containing monooxygenase-2 (Fmo-2) gene, a likely direct dsx target, to elucidate how sexually dimorphic expression and its evolution are brought about. We found that dsx is deployed to regulate the Fmo-2 transcription both in the midgut and in fat body cells of the spermatheca (a female-specific tissue), through a canonical DSX-binding site in the Fmo-2 regulatory sequence. In the melanogaster group, Fmo-2 transcription in the midgut has evolved rapidly, in contrast to the conserved spermathecal transcription. We identified two cis-regulatory modules (CRM-p and CRM-d) that direct sexually monomorphic or dimorphic Fmo-2 transcription, respectively, in the midguts of these species. Changes of Fmo-2 transcription in the midgut from sexually dimorphic to sexually monomorphic in some species are caused by the loss of CRM-d function, but not the loss of the canonical DSX-binding site. Thus, conferring transcriptional regulation on a CRM level allows the regulation to evolve rapidly in one tissue while evading evolutionary constraints posed by other tissues. PMID:25675536
Analysis of expressed sequence tags for Frankliniella occidentalis, the western flower thrips.
Rotenberg, D; Whitfield, A E
2010-08-01
Thrips are members of the insect order Thysanoptera and Frankliniella occidentalis (the western flower thrips) is the most economically important pest within this order. F. occidentalis is both a direct pest of crops and an efficient vector of plant viruses, including Tomato spotted wilt virus (TSWV). Despite the world-wide importance of thrips in agriculture, there is little knowledge of the F. occidentalis genome or gene functions at this time. A normalized cDNA library was constructed from first instar thrips and 13 839 expressed sequence tags (ESTs) were obtained. Our EST data assembled into 894 contigs and 11 806 singletons (12 700 nonredundant sequences). We found that 31% of these sequences had significant similarity (E< or = 10(-10)) to protein sequences in the National Center for Biotechnology Information nonredundant (nr) protein database, and 25% were functionally annotated using Blast 2GO. We identified 74 sequences with putative homology to proteins associated with insect innate immunity. Sixteen sequences had significant similarity to proteins associated with small RNA-mediated gene silencing pathways (RNA interference; RNAi), including the antiviral pathway (short interfering RNA-mediated pathway). Our EST collection provides new sequence resources for characterizing gene functions in F. occidentalis and other thrips species with regards to vital biological processes, studying the mechanism of interactions with the viruses harboured and transmitted by the vector, and identifying new insect gene-centred targets for plant disease and insect control.
Hao, Weilong; Palmer, Jeffrey D
2009-09-29
The mitochondrial genomes of flowering plants possess a promiscuous proclivity for taking up sequences from the chloroplast genome. All characterized chloroplast integrants exist apart from native mitochondrial genes, and only a few, involving chloroplast tRNA genes that have functionally supplanted their mitochondrial counterparts, appear to be of functional consequence. We developed a novel computational approach to search for homologous recombination (gene conversion) in a large number of sequences and applied it to 22 mitochondrial and chloroplast gene pairs, which last shared common ancestry some 2 billion years ago. We found evidence of recurrent conversion of short patches of mitochondrial genes by chloroplast homologs during angiosperm evolution, but no evidence of gene conversion in the opposite direction. All 9 putative conversion events involve the atp1/atpA gene encoding the alpha subunit of ATP synthase, which is unusually well conserved between the 2 organelles and the only shared gene that is widely sequenced across plant mitochondria. Moreover, all conversions were limited to the 2 regions of greatest nucleotide and amino acid conservation of atp1/atpA. These observations probably reflect constraints operating on both the occurrence and fixation of recombination between ancient homologs. These findings indicate that recombination between anciently related sequences is more frequent than previously appreciated and creates functional mitochondrial genes of chimeric origin. These results also have implications for the widespread use of mitochondrial atp1 in phylogeny reconstruction.
Vioque, A
1997-01-01
The RNase P RNA gene (rnpB) from 10 cyanobacteria has been characterized. These new RNAs, together with the previously available ones, provide a comprehensive data set of RNase P RNA from diverse cyanobacterial lineages. All heterocystous cyanobacteria, but none of the non-heterocystous strains analyzed, contain short tandemly repeated repetitive (STRR) sequences that increase the length of helix P12. Site-directed mutagenesis experiments indicate that the STRR sequences are not required for catalytic activity in vitro. STRR sequences seem to have recently and independently invaded the RNase P RNA genes in heterocyst-forming cyanobacteria because closely related strains contain unrelated STRR sequences. Most cyanobacteria RNase P RNAs lack the sequence GGU in the loop connecting helices P15 and P16 that has been established to interact with the 3'-end CCA in precursor tRNA substrates in other bacteria. This character is shared with plastid RNase P RNA. Helix P6 is longer than usual in most cyanobacteria as well as in plastid RNase P RNA. PMID:9254706
Association of Amine-Receptor DNA Sequence Variants with Associative Learning in the Honeybee.
Lagisz, Malgorzata; Mercer, Alison R; de Mouzon, Charlotte; Santos, Luana L S; Nakagawa, Shinichi
2016-03-01
Octopamine- and dopamine-based neuromodulatory systems play a critical role in learning and learning-related behaviour in insects. To further our understanding of these systems and resulting phenotypes, we quantified DNA sequence variations at six loci coding octopamine-and dopamine-receptors and their association with aversive and appetitive learning traits in a population of honeybees. We identified 79 polymorphic sequence markers (mostly SNPs and a few insertions/deletions) located within or close to six candidate genes. Intriguingly, we found that levels of sequence variation in the protein-coding regions studied were low, indicating that sequence variation in the coding regions of receptor genes critical to learning and memory is strongly selected against. Non-coding and upstream regions of the same genes, however, were less conserved and sequence variations in these regions were weakly associated with between-individual differences in learning-related traits. While these associations do not directly imply a specific molecular mechanism, they suggest that the cross-talk between dopamine and octopamine signalling pathways may influence olfactory learning and memory in the honeybee.
Rybarczyk-Mydłowska, Katarzyna; Maboreke, Hazel Ruvimbo; van Megen, Hanny; van den Elsen, Sven; Mooyman, Paul; Smant, Geert; Bakker, Jaap; Helder, Johannes
2012-11-21
Plant parasitic nematodes are unusual Metazoans as they are equipped with genes that allow for symbiont-independent degradation of plant cell walls. Among the cell wall-degrading enzymes, glycoside hydrolase family 5 (GHF5) cellulases are relatively well characterized, especially for high impact parasites such as root-knot and cyst nematodes. Interestingly, ancestors of extant nematodes most likely acquired these GHF5 cellulases from a prokaryote donor by one or multiple lateral gene transfer events. To obtain insight into the origin of GHF5 cellulases among evolutionary advanced members of the order Tylenchida, cellulase biodiversity data from less distal family members were collected and analyzed. Single nematodes were used to obtain (partial) genomic sequences of cellulases from representatives of the genera Meloidogyne, Pratylenchus, Hirschmanniella and Globodera. Combined Bayesian analysis of ≈ 100 cellulase sequences revealed three types of catalytic domains (A, B, and C). Represented by 84 sequences, type B is numerically dominant, and the overall topology of the catalytic domain type shows remarkable resemblance with trees based on neutral (= pathogenicity-unrelated) small subunit ribosomal DNA sequences. Bayesian analysis further suggested a sister relationship between the lesion nematode Pratylenchus thornei and all type B cellulases from root-knot nematodes. Yet, the relationship between the three catalytic domain types remained unclear. Superposition of intron data onto the cellulase tree suggests that types B and C are related, and together distinct from type A that is characterized by two unique introns. All Tylenchida members investigated here harbored one or multiple GHF5 cellulases. Three types of catalytic domains are distinguished, and the presence of at least two types is relatively common among plant parasitic Tylenchida. Analysis of coding sequences of cellulases suggests that root-knot and cyst nematodes did not acquire this gene directly by lateral genes transfer. More likely, these genes were passed on by ancestors of a family nowadays known as the Pratylenchidae.
Production of Functional Proteins: Balance of Shear Stress and Gravity
NASA Technical Reports Server (NTRS)
Goodwin, Thomas John (Inventor); Hammond, Timothy Grant (Inventor); Haysen, James Howard (Inventor)
2005-01-01
The present invention provides for a method of culturing cells and inducing the expression of at least one gene in the cell culture. The method provides for contacting the cell with a transcription factor decoy oligonucleotide sequence directed against a nucleotide sequence encoding a shear stress response element.
ACTG: novel peptide mapping onto gene models.
Choi, Seunghyuk; Kim, Hyunwoo; Paek, Eunok
2017-04-15
In many proteogenomic applications, mapping peptide sequences onto genome sequences can be very useful, because it allows us to understand origins of the gene products. Existing software tools either take the genomic position of a peptide start site as an input or assume that the peptide sequence exactly matches the coding sequence of a given gene model. In case of novel peptides resulting from genomic variations, especially structural variations such as alternative splicing, these existing tools cannot be directly applied unless users supply information about the variant, either its genomic position or its transcription model. Mapping potentially novel peptides to genome sequences, while allowing certain genomic variations, requires introducing novel gene models when aligning peptide sequences to gene structures. We have developed a new tool called ACTG (Amino aCids To Genome), which maps peptides to genome, assuming all possible single exon skipping, junction variation allowing three edit distances from the original splice sites, exon extension and frame shift. In addition, it can also consider SNVs (single nucleotide variations) during mapping phase if a user provides the VCF (variant call format) file as an input. Available at http://prix.hanyang.ac.kr/ACTG/search.jsp . eunokpaek@hanyang.ac.kr. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Transcriptome analysis by strand-specific sequencing of complementary DNA
Parkhomchuk, Dmitri; Borodina, Tatiana; Amstislavskiy, Vyacheslav; Banaru, Maria; Hallen, Linda; Krobitsch, Sylvia; Lehrach, Hans; Soldatov, Alexey
2009-01-01
High-throughput complementary DNA sequencing (RNA-Seq) is a powerful tool for whole-transcriptome analysis, supplying information about a transcript's expression level and structure. However, it is difficult to determine the polarity of transcripts, and therefore identify which strand is transcribed. Here, we present a simple cDNA sequencing protocol that preserves information about a transcript's direction. Using Saccharomyces cerevisiae and mouse brain transcriptomes as models, we demonstrate that knowing the transcript's orientation allows more accurate determination of the structure and expression of genes. It also helps to identify new genes and enables studying promoter-associated and antisense transcription. The transcriptional landscapes we obtained are available online. PMID:19620212
Transcriptome analysis by strand-specific sequencing of complementary DNA.
Parkhomchuk, Dmitri; Borodina, Tatiana; Amstislavskiy, Vyacheslav; Banaru, Maria; Hallen, Linda; Krobitsch, Sylvia; Lehrach, Hans; Soldatov, Alexey
2009-10-01
High-throughput complementary DNA sequencing (RNA-Seq) is a powerful tool for whole-transcriptome analysis, supplying information about a transcript's expression level and structure. However, it is difficult to determine the polarity of transcripts, and therefore identify which strand is transcribed. Here, we present a simple cDNA sequencing protocol that preserves information about a transcript's direction. Using Saccharomyces cerevisiae and mouse brain transcriptomes as models, we demonstrate that knowing the transcript's orientation allows more accurate determination of the structure and expression of genes. It also helps to identify new genes and enables studying promoter-associated and antisense transcription. The transcriptional landscapes we obtained are available online.
Norman, Paul J.; Norberg, Steven J.; Guethlein, Lisbeth A.; Nemat-Gorgani, Neda; Royce, Thomas; Wroblewski, Emily E.; Dunn, Tamsen; Mann, Tobias; Alicata, Claudia; Hollenbach, Jill A.; Chang, Weihua; Shults Won, Melissa; Gunderson, Kevin L.; Abi-Rached, Laurent; Ronaghi, Mostafa; Parham, Peter
2017-01-01
The most polymorphic part of the human genome, the MHC, encodes over 160 proteins of diverse function. Half of them, including the HLA class I and II genes, are directly involved in immune responses. Consequently, the MHC region strongly associates with numerous diseases and clinical therapies. Notoriously, the MHC region has been intractable to high-throughput analysis at complete sequence resolution, and current reference haplotypes are inadequate for large-scale studies. To address these challenges, we developed a method that specifically captures and sequences the 4.8-Mbp MHC region from genomic DNA. For 95 MHC homozygous cell lines we assembled, de novo, a set of high-fidelity contigs and a sequence scaffold, representing a mean 98% of the target region. Included are six alternative MHC reference sequences of the human genome that we completed and refined. Characterization of the sequence and structural diversity of the MHC region shows the approach accurately determines the sequences of the highly polymorphic HLA class I and HLA class II genes and the complex structural diversity of complement factor C4A/C4B. It has also uncovered extensive and unexpected diversity in other MHC genes; an example is MUC22, which encodes a lung mucin and exhibits more coding sequence alleles than any HLA class I or II gene studied here. More than 60% of the coding sequence alleles analyzed were previously uncharacterized. We have created a substantial database of robust reference MHC haplotype sequences that will enable future population scale studies of this complicated and clinically important region of the human genome. PMID:28360230
Christiaens, H; Leer, R J; Pouwels, P H; Verstraete, W
1992-12-01
The conjugated bile acid hydrolase gene from the silage isolate Lactobacillus plantarum 80 was cloned and expressed in Escherichia coli MC1061. For the screening of this hydrolase gene within the gene bank, a direct plate assay developed by Dashkevicz and Feighner (M. P. Dashkevicz and S. D. Feighner, Appl. Environ. Microbiol. 53:331-336, 1989) was adapted to the growth requirements of E. coli. Because of hydrolysis and medium acidification, hydrolase-active colonies were surrounded with big halos of precipitated, free bile acids. This phenomenon was also obtained when the gene was cloned into a multicopy shuttle vector and subsequently reintroduced into the parental Lactobacillus strain. The cbh gene and surrounding regions were characterized by nucleotide sequence analysis. The deduced amino acid sequence was shown to have 52% similarity with a penicillin V amidase from Bacillus sphaericus. Preliminary characterization of the gene product showed that it is a cholylglycine hydrolase (EC 3.5.1.24) with only slight activity against taurine conjugates. The optimum pH was between 4.7 and 5.5. Optimum temperature ranged from 30 to 45 degrees C. Southern blot analysis indicated that the cloned gene has similarity with genomic DNA of bile acid hydrolase-active Lactobacillus spp. of intestinal origin.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wheeler, E.F.; Roussel, M.F.; Hampe, A.
1986-08-01
The nucleotide sequence of a 5' segment of the human genomic c-fms proto-oncogene suggested that recombination between feline leukemia virus and feline c-fms sequences might have occurred in a region encoding the 5' untranslated portion of c-fms mRNA. The polyprotein precursor gP180/sup gag-fms/ encoded by the McDonough strain of feline sarcoma virus was therefore predicted to contain 34 v-fms-coded amino acids derived from sequences of the c-fms gene that are not ordinarily translated from the proto-oncogene mRNA. The (gP180/sup gag-fms/) polyprotein was cotranslationally cleaved near the gag-fms junction to remove its gag gene-coded portion. Determination of the amino-terminal sequence ofmore » the resulting v-fms-coded glycoprotein, gp120/sup v-fms/, showed that the site of proteolysis corresponded to a predicted signal peptidase cleavage site within the c-fms gene product. Together, these analyses suggested that the linked gag sequences may not be necessary for expression of a biologically active v-fms gene product. The gag-fms sequences of feline sarcoma virus strain McDonough and the v-fms sequences alone were inserted into a murine retroviral vector containing a neomycin resistance gene. The authors conclude that a cryptic hydrophobic signal peptide sequence in v-fms was unmasked by gag deletion, thereby allowing the correct orientation and transport of the v-fms was unmasked by gag deletion, thereby allowing the correct orientation and transport of the v-fms gene product within membranous organelles. It seems likely that the proteolytic cleavage of gP180/gag-fms/ is mediated by signal peptidase and that the amino termini of gp140/sup v-fms/ and the c-fms gene product are identical.« less
Arai, Yuuki; Maeda, Akiko; Hirami, Yasuhiko; Ishigami, Chie; Kosugi, Shinji; Mandai, Michiko; Kurimoto, Yasuo; Takahashi, Masayo
2015-01-01
The aim of this study was to gain information about disease prevalence and to identify the responsible genes for inherited retinal dystrophies (IRD) in Japanese populations. Clinical and molecular evaluations were performed on 349 patients with IRD. For segregation analyses, 63 of their family members were employed. Bioinformatics data from 1,208 Japanese individuals were used as controls. Molecular diagnosis was obtained by direct sequencing in a stepwise fashion utilizing one or two panels of 15 and 27 genes for retinitis pigmentosa patients. If a specific clinical diagnosis was suspected, direct sequencing of disease-specific genes, that is, ABCA4 for Stargardt disease, was conducted. Limited availability of intrafamily information and decreasing family size hampered identifying inherited patterns. Differential disease profiles with lower prevalence of Stargardt disease from European and North American populations were obtained. We found 205 sequence variants in 159 of 349 probands with an identification rate of 45.6%. This study found 43 novel sequence variants. In silico analysis suggests that 20 of 25 novel missense variants are pathogenic. EYS mutations had the highest prevalence at 23.5%. c.4957_4958insA and c.8868C>A were the two major EYS mutations identified in this cohort. EYS mutations are the most prevalent among Japanese patients with IRD.
Transcriptome and Small RNA Deep Sequencing Reveals Deregulation of miRNA Biogenesis in Human Glioma
Moore, Lynette M.; Kivinen, Virpi; Liu, Yuexin; Annala, Matti; Cogdell, David; Liu, Xiuping; Liu, Chang-Gong; Sawaya, Raymond; Yli-Harja, Olli; Shmulevich, Ilya; Fuller, Gregory N.; Zhang, Wei; Nykter, Matti
2013-01-01
Altered expression of oncogenic and tumor-suppressing microRNAs (miRNAs) is widely associated with tumorigenesis. However, the regulatory mechanisms underlying these alterations are poorly understood. We sought to shed light on the deregulation of miRNA biogenesis promoting the aberrant miRNA expression profiles identified in these tumors. Using sequencing technology to perform both whole-transcriptome and small RNA sequencing of glioma patient samples, we examined precursor and mature miRNAs to directly evaluate the miRNA maturation process, and interrogated expression profiles for genes involved in the major steps of miRNA biogenesis. We found that ratios of mature to precursor forms of a large number of miRNAs increased with the progression from normal brain to low-grade and then to high-grade gliomas. The expression levels of genes involved in each of the three major steps of miRNA biogenesis (nuclear processing, nucleo-cytoplasmic transport, and cytoplasmic processing) were systematically altered in glioma tissues. Survival analysis of an independent data set demonstrated that the alteration of genes involved in miRNA maturation correlates with survival in glioma patients. Direct quantification of miRNA maturation with deep sequencing demonstrated that deregulation of the miRNA biogenesis pathway is a hallmark for glioma genesis and progression. PMID:23007860
Transcriptional insulation of the human keratin 18 gene in transgenic mice.
Neznanov, N; Thorey, I S; Ceceña, G; Oshima, R G
1993-01-01
Expression of the 10-kb human keratin 18 (K18) gene in transgenic mice results in efficient and appropriate tissue-specific expression in a variety of internal epithelial organs, including liver, lung, intestine, kidney, and the ependymal epithelium of brain, but not in spleen, heart, or skeletal muscle. Expression at the RNA level is directly proportional to the number of integrated K18 transgenes. These results indicate that the K18 gene is able to insulate itself both from the commonly observed cis-acting effects of the sites of integration and from the potential complications of duplicated copies of the gene arranged in head-to-tail fashion. To begin to identify the K18 gene sequences responsible for this property of transcriptional insulation, additional transgenic mouse lines containing deletions of either the 5' or 3' distal end of the K18 gene have been characterized. Deletion of 1.5 kb of the distal 5' flanking sequence has no effect upon either the tissue specificity or the copy number-dependent behavior of the transgene. In contrast, deletion of the 3.5-kb 3' flanking sequence of the gene results in the loss of the copy number-dependent behavior of the gene in liver and intestine. However, expression in kidney, lung, and brain remains efficient and copy number dependent in these transgenic mice. Furthermore, herpes simplex virus thymidine kinase gene expression is copy number dependent in transgenic mice when the gene is located between the distal 5'- and 3'-flanking sequences of the K18 gene. Each adult transgenic male expressed the thymidine kinase gene in testes and brain and proportionally to the number of integrated transgenes. We conclude that the characteristic of copy number-dependent expression of the K18 gene is tissue specific because the sequence requirements for transcriptional insulation in adult liver and intestine are different from those for lung and kidney. In addition, the behavior of the transgenic thymidine kinase gene in testes and brain suggests that the property of transcriptional insulation of the K18 gene may be conferred by the distal flanking sequences of the K18 gene and, additionally, may function for other genes. Images PMID:7681143
Sequence analysis of 16S rRNA gene clone libraries is a popular tool used to describe the composition of natural microbial communities. Commonly, clone libraries are developed by direct cloning of 16S rRNA gene PCR products. Different primers are often employed in the initial amp...
Sequence analysis of 16S rRNA gene clone libraries is a popular tool used to describe the composition of natural microbial communities. Commonly, clone libraries are developed by direct cloning of 16S rRNA gene PCR products. Different primers are often employed in the initial amp...
USDA-ARS?s Scientific Manuscript database
Mounting evidence shows microRNAs (miRNAs) directly regulate gene expression post-transcriptionally through base-pairing with regions in the 3’-untranslated sequences of target gene mRNAs, which results in dysregulation of gene expression/translation and subsequently modulates cellular processes. We...
NASA Astrophysics Data System (ADS)
Gong, Qianhong; Yu, Wengong; Dai, Jixun; Liu, Hongquan; Xu, Rifu; Guan, Huashi; Pan, Kehou
2007-01-01
Endogenous tubulin promoter has been widely used for expressing foreign genes in green algae, but the efficiency and feasibility of endogenous tubulin promoter in the economically important Porphyra yezoensis (Rhodophyta) are unknown. In this study, the flanking sequences of beta-tubulin gene from P. yezoensis were amplified and two transient expression vectors were constructed to determine their transcription promoting feasibility for foreign gene gusA. The testing vector pATubGUS was constructed by inserting 5'-and 3'-flanking regions ( Tub5' and Tub3') up-and down-stream of β-glucuronidase (GUS) gene ( gusA), respectively, into pA, a derivative of pCAT®3-enhancer vector. The control construct, pAGUSTub3, contains only gusA and Tub3'. These constructs were electroporated into P. yezoensis protoplasts and the GUS activities were quantitatively analyzed by spectrometry. The results demonstrated that gusA gene was efficiently expressed in P. yezoensis protoplasts under the regulation of 5'-flanking sequence of the beta-tubulin gene. More interestingly, the pATubGUS produced stronger GUS activity in P. yezoensis protoplasts when compared to the result from pBI221, in which the gusA gene was directed by a constitutive CaMV 35S promoter. The data suggest that the integration of P. yezoensis protoplast and its endogenous beta-tubulin flanking sequences is a potential novel system for foreign gene expression.
Progress of targeted genome modification approaches in higher plants.
Cardi, Teodoro; Neal Stewart, C
2016-07-01
Transgene integration in plants is based on illegitimate recombination between non-homologous sequences. The low control of integration site and number of (trans/cis)gene copies might have negative consequences on the expression of transferred genes and their insertion within endogenous coding sequences. The first experiments conducted to use precise homologous recombination for gene integration commenced soon after the first demonstration that transgenic plants could be produced. Modern transgene targeting categories used in plant biology are: (a) homologous recombination-dependent gene targeting; (b) recombinase-mediated site-specific gene integration; (c) oligonucleotide-directed mutagenesis; (d) nuclease-mediated site-specific genome modifications. New tools enable precise gene replacement or stacking with exogenous sequences and targeted mutagenesis of endogeneous sequences. The possibility to engineer chimeric designer nucleases, which are able to target virtually any genomic site, and use them for inducing double-strand breaks in host DNA create new opportunities for both applied plant breeding and functional genomics. CRISPR is the most recent technology available for precise genome editing. Its rapid adoption in biological research is based on its inherent simplicity and efficacy. Its utilization, however, depends on available sequence information, especially for genome-wide analysis. We will review the approaches used for genome modification, specifically those for affecting gene integration and modification in higher plants. For each approach, the advantages and limitations will be noted. We also will speculate on how their actual commercial development and implementation in plant breeding will be affected by governmental regulations.
Kim, Taeho; Kim, Jiyeon; Nadler, Steven A; Park, Joong-Ki
2016-05-01
Testing hypotheses of monophyly for different nematode groups in the context of broad representation of nematode diversity is central to understanding the patterns and processes of nematode evolution. Herein sequence information from mitochondrial genomes is used to test the monophyly of diplogasterids, which includes an important nematode model organism. The complete mitochondrial genome sequence of Koerneria sudhausi, a representative of Diplogasteromorpha, was determined and used for phylogenetic analyses along with 60 other nematode species. The mtDNA of K. sudhausi is comprised of 16,005 bp that includes 36 genes (12 protein-coding genes, 2 ribosomal RNA genes and 22 transfer RNA genes) encoded in the same direction. Phylogenetic trees inferred from amino acid and nucleotide sequence data for the 12 protein-coding genes strongly supported the sister relationship of K. sudhausi with Pristionchus pacificus, supporting Diplogasteromorpha. The gene order of K. sudhausi is identical to that most commonly found in members of the Rhabditomorpha + Ascaridomorpha + Diplogasteromorpha clade, with an exception of some tRNA translocations. Both the gene order pattern and sequence-based phylogenetic analyses support a close relationship between the diplogasterid species and Rhabditomorpha. The nesting of the two diplogasteromorph species within Rhabditomorpha is consistent with most molecular phylogenies for the group, but inconsistent with certain morphology-based hypotheses that asserted phylogenetic affinity between diplogasteromorphs and tylenchomorphs. Phylogenetic analysis of mitochondrial genome sequences strongly supports monophyly of the diplogasteromorpha.
Kumar, Rajnish; Mishra, Bharat Kumar; Lahiri, Tapobrata; Kumar, Gautam; Kumar, Nilesh; Gupta, Rahul; Pal, Manoj Kumar
2017-06-01
Online retrieval of the homologous nucleotide sequences through existing alignment techniques is a common practice against the given database of sequences. The salient point of these techniques is their dependence on local alignment techniques and scoring matrices the reliability of which is limited by computational complexity and accuracy. Toward this direction, this work offers a novel way for numerical representation of genes which can further help in dividing the data space into smaller partitions helping formation of a search tree. In this context, this paper introduces a 36-dimensional Periodicity Count Value (PCV) which is representative of a particular nucleotide sequence and created through adaptation from the concept of stochastic model of Kolekar et al. (American Institute of Physics 1298:307-312, 2010. doi: 10.1063/1.3516320 ). The PCV construct uses information on physicochemical properties of nucleotides and their positional distribution pattern within a gene. It is observed that PCV representation of gene reduces computational cost in the calculation of distances between a pair of genes while being consistent with the existing methods. The validity of PCV-based method was further tested through their use in molecular phylogeny constructs in comparison with that using existing sequence alignment methods.
Phylogenetic Analysis of Theileria annulata Infected Cell Line S15 Iran Vaccine Strain.
Habibi, Gh
2012-01-01
Bovine theileriosis results from infection with obligate intracellular protozoa of the genus Theileria. The phylogenetic relationships between two isolates of Theileria annulata, and 36 Theileria spp., as well as 6 outgroup including Babesia spp. and coccidian protozoa were analyzed using the 18S rRNA gene sequence. The target DNA segment was amplified by PCR. The PCR product was used for direct sequencing. The length of the 18S rRNA gene of all Theileria spp. involved in this study was around 1,400 bp. A phylogenetic tree was inferred based on the 18S rRNA gene sequence of the Iran and Iraq isolates, and other species of Theileria available in GenBank. In the constructed tree, Theileria annulata (Iran vaccine strain) was closely related to other T. annulata from Europe, Asia, as well as T. lestoquardi, T. parva and T. taurotragi all in one clade. Phylogenetic analyses based on small subunit ribosomal RNA gene suggested that the percent identity of the sequence of Iran vaccine strain was completely the same as Iraq sequence (100% identical), but the similarity of Iran vaccine strain with other T. annulata reported from China, Spain and Italy determined the 97.9 to 99.9% identity.
Financsek, I; Mizumoto, K; Mishima, Y; Muramatsu, M
1982-01-01
The transcription initiation site of the human ribosomal RNA gene (rDNA) was located by using the single-strand specific nuclease protection method and by determining the first nucleotide of the in vitro capped 45S preribosomal RNA. The sequence of 1,211 nucleotides surrounding the initiation site was determined. The sequenced region was found to consist of 75% G and C and to contain a number of short direct and inverted repeats and palindromes. By comparison of the corresponding initiation regions of three mammalian species, several conserved sequences were found upstream and downstream from the transcription starting point. Two short A + T-rich sequences are present on human, mouse, and rat ribosomal RNA genes between the initiation site and 40 nucleotides upstream, and a C + T cluster is located at a position around -60. At and downstream from the initiation site, a common sequence, T-AG-C-T-G-A-C-A-C-G-C-T-G-T-C-C-T-CT-T, was found in the three genes from position -1 through +18. The strong conservation of these sequences suggests their functional significance in rDNA. The S1 nuclease protection experiments with cloned rDNA fragments indicated the presence in human 45S RNA of molecules several hundred nucleotides shorter than the supposed primary transcript. The first 19 nucleotides of these molecules appear identical--except for one mismatch--to the nucleotide sequence of the 5' end of a supposed early processing product of the mouse 45S RNA. Images PMID:6954460
Imbert, J; Zafarullah, M; Culotta, V C; Gedamu, L; Hamer, D
1989-01-01
Metallothionein (MT) gene promoters in higher eucaryotes contain multiple metal regulatory elements (MREs) that are responsible for the metal induction of MT gene transcription. We identified and purified to near homogeneity a 74-kilodalton mouse nuclear protein that specifically binds to certain MRE sequences. This protein, MBF-I, was purified employing as an affinity reagent a trout MRE that is shown to be functional in mouse cells but which lacks the G+C-rich and SP1-like sequences found in many mammalian MT gene promoters. Using point-mutated MREs, we showed that there is a strong correlation between DNA binding in vitro and MT gene regulation in vivo, suggesting a direct role of MBF-I in MT gene transcription. We also showed that MBF-I can induce MT gene transcription in vitro in a mouse extract and that this stimulation requires zinc. Images PMID:2586522
van Koningsbruggen, Silvana; Gierliński, Marek; Schofield, Pietá; Martin, David; Barton, Geoffey J.; Ariyurek, Yavuz; den Dunnen, Johan T.
2010-01-01
The nuclear space is mostly occupied by chromosome territories and nuclear bodies. Although this organization of chromosomes affects gene function, relatively little is known about the role of nuclear bodies in the organization of chromosomal regions. The nucleolus is the best-studied subnuclear structure and forms around the rRNA repeat gene clusters on the acrocentric chromosomes. In addition to rDNA, other chromatin sequences also surround the nucleolar surface and may even loop into the nucleolus. These additional nucleolar-associated domains (NADs) have not been well characterized. We present here a whole-genome, high-resolution analysis of chromatin endogenously associated with nucleoli. We have used a combination of three complementary approaches, namely fluorescence comparative genome hybridization, high-throughput deep DNA sequencing and photoactivation combined with time-lapse fluorescence microscopy. The data show that specific sequences from most human chromosomes, in addition to the rDNA repeat units, associate with nucleoli in a reproducible and heritable manner. NADs have in common a high density of AT-rich sequence elements, low gene density and a statistically significant enrichment in transcriptionally repressed genes. Unexpectedly, both the direct DNA sequencing and fluorescence photoactivation data show that certain chromatin loci can specifically associate with either the nucleolus, or the nuclear envelope. PMID:20826608
van Koningsbruggen, Silvana; Gierlinski, Marek; Schofield, Pietá; Martin, David; Barton, Geoffey J; Ariyurek, Yavuz; den Dunnen, Johan T; Lamond, Angus I
2010-11-01
The nuclear space is mostly occupied by chromosome territories and nuclear bodies. Although this organization of chromosomes affects gene function, relatively little is known about the role of nuclear bodies in the organization of chromosomal regions. The nucleolus is the best-studied subnuclear structure and forms around the rRNA repeat gene clusters on the acrocentric chromosomes. In addition to rDNA, other chromatin sequences also surround the nucleolar surface and may even loop into the nucleolus. These additional nucleolar-associated domains (NADs) have not been well characterized. We present here a whole-genome, high-resolution analysis of chromatin endogenously associated with nucleoli. We have used a combination of three complementary approaches, namely fluorescence comparative genome hybridization, high-throughput deep DNA sequencing and photoactivation combined with time-lapse fluorescence microscopy. The data show that specific sequences from most human chromosomes, in addition to the rDNA repeat units, associate with nucleoli in a reproducible and heritable manner. NADs have in common a high density of AT-rich sequence elements, low gene density and a statistically significant enrichment in transcriptionally repressed genes. Unexpectedly, both the direct DNA sequencing and fluorescence photoactivation data show that certain chromatin loci can specifically associate with either the nucleolus, or the nuclear envelope.
Selective DNA demethylation by fusion of TDG with a sequence-specific DNA-binding domain
Gregory, David J.; Mikhaylova, Lyudmila; Fedulov, Alexey V.
2012-01-01
Our ability to selectively manipulate gene expression by epigenetic means is limited, as there is no approach for targeted reactivation of epigenetically silenced genes, in contrast to what is available for selective gene silencing. We aimed to develop a tool for selective transcriptional activation by DNA demethylation. Here we present evidence that direct targeting of thymine-DNA-glycosylase (TDG) to specific sequences in the DNA can result in local DNA demethylation at potential regulatory sequences and lead to enhanced gene induction. When TDG was fused to a well-characterized DNA-binding domain [the Rel-homology domain (RHD) of NFκB], we observed decreased DNA methylation and increased transcriptional response to unrelated stimulus of inducible nitric oxide synthase (NOS2). The effect was not seen for control genes lacking either RHD-binding sites or high levels of methylation, nor in control mock-transduced cells. Specific reactivation of epigenetically silenced genes may thus be achievable by this approach, which provides a broadly useful strategy to further our exploration of biological mechanisms and to improve control over the epigenome. PMID:22419066
Stevens, Todd M; Morlote, Diana; Swensen, Jeff; Ellis, Michelle; Harada, Shuko; Spencer, Sharon; Prieto-Granada, Carlos N; Folpe, Andrew L; Gatalica, Zoran
2018-05-07
Spindle epithelial tumor with thymus-like differentiation (SETTLE) is a malignant biphasic neoplasm of the thyroid or neck with propensity for late metastasis. Unlike synovial sarcoma, its main morphologic mimic, SETTLE lacks synovial sarcoma-associated translocations. A single case of SETTLE has shown a KRAS mutation but to date no comprehensive next generation sequencing studies of this rare neoplasm have been undertaken. Herein, we subjected 5 well defined cases of SETTLE to direct sequence analysis of 592 genes and fusion gene analysis of 52 genes frequently rearranged in human cancers. We identified one case with two pathogenic variants in the KMT2D gene, one being in an intron splice site (c.674-1A>G) and the other being a frameshift variant (p.M2829fs). This same case also had a pathogenic nonsense variant in the KMT2C gene (p.R1237*). A second case of SETTLE carried a pathogenic NRAS missense variant, Q61R. No other molecular alterations, microsatellite instability, gene fusions or amplifications were identified.
Tooley, Paul W; Bandyopadhyay, Ranajit; Carras, Marie M; Pazoutová, Sylvie
2006-04-01
Isolates of Claviceps causing ergot on sorghum in India were analysed by AFLP analysis, and by analysis of DNA sequences of the EF-1alpha gene intron 4 and beta-tubulin gene intron 3 region. Of 89 isolates assayed from six states in India, four were determined to be C. sorghi, and the rest C. africana. A relatively low level of genetic diversity was observed within the Indian C. africana population. No evidence of genetic exchange between C. africana and C. sorghi was observed in either AFLP or DNA sequence analysis. Phylogenetic analysis was conducted using DNA sequences from 14 different Claviceps species. A multigene phylogeny based on the EF-1alpha gene intron 4, the beta-tubulin gene intron 3 region, and rDNA showed that C. sorghi grouped most closely with C. gigantea and C. africana. Although the Claviceps species we analysed were closely related, they colonize hosts that are taxonomically very distinct suggesting that there is no direct coevolution of Claviceps with its hosts.
Identification of a p53-response element in the promoter of the proline oxidase gene
DOE Office of Scientific and Technical Information (OSTI.GOV)
Maxwell, Steve A.; Kochevar, Gerald J.
2008-05-02
Proline oxidase (POX) is a p53-induced proapoptotic gene. We investigated whether p53 could bind directly to the POX gene promoter. Chromatin immunoprecipitation (ChIP) assays detected p53 bound to POX upstream gene sequences. In support of the ChIP results, sequence analysis of the POX gene and its 5' flanking sequences revealed a potential p53-binding site, GGGCTTGTCTTCGTGTGACTTCTGTCT, located at 1161 base pairs (bp) upstream of the transcriptional start site. A 711-bp DNA fragment containing the candidate p53-binding site exhibited reporter gene activity that was induced by p53. In contrast, the same DNA region lacking the candidate p53-binding site did not show significantmore » p53-response activity. Electrophoretic mobility shift assay (EMSA) in ACHN renal carcinoma cell nuclear lysates confirmed that p53 could bind to the 711-bp POX DNA fragment. We concluded from these experiments that a p53-binding site is positioned at -1161 to -1188 bp upstream of the POX transcriptional start site.« less
Sequence variants in four genes underlying Bardet-Biedl syndrome in consanguineous families
Ullah, Asmat; Umair, Muhammad; Yousaf, Maryam; Khan, Sher Alam; Nazim-ud-din, Muhammad; Shah, Khadim; Ahmad, Farooq; Azeem, Zahid; Ali, Ghazanfar; Alhaddad, Bader; Rafique, Afzal; Jan, Abid; Haack, Tobias B.; Strom, Tim M.; Meitinger, Thomas; Ghous, Tahseen
2017-01-01
Purpose To investigate the molecular basis of Bardet-Biedl syndrome (BBS) in five consanguineous families of Pakistani origin. Methods Linkage in two families (A and B) was established to BBS7 on chromosome 4q27, in family C to BBS8 on chromosome 14q32.1, and in family D to BBS10 on chromosome 12q21.2. Family E was investigated directly with exome sequence analysis. Results Sanger sequencing revealed two novel mutations and three previously reported mutations in the BBS genes. These mutations include two deletions (c.580_582delGCA, c.1592_1597delTTCCAG) in the BBS7 gene, a missense mutation (p.Gln449His) in the BBS8 gene, a frameshift mutation (c.271_272insT) in the BBS10 gene, and a nonsense mutation (p.Ser40*) in the MKKS (BBS6) gene. Conclusions Two novel mutations and three previously reported variants, identified in the present study, further extend the body of evidence implicating BBS6, BBS7, BBS8, and BBS10 in causing BBS. PMID:28761321
Li, Jie; Overall, Christopher C.; Johnson, Rudd C.; ...
2015-09-21
The alternative sigma factor σ E functions to maintain bacterial homeostasis and membrane integrity in response to extracytoplasmic stress by regulating thousands of genes both directly and indirectly. The transcriptional regulatory network governed by σ E in Salmonella and E. coli has been examined using microarray, however a genome-wide analysis of σ E–binding sites inSalmonella has not yet been reported. We infected macrophages with Salmonella Typhimurium over a select time course. Using chromatin immunoprecipitation followed by high-throughput DNA sequencing (ChIP-seq), 31 σ E–binding sites were identified. Seventeen sites were new, which included outer membrane proteins, a quorum-sensing protein, a cellmore » division factor, and a signal transduction modulator. The consensus sequence identified for σ E in vivo binding was similar to the one previously reported, except for a conserved G and A between the -35 and -10 regions. One third of the σ E–binding sites did not contain the consensus sequence, suggesting there may be alternative mechanisms by which σ E modulates transcription. By dissecting direct and indirect modes of σ E-mediated regulation, we found that σ E activates gene expression through recognition of both canonical and reversed consensus sequence. Lastly, new σ E regulated genes ( greA, luxS, ompA and ompX) are shown to be involved in heat shock and oxidative stress responses.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Jie; Overall, Christopher C.; Johnson, Rudd C.
The alternative sigma factor σ E functions to maintain bacterial homeostasis and membrane integrity in response to extracytoplasmic stress by regulating thousands of genes both directly and indirectly. The transcriptional regulatory network governed by σ E in Salmonella and E. coli has been examined using microarray, however a genome-wide analysis of σ E–binding sites inSalmonella has not yet been reported. We infected macrophages with Salmonella Typhimurium over a select time course. Using chromatin immunoprecipitation followed by high-throughput DNA sequencing (ChIP-seq), 31 σ E–binding sites were identified. Seventeen sites were new, which included outer membrane proteins, a quorum-sensing protein, a cellmore » division factor, and a signal transduction modulator. The consensus sequence identified for σ E in vivo binding was similar to the one previously reported, except for a conserved G and A between the -35 and -10 regions. One third of the σ E–binding sites did not contain the consensus sequence, suggesting there may be alternative mechanisms by which σ E modulates transcription. By dissecting direct and indirect modes of σ E-mediated regulation, we found that σ E activates gene expression through recognition of both canonical and reversed consensus sequence. Lastly, new σ E regulated genes ( greA, luxS, ompA and ompX) are shown to be involved in heat shock and oxidative stress responses.« less
Adeno-associated virus inverted terminal repeats stimulate gene editing.
Hirsch, M L
2015-02-01
Advancements in genome editing have relied on technologies to specifically damage DNA which, in turn, stimulates DNA repair including homologous recombination (HR). As off-target concerns complicate the therapeutic translation of site-specific DNA endonucleases, an alternative strategy to stimulate gene editing based on fragile DNA was investigated. To do this, an episomal gene-editing reporter was generated by a disruptive insertion of the adeno-associated virus (AAV) inverted terminal repeat (ITR) into the egfp gene. Compared with a non-structured DNA control sequence, the ITR induced DNA damage as evidenced by increased gamma-H2AX and Mre11 foci formation. As local DNA damage stimulates HR, ITR-mediated gene editing was investigated using DNA oligonucleotides as repair substrates. The AAV ITR stimulated gene editing >1000-fold in a replication-independent manner and was not biased by the polarity of the repair oligonucleotide. Analysis of additional human DNA sequences demonstrated stimulation of gene editing to varying degrees. In particular, inverted yet not direct, Alu repeats induced gene editing, suggesting a role for DNA structure in the repair event. Collectively, the results demonstrate that inverted DNA repeats stimulate gene editing via double-strand break repair in an episomal context and allude to efficient gene editing of the human chromosome using fragile DNA sequences.
Cooper, David N.; Bacolla, Albino; Férec, Claude; Vasquez, Karen M.; Kehrer-Sawatzki, Hildegard; Chen, Jian-Min
2011-01-01
Different types of human gene mutation may vary in size, from structural variants (SVs) to single base-pair substitutions, but what they all have in common is that their nature, size and location are often determined either by specific characteristics of the local DNA sequence environment or by higher-order features of the genomic architecture. The human genome is now recognized to contain ‘pervasive architectural flaws’ in that certain DNA sequences are inherently mutation-prone by virtue of their base composition, sequence repetitivity and/or epigenetic modification. Here we explore how the nature, location and frequency of different types of mutation causing inherited disease are shaped in large part, and often in remarkably predictable ways, by the local DNA sequence environment. The mutability of a given gene or genomic region may also be influenced indirectly by a variety of non-canonical (non-B) secondary structures whose formation is facilitated by the underlying DNA sequence. Since these non-B DNA structures can interfere with subsequent DNA replication and repair, and may serve to increase mutation frequencies in generalized fashion (i.e. both in the context of subtle mutations and SVs), they have the potential to serve as a unifying concept in studies of mutational mechanisms underlying human inherited disease. PMID:21853507
Kim, Heon Seok; Lee, Kyungjin; Bae, Sangsu; Park, Jeongbin; Lee, Chong-Kyo; Kim, Meehyein; Kim, Eunji; Kim, Minju; Kim, Seokjoong; Kim, Chonsaeng; Kim, Jin-Soo
2017-06-23
Several groups have used genome-wide libraries of lentiviruses encoding small guide RNAs (sgRNAs) for genetic screens. In most cases, sgRNA expression cassettes are integrated into cells by using lentiviruses, and target genes are statistically estimated by the readout of sgRNA sequences after targeted sequencing. We present a new virus-free method for human gene knockout screens using a genome-wide library of CRISPR/Cas9 sgRNAs based on plasmids and target gene identification via whole-genome sequencing (WGS) confirmation of authentic mutations rather than statistical estimation through targeted amplicon sequencing. We used 30,840 pairs of individually synthesized oligonucleotides to construct the genome-scale sgRNA library, collectively targeting 10,280 human genes ( i.e. three sgRNAs per gene). These plasmid libraries were co-transfected with a Cas9-expression plasmid into human cells, which were then treated with cytotoxic drugs or viruses. Only cells lacking key factors essential for cytotoxic drug metabolism or viral infection were able to survive. Genomic DNA isolated from cells that survived these challenges was subjected to WGS to directly identify CRISPR/Cas9-mediated causal mutations essential for cell survival. With this approach, we were able to identify known and novel genes essential for viral infection in human cells. We propose that genome-wide sgRNA screens based on plasmids coupled with WGS are powerful tools for forward genetics studies and drug target discovery. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
Evolution of meiotic recombination genes in maize and teosinte.
Sidhu, Gaganpreet K; Warzecha, Tomasz; Pawlowski, Wojciech P
2017-01-25
Meiotic recombination is a major source of genetic variation in eukaryotes. The role of recombination in evolution is recognized but little is known about how evolutionary forces affect the recombination pathway itself. Although the recombination pathway is fundamentally conserved across different species, genetic variation in recombination components and outcomes has been observed. Theoretical predictions and empirical studies suggest that changes in the recombination pathway are likely to provide adaptive abilities to populations experiencing directional or strong selection pressures, such as those occurring during species domestication. We hypothesized that adaptive changes in recombination may be associated with adaptive evolution patterns of genes involved in meiotic recombination. To examine how maize evolution and domestication affected meiotic recombination genes, we studied patterns of sequence polymorphism and divergence in eleven genes controlling key steps in the meiotic recombination pathway in a diverse set of maize inbred lines and several accessions of teosinte, the wild ancestor of maize. We discovered that, even though the recombination genes generally exhibited high sequence conservation expected in a pathway controlling a key cellular process, they showed substantial levels and diverse patterns of sequence polymorphism. Among others, we found differences in sequence polymorphism patterns between tropical and temperate maize germplasms. Several recombination genes displayed patterns of polymorphism indicative of adaptive evolution. Despite their ancient origin and overall sequence conservation, meiotic recombination genes can exhibit extensive and complex patterns of molecular evolution. Changes in these genes could affect the functioning of the recombination pathway, and may have contributed to the successful domestication of maize and its expansion to new cultivation areas.
Phylogeny of sipunculan worms: A combined analysis of four gene regions and morphology.
Schulze, Anja; Cutler, Edward B; Giribet, Gonzalo
2007-01-01
The intra-phyletic relationships of sipunculan worms were analyzed based on DNA sequence data from four gene regions and 58 morphological characters. Initially we analyzed the data under direct optimization using parsimony as optimality criterion. An implied alignment resulting from the direct optimization analysis was subsequently utilized to perform a Bayesian analysis with mixed models for the different data partitions. For this we applied a doublet model for the stem regions of the 18S rRNA. Both analyses support monophyly of Sipuncula and most of the same clades within the phylum. The analyses differ with respect to the relationships among the major groups but whereas the deep nodes in the direct optimization analysis generally show low jackknife support, they are supported by 100% posterior probability in the Bayesian analysis. Direct optimization has been useful for handling sequences of unequal length and generating conservative phylogenetic hypotheses whereas the Bayesian analysis under mixed models provided high resolution in the basal nodes of the tree.
Zabeau, M; Stanley, K K
1982-01-01
Hybrid plasmids carrying cro-lacZ gene fusions have been constructed by joining DNA segments carrying the PR promoter and the start of the cro gene of bacteriophage lambda to the lacZ gene fragment carried by plasmid pLG400 . Plasmids in which the translational reading frames of the cro and lacZ genes are joined in-register (type I) direct the synthesis of elevated levels of cro-beta-galactosidase fusion protein amounting to 30% of the total cellular protein, while plasmids in which the genes are fused out-of-register (type II) produce a low level of beta-galactosidase protein. Sequence rearrangements downstream of the cro initiator AUG were found to influence the efficiency of translation, and have been correlated with alterations in the RNA secondary structure of the ribosome-binding site. Plasmids which direct the synthesis of high levels of beta-galactosidase are conditionally lethal and can only be propagated when the PR promoter is repressed. Deletion of sequences downstream of the lacZ gene restored viability, indicating that this region of the plasmid encodes a function which inhibits the growth of the cells. The different applications of these plasmids for expression of cloned genes are discussed. Images Fig. 6. PMID:6327257
Ahmad, Aftab; Niwa, Yasuo; Goto, Shingo; Ogawa, Takeshi; Shimizu, Masanori; Suzuki, Akane; Kobayashi, Kyoko; Kobayashi, Hirokazu
2015-01-01
An activation-tagging methodology was applied to dedifferentiated calli of Arabidopsis to identify new genes involved in salt tolerance. This identified salt tolerant callus 8 (stc8) as a gene encoding the basic helix-loop-helix transcription factor bHLH106. bHLH106-knockout (KO) lines were more sensitive to NaCl, KCl, LiCl, ABA, and low temperatures than the wild-type. Back-transformation of the KO line rescued its phenotype, and over-expression (OX) of bHLH106 in differentiated plants exhibited tolerance to NaCl. Green fluorescent protein (GFP) fused with bHLH106 revealed that it was localized to the nucleus. Prepared bHLH106 protein was subjected to electrophoresis mobility shift assays against E-box sequences (5'-CANNTG-3'). The G-box sequence 5'-CACGTG-3' had the strongest interaction with bHLH106. bHLH106-OX lines were transcriptomically analyzed, and resultant up- and down-regulated genes selected on the criterion of presence of a G-box sequence. There were 198 genes positively regulated by bHLH106 and 36 genes negatively regulated; these genes possessed one or more G-box sequences in their promoter regions. Many of these genes are known to be involved in abiotic stress response. It is concluded that bHLH106 locates at a branching point in the abiotic stress response network by interacting directly to the G-box in genes conferring salt tolerance on plants.
Yedavalli, Venkat R. K.; Chappey, Colombe; Matala, Erik; Ahmad, Nafees
1998-01-01
The human immunodeficiency virus type 1 (HIV-1) vif gene is conserved among most lentiviruses, suggesting that vif is important for natural infection. To determine whether an intact vif gene is positively selected during mother-to-infant transmission, we analyzed vif sequences from five infected mother-infant pairs following perinatal transmission. The coding potential of the vif open reading frame directly derived from uncultured peripheral blood mononuclear cell DNA was maintained in most of the 78,912 bp sequenced. We found that 123 of the 137 clones analyzed showed an 89.8% frequency of intact vif open reading frames. There was a low degree of heterogeneity of vif genes within mothers, within infants, and between epidemiologically linked mother-infant pairs. The distances between vif sequences were greater in epidemiologically unlinked individuals than in epidemiologically linked mother-infant pairs. Furthermore, the epidemiologically linked mother-infant pair vif sequences displayed similar patterns that were not seen in vif sequences from epidemiologically unlinked individuals. The functional domains, including the two cysteines at positions 114 and 133, a serine phosphorylation site at position 144, and the C-terminal basic amino acids essential for vif protein function, were highly conserved in most of the sequences. Phylogenetic analyses of 137 mother-infant pair vif sequences and 187 other available vif sequences from HIV-1 databases revealed distinct clusters for vif sequences from each mother-infant pair and for other vif sequences. Taken together, these findings suggest that vif plays an important role in HIV-1 infection and replication in mothers and their perinatally infected infants. PMID:9445004
DNA Translator and Aligner: HyperCard utilities to aid phylogenetic analysis of molecules.
Eernisse, D J
1992-04-01
DNA Translator and Aligner are molecular phylogenetics HyperCard stacks for Macintosh computers. They manipulate sequence data to provide graphical gene mapping, conversions, translations and manual multiple-sequence alignment editing. DNA Translator is able to convert documented GenBank or EMBL documented sequences into linearized, rescalable gene maps whose gene sequences are extractable by clicking on the corresponding map button or by selection from a scrolling list. Provided gene maps, complete with extractable sequences, consist of nine metazoan, one yeast, and one ciliate mitochondrial DNAs and three green plant chloroplast DNAs. Single or multiple sequences can be manipulated to aid in phylogenetic analysis. Sequences can be translated between nucleic acids and proteins in either direction with flexible support of alternate genetic codes and ambiguous nucleotide symbols. Multiple aligned sequence output from diverse sources can be converted to Nexus, Hennig86 or PHYLIP format for subsequent phylogenetic analysis. Input or output alignments can be examined with Aligner, a convenient accessory stack included in the DNA Translator package. Aligner is an editor for the manual alignment of up to 100 sequences that toggles between display of matched characters and normal unmatched sequences. DNA Translator also generates graphic displays of amino acid coding and codon usage frequency relative to all other, or only synonymous, codons for approximately 70 select organism-organelle combinations. Codon usage data is compatible with spreadsheet or UWGCG formats for incorporation of additional molecules of interest. The complete package is available via anonymous ftp and is free for non-commercial uses.
OSG-GEM: Gene Expression Matrix Construction Using the Open Science Grid.
Poehlman, William L; Rynge, Mats; Branton, Chris; Balamurugan, D; Feltus, Frank A
2016-01-01
High-throughput DNA sequencing technology has revolutionized the study of gene expression while introducing significant computational challenges for biologists. These computational challenges include access to sufficient computer hardware and functional data processing workflows. Both these challenges are addressed with our scalable, open-source Pegasus workflow for processing high-throughput DNA sequence datasets into a gene expression matrix (GEM) using computational resources available to U.S.-based researchers on the Open Science Grid (OSG). We describe the usage of the workflow (OSG-GEM), discuss workflow design, inspect performance data, and assess accuracy in mapping paired-end sequencing reads to a reference genome. A target OSG-GEM user is proficient with the Linux command line and possesses basic bioinformatics experience. The user may run this workflow directly on the OSG or adapt it to novel computing environments.
OSG-GEM: Gene Expression Matrix Construction Using the Open Science Grid
Poehlman, William L.; Rynge, Mats; Branton, Chris; Balamurugan, D.; Feltus, Frank A.
2016-01-01
High-throughput DNA sequencing technology has revolutionized the study of gene expression while introducing significant computational challenges for biologists. These computational challenges include access to sufficient computer hardware and functional data processing workflows. Both these challenges are addressed with our scalable, open-source Pegasus workflow for processing high-throughput DNA sequence datasets into a gene expression matrix (GEM) using computational resources available to U.S.-based researchers on the Open Science Grid (OSG). We describe the usage of the workflow (OSG-GEM), discuss workflow design, inspect performance data, and assess accuracy in mapping paired-end sequencing reads to a reference genome. A target OSG-GEM user is proficient with the Linux command line and possesses basic bioinformatics experience. The user may run this workflow directly on the OSG or adapt it to novel computing environments. PMID:27499617
Cis-acting elements in the promoter region of the human aldolase C gene.
Buono, P; de Conciliis, L; Olivetta, E; Izzo, P; Salvatore, F
1993-08-16
We investigated the cis-acting sequences involved in the expression of the human aldolase C gene by transient transfections into human neuroblastoma cells (SKNBE). We demonstrate that 420 bp of the 5'-flanking DNA direct at high efficiency the transcription of the CAT reporter gene. A deletion between -420 bp and -164 bp causes a 60% decrease of CAT activity. Gel shift and DNase I footprinting analyses revealed four protected elements: A, B, C and D. Competition analyses indicate that Sp1 or factors sharing a similar sequence specificity bind to elements A and B, but not to elements C and D. Sequence analysis shows a half palindromic ERE motif (GGTCA), in elements B and D. Region D binds a transactivating factor which appears also essential to stabilize the initiation complex.
Ni, Xiangyang; Westpheling, Janet
1997-01-01
The chi63 promoter directs glucose-sensitive, chitin-dependent transcription of a gene involved in the utilization of chitin as carbon source. Analysis of 5′ and 3′ deletions of the promoter region revealed that a 350-bp segment is sufficient for wild-type levels of expression and regulation. The analysis of single base changes throughout the promoter region, introduced by random and site-directed mutagenesis, identified several sequences to be important for activity and regulation. Single base changes at −10, −12, −32, −33, −35, and −37 upstream of the transcription start site resulted in loss of activity from the promoter, suggesting that bases in these positions are important for RNA polymerase interaction. The sequences centered around −10 (TATTCT) and −35 (TTGACC) in this promoter are, in fact, prototypical of eubacterial promoters. Overlapping the RNA polymerase binding site is a perfect 12-bp direct repeat sequence. Some base changes within this direct repeat resulted in constitutive expression, suggesting that this sequence is an operator for negative regulation. Other base changes resulted in loss of glucose repression while retaining the requirement for chitin induction, suggesting that this sequence is also involved in glucose repression. The fact that cis-acting mutations resulted in glucose resistance but not inducer independence rules out the possibility that glucose repression acts exclusively by inducer exclusion. The fact that mutations that affect glucose repression and chitin induction fall within the same direct repeat sequence module suggests that the direct repeat sequence facilitates both chitin induction and glucose repression. PMID:9371809
Watanabe, Kazuya; Teramoto, Maki; Futamata, Hiroyuki; Harayama, Shigeaki
1998-01-01
DNA was isolated from phenol-digesting activated sludge, and partial fragments of the 16S ribosomal DNA (rDNA) and the gene encoding the largest subunit of multicomponent phenol hydroxylase (LmPH) were amplified by PCR. An analysis of the amplified fragments by temperature gradient gel electrophoresis (TGGE) demonstrated that two major 16S rDNA bands (bands R2 and R3) and two major LmPH gene bands (bands P2 and P3) appeared after the activated sludge became acclimated to phenol. The nucleotide sequences of these major bands were determined. In parallel, bacteria were isolated from the activated sludge by direct plating or by plating after enrichment either in batch cultures or in a chemostat culture. The bacteria isolated were classified into 27 distinct groups by a repetitive extragenic palindromic sequence PCR analysis. The partial nucleotide sequences of 16S rDNAs and LmPH genes of members of these 27 groups were then determined. A comparison of these nucleotide sequences with the sequences of the major TGGE bands indicated that the major bacterial populations, R2 and R3, possessed major LmPH genes P2 and P3, respectively. The dominant populations could be isolated either by direct plating or by chemostat culture enrichment but not by batch culture enrichment. One of the dominant strains (R3) which contained a novel type of LmPH (P3), was closely related to Valivorax paradoxus, and the result of a kinetic analysis of its phenol-oxygenating activity suggested that this strain was the principal phenol digester in the activated sludge. PMID:9797297
Blixt, Maria K E; Hallböök, Finn
2016-01-01
Combining techniques of episomal vector gene-specific Cre expression and genomic integration using the piggyBac transposon system enables studies of gene expression-specific cell lineage tracing in the chicken retina. In this work, we aimed to target the retinal horizontal cell progenitors. A 208 bp gene regulatory sequence from the chicken retinoid X receptor γ gene (RXRγ208) was used to drive Cre expression. RXRγ is expressed in progenitors and photoreceptors during development. The vector was combined with a piggyBac "donor" vector containing a floxed STOP sequence followed by enhanced green fluorescent protein (EGFP), as well as a piggyBac helper vector for efficient integration into the host cell genome. The vectors were introduced into the embryonic chicken retina with in ovo electroporation. Tissue electroporation targets specific developmental time points and in specific structures. Cells that drove Cre expression from the regulatory RXRγ208 sequence excised the floxed STOP-sequence and expressed GFP. The approach generated a stable lineage with robust expression of GFP in retinal cells that have activated transcription from the RXRγ208 sequence. Furthermore, GFP was expressed in cells that express horizontal or photoreceptor markers when electroporation was performed between developmental stages 22 and 28. Electroporation of a stage 12 optic cup gave multiple cell types in accordance with RXRγ gene expression in the early retina. In this study, we describe an easy, cost-effective, and time-efficient method for testing regulatory sequences in general. More specifically, our results open up the possibility for further studies of the RXRγ-gene regulatory network governing the formation of photoreceptor and horizontal cells. In addition, the method presents approaches to target the expression of effector genes, such as regulators of cell fate or cell cycle progression, to these cells and their progenitor.
Jin, Ke; Xue, Chenyi; Wu, Xiaoli; Qian, Jinyi; Zhu, Yong; Yang, Zhen; Yonezawa, Takahiro; Crabbe, M James C; Cao, Ying; Hasegawa, Masami; Zhong, Yang; Zheng, Yufang
2011-01-01
The giant panda has an interesting bamboo diet unlike the other species in the order of Carnivora. The umami taste receptor gene T1R1 has been identified as a pseudogene during its genome sequencing project and confirmed using a different giant panda sample. The estimated mutation time for this gene is about 4.2 Myr. Such mutation coincided with the giant panda's dietary change and also reinforced its herbivorous life style. However, as this gene is preserved in herbivores such as cow and horse, we need to look for other reasons behind the giant panda's diet switch. Since taste is part of the reward properties of food related to its energy and nutrition contents, we did a systematic analysis on those genes involved in the appetite-reward system for the giant panda. We extracted the giant panda sequence information for those genes and compared with the human sequence first and then with seven other species including chimpanzee, mouse, rat, dog, cat, horse, and cow. Orthologs in panda were further analyzed based on the coding region, Kozak consensus sequence, and potential microRNA binding of those genes. Our results revealed an interesting dopamine metabolic involvement in the panda's food choice. This finding suggests a new direction for molecular evolution studies behind the panda's dietary switch.
Jin, Ke; Xue, Chenyi; Wu, Xiaoli; Qian, Jinyi; Zhu, Yong; Yang, Zhen; Yonezawa, Takahiro; Crabbe, M. James C.; Cao, Ying; Hasegawa, Masami; Zhong, Yang; Zheng, Yufang
2011-01-01
Background The giant panda has an interesting bamboo diet unlike the other species in the order of Carnivora. The umami taste receptor gene T1R1 has been identified as a pseudogene during its genome sequencing project and confirmed using a different giant panda sample. The estimated mutation time for this gene is about 4.2 Myr. Such mutation coincided with the giant panda's dietary change and also reinforced its herbivorous life style. However, as this gene is preserved in herbivores such as cow and horse, we need to look for other reasons behind the giant panda's diet switch. Methodology/Principal Findings Since taste is part of the reward properties of food related to its energy and nutrition contents, we did a systematic analysis on those genes involved in the appetite-reward system for the giant panda. We extracted the giant panda sequence information for those genes and compared with the human sequence first and then with seven other species including chimpanzee, mouse, rat, dog, cat, horse, and cow. Orthologs in panda were further analyzed based on the coding region, Kozak consensus sequence, and potential microRNA binding of those genes. Conclusions/Significance Our results revealed an interesting dopamine metabolic involvement in the panda's food choice. This finding suggests a new direction for molecular evolution studies behind the panda's dietary switch. PMID:21818345
Torrent, C; Gabus, C; Darlix, J L
1994-02-01
Retroviral genomes consist of two identical RNA molecules associated at their 5' ends by the dimer linkage structure located in the packaging element (Psi or E) necessary for RNA dimerization in vitro and packaging in vivo. In murine leukemia virus (MLV)-derived vectors designed for gene transfer, the Psi + sequence of 600 nucleotides directs the packaging of recombinant RNAs into MLV virions produced by helper cells. By using in vitro RNA dimerization as a screening system, a sequence of rat VL30 RNA located next to the 5' end of the Harvey mouse sarcoma virus genome and as small as 67 nucleotides was found to form stable dimeric RNA. In addition, a purine-rich sequence located at the 5' end of this VL30 RNA seems to be critical for RNA dimerization. When this VL30 element was extended by 107 nucleotides at its 3' end and inserted into an MLV-derived vector lacking MLV Psi +, it directed the efficient encapsidation of recombinant RNAs into MLV virions. Because this VL30 packaging signal is smaller and more efficient in packaging recombinant RNAs than the MLV Psi + and does not contain gag or glyco-gag coding sequences, its use in MLV-derived vectors should render even more unlikely recombinations which could generate replication-competent viruses. Therefore, utilization of the rat VL30 packaging sequence should improve the biological safety of MLV vectors for human gene transfer.
Guillet-Claude, Carine; Isabel, Nathalie; Pelgas, Betty; Bousquet, Jean
2004-12-01
Class I knox genes code for transcription factors that play an essential role in plant growth and development as central regulators of meristem cell identity. Based on the analysis of new cDNA sequences from various tissues and genomic DNA sequences, we identified a highly diversified group of class I knox genes in conifers. Phylogenetic analyses of complete amino acid sequences from various seed plants indicated that all conifer sequences formed a monophyletic group. Within conifers, four subgroups here named genes KN1 to KN4 were well delineated, each regrouping pine and spruce sequences. KN4 was sister group to KN3, which was sister group to KN1 and KN2. Genetic mapping on the genomes of two divergent Picea species indicated that KN1 and KN2 are located close to each other on the same linkage group, whereas KN3 and KN4 mapped on different linkage groups, correlating the more ancient divergence of these two genes. The proportion of synonymous and nonsynonymous substitutions suggested intense purifying selection for the four genes. However, rates of substitution per year indicated an evolution in two steps: faster rates were noted after gene duplications, followed subsequently by lower rates. Positive directional selection was detected for most of the internal branches harboring an accelerated rate of evolution. In addition, many sites with highly significant amino acid rate shift were identified between these branches. However, the tightly linked KN1 and KN2 did not diverge as much from each other. The implications of the correlation between phylogenetic, structural, and functional information are discussed in relation to the diversification of the knox-I gene family in conifers.
Phenotypic and genotypic analysis of Borrelia burgdorferi isolates from various sources.
Adam, T; Gassmann, G S; Rasiah, C; Göbel, U B
1991-01-01
A total of 17 B. burgdorferi isolates from various sources were characterized by sodium dodecyl sulfate-polyacrylamide gel electrophoresis of whole-cell proteins, restriction enzyme analysis, Southern hybridization with probes complementary to unique regions of evolutionarily conserved genes (16S rRNA and fla), and direct sequencing of in vitro polymerase chain reaction-amplified fragments of the 16S rRNA gene. Three groups were distinguished on the basis of phenotypic and genotypic traits, the latter traced to the nucleotide sequence level. Images PMID:1649797
Rapid detection of Mannheimia haemolytica in lung tissues of sheep and from bacterial culture.
Kumar, Jyoti; Dixit, Shivendra Kumar; Kumar, Rajiv
2015-09-01
This study was aimed to detect Mannheimia haemolytica in lung tissues of sheep and from a bacterial culture. M. haemolytica is one of the most important and well-established etiological agents of pneumonia in sheep and other ruminants throughout the world. Accurate diagnosis of M. haemolytica primarily relies on bacteriological examination, biochemical characteristics and, biotyping and serotyping of the isolates. In an effort to facilitate rapid M. haemolytica detection, polymerase chain reaction assay targeting Pasteurella haemolytica serotype-1 specific antigens (PHSSA), Rpt2 and 12S ribosomal RNA (rRNA) genes were used to detect M. haemolytica directly from lung tissues and from bacterial culture. A total of 12 archived lung tissues from sheep that died of pneumonia on an organized farm were used. A multiplex polymerase chain reaction (mPCR) based on two-amplicons targeted PHSSA and Rpt2 genes of M. haemolytica were used for identification of M. haemolytica isolates in culture from the lung samples. All the 12 lung tissue samples were tested for the presence M. haemolytica by PHSSA and Rpt2 genes based PCR and its confirmation by sequencing of the amplicons. All the 12 lung tissue samples tested for the presence of PHSSA and Rpt2 genes of M. haemolytica by mPCR were found to be positive. Amplification of 12S rRNA gene fragment as internal amplification control was obtained with each mPCR reaction performed from DNA extracted directly from lung tissue samples. All the M. haemolytica were also positive for mPCR. No amplified DNA bands were observed for negative control reactions. All the three nucleotide sequences were deposited in NCBI GenBank (Accession No. KJ534629, KJ534630 and KJ534631). Sequencing of the amplified products revealed the identity of 99-100%, with published sequence of PHSSA and Rpt2 genes of M. haemolytica available in the NCBI database. Sheep specific mitochondrial 12S rRNA gene sequence also revealed the identity of 98% with published sequences in the NCBI database. The present study emphasized the PCR as a valuable tool for rapid detection of M. haemolytica in clinical samples from animals. In addition, it offers the opportunity to perform large-scale epidemiological studies regarding the role of M. haemolytica in clinical cases of pneumonia and other disease manifestations in sheep and other ruminants, thereby providing the basis for effective preventive strategies.
Zhou, Zhenxing; Xu, Qingqing; Bu, Qingting; Guo, Yuanyang; Liu, Shuiping; Liu, Yu; Du, Yiling; Li, Yongquan
2015-02-09
Genomic sequencing of actinomycetes has revealed the presence of numerous gene clusters seemingly capable of natural product biosynthesis, yet most clusters are cryptic under laboratory conditions. Bioinformatics analysis of the completely sequenced genome of Streptomyces chattanoogensis L10 (CGMCC 2644) revealed a silent angucycline biosynthetic gene cluster. The overexpression of a pathway-specific activator gene under the constitutive ermE* promoter successfully triggered the expression of the angucycline biosynthetic genes. Two novel members of the angucycline antibiotic family, chattamycins A and B, were further isolated and elucidated. Biological activity assays demonstrated that chattamycin B possesses good antitumor activities against human cancer cell lines and moderate antibacterial activities. The results presented here provide a feasible method to activate silent angucycline biosynthetic gene clusters to discover potential new drug leads. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Identifying transposon insertions and their effects from RNA-sequencing data.
de Ruiter, Julian R; Kas, Sjors M; Schut, Eva; Adams, David J; Koudijs, Marco J; Wessels, Lodewyk F A; Jonkers, Jos
2017-07-07
Insertional mutagenesis using engineered transposons is a potent forward genetic screening technique used to identify cancer genes in mouse model systems. In the analysis of these screens, transposon insertion sites are typically identified by targeted DNA-sequencing and subsequently assigned to predicted target genes using heuristics. As such, these approaches provide no direct evidence that insertions actually affect their predicted targets or how transcripts of these genes are affected. To address this, we developed IM-Fusion, an approach that identifies insertion sites from gene-transposon fusions in standard single- and paired-end RNA-sequencing data. We demonstrate IM-Fusion on two separate transposon screens of 123 mammary tumors and 20 B-cell acute lymphoblastic leukemias, respectively. We show that IM-Fusion accurately identifies transposon insertions and their true target genes. Furthermore, by combining the identified insertion sites with expression quantification, we show that we can determine the effect of a transposon insertion on its target gene(s) and prioritize insertions that have a significant effect on expression. We expect that IM-Fusion will significantly enhance the accuracy of cancer gene discovery in forward genetic screens and provide initial insight into the biological effects of insertions on candidate cancer genes. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
ICEPmu1, an integrative conjugative element (ICE) of Pasteurella multocida: structure and transfer.
Michael, Geovana Brenner; Kadlec, Kristina; Sweeney, Michael T; Brzuszkiewicz, Elzbieta; Liesegang, Heiko; Daniel, Rolf; Murray, Robert W; Watts, Jeffrey L; Schwarz, Stefan
2012-01-01
Integrative and conjugative elements (ICEs) have not been detected in Pasteurella multocida. In this study the multiresistance ICEPmu1 from bovine P. multocida was analysed for its core genes and its ability to conjugatively transfer into strains of the same and different genera. ICEPmu1 was identified during whole genome sequencing. Coding sequences were predicted by bioinformatic tools and manually curated using the annotation software ERGO. Conjugation into P. multocida, Mannheimia haemolytica and Escherichia coli recipients was performed by mating assays. The presence of ICEPmu1 and its circular intermediate in the recipient strains was confirmed by PCR and sequence analysis. Integration sites were sequenced. Susceptibility testing of the ICEPmu1-carrying recipients was conducted by broth microdilution. The 82 214 bp ICEPmu1 harbours 88 genes. The core genes of ICEPmu1, which are involved in excision/integration and conjugative transfer, resemble those found in a 66 641 bp ICE from Histophilus somni. ICEPmu1 integrates into a tRNA(Leu) and is flanked by 13 bp direct repeats. It is able to conjugatively transfer to P. multocida, M. haemolytica and E. coli, where it also uses a tRNA(Leu) for integration and produces closely related 13 bp direct repeats. PCR assays and susceptibility testing confirmed the presence and the functional activity of the ICEPmu1-associated resistance genes in the recipient strains. The observation that the multiresistance ICEPmu1 is present in a bovine P. multocida and can easily spread across strain and genus boundaries underlines the risk of a rapid dissemination of multiple resistance genes, which will distinctly decrease the therapeutic options.
Transcription Factor Map Alignment of Promoter Regions
Blanco, Enrique; Messeguer, Xavier; Smith, Temple F; Guigó, Roderic
2006-01-01
We address the problem of comparing and characterizing the promoter regions of genes with similar expression patterns. This remains a challenging problem in sequence analysis, because often the promoter regions of co-expressed genes do not show discernible sequence conservation. In our approach, thus, we have not directly compared the nucleotide sequence of promoters. Instead, we have obtained predictions of transcription factor binding sites, annotated the predicted sites with the labels of the corresponding binding factors, and aligned the resulting sequences of labels—to which we refer here as transcription factor maps (TF-maps). To obtain the global pairwise alignment of two TF-maps, we have adapted an algorithm initially developed to align restriction enzyme maps. We have optimized the parameters of the algorithm in a small, but well-curated, collection of human–mouse orthologous gene pairs. Results in this dataset, as well as in an independent much larger dataset from the CISRED database, indicate that TF-map alignments are able to uncover conserved regulatory elements, which cannot be detected by the typical sequence alignments. PMID:16733547
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zehr, J.P.; Mellon, M.T.; Zani, S.
1998-09-01
Oligotrophic oceanic waters of the central ocean gyres typically have extremely low dissolved fixed inorganic nitrogen concentrations, but few nitrogen-fixing microorganisms from the oceanic environment have been cultivated. Nitrogenase gene (nifH) sequences amplified directly from oceanic waters showed that the open ocean contains more diverse diazotrophic microbial populations and more diverse habitats for nitrogen fixers than previously observed by classical microbiological techniques. Nitrogenase genes derived from unicellular and filamentous cyanobacteria, as well as from the {alpha} and {gamma} subdivisions of the class Proteobacteria, were found in both the Atlantic and Pacific oceans. nifH sequences that cluster phylogenetically with sequences frommore » sulfate reducers or clostridia were found associated with planktonic crustaceans. Nitrogenase sequence types obtained from invertebrates represented phylotypes distinct from the phylotypes detected in the picoplankton size fraction. The results indicate that there are in the oceanic environment several distinct potentially nitrogen-fixing microbial assemblages that include representatives of diverse phylotypes.« less
Conifer genomics and adaptation: at the crossroads of genetic diversity and genome function.
Prunier, Julien; Verta, Jukka-Pekka; MacKay, John J
2016-01-01
Conifers have been understudied at the genomic level despite their worldwide ecological and economic importance but the situation is rapidly changing with the development of next generation sequencing (NGS) technologies. With NGS, genomics research has simultaneously gained in speed, magnitude and scope. In just a few years, genomes of 20-24 gigabases have been sequenced for several conifers, with several others expected in the near future. Biological insights have resulted from recent sequencing initiatives as well as genetic mapping, gene expression profiling and gene discovery research over nearly two decades. We review the knowledge arising from conifer genomics research emphasizing genome evolution and the genomic basis of adaptation, and outline emerging questions and knowledge gaps. We discuss future directions in three areas with potential inputs from NGS technologies: the evolutionary impacts of adaptation in conifers based on the adaptation-by-speciation model; the contributions of genetic variability of gene expression in adaptation; and the development of a broader understanding of genetic diversity and its impacts on genome function. These research directions promise to sustain research aimed at addressing the emerging challenges of adaptation that face conifer trees. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.
De Iudicibus, Sara; Lucafò, Marianna; Vitulo, Nicola; Martelossi, Stefano; Zimbello, Rosanna; De Pascale, Fabio; Forcato, Claudio; Naviglio, Samuele; Di Silvestre, Alessia; Gerdol, Marco; Stocco, Gabriele; Valle, Giorgio; Ventura, Alessandro; Bramuzzo, Matteo; Decorti, Giuliana
2018-05-08
The aim of this research was the identification of novel pharmacogenomic biomarkers for better understanding the complex gene regulation mechanisms underpinning glucocorticoid (GC) action in paediatric inflammatory bowel disease (IBD). This goal was achieved by evaluating high-throughput microRNA (miRNA) profiles during GC treatment, integrated with the assessment of expression changes in GC receptor (GR) heterocomplex genes. Furthermore, we tested the hypothesis that differentially expressed miRNAs could be directly regulated by GCs through investigating the presence of GC responsive elements (GREs) in their gene promoters. Ten IBD paediatric patients responding to GCs were enrolled. Peripheral blood was obtained at diagnosis (T0) and after four weeks of steroid treatment (T4). MicroRNA profiles were analyzed using next generation sequencing, and selected significantly differentially expressed miRNAs were validated by quantitative reverse transcription-polymerase chain reaction. In detail, 18 miRNAs were differentially expressed from T0 to T4, 16 of which were upregulated and 2 of which were downregulated. Out of these, three miRNAs (miR-144, miR-142, and miR-96) could putatively recognize the 3’UTR of the GR gene and three miRNAs (miR-363, miR-96, miR-142) contained GREs sequences, thereby potentially enabling direct regulation by the GR. In conclusion, we identified miRNAs differently expressed during GC treatment and miRNAs which could be directly regulated by GCs in blood cells of young IBD patients. These results could represent a first step towards their translation as pharmacogenomic biomarkers.
Sequences in the intergenic spacer influence RNA Pol I transcription from the human rRNA promoter
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, W.M.; Sylvester, J.E.
1994-09-01
In most eucaryotic species, ribosomal genes are tandemly repeated about 100-5000 times per haploid genome. The 43 Kb human rDNA repeat consists of a 13 Kb coding region for the 18S, 5.8S, 28S ribosomal RNAs (rRNAs) and transcribed spacers separated by a 30 Kb intergenic spacer. For species such as frog, mouse and rat, sequences in the intergenic spacer other than the gene promoter have been shown to modulate transcription of the ribosomal gene. These sequences are spacer promoters, enhancers and the terminator for spacer transcription. We are addressing whether the human ribosomal gene promoter is similarly influenced. In-vitro transcriptionmore » run-off assays have revealed that the 4.5 kb region (CBE), directly upstream of the gene promoter, has cis-stimulation and trans-competition properties. This suggests that the CBE fragment contains an enhancer(s) for ribosomal gene transcription. Further experiments have shown that a fragment ({approximately}1.6 kb) within the CBE fragment also has trans-competition function. Deletion subclones of this region are being tested to delineate the exact sequences responsible for these modulating activities. Previous sequence analysis and functional studies have revealed that CBE contains regions of DNA capable of adopting alternative structures such as bent DNA, Z-DNA, and triple-stranded DNA. Whether these structures are required for modulating transcription remains to be determined as does the specific DNA-protein interaction involved.« less
A novel paired domain DNA recognition motif can mediate Pax2 repression of gene transcription.
Håvik, B; Ragnhildstveit, E; Lorens, J B; Saelemyr, K; Fauske, O; Knudsen, L K; Fjose, A
1999-12-20
The paired domain (PD) is an evolutionarily conserved DNA-binding domain encoded by the Pax gene family of developmental regulators. The Pax proteins are transcription factors and are involved in a variety of processes such as brain development, patterning of the central nervous system (CNS), and B-cell development. In this report we demonstrate that the zebrafish Pax2 PD can interact with a novel type of DNA sequences in vitro, the triple-A motif, consisting of a heptameric nucleotide sequence G/CAAACA/TC with an invariant core of three adjacent adenosines. This recognition sequence was found to be conserved in known natural Pax5 repressor elements involved in controlling the expression of the p53 and J-chain genes. By identifying similar high affinity binding sites in potential target genes of the Pax2 protein, including the pax2 gene itself, we obtained further evidence that the triple-A sites are biologically significant. The putative natural target sites also provide a basis for defining an extended consensus recognition sequence. In addition, we observed in transformation assays a direct correlation between Pax2 repressor activity and the presence of triple-A sites. The results suggest that a transcriptional regulatory function of Pax proteins can be modulated by PD binding to different categories of target sequences. Copyright 1999 Academic Press.
Yomano, L P; Scopes, R K; Ingram, L O
1993-01-01
Phosphoglycerate mutase is an essential glycolytic enzyme for Zymomonas mobilis, catalyzing the reversible interconversion of 3-phosphoglycerate and 2-phosphoglycerate. The pgm gene encoding this enzyme was cloned on a 5.2-kbp DNA fragment and expressed in Escherichia coli. Recombinants were identified by using antibodies directed against purified Z. mobilis phosphoglycerate mutase. The pgm gene contains a canonical ribosome-binding site, a biased pattern of codon usage, a long upstream untranslated region, and four promoters which share sequence homology. Interestingly, adhA and a D-specific 2-hydroxyacid dehydrogenase were found on the same DNA fragment and appear to form a cluster of genes which function in central metabolism. The translated sequence for Z. mobilis pgm was in full agreement with the 40 N-terminal amino acid residues determined by protein sequencing. The primary structure of the translated sequence is highly conserved (52 to 60% identity with other phosphoglycerate mutases) and also shares extensive homology with bisphosphoglycerate mutases (51 to 59% identity). Since Southern blots indicated the presence of only a single copy of pgm in the Z. mobilis chromosome, it is likely that the cloned pgm gene functions to provide both activities. Z. mobilis phosphoglycerate mutase is unusual in that it lacks the flexible tail and lysines at the carboxy terminus which are present in the enzyme isolated from all other organisms examined. Images PMID:8320209
USDA-ARS?s Scientific Manuscript database
Changes in gene regulation that underlie phenotypic evolution can be encoded directly in the DNA sequence or mediated by chromatin modifications such as DNA methylation. It has been hypothesized that the evolution of social behavior is associated with enhanced gene regulatory potential, which may in...
Prospects and challenges for fungal metatranscriptomics of complex communities
Cheryl R. Kuske; Cedar N. Hesse; Jean F. Challacombe; Daniel Cullen; Joshua R. Herr; Rebecca C. Mueller; Adrian Tsang; Rytas Vilgalys
2015-01-01
The ability to extract and purify messenger RNA directly from plants, decomposing organic matter and soil, followed by highthroughput sequencing of the pool of expressed genes, has spawned the emerging research area of metatranscriptomics. Each metatranscriptome provides a snapshot of the composition and relative abundance of actively transcribed genes, and thus...
Danno, Hiroki; Michiue, Tatsuo; Hitachi, Keisuke; Yukita, Akira; Ishiura, Shoichi; Asashima, Makoto
2008-04-08
The neural-related genes Sox2, Pax6, Otx2, and Rax have been associated with severe ocular malformations such as anophthalmia and microphthalmia, but it remains unclear as to how these genes are linked functionally. We analyzed the upstream signaling of Xenopus Rax (also known as Rx1) and identified the Otx2 and Sox2 proteins as direct upstream regulators of Rax. We revealed that endogenous Otx2 and Sox2 proteins bound to the conserved noncoding sequence (CNS1) located approximately 2 kb upstream of the Rax promoter. This sequence is conserved among vertebrates and is required for potent transcriptional activity. Reporter assays showed that Otx2 and Sox2 synergistically activated transcription via CNS1. Furthermore, the Otx2 and Sox2 proteins physically interacted with each other, and this interaction was affected by the Sox2-missense mutations identified in these ocular disorders. These results demonstrate that the direct interaction and interdependence between the Otx2 and Sox2 proteins coordinate Rax expression in eye development, providing molecular linkages among the genes responsible for ocular malformation.
Differences in selection drive olfactory receptor genes in different directions in dogs and wolf.
Chen, Rui; Irwin, David M; Zhang, Ya-Ping
2012-11-01
The olfactory receptor (OR) gene family is the largest gene family found in mammalian genomes. It is known to evolve through a birth-and-death process. Here, we characterized the sequences of 16 segregating OR pseudogenes in the samples of the wolf and the Chinese village dog (CVD) and compared them with the sequences from dogs of different breeds. Our results show that the segregating OR pseudogenes in breed dogs are under strong purifying selection, while evolving neutrally in the CVD, and show a more complicated pattern in the wolf. In the wolf, we found a trend to remove deleterious polymorphisms and accumulate nondeleterious polymorphisms. On the basis of protein structure of the ORs, we found that the distribution of different types of polymorphisms (synonymous, nonsynonymous, tolerated, and untolerated) varied greatly between the wolf and the breed dogs. In summary, our results suggest that different forms of selection have acted on the segregating OR pseudogenes in the CVD since domestication, breed dogs after breed formation, and ancestral wolf population, which has driven the evolution of these genes in different directions.
A draft annotation and overview of the human genome
Wright, Fred A; Lemon, William J; Zhao, Wei D; Sears, Russell; Zhuo, Degen; Wang, Jian-Ping; Yang, Hee-Yung; Baer, Troy; Stredney, Don; Spitzner, Joe; Stutz, Al; Krahe, Ralf; Yuan, Bo
2001-01-01
Background The recent draft assembly of the human genome provides a unified basis for describing genomic structure and function. The draft is sufficiently accurate to provide useful annotation, enabling direct observations of previously inferred biological phenomena. Results We report here a functionally annotated human gene index placed directly on the genome. The index is based on the integration of public transcript, protein, and mapping information, supplemented with computational prediction. We describe numerous global features of the genome and examine the relationship of various genetic maps with the assembly. In addition, initial sequence analysis reveals highly ordered chromosomal landscapes associated with paralogous gene clusters and distinct functional compartments. Finally, these annotation data were synthesized to produce observations of gene density and number that accord well with historical estimates. Such a global approach had previously been described only for chromosomes 21 and 22, which together account for 2.2% of the genome. Conclusions We estimate that the genome contains 65,000-75,000 transcriptional units, with exon sequences comprising 4%. The creation of a comprehensive gene index requires the synthesis of all available computational and experimental evidence. PMID:11516338
Firth, A. E.; Jagger, B. W.; Wise, H. M.; Nelson, C. C.; Parsawar, K.; Wills, N. M.; Napthine, S.; Taubenberger, J. K.; Digard, P.; Atkins, J. F.
2012-01-01
Programmed ribosomal frameshifting is used in the expression of many virus genes and some cellular genes. In eukaryotic systems, the most well-characterized mechanism involves –1 tandem tRNA slippage on an X_XXY_YYZ motif. By contrast, the mechanisms involved in programmed +1 (or −2) slippage are more varied and often poorly characterized. Recently, a novel gene, PA-X, was discovered in influenza A virus and found to be expressed via a shift to the +1 reading frame. Here, we identify, by mass spectrometric analysis, both the site (UCC_UUU_CGU) and direction (+1) of the frameshifting that is involved in PA-X expression. Related sites are identified in other virus genes that have previously been proposed to be expressed via +1 frameshifting. As these viruses infect insects (chronic bee paralysis virus), plants (fijiviruses and amalgamaviruses) and vertebrates (influenza A virus), such motifs may form a new class of +1 frameshift-inducing sequences that are active in diverse eukaryotes. PMID:23155484
Ogembo, Javier Gordon; Caoili, Barbara L; Shikata, Masamitsu; Chaeychomsri, Sudawan; Kobayashi, Michihiro; Ikeda, Motoko
2009-10-01
A newly cloned Helicoverpa armigera nucleopolyhedrovirus (HearNPV) from Kenya, HearNPV-NNg1, has a higher insecticidal activity than HearNPV-G4, which also exhibits lower insecticidal activity than HearNPV-C1. In the search for genes and/or nucleotide sequences that might be involved in the observed virulence differences among Helicoverpa spp. NPVs, the entire genome of NNg1 was sequenced and compared with previously sequenced genomes of G4, C1 and Helicoverpa zea single-nucleocapsid NPV (Hz). The NNg1 genome was 132,425 bp in length, with a total of 143 putative open reading frames (ORFs), and shared high levels of overall amino acid and nucleotide sequence identities with G4, C1 and Hz. Three NNg1 ORFs, ORF5, ORF100 and ORF124, which were shared with C1, were absent in G4 and Hz, while NNg1 and C1 were missing a homologue of G4/Hz ORF5. Another three ORFs, ORF60 (bro-b), ORF119 and ORF120, and one direct repeat sequence (dr) were unique to NNg1. Relative to the overall nucleotide sequence identity, lower sequence identities were observed between NNg1 hrs and the homologous hrs in the other three Helicoverpa spp. NPVs, despite containing the same number of hrs located at essentially the same positions on the genomes. Differences were also observed between NNg1 and each of the other three Helicoverpa spp. NPVs in the diversity of bro genes encoded on the genomes. These results indicate several putative genes and nucleotide sequences that may be responsible for the virulence differences observed among Helicoverpa spp., yet the specific genes and/or nucleotide sequences responsible have not been identified.
Benevenuto, Juliana; Peters, Leila P.; Carvalho, Giselle; Palhares, Alessandra; Quecine, Maria C.; Nunes, Filipe R. S.; Kmit, Maria C. P.; Wai, Alvan; Hausner, Georg; Aitken, Karen S.; Berkman, Paul J.; Fraser, James A.; Moolhuijzen, Paula M.; Coutinho, Luiz L.; Creste, Silvana; Vieira, Maria L. C.; Kitajima, João P.; Monteiro-Vitorello, Claudia B.
2015-01-01
Sporisorium scitamineum is a biotrophic fungus responsible for the sugarcane smut, a worldwide spread disease. This study provides the complete sequence of individual chromosomes of S. scitamineum from telomere to telomere achieved by a combination of PacBio long reads and Illumina short reads sequence data, as well as a draft sequence of a second fungal strain. Comparative analysis to previous available sequences of another strain detected few polymorphisms among the three genomes. The novel complete sequence described herein allowed us to identify and annotate extended subtelomeric regions, repetitive elements and the mitochondrial DNA sequence. The genome comprises 19,979,571 bases, 6,677 genes encoding proteins, 111 tRNAs and 3 assembled copies of rDNA, out of our estimated number of copies as 130. Chromosomal reorganizations were detected when comparing to sequences of S. reilianum, the closest smut relative, potentially influenced by repeats of transposable elements. Repetitive elements may have also directed the linkage of the two mating-type loci. The fungal transcriptome profiling from in vitro and from interaction with sugarcane at two time points (early infection and whip emergence) revealed that 13.5% of the genes were differentially expressed in planta and particular to each developmental stage. Among them are plant cell wall degrading enzymes, proteases, lipases, chitin modification and lignin degradation enzymes, sugar transporters and transcriptional factors. The fungus also modulates transcription of genes related to surviving against reactive oxygen species and other toxic metabolites produced by the plant. Previously described effectors in smut/plant interactions were detected but some new candidates are proposed. Ten genomic islands harboring some of the candidate genes unique to S. scitamineum were expressed only in planta. RNAseq data was also used to reassure gene predictions. PMID:26065709
Analysis of protein-coding genetic variation in 60,706 humans.
Lek, Monkol; Karczewski, Konrad J; Minikel, Eric V; Samocha, Kaitlin E; Banks, Eric; Fennell, Timothy; O'Donnell-Luria, Anne H; Ware, James S; Hill, Andrew J; Cummings, Beryl B; Tukiainen, Taru; Birnbaum, Daniel P; Kosmicki, Jack A; Duncan, Laramie E; Estrada, Karol; Zhao, Fengmei; Zou, James; Pierce-Hoffman, Emma; Berghout, Joanne; Cooper, David N; Deflaux, Nicole; DePristo, Mark; Do, Ron; Flannick, Jason; Fromer, Menachem; Gauthier, Laura; Goldstein, Jackie; Gupta, Namrata; Howrigan, Daniel; Kiezun, Adam; Kurki, Mitja I; Moonshine, Ami Levy; Natarajan, Pradeep; Orozco, Lorena; Peloso, Gina M; Poplin, Ryan; Rivas, Manuel A; Ruano-Rubio, Valentin; Rose, Samuel A; Ruderfer, Douglas M; Shakir, Khalid; Stenson, Peter D; Stevens, Christine; Thomas, Brett P; Tiao, Grace; Tusie-Luna, Maria T; Weisburd, Ben; Won, Hong-Hee; Yu, Dongmei; Altshuler, David M; Ardissino, Diego; Boehnke, Michael; Danesh, John; Donnelly, Stacey; Elosua, Roberto; Florez, Jose C; Gabriel, Stacey B; Getz, Gad; Glatt, Stephen J; Hultman, Christina M; Kathiresan, Sekar; Laakso, Markku; McCarroll, Steven; McCarthy, Mark I; McGovern, Dermot; McPherson, Ruth; Neale, Benjamin M; Palotie, Aarno; Purcell, Shaun M; Saleheen, Danish; Scharf, Jeremiah M; Sklar, Pamela; Sullivan, Patrick F; Tuomilehto, Jaakko; Tsuang, Ming T; Watkins, Hugh C; Wilson, James G; Daly, Mark J; MacArthur, Daniel G
2016-08-18
Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.
Cho, Anna; Seong, Moon-Woo; Lim, Byung Chan; Lee, Hwa Jeen; Byeon, Jung Hye; Kim, Seung Soo; Kim, Soo Yeon; Choi, Sun Ah; Wong, Ai-Lynn; Lee, Jeongho; Kim, Jon Soo; Ryu, Hye Won; Lee, Jin Sook; Kim, Hunmin; Hwang, Hee; Choi, Ji Eun; Kim, Ki Joong; Hwang, Young Seung; Hong, Ki Ho; Park, Seungman; Cho, Sung Im; Lee, Seung Jun; Park, Hyunwoong; Seo, Soo Hyun; Park, Sung Sup; Chae, Jong Hee
2017-05-01
Duchenne and Becker muscular dystrophies (DMD and BMD) are allelic X-linked recessive muscle diseases caused by mutations in the large and complex dystrophin gene. We analyzed the dystrophin gene in 507 Korean DMD/BMD patients by multiple ligation-dependent probe amplification and direct sequencing. Overall, 117 different deletions, 48 duplications, and 90 pathogenic sequence variations, including 30 novel variations, were identified. Deletions and duplications accounted for 65.4% and 13.3% of Korean dystrophinopathy, respectively, suggesting that the incidence of large rearrangements in dystrophin is similar among different ethnic groups. We also detected sequence variations in >100 probands. The small variations were dispersed across the whole gene, and 12.3% were nonsense mutations. Precise genetic characterization in patients with DMD/BMD is timely and important for implementing nationwide registration systems and future molecular therapeutic trials in Korea and globally. Muscle Nerve 55: 727-734, 2017. © 2016 Wiley Periodicals, Inc.
Sarkar, F H; Kupsky, W J; Li, Y W; Sreepathi, P
1994-03-01
Mutations in the p53 gene have been recognized in brain tumors, and clonal expansion of p53 mutant cells has been shown to be associated with glioma progression. However, studies on the p53 gene have been limited by the need for frozen tissues. We have developed a method utilizing polymerase chain reaction (PCR) for the direct analysis of p53 mutation by single-strand conformation polymorphism (SSCP) and by direct DNA sequencing of the p53 gene using a single 10-microns paraffin-embedded tissue section. We applied this method to screen for p53 gene mutations in exons 5-8 in human gliomas utilizing paraffin-embedded tissues. Twenty paraffin blocks containing tumor were selected from surgical specimens from 17 different adult patients. Tumors included six anaplastic astrocytomas (AAs), nine glioblastomas (GBs), and two mixed malignant gliomas (MMGs). The tissue section on the stained glass slide was used to guide microdissection of an unstained adjacent tissue section to ensure > 90% of the tumor cell population for p53 mutational analysis. Simultaneously, microdissection of the tissue was also carried out to obtain normal tissue from adjacent areas as a control. Mutations in the p53 gene were identified in 3 of 17 (18%) patients by PCR-SSCP analysis and subsequently confirmed by PCR-based DNA sequencing. Mutations in exon 5 resulting in amino acid substitution were found in one thalamic AA (codon 158, CGC > CTT: Arg > Leu) and one cerebral hemispheric GB (codon 151, CCG > CTG: Pro > Leu).(ABSTRACT TRUNCATED AT 250 WORDS)
Factors affecting expression of the recF gene of Escherichia coli K-12.
Sandler, S J; Clark, A J
1990-01-31
This report describes four factors which affect expression of the recF gene from strong upstream lambda promoters under temperature-sensitive cIAt2-encoded repressor control. The first factor was the long mRNA leader sequence consisting of the Escherichia coli dnaN gene and 95% of the dnaA gene and lambda bet, N (double amber) and 40% of the exo gene. When most of this DNA was deleted, RecF became detectable in maxicells. The second factor was the vector, pBEU28, a runaway replication plasmid. When we substituted pUC118 for pBEU28, RecF became detectable in whole cells by the Coomassie blue staining technique. The third factor was the efficiency of initiation of translation. We used site-directed mutagenesis to change the mRNA leader, ribosome-binding site and the 3 bp before and after the translational start codon. Monitoring the effect of these mutational changes by translational fusion to lacZ, we discovered that the efficiency of initiation of translation was increased 30-fold. Only an estimated two- or threefold increase in accumulated levels of RecF occurred, however. This led us to discover the fourth factor, namely sequences in the recF gene itself. These sequences reduce expression of the recF-lacZ fusion genes 100-fold. The sequences responsible for this decrease in expression occur in four regions in the N-terminal half of recF. Expression is reduced by some sequences at the transcriptional level and by others at the translational level.
Ducote, Matthew J.; Prakash, Shubha; Pettis, Gregg S.
2000-01-01
Efficient interbacterial transfer of streptomycete plasmid pIJ101 requires the pIJ101 tra gene, as well as a cis-acting plasmid function known as clt. Here we show that the minimal pIJ101 clt locus consists of a sequence no greater than 54 bp in size that includes essential inverted-repeat and direct-repeat sequences and is located in close proximity to the 3′ end of the korB regulatory gene. Evidence that sequences extending beyond the minimal locus and into the korB open reading frame influence clt transfer function and demonstration that clt-korB sequences are intrinsically curved raise the possibility that higher-order structuring of DNA and protein within this plasmid region may be an inherent feature of efficient pIJ101 transfer. PMID:11073933
Ducote, M J; Prakash, S; Pettis, G S
2000-12-01
Efficient interbacterial transfer of streptomycete plasmid pIJ101 requires the pIJ101 tra gene, as well as a cis-acting plasmid function known as clt. Here we show that the minimal pIJ101 clt locus consists of a sequence no greater than 54 bp in size that includes essential inverted-repeat and direct-repeat sequences and is located in close proximity to the 3' end of the korB regulatory gene. Evidence that sequences extending beyond the minimal locus and into the korB open reading frame influence clt transfer function and demonstration that clt-korB sequences are intrinsically curved raise the possibility that higher-order structuring of DNA and protein within this plasmid region may be an inherent feature of efficient pIJ101 transfer.
Hussey, Richard S; Huang, Guozhong; Allen, Rex
2011-01-01
Identifying parasitism genes encoding proteins secreted from a plant-parasitic nematode's esophageal gland cells and injected through its stylet into plant tissue is the key to understanding the molecular basis of nematode parasitism of plants. Parasitism genes have been cloned by directly microaspirating the cytoplasm from the esophageal gland cells of different parasitic stages of cyst or root-knot nematodes to provide mRNA to create a gland cell-specific cDNA library by long-distance reverse-transcriptase polymerase chain reaction. cDNA clones are sequenced and deduced protein sequences with a signal peptide for secretion are identified for high-throughput in situ hybridization to confirm gland-specific expression.
Booher, Nicholas J.; Carpenter, Sara C. D.; Sebra, Robert P.; Wang, Li; Salzberg, Steven L.; Leach, Jan E.
2015-01-01
Pathogen-injected, direct transcriptional activators of host genes, TAL (transcription activator-like) effectors play determinative roles in plant diseases caused by Xanthomonas spp. A large domain of nearly identical, 33–35 aa repeats in each protein mediates DNA recognition. This modularity makes TAL effectors customizable and thus important also in biotechnology. However, the repeats render TAL effector (tal) genes nearly impossible to assemble using next-generation, short reads. Here, we demonstrate that long-read, single molecule real-time (SMRT) sequencing solves this problem. Taking an ensemble approach to first generate local, tal gene contigs, we correctly assembled de novo the genomes of two strains of the rice pathogen X. oryzae completed previously using the Sanger method and even identified errors in those references. Sequencing two more strains revealed a dynamic genome structure and a striking plasticity in tal gene content. Our results pave the way for population-level studies to inform resistance breeding, improve biotechnology and probe TAL effector evolution. PMID:27148456
Sánchez-García, Ana Belén; Ibáñez, Sergio; Cano, Antonio; Acosta, Manuel; Pérez-Pérez, José Manuel
2018-01-01
Understanding the functional basis of auxin homeostasis requires knowledge about auxin biosynthesis, auxin transport and auxin catabolism genes, which is not always directly available despite the recent whole-genome sequencing of many plant species. Through sequence homology searches and phylogenetic analyses on a selection of 11 plant species with high-quality genome annotation, we identified the putative gene homologs involved in auxin biosynthesis, auxin catabolism and auxin transport pathways in carnation (Dianthus caryophyllus L.). To deepen our knowledge of the regulatory events underlying auxin-mediated adventitious root formation in carnation stem cuttings, we used RNA-sequencing data to confirm the expression profiles of some auxin homeostasis genes during the rooting of two carnation cultivars with different rooting behaviors. We also confirmed the presence of several auxin-related metabolites in the stem cutting tissues. Our findings offer a comprehensive overview of auxin homeostasis genes in carnation and provide a solid foundation for further experiments investigating the role of auxin homeostasis in the regulation of adventitious root formation in carnation.
Cano, Antonio; Acosta, Manuel
2018-01-01
Understanding the functional basis of auxin homeostasis requires knowledge about auxin biosynthesis, auxin transport and auxin catabolism genes, which is not always directly available despite the recent whole-genome sequencing of many plant species. Through sequence homology searches and phylogenetic analyses on a selection of 11 plant species with high-quality genome annotation, we identified the putative gene homologs involved in auxin biosynthesis, auxin catabolism and auxin transport pathways in carnation (Dianthus caryophyllus L.). To deepen our knowledge of the regulatory events underlying auxin-mediated adventitious root formation in carnation stem cuttings, we used RNA-sequencing data to confirm the expression profiles of some auxin homeostasis genes during the rooting of two carnation cultivars with different rooting behaviors. We also confirmed the presence of several auxin-related metabolites in the stem cutting tissues. Our findings offer a comprehensive overview of auxin homeostasis genes in carnation and provide a solid foundation for further experiments investigating the role of auxin homeostasis in the regulation of adventitious root formation in carnation. PMID:29709027
Zhang, Zhijun; Zhang, Pengjun; Li, Weidi; Zhang, Jinming; Huang, Fang; Yang, Jian; Bei, Yawei; Lu, Yaobin
2013-05-01
The western flower thrips (WFT), Frankliniella occidentalis, a world-wide invasive insect, causes agricultural damage by directly feeding and by indirectly vectoring Tospoviruses, such as Tomato spotted wilt virus (TSWV). We characterized the transcriptome of WFT and analyzed global gene expression of WFT response to TSWV infection using Illumina sequencing platform. We compiled 59,932 unigenes, and identified 36,339 unigenes by similarity analysis against public databases, most of which were annotated using gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis. Within these annotated transcripts, we collected 278 sequences related to insecticide resistance. GO and KEGG analysis of different expression genes between TSWV-infected and non-infected WFT population revealed that TSWV can regulate cellular process and immune response, which might lead to low virus titers in thrips cells and no detrimental effects on F. occidentalis. This data-set not only enriches genomic resource for WFT, but also benefits research into its molecular genetics and functional genomics. Copyright © 2013 Elsevier Inc. All rights reserved.
Robinson, Lois; Panayiotakis, Alexandra; Papas, Takis S.; Kola, Ismail; Seth, Arun
1997-01-01
ETS transcription factors play important roles in hematopoiesis, angiogenesis, and organogenesis during murine development. The ETS genes also have a role in neoplasia, for example in Ewing’s sarcomas and retrovirally induced cancers. The ETS genes encode transcription factors that bind to specific DNA sequences and activate transcription of various cellular and viral genes. To isolate novel ETS target genes, we used two approaches. In the first approach, we isolated genes by the RNA differential display technique. Previously, we have shown that the overexpression of ETS1 and ETS2 genes effects transformation of NIH 3T3 cells and specific transformants produce high levels of the ETS proteins. To isolate ETS1 and ETS2 responsive genes in these transformed cells, we prepared RNA from ETS1, ETS2 transformants, and normal NIH 3T3 cell lines and converted it into cDNA. This cDNA was amplified by PCR and displayed on sequencing gels. The differentially displayed bands were subcloned into plasmid vectors. By Northern blot analysis, several clones showed differential patterns of mRNA expression in the NIH 3T3-, ETS1-, and ETS2-expressing cell lines. Sixteen clones were analyzed by DNA sequence analysis, and 13 of them appeared to be unique because their DNA sequences did not match with any of the known genes present in the gene bank. Three known genes were found to be identical to the CArG box binding factor, phospholipase A2-activating protein, and early growth response 1 (Egr1) genes. In the second approach, to isolate ETS target promoters directly, we performed ETS1 binding with MboI-cleaved genomic DNA in the presence of a specific mAb followed by whole genome PCR. The immune complex-bound ETS binding sites containing DNA fragments were amplified and subcloned into pBluescript and subjected to DNA sequence and computer analysis. We found that, of a large number of clones isolated, 43 represented unique sequences not previously identified. Three clones turned out to contain regulatory sequences derived from human serglycin, preproapolipoprotein C II, and Egr1 genes. The ETS binding sites derived from these three regulatory sequences showed specific binding with recombinant ETS proteins. Of interest, Egr1 was identified by both of these techniques, suggesting strongly that it is indeed an ETS target gene. PMID:9207063
Rapid and accurate synthesis of TALE genes from synthetic oligonucleotides.
Wang, Fenghua; Zhang, Hefei; Gao, Jingxia; Chen, Fengjiao; Chen, Sijie; Zhang, Cuizhen; Peng, Gang
2016-01-01
Custom synthesis of transcription activator-like effector (TALE) genes has relied upon plasmid libraries of pre-fabricated TALE-repeat monomers or oligomers. Here we describe a novel synthesis method that directly incorporates annealed synthetic oligonucleotides into the TALE-repeat units. Our approach utilizes iterative sets of oligonucleotides and a translational frame check strategy to ensure the high efficiency and accuracy of TALE-gene synthesis. TALE arrays of more than 20 repeats can be constructed, and the majority of the synthesized constructs have perfect sequences. In addition, this novel oligonucleotide-based method can readily accommodate design changes to the TALE repeats. We demonstrated an increased gene targeting efficiency against a genomic site containing a potentially methylated cytosine by incorporating non-conventional repeat variable di-residue (RVD) sequences.
Ahmed, Ikhlak; Sarazin, Alexis; Bowler, Chris; Colot, Vincent; Quesneville, Hadi
2011-09-01
Transposable elements (TEs) and their relics play major roles in genome evolution. However, mobilization of TEs is usually deleterious and strongly repressed. In plants and mammals, this repression is typically associated with DNA methylation, but the relationship between this epigenetic mark and TE sequences has not been investigated systematically. Here, we present an improved annotation of TE sequences and use it to analyze genome-wide DNA methylation maps obtained at single-nucleotide resolution in Arabidopsis. We show that although the majority of TE sequences are methylated, ∼26% are not. Moreover, a significant fraction of TE sequences densely methylated at CG, CHG and CHH sites (where H = A, T or C) have no or few matching small interfering RNA (siRNAs) and are therefore unlikely to be targeted by the RNA-directed DNA methylation (RdDM) machinery. We provide evidence that these TE sequences acquire DNA methylation through spreading from adjacent siRNA-targeted regions. Further, we show that although both methylated and unmethylated TE sequences located in euchromatin tend to be more abundant closer to genes, this trend is least pronounced for methylated, siRNA-targeted TE sequences located 5' to genes. Based on these and other findings, we propose that spreading of DNA methylation through promoter regions explains at least in part the negative impact of siRNA-targeted TE sequences on neighboring gene expression.
[Application of MALDI-TOF-MS in gene testing for non-syndromic hearing loss].
Zeng, Yun; Jiang, Dan; Feng, Da-fei; Jin, Dong-dong; Wu, Xiao-hui; Ding, Yan-li; Zou, Jing
2013-12-01
To investigate the feasibility of Matrix-Assisted Laser Desorption-Ionization Time of Flight Mass Spectrometry (MALDI-TOF-MS) , according to the genetic test of non-syndromic hearing loss (NSHL), and check using the direct sequencing. Peripheral blood was collected from 454 NSHL patients. DNA samples were extracted and 20 loci of the four common disease-causing genes were analysed by MALDI-TOF-MS, including GJB2 (35delG, 167delT, 176_191del16, 235delC, 299_300delAT ), GJB3 (538C→T, 547G→A), SLC26A4 (281C→T, 589G→A, IVS7-2A→G, 1174A→T, 1226G→A, 1229C→T, IVS15+5G→A, 1975G→C, 2027T→A, 2162C→T, 2168A→G), and mitochondrial 12S rRNA (1494C→T, 1555A→G). Direct sequencing was also used to analyse the aforementioned 20 loci in order to validate the accuracy of MALDI-TOF-MS. Among the 454 patients, 166 cases (36.56%) of disease-causing mutations were detected, which included 69 cases (21.15%) of GJB2 gene mutation, four cases (0.88%) of GJB3 gene mutation, 64 cases (14.10%) of SLC26A4 gene mutation, and three cases (0.66%) of mitochondrial 12S rRNA gene mutation. Moreover, the results obtained from direct sequencing and MALDI-TOF-MS were consistent, and the results showed that the two methods were consistent. The MALDI-TOF-MS detection method was designed based on the hearing loss-related mutation hotspots seen in the Chinese population, and it has a high detection rate for NSHL related mutations. In comparison to the conventional detection methods, MALDI-TOF-MS has the following advantages: more detection sites, greater coverage, accurate, high throughput and low cost. Therefore, this method is capable of satisfying the needs of clinical detection for hearing impairment and it is suitable for large-scale implementation.
Garcia, J A; Harrich, D; Soultanakis, E; Wu, F; Mitsuyasu, R; Gaynor, R B
1989-01-01
The human immunodeficiency virus (HIV) type 1 LTR is regulated at the transcriptional level by both cellular and viral proteins. Using HeLa cell extracts, multiple regions of the HIV LTR were found to serve as binding sites for cellular proteins. An untranslated region binding protein UBP-1 has been purified and fractions containing this protein bind to both the TAR and TATA regions. To investigate the role of cellular proteins binding to both the TATA and TAR regions and their potential interaction with other HIV DNA binding proteins, oligonucleotide-directed mutagenesis of both these regions was performed followed by DNase I footprinting and transient expression assays. In the TATA region, two direct repeats TC/AAGC/AT/AGCTGC surround the TATA sequence. Mutagenesis of both of these direct repeats or of the TATA sequence interrupted binding over the TATA region on the coding strand, but only a mutation of the TATA sequence affected in vivo assays for tat-activation. In addition to TAR serving as the site of binding of cellular proteins, RNA transcribed from TAR is capable of forming a stable stem-loop structure. To determine the relative importance of DNA binding proteins as compared to secondary structure, oligonucleotide-directed mutations in the TAR region were studied. Local mutations that disrupted either the stem or loop structure were defective in gene expression. However, compensatory mutations which restored base pairing in the stem resulted in complete tat-activation. This indicated a significant role for the stem-loop structure in HIV gene expression. To determine the role of TAR binding proteins, mutations were constructed which extensively changed the primary structure of the TAR region, yet left stem base pairing, stem energy and the loop sequence intact. These mutations resulted in decreased protein binding to TAR DNA and defects in tat-activation, and revealed factor binding specifically to the loop DNA sequence. Further mutagenesis which inverted this stem and loop mutation relative to the HIV LTR mRNA start site resulted in even larger decreases in tat-activation. This suggests that multiple determinants, including protein binding, the loop sequence, and RNA or DNA secondary structure, are important in tat-activation and suggests that tat may interact with cellular proteins binding to DNA to increase HIV gene expression. Images PMID:2721501
Mathupala, S P; Lowe, S E; Podkovyrov, S M; Zeikus, J G
1993-08-05
The complete nucleotide sequence of the gene encoding the dual active amylopullulanase of Thermoanaerobacter ethanolicus 39E (formerly Clostridium thermohydrosulfuricum) was determined. The structural gene (apu) contained a single open reading frame 4443 base pairs in length, corresponding to 1481 amino acids, with an estimated molecular weight of 162,780. Analysis of the deduced sequence of apu with sequences of alpha-amylases and alpha-1,6 debranching enzymes enabled the identification of four conserved regions putatively involved in substrate binding and in catalysis. The conserved regions were localized within a 2.9-kilobase pair gene fragment, which encoded a M(r) 100,000 protein that maintained the dual activities and thermostability of the native enzyme. The catalytic residues of amylopullulanase were tentatively identified by using hydrophobic cluster analysis for comparison of amino acid sequences of amylopullulanase and other amylolytic enzymes. Asp597, Glu626, and Asp703 were individually modified to their respective amide form, or the alternate acid form, and in all cases both alpha-amylase and pullulanase activities were lost, suggesting the possible involvement of 3 residues in a catalytic triad, and the presence of a putative single catalytic site within the enzyme. These findings substantiate amylopullulanase as a new type of amylosaccharidase.
Characterization of interleukin-8 receptors in non-human primates
DOE Office of Scientific and Technical Information (OSTI.GOV)
Alvarez, V.; Coto, E.; Gonzalez-Roces, S.
Interleukin-8 is a chemokine with a potent neutrophil chemoatractant activity. In humans, two different cDNAs encoding human IL8 receptors designated IL8RA and IL8RB have been cloned. IL8RA binds IL8, while IL8RB binds IL8 as well as other {alpha}-chemokines. Both human IL8Rs are encoded by two genes physically linked on chromosome 2. The IL8RA and IL8RB genes have open reading frames (ORF) lacking introns. By direct sequencing of the polymerase chain reaction products, we sequenced the IL8R genes of cell lines from four non-human primates: chimpanzee, gorilla, orangutan, and macaca. The IL8RB encodes an ORF in the four non-human primates, showingmore » 95%-99% similarity to the human IL8RB sequence. The IL8RA homologue in gorilla and chimpanzee consisted of two ORF 98%-99% identical to the human sequence. The macaca and orangutan IL8RA homologues are pseudogenes: a 2 base pair insertion generated a sequence with several stop codons. In addition, we describe the physical linkage of these genes in the four non-human primates and discuss the evolutionary implications of these findings. 25 refs., 5 figs., 3 tabs.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sobell, J.L.; Lind, T.J.; Sommer, S.S.
To determine whether mutations in the D{sub 5} dopamine receptor (D{sub 5}DR) gene are associated with schizophrenia, the gene was examined in 78 unrelated schizophrenic individuals. After amplification by the polymerase chain reaction, products were examined by dideoxy fingerprinting (ddF), a highly sensitive screening method related to single strand conformational polymorphism analysis. All samples with unusual ddF patterns were sequenced to precisely identify the sequence change. In the 156 D{sub 5}DR alleles examined, nine sequence changes were identified. Four of the nine did not affect protein structure; of these, three were silent changes and one was a transition in themore » 3{prime} untranslated region. The remaining five sequence changes result in protein alterations: of these, one is a missense change in a non-conserved amino acid, 3 are missense changes in amino acids that are conserved in some dopamine D{sub 5} receptors and the last is a nonsense mutation. To investigate whether the nonsense mutation was associated with schizophrenia, 400 additional schizophrenic cases of western European descent and 1914 ethnically-similar controls were screened for the change. One additional schizophrenic carrier was identified and verified by direct genomic sequencing (allele frequency: .0013), but eight carriers also were found and confirmed among the non-schizophrenics (allele frequency: .0021)(p>.25). The gene was re-examined in all newly identified carriers of the nonsense mutation by direct sequencing and/or ddF in search of additional mutations. None were identified. Family studies also were conducted to investigate possible cosegregation of the mutation with other neuropsychiatric diseases, but this was not demonstrated. Thus, the mutation does not appear to be associated with an increased risk of schizophrenia nor does an initial analysis suggest cosegregation with other neuropsychiatric disorders or symptom complexes.« less
Tang, Xiaoyu; Li, Jie; Millán-Aguiñaga, Natalie; Zhang, Jia Jia; O'Neill, Ellis C; Ugalde, Juan A; Jensen, Paul R; Mantovani, Simone M; Moore, Bradley S
2015-12-18
Recent genome sequencing efforts have led to the rapid accumulation of uncharacterized or "orphaned" secondary metabolic biosynthesis gene clusters (BGCs) in public databases. This increase in DNA-sequenced big data has given rise to significant challenges in the applied field of natural product genome mining, including (i) how to prioritize the characterization of orphan BGCs and (ii) how to rapidly connect genes to biosynthesized small molecules. Here, we show that by correlating putative antibiotic resistance genes that encode target-modified proteins with orphan BGCs, we predict the biological function of pathway specific small molecules before they have been revealed in a process we call target-directed genome mining. By querying the pan-genome of 86 Salinispora bacterial genomes for duplicated house-keeping genes colocalized with natural product BGCs, we prioritized an orphan polyketide synthase-nonribosomal peptide synthetase hybrid BGC (tlm) with a putative fatty acid synthase resistance gene. We employed a new synthetic double-stranded DNA-mediated cloning strategy based on transformation-associated recombination to efficiently capture tlm and the related ttm BGCs directly from genomic DNA and to heterologously express them in Streptomyces hosts. We show the production of a group of unusual thiotetronic acid natural products, including the well-known fatty acid synthase inhibitor thiolactomycin that was first described over 30 years ago, yet never at the genetic level in regards to biosynthesis and autoresistance. This finding not only validates the target-directed genome mining strategy for the discovery of antibiotic producing gene clusters without a priori knowledge of the molecule synthesized but also paves the way for the investigation of novel enzymology involved in thiotetronic acid natural product biosynthesis.
Gibreel, Amera; Sköld, Ola
1999-01-01
The characterization of the genetic basis of sulfonamide resistance in Campylobacter jejuni was attempted. The resistance determinant from a sulfonamide-resistant strain of C. jejuni was cloned and was found to show 42% identity with the folP gene (which codes for dihydropteroate synthase, the target of sulfonamides) of the related bacterium Helicobacter pylori. The sequences of the areas surrounding the folP gene in C. jejuni showed similarity to those of the areas surrounding the corresponding gene in H. pylori. The folP gene of C. jejuni, which mediates the resistance, was observed to show particular features when it was compared to other known folP genes. One of these features is the presence of two pairs of direct repeats (15 and 27 bp) within the coding sequence of the gene. Comparison of the C. jejuni folP genes that mediate susceptibility and resistance revealed the occurrence of mutations that changed four amino acid residues. Resistance of C. jejuni to sulfonamides could be associated with one or several of these four mutational substitutions, which all occurred in the five different resistant isolates studied. The codon for one of these changed amino acids was found to be located in the second direct repeat within the coding sequence of the gene. The change made the repeat perfect. The transformation of both the resistance and the susceptibility variants of the gene into an Escherichia coli folP knockout mutant was found to complement the dihydropteroate synthase deficiency, confirming that the characterized sulfonamide resistance determinant codes for the C. jejuni dihydropteroate synthase enzyme. Kinetic measurements established different affinities of sulfonamide for the dihydropteroate synthase enzyme isolated from the resistant and susceptible strains. In conclusion, sulfonamide resistance in C. jejuni was shown to be associated with mutational changes in the chromosomally located gene for dihydropteroate synthase, the target of sulfonamides. PMID:10471557
mazF, a novel counter-selectable marker for unmarked chromosomal manipulation in Bacillus subtilis.
Zhang, Xiao-Zhou; Yan, Xin; Cui, Zhong-Li; Hong, Qing; Li, Shun-Peng
2006-05-19
Here, we present a novel method for the directed genetic manipulation of the Bacillus subtilis chromosome free of any selection marker. Our new approach employed the Escherichia coli toxin gene mazF as a counter-selectable marker. The mazF gene was placed under the control of an isopropyl-beta-D-thiogalactopyranoside (IPTG)-inducible expression system and associated with a spectomycin-resistance gene to form the MazF cassette, which was flanked by two directly-repeated (DR) sequences. A double-crossover event between the linearized delivery vector and the chromosome integrated the MazF cassette into a target locus and yielded an IPTG-sensitive strain with spectomycin-resistance, in which the wild-type chromosome copy had been replaced by the modified copy at the targeted locus. Another single-crossover event between the two DR sequences led to the excision of the MazF cassette and generated a strain with IPTG resistance, thereby realizing the desired alteration to the chromosome without introducing any unwanted selection markers. We used this method repeatedly and successfully to inactivate a specific gene, to introduce a gene of interest and to realize the in-frame deletion of a target gene in the same strain. As there is no prerequisite strain for this method, it will be a powerful and universal tool.
Beiter, Thomas; Zimmermann, Martina; Fragasso, Annunziata; Armeanu, Sorin; Lauer, Ulrich M; Bitzer, Michael; Su, Hua; Young, William L; Niess, Andreas M; Simon, Perikles
2008-01-01
So far, the abuse of gene transfer technology in sport, so-called gene doping, is undetectable. However, recent studies in somatic gene therapy indicate that long-term presence of transgenic DNA (tDNA) following various gene transfer protocols can be found in DNA isolated from whole blood using conventional PCR protocols. Application of these protocols for the direct detection of gene doping would require almost complete knowledge about the sequence of the genetic information that has been transferred. Here, we develop and describe the novel single-copy primer-internal intron-spanning PCR (spiPCR) procedure that overcomes this difficulty. Apart from the interesting perspectives that this spiPCR procedure offers in the fight against gene doping, this technology could also be of interest in biodistribution and biosafety studies for gene therapeutic applications.
Two Different Rickettsial Bacteria Invading Volvox carteri
Kawafune, Kaoru; Hongoh, Yuichi; Hamaji, Takashi; Sakamoto, Tomoaki; Kurata, Tetsuya; Hirooka, Shunsuke; Miyagishima, Shin-ya; Nozaki, Hisayoshi
2015-01-01
Background Bacteria of the family Rickettsiaceae are principally associated with arthropods. Recently, endosymbionts of the Rickettsiaceae have been found in non-phagotrophic cells of the volvocalean green algae Carteria cerasiformis, Pleodorina japonica, and Volvox carteri. Such endosymbionts were present in only C. cerasiformis strain NIES-425 and V. carteri strain UTEX 2180, of various strains of Carteria and V. carteri examined, suggesting that rickettsial endosymbionts may have been transmitted to only a few algal strains very recently. However, in preliminary work, we detected a sequence similar to that of a rickettsial gene in the nuclear genome of V. carteri strain EVE. Methodology/Principal Findings Here we explored the origin of the rickettsial gene-like sequences in the endosymbiont-lacking V. carteri strain EVE, by performing comparative analyses on 13 strains of V. carteri. By reference to our ongoing genomic sequence of rickettsial endosymbionts in C. cerasiformis strain NIES-425 cells, we confirmed that an approximately 9-kbp DNA sequence encompassing a region similar to that of four rickettsial genes was present in the nuclear genome of V. carteri strain EVE. Phylogenetic analyses, and comparisons of the synteny of rickettsial gene-like sequences from various strains of V. carteri, indicated that the rickettsial gene-like sequences in the nuclear genome of V. carteri strain EVE were closely related to rickettsial gene sequences of P. japonica, rather than those of V. carteri strain UTEX 2180. Conclusion/Significance At least two different rickettsial organisms may have invaded the V. carteri lineage, one of which may be the direct ancestor of the endosymbiont of V. carteri strain UTEX 2180, whereas the other may be closely related to the endosymbiont of P. japonica. Endosymbiotic gene transfer from the latter rickettsial organism may have occurred in an ancestor of V. carteri. Thus, the rickettsiae may be widely associated with V. carteri, and likely have often been lost during host evolution. PMID:25671568
Gilley, D; Preer, J R; Aufderheide, K J; Polisky, B
1988-01-01
Paramecium tetraurelia can be transformed by microinjection of cloned serotype A gene sequences into the macronucleus. Transformants are detected by their ability to express serotype A surface antigen from the injected templates. After injection, the DNA is converted from a supercoiled form to a linear form by cleavage at nonrandom sites. The linear form appears to replicate autonomously as a unit-length molecule and is present in transformants at high copy number. The injected DNA is further processed by the addition of paramecium-type telomeric sequences to the termini of the linear DNA. To examine the fate of injected linear DNA molecules, plasmid pSA14SB DNA containing the A gene was cleaved into two linear pieces, a 14-kilobase (kb) piece containing the A gene and flanking sequences and a 2.2-kb piece consisting of the procaryotic vector. In transformants expressing the A gene, we observed that two linear DNA species were present which correspond to the two species injected. Both species had Paramecium telomerelike sequences added to their termini. For the 2.2-kb DNA, we show that the site of addition of the telomerelike sequences is directly at one terminus and within one nucleotide of the other terminus. These results indicate that injected procaryotic DNA is capable of autonomous replication in Paramecium macronuclei and that telomeric addition in the macronucleus does not require specific recognition sequences. Images PMID:3211128
Ries, David; Holtgräwe, Daniela; Viehöver, Prisca; Weisshaar, Bernd
2016-03-15
The combination of bulk segregant analysis (BSA) and next generation sequencing (NGS), also known as mapping by sequencing (MBS), has been shown to significantly accelerate the identification of causal mutations for species with a reference genome sequence. The usual approach is to cross homozygous parents that differ for the monogenic trait to address, to perform deep sequencing of DNA from F2 plants pooled according to their phenotype, and subsequently to analyze the allele frequency distribution based on a marker table for the parents studied. The method has been successfully applied for EMS induced mutations as well as natural variation. Here, we show that pooling genetically diverse breeding lines according to a contrasting phenotype also allows high resolution mapping of the causal gene in a crop species. The test case was the monogenic locus causing red vs. green hypocotyl color in Beta vulgaris (R locus). We determined the allele frequencies of polymorphic sequences using sequence data from two diverging phenotypic pools of 180 B. vulgaris accessions each. A single interval of about 31 kbp among the nine chromosomes was identified which indeed contained the causative mutation. By applying a variation of the mapping by sequencing approach, we demonstrated that phenotype-based pooling of diverse accessions from breeding panels and subsequent direct determination of the allele frequency distribution can be successfully applied for gene identification in a crop species. Our approach made it possible to identify a small interval around the causative gene. Sequencing of parents or individual lines was not necessary. Whenever the appropriate plant material is available, the approach described saves time compared to the generation of an F2 population. In addition, we provide clues for planning similar experiments with regard to pool size and the sequencing depth required.
Okimoto, R; Chamberlin, H M; Macfarlane, J L; Wolstenholme, D R
1991-01-01
Within a 7 kb segment of the mtDNA molecule of the root knot nematode, Meloidogyne javanica, that lacks standard mitochondrial genes, are three sets of strictly tandemly arranged, direct repeat sequences: approximately 36 copies of a 102 ntp sequence that contains a TaqI site; 11 copies of a 63 ntp sequence, and 5 copies of an 8 ntp sequence. The 7 kb repeat-containing segment is bounded by putative tRNAasp and tRNAf-met genes and the arrangement of sequences within this segment is: the tRNAasp gene; a unique 1,528 ntp segment that contains two highly stable hairpin-forming sequences; the 102 ntp repeat set; the 8 ntp repeat set; a unique 1,068 ntp segment; the 63 ntp repeat set; and the tRNAf-met gene. The nucleotide sequences of the 102 ntp copies and the 63 ntp copies have been conserved among the species examined. Data from Southern hybridization experiments indicate that 102 ntp and 63 ntp repeats occur in the mtDNAs of three, two and two races of M.incognita, M.hapla and M.arenaria, respectively. Nucleotide sequences of the M.incognita Race-3 102 ntp repeat were found to be either identical or highly similar to those of the M.javanica 102 ntp repeat. Differences in migration distance and number of 102 ntp repeat-containing bands seen in Southern hybridization autoradiographs of restriction-digested mtDNAs of M.javanica and the different host races of M.incognita, M.hapla and M.arenaria are sufficient to distinguish the different host races of each species. Images PMID:2027769
Sirakova, T D; Markaryan, A; Kolattukudy, P E
1994-01-01
An extracellular elastinolytic metalloproteinase, purified from Aspergillus fumigatus isolated from an aspergillosis and patient/and an internal peptide derived from it were subjected to N-terminal sequencing. Oligonucleotide primers based on these sequences were used to PCR amplify a segment of the metalloproteinase cDNA, which was used as a probe to isolate the cDNA and gene for this enzyme. The gene sequence matched exactly with the cDNA sequence except for the four introns that interrupted the open reading frame. According to the deduced amino acid sequence, the metalloproteinase has a signal sequence and 227 additional amino acids preceding the sequence for the mature protein of 389 amino acids with a calculated molecular mass of 42 kDa, which is close to the size of the purified mature fungal proteinase. This sequence contains segments that matched both the N terminus of the mature protein and the internal peptide. A. fumigatus metalloproteinase contains some of the conserved zinc-binding and active-site motifs characteristic of metalloproteinases but shows no overall homology with known metalloproteinases. The cDNA of the mature protein when introduced into Escherichia coli directed the expression of a protein with a size, N-terminal sequence, and immunological cross-reactivity identical to those of the native fungal enzyme. Although the enzyme in the inclusion bodies could not be renatured, expression at 30 degrees C yielded soluble enzyme that showed chromatographic behavior identical to that of the native fungal enzyme and catalyzed hydrolysis of elastin. The metalloproteinase gene described here was not found in Aspergillus flavus. Images PMID:7927676
Stam, Remco; Scheikl, Daniela; Tellier, Aurélien
2016-01-01
Nod-like receptors (NLRs) are nucleotide-binding domain and leucine-rich repeats containing proteins that are important in plant resistance signaling. Many of the known pathogen resistance (R) genes in plants are NLRs and they can recognize pathogen molecules directly or indirectly. As such, divergence and copy number variants at these genes are found to be high between species. Within populations, positive and balancing selection are to be expected if plants coevolve with their pathogens. In order to understand the complexity of R-gene coevolution in wild nonmodel species, it is necessary to identify the full range of NLRs and infer their evolutionary history. Here we investigate and reveal polymorphism occurring at 220 NLR genes within one population of the partially selfing wild tomato species Solanum pennellii. We use a combination of enrichment sequencing and pooling ten individuals, to specifically sequence NLR genes in a resource and cost-effective manner. We focus on the effects which different mapping and single nucleotide polymorphism calling software and settings have on calling polymorphisms in customized pooled samples. Our results are accurately verified using Sanger sequencing of polymorphic gene fragments. Our results indicate that some NLRs, namely 13 out of 220, have maintained polymorphism within our S. pennellii population. These genes show a wide range of πN/πS ratios and differing site frequency spectra. We compare our observed rate of heterozygosity with expectations for this selfing and bottlenecked population. We conclude that our method enables us to pinpoint NLR genes which have experienced natural selection in their habitat. PMID:27189991
EuroPineDB: a high-coverage web database for maritime pine transcriptome
2011-01-01
Background Pinus pinaster is an economically and ecologically important species that is becoming a woody gymnosperm model. Its enormous genome size makes whole-genome sequencing approaches are hard to apply. Therefore, the expressed portion of the genome has to be characterised and the results and annotations have to be stored in dedicated databases. Description EuroPineDB is the largest sequence collection available for a single pine species, Pinus pinaster (maritime pine), since it comprises 951 641 raw sequence reads obtained from non-normalised cDNA libraries and high-throughput sequencing from adult (xylem, phloem, roots, stem, needles, cones, strobili) and embryonic (germinated embryos, buds, callus) maritime pine tissues. Using open-source tools, sequences were optimally pre-processed, assembled, and extensively annotated (GO, EC and KEGG terms, descriptions, SNPs, SSRs, ORFs and InterPro codes). As a result, a 10.5× P. pinaster genome was covered and assembled in 55 322 UniGenes. A total of 32 919 (59.5%) of P. pinaster UniGenes were annotated with at least one description, revealing at least 18 466 different genes. The complete database, which is designed to be scalable, maintainable, and expandable, is freely available at: http://www.scbi.uma.es/pindb/. It can be retrieved by gene libraries, pine species, annotations, UniGenes and microarrays (i.e., the sequences are distributed in two-colour microarrays; this is the only conifer database that provides this information) and will be periodically updated. Small assemblies can be viewed using a dedicated visualisation tool that connects them with SNPs. Any sequence or annotation set shown on-screen can be downloaded. Retrieval mechanisms for sequences and gene annotations are provided. Conclusions The EuroPineDB with its integrated information can be used to reveal new knowledge, offers an easy-to-use collection of information to directly support experimental work (including microarray hybridisation), and provides deeper knowledge on the maritime pine transcriptome. PMID:21762488
Shin, Jeong Hong; Jung, Soobin; Ramakrishna, Suresh; Kim, Hyongbum Henry; Lee, Junwon
2018-07-07
Genome editing technology using programmable nucleases has rapidly evolved in recent years. The primary mechanism to achieve precise integration of a transgene is mainly based on homology-directed repair (HDR). However, an HDR-based genome-editing approach is less efficient than non-homologous end-joining (NHEJ). Recently, a microhomology-mediated end-joining (MMEJ)-based transgene integration approach was developed, showing feasibility both in vitro and in vivo. We expanded this method to achieve targeted sequence substitution (TSS) of mutated sequences with normal sequences using double-guide RNAs (gRNAs), and a donor template flanking the microhomologies and target sequence of the gRNAs in vitro and in vivo. Our method could realize more efficient sequence substitution than the HDR-based method in vitro using a reporter cell line, and led to the survival of a hereditary tyrosinemia mouse model in vivo. The proposed MMEJ-based TSS approach could provide a novel therapeutic strategy, in addition to HDR, to achieve gene correction from a mutated sequence to a normal sequence. Copyright © 2018 Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gihring, Thomas; Green, Stefan; Schadt, Christopher Warren
2011-01-01
Technologies for massively parallel sequencing are revolutionizing microbial ecology and are vastly increasing the scale of ribosomal RNA (rRNA) gene studies. Although pyrosequencing has increased the breadth and depth of possible rRNA gene sampling, one drawback is that the number of reads obtained per sample is difficult to control. Pyrosequencing libraries typically vary widely in the number of sequences per sample, even within individual studies, and there is a need to revisit the behaviour of richness estimators and diversity indices with variable gene sequence library sizes. Multiple reports and review papers have demonstrated the bias in non-parametric richness estimators (e.g.more » Chao1 and ACE) and diversity indices when using clone libraries. However, we found that biased community comparisons are accumulating in the literature. Here we demonstrate the effects of sample size on Chao1, ACE, CatchAll, Shannon, Chao-Shen and Simpson's estimations specifically using pyrosequencing libraries. The need to equalize the number of reads being compared across libraries is reiterated, and investigators are directed towards available tools for making unbiased diversity comparisons.« less
Ravi, Anuradha; Avershina, Ekaterina; Angell, Inga Leena; Ludvigsen, Jane; Manohar, Prasanth; Padmanaban, Sumathi; Nachimuthu, Ramesh; Snipen, Lars; Rudi, Knut
2018-06-01
Use of the 16S rRNA gene in microbiota studies is limited by the lack of taxonomic and functional resolution. High resolution analyses are particularly important for understanding transmission and persistence of bacteria. The aim of our work was therefore to compare a novel reduced metagenome sequencing (RMS) approach with 16S rRNA gene sequencing to determine both the metagenome genetic diversity and the mother-to-child sharing of the microbiota in a cohort of 17 mother-child pairs. We found that although both approaches gave comparable results with respect to sample separation and taxonomy, RMS gave higher resolution and the potential for genomic-/functional assignment. Using RMS we estimated that the metagenome size increased from about 60 Mbp for 4-day-old children to about 225 Mbp for mothers. The 4-day-old children shared 7% of the metagenome sequences with the mothers, while the metagenome sequence sharing was >30% among the mothers. We found 15 genomes shared across >50% of the mothers, of which 10 belonged to Clostridia. Only Bacteroides showed a direct mother-child association, with B. vulgatus being abundant in both 4-day-old children and mothers. For the functional assignments, we identified a significant association between antibiotic usage during labor, and quantity of Fosfomycin resistance genes. In conclusion, our results show a higher functional and taxonomic resolution for RMS compared to 16S rRNA gene sequencing, where RMS enabled a detailed description of mother to child gut microbiota transmission - supporting a late recruitment of most gut bacteria and an effect of antibiotic treatment during labor on infant antibiotic resistance gene patterns. Copyright © 2018. Published by Elsevier B.V.
An intron within the 16S ribosomal RNA gene of the archaeon Pyrobaculum aerophilum
NASA Technical Reports Server (NTRS)
Burggraf, S.; Larsen, N.; Woese, C. R.; Stetter, K. O.
1993-01-01
The 16S rRNA genes of Pyrobaculum aerophilum and Pyrobaculum islandicum were amplified by the polymerase chain reaction, and the resulting products were sequenced directly. The two organisms are closely related by this measure (over 98% similar). However, they differ in that the (lone) 16S rRNA gene of Pyrobaculum aerophilum contains a 713-bp intron not seen in the corresponding gene of Pyrobaculum islandicum. To our knowledge, this is the only intron so far reported in the small subunit rRNA gene of a prokaryote. Upon excision the intron is circularized. A secondary structure model of the intron-containing rRNA suggests a splicing mechanism of the same type as that invoked for the tRNA introns of the Archaea and Eucarya and 23S rRNAs of the Archaea. The intron contains an open reading frame whose protein translation shows no certain homology with any known protein sequence.
Quarello, Paola; Garelli, Emanuela; Brusco, Alfredo; Carando, Adriana; Mancini, Cecilia; Pappi, Patrizia; Vinti, Luciana; Svahn, Johanna; Dianzani, Irma; Ramenghi, Ugo
2012-01-01
Diamond-Blackfan anemia is an autosomal dominant disease due to mutations in nine ribosomal protein encoding genes. Because most mutations are loss of function and detected by direct sequencing of coding exons, we reasoned that part of the approximately 50% mutation negative patients may have carried a copy number variant of ribosomal protein genes. As a proof of concept, we designed a multiplex ligation-dependent probe amplification assay targeted to screen the six genes that are most frequently mutated in Diamond-Blackfan anemia patients: RPS17, RPS19, RPS26, RPL5, RPL11, and RPL35A. Using this assay we showed that deletions represent approximately 20% of all mutations. The combination of sequencing and multiplex ligation-dependent probe amplification analysis of these six genes allows the genetic characterization of approximately 65% of patients, showing that Diamond-Blackfan anemia is indisputably a ribosomopathy. PMID:22689679
Molecular analysis of the anaerobic rumen fungus Orpinomyces - insights into an AT-rich genome.
Nicholson, Matthew J; Theodorou, Michael K; Brookman, Jayne L
2005-01-01
The anaerobic gut fungi occupy a unique niche in the intestinal tract of large herbivorous animals and are thought to act as primary colonizers of plant material during digestion. They are the only known obligately anaerobic fungi but molecular analysis of this group has been hampered by difficulties in their culture and manipulation, and by their extremely high A+T nucleotide content. This study begins to answer some of the fundamental questions about the structure and organization of the anaerobic gut fungal genome. Directed plasmid libraries using genomic DNA digested with highly or moderately rich AT-specific restriction enzymes (VspI and EcoRI) were prepared from a polycentric Orpinomyces isolate. Clones were sequenced from these libraries and the breadth of genomic inserts, both genic and intergenic, was characterized. Genes encoding numerous functions not previously characterized for these fungi were identified, including cytoskeletal, secretory pathway and transporter genes. A peptidase gene with no introns and having sequence similarity to a gene encoding a bacterial peptidase was also identified, extending the range of metabolic enzymes resulting from apparent trans-kingdom transfer from bacteria to fungi, as previously characterized largely for genes encoding plant-degrading enzymes. This paper presents the first thorough analysis of the genic, intergenic and rDNA regions of a variety of genomic segments from an anaerobic gut fungus and provides observations on rules governing intron boundaries, the codon biases observed with different types of genes, and the sequence of only the second anaerobic gut fungal promoter reported. Large numbers of retrotransposon sequences of different types were found and the authors speculate on the possible consequences of any such transposon activity in the genome. The coding sequences identified included several orphan gene sequences, including one with regions strongly suggestive of structural proteins such as collagens and lampirin. This gene was present as a single copy in Orpinomyces, was expressed during vegetative growth and was also detected in genomes from another gut fungal genus, Neocallimastix.
Chen, Jiazhen; Miao, Xinyu; Xu, Meng; He, Junlin; Xie, Yi; Wu, Xingwen; Chen, Gang; Yu, Liying; Zhang, Wenhong
2015-01-01
Members of the genera Prevotella, Veillonella and Fusobacterium are the predominant culturable obligate anaerobic bacteria isolated from periodontal abscesses. When determining the cumulative number of clinical anaerobic isolates from periodontal abscesses, ambiguous or overlapping signals were frequently encountered in 16S rRNA gene sequencing chromatograms, resulting in ambiguous identifications. With the exception of the genus Veillonella, the high intra-chromosomal heterogeneity of rrs genes has not been reported. The 16S rRNA genes of 138 clinical, strictly anaerobic isolates and one reference strain were directly sequenced, and the chromatograms were carefully examined. Gene cloning was performed for 22 typical isolates with doublet sequencing signals for the 16S rRNA genes, and four copies of the rrs-ITS genes of 9 Prevotella intermedia isolates were separately amplified by PCR, sequenced and compared. Five conserved housekeeping genes, hsp60, recA, dnaJ, gyrB1 and rpoB from 89 clinical isolates of Prevotella were also amplified by PCR and sequenced for identification and phylogenetic analysis along with 18 Prevotella reference strains. Heterogeneity of 16S rRNA genes was apparent in clinical, strictly anaerobic oral bacteria, particularly in the genera Prevotella and Veillonella. One hundred out of 138 anaerobic strains (72%) had intragenomic nucleotide polymorphisms (SNPs) in multiple locations, and 13 strains (9.4%) had intragenomic insertions or deletions in the 16S rRNA gene. In the genera Prevotella and Veillonella, 75% (67/89) and 100% (19/19) of the strains had SNPs in the 16S rRNA gene, respectively. Gene cloning and separate amplifications of four copies of the rrs-ITS genes confirmed that 2 to 4 heterogeneous 16S rRNA copies existed. Sequence alignment of five housekeeping genes revealed that intra-species nucleotide similarities were very high in the genera Prevotella, ranging from 94.3-100%. However, the inter-species similarities were relatively low, ranging from 68.7-97.9%. The housekeeping genes rpoB and gyrB1 were demonstrated to be alternative classification markers to the species level based on intra- and inter-species comparisons, whereas based on phylogenetic tree rpoB proved to be reliable phylogenetic marker for the genus Prevotella.
Chen, Jiazhen; Miao, Xinyu; Xu, Meng; He, Junlin; Xie, Yi; Wu, Xingwen; Chen, Gang; Yu, Liying; Zhang, Wenhong
2015-01-01
Background Members of the genera Prevotella, Veillonella and Fusobacterium are the predominant culturable obligate anaerobic bacteria isolated from periodontal abscesses. When determining the cumulative number of clinical anaerobic isolates from periodontal abscesses, ambiguous or overlapping signals were frequently encountered in 16S rRNA gene sequencing chromatograms, resulting in ambiguous identifications. With the exception of the genus Veillonella, the high intra-chromosomal heterogeneity of rrs genes has not been reported. Methods The 16S rRNA genes of 138 clinical, strictly anaerobic isolates and one reference strain were directly sequenced, and the chromatograms were carefully examined. Gene cloning was performed for 22 typical isolates with doublet sequencing signals for the 16S rRNA genes, and four copies of the rrs-ITS genes of 9 Prevotella intermedia isolates were separately amplified by PCR, sequenced and compared. Five conserved housekeeping genes, hsp60, recA, dnaJ, gyrB1 and rpoB from 89 clinical isolates of Prevotella were also amplified by PCR and sequenced for identification and phylogenetic analysis along with 18 Prevotella reference strains. Results Heterogeneity of 16S rRNA genes was apparent in clinical, strictly anaerobic oral bacteria, particularly in the genera Prevotella and Veillonella. One hundred out of 138 anaerobic strains (72%) had intragenomic nucleotide polymorphisms (SNPs) in multiple locations, and 13 strains (9.4%) had intragenomic insertions or deletions in the 16S rRNA gene. In the genera Prevotella and Veillonella, 75% (67/89) and 100% (19/19) of the strains had SNPs in the 16S rRNA gene, respectively. Gene cloning and separate amplifications of four copies of the rrs-ITS genes confirmed that 2 to 4 heterogeneous 16S rRNA copies existed. Conclusion Sequence alignment of five housekeeping genes revealed that intra-species nucleotide similarities were very high in the genera Prevotella, ranging from 94.3–100%. However, the inter-species similarities were relatively low, ranging from 68.7–97.9%. The housekeeping genes rpoB and gyrB1 were demonstrated to be alternative classification markers to the species level based on intra- and inter-species comparisons, whereas based on phylogenetic tree rpoB proved to be reliable phylogenetic marker for the genus Prevotella. PMID:26103050
Bender, Kelly S; Rice, Melissa R; Fugate, William H; Coates, John D; Achenbach, Laurie A
2004-09-01
Natural attenuation of the environmental contaminant perchlorate is a cost-effective alternative to current removal methods. The success of natural perchlorate remediation is dependent on the presence and activity of dissimilatory (per)chlorate-reducing bacteria (DPRB) within a target site. To detect DPRB in the environment, two degenerate primer sets targeting the chlorite dismutase (cld) gene were developed and optimized. A nested PCR approach was used in conjunction with these primer sets to increase the sensitivity of the molecular detection method. Screening of environmental samples indicated that all products amplified by this method were cld gene sequences. These sequences were obtained from pristine sites as well as contaminated sites from which DPRB were isolated. More than one cld phylotype was also identified from some samples, indicating the presence of more than one DPRB strain at those sites. The use of these primer sets represents a direct and sensitive molecular method for the qualitative detection of (per)chlorate-reducing bacteria in the environment, thus offering another tool for monitoring natural attenuation. Sequences of cld genes isolated in the course of this project were also generated from various DPRB and provided the first opportunity for a phylogenetic treatment of this metabolic gene. Comparisons of the cld and 16S ribosomal DNA (rDNA) gene trees indicated that the cld gene does not track 16S rDNA phylogeny, further implicating the possible role of horizontal transfer in the evolution of (per)chlorate respiration.
Liu, Junyan; Li, Lin; Peters, Brian M; Li, Bing; Deng, Yang; Xu, Zhenbo; Shirtliff, Mark E
2016-09-01
Lactobacillus acetotolerans is a hard-to-culture beer-spoilage bacterium capable of entering into the viable putative nonculturable (VPNC) state. As part of an initial strategy to investigate the phenotypic behavior of L. acetotolerans, draft genome sequencing was performed. Results demonstrated a total of 1824 predicted annotated genes, with several potential VPNC- and beer-spoilage-associated genes identified. Importantly, this is the first genome sequence of L. acetotolerans as beer-spoilage bacteria and it may aid in further analysis of L. acetotolerans and other beer-spoilage bacteria, with direct implications for food safety control in the beer brewing industry. © FEMS 2016. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
The expanding universe of p53 targets.
Menendez, Daniel; Inga, Alberto; Resnick, Michael A
2009-10-01
The p53 tumour suppressor is modified through mutation or changes in expression in most cancers, leading to the altered regulation of hundreds of genes that are directly influenced by this sequence-specific transcription factor. Central to the p53 master regulatory network are the target response element (RE) sequences. The extent of p53 transactivation and transcriptional repression is influenced by many factors, including p53 levels, cofactors and the specific RE sequences, all of which contribute to the role that p53 has in the aetiology of cancer. This Review describes the identification and functionality of REs and highlights the inclusion of non-canonical REs that expand the universe of genes and regulation mechanisms in the p53 tumour suppressor network.
Strøman, Per; Reinert, William; Case, Mary E.; Giles, Norman H.
1979-01-01
In Neurospora crassa, the enzyme quinate (shikimate) dehydrogenase catalyzes the first reaction in the inducible quinic acid catabolic pathway and is encoded in the qa-3 gene of the qa cluster. In this cluster, the order of genes has been established as qa-1 qa-3 qa-4 qa-2. Amino-terminal sequences have been determined for purified quinate dehydrogenase from wild type and from UV-induced revertants in two different qa-3 mutants. These two mutants (M16 and M45) map at opposite ends of the qa-3 locus. In addition, mapping data (Case et al. 1978) indicate that the end of the qa-3 gene specified by M45 is closer to the adjacent qa-1 gene than is the end specified by the M16 mutant site. In one of the revertants (R45 from qa-3 mutant M45), the aminoterminal sequence for the first ten amino acids is identical to that of wild type. The other revertant (R1 from qa-3 mutant M16) differs from wild type at the amino-terminal end by a single altered residue at position three in the sequence. The observed change involves the substitution of an isoleucine in M16-R1 for a proline in wild type. This substitution requires a two-nucleotide change in the corresponding wild-type codon.——The combined genetic and biochemical data indicate that the qa-3 mutants M16 and M45 carry amino acid substitutions near the amino-terminal and carboxyl-terminal ends of the quinate dehydrogenase enzyme, respectively. On this basis we conclude that transcription of the qa-3 gene proceeds from the end specified by the M16 mutant site in the direction of the qa-1 gene. It appears probable that transcription is initiated from a promoter site within the qa cluster, possibly immediately adjacent to the qa-3 gene. PMID:159203
Torrent, C; Gabus, C; Darlix, J L
1994-01-01
Retroviral genomes consist of two identical RNA molecules associated at their 5' ends by the dimer linkage structure located in the packaging element (Psi or E) necessary for RNA dimerization in vitro and packaging in vivo. In murine leukemia virus (MLV)-derived vectors designed for gene transfer, the Psi + sequence of 600 nucleotides directs the packaging of recombinant RNAs into MLV virions produced by helper cells. By using in vitro RNA dimerization as a screening system, a sequence of rat VL30 RNA located next to the 5' end of the Harvey mouse sarcoma virus genome and as small as 67 nucleotides was found to form stable dimeric RNA. In addition, a purine-rich sequence located at the 5' end of this VL30 RNA seems to be critical for RNA dimerization. When this VL30 element was extended by 107 nucleotides at its 3' end and inserted into an MLV-derived vector lacking MLV Psi +, it directed the efficient encapsidation of recombinant RNAs into MLV virions. Because this VL30 packaging signal is smaller and more efficient in packaging recombinant RNAs than the MLV Psi + and does not contain gag or glyco-gag coding sequences, its use in MLV-derived vectors should render even more unlikely recombinations which could generate replication-competent viruses. Therefore, utilization of the rat VL30 packaging sequence should improve the biological safety of MLV vectors for human gene transfer. Images PMID:8289369
The Inference of Gene Trees with Species Trees
Szöllősi, Gergely J.; Tannier, Eric; Daubin, Vincent; Boussau, Bastien
2015-01-01
This article reviews the various models that have been used to describe the relationships between gene trees and species trees. Molecular phylogeny has focused mainly on improving models for the reconstruction of gene trees based on sequence alignments. Yet, most phylogeneticists seek to reveal the history of species. Although the histories of genes and species are tightly linked, they are seldom identical, because genes duplicate, are lost or horizontally transferred, and because alleles can coexist in populations for periods that may span several speciation events. Building models describing the relationship between gene and species trees can thus improve the reconstruction of gene trees when a species tree is known, and vice versa. Several approaches have been proposed to solve the problem in one direction or the other, but in general neither gene trees nor species trees are known. Only a few studies have attempted to jointly infer gene trees and species trees. These models account for gene duplication and loss, transfer or incomplete lineage sorting. Some of them consider several types of events together, but none exists currently that considers the full repertoire of processes that generate gene trees along the species tree. Simulations as well as empirical studies on genomic data show that combining gene tree–species tree models with models of sequence evolution improves gene tree reconstruction. In turn, these better gene trees provide a more reliable basis for studying genome evolution or reconstructing ancestral chromosomes and ancestral gene sequences. We predict that gene tree–species tree methods that can deal with genomic data sets will be instrumental to advancing our understanding of genomic evolution. PMID:25070970
Lin, Ying-Chung; Li, Wei; Sun, Ying-Hsuan; Kumari, Sapna; Wei, Hairong; Li, Quanzi; Tunlaya-Anukit, Sermsawat; Sederoff, Ronald R.; Chiang, Vincent L.
2013-01-01
Wood is an essential renewable raw material for industrial products and energy. However, knowledge of the genetic regulation of wood formation is limited. We developed a genome-wide high-throughput system for the discovery and validation of specific transcription factor (TF)–directed hierarchical gene regulatory networks (hGRNs) in wood formation. This system depends on a new robust procedure for isolation and transfection of Populus trichocarpa stem differentiating xylem protoplasts. We overexpressed Secondary Wall-Associated NAC Domain 1s (Ptr-SND1-B1), a TF gene affecting wood formation, in these protoplasts and identified differentially expressed genes by RNA sequencing. Direct Ptr-SND1-B1–DNA interactions were then inferred by integration of time-course RNA sequencing data and top-down Graphical Gaussian Modeling–based algorithms. These Ptr-SND1-B1-DNA interactions were verified to function in differentiating xylem by anti-PtrSND1-B1 antibody-based chromatin immunoprecipitation (97% accuracy) and in stable transgenic P. trichocarpa (90% accuracy). In this way, we established a Ptr-SND1-B1–directed quantitative hGRN involving 76 direct targets, including eight TF and 61 enzyme-coding genes previously unidentified as targets. The network can be extended to the third layer from the second-layer TFs by computation or by overexpression of a second-layer TF to identify a new group of direct targets (third layer). This approach would allow the sequential establishment, one two-layered hGRN at a time, of all layers involved in a more comprehensive hGRN. Our approach may be particularly useful to study hGRNs in complex processes in plant species resistant to stable genetic transformation and where mutants are unavailable. PMID:24280390
Identifying metabolic enzymes with multiple types of association evidence
Kharchenko, Peter; Chen, Lifeng; Freund, Yoav; Vitkup, Dennis; Church, George M
2006-01-01
Background Existing large-scale metabolic models of sequenced organisms commonly include enzymatic functions which can not be attributed to any gene in that organism. Existing computational strategies for identifying such missing genes rely primarily on sequence homology to known enzyme-encoding genes. Results We present a novel method for identifying genes encoding for a specific metabolic function based on a local structure of metabolic network and multiple types of functional association evidence, including clustering of genes on the chromosome, similarity of phylogenetic profiles, gene expression, protein fusion events and others. Using E. coli and S. cerevisiae metabolic networks, we illustrate predictive ability of each individual type of association evidence and show that significantly better predictions can be obtained based on the combination of all data. In this way our method is able to predict 60% of enzyme-encoding genes of E. coli metabolism within the top 10 (out of 3551) candidates for their enzymatic function, and as a top candidate within 43% of the cases. Conclusion We illustrate that a combination of genome context and other functional association evidence is effective in predicting genes encoding metabolic enzymes. Our approach does not rely on direct sequence homology to known enzyme-encoding genes, and can be used in conjunction with traditional homology-based metabolic reconstruction methods. The method can also be used to target orphan metabolic activities. PMID:16571130
Haygood, M G
1990-01-01
Flashlight fishes (family Anomalopidae) have light organs that contain luminous bacterial symbionts. Although the symbionts have not yet been successfully cultured, the luciferase genes have been cloned directly from the light organ of the Caribbean species, Kryptophanaron alfredi. The goal of this project was to evaluate the relationship of the symbiont to free-living luminous bacteria by comparison of genes coding for bacterial luciferase (lux genes). Hybridization of a lux AB probe from the Kryptophanaron alfredi symbiont to DNAs from 9 strains (8 species) of luminous bacteria showed that none of the strains tested had lux genes highly similar to the symbiont. The most similar were a group consisting of Vibrio harveyi, Vibrio splendidus and Vibrio orientalis. The nucleotide sequence of the luciferase alpha subunit gene luxA) of the Kryptophanaron alfredi symbiont was determined in order to do a more detailed comparison with published luxA sequences from Vibrio harveyi, Vibrio fischeri and Photobacterium leiognathi. The hybridization results, sequence comparisons and the mol% G + C of the Kryptophanaron alfredi symbiont luxA gene suggest that the symbiont may be considered as a new species of luminous Vibrio related to Vibrio harveyi.
Hase, Ryota; Hirooka, Takuya; Itabashi, Takashi; Endo, Yasunobu; Otsuka, Yoshihito
2018-05-15
A 65-year-old man presented with gradually exacerbating low back pain. Magnetic resonance imaging revealed vertebral osteomyelitis in the Th11-L2 vertebral bodies and discs. The patient showed negative findings on conventional cultures. Direct broad-range polymerase chain reaction (PCR) with sequencing of the biopsied specimen had the highest similarity to the 16S rRNA gene of Helicobacter cinaedi. This case suggests that direct broad-range PCR with sequencing should be considered when conventional cultures cannot identify the causative organism of vertebral osteomyelitis, and that this method may be particularly useful when the pathogen is a fastidious organism, such as H. cinaedi.
Vouille, V; Amiche, M; Nicolas, P
1997-09-01
We cloned the genes of two members of the dermaseptin family, broad-spectrum antimicrobial peptides isolated from the skin of the arboreal frog Phyllomedusa bicolor. The dermaseptin gene Drg2 has a 2-exon coding structure interrupted by a small 137-bp intron, wherein exon 1 encoded a 22-residue hydrophobic signal peptide and the first three amino acids of the acidic propiece; exon 2 contained the 18 additional acidic residues of the propiece plus a typical prohormone processing signal Lys-Arg and a 32-residue dermaseptin progenitor sequence. The dermaseptin genes Drg2 and Drg1g2 have conserved sequences at both untranslated ends and in the first and second coding exons. In contrast, Drg1g2 comprises a third coding exon for a short version of the acidic propiece and a second dermaseptin progenitor sequence. Structural conservation between the two genes suggests that Drg1g2 arose recently from an ancestral Drg2-like gene through amplification of part of the second coding exon and 3'-untranslated region. Analysis of the cDNAs coding precursors for several frog skin peptides of highly different structures and activities demonstrates that the signal peptides and part of the acidic propieces are encoded by conserved nucleotides encompassed by the first coding exon of the dermaseptin genes. The organization of the genes that belong to this family, with the signal peptide and the progenitor sequence on separate exons, permits strikingly different peptides to be directed into the secretory pathway. The recruitment of such a homologous 'secretory' exon by otherwise non-homologous genes may have been an early event in the evolution of amphibian.
MYO7A and USH2A gene sequence variants in Italian patients with Usher syndrome.
Sodi, Andrea; Mariottini, Alessandro; Passerini, Ilaria; Murro, Vittoria; Tachyla, Iryna; Bianchi, Benedetta; Menchini, Ugo; Torricelli, Francesca
2014-01-01
To analyze the spectrum of sequence variants in the MYO7A and USH2A genes in a group of Italian patients affected by Usher syndrome (USH). Thirty-six Italian patients with a diagnosis of USH were recruited. They received a standard ophthalmologic examination, visual field testing, optical coherence tomography (OCT) scan, and electrophysiological tests. Fluorescein angiography and fundus autofluorescence imaging were performed in selected cases. All the patients underwent an audiologic examination for the 0.25-8,000 Hz frequencies. Vestibular function was evaluated with specific tests. DNA samples were analyzed for sequence variants of the MYO7A gene (for USH1) and the USH2A gene (for USH2) with direct sequencing techniques. A few patients were analyzed for both genes. In the MYO7A gene, ten missense variants were found; three patients were compound heterozygous, and two were homozygous. Thirty-four USH2A gene variants were detected, including eight missense variants, nine nonsense variants, six splicing variants, and 11 duplications/deletions; 19 patients were compound heterozygous, and three were homozygous. Four MYO7A and 17 USH2A variants have already been described in the literature. Among the novel mutations there are four USH2A large deletions, detected with multiplex ligation dependent probe amplification (MLPA) technology. Two potentially pathogenic variants were found in 27 patients (75%). Affected patients showed variable clinical pictures without a clear genotype-phenotype correlation. Ten variants in the MYO7A gene and 34 variants in the USH2A gene were detected in Italian patients with USH at a high detection rate. A selective analysis of these genes may be valuable for molecular analysis, combining diagnostic efficiency with little time wastage and less resource consumption.
Seamless editing of the chloroplast genome in plants.
Martin Avila, Elena; Gisby, Martin F; Day, Anil
2016-07-29
Gene editing technologies enable the precise insertion of favourable mutations and performance enhancing trait genes into chromosomes whilst excluding all excess DNA from modified genomes. The technology gives rise to a new class of biotech crops which is likely to have widespread applications in agriculture. Despite progress in the nucleus, the seamless insertions of point mutations and non-selectable foreign genes into the organelle genomes of crops have not been described. The chloroplast genome is an attractive target to improve photosynthesis and crop performance. Current chloroplast genome engineering technologies for introducing point mutations into native chloroplast genes leave DNA scars, such as the target sites for recombination enzymes. Seamless editing methods to modify chloroplast genes need to address reversal of site-directed point mutations by template mediated repair with the vast excess of wild type chloroplast genomes that are present early in the transformation process. Using tobacco, we developed an efficient two-step method to edit a chloroplast gene by replacing the wild type sequence with a transient intermediate. This was resolved to the final edited gene by recombination between imperfect direct repeats. Six out of 11 transplastomic plants isolated contained the desired intermediate and at the second step this was resolved to the edited chloroplast gene in five of six plants tested. Maintenance of a single base deletion mutation in an imperfect direct repeat of the native chloroplast rbcL gene showed the limited influence of biased repair back to the wild type sequence. The deletion caused a frameshift, which replaced the five C-terminal amino acids of the Rubisco large subunit with 16 alternative residues resulting in a ~30-fold reduction in its accumulation. We monitored the process in vivo by engineering an overlapping gusA gene downstream of the edited rbcL gene. Translational coupling between the overlapping rbcL and gusA genes resulted in relatively high GUS accumulation (~0.5 % of leaf protein). Editing chloroplast genomes using transient imperfect direct repeats provides an efficient method for introducing point mutations into chloroplast genes. Moreover, we describe the first synthetic operon allowing expression of a downstream overlapping gene by translational coupling in chloroplasts. Overlapping genes provide a new mechanism for co-ordinating the translation of foreign proteins in chloroplasts.
Regulation of Bacteria-Induced Intercellular Adhesion Molecule-1 by CCAAT/Enhancer Binding Proteins
Manzel, Lori J.; Chin, Cecilia L.; Behlke, Mark A.; Look, Dwight C.
2009-01-01
Direct interaction between bacteria and epithelial cells may initiate or amplify the airway response through induction of epithelial defense gene expression by nuclear factor-κB (NF-κB). However, multiple signaling pathways modify NF-κB effects to modulate gene expression. In this study, the effects of CCAAT/enhancer binding protein (C/EBP) family members on induction of the leukocyte adhesion glycoprotein intercellular adhesion molecule-1 (ICAM-1) was examined in primary cultures of human tracheobronchial epithelial cells incubated with nontypeable Haemophilus influenzae. Increased ICAM-1 gene transcription in response to H. influenzae required gene sequences located at −200 to −135 in the 5′-flanking region that contain a C/EBP-binding sequence immediately upstream of the NF-κB enhancer site. Constitutive C/EBPβ was found to have an important role in epithelial cell ICAM-1 regulation, while the adjacent NF-κB sequence binds the RelA/p65 and NF-κB1/p50 members of the NF-κB family to induce ICAM-1 expression in response to H. influenzae. The expression of C/EBP proteins is not regulated by p38 mitogen-activated protein kinase activation, but p38 affects gene transcription by increasing the binding of TATA-binding protein to TATA-box–containing gene sequences. Epithelial cell ICAM-1 expression in response to H. influenzae was decreased by expressing dominant-negative protein or RNA interference against C/EBPβ, confirming its role in ICAM-1 regulation. Although airway epithelial cells express multiple constitutive and inducible C/EBP family members that bind C/EBP sequences, the results indicate that C/EBPβ plays a central role in modulation of NF-κB–dependent defense gene expression in human airway epithelial cells after exposure to H. influenzae. PMID:18703796
Kalay, Gizem; Lusk, Richard; Dome, Mackenzie; Hens, Korneel; Deplancke, Bart; Wittkopp, Patricia J.
2016-01-01
The regulation of gene expression controls development, and changes in this regulation often contribute to phenotypic evolution. Drosophila pigmentation is a model system for studying evolutionary changes in gene regulation, with differences in expression of pigmentation genes such as yellow that correlate with divergent pigment patterns among species shown to be caused by changes in cis- and trans-regulation. Currently, much more is known about the cis-regulatory component of divergent yellow expression than the trans-regulatory component, in part because very few trans-acting regulators of yellow expression have been identified. This study aims to improve our understanding of the trans-acting control of yellow expression by combining yeast-one-hybrid and RNAi screens for transcription factors binding to yellow cis-regulatory sequences and affecting abdominal pigmentation in adults, respectively. Of the 670 transcription factors included in the yeast-one-hybrid screen, 45 showed evidence of binding to one or more sequence fragments tested from the 5′ intergenic and intronic yellow sequences from D. melanogaster, D. pseudoobscura, and D. willistoni, suggesting that they might be direct regulators of yellow expression. Of the 670 transcription factors included in the yeast-one-hybrid screen, plus another TF previously shown to be genetically upstream of yellow, 125 were also tested using RNAi, and 32 showed altered abdominal pigmentation. Nine transcription factors were identified in both screens, including four nuclear receptors related to ecdysone signaling (Hr78, Hr38, Hr46, and Eip78C). This finding suggests that yellow expression might be directly controlled by nuclear receptors influenced by ecdysone during early pupal development when adult pigmentation is forming. PMID:27527791
Chen, Letian; Wang, Fengpin; Wang, Xiaoyu; Liu, Yao-Guang
2013-01-01
Functional genomics requires vector construction for protein expression and functional characterization of target genes; therefore, a simple, flexible and low-cost molecular manipulation strategy will be highly advantageous for genomics approaches. Here, we describe a Ω-PCR strategy that enables multiple types of sequence modification, including precise insertion, deletion and substitution, in any position of a circular plasmid. Ω-PCR is based on an overlap extension site-directed mutagenesis technique, and is named for its characteristic Ω-shaped secondary structure during PCR. Ω-PCR can be performed either in two steps, or in one tube in combination with exonuclease I treatment. These strategies have wide applications for protein engineering, gene function analysis and in vitro gene splicing. PMID:23335613
Metagenomics: Probing pollutant fate in natural and engineered ecosystems.
Bouhajja, Emna; Agathos, Spiros N; George, Isabelle F
2016-12-01
Polluted environments are a reservoir of microbial species able to degrade or to convert pollutants to harmless compounds. The proper management of microbial resources requires a comprehensive characterization of their genetic pool to assess the fate of contaminants and increase the efficiency of bioremediation processes. Metagenomics offers appropriate tools to describe microbial communities in their whole complexity without lab-based cultivation of individual strains. After a decade of use of metagenomics to study microbiomes, the scientific community has made significant progress in this field. In this review, we survey the main steps of metagenomics applied to environments contaminated with organic compounds or heavy metals. We emphasize technical solutions proposed to overcome encountered obstacles. We then compare two metagenomic approaches, i.e. library-based targeted metagenomics and direct sequencing of metagenomes. In the former, environmental DNA is cloned inside a host, and then clones of interest are selected based on (i) their expression of biodegradative functions or (ii) sequence homology with probes and primers designed from relevant, already known sequences. The highest score for the discovery of novel genes and degradation pathways has been achieved so far by functional screening of large clone libraries. On the other hand, direct sequencing of metagenomes without a cloning step has been more often applied to polluted environments for characterization of the taxonomic and functional composition of microbial communities and their dynamics. In this case, the analysis has focused on 16S rRNA genes and marker genes of biodegradation. Advances in next generation sequencing and in bioinformatic analysis of sequencing data have opened up new opportunities for assessing the potential of biodegradation by microbes, but annotation of collected genes is still hampered by a limited number of available reference sequences in databases. Although metagenomics is still facing technical and computational challenges, our review of the recent literature highlights its value as an aid to efficiently monitor the clean-up of contaminated environments and develop successful strategies to mitigate the impact of pollutants on ecosystems. Copyright © 2016 Elsevier Inc. All rights reserved.
Park, Jeong-Hoon; Park, Jong-Hun; Je Seong, Hoon; Sul, Woo Jun; Jin, Kang-Hyun; Park, Hee-Deung
2018-07-01
To provide insight into direct interspecies electron transfer via granular activated carbon (GAC), the effect of GAC supplementation on anaerobic digestion was evaluated. Compared to control samples, the GAC supplementation increased the total amount of methane production and its production rate by 31% and 72%, respectively. 16S rDNA sequencing analysis revealed a shift in the archaeal community composition; the Methanosarcina proportion decreased 17%, while the Methanosaeta proportion increased 5.6%. Metagenomic analyses based on shotgun sequencing demonstrated that the abundance of pilA and omcS genes belonging to Geobacter species decreased 69.4% and 29.4%, respectively. Furthermore, the analyses suggested a carbon dioxide reduction pathway rather than an acetate decarboxylation pathway for methane formation. Taken together, these results suggest that GAC improved methane production performance by shifting the microbial community and altering functional genes associated with direct interspecies electron transfer via conductive materials. Copyright © 2018 Elsevier Ltd. All rights reserved.
A genomic approach to the understanding of Xylella fastidiosa pathogenicity.
Lambais, M R; Goldman, M H; Camargo, L E; Goldman, G H
2000-10-01
Xylella fastidiosa is a fastidious, xylem-limited bacterium that causes several economically important plant diseases, including citrus variegated chlorosis (CVC). X. fastidiosa is the first plant pathogen to have its genome completely sequenced. In addition, it is probably the least previously studied of any organism for which the complete genome sequence is available. Several pathogenicity-related genes have been identified in the X. fastidiosa genome by similarity with other bacterial genes involved in pathogenesis in plants, as well as in animals. The X. fastidiosa genome encodes different classes of proteins directly or indirectly involved in cell-cell interactions, degradation of plant cell walls, iron homeostasis, anti-oxidant responses, synthesis of toxins, and regulation of pathogenicity. Neither genes encoding members of the type III protein secretion system nor avirulence-like genes have been identified in X. fastidiosa.
Fingerprinting of HLA class I genes for improved selection of unrelated bone marrow donors.
Martinelli, G; Farabegoli, P; Buzzi, M; Panzica, G; Zaccaria, A; Bandini, G; Calori, E; Testoni, N; Rosti, G; Conte, R; Remiddi, C; Salvucci, M; De Vivo, A; Tura, S
1996-02-01
The degree of matching of HLA genes between the selected donor and recipient is an important aspect of the selection of unrelated donors for allogeneic bone marrow transplantation (UBMT). The most sensitive methods currently used are serological typing of HLA class I genes, mixed lymphocyte culture (MLC), IEF and molecular genotyping of HLA class II genes by direct sequencing of PCR products. Serological typing of class I antigenes (A, B and C) fails to detect minor differences demonstrated by direct sequencing of DNA polymorphic regions. Molecular genotyping of HLA class I genes by DNA analysis is costly and work-intensive. To improve compatibility between donor and recipient, we have set up a new rapid and non-radioisotopic application of the 'fingerprinting PCR' technique for the analysis of the polymorphic second exon of the HLA class I A, B and C genes. This technique is based on the formation of specific patterns (PCR fingerprints) of homoduplexes and heteroduplexes between heterologous amplified DNA sequences. After an electrophoretic run on non-denaturing polyacrylamide gel, different HLA class I types give allele-specific banding patterns. HLA class I matching is performed, after the gel has been soaked in ethidium bromide or silver-stained, by visual comparison of patients' fingerprints with those of donors. Identity can be confirmed by mixing donor and recipient DNAs in an amplification cross-match. To assess the technique, 10 normal samples, 22 related allogeneic bone marrow transplanted pairs and 10 unrelated HLA-A and HLA-B serologically matched patient-donor pairs were analysed for HLA class I polymorphic regions. In all the related pairs and in 1/10 unrelated pairs, matched donor-recipient patterns were identified. This new application of PCR fingerprinting may confirm the HLA class I serological selection of unrelated marrow donors.
2014-01-01
Background Wheat glutenin polymers are made up of two main subunit types, the high- (HMW-GS) and low- (LMW-GS) molecular weight subunits. These latter are represented by heterogeneous proteins. The most common, based on the first amino acid of the mature sequence, are known as LMW-m and LMW-s types. The mature sequences differ as a consequence of three extra amino acids (MET-) at the N-terminus of LMW-m types. The nucleotide sequences of their encoding genes are, however, nearly identical, so that the relationship between gene and protein sequences is difficult to ascertain. It has been hypothesized that the presence of an asparagine residue in position 23 of the complete coding sequence for the LMW-s type might account for the observed three-residue shortened sequence, as a consequence of cleavage at the asparagine by an asparaginyl endopeptidase. Results We performed site-directed mutagenesis of a LMW-s gene to replace asparagine at position 23 with threonine and thus convert it to a candidate LMW-m type gene. Similarly, a candidate LMW-m type gene was mutated at position 23 to replace threonine with asparagine. Next, we produced transgenic durum wheat (cultivar Svevo) lines by introducing the mutated versions of the LMW-m and LMW-s genes, along with the wild type counterpart of the LMW-m gene. Proteomic comparisons between the transgenic and null segregant plants enabled identification of transgenic proteins by mass spectrometry analyses and Edman N-terminal sequencing. Conclusions Our results show that the formation of LMW-s type relies on the presence of an asparagine residue close to the N-terminus generated by signal peptide cleavage, and that LMW-GS can be quantitatively processed most likely by vacuolar asparaginyl endoproteases, suggesting that those accumulated in the vacuole are not sequestered into stable aggregates that would hinder the action of proteolytic enzymes. Rather, whatever is the mechanism of glutenin polymer transport to the vacuole, the proteins remain available for proteolytic processing, and can be converted to the mature form by the removal of a short N-terminal sequence. PMID:24629124
Selection of homeotic proteins for binding to a human DNA replication origin.
de Stanchina, E; Gabellini, D; Norio, P; Giacca, M; Peverali, F A; Riva, S; Falaschi, A; Biamonti, G
2000-06-09
We have previously shown that a cell cycle-dependent nucleoprotein complex assembles in vivo on a 74 bp sequence within the human DNA replication origin associated to the Lamin B2 gene. Here, we report the identification, using a one-hybrid screen in yeast, of three proteins interacting with the 74 bp sequence. All of them, namely HOXA13, HOXC10 and HOXC13, are orthologues of the Abdominal-B gene of Drosophila melanogaster and are members of the homeogene family of developmental regulators. We describe the complete open reading frame sequence of HOXC10 and HOXC13 along with the structure of the HoxC13 gene. The specificity of binding of these two proteins to the Lamin B2 origin is confirmed by both band-shift and in vitro footprinting assays. In addition, the ability of HOXC10 and HOXC13 to increase the activity of a promoter containing the 74 bp sequence, as assayed by CAT-assay experiments, demonstrates a direct interaction of these homeoproteins with the origin sequence in mammalian cells. We also show that HOXC10 expression is cell-type-dependent and positively correlates with cell proliferation. Copyright 2000 Academic Press.
Fokkema, Ivo F A C; den Dunnen, Johan T; Taschner, Peter E M
2005-08-01
The completion of the human genome project has initiated, as well as provided the basis for, the collection and study of all sequence variation between individuals. Direct access to up-to-date information on sequence variation is currently provided most efficiently through web-based, gene-centered, locus-specific databases (LSDBs). We have developed the Leiden Open (source) Variation Database (LOVD) software approaching the "LSDB-in-a-Box" idea for the easy creation and maintenance of a fully web-based gene sequence variation database. LOVD is platform-independent and uses PHP and MySQL open source software only. The basic gene-centered and modular design of the database follows the recommendations of the Human Genome Variation Society (HGVS) and focuses on the collection and display of DNA sequence variations. With minimal effort, the LOVD platform is extendable with clinical data. The open set-up should both facilitate and promote functional extension with scripts written by the community. The LOVD software is freely available from the Leiden Muscular Dystrophy pages (www.DMD.nl/LOVD/). To promote the use of LOVD, we currently offer curators the possibility to set up an LSDB on our Leiden server. (c) 2005 Wiley-Liss, Inc.
Chassaing, Nicolas; Vigouroux, Adeline; Calvas, Patrick
2009-06-01
Microphthalmia and anophthalmia are at the severe end of the spectrum of abnormalities in ocular development. A few genes (SOX2, OTX2, RAX, and CHX10) have been implicated in isolated micro/anophthalmia, but causative mutations of these genes explain less than a quarter of these developmental defects. A specifically conserved SOX2/OTX2-mediated RAX expression regulatory sequence has recently been identified. We postulated that mutations in this sequence could lead to micro/anophthalmia, and thus we performed molecular screening of this regulatory element in patients suffering from micro/anophthalmia. Fifty-one patients suffering from nonsyndromic microphthalmia (n = 40) or anophthalmia (n = 11) were included in this study after negative molecular screening for SOX2, OTX2, RAX, and CHX10 mutations. Mutation screening of the RAX regulatory sequence was performed by direct sequencing for these patients. No mutations were identified in the highly conserved RAX regulatory sequence in any of the 51 patients. Mutations in the newly identified RAX regulatory sequence do not represent a frequent cause of nonsyndromic micro/anophthalmia.
Lin, Min; Dan, Hanhong; Li, Yijing
2004-02-01
Leptospira borgpetersenii, one of the causative agents of leptospirosis in both animals and humans, is a bacterial pathogen with characteristic motility that is mediated by the rotation of two periplasmic flagella (PF). The flaB gene coding for a core polypeptide subunit of PF was previously characterized by sequence analysis of its open reading frame (ORF) (M. Lin, J Biochem Mol Biol Biophys 2:181-187, 1999). The present study was undertaken to isolate and clone the uncharacterized sequence upstream of the flaB gene by using a PCR-based genome walking procedure. This has resulted in a 1470-bp genomic DNA sequence in which an 846-bp ORF coding for a 281-amino acid polypeptide (31.3 kDa) is identified 455 bp upstream from the flaB start codon. The encoded protein exhibits 72% amino acid identity to the deduced FlaB protein sequence of L. borgpetersenii and a high degree of sequence homology to the FlaB proteins of other spirochaetes. This has demonstrated for the first time that a second flaB gene homolog is present in a Leptospira species. The newly identified gene is designated flaB1, and the previously cloned flaB renamed flaB2. Within the intergenic sequence between flaB1 and flaB2, a potential stem-loop structure (12-bp inverted repeats) was identified 25 bp downstream of the flaB1 stop codon; this could serve as a transcription terminator for the flaB1 mRNA. Three E. coli-like promoter regions (I, II, and III) for binding Esigma(70), a regulatory sequence uncommonly found in flagellar genes, were predicted upstream of the flaB2 ORF. Only promoter region II contains a promoter that is functional in E. coli, as revealed at phenotypic and transcriptional levels by its capability of directing the expression of the chloramphenicol acetyltransferase (CAT) gene in the promoter probe vector pKK232-8. These observations may suggest that flaB1 and flaB2 are transcribed separately and do not form a transcriptional operon controlled by a single promoter.
Angsuthanasombat, C; Chungjatupornchai, W; Kertbundit, S; Luxananil, P; Settasatian, C; Wilairat, P; Panyim, S
1987-07-01
Five recombinant E. coli clones exhibiting toxicity to Aedes aegypti larvae were obtained from a library of 800 clones containing XbaI DNA fragments of 110 kb plasmid from B. thuringiensis var. israelensis. All the five clones (pMU 14/258/303/388/679) had the same 3.8-kb insert and encoded a major protein of 130 kDa which was highly toxic to A. aegypti larvae. Three clones (pMU 258/303/388) transcribed the 130 kD a gene in the same direction as that of lac Z promoter of pUC12 vector whereas the transcription of the other two (pMU 14/679) was in the opposite direction. A 1.9-kb fragment of the 3.8 kb insert coded for a protein of 65 kDa. Partial DNA sequence of the 3.8 kb insert, corresponding to the 5'-terminal of the 130 kDa gene, revealed a continuous reading frame, a Shine-Dalgarno sequence and a tentative 5'-regulatory region. These results demonstrated that the 3.8 kb insert is a minimal DNA fragment containing a regulatory region plus the coding sequence of the 130 kDa protein that is highly toxic to mosquito larvae.
Gene Chips: A New Tool for Biology
NASA Astrophysics Data System (ADS)
Botstein, David
2005-03-01
The knowledge of many complete genomic sequences has led to a ``grand unification of biology,'' consisting of direct evidence that most of the basic cellular functions of all organisms are carried out by genes and proteins whose primary sequences are directly related by descent (i.e. orthologs). Further, genome sequences have made it possible to study all the genes of a single organism simultaneously. We have been using DNA microarrays (sometime referred to as ``gene chips'') to study patterns of gene expression and genome rearrangement in yeast and human cells under a variety of conditions and in human tumors and normal tissues. These experiments produce huge volumes of data; new computational and statistical methods are required to analyze them properly. Examples from this work will be presented to illustrate how genome-scale experiments and analysis can result in new biological insights not obtainable by traditional analyses of genes and proteins one by one. For lymphomas, breast tumors, lung tumors, liver tumors, gastric tumors, brain tumors and soft tissue tumors we have been able, by the application of clustering algorithms, to subclassify tumors of similar anatomical origin on the basis of their gene expression patterns. These subclassifications appear to be reproducible and clinically as well as biologically meaningful. By studying synchronized cells growing in culture, we have identified many hundreds of yeast and human genes that are expressed periodically, at characteristically different points in the cell division cycle. In humans, it turns out that most of these genes are the same genes that comprise the ``proliferation cluster,'' i.e. the genes whose expression is specifically associated with the proliferativeness of tumors and tumor cell lines. Finally, we have been applying a variant of our DNA microarray technology (which we call ``array comparative hybridization'') to follow the DNA copy number of genes, both in tumors and in yeast cells undergoing adaptive evolution during hundreds of generations of growth in continuous culture. These studies suggest a basic similarity in mechanism between adaptive evolution in yeast and tumor progression in humans.
Fine-tuning gene networks using simple sequence repeats
Egbert, Robert G.; Klavins, Eric
2012-01-01
The parameters in a complex synthetic gene network must be extensively tuned before the network functions as designed. Here, we introduce a simple and general approach to rapidly tune gene networks in Escherichia coli using hypermutable simple sequence repeats embedded in the spacer region of the ribosome binding site. By varying repeat length, we generated expression libraries that incrementally and predictably sample gene expression levels over a 1,000-fold range. We demonstrate the utility of the approach by creating a bistable switch library that programmatically samples the expression space to balance the two states of the switch, and we illustrate the need for tuning by showing that the switch’s behavior is sensitive to host context. Further, we show that mutation rates of the repeats are controllable in vivo for stability or for targeted mutagenesis—suggesting a new approach to optimizing gene networks via directed evolution. This tuning methodology should accelerate the process of engineering functionally complex gene networks. PMID:22927382
Jiang, Lulu; Hindmarch, Charles C. T.; Rogers, Mark; Campbell, Colin; Waterfall, Christy; Coghill, Jane; Mathieson, Peter W.; Welsh, Gavin I.
2016-01-01
Glucocorticoids are steroids that reduce inflammation and are used as immunosuppressive drugs for many diseases. They are also the mainstay for the treatment of minimal change nephropathy (MCN), which is characterised by an absence of inflammation. Their mechanisms of action remain elusive. Evidence suggests that immunomodulatory drugs can directly act on glomerular epithelial cells or ‘podocytes’, the cell type which is the main target of injury in MCN. To understand the nature of glucocorticoid effects on non-immune cell functions, we generated RNA sequencing data from human podocyte cell lines and identified the genes that are significantly regulated in dexamethasone-treated podocytes compared to vehicle-treated cells. The upregulated genes are of functional relevance to cytoskeleton-related processes, whereas the downregulated genes mostly encode pro-inflammatory cytokines and growth factors. We observed a tendency for dexamethasone-upregulated genes to be downregulated in MCN patients. Integrative analysis revealed gene networks composed of critical signaling pathways that are likely targeted by dexamethasone in podocytes. PMID:27774996
Oem, Jae-Ku; Xiang, Zhonghua; Zhou, Yan; Babiuk, Lorne A; Liu, Qiang
2007-09-01
Hepatitis C virus (HCV) causes severe liver diseases in a large population worldwide. HCV protein translation is controlled by an internal ribosomal entry site (IRES) within the 5'-untranslated region (UTR). HCV IRES-dependent translation is critical for HCV-associated pathogenesis. To develop a plasmid DNA transfection system by using RNA polymerase I promoter and terminator sequences for studying HCV IRES-dependent translation. A gene cassette containing HCV 5'-UTR, Renilla luciferase reporter gene, and HCV 3'-UTR was inserted between RNA polymerase I promoter and terminator sequences. HCV IRES-directed translation was determined by luciferase assay after transfection. Transfection of the RNA polymerase I-HCV IRES plasmid into human hepatoma Huh-7 and HepG2 cells resulted in luciferase gene expression. Deletion of the IIIf domain in HCV IRES dramatically reduced luciferase activity. Our results indicated that the plasmid vector system-based on RNA polymerase I promoter and terminator sequences represents an effective approach for the study of HCV IRES-dependent translation.
NASA Astrophysics Data System (ADS)
Wu, Jiangling; Huang, Yu; Bian, Xintong; Li, DanDan; Cheng, Quan; Ding, Shijia
2016-10-01
In this work, a custom-made intensity-interrogation surface plasmon resonance imaging (SPRi) system has been developed to directly detect a specific sequence of BCR/ABL fusion gene in chronic myelogenous leukemia (CML). The variation in the reflected light intensity detected from the sensor chip composed of gold islands array is proportional to the change of refractive index due to the selective hybridization of surface-bound DNA probes with target ssDNA. SPRi measurements were performed with different concentrations of synthetic target DNA sequence. The calibration curve of synthetic target sequence shows a good relationship between the concentration of synthetic target and the change of reflected light intensity. The detection limit of this SPRi measurement could approach 10.29 nM. By comparing SPRi images, the target ssDNA and non-complementary DNA sequence are able to be distinguished. This SPRi system has been applied for assay of BCR/ABL fusion gene extracted from real samples. This nucleic acid-based SPRi biosensor therefore offers an alternative high-effective, high-throughput label-free tool for DNA detection in biomedical research and molecular diagnosis.
Automated Gene Ontology annotation for anonymous sequence data.
Hennig, Steffen; Groth, Detlef; Lehrach, Hans
2003-07-01
Gene Ontology (GO) is the most widely accepted attempt to construct a unified and structured vocabulary for the description of genes and their products in any organism. Annotation by GO terms is performed in most of the current genome projects, which besides generality has the advantage of being very convenient for computer based classification methods. However, direct use of GO in small sequencing projects is not easy, especially for species not commonly represented in public databases. We present a software package (GOblet), which performs annotation based on GO terms for anonymous cDNA or protein sequences. It uses the species independent GO structure and vocabulary together with a series of protein databases collected from various sites, to perform a detailed GO annotation by sequence similarity searches. The sensitivity and the reference protein sets can be selected by the user. GOblet runs automatically and is available as a public service on our web server. The paper also addresses the reliability of automated GO annotations by using a reference set of more than 6000 human proteins. The GOblet server is accessible at http://goblet.molgen.mpg.de.
Ji, Feng; Zhao, Jing-Zhuang; Liu, Miao; Lu, Tong-Yan; Liu, Hong-Bai; Yin, Jiasheng; Xu, Li-Ming
2017-04-01
Infectious pancreatic necrosis (IPN) is a significant disease of farmed salmonids resulting in direct economic losses due to high mortality in China. However, no gene sequence of any Chinese infectious pancreatic necrosis virus (IPNV) isolates was available. In the study, moribund rainbow trout fry samples were collected during an outbreak of IPN in Yunnan province of southwest China in 2013. An IPNV was isolated and tentatively named ChRtm213. We determined the full genome sequence of the IPNV ChRtm213 and compared it with previously identified IPNV sequences worldwide. The sequences of different structural and non-structural protein genes were compared to those of other aquatic birnaviruses sequenced to date. The results indicated that the complete genome sequence of ChRtm213 strain contains a segment A (3099 nucleotides) coding a polyprotein VP2-VP4-VP3, and a segment B (2789 nucleotides) coding a RNA-dependent RNA polymerase VP1. The phylogenetic analyses showed that ChRtm213 strain fell within genogroup 1, serotype A9 (Jasper), having similarities of 96.3% (segment A) and 97.3% (segment B) with the IPNV strain AM98 from Japan. The results suggest that the Chinese IPNV isolate has relative closer relationship with Japanese IPNV strains. The sequence of ChRtm213 was the first gene sequence of IPNV isolates in China. This study provided a robust reference for diagnosis and/or control of IPNV prevalent in China.
Metabolic Pathway Assignment of Plant Genes based on Phylogenetic Profiling–A Feasibility Study
Weißenborn, Sandra; Walther, Dirk
2017-01-01
Despite many developed experimental and computational approaches, functional gene annotation remains challenging. With the rapidly growing number of sequenced genomes, the concept of phylogenetic profiling, which predicts functional links between genes that share a common co-occurrence pattern across different genomes, has gained renewed attention as it promises to annotate gene functions based on presence/absence calls alone. We applied phylogenetic profiling to the problem of metabolic pathway assignments of plant genes with a particular focus on secondary metabolism pathways. We determined phylogenetic profiles for 40,960 metabolic pathway enzyme genes with assigned EC numbers from 24 plant species based on sequence and pathway annotation data from KEGG and Ensembl Plants. For gene sequence family assignments, needed to determine the presence or absence of particular gene functions in the given plant species, we included data of all 39 species available at the Ensembl Plants database and established gene families based on pairwise sequence identities and annotation information. Aside from performing profiling comparisons, we used machine learning approaches to predict pathway associations from phylogenetic profiles alone. Selected metabolic pathways were indeed found to be composed of gene families of greater than expected phylogenetic profile similarity. This was particularly evident for primary metabolism pathways, whereas for secondary pathways, both the available annotation in different species as well as the abstraction of functional association via distinct pathways proved limiting. While phylogenetic profile similarity was generally not found to correlate with gene co-expression, direct physical interactions of proteins were reflected by a significantly increased profile similarity suggesting an application of phylogenetic profiling methods as a filtering step in the identification of protein-protein interactions. This feasibility study highlights the potential and challenges associated with phylogenetic profiling methods for the detection of functional relationships between genes as well as the need to enlarge the set of plant genes with proven secondary metabolism involvement as well as the limitations of distinct pathways as abstractions of relationships between genes. PMID:29163570
Gu, Lei-Lei; Li, Xin-Hua; Han, Yue; Zhang, Dong-Hua; Gong, Qi-Ming; Zhang, Xin-Xin
2014-02-25
Glycogen storage disease type Ia (GSD-Ia) is an autosomal recessive genetic disorder resulting in hypoglycemia, hepatomegaly and growth retardation. It is caused by mutations in the G6PC gene encoding Glucose-6-phosphatase. To date, over 80 mutations have been identified in the G6PC gene. Here we reported a novel mutation found in a Chinese patient with abnormal transaminases, hypoglycemia, hepatomegaly and short stature. Direct sequencing of the coding region and splicing-sites in the G6PC gene revealed a novel no-stop mutation, p.*358Yext*43, leading to a 43 amino-acid extension of G6Pase. The expression level of mutant G6Pase transcripts was only 7.8% relative to wild-type transcripts. This mutation was not found in 120 chromosomes from 60 unrelated healthy control subjects using direct sequencing, and was further confirmed by digestion with Rsa I restriction endonuclease. In conclusion, we revealed a novel no-stop mutation in this study which expands the spectrum of mutations in the G6PC gene. The molecular genetic analysis was indispensable to the diagnosis of GSD-Ia for the patient. Copyright © 2013 Elsevier B.V. All rights reserved.
Re-engineering adenovirus vector systems to enable high-throughput analyses of gene function.
Stanton, Richard J; McSharry, Brian P; Armstrong, Melanie; Tomasec, Peter; Wilkinson, Gavin W G
2008-12-01
With the enhanced capacity of bioinformatics to interrogate extensive banks of sequence data, more efficient technologies are needed to test gene function predictions. Replication-deficient recombinant adenovirus (Ad) vectors are widely used in expression analysis since they provide for extremely efficient expression of transgenes in a wide range of cell types. To facilitate rapid, high-throughput generation of recombinant viruses, we have re-engineered an adenovirus vector (designated AdZ) to allow single-step, directional gene insertion using recombineering technology. Recombineering allows for direct insertion into the Ad vector of PCR products, synthesized sequences, or oligonucleotides encoding shRNAs without requirement for a transfer vector Vectors were optimized for high-throughput applications by making them "self-excising" through incorporating the I-SceI homing endonuclease into the vector removing the need to linearize vectors prior to transfection into packaging cells. AdZ vectors allow genes to be expressed in their native form or with strep, V5, or GFP tags. Insertion of tetracycline operators downstream of the human cytomegalovirus major immediate early (HCMV MIE) promoter permits silencing of transgenes in helper cells expressing the tet repressor thus making the vector compatible with the cloning of toxic gene products. The AdZ vector system is robust, straightforward, and suited to both sporadic and high-throughput applications.
The human oxytocin gene promoter is regulated by estrogens.
Richard, S; Zingg, H H
1990-04-15
Gonadal steroids affect brain function primarily by altering the expression of specific genes, yet the specific mechanisms by which neuronal target genes undergo such regulation are unknown. Recent evidence suggests that the expression of the neuropeptide gene for oxytocin (OT) is modulated by estrogens. We therefore examined the possibility that this regulation occurred via a direct interaction of the estrogen-receptor complex with cis-acting elements flanking the OT gene. DNA-mediated gene transfer experiments were performed using Neuro-2a neuroblastoma cells and chimeric plasmids containing portions of the human OT gene 5'-glanking region linked to the chloramphenicol acetyltransferase gene. We identified a 19-base pair region located at -164 to -146 upstream of the transcription start site which is capable of conferring estrogen responsiveness to the homologous as well as to a heterologous promoter. The hormonal response is strictly dependent on the presence of intracellular estrogen receptors, since estrogen induced stimulation occurred only in Neuro-2a cells co-transfected with an expression vector for the human estrogen receptor. The identified region contains a novel imperfect palindrome (GGTGACCTTGACC) with sequence similarity to other estrogen response elements (EREs). To define cis-acting elements that function in synergism with the ERE, sequences 3' to the ERE were deleted, including the CCAAT box, two additional motifs corresponding to the right half of the ERE palindrome (TGACC), as well as a CTGCTAA heptamer similar to the "elegans box" found in Caenorhabditis elegans. Interestingly, optimal function of the identified ERE was fully independent of these elements and only required a short promoter region (-49 to +36). Our studies define a molecular mechanism by which estrogens can directly modulate OT gene expression. However, only a subset of OT neurons are capable of binding estrogens, therefore, direct action of estrogens on the OT gene may be restricted to a subpopulation of OT neurons.
Mitsui, Jun; Fukuda, Yoko; Azuma, Kyo; Tozaki, Hirokazu; Ishiura, Hiroyuki; Takahashi, Yuji; Goto, Jun; Tsuji, Shoji
2010-07-01
We have recently found that multiple rare variants of the glucocerebrosidase gene (GBA) confer a robust risk for Parkinson disease, supporting the 'common disease-multiple rare variants' hypothesis. To develop an efficient method of identifying rare variants in a large number of samples, we applied multiplexed resequencing using a next-generation sequencer to identification of rare variants of GBA. Sixteen sets of pooled DNAs from six pooled DNA samples were prepared. Each set of pooled DNAs was subjected to polymerase chain reaction to amplify the target gene (GBA) covering 6.5 kb, pooled into one tube with barcode indexing, and then subjected to extensive sequence analysis using the SOLiD System. Individual samples were also subjected to direct nucleotide sequence analysis. With the optimization of data processing, we were able to extract all the variants from 96 samples with acceptable rates of false-positive single-nucleotide variants.
Danno, Hiroki; Michiue, Tatsuo; Hitachi, Keisuke; Yukita, Akira; Ishiura, Shoichi; Asashima, Makoto
2008-01-01
The neural-related genes Sox2, Pax6, Otx2, and Rax have been associated with severe ocular malformations such as anophthalmia and microphthalmia, but it remains unclear as to how these genes are linked functionally. We analyzed the upstream signaling of Xenopus Rax (also known as Rx1) and identified the Otx2 and Sox2 proteins as direct upstream regulators of Rax. We revealed that endogenous Otx2 and Sox2 proteins bound to the conserved noncoding sequence (CNS1) located ≈2 kb upstream of the Rax promoter. This sequence is conserved among vertebrates and is required for potent transcriptional activity. Reporter assays showed that Otx2 and Sox2 synergistically activated transcription via CNS1. Furthermore, the Otx2 and Sox2 proteins physically interacted with each other, and this interaction was affected by the Sox2-missense mutations identified in these ocular disorders. These results demonstrate that the direct interaction and interdependence between the Otx2 and Sox2 proteins coordinate Rax expression in eye development, providing molecular linkages among the genes responsible for ocular malformation. PMID:18385377
In trans paired nicking triggers seamless genome editing without double-stranded DNA cutting.
Chen, Xiaoyu; Janssen, Josephine M; Liu, Jin; Maggio, Ignazio; 't Jong, Anke E J; Mikkers, Harald M M; Gonçalves, Manuel A F V
2017-09-22
Precise genome editing involves homologous recombination between donor DNA and chromosomal sequences subjected to double-stranded DNA breaks made by programmable nucleases. Ideally, genome editing should be efficient, specific, and accurate. However, besides constituting potential translocation-initiating lesions, double-stranded DNA breaks (targeted or otherwise) are mostly repaired through unpredictable and mutagenic non-homologous recombination processes. Here, we report that the coordinated formation of paired single-stranded DNA breaks, or nicks, at donor plasmids and chromosomal target sites by RNA-guided nucleases based on CRISPR-Cas9 components, triggers seamless homology-directed gene targeting of large genetic payloads in human cells, including pluripotent stem cells. Importantly, in addition to significantly reducing the mutagenicity of the genome modification procedure, this in trans paired nicking strategy achieves multiplexed, single-step, gene targeting, and yields higher frequencies of accurately edited cells when compared to the standard double-stranded DNA break-dependent approach.CRISPR-Cas9-based gene editing involves double-strand breaks at target sequences, which are often repaired by mutagenic non-homologous end-joining. Here the authors use Cas9 nickases to generate coordinated single-strand breaks in donor and target DNA for precise homology-directed gene editing.
Sullivan, Lori S.; Baylin, Eric B.; Font, Ramon; Daiger, Stephen P.; Pepose, Jay S.; Clinch, Thomas E.; Nakamura, Hisashi; Zhao, Xinping C.
2007-01-01
Purpose To determine if a mutation within the coding region of the keratin 12 gene (KRT12) is responsible for a severe form of Meesmann's corneal dystrophy. Methods A family with clinically identified Meesmann's corneal dystrophy was recruited and studied. Electron microscopy was performed on scrapings of corneal epithelial cells from the proband. Mutations in the KRT12 gene were sought using direct genomic sequencing of leukocyte DNA from two affected and two unaffected family members. Subsequently, the observed mutation was screened in all available family members using polymerase chain reaction and direct sequencing. Results A heterozygous missense mutation (Arg430Pro) was found in exon 6 of KRT12 in all 14 affected individuals studied. Unaffected family members and 100 normal controls were negative for this mutation. Conclusions We have identified a novel mutation in the KRT12 gene that is associated with a symptomatic phenotype of Meesmann's corneal dystrophy. This mutation results in a substitution of proline for arginine in the helix termination motif that may disrupt the normal helix, leading to a dramatic structural change of the keratin 12 protein. PMID:17653038
Characterization of Cyt2Bc Toxin from Bacillus thuringiensis subsp. medellin
Juárez-Pérez, Victor; Guerchicoff, Alejandra; Rubinstein, Clara; Delécluse, Armelle
2002-01-01
We cloned and sequenced a new cytolysin gene from Bacillus thuringiensis subsp. medellin. Three IS240-like insertion sequence elements and the previously cloned cyt1Ab and p21 genes were found in the vicinity of the cytolysin gene. The cytolysin gene encodes a protein 29.7 kDa in size that is 91.5% identical to Cyt2Ba from Bacillus thuringiensis subsp. israelensis and has been designated Cyt2Bc. Inclusions containing Cyt2Bc were purified from the crystal-negative strain SPL407 of B. thuringiensis. Cyt2Bc reacted weakly with antibodies directed against Cyt2Ba and was not recognized by an antiserum directed against the reference cytolysin Cyt1Aa. Cyt2Bc was hemolytic only upon activation with trypsin and had only one-third to one-fifth of the activity of Cyt2Ba, depending on the activation time. Cyt2Bc was also mosquitocidal against Aedes aegypti, Anopheles stephensi, and Culex quinquefasciatus, including strains resistant to the Bacillus sphaericus binary toxin. Its toxicity was half of that of Cyt2Ba on all mosquito species except resistant C. quinquefasciatus. PMID:11872472
Evaluation of the arrestin gene in patients with retinitis pigmentosa or an allied disease
DOE Office of Scientific and Technical Information (OSTI.GOV)
DeStefano, D.J.; Berson, E.L.; Dryja, T.P.
1994-09-01
Arrestin, also called 48K protein or S-antigen, plays a role in deactivating rhodopsin, the photosensitive, seven-helix, G-protein receptor found in rod photoreceptors. In Drosophila, null mutations in arrestin genes cause a light-dependent photoreceptor degeneration. It is possible that a comparable photoreceptor degeneration in humans is caused by defects in the rod arrestin gene. In order to evaluate this possibility, we are characterizing the human arrestin locus on chromosome 2q. We screened a genomic library (5 million plaques) using an arrestin cDNA clone. Sixty-eight hybridizing clones were identified; portions of 7 clones were sequenced to determine the intron sequence flanking themore » exons. We are using SSCP analysis and direct genomic sequencing to screen the entire coding region, splice donor and acceptor sites, and the promoter region of the arrestin gene in 188 patients with autosomal dominant and 104 patients with autosomal recessive retinitis pigmentosa. We have already obtained flanking intron sequences necessary for SSCP analysis for 13 of 16 exons. So far, we have identified 4 silent base changes at codons 67 (TGC-to-TGT), 107 (CTG-to-CTC), 163 (GCC-to-GCT), and 288 (CTG-to-TGT), all with allele frequencies at 1% or less. Several other variant bands detected by SSCP analysis are currently being sequenced.« less
The Complete Genome Sequence of the Plant Growth-Promoting Bacterium Pseudomonas sp. UW4
Duan, Jin; Jiang, Wei; Cheng, Zhenyu; Heikkila, John J.; Glick, Bernard R.
2013-01-01
The plant growth-promoting bacterium (PGPB) Pseudomonas sp. UW4, previously isolated from the rhizosphere of common reeds growing on the campus of the University of Waterloo, promotes plant growth in the presence of different environmental stresses, such as flooding, high concentrations of salt, cold, heavy metals, drought and phytopathogens. In this work, the genome sequence of UW4 was obtained by pyrosequencing and the gaps between the contigs were closed by directed PCR. The P. sp. UW4 genome contains a single circular chromosome that is 6,183,388 bp with a 60.05% G+C content. The bacterial genome contains 5,423 predicted protein-coding sequences that occupy 87.2% of the genome. Nineteen genomic islands (GIs) were predicted and thirty one complete putative insertion sequences were identified. Genes potentially involved in plant growth promotion such as indole-3-acetic acid (IAA) biosynthesis, trehalose production, siderophore production, acetoin synthesis, and phosphate solubilization were determined. Moreover, genes that contribute to the environmental fitness of UW4 were also observed including genes responsible for heavy metal resistance such as nickel, copper, cadmium, zinc, molybdate, cobalt, arsenate, and chromate. Whole-genome comparison with other completely sequenced Pseudomonas strains and phylogeny of four concatenated “housekeeping” genes (16S rRNA, gyrB, rpoB and rpoD) of 128 Pseudomonas strains revealed that UW4 belongs to the fluorescens group, jessenii subgroup. PMID:23516524
Assignment of the human caltractin gene (CALT) to Xq28 by fluorescence in situ hybridization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tanaka, Tanaka; Okui, Keiko; Nakamura, Yusuke
1994-12-01
The centrosome is the major microtubule-organizing center of interphase eukaryotic cells, an its duplication is essential to eukaryotic cell division. Caltractin, a structural component of centrosomes, is highly homologous in amino acid sequence to the product of the CDC31 gene of Saccharomyces cerevisiae. In S. cerevisiae, an important role for CDC31 in duplication of the spindle pole body (SPB), a kind of microtubule-organizing center, has been demonstrated by an experiment in which mutant CDC31 prevented SPB duplication and led to formation of a monopolar spindle. In view of the localization of human caltractin in centrosomes and the sequence homology itmore » bears to yeast CDC31, it is reasonable to assume that caltractin functions in humans as CDC31 does in yeast. As a part of the Human Genome Project, we have been determining nucleotide sequences of DNA clones randomly selected from a directionally cloned cDNA library constructed from fetal brain mRNA obtained from Clontech (La Jolla, CA). By comparing 5{prime} partial DNA sequences of these cDNA clones with known DNA sequences in the database, we found one clone that was highly homologous to the caltractin gene of Chlamydomonas, which turned out to be the same as a human gene identified recently. 4 refs., 1 fig.« less
Macagno, Eduardo R; Gaasterland, Terry; Edsall, Lee; Bafna, Vineet; Soares, Marcelo B; Scheetz, Todd; Casavant, Thomas; Da Silva, Corinne; Wincker, Patrick; Tasiemski, Aurélie; Salzet, Michel
2010-06-25
The medicinal leech, Hirudo medicinalis, is an important model system for the study of nervous system structure, function, development, regeneration and repair. It is also a unique species in being presently approved for use in medical procedures, such as clearing of pooled blood following certain surgical procedures. It is a current, and potentially also future, source of medically useful molecular factors, such as anticoagulants and antibacterial peptides, which may have evolved as a result of its parasitizing large mammals, including humans. Despite the broad focus of research on this system, little has been done at the genomic or transcriptomic levels and there is a paucity of openly available sequence data. To begin to address this problem, we constructed whole embryo and adult central nervous system (CNS) EST libraries and created a clustered sequence database of the Hirudo transcriptome that is available to the scientific community. A total of approximately 133,000 EST clones from two directionally-cloned cDNA libraries, one constructed from mRNA derived from whole embryos at several developmental stages and the other from adult CNS cords, were sequenced in one or both directions by three different groups: Genoscope (French National Sequencing Center), the University of Iowa Sequencing Facility and the DOE Joint Genome Institute. These were assembled using the phrap software package into 31,232 unique contigs and singletons, with an average length of 827 nt. The assembled transcripts were then translated in all six frames and compared to proteins in NCBI's non-redundant (NR) and to the Gene Ontology (GO) protein sequence databases, resulting in 15,565 matches to 11,236 proteins in NR and 13,935 matches to 8,073 proteins in GO. Searching the database for transcripts of genes homologous to those thought to be involved in the innate immune responses of vertebrates and other invertebrates yielded a set of nearly one hundred evolutionarily conserved sequences, representing all known pathways involved in these important functions. The sequences obtained for Hirudo transcripts represent the first major database of genes expressed in this important model system. Comparison of translated open reading frames (ORFs) with the other openly available leech datasets, the genome and transcriptome of Helobdella robusta, shows an average identity at the amino acid level of 58% in matched sequences. Interestingly, comparison with other available Lophotrochozoans shows similar high levels of amino acid identity, where sequences match, for example, 64% with Capitella capitata (a polychaete) and 56% with Aplysia californica (a mollusk), as well as 58% with Schistosoma mansoni (a platyhelminth). Phylogenetic comparisons of putative Hirudo innate immune response genes present within the Hirudo transcriptome database herein described show a strong resemblance to the corresponding mammalian genes, indicating that this important physiological response may have older origins than what has been previously proposed.
Huang, Wei-Yi; Zhao, Guang-Hui; Wei, Shu-Jun; Song, Hui-Qun; Xu, Min-Jun; Lin, Rui-Qing; Zhou, Dong-Hui; Zhu, Xing-Quan
2012-01-01
Complete mitochondrial (mt) genomes and the gene rearrangements are increasingly used as molecular markers for investigating phylogenetic relationships. Contributing to the complete mt genomes of Gastropoda, especially Pulmonata, we determined the mt genome of the freshwater snail Galba pervia, which is an important intermediate host for Fasciola spp. in China. The complete mt genome of G. pervia is 13,768 bp in length. Its genome is circular, and consists of 37 genes, including 13 genes for proteins, 2 genes for rRNA, 22 genes for tRNA. The mt gene order of G. pervia showed novel arrangement (tRNA-His, tRNA-Gly and tRNA-Tyr change positions and directions) when compared with mt genomes of Pulmonata species sequenced to date, indicating divergence among different species within the Pulmonata. A total of 3655 amino acids were deduced to encode 13 protein genes. The most frequently used amino acid is Leu (15.05%), followed by Phe (11.24%), Ser (10.76%) and IIe (8.346%). Phylogenetic analyses using the concatenated amino acid sequences of the 13 protein-coding genes, with three different computational algorithms (maximum parsimony, maximum likelihood and Bayesian analysis), all revealed that the families Lymnaeidae and Planorbidae are closely related two snail families, consistent with previous classifications based on morphological and molecular studies. The complete mt genome sequence of G. pervia showed a novel gene arrangement and it represents the first sequenced high quality mt genome of the family Lymnaeidae. These novel mtDNA data provide additional genetic markers for studying the epidemiology, population genetics and phylogeographics of freshwater snails, as well as for understanding interplay between the intermediate snail hosts and the intra-mollusca stages of Fasciola spp.. PMID:22844544
Stam, Remco; Scheikl, Daniela; Tellier, Aurélien
2016-06-02
Nod-like receptors (NLRs) are nucleotide-binding domain and leucine-rich repeats containing proteins that are important in plant resistance signaling. Many of the known pathogen resistance (R) genes in plants are NLRs and they can recognize pathogen molecules directly or indirectly. As such, divergence and copy number variants at these genes are found to be high between species. Within populations, positive and balancing selection are to be expected if plants coevolve with their pathogens. In order to understand the complexity of R-gene coevolution in wild nonmodel species, it is necessary to identify the full range of NLRs and infer their evolutionary history. Here we investigate and reveal polymorphism occurring at 220 NLR genes within one population of the partially selfing wild tomato species Solanum pennellii. We use a combination of enrichment sequencing and pooling ten individuals, to specifically sequence NLR genes in a resource and cost-effective manner. We focus on the effects which different mapping and single nucleotide polymorphism calling software and settings have on calling polymorphisms in customized pooled samples. Our results are accurately verified using Sanger sequencing of polymorphic gene fragments. Our results indicate that some NLRs, namely 13 out of 220, have maintained polymorphism within our S. pennellii population. These genes show a wide range of πN/πS ratios and differing site frequency spectra. We compare our observed rate of heterozygosity with expectations for this selfing and bottlenecked population. We conclude that our method enables us to pinpoint NLR genes which have experienced natural selection in their habitat. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Severson, Eric; Arnett, Kelly L; Wang, Hongfang; Zang, Chongzhi; Taing, Len; Liu, Hudan; Pear, Warren S; Shirley Liu, X; Blacklow, Stephen C; Aster, Jon C
2017-05-02
Notch transcription complexes (NTCs) drive target gene expression by binding to two distinct types of genomic response elements, NTC monomer-binding sites and sequence-paired sites (SPSs) that bind NTC dimers. SPSs are conserved and have been linked to the Notch responsiveness of a few genes. To assess the overall contribution of SPSs to Notch-dependent gene regulation, we determined the DNA sequence requirements for NTC dimerization using a fluorescence resonance energy transfer (FRET) assay and applied insights from these in vitro studies to Notch-"addicted" T cell acute lymphoblastic leukemia (T-ALL) cells. We found that SPSs contributed to the regulation of about a third of direct Notch target genes. Although originally described in promoters, SPSs are present mainly in long-range enhancers, including an enhancer containing a newly described SPS that regulates HES5 expression. Our work provides a general method for identifying SPSs in genome-wide data sets and highlights the widespread role of NTC dimerization in Notch-transformed leukemia cells. Copyright © 2017, American Association for the Advancement of Science.
Lau, Evan; Fisher, Meredith C.; Steudler, Paul A.; Cavanaugh, Colleen M.
2013-01-01
The mxaF gene, coding for the large (α) subunit of methanol dehydrogenase, is highly conserved among distantly related methylotrophic species in the Alpha-, Beta- and Gammaproteobacteria. It is ubiquitous in methanotrophs, in contrast to other methanotroph-specific genes such as the pmoA and mmoX genes, which are absent in some methanotrophic proteobacterial genera. This study examined the potential for using the mxaF gene as a functional and phylogenetic marker for methanotrophs. mxaF and 16S rRNA gene phylogenies were constructed based on over 100 database sequences of known proteobacterial methanotrophs and other methylotrophs to assess their evolutionary histories. Topology tests revealed that mxaF and 16S rDNA genes of methanotrophs do not show congruent evolutionary histories, with incongruencies in methanotrophic taxa in the Methylococcaceae, Methylocystaceae, and Beijerinckiacea. However, known methanotrophs generally formed coherent clades based on mxaF gene sequences, allowing for phylogenetic discrimination of major taxa. This feature highlights the mxaF gene’s usefulness as a biomarker in studying the molecular diversity of proteobacterial methanotrophs in nature. To verify this, PCR-directed assays targeting this gene were used to detect novel methanotrophs from diverse environments including soil, peatland, hydrothermal vent mussel tissues, and methanotroph isolates. The placement of the majority of environmental mxaF gene sequences in distinct methanotroph-specific clades (Methylocystaceae and Methylococcaceae) detected in this study supports the use of mxaF as a biomarker for methanotrophic proteobacteria. PMID:23451130
Stauff, Devin L.; Bassler, Bonnie L.
2011-01-01
The bacterial pathogen Chromobacterium violaceum uses a LuxIR-type quorum-sensing system to detect and respond to changes in cell population density. CviI synthesizes the autoinducer C10-homoserine lactone (C10-HSL), and CviR is a cytoplasmic DNA binding transcription factor that activates gene expression following binding to C10-HSL. A number of behaviors are controlled by quorum sensing in C. violaceum. However, few genes have been shown to be directly controlled by CviR, in part because the DNA motif bound by CviR is not well characterized. Here, we define the DNA sequence required for promoter recognition by CviR. Using in vivo data generated from a library of point mutations in a CviR-regulated promoter, we find that CviR binds to a palindrome with the ideal sequence CTGNCCNNNNGGNCAG. We constructed a position weight matrix using these in vivo data and scanned the C. violaceum genome to predict CviR binding sites. We measured direct activation of the identified promoters by CviR and found that CviR controls the expression of the promoter for a chitinase, a type VI secretion-related gene, a transcriptional regulator gene, a guanine deaminase gene, and cviI. Indeed, regulation of cviI expression by CviR generates a canonical quorum-sensing positive-feedback loop. PMID:21622734
Stauff, Devin L; Bassler, Bonnie L
2011-08-01
The bacterial pathogen Chromobacterium violaceum uses a LuxIR-type quorum-sensing system to detect and respond to changes in cell population density. CviI synthesizes the autoinducer C(10)-homoserine lactone (C(10)-HSL), and CviR is a cytoplasmic DNA binding transcription factor that activates gene expression following binding to C(10)-HSL. A number of behaviors are controlled by quorum sensing in C. violaceum. However, few genes have been shown to be directly controlled by CviR, in part because the DNA motif bound by CviR is not well characterized. Here, we define the DNA sequence required for promoter recognition by CviR. Using in vivo data generated from a library of point mutations in a CviR-regulated promoter, we find that CviR binds to a palindrome with the ideal sequence CTGNCCNNNNGGNCAG. We constructed a position weight matrix using these in vivo data and scanned the C. violaceum genome to predict CviR binding sites. We measured direct activation of the identified promoters by CviR and found that CviR controls the expression of the promoter for a chitinase, a type VI secretion-related gene, a transcriptional regulator gene, a guanine deaminase gene, and cviI. Indeed, regulation of cviI expression by CviR generates a canonical quorum-sensing positive-feedback loop.
Kassner, Ursula; Salewsky, Bastian; Wühle-Demuth, Marion; Szijarto, Istvan Andras; Grenkowitz, Thomas; Binner, Priska; März, Winfried; Steinhagen-Thiessen, Elisabeth; Demuth, Ilja
2015-09-01
Rare monogenic hyperchylomicronemia is caused by loss-of-function mutations in genes involved in the catabolism of triglyceride-rich lipoproteins, including the lipoprotein lipase gene, LPL. Clinical hallmarks of this condition are eruptive xanthomas, recurrent pancreatitis and abdominal pain. Patients with LPL deficiency and severe or recurrent pancreatitis are eligible for the first gene therapy treatment approved by the European Union. Therefore the precise molecular diagnosis of familial hyperchylomicronemia may affect treatment decisions. We present a 57-year-old male patient with excessive hypertriglyceridemia despite intensive lipid-lowering therapy. Abdominal sonography showed signs of chronic pancreatitis. Direct DNA sequencing and cloning revealed two novel missense variants, c.1302A>T and c.1306G>A, in exon 8 of the LPL gene coexisting on the same allele. The variants result in the amino-acid exchanges p.(Lys434Asn) and p.(Gly436Arg). They are located in the carboxy-terminal domain of lipoprotein lipase that interacts with the glycosylphosphatidylinositol-anchored HDL-binding protein (GPIHBP1) and are likely of functional relevance. No further relevant mutations were found by direct sequencing of the genes for APOA5, APOC2, LMF1 and GPIHBP1. We conclude that heterozygosity for damaging mutations of LPL may be sufficient to produce severe hypertriglyceridemia and that chylomicronemia may be transmitted in a dominant manner, at least in some families.
Kassner, Ursula; Salewsky, Bastian; Wühle-Demuth, Marion; Szijarto, Istvan Andras; Grenkowitz, Thomas; Binner, Priska; März, Winfried; Steinhagen-Thiessen, Elisabeth; Demuth, Ilja
2015-01-01
Rare monogenic hyperchylomicronemia is caused by loss-of-function mutations in genes involved in the catabolism of triglyceride-rich lipoproteins, including the lipoprotein lipase gene, LPL. Clinical hallmarks of this condition are eruptive xanthomas, recurrent pancreatitis and abdominal pain. Patients with LPL deficiency and severe or recurrent pancreatitis are eligible for the first gene therapy treatment approved by the European Union. Therefore the precise molecular diagnosis of familial hyperchylomicronemia may affect treatment decisions. We present a 57-year-old male patient with excessive hypertriglyceridemia despite intensive lipid-lowering therapy. Abdominal sonography showed signs of chronic pancreatitis. Direct DNA sequencing and cloning revealed two novel missense variants, c.1302A>T and c.1306G>A, in exon 8 of the LPL gene coexisting on the same allele. The variants result in the amino-acid exchanges p.(Lys434Asn) and p.(Gly436Arg). They are located in the carboxy-terminal domain of lipoprotein lipase that interacts with the glycosylphosphatidylinositol-anchored HDL-binding protein (GPIHBP1) and are likely of functional relevance. No further relevant mutations were found by direct sequencing of the genes for APOA5, APOC2, LMF1 and GPIHBP1. We conclude that heterozygosity for damaging mutations of LPL may be sufficient to produce severe hypertriglyceridemia and that chylomicronemia may be transmitted in a dominant manner, at least in some families. PMID:25585702
Cas9-Guide RNA Directed Genome Editing in Soybean[OPEN
Li, Zhongsen; Liu, Zhan-Bin; Xing, Aiqiu; Moon, Bryan P.; Koellhoffer, Jessica P.; Huang, Lingxia; Ward, R. Timothy; Clifton, Elizabeth; Falco, S. Carl; Cigan, A. Mark
2015-01-01
Recently discovered bacteria and archaea adaptive immune system consisting of clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated (Cas) endonuclease has been explored in targeted genome editing in different species. Streptococcus pyogenes Cas9-guide RNA (gRNA) was successfully applied to generate targeted mutagenesis, gene integration, and gene editing in soybean (Glycine max). Two genomic sites, DD20 and DD43 on chromosome 4, were mutagenized with frequencies of 59% and 76%, respectively. Sequencing randomly selected transgenic events confirmed that the genome modifications were specific to the Cas9-gRNA cleavage sites and consisted of small deletions or insertions. Targeted gene integrations through homology-directed recombination were detected by border-specific polymerase chain reaction analysis for both sites at callus stage, and one DD43 homology-directed recombination event was transmitted to T1 generation. T1 progenies of the integration event segregated according to Mendelian laws and clean homozygous T1 plants with the donor gene precisely inserted at the DD43 target site were obtained. The Cas9-gRNA system was also successfully applied to make a directed P178S mutation of acetolactate synthase1 gene through in planta gene editing. PMID:26294043
2011-01-01
Background Biodiesel or ethanol derived from lipids or starch produced by microalgae may overcome many of the sustainability challenges previously ascribed to petroleum-based fuels and first generation plant-based biofuels. The paucity of microalgae genome sequences, however, limits gene-based biofuel feedstock optimization studies. Here we describe the sequencing and de novo transcriptome assembly for the non-model microalgae species, Dunaliella tertiolecta, and identify pathways and genes of importance related to biofuel production. Results Next generation DNA pyrosequencing technology applied to D. tertiolecta transcripts produced 1,363,336 high quality reads with an average length of 400 bases. Following quality and size trimming, ~ 45% of the high quality reads were assembled into 33,307 isotigs with a 31-fold coverage and 376,482 singletons. Assembled sequences and singletons were subjected to BLAST similarity searches and annotated with Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology (KO) identifiers. These analyses identified the majority of lipid and starch biosynthesis and catabolism pathways in D. tertiolecta. Conclusions The construction of metabolic pathways involved in the biosynthesis and catabolism of fatty acids, triacylglycrols, and starch in D. tertiolecta as well as the assembled transcriptome provide a foundation for the molecular genetics and functional genomics required to direct metabolic engineering efforts that seek to enhance the quantity and character of microalgae-based biofuel feedstock. PMID:21401935
An, Qian; Wright, Sarah L.; Moorman, Anthony V.; Parker, Helen; Griffiths, Mike; Ross, Fiona M.; Davies, Teresa; Harrison, Christine J.; Strefford, Jon C.
2009-01-01
The dic(9;20)(p11~13;q11) is a recurrent chromosomal abnormality in patients with acute lymphoblastic leukemia. Although it results in loss of material from 9p and 20q, the molecular targets on both chromosomes have not been fully elucidated. From an initial cohort of 58 with acute lymphoblastic leukemia patients with this translocation, breakpoint mapping with fluorescence in situ hybridization on 26 of them revealed breakpoint heterogeneity of both chromosomes. PAX5 has been proposed to be the target gene on 9p, while for 20q, FISH analysis implicated the involvement of the ASXL1 gene, either by a breakpoint within (n=4) or centromeric (deletion, n=12) of the gene. Molecular copy-number counting, long-distance inverse PCR and direct sequence analysis identified six dic(9;20) breakpoint sequences. In addition to the three previously reported: PAX5-ASXL1, PAX5-C20ORF112 and PAX5-KIF3B; we identified three new ones in this study: sequences 3’ of PAX5 disrupting ASXL1, and ZCCHC7 disrupted by sequences 3’ of FRG1B and LOC1499503. This study provides insight into the breakpoint complexity underlying dicentric chromosomal formation in acute lymphoblastic leukemia and highlights putative target gene loci. PMID:19586940
Tria, Antje; Hiort, Olaf; Sinnecker, Gernot H G
2004-01-01
Defects in the steroid 5alpha-reductase type 2 (SRD5A2) activity cause decreased formation of dihydrotestosterone (DHT) from testosterone (T), resulting in defective masculinization of external genitalia; the T/DHT ratio is increased. We investigated 10 patients with elevated T/DHT ratios in whom mutations in the SRD5A2 and AR genes had been excluded to find out whether structural alterations of the SRD5A1 gene could contribute to their genital malformations. Single-strand conformation polymorphism analysis and direct sequencing were used to detect variations in the SRD5A1 gene of the patients and of 49 adult fertile men who served as controls. The sequence analysis of exon 3 of the SRD5A1 gene indicated an adenine-to-guanine change (ACA vs. ACG), both triplets encoding the amino acid residue threonine. The ACG sequence was detected in 57% of all subjects and was equally distributed in patients and controls. The T/DHT ratio was significantly higher in controls with the ACG variant as compared with those having the ACA variant. However, no particular sequence aberration was found in the SRD5A1 genes of either group. Mutant SRD5A1 isoenzyme does not seem to play a crucial role in the development of hypospadias. Copyright 2004 S. Karger AG, Basel
An, Qian; Wright, Sarah L; Moorman, Anthony V; Parker, Helen; Griffiths, Mike; Ross, Fiona M; Davies, Teresa; Harrison, Christine J; Strefford, Jon C
2009-08-01
The dic(9;20)(p11-13;q11) is a recurrent chromosomal abnormality in patients with acute lymphoblastic leukemia. Although it results in loss of material from 9p and 20q, the molecular targets on both chromosomes have not been fully elucidated. From an initial cohort of 58 with acute lymphoblastic leukemia patients with this translocation, breakpoint mapping with fluorescence in situ hybridization on 26 of them revealed breakpoint heterogeneity of both chromosomes. PAX5 has been proposed to be the target gene on 9p, while for 20q, FISH analysis implicated the involvement of the ASXL1 gene, either by a breakpoint within (n=4) or centromeric (deletion, n=12) of the gene. Molecular copy-number counting, long-distance inverse PCR and direct sequence analysis identified six dic(9;20) breakpoint sequences. In addition to the three previously reported: PAX5-ASXL1, PAX5-C20ORF112 and PAX5-KIF3B; we identified three new ones in this study: sequences 3' of PAX5 disrupting ASXL1, and ZCCHC7 disrupted by sequences 3' of FRG1B and LOC1499503. This study provides insight into the breakpoint complexity underlying dicentric chromosomal formation in acute lymphoblastic leukemia and highlights putative target gene loci.
Pathak, B G; Neumann, J C; Croyle, M L; Lingrel, J B
1994-01-01
The Na,K-ATPase is an integral plasma membrane protein consisting of alpha and beta subunits, each of which has discrete isoforms expressed in a tissue-specific manner. Of the three functional alpha isoform genes, the one encoding the alpha 3 isoform is the most tissue-restricted in its expression, being found primarily in the brain. To identify regions of the alpha 3 isoform gene that are involved in directing expression in the brain, a 1.6 kb 5'-flanking sequence was attached to a reporter gene, chloramphenicol acetyltransferase (CAT). The alpha 3-CAT chimeric gene construct was microinjected into fertilized mouse eggs, and transgenic mice were produced. Analysis of adult transgenic mice from different lines revealed that the transgene is expressed primarily in the brain. To further delineate regions that are needed for conferring expression in this tissue, systematic deletions of the 5'-flanking sequence of the alpha 3-CAT fusion constructs were made and analyzed, again using transgenic mice. The results from these analyses indicate that DNA sequences required for mediating brain-specific expression of the alpha 3 isoform gene are present within 210 bp upstream of the transcription initiation site. alpha 3-CAT promoter constructs containing scanning mutations in this region were also assayed in transgenic mice. These studies have identified both a functional neural-restrictive silencer element as well as a positively acting cis element. Images PMID:7984427
Statistical inference of the generation probability of T-cell receptors from sequence repertoires.
Murugan, Anand; Mora, Thierry; Walczak, Aleksandra M; Callan, Curtis G
2012-10-02
Stochastic rearrangement of germline V-, D-, and J-genes to create variable coding sequence for certain cell surface receptors is at the origin of immune system diversity. This process, known as "VDJ recombination", is implemented via a series of stochastic molecular events involving gene choices and random nucleotide insertions between, and deletions from, genes. We use large sequence repertoires of the variable CDR3 region of human CD4+ T-cell receptor beta chains to infer the statistical properties of these basic biochemical events. Because any given CDR3 sequence can be produced in multiple ways, the probability distribution of hidden recombination events cannot be inferred directly from the observed sequences; we therefore develop a maximum likelihood inference method to achieve this end. To separate the properties of the molecular rearrangement mechanism from the effects of selection, we focus on nonproductive CDR3 sequences in T-cell DNA. We infer the joint distribution of the various generative events that occur when a new T-cell receptor gene is created. We find a rich picture of correlation (and absence thereof), providing insight into the molecular mechanisms involved. The generative event statistics are consistent between individuals, suggesting a universal biochemical process. Our probabilistic model predicts the generation probability of any specific CDR3 sequence by the primitive recombination process, allowing us to quantify the potential diversity of the T-cell repertoire and to understand why some sequences are shared between individuals. We argue that the use of formal statistical inference methods, of the kind presented in this paper, will be essential for quantitative understanding of the generation and evolution of diversity in the adaptive immune system.
Engineering of a DNA Polymerase for Direct m6 A Sequencing.
Aschenbrenner, Joos; Werner, Stephan; Marchand, Virginie; Adam, Martina; Motorin, Yuri; Helm, Mark; Marx, Andreas
2018-01-08
Methods for the detection of RNA modifications are of fundamental importance for advancing epitranscriptomics. N 6 -methyladenosine (m 6 A) is the most abundant RNA modification in mammalian mRNA and is involved in the regulation of gene expression. Current detection techniques are laborious and rely on antibody-based enrichment of m 6 A-containing RNA prior to sequencing, since m 6 A modifications are generally "erased" during reverse transcription (RT). To overcome the drawbacks associated with indirect detection, we aimed to generate novel DNA polymerase variants for direct m 6 A sequencing. Therefore, we developed a screen to evolve an RT-active KlenTaq DNA polymerase variant that sets a mark for N 6 -methylation. We identified a mutant that exhibits increased misincorporation opposite m 6 A compared to unmodified A. Application of the generated DNA polymerase in next-generation sequencing allowed the identification of m 6 A sites directly from the sequencing data of untreated RNA samples. © 2017 The Authors. Published by Wiley-VCH Verlag GmbH & Co. KGaA.
A simple procedure for parallel sequence analysis of both strands of 5'-labeled DNA.
Razvi, F; Gargiulo, G; Worcel, A
1983-08-01
Ligation of a 5'-labeled DNA restriction fragment results in a circular DNA molecule carrying the two 32Ps at the reformed restriction site. Double digestions of the circular DNA with the original enzyme and a second restriction enzyme cleavage near the labeled site allows direct chemical sequencing of one 5'-labeled DNA strand. Similar double digestions, using an isoschizomer that cleaves differently at the 32P-labeled site, allows direct sequencing of the now 3'-labeled complementary DNA strand. It is possible to directly sequence both strands of cloned DNA inserts by using the above protocol and a multiple cloning site vector that provides the necessary restriction sites. The simultaneous and parallel visualization of both DNA strands eliminates sequence ambiguities. In addition, the labeled circular molecules are particularly useful for single-hit DNA cleavage studies and DNA footprint analysis. As an example, we show here an analysis of the micrococcal nuclease-induced breaks on the two strands of the somatic 5S RNA gene of Xenopus borealis, which suggests that the enzyme may recognize and cleave small AT-containing palindromes along the DNA helix.
Meitinger, T; Meindl, A; Bork, P; Rost, B; Sander, C; Haasemann, M; Murken, J
1993-12-01
The X-lined gene for Norrie disease, which is characterized by blindness, deafness and mental retardation has been cloned recently. This gene has been thought to code for a putative extracellular factor; its predicted amino acid sequence is homologous to the C-terminal domain of diverse extracellular proteins. Sequence pattern searches and three-dimensional modelling now suggest that the Norrie disease protein (NDP) has a tertiary structure similar to that of transforming growth factor beta (TGF beta). Our model identifies NDP as a member of an emerging family of growth factors containing a cystine knot motif, with direct implications for the physiological role of NDP. The model also sheds light on sequence related domains such as the C-terminal domain of mucins and of von Willebrand factor.
Cheng, Tian; Liu, Guo-Hua; Song, Hui-Qun; Lin, Rui-Qing; Zhu, Xing-Quan
2016-03-01
Hymenolepis nana, commonly known as the dwarf tapeworm, is one of the most common tapeworms of humans and rodents and can cause hymenolepiasis. Although this zoonotic tapeworm is of socio-economic significance in many countries of the world, its genetics, systematics, epidemiology, and biology are poorly understood. In the present study, we sequenced and characterized the complete mitochondrial (mt) genome of H. nana. The mt genome is 13,764 bp in size and encodes 36 genes, including 12 protein-coding genes, 2 ribosomal RNA, and 22 transfer RNA genes. All genes are transcribed in the same direction. The gene order and genome content are completely identical with their congener Hymenolepis diminuta. Phylogenetic analyses based on concatenated amino acid sequences of 12 protein-coding genes by Bayesian inference, Maximum likelihood, and Maximum parsimony showed the division of class Cestoda into two orders, supported the monophylies of both the orders Cyclophyllidea and Pseudophyllidea. Analyses of mt genome sequences also support the monophylies of the three families Taeniidae, Hymenolepididae, and Diphyllobothriidae. This novel mt genome provides a useful genetic marker for studying the molecular epidemiology, systematics, and population genetics of the dwarf tapeworm and should have implications for the diagnosis, prevention, and control of hymenolepiasis in humans.
Martínez-Castilla, León Patricio; Alvarez-Buylla, Elena R.
2003-01-01
Gene duplication is a substrate of evolution. However, the relative importance of positive selection versus relaxation of constraints in the functional divergence of gene copies is still under debate. Plant MADS-box genes encode transcriptional regulators key in various aspects of development and have undergone extensive duplications to form a large family. We recovered 104 MADS sequences from the Arabidopsis genome. Bayesian phylogenetic trees recover type II lineage as a monophyletic group and resolve a branching sequence of monophyletic groups within this lineage. The type I lineage is comprised of several divergent groups. However, contrasting gene structure and patterns of chromosomal distribution between type I and II sequences suggest that they had different evolutionary histories and support the placement of the root of the gene family between these two groups. Site-specific and site-branch analyses of positive Darwinian selection (PDS) suggest that different selection regimes could have affected the evolution of these lineages. We found evidence for PDS along the branch leading to flowering time genes that have a direct impact on plant fitness. Sites with high probabilities of having been under PDS were found in the MADS and K domains, suggesting that these played important roles in the acquisition of novel functions during MADS-box diversification. Detected sites are targets for further experimental analyses. We argue that adaptive changes in MADS-domain protein sequences have been important for their functional divergence, suggesting that changes within coding regions of transcriptional regulators have influenced phenotypic evolution of plants. PMID:14597714
Pasko, Chris; Dunn, John; Jaeckel, Heidi; Nieuwlandt, Dan; Weed, Diane; Woodruff, Evelyn; Zheng, Xiaotian
2012-01-01
Rapid diagnosis of staphylococcal bacteremia directs appropriate antimicrobial therapy, leading to improved patient outcome. We describe herein a rapid test (<75 min) that can identify the major pathogenic strains of Staphylococcus to the species level as well as the presence or absence of the methicillin resistance determinant gene, mecA. The test, Staph ID/R, combines a rapid isothermal nucleic acid amplification method, helicase-dependent amplification (HDA), with a chip-based array that produces unambiguous visible results. The analytic sensitivity was 1 CFU per reaction for the mecA gene and was 1 to 250 CFU per reaction depending on the staphylococcal species present in the positive blood culture. Staph ID/R has excellent specificity as well, with no cross-reactivity observed. We validated the performance of Staph ID/R by testing 104 frozen clinical positive blood cultures and comparing the results with rpoB gene or 16S rRNA gene sequencing for species identity determinations and mecA gene PCR to confirm mecA gene results. Staph ID/R agreed with mecA gene PCR for all samples and agreed with rpoB/16S rRNA gene sequencing in all cases except for one sample that contained a mixture of two staphylococcal species, one of which Staph ID/R correctly identified, for an overall agreement of 99.0% (P < 0.01). Staph ID/R could potentially be used to positively affect patient management for Staphylococcus-mediated bacteremia. PMID:22170912
Becságh, Péter; Szakács, Orsolya
2014-10-01
During diagnostic workflow when detecting sequence alterations, sometimes it is important to design an algorithm that includes screening and direct tests in combination. Normally the use of direct test, which is mainly sequencing, is limited. There is an increased need for effective screening tests, with "closed tube" during the whole process and therefore decreasing the risk of PCR product contamination. The aim of this study was to design such a closed tube, detection probe based screening assay to detect different kind of sequence alterations in the exon 11 of the human c-kit gene region. Inside this region there are variable possible deletions and single nucleotide changes. During assay setup, more probe chemistry formats were screened and tested. After some optimization steps the taqman probe format was selected.
Ventura, Marco; Canchaya, Carlos; Meylan, Valèrie; Klaenhammer, Todd R.; Zink, Ralf
2003-01-01
We analyzed the tuf gene, encoding elongation factor Tu, from 33 strains representing 17 Lactobacillus species and 8 Bifidobacterium species. The tuf sequences were aligned and used to infer phylogenesis among species of lactobacilli and bifidobacteria. We demonstrated that the synonymous substitution affecting this gene renders elongation factor Tu a reliable molecular clock for investigating evolutionary distances of lactobacilli and bifidobacteria. In fact, the phylogeny generated by these tuf sequences is consistent with that derived from 16S rRNA analysis. The investigation of a multiple alignment of tuf sequences revealed regions conserved among strains belonging to the same species but distinct from those of other species. PCR primers complementary to these regions allowed species-specific identification of closely related species, such as Lactobacillus casei group members. These tuf gene-based assays developed in this study provide an alternative to present methods for the identification for lactic acid bacterial species. Since a variable number of tuf genes have been described for bacteria, the presence of multiple genes was examined. Southern analysis revealed one tuf gene in the genomes of lactobacilli and bifidobacteria, but the tuf gene was arranged differently in the genomes of these two taxa. Our results revealed that the tuf gene in bifidobacteria is flanked by the same gene constellation as the str operon, as originally reported for Escherichia coli. In contrast, bioinformatic and transcriptional analyses of the DNA region flanking the tuf gene in four Lactobacillus species indicated the same four-gene unit and suggested a novel tuf operon specific for the genus Lactobacillus. PMID:14602655
CRISPR-Cas9 provides the means to perform genome editing and facilitates loss-of-function screens. However, we and others demonstrated that expression of the Cas9 endonuclease induces a gene-independent response that correlates with the number of target sequences in the genome. An alternative approach to suppressing gene expression is to block transcription using a catalytically inactive Cas9 (dCas9). Here we directly compare genome editing by CRISPR-Cas9 (cutting, CRISPRc) and gene suppression using KRAB-dCas9 (CRISPRi) in loss-of-function screens to identify cell essential genes.
Microsatellite analysis in the genome of Acanthaceae: An in silico approach.
Kaliswamy, Priyadharsini; Vellingiri, Srividhya; Nathan, Bharathi; Selvaraj, Saravanakumar
2015-01-01
Acanthaceae is one of the advanced and specialized families with conventionally used medicinal plants. Simple sequence repeats (SSRs) play a major role as molecular markers for genome analysis and plant breeding. The microsatellites existing in the complete genome sequences would help to attain a direct role in the genome organization, recombination, gene regulation, quantitative genetic variation, and evolution of genes. The current study reports the frequency of microsatellites and appropriate markers for the Acanthaceae family genome sequences. The whole nucleotide sequences of Acanthaceae species were obtained from National Center for Biotechnology Information database and screened for the presence of SSRs. SSR Locator tool was used to predict the microsatellites and inbuilt Primer3 module was used for primer designing. Totally 110 repeats from 108 sequences of Acanthaceae family plant genomes were identified, and the occurrence of dinucleotide repeats was found to be abundant in the genome sequences. The essential amino acid isoleucine was found rich in all the sequences. We also designed the SSR-based primers/markers for 59 sequences of this family that contains microsatellite repeats in their genome. The identified microsatellites and primers might be useful for breeding and genetic studies of plants that belong to Acanthaceae family in the future.
Frequency of Usher syndrome type 1 in deaf children by massively parallel DNA sequencing
Yoshimura, Hidekane; Miyagawa, Maiko; Kumakawa, Kozo; Nishio, Shin-ya; Usami, Shin-ichi
2016-01-01
Usher syndrome type 1 (USH1) is the most severe of the three USH subtypes due to its profound hearing loss, absent vestibular response and retinitis pigmentosa appearing at a prepubescent age. Six causative genes have been identified for USH1, making early diagnosis and therapy possible through DNA testing. Targeted exon sequencing of selected genes using massively parallel DNA sequencing (MPS) technology enables clinicians to systematically tackle previously intractable monogenic disorders and improve molecular diagnosis. Using MPS along with direct sequence analysis, we screened 227 unrelated non-syndromic deaf children and detected recessive mutations in USH1 causative genes in five patients (2.2%): three patients harbored MYO7A mutations and one each carried CDH23 or PCDH15 mutations. As indicated by an earlier genotype–phenotype correlation study of the CDH23 and PCDH15 genes, we considered the latter two patients to have USH1. Based on clinical findings, it was also highly likely that one patient with MYO7A mutations possessed USH1 due to a late onset age of walking. This first report describing the frequency (1.3–2.2%) of USH1 among non-syndromic deaf children highlights the importance of comprehensive genetic testing for early disease diagnosis. PMID:26791358
Piddington, C S; Kovacevich, B R; Rambosek, J
1995-01-01
Dibenzothiophene (DBT), a model compound for sulfur-containing organic molecules found in fossil fuels, can be desulfurized to 2-hydroxybiphenyl (2-HBP) by Rhodococcus sp. strain IGTS8. Complementation of a desulfurization (dsz) mutant provided the genes from Rhodococcus sp. strain IGTS8 responsible for desulfurization. A 6.7-kb TaqI fragment cloned in Escherichia coli-Rhodococcus shuttle vector pRR-6 was found to both complement this mutation and confer desulfurization to Rhodococcus fascians, which normally is not able to desulfurize DBT. Expression of this fragment in E. coli also conferred the ability to desulfurize DBT. A molecular analysis of the cloned fragment revealed a single operon containing three open reading frames involved in the conversion of DBT to 2-HBP. The three genes were designated dszA, dszB, and dszC. Neither the nucleotide sequences nor the deduced amino acid sequences of the enzymes exhibited significant similarity to sequences obtained from the GenBank, EMBL, and Swiss-Prot databases, indicating that these enzymes are novel enzymes. Subclone analyses revealed that the gene product of dszC converts DBT directly to DBT-sulfone and that the gene products of dszA and dszB act in concert to convert DBT-sulfone to 2-HBP. PMID:7574582
Frequency of Usher syndrome type 1 in deaf children by massively parallel DNA sequencing.
Yoshimura, Hidekane; Miyagawa, Maiko; Kumakawa, Kozo; Nishio, Shin-Ya; Usami, Shin-Ichi
2016-05-01
Usher syndrome type 1 (USH1) is the most severe of the three USH subtypes due to its profound hearing loss, absent vestibular response and retinitis pigmentosa appearing at a prepubescent age. Six causative genes have been identified for USH1, making early diagnosis and therapy possible through DNA testing. Targeted exon sequencing of selected genes using massively parallel DNA sequencing (MPS) technology enables clinicians to systematically tackle previously intractable monogenic disorders and improve molecular diagnosis. Using MPS along with direct sequence analysis, we screened 227 unrelated non-syndromic deaf children and detected recessive mutations in USH1 causative genes in five patients (2.2%): three patients harbored MYO7A mutations and one each carried CDH23 or PCDH15 mutations. As indicated by an earlier genotype-phenotype correlation study of the CDH23 and PCDH15 genes, we considered the latter two patients to have USH1. Based on clinical findings, it was also highly likely that one patient with MYO7A mutations possessed USH1 due to a late onset age of walking. This first report describing the frequency (1.3-2.2%) of USH1 among non-syndromic deaf children highlights the importance of comprehensive genetic testing for early disease diagnosis.
Nelson, Chase W; Moncla, Louise H; Hughes, Austin L
2015-11-15
New applications of next-generation sequencing technologies use pools of DNA from multiple individuals to estimate population genetic parameters. However, no publicly available tools exist to analyse single-nucleotide polymorphism (SNP) calling results directly for evolutionary parameters important in detecting natural selection, including nucleotide diversity and gene diversity. We have developed SNPGenie to fill this gap. The user submits a FASTA reference sequence(s), a Gene Transfer Format (.GTF) file with CDS information and a SNP report(s) in an increasing selection of formats. The program estimates nucleotide diversity, distance from the reference and gene diversity. Sites are flagged for multiple overlapping reading frames, and are categorized by polymorphism type: nonsynonymous, synonymous, or ambiguous. The results allow single nucleotide, single codon, sliding window, whole gene and whole genome/population analyses that aid in the detection of positive and purifying natural selection in the source population. SNPGenie version 1.2 is a Perl program with no additional dependencies. It is free, open-source, and available for download at https://github.com/hugheslab/snpgenie. nelsoncw@email.sc.edu or austin@biol.sc.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Exploring the roles of DNA methylation in the metal-reducing bacterium Shewanella oneidensis MR-1
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bendall, Matthew L.; Luong, Khai; Wetmore, Kelly M.
2013-08-30
We performed whole genome analyses of DNA methylation in Shewanella 17 oneidensis MR-1 to examine its possible role in regulating gene expression and 18 other cellular processes. Single-Molecule Real Time (SMRT) sequencing 19 revealed extensive methylation of adenine (N6mA) throughout the 20 genome. These methylated bases were located in five sequence motifs, 21 including three novel targets for Type I restriction/modification enzymes. The 22 sequence motifs targeted by putative methyltranferases were determined via 23 SMRT sequencing of gene knockout mutants. In addition, we found S. 24 oneidensis MR-1 cultures grown under various culture conditions displayed 25 different DNA methylation patterns.more » However, the small number of differentially 26 methylated sites could not be directly linked to the much larger number of 27 differentially expressed genes in these conditions, suggesting DNA methylation is 28 not a major regulator of gene expression in S. oneidensis MR-1. The enrichment 29 of methylated GATC motifs in the origin of replication indicate DNA methylation 30 may regulate genome replication in a manner similar to that seen in Escherichia 31 coli. Furthermore, comparative analyses suggest that many 32 Gammaproteobacteria, including all members of the Shewanellaceae family, may 33 also utilize DNA methylation to regulate genome replication.« less
LTRs of endogenous retroviruses as a source of Tbx6 binding sites
NASA Astrophysics Data System (ADS)
Yasuhiko, Yukuto; Hirabayashi, Yoko; Ono, Ryuichi
2017-06-01
Retrotransposons are abundant in mammalian genomes and can modulate the gene expression of surrounding genes by disrupting endogenous binding sites for transcription factors (TFs) or providing novel TFs binding sites within retrotransposon sequences. Here, we show that a (C/T)CACACCT sequence motif in ORR1A, ORR1B, ORR1C and ORR1D, Long Terminal Repeats (LTRs) of MaLR endogenous retrovirus (ERV), is the direct target of Tbx6, an evolutionary conserved family of T-box transcription factors. Moreover, by comparing gene expression between control mice (Tbx6 +/-) and Tbx6-deficient mice (Tbx6 -/-), we demonstrate that at least four genes, Twist2, Pitx2, Oscp1, and Nfxl1, are down-regulated with Tbx6 deficiency. These results suggest that ORR1A, ORR1B, ORR1C and ORR1D may contribute to the evolution of mammalian embryogenesis.
Salient Features of Endonuclease Platforms for Therapeutic Genome Editing.
Certo, Michael T; Morgan, Richard A
2016-03-01
Emerging gene-editing technologies are nearing a revolutionary phase in genetic medicine: precisely modifying or repairing causal genetic defects. This may include any number of DNA sequence manipulations, such as knocking out a deleterious gene, introducing a particular mutation, or directly repairing a defective sequence by site-specific recombination. All of these edits can currently be achieved via programmable rare-cutting endonucleases to create targeted DNA breaks that can engage and exploit endogenous DNA repair pathways to impart site-specific genetic changes. Over the past decade, several distinct technologies for introducing site-specific DNA breaks have been developed, yet the different biological origins of these gene-editing technologies bring along inherent differences in parameters that impact clinical implementation. This review aims to provide an accessible overview of the various endonuclease-based gene-editing platforms, highlighting the strengths and weakness of each with respect to therapeutic applications.
Salient Features of Endonuclease Platforms for Therapeutic Genome Editing
Certo, Michael T; Morgan, Richard A
2016-01-01
Emerging gene-editing technologies are nearing a revolutionary phase in genetic medicine: precisely modifying or repairing causal genetic defects. This may include any number of DNA sequence manipulations, such as knocking out a deleterious gene, introducing a particular mutation, or directly repairing a defective sequence by site-specific recombination. All of these edits can currently be achieved via programmable rare-cutting endonucleases to create targeted DNA breaks that can engage and exploit endogenous DNA repair pathways to impart site-specific genetic changes. Over the past decade, several distinct technologies for introducing site-specific DNA breaks have been developed, yet the different biological origins of these gene-editing technologies bring along inherent differences in parameters that impact clinical implementation. This review aims to provide an accessible overview of the various endonuclease-based gene-editing platforms, highlighting the strengths and weakness of each with respect to therapeutic applications. PMID:26796671
LTRs of Endogenous Retroviruses as a Source of Tbx6 Binding Sites
Yasuhiko, Yukuto; Hirabayashi, Yoko; Ono, Ryuichi
2017-01-01
Retrotransposons are abundant in mammalian genomes and can modulate the gene expression of surrounding genes by disrupting endogenous binding sites for transcription factors (TFs) or providing novel TFs binding sites within retrotransposon sequences. Here, we show that a (C/T)CACACCT sequence motif in ORR1A, ORR1B, ORR1C, and ORR1D, Long Terminal Repeats (LTRs) of MaLR endogenous retrovirus (ERV), is the direct target of Tbx6, an evolutionary conserved family of T-box TFs. Moreover, by comparing gene expression between control mice (Tbx6 +/−) and Tbx6-deficient mice (Tbx6 −/−), we demonstrate that at least four genes, Twist2, Pitx2, Oscp1, and Nfxl1, are down-regulated with Tbx6 deficiency. These results suggest that ORR1A, ORR1B, ORR1C and ORR1D may contribute to the evolution of mammalian embryogenesis. PMID:28664156
LTRs of Endogenous Retroviruses as a Source of Tbx6 Binding Sites.
Yasuhiko, Yukuto; Hirabayashi, Yoko; Ono, Ryuichi
2017-01-01
Retrotransposons are abundant in mammalian genomes and can modulate the gene expression of surrounding genes by disrupting endogenous binding sites for transcription factors (TFs) or providing novel TFs binding sites within retrotransposon sequences. Here, we show that a (C/T)CACACCT sequence motif in ORR1A, ORR1B, ORR1C, and ORR1D, Long Terminal Repeats (LTRs) of MaLR endogenous retrovirus (ERV), is the direct target of Tbx6, an evolutionary conserved family of T-box TFs. Moreover, by comparing gene expression between control mice (Tbx6 +/-) and Tbx6-deficient mice (Tbx6 -/-), we demonstrate that at least four genes, Twist2, Pitx2, Oscp1 , and Nfxl1 , are down-regulated with Tbx6 deficiency. These results suggest that ORR1A, ORR1B, ORR1C and ORR1D may contribute to the evolution of mammalian embryogenesis.
Sun, Zichen; Stack, Colin; Šlapeta, Jan
2012-05-25
In order to investigate the genetic variation between Tritrichomonas foetus from bovine and feline origins, cysteine protease 8 (CP8) coding sequence was selected as the polymorphic DNA marker. Direct sequencing of CP8 coding sequence of T. foetus from four feline isolates and two bovine isolates with polymerase chain reaction successfully revealed conserved nucleotide polymorphisms between feline and bovine isolates. These results provide useful information for CP8-based molecular differentiation of T. foetus genotypes. Copyright © 2011 Elsevier B.V. All rights reserved.
PanFP: Pangenome-based functional profiles for microbial communities
Jun, Se -Ran; Hauser, Loren John; Schadt, Christopher Warren; ...
2015-09-26
For decades there has been increasing interest in understanding the relationships between microbial communities and ecosystem functions. Current DNA sequencing technologies allows for the exploration of microbial communities in two principle ways: targeted rRNA gene surveys and shotgun metagenomics. For large study designs, it is often still prohibitively expensive to sequence metagenomes at both the breadth and depth necessary to statistically capture the true functional diversity of a community. Although rRNA gene surveys provide no direct evidence of function, they do provide a reasonable estimation of microbial diversity, while being a very cost effective way to screen samples of interestmore » for later shotgun metagenomic analyses. However, there is a great deal of 16S rRNA gene survey data currently available from diverse environments, and thus a need for tools to infer functional composition of environmental samples based on 16S rRNA gene survey data. As a result, we present a computational method called pangenome based functional profiles (PanFP), which infers functional profiles of microbial communities from 16S rRNA gene survey data for Bacteria and Archaea. PanFP is based on pangenome reconstruction of a 16S rRNA gene operational taxonomic unit (OTU) from known genes and genomes pooled from the OTU s taxonomic lineage. From this lineage, we derive an OTU functional profile by weighting a pangenome s functional profile with the OTUs abundance observed in a given sample. We validated our method by comparing PanFP to the functional profiles obtained from the direct shotgun metagenomic measurement of 65 diverse communities via Spearman correlation coefficients. These correlations improved with increasing sequencing depth, within the range of 0.8 0.9 for the most deeply sequenced Human Microbiome Project mock community samples. PanFP is very similar in performance to another recently released tool, PICRUSt, for almost all of survey data analysed here. But, our method is unique in that any OTU building method can be used, as opposed to being limited to closed reference OTU picking strategies against specific reference sequence databases. In conclusion, we developed an automated computational method, which derives an inferred functional profile based on the 16S rRNA gene surveys of microbial communities. The inferred functional profile provides a cost effective way to study complex ecosystems through predicted comparative functional metagenomes and metadata analysis. All PanFP source code and additional documentation are freely available online at GitHub.« less
PanFP: pangenome-based functional profiles for microbial communities.
Jun, Se-Ran; Robeson, Michael S; Hauser, Loren J; Schadt, Christopher W; Gorin, Andrey A
2015-09-26
For decades there has been increasing interest in understanding the relationships between microbial communities and ecosystem functions. Current DNA sequencing technologies allows for the exploration of microbial communities in two principle ways: targeted rRNA gene surveys and shotgun metagenomics. For large study designs, it is often still prohibitively expensive to sequence metagenomes at both the breadth and depth necessary to statistically capture the true functional diversity of a community. Although rRNA gene surveys provide no direct evidence of function, they do provide a reasonable estimation of microbial diversity, while being a very cost-effective way to screen samples of interest for later shotgun metagenomic analyses. However, there is a great deal of 16S rRNA gene survey data currently available from diverse environments, and thus a need for tools to infer functional composition of environmental samples based on 16S rRNA gene survey data. We present a computational method called pangenome-based functional profiles (PanFP), which infers functional profiles of microbial communities from 16S rRNA gene survey data for Bacteria and Archaea. PanFP is based on pangenome reconstruction of a 16S rRNA gene operational taxonomic unit (OTU) from known genes and genomes pooled from the OTU's taxonomic lineage. From this lineage, we derive an OTU functional profile by weighting a pangenome's functional profile with the OTUs abundance observed in a given sample. We validated our method by comparing PanFP to the functional profiles obtained from the direct shotgun metagenomic measurement of 65 diverse communities via Spearman correlation coefficients. These correlations improved with increasing sequencing depth, within the range of 0.8-0.9 for the most deeply sequenced Human Microbiome Project mock community samples. PanFP is very similar in performance to another recently released tool, PICRUSt, for almost all of survey data analysed here. But, our method is unique in that any OTU building method can be used, as opposed to being limited to closed-reference OTU picking strategies against specific reference sequence databases. We developed an automated computational method, which derives an inferred functional profile based on the 16S rRNA gene surveys of microbial communities. The inferred functional profile provides a cost effective way to study complex ecosystems through predicted comparative functional metagenomes and metadata analysis. All PanFP source code and additional documentation are freely available online at GitHub ( https://github.com/srjun/PanFP ).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jun, Se -Ran; Hauser, Loren John; Schadt, Christopher Warren
For decades there has been increasing interest in understanding the relationships between microbial communities and ecosystem functions. Current DNA sequencing technologies allows for the exploration of microbial communities in two principle ways: targeted rRNA gene surveys and shotgun metagenomics. For large study designs, it is often still prohibitively expensive to sequence metagenomes at both the breadth and depth necessary to statistically capture the true functional diversity of a community. Although rRNA gene surveys provide no direct evidence of function, they do provide a reasonable estimation of microbial diversity, while being a very cost effective way to screen samples of interestmore » for later shotgun metagenomic analyses. However, there is a great deal of 16S rRNA gene survey data currently available from diverse environments, and thus a need for tools to infer functional composition of environmental samples based on 16S rRNA gene survey data. As a result, we present a computational method called pangenome based functional profiles (PanFP), which infers functional profiles of microbial communities from 16S rRNA gene survey data for Bacteria and Archaea. PanFP is based on pangenome reconstruction of a 16S rRNA gene operational taxonomic unit (OTU) from known genes and genomes pooled from the OTU s taxonomic lineage. From this lineage, we derive an OTU functional profile by weighting a pangenome s functional profile with the OTUs abundance observed in a given sample. We validated our method by comparing PanFP to the functional profiles obtained from the direct shotgun metagenomic measurement of 65 diverse communities via Spearman correlation coefficients. These correlations improved with increasing sequencing depth, within the range of 0.8 0.9 for the most deeply sequenced Human Microbiome Project mock community samples. PanFP is very similar in performance to another recently released tool, PICRUSt, for almost all of survey data analysed here. But, our method is unique in that any OTU building method can be used, as opposed to being limited to closed reference OTU picking strategies against specific reference sequence databases. In conclusion, we developed an automated computational method, which derives an inferred functional profile based on the 16S rRNA gene surveys of microbial communities. The inferred functional profile provides a cost effective way to study complex ecosystems through predicted comparative functional metagenomes and metadata analysis. All PanFP source code and additional documentation are freely available online at GitHub.« less
Visual gene developer: a fully programmable bioinformatics software for synthetic gene optimization.
Jung, Sang-Kyu; McDonald, Karen
2011-08-16
Direct gene synthesis is becoming more popular owing to decreases in gene synthesis pricing. Compared with using natural genes, gene synthesis provides a good opportunity to optimize gene sequence for specific applications. In order to facilitate gene optimization, we have developed a stand-alone software called Visual Gene Developer. The software not only provides general functions for gene analysis and optimization along with an interactive user-friendly interface, but also includes unique features such as programming capability, dedicated mRNA secondary structure prediction, artificial neural network modeling, network & multi-threaded computing, and user-accessible programming modules. The software allows a user to analyze and optimize a sequence using main menu functions or specialized module windows. Alternatively, gene optimization can be initiated by designing a gene construct and configuring an optimization strategy. A user can choose several predefined or user-defined algorithms to design a complicated strategy. The software provides expandable functionality as platform software supporting module development using popular script languages such as VBScript and JScript in the software programming environment. Visual Gene Developer is useful for both researchers who want to quickly analyze and optimize genes, and those who are interested in developing and testing new algorithms in bioinformatics. The software is available for free download at http://www.visualgenedeveloper.net.
Visual gene developer: a fully programmable bioinformatics software for synthetic gene optimization
2011-01-01
Background Direct gene synthesis is becoming more popular owing to decreases in gene synthesis pricing. Compared with using natural genes, gene synthesis provides a good opportunity to optimize gene sequence for specific applications. In order to facilitate gene optimization, we have developed a stand-alone software called Visual Gene Developer. Results The software not only provides general functions for gene analysis and optimization along with an interactive user-friendly interface, but also includes unique features such as programming capability, dedicated mRNA secondary structure prediction, artificial neural network modeling, network & multi-threaded computing, and user-accessible programming modules. The software allows a user to analyze and optimize a sequence using main menu functions or specialized module windows. Alternatively, gene optimization can be initiated by designing a gene construct and configuring an optimization strategy. A user can choose several predefined or user-defined algorithms to design a complicated strategy. The software provides expandable functionality as platform software supporting module development using popular script languages such as VBScript and JScript in the software programming environment. Conclusion Visual Gene Developer is useful for both researchers who want to quickly analyze and optimize genes, and those who are interested in developing and testing new algorithms in bioinformatics. The software is available for free download at http://www.visualgenedeveloper.net. PMID:21846353
The spotted gar genome illuminates vertebrate evolution and facilitates human-teleost comparisons.
Braasch, Ingo; Gehrke, Andrew R; Smith, Jeramiah J; Kawasaki, Kazuhiko; Manousaki, Tereza; Pasquier, Jeremy; Amores, Angel; Desvignes, Thomas; Batzel, Peter; Catchen, Julian; Berlin, Aaron M; Campbell, Michael S; Barrell, Daniel; Martin, Kyle J; Mulley, John F; Ravi, Vydianathan; Lee, Alison P; Nakamura, Tetsuya; Chalopin, Domitille; Fan, Shaohua; Wcisel, Dustin; Cañestro, Cristian; Sydes, Jason; Beaudry, Felix E G; Sun, Yi; Hertel, Jana; Beam, Michael J; Fasold, Mario; Ishiyama, Mikio; Johnson, Jeremy; Kehr, Steffi; Lara, Marcia; Letaw, John H; Litman, Gary W; Litman, Ronda T; Mikami, Masato; Ota, Tatsuya; Saha, Nil Ratan; Williams, Louise; Stadler, Peter F; Wang, Han; Taylor, John S; Fontenot, Quenton; Ferrara, Allyse; Searle, Stephen M J; Aken, Bronwen; Yandell, Mark; Schneider, Igor; Yoder, Jeffrey A; Volff, Jean-Nicolas; Meyer, Axel; Amemiya, Chris T; Venkatesh, Byrappa; Holland, Peter W H; Guiguen, Yann; Bobe, Julien; Shubin, Neil H; Di Palma, Federica; Alföldi, Jessica; Lindblad-Toh, Kerstin; Postlethwait, John H
2016-04-01
To connect human biology to fish biomedical models, we sequenced the genome of spotted gar (Lepisosteus oculatus), whose lineage diverged from teleosts before teleost genome duplication (TGD). The slowly evolving gar genome has conserved in content and size many entire chromosomes from bony vertebrate ancestors. Gar bridges teleosts to tetrapods by illuminating the evolution of immunity, mineralization and development (mediated, for example, by Hox, ParaHox and microRNA genes). Numerous conserved noncoding elements (CNEs; often cis regulatory) undetectable in direct human-teleost comparisons become apparent using gar: functional studies uncovered conserved roles for such cryptic CNEs, facilitating annotation of sequences identified in human genome-wide association studies. Transcriptomic analyses showed that the sums of expression domains and expression levels for duplicated teleost genes often approximate the patterns and levels of expression for gar genes, consistent with subfunctionalization. The gar genome provides a resource for understanding evolution after genome duplication, the origin of vertebrate genomes and the function of human regulatory sequences.
Arkas: Rapid reproducible RNAseq analysis
Colombo, Anthony R.; J. Triche Jr, Timothy; Ramsingh, Giridharan
2017-01-01
The recently introduced Kallisto pseudoaligner has radically simplified the quantification of transcripts in RNA-sequencing experiments. We offer cloud-scale RNAseq pipelines Arkas-Quantification, and Arkas-Analysis available within Illumina’s BaseSpace cloud application platform which expedites Kallisto preparatory routines, reliably calculates differential expression, and performs gene-set enrichment of REACTOME pathways . Due to inherit inefficiencies of scale, Illumina's BaseSpace computing platform offers a massively parallel distributive environment improving data management services and data importing. Arkas-Quantification deploys Kallisto for parallel cloud computations and is conveniently integrated downstream from the BaseSpace Sequence Read Archive (SRA) import/conversion application titled SRA Import. Arkas-Analysis annotates the Kallisto results by extracting structured information directly from source FASTA files with per-contig metadata, calculates the differential expression and gene-set enrichment analysis on both coding genes and transcripts. The Arkas cloud pipeline supports ENSEMBL transcriptomes and can be used downstream from the SRA Import facilitating raw sequencing importing, SRA FASTQ conversion, RNA quantification and analysis steps. PMID:28868134
Chien, Maw-Sheng; Gilbert , Teresa L.; Huang, Chienjin; Landolt, Marsha L.; O'Hara, Patrick J.; Winton, James R.
1992-01-01
The complete sequence coding for the 57-kDa major soluble antigen of the salmonid fish pathogen, Renibacterium salmoninarum, was determined. The gene contained an opening reading frame of 1671 nucleotides coding for a protein of 557 amino acids with a calculated Mr value of 57190. The first 26 amino acids constituted a signal peptide. The deduced sequence for amino acid residues 27–61 was in agreement with the 35 N-terminal amino acid residues determined by microsequencing, suggesting the protein in synthesized as a 557-amino acid precursor and processed to produce a mature protein of Mr 54505. Two regions of the protein contained imperfect direct repeats. The first region contained two copies of an 81-residue repeat, the second contained five copies of an unrelated 25-residue repeat. Also, a perfect inverted repeat (including three in-frame UAA stop codons) was observed at the carboxyl-terminus of the gene.
The spotted gar genome illuminates vertebrate evolution and facilitates human-to-teleost comparisons
Braasch, Ingo; Gehrke, Andrew R.; Smith, Jeramiah J.; Kawasaki, Kazuhiko; Manousaki, Tereza; Pasquier, Jeremy; Amores, Angel; Desvignes, Thomas; Batzel, Peter; Catchen, Julian; Berlin, Aaron M.; Campbell, Michael S.; Barrell, Daniel; Martin, Kyle J.; Mulley, John F.; Ravi, Vydianathan; Lee, Alison P.; Nakamura, Tetsuya; Chalopin, Domitille; Fan, Shaohua; Wcisel, Dustin; Cañestro, Cristian; Sydes, Jason; Beaudry, Felix E. G.; Sun, Yi; Hertel, Jana; Beam, Michael J.; Fasold, Mario; Ishiyama, Mikio; Johnson, Jeremy; Kehr, Steffi; Lara, Marcia; Letaw, John H.; Litman, Gary W.; Litman, Ronda T.; Mikami, Masato; Ota, Tatsuya; Saha, Nil Ratan; Williams, Louise; Stadler, Peter F.; Wang, Han; Taylor, John S.; Fontenot, Quenton; Ferrara, Allyse; Searle, Stephen M. J.; Aken, Bronwen; Yandell, Mark; Schneider, Igor; Yoder, Jeffrey A.; Volff, Jean-Nicolas; Meyer, Axel; Amemiya, Chris T.; Venkatesh, Byrappa; Holland, Peter W. H.; Guiguen, Yann; Bobe, Julien; Shubin, Neil H.; Di Palma, Federica; Alföldi, Jessica; Lindblad-Toh, Kerstin; Postlethwait, John H.
2016-01-01
To connect human biology to fish biomedical models, we sequenced the genome of spotted gar (Lepisosteus oculatus), whose lineage diverged from teleosts before the teleost genome duplication (TGD). The slowly evolving gar genome conserved in content and size many entire chromosomes from bony vertebrate ancestors. Gar bridges teleosts to tetrapods by illuminating the evolution of immunity, mineralization, and development (e.g., Hox, ParaHox, and miRNA genes). Numerous conserved non-coding elements (CNEs, often cis-regulatory) undetectable in direct human-teleost comparisons become apparent using gar: functional studies uncovered conserved roles of such cryptic CNEs, facilitating annotation of sequences identified in human genome-wide association studies. Transcriptomic analyses revealed that the sum of expression domains and levels from duplicated teleost genes often approximate patterns and levels of gar genes, consistent with subfunctionalization. The gar genome provides a resource for understanding evolution after genome duplication, the origin of vertebrate genomes, and the function of human regulatory sequences. PMID:26950095
Simonelli, F; Cennamo, G; Ziviello, C; Testa, F; de Crecchio, G; Nesti, A; Manitto, M P; Ciccodicola, A; Banfi, S; Brancato, R; Rinaldi, E
2003-09-01
To describe the clinical phenotype of X linked juvenile retinoschisis in eight Italian families with six different mutations in the XLRS1 gene. Complete ophthalmic examinations, electroretinography and A and B-scan standardised echography were performed in 18 affected males. The coding sequences of the XLRS1 gene were amplified by polymerase chain reaction and directly sequenced on an automated sequencer. Six different XLRS1 mutations were identified; two of these mutations Ile81Asn and the Trp122Cys, have not been previously described. The affected males showed an electronegative response to the standard white scotopic stimulus and a prolonged implicit time of the 30 Hz flicker. In the families with Trp112Cys and Trp122Cys mutations we observed a more severe retinoschisis (RS) clinical picture compared with the other genotypes. The severe RS phenotypes associated with Trp112Cys and to Trp122Cys mutations suggest that these mutations determine a notable alteration in the function of the retinoschisin protein.
NASA Astrophysics Data System (ADS)
Lau, Yun-Fai; Kan, Yuet Wai
1983-09-01
We have developed a series of cosmids that can be used as vectors for genomic recombinant DNA library preparations, as expression vectors in mammalian cells for both transient and stable transformations, and as shuttle vectors between bacteria and mammalian cells. These cosmids were constructed by inserting one of the SV2-derived selectable gene markers-SV2-gpt, SV2-DHFR, and SV2-neo-in cosmid pJB8. High efficiency of genomic cloning was obtained with these cosmids and the size of the inserts was 30-42 kilobases. We isolated recombinant cosmids containing the human α -globin gene cluster from these genomic libraries. The simian virus 40 DNA in these selectable gene markers provides the origin of replication and enhancer sequences necessary for replication in permissive cells such as COS 7 cells and thereby allows transient expression of α -globin genes in these cells. These cosmids and their recombinants could also be stably transformed into mammalian cells by using the respective selection systems. Both of the adult α -globin genes were more actively expressed than the embryonic zeta -globin genes in these transformed cell lines. Because of the presence of the cohesive ends of the Charon 4A phage in the cosmids, the transforming DNA sequences could readily be rescued from these stably transformed cells into bacteria by in vitro packaging of total cellular DNA. Thus, these cosmid vectors are potentially useful for direct isolation of structural genes.
Prospecting for pig single nucleotide polymorphisms in the human genome: have we struck gold?
Grapes, L; Rudd, S; Fernando, R L; Megy, K; Rocha, D; Rothschild, M F
2006-06-01
Gene-to-gene variation in the frequency of single nucleotide polymorphisms (SNPs) has been observed in humans, mice, rats, primates and pigs, but a relationship across species in this variation has not been described. Here, the frequency of porcine coding SNPs (cSNPs) identified by in silico methods, and the frequency of murine cSNPs, were compared with the frequency of human cSNPs across homologous genes. From 150,000 porcine expressed sequence tag (EST) sequences, a total of 452 SNP-containing sequence clusters were found, totalling 1394 putative SNPs. All the clustered porcine EST annotations and SNP data have been made publicly available at http://sputnik.btk.fi/project?name=swine. Human and murine cSNPs were identified from dbSNP and were characterized as either validated or total number of cSNPs (validated plus non-validated) for comparison purposes. The correlation between in silico pig cSNP and validated human cSNP densities was found to be 0.77 (p < 0.00001) for a set of 25 homologous genes, while a correlation of 0.48 (p < 0.0005) was found for a primarily random sample of 50 homologous human and mouse genes. This is the first evidence of conserved gene-to-gene variability in cSNP frequency across species and indicates that site-directed screening of porcine genes that are homologous to cSNP-rich human genes may rapidly advance cSNP discovery in pigs.
Intramolecular transposition by a synthetic IS50 (Tn5) derivative
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tomcsanyi, T.; Phadnis, S.H.; Berg, D.E.
1990-11-01
We report the formation of deletions and inversions by intramolecular transposition of Tn5-derived mobile elements. The synthetic transposons used contained the IS50 O and I end segments and the transposase gene, a contraselectable gene encoding sucrose sensitivity (sacB), antibiotic resistance genes, and a plasmid replication origin. Both deletions and inversions were associated with loss of a 300-bp segment that is designated the vector because it is outside of the transposon. Deletions were severalfold more frequent than inversions, perhaps reflecting constraints on DNA twisting or abortive transposition. Restriction and DNA sequence analyses showed that both types of rearrangements extended from onemore » transposon end to many different sites in target DNA. In the case of inversions, transposition generated 9-bp direct repeats of target sequences.« less
[Efficient genome editing in human pluripotent stem cells through CRISPR/Cas9].
Liu, Gai-gai; Li, Shuang; Wei, Yu-da; Zhang, Yong-xian; Ding, Qiu-rong
2015-11-01
The RNA-guided CRISPR (clustered regularly interspaced short palindromic repeat)-associated Cas9 nuclease has offered a new platform for genome editing with high efficiency. Here, we report the use of CRISPR/Cas9 technology to target a specific genomic region in human pluripotent stem cells. We show that CRISPR/Cas9 can be used to disrupt a gene by introducing frameshift mutations to gene coding region; to knock in specific sequences (e.g. FLAG tag DNA sequence) to targeted genomic locus via homology directed repair; to induce large genomic deletion through dual-guide multiplex. Our results demonstrate the versatile application of CRISPR/Cas9 in stem cell genome editing, which can be widely utilized for functional studies of genes or genome loci in human pluripotent stem cells.
Repressor-mediated tissue-specific gene expression in plants
Meagher, Richard B [Athens, GA; Balish, Rebecca S [Oxford, OH; Tehryung, Kim [Athens, GA; McKinney, Elizabeth C [Athens, GA
2009-02-17
Plant tissue specific gene expression by way of repressor-operator complexes, has enabled outcomes including, without limitation, male sterility and engineered plants having root-specific gene expression of relevant proteins to clean environmental pollutants from soil and water. A mercury hyperaccumulation strategy requires that mercuric ion reductase coding sequence is strongly expressed. The actin promoter vector, A2pot, engineered to contain bacterial lac operator sequences, directed strong expression in all plant vegetative organs and tissues. In contrast, the expression from the A2pot construct was restricted primarily to root tissues when a modified bacterial repressor (LacIn) was coexpressed from the light-regulated rubisco small subunit promoter in above-ground tissues. Also provided are analogous repressor operator complexes for selective expression in other plant tissues, for example, to produce male sterile plants.
Fukuzawa, M; Williams, J G
2000-06-01
The cudA gene encodes a nuclear protein that is essential for normal multicellular development. At the slug stage cudA is expressed in the prespore cells and in a sub-region of the prestalk zone. We show that cap site distal promoter sequences direct cudA expression in prespore cells, while proximal sequences direct expression in the prestalk sub-region. The promoter domain that directs prespore-specific transcription consists of a positively acting region, that has the potential to direct expression in all cells within the slug, and a negatively acting region that prevents expression in the prestalk cells. Dd-STATa is the STAT protein that regulates commitment to stalk cell gene expression, where it is known to function as a transcriptional repressor. We show that Dd-STATa binds in vitro to the positively acting part of the prespore domain of the cudA promoter. However, Dd-STATa cannot be utilised for this purpose in vivo, because analysis of a Dd-STATa null mutant strain shows that Dd-STATa is not necessary for cudA transcription in prespore cells. In contrast, the part of the cudA promoter that directs prestalk-specific expression contains a binding site for Dd-STATa that is essential for its biological activity. Dd-STATa appears therefore to serve as a direct activator of cudA transcription in prestalk cells, while a protein with a DNA binding specificity highly related to that of Dd-STATa is utilised to activate cudA transcription in prespore cells.
Hu, Min; Chilton, Neil B; Gasser, Robin B
2002-02-01
The complete mitochondrial genome sequences were determined for two species of human hookworms, Ancylostoma duodenale (13,721 bp) and Necator americanus (13,604 bp). The circular hookworm genomes are amongst the smallest reported to date for any metazoan organism. Their relatively small size relates mainly to a reduced length in the AT-rich region. Both hookworm genomes encode 12 protein, two ribosomal RNA and 22 transfer RNA genes, but lack the ATP synthetase subunit 8 gene, which is consistent with three other species of Secernentea studied to date. All genes are transcribed in the same direction and have a nucleotide composition high in A and T, but low in G and C. The AT bias had a significant effect on both the codon usage pattern and amino acid composition of proteins. For both hookworm species, genes were arranged in the same order as for Caenorhabditis elegans, except for the presence of a non-coding region between genes nad3 and nad5. In A. duodenale, this non-coding region is predicted to form a stem-and-loop structure which is not present in N. americanus. The mitochondrial genome structure for both hookworms differs from Ascaris suum only in the location of the AT-rich region, whereas there are substantial differences when compared with Onchocerca volvulus, including four gene or gene-block translocations and the positions of some transfer RNA genes and the AT-rich region. Based on genome organisation and amino acid sequence identity, A. duodenale and N. americanus were more closely related to C. elegans than to A. suum or O. volvulus (all secernentean nematodes), consistent with a previous phylogenetic study using ribosomal DNA sequence data. Determination of the complete mitochondrial genome sequences for two human hookworms (the first members of the order Strongylida ever sequenced) provides a foundation for studying the systematics, population genetics and ecology of these and other nematodes of socio-economic importance.
Mi, Huaiyu; Huang, Xiaosong; Muruganujan, Anushya; Tang, Haiming; Mills, Caitlin; Kang, Diane; Thomas, Paul D
2017-01-04
The PANTHER database (Protein ANalysis THrough Evolutionary Relationships, http://pantherdb.org) contains comprehensive information on the evolution and function of protein-coding genes from 104 completely sequenced genomes. PANTHER software tools allow users to classify new protein sequences, and to analyze gene lists obtained from large-scale genomics experiments. In the past year, major improvements include a large expansion of classification information available in PANTHER, as well as significant enhancements to the analysis tools. Protein subfamily functional classifications have more than doubled due to progress of the Gene Ontology Phylogenetic Annotation Project. For human genes (as well as a few other organisms), PANTHER now also supports enrichment analysis using pathway classifications from the Reactome resource. The gene list enrichment tools include a new 'hierarchical view' of results, enabling users to leverage the structure of the classifications/ontologies; the tools also allow users to upload genetic variant data directly, rather than requiring prior conversion to a gene list. The updated coding single-nucleotide polymorphisms (SNP) scoring tool uses an improved algorithm. The hidden Markov model (HMM) search tools now use HMMER3, dramatically reducing search times and improving accuracy of E-value statistics. Finally, the PANTHER Tree-Attribute Viewer has been implemented in JavaScript, with new views for exploring protein sequence evolution. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Safe genetically engineered plants
NASA Astrophysics Data System (ADS)
Rosellini, D.; Veronesi, F.
2007-10-01
The application of genetic engineering to plants has provided genetically modified plants (GMPs, or transgenic plants) that are cultivated worldwide on increasing areas. The most widespread GMPs are herbicide-resistant soybean and canola and insect-resistant corn and cotton. New GMPs that produce vaccines, pharmaceutical or industrial proteins, and fortified food are approaching the market. The techniques employed to introduce foreign genes into plants allow a quite good degree of predictability of the results, and their genome is minimally modified. However, some aspects of GMPs have raised concern: (a) control of the insertion site of the introduced DNA sequences into the plant genome and of its mutagenic effect; (b) presence of selectable marker genes conferring resistance to an antibiotic or an herbicide, linked to the useful gene; (c) insertion of undesired bacterial plasmid sequences; and (d) gene flow from transgenic plants to non-transgenic crops or wild plants. In response to public concerns, genetic engineering techniques are continuously being improved. Techniques to direct foreign gene integration into chosen genomic sites, to avoid the use of selectable genes or to remove them from the cultivated plants, to reduce the transfer of undesired bacterial sequences, and make use of alternative, safer selectable genes, are all fields of active research. In our laboratory, some of these new techniques are applied to alfalfa, an important forage plant. These emerging methods for plant genetic engineering are briefly reviewed in this work.
2010-01-01
Background Expansins form a large multi-gene family found in wheat and other cereal genomes that are involved in the expansion of cell walls as a tissue grows. The expansin family can be divided up into two main groups, namely, alpha-expansin (EXPA) and beta-expansin proteins (EXPB), with the EXPB group being of particular interest as group 1-pollen allergens. Results In this study, three beta-expansin genes were identified and characterized from a newly sequenced region of the Triticum aestivum cv. Chinese Spring chromosome 3B physical map at the Sr2 locus (FPC contig ctg11). The analysis of a 357 kb sub-sequence of FPC contig ctg11 identified one beta-expansin genes to be TaEXPB11, originally identified as a cDNA from the wheat cv Wyuna. Through the analysis of intron sequences of the three wheat cv. Chinese Spring genes, we propose that two of these beta-expansin genes are duplications of the TaEXPB11 gene. Comparative sequence analysis with two other wheat cultivars (cv. Westonia and cv. Hope) and a Triticum aestivum var. spelta line validated the identification of the Chinese Spring variant of TaEXPB11. The expression in maternal and grain tissues was confirmed by examining EST databases and carrying out RT-PCR experiments. Detailed examination of the position of TaEXPB11 relative to the locus encoding Sr2 disease resistance ruled out the possibility of this gene directly contributing to the resistance phenotype. Conclusions Through 3-D structural protein comparisons with Zea mays EXPB1, we proposed that variations within the coding sequence of TaEXPB11 in wheats may produce a functional change within features such as domain 1 related to possible involvement in cell wall structure and domain 2 defining the pollen allergen domain and binding to IgE protein. The variation established in this gene suggests it is a clearly identifiable member of a gene family and reflects the dynamic features of the wheat genome as it adapted to a range of different environments and uses. Accession Numbers: ctg11 =FN564426 Survey sequences of TaEXPB11ws and TsEXPB11 are provided request. PMID:20507562
A comparative molecular analysis of water-filled limestone sinkholes in north-eastern Mexico.
Sahl, Jason W; Gary, Marcus O; Harris, J Kirk; Spear, John R
2011-01-01
Sistema Zacatón in north-eastern Mexico is host to several deep, water-filled, anoxic, karstic sinkholes (cenotes). These cenotes were explored, mapped, and geochemically and microbiologically sampled by the autonomous underwater vehicle deep phreatic thermal explorer (DEPTHX). The community structure of the filterable fraction of the water column and extensive microbial mats that coat the cenote walls was investigated by comparative analysis of small-subunit (SSU) 16S rRNA gene sequences. Full-length Sanger gene sequence analysis revealed novel microbial diversity that included three putative bacterial candidate phyla and three additional groups that showed high intra-clade distance with poorly characterized bacterial candidate phyla. Limited functional gene sequence analysis in these anoxic environments identified genes associated with methanogenesis, sulfate reduction and anaerobic ammonium oxidation. A directed, barcoded amplicon, multiplex pyrosequencing approach was employed to compare ∼100,000 bacterial SSU gene sequences from water column and wall microbial mat samples from five cenotes in Sistema Zacatón. A new, high-resolution sequence distribution profile (SDP) method identified changes in specific phylogenetic types (phylotypes) in microbial mats at varied depths; Mantel tests showed a correlation of the genetic distances between mat communities in two cenotes and the geographic location of each cenote. Community structure profiles from the water column of three neighbouring cenotes showed distinct variation; statistically significant differences in the concentration of geochemical constituents suggest that the variation observed in microbial communities between neighbouring cenotes are due to geochemical variation. © 2010 Society for Applied Microbiology and Blackwell Publishing Ltd.
Lindahl, Susanne; Söderlund, Robert; Frosth, Sara; Pringle, John; Båverud, Viveca; Aspán, Anna
2011-11-21
Strangles is a serious respiratory disease in horses caused by Streptococcus equi subspecies equi (S. equi). Transmission of the disease occurs by direct contact with an infected horse or contaminated equipment. Genetically, S. equi strains are highly homogenous and differentiation of strains has proven difficult. However, the S. equi M-protein SeM contains a variable N-terminal region and has been proposed as a target gene to distinguish between different strains of S. equi and determine the source of an outbreak. In this study, strains of S. equi (n=60) from 32 strangles outbreaks in Sweden during 1998-2003 and 2008-2009 were genetically characterized by sequencing the SeM protein gene (seM), and by pulsed-field gel electrophoresis (PFGE). Swedish strains belonged to 10 different seM types, of which five have not previously been described. Most were identical or highly similar to allele types from strangles outbreaks in the UK. Outbreaks in 2008/2009 sharing the same seM type were associated by geographic location and/or type of usage of the horses (racing stables). Sequencing of the seM gene generally agreed with pulsed-field gel electrophoresis profiles. Our data suggest that seM sequencing as a epidemiological tool is supported by the agreement between seM and PFGE and that sequencing of the SeM protein gene is more sensitive than PFGE in discriminating strains of S. equi. Copyright © 2011 Elsevier B.V. All rights reserved.
Köberl, Martina; White, Richard A.; Erschen, Sabine; ...
2015-08-06
Streptomyces sp. strain Wb2n-11, isolated from native desert soil, exhibited broad-spectrum antagonism against plant pathogenic fungi, bacteria, and nematodes. The 8.2-Mb draft genome reveals genes putatively responsible for its promising biocontrol activity and genes which enable the soil bacterium to directly interact beneficially with plants.
Sochorová, Jana; Coriton, Olivier; Kuderová, Alena; Lunerová, Jana; Chèvre, Anne-Marie; Kovařík, Aleš
2017-01-01
Background and aims Brassica napus (AACC, 2n = 38, oilseed rape) is a relatively recent allotetraploid species derived from the putative progenitor diploid species Brassica rapa (AA, 2n = 20) and Brassica oleracea (CC, 2n = 18). To determine the influence of intensive breeding conditions on the evolution of its genome, we analysed structure and copy number of rDNA in 21 cultivars of B. napus, representative of genetic diversity. Methods We used next-generation sequencing genomic approaches, Southern blot hybridization, expression analysis and fluorescence in situ hybridization (FISH). Subgenome-specific sequences derived from rDNA intergenic spacers (IGS) were used as probes for identification of loci composition on chromosomes. Key Results Most B. napus cultivars (18/21, 86 %) had more A-genome than C-genome rDNA copies. Three cultivars analysed by FISH (‘Darmor’, ‘Yudal’ and ‘Asparagus kale’) harboured the same number (12 per diploid set) of loci. In B. napus ‘Darmor’, the A-genome-specific rDNA probe hybridized to all 12 rDNA loci (eight on the A-genome and four on the C-genome) while the C-genome-specific probe showed weak signals on the C-genome loci only. Deep sequencing revealed high homogeneity of arrays suggesting that the C-genome genes were largely overwritten by the A-genome variants in B. napus ‘Darmor’. In contrast, B. napus ‘Yudal’ showed a lack of gene conversion evidenced by additive inheritance of progenitor rDNA variants and highly localized hybridization signals of subgenome-specific probes on chromosomes. Brassica napus ‘Asparagus kale’ showed an intermediate pattern to ‘Darmor’ and ‘Yudal’. At the expression level, most cultivars (95 %) exhibited stable A-genome nucleolar dominance while one cultivar (‘Norin 9’) showed co-dominance. Conclusions The B. napus cultivars differ in the degree and direction of rDNA homogenization. The prevalent direction of gene conversion (towards the A-genome) correlates with the direction of expression dominance indicating that gene activity may be needed for interlocus gene conversion. PMID:27707747
Genetic Testing as a New Standard for Clinical Diagnosis of Color Vision Deficiencies.
Davidoff, Candice; Neitz, Maureen; Neitz, Jay
2016-09-01
The genetics underlying inherited color vision deficiencies is well understood: causative mutations change the copy number or sequence of the long (L), middle (M), or short (S) wavelength sensitive cone opsin genes. This study evaluated the potential of opsin gene analyses for use in clinical diagnosis of color vision defects. We tested 1872 human subjects using direct sequencing of opsin genes and a novel genetic assay that characterizes single nucleotide polymorphisms (SNPs) using the MassArray system. Of the subjects, 1074 also were given standard psychophysical color vision tests for a direct comparison with current clinical methods. Protan and deutan deficiencies were classified correctly in all subjects identified by MassArray as having red-green defects. Estimates of defect severity based on SNPs that control photopigment spectral tuning correlated with estimates derived from Nagel anomaloscopy. The MassArray assay provides genetic information that can be useful in the diagnosis of inherited color vision deficiency including presence versus absence, type, and severity, and it provides information to patients about the underlying pathobiology of their disease. The MassArray assay provides a method that directly analyzes the molecular substrates of color vision that could be used in combination with, or as an alternative to current clinical diagnosis of color defects.
Genetic Testing as a New Standard for Clinical Diagnosis of Color Vision Deficiencies
Davidoff, Candice; Neitz, Maureen; Neitz, Jay
2016-01-01
Purpose The genetics underlying inherited color vision deficiencies is well understood: causative mutations change the copy number or sequence of the long (L), middle (M), or short (S) wavelength sensitive cone opsin genes. This study evaluated the potential of opsin gene analyses for use in clinical diagnosis of color vision defects. Methods We tested 1872 human subjects using direct sequencing of opsin genes and a novel genetic assay that characterizes single nucleotide polymorphisms (SNPs) using the MassArray system. Of the subjects, 1074 also were given standard psychophysical color vision tests for a direct comparison with current clinical methods. Results Protan and deutan deficiencies were classified correctly in all subjects identified by MassArray as having red–green defects. Estimates of defect severity based on SNPs that control photopigment spectral tuning correlated with estimates derived from Nagel anomaloscopy. Conclusions The MassArray assay provides genetic information that can be useful in the diagnosis of inherited color vision deficiency including presence versus absence, type, and severity, and it provides information to patients about the underlying pathobiology of their disease. Translational Relevance The MassArray assay provides a method that directly analyzes the molecular substrates of color vision that could be used in combination with, or as an alternative to current clinical diagnosis of color defects. PMID:27622081
Helper-dependent adenoviral vectors for liver-directed gene therapy
Brunetti-Pierri, Nicola; Ng, Philip
2011-01-01
Helper-dependent adenoviral (HDAd) vectors devoid of all viral-coding sequences are promising non-integrating vectors for liver-directed gene therapy because they have a large cloning capacity, can efficiently transduce a wide variety of cell types from various species independent of the cell cycle and can result in long-term transgene expression without chronic toxicity. The main obstacle preventing clinical applications of HDAd for liver-directed gene therapy is the host innate inflammatory response against the vector capsid proteins that occurs shortly after intravascular vector administration resulting in acute toxicity, the severity of which is dependent on vector dose. Intense efforts have been focused on elucidating the factors involved in this acute response and various strategies have been investigated to improve the therapeutic index of HDAd vectors. These strategies have yielded encouraging results with the potential for clinical translation. PMID:21470977
NASA Technical Reports Server (NTRS)
Everroad, R. Craig; Stuart, Rhona K.; Bebout, Brad M.; Detweiler, Angela M.; Lee, Jackson Zan; Woebken, Dagmar; Bebout, Leslie E.; Pett-Ridge, Jennifer
2016-01-01
The nonheterocystous filamentous cyanobacterium, strain ESFC-1, is a recently described member of the order Oscillatoriales within the Cyanobacteria. ESFC-1 has been shown to be a major diazotroph in the intertidal microbial mat system at Elkhorn Slough, CA, USA. Based on phylogenetic analyses of the 16S RNA gene, ESFC-1 appears to belong to a unique, genus-level divergence; the draft genome sequence of this strain has now been determined. Here we report features of this genome as they relate to the ecological functions and capabilities of strain ESFC-1. The 5,632,035 bp genome sequence encodes 4914 protein-coding genes and 92 RNA genes. One striking feature of this cyanobacterium is the apparent lack of either uptake or bi-directional hydrogenases typically expected within a diazotroph. Additionally, a large genomic island is found that contains numerous low GC-content genes and genes related to extracellular polysaccharide production and cell wall synthesis and maintenance.
Everroad, R. Craig; Stuart, Rhona K.; Bebout, Brad M.; ...
2016-08-24
The nonheterocystous filamentous cyanobacterium, strain ESFC-1, is a recently described member of the order Oscillatoriales within the Cyanobacteria. ESFC-1 has been shown to be a major diazotroph in the intertidal microbial mat system at Elkhorn Slough, CA, USA. Based on phylogenetic analyses of the 16S RNA gene, ESFC-1 appears to belong to a unique, genus-level divergence; the draft genome sequence of this strain has now been determined. Here we report features of this genome as they relate to the ecological functions and capabilities of strain ESFC-1. The 5,632,035 bp genome sequence encodes 4914 protein-coding genes and 92 RNA genes. Onemore » striking feature of this cyanobacterium is the apparent lack of either uptake or bi-directional hydrogenases typically expected within a diazotroph. In addition, a large genomic island is found that contains numerous low GC-content genes and genes related to extracellular polysaccharide production and cell wall synthesis and maintenance.« less
Identification of Mycoparasitism-Related Genes in Trichoderma atroviride ▿ † ‡
Reithner, Barbara; Ibarra-Laclette, Enrique; Mach, Robert L.; Herrera-Estrella, Alfredo
2011-01-01
A high-throughput sequencing approach was utilized to carry out a comparative transcriptome analysis of Trichoderma atroviride IMI206040 during mycoparasitic interactions with the plant-pathogenic fungus Rhizoctonia solani. In this study, transcript fragments of 7,797 Trichoderma genes were sequenced, 175 of which were host responsive. According to the functional annotation of these genes by KOG (eukaryotic orthologous groups), the most abundant group during direct contact was “metabolism.” Quantitative reverse transcription (RT)-PCR confirmed the differential transcription of 13 genes (including swo1, encoding an expansin-like protein; axe1, coding for an acetyl xylan esterase; and homologs of genes encoding the aspartyl protease papA and a trypsin-like protease, pra1) in the presence of R. solani. An additional relative gene expression analysis of these genes, conducted at different stages of mycoparasitism against Botrytis cinerea and Phytophthora capsici, revealed a synergistic transcription of various genes involved in cell wall degradation. The similarities in expression patterns and the occurrence of regulatory binding sites in the corresponding promoter regions suggest a possible analog regulation of these genes during the mycoparasitism of T. atroviride. Furthermore, a chitin- and distance-dependent induction of pra1 was demonstrated. PMID:21531825
Blochlinger, K; Diggelmann, H
1984-12-01
The DNA coding sequence for the hygromycin B phosphotransferase gene was placed under the control of the regulatory sequences of a cloned long terminal repeat of Moloney sarcoma virus. This construction allowed direct selection for hygromycin B resistance after transfection of eucaryotic cell lines not naturally resistant to this antibiotic, thus providing another dominant marker for DNA transfer in eucaryotic cells.
Blochlinger, K; Diggelmann, H
1984-01-01
The DNA coding sequence for the hygromycin B phosphotransferase gene was placed under the control of the regulatory sequences of a cloned long terminal repeat of Moloney sarcoma virus. This construction allowed direct selection for hygromycin B resistance after transfection of eucaryotic cell lines not naturally resistant to this antibiotic, thus providing another dominant marker for DNA transfer in eucaryotic cells. Images PMID:6098829
Analysis of SNP rs16754 of WT1 gene in a series of de novo acute myeloid leukemia patients.
Luna, Irene; Such, Esperanza; Cervera, Jose; Barragán, Eva; Jiménez-Velasco, Antonio; Dolz, Sandra; Ibáñez, Mariam; Gómez-Seguí, Inés; López-Pavía, María; Llop, Marta; Fuster, Óscar; Oltra, Silvestre; Moscardó, Federico; Martínez-Cuadrón, David; Senent, M Leonor; Gascón, Adriana; Montesinos, Pau; Martín, Guillermo; Bolufer, Pascual; Sanz, Miguel A
2012-12-01
The single nucleotide polymorphism (SNP) rs16754 of the WT1 gene has been previously described as a possible prognostic marker in normal karyotype acute myeloid leukemia (AML) patients. Nevertheless, the findings in this field are not always reproducible in different series. One hundred and seventy-five adult de novo AML patients were screened with two different methods for the detection of SNP rs16754: high-resolution melting (HRM) and FRET hybridization probes. Direct sequencing was used to validate both techniques. The SNP was detected in 52 out of 175 patients (30 %), both by HRM and hybridization probes. Direct sequencing confirmed that every positive sample in the screening methods had a variation in the DNA sequence. Patients with the wild-type genotype (WT1(AA)) for the SNP rs16754 were significantly younger than those with the heterozygous WT1(AG) genotype. No other difference was observed for baseline characteristic or outcome between patients with or without the SNP. Both techniques are equally reliable and reproducible as screening methods for the detection of the SNP rs16754, allowing for the selection of those samples that will need to be sequenced. We were unable to confirm the suggested favorable outcome of SNP rs16754 in de novo AML.
Zhang, Wanying; Wang, Tao; Huang, Shuaiwu; Zhao, Xiuli
2018-04-10
To detect mutation of HPGD gene among three pedigrees affected with primary hypertrophic osteoarthropathy (PHO) by DNA sequencing and high-resolution melting (HRM) analysis. Genomic DNA was extracted from peripheral blood samples collected from the pedigrees. PCR and direct sequencing were carried out to identify potential mutations of the HPGD gene. Amplicons containing the mutation spot were generated by nested PCR. The products were then subjected to HRM analysis using the HR-1 instrument. Direct sequencing was carried out in family members and healthy individuals to confirm the result of HRM analysis. A homozygous mutation c.310_311delCT was detected in 2 affected probands, while a heterozygous mutation c.310_311delCT was detected in the third proband. HRM analysis of the fragments encompassing HPGD exon 3 showed 3 curve patterns representing three different genotypes, i.e., the wild type, the c.310_311delCT homozygote, and the c.310_311delCT heterozygote. Result of DNA sequencing was consistent with that of the HRM analysis and phenotype of the subjects. The c.310_311delCT mutation may be the most prevalent mutation among Chinese population. HRM analysis has provided an optimized method for genetic testing of HPGD mutation for its simplicity, rapid turnover and high sensitivity.
Li, Qinglian; Wang, Lifei; Xie, Yunying; Wang, Songmei; Chen, Ruxian
2013-01-01
Sansanmycins, produced by Streptomyces sp. strain SS, are uridyl peptide antibiotics with activities against Pseudomonas aeruginosa and multidrug-resistant Mycobacterium tuberculosis. In this work, the biosynthetic gene cluster of sansanmycins, comprised of 25 open reading frames (ORFs) showing considerable amino acid sequence identity to those of the pacidamycin and napsamycin gene cluster, was identified. SsaA, the archetype of a novel class of transcriptional regulators, was characterized in the sansanmycin gene cluster, with an N-terminal fork head-associated (FHA) domain and a C-terminal LuxR-type helix-turn-helix (HTH) motif. The disruption of ssaA abolished sansanmycin production, as well as the expression of the structural genes for sansanmycin biosynthesis, indicating that SsaA is a pivotal activator for sansanmycin biosynthesis. SsaA was proved to directly bind several putative promoter regions of biosynthetic genes, and comparison of sequences of the binding sites allowed the identification of a consensus SsaA binding sequence, GTMCTGACAN2TGTCAGKAC. The DNA binding activity of SsaA was inhibited by sansanmycins A and H in a concentration-dependent manner. Furthermore, sansanmycins A and H were found to directly interact with SsaA. These results indicated that SsaA strictly controls the production of sansanmycins at the transcriptional level in a feedback regulatory mechanism by sensing the accumulation of the end products. As the first characterized regulator of uridyl peptide antibiotic biosynthesis, the understanding of this autoregulatory process involved in sansanmycin biosynthesis will likely provide an effective strategy for rational improvements in the yields of these uridyl peptide antibiotics. PMID:23475969
Xu, Jian-zhong; Zhang, Wei-guo
2016-01-01
With the availability of the whole genome sequence of Escherichia coli or Corynebacterium glutamicum, strategies for directed DNA manipulation have developed rapidly. DNA manipulation plays an important role in understanding the function of genes and in constructing novel engineering bacteria according to requirement. DNA manipulation involves modifying the autologous genes and expressing the heterogenous genes. Two alternative approaches, using electroporation linear DNA or recombinant suicide plasmid, allow a wide variety of DNA manipulation. However, the over-expression of the desired gene is generally executed via plasmid-mediation. The current review summarizes the common strategies used for genetically modifying E. coli and C. glutamicum genomes, and discusses the technical problem of multi-layered DNA manipulation. Strategies for gene over-expression via integrating into genome are proposed. This review is intended to be an accessible introduction to DNA manipulation within the bacterial genome for novices and a source of the latest experimental information for experienced investigators. PMID:26834010
Demographic history and gene flow during silkworm domestication
2014-01-01
Background Gene flow plays an important role in domestication history of domesticated species. However, little is known about the demographic history of domesticated silkworm involving gene flow with its wild relative. Results In this study, four model-based evolutionary scenarios to describe the demographic history of B. mori were hypothesized. Using Approximate Bayesian Computation method and DNA sequence data from 29 nuclear loci, we found that the gene flow at bottleneck model is the most likely scenario for silkworm domestication. The starting time of silkworm domestication was estimated to be approximate 7,500 years ago; the time of domestication termination was 3,984 years ago. Using coalescent simulation analysis, we also found that bi-directional gene flow occurred during silkworm domestication. Conclusions Estimates of silkworm domestication time are nearly consistent with the archeological evidence and our previous results. Importantly, we found that the bi-directional gene flow might occur during silkworm domestication. Our findings add a dimension to highlight the important role of gene flow in domestication of crops and animals. PMID:25123546
Choury, Danièle; Aubert, Gérald; Szajnert, Marie-France; Azibi, Kemal; Delpech, Marc; Paul, Gérard
1999-01-01
A clinical strain of Vibrio cholerae non-O1 non-O139 isolated in France produced a new β-lactamase with a pI of 5.35. The purified enzyme, with a molecular mass of 33,000 Da, was characterized. Its kinetic constants show it to be a carbenicillin-hydrolyzing enzyme comparable to the five previously reported CARB β-lactamases and to SAR-1, another carbenicillin-hydrolyzing β-lactamase that has a pI of 4.9 and that is produced by a V. cholerae strain from Tanzania. This β-lactamase is designated CARB-6, and the gene for CARB-6 could not be transferred to Escherichia coli K-12 by conjugation. The nucleotide sequence of the structural gene was determined by direct sequencing of PCR-generated fragments from plasmid DNA with four pairs of primers covering the whole sequence of the reference CARB-3 gene. The gene encodes a 288-amino-acid protein that shares 94% homology with the CARB-1, CARB-2, and CARB-3 enzymes, 93% homology with the Proteus mirabilis N29 enzyme, and 86.5% homology with the CARB-4 enzyme. The sequence of CARB-6 differs from those of CARB-3, CARB-2, CARB-1, N29, and CARB-4 at 15, 16, 17, 19, and 37 amino acid positions, respectively. All these mutations are located in the C-terminal region of the sequence and at the surface of the molecule, according to the crystal structure of the Staphylococcus aureus PC-1 β-lactamase. PMID:9925522
The status of the species Enterobacter siamensisKhunthongpan et al. 2014. Request for an Opinion.
Kämpfer, Peter; Doijad, Swapnil; Chakraborty, Trinad; Glaeser, Stefanie P
2016-01-01
In the course of a taxonomic study describing novel species of the genus Enterobacter it was found that the 16S rRNA gene sequence of the type strain of Enterobacter siamensis, obtained both directly from the authors of the publication on Enterobacter siamensis and from the Korean Collection for Type Cultures (C2361T and KCTC 23282T, respectively), was not congruent with the 16S rRNA gene sequence deposited in the GenBank database under the accession number HQ888848, which was applied for phylogenetic analysis in the species proposal. The remaining deposit in the Japanese type culture collection, NBRC 107138T, showed an identical 16S rRNA gene sequence to the other two cultures and overall, this sequence differed at 35 positions in comparison with the 1429 bp sequence published under the accession number HQ888848.Therefore, the type strain of this species cannot be included in any further scientific comparative study. It is proposed that the Judicial Commission of the International Committee on Systematics of Prokaryotes place the name Enterobacter siamensis on the list of rejected names, if a suitable replacement for the type strain is not found or a neotype strain is not proposed within two years following the publication of this Request for an Opinion.
NASA Astrophysics Data System (ADS)
Streets, Aaron M.; Cao, Chen; Zhang, Xiannian; Huang, Yanyi
2016-03-01
Phenotype classification of single cells reveals biological variation that is masked in ensemble measurement. This heterogeneity is found in gene and protein expression as well as in cell morphology. Many techniques are available to probe phenotypic heterogeneity at the single cell level, for example quantitative imaging and single-cell RNA sequencing, but it is difficult to perform multiple assays on the same single cell. In order to directly track correlation between morphology and gene expression at the single cell level, we developed a microfluidic platform for quantitative coherent Raman imaging and immediate RNA sequencing (RNA-Seq) of single cells. With this device we actively sort and trap cells for analysis with stimulated Raman scattering microscopy (SRS). The cells are then processed in parallel pipelines for lysis, and preparation of cDNA for high-throughput transcriptome sequencing. SRS microscopy offers three-dimensional imaging with chemical specificity for quantitative analysis of protein and lipid distribution in single cells. Meanwhile, the microfluidic platform facilitates single-cell manipulation, minimizes contamination, and furthermore, provides improved RNA-Seq detection sensitivity and measurement precision, which is necessary for differentiating biological variability from technical noise. By combining coherent Raman microscopy with RNA sequencing, we can better understand the relationship between cellular morphology and gene expression at the single-cell level.
Saingam, Prakit; Li, Bo; Yan, Tao
2018-06-01
DNA-based molecular detection of microbial pathogens in complex environments is still plagued by sensitivity, specificity and robustness issues. We propose to address these issues by viewing them as inadvertent consequences of requiring specific and adequate amplification (SAA) of target DNA molecules by current PCR methods. Using the invA gene of Salmonella as the model system, we investigated if next generation sequencing (NGS) can be used to directly detect target sequences in false-negative PCR reaction (PCR-NGS) in order to remove the SAA requirement from PCR. False-negative PCR and qPCR reactions were first created using serial dilutions of laboratory-prepared Salmonella genomic DNA and then analyzed directly by NGS. Target invA sequences were detected in all false-negative PCR and qPCR reactions, which lowered the method detection limits near the theoretical minimum of single gene copy detection. The capability of the PCR-NGS approach in correcting false negativity was further tested and confirmed under more environmentally relevant conditions using Salmonella-spiked stream water and sediment samples. Finally, the PCR-NGS approach was applied to ten urban stream water samples and detected invA sequences in eight samples that would be otherwise deemed Salmonella negative. Analysis of the non-target sequences in the false-negative reactions helped to identify primer dime-like short sequences as the main cause of the false negativity. Together, the results demonstrated that the PCR-NGS approach can significantly improve method sensitivity, correct false-negative detections, and enable sequence-based analysis for failure diagnostics in complex environmental samples. Copyright © 2018 Elsevier B.V. All rights reserved.
Castro-Prieto, Aines; Wachter, Bettina; Melzheimer, Joerg; Thalwitzer, Susanne; Sommer, Simone
2011-01-01
The genes of the major histocompatibility complex (MHC) are a key component of the mammalian immune system and have become important molecular markers for fitness-related genetic variation in wildlife populations. Currently, no information about the MHC sequence variation and constitution in African leopards exists. In this study, we isolated and characterized genetic variation at the adaptively most important region of MHC class I and MHC class II-DRB genes in 25 free-ranging African leopards from Namibia and investigated the mechanisms that generate and maintain MHC polymorphism in the species. Using single-stranded conformation polymorphism analysis and direct sequencing, we detected 6 MHC class I and 6 MHC class II-DRB sequences, which likely correspond to at least 3 MHC class I and 3 MHC class II-DRB loci. Amino acid sequence variation in both MHC classes was higher or similar in comparison to other reported felids. We found signatures of positive selection shaping the diversity of MHC class I and MHC class II-DRB loci during the evolutionary history of the species. A comparison of MHC class I and MHC class II-DRB sequences of the leopard to those of other felids revealed a trans-species mode of evolution. In addition, the evolutionary relationships of MHC class II-DRB sequences between African and Asian leopard subspecies are discussed.
Defining the ABC of gene essentiality in streptococci.
Charbonneau, Amelia R L; Forman, Oliver P; Cain, Amy K; Newland, Graham; Robinson, Carl; Boursnell, Mike; Parkhill, Julian; Leigh, James A; Maskell, Duncan J; Waller, Andrew S
2017-05-31
Utilising next generation sequencing to interrogate saturated bacterial mutant libraries provides unprecedented information for the assignment of genome-wide gene essentiality. Exposure of saturated mutant libraries to specific conditions and subsequent sequencing can be exploited to uncover gene essentiality relevant to the condition. Here we present a barcoded transposon directed insertion-site sequencing (TraDIS) system to define an essential gene list for Streptococcus equi subsp. equi, the causative agent of strangles in horses, for the first time. The gene essentiality data for this group C Streptococcus was compared to that of group A and B streptococci. Six barcoded variants of pGh9:ISS1 were designed and used to generate mutant libraries containing between 33,000-66,000 unique mutants. TraDIS was performed on DNA extracted from each library and data were analysed separately and as a combined master pool. Gene essentiality determined that 19.5% of the S. equi genome was essential. Gene essentialities were compared to those of group A and group B streptococci, identifying concordances of 90.2% and 89.4%, respectively and an overall concordance of 83.7% between the three species. The use of barcoded pGh9:ISS1 to generate mutant libraries provides a highly useful tool for the assignment of gene function in S. equi and other streptococci. The shared essential gene set of group A, B and C streptococci provides further evidence of the close genetic relationships between these important pathogenic bacteria. Therefore, the ABC of gene essentiality reported here provides a solid foundation towards reporting the functional genome of streptococci.
Gene calling and bacterial genome annotation with BG7.
Tobes, Raquel; Pareja-Tobes, Pablo; Manrique, Marina; Pareja-Tobes, Eduardo; Kovach, Evdokim; Alekhin, Alexey; Pareja, Eduardo
2015-01-01
New massive sequencing technologies are providing many bacterial genome sequences from diverse taxa but a refined annotation of these genomes is crucial for obtaining scientific findings and new knowledge. Thus, bacterial genome annotation has emerged as a key point to investigate in bacteria. Any efficient tool designed specifically to annotate bacterial genomes sequenced with massively parallel technologies has to consider the specific features of bacterial genomes (absence of introns and scarcity of nonprotein-coding sequence) and of next-generation sequencing (NGS) technologies (presence of errors and not perfectly assembled genomes). These features make it convenient to focus on coding regions and, hence, on protein sequences that are the elements directly related with biological functions. In this chapter we describe how to annotate bacterial genomes with BG7, an open-source tool based on a protein-centered gene calling/annotation paradigm. BG7 is specifically designed for the annotation of bacterial genomes sequenced with NGS. This tool is sequence error tolerant maintaining their capabilities for the annotation of highly fragmented genomes or for annotating mixed sequences coming from several genomes (as those obtained through metagenomics samples). BG7 has been designed with scalability as a requirement, with a computing infrastructure completely based on cloud computing (Amazon Web Services).
Molecular detection of Toxoplasma gondii in snakes.
Nasiri, Vahid; Teymurzadeh, Shohreh; Karimi, Gholamreza; Nasiri, Mehdi
2016-10-01
Toxoplasma gondii, an obligate intracellular protozoan parasite, is responsible for one of the most common zoonotic parasitic diseases in almost all warm-blooded vertebrates worldwide, and it is estimated that about one-third of the world human population is chronically infected with this parasite. Little is known about the circulation of T. gondii in snakes and this study for the first time aimed to evaluate the infection rates of snakes by this parasite by PCR methods. The brain of 68 Snakes, that were collected between May 2012 and September 2015 and died after the hold in captivity, under which they were kept for taking poisons, were examined for the presence of this parasite. DNA was extracted and Nested-PCR method was carried out with two of pairs of primers to detect the 344 bp fragment of T. gondii GRA6 gene. Five positive nested-PCR products were directly sequenced in the forward and reverse directions by Sequetech Company (Mountain View, CA). T. gondii GRA6 gene were detected from 55 (80.88%) of 68 snakes brains. Sequencing of the GRA6 gene revealed 98-100% of similarity with T. gondii sequences deposited in GenBank. To our knowledge, this is the first study of molecular detection of T. gondii in snakes and our findings show a higher frequency of this organism among them. Copyright © 2016 Elsevier Inc. All rights reserved.
Liu, Wenjun; Yu, Jie; Sun, Zhihong; Song, Yuqin; Wang, Xueni; Wang, Hongmei; Wuren, Tuoya; Zha, Musu; Menghe, Bilige; Heping, Zhang
2016-01-01
Lactobacillus delbrueckii ssp. bulgaricus (L. bulgaricus) is well known for its worldwide application in yogurt production. Flavor production and acid producing are considered as the most important characteristics for starter culture screening. To our knowledge this is the first study applying functional gene sequence multilocus sequence typing technology to predict the fermentation and flavor-producing characteristics of yogurt-producing bacteria. In the present study, phenotypic characteristics of 35 L. bulgaricus strains were quantified during the fermentation of milk to yogurt and during its subsequent storage; these included fermentation time, acidification rate, pH, titratable acidity, and flavor characteristics (acetaldehyde concentration). Furthermore, multilocus sequence typing analysis of 7 functional genes associated with fermentation time, acid production, and flavor formation was done to elucidate the phylogeny and genetic evolution of the same L. bulgaricus isolates. The results showed that strains significantly differed in fermentation time, acidification rate, and acetaldehyde production. Combining functional gene sequence analysis with phenotypic characteristics demonstrated that groups of strains established using genotype data were consistent with groups identified based on their phenotypic traits. This study has established an efficient and rapid molecular genotyping method to identify strains with good fermentation traits; this has the potential to replace time-consuming conventional methods based on direct measurement of phenotypic traits. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Triplex technology in studies of DNA damage, DNA repair, and mutagenesis.
Mukherjee, Anirban; Vasquez, Karen M
2011-08-01
Triplex-forming oligonucleotides (TFOs) can bind to the major groove of homopurine-homopyrimidine stretches of double-stranded DNA in a sequence-specific manner through Hoogsteen hydrogen bonding to form DNA triplexes. TFOs by themselves or conjugated to reactive molecules can be used to direct sequence-specific DNA damage, which in turn results in the induction of several DNA metabolic activities. Triplex technology is highly utilized as a tool to study gene regulation, molecular mechanisms of DNA repair, recombination, and mutagenesis. In addition, TFO targeting of specific genes has been exploited in the development of therapeutic strategies to modulate DNA structure and function. In this review, we discuss advances made in studies of DNA damage, DNA repair, recombination, and mutagenesis by using triplex technology to target specific DNA sequences. Copyright © 2011 Elsevier Masson SAS. All rights reserved.
AbouHaidar, Mounir Georges; Venkataraman, Srividhya; Golshani, Ashkan; Liu, Bolin; Ahmad, Tauqeer
2014-10-07
The highly structured (64% GC) covalently closed circular (CCC) RNA (220 nt) of the virusoid associated with rice yellow mottle virus codes for a 16-kDa highly basic protein using novel modalities for coding, translation, and gene expression. This CCC RNA is the smallest among all known viroids and virusoids and the only one that codes proteins. Its sequence possesses an internal ribosome entry site and is directly translated through two (or three) completely overlapping ORFs (shifting to a new reading frame at the end of each round). The initiation and termination codons overlap UGAUGA (underline highlights the initiation codon AUG within the combined initiation-termination sequence). Termination codons can be ignored to obtain larger read-through proteins. This circular RNA with no noncoding sequences is a unique natural supercompact "nanogenome."
Improving RNA-Seq expression estimation by modeling isoform- and exon-specific read sequencing rate.
Liu, Xuejun; Shi, Xinxin; Chen, Chunlin; Zhang, Li
2015-10-16
The high-throughput sequencing technology, RNA-Seq, has been widely used to quantify gene and isoform expression in the study of transcriptome in recent years. Accurate expression measurement from the millions or billions of short generated reads is obstructed by difficulties. One is ambiguous mapping of reads to reference transcriptome caused by alternative splicing. This increases the uncertainty in estimating isoform expression. The other is non-uniformity of read distribution along the reference transcriptome due to positional, sequencing, mappability and other undiscovered sources of biases. This violates the uniform assumption of read distribution for many expression calculation approaches, such as the direct RPKM calculation and Poisson-based models. Many methods have been proposed to address these difficulties. Some approaches employ latent variable models to discover the underlying pattern of read sequencing. However, most of these methods make bias correction based on surrounding sequence contents and share the bias models by all genes. They therefore cannot estimate gene- and isoform-specific biases as revealed by recent studies. We propose a latent variable model, NLDMseq, to estimate gene and isoform expression. Our method adopts latent variables to model the unknown isoforms, from which reads originate, and the underlying percentage of multiple spliced variants. The isoform- and exon-specific read sequencing biases are modeled to account for the non-uniformity of read distribution, and are identified by utilizing the replicate information of multiple lanes of a single library run. We employ simulation and real data to verify the performance of our method in terms of accuracy in the calculation of gene and isoform expression. Results show that NLDMseq obtains competitive gene and isoform expression compared to popular alternatives. Finally, the proposed method is applied to the detection of differential expression (DE) to show its usefulness in the downstream analysis. The proposed NLDMseq method provides an approach to accurately estimate gene and isoform expression from RNA-Seq data by modeling the isoform- and exon-specific read sequencing biases. It makes use of a latent variable model to discover the hidden pattern of read sequencing. We have shown that it works well in both simulations and real datasets, and has competitive performance compared to popular methods. The method has been implemented as a freely available software which can be found at https://github.com/PUGEA/NLDMseq.
McCarthy, Alex J; Stabler, Richard A; Taylor, Peter W
2018-04-01
Escherichia coli K1 strains are major causative agents of invasive disease of newborn infants. The age dependency of infection can be reproduced in neonatal rats. Colonization of the small intestine following oral administration of K1 bacteria leads rapidly to invasion of the blood circulation; bacteria that avoid capture by the mesenteric lymphatic system and evade antibacterial mechanisms in the blood may disseminate to cause organ-specific infections such as meningitis. Some E. coli K1 surface constituents, in particular the polysialic acid capsule, are known to contribute to invasive potential, but a comprehensive picture of the factors that determine the fully virulent phenotype has not emerged so far. We constructed a library and constituent sublibraries of ∼775,000 Tn 5 transposon mutants of E. coli K1 strain A192PP and employed transposon-directed insertion site sequencing (TraDIS) to identify genes required for fitness for infection of 2-day-old rats. Transposon insertions were lacking in 357 genes following recovery on selective agar; these genes were considered essential for growth in nutrient-replete medium. Colonization of the midsection of the small intestine was facilitated by 167 E. coli K1 gene products. Restricted bacterial translocation across epithelial barriers precluded TraDIS analysis of gut-to-blood and blood-to-brain transits; 97 genes were required for survival in human serum. This study revealed that a large number of bacterial genes, many of which were not previously associated with systemic E. coli K1 infection, are required to realize full invasive potential. IMPORTANCE Escherichia coli K1 strains cause life-threatening infections in newborn infants. They are acquired from the mother at birth and colonize the small intestine, from where they invade the blood and central nervous system. It is difficult to obtain information from acutely ill patients that sheds light on physiological and bacterial factors determining invasive disease. Key aspects of naturally occurring age-dependent human infection can be reproduced in neonatal rats. Here, we employ transposon-directed insertion site sequencing to identify genes essential for the in vitro growth of E. coli K1 and genes that contribute to the colonization of susceptible rats. The presence of bottlenecks to invasion of the blood and cerebrospinal compartments precluded insertion site sequencing analysis, but we identified genes for survival in serum. Copyright © 2018 McCarthy et al.
McCarthy, Alex J.
2018-01-01
ABSTRACT Escherichia coli K1 strains are major causative agents of invasive disease of newborn infants. The age dependency of infection can be reproduced in neonatal rats. Colonization of the small intestine following oral administration of K1 bacteria leads rapidly to invasion of the blood circulation; bacteria that avoid capture by the mesenteric lymphatic system and evade antibacterial mechanisms in the blood may disseminate to cause organ-specific infections such as meningitis. Some E. coli K1 surface constituents, in particular the polysialic acid capsule, are known to contribute to invasive potential, but a comprehensive picture of the factors that determine the fully virulent phenotype has not emerged so far. We constructed a library and constituent sublibraries of ∼775,000 Tn5 transposon mutants of E. coli K1 strain A192PP and employed transposon-directed insertion site sequencing (TraDIS) to identify genes required for fitness for infection of 2-day-old rats. Transposon insertions were lacking in 357 genes following recovery on selective agar; these genes were considered essential for growth in nutrient-replete medium. Colonization of the midsection of the small intestine was facilitated by 167 E. coli K1 gene products. Restricted bacterial translocation across epithelial barriers precluded TraDIS analysis of gut-to-blood and blood-to-brain transits; 97 genes were required for survival in human serum. This study revealed that a large number of bacterial genes, many of which were not previously associated with systemic E. coli K1 infection, are required to realize full invasive potential. IMPORTANCE Escherichia coli K1 strains cause life-threatening infections in newborn infants. They are acquired from the mother at birth and colonize the small intestine, from where they invade the blood and central nervous system. It is difficult to obtain information from acutely ill patients that sheds light on physiological and bacterial factors determining invasive disease. Key aspects of naturally occurring age-dependent human infection can be reproduced in neonatal rats. Here, we employ transposon-directed insertion site sequencing to identify genes essential for the in vitro growth of E. coli K1 and genes that contribute to the colonization of susceptible rats. The presence of bottlenecks to invasion of the blood and cerebrospinal compartments precluded insertion site sequencing analysis, but we identified genes for survival in serum. PMID:29339415
Feng, Fan; Qi, Weiwei; Lv, Yuanda; Yan, Shumei; Xu, Liming; Yang, Wenyao; Yuan, Yue; Chen, Yihan
2018-01-01
Maize (Zea mays) endosperm is a primary tissue for nutrient storage and is highly differentiated during development. However, the regulatory networks of endosperm development and nutrient metabolism remain largely unknown. Maize opaque11 (o11) is a classic seed mutant with a small and opaque endosperm showing decreased starch and protein accumulation. We cloned O11 and found that it encodes an endosperm-specific bHLH transcription factor (TF). Loss of function of O11 significantly affected transcription of carbohydrate/amino acid metabolism and stress response genes. Genome-wide binding site analysis revealed 9885 O11 binding sites distributed over 6033 genes. Using chromatin immunoprecipitation sequencing (ChIP-seq) coupled with RNA sequencing (RNA-seq) assays, we identified 259 O11-modulated target genes. O11 was found to directly regulate key TFs in endosperm development (NKD2 and ZmDOF3) and nutrient metabolism (O2 and PBF). Moreover, O11 directly regulates cyPPDKs and multiple carbohydrate metabolic enzymes. O11 is an activator of ZmYoda, suggesting its regulatory function through the MAPK pathway in endosperm development. Many stress-response genes are also direct targets of O11. In addition, 11 O11-interacting proteins were identified, including ZmIce1, which coregulates stress response targets and ZmYoda with O11. Therefore, this study reveals an endosperm regulatory network centered around O11, which coordinates endosperm development, metabolism and stress responses. PMID:29436476
Andréasson, Claes; Schick, Anna J; Pfeiffer, Susanne M; Sarov, Mihail; Stewart, Francis; Wurst, Wolfgang; Schick, Joel A
2013-01-01
Efficient gene targeting in embryonic stem cells requires that modifying DNA sequences are identical to those in the targeted chromosomal locus. Yet, there is a paucity of isogenic genomic clones for human cell lines and PCR amplification cannot be used in many mutation-sensitive applications. Here, we describe a novel method for the direct cloning of genomic DNA into a targeting vector, pRTVIR, using oligonucleotide-directed homologous recombination in yeast. We demonstrate the applicability of the method by constructing functional targeting vectors for mammalian genes Uhrf1 and Gfap. Whereas the isogenic targeting of the gene Uhrf1 showed a substantial increase in targeting efficiency compared to non-isogenic DNA in mouse E14 cells, E14-derived DNA performed better than the isogenic DNA in JM8 cells for both Uhrf1 and Gfap. Analysis of 70 C57BL/6-derived targeting vectors electroporated in JM8 and E14 cell lines in parallel showed a clear dependence on isogenicity for targeting, but for three genes isogenic DNA was found to be inhibitory. In summary, this study provides a straightforward methodological approach for the direct generation of isogenic gene targeting vectors.
Simmon, Keith; Karaca, Dilek; Langeland, Nina; Wiker, Harald G.
2012-01-01
Broad-range amplification and sequencing of the bacterial 16S rRNA gene directly from clinical specimens are offered as a diagnostic service in many laboratories. One major pitfall is primer cross-reactivity with human DNA which will result in mixed chromatograms. Mixed chromatograms will complicate subsequent sequence analysis and impede identification. In SYBR green real-time PCR assays, it can also affect crossing threshold values and consequently the status of a specimen as positive or negative. We evaluated two conventional primer pairs in common use and a new primer pair based on the dual priming oligonucleotide (DPO) principle. Cross-reactivity was observed when both conventional primer pairs were used, resulting in interpretation difficulties. No cross-reactivity was observed using the DPOs even in specimens with a high ratio of human to bacterial DNA. In addition to reducing cross-reactivity, the DPO principle also offers a high degree of flexibility in the design of primers and should be considered for any PCR assay intended for detection and identification of pathogens directly from human clinical specimens. PMID:22278843
The Essential Genome of Escherichia coli K-12.
Goodall, Emily C A; Robinson, Ashley; Johnston, Iain G; Jabbari, Sara; Turner, Keith A; Cunningham, Adam F; Lund, Peter A; Cole, Jeffrey A; Henderson, Ian R
2018-02-20
Transposon-directed insertion site sequencing (TraDIS) is a high-throughput method coupling transposon mutagenesis with short-fragment DNA sequencing. It is commonly used to identify essential genes. Single gene deletion libraries are considered the gold standard for identifying essential genes. Currently, the TraDIS method has not been benchmarked against such libraries, and therefore, it remains unclear whether the two methodologies are comparable. To address this, a high-density transposon library was constructed in Escherichia coli K-12. Essential genes predicted from sequencing of this library were compared to existing essential gene databases. To decrease false-positive identification of essential genes, statistical data analysis included corrections for both gene length and genome length. Through this analysis, new essential genes and genes previously incorrectly designated essential were identified. We show that manual analysis of TraDIS data reveals novel features that would not have been detected by statistical analysis alone. Examples include short essential regions within genes, orientation-dependent effects, and fine-resolution identification of genome and protein features. Recognition of these insertion profiles in transposon mutagenesis data sets will assist genome annotation of less well characterized genomes and provides new insights into bacterial physiology and biochemistry. IMPORTANCE Incentives to define lists of genes that are essential for bacterial survival include the identification of potential targets for antibacterial drug development, genes required for rapid growth for exploitation in biotechnology, and discovery of new biochemical pathways. To identify essential genes in Escherichia coli , we constructed a transposon mutant library of unprecedented density. Initial automated analysis of the resulting data revealed many discrepancies compared to the literature. We now report more extensive statistical analysis supported by both literature searches and detailed inspection of high-density TraDIS sequencing data for each putative essential gene for the E. coli model laboratory organism. This paper is important because it provides a better understanding of the essential genes of E. coli , reveals the limitations of relying on automated analysis alone, and provides a new standard for the analysis of TraDIS data. Copyright © 2018 Goodall et al.
Nakamura, Mikiko; Suzuki, Ayako; Akada, Junko; Tomiyoshi, Keisuke; Hoshida, Hisashi; Akada, Rinji
2015-12-01
Mammalian gene expression constructs are generally prepared in a plasmid vector, in which a promoter and terminator are located upstream and downstream of a protein-coding sequence, respectively. In this study, we found that front terminator constructs-DNA constructs containing a terminator upstream of a promoter rather than downstream of a coding region-could sufficiently express proteins as a result of end joining of the introduced DNA fragment. By taking advantage of front terminator constructs, FLAG substitutions, and deletions were generated using mutagenesis primers to identify amino acids specifically recognized by commercial FLAG antibodies. A minimal epitope sequence for polyclonal FLAG antibody recognition was also identified. In addition, we analyzed the sequence of a C-terminal Ser-Lys-Leu peroxisome localization signal, and identified the key residues necessary for peroxisome targeting. Moreover, front terminator constructs of hepatitis B surface antigen were used for deletion analysis, leading to the identification of regions required for the particle formation. Collectively, these results indicate that front terminator constructs allow for easy manipulations of C-terminal protein-coding sequences, and suggest that direct gene expression with PCR-amplified DNA is useful for high-throughput protein analysis in mammalian cells.
Editing Transgenic DNA Components by Inducible Gene Replacement in Drosophila melanogaster
Lin, Chun-Chieh; Potter, Christopher J.
2016-01-01
Gene conversions occur when genomic double-strand DNA breaks (DSBs) trigger unidirectional transfer of genetic material from a homologous template sequence. Exogenous or mutated sequence can be introduced through this homology-directed repair (HDR). We leveraged gene conversion to develop a method for genomic editing of existing transgenic insertions in Drosophila melanogaster. The clustered regularly-interspaced palindromic repeats (CRISPR)/Cas9 system is used in the homology assisted CRISPR knock-in (HACK) method to induce DSBs in a GAL4 transgene, which is repaired by a single-genomic transgenic construct containing GAL4 homologous sequences flanking a T2A-QF2 cassette. With two crosses, this technique converts existing GAL4 lines, including enhancer traps, into functional QF2 expressing lines. We used HACK to convert the most commonly-used GAL4 lines (labeling tissues such as neurons, fat, glia, muscle, and hemocytes) to QF2 lines. We also identified regions of the genome that exhibited differential efficiencies of HDR. The HACK technique is robust and readily adaptable for targeting and replacement of other genomic sequences, and could be a useful approach to repurpose existing transgenes as new genetic reagents become available. PMID:27334272
Zhu, Qihui; Smith, Shavannor M; Ayele, Mulu; Yang, Lixing; Jogi, Ansuya; Chaluvadi, Srinivasa R; Bennetzen, Jeffrey L
2012-11-01
Tef (Eragrostis tef) is a major cereal crop in Ethiopia. Lodging is the primary constraint to increasing productivity in this allotetraploid species, accounting for losses of ∼15-45% in yield each year. As a first step toward identifying semi-dwarf varieties that might have improved lodging resistance, an ∼6× fosmid library was constructed and used to identify both homeologues of the dw3 semi-dwarfing gene of Sorghum bicolor. An EMS mutagenized population, consisting of ∼21,210 tef plants, was planted and leaf materials were collected into 23 superpools. Two dwarfing candidate genes, homeologues of dw3 of sorghum and rht1 of wheat, were sequenced directly from each superpool with 454 technology, and 120 candidate mutations were identified. Out of 10 candidates tested, six independent mutations were validated by Sanger sequencing, including two predicted detrimental mutations in both dw3 homeologues with a potential to improve lodging resistance in tef through further breeding. This study demonstrates that high-throughput sequencing can identify potentially valuable mutations in under-studied plant species like tef and has provided mutant lines that can now be combined and tested in breeding programs for improved lodging resistance.
Trinh, T. Q.; Sinden, R. R.
1993-01-01
We describe a system to measure the frequency of both deletions and duplications between direct repeats. Short 17- and 18-bp palindromic and nonpalindromic DNA sequences were cloned into the EcoRI site within the chloramphenicol acetyltransferase gene of plasmids pBR325 and pJT7. This creates an insert between direct repeated EcoRI sites and results in a chloramphenicol-sensitive phenotype. Selection for chloramphenicol resistance was utilized to select chloramphenicol resistant revertants that included those with precise deletion of the insert from plasmid pBR325 and duplication of the insert in plasmid pJT7. The frequency of deletion or duplication varied more than 500-fold depending on the sequence of the short sequence inserted into the EcoRI site. For the nonpalindromic inserts, multiple internal direct repeats and the length of the direct repeats appear to influence the frequency of deletion. Certain palindromic DNA sequences with the potential to form DNA hairpin structures that might stabilize the misalignment of direct repeats had a high frequency of deletion. Other DNA sequences with the potential to form structures that might destabilize misalignment of direct repeats had a very low frequency of deletion. Duplication mutations occurred at the highest frequency when the DNA between the direct repeats contained no direct or inverted repeats. The presence of inverted repeats dramatically reduced the frequency of duplications. The results support the slippage-misalignment model, suggesting that misalignment occurring during DNA replication leads to deletion and duplication mutations. The results also support the idea that the formation of DNA secondary structures during DNA replication can facilitate and direct specific mutagenic events. PMID:8325478
Identification of Bacterial Species in Kuwaiti Waters Through DNA Sequencing
NASA Astrophysics Data System (ADS)
Chen, K.
2017-01-01
With an objective of identifying the bacterial diversity associated with ecosystem of various Kuwaiti Seas, bacteria were cultured and isolated from 3 water samples. Due to the difficulties for cultured and isolated fecal coliforms on the selective agar plates, bacterial isolates from marine agar plates were selected for molecular identification. 16S rRNA genes were successfully amplified from the genome of the selected isolates using Universal Eubacterial 16S rRNA primers. The resulted amplification products were subjected to automated DNA sequencing. Partial 16S rDNA sequences obtained were compared directly with sequences in the NCBI database using BLAST as well as with the sequences available with Ribosomal Database Project (RDP).
Tomar, Navneet; Mishra, Akhilesh; Mrinal, Nirotpal; Jayaram, B.
2016-01-01
Transcription factors (TFs) bind at multiple sites in the genome and regulate expression of many genes. Regulating TF binding in a gene specific manner remains a formidable challenge in drug discovery because the same binding motif may be present at multiple locations in the genome. Here, we present Onco-Regulon (http://www.scfbio-iitd.res.in/software/onco/NavSite/index.htm), an integrated database of regulatory motifs of cancer genes clubbed with Unique Sequence-Predictor (USP) a software suite that identifies unique sequences for each of these regulatory DNA motifs at the specified position in the genome. USP works by extending a given DNA motif, in 5′→3′, 3′ →5′ or both directions by adding one nucleotide at each step, and calculates the frequency of each extended motif in the genome by Frequency Counter programme. This step is iterated till the frequency of the extended motif becomes unity in the genome. Thus, for each given motif, we get three possible unique sequences. Closest Sequence Finder program predicts off-target drug binding in the genome. Inclusion of DNA-Protein structural information further makes Onco-Regulon a highly informative repository for gene specific drug development. We believe that Onco-Regulon will help researchers to design drugs which will bind to an exclusive site in the genome with no off-target effects, theoretically. Database URL: http://www.scfbio-iitd.res.in/software/onco/NavSite/index.htm PMID:27515825
Pi, J; Wookey, P J; Pittard, A J
1991-01-01
The phenylalanine-specific permease gene (pheP) of Escherichia coli has been cloned and sequenced. The gene was isolated on a 6-kb Sau3AI fragment from a chromosomal library, and its presence was verified by complementation of a mutant lacking the functional phenylalanine-specific permease. Subcloning from this fragment localized the pheP gene on a 2.7-kb HindIII-HindII fragment. The nucleotide sequence of this 2.7-kb region was determined. An open reading frame was identified which extends from a putative start point of translation (GTG at position 636) to a termination signal (TAA at position 2010). The assignment of the GTG as the initiation codon was verified by site-directed mutagenesis of the initiation codon and by introducing a chain termination mutation into the pheP-lacZ fusion construct. A single initiation site of transcription 30 bp upstream of the start point of translation was identified by the primer extension analysis. The pheP structural gene consists of 1,374 nucleotides specifying a protein of 458 amino acid residues. The PheP protein is very hydrophobic (71% nonpolar residues). A topological model predicted from the sequence analysis defines 12 transmembrane segments. This protein is highly homologous with the AroP (general aromatic transport) system of E. coli (59.6% identity) and to a lesser extent with the yeast permeases CAN1 (arginine), PUT4 (proline), and HIP1 (histidine) of Saccharomyces cerevisiae. Images PMID:1711024
Akao, Takeshi; Gomi, Katsuya; Goto, Kuniyasu; Okazaki, Naoto; Akita, Osamu
2002-07-01
In solid-state cultures (SC), Aspergillus oryzae shows characteristics such as high-level production and secretion of enzymes and hyphal differentiation with asexual development which are absent in liquid (submerged) culture (LC). It was predicted that many of the genes involved in the characteristics of A. oryzae in SC are differentially expressed between SC and LC. We generated two subtracted cDNA libraries with bi-directional cDNA subtractive hybridizations to isolate and identify such genes. Among them, we identified genes upregulated in or specific to SC, such as the AOS ( A. oryzae SC-specific gene) series, and those downregulated or not expressed in SC, such as the AOL ( A. oryzae LC-specific) series. Sequencing analyses revealed that the AOS series and the AOL series contain genes encoding extra- and intracellular enzymes and transport proteins. However, half were functionally unclassified by nucleotide sequences. Also, by expression profile, the AOS series comprised two groups. These gene products' molecular functions and physiological roles in SC await further investigation.
Molecular mechanism for the operation of nitrogen control in cyanobacteria.
Luque, I; Flores, E; Herrero, A
1994-01-01
In cyanobacteria, ammonium exerts a negative regulation of the expression of proteins involved in the assimilation of nitrogen sources alternative to ammonium. In Synechococcus, mRNA levels of genes encoding proteins for nitrate and ammonium assimilation were observed to be negatively regulated by ammonium, and ammonium-regulated transcription start points were identified for those genes. The NtcA protein is a positive regulator of genes subjected to nitrogen control by ammonium. Mutants lacking NtcA exhibited only basal mRNA levels of the regulated genes, even in the absence of ammonium, indicating that NtcA exerts its regulatory action by positively influencing mRNA levels of the nitrogen-regulated genes. NtcA was observed to bind directly to the promoters of nitrogen-regulated genes, and the palindromic DNA sequence GTAN8TAC was identified as a sequence signature for NtcA-target sites. The structure of the nitrogen-, NtcA-regulated promoters of Synechococcus was determined to be constituted by a -10, Pribnow-like box in the form TAN3T, and an NtcA-binding site that substituted for the canonical -35 box. Images PMID:8026471
RNAi triggered by symmetrically transcribed transgenes in Drosophila melanogaster.
Giordano, Ennio; Rendina, Rosaria; Peluso, Ivana; Furia, Maria
2002-01-01
Specific silencing of target genes can be induced in a variety of organisms by providing homologous double-stranded RNA molecules. In vivo, these molecules can be generated either by transcription of sequences having an inverted-repeat (IR) configuration or by simultaneous transcription of sense-antisense strands. Since IR constructs are difficult to prepare and can stimulate genomic rearrangements, we investigated the silencing potential of symmetrically transcribed sequences. We report that Drosophila transgenes whose sense-antisense transcription was driven by two convergent arrays of Gal4-dependent UAS sequences can induce specific, dominant, and heritable repression of target genes. This effect is not dependent on a mechanism based on homology-dependent DNA/DNA interactions, but is directly triggered by transcriptional activation and is accompanied by specific depletion of the endogenous target RNA. Tissue-specific induction of these transgenes restricts the target gene silencing to selected body domains, and spreading phenomena described in other cases of post-transcriptional gene silencing (PTGS) were not observed. In addition to providing an additional tool useful for Drosophila functional genomic analysis, these results add further strength to the view that events of sense-antisense transcription may readily account for some, if not all, PTGS-cosuppression phenomena and can potentially play a relevant role in gene regulation. PMID:11861567
Genetic Perturbation of the Maize Methylome[W
Li, Qing; Hermanson, Peter J.; Zaunbrecher, Virginia M.; Song, Jawon; Wendt, Jennifer; Rosenbaum, Heidi; Madzima, Thelma F.; Sloan, Amy E.; Huang, Ji; Burgess, Daniel L.; Richmond, Todd A.; McGinnis, Karen M.; Meeley, Robert B.; Danilevskaya, Olga N.; Vaughn, Matthew W.; Kaeppler, Shawn M.; Jeddeloh, Jeffrey A.
2014-01-01
DNA methylation can play important roles in the regulation of transposable elements and genes. A collection of mutant alleles for 11 maize (Zea mays) genes predicted to play roles in controlling DNA methylation were isolated through forward- or reverse-genetic approaches. Low-coverage whole-genome bisulfite sequencing and high-coverage sequence-capture bisulfite sequencing were applied to mutant lines to determine context- and locus-specific effects of these mutations on DNA methylation profiles. Plants containing mutant alleles for components of the RNA-directed DNA methylation pathway exhibit loss of CHH methylation at many loci as well as CG and CHG methylation at a small number of loci. Plants containing loss-of-function alleles for chromomethylase (CMT) genes exhibit strong genome-wide reductions in CHG methylation and some locus-specific loss of CHH methylation. In an attempt to identify stocks with stronger reductions in DNA methylation levels than provided by single gene mutations, we performed crosses to create double mutants for the maize CMT3 orthologs, Zmet2 and Zmet5, and for the maize DDM1 orthologs, Chr101 and Chr106. While loss-of-function alleles are viable as single gene mutants, the double mutants were not recovered, suggesting that severe perturbations of the maize methylome may have stronger deleterious phenotypic effects than in Arabidopsis thaliana. PMID:25527708
Inaba, Takehito; Nagano, Yukio; Sakakibara, Toshihiro; Sasaki, Yukiko
1999-01-01
The pra2 gene encodes a pea (Pisum sativum) small GTPase belonging to the YPT/rab family, and its expression is down-regulated by light, mediated by phytochrome. We have isolated and characterized a genomic clone of this gene and constructed a fusion DNA of its 5′-upstream region in front of the gene for firefly luciferase. Using this construct in a transient assay, we determined a pra2 cis-regulatory region sufficient to direct the light down-regulation of the luciferase reporter gene. Both 5′- and internal deletion analyses revealed that the 93-bp sequence between −734 and −642 from the transcriptional start site was important for phytochrome down-regulation. Gain-of-function analysis showed that this 93-bp region could confer light down-regulation when fused to the cauliflower mosaic virus 35S promoter. Furthermore, linker-scanning analysis showed that a 12-bp sequence within the 93-bp region mediated phytochrome down-regulation. Gel-retardation analysis showed the presence of a nuclear factor that was specifically bound to the 12-bp sequence in vitro. These results indicate that this element is a cis-regulatory element involved in phytochrome down-regulated expression. PMID:10364400
van der Leij, F R; Visser, R G; Ponstein, A S; Jacobsen, E; Feenstra, W J
1991-08-01
The genomic sequence of the potato gene for starch granule-bound starch synthase (GBSS; "waxy protein") has been determined for the wild-type allele of a monoploid genotype from which an amylose-free (amf) mutant was derived, and for the mutant part of the amf allele. Comparison of the wild-type sequence with a cDNA sequence from the literature and a newly isolated cDNA revealed the presence of 13 introns, the first of which is located in the untranslated leader. The promoter contains a G-box-like sequence. The deduced amino acid sequence of the precursor of GBSS shows a high degree of identity with monocot waxy protein sequences in the region corresponding to the mature form of the enzyme. The transit peptide of 77 amino acids, required for routing of the precursor to the plastids, shows much less identity with the transit peptides of the other waxy preproteins, but resembles the hydropathic distributions of these peptides. Alignment of the amino acid sequences of the four mature starch synthases with the Escherichia coli glgA gene product revealed the presence of at least three conserved boxes; there is no homology with previously proposed starch-binding domains of other enzymes involved in starch metabolism. We report the use of chimeric constructs with wild-type and amf sequences to localize, via complementation experiments, the region of the amf allele in which the mutation resides. Direct sequencing of polymerase chain reaction products confirmed that the amf mutation is a deletion of a single AT basepair in the region coding for the transit peptide.(ABSTRACT TRUNCATED AT 250 WORDS)
Specific c-Jun target genes in malignant melanoma.
Schummer, Patrick; Kuphal, Silke; Vardimon, Lily; Bosserhoff, Anja K; Kappelmann, Melanie
2016-05-03
A fundamental event in the development and progression of malignant melanoma is the de-regulation of cancer-relevant transcription factors. We recently showed that c-Jun is a main regulator of melanoma progression and, thus, is the most important member of the AP-1 transcription factor family in this disease. Surprisingly, no cancer-related specific c-Jun target genes in melanoma were described in the literature, so far. Therefore, we focused on pre-existing ChIP-Seq data (Encyclopedia of DNA Elements) of 3 different non-melanoma cell lines to screen direct c-Jun target genes. Here, a specific c-Jun antibody to immunoprecipitate the associated promoter DNA was used. Consequently, we identified 44 direct c-Jun targets and a detailed analysis of 6 selected genes confirmed their deregulation in malignant melanoma. The identified genes were differentially regulated comparing 4 melanoma cell lines and normal human melanocytes and we confirmed their c-Jun dependency. Direct interaction between c-Jun and the promoter/enhancer regions of the identified genes was confirmed by us via ChIP experiments. Interestingly, we revealed that the direct regulation of target gene expression via c-Jun can be independent of the existence of the classical AP-1 (5´-TGA(C/G)TCA-3´) consensus sequence allowing for the subsequent down- or up-regulation of the expression of these cancer-relevant genes. In summary, the results of this study indicate that c-Jun plays a crucial role in the development and progression of malignant melanoma via direct regulation of cancer-relevant target genes and that inhibition of direct c-Jun targets through inhibition of c-Jun is a potential novel therapeutic option for treatment of malignant melanoma.
Rong, Weining; Chen, Xuejuan; Li, Huiping; Liu, Yani; Sheng, Xunlun
2014-06-01
To detect the disease-causing genes of 10 retinitis pigmentosa pedigrees by using exon combined target region capture sequencing chip. Pedigree investigation study. From October 2010 to December 2013, 10 RP pedigrees were recruited for this study in Ningxia Eye Hospital. All the patients and family members received complete ophthalmic examinations. DNA was abstracted from patients, family members and controls. Using exon combined target region capture sequencing chip to screen the candidate disease-causing mutations. Polymerase chain reaction (PCR) and direct sequencing were used to confirm the disease-causing mutations. Seventy patients and 23 normal family members were recruited from 10 pedigrees. Among 10 RP pedigrees, 1 was autosomal dominant pedigrees and 9 were autosomal recessive pedigrees. 7 mutations related to 5 genes of 5 pedigrees were detected. A frameshift mutation on BBS7 gene was detected in No.2 pedigree, the patients of this pedigree combined with central obesity, polydactyly and mental handicap. No.2 pedigree was diagnosed as Bardet-Biedl syndrome finally. A missense mutation was detected in No.7 and No.10 pedigrees respectively. Because the patients suffered deafness meanwhile, the final diagnosis was Usher syndrome. A missense mutation on C3 gene related to age-related macular degeneration was also detected in No. 7 pedigrees. A nonsense mutation and a missense mutation on CRB1 gene were detected in No. 1 pedigree and a splicesite mutation on PROM1 gene was detected in No. 5 pedigree. Retinitis pigmentosa is a kind of genetic eye disease with diversity clinical phenotypes. Rapid and effective genetic diagnosis technology combined with clinical characteristics analysis is helpful to improve the level of clinical diagnosis of RP.
2014-01-01
Background Fascioliasis is an important and neglected disease of humans and other mammals, caused by trematodes of the genus Fasciola. Fasciola hepatica and F. gigantica are valid species that infect humans and animals, but the specific status of Fasciola sp. (‘intermediate form’) is unclear. Methods Single specimens inferred to represent Fasciola sp. (‘intermediate form’; Heilongjiang) and F. gigantica (Guangxi) from China were genetically identified and characterized using PCR-based sequencing of the first and second internal transcribed spacer regions of nuclear ribosomal DNA. The complete mitochondrial (mt) genomes of these representative specimens were then sequenced. The relationships of these specimens with selected members of the Trematoda were assessed by phylogenetic analysis of concatenated amino acid sequence datasets by Bayesian inference (BI). Results The complete mt genomes of representatives of Fasciola sp. and F. gigantica were 14,453 bp and 14,478 bp in size, respectively. Both mt genomes contain 12 protein-coding genes, 22 transfer RNA genes and two ribosomal RNA genes, but lack an atp8 gene. All protein-coding genes are transcribed in the same direction, and the gene order in both mt genomes is the same as that published for F. hepatica. Phylogenetic analysis of the concatenated amino acid sequence data for all 12 protein-coding genes showed that the specimen of Fasciola sp. was more closely related to F. gigantica than to F. hepatica. Conclusions The mt genomes characterized here provide a rich source of markers, which can be used in combination with nuclear markers and imaging techniques, for future comparative studies of the biology of Fasciola sp. from China and other countries. PMID:24685294
Vidal, Ramon Oliveira; Mondego, Jorge Maurício Costa; Pot, David; Ambrósio, Alinne Batista; Andrade, Alan Carvalho; Pereira, Luiz Filipe Protasio; Colombo, Carlos Augusto; Vieira, Luiz Gonzaga Esteves; Carazzolle, Marcelo Falsarella; Pereira, Gonçalo Amarante Guimarães
2010-01-01
Polyploidization constitutes a common mode of evolution in flowering plants. This event provides the raw material for the divergence of function in homeologous genes, leading to phenotypic novelty that can contribute to the success of polyploids in nature or their selection for use in agriculture. Mounting evidence underlined the existence of homeologous expression biases in polyploid genomes; however, strategies to analyze such transcriptome regulation remained scarce. Important factors regarding homeologous expression biases remain to be explored, such as whether this phenomenon influences specific genes, how paralogs are affected by genome doubling, and what is the importance of the variability of homeologous expression bias to genotype differences. This study reports the expressed sequence tag assembly of the allopolyploid Coffea arabica and one of its direct ancestors, Coffea canephora. The assembly was used for the discovery of single nucleotide polymorphisms through the identification of high-quality discrepancies in overlapped expressed sequence tags and for gene expression information indirectly estimated by the transcript redundancy. Sequence diversity profiles were evaluated within C. arabica (Ca) and C. canephora (Cc) and used to deduce the transcript contribution of the Coffea eugenioides (Ce) ancestor. The assignment of the C. arabica haplotypes to the C. canephora (CaCc) or C. eugenioides (CaCe) ancestral genomes allowed us to analyze gene expression contributions of each subgenome in C. arabica. In silico data were validated by the quantitative polymerase chain reaction and allele-specific combination TaqMAMA-based method. The presence of differential expression of C. arabica homeologous genes and its implications in coffee gene expression, ontology, and physiology are discussed. PMID:20864545
Liu, Guo-Hua; Gasser, Robin B; Young, Neil D; Song, Hui-Qun; Ai, Lin; Zhu, Xing-Quan
2014-03-31
Fascioliasis is an important and neglected disease of humans and other mammals, caused by trematodes of the genus Fasciola. Fasciola hepatica and F. gigantica are valid species that infect humans and animals, but the specific status of Fasciola sp. ('intermediate form') is unclear. Single specimens inferred to represent Fasciola sp. ('intermediate form'; Heilongjiang) and F. gigantica (Guangxi) from China were genetically identified and characterized using PCR-based sequencing of the first and second internal transcribed spacer regions of nuclear ribosomal DNA. The complete mitochondrial (mt) genomes of these representative specimens were then sequenced. The relationships of these specimens with selected members of the Trematoda were assessed by phylogenetic analysis of concatenated amino acid sequence datasets by Bayesian inference (BI). The complete mt genomes of representatives of Fasciola sp. and F. gigantica were 14,453 bp and 14,478 bp in size, respectively. Both mt genomes contain 12 protein-coding genes, 22 transfer RNA genes and two ribosomal RNA genes, but lack an atp8 gene. All protein-coding genes are transcribed in the same direction, and the gene order in both mt genomes is the same as that published for F. hepatica. Phylogenetic analysis of the concatenated amino acid sequence data for all 12 protein-coding genes showed that the specimen of Fasciola sp. was more closely related to F. gigantica than to F. hepatica. The mt genomes characterized here provide a rich source of markers, which can be used in combination with nuclear markers and imaging techniques, for future comparative studies of the biology of Fasciola sp. from China and other countries.
Bian, Yue-Hong; Xu, Cheng; Li, Junling; Xu, Jin; Zhang, Hongwei; Du, Shao Jun
2011-08-01
Hemojuvelin, also known as RGMc, is encoded by hfe2 gene that plays an important role in iron homeostasis. hfe2 is specifically expressed in the notochord, developing somite and skeletal muscles during development. The molecular regulation of hfe2 expression is, however, not clear. We reported here the characterization of hfe2 gene expression and the regulation of its tissue-specific expression in zebrafish embryos. We demonstrated that the 6 kb 5'-flanking sequence upstream of the ATG start codon in the zebrafish hfe2 gene could direct GFP specific expression in the notochord, somites, and skeletal muscle of zebrafish embryos, recapitulating the expression pattern of the endogenous gene. However, the Tg(hfe2:gfp) transgene is also expressed in the liver of fish embryos, which did not mimic the expression of the endogenous hfe2 at the early stage. Nevertheless, the Tg(hfe2:gfp) transgenic zebrafish provides a useful model to study liver development. Treating Tg(hfe2:gfp) transgenic zebrafish embryos with valproic acid, a liver development inhibitor, significantly inhibited GFP expression in zebrafish. Together, these data indicate that the tissue specific expression of hfe2 in the notochord, somites and muscles is regulated by regulatory elements within the 6 kb 5'-flanking sequence of the hfe2 gene. Moreover, the Tg(hfe2:gfp) transgenic zebrafish line provides a useful model system for analyzing liver development in zebrafish.
Ran, Yidong; Patron, Nicola; Kay, Pippa; Wong, Debbie; Buchanan, Margaret; Cao, Ying-Ying; Sawbridge, Tim; Davies, John P; Mason, John; Webb, Steven R; Spangenberg, German; Ainley, William M; Walsh, Terence A; Hayden, Matthew J
2018-05-07
Sequence-specific nucleases have been used to engineer targeted genome modifications in various plants. While targeted gene knockouts resulting in loss of function have been reported with relatively high rates of success, targeted gene editing using an exogenously supplied DNA repair template and site-specific transgene integration has been more challenging. Here, we report the first application of zinc finger nuclease (ZFN)-mediated, nonhomologous end-joining (NHEJ)-directed editing of a native gene in allohexaploid bread wheat to introduce, via a supplied DNA repair template, a specific single amino acid change into the coding sequence of acetohydroxyacid synthase (AHAS) to confer resistance to imidazolinone herbicides. We recovered edited wheat plants having the targeted amino acid modification in one or more AHAS homoalleles via direct selection for resistance to imazamox, an AHAS-inhibiting imidazolinone herbicide. Using a cotransformation strategy based on chemical selection for an exogenous marker, we achieved a 1.2% recovery rate of edited plants having the desired amino acid change and a 2.9% recovery of plants with targeted mutations at the AHAS locus resulting in a loss-of-function gene knockout. The latter results demonstrate a broadly applicable approach to introduce targeted modifications into native genes for nonselectable traits. All ZFN-mediated changes were faithfully transmitted to the next generation. © 2018 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.
Groves, Ryan A.; Hagel, Jillian M.; Zhang, Ye; Kilpatrick, Korey; Levy, Asaf; Marsolais, Frédéric; Lewinsohn, Efraim; Sensen, Christoph W.; Facchini, Peter J.
2015-01-01
Amphetamine analogues are produced by plants in the genus Ephedra and by khat (Catha edulis), and include the widely used decongestants and appetite suppressants (1S,2S)-pseudoephedrine and (1R,2S)-ephedrine. The production of these metabolites, which derive from L-phenylalanine, involves a multi-step pathway partially mapped out at the biochemical level using knowledge of benzoic acid metabolism established in other plants, and direct evidence using khat and Ephedra species as model systems. Despite the commercial importance of amphetamine-type alkaloids, only a single step in their biosynthesis has been elucidated at the molecular level. We have employed Illumina next-generation sequencing technology, paired with Trinity and Velvet-Oases assembly platforms, to establish data-mining frameworks for Ephedra sinica and khat plants. Sequence libraries representing a combined 200,000 unigenes were subjected to an annotation pipeline involving direct searches against public databases. Annotations included the assignment of Gene Ontology (GO) terms used to allocate unigenes to functional categories. As part of our functional genomics program aimed at novel gene discovery, the databases were mined for enzyme candidates putatively involved in alkaloid biosynthesis. Queries used for mining included enzymes with established roles in benzoic acid metabolism, as well as enzymes catalyzing reactions similar to those predicted for amphetamine alkaloid metabolism. Gene candidates were evaluated based on phylogenetic relationships, FPKM-based expression data, and mechanistic considerations. Establishment of expansive sequence resources is a critical step toward pathway characterization, a goal with both academic and industrial implications. PMID:25806807
Wawrzyńska, Anna; Lewandowska, Małgorzata; Sirko, Agnieszka
2010-03-01
Sulphur deficiency severely affects plant growth and their agricultural productivity leading to diverse changes in development and metabolisms. Molecular mechanisms regulating gene expression under low sulphur conditions remain largely unknown. AtSLIM1, a member of the EIN3-like (EIL) family was reported to be a central transcriptional regulator of the plant sulphur response, however, no direct interaction of this protein with any sulphur-responsive promoters was demonstrated. The focus of this study was on the analysis of a promoter region of UP9C, a tobacco gene strongly induced by sulphur limitation. Cloning and subsequent examination of this promoter resulted in the identification of a 20-nt sequence (UPE-box), also present in the promoters of several Arabidopsis genes, including three out of four homologues of UP9C. The UPE-box, consisting of two parallel tebs sequences (TEIL binding site), proved to be necessary to bind the transcription factors belonging to the EIL family and of a 5-nt conserved sequence at the 3'-end. The yeast one-hybrid analysis resulted in the identification of one transcription factor (NtEIL2) capable of binding to the UPE-box. The interactions of NtEIL2, and its homologue from Arabidopsis, AtSLIM1, with DNA were affected by mutations within the UPE-box. Transient expression assays in Nicotiana benthamiana have further shown that both factors, NtEIL2 and AtSLIM1, activate the UP9C promoter. Interestingly, activation by NtEIL2, but not by AtSLIM1, was dependent on the sulphur-deficiency of the plants.
Taniguchi, H; Ohta, H; Ogawa, M; Mizuguchi, Y
1985-05-01
Two hemolysin genes of Vibrio parahaemolyticus WP1, a thermostable direct (TSD) hemolysin gene and a thermolabile hemolysin gene, were cloned into the pBR322 vector in Escherichia coli K-12 C600. A large amount of the TSD hemolysin produced in E. coli K-12 accumulated in the periplasmic space. The TSD hemolysin gene was localized on a 0.9-kilobase HindIII-BamHI fragment by identifying qualitatively the production of the TSD hemolysin by a reverse passive hemagglutination assay in the osmotic shock fluid. The thermolabile hemolysin gene was isolated on a 1.3-kilobase HindIII-PstI fragment by selection with the hemolysin on blood agar. Southern blot hybridization and colony hybridization experiments indicated that the TSD hemolysin gene was present in the chromosomal DNA of 15 Kanagawa phenomenon-positive strains but not in 14 negative strains, whereas the thermolabile hemolysin gene was detected in all strains. No homologous DNA sequences to TSD and thermolabile hemolysin genes were detected in the chromosomes of Vibrio cholerae, Vibrio vulnificus, non-O1 V. cholerae, and Vibrio anguillarum.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sharma, V.; Bonnycastle, L.; Poorkai, P.
1994-09-01
We have constructed a yeast artificial chromosome (YAC) contig of chromosome 14q24.3 which encompasses the chromosome 14 Alzheimer`s disease locus (AD3). Determined by linkage analysis of early-onset Alzheimer`s disease kindreds, this interval is bounded by the genetic markers D14S61-D14S63 and spans approximately 15 centimorgans. The contig consists of 29 markers and 74 YACs of which 57 are defined by one or more sequence tagged sites (STSs). The STS markers comprise 5 genes, 16 short tandem repeat polymorphisms and 8 cDNA clones. An additional number of genes, expressed sequence tags and cDNA fragments have been identified and localized to the contigmore » by hybridization and sequence analysis of anonymous clones isolated by cDNA direct selection techniques. A minimal contig of about 15 YACs averaging 0.5-1.5 megabase in length will span this interval and is, at first approximation, in rough agreement with the genetic map. For two regions of the contig, our coverage has relied on L1/THE fingerprint and Alu-PCR hybridization data of YACs provided by CEPH/Genethon. We are currently developing sequence tagged sites from these to confirm the overlaps revealed by the fingerprint data. Among the genes which map to the contig are transforming growth factor beta 3, c-fos, and heat shock protein 2A (HSPA2). C-fos is not a candidate gene for AD3 based on the sequence analysis of affected and unaffected individuals. HSPA2 maps to the proximal edge of the contig and Calmodulin 1, a candidate gene from 4q24.3, maps outside of the region. The YAC contig is a framework physical map from which cosmid or P1 clone contigs can be constructed. As more genes and cDNAs are mapped, a highly resolved transcription map will emerge, a necessary step towards positionally cloning the AD3 gene.« less
Gene Function Analysis in the Ubiquitous Human Commensal and Pathogen Malassezia Genus.
Ianiri, Giuseppe; Averette, Anna F; Kingsbury, Joanne M; Heitman, Joseph; Idnurm, Alexander
2016-11-29
The genus Malassezia includes 14 species that are found on the skin of humans and animals and are associated with a number of diseases. Recent genome sequencing projects have defined the gene content of all 14 species; however, to date, genetic manipulation has not been possible for any species within this genus. Here, we develop and then optimize molecular tools for the transformation of Malassezia furfur and Malassezia sympodialis using Agrobacterium tumefaciens delivery of transfer DNA (T-DNA) molecules. These T-DNAs can insert randomly into the genome. In the case of M. furfur, targeted gene replacements were also achieved via homologous recombination, enabling deletion of the ADE2 gene for purine biosynthesis and of the LAC2 gene predicted to be involved in melanin biosynthesis. Hence, the introduction of exogenous DNA and direct gene manipulation are feasible in Malassezia species. Species in the genus Malassezia are a defining component of the microbiome of the surface of mammals. They are also associated with a wide range of skin disease symptoms. Many species are difficult to culture in vitro, and although genome sequences are available for the species in this genus, it has not been possible to assess gene function to date. In this study, we pursued a series of possible transformation methods and identified one that allows the introduction of DNA into two species of Malassezia, including the ability to make targeted integrations into the genome such that genes can be deleted. This research opens a new direction in terms of now being able to analyze gene functions in this little understood genus. These tools will contribute to define the mechanisms that lead to the commensalism and pathogenicity in this group of obligate fungi that are predominant on the skin of mammals. Copyright © 2016 Ianiri et al.
Subclinical hyperthyroidism due to a thyrotropin receptor (TSHR) gene mutation (S505R).
Pohlenz, Joachim; Pfarr, Nicole; Krüger, Silvia; Hesse, Volker
2006-12-01
To identify the molecular defect by which non-autoimmune subclinical hyperthyroidism was caused in a 6-mo-old infant who presented with weight loss. Congenital non-autoimmune hyperthyroidism is caused by activating germline mutations in the thyrotropin receptor (TSHR) gene. Therefore, the TSHR gene was sequenced directly from the patient's genomic DNA. Molecular analysis revealed a heterozygous point mutation (S505R) in the TSHR gene as the underlying defect. A constitutively activating mutation in the TSHR gene has to be considered not only in patients with severe congenital non-autoimmune hyperthyroidism, but also in children with subclinical non-autoimmune hyperthyroidism.
Biedler, James K; Tu, Zhijian
2010-07-08
The maternal zygotic transition marks the time at which transcription from the zygotic genome is initiated and a subset of maternal RNAs are progressively degraded in the developing embryo. A number of early zygotic genes have been identified in Drosophila melanogaster and comparisons to sequenced mosquito genomes suggest that some of these early zygotic genes such as bottleneck are fast-evolving or subject to turnover in dipteran insects. One objective of this study is to identify early zygotic genes from the yellow fever mosquito Aedes aegypti to study their evolution. We are also interested in obtaining early zygotic promoters that will direct transgene expression in the early embryo as part of a Medea gene drive system. Two novel early zygotic kinesin light chain genes we call AaKLC2.1 and AaKLC2.2 were identified by transcriptome sequencing of Aedes aegypti embryos at various time points. These two genes have 98% nucleotide and amino acid identity in their coding regions and show transcription confined to the early zygotic stage according to gene-specific RT-PCR analysis. These AaKLC2 genes have a paralogous gene (AaKLC1) in Ae. aegypti. Phylogenetic inference shows that an ortholog to the AaKLC2 genes is only found in the sequenced genome of Culex quinquefasciatus. In contrast, AaKLC1 gene orthologs are found in all three sequenced mosquito species including Anopheles gambiae. There is only one KLC gene in D. melanogaster and other sequenced holometabolous insects that appears to be similar to AaKLC1. Unlike AaKLC2, AaKLC1 is expressed in all life stages and tissues tested, which is consistent with the expression pattern of the An. gambiae and D. melanogaster KLC genes. Phylogenetic inference also suggests that AaKLC2 genes and their likely C. quinquefasciatus ortholog are fast-evolving genes relative to the highly conserved AaKLC1-like paralogs. Embryonic injection of a luciferase reporter under the control of a 1 kb fragment upstream of the AaKLC2.1 start codon shows promoter activity at least as early as 3 hours in the developing Ae. aegypti embryo. The AaKLC2.1 promoter activity reached ~1600 fold over the negative control at 5 hr after egg deposition. Transcriptome profiling by use of high throughput sequencing technologies has proven to be a valuable method for the identification and discovery of early and transient zygotic genes. The evolutionary investigation of the KLC gene family reveals that duplication is a source for the evolution of new genes that play a role in the dynamic process of early embryonic development. AaKLC2.1 may provide a promoter for early zygotic-specific transgene expression, which is a key component of the Medea gene drive system.
Cardaioli, Elena; Mignarri, Andrea; Cantisani, Teresa Anna; Malandrini, Alessandro; Nesti, Claudia; Rubegni, Anna; Funel, Niccola; Federico, Antonio; Santorelli, Filippo Maria; Dotti, Maria Teresa
2018-06-02
We sequenced the mitochondrial genome from a 40-year-old woman with myoclonus epilepsy, retinitis pigmentosa, leukoencephalopathy and cerebral calcifications. Histological and biochemical features of mitochondrial respiratory chain dysfunction were present. Direct sequencing showed a novel heteroplasmic mutation at nucleotide 5513 in the MT-TW gene that encodes tRNA Trp . Restriction Fragment Length Polymorphism analysis confirmed that about 80% of muscle mtDNA harboured the mutation while it was present in minor percentages in mtDNA from other tissues. The mutation is predicted to disrupt a highly conserved base pair within the aminoacyl acceptor stem of the tRNA. This is the 17° mutation in MT-TW gene and expands the known causes of late-onset mitochondrial diseases. Copyright © 2018 Elsevier Inc. All rights reserved.
Sharma, Mukul; Vedithi, Sundeep Chaitanya; Das, Madhusmita; Roy, Anindya; Ebenezer, Mannam
2017-01-01
Survival of Mycobacterium leprae, the causative bacteria for leprosy, in the human host is dependent to an extent on the ways in which its genome integrity is retained. DNA repair mechanisms protect bacterial DNA from damage induced by various stress factors. The current study is aimed at understanding the sequence and functional annotation of DNA repair genes in M. leprae. T he genome of M. leprae was annotated using sequence alignment tools to identify DNA repair genes that have homologs in Mycobacterium tuberculosis and Escherichia coli. A set of 96 genes known to be involved in DNA repair mechanisms in E. coli and Mycobacteriaceae were chosen as a reference. Among these, 61 were identified in M. leprae based on sequence similarity and domain architecture. The 61 were classified into 36 characterized gene products (59%), 11 hypothetical proteins (18%), and 14 pseudogenes (23%). All these genes have homologs in M. tuberculosis and 49 (80.32%) in E. coli. A set of 12 genes which are absent in E. coli were present in M. leprae and in Mycobacteriaceae. These 61 genes were further investigated for their expression profiles in the whole transcriptome microarray data of M. leprae which was obtained from the signal intensities of 60bp probes, tiling the entire genome with 10bp overlaps. It was noted that transcripts corresponding to all the 61 genes were identified in the transcriptome data with varying expression levels ranging from 0.18 to 2.47 fold (normalized with 16SrRNA). The mRNA expression levels of a representative set of seven genes ( four annotated and three hypothetical protein coding genes) were analyzed using quantitative Polymerase Chain Reaction (qPCR) assays with RNA extracted from skin biopsies of 10 newly diagnosed, untreated leprosy cases. It was noted that RNA expression levels were higher for genes involved in homologous recombination whereas the genes with a low level of expression are involved in the direct repair pathway. This study provided preliminary information on the potential DNA repair pathways that are extant in M. leprae and the associated genes.
Saski, Christopher; Lee, Seung-Bum; Fjellheim, Siri; Guda, Chittibabu; Jansen, Robert K.; Luo, Hong; Tomkins, Jeffrey; Rognli, Odd Arne; Clarke, Jihong Liu
2009-01-01
Comparisons of complete chloroplast genome sequences of Hordeum vulgare, Sorghum bicolor and Agrostis stolonifera to six published grass chloroplast genomes reveal that gene content and order are similar but two microstructural changes have occurred. First, the expansion of the IR at the SSC/IRa boundary that duplicates a portion of the 5′ end of ndhH is restricted to the three genera of the subfamily Pooideae (Agrostis, Hordeum and Triticum). Second, a 6 bp deletion in ndhK is shared by Agrostis, Hordeum, Oryza and Triticum, and this event supports the sister relationship between the subfamilies Erhartoideae and Pooideae. Repeat analysis identified 19–37 direct and inverted repeats 30 bp or longer with a sequence identity of at least 90%. Seventeen of the 26 shared repeats are found in all the grass chloroplast genomes examined and are located in the same genes or intergenic spacer (IGS) regions. Examination of simple sequence repeats (SSRs) identified 16–21 potential polymorphic SSRs. Five IGS regions have 100% sequence identity among Zea mays, Saccharum officinarum and Sorghum bicolor, whereas no spacer regions were identical among Oryza sativa, Triticum aestivum, H. vulgare and A. stolonifera despite their close phylogenetic relationship. Alignment of EST sequences and DNA coding sequences identified six C–U conversions in both Sorghum bicolor and H. vulgare but only one in A. stolonifera. Phylogenetic trees based on DNA sequences of 61 protein-coding genes of 38 taxa using both maximum parsimony and likelihood methods provide moderate support for a sister relationship between the subfamilies Erhartoideae and Pooideae. PMID:17534593
Wehmeier, U F
1995-11-07
Four new shuttle vectors for Escherichia coli (Ec) and Streptomyces, pUWL218, pUWL219, pUWL-SK and pUWL-KS, which permit recognition of recombinant (re-) plasmids on XGal plates in Ec, were constructed. These vectors contain the replication functions of the Streptomyces wide-host-range multicopy plasmid pIJ101, the tsr gene conferring resistance to thiostrepton in Streptomyces, the ColEI origin of replication from the pUC plasmids for replication in Ec and the bla gene conferring resistance to ampicillin in Ec. They possess multiple cloning sites with a number of unique restriction sites and allow direct sequencing of re-derivatives using the pUC sequencing primers.
Improved PCR primers for the detection and identification of arbuscular mycorrhizal fungi.
Lee, Jaikoo; Lee, Sangsun; Young, J Peter W
2008-08-01
A set of PCR primers that should amplify all subgroups of arbuscular mycorrhizal fungi (AMF, Glomeromycota), but exclude sequences from other organisms, was designed to facilitate rapid detection and identification directly from field-grown plant roots. The small subunit rRNA gene was targeted for the new primers (AML1 and AML2) because phylogenetic relationships among the Glomeromycota are well understood for this gene. Sequence comparisons indicate that the new primers should amplify all published AMF sequences except those from Archaeospora trappei. The specificity of the new primers was tested using 23 different AMF spore morphotypes from trap cultures and Miscanthus sinensis, Glycine max and Panax ginseng roots sampled from the field. Non-AMF DNA of 14 plants, 14 Basidiomycota and 18 Ascomycota was also tested as negative controls. Sequences amplified from roots using the new primers were compared with those obtained using the established NS31 and AM1 primer combination. The new primers have much better specificity and coverage of all known AMF groups.
Bulgari, Daniela; Casati, Paola; Brusetti, Lorenzo; Quaglino, Fabio; Brasca, Milena; Daffonchio, Daniele; Bianco, Piero Attilio
2009-08-01
Diversity of bacterial endophytes associated with grapevine leaf tissues was analyzed by cultivation and cultivation-independent methods. In order to identify bacterial endophytes directly from metagenome, a protocol for bacteria enrichment and DNA extraction was optimized. Sequence analysis of 16S rRNA gene libraries underscored five diverse Operational Taxonomic Units (OTUs), showing best sequence matches with gamma-Proteobacteria, family Enterobacteriaceae, with a dominance of the genus Pantoea. Bacteria isolation through cultivation revealed the presence of six OTUs, showing best sequence matches with Actinobacteria, genus Curtobacterium, and with Firmicutes genera Bacillus and Enterococcus. Length Heterogeneity-PCR (LH-PCR) electrophoretic peaks from single bacterial clones were used to setup a database representing the bacterial endophytes identified in association with grapevine tissues. Analysis of healthy and phytoplasma-infected grapevine plants showed that LH-PCR could be a useful complementary tool for examining the diversity of bacterial endophytes especially for diversity survey on a large number of samples.
TP53, PIK3CA, FBXW7 and KRAS Mutations in Esophageal Cancer Identified by Targeted Sequencing.
Zheng, Huili; Wang, Yan; Tang, Chuanning; Jones, Lindsey; Ye, Hua; Zhang, Guangchun; Cao, Weihai; Li, Jingwen; Liu, Lifeng; Liu, Zhencong; Zhang, Chao; Lou, Feng; Liu, Zhiyuan; Li, Yangyang; Shi, Zhenfen; Zhang, Jingbo; Zhang, Dandan; Sun, Hong; Dong, Haichao; Dong, Zhishou; Guo, Baishuai; Yan, H E; Lu, Qingyu; Huang, Xue; Chen, Si-Yi
2016-01-01
Esophageal cancer (EC) is a common malignancy with significant morbidity and mortality. As individual cancers exhibit unique mutation patterns, identifying and characterizing gene mutations in EC that may serve as biomarkers might help predict patient outcome and guide treatment. Traditionally, personalized cancer DNA sequencing was impractical and expensive. Recent technological advancements have made targeted DNA sequencing more cost- and time-effective with reliable results. This technology may be useful for clinicians to direct patient treatment. The Ion PGM and AmpliSeq Cancer Panel was used to identify mutations at 737 hotspot loci of 45 cancer-related genes in 64 EC samples from Chinese patients. Frequent mutations were found in TP53 and less frequent mutations in PIK3CA, FBXW7 and KRAS. These results demonstrate that targeted sequencing can reliably identify mutations in individual tumors that make this technology a possibility for clinical use. Copyright© 2016, International Institute of Anticancer Research (Dr. John G. Delinasios), All rights reserved.
Adamiak, Paul; Vanderkooi, Otto G; Kellner, James D; Schryvers, Anthony B; Bettinger, Julie A; Alcantara, Joenel
2014-06-03
Multi-locus sequence typing (MLST) is a portable, broadly applicable method for classifying bacterial isolates at an intra-species level. This methodology provides clinical and scientific investigators with a standardized means of monitoring evolution within bacterial populations. MLST uses the DNA sequences from a set of genes such that each unique combination of sequences defines an isolate's sequence type. In order to reliably determine the sequence of a typing gene, matching sequence reads for both strands of the gene must be obtained. This study assesses the ability of both the standard, and an alternative set of, Streptococcus pneumoniae MLST primers to completely sequence, in both directions, the required typing alleles. The results demonstrated that for five (aroE, recP, spi, xpt, ddl) of the seven S. pneumoniae typing alleles, the standard primers were unable to obtain the complete forward and reverse sequences. This is due to the standard primers annealing too closely to the target regions, and current sequencing technology failing to sequence the bases that are too close to the primer. The alternative primer set described here, which includes a combination of primers proposed by the CDC and several designed as part of this study, addresses this limitation by annealing to highly conserved segments further from the target region. This primer set was subsequently employed to sequence type 105 S. pneumoniae isolates collected by the Canadian Immunization Monitoring Program ACTive (IMPACT) over a period of 18 years. The inability of several of the standard S. pneumoniae MLST primers to fully sequence the required region was consistently observed and is the result of a shift in sequencing technology occurring after the original primers were designed. The results presented here introduce clear documentation describing this phenomenon into the literature, and provide additional guidance, through the introduction of a widely validated set of alternative primers, to research groups seeking to undertake S. pneumoniae MLST based studies.
Leptospira species molecular epidemiology in the genomic era.
Caimi, K; Repetto, S A; Varni, V; Ruybal, P
2017-10-01
Leptospirosis is a zoonotic disease which global burden is increasing often related to climatic change. Hundreds of whole genome sequences from worldwide isolates of Leptospira spp. are available nowadays, together with online tools that permit to assign MLST sequence types (STs) directly from raw sequence data. In this work we have applied R7L-MLST to near 500 genomes and strains collection globally distributed. All 10 pathogenic species as well as intermediate were typed using this MLST scheme. The correlation observed between STs and serogroups in our previous work, is still satisfied with this higher dataset sustaining the implementation of MLST to assist serological classification as a complementary approach. Bayesian phylogenetic analysis of concatenated sequences from R7-MLST loci allowed us to resolve taxonomic inconsistencies but also showed that events such as recombination, gene conversion or lateral gene transfer played an important role in the evolution of Leptospira genus. Whole genome sequencing allows us to contribute with suitable epidemiologic information useful to apply in the design of control strategies and also in diagnostic methods for this illness. Copyright © 2017 Elsevier B.V. All rights reserved.
Meyer, E; Butler, A; Dubrana, K; Duharcourt, S; Caron, F
1997-01-01
In ciliates, the germ line genome is extensively rearranged during the development of the somatic macronucleus from a mitotic product of the zygotic nucleus. Germ line chromosomes are fragmented in specific regions, and a large number of internal sequence elements are eliminated. It was previously shown that transformation of the vegetative macronucleus of Paramecium primaurelia with a plasmid containing a subtelomeric surface antigen gene can affect the processing of the homologous germ line genomic region during development of a new macronucleus in sexual progeny of transformed clones. The gene and telomere-proximal flanking sequences are deleted from the new macronuclear genome, although the germ line genome remains wild type. Here we show that plasmids containing nonoverlapping segments of the same genomic region are able to induce similar terminal deletions; the locations of deletion end points depend on the particular sequence used. Transformation of the maternal macronucleus with a sequence internal to a macronuclear chromosome also causes the occurrence of internal deletions between short direct repeats composed of alternating thymines and adenines. The epigenetic influence of maternal macronuclear sequences on developmental rearrangements of the zygotic genome thus appears to be both sequence specific and general, suggesting that this trans-nucleus effect is mediated by pairing of homologous sequences. PMID:9199294
Microsatellite analysis in the genome of Acanthaceae: An in silico approach
Kaliswamy, Priyadharsini; Vellingiri, Srividhya; Nathan, Bharathi; Selvaraj, Saravanakumar
2015-01-01
Background: Acanthaceae is one of the advanced and specialized families with conventionally used medicinal plants. Simple sequence repeats (SSRs) play a major role as molecular markers for genome analysis and plant breeding. The microsatellites existing in the complete genome sequences would help to attain a direct role in the genome organization, recombination, gene regulation, quantitative genetic variation, and evolution of genes. Objective: The current study reports the frequency of microsatellites and appropriate markers for the Acanthaceae family genome sequences. Materials and Methods: The whole nucleotide sequences of Acanthaceae species were obtained from National Center for Biotechnology Information database and screened for the presence of SSRs. SSR Locator tool was used to predict the microsatellites and inbuilt Primer3 module was used for primer designing. Results: Totally 110 repeats from 108 sequences of Acanthaceae family plant genomes were identified, and the occurrence of dinucleotide repeats was found to be abundant in the genome sequences. The essential amino acid isoleucine was found rich in all the sequences. We also designed the SSR-based primers/markers for 59 sequences of this family that contains microsatellite repeats in their genome. Conclusion: The identified microsatellites and primers might be useful for breeding and genetic studies of plants that belong to Acanthaceae family in the future. PMID:25709226
Sugathan, Aarathi; Biagioli, Marta; Golzio, Christelle; Erdin, Serkan; Blumenthal, Ian; Manavalan, Poornima; Ragavendran, Ashok; Brand, Harrison; Lucente, Diane; Miles, Judith; Sheridan, Steven D.; Stortchevoi, Alexei; Kellis, Manolis; Haggarty, Stephen J.; Katsanis, Nicholas; Gusella, James F.; Talkowski, Michael E.
2014-01-01
Truncating mutations of chromodomain helicase DNA-binding protein 8 (CHD8), and of many other genes with diverse functions, are strong-effect risk factors for autism spectrum disorder (ASD), suggesting multiple mechanisms of pathogenesis. We explored the transcriptional networks that CHD8 regulates in neural progenitor cells (NPCs) by reducing its expression and then integrating transcriptome sequencing (RNA sequencing) with genome-wide CHD8 binding (ChIP sequencing). Suppressing CHD8 to levels comparable with the loss of a single allele caused altered expression of 1,756 genes, 64.9% of which were up-regulated. CHD8 showed widespread binding to chromatin, with 7,324 replicated sites that marked 5,658 genes. Integration of these data suggests that a limited array of direct regulatory effects of CHD8 produced a much larger network of secondary expression changes. Genes indirectly down-regulated (i.e., without CHD8-binding sites) reflect pathways involved in brain development, including synapse formation, neuron differentiation, cell adhesion, and axon guidance, whereas CHD8-bound genes are strongly associated with chromatin modification and transcriptional regulation. Genes associated with ASD were strongly enriched among indirectly down-regulated loci (P < 10−8) and CHD8-bound genes (P = 0.0043), which align with previously identified coexpression modules during fetal development. We also find an intriguing enrichment of cancer-related gene sets among CHD8-bound genes (P < 10−10). In vivo suppression of chd8 in zebrafish produced macrocephaly comparable to that of humans with inactivating mutations. These data indicate that heterozygous disruption of CHD8 precipitates a network of gene-expression changes involved in neurodevelopmental pathways in which many ASD-associated genes may converge on shared mechanisms of pathogenesis. PMID:25294932
Genome Engineering and Modification Toward Synthetic Biology for the Production of Antibiotics.
Zou, Xuan; Wang, Lianrong; Li, Zhiqiang; Luo, Jie; Wang, Yunfu; Deng, Zixin; Du, Shiming; Chen, Shi
2018-01-01
Antibiotic production is often governed by large gene clusters composed of genes related to antibiotic scaffold synthesis, tailoring, regulation, and resistance. With the expansion of genome sequencing, a considerable number of antibiotic gene clusters has been isolated and characterized. The emerging genome engineering techniques make it possible towards more efficient engineering of antibiotics. In addition to genomic editing, multiple synthetic biology approaches have been developed for the exploration and improvement of antibiotic natural products. Here, we review the progress in the development of these genome editing techniques used to engineer new antibiotics, focusing on three aspects of genome engineering: direct cloning of large genomic fragments, genome engineering of gene clusters, and regulation of gene cluster expression. This review will not only summarize the current uses of genomic engineering techniques for cloning and assembly of antibiotic gene clusters or for altering antibiotic synthetic pathways but will also provide perspectives on the future directions of rebuilding biological systems for the design of novel antibiotics. © 2017 Wiley Periodicals, Inc.
Tominaga-Wada, Rumi; Iwata, Mineko; Sugiyama, Junji; Kotake, Toshihisa; Ishida, Tetsuya; Yokoyama, Ryusuke; Nishitani, Kazuhiko; Okada, Kiyotaka; Wada, Takuji
2009-11-01
Arabidopsis root hair formation is determined by the patterning genes CAPRICE (CPC), GLABRA3 (GL3), WEREWOLF (WER) and GLABRA2 (GL2), but little is known about the later changes in cell wall material during root hair formation. A combined Fourier-transform infrared microspectroscopy-principal components analysis (FTIR-PCA) method was used to detect subtle differences in the cell wall material between wild-type and root hair mutants in Arabidopsis. Among several root hair mutants, only the gl2 mutation affected root cell wall polysaccharides. Five of the 10 genes encoding cellulose synthase (CESA1-10) and 4 of 33 xyloglucan endotransglucosylase (XTH1-33) genes in Arabidopsis are expressed in the root, but only CESA5 and XTH17 were affected by the gl2 mutation. The L1-box sequence located in the promoter region of these genes was recognized by the GL2 protein. These results indicate that GL2 directly regulates cell wall-related gene expression during root development.
Reading, Benjamin J; Chapman, Robert W; Schaff, Jennifer E; Scholl, Elizabeth H; Opperman, Charles H; Sullivan, Craig V
2012-02-21
The striped bass and its relatives (genus Morone) are important fisheries and aquaculture species native to estuaries and rivers of the Atlantic coast and Gulf of Mexico in North America. To open avenues of gene expression research on reproduction and breeding of striped bass, we generated a collection of expressed sequence tags (ESTs) from a complementary DNA (cDNA) library representative of their ovarian transcriptome. Sequences of a total of 230,151 ESTs (51,259,448 bp) were acquired by Roche 454 pyrosequencing of cDNA pooled from ovarian tissues obtained at all stages of oocyte growth, at ovulation (eggs), and during preovulatory atresia. Quality filtering of ESTs allowed assembly of 11,208 high-quality contigs ≥ 100 bp, including 2,984 contigs 500 bp or longer (average length 895 bp). Blastx comparisons revealed 5,482 gene orthologues (E-value < 10-3), of which 4,120 (36.7% of total contigs) were annotated with Gene Ontology terms (E-value < 10-6). There were 5,726 remaining unknown unique sequences (51.1% of total contigs). All of the high-quality EST sequences are available in the National Center for Biotechnology Information (NCBI) Short Read Archive (GenBank: SRX007394). Informative contigs were considered to be abundant if they were assembled from groups of ESTs comprising ≥ 0.15% of the total short read sequences (≥ 345 reads/contig). Approximately 52.5% of these abundant contigs were predicted to have predominant ovary expression through digital differential display in silico comparisons to zebrafish (Danio rerio) UniGene orthologues. Over 1,300 Gene Ontology terms from Biological Process classes of Reproduction, Reproductive process, and Developmental process were assigned to this collection of annotated contigs. This first large reference sequence database available for the ecologically and economically important temperate basses (genus Morone) provides a foundation for gene expression studies in these species. The predicted predominance of ovary gene expression and assignment of directly relevant Gene Ontology classes suggests a powerful utility of this dataset for analysis of ovarian gene expression related to fundamental questions of oogenesis. Additionally, a high definition Agilent 60-mer oligo ovary 'UniClone' microarray with 8 × 15,000 probe format has been designed based on this striped bass transcriptome (eArray Group: Striper Group, Design ID: 029004).
Daubas, Philippe; Buckingham, Margaret E
2013-04-15
The Myf5 gene plays an important role in myogenic determination during mouse embryo development. Multiple genomic regions of the Mrf4-Myf5 locus have been characterised as enhancer sequences responsible for the complex spatiotemporal expression of the Myf5 gene at the onset of myogenesis. These include an enhancer sequence, located at -111 kb upstream of the Myf5 transcription start site, which is responsible of Myf5 activation in ventral somitic domains (Ribas et al., 2011. Dev. Biol. 355, 372-380). We show that the -111 kb-Myf5 enhancer also directs transgene expression in some limb muscles, and is active at foetal as well as embryonic stages. We have carried out further characterisation of the regulation of this enhancer and show that the paired-box Pax3 transcription factor binds to it in vitro as in vivo, and that Pax binding sites are essential for its activity. This requirement is independent of the previously reported regulation by TEAD transcription factors. Six1/4 which, like Pax3, are important upstream regulators of myogenesis, also bind in vivo to sites in the -111 kb-Myf5 enhancer and modulate its activity. The -111 kb-Myf5 enhancer therefore shares common functional characteristics with another Myf5 regulatory sequence, the hypaxial and limb 145 bp-Myf5 enhancer, both being directly regulated in vivo by Pax3 and Six1/4 proteins. However, in the case of the -111 kb-Myf5 enhancer, Six has less effect and we conclude that Pax regulation plays a major role in controlling this aspect of the Myf5 gene expression at the onset of myogenesis in the embryo. Copyright © 2013 Elsevier Inc. All rights reserved.
Gillot, Guillaume; Jany, Jean-Luc; Dominguez-Santos, Rebeca; Poirier, Elisabeth; Debaets, Stella; Hidalgo, Pedro I; Ullán, Ricardo V; Coton, Emmanuel; Coton, Monika
2017-04-01
Mycophenolic acid (MPA) is a secondary metabolite produced by various Penicillium species including Penicillium roqueforti. The MPA biosynthetic pathway was recently described in Penicillium brevicompactum. In this study, an in silico analysis of the P. roqueforti FM164 genome sequence localized a 23.5-kb putative MPA gene cluster. The cluster contains seven genes putatively coding seven proteins (MpaA, MpaB, MpaC, MpaDE, MpaF, MpaG, MpaH) and is highly similar (i.e. gene synteny, sequence homology) to the P. brevicompactum cluster. To confirm the involvement of this gene cluster in MPA biosynthesis, gene silencing using RNA interference targeting mpaC, encoding a putative polyketide synthase, was performed in a high MPA-producing P. roqueforti strain (F43-1). In the obtained transformants, decreased MPA production (measured by LC-Q-TOF/MS) was correlated to reduced mpaC gene expression by Q-RT-PCR. In parallel, mycotoxin quantification on multiple P. roqueforti strains suggested strain-dependent MPA-production. Thus, the entire MPA cluster was sequenced for P. roqueforti strains with contrasted MPA production and a 174bp deletion in mpaC was observed in low MPA-producers. PCRs directed towards the deleted region among 55 strains showed an excellent correlation with MPA quantification. Our results indicated the clear involvement of mpaC gene as well as surrounding cluster in P. roqueforti MPA biosynthesis. Copyright © 2016 Elsevier Ltd. All rights reserved.
The king cobra genome reveals dynamic gene evolution and adaptation in the snake venom system.
Vonk, Freek J; Casewell, Nicholas R; Henkel, Christiaan V; Heimberg, Alysha M; Jansen, Hans J; McCleary, Ryan J R; Kerkkamp, Harald M E; Vos, Rutger A; Guerreiro, Isabel; Calvete, Juan J; Wüster, Wolfgang; Woods, Anthony E; Logan, Jessica M; Harrison, Robert A; Castoe, Todd A; de Koning, A P Jason; Pollock, David D; Yandell, Mark; Calderon, Diego; Renjifo, Camila; Currier, Rachel B; Salgado, David; Pla, Davinia; Sanz, Libia; Hyder, Asad S; Ribeiro, José M C; Arntzen, Jan W; van den Thillart, Guido E E J M; Boetzer, Marten; Pirovano, Walter; Dirks, Ron P; Spaink, Herman P; Duboule, Denis; McGlinn, Edwina; Kini, R Manjunatha; Richardson, Michael K
2013-12-17
Snakes are limbless predators, and many species use venom to help overpower relatively large, agile prey. Snake venoms are complex protein mixtures encoded by several multilocus gene families that function synergistically to cause incapacitation. To examine venom evolution, we sequenced and interrogated the genome of a venomous snake, the king cobra (Ophiophagus hannah), and compared it, together with our unique transcriptome, microRNA, and proteome datasets from this species, with data from other vertebrates. In contrast to the platypus, the only other venomous vertebrate with a sequenced genome, we find that snake toxin genes evolve through several distinct co-option mechanisms and exhibit surprisingly variable levels of gene duplication and directional selection that correlate with their functional importance in prey capture. The enigmatic accessory venom gland shows a very different pattern of toxin gene expression from the main venom gland and seems to have recruited toxin-like lectin genes repeatedly for new nontoxic functions. In addition, tissue-specific microRNA analyses suggested the co-option of core genetic regulatory components of the venom secretory system from a pancreatic origin. Although the king cobra is limbless, we recovered coding sequences for all Hox genes involved in amniote limb development, with the exception of Hoxd12. Our results provide a unique view of the origin and evolution of snake venom and reveal multiple genome-level adaptive responses to natural selection in this complex biological weapon system. More generally, they provide insight into mechanisms of protein evolution under strong selection.
NGSPanPipe: A Pipeline for Pan-genome Identification in Microbial Strains from Experimental Reads.
Kulsum, Umay; Kapil, Arti; Singh, Harpreet; Kaur, Punit
2018-01-01
Recent advancements in sequencing technologies have decreased both time span and cost for sequencing the whole bacterial genome. High-throughput Next-Generation Sequencing (NGS) technology has led to the generation of enormous data concerning microbial populations publically available across various repositories. As a consequence, it has become possible to study and compare the genomes of different bacterial strains within a species or genus in terms of evolution, ecology and diversity. Studying the pan-genome provides insights into deciphering microevolution, global composition and diversity in virulence and pathogenesis of a species. It can also assist in identifying drug targets and proposing vaccine candidates. The effective analysis of these large genome datasets necessitates the development of robust tools. Current methods to develop pan-genome do not support direct input of raw reads from the sequencer machine but require preprocessing of reads as an assembled protein/gene sequence file or the binary matrix of orthologous genes/proteins. We have designed an easy-to-use integrated pipeline, NGSPanPipe, which can directly identify the pan-genome from short reads. The output from the pipeline is compatible with other pan-genome analysis tools. We evaluated our pipeline with other methods for developing pan-genome, i.e. reference-based assembly and de novo assembly using simulated reads of Mycobacterium tuberculosis. The single script pipeline (pipeline.pl) is applicable for all bacterial strains. It integrates multiple in-house Perl scripts and is freely accessible from https://github.com/Biomedinformatics/NGSPanPipe .
Patiño, Liliana Catherine; Battu, Rajani; Ortega-Recalde, Oscar; Nallathambi, Jeyabalan; Anandula, Venkata Ramana; Renukaradhya, Umashankar; Laissue, Paul
2014-01-01
The neuronal ceroid-lipofuscinoses (NCL) is a group of neurodegenerative disorders characterized by epilepsy, visual failure, progressive mental and motor deterioration, myoclonus, dementia and reduced life expectancy. Classically, NCL-affected individuals have been classified into six categories, which have been mainly defined regarding the clinical onset of symptoms. However, some patients cannot be easily included in a specific group because of significant variation in the age of onset and disease progression. Molecular genetics has emerged in recent years as a useful tool for enhancing NCL subtype classification. Fourteen NCL genetic forms (CLN1 to CLN14) have been described to date. The variant late-infantile form of the disease has been linked to CLN5, CLN6, CLN7 (MFSD8) and CLN8 mutations. Despite advances in the diagnosis of neurodegenerative disorders mutations in these genes may cause similar phenotypes, which rends difficult accurate candidate gene selection for direct sequencing. Three siblings who were affected by variant late-infantile NCL are reported in the present study. We used whole-exome sequencing, direct sequencing and in silico approaches to identify the molecular basis of the disease. We identified the novel c.1219T>C (p.Trp407Arg) and c.1361T>C (p.Met454Thr) MFSD8 pathogenic mutations. Our results highlighted next generation sequencing as a novel and powerful methodological approach for the rapid determination of the molecular diagnosis of NCL. They also provide information regarding the phenotypic and molecular spectrum of CLN7 disease.
NASA Technical Reports Server (NTRS)
Dar, M. E.; Winters, T. A.; Jorgensen, T. J.
1997-01-01
Ataxia-telangiectasia (A-T) is an autosomal-recessive lethal human disease. Homozygotes suffer from a number of neurological disorders, as well as very high cancer incidence. Heterozygotes may also have a higher than normal risk of cancer, particularly for the breast. The gene responsible for the disease (ATM) has been cloned, but its role in mechanisms of the disease remain unknown. Cellular A-T phenotypes, such as radiosensitivity and genomic instability, suggest that a deficiency in the repair of DNA double-strand breaks (DSBs) may be the primary defect; however, overall levels of DSB rejoining appear normal. We used the shuttle vector, pZ189, containing an oxidatively-induced DSB, to compare the integrity of DSB rejoining in one normal and two A-T fibroblast cells lines. Mutation frequencies were two-fold higher in A-T cells, and the mutational spectrum was different. The majority of the mutations found in all three cell lines were deletions (44-63%). The DNA sequence analysis indicated that 17 of the 17 plasmids with deletion mutations in normal cells occurred between short direct-repeat sequences (removing one of the repeats plus the intervening sequences), implicating illegitimate recombination in DSB rejoining. The combined data from both A-T cell lines showed that 21 of 24 deletions did not involve direct-repeats sequences, implicating a defect in the illegitimate recombination pathway. These findings suggest that the A-T gene product may either directly participate in illegitimate recombination or modulate the pathway. Regardless, this defect is likely to be important to a mechanistic understanding of this lethal disease.
Cappi, C; Brentani, H; Lima, L; Sanders, S J; Zai, G; Diniz, B J; Reis, V N S; Hounie, A G; Conceição do Rosário, M; Mariani, D; Requena, G L; Puga, R; Souza-Duran, F L; Shavitt, R G; Pauls, D L; Miguel, E C; Fernandez, T V
2016-01-01
Studies of rare genetic variation have identified molecular pathways conferring risk for developmental neuropsychiatric disorders. To date, no published whole-exome sequencing studies have been reported in obsessive-compulsive disorder (OCD). We sequenced all the genome coding regions in 20 sporadic OCD cases and their unaffected parents to identify rare de novo (DN) single-nucleotide variants (SNVs). The primary aim of this pilot study was to determine whether DN variation contributes to OCD risk. To this aim, we evaluated whether there is an elevated rate of DN mutations in OCD, which would justify this approach toward gene discovery in larger studies of the disorder. Furthermore, to explore functional molecular correlations among genes with nonsynonymous DN SNVs in OCD probands, a protein–protein interaction (PPI) network was generated based on databases of direct molecular interactions. We applied Degree-Aware Disease Gene Prioritization (DADA) to rank the PPI network genes based on their relatedness to a set of OCD candidate genes from two OCD genome-wide association studies (Stewart et al., 2013; Mattheisen et al., 2014). In addition, we performed a pathway analysis with genes from the PPI network. The rate of DN SNVs in OCD was 2.51 × 10−8 per base per generation, significantly higher than a previous estimated rate in unaffected subjects using the same sequencing platform and analytic pipeline. Several genes harboring DN SNVs in OCD were highly interconnected in the PPI network and ranked high in the DADA analysis. Nearly all the DN SNVs in this study are in genes expressed in the human brain, and a pathway analysis revealed enrichment in immunological and central nervous system functioning and development. The results of this pilot study indicate that further investigation of DN variation in larger OCD cohorts is warranted to identify specific risk genes and to confirm our preliminary finding with regard to PPI network enrichment for particular biological pathways and functions. PMID:27023170
Rivera-Torres, Natalia; Banas, Kelly; Bialk, Pawel; Bloh, Kevin M; Kmiec, Eric B
2017-01-01
CRISPR/Cas9 and single-stranded DNA oligonucleotides (ssODNs) have been used to direct the repair of a single base mutation in human genes. Here, we examine a method designed to increase the precision of RNA guided genome editing in human cells by utilizing a CRISPR/Cas9 ribonucleoprotein (RNP) complex to initiate DNA cleavage. The RNP is assembled in vitro and induces a double stranded break at a specific site surrounding the mutant base designated for correction by the ssODN. We use an integrated mutant eGFP gene, bearing a single base change rendering the expressed protein nonfunctional, as a single copy target in HCT 116 cells. We observe significant gene correction activity of the mutant base, promoted by the RNP and single-stranded DNA oligonucleotide with validation through genotypic and phenotypic readout. We demonstrate that all individual components must be present to obtain successful gene editing. Importantly, we examine the genotype of individually sorted corrected and uncorrected clonally expanded cell populations for the mutagenic footprint left by the action of these gene editing tools. While the DNA sequence of the corrected population is exact with no adjacent sequence modification, the uncorrected population exhibits heterogeneous mutagenicity with a wide variety of deletions and insertions surrounding the target site. We designate this type of DNA aberration as on-site mutagenicity. Analyses of two clonal populations bearing specific DNA insertions surrounding the target site, indicate that point mutation repair has occurred at the level of the gene. The phenotype, however, is not rescued because a section of the single-stranded oligonucleotide has been inserted altering the reading frame and generating truncated proteins. These data illustrate the importance of analysing mutagenicity in uncorrected cells. Our results also form the basis of a simple model for point mutation repair directed by a short single-stranded DNA oligonucleotides and CRISPR/Cas9 ribonucleoprotein complex.
Rivera-Torres, Natalia; Bialk, Pawel; Bloh, Kevin M.; Kmiec, Eric B.
2017-01-01
CRISPR/Cas9 and single-stranded DNA oligonucleotides (ssODNs) have been used to direct the repair of a single base mutation in human genes. Here, we examine a method designed to increase the precision of RNA guided genome editing in human cells by utilizing a CRISPR/Cas9 ribonucleoprotein (RNP) complex to initiate DNA cleavage. The RNP is assembled in vitro and induces a double stranded break at a specific site surrounding the mutant base designated for correction by the ssODN. We use an integrated mutant eGFP gene, bearing a single base change rendering the expressed protein nonfunctional, as a single copy target in HCT 116 cells. We observe significant gene correction activity of the mutant base, promoted by the RNP and single-stranded DNA oligonucleotide with validation through genotypic and phenotypic readout. We demonstrate that all individual components must be present to obtain successful gene editing. Importantly, we examine the genotype of individually sorted corrected and uncorrected clonally expanded cell populations for the mutagenic footprint left by the action of these gene editing tools. While the DNA sequence of the corrected population is exact with no adjacent sequence modification, the uncorrected population exhibits heterogeneous mutagenicity with a wide variety of deletions and insertions surrounding the target site. We designate this type of DNA aberration as on-site mutagenicity. Analyses of two clonal populations bearing specific DNA insertions surrounding the target site, indicate that point mutation repair has occurred at the level of the gene. The phenotype, however, is not rescued because a section of the single-stranded oligonucleotide has been inserted altering the reading frame and generating truncated proteins. These data illustrate the importance of analysing mutagenicity in uncorrected cells. Our results also form the basis of a simple model for point mutation repair directed by a short single-stranded DNA oligonucleotides and CRISPR/Cas9 ribonucleoprotein complex. PMID:28052104
Zurawski, Gerard; Bohnert, Hans J.; Whitfeld, Paul R.; Bottomley, Warwick
1982-01-01
The gene for the so-called Mr 32,000 rapidly labeled photosystem II thylakoid membrane protein (here designated psbA) of spinach (Spinacia oleracea) chloroplasts is located on the chloroplast DNA in the large single-copy region immediately adjacent to one of the inverted repeat sequences. In this paper we show that the size of the mRNA for this protein is ≈ 1.25 kilobases and that the direction of transcription is towards the inverted repeat unit. The nucleotide sequence of the gene and its flanking regions is presented. The only large open reading frame in the sequence codes for a protein of Mr 38,950. The nucleotide sequence of psbA from Nicotiana debneyi also has been determined, and comparison of the sequences from the two species shows them to be highly conserved (>95% homology) throughout the entire reading frame. Conservation of the amino acid sequence is absolute, there being no changes in a total of 353 residues. This leads us to conclude that the primary translation product of psbA must be a protein of Mr 38,950. The protein is characterized by the complete absence of lysine residues and is relatively rich in hydrophobic amino acids, which tend to be clustered. Transcription of spinach psbA starts about 86 base pairs before the first ATG codon. Immediately upstream from this point there is a sequence typical of that found in E. coli promoters. An almost identical sequence occurs in the equivalent region of N. debneyi DNA. Images PMID:16593262
Analysis of human herpesvirus-6 IE1 sequence variation in clinical samples.
Stanton, Richard; Wilkinson, Gavin W G; Fox, Julie D
2003-12-01
Herpesvirus immediate early (IE) proteins are known to play key roles in establishing productive infections, regulating reactivation from latency, and creating a cellular environment favourable to viral replication. Human herpesvirus-6 (HHV-6) IE genes have not been studied as intensively as their homologues in the prototype betaherpesvirus human cytomegalovirus (HCMV). Whilst the HCMV IE1 gene is relatively conserved, early studies indicated that HHV-6 IE1 exhibited a high level of sequence variation between HHV-6A and HHV-6B isolates, although the observation was based primarily on virus stocks that had been isolated and propagated in vitro. In this study, we investigated the level of HHV-6 IE1 sequence variation in vivo by direct sequencing of circulating virus in clinical samples without prior in vitro culture. Sequences exactly matching those reported for reference HHV-6 isolates were identified in clinical samples, thus the HHV-6 laboratory strains used in the majority of in vitro studies appear to be representative of virus circulating in vivo with respect to the IE1 gene. The HHV-6 IE1 sequence is also conserved in reference strains that had been passaged extensively in vitro. The high degree of divergence between variant A and B type IE1 sequences was confirmed, but interestingly HHV-6B IE1 sequences were observed to further segregate into two distinct subgroups, with the laboratory strains Z29 and HST representative of these two subgroups. Within each HHV-6B subgroup, a remarkably high level of homology was observed. Thus the HHV-6 IE1 sequence appears highly stable, underlining its potential importance to the viral life cycle. Copyright 2003 Wiley-Liss, Inc.
Bacterial cellulose synthesis mechanism of facultative anaerobe Enterobacter sp. FY-07.
Ji, Kaihua; Wang, Wei; Zeng, Bing; Chen, Sibin; Zhao, Qianqian; Chen, Yueqing; Li, Guoqiang; Ma, Ting
2016-02-25
Enterobacter sp. FY-07 can produce bacterial cellulose (BC) under aerobic and anaerobic conditions. Three potential BC synthesis gene clusters (bcsI, bcsII and bcsIII) of Enterobacter sp. FY-07 have been predicted using genome sequencing and comparative genome analysis, in which bcsIII was confirmed as the main contributor to BC synthesis by gene knockout and functional reconstitution methods. Protein homology, gene arrangement and gene constitution analysis indicated that bcsIII had high identity to the bcsI operon of Enterobacter sp. 638; however, its arrangement and composition were same as those of BC synthesizing operon of G. xylinum ATCC53582 except for the flanking sequences. According to the BC biosynthesizing process, oxygen is not directly involved in the reactions of BC synthesis, however, energy is required to activate intermediate metabolites and synthesize the activator, c-di-GMP. Comparative transcriptome and metabolite quantitative analysis demonstrated that under anaerobic conditions genes involved in the TCA cycle were downregulated, however, genes in the nitrate reduction and gluconeogenesis pathways were upregulated, especially, genes in three pyruvate metabolism pathways. These results suggested that Enterobacter sp. FY-07 could produce energy efficiently under anaerobic conditions to meet the requirement of BC biosynthesis.
Bacterial cellulose synthesis mechanism of facultative anaerobe Enterobacter sp. FY-07
Ji, Kaihua; Wang, Wei; Zeng, Bing; Chen, Sibin; Zhao, Qianqian; Chen, Yueqing; Li, Guoqiang; Ma, Ting
2016-01-01
Enterobacter sp. FY-07 can produce bacterial cellulose (BC) under aerobic and anaerobic conditions. Three potential BC synthesis gene clusters (bcsI, bcsII and bcsIII) of Enterobacter sp. FY-07 have been predicted using genome sequencing and comparative genome analysis, in which bcsIII was confirmed as the main contributor to BC synthesis by gene knockout and functional reconstitution methods. Protein homology, gene arrangement and gene constitution analysis indicated that bcsIII had high identity to the bcsI operon of Enterobacter sp. 638; however, its arrangement and composition were same as those of BC synthesizing operon of G. xylinum ATCC53582 except for the flanking sequences. According to the BC biosynthesizing process, oxygen is not directly involved in the reactions of BC synthesis, however, energy is required to activate intermediate metabolites and synthesize the activator, c-di-GMP. Comparative transcriptome and metabolite quantitative analysis demonstrated that under anaerobic conditions genes involved in the TCA cycle were downregulated, however, genes in the nitrate reduction and gluconeogenesis pathways were upregulated, especially, genes in three pyruvate metabolism pathways. These results suggested that Enterobacter sp. FY-07 could produce energy efficiently under anaerobic conditions to meet the requirement of BC biosynthesis. PMID:26911736
Li, Haishan; Zhang, Lingling; Jiang, Quan; Shi, Zhenwang; Tong, Hanxing
2017-04-01
Familial adenomatous polyposis (FAP; Mendelian of Inherintance in Man ID, 175100) is a rare autosomal dominant disorder characterized by the development of numerous adenomatous polyps throughout the colon and rectum associated with an increased risk of colorectal cancer. FAP is at time accompanied with certain extraintestinal manifestations such as congenital hypertrophy of the retinal pigment epithelium, dental disorders and desmoid tumors. It is caused by mutations in the adenomatous polyposis coli ( APC ) gene. The present study reported on a Chinese family with FAP. Polymerase chain reaction and direct sequencing of the full coding sequence of the APC gene were performed to identify the mutation in this family. A nonsense mutation of the APC gene was identified in this pedigree. It is a heterozygous G>T substitution at position 2,971 in exon 15 of the APC gene, which formed a premature stop codon at amino acid residue 991 (p.Glu991*). The resulting truncated protein lacked 1,853 amino acids. The present study expanded the database on APC gene mutations in FAP and enriched the spectrum of known germline mutations of the APC gene. Prophylactic proctocolectomy may be considered as a possible treatment for carriers of the mutation.
Wagamitsu, Shunsuke; Takase, Dan; Aoki, Fugaku; Suzuki, Masataka G
2017-02-01
Normal sexual differentiation in the genital organs is essential for the animal species that use sexual reproduction. Although it is known that doublesex (dsx) is required for the sexual development of the genitalia in various insect species, the direct target genes responsible for the sexual differentiation of the genitalia have not been identified. The lozenge (lz) gene is expressed in the female genital disc and is essential for developments of spermathecae and accessory glands in Drosophila melanogaster. The female-specific isoform of DSX (DSXF) is required for activating lz expression in the female genital disc. However, it still remains unclear whether the DSXF directly activates the transcription of lz in the female genital disc. In this study, we found two sequences (lz-DBS1 and lz-DBS2) within lz locus that showed high homoloty to the DSX binding motif identified previously. Competition assays using recombinant DSX DNA-binding domain (DSX-DBD) protein verified that the DSX-DBD protein bound to lz-DBS1 and lz-DBS2 in a sequence-specific manner with lower affinity than to the known DSX binding site in the bric-à-brac 1 (bab1) gene. Reporter gene analyses revealed that a 2.5-kbp lz genomic fragment containing lz-DBS1 and lz-DBS2 drove reporter gene (EGFP) expression in a manner similar to endogenous lz expression in the female genital disc. Mutations in lz-DBS1 alone significantly reduced the area of EGFP-expressing region, while EGFP expression in the female genital disc was abolished when both sites were mutated. These results demonstrated that DSX directly activates female-specific lz expression in the genital disc through lz-DBS1 and lz-DBS2. Copyright © 2017 Elsevier B.V. All rights reserved.
Sun, Di; Wang, Qian; Chen, Zhi; Li, Jilun; Wen, Ying
2017-01-01
Alternative σ factors in bacteria redirect RNA polymerase to recognize alternative promoters, thereby facilitating coordinated gene expression necessary for adaptive responses. The gene sig8 ( sav_741 ) in Streptomyces avermitilis encodes an alternative σ factor, σ 8 , highly homologous to σ B in Streptomyces coelicolor . Studies reported here demonstrate that σ 8 is an important regulator of both avermectin production and stress responses in S. avermitilis . σ 8 inhibited avermectin production by indirectly repressing expression of cluster-situated activator gene aveR , and by directly initiating transcription of its downstream gene sav_742 , which encodes a direct repressor of ave structural genes. σ 8 had no effect on cell growth or morphological differentiation under normal growth conditions. Growth of a sig8- deletion mutant was less than that of wild-type strain on YMS plates following treatment with heat, H 2 O 2 , diamide, NaCl, or KCl. sig8 transcription was strongly induced by these environmental stresses, indicating response by σ 8 itself. A series of σ 8 -dependent genes responsive to heat, oxidative and osmotic stress were identified by EMSAs, qRT-PCR and in vitro transcription experiments. These findings indicate that σ 8 plays an important role in mediating protective responses to various stress conditions by activating transcription of its target genes. Six σ 8 -binding promoter sequences were determined and consensus binding sequence BGVNVH-N 15 -GSNNHH (B: C, T or G, V: A, C or G, S: C or G, H: A, C or T, N: any nucleotide) was identified, leading to prediction of the σ 8 regulon. The list consists of 940 putative σ 8 target genes, assignable to 17 functional groups, suggesting the wide range of cellular functions controlled by σ 8 in S. avermitilis .
A new polymorphic and multicopy MHC gene family related to nonmammalian class I
DOE Office of Scientific and Technical Information (OSTI.GOV)
Leelayuwat, C.; Degli-Esposti, M.A.; Abraham, L.J.
1994-12-31
The authors have used genomic analysis to characterize a region of the central major histocompatibility complex (MHC) spanning {approximately} 300 kilobases (kb) between TNF and HLA-B. This region has been suggested to carry genetic factors relevant to the development of autoimmune diseases such as myasthenia gravis (MG) and insulin dependent diabetes mellitus (IDDM). Genomic sequence was analyzed for coding potential, using two neural network programs, GRAIL and GeneParser. A genomic probe, JAB, containing putative coding sequences (PERB11) located 60 kb centromeric of HLA-B, was used for northern analysis of human tissues. Multiple transcripts were detected. Southern analysis of genomic DNAmore » and overlapping YAC clones, covering the region from BAT1 to HLA-F, indicated that there are at least five copies of PERB11, four of which are located within this region of the MHC. The partial cDNA sequence of PERB11 was obtained from poly-A RNA derived from skeletal muscle. The putative amino acid sequence of PERB11 shares {approximately} 30% identity to MHC class I molecules from various species, including reptiles, chickens, and frogs, as well as to other MHC class I-like molecules, such as the IgG FcR of the mouse and rat and the human Zn-{alpha}2-glycoprotein. From direct comparison of amino acid sequences, it is concluded that PERB11 is a distinct molecule more closely related to nonmammalian than known mammalian MHC class I molecules. Genomic sequence analysis of PERB11 from five MHC ancestral haplotypes (AH) indicated that the gene is polymorphic at both DNA and protein level. The results suggest that the authors have identified a novel polymorphic gene family with multiple copies within the MHC. 48 refs., 10 figs., 2 tabs.« less
2010-01-01
Background The cultivated olive (Olea europaea L.) is the most agriculturally important species of the Oleaceae family. Although many studies have been performed on plastid polymorphisms to evaluate taxonomy, phylogeny and phylogeography of Olea subspecies, only few polymorphic regions discriminating among the agronomically and economically important olive cultivars have been identified. The objective of this study was to sequence the entire plastome of olive and analyze many potential polymorphic regions to develop new inter-cultivar genetic markers. Results The complete plastid genome of the olive cultivar Frantoio was determined by direct sequence analysis using universal and novel PCR primers designed to amplify all overlapping regions. The chloroplast genome of the olive has an organisation and gene order that is conserved among numerous Angiosperm species and do not contain any of the inversions, gene duplications, insertions, inverted repeat expansions and gene/intron losses that have been found in the chloroplast genomes of the genera Jasminum and Menodora, from the same family as Olea. The annotated sequence was used to evaluate the content of coding genes, the extent, and distribution of repeated and long dispersed sequences and the nucleotide composition pattern. These analyses provided essential information for structural, functional and comparative genomic studies in olive plastids. Furthermore, the alignment of the olive plastome sequence to those of other varieties and species identified 30 new organellar polymorphisms within the cultivated olive. Conclusions In addition to identifying mutations that may play a functional role in modifying the metabolism and adaptation of olive cultivars, the new chloroplast markers represent a valuable tool to assess the level of olive intercultivar plastome variation for use in population genetic analysis, phylogenesis, cultivar characterisation and DNA food tracking. PMID:20868482
Härtl, Katja; Kalinowski, Gregor; Hoffmann, Thomas; Preuss, Anja; Schwab, Wilfried
2017-05-01
RNA interference (RNAi) has been exploited as a reverse genetic tool for functional genomics in the nonmodel species strawberry (Fragaria × ananassa) since 2006. Here, we analysed for the first time different but overlapping nucleotide sections (>200 nt) of two endogenous genes, FaCHS (chalcone synthase) and FaOMT (O-methyltransferase), as inducer sequences and a transitive vector system to compare their gene silencing efficiencies. In total, ten vectors were assembled each containing the nucleotide sequence of one fragment in sense and corresponding antisense orientation separated by an intron (inverted hairpin construct, ihp). All sequence fragments along the full lengths of both target genes resulted in a significant down-regulation of the respective gene expression and related metabolite levels. Quantitative PCR data and successful application of a transitive vector system coinciding with a phenotypic change suggested propagation of the silencing signal. The spreading of the signal in strawberry fruit in the 3' direction was shown for the first time by the detection of secondary small interfering RNAs (siRNAs) outside of the primary targets by deep sequencing. Down-regulation of endogenes by the transitive method was less effective than silencing by ihp constructs probably because the numbers of primary siRNAs exceeded the quantity of secondary siRNAs by three orders of magnitude. Besides, we observed consistent hotspots of primary and secondary siRNA formation along the target sequence which fall within a distance of less than 200 nt. Thus, ihp vectors seem to be superior over the transitive vector system for functional genomics in strawberry fruit. © 2016 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.
Sokol, Martin; Jessen, Karen Margrethe; Pedersen, Finn Skou
2016-01-01
Several studies have shown that human endogenous retroviruses and endogenous retrovirus-like repeats (here collectively HERVs) impose direct regulation on human genes through enhancer and promoter motifs present in their long terminal repeats (LTRs). Although chimeric transcription in which novel gene isoforms containing retroviral and human sequence are transcribed from viral promoters are commonly associated with disease, regulation by HERVs is beneficial in other settings; for example, in human testis chimeric isoforms of TP63 induced by an ERV9 LTR protect the male germ line upon DNA damage by inducing apoptosis, whereas in the human globin locus the γ- and β-globin switch during normal hematopoiesis is mediated by complex interactions of an ERV9 LTR and surrounding human sequence. The advent of deep sequencing or next-generation sequencing (NGS) has revolutionized the way researchers solve important scientific questions and develop novel hypotheses in relation to human genome regulation. We recently applied next-generation paired-end RNA-sequencing (RNA-seq) together with chromatin immunoprecipitation with sequencing (ChIP-seq) to examine ERV9 chimeric transcription in human reference cell lines from Encyclopedia of DNA Elements (ENCODE). This led to the discovery of advanced regulation mechanisms by ERV9s and other HERVs across numerous human loci including transcription of large gene-unannotated genomic regions, as well as cooperative regulation by multiple HERVs and non-LTR repeats such as Alu elements. In this article, well-established examples of human gene regulation by HERVs are reviewed followed by a description of paired-end RNA-seq, and its application in identifying chimeric transcription genome-widely. Based on integrative analyses of RNA-seq and ChIP-seq, data we then present novel examples of regulation by ERV9s of tumor suppressor genes CADM2 and SEMA3A, as well as transcription of an unannotated region. Taken together, this article highlights the high suitability of contemporary sequencing methods in future analyses of human biology in relation to evolutionary acquired retroviruses in the human genome. © 2016 APMIS. Published by John Wiley & Sons Ltd.
Physiology and Genetics of Biogenic Methane-Production from Acetate
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sowers, Kevin R
Biomass conversion catalyzed by methanogenic consortia is a widely available, renewable resource for both energy production and waste treatment. The efficiency of this process is directly dependent upon the interaction of three metabolically distinct groups of microorganisms; the fermentative and acetogenic Bacteria and the methanogenic Archaea. One of the rate limiting steps in the degradation of soluble organic matter is the dismutation of acetate, a predominant intermediate in the process, which accounts for 70 % or more of the methane produced by the methanogens. Acetate utilization is controlled by regulation of expression of carbon monoxide dehydrogensase (COdh), which catalyzes themore » dismutation of acetate. However, physiological and molecular factors that control differential substrate utilization have not been identified in these Archaea. Our laboratory has identified sequence elements near the promoter of the gene (cdh) encoding for COdh and we have confirmed that these sequences have a role in the in vivo expression of cdh. The current proposal focuses on identifying the regulatory components that interact with DNA and RNA elements, and identifying the mechanisms used to control cdh expression. We will determine whether expression is controlled at the level of transcription or if it is mediated by coordinate interaction of transcription initiation with other processes such as transcription elongation rate and differential mRNA stability. Utilizing recently sequenced methanosarcinal genomes and a DNA microarray currently under development genes that encode regulatory proteins and transcription factors will be identified and function confirmed by gene disruption and subsequent screening on different substrates. Functional interactions will be determined in vivo by assaying the effects of gene dosage and site-directed mutagenesis of the regulatory gene on the expression of a cdh::lacZ operon fusion. Results of this study will reveal whether this critical catabolic pathway is controlled by mechanisms similar to those employed by the Bacteria and Eukarya, or by a regulatory paradigm that is unique to the Archaea. The mechanism(s) revealed by this investigation will provide insight into the regulatory strategies employed by the aceticlastic methanogenic Archaea to efficiently direct carbon and electron flow in anaerobic consortia during fermentative processes.« less
AbouHaidar, Mounir Georges; Venkataraman, Srividhya; Golshani, Ashkan; Liu, Bolin; Ahmad, Tauqeer
2014-01-01
The highly structured (64% GC) covalently closed circular (CCC) RNA (220 nt) of the virusoid associated with rice yellow mottle virus codes for a 16-kDa highly basic protein using novel modalities for coding, translation, and gene expression. This CCC RNA is the smallest among all known viroids and virusoids and the only one that codes proteins. Its sequence possesses an internal ribosome entry site and is directly translated through two (or three) completely overlapping ORFs (shifting to a new reading frame at the end of each round). The initiation and termination codons overlap UGAUGA (underline highlights the initiation codon AUG within the combined initiation-termination sequence). Termination codons can be ignored to obtain larger read-through proteins. This circular RNA with no noncoding sequences is a unique natural supercompact “nanogenome.” PMID:25253891
Orthology detection combining clustering and synteny for very large datasets.
Lechner, Marcus; Hernandez-Rosales, Maribel; Doerr, Daniel; Wieseke, Nicolas; Thévenin, Annelyse; Stoye, Jens; Hartmann, Roland K; Prohaska, Sonja J; Stadler, Peter F
2014-01-01
The elucidation of orthology relationships is an important step both in gene function prediction as well as towards understanding patterns of sequence evolution. Orthology assignments are usually derived directly from sequence similarities for large data because more exact approaches exhibit too high computational costs. Here we present PoFF, an extension for the standalone tool Proteinortho, which enhances orthology detection by combining clustering, sequence similarity, and synteny. In the course of this work, FFAdj-MCS, a heuristic that assesses pairwise gene order using adjacencies (a similarity measure related to the breakpoint distance) was adapted to support multiple linear chromosomes and extended to detect duplicated regions. PoFF largely reduces the number of false positives and enables more fine-grained predictions than purely similarity-based approaches. The extension maintains the low memory requirements and the efficient concurrency options of its basis Proteinortho, making the software applicable to very large datasets.
Orthology Detection Combining Clustering and Synteny for Very Large Datasets
Lechner, Marcus; Hernandez-Rosales, Maribel; Doerr, Daniel; Wieseke, Nicolas; Thévenin, Annelyse; Stoye, Jens; Hartmann, Roland K.; Prohaska, Sonja J.; Stadler, Peter F.
2014-01-01
The elucidation of orthology relationships is an important step both in gene function prediction as well as towards understanding patterns of sequence evolution. Orthology assignments are usually derived directly from sequence similarities for large data because more exact approaches exhibit too high computational costs. Here we present PoFF, an extension for the standalone tool Proteinortho, which enhances orthology detection by combining clustering, sequence similarity, and synteny. In the course of this work, FFAdj-MCS, a heuristic that assesses pairwise gene order using adjacencies (a similarity measure related to the breakpoint distance) was adapted to support multiple linear chromosomes and extended to detect duplicated regions. PoFF largely reduces the number of false positives and enables more fine-grained predictions than purely similarity-based approaches. The extension maintains the low memory requirements and the efficient concurrency options of its basis Proteinortho, making the software applicable to very large datasets. PMID:25137074
Irizarry, Kristopher J L; Downs, Eileen; Bryden, Randall; Clark, Jory; Griggs, Lisa; Kopulos, Renee; Boettger, Cynthia M; Carr, Thomas J; Keeler, Calvin L; Collisson, Ellen; Drechsler, Yvonne
2017-01-01
Discovering genetic biomarkers associated with disease resistance and enhanced immunity is critical to developing advanced strategies for controlling viral and bacterial infections in different species. Macrophages, important cells of innate immunity, are directly involved in cellular interactions with pathogens, the release of cytokines activating other immune cells and antigen presentation to cells of the adaptive immune response. IFNγ is a potent activator of macrophages and increased production has been associated with disease resistance in several species. This study characterizes the molecular basis for dramatically different nitric oxide production and immune function between the B2 and the B19 haplotype chicken macrophages.A large-scale RNA sequencing approach was employed to sequence the RNA of purified macrophages from each haplotype group (B2 vs. B19) during differentiation and after stimulation. Our results demonstrate that a large number of genes exhibit divergent expression between B2 and B19 haplotype cells both prior and after stimulation. These differences in gene expression appear to be regulated by complex epigenetic mechanisms that need further investigation.
NASA Astrophysics Data System (ADS)
Meena, Balakrishnan; Anburajan, Lawrance; Sathish, Thadikamala; Vijaya Raghavan, Rangamaran; Dharani, Gopal; Valsalan Vinithkumar, Nambali; Kirubagaran, Ramalingam
2015-07-01
Marine actinobacteria are known to be a rich source for novel metabolites with diverse biological activities. In this study, a potential extracellular L-asparaginase was characterised from the Streptomyces griseus NIOT-VKMA29. Box-Behnken based optimization was used to determine the culture medium components to enhance the L-asparaginase production. pH, starch, yeast extract and L-asparagine has a direct correlation for enzyme production with a maximum yield of 56.78 IU mL-1. A verification experiment was performed to validate the experiment and more than 99% validity was established. L-Asparaginase biosynthesis gene (ansA) from Streptomyces griseus NIOT-VKMA29 was heterologously expressed in Escherichia coli M15 and the enzyme production was increased threefold (123 IU mL-1) over the native strain. The ansA gene sequences reported in this study encloses several base substitutions with that of reported sequences in GenBank, resulting in altered amino acid sequences of the translated protein.
Simonelli, F; Cennamo, G; Ziviello, C; Testa, F; de Crecchio, G; Nesti, A; Manitto, M P; Ciccodicola, A; Banfi, S; Brancato, R; Rinaldi, E
2003-01-01
Aims: To describe the clinical phenotype of X linked juvenile retinoschisis in eight Italian families with six different mutations in the XLRS1 gene. Methods: Complete ophthalmic examinations, electroretinography and A and B-scan standardised echography were performed in 18 affected males. The coding sequences of the XLRS1 gene were amplified by polymerase chain reaction and directly sequenced on an automated sequencer. Results: Six different XLRS1 mutations were identified; two of these mutations Ile81Asn and the Trp122Cys, have not been previously described. The affected males showed an electronegative response to the standard white scotopic stimulus and a prolonged implicit time of the 30 Hz flicker. In the families with Trp112Cys and Trp122Cys mutations we observed a more severe retinoschisis (RS) clinical picture compared with the other genotypes. Conclusion: The severe RS phenotypes associated with Trp112Cys and to Trp122Cys mutations suggest that these mutations determine a notable alteration in the function of the retinoschisin protein. PMID:12928282
Patterns of homoeologous gene expression shown by RNA sequencing in hexaploid bread wheat
2014-01-01
Background Bread wheat (Triticum aestivum) has a large, complex and hexaploid genome consisting of A, B and D homoeologous chromosome sets. Therefore each wheat gene potentially exists as a trio of A, B and D homoeoloci, each of which may contribute differentially to wheat phenotypes. We describe a novel approach combining wheat cytogenetic resources (chromosome substitution ‘nullisomic-tetrasomic’ lines) with next generation deep sequencing of gene transcripts (RNA-Seq), to directly and accurately identify homoeologue-specific single nucleotide variants and quantify the relative contribution of individual homoeoloci to gene expression. Results We discover, based on a sample comprising ~5-10% of the total wheat gene content, that at least 45% of wheat genes are expressed from all three distinct homoeoloci. Most of these genes show strikingly biased expression patterns in which expression is dominated by a single homoeolocus. The remaining ~55% of wheat genes are expressed from either one or two homoeoloci only, through a combination of extensive transcriptional silencing and homoeolocus loss. Conclusions We conclude that wheat is tending towards functional diploidy, through a variety of mechanisms causing single homoeoloci to become the predominant source of gene transcripts. This discovery has profound consequences for wheat breeding and our understanding of wheat evolution. PMID:24726045
A novel ATTR L32V mutation causes familial amyloid polyneuropathy in a Bolivian family.
Martínez-Ulloa, Pedro L; Vallejo, Manuela; Corral, Iñigo; García-Barragán, Nuria; Alcazar, Alberto; Martínez-Alonso, Emma; Martínez-Poles, Javier; Pian, Hector; Jiménez-Escrig, Adriano
2017-09-01
We report a new transthyretin (ATTR) gene c.272C>G mutation and variant protein, p.Leu32Val, in a kindred of Bolivian origin with a rapid progressive peripheral neuropathy and cardiomyopathy. Three individuals from a kindred with peripheral nerve and cardiac amyloidosis were examined. Analysis of the TTR gene was performed by Sanger direct sequencing. Neuropathologic examination was obtained on the index patient with mass spectrometry study of the ATTR deposition. Direct DNA sequence analysis of exons 2, 3, and 4 of the TTR gene demonstrated a c.272 C>G mutation in exon 2 (p.L32V). Sural nerve biopsy revealed massive amyloid deposition in the perineurium, endoneurium and vasa nervorum. Mass spectrometric analyses of ATTR immunoprecipitated from nerve biopsy showed the presence of both wild-type and variant proteins. The observed mass results for the wild-type and variant proteins were consistent with the predicted values calculated from the genetic analysis data. The ATTR L32V is associated with a severe course. This has implications for treatment of affected individuals and counseling of family members. © 2017 Peripheral Nerve Society.
Biotype-specific tcpA genes in Vibrio cholerae.
Iredell, J R; Manning, P A
1994-08-01
The tcpA gene, encoding the structural subunit of the toxin-coregulated pilus, has been isolated from a variety of clinical isolates of Vibrio cholerae, and the nucleotide sequence determined. Strict biotype-specific conservation within both the coding and putative regulatory regions was observed, with important differences between the El Tor and classical biotypes. V. cholerae O139 Bengal strains appear to have El Tor-type tcpA genes. Environmental O1 and non-O1 isolates have sequences that bind an El Tor-specific tcpA DNA probe and that are weakly and variably amplified by tcpA-specific polymerase chain reaction primers, under conditions of reduced stringency. The data presented allow the selection of primer pairs to help distinguish between clinical and environmental isolates, and to distinguish El Tor (and Bengal) biotypes from classical biotypes of V. cholerae. While the role of TcpA in cholera vaccine preparations remains unclear, the data strongly suggest that TcpA-containing vaccines directed at O1 strains need include only the two forms of TcpA, and that such vaccines directed at (O139) Bengal strains should include the TcpA of El Tor biotype.
GeneSCF: a real-time based functional enrichment tool with support for multiple organisms.
Subhash, Santhilal; Kanduri, Chandrasekhar
2016-09-13
High-throughput technologies such as ChIP-sequencing, RNA-sequencing, DNA sequencing and quantitative metabolomics generate a huge volume of data. Researchers often rely on functional enrichment tools to interpret the biological significance of the affected genes from these high-throughput studies. However, currently available functional enrichment tools need to be updated frequently to adapt to new entries from the functional database repositories. Hence there is a need for a simplified tool that can perform functional enrichment analysis by using updated information directly from the source databases such as KEGG, Reactome or Gene Ontology etc. In this study, we focused on designing a command-line tool called GeneSCF (Gene Set Clustering based on Functional annotations), that can predict the functionally relevant biological information for a set of genes in a real-time updated manner. It is designed to handle information from more than 4000 organisms from freely available prominent functional databases like KEGG, Reactome and Gene Ontology. We successfully employed our tool on two of published datasets to predict the biologically relevant functional information. The core features of this tool were tested on Linux machines without the need for installation of more dependencies. GeneSCF is more reliable compared to other enrichment tools because of its ability to use reference functional databases in real-time to perform enrichment analysis. It is an easy-to-integrate tool with other pipelines available for downstream analysis of high-throughput data. More importantly, GeneSCF can run multiple gene lists simultaneously on different organisms thereby saving time for the users. Since the tool is designed to be ready-to-use, there is no need for any complex compilation and installation procedures.
Fanconi anemia gene editing by the CRISPR/Cas9 system.
Osborn, Mark J; Gabriel, Richard; Webber, Beau R; DeFeo, Anthony P; McElroy, Amber N; Jarjour, Jordan; Starker, Colby G; Wagner, John E; Joung, J Keith; Voytas, Daniel F; von Kalle, Christof; Schmidt, Manfred; Blazar, Bruce R; Tolar, Jakub
2015-02-01
Genome engineering with designer nucleases is a rapidly progressing field, and the ability to correct human gene mutations in situ is highly desirable. We employed fibroblasts derived from a patient with Fanconi anemia as a model to test the ability of the clustered regularly interspaced short palindromic repeats/Cas9 nuclease system to mediate gene correction. We show that the Cas9 nuclease and nickase each resulted in gene correction, but the nickase, because of its ability to preferentially mediate homology-directed repair, resulted in a higher frequency of corrected clonal isolates. To assess the off-target effects, we used both a predictive software platform to identify intragenic sequences of homology as well as a genome-wide screen utilizing linear amplification-mediated PCR. We observed no off-target activity and show RNA-guided endonuclease candidate sites that do not possess low sequence complexity function in a highly specific manner. Collectively, we provide proof of principle for precision genome editing in Fanconi anemia, a DNA repair-deficient human disorder.
Expression profiling of the mouse early embryo: Reflections and Perspectives
Ko, Minoru S. H.
2008-01-01
Laboratory mouse plays important role in our understanding of early mammalian development and provides invaluable model for human early embryos, which are difficult to study for ethical and technical reasons. Comprehensive collection of cDNA clones, their sequences, and complete genome sequence information, which have been accumulated over last two decades, have provided even more advantages to mouse models. Here the progress in global gene expression profiling in early mouse embryos and, to some extent, stem cells are reviewed and the future directions and challenges are discussed. The discussions include the restatement of global gene expression profiles as snapshot of cellular status, and subsequent distinction between the differentiation state and physiological state of the cells. The discussions then extend to the biological problems that can be addressed only through global expression profiling, which include: bird’s-eye view of global gene expression changes, molecular index for developmental potency, cell lineage trajectory, microarray-guided cell manipulation, and the possibility of delineating gene regulatory cascades and networks. PMID:16739220
Qin, Shengfang; Wang, Xueyan; Li, Yunxing; Wei, Ping; Chen, Chun; Zeng, Lan
2016-02-01
To explore the genetics mechanism for the phenotypic variability in a patient carrying a rare ring chromosome 9. The karyotype of the patient was analyzed with cytogenetics method. Presence of sex chromosome was confirmed with fluorescence in situ hybridization. The SRY gene was subjected to PCR amplification and direct sequencing. Potential deletion and duplication were detected with array-based comparative genomic hybridization (array-CGH). The karyotype of the patient has comprised 6 types of cell lines containing a ring chromosome 9. The SRY gene sequence was normal. By array-CGH, the patient has carried a hemizygous deletion at 9p24.3-p23 (174 201-9 721 761) encompassing 30 genes from Online Mendelian Inheritance in Man. The phenotypic variability of the 9p deletion syndrome in conjunct with ring chromosome 9 may be attributable to multiple factors including loss of chromosomal material, insufficient dosage of genes, instability of ring chromosome, and pattern of inheritance.
Investigating the Genome Diversity of B. cereus and Evolutionary Aspects of B. anthracis Emergence
Papazisi, Leka; Rasko, David A.; Ratnayake, Shashikala; Bock, Geoff R.; Remortel, Brian G.; Appalla, Lakshmi; Liu, Jia; Dracheva, Tatiana; Braisted, John C.; Shallom, Shamira; Jarrahi, Benham; Snesrud, Erik; Ahn, Susie; Sun, Qiang; Rilstone, Jenifer; Økstad, Ole Andreas; Kolstø, Anne-Brit; Fleischmann, Robert D.; Peterson, Scott N.
2011-01-01
Here we report the use of a multi-genome DNA microarray to investigate the genome diversity of Bacillus cereus group members and elucidate the events associated with the emergence of B. anthracis the causative agent of anthrax–a lethal zoonotic disease. We initially performed directed genome sequencing of seven diverse B. cereus strains to identify novel sequences encoded in those genomes. The novel genes identified, combined with those publicly available, allowed the design of a “species” DNA microarray. Comparative genomic hybridization analyses of 41 strains indicates that substantial heterogeneity exists with respect to the genes comprising functional role categories. While the acquisition of the plasmid-encoded pathogenicity island (pXO1) and capsule genes (pXO2) represent a crucial landmark dictating the emergence of B. anthracis, the evolution of this species and its close relatives was associated with an overall a shift in the fraction of genes devoted to energy metabolism, cellular processes, transport, as well as virulence. PMID:21447378
Kang, In-Nee; Musa, Maslinda; Harun, Fatimah; Junit, Sarni Mat
2010-02-01
The FOXE1 gene was screened for mutations in a cohort of 34 unrelated patients with congenital hypothyroidism, 14 of whom had thyroid dysgenesis and 18 were normal (the thyroid status for 2 patients was unknown). The entire coding region of the FOXE1 gene was PCR-amplified, then analyzed using single-stranded conformational polymorphism, followed by confirmation by direct DNA sequencing. DNA sequencing analysis revealed a heterozygous A>G transition at nucleotide position 394 in one of the patients. The nucleotide transition changed asparagine to aspartate at codon 132 in the highly conserved region of the forkhead DNA binding domain of the FOXE1 gene. This mutation was not detected in a total of 104 normal healthy individuals screened. The binding ability of the mutant FOXE1 protein to the human thyroperoxidase (TPO) promoter was slightly reduced compared with the wild-type FOXE1. The mutation also caused a 5% loss of TPO transcriptional activity.
Miryounesi, Mohammad; Ghafouri-Fard, Soudeh; Goodarzi, Hamedreza; Fardaei, Majid
2015-05-01
Maple syrup urine disease (MSUD) is an autosomal recessive metabolic disease caused by mutations in the BCKDHA, BCKDHB, DBT and DLD genes, which encode the E1α, E1β, E2 and E3 subunits of the branched chain α ketoacid dehydrogenase (BCKD) complex, respectively. This complex is involved in the metabolism of branched-chain amino acids. In this study, we analyzed the DNA sequences of BCKDHA and BCKDHB genes in an infant who suffered from MSUD and died at the age of 6 months. We found a new missense mutation in exon 5 of BCKDHB gene (c.508C>T). The heterozygosity of the parents for the mentioned nucleotide change was confirmed by direct sequence analysis of the corresponding segment. Another missense mutation has been found in the same codon previously and shown by in silico analyses to be deleterious. This report provides further evidence that this amino acid change can cause classic MSUD.
Pesz, Karolina; Pienkowski, Victor Murcia; Pollak, Agnieszka; Gasperowicz, Piotr; Sykulski, Maciej; Kosińska, Joanna; Kiszko, Magdalena; Krzykwa, Bogusława; Bartnik-Głaska, Magdalena; Nowakowska, Beata; Rydzanicz, Małgorzata; Sasiadek, Maria Małgorzata; Płoski, Rafał
2018-04-03
Mapping of de novo balanced chromosomal translocations (BCTs) in patients with sporadic poorly characterized disease(s) is an unbiased method of finding candidate gene(s) responsible for the observed symptoms. We present a paediatric patient suffering from epilepsy, developmental delay (DD) and atrial septal defect IIº (ASD) requiring surgery. Karyotyping indicated an apparently balanced de novo reciprocal translocation 46,XX,t(3;4)(p25.3;q31.1), whereas aCGH did not reveal any copy number changes. Using shallow mate-pair whole genome sequencing and direct Sanger sequencing of breakpoint regions we found that translocation disrupted SLC6A1 and NAA15 genes. Our results confirm two previous reports indicating that loss of function of a single allele of SLC6A1 causes epilepsy. In addition, we extend existing evidence that disruption of NAA15 is associated with DD and with congenital heart defects. Copyright © 2018 Elsevier Masson SAS. All rights reserved.
Natural Antisense Transcripts: Molecular Mechanisms and Implications in Breast Cancers
Latgé, Guillaume; Poulet, Christophe; Bours, Vincent; Jerusalem, Guy
2018-01-01
Natural antisense transcripts are RNA sequences that can be transcribed from both DNA strands at the same locus but in the opposite direction from the gene transcript. Because strand-specific high-throughput sequencing of the antisense transcriptome has only been available for less than a decade, many natural antisense transcripts were first described as long non-coding RNAs. Although the precise biological roles of natural antisense transcripts are not known yet, an increasing number of studies report their implication in gene expression regulation. Their expression levels are altered in many physiological and pathological conditions, including breast cancers. Among the potential clinical utilities of the natural antisense transcripts, the non-coding|coding transcript pairs are of high interest for treatment. Indeed, these pairs can be targeted by antisense oligonucleotides to specifically tune the expression of the coding-gene. Here, we describe the current knowledge about natural antisense transcripts, their varying molecular mechanisms as gene expression regulators, and their potential as prognostic or predictive biomarkers in breast cancers. PMID:29301303
Natural Antisense Transcripts: Molecular Mechanisms and Implications in Breast Cancers.
Latgé, Guillaume; Poulet, Christophe; Bours, Vincent; Josse, Claire; Jerusalem, Guy
2018-01-02
Natural antisense transcripts are RNA sequences that can be transcribed from both DNA strands at the same locus but in the opposite direction from the gene transcript. Because strand-specific high-throughput sequencing of the antisense transcriptome has only been available for less than a decade, many natural antisense transcripts were first described as long non-coding RNAs. Although the precise biological roles of natural antisense transcripts are not known yet, an increasing number of studies report their implication in gene expression regulation. Their expression levels are altered in many physiological and pathological conditions, including breast cancers. Among the potential clinical utilities of the natural antisense transcripts, the non-coding|coding transcript pairs are of high interest for treatment. Indeed, these pairs can be targeted by antisense oligonucleotides to specifically tune the expression of the coding-gene. Here, we describe the current knowledge about natural antisense transcripts, their varying molecular mechanisms as gene expression regulators, and their potential as prognostic or predictive biomarkers in breast cancers.
Lu, Zefu; Yu, Hong; Xiong, Guosheng; Wang, Jing; Jiao, Yongqing; Liu, Guifu; Jing, Yanhui; Meng, Xiangbing; Hu, Xingming; Qian, Qian; Fu, Xiangdong; Wang, Yonghong; Li, Jiayang
2013-01-01
IDEAL PLANT ARCHITECTURE1 (IPA1) is critical in regulating rice (Oryza sativa) plant architecture and substantially enhances grain yield. To elucidate its molecular basis, we first confirmed IPA1 as a functional transcription activator and then identified 1067 and 2185 genes associated with IPA1 binding sites in shoot apices and young panicles, respectively, through chromatin immunoprecipitation sequencing assays. The SQUAMOSA PROMOTER BINDING PROTEIN-box direct binding core motif GTAC was highly enriched in IPA1 binding peaks; interestingly, a previously uncharacterized indirect binding motif TGGGCC/T was found to be significantly enriched through the interaction of IPA1 with proliferating cell nuclear antigen PROMOTER BINDING FACTOR1 or PROMOTER BINDING FACTOR2. Genome-wide expression profiling by RNA sequencing revealed IPA1 roles in diverse pathways. Moreover, our results demonstrated that IPA1 could directly bind to the promoter of rice TEOSINTE BRANCHED1, a negative regulator of tiller bud outgrowth, to suppress rice tillering, and directly and positively regulate DENSE AND ERECT PANICLE1, an important gene regulating panicle architecture, to influence plant height and panicle length. The elucidation of target genes of IPA1 genome-wide will contribute to understanding the molecular mechanisms underlying plant architecture and to facilitating the breeding of elite varieties with ideal plant architecture. PMID:24170127
A Functional Nuclear Localization Sequence in the C. elegans TRPV Channel OCR-2
Ezak, Meredith J.; Ferkey, Denise M.
2011-01-01
The ability to modulate gene expression in response to sensory experience is critical to the normal development and function of the nervous system. Calcium is a key activator of the signal transduction cascades that mediate the process of translating a cellular stimulus into transcriptional changes. With the recent discovery that the mammalian Cav1.2 calcium channel can be cleaved, enter the nucleus and act as a transcription factor to control neuronal gene expression, a more direct role for the calcium channels themselves in regulating transcription has begun to be appreciated. Here we report the identification of a nuclear localization sequence (NLS) in the C. elegans transient receptor potential vanilloid (TRPV) cation channel OCR-2. TRPV channels have previously been implicated in transcriptional regulation of neuronal genes in the nematode, although the precise mechanism remains unclear. We show that the NLS in OCR-2 is functional, being able to direct nuclear accumulation of a synthetic cargo protein as well as the carboxy-terminal cytosolic tail of OCR-2 where it is endogenously found. Furthermore, we discovered that a carboxy-terminal portion of the full-length channel can localize to the nucleus of neuronal cells. These results suggest that the OCR-2 TRPV cation channel may have a direct nuclear function in neuronal cells that was not previously appreciated. PMID:21957475
Vukmirovic, Milica; Herazo-Maya, Jose D; Blackmon, John; Skodric-Trifunovic, Vesna; Jovanovic, Dragana; Pavlovic, Sonja; Stojsic, Jelena; Zeljkovic, Vesna; Yan, Xiting; Homer, Robert; Stefanovic, Branko; Kaminski, Naftali
2017-01-12
Idiopathic Pulmonary Fibrosis (IPF) is a lethal lung disease of unknown etiology. A major limitation in transcriptomic profiling of lung tissue in IPF has been a dependence on snap-frozen fresh tissues (FF). In this project we sought to determine whether genome scale transcript profiling using RNA Sequencing (RNA-Seq) could be applied to archived Formalin-Fixed Paraffin-Embedded (FFPE) IPF tissues. We isolated total RNA from 7 IPF and 5 control FFPE lung tissues and performed 50 base pair paired-end sequencing on Illumina 2000 HiSeq. TopHat2 was used to map sequencing reads to the human genome. On average ~62 million reads (53.4% of ~116 million reads) were mapped per sample. 4,131 genes were differentially expressed between IPF and controls (1,920 increased and 2,211 decreased (FDR < 0.05). We compared our results to differentially expressed genes calculated from a previously published dataset generated from FF tissues analyzed on Agilent microarrays (GSE47460). The overlap of differentially expressed genes was very high (760 increased and 1,413 decreased, FDR < 0.05). Only 92 differentially expressed genes changed in opposite directions. Pathway enrichment analysis performed using MetaCore confirmed numerous IPF relevant genes and pathways including extracellular remodeling, TGF-beta, and WNT. Gene network analysis of MMP7, a highly differentially expressed gene in both datasets, revealed the same canonical pathways and gene network candidates in RNA-Seq and microarray data. For validation by NanoString nCounter® we selected 35 genes that had a fold change of 2 in at least one dataset (10 discordant, 10 significantly differentially expressed in one dataset only and 15 concordant genes). High concordance of fold change and FDR was observed for each type of the samples (FF vs FFPE) with both microarrays (r = 0.92) and RNA-Seq (r = 0.90) and the number of discordant genes was reduced to four. Our results demonstrate that RNA sequencing of RNA obtained from archived FFPE lung tissues is feasible. The results obtained from FFPE tissue are highly comparable to FF tissues. The ability to perform RNA-Seq on archived FFPE IPF tissues should greatly enhance the availability of tissue biopsies for research in IPF.
Maceachern, Sean; Muir, William M; Crosby, Seth; Cheng, Hans H
2011-06-03
Marek's disease (MD), a T cell lymphoma induced by the highly oncogenic α-herpesvirus Marek's disease virus (MDV), is the main chronic infectious disease concern threatening the poultry industry. Enhancing genetic resistance to MD in commercial poultry is an attractive method to augment MD vaccines, which is currently the control method of choice. In order to optimally implement this control strategy through marker-assisted selection (MAS) and to gain biological information, it is necessary to identify specific genes that influence MD incidence. A genome-wide screen for allele-specific expression (ASE) in response to MDV infection was conducted. The highly inbred ADOL chicken lines 6 (MD resistant) and 7 (MD susceptible) were inter-mated in reciprocal crosses and half of the progeny challenged with MDV. Splenic RNA pools at a single time after infection for each treatment group point were generated, sequenced using a next generation sequencer, then analyzed for allele-specific expression (ASE). To validate and extend the results, Illumina GoldenGate assays for selected cSNPs were developed and used on all RNA samples from all 6 time points following MDV challenge. RNA sequencing resulted in 11-13+ million mappable reads per treatment group, 1.7+ Gb total sequence, and 22,655 high-confidence cSNPs. Analysis of these cSNPs revealed that 5360 cSNPs in 3773 genes exhibited statistically significant allelic imbalance. Of the 1536 GoldenGate assays, 1465 were successfully scored with all but 19 exhibiting evidence for allelic imbalance. ASE is an efficient method to identify potentially all or most of the genes influencing this complex trait. The identified cSNPs can be further evaluated in resource populations to determine their allelic direction and size of effect on genetic resistance to MD as well as being directly implemented in genomic selection programs. The described method, although demonstrated in inbred chicken lines, is applicable to all traits in any diploid species, and should prove to be a simple method to identify the majority of genes controlling any complex trait.
Günthard, H F; Wong, J K; Ignacio, C C; Havlir, D V; Richman, D D
1998-07-01
The performance of the high-density oligonucleotide array methodology (GeneChip) in detecting drug resistance mutations in HIV-1 pol was compared with that of automated dideoxynucleotide sequencing (ABI) of clinical samples, viral stocks, and plasmid-derived NL4-3 clones. Sequences from 29 clinical samples (plasma RNA, n = 17; lymph node RNA, n = 5; lymph node DNA, n = 7) from 12 patients, from 6 viral stock RNA samples, and from 13 NL4-3 clones were generated by both methods. Editing was done independently by a different investigator for each method before comparing the sequences. In addition, NL4-3 wild type (WT) and mutants were mixed in varying concentrations and sequenced by both methods. Overall, a concordance of 99.1% was found for a total of 30,865 bases compared. The comparison of clinical samples (plasma RNA and lymph node RNA and DNA) showed a slightly lower match of base calls, 98.8% for 19,831 nucleotides compared (protease region, 99.5%, n = 8272; RT region, 98.3%, n = 11,316), than for viral stocks and NL4-3 clones (protease region, 99.8%; RT region, 99.5%). Artificial mixing experiments showed a bias toward calling wild-type bases by GeneChip. Discordant base calls are most likely due to differential detection of mixtures. The concordance between GeneChip and ABI was high and appeared dependent on the nature of the templates (directly amplified versus cloned) and the complexity of mixes.
2014-01-01
Background Although it is possible to recover the complete mitogenome directly from shotgun sequencing data, currently reported methods and pipelines are still relatively time consuming and costly. Using a sample of the Australian freshwater crayfish Engaeus lengana, we demonstrate that it is possible to achieve three-day turnaround time (four hours hands-on time) from tissue sample to NCBI-ready submission file through the integration of MiSeq sequencing platform, Nextera sample preparation protocol, MITObim assembly algorithm and MITOS annotation pipeline. Results The complete mitochondrial genome of the parastacid freshwater crayfish, Engaeus lengana, was recovered by modest shotgun sequencing (1.2 giga bases) using the Illumina MiSeq benchtop sequencing platform. Genome assembly using the MITObim mitogenome assembler recovered the mitochondrial genome as a single contig with a 97-fold mean coverage (min. = 17; max. = 138). The mitogenome consists of 15,934 base pairs and contains the typical 37 mitochondrial genes and a non-coding AT-rich region. The genome arrangement is similar to the only other published parastacid mitogenome from the Australian genus Cherax. Conclusions We infer that the gene order arrangement found in Cherax destructor is common to Australian crayfish and may be a derived feature of the southern hemisphere family Parastacidae. Further, we report to our knowledge, the simplest and fastest protocol for the recovery and assembly of complete mitochondrial genomes using the MiSeq benchtop sequencer. PMID:24484414
Bacterial identification and subtyping using DNA microarray and DNA sequencing.
Al-Khaldi, Sufian F; Mossoba, Magdi M; Allard, Marc M; Lienau, E Kurt; Brown, Eric D
2012-01-01
The era of fast and accurate discovery of biological sequence motifs in prokaryotic and eukaryotic cells is here. The co-evolution of direct genome sequencing and DNA microarray strategies not only will identify, isotype, and serotype pathogenic bacteria, but also it will aid in the discovery of new gene functions by detecting gene expressions in different diseases and environmental conditions. Microarray bacterial identification has made great advances in working with pure and mixed bacterial samples. The technological advances have moved beyond bacterial gene expression to include bacterial identification and isotyping. Application of new tools such as mid-infrared chemical imaging improves detection of hybridization in DNA microarrays. The research in this field is promising and future work will reveal the potential of infrared technology in bacterial identification. On the other hand, DNA sequencing by using 454 pyrosequencing is so cost effective that the promise of $1,000 per bacterial genome sequence is becoming a reality. Pyrosequencing technology is a simple to use technique that can produce accurate and quantitative analysis of DNA sequences with a great speed. The deposition of massive amounts of bacterial genomic information in databanks is creating fingerprint phylogenetic analysis that will ultimately replace several technologies such as Pulsed Field Gel Electrophoresis. In this chapter, we will review (1) the use of DNA microarray using fluorescence and infrared imaging detection for identification of pathogenic bacteria, and (2) use of pyrosequencing in DNA cluster analysis to fingerprint bacterial phylogenetic trees.
Regulatory link between DNA methylation and active demethylation in Arabidopsis
Lei, Mingguang; Zhang, Huiming; Julian, Russell; Tang, Kai; Xie, Shaojun; Zhu, Jian-Kang
2015-01-01
De novo DNA methylation through the RNA-directed DNA methylation (RdDM) pathway and active DNA demethylation play important roles in controlling genome-wide DNA methylation patterns in plants. Little is known about how cells manage the balance between DNA methylation and active demethylation activities. Here, we report the identification of a unique RdDM target sequence, where DNA methylation is required for maintaining proper active DNA demethylation of the Arabidopsis genome. In a genetic screen for cellular antisilencing factors, we isolated several REPRESSOR OF SILENCING 1 (ros1) mutant alleles, as well as many RdDM mutants, which showed drastically reduced ROS1 gene expression and, consequently, transcriptional silencing of two reporter genes. A helitron transposon element (TE) in the ROS1 gene promoter negatively controls ROS1 expression, whereas DNA methylation of an RdDM target sequence between ROS1 5′ UTR and the promoter TE region antagonizes this helitron TE in regulating ROS1 expression. This RdDM target sequence is also targeted by ROS1, and defective DNA demethylation in loss-of-function ros1 mutant alleles causes DNA hypermethylation of this sequence and concomitantly causes increased ROS1 expression. Our results suggest that this sequence in the ROS1 promoter region serves as a DNA methylation monitoring sequence (MEMS) that senses DNA methylation and active DNA demethylation activities. Therefore, the ROS1 promoter functions like a thermostat (i.e., methylstat) to sense DNA methylation levels and regulates DNA methylation by controlling ROS1 expression. PMID:25733903
Bialk, Pawel; Rivera-Torres, Natalia; Strouse, Bryan; Kmiec, Eric B.
2015-01-01
Single-stranded DNA oligonucleotides (ssODNs) can direct the repair of a single base mutation in human genes. While the regulation of this gene editing reaction has been partially elucidated, the low frequency with which repair occurs has hampered development toward clinical application. In this work a CRISPR/Cas9 complex is employed to induce double strand DNA breakage at specific sites surrounding the nucleotide designated for exchange. The result is a significant elevation in ssODN-directed gene repair, validated by a phenotypic readout. By analysing reaction parameters, we have uncovered restrictions on gene editing activity involving CRISPR/Cas9 complexes. First, ssODNs that hybridize to the non-transcribed strand direct a higher level of gene repair than those that hybridize to the transcribed strand. Second, cleavage must be proximal to the targeted mutant base to enable higher levels of gene editing. Third, DNA cleavage enables a higher level of gene editing activity as compared to single-stranded DNA nicks, created by modified Cas9 (Nickases). Fourth, we calculated the hybridization potential and free energy levels of ssODNs that are complementary to the guide RNA sequences of CRISPRs used in this study. We find a correlation between free energy potential and the capacity of single-stranded oligonucleotides to inhibit specific DNA cleavage activity, thereby indirectly reducing gene editing activity. Our data provide novel information that might be taken into consideration in the design and usage of CRISPR/Cas9 systems with ssODNs for gene editing. PMID:26053390
Bialk, Pawel; Rivera-Torres, Natalia; Strouse, Bryan; Kmiec, Eric B
2015-01-01
Single-stranded DNA oligonucleotides (ssODNs) can direct the repair of a single base mutation in human genes. While the regulation of this gene editing reaction has been partially elucidated, the low frequency with which repair occurs has hampered development toward clinical application. In this work a CRISPR/Cas9 complex is employed to induce double strand DNA breakage at specific sites surrounding the nucleotide designated for exchange. The result is a significant elevation in ssODN-directed gene repair, validated by a phenotypic readout. By analysing reaction parameters, we have uncovered restrictions on gene editing activity involving CRISPR/Cas9 complexes. First, ssODNs that hybridize to the non-transcribed strand direct a higher level of gene repair than those that hybridize to the transcribed strand. Second, cleavage must be proximal to the targeted mutant base to enable higher levels of gene editing. Third, DNA cleavage enables a higher level of gene editing activity as compared to single-stranded DNA nicks, created by modified Cas9 (Nickases). Fourth, we calculated the hybridization potential and free energy levels of ssODNs that are complementary to the guide RNA sequences of CRISPRs used in this study. We find a correlation between free energy potential and the capacity of single-stranded oligonucleotides to inhibit specific DNA cleavage activity, thereby indirectly reducing gene editing activity. Our data provide novel information that might be taken into consideration in the design and usage of CRISPR/Cas9 systems with ssODNs for gene editing.
A Genome Sequence-directed Investigation of D-Tagatose Utilization by Kosmotoga Olearia
NASA Astrophysics Data System (ADS)
Butzin, N. C.; Bradnan, D. M.; Noll, K. M.
2010-04-01
The research goals are to determine the pathway that Kosmotoga olearia uses tagatose, the roles of Kole_0686, Kole_0737 and Kole_1652 in this process, and the evolutionary history of the genes that encode the proteins involved in tagatose catabolism.
Yamaguchi, Kiyoshi; Nagayama, Satoshi; Shimizu, Eigo; Komura, Mitsuhiro; Yamaguchi, Rui; Shibuya, Tetsuo; Arai, Masami; Hatakeyama, Seira; Ikenoue, Tsuneo; Ueno, Masashi; Miyano, Satoru; Imoto, Seiya; Furukawa, Yoichi
2016-05-24
Germline mutations in the tumor suppressor gene APC are associated with familial adenomatous polyposis (FAP). Here we applied whole-genome sequencing (WGS) to the DNA of a sporadic FAP patient in which we did not find any pathological APC mutations by direct sequencing. WGS identified a promoter deletion of approximately 10 kb encompassing promoter 1B and exon1B of APC. Additional allele-specific expression analysis by deep cDNA sequencing revealed that the deletion reduced the expression of the mutated APC allele to as low as 11.2% in the total APC transcripts, suggesting that the residual mutant transcripts were driven by other promoter(s). Furthermore, cap analysis of gene expression (CAGE) demonstrated that the deleted promoter 1B region is responsible for the great majority of APC transcription in many tissues except the brain. The deletion decreased the transcripts of APC-1B to 39-45% in the patient compared to the healthy controls, but it did not decrease those of APC-1A. Different deletions including promoter 1B have been reported in FAP patients. Taken together, our results strengthen the evidence that analysis of structural variations in promoter 1B should be considered for the FAP patients whose pathological mutations are not identified by conventional direct sequencing.
Kvitt, H; Ucko, M; Colorni, A; Batargias, C; Zlotkin, A; Knibb, W
2002-04-05
A PCR protocol for the rapid diagnosis of fish 'pasteurellosis' based on 16S rRNA gene sequences was developed. The procedure combines low annealing temperature that detects low titers of Photobacterium damselae but also related species, and high annealing temperature for the specific identification of P. damselae directly from infected fish. The PCR protocol was validated on 19 piscine isolates of P. damselae ssp. piscicida from different geographic regions (Japan, Italy, Spain, Greece and Israel), on spontaneously infected sea bream Sparus aurata and sea bass Dicentrarchus labrax, and on closely related American Type Culture Collection (ATCC) reference strains. PCR using high annealing temperature (64 degrees C) discriminated between P. damselae and closely related reference strains, including P. histaminum. Sixteen isolates of P. damselae ssp. piscicida, 2 P. damselae ssp. piscicida reference strains and 1 P. damselae ssp. damselae reference strain were subjected to Amplified Fragment Length Polymorphism (AFLP) analysis, and a similarity matrix was produced. Accordingly, the Japanese isolates of P. damselae ssp. piscicida were distinguished from the Mediterranean/European isolates at a cut-off value of 83% similarity. A further subclustering at a cut-off value of 97% allowed discrimination between the Israeli P. damselae ssp. piscicida isolates and the other Mediterranean/European isolates. The combination of PCR direct amplification and AFLP provides a 2-step procedure, where P. damselae is rapidly identified at genus level on the basis of its 16S rRNA gene sequence and then grouped into distinct clusters on the basis of AFLP polymorphisms. The first step of direct amplification is highly sensitive and has immediate practical consequences, offering fish farmers a rapid diagnosis, while the AFLP is more specific and detects intraspecific variation which, in our study, also reflected geographic correspondence. Because of its superior discriminative properties, AFLP can be an important tool for epidemiological and taxonomic studies of this highly homogeneous genus.
Sequence-based screening for self-sufficient P450 monooxygenase from a metagenome library.
Kim, B S; Kim, S Y; Park, J; Park, W; Hwang, K Y; Yoon, Y J; Oh, W K; Kim, B Y; Ahn, J S
2007-05-01
Cytochrome P450 monooxygenases (CYPs) are useful catalysts for oxidation reactions. Self-sufficient CYPs harbour a reductive domain covalently connected to a P450 domain and are known for their robust catalytic activity with great potential as biocatalysts. In an effort to expand genetic sources of self-sufficient CYPs, we devised a sequence-based screening system to identify them in a soil metagenome. We constructed a soil metagenome library and performed sequence-based screening for self-sufficient CYP genes. A new CYP gene, syk181, was identified from the metagenome library. Phylogenetic analysis revealed that SYK181 formed a distinct phylogenic line with 46% amino-acid-sequence identity to CYP102A1 which has been extensively studied as a fatty acid hydroxylase. The heterologously expressed SYK181 showed significant hydroxylase activity towards naphthalene and phenanthrene as well as towards fatty acids. Sequence-based screening of metagenome libraries is expected to be a useful approach for searching self-sufficient CYP genes. The translated product of syk181 shows self-sufficient hydroxylase activity towards fatty acids and aromatic compounds. SYK181 is the first self-sufficient CYP obtained directly from a metagenome library. The genetic and biochemical information on SYK181 are expected to be helpful for engineering self-sufficient CYPs with broader catalytic activities towards various substrates, which would be useful for bioconversion of natural products and biodegradation of organic chemicals.
LinkFinder: An expert system that constructs phylogenic trees
NASA Technical Reports Server (NTRS)
Inglehart, James; Nelson, Peter C.
1991-01-01
An expert system has been developed using the C Language Integrated Production System (CLIPS) that automates the process of constructing DNA sequence based phylogenies (trees or lineages) that indicate evolutionary relationships. LinkFinder takes as input homologous DNA sequences from distinct individual organisms. It measures variations between the sequences, selects appropriate proportionality constants, and estimates the time that has passed since each pair of organisms diverged from a common ancestor. It then designs and outputs a phylogenic map summarizing these results. LinkFinder can find genetic relationships between different species, and between individuals of the same species, including humans. It was designed to take advantage of the vast amount of sequence data being produced by the Genome Project, and should be of value to evolution theorists who wish to utilize this data, but who have no formal training in molecular genetics. Evolutionary theory holds that distinct organisms carrying a common gene inherited that gene from a common ancestor. Homologous genes vary from individual to individual and species to species, and the amount of variation is now believed to be directly proportional to the time that has passed since divergence from a common ancestor. The proportionality constant must be determined experimentally; it varies considerably with the types of organisms and DNA molecules under study. Given an appropriate constant, and the variation between two DNA sequences, a simple linear equation gives the divergence time.
Genome sequence diversity and clues to the evolution of variola (smallpox) virus.
Esposito, Joseph J; Sammons, Scott A; Frace, A Michael; Osborne, John D; Olsen-Rasmussen, Melissa; Zhang, Ming; Govil, Dhwani; Damon, Inger K; Kline, Richard; Laker, Miriam; Li, Yu; Smith, Geoffrey L; Meyer, Hermann; Leduc, James W; Wohlhueter, Robert M
2006-08-11
Comparative genomics of 45 epidemiologically varied variola virus isolates from the past 30 years of the smallpox era indicate low sequence diversity, suggesting that there is probably little difference in the isolates' functional gene content. Phylogenetic clustering inferred three clades coincident with their geographical origin and case-fatality rate; the latter implicated putative proteins that mediate viral virulence differences. Analysis of the viral linear DNA genome suggests that its evolution involved direct descent and DNA end-region recombination events. Knowing the sequences will help understand the viral proteome and improve diagnostic test precision, therapeutics, and systems for their assessment.
Nonsense mutations in the PAX3 gene cause Waardenburg syndrome type I in two Chinese patients.
Yang, Shu-Zhi; Cao, Ju-Yang; Zhang, Rui-Ning; Liu, Li-Xian; Liu, Xin; Zhang, Xin; Kang, Dong-Yang; Li, Mei; Han, Dong-Yi; Yuan, Hui-Jun; Yang, Wei-Yan
2007-01-05
Waardenburg syndrome type I (WS1) is an autosomal dominant disorder characterized by sensorineural hearing loss, pigmental abnormalities of the eye, hair and skin, and dystopia canthorum. The gene mainly responsible for WS1 is PAX3 which is involved in melanocytic development and survival. Mutations of PAX3 have been reported in familiar or sporadic patients with WS1 in several populations of the world except Chinese. In order to explore the genetic background of Chinese WS1 patients, a mutation screening of PAX3 gene was carried out in four WS1 pedigrees. A questionnaire survey and comprehensive clinical examination were conducted in four Chinese pedigrees of WS1. Genomic DNA from each patient and their family members was extracted and exons of PAX3 were amplified by PCR. PCR fragments were ethanol-purified and sequenced in both directions on an ABI_Prism 3100 DNA sequencer with the BigDye Terminator Cycle Sequencing Ready Reaction Kit. The sequences were obtained and aligned to the wild type sequence of PAX3 with the GeneTool program. Two nonsense PAX3 mutations have been found in the study population. One is heterozygous for a novel nonsense mutation S209X. The other is heterozygous for a previously reported mutation in European population R223X. Both mutations create stop codons leading to truncation of the PAX3 protein. This is the first demonstration of PAX3 mutations in Chinese WS1 patients and one of the few examples of an identical mutation of PAX3 occurred in different populations.
Overview Article: Identifying transcriptional cis-regulatory modules in animal genomes
Suryamohan, Kushal; Halfon, Marc S.
2014-01-01
Gene expression is regulated through the activity of transcription factors and chromatin modifying proteins acting on specific DNA sequences, referred to as cis-regulatory elements. These include promoters, located at the transcription initiation sites of genes, and a variety of distal cis-regulatory modules (CRMs), the most common of which are transcriptional enhancers. Because regulated gene expression is fundamental to cell differentiation and acquisition of new cell fates, identifying, characterizing, and understanding the mechanisms of action of CRMs is critical for understanding development. CRM discovery has historically been challenging, as CRMs can be located far from the genes they regulate, have few readily-identifiable sequence characteristics, and for many years were not amenable to high-throughput discovery methods. However, the recent availability of complete genome sequences and the development of next-generation sequencing methods has led to an explosion of both computational and empirical methods for CRM discovery in model and non-model organisms alike. Experimentally, CRMs can be identified through chromatin immunoprecipitation directed against transcription factors or histone post-translational modifications, identification of nucleosome-depleted “open” chromatin regions, or sequencing-based high-throughput functional screening. Computational methods include comparative genomics, clustering of known or predicted transcription factor binding sites, and supervised machine-learning approaches trained on known CRMs. All of these methods have proven effective for CRM discovery, but each has its own considerations and limitations, and each is subject to a greater or lesser number of false-positive identifications. Experimental confirmation of predictions is essential, although shortcomings in current methods suggest that additional means of validation need to be developed. PMID:25704908
Whole-exome/genome sequencing and genomics.
Grody, Wayne W; Thompson, Barry H; Hudgins, Louanne
2013-12-01
As medical genetics has progressed from a descriptive entity to one focused on the functional relationship between genes and clinical disorders, emphasis has been placed on genomics. Genomics, a subelement of genetics, is the study of the genome, the sum total of all the genes of an organism. The human genome, which is contained in the 23 pairs of nuclear chromosomes and in the mitochondrial DNA of each cell, comprises >6 billion nucleotides of genetic code. There are some 23,000 protein-coding genes, a surprisingly small fraction of the total genetic material, with the remainder composed of noncoding DNA, regulatory sequences, and introns. The Human Genome Project, launched in 1990, produced a draft of the genome in 2001 and then a finished sequence in 2003, on the 50th anniversary of the initial publication of Watson and Crick's paper on the double-helical structure of DNA. Since then, this mass of genetic information has been translated at an ever-increasing pace into useable knowledge applicable to clinical medicine. The recent advent of massively parallel DNA sequencing (also known as shotgun, high-throughput, and next-generation sequencing) has brought whole-genome analysis into the clinic for the first time, and most of the current applications are directed at children with congenital conditions that are undiagnosable by using standard genetic tests for single-gene disorders. Thus, pediatricians must become familiar with this technology, what it can and cannot offer, and its technical and ethical challenges. Here, we address the concepts of human genomic analysis and its clinical applicability for primary care providers.
Lopez, Philippe; Halary, Sébastien; Bapteste, Eric
2015-10-26
Microbial genetic diversity is often investigated via the comparison of relatively similar 16S molecules through multiple alignments between reference sequences and novel environmental samples using phylogenetic trees, direct BLAST matches, or phylotypes counts. However, are we missing novel lineages in the microbial dark universe by relying on standard phylogenetic and BLAST methods? If so, how can we probe that universe using alternative approaches? We performed a novel type of multi-marker analysis of genetic diversity exploiting the topology of inclusive sequence similarity networks. Our protocol identified 86 ancient gene families, well distributed and rarely transferred across the 3 domains of life, and retrieved their environmental homologs among 10 million predicted ORFs from human gut samples and other metagenomic projects. Numerous highly divergent environmental homologs were observed in gut samples, although the most divergent genes were over-represented in non-gut environments. In our networks, most divergent environmental genes grouped exclusively with uncultured relatives, in maximal cliques. Sequences within these groups were under strong purifying selection and presented a range of genetic variation comparable to that of a prokaryotic domain. Many genes families included environmental homologs that were highly divergent from cultured homologs: in 79 gene families (including 18 ribosomal proteins), Bacteria and Archaea were less divergent than some groups of environmental sequences were to any cultured or viral homologs. Moreover, some groups of environmental homologs branched very deeply in phylogenetic trees of life, when they were not too divergent to be aligned. These results underline how limited our understanding of the most diverse elements of the microbial world remains, and encourage a deeper exploration of natural communities and their genetic resources, hinting at the possibility that still unknown yet major divisions of life have yet to be discovered.
Comparative Genomics of Carp Herpesviruses
Kurobe, Tomofumi; Gatherer, Derek; Cunningham, Charles; Korf, Ian; Fukuda, Hideo; Hedrick, Ronald P.; Waltzek, Thomas B.
2013-01-01
Three alloherpesviruses are known to cause disease in cyprinid fish: cyprinid herpesviruses 1 and 3 (CyHV1 and CyHV3) in common carp and koi and cyprinid herpesvirus 2 (CyHV2) in goldfish. We have determined the genome sequences of CyHV1 and CyHV2 and compared them with the published CyHV3 sequence. The CyHV1 and CyHV2 genomes are 291,144 and 290,304 bp, respectively, in size, and thus the CyHV3 genome, at 295,146 bp, remains the largest recorded among the herpesviruses. Each of the three genomes consists of a unique region flanked at each terminus by a sizeable direct repeat. The CyHV1, CyHV2, and CyHV3 genomes are predicted to contain 137, 150, and 155 unique, functional protein-coding genes, respectively, of which six, four, and eight, respectively, are duplicated in the terminal repeat. The three viruses share 120 orthologous genes in a largely colinear arrangement, of which up to 55 are also conserved in the other member of the genus Cyprinivirus, anguillid herpesvirus 1. Twelve genes are conserved convincingly in all sequenced alloherpesviruses, and two others are conserved marginally. The reference CyHV3 strain has been reported to contain five fragmented genes that are presumably nonfunctional. The CyHV2 strain has two fragmented genes, and the CyHV1 strain has none. CyHV1, CyHV2, and CyHV3 have five, six, and five families of paralogous genes, respectively. One family unique to CyHV1 is related to cellular JUNB, which encodes a transcription factor involved in oncogenesis. To our knowledge, this is the first time that JUNB-related sequences have been reported in a herpesvirus. PMID:23269803
Yamamoto, O; Takakusa, N; Mishima, Y; Kominami, R; Muramatsu, M
1984-01-01
Sequences required for a faithful and efficient transcription of a cloned mouse ribosomal RNA gene (rDNA) are determined by testing a series of deletion mutants in an in vitro transcription system utilizing two kinds of mouse cellular extract. Deletion of sequences upstream of -40 or downstream of +52 causes only slight reduction in promoter activity as compared with the "wild-type" template. For upstream deletion mutants, the removal of a sequence between -40 and -35 causes a significant decrease in the capacity to direct efficient initiation. This decrease becomes more pronounced when the deletion reaches -32 and the sequence A-T-C-T-T-T, conserved among mouse, rat, and human rDNAs, is lost. Residual template activity is further reduced as more upstream sequence is deleted and finally becomes undetectable when the deletion is extended from -22 down to -17, corresponding to the loss of the conserved sequence T-A-T-T-G. As for downstream deletion mutants, the removal of the sequence downstream of +23 causes some (and further deletions up to +11 cause a more) serious decrease in template activity in vitro. These deletions involve other conserved sequences downstream of the transcription start site. However, the removal of the original transcription start site does not abolish the transcription initiation completely, provided that the whole upstream sequence is intact. Images PMID:6320178
Genetic Control of Plant Root Colonization by the Biocontrol agent, Pseudomonas fluorescens
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cole, Benjamin J.; Fletcher, Meghan; Waters, Jordan
Plant growth promoting rhizobacteria (PGPR) are a critical component of plant root ecosystems. PGPR promote plant growth by solubilizing inaccessible minerals, suppressing pathogenic microorganisms in the soil, and directly stimulating growth through hormone synthesis. Pseudomonas fluorescens is a well-established PGPR isolated from wheat roots that can also colonize the root system of the model plant, Arabidopsis thaliana. We have created barcoded transposon insertion mutant libraries suitable for genome-wide transposon-mediated mutagenesis followed by sequencing (TnSeq). These libraries consist of over 105 independent insertions, collectively providing loss-of-function mutants for nearly all genes in the P.fluorescens genome. Each insertion mutant can be unambiguouslymore » identified by a randomized 20 nucleotide sequence (barcode) engineered into the transposon sequence. We used these libraries in a gnotobiotic assay to examine the colonization ability of P.fluorescens on A.thaliana roots. Taking advantage of the ability to distinguish individual colonization events using barcode sequences, we assessed the timing and microbial concentration dependence of colonization of the rhizoplane niche. These data provide direct insight into the dynamics of plant root colonization in an in vivo system and define baseline parameters for the systematic identification of the bacterial genes and molecular pathways using TnSeq assays. Having determined parameters that facilitate potential colonization of roots by thousands of independent insertion mutants in a single assay, we are currently establishing a genome-wide functional map of genes required for root colonization in P.fluorescens. Importantly, the approach developed and optimized here for P.fluorescens>A.thaliana colonization will be applicable to a wide range of plant-microbe interactions, including biofuel feedstock plants and microbes known or hypothesized to impact on biofuel-relevant traits including biomass productivity and pathogen resistance.« less
Voz, Marianne L.; Coppieters, Wouter; Manfroid, Isabelle; Baudhuin, Ariane; Von Berg, Virginie; Charlier, Carole; Meyer, Dirk; Driever, Wolfgang; Martial, Joseph A.; Peers, Bernard
2012-01-01
Forward genetics using zebrafish is a powerful tool for studying vertebrate development through large-scale mutagenesis. Nonetheless, the identification of the molecular lesion is still laborious and involves time-consuming genetic mapping. Here, we show that high-throughput sequencing of the whole zebrafish genome can directly locate the interval carrying the causative mutation and at the same time pinpoint the molecular lesion. The feasibility of this approach was validated by sequencing the m1045 mutant line that displays a severe hypoplasia of the exocrine pancreas. We generated 13 Gb of sequence, equivalent to an eightfold genomic coverage, from a pool of 50 mutant embryos obtained from a map-cross between the AB mutant carrier and the WIK polymorphic strain. The chromosomal region carrying the causal mutation was localized based on its unique property to display high levels of homozygosity among sequence reads as it derives exclusively from the initial AB mutated allele. We developed an algorithm identifying such a region by calculating a homozygosity score along all chromosomes. This highlighted an 8-Mb window on chromosome 5 with a score close to 1 in the m1045 mutants. The sequence analysis of all genes within this interval revealed a nonsense mutation in the snapc4 gene. Knockdown experiments confirmed the assertion that snapc4 is the gene whose mutation leads to exocrine pancreas hypoplasia. In conclusion, this study constitutes a proof-of-concept that whole-genome sequencing is a fast and effective alternative to the classical positional cloning strategies in zebrafish. PMID:22496837
NASA Astrophysics Data System (ADS)
Zhao, Xiaoqing; Li, Hong; Bao, Tonglaga; Ying, Zhiqiang
2012-09-01
Many experiment evidences showed that sequence structures of introns and intron loss/gain can influence gene expression, but current mechanisms did not refer to the functions of post-spliced introns directly. We propose that postspliced introns play their functions in gene expression by interacting with their mRNA sequences and the interaction is characterized by the matched segments between introns and their CDS. In this study, we investigated the interaction characters with length series by improved Smith-Waterman local alignment software for the ribosomal protein genes in C. elegans and D. melanogaster. Our results showed that RF values of five intron groups are significantly high in the central non-conserved region and very low in 5'-end and 3'-end splicing region. It is interesting that the number of the optimal matched regions gradually increases with intron length. Distributions of the optimal matched regions are different for five intron groups. Our study revealed that there are more interaction regions between longer introns and their CDS than shorter, and it provides a positive pattern for regulating the gene expression.
Zheng, Zhaoqing; Keifer, Joyce
2014-01-01
Brain-derived neurotrophic factor (BDNF) is an important regulator of neuronal development and synaptic function. The BDNF gene undergoes significant activity-dependent regulation during learning. Here, we identified the BDNF promoter regions, transcription start sites, and potential regulatory sequences for BDNF exons I–III that may contribute to activity-dependent gene and protein expression in the pond turtle Trachemys scripta elegans (tBDNF). By using transfection of BDNF promoter/luciferase plasmid constructs into human neuroblastoma SHSY5Y cells and mouse embryonic fibroblast NIH3T3 cells, we identified the basal regulatory activity of promoter sequences located upstream of each tBDNF exon, designated as pBDNFI–III. Further, through chromatin immunoprecipitation (ChIP) assays, we detected CREB binding directly to exon I and exon III promoters, while BHLHB2, but not CREB, binds within the exon II promoter. Elucidation of the promoter regions and regulatory protein binding sites in the tBDNF gene is essential for understanding the regulatory mechanisms that control tBDNF gene expression. PMID:24443176
Ambigapathy, Ganesh; Zheng, Zhaoqing; Keifer, Joyce
2014-08-01
Brain-derived neurotrophic factor (BDNF) is an important regulator of neuronal development and synaptic function. The BDNF gene undergoes significant activity-dependent regulation during learning. Here, we identified the BDNF promoter regions, transcription start sites, and potential regulatory sequences for BDNF exons I-III that may contribute to activity-dependent gene and protein expression in the pond turtle Trachemys scripta elegans (tBDNF). By using transfection of BDNF promoter/luciferase plasmid constructs into human neuroblastoma SHSY5Y cells and mouse embryonic fibroblast NIH3T3 cells, we identified the basal regulatory activity of promoter sequences located upstream of each tBDNF exon, designated as pBDNFI-III. Further, through chromatin immunoprecipitation (ChIP) assays, we detected CREB binding directly to exon I and exon III promoters, while BHLHB2, but not CREB, binds within the exon II promoter. Elucidation of the promoter regions and regulatory protein binding sites in the tBDNF gene is essential for understanding the regulatory mechanisms that control tBDNF gene expression.
NASA Astrophysics Data System (ADS)
Mackiewicz, P.; Gierlik, A.; Kowalczuk, M.; Szczepanik, D.; Dudek, M. R.; Cebrat, S.
1999-12-01
We have analysed protein coding and intergenic sequences in the Borrelia burgdorferi (the Lyme disease bacterium) genome using different kinds of DNA walks. Genes occupying the leading strand of DNA have significantly different nucleotide composition from genes occupying the lagging strand. Nucleotide compositional bias of the two DNA strands reflects the aminoacid composition of proteins. 96% of genes coding for ribosomal proteins lie on the leading DNA strand, which suggests that the positions of these as well as other genes are non-random. In the B. burgdorferi genome, the asymmetry in intergenic DNA sequences is lower than the asymmetry in the third positions in codons. All these characters of the B. burgdorferi genome suggest that both replication-associated mutational pressure and recombination mechanisms have established the specific structure of the genome and now any recombination leading to inversion of a gene in respect to the direction of replication is forbidden. This property of the genome allows us to assume that it is in a steady state, which enables us to fix some parameters for simulations of DNA evolution.