USDA-ARS?s Scientific Manuscript database
Single-nucleotide Polymorphism (SNP) markers are by far the most common form of DNA polymorphism in a genome. The objectives of this study were to discover SNPs in common bean comparing sequences from coding and non-coding regions obtained from Genbank and genomic DNA and to compare sequencing resu...
Refactoring the Genetic Code for Increased Evolvability
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pines, Gur; Winkler, James D.; Pines, Assaf
ABSTRACT The standard genetic code is robust to mutations during transcription and translation. Point mutations are likely to be synonymous or to preserve the chemical properties of the original amino acid. Saturation mutagenesis experiments suggest that in some cases the best-performing mutant requires replacement of more than a single nucleotide within a codon. These replacements are essentially inaccessible to common error-based laboratory engineering techniques that alter a single nucleotide per mutation event, due to the extreme rarity of adjacent mutations. In this theoretical study, we suggest a radical reordering of the genetic code that maximizes the mutagenic potential of singlemore » nucleotide replacements. We explore several possible genetic codes that allow a greater degree of accessibility to the mutational landscape and may result in a hyperevolvable organism that could serve as an ideal platform for directed evolution experiments. We then conclude by evaluating the challenges of constructing such recoded organisms and their potential applications within the field of synthetic biology. IMPORTANCE The conservative nature of the genetic code prevents bioengineers from efficiently accessing the full mutational landscape of a gene via common error-prone methods. Here, we present two computational approaches to generate alternative genetic codes with increased accessibility. These new codes allow mutational transitions to a larger pool of amino acids and with a greater extent of chemical differences, based on a single nucleotide replacement within the codon, thus increasing evolvability both at the single-gene and at the genome levels. Given the widespread use of these techniques for strain and protein improvement, along with more fundamental evolutionary biology questions, the use of recoded organisms that maximize evolvability should significantly improve the efficiency of directed evolution, library generation, and fitness maximization.« less
Refactoring the Genetic Code for Increased Evolvability
Pines, Gur; Winkler, James D.; Pines, Assaf; ...
2017-11-14
ABSTRACT The standard genetic code is robust to mutations during transcription and translation. Point mutations are likely to be synonymous or to preserve the chemical properties of the original amino acid. Saturation mutagenesis experiments suggest that in some cases the best-performing mutant requires replacement of more than a single nucleotide within a codon. These replacements are essentially inaccessible to common error-based laboratory engineering techniques that alter a single nucleotide per mutation event, due to the extreme rarity of adjacent mutations. In this theoretical study, we suggest a radical reordering of the genetic code that maximizes the mutagenic potential of singlemore » nucleotide replacements. We explore several possible genetic codes that allow a greater degree of accessibility to the mutational landscape and may result in a hyperevolvable organism that could serve as an ideal platform for directed evolution experiments. We then conclude by evaluating the challenges of constructing such recoded organisms and their potential applications within the field of synthetic biology. IMPORTANCE The conservative nature of the genetic code prevents bioengineers from efficiently accessing the full mutational landscape of a gene via common error-prone methods. Here, we present two computational approaches to generate alternative genetic codes with increased accessibility. These new codes allow mutational transitions to a larger pool of amino acids and with a greater extent of chemical differences, based on a single nucleotide replacement within the codon, thus increasing evolvability both at the single-gene and at the genome levels. Given the widespread use of these techniques for strain and protein improvement, along with more fundamental evolutionary biology questions, the use of recoded organisms that maximize evolvability should significantly improve the efficiency of directed evolution, library generation, and fitness maximization.« less
Duellman, Tyler; Warren, Christopher; Yang, Jay
2014-01-01
Microribonucleic acids (miRNAs) work with exquisite specificity and are able to distinguish a target from a non-target based on a single nucleotide mismatch in the core nucleotide domain. We questioned whether miRNA regulation of gene expression could occur in a single nucleotide polymorphism (SNP)-specific manner, manifesting as a post-transcriptional control of expression of genetic polymorphisms. In our recent study of the functional consequences of matrix metalloproteinase (MMP)-9 SNPs, we discovered that expression of a coding exon SNP in the pro-domain of the protein resulted in a profound decrease in the secreted protein. This missense SNP results in the N38S amino acid change and a loss of an N-glycosylation site. A systematic study demonstrated that the loss of secreted protein was due not to the loss of an N-glycosylation site, but rather an SNP-specific targeting by miR-671-3p and miR-657. Bioinformatics analysis identified 41 SNP-specific miRNA targeting MMP-9 SNPs, mostly in the coding exon and an extension of the analysis to chromosome 20, where the MMP-9 gene is located, suggesting that SNP-specific miRNAs targeting the coding exon are prevalent. This selective post-transcriptional regulation of a target messenger RNA harboring genetic polymorphisms by miRNAs offers an SNP-dependent post-transcriptional regulatory mechanism, allowing for polymorphic-specific differential gene regulation. PMID:24627221
IL-TIF/IL-22: genomic organization and mapping of the human and mouse genes.
Dumoutier, L; Van Roost, E; Ameye, G; Michaux, L; Renauld, J C
2000-12-01
IL-TIF is a new cytokine originally identified as a gene induced by IL-9 in murine T lymphocytes, and showing 22% amino acid identity with IL-10. Here, we report the sequence and organization of the mouse and human IL-TIF genes, which both consist of 6 exons spreading over approximately 6 Kb. The IL-TIF gene is a single copy gene in humans, and is located on chromosome 12q15, at 90 Kb from the IFN gamma gene, and at 27 Kb from the AK155 gene, which codes for another IL-10-related cytokine. In the mouse, the IL-TIF gene is located on chromosome 10, also in the same region as the IFN gamma gene. Although it is a single copy gene in BALB/c and DBA/2 mice, the IL-TIF gene is duplicated in other strains such as C57Bl/6, FVB and 129. The two copies, which show 98% nucleotide identity in the coding region, were named IL-TIF alpha and IL-TIF beta. Beside single nucleotide variations, they differ by a 658 nucleotide deletion in IL-TIF beta, including the first non-coding exon and 603 nucleotides from the promoter. A DNA fragment corresponding to this deletion was sufficient to confer IL-9-regulated expression of a luciferase reporter plasmid, suggesting that the IL-TIF beta gene is either differentially regulated, or not expressed at all.
Sequence of a cDNA encoding pancreatic preprosomatostatin-22.
Magazin, M; Minth, C D; Funckes, C L; Deschenes, R; Tavianini, M A; Dixon, J E
1982-01-01
We report the nucleotide sequence of a precursor to somatostatin that upon proteolytic processing may give rise to a hormone of 22 amino acids. The nucleotide sequence of a cDNA from the channel catfish (Ictalurus punctatus) encodes a precursor to somatostatin that is 105 amino acids (Mr, 11,500). The cDNA coding for somatostatin-22 consists of 36 nucleotides in the 5' untranslated region, 315 nucleotides that code for the precursor to somatostatin-22, 269 nucleotides at the 3' untranslated region, and a variable length of poly(A). The putative preprohormone contains a sequence of hydrophobic amino acids at the amino terminus that has the properties of a "signal" peptide. A connecting sequence of approximately 57 amino acids is followed by a single Arg-Arg sequence, which immediately precedes the hormone. Somatostatin-22 is homologous to somatostatin-14 in 7 of the 14 amino acids, including the Phe-Trp-Lys sequence. Hybridization selection of mRNA, followed by its translation in a wheat germ cell-free system, resulted in the synthesis of a single polypeptide having a molecular weight of approximately 10,000 as estimated on Na-DodSO4/polyacrylamide gels. Images PMID:6127673
Dasgupta, R; Kaesberg, P
1982-01-01
The nucleotide sequences of the subgenomic coat protein messengers (RNA4's) of two related bromoviruses, brome mosaic virus (BMV) and cowpea chlorotic mottle virus (CCMV), have been determined by direct RNA and CDNA sequencing without cloning. BMV RNA4 is 876 b long including a 5' noncoding region of nine nucleotides and a 3' noncoding region of 300 nucleotides. CCMV RNA 4 is 824 b long, including a 5' noncoding region of 10 nucleotides and a 3' noncoding region of 244 nucleotides. The encoded coat proteins are similar in length (188 amino acids for BMV and 189 amino acids for CCMV) and display about 70% homology in their amino acid sequences. Length difference between the two RNAs is due mostly to a single deletion, in CCMV with respect to BMV, of about 57 b immediately following the coding region. Allowing for this deletion the RNAs are indicate that mutations leading to divergence were constrained in the coding region primarily by the requirement of maintaining a favorable coat protein structure and in the 3' noncoding region primarily by the requirement of maintaining a favorable RNA spatial configuration. PMID:6895941
On fuzzy semantic similarity measure for DNA coding.
Ahmad, Muneer; Jung, Low Tang; Bhuiyan, Md Al-Amin
2016-02-01
A coding measure scheme numerically translates the DNA sequence to a time domain signal for protein coding regions identification. A number of coding measure schemes based on numerology, geometry, fixed mapping, statistical characteristics and chemical attributes of nucleotides have been proposed in recent decades. Such coding measure schemes lack the biologically meaningful aspects of nucleotide data and hence do not significantly discriminate coding regions from non-coding regions. This paper presents a novel fuzzy semantic similarity measure (FSSM) coding scheme centering on FSSM codons׳ clustering and genetic code context of nucleotides. Certain natural characteristics of nucleotides i.e. appearance as a unique combination of triplets, preserving special structure and occurrence, and ability to own and share density distributions in codons have been exploited in FSSM. The nucleotides׳ fuzzy behaviors, semantic similarities and defuzzification based on the center of gravity of nucleotides revealed a strong correlation between nucleotides in codons. The proposed FSSM coding scheme attains a significant enhancement in coding regions identification i.e. 36-133% as compared to other existing coding measure schemes tested over more than 250 benchmarked and randomly taken DNA datasets of different organisms. Copyright © 2015 Elsevier Ltd. All rights reserved.
Gritz, L; Davies, J
1983-11-01
The plasmid-borne gene hph coding for hygromycin B phosphotransferase (HPH) in Escherichia coli has been identified and its nucleotide sequence determined. The hph gene is 1026 nucleotides long, coding for a protein with a predicted Mr of 39 000. The hph gene was placed in a shuttle plasmid vector, downstream from the promoter region of the cyc 1 gene of Saccharomyces cerevisiae, and an hph construction containing a single AUG in the 5' noncoding region allowed direct selection following transformation in yeast and in E. coli. Thus the hph gene can be used in cloning vectors for both pro- and eukaryotes.
Xu, Zhi; Reynolds, Gavin P; Yuan, Yonggui; Shi, Yanyan; Pu, Mengjia; Zhang, Zhijun
2016-11-01
Variation in genes implicated in monoamine neurotransmission may interact with environmental factors to influence antidepressant response. We aimed to determine how a range of single nucleotide polymorphisms in monoaminergic genes influence this response to treatment and how they interact with childhood trauma and recent life stress in a Chinese sample. An initial study of monoaminergic coding region single nucleotide polymorphisms identified significant associations of TPH2 and HTR1B single nucleotide polymorphisms with treatment response that showed interactions with childhood and recent life stress, respectively (Xu et al., 2012). A total of 47 further single nucleotide polymorphisms in 17 candidate monoaminergic genes were genotyped in 281 Chinese Han patients with major depressive disorder. Response to 6 weeks' antidepressant treatment was determined by change in the 17-item Hamilton Depression Rating Scale score, and previous stressful events were evaluated by the Life Events Scale and Childhood Trauma Questionnaire-Short Form. Three TPH2 single nucleotide polymorphisms (rs11178998, rs7963717, and rs2171363) were significantly associated with antidepressant response in this Chinese sample, as was a haplotype in TPH2 (rs2171363 and rs1487278). One of these, rs2171363, showed a significant interaction with childhood adversity in its association with antidepressant response. These findings provide further evidence that variation in TPH2 is associated with antidepressant response and may also interact with childhood trauma to influence outcome of antidepressant treatment. © The Author 2016. Published by Oxford University Press on behalf of CINP.
Reynolds, Gavin P.; Yuan, Yonggui; Shi, Yanyan; Pu, Mengjia; Zhang, Zhijun
2016-01-01
Background: Variation in genes implicated in monoamine neurotransmission may interact with environmental factors to influence antidepressant response. We aimed to determine how a range of single nucleotide polymorphisms in monoaminergic genes influence this response to treatment and how they interact with childhood trauma and recent life stress in a Chinese sample. An initial study of monoaminergic coding region single nucleotide polymorphisms identified significant associations of TPH2 and HTR1B single nucleotide polymorphisms with treatment response that showed interactions with childhood and recent life stress, respectively (Xu et al., 2012). Methods: A total of 47 further single nucleotide polymorphisms in 17 candidate monoaminergic genes were genotyped in 281 Chinese Han patients with major depressive disorder. Response to 6 weeks’ antidepressant treatment was determined by change in the 17-item Hamilton Depression Rating Scale score, and previous stressful events were evaluated by the Life Events Scale and Childhood Trauma Questionnaire-Short Form. Results: Three TPH2 single nucleotide polymorphisms (rs11178998, rs7963717, and rs2171363) were significantly associated with antidepressant response in this Chinese sample, as was a haplotype in TPH2 (rs2171363 and rs1487278). One of these, rs2171363, showed a significant interaction with childhood adversity in its association with antidepressant response. Conclusions: These findings provide further evidence that variation in TPH2 is associated with antidepressant response and may also interact with childhood trauma to influence outcome of antidepressant treatment. PMID:27521242
The Coding of Biological Information: From Nucleotide Sequence to Protein Recognition
NASA Astrophysics Data System (ADS)
Štambuk, Nikola
The paper reviews the classic results of Swanson, Dayhoff, Grantham, Blalock and Root-Bernstein, which link genetic code nucleotide patterns to the protein structure, evolution and molecular recognition. Symbolic representation of the binary addresses defining particular nucleotide and amino acid properties is discussed, with consideration of: structure and metric of the code, direct correspondence between amino acid and nucleotide information, and molecular recognition of the interacting protein motifs coded by the complementary DNA and RNA strands.
Seligmann, Hervé
2013-03-01
Usual DNA→RNA transcription exchanges T→U. Assuming different systematic symmetric nucleotide exchanges during translation, some GenBank RNAs match exactly human mitochondrial sequences (exchange rules listed in decreasing transcript frequencies): C↔U, A↔U, A↔U+C↔G (two nucleotide pairs exchanged), G↔U, A↔G, C↔G, none for A↔C, A↔G+C↔U, and A↔C+G↔U. Most unusual transcripts involve exchanging uracil. Independent measures of rates of rare replicational enzymatic DNA nucleotide misinsertions predict frequencies of RNA transcripts systematically exchanging the corresponding misinserted nucleotides. Exchange transcripts self-hybridize less than other gene regions, self-hybridization increases with length, suggesting endoribonuclease-limited elongation. Blast detects stop codon depleted putative protein coding overlapping genes within exchange-transcribed mitochondrial genes. These align with existing GenBank proteins (mainly metazoan origins, prokaryotic and viral origins underrepresented). These GenBank proteins frequently interact with RNA/DNA, are membrane transporters, or are typical of mitochondrial metabolism. Nucleotide exchange transcript frequencies increase with overlapping gene densities and stop densities, indicating finely tuned counterbalancing regulation of expression of systematic symmetric nucleotide exchange-encrypted proteins. Such expression necessitates combined activities of suppressor tRNAs matching stops, and nucleotide exchange transcription. Two independent properties confirm predicted exchanged overlap coding genes: discrepancy of third codon nucleotide contents from replicational deamination gradients, and codon usage according to circular code predictions. Predictions from both properties converge, especially for frequent nucleotide exchange types. Nucleotide exchanging transcription apparently increases coding densities of protein coding genes without lengthening genomes, revealing unsuspected functional DNA coding potential. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kuppuswamy, M.N.; Hoffmann, J.W.; Spitzer, S.G.
1991-02-15
In this report, the authors describe an approach to detect the presence of abnormal alleles in those genetic diseases in which frequency of occurrence of the same mutation is high (e.g., hemophilia B). Initially, from each subject, the DNA fragment containing the putative mutation site is amplified by the polymerase chain reaction. For each fragment two reaction mixtures are then prepared. Each contains the amplified fragment, a primer (18-mer or longer) whose sequence is identical to the coding sequence of the normal gene immediately flanking the 5{prime} end of the mutation site, and either an {alpha}-{sup 32}P-labeled nucleotide corresponding tomore » the normal coding sequence at the mutation site or an {alpha}-{sup 32}P-labeled nucleotide corresponding to the mutant sequence. An essential feature of the present methodology is that the base immediately 3{prime} to the template-bound primer is one of those altered in the mutant, since in this way an extension of the primer by a single base will give an extended molecule characteristic of either the mutant or the wild type. The method is rapid and should be useful in carrier detection and prenatal diagnosis of every genetic disease with a known sequence variation.« less
Evidence for a Complex Class of Nonadenylated mRNA in Drosophila
Zimmerman, J. Lynn; Fouts, David L.; Manning, Jerry E.
1980-01-01
The amount, by mass, of poly(A+) mRNA present in the polyribosomes of third-instar larvae of Drosophila melanogaster, and the relative contribution of the poly(A+) mRNA to the sequence complexity of total polysomal RNA, has been determined. Selective removal of poly(A+) mRNA from total polysomal RNA by use of either oligo-dT-cellulose, or poly(U)-sepharose affinity chromatography, revealed that only 0.15% of the mass of the polysomal RNA was present as poly(A+) mRNA. The present study shows that this RNA hybridized at saturation with 3.3% of the single-copy DNA in the Drosophila genome. After correction for asymmetric transcription and reactability of the DNA, 7.4% of the single-copy DNA in the Drosophila genome is represented in larval poly(A+) mRNA. This corresponds to 6.73 x 106 nucleotides of mRNA coding sequences, or approximately 5,384 diverse RNA sequences of average size 1,250 nucleotides. However, total polysomal RNA hybridizes at saturation to 10.9% of the single-copy DNA sequences. After correcting this value for asymmetric transcription and tracer DNA reactability, 24% of the single-copy DNA in Drosophila is represented in total polysomal RNA. This corresponds to 2.18 x 107 nucleotides of RNA coding sequences or 17,440 diverse RNA molecules of size 1,250 nucleotides. This value is 3.2 times greater than that observed for poly(A+) mRNA, and indicates that ≃69% of the polysomal RNA sequence complexity is contributed by nonadenylated RNA. Furthermore, if the number of different structural genes represented in total polysomal RNA is ≃1.7 x 104, then the number of genes expressed in third-instar larvae exceeds the number of chromomeres in Drosophila by about a factor of three. This numerology indicates that the number of chromomeres observed in polytene chromosomes does not reflect the number of structural gene sequences in the Drosophila genome. PMID:6777246
NASA Astrophysics Data System (ADS)
Liu, Siwei; Li, Qi; Yu, Hong; Kong, Lingfeng
2017-02-01
Glycogen is important not only for the energy supplementary of oysters, but also for human consumption. High glycogen content can improve the stress survival of oyster. A key enzyme in glycogenesis is glycogen synthase that is encoded by glycogen synthase gene GYS. In this study, the relationship between single nucleotide polymorphisms (SNPs) in coding regions of Crassostrea gigas GYS (Cg-GYS) and individual glycogen content was investigated with 321 individuals from five full-sib families. Single-strand conformation polymorphism (SSCP) procedure was combined with sequencing to confirm individual SNP genotypes of Cg-GYS. Least-square analysis of variance was performed to assess the relationship of variation in glycogen content of C. gigas with single SNP genotype and SNP haplotype. As a consequence, six SNPs were found in coding regions to be significantly associated with glycogen content ( P < 0.01), from which we constructed four main haplotypes due to linkage disequilibrium. Furthermore, the most effective haplotype H2 (GAGGAT) had extremely significant relationship with high glycogen content ( P < 0.0001). These findings revealed the potential influence of Cg-GYS polymorphism on the glycogen content and provided molecular biological information for the selective breeding of good quality traits of C. gigas.
Jheng, Cheng-Fong; Chen, Tien-Chih; Lin, Jhong-Yi; Chen, Ting-Chieh; Wu, Wen-Luan; Chang, Ching-Chun
2012-07-01
The chloroplast genome of Phalaenopsis equestris was determined and compared to those of Phalaenopsis aphrodite and Oncidium Gower Ramsey in Orchidaceae. The chloroplast genome of P. equestris is 148,959 bp, and a pair of inverted repeats (25,846 bp) separates the genome into large single-copy (85,967 bp) and small single-copy (11,300 bp) regions. The genome encodes 109 genes, including 4 rRNA, 30 tRNA and 75 protein-coding genes, but loses four ndh genes (ndhA, E, F and H) and seven other ndh genes are pseudogenes. The rate of inter-species variation between the two moth orchids was 0.74% (1107 sites) for single nucleotide substitution and 0.24% for insertions (161 sites; 1388 bp) and deletions (189 sites; 1393 bp). The IR regions have a lower rate of nucleotide substitution (3.5-5.8-fold) and indels (4.3-7.1-fold) than single-copy regions. The intergenic spacers are the most divergent, and based on the length variation of the three intergenic spacers, 11 native Phalaenopsis orchids could be successfully distinguished. The coding genes, IR junction and RNA editing sites are relatively more conserved between the two moth orchids than between those of Phalaenopsis and Oncidium spp. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Correlation approach to identify coding regions in DNA sequences
NASA Technical Reports Server (NTRS)
Ossadnik, S. M.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.
1994-01-01
Recently, it was observed that noncoding regions of DNA sequences possess long-range power-law correlations, whereas coding regions typically display only short-range correlations. We develop an algorithm based on this finding that enables investigators to perform a statistical analysis on long DNA sequences to locate possible coding regions. The algorithm is particularly successful in predicting the location of lengthy coding regions. For example, for the complete genome of yeast chromosome III (315,344 nucleotides), at least 82% of the predictions correspond to putative coding regions; the algorithm correctly identified all coding regions larger than 3000 nucleotides, 92% of coding regions between 2000 and 3000 nucleotides long, and 79% of coding regions between 1000 and 2000 nucleotides. The predictive ability of this new algorithm supports the claim that there is a fundamental difference in the correlation property between coding and noncoding sequences. This algorithm, which is not species-dependent, can be implemented with other techniques for rapidly and accurately locating relatively long coding regions in genomic sequences.
Werling, Donna M; Brand, Harrison; An, Joon-Yong; Stone, Matthew R; Zhu, Lingxue; Glessner, Joseph T; Collins, Ryan L; Dong, Shan; Layer, Ryan M; Markenscoff-Papadimitriou, Eirene; Farrell, Andrew; Schwartz, Grace B; Wang, Harold Z; Currall, Benjamin B; Zhao, Xuefang; Dea, Jeanselle; Duhn, Clif; Erdman, Carolyn A; Gilson, Michael C; Yadav, Rachita; Handsaker, Robert E; Kashin, Seva; Klei, Lambertus; Mandell, Jeffrey D; Nowakowski, Tomasz J; Liu, Yuwen; Pochareddy, Sirisha; Smith, Louw; Walker, Michael F; Waterman, Matthew J; He, Xin; Kriegstein, Arnold R; Rubenstein, John L; Sestan, Nenad; McCarroll, Steven A; Neale, Benjamin M; Coon, Hilary; Willsey, A Jeremy; Buxbaum, Joseph D; Daly, Mark J; State, Matthew W; Quinlan, Aaron R; Marth, Gabor T; Roeder, Kathryn; Devlin, Bernie; Talkowski, Michael E; Sanders, Stephan J
2018-05-01
Genomic association studies of common or rare protein-coding variation have established robust statistical approaches to account for multiple testing. Here we present a comparable framework to evaluate rare and de novo noncoding single-nucleotide variants, insertion/deletions, and all classes of structural variation from whole-genome sequencing (WGS). Integrating genomic annotations at the level of nucleotides, genes, and regulatory regions, we define 51,801 annotation categories. Analyses of 519 autism spectrum disorder families did not identify association with any categories after correction for 4,123 effective tests. Without appropriate correction, biologically plausible associations are observed in both cases and controls. Despite excluding previously identified gene-disrupting mutations, coding regions still exhibited the strongest associations. Thus, in autism, the contribution of de novo noncoding variation is probably modest in comparison to that of de novo coding variants. Robust results from future WGS studies will require large cohorts and comprehensive analytical strategies that consider the substantial multiple-testing burden.
Identification of two allelic IgG1 C(H) coding regions (Cgamma1) of cat.
Kanai, T H; Ueda, S; Nakamura, T
2000-01-31
Two types of cDNA encoding IgG1 heavy chain (gamma1) were isolated from a single domestic short-hair cat. Sequence analysis indicated a higher level of similarity of these Cgamma1 sequences to human Cgamma1 sequence (76.9 and 77.0%) than to mouse sequence (70.0 and 69.7%) at the nucleotide level. Predicted primary structures of both the feline Cgamma1 genes, designated as Cgamma1a and Cgamma1b, were similar to that of human Cgamma1 gene, for instance, as to the size of constant domains, the presence of six conserved cysteine residues involved in formation of the domain structure, and the location of a conserved N-linked glycosylation site. Sequence comparison between the two alleles showed that 7 out of 10 nucleotide differences were within the C(H)3 domain coding region, all leading to nonsynonymous changes in amino acid residues. Partial sequence analysis of genomic clones showed three nucleotide substitutions between the two Cgamma1 alleles in the intron between the CH2 and C(H)3 domain coding regions. In 12 domestic short-hair cats used in this study, the frequency of Cgamma1a allele (62.5%) was higher than that of the Cgamma1b allele (37.5%).
Van, K; Onoda, S; Kim, M Y; Kim, K D; Lee, S-H
2008-03-01
The Waxy (Wx) gene product controls the formation of a straight chain polymer of amylose in the starch pathway. Dominance/recessiveness of the Wx allele is associated with amylose content, leading to non-waxy/waxy phenotypes. For a total of 113 foxtail millet accessions, agronomic traits and the molecular differences of the Wx gene were surveyed to evaluate genetic diversities. Molecular types were associated with phenotypes determined by four specific primer sets (non-waxy, Type I; low amylose, Type VI; waxy, Type IV or V). Additionally, the insertion of transposable element in waxy was confirmed by ex1/TSI2R, TSI2F/ex2, ex2int2/TSI7R and TSI7F/ex4r. Seventeen single nucleotide polymorphims (SNPs) were observed from non-coding regions, while three SNPs from coding regions were non-synonymous. Interestingly, the phenotype of No. 88 was still non-waxy, although seven nucleotides (AATTGGT) insertion at 2,993 bp led to 78 amino acids shorter. The rapid decline of r (2) in the sequenced region (exon 1-intron 1-exon 2) suggested a low level of linkage disequilibrium and limited haplotype structure. K (s) values and estimation of evolutionary events indicate early divergence of S. italica among cereal crops. This study suggested the Wx gene was one of the targets in the selection process during domestication.
Chery, Joyce G; Sass, Chodon; Specht, Chelsea D
2017-09-01
We developed a bioinformatic pipeline that leverages a publicly available genome and published transcriptomes to design primers in conserved coding sequences flanking targeted introns of single-copy nuclear loci. Paullinieae (Sapindaceae) is used to demonstrate the pipeline. Transcriptome reads phylogenetically closer to the lineage of interest are aligned to the closest genome. Single-nucleotide polymorphisms are called, generating a "pseudoreference" closer to the lineage of interest. Several filters are applied to meet the criteria of single-copy nuclear loci with introns of a desired size. Primers are designed in conserved coding sequences flanking introns. Using this pipeline, we developed nine single-copy nuclear intron markers for Paullinieae. This pipeline is highly flexible and can be used for any group with available genomic and transcriptomic resources. This pipeline led to the development of nine variable markers for phylogenetic study without generating sequence data de novo.
Ahmad, Muneer; Jung, Low Tan; Bhuiyan, Al-Amin
2017-10-01
Digital signal processing techniques commonly employ fixed length window filters to process the signal contents. DNA signals differ in characteristics from common digital signals since they carry nucleotides as contents. The nucleotides own genetic code context and fuzzy behaviors due to their special structure and order in DNA strand. Employing conventional fixed length window filters for DNA signal processing produce spectral leakage and hence results in signal noise. A biological context aware adaptive window filter is required to process the DNA signals. This paper introduces a biological inspired fuzzy adaptive window median filter (FAWMF) which computes the fuzzy membership strength of nucleotides in each slide of window and filters nucleotides based on median filtering with a combination of s-shaped and z-shaped filters. Since coding regions cause 3-base periodicity by an unbalanced nucleotides' distribution producing a relatively high bias for nucleotides' usage, such fundamental characteristic of nucleotides has been exploited in FAWMF to suppress the signal noise. Along with adaptive response of FAWMF, a strong correlation between median nucleotides and the Π shaped filter was observed which produced enhanced discrimination between coding and non-coding regions contrary to fixed length conventional window filters. The proposed FAWMF attains a significant enhancement in coding regions identification i.e. 40% to 125% as compared to other conventional window filters tested over more than 250 benchmarked and randomly taken DNA datasets of different organisms. This study proves that conventional fixed length window filters applied to DNA signals do not achieve significant results since the nucleotides carry genetic code context. The proposed FAWMF algorithm is adaptive and outperforms significantly to process DNA signal contents. The algorithm applied to variety of DNA datasets produced noteworthy discrimination between coding and non-coding regions contrary to fixed window length conventional filters. Copyright © 2017 Elsevier B.V. All rights reserved.
Liljeqvist, Jan-Åke; Svennerholm, Bo; Bergström, Tomas
1999-01-01
Herpes simplex virus (HSV) codes for several envelope glycoproteins, including glycoprotein G-2 (gG-2) of HSV type 2 (HSV-2), which are dispensable for replication in cell culture. However, clinical isolates which are deficient in such proteins occur rarely. We describe here five clinical HSV-2 isolates which were found to be unreactive to a panel of anti-gG-2 monoclonal antibodies and therefore considered phenotypically gG-2 negative. These isolates were further examined for expression of the secreted amino-terminal and cell-associated carboxy-terminal portions of gG-2 by immunoblotting and radioimmunoprecipitation. The gG-2 gene was completely inactivated in four isolates, with no expression of the two protein products. For one isolate a normally produced secreted portion and a truncated carboxy-terminal portion of gG-2 were detected in virus-infected cell medium. Sequencing of the complete gG-2 gene identified a single insertion or deletion of guanine or cytosine nucleotides in all five strains, resulting in a premature termination codon. The frameshift mutations were localized within runs of five or more guanine or cytosine nucleotides and were dispersed throughout the gene. For the isolate for which a partially inactivated gG-2 gene was detected, the frameshift mutation was localized upstream of but adjacent to the nucleotides coding for the transmembranous region. Thus, this study demonstrates the existence of clinical HSV-2 isolates which do not express an envelope glycoprotein and identifies the underlying molecular mechanism to be a single frameshift mutation. PMID:10559290
Tan, Ene-Choo; Li, Haixia
2006-07-19
Most of the studies on single nucleotide variations are on substitutions rather than insertions/deletions. In this study, we examined the distribution and characteristics of single nucleotide insertions/deletions (SNindels), using data available from dbSNP for all the human chromosomes. There are almost 300,000 SNindels in the database, of which only 0.8% are validated. They occur at the frequency of 0.887 per 10 kb on average for the whole genome, or approximately 1 for every 11,274 bp. More than half occur in regions with mononucleotide repeats the longest of which is 47 bases. Overall the mononucleotide repeats involving C and G are much shorter than those for A and T. About 12% are surrounded by palindromes. There is general correlation between chromosome size and total number for each chromosome. Inter-chromosomal variation in density ranges from 0.6 to 21.7 per kilobase. The overall spectrum shows very high proportion of SNindel of types -/A and -/T at over 81%. The proportion of -/A and -/T SNindels for each chromosome is correlated to its AT content. Less than half of the SNindels are within or near known genes and even fewer (<0.183%) in coding regions, and more than 1.4% of -/C and -/G are in coding compared to 0.2% for -/A and -/T types. SNindels of -/A and -/T types make up 80% of those found within untranslated regions but less than 40% of those within coding regions. A separate analysis using the subset of 2324 validated SNindels showed slightly less AT bias of 74%, SNindels not within mononucleotide repeats showed even less AT bias at 58%. Density of validated SNindels is 0.007/10 kb overall and 90% are found within or near genes. Among all chromosomes, Y has the lowest numbers and densities for all SNindels, validated SNindels, and SNindels not within repeats.
RNAcentral: A comprehensive database of non-coding RNA sequences
Williams, Kelly Porter; Lau, Britney Yan
2016-10-28
RNAcentral is a database of non-coding RNA (ncRNA) sequences that aggregates data from specialised ncRNA resources and provides a single entry point for accessing ncRNA sequences of all ncRNA types from all organisms. Since its launch in 2014, RNAcentral has integrated twelve new resources, taking the total number of collaborating database to 22, and began importing new types of data, such as modified nucleotides from MODOMICS and PDB. We created new species-specific identifiers that refer to unique RNA sequences within a context of single species. Furthermore, the website has been subject to continuous improvements focusing on text and sequence similaritymore » searches as well as genome browsing functionality.« less
RNAcentral: A comprehensive database of non-coding RNA sequences
DOE Office of Scientific and Technical Information (OSTI.GOV)
Williams, Kelly Porter; Lau, Britney Yan
RNAcentral is a database of non-coding RNA (ncRNA) sequences that aggregates data from specialised ncRNA resources and provides a single entry point for accessing ncRNA sequences of all ncRNA types from all organisms. Since its launch in 2014, RNAcentral has integrated twelve new resources, taking the total number of collaborating database to 22, and began importing new types of data, such as modified nucleotides from MODOMICS and PDB. We created new species-specific identifiers that refer to unique RNA sequences within a context of single species. Furthermore, the website has been subject to continuous improvements focusing on text and sequence similaritymore » searches as well as genome browsing functionality.« less
USDA-ARS?s Scientific Manuscript database
Genetic variants detected from sequence have been used to successfully identify causal variants and map complex traits in several organisms. High and moderate impact variants, those expected to alter or disrupt the protein coded by a gene and those that regulate protein production, likely have a mor...
Zhang, Yuqin; Lin, Fanbo; Zhang, Youyu; Li, Haitao; Zeng, Yue; Tang, Hao; Yao, Shouzhuo
2011-01-01
A new method for the detection of point mutation in DNA based on the monobase-coded cadmium tellurium nanoprobes and the quartz crystal microbalance (QCM) technique was reported. A point mutation (single-base, adenine, thymine, cytosine, and guanine, namely, A, T, C and G, mutation in DNA strand, respectively) DNA QCM sensor was fabricated by immobilizing single-base mutation DNA modified magnetic beads onto the electrode surface with an external magnetic field near the electrode. The DNA-modified magnetic beads were obtained from the biotin-avidin affinity reaction of biotinylated DNA and streptavidin-functionalized core/shell Fe(3)O(4)/Au magnetic nanoparticles, followed by a DNA hybridization reaction. Single-base coded CdTe nanoprobes (A-CdTe, T-CdTe, C-CdTe and G-CdTe, respectively) were used as the detection probes. The mutation site in DNA was distinguished by detecting the decreases of the resonance frequency of the piezoelectric quartz crystal when the coded nanoprobe was added to the test system. This proposed detection strategy for point mutation in DNA is proved to be sensitive, simple, repeatable and low-cost, consequently, it has a great potential for single nucleotide polymorphism (SNP) detection. 2011 © The Japan Society for Analytical Chemistry
Systematic screening for mutations in the promoter and the coding region of the 5-HT{sub 1A} gene
DOE Office of Scientific and Technical Information (OSTI.GOV)
Erdmann, J.; Shimron-Abarbanell, D.; Cichon, S.
1995-10-09
In the present study we sought to identify genetic variation in the 5-HT{sub 1A} receptor gene which through alteration of protein function or level of expression might contribute to the genetic predisposition to neuropsychiatric diseases. Genomic DNA samples from 159 unrelated subjects (including 45 schizophrenic, 46 bipolar affective, and 43 patients with Tourette`s syndrome, as well as 25 healthy controls) were investigated by single-strand conformation analysis. Overlapping PCR (polymerase chain reaction) fragments covered the whole coding sequence as well as the 5{prime} untranslated region of the 5-HT{sub 1A} gene. The region upstream to the coding sequence we investigated contains amore » functional promoter. We found two rare nucleotide sequence variants. Both mutations are located in the coding region of the gene: a coding mutation (A{yields}G) in nucleotide position 82 which leads to an amino acid exchange (Ile{yields}Val) in position 28 of the receptor protein and a silent mutation (C{yields}T) in nucleotide position 549. The occurrence of the Ile-28-Val substitution was studied in an extended sample of patients (n = 352) and controls (n = 210) but was found in similar frequencies in all groups. Thus, this mutation is unlikely to play a significant role in the genetic predisposition to the diseases investigated. In conclusion, our study does not provide evidence that the 5-HT{sub 1A} gene plays either a major or a minor role in the genetic predisposition to schizophrenia, bipolar affective disorder, or Tourette`s syndrome. 29 refs., 4 figs., 1 tab.« less
Weng, Jianfeng; Li, Bo; Liu, Changlin; Yang, Xiaoyan; Wang, Hongwei; Hao, Zhuanfang; Li, Mingshun; Zhang, Degui; Ci, Xiaoke; Li, Xinhai; Zhang, Shihuang
2013-07-05
Kernel weight, controlled by quantitative trait loci (QTL), is an important component of grain yield in maize. Cytokinins (CKs) participate in determining grain morphology and final grain yield in crops. ZmIPT2, which is expressed mainly in the basal transfer cell layer, endosperm, and embryo during maize kernel development, encodes an isopentenyl transferase (IPT) that is involved in CK biosynthesis. The coding region of ZmIPT2 was sequenced across a panel of 175 maize inbred lines that are currently used in Chinese maize breeding programs. Only 16 single nucleotide polymorphisms (SNPs) and seven haplotypes were detected among these inbred lines. Nucleotide diversity (π) within the ZmIPT2 window and coding region were 0.347 and 0.0047, respectively, and they were significantly lower than the mean nucleotide diversity value of 0.372 for maize Chromosome 2 (P < 0.01). Association mapping revealed that a single nucleotide change from cytosine (C) to thymine (T) in the ZmIPT2 coding region, which converted a proline residue into a serine residue, was significantly associated with hundred kernel weight (HKW) in three environments (P <0.05), and explained 4.76% of the total phenotypic variation. In vitro characterization suggests that the dimethylallyl diphospate (DMAPP) IPT activity of ZmIPT2-T is higher than that of ZmIPT2-C, as the amounts of adenosine triphosphate (ATP), adenosine diphosphate (ADP), and adenosine monophosphate (AMP) consumed by ZmIPT2-T were 5.48-, 2.70-, and 1.87-fold, respectively, greater than those consumed by ZmIPT2-C. The effects of artificial selection on the ZmIPT2 coding region were evaluated using Tajima's D tests across six subgroups of Chinese maize germplasm, with the most frequent favorable allele identified in subgroup PB (Partner B). These results showed that ZmIPT2, which is associated with kernel weight, was subjected to artificial selection during the maize breeding process. ZmIPT2-T had higher IPT activity than ZmIPT2-C, and this favorable allele for kernel weight could be used in molecular marker-assisted selection for improvement of grain yield components in Chinese maize breeding programs.
Miyakawa, Hiroe; Miyamoto, Toshinobu; Koh, Eitetsu; Tsujimura, Akira; Miyagawa, Yasushi; Saijo, Yasuaki; Namiki, Mikio; Sengoku, Kazuo
2012-01-01
Genetic mechanisms have been implicated as a cause of some cases of male infertility. Recently, 10 novel genes involved in human spermatogenesis, including human SEPTIN12, were identified by expression microarray analysis of human testicular tissue. Septin12 is a member of the septin family of conserved cytoskeletal GTPases that form heteropolymeric filamentous structures in interphase cells. It is expressed specifically in the testis. Therefore, we hypothesized that mutation or polymorphisms of SEPTIN12 participate in male infertility, especially Sertoli cell-only syndrome (SCOS). To investigate whether SEPTIN12 gene defects are associated with azoospermia caused by SCOS, mutational analysis was performed in 100 Japanese patients by direct sequencing of coding regions. Statistical analysis was performed in patients with SCOS and in 140 healthy control men. No mutations were found in SEPTIN12 ; however, 8 coding single-nucleotide polymorphisms (SNP1-SNP8) could be detected in the patients with SCOS. The genotype and allele frequencies in SNP3, SNP4, and SNP6 were notably higher in the SCOS group than in the control group (P < .001). These results suggest that SEPTIN12 might play a critical role in human spermatogenesis.
Hewett, Duncan; Samuelsson, Lena; Polding, Joanne; Enlund, Fredrik; Smart, Devi; Cantone, Kathryn; See, Chee Gee; Chadha, Sapna; Inerot, Annica; Enerback, Charlotta; Montgomery, Doug; Christodolou, Chris; Robinson, Phil; Matthews, Paul; Plumpton, Mary; Wahlstrom, Jan; Swanbeck, Gunnar; Martinsson, Tommy; Roses, Allen; Riley, John; Purvis, Ian
2002-03-01
Psoriasis is a chronic inflammatory disease of the skin with both genetic and environmental risk factors. Here we describe the creation of a single-nucleotide polymorphism (SNP) map spanning 900-1200 kb of chromosome 3q21, which had been previously recognized as containing a psoriasis susceptibility locus, PSORS5. We genotyped 644 individuals, from 195 Swedish psoriatic families, for 19 polymorphisms. Linkage disequilibrium (LD) between marker and disease was assessed using the transmission/disequilibrium test (TDT). In the TDT analysis, alleles of three of these SNPs showed significant association with disease (P<0.05). A 160-kb interval encompassing these three SNPs was sequenced, and a coding sequence consisting of 13 exons was identified. The predicted protein shares 30-40% homology with the family of cation/chloride cotransporters. A five-marker haplotype spanning the 3' half of this gene is associated with psoriasis to a P value of 3.8<10(-5). We have called this gene SLC12A8, coding for a member of the solute carrier family 12 proteins. It belongs to a class of genes that were previously unrecognized as playing a role in psoriasis pathogenesis.
Association analysis identifies 65 new breast cancer risk loci.
Michailidou, Kyriaki; Lindström, Sara; Dennis, Joe; Beesley, Jonathan; Hui, Shirley; Kar, Siddhartha; Lemaçon, Audrey; Soucy, Penny; Glubb, Dylan; Rostamianfar, Asha; Bolla, Manjeet K; Wang, Qin; Tyrer, Jonathan; Dicks, Ed; Lee, Andrew; Wang, Zhaoming; Allen, Jamie; Keeman, Renske; Eilber, Ursula; French, Juliet D; Qing Chen, Xiao; Fachal, Laura; McCue, Karen; McCart Reed, Amy E; Ghoussaini, Maya; Carroll, Jason S; Jiang, Xia; Finucane, Hilary; Adams, Marcia; Adank, Muriel A; Ahsan, Habibul; Aittomäki, Kristiina; Anton-Culver, Hoda; Antonenkova, Natalia N; Arndt, Volker; Aronson, Kristan J; Arun, Banu; Auer, Paul L; Bacot, François; Barrdahl, Myrto; Baynes, Caroline; Beckmann, Matthias W; Behrens, Sabine; Benitez, Javier; Bermisheva, Marina; Bernstein, Leslie; Blomqvist, Carl; Bogdanova, Natalia V; Bojesen, Stig E; Bonanni, Bernardo; Børresen-Dale, Anne-Lise; Brand, Judith S; Brauch, Hiltrud; Brennan, Paul; Brenner, Hermann; Brinton, Louise; Broberg, Per; Brock, Ian W; Broeks, Annegien; Brooks-Wilson, Angela; Brucker, Sara Y; Brüning, Thomas; Burwinkel, Barbara; Butterbach, Katja; Cai, Qiuyin; Cai, Hui; Caldés, Trinidad; Canzian, Federico; Carracedo, Angel; Carter, Brian D; Castelao, Jose E; Chan, Tsun L; David Cheng, Ting-Yuan; Seng Chia, Kee; Choi, Ji-Yeob; Christiansen, Hans; Clarke, Christine L; Collée, Margriet; Conroy, Don M; Cordina-Duverger, Emilie; Cornelissen, Sten; Cox, David G; Cox, Angela; Cross, Simon S; Cunningham, Julie M; Czene, Kamila; Daly, Mary B; Devilee, Peter; Doheny, Kimberly F; Dörk, Thilo; Dos-Santos-Silva, Isabel; Dumont, Martine; Durcan, Lorraine; Dwek, Miriam; Eccles, Diana M; Ekici, Arif B; Eliassen, A Heather; Ellberg, Carolina; Elvira, Mingajeva; Engel, Christoph; Eriksson, Mikael; Fasching, Peter A; Figueroa, Jonine; Flesch-Janys, Dieter; Fletcher, Olivia; Flyger, Henrik; Fritschi, Lin; Gaborieau, Valerie; Gabrielson, Marike; Gago-Dominguez, Manuela; Gao, Yu-Tang; Gapstur, Susan M; García-Sáenz, José A; Gaudet, Mia M; Georgoulias, Vassilios; Giles, Graham G; Glendon, Gord; Goldberg, Mark S; Goldgar, David E; González-Neira, Anna; Grenaker Alnæs, Grethe I; Grip, Mervi; Gronwald, Jacek; Grundy, Anne; Guénel, Pascal; Haeberle, Lothar; Hahnen, Eric; Haiman, Christopher A; Håkansson, Niclas; Hamann, Ute; Hamel, Nathalie; Hankinson, Susan; Harrington, Patricia; Hart, Steven N; Hartikainen, Jaana M; Hartman, Mikael; Hein, Alexander; Heyworth, Jane; Hicks, Belynda; Hillemanns, Peter; Ho, Dona N; Hollestelle, Antoinette; Hooning, Maartje J; Hoover, Robert N; Hopper, John L; Hou, Ming-Feng; Hsiung, Chia-Ni; Huang, Guanmengqian; Humphreys, Keith; Ishiguro, Junko; Ito, Hidemi; Iwasaki, Motoki; Iwata, Hiroji; Jakubowska, Anna; Janni, Wolfgang; John, Esther M; Johnson, Nichola; Jones, Kristine; Jones, Michael; Jukkola-Vuorinen, Arja; Kaaks, Rudolf; Kabisch, Maria; Kaczmarek, Katarzyna; Kang, Daehee; Kasuga, Yoshio; Kerin, Michael J; Khan, Sofia; Khusnutdinova, Elza; Kiiski, Johanna I; Kim, Sung-Won; Knight, Julia A; Kosma, Veli-Matti; Kristensen, Vessela N; Krüger, Ute; Kwong, Ava; Lambrechts, Diether; Le Marchand, Loic; Lee, Eunjung; Lee, Min Hyuk; Lee, Jong Won; Neng Lee, Chuen; Lejbkowicz, Flavio; Li, Jingmei; Lilyquist, Jenna; Lindblom, Annika; Lissowska, Jolanta; Lo, Wing-Yee; Loibl, Sibylle; Long, Jirong; Lophatananon, Artitaya; Lubinski, Jan; Luccarini, Craig; Lux, Michael P; Ma, Edmond S K; MacInnis, Robert J; Maishman, Tom; Makalic, Enes; Malone, Kathleen E; Kostovska, Ivana Maleva; Mannermaa, Arto; Manoukian, Siranoush; Manson, JoAnn E; Margolin, Sara; Mariapun, Shivaani; Martinez, Maria Elena; Matsuo, Keitaro; Mavroudis, Dimitrios; McKay, James; McLean, Catriona; Meijers-Heijboer, Hanne; Meindl, Alfons; Menéndez, Primitiva; Menon, Usha; Meyer, Jeffery; Miao, Hui; Miller, Nicola; Taib, Nur Aishah Mohd; Muir, Kenneth; Mulligan, Anna Marie; Mulot, Claire; Neuhausen, Susan L; Nevanlinna, Heli; Neven, Patrick; Nielsen, Sune F; Noh, Dong-Young; Nordestgaard, Børge G; Norman, Aaron; Olopade, Olufunmilayo I; Olson, Janet E; Olsson, Håkan; Olswold, Curtis; Orr, Nick; Pankratz, V Shane; Park, Sue K; Park-Simon, Tjoung-Won; Lloyd, Rachel; Perez, Jose I A; Peterlongo, Paolo; Peto, Julian; Phillips, Kelly-Anne; Pinchev, Mila; Plaseska-Karanfilska, Dijana; Prentice, Ross; Presneau, Nadege; Prokofyeva, Darya; Pugh, Elizabeth; Pylkäs, Katri; Rack, Brigitte; Radice, Paolo; Rahman, Nazneen; Rennert, Gadi; Rennert, Hedy S; Rhenius, Valerie; Romero, Atocha; Romm, Jane; Ruddy, Kathryn J; Rüdiger, Thomas; Rudolph, Anja; Ruebner, Matthias; Rutgers, Emiel J T; Saloustros, Emmanouil; Sandler, Dale P; Sangrajrang, Suleeporn; Sawyer, Elinor J; Schmidt, Daniel F; Schmutzler, Rita K; Schneeweiss, Andreas; Schoemaker, Minouk J; Schumacher, Fredrick; Schürmann, Peter; Scott, Rodney J; Scott, Christopher; Seal, Sheila; Seynaeve, Caroline; Shah, Mitul; Sharma, Priyanka; Shen, Chen-Yang; Sheng, Grace; Sherman, Mark E; Shrubsole, Martha J; Shu, Xiao-Ou; Smeets, Ann; Sohn, Christof; Southey, Melissa C; Spinelli, John J; Stegmaier, Christa; Stewart-Brown, Sarah; Stone, Jennifer; Stram, Daniel O; Surowy, Harald; Swerdlow, Anthony; Tamimi, Rulla; Taylor, Jack A; Tengström, Maria; Teo, Soo H; Beth Terry, Mary; Tessier, Daniel C; Thanasitthichai, Somchai; Thöne, Kathrin; Tollenaar, Rob A E M; Tomlinson, Ian; Tong, Ling; Torres, Diana; Truong, Thérèse; Tseng, Chiu-Chen; Tsugane, Shoichiro; Ulmer, Hans-Ulrich; Ursin, Giske; Untch, Michael; Vachon, Celine; van Asperen, Christi J; Van Den Berg, David; van den Ouweland, Ans M W; van der Kolk, Lizet; van der Luijt, Rob B; Vincent, Daniel; Vollenweider, Jason; Waisfisz, Quinten; Wang-Gohrke, Shan; Weinberg, Clarice R; Wendt, Camilla; Whittemore, Alice S; Wildiers, Hans; Willett, Walter; Winqvist, Robert; Wolk, Alicja; Wu, Anna H; Xia, Lucy; Yamaji, Taiki; Yang, Xiaohong R; Har Yip, Cheng; Yoo, Keun-Young; Yu, Jyh-Cherng; Zheng, Wei; Zheng, Ying; Zhu, Bin; Ziogas, Argyrios; Ziv, Elad; Lakhani, Sunil R; Antoniou, Antonis C; Droit, Arnaud; Andrulis, Irene L; Amos, Christopher I; Couch, Fergus J; Pharoah, Paul D P; Chang-Claude, Jenny; Hall, Per; Hunter, David J; Milne, Roger L; García-Closas, Montserrat; Schmidt, Marjanka K; Chanock, Stephen J; Dunning, Alison M; Edwards, Stacey L; Bader, Gary D; Chenevix-Trench, Georgia; Simard, Jacques; Kraft, Peter; Easton, Douglas F
2017-11-02
Breast cancer risk is influenced by rare coding variants in susceptibility genes, such as BRCA1, and many common, mostly non-coding variants. However, much of the genetic contribution to breast cancer risk remains unknown. Here we report the results of a genome-wide association study of breast cancer in 122,977 cases and 105,974 controls of European ancestry and 14,068 cases and 13,104 controls of East Asian ancestry. We identified 65 new loci that are associated with overall breast cancer risk at P < 5 × 10 -8 . The majority of credible risk single-nucleotide polymorphisms in these loci fall in distal regulatory elements, and by integrating in silico data to predict target genes in breast cells at each locus, we demonstrate a strong overlap between candidate target genes and somatic driver genes in breast tumours. We also find that heritability of breast cancer due to all single-nucleotide polymorphisms in regulatory features was 2-5-fold enriched relative to the genome-wide average, with strong enrichment for particular transcription factor binding sites. These results provide further insight into genetic susceptibility to breast cancer and will improve the use of genetic risk scores for individualized screening and prevention.
Hall, L; Laird, J E; Craig, R K
1984-01-01
Nucleotide sequence analysis of cloned guinea-pig casein B cDNA sequences has identified two casein B variants related to the bovine and rat alpha s1 caseins. Amino acid homology was largely confined to the known bovine or predicted rat phosphorylation sites and within the 'signal' precursor sequence. Comparison of the deduced nucleotide sequence of the guinea-pig and rat alpha s1 casein mRNA species showed greater sequence conservation in the non-coding than in the coding regions, suggesting a functional and possibly regulatory role for the non-coding regions of casein mRNA. The results provide insight into the evolution of the casein genes, and raise questions as to the role of conserved nucleotide sequences within the non-coding regions of mRNA species. Images Fig. 1. PMID:6548375
The complete mitochondrial genome of the stomatopod crustacean Squilla mantis
Cook, Charles E
2005-01-01
Background Animal mitochondrial genomes are physically separate from the much larger nuclear genomes and have proven useful both for phylogenetic studies and for understanding genome evolution. Within the phylum Arthropoda the subphylum Crustacea includes over 50,000 named species with immense variation in body plans and habitats, yet only 23 complete mitochondrial genomes are available from this subphylum. Results I describe here the complete mitochondrial genome of the crustacean Squilla mantis (Crustacea: Malacostraca: Stomatopoda). This 15994-nucleotide genome, the first described from a hoplocarid, contains the standard complement of 13 protein-coding genes, 22 transfer RNA genes, two ribosomal RNA genes, and a non-coding AT-rich region that is found in most other metazoans. The gene order is identical to that considered ancestral for hexapods and crustaceans. The 70% AT base composition is within the range described for other arthropods. A single unusual feature of the genome is a 230 nucleotide non-coding region between a serine transfer RNA and the nad1 gene, which has no apparent function. I also compare gene order, nucleotide composition, and codon usage of the S. mantis genome and eight other malacostracan crustaceans. A translocation of the histidine transfer RNA gene is shared by three taxa in the order Decapoda, infraorder Brachyura; Callinectes sapidus, Portunus trituberculatus and Pseudocarcinus gigas. This translocation may be diagnostic for the Brachyura. For all nine taxa nucleotide composition is biased towards AT-richness, as expected for arthropods, and is within the range reported for other arthropods. Codon usage is biased, and much of this bias is probably due to the skew in nucleotide composition towards AT-richness. Conclusion The mitochondrial genome of Squilla mantis contains one unusual feature, a 230 base pair non-coding region has so far not been described in any other malacostracan. Comparisons with other Malacostraca show that all nine genomes, like most other mitochondrial genomes, share a bias toward AT-richness and a related bias in codon usage. The nine malacostracans included in this analysis are not representative of the diversity of the class Malacostraca, and additional malacostracan sequences would surely reveal other unusual genomic features that could be useful in understanding mitochondrial evolution in this taxon. PMID:16091132
Chloroplast DNA Structural Variation, Phylogeny, and Age of Divergence among Diploid Cotton Species.
Chen, Zhiwen; Feng, Kun; Grover, Corrinne E; Li, Pengbo; Liu, Fang; Wang, Yumei; Xu, Qin; Shang, Mingzhao; Zhou, Zhongli; Cai, Xiaoyan; Wang, Xingxing; Wendel, Jonathan F; Wang, Kunbo; Hua, Jinping
2016-01-01
The cotton genus (Gossypium spp.) contains 8 monophyletic diploid genome groups (A, B, C, D, E, F, G, K) and a single allotetraploid clade (AD). To gain insight into the phylogeny of Gossypium and molecular evolution of the chloroplast genome in this group, we performed a comparative analysis of 19 Gossypium chloroplast genomes, six reported here for the first time. Nucleotide distance in non-coding regions was about three times that of coding regions. As expected, distances were smaller within than among genome groups. Phylogenetic topologies based on nucleotide and indel data support for the resolution of the 8 genome groups into 6 clades. Phylogenetic analysis of indel distribution among the 19 genomes demonstrates contrasting evolutionary dynamics in different clades, with a parallel genome downsizing in two genome groups and a biased accumulation of insertions in the clade containing the cultivated cottons leading to large (for Gossypium) chloroplast genomes. Divergence time estimates derived from the cpDNA sequence suggest that the major diploid clades had diverged approximately 10 to 11 million years ago. The complete nucleotide sequences of 6 cpDNA genomes are provided, offering a resource for cytonuclear studies in Gossypium.
Grzes, M; Nowacka-Woszuk, J; Szczerbal, I; Czerwinska, J; Gracz, J; Switonski, M
2009-01-01
The gene encoding myostatin (MSTN), due to its crucial function for growth of skeletal muscle mass, is an important candidate for muscularity. In this study we analyzed the nucleotide sequence and FISH localization of this gene in 4 canids, including 3 farm species. The nucleotide sequence of the MSTN coding fragment turned out to be highly conserved, since its identity among the studied species was very high and varied between 99.4 and 99.7%. Only 1, widely spread, silent single nucleotide polymorphism (SNP) was found in exon 1 of the Chinese raccoon dog. The MSTN gene was localized close to the centromere in one-armed chromosomes of the dog (37q11) and bi-armed chromosomes of the red fox (16p11) and arctic fox (10q11), with an exception of the Chinese raccoon dog chromosome (2q14-q21). This chromosome is orthologous to 3 canine chromosomes and thus the MSTN was found more interstitially. Our results are in agreement with the hypothesis that karyotypes of the canids evolved mainly through centric fusion/fission events, while tandem fusions occurred rarely. (c) 2009 S. Karger AG, Basel.
Chloroplast DNA Structural Variation, Phylogeny, and Age of Divergence among Diploid Cotton Species
Li, Pengbo; Liu, Fang; Wang, Yumei; Xu, Qin; Shang, Mingzhao; Zhou, Zhongli; Cai, Xiaoyan; Wang, Xingxing; Wendel, Jonathan F.; Wang, Kunbo
2016-01-01
The cotton genus (Gossypium spp.) contains 8 monophyletic diploid genome groups (A, B, C, D, E, F, G, K) and a single allotetraploid clade (AD). To gain insight into the phylogeny of Gossypium and molecular evolution of the chloroplast genome in this group, we performed a comparative analysis of 19 Gossypium chloroplast genomes, six reported here for the first time. Nucleotide distance in non-coding regions was about three times that of coding regions. As expected, distances were smaller within than among genome groups. Phylogenetic topologies based on nucleotide and indel data support for the resolution of the 8 genome groups into 6 clades. Phylogenetic analysis of indel distribution among the 19 genomes demonstrates contrasting evolutionary dynamics in different clades, with a parallel genome downsizing in two genome groups and a biased accumulation of insertions in the clade containing the cultivated cottons leading to large (for Gossypium) chloroplast genomes. Divergence time estimates derived from the cpDNA sequence suggest that the major diploid clades had diverged approximately 10 to 11 million years ago. The complete nucleotide sequences of 6 cpDNA genomes are provided, offering a resource for cytonuclear studies in Gossypium. PMID:27309527
Ancient DNA sequence revealed by error-correcting codes.
Brandão, Marcelo M; Spoladore, Larissa; Faria, Luzinete C B; Rocha, Andréa S L; Silva-Filho, Marcio C; Palazzo, Reginaldo
2015-07-10
A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code.
Ancient DNA sequence revealed by error-correcting codes
Brandão, Marcelo M.; Spoladore, Larissa; Faria, Luzinete C. B.; Rocha, Andréa S. L.; Silva-Filho, Marcio C.; Palazzo, Reginaldo
2015-01-01
A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code. PMID:26159228
Bijective transformation circular codes and nucleotide exchanging RNA transcription.
Michel, Christian J; Seligmann, Hervé
2014-04-01
The C(3) self-complementary circular code X identified in genes of prokaryotes and eukaryotes is a set of 20 trinucleotides enabling reading frame retrieval and maintenance, i.e. a framing code (Arquès and Michel, 1996; Michel, 2012, 2013). Some mitochondrial RNAs correspond to DNA sequences when RNA transcription systematically exchanges between nucleotides (Seligmann, 2013a,b). We study here the 23 bijective transformation codes ΠX of X which may code nucleotide exchanging RNA transcription as suggested by this mitochondrial observation. The 23 bijective transformation codes ΠX are C(3) trinucleotide circular codes, seven of them are also self-complementary. Furthermore, several correlations are observed between the Reading Frame Retrieval (RFR) probability of bijective transformation codes ΠX and the different biological properties of ΠX related to their numbers of RNAs in GenBank's EST database, their polymerization rate, their number of amino acids and the chirality of amino acids they code. Results suggest that the circular code X with the functions of reading frame retrieval and maintenance in regular RNA transcription, may also have, through its bijective transformation codes ΠX, the same functions in nucleotide exchanging RNA transcription. Associations with properties such as amino acid chirality suggest that the RFR of X and its bijective transformations molded the origins of the genetic code's machinery. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
He, Feng; Wen, Haishen; Yu, Dahui; Li, Jifang; Shi, Bao; Chen, Caifang; Zhang, Jiaren; Jin, Guoxiong; Chen, Xiaoyan; Shi, Dan; Yang, Yanping
2010-12-01
Follicle stimulating hormone β (FSHβ) of Japanese flounder ( Paralichthys olivaceus) plays a key role in the regulation of gonadal development. This study aimed to investigate molecular genetic characteristics of the FSHβ gene and elucidate the effects of single nucleotide polymorphisms (SNPs) of FSHβ on reproductive traits in Japanese flounder. We used polymerase chain reaction single-strand conformation polymorphism (PCR-SSCP) and sequencing of the FSHβ gene in 60 individuals. We identified only an SNP (T/C) in the coding region of exon3 of FSHβ. The SNP (T/C) did not lead to amino acid changes at the position 340 bp of FSHβ gene. Statistical analysis showed that the SNP was significantly associated with testosterone (T) level and gonadosomatic index (GSI) ( P < 0.05). Individuals with genotype TC of the SNP had significantly higher serum T levels and GSI ( P < 0.05) than that of genotype CC. Therefore, FSHβ gene could be a useful molecular marker in selection for prominent reproductive trait in Japanese Flounder.
Seligmann, Hervé
2013-05-07
GenBank's EST database includes RNAs matching exactly human mitochondrial sequences assuming systematic asymmetric nucleotide exchange-transcription along exchange rules: A→G→C→U/T→A (12 ESTs), A→U/T→C→G→A (4 ESTs), C→G→U/T→C (3 ESTs), and A→C→G→U/T→A (1 EST), no RNAs correspond to other potential asymmetric exchange rules. Hypothetical polypeptides translated from nucleotide-exchanged human mitochondrial protein coding genes align with numerous GenBank proteins, predicted secondary structures resemble their putative GenBank homologue's. Two independent methods designed to detect overlapping genes (one based on nucleotide contents analyses in relation to replicative deamination gradients at third codon positions, and circular code analyses of codon contents based on frame redundancy), confirm nucleotide-exchange-encrypted overlapping genes. Methods converge on which genes are most probably active, and which not, and this for the various exchange rules. Mean EST lengths produced by different nucleotide exchanges are proportional to (a) extents that various bioinformatics analyses confirm the protein coding status of putative overlapping genes; (b) known kinetic chemistry parameters of the corresponding nucleotide substitutions by the human mitochondrial DNA polymerase gamma (nucleotide DNA misinsertion rates); (c) stop codon densities in predicted overlapping genes (stop codon readthrough and exchanging polymerization regulate gene expression by counterbalancing each other). Numerous rarely expressed proteins seem encoded within regular mitochondrial genes through asymmetric nucleotide exchange, avoiding lengthening genomes. Intersecting evidence between several independent approaches confirms the working hypothesis status of gene encryption by systematic nucleotide exchanges. Copyright © 2013 Elsevier Ltd. All rights reserved.
Shannon information entropy in the canonical genetic code.
Nemzer, Louis R
2017-02-21
The Shannon entropy measures the expected information value of messages. As with thermodynamic entropy, the Shannon entropy is only defined within a system that identifies at the outset the collections of possible messages, analogous to microstates, that will be considered indistinguishable macrostates. This fundamental insight is applied here for the first time to amino acid alphabets, which group the twenty common amino acids into families based on chemical and physical similarities. To evaluate these schemas objectively, a novel quantitative method is introduced based the inherent redundancy in the canonical genetic code. Each alphabet is taken as a separate system that partitions the 64 possible RNA codons, the microstates, into families, the macrostates. By calculating the normalized mutual information, which measures the reduction in Shannon entropy, conveyed by single nucleotide messages, groupings that best leverage this aspect of fault tolerance in the code are identified. The relative importance of properties related to protein folding - like hydropathy and size - and function, including side-chain acidity, can also be estimated. This approach allows the quantification of the average information value of nucleotide positions, which can shed light on the coevolution of the canonical genetic code with the tRNA-protein translation mechanism. Copyright © 2016 Elsevier Ltd. All rights reserved.
Screening of reproduction-related single-nucleotide variations from MeDIP-seq data in sheep.
Cao, Jiaxue; Wei, Caihong; Zhang, Shuzhen; Capellini, Terence D; Zhang, Li; Zhao, Fuping; Li, Li; Zhong, Tao; Wang, Linjie; Du, Lixin; Zhang, Hongping
2016-11-01
Extensive variation in reproduction has arisen in Chinese Mongolian sheep during recent domestication. Hu and Small-tailed Han sheep, for example, have become non-seasonal breeders and exhibit higher fecundity than Tan and Ujumqin breeds. We therefore scanned reproduction-related single-nucleotide variations from methylated DNA-immunoprecipitation sequencing data generated from each of those four breeds to uncover potential mechanisms underlying this breed variation. We generated a high-quality map of single nucleotide variations (SNVs) in DNA methylation enriched regions, and found that the majority of variants are located within non-coding regions. We identified 359 SNVs within the Sheep Quantitative Trait Locus (QTL) database. Nineteen of these SNVs associated with the Aseasonal Reproduction QTL, and 10 out of the 19 reside close to genes with known reproduction functions. We also identified the well-known FecB mutation in high-fecundity sheep (Hu and Small-tailed Han sheep). When we applied these FecB finding to our breeding system, we improved lambing rate by 175%. In summary, this study provided strong candidate SNVs associated with sheep fecundity that can serve as targets for functional testing and to enhance selective breeding strategies. Mol. Reprod. Dev. 83: 958-967, 2016 © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Genetic Variation Linked to Lung Cancer Survival in White Smokers | Center for Cancer Research
CCR investigators have discovered evidence that links lung cancer survival with genetic variations (called single nucleotide polymorphisms) in the MBL2 gene, a key player in innate immunity. The variations in the gene, which codes for a protein called the mannose-binding lectin, occur in its promoter region, where the RNA polymerase molecule binds to start transcription, and
Bendezu Eguis, Jorge; Montesinos, Ricardo; Fernández-Díaz, Manolo
2018-01-01
ABSTRACT We report here the first genome sequence of infectious laryngotracheitis virus isolated in Peru from tracheal tissues of layer chickens. The genome showed 99.98% identity to the J2 strain genome sequence. Single nucleotide polymorphisms were detected in five gene-coding sequences related to vaccine development, virus attachment, and viral immune evasion. PMID:29519822
Wong, Wing Chung; Kim, Dewey; Carter, Hannah; Diekhans, Mark; Ryan, Michael C; Karchin, Rachel
2011-08-01
Thousands of cancer exomes are currently being sequenced, yielding millions of non-synonymous single nucleotide variants (SNVs) of possible relevance to disease etiology. Here, we provide a software toolkit to prioritize SNVs based on their predicted contribution to tumorigenesis. It includes a database of precomputed, predictive features covering all positions in the annotated human exome and can be used either stand-alone or as part of a larger variant discovery pipeline. MySQL database, source code and binaries freely available for academic/government use at http://wiki.chasmsoftware.org, Source in Python and C++. Requires 32 or 64-bit Linux system (tested on Fedora Core 8,10,11 and Ubuntu 10), 2.5*≤ Python <3.0*, MySQL server >5.0, 60 GB available hard disk space (50 MB for software and data files, 40 GB for MySQL database dump when uncompressed), 2 GB of RAM.
Rosinski-Chupin, Isabelle; Sauvage, Elisabeth; Sismeiro, Odile; Villain, Adrien; Da Cunha, Violette; Caliot, Marie-Elise; Dillies, Marie-Agnès; Trieu-Cuot, Patrick; Bouloc, Philippe; Lartigue, Marie-Frédérique; Glaser, Philippe
2015-05-30
Streptococcus agalactiae, or Group B Streptococcus, is a leading cause of neonatal infections and an increasing cause of infections in adults with underlying diseases. In an effort to reconstruct the transcriptional networks involved in S. agalactiae physiology and pathogenesis, we performed an extensive and robust characterization of its transcriptome through a combination of differential RNA-sequencing in eight different growth conditions or genetic backgrounds and strand-specific RNA-sequencing. Our study identified 1,210 transcription start sites (TSSs) and 655 transcript ends as well as 39 riboswitches and cis-regulatory regions, 39 cis-antisense non-coding RNAs and 47 small RNAs potentially acting in trans. Among these putative regulatory RNAs, ten were differentially expressed in response to an acid stress and two riboswitches sensed directly or indirectly the pH modification. Strikingly, 15% of the TSSs identified were associated with the incorporation of pseudo-templated nucleotides, showing that reiterative transcription is a pervasive process in S. agalactiae. In particular, 40% of the TSSs upstream genes involved in nucleotide metabolism show reiterative transcription potentially regulating gene expression, as exemplified for pyrG and thyA encoding the CTP synthase and the thymidylate synthase respectively. This comprehensive map of the transcriptome at the single nucleotide resolution led to the discovery of new regulatory mechanisms in S. agalactiae. It also provides the basis for in depth analyses of transcriptional networks in S. agalactiae and of the regulatory role of reiterative transcription following variations of intra-cellular nucleotide pools.
n-Nucleotide circular codes in graph theory.
Fimmel, Elena; Michel, Christian J; Strüngmann, Lutz
2016-03-13
The circular code theory proposes that genes are constituted of two trinucleotide codes: the classical genetic code with 61 trinucleotides for coding the 20 amino acids (except the three stop codons {TAA,TAG,TGA}) and a circular code based on 20 trinucleotides for retrieving, maintaining and synchronizing the reading frame. It relies on two main results: the identification of a maximal C(3) self-complementary trinucleotide circular code X in genes of bacteria, eukaryotes, plasmids and viruses (Michel 2015 J. Theor. Biol. 380, 156-177. (doi:10.1016/j.jtbi.2015.04.009); Arquès & Michel 1996 J. Theor. Biol. 182, 45-58. (doi:10.1006/jtbi.1996.0142)) and the finding of X circular code motifs in tRNAs and rRNAs, in particular in the ribosome decoding centre (Michel 2012 Comput. Biol. Chem. 37, 24-37. (doi:10.1016/j.compbiolchem.2011.10.002); El Soufi & Michel 2014 Comput. Biol. Chem. 52, 9-17. (doi:10.1016/j.compbiolchem.2014.08.001)). The univerally conserved nucleotides A1492 and A1493 and the conserved nucleotide G530 are included in X circular code motifs. Recently, dinucleotide circular codes were also investigated (Michel & Pirillo 2013 ISRN Biomath. 2013, 538631. (doi:10.1155/2013/538631); Fimmel et al. 2015 J. Theor. Biol. 386, 159-165. (doi:10.1016/j.jtbi.2015.08.034)). As the genetic motifs of different lengths are ubiquitous in genes and genomes, we introduce a new approach based on graph theory to study in full generality n-nucleotide circular codes X, i.e. of length 2 (dinucleotide), 3 (trinucleotide), 4 (tetranucleotide), etc. Indeed, we prove that an n-nucleotide code X is circular if and only if the corresponding graph [Formula: see text] is acyclic. Moreover, the maximal length of a path in [Formula: see text] corresponds to the window of nucleotides in a sequence for detecting the correct reading frame. Finally, the graph theory of tournaments is applied to the study of dinucleotide circular codes. It has full equivalence between the combinatorics theory (Michel & Pirillo 2013 ISRN Biomath. 2013, 538631. (doi:10.1155/2013/538631)) and the group theory (Fimmel et al. 2015 J. Theor. Biol. 386, 159-165. (doi:10.1016/j.jtbi.2015.08.034)) of dinucleotide circular codes while its mathematical approach is simpler. © 2016 The Author(s).
Iwanowicz, L; Densmore, C; Hahn, C; McAllister, P; Odenkirk, J
2013-09-01
The Northern Snakehead Channa argus is an introduced species that now inhabits the Chesapeake Bay. During a preliminary survey for introduced pathogens possibly harbored by these fish in Virginia waters, a filterable agent was isolated from five specimens that produced cytopathic effects in BF-2 cells. Based on PCR amplification and partial sequencing of the major capsid protein (MCP), DNA polymerase (DNApol), and DNA methyltransferase (Mtase) genes, the isolates were identified as Largemouth Bass virus (LMBV). Nucleotide sequences of the MCP (492 bp) and DNApol (419 pb) genes were 100% identical to those of LMBV. The nucleotide sequence of the Mtase (206 bp) gene was 99.5% identical to that of LMBV, and the single nucleotide substitution did not lead to a predicted amino acid coding change. This is the first report of LMBV from the Northern Snakehead, and provides evidence that noncentrarchid fishes may be susceptible to this virus.
Iwanowicz, Luke R.; Densmore, Christine L.; Hahn, Cassidy M.; McAllister, Phillip; Odenkirk, John
2013-01-01
The Northern Snakehead Channa argus is an introduced species that now inhabits the Chesapeake Bay. During a preliminary survey for introduced pathogens possibly harbored by these fish in Virginia waters, a filterable agent was isolated from five specimens that produced cytopathic effects in BF-2 cells. Based on PCR amplification and partial sequencing of the major capsid protein (MCP), DNA polymerase (DNApol), and DNA methyltransferase (Mtase) genes, the isolates were identified as Largemouth Bass virus (LMBV). Nucleotide sequences of the MCP (492 bp) and DNApol (419 pb) genes were 100% identical to those of LMBV. The nucleotide sequence of the Mtase (206 bp) gene was 99.5% identical to that of LMBV, and the single nucleotide substitution did not lead to a predicted amino acid coding change. This is the first report of LMBV from the Northern Snakehead, and provides evidence that noncentrarchid fishes may be susceptible to this virus.
Association of α-, β-, and γ-Synuclein With Diffuse Lewy Body Disease
Nishioka, Kenya; Wider, Christian; Vilariño-Güell, Carles; Soto-Ortolaza, Alexandra I.; Lincoln, Sarah J.; Kachergus, Jennifer M.; Jasinska-Myga, Barbara; Ross, Owen A.; Rajput, Alex; Robinson, Christopher A.; Ferman, Tanis J.; Wszolek, Zbigniew K.; Dickson, Dennis W.; Farrer, Matthew J.
2016-01-01
Objective To determine the association of the genes that encode α-, β-, and γ-synuclein (SNCA, SNCB, and SNCG, respectively) with diffuse Lewy body disease (DLBD). Design Case-control study. Subjects A total of 172 patients with DLBD consistent with a clinical diagnosis of Parkinson disease dementia/dementia with Lewy bodies and 350 clinically and 97 pathologically normal controls. Interventions Sequencing of SNCA, SNCB, and SNCG and genotyping of single-nucleotide polymorphisms performed on an Applied Biosystems capillary sequencer and a Sequenom MassArray pLEX platform, respectively. Associations were determined using χ2 or Fisher exact tests. Results Initial sequencing studies of the coding regions of each gene in 89 patients with DLBD did not detect any pathogenic substitutions. Nevertheless, genotyping of known polymorphic variability in sequence-conserved regions detected several single-nucleotide polymorphisms in the SNCA and SNCG genes that were significantly associated with disease (P=.05 to <.001). Significant association was also observed for 3 single-nucleotide polymorphisms located in SNCB when comparing DLBD cases and pathologically confirmed normal controls (P=.03-.01); however, this association was not significant for the clinical controls alone or the combined clinical and pathological controls (P>.05). After correction for multiple testing, only 1 single-nucleotide polymorphism in SNCG (rs3750823) remained significant in all of the analyses (P=.05-.009). Conclusion These findings suggest that variants in all 3 members of the synuclein gene family, particularly SNCA and SNCG, affect the risk of developing DLBD and warrant further investigation in larger, pathologically defined data sets as well as clinically diagnosed Parkinson disease/dementia with Lewy bodies case-control series. PMID:20697047
Nowacka-Woszuk, Joanna; Switonski, Marek
2009-01-01
The sex determination process is under the control of several genes of which two (SRY and SOX9), encoding transcription factors, play a crucial role. It is well-known that mutations at these genes may cause the development of an intersexual phenotype. The aim of this study was to conduct a comparative analysis of the coding sequence and 5'-flanking regions of both genes in four species of the family Canidae (the dog, red fox, arctic fox and Chinese raccoon dog). Similarity of the coding sequence of the SOX9 gene among the studied species was higher (99.7-99.9%) than in the case of the SRY gene (96.7-97.3%). Only single nucleotide changes were found in the compared coding sequences, whereas in the 5'-flanking region of both genes nucleotide substitutions, as well as insertions and deletions were observed. None of the changes detected in the 5'-flanking region occurred within the potential consensus sequences for transcription factors. No polymorphism was found for either of these genes in any of the analyzed species.
USDA-ARS?s Scientific Manuscript database
The actions of prolactin (PRL) are mediated by both long (LF) and short isoforms (SF) of the PRL receptor (PRLR). Here, we report on a genetic and functional analysis of the porcine PRLR (pPRLR) SF. Three single nucleotide polymorphisms (SNPs) within exon 11 of the pPRLR-SF give rise to four amino a...
Masking as an effective quality control method for next-generation sequencing data analysis.
Yun, Sajung; Yun, Sijung
2014-12-13
Next generation sequencing produces base calls with low quality scores that can affect the accuracy of identifying simple nucleotide variation calls, including single nucleotide polymorphisms and small insertions and deletions. Here we compare the effectiveness of two data preprocessing methods, masking and trimming, and the accuracy of simple nucleotide variation calls on whole-genome sequence data from Caenorhabditis elegans. Masking substitutes low quality base calls with 'N's (undetermined bases), whereas trimming removes low quality bases that results in a shorter read lengths. We demonstrate that masking is more effective than trimming in reducing the false-positive rate in single nucleotide polymorphism (SNP) calling. However, both of the preprocessing methods did not affect the false-negative rate in SNP calling with statistical significance compared to the data analysis without preprocessing. False-positive rate and false-negative rate for small insertions and deletions did not show differences between masking and trimming. We recommend masking over trimming as a more effective preprocessing method for next generation sequencing data analysis since masking reduces the false-positive rate in SNP calling without sacrificing the false-negative rate although trimming is more commonly used currently in the field. The perl script for masking is available at http://code.google.com/p/subn/. The sequencing data used in the study were deposited in the Sequence Read Archive (SRX450968 and SRX451773).
Kimura, Hiroki; Tsuboi, Daisuke; Wang, Chenyao; Kushima, Itaru; Koide, Takayoshi; Ikeda, Masashi; Iwayama, Yoshimi; Toyota, Tomoko; Yamamoto, Noriko; Kunimoto, Shohko; Nakamura, Yukako; Yoshimi, Akira; Banno, Masahiro; Xing, Jingrui; Takasaki, Yuto; Yoshida, Mami; Aleksic, Branko; Uno, Yota; Okada, Takashi; Iidaka, Tetsuya; Inada, Toshiya; Suzuki, Michio; Ujike, Hiroshi; Kunugi, Hiroshi; Kato, Tadafumi; Yoshikawa, Takeo; Iwata, Nakao; Kaibuchi, Kozo; Ozaki, Norio
2015-01-01
Background: Nuclear distribution E homolog 1 (NDE1), located within chromosome 16p13.11, plays an essential role in microtubule organization, mitosis, and neuronal migration and has been suggested by several studies of rare copy number variants to be a promising schizophrenia (SCZ) candidate gene. Recently, increasing attention has been paid to rare single-nucleotide variants (SNVs) discovered by deep sequencing of candidate genes, because such SNVs may have large effect sizes and their functional analysis may clarify etiopathology. Methods and Results: We conducted mutation screening of NDE1 coding exons using 433 SCZ and 145 pervasive developmental disorders samples in order to identify rare single nucleotide variants with a minor allele frequency ≤5%. We then performed genetic association analysis using a large number of unrelated individuals (3554 SCZ, 1041 bipolar disorder [BD], and 4746 controls). Among the discovered novel rare variants, we detected significant associations between SCZ and S214F (P = .039), and between BD and R234C (P = .032). Furthermore, functional assays showed that S214F affected axonal outgrowth and the interaction between NDE1 and YWHAE (14-3-3 epsilon; a neurodevelopmental regulator). Conclusions: This study strengthens the evidence for association between rare variants within NDE1 and SCZ, and may shed light into the molecular mechanisms underlying this severe psychiatric disorder. PMID:25332407
Khrustalev, Vladislav Victorovich
2009-01-01
We showed that GC-content of nucleotide sequences coding for linear B-cell epitopes of herpes simplex virus type 1 (HSV1) glycoprotein B (gB) is higher than GC-content of sequences coding for epitope-free regions of this glycoprotein (G + C = 73 and 64%, respectively). Linear B-cell epitopes have been predicted in HSV1 gB by BepiPred algorithm ( www.cbs.dtu.dk/services/BepiPred ). Proline is an acrophilic amino acid residue (it is usually situated on the surface of protein globules, and so included in linear B-cell epitopes). Indeed, the level of proline is much higher in predicted epitopes of gB than in epitope-free regions (17.8% versus 1.8%). This amino acid is coded by GC-rich codons (CCX) that can be produced due to nucleotide substitutions caused by mutational GC-pressure. GC-pressure will also lead to disappearance of acrophobic phenylalanine, isoleucine, methionine and tyrosine coded by GC-poor codons. Results of our "in-silico directed mutagenesis" showed that single nonsynonymous substitutions in AT to GC direction in two long epitope-free regions of gB will cause formation of new linear epitopes or elongation of previously existing epitopes flanking these regions in 25% of 539 possible cases. The calculations of GC-content and amino acid content have been performed by CodonChanges algorithm ( www.barkovsky.hotmail.ru ).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Machlin, S.M.; Hanson, R.S.
The nucleotide sequence of a cloned 2.5-kilobase-pair SmaI fragment containing the methanol dehydrogenase (MDH) structural gene from Methylobacterium organophilum XX was determined. A single open reading frame with a coding capacity of 626 amino acids (molecular weight, 66,000) was identified on one stand, and N-terminal sequencing of purified MDH revealed that 27 of these residues constituted a putative signal peptide. Primer extension mapping of in vivo transcripts indicated that the start of mRNA synthesis was 160 to 170 base pairs upstream of the ATG codon. Northern (RNA) blot analysis further demonstrated that the transcript was 2.1 kilobase pairs in lengthmore » and therefore appeared to encode only MDH.« less
Genome sequences of a mouse-avirulent and a mouse-virulent strain of Ross River virus.
Faragher, S G; Meek, A D; Rice, C M; Dalgarno, L
1988-04-01
The nucleotide sequence of the genomic RNA of a mouse-avirulent strain of Ross River virus, RRV NB5092 (isolated in 1969), has been determined and the corresponding sequence for the prototype mouse-virulent strain, RRV T48 (isolated in 1959), has been completed. The RRV NB5092 genome is approximately 11,674 nucleotides in length, compared with 11,853 nucleotides for RRV T48. RRV NB5092 and RRV T48 have the same genome organization. For both viruses an untranslated region of 80 nucleotides at the 5' end of the genome is followed by a 7440-nucleotide open reading frame which is interrupted after 5586 nucleotides by a single opal termination codon. By homology with other alphaviruses, the 5586-nucleotide open reading frame encodes the nonstructural proteins nsP1, nsP2, and nsP3; a fourth nonstructural protein, nsP4, is produced by read-through of the opal codon. The RRV nonstructural proteins show strong homology with the corresponding proteins of Sindbis virus and Semliki Forest virus in terms of size, net charge, and hydropathy characteristics. However, homology is not uniform between or within the proteins; nsP1, nsP2, and nsP4 contain extended domains which are highly conserved between alphaviruses, while the C-terminal region of nsP3 shows little conservation in sequence or length between alphaviruses. An untranslated "junction" region of 44 nucleotides (for RRV NB5092) or 47 nucleotides (for RRV T48) separates the nonstructural and structural protein coding regions. The structural proteins (capsid-E3-E2-6K-E1) are translated from an open reading frame of 3762 nucleotides which is followed by a 3'-untranslated region of approximately 348 nucleotides (for RRV NB5092) or 524 nucleotides (for RRV T48). Excluding deletions and insertions, the genomes of RRV NB5092 and RRV T48 differ at 284 nucleotides, representing a sequence divergence of 2.38%. Sequence deletions or insertions were found only in the noncoding regions and include a 173-nucleotide deletion in the 3'-untranslated region of RRV NB5092, compared with RRV T48. In the coding regions, most of the nucleotide differences are silent; there are 36 amino acid differences in the nonstructural proteins and 12 in the structural proteins. The distribution of amino acid differences between the two RRV strains correlates with the location of domains which are poorly conserved in sequence between alphaviruses. The possible role of amino acid differences in envelope glycoproteins E1 and E2 in determining the different antigenic and biological properties of RRV NB5092 and RRV T48 is discussed.
Mistranslation: from adaptations to applications.
Hoffman, Kyle S; O'Donoghue, Patrick; Brandl, Christopher J
2017-11-01
The conservation of the genetic code indicates that there was a single origin, but like all genetic material, the cell's interpretation of the code is subject to evolutionary pressure. Single nucleotide variations in tRNA sequences can modulate codon assignments by altering codon-anticodon pairing or tRNA charging. Either can increase translation errors and even change the code. The frozen accident hypothesis argued that changes to the code would destabilize the proteome and reduce fitness. In studies of model organisms, mistranslation often acts as an adaptive response. These studies reveal evolutionary conserved mechanisms to maintain proteostasis even during high rates of mistranslation. This review discusses the evolutionary basis of altered genetic codes, how mistranslation is identified, and how deviations to the genetic code are exploited. We revisit early discoveries of genetic code deviations and provide examples of adaptive mistranslation events in nature. Lastly, we highlight innovations in synthetic biology to expand the genetic code. The genetic code is still evolving. Mistranslation increases proteomic diversity that enables cells to survive stress conditions or suppress a deleterious allele. Genetic code variants have been identified by genome and metagenome sequence analyses, suppressor genetics, and biochemical characterization. Understanding the mechanisms of translation and genetic code deviations enables the design of new codes to produce novel proteins. Engineering the translation machinery and expanding the genetic code to incorporate non-canonical amino acids are valuable tools in synthetic biology that are impacting biomedical research. This article is part of a Special Issue entitled "Biochemistry of Synthetic Biology - Recent Developments" Guest Editor: Dr. Ilka Heinemann and Dr. Patrick O'Donoghue. Copyright © 2017 Elsevier B.V. All rights reserved.
Singh, Vineet K; Ring, Robert P; Aswani, Vijay; Stemper, Mary E; Kislow, Jennifer; Ye, Zhan; Shukla, Sanjay K
2017-12-01
Staphylococcus aureus is an opportunistic human pathogen that can cause serious infections in humans. A plethora of known and putative virulence factors are produced by staphylococci that collectively orchestrate pathogenesis. Ear protein (Escherichia coli ampicillin resistance) in S. aureus is an exoprotein in COL strain, predicted to be a superantigen, and speculated to play roles in antibiotic resistance and virulence. The goal of this study was to determine if expression of ear is modulated by single nucleotide polymorphisms in its promoter and coding sequences and whether this gene plays roles in antibiotic resistance and virulence. Promoter, coding sequences and expression of the ear gene in clinical and carriage S. aureus strains with distinct genetic backgrounds were analysed. The JE2 strain and its isogenic ear mutant were used in a systemic infection mouse model to determine the competiveness of the ear mutant.Results/Key findings. The ear gene showed a variable expression, with USA300FPR3757 showing a high-level expression compared to many of the other strains tested including some showing negligible expression. Higher expression was associated with agr type 1 but not correlated with phylogenetic relatedness of the ear gene based upon single nucleotide polymorphisms in the promoter or coding regions suggesting a complex regulation. An isogenic JE2 (USA300 background) ear mutant showed no significant difference in its growth, antibiotic susceptibility or virulence in a mouse model. Our data suggests that despite being highly expressed in a USA300 genetic background, Ear is not a significant contributor to virulence in that strain.
Sorimachi, Kenji; Okayasu, Teiji
2015-01-01
The complete vertebrate mitochondrial genome consists of 13 coding genes. We used this genome to investigate the existence of natural selection in vertebrate evolution. From the complete mitochondrial genomes, we predicted nucleotide contents and then separated these values into coding and non-coding regions. When nucleotide contents of a coding or non-coding region were plotted against the nucleotide content of the complete mitochondrial genomes, we obtained linear regression lines only between homonucleotides and their analogs. On every plot using G or A content purine, G content in aquatic vertebrates was higher than that in terrestrial vertebrates, while A content in aquatic vertebrates was lower than that in terrestrial vertebrates. Based on these relationships, vertebrates were separated into two groups, terrestrial and aquatic. However, using C or T content pyrimidine, clear separation between these two groups was not obtained. The hagfish (Eptatretus burgeri) was further separated from both terrestrial and aquatic vertebrates. Based on these results, nucleotide content relationships predicted from the complete vertebrate mitochondrial genomes reveal the existence of natural selection based on evolutionary separation between terrestrial and aquatic vertebrate groups. In addition, we propose that separation of the two groups might be linked to ammonia detoxification based on high G and low A contents, which encode Glu rich and Lys poor proteins.
NASA Technical Reports Server (NTRS)
Lacey, J. C., Jr.; Mullins, D. W., Jr.; Watkins, C. L.; Hall, L. M.
1986-01-01
Cellular organisms store information as sequences of nucleotides in double stranded DNA. This information is useless unless it can be converted into the active molecular species, protein. This is done in contemporary creatures first by transcription of one strand to give a complementary strand of mRNA. The sequence of nucleotides is then translated into a specific sequence of amino acids in a protein. Translation is made possible by a genetic coding system in which a sequence of three nucleotides codes for a specific amino acid. The origin and evolution of any chemical system can be understood through elucidation of the properties of the chemical entities which make up the system. There is an underlying logic to the coding system revealed by a correlation of the hydrophobicities of amino acids and their anticodonic nucleotides (i.e., the complement of the codon). Its importance lies in the fact that every amino acid going into protein synthesis must first be activated. This is universally accomplished with ATP. Past studies have concentrated on the chemistry of the adenylates, but more recently we have found, through the use of NMR, that we can observe intramolecular interactions even at low concentrations, between amino acid side chains and nucleotide base rings in these adenylates. The use of this type of compound thus affords a novel way of elucidating the manner in which amino acids and nucleotides interact with each other. In aqueous solution, when a hydrophobic amino acid is attached to the most hydrophobic nucleotide, AMP, a hydrophobic interaction takes place between the amino acid side chain and the adenine ring. The studies to be reported concern these hydrophobic interactions.
CHASM and SNVBox: toolkit for detecting biologically important single nucleotide mutations in cancer
Carter, Hannah; Diekhans, Mark; Ryan, Michael C.; Karchin, Rachel
2011-01-01
Summary: Thousands of cancer exomes are currently being sequenced, yielding millions of non-synonymous single nucleotide variants (SNVs) of possible relevance to disease etiology. Here, we provide a software toolkit to prioritize SNVs based on their predicted contribution to tumorigenesis. It includes a database of precomputed, predictive features covering all positions in the annotated human exome and can be used either stand-alone or as part of a larger variant discovery pipeline. Availability and Implementation: MySQL database, source code and binaries freely available for academic/government use at http://wiki.chasmsoftware.org, Source in Python and C++. Requires 32 or 64-bit Linux system (tested on Fedora Core 8,10,11 and Ubuntu 10), 2.5*≤ Python <3.0*, MySQL server >5.0, 60 GB available hard disk space (50 MB for software and data files, 40 GB for MySQL database dump when uncompressed), 2 GB of RAM. Contact: karchin@jhu.edu Supplementary Information: Supplementary data are available at Bioinformatics online. PMID:21685053
CGDSNPdb: a database resource for error-checked and imputed mouse SNPs.
Hutchins, Lucie N; Ding, Yueming; Szatkiewicz, Jin P; Von Smith, Randy; Yang, Hyuna; de Villena, Fernando Pardo-Manuel; Churchill, Gary A; Graber, Joel H
2010-07-06
The Center for Genome Dynamics Single Nucleotide Polymorphism Database (CGDSNPdb) is an open-source value-added database with more than nine million mouse single nucleotide polymorphisms (SNPs), drawn from multiple sources, with genotypes assigned to multiple inbred strains of laboratory mice. All SNPs are checked for accuracy and annotated for properties specific to the SNP as well as those implied by changes to overlapping protein-coding genes. CGDSNPdb serves as the primary interface to two unique data sets, the 'imputed genotype resource' in which a Hidden Markov Model was used to assess local haplotypes and the most probable base assignment at several million genomic loci in tens of strains of mice, and the Affymetrix Mouse Diversity Genotyping Array, a high density microarray with over 600,000 SNPs and over 900,000 invariant genomic probes. CGDSNPdb is accessible online through either a web-based query tool or a MySQL public login. Database URL: http://cgd.jax.org/cgdsnpdb/
Lourenco-Jaramillo, Diana Lelidett; Sifuentes-Rincón, Ana María; Parra-Bracamonte, Gaspar Manuel; de la Rosa-Reyna, Xochitl Fabiola; Segura-Cabrera, Aldo; Arellano-Vera, Williams
2012-01-01
DNA from four cattle breeds was used to re-sequence all of the exons and 56% of the introns of the bovine tyrosine hydroxylase (TH) gene and 97% and 13% of the bovine dopamine β-hydroxylase (DBH) coding and non-coding sequences, respectively. Two novel single nucleotide polymorphisms (SNPs) and a microsatellite motif were found in the TH sequences. The DBH sequences contained 62 nucleotide changes, including eight non-synonymous SNPs (nsSNPs) that are of particular interest because they may alter protein function and therefore affect the phenotype. These DBH nsSNPs resulted in amino acid substitutions that were predicted to destabilize the protein structure. Six SNPs (one from TH and five from DBH non-synonymous SNPs) were genotyped in 140 animals; all of them were polymorphic and had a minor allele frequency of > 9%. There were significant differences in the intra- and inter-population haplotype distributions. The haplotype differences between Brahman cattle and the three B. t. taurus breeds (Charolais, Holstein and Lidia) were interesting from a behavioural point of view because of the differences in temperament between these breeds. PMID:22888292
Costa, Valerio; Federico, Antonio; Pollastro, Carla; Ziviello, Carmela; Cataldi, Simona; Formisano, Pietro; Ciccodicola, Alfredo
2016-01-01
Type 2 diabetes (T2D) is one of the most frequent mortality causes in western countries, with rapidly increasing prevalence. Anti-diabetic drugs are the first therapeutic approach, although many patients develop drug resistance. Most drug responsiveness variability can be explained by genetic causes. Inter-individual variability is principally due to single nucleotide polymorphisms, and differential drug responsiveness has been correlated to alteration in genes involved in drug metabolism (CYP2C9) or insulin signaling (IRS1, ABCC8, KCNJ11 and PPARG). However, most genome-wide association studies did not provide clues about the contribution of DNA variations to impaired drug responsiveness. Thus, characterizing T2D drug responsiveness variants is needed to guide clinicians toward tailored therapeutic approaches. Here, we extensively investigated polymorphisms associated with altered drug response in T2D, predicting their effects in silico. Combining different computational approaches, we focused on the expression pattern of genes correlated to drug resistance and inferred evolutionary conservation of polymorphic residues, computationally predicting the biochemical properties of polymorphic proteins. Using RNA-Sequencing followed by targeted validation, we identified and experimentally confirmed that two nucleotide variations in the CAPN10 gene—currently annotated as intronic—fall within two new transcripts in this locus. Additionally, we found that a Single Nucleotide Polymorphism (SNP), currently reported as intergenic, maps to the intron of a new transcript, harboring CAPN10 and GPR35 genes, which undergoes non-sense mediated decay. Finally, we analyzed variants that fall into non-coding regulatory regions of yet underestimated functional significance, predicting that some of them can potentially affect gene expression and/or post-transcriptional regulation of mRNAs affecting the splicing. PMID:27347941
A large-scale study of the random variability of a coding sequence: a study on the CFTR gene.
Modiano, Guido; Bombieri, Cristina; Ciminelli, Bianca Maria; Belpinati, Francesca; Giorgi, Silvia; Georges, Marie des; Scotet, Virginie; Pompei, Fiorenza; Ciccacci, Cinzia; Guittard, Caroline; Audrézet, Marie Pierre; Begnini, Angela; Toepfer, Michael; Macek, Milan; Ferec, Claude; Claustres, Mireille; Pignatti, Pier Franco
2005-02-01
Coding single nucleotide substitutions (cSNSs) have been studied on hundreds of genes using small samples (n(g) approximately 100-150 genes). In the present investigation, a large random European population sample (average n(g) approximately 1500) was studied for a single gene, the CFTR (Cystic Fibrosis Transmembrane conductance Regulator). The nonsynonymous (NS) substitutions exhibited, in accordance with previous reports, a mean probability of being polymorphic (q > 0.005), much lower than that of the synonymous (S) substitutions, but they showed a similar rate of subpolymorphic (q < 0.005) variability. This indicates that, in autosomal genes that may have harmful recessive alleles (nonduplicated genes with important functions), genetic drift overwhelms selection in the subpolymorphic range of variability, making disadvantageous alleles behave as neutral. These results imply that the majority of the subpolymorphic nonsynonymous alleles of these genes are selectively negative or even pathogenic.
Henrich, Oliver; Gutiérrez Fosado, Yair Augusto; Curk, Tine; Ouldridge, Thomas E
2018-05-10
During the last decade coarse-grained nucleotide models have emerged that allow us to study DNA and RNA on unprecedented time and length scales. Among them is oxDNA, a coarse-grained, sequence-specific model that captures the hybridisation transition of DNA and many structural properties of single- and double-stranded DNA. oxDNA was previously only available as standalone software, but has now been implemented into the popular LAMMPS molecular dynamics code. This article describes the new implementation and analyses its parallel performance. Practical applications are presented that focus on single-stranded DNA, an area of research which has been so far under-investigated. The LAMMPS implementation of oxDNA lowers the entry barrier for using the oxDNA model significantly, facilitates future code development and interfacing with existing LAMMPS functionality as well as other coarse-grained and atomistic DNA models.
Miyamoto, T; Koh, E; Tsujimura, A; Miyagawa, Y; Saijo, Y; Namiki, M; Sengoku, K
2014-04-01
Genetic mechanisms have been implicated as a cause of some cases of male infertility. Recently, ten novel genes involved in human spermatogenesis, including human LRWD1, have been identified by expression microarray analysis of human testictissue. The human LRWD1 protein mediates the origin recognition complex in chromatin, which is critical for the initiation of pre-replication complex assembly in G1 and chromatin organization in post-G1 cells. The Lrwd1 gene expression is specific to the testis in mice. Therefore, we hypothesized that mutation or polymorphisms of LRWD1 participate in male infertility, especially azoospermia. To investigate whether LRWD1 gene defects are associated with azoospermia caused by SCOS and meiotic arrest (MA), mutational analysis was performed in 100 and 30 Japanese patients by direct sequencing of the coding regions, respectively. Statistical analysis was performed for patients with SCOS and MA and in 100 healthy control men. No mutations were found in LRWD1; however, three coding single-nucleotide polymorphisms (SNP1-SNP3) could be detected in the patients. The genotype and allele frequencies in SNP1 and SNP2 were notably higher in the SCOS group than in the control group (P < 0.05). These results suggest the critical role of LRWD1 in human spermatogenesis. © 2013 Blackwell Verlag GmbH.
Bester-Van Der Merwe, Aletta; Blaauw, Sonja; Du Plessis, Jana; Roodt-Wilding, Rouvay
2013-09-23
Haliotis midae is one of the most valuable commercial abalone species in the world, but is highly vulnerable, due to exploitation, habitat destruction and predation. In order to preserve wild and cultured stocks, genetic management and improvement of the species has become crucial. Fundamental to this is the availability and employment of molecular markers, such as microsatellites and single nucleotide (SNPs). Transcriptome sequences generated through sequencing-by-synthesis technology were utilized for the in vitro and in silico identification of 505 putative SNPs from a total of 316 selected contigs. A subset of 234 SNPs were further validated and characterized in wild and cultured abalone using two Illumina GoldenGate genotyping assays. Combined with VeraCode technology, this genotyping platform yielded a 65%-69% conversion rate (percentage polymorphic markers) with a global genotyping success rate of 76%-85% and provided a viable means for validating SNP markers in a non-model species. The utility of 31 of the validated SNPs in population structure analysis was confirmed, while a large number of SNPs (174) were shown to be informative and are, thus, good candidates for linkage map construction. The non-synonymous SNPs (50) located in coding regions of genes that showed similarities with known proteins will also be useful for genetic applications, such as the marker-assisted selection of genes of relevance to abalone aquaculture.
Zhao, Na; Xiao, Jianqiu; Zheng, Zhiyong; Fei, Guoqiang; Zhang, Feng; Jin, Lirong; Zhong, Chunjiu
2015-04-01
Our previous studies have demonstrated that ceruloplasmin (CP) dysmetabolism is correlated with Parkinson's disease (PD). However, the causes of decreased serum CP levels in PD patients remain to be clarified. This study aimed to explore the potential association between genetic variants of the CP gene and PD. Clinical features, serum CP levels, and the CP gene (both promoter and coding regions) were analyzed in 60 PD patients and 50 controls. A luciferase reporter system was used to investigate the function of promoter single-nucleotide polymorphisms (SNPs). High-density comparative genomic hybridization microarrays were also used to detect large-scale copy-number variations in CP and an additional 47 genes involved in PD and/or copper/iron metabolism. The frequencies of eight SNPs (one intronic SNP and seven promoter SNPs of the CP gene) and their haplotypes were significantly different between PD patients, especially those with lowered serum CP levels, and controls. However, the luciferase reporter system revealed no significant effect of the risk haplotype on promoter activity of the CP gene. Neither these SNPs nor their haplotypes were correlated with the Hoehn and Yahr staging of PD. The results of this study suggest that common genetic variants of CP are associated with PD and further investigation is needed to explore their functions in PD.
Ferlaino, Michael; Rogers, Mark F.; Shihab, Hashem A.; Mort, Matthew; Cooper, David N.; Gaunt, Tom R.; Campbell, Colin
2018-01-01
Background Small insertions and deletions (indels) have a significant influence in human disease and, in terms of frequency, they are second only to single nucleotide variants as pathogenic mutations. As the majority of mutations associated with complex traits are located outside the exome, it is crucial to investigate the potential pathogenic impact of indels in non-coding regions of the human genome. Results We present FATHMM-indel, an integrative approach to predict the functional effect, pathogenic or neutral, of indels in non-coding regions of the human genome. Our method exploits various genomic annotations in addition to sequence data. When validated on benchmark data, FATHMM-indel significantly outperforms CADD and GAVIN, state of the art models in assessing the pathogenic impact of non-coding variants. FATHMM-indel is available via a web server at indels.biocompute.org.uk. Conclusions FATHMM-indel can accurately predict the functional impact and prioritise small indels throughout the whole non-coding genome. PMID:28985712
Oh, Chang Seok; Lee, Soong Deok; Kim, Yi-Suk; Shin, Dong Hoon
2015-01-01
Previous study showed that East Asian mtDNA haplogroups, especially those of Koreans, could be successfully assigned by the coupled use of analyses on coding region SNP markers and control region mutation motifs. In this study, we tried to see if the same triple multiplex analysis for coding regions SNPs could be also applicable to ancient samples from East Asia as the complementation for sequence analysis of mtDNA control region. By the study on Joseon skeleton samples, we know that mtDNA haplogroup determined by coding region SNP markers successfully falls within the same haplogroup that sequence analysis on control region can assign. Considering that ancient samples in previous studies make no small number of errors in control region mtDNA sequencing, coding region SNP analysis can be used as good complimentary to the conventional haplogroup determination, especially of archaeological human bone samples buried underground over long periods. PMID:26345190
Ferlaino, Michael; Rogers, Mark F; Shihab, Hashem A; Mort, Matthew; Cooper, David N; Gaunt, Tom R; Campbell, Colin
2017-10-06
Small insertions and deletions (indels) have a significant influence in human disease and, in terms of frequency, they are second only to single nucleotide variants as pathogenic mutations. As the majority of mutations associated with complex traits are located outside the exome, it is crucial to investigate the potential pathogenic impact of indels in non-coding regions of the human genome. We present FATHMM-indel, an integrative approach to predict the functional effect, pathogenic or neutral, of indels in non-coding regions of the human genome. Our method exploits various genomic annotations in addition to sequence data. When validated on benchmark data, FATHMM-indel significantly outperforms CADD and GAVIN, state of the art models in assessing the pathogenic impact of non-coding variants. FATHMM-indel is available via a web server at indels.biocompute.org.uk. FATHMM-indel can accurately predict the functional impact and prioritise small indels throughout the whole non-coding genome.
The nucleotide sequence and genome organization of Plasmopara halstedii virus.
Heller-Dohmen, Marion; Göpfert, Jens C; Pfannstiel, Jens; Spring, Otmar
2011-03-17
Only very few viruses of Oomycetes have been studied in detail. Isometric virions were found in different isolates of the oomycete Plasmopara halstedii, the downy mildew pathogen of sunflower. However, complete nucleotide sequences and data on the genome organization were lacking. Viral RNA of different P. halstedii isolates was subjected to nucleotide sequencing and analysis of the viral genome. The N-terminal sequence of the viral coat protein was determined using Top-Down MALDI-TOF analysis. The complete nucleotide sequences of both single-stranded RNA segments (RNA1 and RNA2) were established. RNA1 consisted of 2793 nucleotides (nt) exclusive its 3' poly(A) tract and a single open-reading frame (ORF1) of 2745 nt. ORF1 was framed by a 5' untranslated region (5' UTR) of 18 nt and a 3' untranslated region (3' UTR) of 30 nt. ORF1 contained motifs of RNA-dependent RNA polymerases (RdRp) and showed similarities to RdRp of Scleropthora macrospora virus A (SmV A) and viruses within the Nodaviridae family. RNA2 consisted of 1526 nt exclusive its 3' poly(A) tract and a second ORF (ORF2) of 1128 nt. ORF2 coded for the single viral coat protein (CP) and was framed by a 5' UTR of 164 nt and a 3' UTR of 234 nt. The deduced amino acid sequence of ORF2 was verified by nano-LC-ESI-MS/MS experiments. Top-Down MALDI-TOF analysis revealed the N-terminal sequence of the CP. The N-terminal sequence represented a region within ORF2 suggesting a proteolytic processing of the CP in vivo. The CP showed similarities to CP of SmV A and viruses within the Tombusviridae family. Fragments of RNA1 (ca. 1.9 kb) and RNA2 (ca. 1.4 kb) were used to analyze the nucleotide sequence variation of virions in different P. halstedii isolates. Viral sequence variation was 0.3% or less regardless of their host's pathotypes, the geographical origin and the sensitivity towards the fungicide metalaxyl. The results showed the presence of a single and new virus type in different P. halstedii isolates. Insignificant viral sequence variation indicated that the virus did not account for differences in pathogenicity of the oomycete P. halstedii.
Arlindo, Samuel; Calo, Pilar; Franco, Carlos; Prado, Marta; Cepeda, Alberto; Barros-Velázquez, Jorge
2006-12-01
The bacteriocins produced by two lactic acid bacteria isolated from nonfermented fresh meat and fish, respectively, and exhibiting a remarkable antilisterial activity, were characterized. Bacteriocinogenic strains were identified as Enterococcus faecium and the maximum bacteriocin production by both strains was detected in the stationary phase of growth. The activity against Listeria monocytogenes was maintained in pH range of 3-7 and was stable in both strains after heating at 100 or 121 degrees C. The genes coding for enterocin P were detected, isolated, and sequenced in both E. faecium strains. They exhibited DNA/DNA homology in the 87.1-97.2% range with respect to the other four enterocin P genes reported so far. Three single nucleotide polymorphism events, silent at the amino acid level, were detected at nucleotide positions 45 (G/A), 75 (A/G), and 90 (T/C) in E. faecium LHICA 28-4 and may explain the differences reported for those loci in other enterocin P-producing E. faecium strains. This work provides the first description of enterocin P-producing E. faecium strains in nonfermented foodstuffs and, in the case of E. faecium LHICA 51, the first report of an enterocin P-producing strain isolated from fish so far.
Koole, Cassandra; Savage, Emilia E.; Christopoulos, Arthur; Miller, Laurence J.
2013-01-01
The glucagon-like peptide-1 receptor (GLP-1R) controls the physiological responses to the incretin hormone glucagon-like peptide-1 and is a major therapeutic target for the treatment of type 2 diabetes, owing to the broad range of effects that are mediated upon its activation. These include the promotion of glucose-dependent insulin secretion, increased insulin biosynthesis, preservation of β-cell mass, improved peripheral insulin action, and promotion of weight loss. Regulation of GLP-1R function is complex, with multiple endogenous and exogenous peptides that interact with the receptor that result in the activation of numerous downstream signaling cascades. The current understanding of GLP-1R signaling and regulation is limited, with the desired spectrum of signaling required for the ideal therapeutic outcome still to be determined. In addition, there are several single-nucleotide polymorphisms (used in this review as defining a natural change of single nucleotide in the receptor sequence; clinically, this is viewed as a single-nucleotide polymorphism only if the frequency of the mutation occurs in 1% or more of the population) distributed within the coding sequence of the receptor protein that have the potential to produce differential responses for distinct ligands. In this review, we discuss the current understanding of GLP-1R function, in particular highlighting recent advances in the field on ligand-directed signal bias, allosteric modulation, and probe dependence and the implications of these behaviors for drug discovery and development. PMID:23864649
Quantum Point Contact Single-Nucleotide Conductance for DNA and RNA Sequence Identification.
Afsari, Sepideh; Korshoj, Lee E; Abel, Gary R; Khan, Sajida; Chatterjee, Anushree; Nagpal, Prashant
2017-11-28
Several nanoscale electronic methods have been proposed for high-throughput single-molecule nucleic acid sequence identification. While many studies display a large ensemble of measurements as "electronic fingerprints" with some promise for distinguishing the DNA and RNA nucleobases (adenine, guanine, cytosine, thymine, and uracil), important metrics such as accuracy and confidence of base calling fall well below the current genomic methods. Issues such as unreliable metal-molecule junction formation, variation of nucleotide conformations, insufficient differences between the molecular orbitals responsible for single-nucleotide conduction, and lack of rigorous base calling algorithms lead to overlapping nanoelectronic measurements and poor nucleotide discrimination, especially at low coverage on single molecules. Here, we demonstrate a technique for reproducible conductance measurements on conformation-constrained single nucleotides and an advanced algorithmic approach for distinguishing the nucleobases. Our quantum point contact single-nucleotide conductance sequencing (QPICS) method uses combed and electrostatically bound single DNA and RNA nucleotides on a self-assembled monolayer of cysteamine molecules. We demonstrate that by varying the applied bias and pH conditions, molecular conductance can be switched ON and OFF, leading to reversible nucleotide perturbation for electronic recognition (NPER). We utilize NPER as a method to achieve >99.7% accuracy for DNA and RNA base calling at low molecular coverage (∼12×) using unbiased single measurements on DNA/RNA nucleotides, which represents a significant advance compared to existing sequencing methods. These results demonstrate the potential for utilizing simple surface modifications and existing biochemical moieties in individual nucleobases for a reliable, direct, single-molecule, nanoelectronic DNA and RNA nucleotide identification method for sequencing.
Chen, L P; E, G X; Zhao, Y J; Na, R S; Zhao, Z Q; Zhang, J H; Ma, Y H; Sun, Y W; Zhong, T; Zhang, H P; Huang, Y F
2015-06-18
DRA encodes the alpha chain of the DR heterodimer, is closely linked to DRB and is considered almost monomorphic in major histocompatibility complex region. In this study, we identified the exon 2 of DRA to evaluate the immunogenetic diversity of Chinese south indigenous goat. Two single nucleotide polymorphisms in an untranslated region and one synonymous substitution in coding region were identified. These data suggest that high immunodiversity in native Chinese population.
Complex genetic diseases: controversy over the Croesus code.
Wright, A F; Hastie, N D
2001-01-01
The polarization of views on how best to exploit new information from the Human Genome Project for medicine reflects our ignorance of the genetic architecture underlying common diseases: are susceptibility alleles common or rare, neutral or deleterious, few or many? Single-nucleotide polymorphism (SNP) technology is almost in place to dissect such diseases and to create a personalized medicine, but success is critically dependent on the biology and "Nature to be commanded must be obeyed" (Francis Bacon, 1620, Novum Organum).
ADOMA: A Command Line Tool to Modify ClustalW Multiple Alignment Output.
Zaal, Dionne; Nota, Benjamin
2016-01-01
We present ADOMA, a command line tool that produces alternative outputs from ClustalW multiple alignments of nucleotide or protein sequences. ADOMA can simplify the output of alignments by showing only the different residues between sequences, which is often desirable when only small differences such as single nucleotide polymorphisms are present (e.g., between different alleles). Another feature of ADOMA is that it can enhance the ClustalW output by coloring the residues in the alignment. This tool is easily integrated into automated Linux pipelines for next-generation sequencing data analysis, and may be useful for researchers in a broad range of scientific disciplines including evolutionary biology and biomedical sciences. The source code is freely available at https://sourceforge. net/projects/adoma/. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Single Color Multiplexed ddPCR Copy Number Measurements and Single Nucleotide Variant Genotyping.
Wood-Bouwens, Christina M; Ji, Hanlee P
2018-01-01
Droplet digital PCR (ddPCR) allows for accurate quantification of genetic events such as copy number variation and single nucleotide variants. Probe-based assays represent the current "gold-standard" for detection and quantification of these genetic events. Here, we introduce a cost-effective single color ddPCR assay that allows for single genome resolution quantification of copy number and single nucleotide variation.
Amexis, Georgios; Rubin, Steven; Chatterjee, Nando; Carbone, Kathryn; Chumakov, Kostantin
2003-06-01
A single clinical isolate of mumps virus designated 88-1961 was obtained from a patient hospitalized with a clinical history of upper respiratory tract infection, parotitis, severe headache, fever and lymphadenopathy. We have sequenced the full-length genome of 88-1961 and compared it against all available full-length sequences of mumps virus. Based upon its nucleotide sequence of the SH gene 88-1961 was identified as a genotype H mumps strain. The overall extent of nucleotide and amino acid differences between each individual gene and protein of 88-1961 and the full-length mumps samples showed that the missense to silent ratios were unevenly distributed. Upon evaluation of the consensus sequence of 88-1961, four positions were found to be clearly heterogeneous at the nucleotide level (NP 315C/T, NP 318C/T, F 271A/C, and HN 855C/T). Sequence analysis revealed that the amino acid sequences for the NP, M, and the L protein were the most conserved, whereas the SH protein exhibited the highest variability among the compared mumps genotypes A, B, and G. No identifying molecular patterns in the non-coding (intergenic) or coding regions of 88-1961 were found when we compared it against relatively virulent (Urabe AM9 B, Glouc1/UK96, 87-1004 and 87-1005) and non-virulent mumps strains (Jeryl Lynn and all Urabe Am9 A substrains). Copyright 2003 Wiley-Liss, Inc.
Alvarado, David M; Yang, Ping; Druley, Todd E; Lovett, Michael; Gurnett, Christina A
2014-06-01
Despite declining sequencing costs, few methods are available for cost-effective single-nucleotide polymorphism (SNP), insertion/deletion (INDEL) and copy number variation (CNV) discovery in a single assay. Commercially available methods require a high investment to a specific region and are only cost-effective for large samples. Here, we introduce a novel, flexible approach for multiplexed targeted sequencing and CNV analysis of large genomic regions called multiplexed direct genomic selection (MDiGS). MDiGS combines biotinylated bacterial artificial chromosome (BAC) capture and multiplexed pooled capture for SNP/INDEL and CNV detection of 96 multiplexed samples on a single MiSeq run. MDiGS is advantageous over other methods for CNV detection because pooled sample capture and hybridization to large contiguous BAC baits reduces sample and probe hybridization variability inherent in other methods. We performed MDiGS capture for three chromosomal regions consisting of ∼ 550 kb of coding and non-coding sequence with DNA from 253 patients with congenital lower limb disorders. PITX1 nonsense and HOXC11 S191F missense mutations were identified that segregate in clubfoot families. Using a novel pooled-capture reference strategy, we identified recurrent chromosome chr17q23.1q23.2 duplications and small HOXC 5' cluster deletions (51 kb and 12 kb). Given the current interest in coding and non-coding variants in human disease, MDiGS fulfills a niche for comprehensive and low-cost evaluation of CNVs, coding, and non-coding variants across candidate regions of interest. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Helfenbein, Kevin G.; Fourcade, H. Matthew; Vanjani, Rohit G.
2004-05-01
We report the first complete mitochondrial (mt) DNA sequence from a member of the phylum Chaetognatha (arrow worms). The Paraspadella gotoi mtDNA is highly unusual, missing 23 of the genes commonly found in animal mtDNAs, including atp6, which has otherwise been found universally to be present. Its 14 genes are unusually arranged into two groups, one on each strand. One group is punctuated by numerous non-coding intergenic nucleotides, while the other group is tightly packed, having no non-coding nucleotides, leading to speculation that there are two transcription units with differing modes of expression. The phylogenetic position of the Chaetognatha withinmore » the Metazoa has long been uncertain, with conflicting or equivocal results from various morphological analyses and rRNA sequence comparisons. Comparisons here of amino acid sequences from mitochondrially encoded proteins gives a single most parsimonious tree that supports a position of Chaetognatha as sister to the protostomes studied here. From this, one can more clearly interpret the patterns of evolution of various developmental features, especially regarding the embryological fate of the blastopore.« less
Optimization of algorithm of coding of genetic information of Chlamydia
NASA Astrophysics Data System (ADS)
Feodorova, Valentina A.; Ulyanov, Sergey S.; Zaytsev, Sergey S.; Saltykov, Yury V.; Ulianova, Onega V.
2018-04-01
New method of coding of genetic information using coherent optical fields is developed. Universal technique of transformation of nucleotide sequences of bacterial gene into laser speckle pattern is suggested. Reference speckle patterns of the nucleotide sequences of omp1 gene of typical wild strains of Chlamydia trachomatis of genovars D, E, F, G, J and K and Chlamydia psittaci serovar I as well are generated. Algorithm of coding of gene information into speckle pattern is optimized. Fully developed speckles with Gaussian statistics for gene-based speckles have been used as criterion of optimization.
The complete chloroplast genome of Sinopodophyllum hexandrum Ying (Berberidaceae).
Meng, Lihua; Liu, Ruijuan; Chen, Jianbing; Ding, Chenxu
2017-05-01
The complete nucleotide sequence of the Sinopodophyllum hexandrum Ying chloroplast genome (cpDNA) was determined based on next-generation sequencing technologies in this study. The genome was 157 203 bp in length, containing a pair of inverted repeat (IRa and IRb) regions of 25 960 bp, which were separated by a large single-copy (LSC) region of 87 065 bp and a small single-copy (SSC) region of 18 218 bp, respectively. The cpDNA contained 148 genes, including 96 protein-coding genes, 8 ribosomal RNA genes, and 44 tRNA genes. In these genes, eight harbored a single intron, and two (ycf3 and clpP) contained a couple of introns. The cpDNA AT content of S. hexandrum cpDNA is 61.5%.
Huang, C.; Chien, M.S.; Landolt, M.L.; Batts, W.; Winton, J.
1996-01-01
Twelve neutralizing monoclonal antibodies (MAbs) against the fish rhabdovirus, infectious haematopoietic necrosis virus (IHNV), were used to select 20 MAb escape mutants. The nucleotide sequence of the entire glycoprotein (G) gene was determined for six mutants representing differing cross-neutralization patterns and each had a single nucleotide change leading to a single amino acid substitution within one of three regions of the protein. These data were used to design nested PCR primers to amplify portions of the G gene of the 14 remaining mutants. When the PCR products from these mutants were sequenced, they also had single nucleotide substitutions coding for amino acid substitutions at the same, or nearby, locations. Of the 20 mutants for which all or part of the glycoprotein gene was sequenced, two MAbs selected mutants with substitutions at amino acids 230-231 (antigenic site I) and the remaining MAbs selected mutants with substitutions at amino acids 272-276 (antigenic site II). Two MAbs that selected mutants mapping to amino acids 272-276, selected other mutants that mapped to amino acids 78-81, raising the possibility that this portion of the N terminus of the protein was part of a discontinuous epitope defining antigenic site II. CLUSTAL alignment of the glycoproteins of rabies virus, vesicular stomatitis virus and IHNV revealed similarities in the location of the neutralizing epitopes and a high degree of conservation among cysteine residues, indicating that the glycoproteins of three different genera of animal rhabdoviruses may share a similar three-dimensional structure in spite of extensive sequence divergence.
Deng, Hong-Zhu; You, Cong; Xing, Yu; Chen, Kai-Yun; Zou, Xiao-Bing
2016-05-01
Autism spectrum disorder is a group of neurodevelopmental disorders with the higher prevalence in males. Our previous studies have indicated lower progesterone levels in the children with autism spectrum disorder, suggesting involvement of the cytochrome P-450scc gene (CYP11A1) and cytochrome P-45011beta gene (CYP11B1) as candidate genes in autism spectrum disorder. The aim of this study was to investigate the family-based genetic association between single-nucleotide polymorphisms, rs2279357 in the CYP11A1 gene and rs4534 and rs4541 in the CYP11B1 gene and autism spectrum disorder in Chinese children, which were selected according to the location in the coding region and 5' and 3' regions and minor allele frequencies of greater than 0.05 in the Chinese populations. The transmission disequilibrium test and case-control association analyses were performed in 100 Chinese Han autism spectrum disorder family trios. The genotype and allele frequency of the 3 single-nucleotide polymorphisms had no statistical difference between the children with autism spectrum disorder and their parents (P> .05). Transmission disequilibrium test analysis showed transmission disequilibrium of CYP11A1 gene rs2279357 single-nucleotide polymorphisms (χ(2)= 5.038,P< .001). Our findings provide further support for the hypothesis that a susceptibility gene for autism spectrum disorder exists within or near the CYP11A1 gene in the Han Chinese population. © The Author(s) 2015.
Cost-effective sequencing of full-length cDNA clones powered by a de novo-reference hybrid assembly.
Kuroshu, Reginaldo M; Watanabe, Junichi; Sugano, Sumio; Morishita, Shinichi; Suzuki, Yutaka; Kasahara, Masahiro
2010-05-07
Sequencing full-length cDNA clones is important to determine gene structures including alternative splice forms, and provides valuable resources for experimental analyses to reveal the biological functions of coded proteins. However, previous approaches for sequencing cDNA clones were expensive or time-consuming, and therefore, a fast and efficient sequencing approach was demanded. We developed a program, MuSICA 2, that assembles millions of short (36-nucleotide) reads collected from a single flow cell lane of Illumina Genome Analyzer to shotgun-sequence approximately 800 human full-length cDNA clones. MuSICA 2 performs a hybrid assembly in which an external de novo assembler is run first and the result is then improved by reference alignment of shotgun reads. We compared the MuSICA 2 assembly with 200 pooled full-length cDNA clones finished independently by the conventional primer-walking using Sanger sequencers. The exon-intron structure of the coding sequence was correct for more than 95% of the clones with coding sequence annotation when we excluded cDNA clones insufficiently represented in the shotgun library due to PCR failure (42 out of 200 clones excluded), and the nucleotide-level accuracy of coding sequences of those correct clones was over 99.99%. We also applied MuSICA 2 to full-length cDNA clones from Toxoplasma gondii, to confirm that its ability was competent even for non-human species. The entire sequencing and shotgun assembly takes less than 1 week and the consumables cost only approximately US$3 per clone, demonstrating a significant advantage over previous approaches.
Oh, Juliana J.; Koegel, Ashley; Phan, Diana T.; Razfar, Ali; Slamon, Dennis J.
2007-01-01
Summary Allele loss and genetic alteration in chromosome 3p, particularly in 3p21.3 region, are the most frequent and the earliest genomic abnormalities found in lung cancer. Multiple 3p21.3 genes exhibit various degrees of tumour suppression activity suggesting that 3p21.3 genes may function as an integrated tumour suppressor region through their diverse biological activities. We have previously demonstrated growth inhibitory effects and tumour suppression mechanism of the H37/RBM5 gene which is one of the 19 genes residing in the 370kb minimal overlap region at 3p21.3. In the current study, in an attempt to find, if any, mutations in the H37 coding region in lung cancer cells, we compared nucleotide sequences of the entire H37 gene in tumour vs. adjacent normal tissues from 17 non-small cell lung cancer (NSCLC) patients. No mutations were detected, instead, we found the two silent single nucleotide polymorphisms (SNPs), C1138T and C2185T, within the coding region of the H37 gene. In addition, we found that specific allele types at these SNP positions are correlated with different histological subtypes of NSCLC; tumours containing heterozygous alleles (C+T) at these SNP positions are more likely to be associated with adenocarcinoma (AC) whereas homozygous alleles (either C or T) are associated with squamous cell carcinoma (SCC) (p=0.0098). We postulate that, these two silent polymorphisms may be in linkage disequilibrium (LD) with a disease causative allele in the 3p21.3 tumour suppressor region which is packed with a large number of important genes affecting lung cancer development. In addition, because of prevalent loss of heterozygosity (LOH) detected at 3p21.3 which precedes lung cancer initiation, these SNPs may be developed into a marker screening for the high risk individuals. PMID:17606309
Sounds of silence: synonymous nucleotides as a key to biological regulation and complexity
Shabalina, Svetlana A.; Spiridonov, Nikolay A.; Kashina, Anna
2013-01-01
Messenger RNA is a key component of an intricate regulatory network of its own. It accommodates numerous nucleotide signals that overlap protein coding sequences and are responsible for multiple levels of regulation and generation of biological complexity. A wealth of structural and regulatory information, which mRNA carries in addition to the encoded amino acid sequence, raises the question of how these signals and overlapping codes are delineated along non-synonymous and synonymous positions in protein coding regions, especially in eukaryotes. Silent or synonymous codon positions, which do not determine amino acid sequences of the encoded proteins, define mRNA secondary structure and stability and affect the rate of translation, folding and post-translational modifications of nascent polypeptides. The RNA level selection is acting on synonymous sites in both prokaryotes and eukaryotes and is more common than previously thought. Selection pressure on the coding gene regions follows three-nucleotide periodic pattern of nucleotide base-pairing in mRNA, which is imposed by the genetic code. Synonymous positions of the coding regions have a higher level of hybridization potential relative to non-synonymous positions, and are multifunctional in their regulatory and structural roles. Recent experimental evidence and analysis of mRNA structure and interspecies conservation suggest that there is an evolutionary tradeoff between selective pressure acting at the RNA and protein levels. Here we provide a comprehensive overview of the studies that define the role of silent positions in regulating RNA structure and processing that exert downstream effects on proteins and their functions. PMID:23293005
Tang, Clara S; Zhang, He; Cheung, Chloe Y Y; Xu, Ming; Ho, Jenny C Y; Zhou, Wei; Cherny, Stacey S; Zhang, Yan; Holmen, Oddgeir; Au, Ka-Wing; Yu, Haiyi; Xu, Lin; Jia, Jia; Porsch, Robert M; Sun, Lijie; Xu, Weixian; Zheng, Huiping; Wong, Lai-Yung; Mu, Yiming; Dou, Jingtao; Fong, Carol H Y; Wang, Shuyu; Hong, Xueyu; Dong, Liguang; Liao, Yanhua; Wang, Jiansong; Lam, Levina S M; Su, Xi; Yan, Hua; Yang, Min-Lee; Chen, Jin; Siu, Chung-Wah; Xie, Gaoqiang; Woo, Yu-Cho; Wu, Yangfeng; Tan, Kathryn C B; Hveem, Kristian; Cheung, Bernard M Y; Zöllner, Sebastian; Xu, Aimin; Eugene Chen, Y; Jiang, Chao Qiang; Zhang, Youyi; Lam, Tai-Hing; Ganesh, Santhi K; Huo, Yong; Sham, Pak C; Lam, Karen S L; Willer, Cristen J; Tse, Hung-Fat; Gao, Wei
2015-12-22
Blood lipids are important risk factors for coronary artery disease (CAD). Here we perform an exome-wide association study by genotyping 12,685 Chinese, using a custom Illumina HumanExome BeadChip, to identify additional loci influencing lipid levels. Single-variant association analysis on 65,671 single nucleotide polymorphisms reveals 19 loci associated with lipids at exome-wide significance (P<2.69 × 10(-7)), including three Asian-specific coding variants in known genes (CETP p.Asp459Gly, PCSK9 p.Arg93Cys and LDLR p.Arg257Trp). Furthermore, missense variants at two novel loci-PNPLA3 p.Ile148Met and PKD1L3 p.Thr429Ser-also influence levels of triglycerides and low-density lipoprotein cholesterol, respectively. Another novel gene, TEAD2, is found to be associated with high-density lipoprotein cholesterol through gene-based association analysis. Most of these newly identified coding variants show suggestive association (P<0.05) with CAD. These findings demonstrate that exome-wide genotyping on samples of non-European ancestry can identify additional population-specific possible causal variants, shedding light on novel lipid biology and CAD.
Schmale, H; Ivell, R; Breindl, M; Darmer, D; Richter, D
1984-01-01
The vasopressin gene from normal and diabetes insipidus (Brattleboro) rats has been isolated and sequenced. Except for a single deletion of a G residue in region coding for the neurophysin carrier protein the approximately 2300 nucleotides of both genes are identical. Blot analysis of hypothalamic RNA as well as transfection and microinjection experiments indicate that the mutant gene is correctly transcribed and spliced, however the resulting mRNA is not efficiently translated. Images Fig. 2. Fig. 3. PMID:6526016
Lee, Ciaran M; Zhu, Haibao; Davis, Timothy H; Deshmukh, Harshahardhan; Bao, Gang
2017-01-01
The CRISPR/Cas9 system is a powerful tool for precision genome editing. The ability to accurately modify genomic DNA in situ with single nucleotide precision opens up new possibilities for not only basic research but also biotechnology applications and clinical translation. In this chapter, we outline the procedures for design, screening, and validation of CRISPR/Cas9 systems for targeted modification of coding sequences in the human genome and how to perform genome editing in induced pluripotent stem cells with high efficiency and specificity.
Genetic Variation Linked to Lung Cancer Survival in White Smokers | Center for Cancer Research
CCR investigators have discovered evidence that links lung cancer survival with genetic variations (called single nucleotide polymorphisms) in the MBL2 gene, a key player in innate immunity. The variations in the gene, which codes for a protein called the mannose-binding lectin, occur in its promoter region, where the RNA polymerase molecule binds to start transcription, and in the first exon that is responsible for the correct structure of MBL. The findings appear in the September 19, 2007, issue of the Journal of the National Cancer Institute.
The primary structure of the Saccharomyces cerevisiae gene for 3-phosphoglycerate kinase.
Hitzeman, R A; Hagie, F E; Hayflick, J S; Chen, C Y; Seeburg, P H; Derynck, R
1982-01-01
The DNA sequence of the gene for the yeast glycolytic enzyme, 3-phosphoglycerate kinase (PGK), has been obtained by sequencing part of a 3.1 kbp HindIII fragment obtained from the yeast genome. The structural gene sequence corresponds to a reading frame of 1251 bp coding for 416 amino acids with no intervening DNA sequences. The amino acid sequence is approximately 65 percent homologous with human and horse PGK protein sequences and is in general agreement with the published protein sequence for yeast PGK. As for other highly expressed structural genes in yeast, the coding sequence is highly codon biased with 95 percent of the amino acids coded for by a select 25 codons (out of 61 possible). Besides structural DNA sequence, 291 bp of 5'-flanking sequence and 286 bp of 3'-flanking sequence were determined. Transcription starts 36 nucleotides upstream from the translational start and stops 86-93 nucleotides downstream from the translational stop. These results suggest a non-polyadenylated mRNA length of 1373 to 1380 nucleotides, which is consistent with the observed length of 1500 nucleotides for polyadenylated PGK mRNA. A sequence TATATATAAA is found at 145 nucleotides upstream from the translational start. This sequence resembles the TATAAA box that is possibly associated with RNA polymerase II binding. Images PMID:6296791
Proteogenomic Investigation of Strain Variation in Clinical Mycobacterium tuberculosis Isolates.
Heunis, Tiaan; Dippenaar, Anzaan; Warren, Robin M; van Helden, Paul D; van der Merwe, Ruben G; Gey van Pittius, Nicolaas C; Pain, Arnab; Sampson, Samantha L; Tabb, David L
2017-10-06
Mycobacterium tuberculosis consists of a large number of different strains that display unique virulence characteristics. Whole-genome sequencing has revealed substantial genetic diversity among clinical M. tuberculosis isolates, and elucidating the phenotypic variation encoded by this genetic diversity will be of the utmost importance to fully understand M. tuberculosis biology and pathogenicity. In this study, we integrated whole-genome sequencing and mass spectrometry (GeLC-MS/MS) to reveal strain-specific characteristics in the proteomes of two clinical M. tuberculosis Latin American-Mediterranean isolates. Using this approach, we identified 59 peptides containing single amino acid variants, which covered ∼9% of all coding nonsynonymous single nucleotide variants detected by whole-genome sequencing. Furthermore, we identified 29 distinct peptides that mapped to a hypothetical protein not present in the M. tuberculosis H37Rv reference proteome. Here, we provide evidence for the expression of this protein in the clinical M. tuberculosis SAWC3651 isolate. The strain-specific databases enabled confirmation of genomic differences (i.e., large genomic regions of difference and nonsynonymous single nucleotide variants) in these two clinical M. tuberculosis isolates and allowed strain differentiation at the proteome level. Our results contribute to the growing field of clinical microbial proteogenomics and can improve our understanding of phenotypic variation in clinical M. tuberculosis isolates.
Phomopsis longicolla RNA virus 1 - Novel virus at the edge of myco- and plant viruses.
Hrabáková, Lenka; Koloniuk, Igor; Petrzik, Karel
2017-06-01
The complete nucleotide sequence of a new RNA mycovirus in the KY isolate of Phomopsis longicolla Hobbs 1985 and its protoplasts subcultures p5, p9, and ME711 was discovered. The virus, provisionally named Phomopsis longicolla RNA virus 1 (PlRV1), was localized in mitochondria and was determined to have a genome 2822 nucleotides long. A single open reading frame could be translated in silico by both standard and mitochondrial genetic codes into a product featuring conservative domains for an RNA-dependent RNA polymerase (RdRp). The RdRp of PlRV1 has no counterpart among mycoviruses, but it is about 30% identical with the RdRp of plant ourmiaviruses. Recently, new mycoviruses related to plant ourmiaviruses and forming one clade with PlRV1 have been discovered. This separate clade could represent the crucial link between plant and fungal viruses. Copyright © 2017 Elsevier Inc. All rights reserved.
Dey, Avishek; Samanta, Milan Kumar; Gayen, Srimonta; Sen, Soumitra K.; Maiti, Mrinal K.
2016-01-01
Drought is one of the major limiting factors for productivity of crops including rice (Oryza sativa L.). Understanding the role of allelic variations of key regulatory genes involved in stress-tolerance is essential for developing an effective strategy to combat drought. The bZIP transcription factors play a crucial role in abiotic-stress adaptation in plants via abscisic acid (ABA) signaling pathway. The present study aimed to search for allelic polymorphism in the OsbZIP23 gene across selected drought-tolerant and drought-sensitive rice genotypes, and to characterize the new allele through overexpression (OE) and gene-silencing (RNAi). Analyses of the coding DNA sequence (CDS) of the cloned OsbZIP23 gene revealed single nucleotide polymorphism at four places and a 15-nucleotide deletion at one place. The single-copy OsbZIP23 gene is expressed at relatively higher level in leaf tissues of drought-tolerant genotypes, and its abundance is more in reproductive stage. Cloning and sequence analyses of the OsbZIP23-promoter from drought-tolerant O. rufipogon and drought-sensitive IR20 cultivar showed variation in the number of stress-responsive cis-elements and a 35-nucleotide deletion at 5’-UTR in IR20. Analysis of the GFP reporter gene function revealed that the promoter activity of O. rufipogon is comparatively higher than that of IR20. The overexpression of any of the two polymorphic forms (1083 bp and 1068 bp CDS) of OsbZIP23 improved drought tolerance and yield-related traits significantly by retaining higher content of cellular water, soluble sugar and proline; and exhibited decrease in membrane lipid peroxidation in comparison to RNAi lines and non-transgenic plants. The OE lines showed higher expression of target genes-OsRab16B, OsRab21 and OsLEA3-1 and increased ABA sensitivity; indicating that OsbZIP23 is a positive transcriptional-regulator of the ABA-signaling pathway. Taken together, the present study concludes that the enhanced gene expression rather than natural polymorphism in coding sequence of OsbZIP23 is accountable for improved drought tolerance and yield performance in rice genotypes. PMID:26959651
Essentials of Conservation Biotechnology: A mini review
NASA Astrophysics Data System (ADS)
Merlyn Keziah, S.; Subathra Devi, C.
2017-11-01
Equilibrium of biodiversity is essential for the maintenance of the ecosystem as they are interdependent on each other. The decline in biodiversity is a global problem and an inevitable threat to the mankind. Major threats include unsustainable exploitation, habitat destruction, fragmentation, transformation, genetic pollution, invasive exotic species and degradation. This review covers the management strategies of biotechnology which include sin situ, ex situ conservation, computerized taxonomic analysis through construction of phylogenetic trees, calculating genetic distance, prioritizing the group for conservation, digital preservation of biodiversities within the coding and decoding keys, molecular approaches to asses biodiversity like polymerase chain reaction, real time, randomly amplified polymorphic DNA, restriction fragment length polymorphism, amplified fragment length polymorphism, single sequence repeats, DNA finger printing, single nucleotide polymorphism, cryopreservation and vitrification.
Sudden infant death syndrome (SIDS) and polymorphisms in Monoamine oxidase A gene (MAOA): a revisit.
Groß, Maximilian; Bajanowski, Thomas; Vennemann, Mechtild; Poetsch, Micaela
2014-01-01
Literature describes multiple possible links between genetic variations in the neuroadrenergic system and the occurrence of sudden infant death syndrome. The X-chromosomal Monoamine oxidase A (MAOA) is one of the genes with regulatory activity in the noradrenergic and serotonergic neuronal systems and a polymorphism of the promoter which affects the activity of this gene has been proclaimed to contribute significantly to the prevalence of sudden infant death syndrome (SIDS) in three studies from 2009, 2012 and 2013. However, these studies described different significant correlations regarding gender or age of children. Since several studies, suggesting associations between genetic variations and SIDS, were disproved by follow-up analysis, this study was conducted to take a closer look at the MAOA gene and its polymorphisms. The functional MAOA promoter length polymorphism was investigated in 261 SIDS cases and 93 control subjects. Moreover, the allele distribution of 12 coding and non-coding single nucleotide polymorphisms (SNPs) of the MAOA gene was examined in 285 SIDS cases and 93 controls by a minisequencing technique. In contrast to prior studies with fewer individuals, no significant correlations between the occurrence of SIDS and the frequency of allele variants of the promoter polymorphism could be demonstrated, even including the results from the abovementioned previous studies. Regarding the SNPs, three statistically significant associations were observed which had not been described before. This study clearly disproves interactions between MAOA promoter polymorphisms and SIDS, even if variations in single nucleotide polymorphisms of MAOA should be subjected to further analysis to clarify their impact on SIDS.
Implication of common and disease specific variants in CLU, CR1, and PICALM.
Ferrari, Raffaele; Moreno, Jorge H; Minhajuddin, Abu T; O'Bryant, Sid E; Reisch, Joan S; Barber, Robert C; Momeni, Parastoo
2012-08-01
Two recent genome-wide association studies (GWAS) for late onset Alzheimer's disease (LOAD) revealed 3 new genes: clusterin (CLU), phosphatidylinositol binding clathrin assembly protein (PICALM), and complement receptor 1 (CR1). In order to evaluate association with these genome-wide association study-identified genes and to isolate the variants contributing to the pathogenesis of LOAD, we genotyped the top single nucleotide polymorphisms (SNPs), rs11136000 (CLU), rs3818361 (CR1), and rs3851179 (PICALM), and sequenced the entire coding regions of these genes in our cohort of 342 LOAD patients and 277 control subjects. We confirmed the association of rs3851179 (PICALM) (p = 7.4 × 10(-3)) with the disease status. Through sequencing we identified 18 variants in CLU, 3 of which were found exclusively in patients; 8 variants (out of 65) in CR1 gene were only found in patients and the 16 variants identified in PICALM gene were present in both patients and controls. In silico analysis of the variants in PICALM did not predict any damaging effect on the protein. The haplotype analysis of the variants in each gene predicted a common haplotype when the 3 single nucleotide polymorphisms rs11136000 (CLU), rs3818361 (CR1), and rs3851179 (PICALM), respectively, were included. For each gene the haplotype structure and size differed between patients and controls. In conclusion, we confirmed association of CLU, CR1, and PICALM genes with the disease status in our cohort through identification of a number of disease-specific variants among patients through the sequencing of the coding region of these genes. Published by Elsevier Inc.
Pokorska, J; Dusza, M; Kułaj, D; Żukowski, K; Makulska, J
2016-04-28
The aim of this study was to identify the association between single nucleotide polymorphisms (SNPs) in the bovine chemokine receptor (CXCR1) gene and the resistance or susceptibility of cows to mastitis. The analysis of the CXCR1 polymorphism was carried out using polymerase chain reaction restriction fragment length polymorphism analysis for six SNP mutations (c.+291C>T, c.+365T>C, c.+816C>A, c.+819G>A, +1093C>T, and +1373C>A), of which four were located within the coding region and two in the 3'UTR region of the CXCR1 gene. Genetic material from 146 Polish Holstein-Friesian cows was analyzed after dividing into two groups depending on the incidence of clinical mastitis. Identified polymorphisms were in linkage disequilibrium and formed two linkage groups. Three haplotypes (CCCATA, TTAGCC, CTCGCC), forming six haplotype combinations, were detected. The logistic regression showed a significant association between the CC genotype at c.+365T>C and susceptibility of cows to clinical mastitis (P = 0.047). The frequency of haplotype combination 1/1 (CCCATA/CCCATA) was not significantly higher in cows susceptible to mastitis (P = 0.062). Of the identified SNP mutations, only c.+365T>C is a nonsynonymous mutation that induces a change in the coded protein [GCC (Ala) to GTC (Val) at the 122nd amino acid]. This amino acid change can result in changes in receptor function, which may be a reason for the increased mastitis incidence observed in cows with polymorphism at this site.
Schroeder, H; Hoeltken, A M; Fladung, M
2012-03-01
Within the genus Populus several species belonging to different sections are cross-compatible. Hence, high numbers of interspecies hybrids occur naturally and, additionally, have been artificially produced in huge breeding programmes during the last 100 years. Therefore, determination of a single poplar species, used for the production of 'multi-species hybrids' is often difficult, and represents a great challenge for the use of molecular markers in species identification. Within this study, over 20 chloroplast regions, both intergenic spacers and coding regions, have been tested for their ability to differentiate different poplar species using 23 already published barcoding primer combinations and 17 newly designed primer combinations. About half of the published barcoding primers yielded amplification products, whereas the new primers designed on the basis of the total sequenced cpDNA genome of Populus trichocarpa Torr. & Gray yielded much higher amplification success. Intergenic spacers were found to be more variable than coding regions within the genus Populus. The highest discrimination power of Populus species was found in the combination of two intergenic spacers (trnG-psbK, psbK-psbl) and the coding region rpoC. In barcoding projects, the coding regions matK and rbcL are often recommended, but within the genus Populus they only show moderate variability and are not efficient in species discrimination. © 2011 German Botanical Society and The Royal Botanical Society of the Netherlands.
Molee, A.; Kongroi, K.; Kuadsantia, P.; Poompramun, C.; Likitdecharote, B.
2016-01-01
The aim of the present study was to investigate the effect of single nucleotide polymorphisms in the major histocompatibility complex (MHC) class II gene on resistance to Newcastle disease virus and body weight of the Thai indigenous chicken, Leung Hang Khao (Gallus gallus domesticus). Blood samples were collected for single nucleotide polymorphism analysis from 485 chickens. Polymerase chain reaction sequencing was used to classify single nucleotide polymorphisms of class II MHC. Body weights were measured at the ages of 3, 4, 5, and 7 months. Titres of Newcastle disease virus at 2 weeks to 7 months were determined and the correlation between body weight and titre was analysed. The association between single nucleotide polymorphisms and body weight and titre were analysed by a generalized linear model. Seven single nucleotide polymorphisms were identified: C125T, A126T, C209G, C242T, A243T, C244T, and A254T. Significant correlations between log titre and body weight were found at 2 and 4 weeks. Associations between single nucleotide polymorphisms and titre were found for C209G and A254T, and between all single nucleotide polymorphisms (except A243T) and body weight. The results showed that class II MHC is associated with both titre of Newcastle disease virus and body weight in Leung Hang Khao chickens. This is of concern because improved growth traits are the main goal of breeding selection. Moreover, the results suggested that MHC has a pleiotropic effect on the titre and growth performance. This mechanism should be investigated in a future study. PMID:26732325
Cost-Effective Sequencing of Full-Length cDNA Clones Powered by a De Novo-Reference Hybrid Assembly
Sugano, Sumio; Morishita, Shinichi; Suzuki, Yutaka
2010-01-01
Background Sequencing full-length cDNA clones is important to determine gene structures including alternative splice forms, and provides valuable resources for experimental analyses to reveal the biological functions of coded proteins. However, previous approaches for sequencing cDNA clones were expensive or time-consuming, and therefore, a fast and efficient sequencing approach was demanded. Methodology We developed a program, MuSICA 2, that assembles millions of short (36-nucleotide) reads collected from a single flow cell lane of Illumina Genome Analyzer to shotgun-sequence ∼800 human full-length cDNA clones. MuSICA 2 performs a hybrid assembly in which an external de novo assembler is run first and the result is then improved by reference alignment of shotgun reads. We compared the MuSICA 2 assembly with 200 pooled full-length cDNA clones finished independently by the conventional primer-walking using Sanger sequencers. The exon-intron structure of the coding sequence was correct for more than 95% of the clones with coding sequence annotation when we excluded cDNA clones insufficiently represented in the shotgun library due to PCR failure (42 out of 200 clones excluded), and the nucleotide-level accuracy of coding sequences of those correct clones was over 99.99%. We also applied MuSICA 2 to full-length cDNA clones from Toxoplasma gondii, to confirm that its ability was competent even for non-human species. Conclusions The entire sequencing and shotgun assembly takes less than 1 week and the consumables cost only ∼US$3 per clone, demonstrating a significant advantage over previous approaches. PMID:20479877
New t-gap insertion-deletion-like metrics for DNA hybridization thermodynamic modeling.
D'yachkov, Arkadii G; Macula, Anthony J; Pogozelski, Wendy K; Renz, Thomas E; Rykov, Vyacheslav V; Torney, David C
2006-05-01
We discuss the concept of t-gap block isomorphic subsequences and use it to describe new abstract string metrics that are similar to the Levenshtein insertion-deletion metric. Some of the metrics that we define can be used to model a thermodynamic distance function on single-stranded DNA sequences. Our model captures a key aspect of the nearest neighbor thermodynamic model for hybridized DNA duplexes. One version of our metric gives the maximum number of stacked pairs of hydrogen bonded nucleotide base pairs that can be present in any secondary structure in a hybridized DNA duplex without pseudoknots. Thermodynamic distance functions are important components in the construction of DNA codes, and DNA codes are important components in biomolecular computing, nanotechnology, and other biotechnical applications that employ DNA hybridization assays. We show how our new distances can be calculated by using a dynamic programming method, and we derive a Varshamov-Gilbert-like lower bound on the size of some of codes using these distance functions as constraints. We also discuss software implementation of our DNA code design methods.
The complete mitochondrial genome of Rapana venosa (Gastropoda, Muricidae).
Sun, Xiujun; Yang, Aiguo
2016-01-01
The complete mitochondrial (mt) genome of the veined rapa whelk, Rapana venosa, was determined using genome walking techniques in this study. The total length of the mt genome sequence of R. venosa was 15,271 bp, which is comparable to the reported Muricidae mitogenomes to date. It contained 13 protein-coding genes, 21 transfer RNA genes, and two ribosomal RNA genes. A bias towards a higher representation of nucleotides A and T (69%) was detected in the mt genome of R. venosa. A small number of non-coding nucleotides (302 bp) was detected, and the largest non-coding region was 74 bp in length.
The complete nucleotide sequence of RNA beta from the type strain of barley stripe mosaic virus.
Gustafson, G; Armour, S L
1986-01-01
The complete nucleotide sequence of RNA beta from the type strain of barley stripe mosaic virus (BSMV) has been determined. The sequence is 3289 nucleotides in length and contains four open reading frames (ORFs) which code for proteins of Mr 22,147 (ORF1), Mr 58,098 (ORF2), Mr 17,378 (ORF3), and Mr 14,119 (ORF4). The predicted N-terminal amino acid sequence of the polypeptide encoded by the ORF nearest the 5'-end of the RNA (ORF1) is identical (after the initiator methionine) to the published N-terminal amino acid sequence of BSMV coat protein for 29 of the first 30 amino acids. ORF2 occupies the central portion of the coding region of RNA beta and ORF3 is located at the 3'-end. The ORF4 sequence overlaps the 3'-region of ORF2 and the 5'-region of ORF3 and differs in codon usage from the other three RNA beta ORFs. The coding region of RNA beta is followed by a poly(A) tract and a 238 nucleotide tRNA-like structure which are common to all three BSMV genomic RNAs. Images PMID:3754962
Organization and transient expression of the gene for human U11 snRNA
Clemens, Suter-Crazzolara; Walter, Keller
1991-01-01
The nucleotide sequence of U11 small nuclear RNA, a minor U RNA from HeLa cells, was determined. Computer analysis of the sequence (135 residues) predicts two strong hairpin loops which are separated by seventeen nucleotides containing an Sm binding site (AAUUUUUUGG). A synthetic gene was constructed in which the coding region of U11 RNA is under the control of a T7 promoter. This vector can be used to produce U11 RNA in vitro. Southern hybridization and PCR analysis of HeLa genomic DNA suggest that U11 RNA is encoded by a single copy gene, and that at least three genomic regions could be U11 RNA pseudogenes. A HeLa genomic copy of a U11 gene was isolated by inverted PCR. This gene contains the U11 RNA coding sequence and several sequence elements unique for the U RNA genes. These include a Distal Sequence Element (DSE, ATTTGCATA) present between positions −215 and −223 relative to the start of transcription; a Proximal Sequence Element (PSE, TTCACCTTTACCAAAAATG) located between positions −43 and −63 ; and a 3′box (GTTAGGCGAAATATTA) between positions +150 and +166. Transfection of HeLa cells with this gene revealed that it is functioning in vivo and can produce U11 RNA. PMID:1820214
Kamada, Anselmo J; Bianco, Anna M; Zupin, Luisa; Girardelli, Martina; Matte, Maria C C; Medeiros, Rúbia Marília de; Almeida, Sabrina Esteves de Matos; Rocha, Marineide M; Segat, Ludovica; Chies, José A B; Kuhn, Louise; Crovella, Sergio
2016-07-01
Bone marrow stromal cell antigen-2 (BST-2)/Tetherin is a restriction factor that prevents Human immunodeficiency virus type 1 (HIV-1) release from infected cells and mediates pro-inflammatory cytokine production. This study investigated the risk conferred by single nucleotide polymorphisms (rs919266, rs9192677, and rs9576) at BST-2 coding gene (BST2) in HIV-1 mother-to-child transmission and in disease progression. Initially, 101 HIV-1+ pregnant women and 331 neonates exposed to HIV-1 from Zambia were enrolled. Additional BST2 single nucleotide polymorphism analyses were performed in 2 cohorts with acquired immunodeficiency syndrome (AIDS) progression: an adult Brazilian cohort (37 rapid, 30 chronic and 21 long-term non-progressors) and an Italian pediatric cohort (21 rapid and 67 slow progressors). The rs9576A allele was nominally associated with protection during breastfeeding (P = 0.019) and individuals carrying rs919266 GA showed slower progression to AIDS (P = 0.033). Despite the influence of rs919266 and rs9576 on BST2 expression being still undetermined, a preventive role by BST2 polymorphisms was found during HIV-1 infection.
Functional analysis of regulatory single-nucleotide polymorphisms.
Pampín, Sandra; Rodríguez-Rey, José C
2007-04-01
The identification of regulatory polymorphisms has become a key problem in human genetics. In the past few years there has been a conceptual change in the way in which regulatory single-nucleotide polymorphisms are studied. We revise the new approaches and discuss how gene expression studies can contribute to a better knowledge of the genetics of common diseases. New techniques for the association of single-nucleotide polymorphisms with changes in gene expression have been recently developed. This, together with a more comprehensive use of the old in-vitro methods, has produced a great amount of genetic information. When added to current databases, it will help to design better tools for the detection of regulatory single-nucleotide polymorphisms. The identification of functional regulatory single-nucleotide polymorphisms cannot be done by the simple inspection of DNA sequence. In-vivo techniques, based on primer-extension, and the more recently developed 'haploChIP' allow the association of gene variants to changes in gene expression. Gene expression analysis by conventional in-vitro techniques is the only way to identify the functional consequences of regulatory single-nucleotide polymorphisms. The amount of information produced in the last few years will help to refine the tools for the future analysis of regulatory gene variants.
Joseph, S; Schmidt, L M; Danquah, W B; Timper, P; Mekete, T
2017-02-01
To generate single spore lines of a population of bacterial parasite of root-knot nematode (RKN), Pasteuria penetrans, isolated from Florida and examine genotypic variation and virulence characteristics exist within the population. Six single spore lines (SSP), 16SSP, 17SSP, 18SSP, 25SSP, 26SSP and 30SSP were generated. Genetic variability was evaluated by comparing single-nucleotide polymorphisms (SNPs) in six protein-coding genes and the 16S rRNA gene. An average of one SNP was observed for every 69 bp in the 16S rRNA, whereas no SNPs were observed in the protein-coding sequences. Hierarchical cluster analysis of 16S rRNA sequences placed the clones into three distinct clades. Bio-efficacy analysis revealed significant heterogeneity in the level virulence and host specificity between the individual clones. The SNP markers developed to the 5' hypervariable region of the 16S rRNA gene may be useful in biotype differentiation within a population of P. penetrans. This study demonstrates an efficient method for generating single spore lines of P. penetrans and gives a deep insight into genetic heterogeneity and varying level of virulence exists within a population parasitizing a specific Meloidogyne sp. host. The results also suggest that the application of generalist spore lines in nematode management may achieve broad RKN control. © 2016 The Society for Applied Microbiology.
The primary transcriptome of the marine diazotroph Trichodesmium erythraeum IMS101
NASA Astrophysics Data System (ADS)
Pfreundt, Ulrike; Kopf, Matthias; Belkin, Natalia; Berman-Frank, Ilana; Hess, Wolfgang R.
2014-08-01
Blooms of the dinitrogen-fixing marine cyanobacterium Trichodesmium considerably contribute to new nitrogen inputs into tropical oceans. Intriguingly, only 60% of the Trichodesmium erythraeum IMS101 genome sequence codes for protein, compared with ~85% in other sequenced cyanobacterial genomes. The extensive non-coding genome fraction suggests space for an unusually high number of unidentified, potentially regulatory non-protein-coding RNAs (ncRNAs). To identify the transcribed fraction of the genome, here we present a genome-wide map of transcriptional start sites (TSS) at single nucleotide resolution, revealing the activity of 6,080 promoters. We demonstrate that T. erythraeum has the highest number of actively splicing group II introns and the highest percentage of TSS yielding ncRNAs of any bacterium examined to date. We identified a highly transcribed retroelement that serves as template repeat for the targeted mutation of at least 12 different genes by mutagenic homing. Our findings explain the non-coding portion of the T. erythraeum genome by the transcription of an unusually high number of non-coding transcripts in addition to the known high incidence of transposable elements. We conclude that riboregulation and RNA maturation-dependent processes constitute a major part of the Trichodesmium regulatory apparatus.
Global variation in CYP2C8–CYP2C9 functional haplotypes
Speed, William C; Kang, Soonmo Peter; Tuck, David P; Harris, Lyndsay N; Kidd, Kenneth K
2009-01-01
We have studied the global frequency distributions of 10 single nucleotide polymorphisms (SNPs) across 132 kb of CYP2C8 and CYP2C9 in ∼2500 individuals representing 45 populations. Five of the SNPs were in noncoding sequences; the other five involved the more common missense variants (four in CYP2C8, one in CYP2C9) that change amino acids in the gene products. One haplotype containing two CYP2C8 coding variants and one CYP2C9 coding variant reaches an average frequency of 10% in Europe; a set of haplotypes with a different CYP2C8 coding variant reaches 17% in Africa. In both cases these haplotypes are found in other regions of the world at <1%. This considerable geographic variation in haplotype frequencies impacts the interpretation of CYP2C8/CYP2C9 association studies, and has pharmacogenomic implications for drug interactions. PMID:19381162
Tetrahymena thermophila acidic ribosomal protein L37 contains an archaebacterial type of C-terminus.
Hansen, T S; Andreasen, P H; Dreisig, H; Højrup, P; Nielsen, H; Engberg, J; Kristiansen, K
1991-09-15
We have cloned and characterized a Tetrahymena thermophila macronuclear gene (L37) encoding the acidic ribosomal protein (A-protein) L37. The gene contains a single intron located in the 3'-part of the coding region. Two major and three minor transcription start points (tsp) were mapped 39 to 63 nucleotides upstream from the translational start codon. The uppermost tsp mapped to the first T in a putative T. thermophila RNA polymerase II initiator element, TATAA. The coding region of L37 predicts a protein of 109 amino acid (aa) residues. A substantial part of the deduced aa sequence was verified by protein sequencing. The T. thermophila L37 clearly belongs to the P1-type family of eukaryotic A-proteins, but the C-terminal region has the hallmarks of archaebacterial A-proteins.
Origin of the polymorphism of the involucrin gene in Asians.
Djian, P; Delhomme, B; Green, H
1995-01-01
The involucrin gene, encoding a protein of the terminally differentiated keratinocyte, is polymorphic in the human. There is polymorphism of marker nucleotides a two positions in the coding region, and there are over eight polymorphic forms based on the number and kind of 10-codon tandem repeats in that part of the coding region most recently added in the human lineage. The involucrin alleles of Caucasians and Africans differ in both nucleotides and repeat patterns. We show that the involucrin alleles of East Asians (Chinese and Japanese) can be divided into two populations according to whether they possess the two marker nucleotides typical of Africans or Caucasians. The Asian population bearing Caucasian-type marker nucleotides has repeat patterns similar to those of Caucasians, whereas Asians bearing African-type marker nucleotides have repeat patterns that resemble those of Africans more than those of Caucasians. The existence of two populations of East Asian involucrin alleles gives support for the existence of a Eurasian stem lineage from which Caucasians and a part of the Asian population originated. PMID:7762559
Using hidden Markov models and observed evolution to annotate viral genomes.
McCauley, Stephen; Hein, Jotun
2006-06-01
ssRNA (single stranded) viral genomes are generally constrained in length and utilize overlapping reading frames to maximally exploit the coding potential within the genome length restrictions. This overlapping coding phenomenon leads to complex evolutionary constraints operating on the genome. In regions which code for more than one protein, silent mutations in one reading frame generally have a protein coding effect in another. To maximize coding flexibility in all reading frames, overlapping regions are often compositionally biased towards amino acids which are 6-fold degenerate with respect to the 64 codon alphabet. Previous methodologies have used this fact in an ad hoc manner to look for overlapping genes by motif matching. In this paper differentiated nucleotide compositional patterns in overlapping regions are incorporated into a probabilistic hidden Markov model (HMM) framework which is used to annotate ssRNA viral genomes. This work focuses on single sequence annotation and applies an HMM framework to ssRNA viral annotation. A description of how the HMM is parameterized, whilst annotating within a missing data framework is given. A Phylogenetic HMM (Phylo-HMM) extension, as applied to 14 aligned HIV2 sequences is also presented. This evolutionary extension serves as an illustration of the potential of the Phylo-HMM framework for ssRNA viral genomic annotation. The single sequence annotation procedure (SSA) is applied to 14 different strains of the HIV2 virus. Further results on alternative ssRNA viral genomes are presented to illustrate more generally the performance of the method. The results of the SSA method are encouraging however there is still room for improvement, and since there is overwhelming evidence to indicate that comparative methods can improve coding sequence (CDS) annotation, the SSA method is extended to a Phylo-HMM to incorporate evolutionary information. The Phylo-HMM extension is applied to the same set of 14 HIV2 sequences which are pre-aligned. The performance improvement that results from including the evolutionary information in the analysis is illustrated.
Khan, Waqasuddin; Saripella, Ganapathi Varma-; Ludwig, Thomas; Cuppens, Tania; Thibord, Florian; Génin, Emmanuelle; Deleuze, Jean-Francois; Trégouët, David-Alexandre
2018-05-03
Predicted deleteriousness of coding variants is a frequently used criterion to filter out variants detected in next-generation sequencing projects and to select candidates impacting on the risk of human diseases. Most available dedicated tools implement a base-to-base annotation approach that could be biased in presence of several variants in the same genetic codon. We here proposed the MACARON program that, from a standard VCF file, identifies, re-annotates and predicts the amino acid change resulting from multiple single nucleotide variants (SNVs) within the same genetic codon. Applied to the whole exome dataset of 573 individuals, MACARON identifies 114 situations where multiple SNVs within a genetic codon induce an amino acid change that is different from those predicted by standard single SNV annotation tool. Such events are not uncommon and deserve to be studied in sequencing projects with inconclusive findings. MACARON is written in python with codes available on the GENMED website (www.genmed.fr). david-alexandre.tregouet@inserm.fr. Supplementary data are available at Bioinformatics online.
Using a Euclid distance discriminant method to find protein coding genes in the yeast genome.
Zhang, Chun-Ting; Wang, Ju; Zhang, Ren
2002-02-01
The Euclid distance discriminant method is used to find protein coding genes in the yeast genome, based on the single nucleotide frequencies at three codon positions in the ORFs. The method is extremely simple and may be extended to find genes in prokaryotic genomes or eukaryotic genomes with less introns. Six-fold cross-validation tests have demonstrated that the accuracy of the algorithm is better than 93%. Based on this, it is found that the total number of protein coding genes in the yeast genome is less than or equal to 5579 only, about 3.8-7.0% less than 5800-6000, which is currently widely accepted. The base compositions at three codon positions are analyzed in details using a graphic method. The result shows that the preference codons adopted by yeast genes are of the RGW type, where R, G and W indicate the bases of purine, non-G and A/T, whereas the 'codons' in the intergenic sequences are of the form NNN, where N denotes any base. This fact constitutes the basis of the algorithm to distinguish between coding and non-coding ORFs in the yeast genome. The names of putative non-coding ORFs are listed here in detail.
DNA as a Binary Code: How the Physical Structure of Nucleotide Bases Carries Information
ERIC Educational Resources Information Center
McCallister, Gary
2005-01-01
The DNA triplet code also functions as a binary code. Because double-ring compounds cannot bind to double-ring compounds in the DNA code, the sequence of bases classified simply as purines or pyrimidines can encode for smaller groups of possible amino acids. This is an intuitive approach to teaching the DNA code. (Contains 6 figures.)
MPRAnator: a web-based tool for the design of massively parallel reporter assay experiments
Georgakopoulos-Soares, Ilias; Jain, Naman; Gray, Jesse M; Hemberg, Martin
2017-01-01
Motivation: With the rapid advances in DNA synthesis and sequencing technologies and the continuing decline in the associated costs, high-throughput experiments can be performed to investigate the regulatory role of thousands of oligonucleotide sequences simultaneously. Nevertheless, designing high-throughput reporter assay experiments such as massively parallel reporter assays (MPRAs) and similar methods remains challenging. Results: We introduce MPRAnator, a set of tools that facilitate rapid design of MPRA experiments. With MPRA Motif design, a set of variables provides fine control of how motifs are placed into sequences, thereby allowing the investigation of the rules that govern transcription factor (TF) occupancy. MPRA single-nucleotide polymorphism design can be used to systematically examine the functional effects of single or combinations of single-nucleotide polymorphisms at regulatory sequences. Finally, the Transmutation tool allows for the design of negative controls by permitting scrambling, reversing, complementing or introducing multiple random mutations in the input sequences or motifs. Availability and implementation: MPRAnator tool set is implemented in Python, Perl and Javascript and is freely available at www.genomegeek.com and www.sanger.ac.uk/science/tools/mpranator. The source code is available on www.github.com/hemberg-lab/MPRAnator/ under the MIT license. The REST API allows programmatic access to MPRAnator using simple URLs. Contact: igs@sanger.ac.uk or mh26@sanger.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27605100
MPRAnator: a web-based tool for the design of massively parallel reporter assay experiments.
Georgakopoulos-Soares, Ilias; Jain, Naman; Gray, Jesse M; Hemberg, Martin
2017-01-01
With the rapid advances in DNA synthesis and sequencing technologies and the continuing decline in the associated costs, high-throughput experiments can be performed to investigate the regulatory role of thousands of oligonucleotide sequences simultaneously. Nevertheless, designing high-throughput reporter assay experiments such as massively parallel reporter assays (MPRAs) and similar methods remains challenging. We introduce MPRAnator, a set of tools that facilitate rapid design of MPRA experiments. With MPRA Motif design, a set of variables provides fine control of how motifs are placed into sequences, thereby allowing the investigation of the rules that govern transcription factor (TF) occupancy. MPRA single-nucleotide polymorphism design can be used to systematically examine the functional effects of single or combinations of single-nucleotide polymorphisms at regulatory sequences. Finally, the Transmutation tool allows for the design of negative controls by permitting scrambling, reversing, complementing or introducing multiple random mutations in the input sequences or motifs. MPRAnator tool set is implemented in Python, Perl and Javascript and is freely available at www.genomegeek.com and www.sanger.ac.uk/science/tools/mpranator The source code is available on www.github.com/hemberg-lab/MPRAnator/ under the MIT license. The REST API allows programmatic access to MPRAnator using simple URLs. igs@sanger.ac.uk or mh26@sanger.ac.ukSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Chen, Zhongxue; Ng, Hon Keung Tony; Li, Jing; Liu, Qingzhong; Huang, Hanwen
2017-04-01
In the past decade, hundreds of genome-wide association studies have been conducted to detect the significant single-nucleotide polymorphisms that are associated with certain diseases. However, most of the data from the X chromosome were not analyzed and only a few significant associated single-nucleotide polymorphisms from the X chromosome have been identified from genome-wide association studies. This is mainly due to the lack of powerful statistical tests. In this paper, we propose a novel statistical approach that combines the information of single-nucleotide polymorphisms on the X chromosome from both males and females in an efficient way. The proposed approach avoids the need of making strong assumptions about the underlying genetic models. Our proposed statistical test is a robust method that only makes the assumption that the risk allele is the same for both females and males if the single-nucleotide polymorphism is associated with the disease for both genders. Through simulation study and a real data application, we show that the proposed procedure is robust and have excellent performance compared to existing methods. We expect that many more associated single-nucleotide polymorphisms on the X chromosome will be identified if the proposed approach is applied to current available genome-wide association studies data.
USDA-ARS?s Scientific Manuscript database
Principal component analysis (PCA) with 36,621 polymorphic genome-anchored single nucleotide polymorphisms (SNPs) identified collectively for Capsicum annuum and Capsicum baccatum was used to show the distribution of these 2 important incompatible cultivated pepper species. Estimated mean nucleotide...
Mechanisms of radiation-induced gene responses
DOE Office of Scientific and Technical Information (OSTI.GOV)
Woloschak, G.E.; Paunesku, T.
1996-10-01
In the process of identifying genes differentially expressed in cells exposed ultraviolet radiation, we have identified a transcript having a 26-bp region that is highly conserved in a variety of species including Bacillus circulans, yeast, pumpkin, Drosophila, mouse, and man. When the 5` region (flanking region or UTR) of a gene, the sequence is predominantly in +/+ orientation with respect to the coding DNA strand; while in the coding region and the 3` region (UTR), the sequence is most frequently in the +/-orientation with respect to the coding DNA strand. In two genes, the element is split into two parts;more » however, in most cases, it is found only once but with a minimum of 11 consecutive nucleotides precisely depicting the original sequence. The element is found in a large number of different genes with diverse functions (from human ras p21 to B. circulans chitonase). Gel shift assays demonstrated the presence of a protein in HeLa cell extracts that binds to the sense and antisense single-stranded consensus oligomers, as well as to the double- stranded oligonucleotide. When double-stranded oligomer was used, the size shift demonstrated as additional protein-oligomer complex larger than the one bound to either sense or antisense single-stranded consensus oligomers alone. It is speculated either that this element binds to protein(s) important in maintaining DNA is a single-stranded orientation for transcription or, alternatively that this element is important in the transcription-coupled DNA repair process.« less
Choudhry, Shweta; Baskin, Laurence S; Lammer, Edward J; Witte, John S; Dasgupta, Sudeshna; Ma, Chen; Surampalli, Abhilasha; Shen, Joel; Shaw, Gary M; Carmichael, Suzan L
2015-05-01
Estrogenic endocrine disruptors acting via estrogen receptors α (ESR1) and β (ESR2) have been implicated in the etiology of hypospadias, a common congenital malformation of the male external genitalia. We determined the association of single nucleotide polymorphisms in ESR1 and ESR2 genes with hypospadias in a racially/ethnically diverse study population of California births. We investigated the relationship between hypospadias and 108 ESR1 and 36 ESR2 single nucleotide polymorphisms in 647 cases and 877 population based nonmalformed controls among infants born in selected California counties from 1990 to 2003. Subgroup analyses were performed by race/ethnicity (nonHispanic white and Hispanic subjects) and by hypospadias severity (mild to moderate and severe). Odds ratios for 33 of the 108 ESR1 single nucleotide polymorphisms had p values less than 0.05 (p = 0.05 to 0.007) for risk of hypospadias. However, none of the 36 ESR2 single nucleotide polymorphisms was significantly associated. In stratified analyses the association results were consistent by disease severity but different sets of single nucleotide polymorphisms were significantly associated with hypospadias in nonHispanic white and Hispanic subjects. Due to high linkage disequilibrium across the single nucleotide polymorphisms, haplotype analyses were conducted and identified 6 haplotype blocks in ESR1 gene that had haplotypes significantly associated with an increased risk of hypospadias (OR 1.3 to 1.8, p = 0.04 to 0.00001). Similar to single nucleotide polymorphism analysis, different ESR1 haplotypes were associated with risk of hypospadias in nonHispanic white and Hispanic subjects. No significant haplotype association was observed for ESR2. The data provide evidence that ESR1 single nucleotide polymorphisms and haplotypes influence the risk of hypospadias in white and Hispanic subjects, and warrant further examination in other study populations. Copyright © 2015 American Urological Association Education and Research, Inc. Published by Elsevier Inc. All rights reserved.
Evolving nucleotide binding surfaces
NASA Technical Reports Server (NTRS)
Kieber-Emmons, T.; Rein, R.
1981-01-01
An analysis is presented of the stability and nature of binding of a nucleotide to several known dehydrogenases. The employed approach includes calculation of hydrophobic stabilization of the binding motif and its intermolecular interaction with the ligand. The evolutionary changes of the binding motif are studied by calculating the Euclidean deviation of the respective dehydrogenases. Attention is given to the possible structural elements involved in the origin of nucleotide recognition by non-coded primordial polypeptides.
Phylogenetic Network for European mtDNA
Finnilä, Saara; Lehtonen, Mervi S.; Majamaa, Kari
2001-01-01
The sequence in the first hypervariable segment (HVS-I) of the control region has been used as a source of evolutionary information in most phylogenetic analyses of mtDNA. Population genetic inference would benefit from a better understanding of the variation in the mtDNA coding region, but, thus far, complete mtDNA sequences have been rare. We determined the nucleotide sequence in the coding region of mtDNA from 121 Finns, by conformation-sensitive gel electrophoresis and subsequent sequencing and by direct sequencing of the D loop. Furthermore, 71 sequences from our previous reports were included, so that the samples represented all the mtDNA haplogroups present in the Finnish population. We found a total of 297 variable sites in the coding region, which allowed the compilation of unambiguous phylogenetic networks. The D loop harbored 104 variable sites, and, in most cases, these could be localized within the coding-region networks, without discrepancies. Interestingly, many homoplasies were detected in the coding region. Nucleotide variation in the rRNA and tRNA genes was 6%, and that in the third nucleotide positions of structural genes amounted to 22% of that in the HVS-I. The complete networks enabled the relationships between the mtDNA haplogroups to be analyzed. Phylogenetic networks based on the entire coding-region sequence in mtDNA provide a rich source for further population genetic studies, and complete sequences make it easier to differentiate between disease-causing mutations and rare polymorphisms. PMID:11349229
Liu, Dewu; Zhang, Yushan; Du, Yinjun; Yang, Guanfu; Zhang, Xiquan
2007-06-01
The growth-correlated genes that are part of the neuroendocrine growth axis play crucial roles in the regulation of growth and development of pig. The identification of genetic polymorphisms in these genes will enable the scientist to evaluate the biological relevance of such polymorphisms and to gain a better understanding of quantitative traits like growth. In the present study, seven pairs of primers were designed to obtain unknown sequences of growth-correlated genes, and other 25 pairs of primers were designed to identify single nucleotide polymorphisms (SNP) using the denaturing high-performance liquid chromatography (DHPLC) technology in four pig breeds (Duroc, Landrace, Lantang and Wuzhishan), significantly differing in growth and development characteristics. A total of 101 polymorphisms were discovered in 10,707 base pairs (bp) from six genes of the ghrelin (GHRL), leptin (LEP), insulin-like growth factor II (IGF-II), insulin-like growth factor binding protein 2 (IGFBP-2), insulin-like growth factor binding protein 3 (IGFBP-3), and somatostatin (SS). The observed average distances between the SNP in the 5'UTR, coding regions, introns and 3'UTR were 134, 521, 81 and 92 bp, respectively. Four SNPs were found in the coding regions of IGF-II, IGFBP-2 and LEP, respectively. Two synonymous mutations were obtained in IGF-II and LEP genes respectively, and two non-synonymous were found in IGFBP-2 and LEP genes, respectively. Seven other mutations were also observed. Thirty-two PCR-RFLP markers were found among 101 polymorphisms of the six genes. The SNP discovered in this study would provide suitable markers for association studies of candidate genes with growth related traits in pig.
[Detection of gene mutation in glucose-6-phosphate dehydrogenase deficiency by RT-PCR sequencing].
Lyu, Rong-Yu; Chen, Xiao-Wen; Zhang, Min; Chen, Yun-Sheng; Yu, Jie; Wen, Fei-Qiu
2016-07-01
Since glucose-6-phosphate dehydrogenase (G6PD) deficiency is the most common hereditary hemolytic erythrocyte enzyme deficiency, most cases have single nucleotide mutations in the coding region, and current test methods for gene mutation have some missed detections, this study aimed to investigate the feasibility of RT-PCR sequencing in the detection of gene mutation in G6PD deficiency. According to the G6PD/6GPD ratio, 195 children with anemia of unknown cause or who underwent physical examination between August 2013 and July 2014 were classified into G6PD-deficiency group with 130 children (G6PD/6GPD ratio <1.00) and control group with 65 children (G6PD/6GPD ratio≥1.00). The primer design and PCR amplification conditions were optimized, and RT-PCR sequencing was used to analyze the complete coding sequence and verify the genomic DNA sequence in the two groups. In the G6PD-deficiency group, the detection rate of gene mutation was 100% and 13 missense mutations were detected, including one new mutation. In the control group, no missense mutation was detected in 28 boys; 13 heterozygous missense mutations, 1 homozygous same-sense mutation (C1191T) which had not been reported in China and abroad, and 14 single nucleotide polymorphisms of C1311T were detected in 37 girls. The control group showed a high rate of missed detection of G6PD deficiency (carriers) in the specimens from girls (35%, 13/37). RT-PCR sequencing has a high detection rate of G6PD gene mutation and a certain value in clinical diagnosis of G6PD deficiency.
Compositions and methods for detecting single nucleotide polymorphisms
Yeh, Hsin-Chih; Werner, James; Martinez, Jennifer S.
2016-11-22
Described herein are nucleic acid based probes and methods for discriminating and detecting single nucleotide variants in nucleic acid molecules (e.g., DNA). The methods include use of a pair of probes can be used to detect and identify polymorphisms, for example single nucleotide polymorphism in DNA. The pair of probes emit a different fluorescent wavelength of light depending on the association and alignment of the probes when hybridized to a target nucleic acid molecule. Each pair of probes is capable of discriminating at least two different nucleic acid molecules that differ by at least a single nucleotide difference. The methods can probes can be used, for example, for detection of DNA polymorphisms that are indicative of a particular disease or condition.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brinson, E.C.; Adriano, T.; Bloch, W.
1994-09-01
We have developed a rapid, single-tube, non-isotopic assay that screens a patient sample for the presence of 31 cystic fibrosis (CF) mutations. This assay can identify these mutations in a single reaction tube and a single electrophoresis run. Sample preparation is a simple, boil-and-go procedure, completed in less than an hour. The assay is composed of a 15-plex PCR, followed by a 61-plex oligonucleotide ligation assay (OLA), and incorporates a novel detection scheme, Sequence Coded Separation. Initially, the multiplex PCR amplifies 15 relevant segments of the CFTR gene, simultaneously. These PCR amplicons serve as templates for the multiplex OLA, whichmore » detects the normal or mutant allele at all loci, simultaneously. Each polymorphic site is interrogated by three oligonucleotide probes, a common probe and two allele-specific probes. Each common probe is tagged with a fluorescent dye, and the competing normal and mutant allelic probes incorporate different, non-nucleotide, mobility modifiers. These modifiers are composed of hexaethylene oxide (HEO) units, incorporated as HEO phosphoramidite monomers during automated DNA synthesis. The OLA is based on both probe hybridization and the ability of DNA ligase to discriminate single base mismatches at the junction between paired probes. Each single tube assay is electrophoresed in a single gel lane of a 4-color fluorescent DNA sequencer (Applied Biosystems, Model 373A). Each of the ligation products is identified by its unique combination of electrophoretic mobility and one of three colors. The fourth color is reserved for the in-lane size standard, used by GENESCAN{sup TM} software (Applied Biosystems) to size the OLA electrophoresis products. The Genotyper{sub TM} software (Applied Biosystems) decodes these Sequence-Coded-Separation data to create a patient summary report for all loci tested.« less
Polymorphism of prion protein gene in Arctic fox (Vulpes lagopus).
Wan, Jiayu; Bai, Xue; Liu, Wensen; Xu, Jing; Xu, Ming; Gao, Hongwei
2009-07-01
Prion diseases are fatal neurodegenerative disorders of humans and certain other mammals. Prion protein gene (Prnp) is associated with susceptibility and species barrier to prion diseases. No natural and experimental prion diseases have been documented to date in Arctic fox. In the present study, coding region of Prnp from 135 Arctic foxes were cloned and screened for polymorphisms. Our results indicated that the Arctic fox Prnp open reading frame (ORF) contains 771 nucleotides encoding 257 amino acids. Four single nucleotide polymorphisms (SNPs) (G312C, A337G, C541T, and A723G) were identified. SNPs G312C and A723G produced silent mutations, but SNPs A337G and C541T resulted in a M-V change at codon 113 and R-C at codon 181, respectively. The Arctic fox Prnp amino acid sequence was similar to that of the dog (XM 542906). In short, this study provides preliminary information about genotypes of Prnp in Arctic fox.
Kang, In-Nee; Musa, Maslinda; Harun, Fatimah; Junit, Sarni Mat
2010-02-01
The FOXE1 gene was screened for mutations in a cohort of 34 unrelated patients with congenital hypothyroidism, 14 of whom had thyroid dysgenesis and 18 were normal (the thyroid status for 2 patients was unknown). The entire coding region of the FOXE1 gene was PCR-amplified, then analyzed using single-stranded conformational polymorphism, followed by confirmation by direct DNA sequencing. DNA sequencing analysis revealed a heterozygous A>G transition at nucleotide position 394 in one of the patients. The nucleotide transition changed asparagine to aspartate at codon 132 in the highly conserved region of the forkhead DNA binding domain of the FOXE1 gene. This mutation was not detected in a total of 104 normal healthy individuals screened. The binding ability of the mutant FOXE1 protein to the human thyroperoxidase (TPO) promoter was slightly reduced compared with the wild-type FOXE1. The mutation also caused a 5% loss of TPO transcriptional activity.
Spliced RNA of woodchuck hepatitis virus.
Ogston, C W; Razman, D G
1992-07-01
Polymerase chain reaction was used to investigate RNA splicing in liver of woodchucks infected with woodchuck hepatitis virus (WHV). Two spliced species were detected, and the splice junctions were sequenced. The larger spliced RNA has an intron of 1300 nucleotides, and the smaller spliced sequence shows an additional downstream intron of 1104 nucleotides. We did not detect singly spliced sequences from which the smaller intron alone was removed. Control experiments showed that spliced sequences are present in both RNA and DNA in infected liver, showing that the viral reverse transcriptase can use spliced RNA as template. Spliced sequences were detected also in virion DNA prepared from serum. The upstream intron produces a reading frame that fuses the core to the polymerase polypeptide, while the downstream intron causes an inframe deletion in the polymerase open reading frame. Whereas the splicing patterns in WHV are superficially similar to those reported recently in hepatitis B virus, we detected no obvious homology in the coding capacity of spliced RNAs from these two viruses.
Listorti, Valeria; Laconi, Andrea; Catelli, Elena; Cecchinato, Mattia; Lupini, Caterina; Naylor, Clive J
2017-10-09
IBV genotype QX causes sufficient disease in Europe for several commercial companies to have started developing live attenuated vaccines. Here, one of those vaccines (L1148) was fully consensus sequenced alongside its progenitor field strain (1148-A) to determine vaccine markers, thereby enabling detection on farms. Twenty-eight single nucleotide substitutions were associated with the 1148-A attenuation, of which any combination can identify vaccine L1148 in the field. Sixteen substitutions resulted in amino acid coding changes of which half were in spike. One change in the 1b gene altered the normally highly conserved final 5 nucleotides of the transcription regulatory sequence of the S gene, common to all IBV QX genes. No mutations can currently be associated with the attenuation process. Field vaccination strategies would greatly benefit by such comparative sequence data being mandatorily submitted to regulators prior to vaccine release following a successful registration process. Copyright © 2017. Published by Elsevier Ltd.
Shao, Renfu; Barker, Stephen C
2011-02-15
The mitochondrial (mt) genome of the human body louse, Pediculus humanus, consists of 18 minichromosomes. Each minichromosome is 3 to 4 kb long and has 1 to 3 genes. There is unequivocal evidence for recombination between different mt minichromosomes in P. humanus. It is not known, however, how these minichromosomes recombine. Here, we report the discovery of eight chimeric mt minichromosomes in P. humanus. We classify these chimeric mt minichromosomes into two groups: Group I and Group II. Group I chimeric minichromosomes contain parts of two different protein-coding genes that are from different minichromosomes. The two parts of protein-coding genes in each Group I chimeric minichromosome are joined at a microhomologous nucleotide sequence; microhomologous nucleotide sequences are hallmarks of non-homologous recombination. Group II chimeric minichromosomes contain all of the genes and the non-coding regions of two different minichromosomes. The conserved sequence blocks in the non-coding regions of Group II chimeric minichromosomes resemble the "recombination repeats" in the non-coding regions of the mt genomes of higher plants. These repeats are essential to homologous recombination in higher plants. Our analyses of the nucleotide sequences of chimeric mt minichromosomes indicate both homologous and non-homologous recombination between minichromosomes in the mitochondria of the human body louse. Copyright © 2010 Elsevier B.V. All rights reserved.
Can, Ceren; Yazıcıoğlu, Mehtap; Gürkan, Hakan; Tozkır, Hilmi; Görgülü, Adnan; Süt, Necdet Hilmi
2017-01-01
Background: Atopic dermatitis is the most common chronic inflammatory skin disease. A complex interaction of both genetic and environmental factors is thought to contribute to the disease. Aims: To evaluate whether single nucleotide polymorphisms in the TLR2 gene c.2258C>T (R753Q) (rs5743708) and TLR2 c.-148+1614T>A (A-16934T) (rs4696480) (NM_0032643) are associated with atopic dermatitis in Turkish children. Study Design: Case-control study. Methods: The study was conducted on 70 Turkish children with atopic dermatitis aged 0.5-18 years. The clinical severity of atopic dermatitis was evaluated by the severity scoring of atopic dermatitis index. Serum total IgE levels, specific IgE antibodies to inhalant and food allergens were measured in both atopic dermatitis patients and controls, skin prick tests were done on 70 children with atopic dermatitis. Genotyping for TLR2 (R753Q and A-16934T) single nucleotide polymorphisms was performed in both atopic dermatitis patients and controls. Results: Cytosine-cytosine and cytosin-thymine genotype frequencies of the TLR2 R753Q single nucleotide polymorphism in the atopic dermatitis group were determined as being 98.6% and 1.4%, cytosine allele frequency for TLR2 R753Q single nucleotide polymorphism was determined as 99.29% and the thymine allele frequency was 0.71%, thymine-thymine, thymine-adenine, and adenine-adenine genotype frequencies of the TLR2 A-16934T single nucleotide polymorphism were 24.3%, 44.3%, and 31.4%. The thymine allele frequency for the TLR2 A-16934T single nucleotide polymorphism in the atopic dermatitis group was 46.43%, and the adenine allele frequency was 53.57%, respectively. There was not statistically significant difference between the groups for all investigated polymorphisms (p>0.05). For all single nucleotide polymorphisms studied, allelic distribution was analogous among atopic dermatitis patients and controls, and no significant statistical difference was observed. No homozygous carriers of the TLR2 R753Q single nucleotide polymorphism were found in the atopic dermatitis and control groups. Conclusion: The TLR2 (R753Q and A-16934T) single nucleotide polymorphisms are not associated with atopic dermatitis in a group of Turkish patients. PMID:28443596
Can, Ceren; Yazıcıoğlu, Mehtap; Gürkan, Hakan; Tozkır, Hilmi; Görgülü, Adnan; Süt, Necdet Hilmi
2017-05-05
Atopic dermatitis is the most common chronic inflammatory skin disease. A complex interaction of both genetic and environmental factors is thought to contribute to the disease. To evaluate whether single nucleotide polymorphisms in the TLR2 gene c.2258C>T (R753Q) (rs5743708) and TLR2 c.-148+1614T>A (A-16934T) (rs4696480) (NM_0032643) are associated with atopic dermatitis in Turkish children. Case-control study. The study was conducted on 70 Turkish children with atopic dermatitis aged 0.5-18 years. The clinical severity of atopic dermatitis was evaluated by the severity scoring of atopic dermatitis index. Serum total IgE levels, specific IgE antibodies to inhalant and food allergens were measured in both atopic dermatitis patients and controls, skin prick tests were done on 70 children with atopic dermatitis. Genotyping for TLR2 (R753Q and A-16934T) single nucleotide polymorphisms was performed in both atopic dermatitis patients and controls. Cytosine-cytosine and cytosin-thymine genotype frequencies of the TLR2 R753Q single nucleotide polymorphism in the atopic dermatitis group were determined as being 98.6% and 1.4%, cytosine allele frequency for TLR2 R753Q single nucleotide polymorphism was determined as 99.29% and the thymine allele frequency was 0.71%, thymine-thymine, thymine-adenine, and adenine-adenine genotype frequencies of the TLR2 A-16934T single nucleotide polymorphism were 24.3%, 44.3%, and 31.4%. The thymine allele frequency for the TLR2 A-16934T single nucleotide polymorphism in the atopic dermatitis group was 46.43%, and the adenine allele frequency was 53.57%, respectively. There was not statistically significant difference between the groups for all investigated polymorphisms (p>0.05). For all single nucleotide polymorphisms studied, allelic distribution was analogous among atopic dermatitis patients and controls, and no significant statistical difference was observed. No homozygous carriers of the TLR2 R753Q single nucleotide polymorphism were found in the atopic dermatitis and control groups. The TLR2 (R753Q and A-16934T) single nucleotide polymorphisms are not associated with atopic dermatitis in a group of Turkish patients.
Improved prediction of biochemical recurrence after radical prostatectomy by genetic polymorphisms.
Morote, Juan; Del Amo, Jokin; Borque, Angel; Ars, Elisabet; Hernández, Carlos; Herranz, Felipe; Arruza, Antonio; Llarena, Roberto; Planas, Jacques; Viso, María J; Palou, Joan; Raventós, Carles X; Tejedor, Diego; Artieda, Marta; Simón, Laureano; Martínez, Antonio; Rioja, Luis A
2010-08-01
Single nucleotide polymorphisms are inherited genetic variations that can predispose or protect individuals against clinical events. We hypothesized that single nucleotide polymorphism profiling may improve the prediction of biochemical recurrence after radical prostatectomy. We performed a retrospective, multi-institutional study of 703 patients treated with radical prostatectomy for clinically localized prostate cancer who had at least 5 years of followup after surgery. All patients were genotyped for 83 prostate cancer related single nucleotide polymorphisms using a low density oligonucleotide microarray. Baseline clinicopathological variables and single nucleotide polymorphisms were analyzed to predict biochemical recurrence within 5 years using stepwise logistic regression. Discrimination was measured by ROC curve AUC, specificity, sensitivity, predictive values, net reclassification improvement and integrated discrimination index. The overall biochemical recurrence rate was 35%. The model with the best fit combined 8 covariates, including the 5 clinicopathological variables prostate specific antigen, Gleason score, pathological stage, lymph node involvement and margin status, and 3 single nucleotide polymorphisms at the KLK2, SULT1A1 and TLR4 genes. Model predictive power was defined by 80% positive predictive value, 74% negative predictive value and an AUC of 0.78. The model based on clinicopathological variables plus single nucleotide polymorphisms showed significant improvement over the model without single nucleotide polymorphisms, as indicated by 23.3% net reclassification improvement (p = 0.003), integrated discrimination index (p <0.001) and likelihood ratio test (p <0.001). Internal validation proved model robustness (bootstrap corrected AUC 0.78, range 0.74 to 0.82). The calibration plot showed close agreement between biochemical recurrence observed and predicted probabilities. Predicting biochemical recurrence after radical prostatectomy based on clinicopathological data can be significantly improved by including patient genetic information. Copyright (c) 2010 American Urological Association Education and Research, Inc. Published by Elsevier Inc. All rights reserved.
Deep sequencing approaches for the analysis of prokaryotic transcriptional boundaries and dynamics.
James, Katherine; Cockell, Simon J; Zenkin, Nikolay
2017-05-01
The identification of the protein-coding regions of a genome is straightforward due to the universality of start and stop codons. However, the boundaries of the transcribed regions, conditional operon structures, non-coding RNAs and the dynamics of transcription, such as pausing of elongation, are non-trivial to identify, even in the comparatively simple genomes of prokaryotes. Traditional methods for the study of these areas, such as tiling arrays, are noisy, labour-intensive and lack the resolution required for densely-packed bacterial genomes. Recently, deep sequencing has become increasingly popular for the study of the transcriptome due to its lower costs, higher accuracy and single nucleotide resolution. These methods have revolutionised our understanding of prokaryotic transcriptional dynamics. Here, we review the deep sequencing and data analysis techniques that are available for the study of transcription in prokaryotes, and discuss the bioinformatic considerations of these analyses. Copyright © 2017 Elsevier Inc. All rights reserved.
HERC1 polymorphisms: population-specific variations in haplotype composition.
Yuasa, Isao; Umetsu, Kazuo; Nishimukai, Hiroaki; Fukumori, Yasuo; Harihara, Shinji; Saitou, Naruya; Jin, Feng; Chattopadhyay, Prasanta K; Henke, Lotte; Henke, Jürgen
2009-08-01
Human HERC1 is one of six HERC proteins and may play an important role in intracellular membrane trafficking. The human HERC1 gene is suggested to have been affected by local positive selection. To assess the global frequency distributions of coding and non-coding single nucleotide polymorphisms (SNPs) in the HERC1 gene, we developed a new simultaneous genotyping method for four SNPs, and applied this method to investigate 1213 individuals from 12 global populations. The results confirmed remarked differences in the allele and haplotype frequencies between East Asian and non-East Asian populations. One of the three common haplotypes observed was found to be characteristic of East Asians, who showed a relatively uniform distribution of haplotypes. Information on haplotypes would be useful for testing the function of polymorphisms in the HERC1 gene. This is the first study to investigate the distribution of HERC1 polymorphisms in various populations. (c) 2009 John Wiley & Sons, Ltd.
Polymorphisms of 20 regulatory proteins between Mycobacterium tuberculosis and Mycobacterium bovis.
Bigi, María M; Blanco, Federico Carlos; Araújo, Flabio R; Thacker, Tyler C; Zumárraga, Martín J; Cataldi, Angel A; Soria, Marcelo A; Bigi, Fabiana
2016-08-01
Mycobacterium tuberculosis and Mycobacterium bovis are responsible for tuberculosis in humans and animals, respectively. Both species are closely related and belong to the Mycobacterium tuberculosis complex (MTC). M. tuberculosis is the most ancient species from which M. bovis and other members of the MTC evolved. The genome of M. bovis is over >99.95% identical to that of M. tuberculosis but with seven deletions ranging in size from 1 to 12.7 kb. In addition, 1200 single nucleotide mutations in coding regions distinguish M. bovis from M. tuberculosis. In the present study, we assessed 75 M. tuberculosis genomes and 23 M. bovis genomes to identify non-synonymous mutations in 202 coding sequences of regulatory genes between both species. We identified species-specific variants in 20 regulatory proteins and confirmed differential expression of hypoxia-related genes between M. bovis and M. tuberculosis. © 2016 The Societies and John Wiley & Sons Australia, Ltd.
Wu, Lei; He, Yao; Zhang, Di
2015-11-01
To systematically evaluate the association between single nucleotide polymorphism of rs2231142 genetic susceptibility and gout in East Asian population. The literature retrieval was conducted by using English databases (Medline, EMbase), Chinese databases (CNKI, Vip, Wanfang, SinaMed) and others to collect the published papers on the association between single nucleotide polymorphism of rs2231142 genetic susceptibility and gout by the end of December 2014. Meta-analysis was performed with software Stata 12.0. Nine studies were included. There were significant associations between increased risk of gout and single nucleotide polymorphism of rs2231142, the combined OR was 2.04 (95%CI: 1.82-2.28) for A allele and C allele, 1.97 (95%CI: 1.57-2.48) for CA and CC, 3.71 (95%CI: 3.07-4.47) for AA and CC. Sex and region specific subgroup analysis showed less heterogeneity. There is significant association between gout and single nucleotide polymorphism of rs2231142 in East Asian population, and A allele is a high risk gene for gout.
CNTNAP2 Is Significantly Associated With Speech Sound Disorder in the Chinese Han Population.
Zhao, Yun-Jing; Wang, Yue-Ping; Yang, Wen-Zhu; Sun, Hong-Wei; Ma, Hong-Wei; Zhao, Ya-Ru
2015-11-01
Speech sound disorder is the most common communication disorder. Some investigations support the possibility that the CNTNAP2 gene might be involved in the pathogenesis of speech-related diseases. To investigate single-nucleotide polymorphisms in the CNTNAP2 gene, 300 unrelated speech sound disorder patients and 200 normal controls were included in the study. Five single-nucleotide polymorphisms were amplified and directly sequenced. Significant differences were found in the genotype (P = .0003) and allele (P = .0056) frequencies of rs2538976 between patients and controls. The excess frequency of the A allele in the patient group remained significant after Bonferroni correction (P = .0280). A significant haplotype association with rs2710102T/+rs17236239A/+2538976A/+2710117A (P = 4.10e-006) was identified. A neighboring single-nucleotide polymorphism, rs10608123, was found in complete linkage disequilibrium with rs2538976, and the genotypes exactly corresponded to each other. The authors propose that these CNTNAP2 variants increase the susceptibility to speech sound disorder. The single-nucleotide polymorphisms rs10608123 and rs2538976 may merge into one single-nucleotide polymorphism. © The Author(s) 2015.
Evolution of Nucleotide Punctuation Marks: From Structural to Linear Signals.
El Houmami, Nawal; Seligmann, Hervé
2017-01-01
We present an evolutionary hypothesis assuming that signals marking nucleotide synthesis (DNA replication and RNA transcription) evolved from multi- to unidimensional structures, and were carried over from transcription to translation. This evolutionary scenario presumes that signals combining secondary and primary nucleotide structures are evolutionary transitions. Mitochondrial replication initiation fits this scenario. Some observations reported in the literature corroborate that several signals for nucleotide synthesis function in translation, and vice versa. (a) Polymerase-induced frameshift mutations occur preferentially at translational termination signals (nucleotide deletion is interpreted as termination of nucleotide polymerization, paralleling the role of stop codons in translation). (b) Stem-loop hairpin presence/absence modulates codon-amino acid assignments, showing that translational signals sometimes combine primary and secondary nucleotide structures (here codon and stem-loop). (c) Homopolymer nucleotide triplets (AAA, CCC, GGG, TTT) cause transcriptional and ribosomal frameshifts. Here we find in recently described human mitochondrial RNAs that systematically lack mono-, dinucleotides after each trinucleotide (delRNAs) that delRNA triplets include 2x more homopolymers than mitogenome regions not covered by delRNA. Further analyses of delRNAs show that the natural circular code X (a little-known group of 20 translational signals enabling ribosomal frame retrieval consisting of 20 codons {AAC, AAT, ACC, ATC, ATT, CAG, CTC, CTG, GAA, GAC, GAG, GAT, GCC, GGC, GGT, GTA, GTC, GTT, TAC, TTC} universally overrepresented in coding versus other frames of gene sequences), regulates frameshift in transcription and translation. This dual transcription and translation role confirms for X the hypothesis that translational signals were carried over from transcriptional signals.
USDA-ARS?s Scientific Manuscript database
Single-nucleotide polymorphisms (SNPs) are the most common genetic markers in Theobroma cacao, occurring approximately once in every 200 nucleotides. SNPs, like microsatellites, are co-dominant and PCR-based, but they have several advantages over microsatellites. They are unambiguous, so that a SN...
Seto, P; Hirayu, H; Magnusson, R P; Gestautas, J; Portmann, L; DeGroot, L J; Rapoport, B
1987-01-01
The thyroid microsomal antigen (MSA) in autoimmune thyroid disease is a protein of approximately 107 kD. We screened a human thyroid cDNA library constructed in the expression vector lambda gt11 with anti-107-kD monoclonal antibodies. Of five clones obtained, the recombinant beta-galactosidase fusion protein from one clone (PM-5) was confirmed to react with the monoclonal antiserum. The complementary DNA (cDNA) insert from PM-5 (0.8 kb) was used as a probe on Northern blot analysis to estimate the size of the mRNA coding for the MSA. The 2.9-kb messenger RNA (mRNA) species observed was the same size as that coding for human thyroid peroxidase (TPO). The probe did not bind to human liver mRNA, indicating the thyroid-specific nature of the PM-5-related mRNA. The nucleotide sequence of PM-5 (842 bp) was determined and consisted of a single open reading frame. Comparison of the nucleotide sequence of PM-5 with that presently available for pig TPO indicates 84% homology. In conclusion, a cDNA clone representing part of the microsomal antigen has been isolated. Sequence homology with porcine TPO, as well as identity in the size of the mRNA species for both the microsomal antigen and TPO, indicate that the microsomal antigen is, at least in part, TPO. Images PMID:3654979
McCutchen-Maloney, Sandra L.
2002-01-01
DNA mutation binding proteins alone and as chimeric proteins with nucleases are used with solid supports to detect DNA sequence variations, DNA mutations and single nucleotide polymorphisms. The solid supports may be flow cytometry beads, DNA chips, glass slides or DNA dips sticks. DNA molecules are coupled to solid supports to form DNA-support complexes. Labeled DNA is used with unlabeled DNA mutation binding proteins such at TthMutS to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by binding which gives an increase in signal. Unlabeled DNA is utilized with labeled chimeras to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by nuclease activity of the chimera which gives a decrease in signal.
2011-01-01
Background The melon belongs to the Cucurbitaceae family, whose economic importance among vegetable crops is second only to Solanaceae. The melon has a small genome size (454 Mb), which makes it suitable for molecular and genetic studies. Despite similar nuclear and chloroplast genome sizes, cucurbits show great variation when their mitochondrial genomes are compared. The melon possesses the largest plant mitochondrial genome, as much as eight times larger than that of other cucurbits. Results The nucleotide sequences of the melon chloroplast and mitochondrial genomes were determined. The chloroplast genome (156,017 bp) included 132 genes, with 98 single-copy genes dispersed between the small (SSC) and large (LSC) single-copy regions and 17 duplicated genes in the inverted repeat regions (IRa and IRb). A comparison of the cucumber and melon chloroplast genomes showed differences in only approximately 5% of nucleotides, mainly due to short indels and SNPs. Additionally, 2.74 Mb of mitochondrial sequence, accounting for 95% of the estimated mitochondrial genome size, were assembled into five scaffolds and four additional unscaffolded contigs. An 84% of the mitochondrial genome is contained in a single scaffold. The gene-coding region accounted for 1.7% (45,926 bp) of the total sequence, including 51 protein-coding genes, 4 conserved ORFs, 3 rRNA genes and 24 tRNA genes. Despite the differences observed in the mitochondrial genome sizes of cucurbit species, Citrullus lanatus (379 kb), Cucurbita pepo (983 kb) and Cucumis melo (2,740 kb) share 120 kb of sequence, including the predicted protein-coding regions. Nevertheless, melon contained a high number of repetitive sequences and a high content of DNA of nuclear origin, which represented 42% and 47% of the total sequence, respectively. Conclusions Whereas the size and gene organisation of chloroplast genomes are similar among the cucurbit species, mitochondrial genomes show a wide variety of sizes, with a non-conserved structure both in gene number and organisation, as well as in the features of the noncoding DNA. The transfer of nuclear DNA to the melon mitochondrial genome and the high proportion of repetitive DNA appear to explain the size of the largest mitochondrial genome reported so far. PMID:21854637
Association between AMELX polymorphisms and dental caries in Koreans.
Kang, S W; Yoon, I; Lee, H W; Cho, J
2011-05-01
Dental caries is greatly influenced disease by environmental factors, but recently there are increasing evidences for a genetic component in caries susceptibility. AMELX is the gene coding amelogenin, which is the most important factor for normal enamel development. The aim of this study was to examine the relationship between dental caries and single nucleotide polymorphisms (SNPs) in AMELX. For this study, we used DNA samples collected from 120 unrelated individuals older than 12 years of age. All of them were examined for their oral and dental status under the WHO recommended criteria, and clinical information such as DMFT and DMFS were evaluated. Individuals whose DMFT and DMFS index lower than 2 were designated 'very low caries experience' and higher than 3 were designated 'higher caries experience'. Genomic DNA was extracted from hair samples, and single nucleotide polymorphisms of AMELX were genotyped. Genotyping of three SNPs (rs17878486, rs5933871, rs5934997, intron) in AMELX gene was determined by direct sequencing and analyzed with SNPStats. There were significant associations between rs5933871 and rs5934997 SNP and caries susceptibility in the water fluoridation group. These results suggest that SNPs of AMELX might be associated with dental caries susceptibility in Korean population. © 2010 John Wiley & Sons A/S.
Comparative genomics of the mimicry switch in Papilio dardanus.
Timmermans, Martijn J T N; Baxter, Simon W; Clark, Rebecca; Heckel, David G; Vogel, Heiko; Collins, Steve; Papanicolaou, Alexie; Fukova, Iva; Joron, Mathieu; Thompson, Martin J; Jiggins, Chris D; ffrench-Constant, Richard H; Vogler, Alfried P
2014-07-22
The African Mocker Swallowtail, Papilio dardanus, is a textbook example in evolutionary genetics. Classical breeding experiments have shown that wing pattern variation in this polymorphic Batesian mimic is determined by the polyallelic H locus that controls a set of distinct mimetic phenotypes. Using bacterial artificial chromosome (BAC) sequencing, recombination analyses and comparative genomics, we show that H co-segregates with an interval of less than 500 kb that is collinear with two other Lepidoptera genomes and contains 24 genes, including the transcription factor genes engrailed (en) and invected (inv). H is located in a region of conserved gene order, which argues against any role for genomic translocations in the evolution of a hypothesized multi-gene mimicry locus. Natural populations of P. dardanus show significant associations of specific morphs with single nucleotide polymorphisms (SNPs), centred on en. In addition, SNP variation in the H region reveals evidence of non-neutral molecular evolution in the en gene alone. We find evidence for a duplication potentially driving physical constraints on recombination in the lamborni morph. Absence of perfect linkage disequilibrium between different genes in the other morphs suggests that H is limited to nucleotide positions in the regulatory and coding regions of en. Our results therefore support the hypothesis that a single gene underlies wing pattern variation in P. dardanus.
Palacios-Flores, Kim; García-Sotelo, Jair; Castillo, Alejandra; Uribe, Carina; Aguilar, Luis; Morales, Lucía; Gómez-Romero, Laura; Reyes, José; Garciarubio, Alejandro; Boege, Margareta; Dávila, Guillermo
2018-01-01
We present a conceptually simple, sensitive, precise, and essentially nonstatistical solution for the analysis of genome variation in haploid organisms. The generation of a Perfect Match Genomic Landscape (PMGL), which computes intergenome identity with single nucleotide resolution, reveals signatures of variation wherever a query genome differs from a reference genome. Such signatures encode the precise location of different types of variants, including single nucleotide variants, deletions, insertions, and amplifications, effectively introducing the concept of a general signature of variation. The precise nature of variants is then resolved through the generation of targeted alignments between specific sets of sequence reads and known regions of the reference genome. Thus, the perfect match logic decouples the identification of the location of variants from the characterization of their nature, providing a unified framework for the detection of genome variation. We assessed the performance of the PMGL strategy via simulation experiments. We determined the variation profiles of natural genomes and of a synthetic chromosome, both in the context of haploid yeast strains. Our approach uncovered variants that have previously escaped detection. Moreover, our strategy is ideally suited for further refining high-quality reference genomes. The source codes for the automated PMGL pipeline have been deposited in a public repository. PMID:29367403
Mendes-Junior, C T; Castelli, E C; Meyer, D; Simões, A L; Donadi, E A
2013-12-01
HLA-G has an important role in the modulation of the maternal immune system during pregnancy, and evidence that balancing selection acts in the promoter and 3'UTR regions has been previously reported. To determine whether selection acts on the HLA-G coding region in the Amazon Rainforest, exons 2, 3 and 4 were analyzed in a sample of 142 Amerindians from nine villages of five isolated tribes that inhabit the Central Amazon. Six previously described single-nucleotide polymorphisms (SNPs) were identified and the Expectation-Maximization (EM) and PHASE algorithms were used to computationally reconstruct SNP haplotypes (HLA-G alleles). A new HLA-G allele, which originated in Amerindian populations by a crossing-over event between two widespread HLA-G alleles, was identified in 18 individuals. Neutrality tests evidenced that natural selection has a complex part in the HLA-G coding region. Although balancing selection is the type of selection that shapes variability at a local level (Native American populations), we have also shown that purifying selection may occur on a worldwide scale. Moreover, the balancing selection does not seem to act on the coding region as strongly as it acts on the flanking regulatory regions, and such coding signature may actually reflect a hitchhiking effect.
Grotegut, Chad A; Ngan, Emily; Garrett, Melanie E; Miranda, Marie Lynn; Ashley-Koch, Allison E; Swamy, Geeta K
2017-09-01
Oxytocin is a potent uterotonic agent that is widely used for induction and augmentation of labor. Oxytocin has a narrow therapeutic index and the optimal dosing for any individual woman varies widely. The objective of this study was to determine whether genetic variation in the oxytocin receptor (OXTR) or in the gene encoding G protein-coupled receptor kinase 6 (GRK6), which regulates desensitization of the oxytocin receptor, could explain variation in oxytocin dosing and labor outcomes among women being induced near term. Pregnant women with a singleton gestation residing in Durham County, NC, were prospectively enrolled as part of the Healthy Pregnancy, Healthy Baby cohort study. Those women undergoing an induction of labor at 36 weeks or greater were genotyped for 18 haplotype-tagging single-nucleotide polymorphisms in OXTR and 7 haplotype-tagging single-nucleotide polymorphisms in GRK6 using TaqMan assays. Linear regression was used to examine the relationship between maternal genotype and maximal oxytocin infusion rate, total oxytocin dose received, and duration of labor. Logistic regression was used to test for the association of maternal genotype with mode of delivery. For each outcome, backward selection techniques were utilized to control for important confounding variables and additive genetic models were used. Race/ethnicity was included in all models because of differences in allele frequencies across populations, and Bonferroni correction for multiple testing was used. DNA was available from 482 women undergoing induction of labor at 36 weeks or greater. Eighteen haplotype-tagging single-nucleotide polymorphisms within OXTR and 7 haplotype-tagging single-nucleotide polymorphisms within GRK6 were examined. Five single-nucleotide polymorphisms in OXTR showed nominal significance with maximal infusion rate of oxytocin, and two single-nucleotide polymorphisms in OXTR were associated with total oxytocin dose received. One single-nucleotide polymorphism in OXTR and two single-nucleotide polymorphisms in GRK6 were associated with duration of labor, one of which met the multiple testing threshold (P = .0014, rs2731664 [GRK6], mean duration of labor, 17.7 hours vs 20.2 hours vs 23.5 hours for AA, AC, and CC genotypes, respectively). Three single-nucleotide polymorphisms, two in OXTR and one in GRK6, showed nominal significance with mode of delivery. Genetic variation in OXTR and GRK6 is associated with the amount of oxytocin required as well as the duration of labor and risk for cesarean delivery among women undergoing induction of labor near term. With further research, pharmacogenomic approaches may potentially be utilized to develop personalized treatment to improve safety and efficacy outcomes among women undergoing induction of labor. Copyright © 2017 Elsevier Inc. All rights reserved.
Long-term excretion of vaccine-derived poliovirus by a healthy child.
Martín, Javier; Odoom, Kofi; Tuite, Gráinne; Dunn, Glynis; Hopewell, Nicola; Cooper, Gill; Fitzharris, Catherine; Butler, Karina; Hall, William W; Minor, Philip D
2004-12-01
A child was found to be excreting type 1 vaccine-derived poliovirus (VDPV) with a 1.1% sequence drift from Sabin type 1 vaccine strain in the VP1 coding region 6 months after he was immunized with oral live polio vaccine. Seventeen type 1 poliovirus isolates were recovered from stools taken from this child during the following 4 months. Contrary to expectation, the child was not deficient in humoral immunity and showed high levels of serum neutralization against poliovirus. Selected virus isolates were characterized in terms of their antigenic properties, virulence in transgenic mice, sensitivity for growth at high temperatures, and differences in nucleotide sequence from the Sabin type 1 strain. The VDPV isolates showed mutations at key nucleotide positions that correlated with the observed reversion to biological properties typical of wild polioviruses. A number of capsid mutations mapped at known antigenic sites leading to changes in the viral antigenic structure. Estimates of sequence evolution based on the accumulation of nucleotide changes in the VP1 coding region detected a "defective" molecular clock running at an apparent faster speed of 2.05% nucleotide changes per year versus 1% shown in previous studies. Remarkably, when compared to several type 1 VDPV strains of different origins, isolates from this child showed a much higher proportion of nonsynonymous versus synonymous nucleotide changes in the capsid coding region. This anomaly could explain the high VP1 sequence drift found and the ability of these virus strains to replicate in the gut for a longer period than expected.
Engqvist, Martin K M; Nielsen, Jens
2015-08-21
The Ambiguous Nucleotide Tool (ANT) is a desktop application that generates and evaluates degenerate codons. Degenerate codons are used to represent DNA positions that have multiple possible nucleotide alternatives. This is useful for protein engineering and directed evolution, where primers specified with degenerate codons are used as a basis for generating libraries of protein sequences. ANT is intuitive and can be used in a graphical user interface or by interacting with the code through a defined application programming interface. ANT comes with full support for nonstandard, user-defined, or expanded genetic codes (translation tables), which is important because synthetic biology is being applied to an ever widening range of natural and engineered organisms. The Python source code for ANT is freely distributed so that it may be used without restriction, modified, and incorporated in other software or custom data pipelines.
Probing genomic diversity and evolution of Escherichia coli O157 by single nucleotide polymorphisms.
Zhang, Wei; Qi, Weihong; Albert, Thomas J; Motiwala, Alifiya S; Alland, David; Hyytia-Trees, Eija K; Ribot, Efrain M; Fields, Patricia I; Whittam, Thomas S; Swaminathan, Bala
2006-06-01
Infections by Shiga toxin-producing Escherichia coli O157:H7 (STEC O157) are the predominant cause of bloody diarrhea and hemolytic uremic syndrome in the United States. In silico comparison of the two complete STEC O157 genomes (Sakai and EDL933) revealed a strikingly high level of sequence identity in orthologous protein-coding genes, limiting the use of nucleotide sequences to study the evolution and epidemiology of this bacterial pathogen. To systematically examine single nucleotide polymorphisms (SNPs) at a genome scale, we designed comparative genome sequencing microarrays and analyzed 1199 chromosomal genes (a total of 1,167,948 bp) and 92,721 bp of the large virulence plasmid (pO157) of eleven outbreak-associated STEC O157 strains. We discovered 906 SNPs in 523 chromosomal genes and observed a high level of DNA polymorphisms among the pO157 plasmids. Based on a uniform rate of synonymous substitution for Escherichia coli and Salmonella enterica (4.7x10(-9) per site per year), we estimate that the most recent common ancestor of the contemporary beta-glucuronidase-negative, non-sorbitolfermenting STEC O157 strains existed ca. 40 thousand years ago. The phylogeny of the STEC O157 strains based on the informative synonymous SNPs was compared to the maximum parsimony trees inferred from pulsed-field gel electrophoresis and multilocus variable numbers of tandem repeats analysis. The topological discrepancies indicate that, in contrast to the synonymous mutations, parts of STEC O157 genomes have evolved through different mechanisms with highly variable divergence rates. The SNP loci reported here will provide useful genetic markers for developing high-throughput methods for fine-resolution genotyping of STEC O157. Functional characterization of nucleotide polymorphisms should shed new insights on the evolution, epidemiology, and pathogenesis of STEC O157 and related pathogens.
Probing genomic diversity and evolution of Escherichia coli O157 by single nucleotide polymorphisms
Zhang, Wei; Qi, Weihong; Albert, Thomas J.; Motiwala, Alifiya S.; Alland, David; Hyytia-Trees, Eija K.; Ribot, Efrain M.; Fields, Patricia I.; Whittam, Thomas S.; Swaminathan, Bala
2006-01-01
Infections by Shiga toxin-producing Escherichia coli O157:H7 (STEC O157) are the predominant cause of bloody diarrhea and hemolytic uremic syndrome in the United States. In silico comparison of the two complete STEC O157 genomes (Sakai and EDL933) revealed a strikingly high level of sequence identity in orthologous protein-coding genes, limiting the use of nucleotide sequences to study the evolution and epidemiology of this bacterial pathogen. To systematically examine single nucleotide polymorphisms (SNPs) at a genome scale, we designed comparative genome sequencing microarrays and analyzed 1199 chromosomal genes (a total of 1,167,948 bp) and 92,721 bp of the large virulence plasmid (pO157) of eleven outbreak-associated STEC O157 strains. We discovered 906 SNPs in 523 chromosomal genes and observed a high level of DNA polymorphisms among the pO157 plasmids. Based on a uniform rate of synonymous substitution for Escherichia coli and Salmonella enterica (4.7 × 10−9 per site per year), we estimate that the most recent common ancestor of the contemporary β-glucuronidase-negative, non-sorbitolfermenting STEC O157 strains existed ca. 40 thousand years ago. The phylogeny of the STEC O157 strains based on the informative synonymous SNPs was compared to the maximum parsimony trees inferred from pulsed-field gel electrophoresis and multilocus variable numbers of tandem repeats analysis. The topological discrepancies indicate that, in contrast to the synonymous mutations, parts of STEC O157 genomes have evolved through different mechanisms with highly variable divergence rates. The SNP loci reported here will provide useful genetic markers for developing high-throughput methods for fine-resolution genotyping of STEC O157. Functional characterization of nucleotide polymorphisms should shed new insights on the evolution, epidemiology, and pathogenesis of STEC O157 and related pathogens. PMID:16606700
Genetic spell-checking: gene editing using single-stranded DNA oligonucleotides.
Rivera-Torres, Natalia; Kmiec, Eric B
2016-02-01
Single-stranded oligonucleotides (ssODNs) can be used to direct the exchange of a single nucleotide or the repair of a single base within the coding region of a gene in a process that is known, generically, as gene editing. These molecules are composed of either all DNA residues or a mixture of RNA and DNA bases and utilize inherent metabolic functions to execute the genetic alteration within the context of a chromosome. The mechanism of action of gene editing is now being elucidated as well as an understanding of its regulatory circuitry, work that has been particularly important in establishing a foundation for designing effective gene editing strategies in plants. Double-strand DNA breakage and the activation of the DNA damage response pathway play key roles in determining the frequency with which gene editing activity takes place. Cellular regulators respond to such damage and their action impacts the success or failure of a particular nucleotide exchange reaction. A consequence of such activation is the natural slowing of replication fork progression, which naturally creates a more open chromatin configuration, thereby increasing access of the oligonucleotide to the DNA template. Herein, how critical reaction parameters influence the effectiveness of gene editing is discussed. Functional interrelationships between DNA damage, the activation of DNA response pathways and the stalling of replication forks are presented in detail as potential targets for increasing the frequency of gene editing by ssODNs in plants and plant cells. © 2015 Society for Experimental Biology, Association of Applied Biologists and John Wiley & Sons Ltd.
A Tradeoff Drives the Evolution of Reduced Metal Resistance in Natural Populations of Yeast
Chang, Shang-Lin; Leu, Jun-Yi
2011-01-01
Various types of genetic modification and selective forces have been implicated in the process of adaptation to novel or adverse environments. However, the underlying molecular mechanisms are not well understood in most natural populations. Here we report that a set of yeast strains collected from Evolution Canyon (EC), Israel, exhibit an extremely high tolerance to the heavy metal cadmium. We found that cadmium resistance is primarily caused by an enhanced function of a metal efflux pump, PCA1. Molecular analyses demonstrate that this enhancement can be largely attributed to mutations in the promoter sequence, while mutations in the coding region have a minor effect. Reconstruction experiments show that three single nucleotide substitutions in the PCA1 promoter quantitatively increase its activity and thus enhance the cells' cadmium resistance. Comparison among different yeast species shows that the critical nucleotides found in EC strains are conserved and functionally important for cadmium resistance in other species, suggesting that they represent an ancestral type. However, these nucleotides had diverged in most Saccharomyces cerevisiae populations, which gave cells growth advantages under conditions where cadmium is low or absent. Our results provide a rare example of a selective sweep in yeast populations driven by a tradeoff in metal resistance. PMID:21483812
Identification of common, unique and polymorphic microsatellites among 73 cyanobacterial genomes.
Kabra, Ritika; Kapil, Aditi; Attarwala, Kherunnisa; Rai, Piyush Kant; Shanker, Asheesh
2016-04-01
Microsatellites also known as Simple Sequence Repeats are short tandem repeats of 1-6 nucleotides. These repeats are found in coding as well as non-coding regions of both prokaryotic and eukaryotic genomes and play a significant role in the study of gene regulation, genetic mapping, DNA fingerprinting and evolutionary studies. The availability of 73 complete genome sequences of cyanobacteria enabled us to mine and statistically analyze microsatellites in these genomes. The cyanobacterial microsatellites identified through bioinformatics analysis were stored in a user-friendly database named CyanoSat, which is an efficient data representation and query system designed using ASP.net. The information in CyanoSat comprises of perfect, imperfect and compound microsatellites found in coding, non-coding and coding-non-coding regions. Moreover, it contains PCR primers with 200 nucleotides long flanking region. The mined cyanobacterial microsatellites can be freely accessed at www.compubio.in/CyanoSat/home.aspx. In addition to this 82 polymorphic, 13,866 unique and 2390 common microsatellites were also detected. These microsatellites will be useful in strain identification and genetic diversity studies of cyanobacteria.
The complete chloroplast genome of North American ginseng, Panax quinquefolius.
Han, Zeng-Jie; Li, Wei; Liu, Yuan; Gao, Li-Zhi
2016-09-01
We report complete nucleotide sequence of the Panax quinquefolius chloroplast genome using next-generation sequencing technology. The genome size is 156 359 bp, including two inverted repeats (IRs) of 52 153 bp, separated by the large single-copy (LSC 86 184 bp) and small single-copy (SSC 18 081 bp) regions. This cp genome encodes 114 unigenes (80 protein-coding genes, four rRNA genes, and 30 tRNA genes), in which 18 are duplicated in the IR regions. Overall GC content of the genome is 38.08%. A phylogenomic analysis of the 10 complete chloroplast genomes from Araliaceae using Daucus carota from Apiaceae as outgroup showed that P. quinquefolius is closely related to the other two members of the genus Panax, P. ginseng and P. notoginseng.
Studying the genetic basis of speciation in high gene flow marine invertebrates
2016-01-01
A growing number of genes responsible for reproductive incompatibilities between species (barrier loci) exhibit the signals of positive selection. However, the possibility that genes experiencing positive selection diverge early in speciation and commonly cause reproductive incompatibilities has not been systematically investigated on a genome-wide scale. Here, I outline a research program for studying the genetic basis of speciation in broadcast spawning marine invertebrates that uses a priori genome-wide information on a large, unbiased sample of genes tested for positive selection. A targeted sequence capture approach is proposed that scores single-nucleotide polymorphisms (SNPs) in widely separated species populations at an early stage of allopatric divergence. The targeted capture of both coding and non-coding sequences enables SNPs to be characterized at known locations across the genome and at genes with known selective or neutral histories. The neutral coding and non-coding SNPs provide robust background distributions for identifying FST-outliers within genes that can, in principle, identify specific mutations experiencing diversifying selection. If natural hybridization occurs between species, the neutral coding and non-coding SNPs can provide a neutral admixture model for genomic clines analyses aimed at finding genes exhibiting strong blocks to introgression. Strongylocentrotid sea urchins are used as a model system to outline the approach but it can be used for any group that has a complete reference genome available. PMID:29491951
Kopf, Matthias; Klähn, Stephan; Scholz, Ingeborg; Hess, Wolfgang R; Voß, Björn
2015-04-22
In all studied organisms, a substantial portion of the transcriptome consists of non-coding RNAs that frequently execute regulatory functions. Here, we have compared the primary transcriptomes of the cyanobacteria Synechocystis sp. PCC 6714 and PCC 6803 under 10 different conditions. These strains share 2854 protein-coding genes and a 16S rRNA identity of 99.4%, indicating their close relatedness. Conserved major transcriptional start sites (TSSs) give rise to non-coding transcripts within the sigB gene, from the 5'UTRs of cmpA and isiA, and 168 loci in antisense orientation. Distinct differences include single nucleotide polymorphisms rendering promoters inactive in one of the strains, e.g., for cmpR and for the asRNA PsbA2R. Based on the genome-wide mapped location, regulation and classification of TSSs, non-coding transcripts were identified as the most dynamic component of the transcriptome. We identified a class of mRNAs that originate by read-through from an sRNA that accumulates as a discrete and abundant transcript while also serving as the 5'UTR. Such an sRNA/mRNA structure, which we name 'actuaton', represents another way for bacteria to remodel their transcriptional network. Our findings support the hypothesis that variations in the non-coding transcriptome constitute a major evolutionary element of inter-strain divergence and capability for physiological adaptation.
Energy efficiency trade-offs drive nucleotide usage in transcribed regions
Chen, Wei-Hua; Lu, Guanting; Bork, Peer; Hu, Songnian; Lercher, Martin J.
2016-01-01
Efficient nutrient usage is a trait under universal selection. A substantial part of cellular resources is spent on making nucleotides. We thus expect preferential use of cheaper nucleotides especially in transcribed sequences, which are often amplified thousand-fold compared with genomic sequences. To test this hypothesis, we derive a mutation-selection-drift equilibrium model for nucleotide skews (strand-specific usage of ‘A' versus ‘T' and ‘G' versus ‘C'), which explains nucleotide skews across 1,550 prokaryotic genomes as a consequence of selection on efficient resource usage. Transcription-related selection generally favours the cheaper nucleotides ‘U' and ‘C' at synonymous sites. However, the information encoded in mRNA is further amplified through translation. Due to unexpected trade-offs in the codon table, cheaper nucleotides encode on average energetically more expensive amino acids. These trade-offs apply to both strand-specific nucleotide usage and GC content, causing a universal bias towards the more expensive nucleotides ‘A' and ‘G' at non-synonymous coding sites. PMID:27098217
Jafari, Naghmeh; Broer, Linda; Hoppenbrouwers, Ilse A; van Duijn, Cornelia M; Hintzen, Rogier Q
2010-11-01
Multiple sclerosis is a presumed autoimmune disease associated with genetic and environmental risk factors such as infectious mononucleosis. Recent research has shown infectious mononucleosis to be associated with a specific HLA class I polymorphism. Our aim was to test if the infectious mononucleosis-linked HLA class I single nucleotide polymorphism (rs6457110) is also associated with multiple sclerosis. Genotyping of the HLA-A single nucleotide polymorphism rs6457110 using TaqMan was performed in 591 multiple sclerosis cases and 600 controls. The association of multiple sclerosis with the HLA-A single nucleotide polymorphism was tested using logistic regression adjusted for age, sex and HLA-DRB1*1501. HLA-A minor allele (A) is associated with multiple sclerosis (OR = 0.68; p = 4.08 × 10( -5)). After stratification for HLA-DRB1*1501 risk allele (T) carrier we showed a significant OR of 0.70 (p = 0.003) for HLA-A. HLA class I single nucleotide polymorphism rs6457110 is associated with infectious mononucleosis and multiple sclerosis, independent of the major class II allele, supporting the hypothesis that shared genetics may contribute to the association between infectious mononucleosis and multiple sclerosis.
Porcine insulin receptor substrate 4 (IRS4) gene: cloning, polymorphism and association study
USDA-ARS?s Scientific Manuscript database
Using PCR and IPCR techniques we obtained a 4498 bp nucleotide sequence FN424076 encompassing the complete coding sequence of the porcine IRS4 gene and its proximal promoter. The 1269-amino acid porcine protein deduced from the nucleotide sequence shares 92% identity with the human IRS4 and possesse...
Baurens, Franc-Christophe; Bocs, Stéphanie; Rouard, Mathieu; Matsumoto, Takashi; Miller, Robert N G; Rodier-Goud, Marguerite; MBéguié-A-MBéguié, Didier; Yahiaoui, Nabila
2010-07-16
Comparative sequence analysis of complex loci such as resistance gene analog clusters allows estimating the degree of sequence conservation and mechanisms of divergence at the intraspecies level. In banana (Musa sp.), two diploid wild species Musa acuminata (A genome) and Musa balbisiana (B genome) contribute to the polyploid genome of many cultivars. The M. balbisiana species is associated with vigour and tolerance to pests and disease and little is known on the genome structure and haplotype diversity within this species. Here, we compare two genomic sequences of 253 and 223 kb corresponding to two haplotypes of the RGA08 resistance gene analog locus in M. balbisiana "Pisang Klutuk Wulung" (PKW). Sequence comparison revealed two regions of contrasting features. The first is a highly colinear gene-rich region where the two haplotypes diverge only by single nucleotide polymorphisms and two repetitive element insertions. The second corresponds to a large cluster of RGA08 genes, with 13 and 18 predicted RGA genes and pseudogenes spread over 131 and 152 kb respectively on each haplotype. The RGA08 cluster is enriched in repetitive element insertions, in duplicated non-coding intergenic sequences including low complexity regions and shows structural variations between haplotypes. Although some allelic relationships are retained, a large diversity of RGA08 genes occurs in this single M. balbisiana genotype, with several RGA08 paralogs specific to each haplotype. The RGA08 gene family has evolved by mechanisms of unequal recombination, intragenic sequence exchange and diversifying selection. An unequal recombination event taking place between duplicated non-coding intergenic sequences resulted in a different RGA08 gene content between haplotypes pointing out the role of such duplicated regions in the evolution of RGA clusters. Based on the synonymous substitution rate in coding sequences, we estimated a 1 million year divergence time for these M. balbisiana haplotypes. A large RGA08 gene cluster identified in wild banana corresponds to a highly variable genomic region between haplotypes surrounded by conserved flanking regions. High level of sequence identity (70 to 99%) of the genic and intergenic regions suggests a recent and rapid evolution of this cluster in M. balbisiana.
Ito, M; Mori, Y; Oiso, Y; Saito, H
1991-01-01
To elucidate the molecular mechanism of familial central diabetes insipidus (FDI), we sequenced the arginine vasopressin-neurophysin II (AVP-NPII) gene in 2 patients belonging to a pedigree that is consistent with an autosomal dominant mode of inheritance. 10 patients with idiopathic central diabetes insipidus (IDI) and 5 normals were also studied. The AVP-NPII gene, locating on chromosome 20, consists of three exons that encode putative signal peptide, AVP, NPII, and glycoprotein. Using polymerase chain reaction, fragments including the promoter region and all coding regions were amplified from genomic DNA and subjected to direct sequencing. Sequences of 10 patients with IDI were identical with those of normals, while in 2 patients with FDI, a single base substitution was detected in one of two alleles of the AVP-NPII gene, indicating they were heterozygotes for this mutation. It was a G----A transition at nucleotide position 1859 in the second exon, resulting in a substitution of Gly for Ser at amino acid position 57 in the NPII moiety. It was speculated that the mutated AVP-NPII precursor or the mutated NPII molecule, through their conformational changes, might be responsible for AVP deficiency. Images PMID:1840604
A mechanism for exon skipping caused by nonsense or missense mutations in BRCA1 and other genes.
Liu, H X; Cartegni, L; Zhang, M Q; Krainer, A R
2001-01-01
Point mutations can generate defective and sometimes harmful proteins. The nonsense-mediated mRNA decay (NMD) pathway minimizes the potential damage caused by nonsense mutations. In-frame nonsense codons located at a minimum distance upstream of the last exon-exon junction are recognized as premature termination codons (PTCs), targeting the mRNA for degradation. Some nonsense mutations cause skipping of one or more exons, presumably during pre-mRNA splicing in the nucleus; this phenomenon is termed nonsense-mediated altered splicing (NAS), and its underlying mechanism is unclear. By analyzing NAS in BRCA1, we show here that inappropriate exon skipping can be reproduced in vitro, and results from disruption of a splicing enhancer in the coding sequence. Enhancers can be disrupted by single nonsense, missense and translationally silent point mutations, without recognition of an open reading frame as such. These results argue against a nuclear reading-frame scanning mechanism for NAS. Coding-region single-nucleotide polymorphisms (cSNPs) within exonic splicing enhancers or silencers may affect the patterns or efficiency of mRNA splicing, which may in turn cause phenotypic variability and variable penetrance of mutations elsewhere in a gene.
Electrical detection and quantification of single and mixed DNA nucleotides in suspension
NASA Astrophysics Data System (ADS)
Ahmad, Mahmoud Al; Panicker, Neena G.; Rizvi, Tahir A.; Mustafa, Farah
2016-09-01
High speed sequential identification of the building blocks of DNA, (deoxyribonucleotides or nucleotides for short) without labeling or processing in long reads of DNA is the need of the hour. This can be accomplished through exploiting their unique electrical properties. In this study, the four different types of nucleotides that constitute a DNA molecule were suspended in a buffer followed by performing several types of electrical measurements. These electrical parameters were then used to quantify the suspended DNA nucleotides. Thus, we present a purely electrical counting scheme based on the semiconductor theory that allows one to determine the number of nucleotides in a solution by measuring their capacitance-voltage dependency. The nucleotide count was observed to be similar to the multiplication of the corresponding dopant concentration and debye volume after de-embedding the buffer contribution. The presented approach allows for a fast and label-free quantification of single and mixed nucleotides in a solution.
Wang, Xiaodan; Ma, Dehong; Huang, Xinwei; Li, Lihua; Li, Duo; Zhao, Yujiao; Qiu, Lijuan; Pan, Yue; Chen, Junying; Xi, Juemin; Shan, Xiyun; Sun, Qiangming
2017-06-15
In the past few decades, dengue has spread rapidly and is an emerging disease in China. An unexpected dengue outbreak occurred in Xishuangbanna, Yunnan, China, resulting in 1331 patients in 2013. In order to obtain the complete genome information and perform mutation and evolutionary analysis of causative agent related to this largest outbreak of dengue fever. The viruses were isolated by cell culture and evaluated by genome sequence analysis. Phylogenetic trees were then constructed by Neighbor-Joining methods (MEGA6.0), followed by analysis of nucleotide mutation and amino acid substitution. The analysis of the diversity of secondary structure for E and NS1 protein were also performed. Then selection pressures acting on the coding sequences were estimated by PAML software. The complete genome sequences of two isolated strains (YNSW1, YNSW2) were 10,710 and 10,702 nucleotides in length, respectively. Phylogenetic analysis revealed both strain were classified as genotype II of DENV-3. The results indicated that both isolated strains of Xishuangbanna in 2013 and Laos 2013 stains (KF816161.1, KF816158.1, LC147061.1, LC147059.1, KF816162.1) were most similar to Bangladesh (AY496873.2) in 2002. After comparing with the DENV-3SS (H87) 62 amino acid substitutions were identified in translated regions, and 38 amino acid substitutions were identified in translated regions compared with DENV-3 genotype II stains Bangladesh (AY496873.2). 27(YNSW1) or 28(YNSW2) single nucleotide changes were observed in structural protein sequences with 7(YNSW1) or 8(YNSW2) non-synonymous mutations compared with AY496873.2. Of them, 4 non-synonymous mutations were identified in E protein sequences with (2 in the β-sheet, 2 in the coil). Meanwhile, 117(YNSW1) or 115 (YNSW2) single nucleotide changes were observed in non-structural protein sequences with 31(YNSW1) or 30 (YNSW2) non-synonymous mutations. Particularly, 14 single nucleotide changes were observed in NS1 sequences with 4/14 non-synonymous substitutions (4 in the coil). Selection pressure analysis revealed no positive selection in the amino acid sites of the genes encoding for structural and non-structural proteins. This study may help understand the intrinsic geographical relatedness of dengue virus 3 and contributes further to research on their infectivity, pathogenicity and vaccine development. Copyright © 2017 Elsevier B.V. All rights reserved.
2012-01-01
Background Detecting the borders between coding and non-coding regions is an essential step in the genome annotation. And information entropy measures are useful for describing the signals in genome sequence. However, the accuracies of previous methods of finding borders based on entropy segmentation method still need to be improved. Methods In this study, we first applied a new recursive entropic segmentation method on DNA sequences to get preliminary significant cuts. A 22-symbol alphabet is used to capture the differential composition of nucleotide doublets and stop codon patterns along three phases in both DNA strands. This process requires no prior training datasets. Results Comparing with the previous segmentation methods, the experimental results on three bacteria genomes, Rickettsia prowazekii, Borrelia burgdorferi and E.coli, show that our approach improves the accuracy for finding the borders between coding and non-coding regions in DNA sequences. Conclusions This paper presents a new segmentation method in prokaryotes based on Jensen-Rényi divergence with a 22-symbol alphabet. For three bacteria genomes, comparing to A12_JR method, our method raised the accuracy of finding the borders between protein coding and non-coding regions in DNA sequences. PMID:23282225
Bond, C; LaForge, K S; Tian, M; Melia, D; Zhang, S; Borg, L; Gong, J; Schluger, J; Strong, J A; Leal, S M; Tischfield, J A; Kreek, M J; Yu, L
1998-08-04
Opioid drugs play important roles in the clinical management of pain, as well as in the development and treatment of drug abuse. The mu opioid receptor is the primary site of action for the most commonly used opioids, including morphine, heroin, fentanyl, and methadone. By sequencing DNA from 113 former heroin addicts in methadone maintenance and 39 individuals with no history of drug or alcohol abuse or dependence, we have identified five different single-nucleotide polymorphisms (SNPs) in the coding region of the mu opioid receptor gene. The most prevalent SNP is a nucleotide substitution at position 118 (A118G), predicting an amino acid change at a putative N-glycosylation site. This SNP displays an allelic frequency of approximately 10% in our study population. Significant differences in allele distribution were observed among ethnic groups studied. The variant receptor resulting from the A118G SNP did not show altered binding affinities for most opioid peptides and alkaloids tested. However, the A118G variant receptor binds beta-endorphin, an endogenous opioid that activates the mu opioid receptor, approximately three times more tightly than the most common allelic form of the receptor. Furthermore, beta-endorphin is approximately three times more potent at the A118G variant receptor than at the most common allelic form in agonist-induced activation of G protein-coupled potassium channels. These results show that SNPs in the mu opioid receptor gene can alter binding and signal transduction in the resulting receptor and may have implications for normal physiology, therapeutics, and vulnerability to develop or protection from diverse diseases including the addictive diseases.
Kishi, H; Mukai, T; Hirono, A; Fujii, H; Miwa, S; Hori, K
1987-01-01
Fructose-1,6-bisphosphate aldolase A (fructose-bisphosphate aldolase; EC 4.1.2.13) deficiency is an autosomal recessive disorder associated with hereditary hemolytic anemia. To clarify the molecular mechanism of the deficiency at the nucleotide level, we have cloned aldolase A cDNA from a patient's poly(A)+ RNA that was expressed in cultured lymphoblastoid cells. Nucleotide analysis of the patient's aldolase A cDNA showed a substitution of a single nucleotide (adenine to guanine) at position 386 in a coding region. As a result, the 128th amino acid, aspartic acid, was replaced with glycine (GAT to GGT). Furthermore, change of the second letter of the aspartic acid codon extinguished a F ok I restriction site (GGATG to GGGTG). Southern blot analysis of the genomic DNA showed the patient carried a homozygous mutation inherited from his parents. When compared with normal human aldolase A, the patient's enzyme from erythrocytes and from cultured lymphoblastoid cells was found to be highly thermolabile, suggesting that this mutation causes a functional defect of the enzyme. To further examine this possibility, the thermal stability of aldolase A of the patient and of a normal control, expressed in Escherichia coli using expression plasmids, was determined. The results of E. coli expression of the mutated aldolase A enzyme confirmed the thermolabile nature of the abnormal enzyme. The Asp-128 is conserved in aldolase A, B, and C of eukaryotes, including an insect, Drosophila, suggesting that the Asp-128 of the aldolase A protein is likely to be an amino acid residue with a crucial role in maintaining the correct spatial structure or in performing the catalytic function of the enzyme. Images PMID:2825199
Novel methodologies for spectral classification of exon and intron sequences
NASA Astrophysics Data System (ADS)
Kwan, Hon Keung; Kwan, Benjamin Y. M.; Kwan, Jennifer Y. Y.
2012-12-01
Digital processing of a nucleotide sequence requires it to be mapped to a numerical sequence in which the choice of nucleotide to numeric mapping affects how well its biological properties can be preserved and reflected from nucleotide domain to numerical domain. Digital spectral analysis of nucleotide sequences unfolds a period-3 power spectral value which is more prominent in an exon sequence as compared to that of an intron sequence. The success of a period-3 based exon and intron classification depends on the choice of a threshold value. The main purposes of this article are to introduce novel codes for 1-sequence numerical representations for spectral analysis and compare them to existing codes to determine appropriate representation, and to introduce novel thresholding methods for more accurate period-3 based exon and intron classification of an unknown sequence. The main findings of this study are summarized as follows: Among sixteen 1-sequence numerical representations, the K-Quaternary Code I offers an attractive performance. A windowed 1-sequence numerical representation (with window length of 9, 15, and 24 bases) offers a possible speed gain over non-windowed 4-sequence Voss representation which increases as sequence length increases. A winner threshold value (chosen from the best among two defined threshold values and one other threshold value) offers a top precision for classifying an unknown sequence of specified fixed lengths. An interpolated winner threshold value applicable to an unknown and arbitrary length sequence can be estimated from the winner threshold values of fixed length sequences with a comparable performance. In general, precision increases as sequence length increases. The study contributes an effective spectral analysis of nucleotide sequences to better reveal embedded properties, and has potential applications in improved genome annotation.
Brunak, S; Engelbrecht, J
1996-06-01
A direct comparison of experimentally determined protein structures and their corresponding protein coding mRNA sequences has been performed. We examine whether real world data support the hypothesis that clusters of rare codons correlate with the location of structural units in the resulting protein. The degeneracy of the genetic code allows for a biased selection of codons which may control the translational rate of the ribosome, and may thus in vivo have a catalyzing effect on the folding of the polypeptide chain. A complete search for GenBank nucleotide sequences coding for structural entries in the Brookhaven Protein Data Bank produced 719 protein chains with matching mRNA sequence, amino acid sequence, and secondary structure assignment. By neural network analysis, we found strong signals in mRNA sequence regions surrounding helices and sheets. These signals do not originate from the clustering of rare codons, but from the similarity of codons coding for very abundant amino acid residues at the N- and C-termini of helices and sheets. No correlation between the positioning of rare codons and the location of structural units was found. The mRNA signals were also compared with conserved nucleotide features of 16S-like ribosomal RNA sequences and related to mechanisms for maintaining the correct reading frame by the ribosome.
Shabalina, Svetlana A.; Ogurtsov, Aleksey Y.; Spiridonov, Nikolay A.; Koonin, Eugene V.
2014-01-01
Alternative splicing (AS), alternative transcription initiation (ATI) and alternative transcription termination (ATT) create the extraordinary complexity of transcriptomes and make key contributions to the structural and functional diversity of mammalian proteomes. Analysis of mammalian genomic and transcriptomic data shows that contrary to the traditional view, the joint contribution of ATI and ATT to the transcriptome and proteome diversity is quantitatively greater than the contribution of AS. Although the mean numbers of protein-coding constitutive and alternative nucleotides in gene loci are nearly identical, their distribution along the transcripts is highly non-uniform. On average, coding exons in the variable 5′ and 3′ transcript ends that are created by ATI and ATT contain approximately four times more alternative nucleotides than core protein-coding regions that diversify exclusively via AS. Short upstream exons that encompass alternative 5′-untranslated regions and N-termini of proteins evolve under strong nucleotide-level selection whereas in 3′-terminal exons that encode protein C-termini, protein-level selection is significantly stronger. The groups of genes that are subject to ATI and ATT show major differences in biological roles, expression and selection patterns. PMID:24792168
De Wachter, R; Neefs, J M; Goris, A; Van de Peer, Y
1992-01-01
The nucleotide sequence of the gene coding for small ribosomal subunit RNA in the basidiomycete Ustilago maydis was determined. It revealed the presence of a group I intron with a length of 411 nucleotides. This is the third occurrence of such an intron discovered in a small subunit rRNA gene encoded by a eukaryotic nuclear genome. The other two occurrences are in Pneumocystis carinii, a fungus of uncertain taxonomic status, and Ankistrodesmus stipitatus, a green alga. The nucleotides of the conserved core structure of 101 group I intron sequences present in different genes and genome types were aligned and their evolutionary relatedness was examined. This revealed a cluster including all group I introns hitherto found in eukaryotic nuclear genes coding for small and large subunit rRNAs. A secondary structure model was designed for the area of the Ustilago maydis small ribosomal subunit RNA precursor where the intron is situated. It shows that the internal guide sequence pairing with the intron boundaries fits between two helices of the small subunit rRNA, and that minimal rearrangement of base pairs suffices to achieve the definitive secondary structure of the 18S rRNA upon splicing. PMID:1561081
Bataillon, Thomas; Duan, Jinjie; Hvilsom, Christina; Jin, Xin; Li, Yingrui; Skov, Laurits; Glemin, Sylvain; Munch, Kasper; Jiang, Tao; Qian, Yu; Hobolth, Asger; Wang, Jun; Mailund, Thomas; Siegismund, Hans R; Schierup, Mikkel H
2015-03-30
We study genome-wide nucleotide diversity in three subspecies of extant chimpanzees using exome capture. After strict filtering, Single Nucleotide Polymorphisms and indels were called and genotyped for greater than 50% of exons at a mean coverage of 35× per individual. Central chimpanzees (Pan troglodytes troglodytes) are the most polymorphic (nucleotide diversity, θw = 0.0023 per site) followed by Eastern (P. t. schweinfurthii) chimpanzees (θw = 0.0016) and Western (P. t. verus) chimpanzees (θw = 0.0008). A demographic scenario of divergence without gene flow fits the patterns of autosomal synonymous nucleotide diversity well except for a signal of recent gene flow from Western into Eastern chimpanzees. The striking contrast in X-linked versus autosomal polymorphism and divergence previously reported in Central chimpanzees is also found in Eastern and Western chimpanzees. We show that the direction of selection statistic exhibits a strong nonmonotonic relationship with the strength of purifying selection S, making it inappropriate for estimating S. We instead use counts in synonymous versus nonsynonymous frequency classes to infer the distribution of S coefficients acting on nonsynonymous mutations in each subspecies. The strength of purifying selection we infer is congruent with the differences in effective sizes of each subspecies: Central chimpanzees are undergoing the strongest purifying selection followed by Eastern and Western chimpanzees. Coding indels show stronger selection against indels changing the reading frame than observed in human populations. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Lin, C S; Sun, Y L; Liu, C Y; Yang, P C; Chang, L C; Cheng, I C; Mao, S J; Huang, M C
1999-08-05
The complete nucleotide sequence of the pig (Sus scrofa) mitochondrial genome, containing 16613bp, is presented in this report. The genome is not a specific length because of the presence of the variable numbers of tandem repeats, 5'-CGTGCGTACA in the displacement loop (D-loop). Genes responsible for 12S and 16S rRNAs, 22 tRNAs, and 13 protein-coding regions are found. The genome carries very few intergenic nucleotides with several instances of overlap between protein-coding or tRNA genes, except in the D-loop region. For evaluating the possible evolutionary relationships between Artiodactyla and Cetacea, the nucleotide substitutions and amino acid sequences of 13 protein-coding genes were aligned by pairwise comparisons of the pig, cow, and fin whale. By comparing these sequences, we suggest that there is a closer relationship between the pig and cow than that between either of these species and fin whale. In addition, the accumulation of transversions and gaps in pig 12S and 16S rRNA genes was compared with that in other eutherian species, including cow, fin whale, human, horse, and harbor seal. The results also reveal a close phylogenetic relationship between pig and cow, as compared to fin whale and others. Thus, according to the sequence differences of mitochondrial rRNA genes in eutherian species, the evolutionary separation of pig and cow occurred about 53-60 million years ago.
van Lieburg, A. F.; Verdijk, M. A.; Knoers, V. V.; van Essen, A. J.; Proesmans, W.; Mallmann, R.; Monnens, L. A.; van Oost, B. A.; van Os, C. H.; Deen, P. M.
1994-01-01
Mutations in the X-chromosomal V2 receptor gene are known to cause nephrogenic diabetes insipidus (NDI). Besides the X-linked form, an autosomal mode of inheritance has been described. Recently, mutations in the autosomal gene coding for water-channel aquaporin 2 (AQP2) of the renal collecting duct were reported in an NDI patient. In the present study, missense mutations and a single nucleotide deletion in the aquaporin 2 gene of three NDI patients from consanguineous matings are described. Expression studies in Xenopus oocytes showed that the missense AQP2 proteins are nonfunctional. These results prove that mutations in the AQP2 gene cause autosomal recessive NDI. PMID:7524315
Yi, Ping; Chen, Zhuqin; Zhao, Yan; Guo, Jianxin; Fu, Huabin; Zhou, Yuanguo; Yu, Lili; Li, Li
2009-03-01
The discovery of fetal DNA in maternal plasma has opened up an approach for noninvasive diagnosis. We have now assessed the possibility of detecting single-nucleotide differences between fetal and maternal DNA in maternal plasma by polymerase chain reaction (PCR)/ligase detection reaction((LDR)/capillary electrophoresis. PCR/LDR/capillary electrophoresis was applied to detect the genotype of c.454-397T>gene (ESR1) from experimental DNA models of maternal plasma at different sensitivity levels and 13 maternal plasma samples.alphaC in estrogen receptor. (1) Our results demonstrated that the technique could discriminate low abundance single-nucleotide mutation with a mutant/normal allele ratio up to 1:10 000. (2) Examination of ESR1 c.454-397T>C genotypes by using the method of restriction fragment length analysis was performed in 25 pregnant women, of whom 13 pregnant women had homozygous genotypes. The c.454-397T>C genotypes of paternally inherited fetal DNA in maternal plasma of these 13 women were detected by PCR/LDR/capillary electrophoresis, which were accordant with the results of umbilical cord blood. PCR/LDR/capillary electrophoresis has very high sensitivity to distinguish low abundance single nucleotide differences and can discriminate point mutations and single-nucleotide polymorphisms(SNPs) of paternally inherited fetal DNA in maternal plasma.
Focareta, T; Manning, P A
1987-01-01
The gene encoding the extracellular DNase of Vibrio cholerae was cloned into Escherichia coli K-12. A maximal coding region of 1.2 kb and a minimal region of 0.6 kb were determined by transposon mutagenesis and deletion analysis. The nucleotide sequence of this region contained a single open reading frame of 690 bp corresponding to a protein of Mr 26,389 with a typical N-terminal signal sequence of 18 aa which, when removed, would give a mature protein of Mr 24,163. This is in good agreement with the size of 24 kDa, calculated directly by Coomassie blue staining following sodium dodecyl sulphate-polyacrylamide gel electrophoresis and indirectly via a DNA-hydrolysis assay. The protein is located in the periplasmic space of E. coli K-12 unlike in V. cholerae where it is excreted into the extracellular medium. The introduction of the DNase gene into a periplasmic (tolA) leaky mutant of E. coli K-12 facilitates the release of the protein, further confirming the periplasmic location.
Skorczyk, Anna; Flisikowski, Krzysztof; Switonski, Marek
2012-05-01
Numerous mutations of the human melanocortin receptor type 4 (MC4R) gene are responsible for monogenic obesity, and some of them appear to be associated with predisposition or resistance to polygenic obesity. Thus, this gene is considered a functional candidate for fat tissue accumulation and body weight in domestic mammals. The aim of the study was comparative analysis of chromosome localization, nucleotide sequence, and polymorphism of the MC4R gene in two farmed species of the Canidae family, namely the Chinese raccoon dog (Nycterutes procyonoides procyonoides) and the arctic fox (Alopex lagopus). The whole coding sequence, including fragments of 3'UTR and 5'UTR, shows 89% similarity between the arctic fox (1276 bp) and Chinese raccoon dog (1213 bp). Altogether, 30 farmed Chinese raccoon dogs and 30 farmed arctic foxes were searched for polymorphisms. In the Chinese raccoon dog, only one silent substitution in the coding sequence was identified; whereas in the arctic fox, four InDels and two single-nucleotide polymorphisms (SNPs) in the 5'UTR and six silent SNPs in the exon were found. The studied gene was mapped by FISH to the Chinese raccoon dog chromosome 9 (NPP9q1.2) and arctic fox chromosome 24 (ALA24q1.2-1.3). The obtained results are discussed in terms of genome evolution of species belonging to the family Canidae and their potential use in animal breeding.
Beaton, Derek; Dunlop, Joseph; Abdi, Hervé
2016-12-01
For nearly a century, detecting the genetic contributions to cognitive and behavioral phenomena has been a core interest for psychological research. Recently, this interest has been reinvigorated by the availability of genotyping technologies (e.g., microarrays) that provide new genetic data, such as single nucleotide polymorphisms (SNPs). These SNPs-which represent pairs of nucleotide letters (e.g., AA, AG, or GG) found at specific positions on human chromosomes-are best considered as categorical variables, but this coding scheme can make difficult the multivariate analysis of their relationships with behavioral measurements, because most multivariate techniques developed for the analysis between sets of variables are designed for quantitative variables. To palliate this problem, we present a generalization of partial least squares-a technique used to extract the information common to 2 different data tables measured on the same observations-called partial least squares correspondence analysis-that is specifically tailored for the analysis of categorical and mixed ("heterogeneous") data types. Here, we formally define and illustrate-in a tutorial format-how partial least squares correspondence analysis extends to various types of data and design problems that are particularly relevant for psychological research that include genetic data. We illustrate partial least squares correspondence analysis with genetic, behavioral, and neuroimaging data from the Alzheimer's Disease Neuroimaging Initiative. R code is available on the Comprehensive R Archive Network and via the authors' websites. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Campo, Daniel; García-Vázquez, Eva
2012-01-01
The 5S rDNA is organized in the genome as tandemly repeated copies of a structural unit composed of a coding sequence plus a nontranscribed spacer (NTS). The coding region is highly conserved in the evolution, whereas the NTS vary in both length and sequence. It has been proposed that 5S rRNA genes are members of a gene family that have arisen through concerted evolution. In this study, we describe the molecular organization and evolution of the 5S rDNA in the genera Lepidorhombus and Scophthalmus (Scophthalmidae) and compared it with already known 5S rDNA of the very different genera Merluccius (Merluccidae) and Salmo (Salmoninae), to identify common structural elements or patterns for understanding 5S rDNA evolution in fish. High intra- and interspecific diversity within the 5S rDNA family in all the genera can be explained by a combination of duplications, deletions, and transposition events. Sequence blocks with high similarity in all the 5S rDNA members across species were identified for the four studied genera, with evidences of intense gene conversion within noncoding regions. We propose a model to explain the evolution of the 5S rDNA, in which the evolutionary units are blocks of nucleotides rather than the entire sequences or single nucleotides. This model implies a "two-speed" evolution: slow within blocks (homogenized by recombination) and fast within the gene family (diversified by duplications and deletions).
Morita, Kei-ichi; Naruto, Takuya; Tanimoto, Kousuke; Yasukawa, Chisato; Oikawa, Yu; Masuda, Kiyoshi; Imoto, Issei; Inazawa, Johji; Omura, Ken; Harada, Hiroyuki
2015-01-01
Gorlin syndrome (GS) is an autosomal dominant disorder that predisposes affected individuals to developmental defects and tumorigenesis, and caused mainly by heterozygous germline PTCH1 mutations. Despite exhaustive analysis, PTCH1 mutations are often unidentifiable in some patients; the failure to detect mutations is presumably because of mutations occurred in other causative genes or outside of analyzed regions of PTCH1, or copy number alterations (CNAs). In this study, we subjected a cohort of GS-affected individuals from six unrelated families to next-generation sequencing (NGS) analysis for the combined screening of causative alterations in Hedgehog signaling pathway-related genes. Specific single nucleotide variations (SNVs) of PTCH1 causing inferred amino acid changes were identified in four families (seven affected individuals), whereas CNAs within or around PTCH1 were found in two families in whom possible causative SNVs were not detected. Through a targeted resequencing of all coding exons, as well as simultaneous evaluation of copy number status using the alignment map files obtained via NGS, we found that GS phenotypes could be explained by PTCH1 mutations or deletions in all affected patients. Because it is advisable to evaluate CNAs of candidate causative genes in point mutation-negative cases, NGS methodology appears to be useful for improving molecular diagnosis through the simultaneous detection of both SNVs and CNAs in the targeted genes/regions. PMID:26544948
Palacios-Flores, Kim; García-Sotelo, Jair; Castillo, Alejandra; Uribe, Carina; Aguilar, Luis; Morales, Lucía; Gómez-Romero, Laura; Reyes, José; Garciarubio, Alejandro; Boege, Margareta; Dávila, Guillermo
2018-04-01
We present a conceptually simple, sensitive, precise, and essentially nonstatistical solution for the analysis of genome variation in haploid organisms. The generation of a Perfect Match Genomic Landscape (PMGL), which computes intergenome identity with single nucleotide resolution, reveals signatures of variation wherever a query genome differs from a reference genome. Such signatures encode the precise location of different types of variants, including single nucleotide variants, deletions, insertions, and amplifications, effectively introducing the concept of a general signature of variation. The precise nature of variants is then resolved through the generation of targeted alignments between specific sets of sequence reads and known regions of the reference genome. Thus, the perfect match logic decouples the identification of the location of variants from the characterization of their nature, providing a unified framework for the detection of genome variation. We assessed the performance of the PMGL strategy via simulation experiments. We determined the variation profiles of natural genomes and of a synthetic chromosome, both in the context of haploid yeast strains. Our approach uncovered variants that have previously escaped detection. Moreover, our strategy is ideally suited for further refining high-quality reference genomes. The source codes for the automated PMGL pipeline have been deposited in a public repository. Copyright © 2018 by the Genetics Society of America.
Morita, Kei-ichi; Naruto, Takuya; Tanimoto, Kousuke; Yasukawa, Chisato; Oikawa, Yu; Masuda, Kiyoshi; Imoto, Issei; Inazawa, Johji; Omura, Ken; Harada, Hiroyuki
2015-01-01
Gorlin syndrome (GS) is an autosomal dominant disorder that predisposes affected individuals to developmental defects and tumorigenesis, and caused mainly by heterozygous germline PTCH1 mutations. Despite exhaustive analysis, PTCH1 mutations are often unidentifiable in some patients; the failure to detect mutations is presumably because of mutations occurred in other causative genes or outside of analyzed regions of PTCH1, or copy number alterations (CNAs). In this study, we subjected a cohort of GS-affected individuals from six unrelated families to next-generation sequencing (NGS) analysis for the combined screening of causative alterations in Hedgehog signaling pathway-related genes. Specific single nucleotide variations (SNVs) of PTCH1 causing inferred amino acid changes were identified in four families (seven affected individuals), whereas CNAs within or around PTCH1 were found in two families in whom possible causative SNVs were not detected. Through a targeted resequencing of all coding exons, as well as simultaneous evaluation of copy number status using the alignment map files obtained via NGS, we found that GS phenotypes could be explained by PTCH1 mutations or deletions in all affected patients. Because it is advisable to evaluate CNAs of candidate causative genes in point mutation-negative cases, NGS methodology appears to be useful for improving molecular diagnosis through the simultaneous detection of both SNVs and CNAs in the targeted genes/regions.
Yanagi, Kumiko; Kaname, Tadashi; Wakui, Keiko; Hashimoto, Ohiko; Fukushima, Yoshimitsu; Naritomi, Kenji
2012-01-01
Mutations in the X-linked genes neuroligin 3 (NLGN3) and neuroligin 4X (NLGN4X) were first implicated in the pathogenesis of X-linked autism in Swedish families. However, reports of mutations in these genes in autism spectrum disorder (ASD) patients from various ethnic backgrounds present conflicting results regarding the etiology of ASD, possibly because of genetic heterogeneity and/or differences in their ethnic background. Additional mutation screening study on another ethnic background could help to clarify the relevance of the genes to ASD. We scanned the entire coding regions of NLGN3 and NLGN4X in 62 Japanese patients with ASD by polymerase chain reaction-high-resolution melting curve and direct sequencing analyses. Four synonymous substitutions, one in NLGN3 and three in NLGN4X, were identified in four of the 62 patients. These substitutions were not present in 278 control X-chromosomes from unrelated Japanese individuals and were not registered in the database of Single Nucleotide Polymorphisms build 132 or in the Japanese Single Nucleotide Polymorphisms database, indicating that they were novel and specific to ASD. Though further analysis is necessary to determine the physiological and clinical importance of such substitutions, the possibility of the relevance of both synonymous and nonsynonymous substitutions with the etiology of ASD should be considered.
Yanagi, Kumiko; Kaname, Tadashi; Wakui, Keiko; Hashimoto, Ohiko; Fukushima, Yoshimitsu; Naritomi, Kenji
2012-01-01
Mutations in the X-linked genes neuroligin 3 (NLGN3) and neuroligin 4X (NLGN4X) were first implicated in the pathogenesis of X-linked autism in Swedish families. However, reports of mutations in these genes in autism spectrum disorder (ASD) patients from various ethnic backgrounds present conflicting results regarding the etiology of ASD, possibly because of genetic heterogeneity and/or differences in their ethnic background. Additional mutation screening study on another ethnic background could help to clarify the relevance of the genes to ASD. We scanned the entire coding regions of NLGN3 and NLGN4X in 62 Japanese patients with ASD by polymerase chain reaction-high-resolution melting curve and direct sequencing analyses. Four synonymous substitutions, one in NLGN3 and three in NLGN4X, were identified in four of the 62 patients. These substitutions were not present in 278 control X-chromosomes from unrelated Japanese individuals and were not registered in the database of Single Nucleotide Polymorphisms build 132 or in the Japanese Single Nucleotide Polymorphisms database, indicating that they were novel and specific to ASD. Though further analysis is necessary to determine the physiological and clinical importance of such substitutions, the possibility of the relevance of both synonymous and nonsynonymous substitutions with the etiology of ASD should be considered. PMID:22934180
Ghalandari, Hamid; Hosseini-Esfahani, Firoozeh; Mirmiran, Parvin
2015-07-01
Leptin and ghrelin are two important appetite and energy balance-regulating peptides. Common polymorphisms in the genes coding these peptides and their related receptors are shown to be associated with body weight, different markers of obesity and metabolic abnormalities. This review article aims to investigate the association of common polymorphisms of these genes with overweight/obesity and the metabolic disturbances related to it. The keywords leptin, ghrelin, polymorphism, single-nucleotide polymorphism (SNP), obesity, overweight, Body Mass Index, metabolic syndrome, and type 2 diabetes mellitus (T2DM) (MeSH headings) were used to search in the following databases: Pubmed, Sciencedirect (Elsevier), and Google scholar. Overall, 24 case-control studies, relevant to our topic, met the criteria and were included in the review. The most prevalent leptin/leptin receptor genes (LEP/LEPR) and ghrelin/ghrelin receptor genes (GHRL/GHSR) single nucleotide polymorphisms studied were LEP G-2548A, LEPR Q223R, and Leu72Met, respectively. Nine studies of the 17 studies on LEP/LEPR, and three studies of the seven studies on GHRL/GHSR showed significant relationships. In general, our study suggests that the association between LEP/LEPR and GHRL/GHSR with overweight/obesity and the related metabolic disturbances is inconclusive. These results may be due to unidentified gene-environment interactions. More investigations are needed to further clarify this association.
Shalia, Kavita; Saranath, Dhananjaya; Rayar, Jaipreet; Shah, Vinod K.; Mashru, Manoj R.; Soneji, Surendra L.
2017-01-01
Background & objectives: Acute myocardial infarction (AMI) is a major health concern in India. The aim of the study was to identify single nucleotide polymorphisms (SNPs) associated with AMI in patients using dedicated chip and validating the identified SNPs on custom-designed chips using high-throughput microarray analysis. Methods: In pilot phase, 48 AMI patients and 48 healthy controls were screened for SNPs using human CVD55K BeadChip with 48,472 SNP probes on Illumina high-throughput microarray platform. The identified SNPs were validated by genotyping additional 160 patients and 179 controls using custom-made Illumina VeraCode GoldenGate Genotyping Assay. Analysis was carried out using PLINK software. Results: From the pilot phase, 98 SNPs present on 94 genes were identified with increased risk of AMI (odds ratio of 1.84-8.85, P=0.04861-0.003337). Five of these SNPs demonstrated association with AMI in the validation phase (P<0.05). Among these, one SNP rs9978223 on interferon gamma receptor 2 [IFNGR2, interferon (IFN)-gamma transducer 1] gene showed a significant association (P=0.00021) with AMI below Bonferroni corrected P value (P=0.00061). IFNGR2 is the second subunit of the receptor for IFN-gamma, an important cytokine in inflammatory reactions. Interpretation & conclusions: The study identified an SNP rs9978223 on IFNGR2 gene, associated with increased risk in AMI patient from India. PMID:29434065
Drögemüller, Cord; Philipp, Ute; Haase, Bianca; Günzel-Apel, Anne-Rose; Leeb, Tosso
2007-01-01
Coat color dilution in several breeds of dog is characterized by a specific pigmentation phenotype and sometimes accompanied by hair loss and recurrent skin inflammation, the so-called color dilution alopecia or black hair follicular dysplasia. Coat color dilution (d) is inherited as a Mendelian autosomal recessive trait. In a previous study, MLPH polymorphisms showed perfect cosegregation with the dilute phenotype within breeds. However, different dilute haplotypes were found in different breeds, and no single polymorphism was identified in the coding sequence that was likely to be causative for the dilute phenotype. We resequenced the 5'-region of the canine MLPH gene and identified a strong candidate single nucleotide polymorphism within the nontranslated exon 1, which showed perfect association to the dilute phenotype in 65 dilute dogs from 7 different breeds. The A/G polymorphism is located at the last nucleotide of exon 1 and the mutant A-allele is predicted to reduce splicing efficiency 8-fold. An MLPH mRNA expression study using quantitative reverse transcriptase-polymerase chain reaction confirmed that dd animals had only about approximately 25% of the MLPH transcript compared with DD animals. These results provide preliminary evidence that the reported regulatory MLPH mutation might represent a causal mutation for coat color dilution in dogs.
Aspergillus and Penicillium identification using DNA sequences: Barcode or MLST?
USDA-ARS?s Scientific Manuscript database
Current methods in DNA technology can detect single nucleotide polymorphisms with measurable accuracy using several different approaches appropriate for different uses. If there are even single nucleotide differences that are invariant markers of the species, we can accomplish identification through...
Grasse, Wolfgang; Spring, Otmar
2015-03-01
Plasmopara halstedii virus (PhV) is a ss(+)RNA virus that exclusively occurs in the sunflower downy mildew pathogen Plasmopara halstedii, a biotrophic oomycete of severe economic impact. The virus origin and its genomic variability are unknown. A PCR-based screening of 128 samples of P. halstedii from five continents and up to 40 y old was conducted. PhV RNA was found in over 90 % of the isolates with no correlation to geographic origin or pathotype of its host. Sequence analyses of the two open reading frames (ORFs) revealed only 18 single nucleotide polymorphisms (SNPs) in 3873 nucleotides. The SNPs had no recognizable effect on the two encoded virus proteins. In 398 nucleotides of the untranslated regions (UTRs) of the RNA 2 strand eight additional SNPs and one short deletion was found. Modelling experiments revealed no effects of these variations on the secondary structure of the RNA. The results showed the presence of PhV in P. halstedii isolates of global origin and the existence of the virus since more than 40 y. The virus genome revealed a surprisingly low variation in both coding and noncoding parts. No sequence differences were correlated with host pathotype or geographic populations of the oomycete. Copyright © 2014 The British Mycological Society. Published by Elsevier Ltd. All rights reserved.
Marra, M A; Prasad, S S; Baillie, D L
1993-01-01
A previous study of genomic organization described the identification of nine potential coding regions in 150 kb of genomic DNA from the unc-22(IV) region of Caenorhabditis elegans. In this study, we focus on the genomic organization of a small interval of 0.1 map unit bordered on the right by unc-22 and on the left by the left-hand breakpoints of the deficiencies sDf9, sDf19 and sDf65. This small interval at present contains a single mutagenically defined locus, the essential gene let-56. The cosmid C11F2 has previously been used to rescue let-56. Therefore, at least some of C11F2 must reside in the interval. In this paper, we report the characterization of two coding elements that reside on C11F2. Analysis of nucleotide sequence data obtained from cDNAs and cosmid subclones revealed that one of the coding elements closely resembles aromatic amino acid decarboxylases from several species. The other of these coding elements was found to closely resemble a human growth factor activatable Na+/H+ antiporter. Paris of oligonucleotide primers, predicted from both coding elements, have been used in PCR experiments to position these coding elements between the left breakpoint of sDf19 and the left breakpoint of sDf65, between the essential genes let-653 and let-56.
Kopf, Matthias; Klähn, Stephan; Scholz, Ingeborg; Hess, Wolfgang R.; Voß, Björn
2015-01-01
In all studied organisms, a substantial portion of the transcriptome consists of non-coding RNAs that frequently execute regulatory functions. Here, we have compared the primary transcriptomes of the cyanobacteria Synechocystis sp. PCC 6714 and PCC 6803 under 10 different conditions. These strains share 2854 protein-coding genes and a 16S rRNA identity of 99.4%, indicating their close relatedness. Conserved major transcriptional start sites (TSSs) give rise to non-coding transcripts within the sigB gene, from the 5′UTRs of cmpA and isiA, and 168 loci in antisense orientation. Distinct differences include single nucleotide polymorphisms rendering promoters inactive in one of the strains, e.g., for cmpR and for the asRNA PsbA2R. Based on the genome-wide mapped location, regulation and classification of TSSs, non-coding transcripts were identified as the most dynamic component of the transcriptome. We identified a class of mRNAs that originate by read-through from an sRNA that accumulates as a discrete and abundant transcript while also serving as the 5′UTR. Such an sRNA/mRNA structure, which we name ‘actuaton’, represents another way for bacteria to remodel their transcriptional network. Our findings support the hypothesis that variations in the non-coding transcriptome constitute a major evolutionary element of inter-strain divergence and capability for physiological adaptation. PMID:25902393
2013-01-01
Demand for nonnutritive sweeteners continues to increase due to their ability to provide desirable sweetness with minimal calories. Acesulfame potassium and saccharin are well-studied nonnutritive sweeteners commonly found in food products. Some individuals report aversive sensations from these sweeteners, such as bitter and metallic side tastes. Recent advances in molecular genetics have provided insight into the cause of perceptual differences across people. For example, common alleles for the genes TAS2R9 and TAS2R38 explain variable response to the bitter drugs ofloxacin in vitro and propylthiouracil in vivo. Here, we wanted to determine whether differences in the bitterness of acesulfame potassium could be predicted by common polymorphisms (genetic variants) in bitter taste receptor genes (TAS2Rs). We genotyped participants (n = 108) for putatively functional single nucleotide polymorphisms in 5 TAS2Rs and asked them to rate the bitterness of 25 mM acesulfame potassium on a general labeled magnitude scale. Consistent with prior reports, we found 2 single nucleotide polymorphisms in TAS2R31 were associated with acesulfame potassium bitterness. However, TAS2R9 alleles also predicted additional variation in acesulfame potassium bitterness. Conversely, single nucleotide polymorphisms in TAS2R4, TAS2R38, and near TAS2R16 were not significant predictors. Using 1 single nucleotide polymorphism each from TAS2R9 and TAS2R31, we modeled the simultaneous influence of these single nucleotide polymorphisms on acesulfame potassium bitterness; together, these 2 single nucleotide polymorphisms explained 13.4% of the variance in perceived bitterness. These data suggest multiple polymorphisms within TAS2Rs contribute to the ability to perceive the bitterness from acesulfame potassium. PMID:23599216
Nucleotide cleaving agents and method
Que, Jr., Lawrence; Hanson, Richard S.; Schnaith, Leah M. T.
2000-01-01
The present invention provides a unique series of nucleotide cleaving agents and a method for cleaving a nucleotide sequence, whether single-stranded or double-stranded DNA or RNA, using and a cationic metal complex having at least one polydentate ligand to cleave the nucleotide sequence phosphate backbone to yield a hydroxyl end and a phosphate end.
Code of Federal Regulations, 2010 CFR
2010-07-01
... 37 Patents, Trademarks, and Copyrights 1 2010-07-01 2010-07-01 false Form and format for... And/or Amino Acid Sequences § 1.824 Form and format for nucleotide and/or amino acid sequence... Code for Information Interchange (ASCII) text. No other formats shall be allowed. (3) The computer...
Hobbs, A A; Rosen, J M
1982-01-01
The complete sequences of rat alpha- and gamma-casein mRNAs have been determined. The 1402-nucleotide alpha- and 864-nucleotide gamma-casein mRNAs both encode 15 amino acid signal peptides and mature proteins of 269 and 164 residues, respectively. Considerable homology between the 5' non-coding regions, and the regions encoding the signal peptides and the phosphorylation sites, in these mRNAs as compared to several other rodent casein mRNAs, was observed. Significant homology was also detected between rat alpha- and bovine alpha s1-casein. Comparison of the rodent and bovine sequences suggests that the caseins evolved at about the time of the appearance of the primitive mammals. This may have occurred by intragenic duplication of a nucleotide sequence encoding a primitive phosphorylation site, -(Ser)n-Glu-Glu-, and intergenic duplication resulting in the small casein multigene family. A unique feature of the rat alpha-casein sequence is an insertion in the coding region containing 10 repeated elements of 18 nucleotides each. This insertion appears to have occurred 7-12 million years ago, just prior to the divergence of rat and mouse. Images PMID:6298707
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pham-Dinh, D.; Gaspera, D.B.; Dautigny, A.
1995-09-20
Myelin/oligodendrocyte glycoprotein (MOG), a special component of the central nervous system localization on the outermost lamellae of mature myelin, is a member of the immunoglobulin superfamily. We report here the organization of the human MOG gene, which spans approximately 17 kb, and the characterization of six MOG mRNA splicing variants. The intron/exon structure of the human MOG gene confirmed the splicing pattern, supporting the hypothesis that mRNA isoforms could arise by alternative splicing of a single gene. In addition to the eight exons coding for the major MOG isoform, the human MOG gene also contains 3` region, a previously unknownmore » alternatively spliced coding exon, VIA. Alternative utilization of two acceptor splicing sites for exon VIII could produce two different C-termini. The nucleotide sequences presented here may be a useful tool to study further possible involvement if the MOG gene in hereditary neurological disorders. 23 refs., 5 figs.« less
Splendore, A; Silva, E O; Alonso, L G; Richieri-Costa, A; Alonso, N; Rosa, A; Carakushanky, G; Cavalcanti, D P; Brunoni, D; Passos-Bueno, M R
2000-10-01
Twenty-eight families with a clinical diagnosis of Treacher Collins syndrome were screened for mutations in the 25 coding exons of TCOF1 and their adjacent splice junctions through SSCP and direct sequencing. Pathogenic mutations were detected in 26 patients, yielding the highest detection rate reported so far for this disease (93%) and bringing the number of known disease-causing mutations from 35 to 51. This is the first report to describe clustering of pathogenic mutations. Thirteen novel polymorphic alterations were characterized, confirming previous reports that TCOF1 has an unusually high rate of single-nucleotide polymorphisms (SNPs) within its coding region. We suggest a possible different mechanism leading to TCS or genetic heterogeneity for this condition, as we identified two families with no apparent pathogenic mutation in the gene. Furthermore, our data confirm the absence of genotype-phenotype correlation and reinforce that the apparent anticipation often observed in TCS families is due to ascertainment bias. Copyright 2000 Wiley-Liss, Inc.
COOLAIR Antisense RNAs Form Evolutionarily Conserved Elaborate Secondary Structures
Hawkes, Emily J.; Hennelly, Scott P.; Novikova, Irina V.; ...
2016-09-20
There is considerable debate about the functionality of long non-coding RNAs (lncRNAs). Lack of sequence conservation has been used to argue against functional relevance. Here, we investigated antisense lncRNAs, called COOLAIR, at the A. thaliana FLC locus and experimentally determined their secondary structure. The major COOLAIR variants are highly structured, organized by exon. The distally polyadenylated transcript has a complex multi-domain structure, altered by a single non-coding SNP defining a functionally distinct A. thaliana FLC haplotype. The A. thaliana COOLAIR secondary structure was used to predict COOLAIR exons in evolutionarily divergent Brassicaceae species. These predictions were validated through chemical probingmore » and cloning. Despite the relatively low nucleotide sequence identity, the structures, including multi-helix junctions, show remarkable evolutionary conservation. In a number of places, the structure is conserved through covariation of a non-contiguous DNA sequence. This structural conservation supports a functional role for COOLAIR transcripts rather than, or in addition to, antisense transcription.« less
The human genome contracts again.
Pavlichin, Dmitri S; Weissman, Tsachy; Yona, Golan
2013-09-01
The number of human genomes that have been sequenced completely for different individuals has increased rapidly in recent years. Storing and transferring complete genomes between computers for the purpose of applying various applications and analysis tools will soon become a major hurdle, hindering the analysis phase. Therefore, there is a growing need to compress these data efficiently. Here, we describe a technique to compress human genomes based on entropy coding, using a reference genome and known Single Nucleotide Polymorphisms (SNPs). Furthermore, we explore several intrinsic features of genomes and information in other genomic databases to further improve the compression attained. Using these methods, we compress James Watson's genome to 2.5 megabytes (MB), improving on recent work by 37%. Similar compression is obtained for most genomes available from the 1000 Genomes Project. Our biologically inspired techniques promise even greater gains for genomes of lower organisms and for human genomes as more genomic data become available. Code is available at sourceforge.net/projects/genomezip/
COOLAIR Antisense RNAs Form Evolutionarily Conserved Elaborate Secondary Structures
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hawkes, Emily J.; Hennelly, Scott P.; Novikova, Irina V.
There is considerable debate about the functionality of long non-coding RNAs (lncRNAs). Lack of sequence conservation has been used to argue against functional relevance. Here, we investigated antisense lncRNAs, called COOLAIR, at the A. thaliana FLC locus and experimentally determined their secondary structure. The major COOLAIR variants are highly structured, organized by exon. The distally polyadenylated transcript has a complex multi-domain structure, altered by a single non-coding SNP defining a functionally distinct A. thaliana FLC haplotype. The A. thaliana COOLAIR secondary structure was used to predict COOLAIR exons in evolutionarily divergent Brassicaceae species. These predictions were validated through chemical probingmore » and cloning. Despite the relatively low nucleotide sequence identity, the structures, including multi-helix junctions, show remarkable evolutionary conservation. In a number of places, the structure is conserved through covariation of a non-contiguous DNA sequence. This structural conservation supports a functional role for COOLAIR transcripts rather than, or in addition to, antisense transcription.« less
Cis-acting RNA elements in the Hepatitis C virus RNA genome
Sagan, Selena M.; Chahal, Jasmin; Sarnow, Peter
2017-01-01
Hepatitis C virus (HCV) infection is a rapidly increasing global health problem with an estimated 170 million people infected worldwide. HCV is a hepatotropic, positive-sense RNA virus of the family Flaviviridae. As a positive-sense RNA virus, the HCV genome itself must serve as a template for translation, replication and packaging. The viral RNA must therefore be a dynamic structure that is able to readily accommodate structural changes to expose different regions of the genome to viral and cellular proteins to carry out the HCV life cycle. The ∼9600 nucleotide viral genome contains a single long open reading frame flanked by 5′ and 3′ non-coding regions that contain cis-acting RNA elements important for viral translation, replication and stability. Additional cis-acting RNA elements have also been identified in the coding sequences as well as in the 3′ end of the negative-strand replicative intermediate. Herein, we provide an overview of the importance of these cis-acting RNA elements in the HCV life cycle. PMID:25576644
Advanced complex trait analysis.
Gray, A; Stewart, I; Tenesa, A
2012-12-01
The Genome-wide Complex Trait Analysis (GCTA) software package can quantify the contribution of genetic variation to phenotypic variation for complex traits. However, as those datasets of interest continue to increase in size, GCTA becomes increasingly computationally prohibitive. We present an adapted version, Advanced Complex Trait Analysis (ACTA), demonstrating dramatically improved performance. We restructure the genetic relationship matrix (GRM) estimation phase of the code and introduce the highly optimized parallel Basic Linear Algebra Subprograms (BLAS) library combined with manual parallelization and optimization. We introduce the Linear Algebra PACKage (LAPACK) library into the restricted maximum likelihood (REML) analysis stage. For a test case with 8999 individuals and 279,435 single nucleotide polymorphisms (SNPs), we reduce the total runtime, using a compute node with two multi-core Intel Nehalem CPUs, from ∼17 h to ∼11 min. The source code is fully available under the GNU Public License, along with Linux binaries. For more information see http://www.epcc.ed.ac.uk/software-products/acta. a.gray@ed.ac.uk Supplementary data are available at Bioinformatics online.
Opioid system genes in alcoholism: a case-control study in Croatian population.
Cupic, B; Stefulj, J; Zapletal, E; Matosic, A; Bordukalo-Niksic, T; Cicin-Sain, L; Gabrilovac, J
2013-10-01
Due to their involvement in dependence pathways, opioid system genes represent strong candidates for association studies investigating alcoholism. In this study, single nucleotide polymorphisms within the genes for mu (OPRM1) and kappa (OPRK1) opioid receptors and precursors of their ligands - proopiomelanocortin (POMC), coding for beta-endorphin and prodynorphin (PDYN) coding for dynorphins, were analyzed in a case-control study that included 354 male alcohol-dependent and 357 male control subjects from Croatian population. Analysis of allele and genotype frequencies of the selected polymorphisms of the genes OPRM1/POMC and OPRK1/PDYN revealed no differences between the tested groups. The same was true when alcohol-dependent persons were subdivided according to the Cloninger's criteria into type-1 and type-2 groups, known to differ in the extent of genetic control. Thus, the data obtained suggest no association of the selected polymorphisms of the genes OPRM1/POMC and OPRK1/PDYN with alcoholism in Croatian population. Copyright © 2013 Elsevier Ltd. All rights reserved.
McKernan, Kevin J.; Spangler, Jessica; Zhang, Lei; Tadigotla, Vasisht; McLaughlin, Stephen; Warner, Jason; Zare, Amir; Boles, Richard G.
2014-01-01
We have developed a PCR method, coined Déjà vu PCR, that utilizes six nucleotides in PCR with two methyl specific restriction enzymes that respectively digest these additional nucleotides. Use of this enzyme-and-nucleotide combination enables what we term a “DNA diode”, where DNA can advance in a laboratory in only one direction and cannot feedback into upstream assays. Here we describe aspects of this method that enable consecutive amplification with the introduction of a 5th and 6th base while simultaneously providing methylation dependent mitochondrial DNA enrichment. These additional nucleotides enable a novel DNA decontamination technique that generates ephemeral and easy to decontaminate DNA. PMID:24788618
Chen, Sherry Xi; Seelig, Georg
2016-04-20
Even a single-nucleotide difference between the sequences of two otherwise identical biological nucleic acids can have dramatic functional consequences. Here, we use model-guided reaction pathway engineering to quantitatively improve the performance of selective hybridization probes in recognizing single nucleotide variants (SNVs). Specifically, we build a detection system that combines discrimination by competition with DNA strand displacement-based catalytic amplification. We show, both mathematically and experimentally, that the single nucleotide selectivity of such a system in binding to single-stranded DNA and RNA is quadratically better than discrimination due to competitive hybridization alone. As an additional benefit the integrated circuit inherits the property of amplification and provides at least 10-fold better sensitivity than standard hybridization probes. Moreover, we demonstrate how the detection mechanism can be tuned such that the detection reaction is agnostic to the position of the SNV within the target sequence. in contrast, prior strand displacement-based probes designed for kinetic discrimination are highly sensitive to position effects. We apply our system to reliably discriminate between different members of the let-7 microRNA family that differ in only a single base position. Our results demonstrate the power of systematic reaction network design to quantitatively improve biotechnology.
Schuster, W; Wissinger, B; Unseld, M; Brennicke, A
1990-01-01
A number of cytosines are altered to be recognized as uridines in transcripts of the nad3 locus in mitochondria of the higher plant Oenothera. Such nucleotide modifications can be found at 16 different sites within the nad3 coding region. Most of these alterations in the mRNA sequence change codon identities to specify amino acids better conserved in evolution. Individual cDNA clones differ in their degree of editing at five nucleotide positions, three of which are silent, while two lead to codon alterations specifying different amino acids. None of the cDNA clones analysed is maximally edited at all possible sites, suggesting slow processing or lowered stringency of editing at these nucleotides. Differentially edited transcripts could be editing intermediates or could code for differing polypeptides. Two edited nucleotides in an open reading frame located upstream of nad3 change two amino acids in the deduced polypeptide. Part of the well-conserved ribosomal protein gene rps12 also encoded downstream of nad3 in other plants, is lost in Oenothera mitochondria by recombination events. The functional rps12 protein must be imported from the cytoplasm since the deleted sequences of this gene are not found in the Oenothera mitochondrial genome. The pseudogene sequence is not edited at any nucleotide position. Images Fig. 3. Fig. 4. Fig. 7. PMID:1688531
Zhou, Daling; Du, Qingzhang; Chen, Jinhui; Wang, Qingshi; Zhang, Deqiang
2017-10-01
Long non-coding RNAs (lncRNAs) function in various biological processes. However, their roles in secondary growth of plants remain poorly understood. Here, 15,691 lncRNAs were identified from vascular cambium, developing xylem, and mature xylem of Populus tomentosa with high and low biomass using RNA-seq, including 1,994 lncRNAs that were differentially expressed (DE) among the six libraries. 3,569 cis-regulated and 3,297 trans-regulated protein-coding genes were predicted as potential target genes (PTGs) of the DE lncRNAs to participate in biological regulation. Then, 476 and 28 lncRNAs were identified as putative targets and endogenous target mimics (eTMs) of Populus known microRNAs (miRNAs), respectively. Genome re-sequencing of 435 individuals from a natural population of P. tomentosa found 34,015 single nucleotide polymorphisms (SNPs) within 178 lncRNA loci and 522 PTGs. Single-SNP associations analysis detected 2,993 associations with 10 growth and wood-property traits under additive and dominance model. Epistasis analysis identified 17,656 epistatic SNP pairs, providing evidence for potential regulatory interactions between lncRNAs and their PTGs. Furthermore, a reconstructed epistatic network, representing interactions of 8 lncRNAs and 15 PTGs, might enrich regulation roles of genes in the phenylpropanoid pathway. These findings may enhance our understanding of non-coding genes in plants. © The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Mustafa, Saima; Fatima, Hira; Fatima, Sadia; Khosa, Tafheem; Akbar, Atif; Shaikh, Rehan Sadiq; Iqbal, Furhan
2018-01-01
To find out a correlation between the single nucleotide polymorphisms in cluster of differentiation 28 and cluster of differentiation 40 genes with Graves' disease, if any. This case-control study was conducted at the Multan Institute of Nuclear Medicine and Radiotherapy, Multan, Pakistan, and comprised blood samples of Graves' disease patients and controls. Various risk factors were also correlated either with the genotype at each single-nucleotide polymorphism or with various combinations of genotypes studied during present investigation. Of the 160 samples, there were 80(50%) each from patients and controls. Risk factor analysis revealed that gender (p=0.008), marital status (p<0.001), education (p<0.001), smoking (p<0.001), tri-iodothyronine (P <0.001), thyroxin (p<0.001) and thyroid-stimulating hormone (p<0.000) levels in blood were associated with Graves' disease. Both single-nucleotide polymorphisms in both genes were not associated with Graves' disease, either individually or in any combined form.
Stockley, Jacqueline; Nisar, Shaista P; Leo, Vincenzo C; Sabi, Essa; Cunningham, Margaret R; Eikenboom, Jeroen C; Lethagen, Stefan; Schneppenheim, Reinhard; Goodeve, Anne C; Watson, Steve P; Mundell, Stuart J; Daly, Martina E
2015-01-01
The clinical expression of type 1 von Willebrand disease may be modified by co-inheritance of other mild bleeding diatheses. We previously showed that mutations in the platelet P2Y12 ADP receptor gene (P2RY12) could contribute to the bleeding phenotype in patients with type 1 von Willebrand disease. Here we investigated whether variations in platelet G protein-coupled receptor genes other than P2RY12 also contributed to the bleeding phenotype. Platelet G protein-coupled receptor genes P2RY1, F2R, F2RL3, TBXA2R and PTGIR were sequenced in 146 index cases with type 1 von Willebrand disease and the potential effects of identified single nucleotide variations were assessed using in silico methods and heterologous expression analysis. Seven heterozygous single nucleotide variations were identified in 8 index cases. Two single nucleotide variations were detected in F2R; a novel c.-67G>C transversion which reduced F2R transcriptional activity and a rare c.1063C>T transition predicting a p.L355F substitution which did not interfere with PAR1 expression or signalling. Two synonymous single nucleotide variations were identified in F2RL3 (c.402C>G, p.A134 =; c.1029 G>C p.V343 =), both of which introduced less commonly used codons and were predicted to be deleterious, though neither of them affected PAR4 receptor expression. A third single nucleotide variation in F2RL3 (c.65 C>A; p.T22N) was co-inherited with a synonymous single nucleotide variation in TBXA2R (c.6680 C>T, p.S218 =). Expression and signalling of the p.T22N PAR4 variant was similar to wild-type, while the TBXA2R variation introduced a cryptic splice site that was predicted to cause premature termination of protein translation. The enrichment of single nucleotide variations in G protein-coupled receptor genes among type 1 von Willebrand disease patients supports the view of type 1 von Willebrand disease as a polygenic disorder.
Fuller, Carl W.; Kumar, Shiv; Porel, Mintu; Chien, Minchen; Bibillo, Arek; Stranges, P. Benjamin; Dorwart, Michael; Tao, Chuanjuan; Li, Zengmin; Guo, Wenjing; Shi, Shundi; Korenblum, Daniel; Trans, Andrew; Aguirre, Anne; Liu, Edward; Harada, Eric T.; Pollard, James; Bhat, Ashwini; Cech, Cynthia; Yang, Alexander; Arnold, Cleoma; Palla, Mirkó; Hovis, Jennifer; Chen, Roger; Morozova, Irina; Kalachikov, Sergey; Russo, James J.; Kasianowicz, John J.; Davis, Randy; Roever, Stefan; Church, George M.; Ju, Jingyue
2016-01-01
DNA sequencing by synthesis (SBS) offers a robust platform to decipher nucleic acid sequences. Recently, we reported a single-molecule nanopore-based SBS strategy that accurately distinguishes four bases by electronically detecting and differentiating four different polymer tags attached to the 5′-phosphate of the nucleotides during their incorporation into a growing DNA strand catalyzed by DNA polymerase. Further developing this approach, we report here the use of nucleotides tagged at the terminal phosphate with oligonucleotide-based polymers to perform nanopore SBS on an α-hemolysin nanopore array platform. We designed and synthesized several polymer-tagged nucleotides using tags that produce different electrical current blockade levels and verified they are active substrates for DNA polymerase. A highly processive DNA polymerase was conjugated to the nanopore, and the conjugates were complexed with primer/template DNA and inserted into lipid bilayers over individually addressable electrodes of the nanopore chip. When an incoming complementary-tagged nucleotide forms a tight ternary complex with the primer/template and polymerase, the tag enters the pore, and the current blockade level is measured. The levels displayed by the four nucleotides tagged with four different polymers captured in the nanopore in such ternary complexes were clearly distinguishable and sequence-specific, enabling continuous sequence determination during the polymerase reaction. Thus, real-time single-molecule electronic DNA sequencing data with single-base resolution were obtained. The use of these polymer-tagged nucleotides, combined with polymerase tethering to nanopores and multiplexed nanopore sensors, should lead to new high-throughput sequencing methods. PMID:27091962
Self-complementary circular codes in coding theory.
Fimmel, Elena; Michel, Christian J; Starman, Martin; Strüngmann, Lutz
2018-04-01
Self-complementary circular codes are involved in pairing genetic processes. A maximal [Formula: see text] self-complementary circular code X of trinucleotides was identified in genes of bacteria, archaea, eukaryotes, plasmids and viruses (Michel in Life 7(20):1-16 2017, J Theor Biol 380:156-177, 2015; Arquès and Michel in J Theor Biol 182:45-58 1996). In this paper, self-complementary circular codes are investigated using the graph theory approach recently formulated in Fimmel et al. (Philos Trans R Soc A 374:20150058, 2016). A directed graph [Formula: see text] associated with any code X mirrors the properties of the code. In the present paper, we demonstrate a necessary condition for the self-complementarity of an arbitrary code X in terms of the graph theory. The same condition has been proven to be sufficient for codes which are circular and of large size [Formula: see text] trinucleotides, in particular for maximal circular codes ([Formula: see text] trinucleotides). For codes of small-size [Formula: see text] trinucleotides, some very rare counterexamples have been constructed. Furthermore, the length and the structure of the longest paths in the graphs associated with the self-complementary circular codes are investigated. It has been proven that the longest paths in such graphs determine the reading frame for the self-complementary circular codes. By applying this result, the reading frame in any arbitrary sequence of trinucleotides is retrieved after at most 15 nucleotides, i.e., 5 consecutive trinucleotides, from the circular code X identified in genes. Thus, an X motif of a length of at least 15 nucleotides in an arbitrary sequence of trinucleotides (not necessarily all of them belonging to X) uniquely defines the reading (correct) frame, an important criterion for analyzing the X motifs in genes in the future.
Detecting Single-Nucleotides by Tunneling Current Measurements at Sub-MHz Temporal Resolution.
Morikawa, Takanori; Yokota, Kazumichi; Tanimoto, Sachie; Tsutsui, Makusu; Taniguchi, Masateru
2017-04-18
Label-free detection of single-nucleotides was performed by fast tunneling current measurements in a polar solvent at 1 MHz sampling rate using SiO₂-protected Au nanoprobes. Short current spikes were observed, suggestive of trapping/detrapping of individual nucleotides between the nanoelectrodes. The fall and rise features of the electrical signatures indicated signal retardation by capacitance effects with a time constant of about 10 microseconds. The high temporal resolution revealed current fluctuations, reflecting the molecular conformation degrees of freedom in the electrode gap. The method presented in this work may enable direct characterizations of dynamic changes in single-molecule conformations in an electrode gap in liquid.
Li, Su-Xia
2004-12-01
Single nucleotide polymorphism (SNP) is the third genetic marker after restriction fragment length polymorphism (RFLP) and short tandem repeat. It represents the most density genetic variability in the human genome and has been widely used in gene location, cloning, and research of heredity variation, as well as parenthood identification in forensic medicine. As steady heredity polymorphism, single nucleotide polymorphism is becoming the focus of attention in monitoring chimerism and minimal residual disease in the patients after allogeneic hematopoietic stem cell transplantation. The article reviews SNP heredity characterization, analysis techniques and its applications in allogeneic stem cell transplantation and other fields.
Su, Zhipeng; Zhu, Jiawen; Xu, Zhuofei; Xiao, Ran; Zhou, Rui; Li, Lu; Chen, Huanchun
2016-01-01
Actinobacillus pleuropneumoniae is the pathogen of porcine contagious pleuropneumoniae, a highly contagious respiratory disease of swine. Although the genome of A. pleuropneumoniae was sequenced several years ago, limited information is available on the genome-wide transcriptional analysis to accurately annotate the gene structures and regulatory elements. High-throughput RNA sequencing (RNA-seq) has been applied to study the transcriptional landscape of bacteria, which can efficiently and accurately identify gene expression regions and unknown transcriptional units, especially small non-coding RNAs (sRNAs), UTRs and regulatory regions. The aim of this study is to comprehensively analyze the transcriptome of A. pleuropneumoniae by RNA-seq in order to improve the existing genome annotation and promote our understanding of A. pleuropneumoniae gene structures and RNA-based regulation. In this study, we utilized RNA-seq to construct a single nucleotide resolution transcriptome map of A. pleuropneumoniae. More than 3.8 million high-quality reads (average length ~90 bp) from a cDNA library were generated and aligned to the reference genome. We identified 32 open reading frames encoding novel proteins that were mis-annotated in the previous genome annotations. The start sites for 35 genes based on the current genome annotation were corrected. Furthermore, 51 sRNAs in the A. pleuropneumoniae genome were discovered, of which 40 sRNAs were never reported in previous studies. The transcriptome map also enabled visualization of 5'- and 3'-UTR regions, in which contained 11 sRNAs. In addition, 351 operons covering 1230 genes throughout the whole genome were identified. The RNA-Seq based transcriptome map validated annotated genes and corrected annotations of open reading frames in the genome, and led to the identification of many functional elements (e.g. regions encoding novel proteins, non-coding sRNAs and operon structures). The transcriptional units described in this study provide a foundation for future studies concerning the gene functions and the transcriptional regulatory architectures of this pathogen. PMID:27018591
Mutharasan, Priscilla; Galdones, Eugene; Peñalver Bernabé, Beatriz; Garcia, Obed A; Jafari, Nadereh; Shea, Lonnie D; Woodruff, Teresa K; Legro, Richard S; Dunaif, Andrea; Urbanek, Margrit
2013-01-01
A previous genome-wide association study in Chinese women with polycystic ovary syndrome (PCOS) identified a region on chromosome 2p16.3 encoding the LH/choriogonadotropin receptor (LHCGR) and FSH receptor (FSHR) genes as a reproducible PCOS susceptibility locus. The objective of the study was to determine the role of the LHCGR and/or FSHR gene in the etiology of PCOS in women of European ancestry. This was a genetic association study in a European ancestry cohort of women with PCOS. The study was conducted at an academic medical center. Participants in the study included 905 women with PCOS diagnosed by National Institutes of Health criteria and 956 control women. We genotyped 94 haplotype-tagging single-nucleotide polymorphisms and two coding single-nucleotide polymorphisms mapping to the coding region of LHCGR and FSHR plus 20 kb upstream and downstream of the genes and test for association in the case control cohort and for association with nine quantitative traits in the women with PCOS. We found strong evidence for an association of PCOS with rs7562215 (P = 0.0037) and rs10495960 (P = 0.0046). Although the marker with the strongest association in the Chinese PCOS genome-wide association study (rs13405728) was not informative in the European populations, we identified and genotyped three markers (rs35960650, rs2956355, and rs7562879) within 5 kb of rs13405728. Of these, rs7562879 was nominally associated with PCOS (P = 0.020). The strongest evidence for association mapping to FSHR was observed with rs1922476 (P = 0.0053). Furthermore, markers with the FSHR gene region were associated with FSH levels in women with PCOS. Fine mapping of the chromosome 2p16.3 Chinese PCOS susceptibility locus in a European ancestry cohort provides evidence for association with two independent loci and PCOS. The gene products LHCGR and FSHR therefore are likely to be important in the etiology of PCOS, regardless of ethnicity.
EnsembleGASVR: a novel ensemble method for classifying missense single nucleotide polymorphisms.
Rapakoulia, Trisevgeni; Theofilatos, Konstantinos; Kleftogiannis, Dimitrios; Likothanasis, Spiros; Tsakalidis, Athanasios; Mavroudi, Seferina
2014-08-15
Single nucleotide polymorphisms (SNPs) are considered the most frequently occurring DNA sequence variations. Several computational methods have been proposed for the classification of missense SNPs to neutral and disease associated. However, existing computational approaches fail to select relevant features by choosing them arbitrarily without sufficient documentation. Moreover, they are limited to the problem of missing values, imbalance between the learning datasets and most of them do not support their predictions with confidence scores. To overcome these limitations, a novel ensemble computational methodology is proposed. EnsembleGASVR facilitates a two-step algorithm, which in its first step applies a novel evolutionary embedded algorithm to locate close to optimal Support Vector Regression models. In its second step, these models are combined to extract a universal predictor, which is less prone to overfitting issues, systematizes the rebalancing of the learning sets and uses an internal approach for solving the missing values problem without loss of information. Confidence scores support all the predictions and the model becomes tunable by modifying the classification thresholds. An extensive study was performed for collecting the most relevant features for the problem of classifying SNPs, and a superset of 88 features was constructed. Experimental results show that the proposed framework outperforms well-known algorithms in terms of classification performance in the examined datasets. Finally, the proposed algorithmic framework was able to uncover the significant role of certain features such as the solvent accessibility feature, and the top-scored predictions were further validated by linking them with disease phenotypes. Datasets and codes are freely available on the Web at http://prlab.ceid.upatras.gr/EnsembleGASVR/dataset-codes.zip. All the required information about the article is available through http://prlab.ceid.upatras.gr/EnsembleGASVR/site.html. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Koenig, R; Loss, S; Specht, J; Varrelmann, M; Lüddecke, P; Deml, G
2009-03-01
Beet necrotic yellow vein virus (BNYVV) A type isolates E12 and S8, originating from areas where resistance-breaking had or had not been observed, respectively, served as starting material for studying the influence of sequence variations in BNYVV RNA 3 on virus accumulation in partially resistant sugar beet varieties. Sub-isolates containing only RNAs 1 and 2 were obtained by serial local lesion passages; biologically active cDNA clones were prepared for RNAs 3 which differed in their coding sequences for P25 aa 67, 68 and 129. Sugar beet seedlings were mechanically inoculated with RNA 1+2/RNA 3 pseudorecombinants. The origin of RNAs 1+2 had little influence on virus accumulation in rootlets. E12 RNA 3 coding for V(67)C(68)Y(129) P25, however, enabled a much higher virus accumulation than S8 RNA 3 coding for A(67)H(68)H(129) P25. Mutants revealed that this was due only to the V(67) 'GUU' codon as opposed to the A(67) 'GCU' codon.
USDA-ARS?s Scientific Manuscript database
Single-nucleotide polymorphisms (SNPs) are highly abundant markers, which are broadly distributed in animal genomes. For rainbow trout, SNP discovery has been done through sequencing of restriction-site associated DNA (RAD) libraries, reduced representation libraries (RRL), RNA sequencing, and whole...
USDA-ARS?s Scientific Manuscript database
High-density single nucleotide polymorphism (SNP) genotyping chips are a powerful tool for studying genomic patterns of diversity, inferring ancestral relationships among individuals in populations and studying marker-trait associations in mapping experiments. We developed a genotyping array includ...
Taira, Chiaki; Matsuda, Kazuyuki; Yamaguchi, Akemi; Sueki, Akane; Koeda, Hiroshi; Takagi, Fumio; Kobayashi, Yukihiro; Sugano, Mitsutoshi; Honda, Takayuki
2013-09-23
Single nucleotide alterations such as single nucleotide polymorphisms (SNP) and single nucleotide mutations are associated with responses to drugs and predisposition to several diseases, and they contribute to the pathogenesis of malignancies. We developed a rapid genotyping assay based on the allele-specific polymerase chain reaction (AS-PCR) with our droplet-PCR machine (droplet-AS-PCR). Using 8 SNP loci, we evaluated the specificity and sensitivity of droplet-AS-PCR. Buccal cells were pretreated with proteinase K and subjected directly to the droplet-AS-PCR without DNA extraction. The genotypes determined using the droplet-AS-PCR were then compared with those obtained by direct sequencing. Specific PCR amplifications for the 8 SNP loci were detected, and the detection limit of the droplet-AS-PCR was found to be 0.1-5.0% by dilution experiments. Droplet-AS-PCR provided specific amplification when using buccal cells, and all the genotypes determined within 9 min were consistent with those obtained by direct sequencing. Our novel droplet-AS-PCR assay enabled high-speed amplification retaining specificity and sensitivity and provided ultra-rapid genotyping. Crude samples such as buccal cells were available for the droplet-AS-PCR assay, resulting in the reduction of the total analysis time. Droplet-AS-PCR may therefore be useful for genotyping or the detection of single nucleotide alterations. Copyright © 2013 Elsevier B.V. All rights reserved.
RNA Editing in Plant Mitochondria
NASA Astrophysics Data System (ADS)
Hiesel, Rudolf; Wissinger, Bernd; Schuster, Wolfgang; Brennicke, Axel
1989-12-01
Comparative sequence analysis of genomic and complementary DNA clones from several mitochondrial genes in the higher plant Oenothera revealed nucleotide sequence divergences between the genomic and the messenger RNA-derived sequences. These sequence alterations could be most easily explained by specific post-transcriptional nucleotide modifications. Most of the nucleotide exchanges in coding regions lead to altered codons in the mRNA that specify amino acids better conserved in evolution than those encoded by the genomic DNA. Several instances show that the genomic arginine codon CGG is edited in the mRNA to the tryptophan codon TGG in amino acid positions that are highly conserved as tryptophan in the homologous proteins of other species. This editing suggests that the standard genetic code is used in plant mitochondria and resolves the frequent coincidence of CGG codons and tryptophan in different plant species. The apparently frequent and non-species-specific equivalency of CGG and TGG codons in particular suggests that RNA editing is a common feature of all higher plant mitochondria.
3D RNA and functional interactions from evolutionary couplings
Weinreb, Caleb; Riesselman, Adam; Ingraham, John B.; Gross, Torsten; Sander, Chris; Marks, Debora S.
2016-01-01
Summary Non-coding RNAs are ubiquitous, but the discovery of new RNA gene sequences far outpaces research on their structure and functional interactions. We mine the evolutionary sequence record to derive precise information about function and structure of RNAs and RNA-protein complexes. As in protein structure prediction, we use maximum entropy global probability models of sequence co-variation to infer evolutionarily constrained nucleotide-nucleotide interactions within RNA molecules, and nucleotide-amino acid interactions in RNA-protein complexes. The predicted contacts allow all-atom blinded 3D structure prediction at good accuracy for several known RNA structures and RNA-protein complexes. For unknown structures, we predict contacts in 160 non-coding RNA families. Beyond 3D structure prediction, evolutionary couplings help identify important functional interactions, e.g., at switch points in riboswitches and at a complex nucleation site in HIV. Aided by accelerating sequence accumulation, evolutionary coupling analysis can accelerate the discovery of functional interactions and 3D structures involving RNA. PMID:27087444
Gritsun, T S; Venugopal, K; Zanotto, P M; Mikhailov, M V; Sall, A A; Holmes, E C; Polkinghorne, I; Frolova, T V; Pogodina, V V; Lashkevich, V A; Gould, E A
1997-05-01
The complete nucleotide sequence of two tick-transmitted flaviviruses, Vasilchenko (Vs) from Siberia and louping ill (LI) from the UK, have been determined. The genomes were respectively, 10928 and 10871 nucleotides (nt) in length. The coding strategy and functional protein sequence motifs of tick-borne flaviviruses are presented in both Vs and LI viruses. The phylogenies based on maximum likelihood, maximum parsimony and distance analysis of the polyproteins, identified Vs virus as a member of the tick-borne encephalitis virus subgroup within the tick-borne serocomplex, genus Flavivirus, family Flaviviridae. Comparative alignment of the 3'-untranslated regions revealed deletions of different lengths essentially at the same position downstream of the stop codon for all tick-borne viruses. Two direct 27 nucleotide repeats at the 3'-end were found only for Vs and LI virus. Immediately following the deletions a region of 332-334 nt with relatively conserved primary structure (67-94% identity) was observed at the 3'-non-coding end of the virus genome. Pairwise comparisons of the nucleotide sequence data revealed similar levels of variation between the coding region, and the 5' and 3'-termini of the genome, implying an equivalent strong selective control for translated and untranslated regions. Indeed the predicted folding of the 5' and 3'-untranslated regions revealed patterns of stem and loop structures conserved for all tick-borne flaviviruses suggesting a purifying selection for preservation of essential RNA secondary structures which could be involved in translational control and replication. The possible implications of these findings are discussed.
Castro-Chavez, Fernando
2011-01-01
My previous theoretical research shows that the rotating circular genetic code is a viable tool to make easier to distinguish the rules of variation applied to the amino acid exchange; it presents a precise and positional bio-mathematical balance of codons, according to the amino acids they codify. Here, I demonstrate that when using the conventional or classic circular genetic code, a clearer pattern for the human codon usage per amino acid and per genome emerges. The most used human codons per amino acid were the ones ending with the three hydrogen bond nucleotides: C for 12 amino acids and G for the remaining 8, plus one codon for arginine ending in A that was used approximately with the same frequency than the one ending in G for this same amino acid (plus *). The most used codons in man fall almost all the time at the rightmost position, clockwise, ending either in C or in G within the circular genetic code. The human codon usage per genome is compared to other organisms such as fruit flies (Drosophila melanogaster), squid (Loligo pealei), and many others. The biosemiotic codon usage of each genomic population or ‘Theme’ is equated to a ‘molecular language’. The C/U choice or difference, and the G/A difference in the third nucleotide of the most used codons per amino acid are illustrated by comparing the most used codons per genome in humans and squids. The human distribution in the third position of most used codons is a 12-8-2, C-G-A, nucleotide ending signature, while the squid distribution in the third position of most used codons was an odd, or uneven, distribution in the third position of its most used codons: 13-6-3, U-A-G, as its nucleotide ending signature. These findings may help to design computational tools to compare human genomes, to determine the exchangeability between compatible codons and amino acids, and for the early detection of incompatible changes leading to hereditary diseases. PMID:22997484
Brown, Jessica A.; Pack, Lindsey R.; Sherrer, Shanen M.; Kshetry, Ajay K.; Newmister, Sean A.; Fowler, Jason D.; Taylor, John-Stephen; Suo, Zucai
2010-01-01
DNA polymerase λ (Pol λ) is a novel X-family DNA polymerase that shares 34% sequence identity with DNA polymerase β (Pol β). Pre-steady state kinetic studies have shown that the Pol λ•DNA complex binds both correct and incorrect nucleotides 130-fold tighter on average than the Pol β•DNA complex, although, the base substitution fidelity of both polymerases is 10−4 to 10−5. To better understand Pol λ’s tight nucleotide binding affinity, we created single- and double-substitution mutants of Pol λ to disrupt interactions between active site residues and an incoming nucleotide or a template base. Single-turnover kinetic assays showed that Pol λ binds to an incoming nucleotide via cooperative interactions with active site residues (R386, R420, K422, Y505, F506, A510, and R514). Disrupting protein interactions with an incoming correct or incorrect nucleotide impacted binding with each of the common structural moieties in the following order: triphosphate ≫ base > ribose. In addition, the loss of Watson-Crick hydrogen bonding between the nucleotide and template base led to a moderate increase in the Kd. The fidelity of Pol λ was maintained predominantly by a single residue, R517, which has minor groove interactions with the DNA template. PMID:20851705
USDA-ARS?s Scientific Manuscript database
Unfavorable genetic correlations between production and fertility traits are well documented. Genetic selection for fertility traits is slow, however, due to low heritabilities. Identification of single nucleotide polymorphisms (SNP) involved in reproduction could improve reliability of genomic esti...
Discovery, Validation and Characterization of 1039 Cattle Single Nucleotide Polymorphisms
USDA-ARS?s Scientific Manuscript database
We identified approximately 13000 putative single nucleotide polymorphisms (SNPs) by comparison of repeat-masked BAC-end sequences from the cattle RPCI-42 BAC library with whole-genome shotgun contigs of cattle genome assembly Btau 1.0. Genotyping of a subset of these SNPs was performed on a panel ...
USDA-ARS?s Scientific Manuscript database
Multiplexed single nucleotide polymorphism (SNP) markers have the potential to increase the speed and cost-effectiveness of genotyping, provided that an optimal SNP density is used for each application. To test the efficiency of multiplexed SNP genotyping for diversity, mapping and breeding applicat...
USDA-ARS?s Scientific Manuscript database
Longan (Dimocarpus longan Lour.) is an important tropical fruit tree crop. Accurate varietal identification is essential for germplasm management and breeding. Using longan transcriptome sequences from public databases, we developed single nucleotide polymorphism (SNP) markers; validated 60 SNPs in...
USDA-ARS?s Scientific Manuscript database
Single nucleotide polymorphisms (SNPs) were genotyped using a high-density array and DNAs from individual plants from important onion populations from major production regions world-wide and the likely progenitor of onion, Allium vavilovii. Genotypes at 1226 SNPs were used to estimate genetic relati...
USDA-ARS?s Scientific Manuscript database
Genome scans in the pig have identified a region on chromosome 2 (SSC2) associated with tenderness. Calpastatin is a likely positional candidate gene in this region because of its inhibitory role in the calpain system that is involved in postmortem tenderization. Novel single nucleotide polymorphism...
Lineage and genogroup-defining single nucleotide polymorphisms of Escherichia coli 0157:H7
USDA-ARS?s Scientific Manuscript database
Escherichia coli O157:H7 is a zoonotic human pathogen for which cattle are an important reservoir host. Using both previously published and new sequencing data, a 48-locus single nucleotide polymorphism (SNP) based typing panel was developed that redundantly identified eleven genogroups that span ...
USDA-ARS?s Scientific Manuscript database
Single-nucleotide polymorphisms (SNPs) are highly abundant markers, which are broadly distributed in animal genomes. For rainbow trout, SNP discovery has been done through sequencing of restriction-site associated DNA (RAD) libraries, reduced representation libraries (RRL), RNA sequencing, and whole...
Seligmann, Hervé; Warthi, Ganesh
2017-01-01
A new codon property, codon directional asymmetry in nucleotide content (CDA), reveals a biologically meaningful genetic code dimension: palindromic codons (first and last nucleotides identical, codon structure XZX) are symmetric (CDA = 0), codons with structures ZXX/XXZ are 5'/3' asymmetric (CDA = - 1/1; CDA = - 0.5/0.5 if Z and X are both purines or both pyrimidines, assigning negative/positive (-/+) signs is an arbitrary convention). Negative/positive CDAs associate with (a) Fujimoto's tetrahedral codon stereo-table; (b) tRNA synthetase class I/II (aminoacylate the 2'/3' hydroxyl group of the tRNA's last ribose, respectively); and (c) high/low antiparallel (not parallel) betasheet conformation parameters. Preliminary results suggest CDA-whole organism associations (body temperature, developmental stability, lifespan). Presumably, CDA impacts spatial kinetics of codon-anticodon interactions, affecting cotranslational protein folding. Some synonymous codons have opposite CDA sign (alanine, leucine, serine, and valine), putatively explaining how synonymous mutations sometimes affect protein function. Correlations between CDA and tRNA synthetase classes are weaker than between CDA and antiparallel betasheet conformation parameters. This effect is stronger for mitochondrial genetic codes, and potentially drives mitochondrial codon-amino acid reassignments. CDA reveals information ruling nucleotide-protein relations embedded in reversed (not reverse-complement) sequences (5'-ZXX-3'/5'-XXZ-3').
Castro-Chavez, Fernando
2012-01-01
Background Three binary representations of the genetic code according to the ancient I Ching of Fu-Xi will be presented, depending on their defragging capabilities by pairing based on three biochemical properties of the nucleic acids: H-bonds, Purine/Pyrimidine rings, and the Keto-enol/Amino-imino tautomerism, yielding the last pair a 32/32 single-strand self-annealed genetic code and I Ching tables. Methods Our working tool is the ancient binary I Ching's resulting genetic code chromosomes defragged by vertical and by horizontal pairing, reverse engineered into non-binaries of 2D rotating 4×4×4 circles and 8×8 squares and into one 3D 100% symmetrical 16×4 tetrahedron coupled to a functional tetrahedron with apical signaling and central hydrophobicity (codon formula: 4[1(1)+1(3)+1(4)+4(2)]; 5:5, 6:6 in man) forming a stella octangula, and compared to Nirenberg's 16×4 codon table (1965) pairing the first two nucleotides of the 64 codons in axis y. Results One horizontal and one vertical defragging had the start Met at the center. Two, both horizontal and vertical pairings produced two pairs of 2×8×4 genetic code chromosomes naturally arranged (M and I), rearranged by semi-introversion of central purines or pyrimidines (M' and I') and by clustering hydrophobic amino acids; their quasi-identity was disrupted by amino acids with odd codons (Met and Tyr pairing to Ile and TGA Stop); in all instances, the 64-grid 90° rotational ability was restored. Conclusions We defragged three I Ching representations of the genetic code while emphasizing Nirenberg's historical finding. The synthetic genetic code chromosomes obtained reflect the protective strategy of enzymes with a similar function, having both humans and mammals a biased G-C dominance of three H-bonds in the third nucleotide of their most used codons per amino acid, as seen in one chromosome of the i, M and M' genetic codes, while a two H-bond A-T dominance was found in their complementary chromosome, as seen in invertebrates and plants. The reverse engineering of chromosome I' into 2D rotating circles and squares was undertaken, yielding a 100% symmetrical 3D geometry which was coupled to a previously obtained genetic code tetrahedron in order to differentiate the start methionine from the methionine that is acting as a codifying non-start codon. PMID:23431415
Getting it Right: How DNA Polymerases Select the Right Nucleotide.
Ludmann, Samra; Marx, Andreas
2016-01-01
All living organisms are defined by their genetic code encrypted in their DNA. DNA polymerases are the enzymes that are responsible for all DNA syntheses occurring in nature. For DNA replication, repair and recombination these enzymes have to read the parental DNA and recognize the complementary nucleotide out of a pool of four structurally similar deoxynucleotide triphosphates (dNTPs) for a given template. The selection of the nucleotide is in accordance with the Watson-Crick rule. In this process the accuracy of DNA synthesis is crucial for the maintenance of the genome stability. However, to spur evolution a certain degree of freedom must be allowed. This brief review highlights the mechanistic basis for selecting the right nucleotide by DNA polymerases.
Detection of small interfering RNA (siRNA) by mass spectrometry procedures in doping controls.
Thomas, Andreas; Walpurgis, Katja; Delahaut, Philippe; Kohler, Maxie; Schänzer, Wilhelm; Thevis, Mario
2013-01-01
Uncovering manipulation of athletic performance via small interfering (si)RNA is an emerging field in sports drug testing. Due to the potential to principally knock down every target gene in the organism by means of the RNA interference pathway, this facet of gene doping has become a realistic scenario. In the present study, two distinct model siRNAs comprising 21 nucleotides were designed as double strands which were perfect counterparts to a sequence of the respective messenger RNA coding the muscle regulator myostatin of Rattus norvegicus. Several modified nucleotides were introduced in both the sense and the antisense strand comprising phosphothioates, 2'-O-methylation, 2'-fluoro-nucleotides, locked nucleic acids and a cholesterol tag at the 3'-end. The model siRNAs were applied to rats at 1 mg/kg (i.v.) and blood as well as urine samples were collected. After isolation of the RNA by means of a RNA purification kit, the target analytes were detected by liquid chromatography - high resolution/high accuracy mass spectrometry (LC-HRMS). Analytes were detected as modified nucleotides after alkaline hydrolysis, as intact oligonucleotide strands (top-down) and by means of denaturing SDS-PAGE analysis. The gel-separated siRNA was further subjected to in-gel hydrolysis with different RNases and subsequent identification of the fragments by untargeted LC-HRMS analysis (bottom-up, 'experimental RNomics'). Combining the results of all approaches, the identification of several 3'-truncated urinary metabolites was accomplished and target analytes were detected up to 24 h after a single administration. Simultaneously collected blood samples yielded no promising results. The methods were validated and found fit-for-purpose for doping controls. Copyright © 2013 John Wiley & Sons, Ltd.
A novel MALDI–TOF based methodology for genotyping single nucleotide polymorphisms
Blondal, Thorarinn; Waage, Benedikt G.; Smarason, Sigurdur V.; Jonsson, Frosti; Fjalldal, Sigridur B.; Stefansson, Kari; Gulcher, Jeffery; Smith, Albert V.
2003-01-01
A new MALDI–TOF based detection assay was developed for analysis of single nucleotide polymorphisms (SNPs). It is a significant modification on the classic three-step minisequencing method, which includes a polymerase chain reaction (PCR), removal of excess nucleotides and primers, followed by primer extension in the presence of dideoxynucleotides using modified thermostable DNA polymerase. The key feature of this novel assay is reliance upon deoxynucleotide mixes, lacking one of the nucleotides at the polymorphic position. During primer extension in the presence of depleted nucleotide mixes, standard thermostable DNA polymerases dissociate from the template at positions requiring a depleted nucleotide; this principal was harnessed to create a genotyping assay. The assay design requires a primer- extension primer having its 3′-end one nucleotide upstream from the interrogated site. The assay further utilizes the same DNA polymerase in both PCR and the primer extension step. This not only simplifies the assay but also greatly reduces the cost per genotype compared to minisequencing methodology. We demonstrate accurate genotyping using this methodology for two SNPs run in both singleplex and duplex reactions. We term this assay nucleotide depletion genotyping (NUDGE). Nucleotide depletion genotyping could be extended to other genotyping assays based on primer extension such as detection by gel or capillary electrophoresis. PMID:14654708
Expressed gene sequence of the IFN-gamma-response chemokine CXCL9 of cattle, horses, and swine
USDA-ARS?s Scientific Manuscript database
This report describes the cloning and characterization of expressed gene sequences of bovine, equine, and swine CXCL9 from RNA obtained from peripheral blood mononuclear cell (PBMC) or other tissues. The bovine coding region was 378 nucleotides in length, while the equine and swine coding regions w...
Identification and characterization of long non-coding RNAs in rainbow trout eggs
USDA-ARS?s Scientific Manuscript database
Long non-coding RNAs (lncRNAs) are in general considered as a diverse class of transcripts longer than 200 nucleotides that structurally resemble mRNAs but do not encode proteins. Recent advances in RNA sequencing (RNA-Seq) and bioinformatics methods have provided an opportunity to indentify and ana...
Biological nanopore MspA for DNA sequencing
NASA Astrophysics Data System (ADS)
Manrao, Elizabeth A.
Unlocking the information hidden in the human genome provides insight into the inner workings of complex biological systems and can be used to greatly improve health-care. In order to allow for widespread sequencing, new technologies are required that provide fast and inexpensive readings of DNA. Nanopore sequencing is a third generation DNA sequencing technology that is currently being developed to fulfill this need. In nanopore sequencing, a voltage is applied across a small pore in an electrolyte solution and the resulting ionic current is recorded. When DNA passes through the channel, the ionic current is partially blocked. If the DNA bases uniquely modulate the ionic current flowing through the channel, the time trace of the current can be related to the sequence of DNA passing through the pore. There are two main challenges to realizing nanopore sequencing: identifying a pore with sensitivity to single nucleotides and controlling the translocation of DNA through the pore so that the small single nucleotide current signatures are distinguishable from background noise. In this dissertation, I explore the use of Mycobacterium smegmatis porin A (MspA) for nanopore sequencing. In order to determine MspA's sensitivity to single nucleotides, DNA strands of various compositions are held in the pore as the resulting ionic current is measured. DNA is immobilized in MspA by attaching it to a large molecule which acts as an anchor. This technique confirms the single nucleotide resolution of the pore and additionally shows that MspA is sensitive to epigenetic modifications and single nucleotide polymorphisms. The forces from the electric field within MspA, the effective charge of nucleotides, and elasticity of DNA are estimated using a Freely Jointed Chain model of single stranded DNA. These results offer insight into the interactions of DNA within the pore. With the nucleotide sensitivity of MspA confirmed, a method is introduced to controllably pass DNA through the pore. Using a DNA polymerase, DNA strands are stepped through MspA one nucleotide at a time. The steps are observable as distinct levels on the ionic-current time-trace and are related to the DNA sequence. These experiments overcome the two fundamental challenges to realizing MspA nanopore sequencing and pave the way to the development of a commercial technology.
Congenital sideroblastic anemia due to mutations in the mitochondrial HSP70 homologue HSPA9
Schmitz-Abe, Klaus; Ciesielski, Szymon J.; Schmidt, Paul J.; Campagna, Dean R.; Rahimov, Fedik; Schilke, Brenda A.; Cuijpers, Marloes; Rieneck, Klaus; Lausen, Birgitte; Linenberger, Michael L.; Sendamarai, Anoop K.; Guo, Chaoshe; Hofmann, Inga; Newburger, Peter E.; Matthews, Dana; Shimamura, Akiko; Snijders, Pieter J. L. M.; Towne, Meghan C.; Niemeyer, Charlotte M.; Watson, Henry G.; Dziegiel, Morten H.; Heeney, Matthew M.; May, Alison; Bottomley, Sylvia S.; Swinkels, Dorine W.; Markianos, Kyriacos; Craig, Elizabeth A.
2015-01-01
The congenital sideroblastic anemias (CSAs) are relatively uncommon diseases characterized by defects in mitochondrial heme synthesis, iron-sulfur (Fe-S) cluster biogenesis, or protein synthesis. Here we demonstrate that mutations in HSPA9, a mitochondrial HSP70 homolog located in the chromosome 5q deletion syndrome 5q33 critical deletion interval and involved in mitochondrial Fe-S biogenesis, result in CSA inherited as an autosomal recessive trait. In a fraction of patients with just 1 severe loss-of-function allele, expression of the clinical phenotype is associated with a common coding single nucleotide polymorphism in trans that correlates with reduced messenger RNA expression and results in a pseudodominant pattern of inheritance. PMID:26491070
Kurushima, J. D.; Lipinski, M. J.; Gandolfi, B.; Froenicke, L.; Grahn, J. C.; Grahn, R. A.; Lyons, L. A.
2012-01-01
Summary Both cat breeders and the lay public have interests in the origins of their pets, not only in the genetic identity of the purebred individuals, but also the historical origins of common household cats. The cat fancy is a relatively new institution with over 85% of its 40–50 breeds arising only in the past 75 years, primarily through selection on single-gene aesthetic traits. The short, yet intense cat breed history poses a significant challenge to the development of a genetic marker-based breed identification strategy. Using different breed assignment strategies and methods, 477 cats representing 29 fancy breeds were analysed with 38 short tandem repeats, 148 intergenic and five phenotypic single nucleotide polymorphisms. Results suggest the frequentist method of Paetkau (accuracy single nucleotide polymorphisms = 0.78, short tandem repeats = 0.88) surpasses the Bayesian method of Rannala and Mountain (single nucleotide polymorphisms = 0.56, short tandem repeats = 0.83) for accurate assignment of individuals to the correct breed. Additionally, a post-assignment verification step with the five phenotypic single nucleotide polymorphisms accurately identified between 0.31 and 0.58 of the mis-assigned individuals raising the sensitivity of assignment with the frequentist method to 0.89 and 0.92 single nucleotide polymorphisms and short tandem repeats respectively. This study provides a novel multi-step assignment strategy and suggests that, despite their short breed history and breed family groupings, a majority of cats can be assigned to their proper breed or population of origin, i.e. race. PMID:23171373
USDA-ARS?s Scientific Manuscript database
The objective of this study is to investigate single nucleotide polymorphism (SNP) genotypes imputation of Hereford cattle. Purebred Herefords were from two sources, Line 1 Hereford (N=240) and representatives of Industry Herefords (N=311). Using different reference panels of 62 and 494 males with 1...
USDA-ARS?s Scientific Manuscript database
Salmonid genomes are considered to be in a pseudo-tetraploid state as a result of an evolutionarily recent genome duplication event. This situation complicates single nucleotide polymorphism (SNP) discovery in rainbow trout as many putative SNPs are actually paralogous sequence variants (PSVs) and ...
USDA-ARS?s Scientific Manuscript database
Fertilization and development of the preimplantation embryo is under genetic control. The goal of the current study was to test 434 single nucleotide polymorphisms (SNPs) for association with genetic variation in fertilization and early embryonic development. The approach was to produce embryos from...
Prospects for inferring pairwise relationships with single nucleotide polymorphisms
Jeffery C. Glaubitz; O. Eugene, Jr. Rhodes; J. Andrew DeWoody
2003-01-01
An extraordinarily large number of single nucleotide polymorphisms (SNPs) are now available in humans as well as in other model organisms. Technological advancements may soon make it feasible to assay hundreds of SNPs in virtually any organism of interest. One potential application of SNPs is the determination of pairwise genetic relationships in populations without...
USDA-ARS?s Scientific Manuscript database
Call rate has been used as a measure of quality on both a single nucleotide polymorphism (SNP) and animal basis since SNP genotypes were first used in genomic evaluation of dairy cattle. The genotyping laboratories perform initial quality control screening and genotypes that fail are usually exclude...
USDA-ARS?s Scientific Manuscript database
Large datasets containing single nucleotide polymorphisms (SNPs) are used to analyze genome-wide diversity in a robust collection of cultivars from representative accessions, across the world. The extent of linkage disequilibrium (LD) within a population determines the number of markers required fo...
Yu, Hong; Liu, Jun; Yang, Aiping; Yang, Guohui; Yang, Wenjun; Lei, Heyue; Quan, Jianjun; Zhang, Zengyu
2016-04-01
Genetic factors play an important role in childhood autism. This study is to determine the association of single-nucleotide polymorphisms in dopa decarboxylase (DDC) and dopamine receptor-1 (DRD1) genes with childhood autism, in a Chinese Han population. A total of 211 autistic children and 250 age- and gender-matched healthy controls were recruited. The severity of disease was determined by Children Autism Rating Scale scores. TaqMan Probe by real-time polymerase chain reaction was used to determine genotypes and allele frequencies of single-nucleotide polymorphism rs6592961 in DDC and rs251937 in DRD1. Case-control and case-only studies were respectively performed, to determine the contribution of both single-nucleotide polymorphisms to the predisposition of disease and its severity. Our results showed that there was no significant association of the genotypes and allele frequencies of both single-nucleotide polymorphisms concerning childhood autism and its severity. More studies with larger samples are needed to corroborate their predicting roles. © The Author(s) 2015.
Single-molecule comparison of DNA Pol I activity with native and analog nucleotides
NASA Astrophysics Data System (ADS)
Gul, Osman; Olsen, Tivoli; Choi, Yongki; Corso, Brad; Weiss, Gregory; Collins, Philip
2014-03-01
DNA polymerases are critical enzymes for DNA replication, and because of their complex catalytic cycle they are excellent targets for investigation by single-molecule experimental techniques. Recently, we studied the Klenow fragment (KF) of DNA polymerase I using a label-free, electronic technique involving single KF molecules attached to carbon nanotube transistors. The electronic technique allowed long-duration monitoring of a single KF molecule while processing thousands of template strands. Processivity of up to 42 nucleotide bases was directly observed, and statistical analysis of the recordings determined key kinetic parameters for the enzyme's open and closed conformations. Subsequently, we have used the same technique to compare the incorporation of canonical nucleotides like dATP to analogs like 1-thio-2'-dATP. The analog had almost no affect on duration of the closed conformation, during which the nucleotide is incorporated. On the other hand, the analog increased the rate-limiting duration of the open conformation by almost 40%. We propose that the thiolated analog interferes with KF's recognition and binding, two key steps that determine its ensemble turnover rate.
Mühlhausen, Stefanie; Findeisen, Peggy; Plessmann, Uwe; Urlaub, Henning; Kollmar, Martin
2016-01-01
The genetic code is the cellular translation table for the conversion of nucleotide sequences into amino acid sequences. Changes to the meaning of sense codons would introduce errors into almost every translated message and are expected to be highly detrimental. However, reassignment of single or multiple codons in mitochondria and nuclear genomes, although extremely rare, demonstrates that the code can evolve. Several models for the mechanism of alteration of nuclear genetic codes have been proposed (including “codon capture,” “genome streamlining,” and “ambiguous intermediate” theories), but with little resolution. Here, we report a novel sense codon reassignment in Pachysolen tannophilus, a yeast related to the Pichiaceae. By generating proteomics data and using tRNA sequence comparisons, we show that Pachysolen translates CUG codons as alanine and not as the more usual leucine. The Pachysolen tRNACAG is an anticodon-mutated tRNAAla containing all major alanine tRNA recognition sites. The polyphyly of the CUG-decoding tRNAs in yeasts is best explained by a tRNA loss driven codon reassignment mechanism. Loss of the CUG-tRNA in the ancient yeast is followed by gradual decrease of respective codons and subsequent codon capture by tRNAs whose anticodon is not part of the aminoacyl-tRNA synthetase recognition region. Our hypothesis applies to all nuclear genetic code alterations and provides several testable predictions. We anticipate more codon reassignments to be uncovered in existing and upcoming genome projects. PMID:27197221
Van Rechem, Capucine; Black, Joshua C; Greninger, Patricia; Zhao, Yang; Donado, Carlos; Burrowes, Paul D; Ladd, Brendon; Christiani, David C; Benes, Cyril H; Whetstine, Johnathan R
2015-03-01
SNPs occur within chromatin-modulating factors; however, little is known about how these variants within the coding sequence affect cancer progression or treatment. Therefore, there is a need to establish their biochemical and/or molecular contribution, their use in subclassifying patients, and their impact on therapeutic response. In this report, we demonstrate that coding SNP-A482 within the lysine tridemethylase gene KDM4A/JMJD2A has different allelic frequencies across ethnic populations, associates with differential outcome in patients with non-small cell lung cancer (NSCLC), and promotes KDM4A protein turnover. Using an unbiased drug screen against 87 preclinical and clinical compounds, we demonstrate that homozygous SNP-A482 cells have increased mTOR inhibitor sensitivity. mTOR inhibitors significantly reduce SNP-A482 protein levels, which parallels the increased drug sensitivity observed with KDM4A depletion. Our data emphasize the importance of using variant status as candidate biomarkers and highlight the importance of studying SNPs in chromatin modifiers to achieve better targeted therapy. This report documents the first coding SNP within a lysine demethylase that associates with worse outcome in patients with NSCLC. We demonstrate that this coding SNP alters the protein turnover and associates with increased mTOR inhibitor sensitivity, which identifies a candidate biomarker for mTOR inhibitor therapy and a therapeutic target for combination therapy. ©2015 American Association for Cancer Research.
2010-01-01
Background Comparative sequence analysis of complex loci such as resistance gene analog clusters allows estimating the degree of sequence conservation and mechanisms of divergence at the intraspecies level. In banana (Musa sp.), two diploid wild species Musa acuminata (A genome) and Musa balbisiana (B genome) contribute to the polyploid genome of many cultivars. The M. balbisiana species is associated with vigour and tolerance to pests and disease and little is known on the genome structure and haplotype diversity within this species. Here, we compare two genomic sequences of 253 and 223 kb corresponding to two haplotypes of the RGA08 resistance gene analog locus in M. balbisiana "Pisang Klutuk Wulung" (PKW). Results Sequence comparison revealed two regions of contrasting features. The first is a highly colinear gene-rich region where the two haplotypes diverge only by single nucleotide polymorphisms and two repetitive element insertions. The second corresponds to a large cluster of RGA08 genes, with 13 and 18 predicted RGA genes and pseudogenes spread over 131 and 152 kb respectively on each haplotype. The RGA08 cluster is enriched in repetitive element insertions, in duplicated non-coding intergenic sequences including low complexity regions and shows structural variations between haplotypes. Although some allelic relationships are retained, a large diversity of RGA08 genes occurs in this single M. balbisiana genotype, with several RGA08 paralogs specific to each haplotype. The RGA08 gene family has evolved by mechanisms of unequal recombination, intragenic sequence exchange and diversifying selection. An unequal recombination event taking place between duplicated non-coding intergenic sequences resulted in a different RGA08 gene content between haplotypes pointing out the role of such duplicated regions in the evolution of RGA clusters. Based on the synonymous substitution rate in coding sequences, we estimated a 1 million year divergence time for these M. balbisiana haplotypes. Conclusions A large RGA08 gene cluster identified in wild banana corresponds to a highly variable genomic region between haplotypes surrounded by conserved flanking regions. High level of sequence identity (70 to 99%) of the genic and intergenic regions suggests a recent and rapid evolution of this cluster in M. balbisiana. PMID:20637079
Subramanian, Sankar; Lingala, Syamala Gowri; Swaminathan, Siva; Huynen, Leon; Lambert, David
2014-08-01
The complete mitochondrial genome of the Chinstrap penguin (Pygoscelis antarcticus) was sequenced and compared with other penguin mitogenomes. The genome is 15,972 bp in length with the number and order of protein coding genes and RNAs being very similar to that of other known penguin mitogenomes. Comparative nucleotide analysis showed the Chinstrap mitogenome shares 94% homology with the mitogenome of its sister species, Pygoscelis adelie (Adélie penguin). Divergence at nonsynonymous nucleotide positions was found to be up to 23 times less than that observed in synonymous positions of protein coding genes, suggesting high selection constraints. The complete mitogenome data will be useful for genetic and evolutionary studies of penguins.
Balintová, Jana; Plucnara, Medard; Vidláková, Pavlína; Pohl, Radek; Havran, Luděk; Fojta, Miroslav; Hocek, Michal
2013-09-16
Benzofurazane has been attached to nucleosides and dNTPs, either directly or through an acetylene linker, as a new redox label for electrochemical analysis of nucleotide sequences. Primer extension incorporation of the benzofurazane-modified dNTPs by polymerases has been developed for the construction of labeled oligonucleotide probes. In combination with nitrophenyl and aminophenyl labels, we have successfully developed a three-potential coding of DNA bases and have explored the relevant electrochemical potentials. The combination of benzofurazane and nitrophenyl reducible labels has proved to be excellent for ratiometric analysis of nucleotide sequences and is suitable for bioanalytical applications. Copyright © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Yusoff, K; Millar, N S; Chambers, P; Emmerson, P T
1987-01-01
The nucleotide sequence of the L gene of the Beaudette C strain of Newcastle disease virus (NDV) has been determined. The L gene is 6704 nucleotides long and encodes a protein of 2204 amino acids with a calculated molecular weight of 248822. Mung bean nuclease mapping of the 5' terminus of the L gene mRNA indicates that the transcription of the L gene is initiated 11 nucleotides upstream of the translational start site. Comparison with the amino acid sequences of the L genes of Sendai virus and vesicular stomatitis virus (VSV) suggests that there are several regions of homology between the sequences. These data provide further evidence for an evolutionary relationship between the Paramyxoviridae and the Rhabdoviridae. A non-coding sequence of 46 nucleotides downstream of the presumed polyadenylation site of the L gene may be part of a negative strand leader RNA. Images PMID:3035486
Analysis of correlation structures in the Synechocystis PCC6803 genome.
Wu, Zuo-Bing
2014-12-01
Transfer of nucleotide strings in the Synechocystis sp. PCC6803 genome is investigated to exhibit periodic and non-periodic correlation structures by using the recurrence plot method and the phase space reconstruction technique. The periodic correlation structures are generated by periodic transfer of several substrings in long periodic or non-periodic nucleotide strings embedded in the coding regions of genes. The non-periodic correlation structures are generated by non-periodic transfer of several substrings covering or overlapping with the coding regions of genes. In the periodic and non-periodic transfer, some gaps divide the long nucleotide strings into the substrings and prevent their global transfer. Most of the gaps are either the replacement of one base or the insertion/reduction of one base. In the reconstructed phase space, the points generated from two or three steps for the continuous iterative transfer via the second maximal distance can be fitted by two lines. It partly reveals an intrinsic dynamics in the transfer of nucleotide strings. Due to the comparison of the relative positions and lengths, the substrings concerned with the non-periodic correlation structures are almost identical to the mobile elements annotated in the genome. The mobile elements are thus endowed with the basic results on the correlation structures. Copyright © 2014 Elsevier Ltd. All rights reserved.
Allen, Alexandra M; Barker, Gary L A; Berry, Simon T; Coghill, Jane A; Gwilliam, Rhian; Kirby, Susan; Robinson, Phil; Brenchley, Rachel C; D'Amore, Rosalinda; McKenzie, Neil; Waite, Darren; Hall, Anthony; Bevan, Michael; Hall, Neil; Edwards, Keith J
2011-12-01
Food security is a global concern and substantial yield increases in cereal crops are required to feed the growing world population. Wheat is one of the three most important crops for human and livestock feed. However, the complexity of the genome coupled with a decline in genetic diversity within modern elite cultivars has hindered the application of marker-assisted selection (MAS) in breeding programmes. A crucial step in the successful application of MAS in breeding programmes is the development of cheap and easy to use molecular markers, such as single-nucleotide polymorphisms. To mine selected elite wheat germplasm for intervarietal single-nucleotide polymorphisms, we have used expressed sequence tags derived from public sequencing programmes and next-generation sequencing of normalized wheat complementary DNA libraries, in combination with a novel sequence alignment and assembly approach. Here, we describe the development and validation of a panel of 1114 single-nucleotide polymorphisms in hexaploid bread wheat using competitive allele-specific polymerase chain reaction genotyping technology. We report the genotyping results of these markers on 23 wheat varieties, selected to represent a broad cross-section of wheat germplasm including a number of elite UK varieties. Finally, we show that, using relatively simple technology, it is possible to rapidly generate a linkage map containing several hundred single-nucleotide polymorphism markers in the doubled haploid mapping population of Avalon × Cadenza. © 2011 The Authors. Plant Biotechnology Journal © 2011 Society for Experimental Biology, Association of Applied Biologists and Blackwell Publishing Ltd.
Utilizing Gene Tree Variation to Identify Candidate Effector Genes in Zymoseptoria tritici
McDonald, Megan C.; McGinness, Lachlan; Hane, James K.; Williams, Angela H.; Milgate, Andrew; Solomon, Peter S.
2016-01-01
Zymoseptoria tritici is a host-specific, necrotrophic pathogen of wheat. Infection by Z. tritici is characterized by its extended latent period, which typically lasts 2 wks, and is followed by extensive host cell death, and rapid proliferation of fungal biomass. This work characterizes the level of genomic variation in 13 isolates, for which we have measured virulence on 11 wheat cultivars with differential resistance genes. Between the reference isolate, IPO323, and the 13 Australian isolates we identified over 800,000 single nucleotide polymorphisms, of which ∼10% had an effect on the coding regions of the genome. Furthermore, we identified over 1700 probable presence/absence polymorphisms in genes across the Australian isolates using de novo assembly. Finally, we developed a gene tree sorting method that quickly identifies groups of isolates within a single gene alignment whose sequence haplotypes correspond with virulence scores on a single wheat cultivar. Using this method, we have identified < 100 candidate effector genes whose gene sequence correlates with virulence toward a wheat cultivar carrying a major resistance gene. PMID:26837952
Zhou, Bo; Wei, Fan-Yan; Kanai, Narumi; Fujimura, Atsushi; Kaitsuka, Taku; Tomizawa, Kazuhito
2014-09-01
Single-nucleotide polymorphisms (SNPs) in CDKAL1 have been associated with the development of type 2 diabetes (T2D). CDKAL1 catalyzes 2-methylthio modification of adenosine at position 37 of tRNA(Lys)(UUU). A deficit of this modification causes aberrant protein synthesis, and is associated with impairment of insulin secretion in both mouse model and human. However, it is unknown whether the T2D-associated SNPs in CDKAL1 are associated with downregulation of CDKAL1 by regulating the gene expression. Here, we report a specific splicing variant of CDKAL1 termed CDKAL1-v1 that is markedly lower in individuals carrying risk SNPs of CDKAL1. Interestingly, CDKAL1-v1 is a non-coding transcript, which regulates the CDKAL1 level by competitive binding to a CDKAL1-targeting miRNA. By direct editing of the genome, we further show that the nucleotides around the SNP regions are critical for the alternative splicing of CDKAL1-v1. These findings reveal that the T2D-associated SNPs in CDKAL1 reduce CDKAL1-v1 levels by impairing splicing, which in turn increases miRNA-mediated suppression of CDKAL1. Our results suggest that CDKAL1-v1-mediated suppression of CDKAL1 might underlie the pathogenesis of T2D in individuals carrying the risk SNPs. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Effects of GWAS-Associated Genetic Variants on lncRNAs within IBD and T1D Candidate Loci
Brorsson, Caroline A.; Pociot, Flemming
2014-01-01
Long non-coding RNAs are a new class of non-coding RNAs that are at the crosshairs in many human diseases such as cancers, cardiovascular disorders, inflammatory and autoimmune disease like Inflammatory Bowel Disease (IBD) and Type 1 Diabetes (T1D). Nearly 90% of the phenotype-associated single-nucleotide polymorphisms (SNPs) identified by genome-wide association studies (GWAS) lie outside of the protein coding regions, and map to the non-coding intervals. However, the relationship between phenotype-associated loci and the non-coding regions including the long non-coding RNAs (lncRNAs) is poorly understood. Here, we systemically identified all annotated IBD and T1D loci-associated lncRNAs, and mapped nominally significant GWAS/ImmunoChip SNPs for IBD and T1D within these lncRNAs. Additionally, we identified tissue-specific cis-eQTLs, and strong linkage disequilibrium (LD) signals associated with these SNPs. We explored sequence and structure based attributes of these lncRNAs, and also predicted the structural effects of mapped SNPs within them. We also identified lncRNAs in IBD and T1D that are under recent positive selection. Our analysis identified putative lncRNA secondary structure-disruptive SNPs within and in close proximity (+/−5 kb flanking regions) of IBD and T1D loci-associated candidate genes, suggesting that these RNA conformation-altering polymorphisms might be associated with diseased-phenotype. Disruption of lncRNA secondary structure due to presence of GWAS SNPs provides valuable information that could be potentially useful for future structure-function studies on lncRNAs. PMID:25144376
Junk DNA and the long non-coding RNA twist in cancer genetics
Ling, Hui; Vincent, Kimberly; Pichler, Martin; Fodde, Riccardo; Berindan-Neagoe, Ioana; Slack, Frank J.; Calin, George A
2015-01-01
The central dogma of molecular biology states that the flow of genetic information moves from DNA to RNA to protein. However, in the last decade this dogma has been challenged by new findings on non-coding RNAs (ncRNAs) such as microRNAs (miRNAs). More recently, long non-coding RNAs (lncRNAs) have attracted much attention due to their large number and biological significance. Many lncRNAs have been identified as mapping to regulatory elements including gene promoters and enhancers, ultraconserved regions, and intergenic regions of protein-coding genes. Yet, the biological function and molecular mechanisms of lncRNA in human diseases in general and cancer in particular remain largely unknown. Data from the literature suggest that lncRNA, often via interaction with proteins, functions in specific genomic loci or use their own transcription loci for regulatory activity. In this review, we summarize recent findings supporting the importance of DNA loci in lncRNA function, and the underlying molecular mechanisms via cis or trans regulation, and discuss their implications in cancer. In addition, we use the 8q24 genomic locus, a region containing interactive SNPs, DNA regulatory elements and lncRNAs, as an example to illustrate how single nucleotide polymorphism (SNP) located within lncRNAs may be functionally associated with the individual’s susceptibility to cancer. PMID:25619839
Pang, Erli; Wu, Xiaomei; Lin, Kui
2016-06-01
Protein evolution plays an important role in the evolution of each genome. Because of their functional nature, in general, most of their parts or sites are differently constrained selectively, particularly by purifying selection. Most previous studies on protein evolution considered individual proteins in their entirety or compared protein-coding sequences with non-coding sequences. Less attention has been paid to the evolution of different parts within each protein of a given genome. To this end, based on PfamA annotation of all human proteins, each protein sequence can be split into two parts: domains or unassigned regions. Using this rationale, single nucleotide polymorphisms (SNPs) in protein-coding sequences from the 1000 Genomes Project were mapped according to two classifications: SNPs occurring within protein domains and those within unassigned regions. With these classifications, we found: the density of synonymous SNPs within domains is significantly greater than that of synonymous SNPs within unassigned regions; however, the density of non-synonymous SNPs shows the opposite pattern. We also found there are signatures of purifying selection on both the domain and unassigned regions. Furthermore, the selective strength on domains is significantly greater than that on unassigned regions. In addition, among all of the human protein sequences, there are 117 PfamA domains in which no SNPs are found. Our results highlight an important aspect of protein domains and may contribute to our understanding of protein evolution.
Origins of Genes: "Big Bang" or Continuous Creation?
NASA Astrophysics Data System (ADS)
Kesse, Paul K.; Gibbs, Adrian
1992-10-01
Many protein families are common to all cellular organisms, indicating that many genes have ancient origins. Genetic variation is mostly attributed to processes such as mutation, duplication, and rearrangement of ancient modules. Thus it is widely assumed that much of present-day genetic diversity can be traced by common ancestry to a molecular "big bang." A rarely considered alternative is that proteins may arise continuously de novo. One mechanism of generating different coding sequences is by "overprinting," in which an existing nucleotide sequence is translated de novo in a different reading frame or from noncoding open reading frames. The clearest evidence for overprinting is provided when the original gene function is retained, as in overlapping genes. Analysis of their phylogenies indicates which are the original genes and which are their informationally novel partners. We report here the phylogenetic relationships of overlapping coding sequences from steroid-related receptor genes and from tymovirus, luteovirus, and lentivirus genomes. For each pair of overlapping coding sequences, one is confined to a single lineage, whereas the other is more widespread. This suggests that the phylogenetically restricted coding sequence arose only in the progenitor of that lineage by translating an out-of-frame sequence to yield the new polypeptide. The production of novel exons by alternative splicing in thyroid receptor and lentivirus genes suggests that introns can be a valuable evolutionary source for overprinting. New genes and their products may drive major evolutionary changes.
Spielmann, A; Stutz, E
1983-10-25
The soybean chloroplast psb A gene (photosystem II thylakoid membrane protein of Mr 32 000, lysine-free) and the trn H gene (tRNAHisGUG), which both map in the large single copy region adjacent to one of the inverted repeat structures (IR1), have been sequenced including flanking regions. The psb A gene shows in its structural part 92% sequence homology with the corresponding genes of spinach and N. debneyi and contains also an open reading frame for 353 aminoacids. The aminoacid sequence of a potential primary translation product (calculated Mr, 38 904, no lysine) diverges from that of spinach and N. debneyi in only two positions in the C-terminal part. The trn H gene has the same polarity as the psb A gene and the coding region is located at the very end of the large single copy region. The deduced sequence of the soybean chloroplast tRNAHisGUG is identical with that of Zea mays chloroplasts. Both ends of the large single copy region were sequenced including a small segment of the adjacent IR1 and IR2.
Sasaya, Takahide; Kusaba, Shinnosuke; Ishikawa, Koichi; Koganezawa, Hiroki
2004-09-01
Lettuce big-vein virus (LBVV) is the type species of the genus Varicosavirus and is a two-segmented negative-sense single-stranded RNA virus. The larger LBVV genome segment (RNA1) consists of 6797 nt and encodes an L polymerase that resembles that of rhabdoviruses. Here, the nucleotide sequence of the second LBVV genome segment (RNA2) is reported. LBVV RNA2 consisted of 6081 nt and contained antisense information for five major ORFs: ORF1 (nt 210-1403 on the viral RNA), ORF2 (nt 1493-2494), ORF3 (nt 2617-3489), ORF4 (nt 3843-4337) and ORF5 (nt 4530-5636), which had coding capacities of 44, 36, 32, 19 and 41 kDa, respectively. The gene at the 3' end of the viral RNA encoded a coat protein, while the other four genes encoded proteins of unknown functions. The 3'-terminal 11 nt of LBVV RNA2 were identical to those of LBVV RNA1, and the 5'-terminal regions of LBVV RNA1 and RNA2 contained a long common nucleotide stretch of about 100 nt. Northern blot analysis using probes specific to the individual ORFs revealed that LBVV transcribes monocistronic RNAs. Analysis of the terminal sequences, and primer extension and RNase H digestion analysis of LBVV mRNAs, suggested that LBVV utilizes a transcription termination/initiation strategy comparable with that of rhabdoviruses.
Base-By-Base: single nucleotide-level analysis of whole viral genome alignments.
Brodie, Ryan; Smith, Alex J; Roper, Rachel L; Tcherepanov, Vasily; Upton, Chris
2004-07-14
With ever increasing numbers of closely related virus genomes being sequenced, it has become desirable to be able to compare two genomes at a level more detailed than gene content because two strains of an organism may share the same set of predicted genes but still differ in their pathogenicity profiles. For example, detailed comparison of multiple isolates of the smallpox virus genome (each approximately 200 kb, with 200 genes) is not feasible without new bioinformatics tools. A software package, Base-By-Base, has been developed that provides visualization tools to enable researchers to 1) rapidly identify and correct alignment errors in large, multiple genome alignments; and 2) generate tabular and graphical output of differences between the genomes at the nucleotide level. Base-By-Base uses detailed annotation information about the aligned genomes and can list each predicted gene with nucleotide differences, display whether variations occur within promoter regions or coding regions and whether these changes result in amino acid substitutions. Base-By-Base can connect to our mySQL database (Virus Orthologous Clusters; VOCs) to retrieve detailed annotation information about the aligned genomes or use information from text files. Base-By-Base enables users to quickly and easily compare large viral genomes; it highlights small differences that may be responsible for important phenotypic differences such as virulence. It is available via the Internet using Java Web Start and runs on Macintosh, PC and Linux operating systems with the Java 1.4 virtual machine.
Natural variations in OsγTMT contribute to diversity of the α-tocopherol content in rice.
Wang, Xiao-Qiang; Yoon, Min-Young; He, Qiang; Kim, Tae-Sung; Tong, Wei; Choi, Bu-Woong; Lee, Young-Sang; Park, Yong-Jin
2015-12-01
Tocopherols and tocotrienols, collectively known as tocochromanols, are lipid-soluble molecules that belong to the group of vitamin E compounds. Among them, α-tocopherol (αΤ) is one of the antioxidants with diverse functions and benefits for humans and animals. Thus, understanding the genetic basis of these traits would be valuable to improve nutritional quality by breeding in rice. Genome-wide association study (GWAS) has emerged as a powerful strategy for identifying genes or quantitative trait loci (QTL) underlying complex traits in plants. To discover the genes or QTLs underlying the naturally occurring variations of αΤ content in rice, we performed GWAS using 1.44 million high-quality single-nucleotide polymorphisms acquired from re-sequencing of 137 accessions from a diverse rice core collection. Thirteen candidate genes were found across 2-year phenotypic data, among which gamma-tocopherol methyltransferase (OsγTMT) was identified as the major factor responsible for the αΤ content among rice accessions. Nucleotide variations in the coding region of OsγTMT were significantly associated with the αΤ content variations, while nucleotide polymorphisms in the promoter region of OsγTMT also could partly demonstrate the correlation with αΤ content variations, according to our RNA expression analyses. This study provides useful information for genetic factors underlying αΤ content variations in rice, which will significantly contribute the research on αΤ biosynthesis mechanisms and αΤ improvement of rice.
DeWitt, D L; Smith, W L
1988-01-01
Prostaglandin G/H synthase (8,11,14-icosatrienoate, hydrogen-donor:oxygen oxidoreductase, EC 1.14.99.1) catalyzes the first step in the formation of prostaglandins and thromboxanes, the conversion of arachidonic acid to prostaglandin endoperoxides G and H. This enzyme is the site of action of nonsteroidal anti-inflammatory drugs. We have isolated a 2.7-kilobase complementary DNA (cDNA) encompassing the entire coding region of prostaglandin G/H synthase from sheep vesicular glands. This cDNA, cloned from a lambda gt 10 library prepared from poly(A)+ RNA of vesicular glands, hybridizes with a single 2.75-kilobase mRNA species. The cDNA clone was selected using oligonucleotide probes modeled from amino acid sequences of tryptic peptides prepared from the purified enzyme. The full-length cDNA encodes a protein of 600 amino acids, including a signal sequence of 24 amino acids. Identification of the cDNA as coding for prostaglandin G/H synthase is based on comparison of amino acid sequences of seven peptides comprising 103 amino acids with the amino acid sequence deduced from the nucleotide sequence of the cDNA. The molecular weight of the unglycosylated enzyme lacking the signal peptide is 65,621. The synthase is a glycoprotein, and there are three potential sites for N-glycosylation, two of them in the amino-terminal half of the molecule. The serine reported to be acetylated by aspirin is at position 530, near the carboxyl terminus. There is no significant similarity between the sequence of the synthase and that of any other protein in amino acid or nucleotide sequence libraries, and a heme binding site(s) is not apparent from the amino acid sequence. The availability of a full-length cDNA clone coding for prostaglandin G/H synthase should facilitate studies of the regulation of expression of this enzyme and the structural features important for catalysis and for interaction with anti-inflammatory drugs. Images PMID:3125548
PredictSNP: Robust and Accurate Consensus Classifier for Prediction of Disease-Related Mutations
Bendl, Jaroslav; Stourac, Jan; Salanda, Ondrej; Pavelka, Antonin; Wieben, Eric D.; Zendulka, Jaroslav; Brezovsky, Jan; Damborsky, Jiri
2014-01-01
Single nucleotide variants represent a prevalent form of genetic variation. Mutations in the coding regions are frequently associated with the development of various genetic diseases. Computational tools for the prediction of the effects of mutations on protein function are very important for analysis of single nucleotide variants and their prioritization for experimental characterization. Many computational tools are already widely employed for this purpose. Unfortunately, their comparison and further improvement is hindered by large overlaps between the training datasets and benchmark datasets, which lead to biased and overly optimistic reported performances. In this study, we have constructed three independent datasets by removing all duplicities, inconsistencies and mutations previously used in the training of evaluated tools. The benchmark dataset containing over 43,000 mutations was employed for the unbiased evaluation of eight established prediction tools: MAPP, nsSNPAnalyzer, PANTHER, PhD-SNP, PolyPhen-1, PolyPhen-2, SIFT and SNAP. The six best performing tools were combined into a consensus classifier PredictSNP, resulting into significantly improved prediction performance, and at the same time returned results for all mutations, confirming that consensus prediction represents an accurate and robust alternative to the predictions delivered by individual tools. A user-friendly web interface enables easy access to all eight prediction tools, the consensus classifier PredictSNP and annotations from the Protein Mutant Database and the UniProt database. The web server and the datasets are freely available to the academic community at http://loschmidt.chemi.muni.cz/predictsnp. PMID:24453961
Thomas, Laurent F.; Sætrom, Pål
2012-01-01
Alternative polyadenylation (APA) can for example occur when a protein-coding gene has several polyadenylation (polyA) signals in its last exon, resulting in messenger RNAs (mRNAs) with different 3′ untranslated region (UTR) lengths. Different 3′UTR lengths can give different microRNA (miRNA) regulation such that shortened transcripts have increased expression. The APA process is part of human cells' natural regulatory processes, but APA also seems to play an important role in many human diseases. Although altered APA in disease can have many causes, we reasoned that mutations in DNA elements that are important for the polyA process, such as the polyA signal and the downstream GU-rich region, can be one important mechanism. To test this hypothesis, we identified single nucleotide polymorphisms (SNPs) that can create or disrupt APA signals (APA-SNPs). By using a data-integrative approach, we show that APA-SNPs can affect 3′UTR length, miRNA regulation, and mRNA expression—both between homozygote individuals and within heterozygote individuals. Furthermore, we show that a significant fraction of the alleles that cause APA are strongly and positively linked with alleles found by genome-wide studies to be associated with disease. Our results confirm that APA-SNPs can give altered gene regulation and that APA alleles that give shortened transcripts and increased gene expression can be important hereditary causes for disease. PMID:22915998
Zhu, Yuanqi; Hein, David W.
2007-01-01
Genetic variants of human N-acetyltransferase 1 (NAT1) are associated with cancer and birth defects. N- and O-acetyltransferase catalytic activities, Michaelis-Menten kinetic constants (Km & Vmax), and steady state expression levels of NAT1-specific mRNA and protein were determined for the reference NAT1*4 and variant human NAT1 haplotypes possessing single nucleotide polymorphisms (SNPs) in the open reading frame. Although none of the SNPs caused a significant effect on steady state levels of NAT1-specific mRNA, C97T(R33stop), C190T(R64W), C559T (R187stop) and A752T(D251V) each reduced NAT1 protein level and/or N- and O-acetyltransferase catalytic activities to levels below detection. G560A(R187Q) substantially reduced NAT1 protein level and catalytic activities and increased substrate Km. The G445A(V149I), G459A(synonymous) and T640G(S214A) haplotype present in NAT1*11 significantly (p<0.05) increased NAT1 protein level and catalytic activity. Neither T21G(synonymous), T402C(synonymous), A613G(M205V), T777C(synonymous), G781A(E261K), or A787G(I263V) significantly affected Km, catalytic activity, mRNA or protein level. These results suggest heterogeneity among slow NAT1 acetylator phenotypes. PMID:17909564
Identification of four novel mutations in the COL4A5 gene of patients with Alport syndrome
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lemmink, H.H.; Schroeder, C.H.; Brunner, H.G.
1993-08-01
The type IV collagen [alpha]5 chain (COL4A5) genes of patients with Alport syndrome were tested for major gene rearrangements by Southern blot analysis, using COL4A5 cDNA clones as probes. In addition, individual exons were screened for small mutations by single-strand conformation polymorphism (SSCP) analysis. Four new COL4A5 mutations were detected. A duplication of the nine most 3[prime] located nucleotides of exon 49 and the first nucleotide of intron 49 was identified in the COL4A5 gene of one patient. Two patients displayed single base substitutions leading to, respectively, a proline to threonine and an arginine to glutamine substitution in the C-terminalmore » end. Both substitutions involve amino acids conserved through evolution. In COL4A5 intron 41 a mutation changing the splice acceptor site from AG to AA was identified. All mutations cosegregate with the clinical phenotype of Alport syndrome in affected family members. In a control population of 50 individuals tested by PCR-SSCP these mutations were never identified. Together with two mutations reported previously, a total of six mutations were found in 26 patients with Alport syndrome (23%) after systematic screening of about 30% of the COL4A5 coding region. The clinical features of these six patients are described in detail. 21 refs., 2 figs., 3 tabs.« less
Prospecting for pig single nucleotide polymorphisms in the human genome: have we struck gold?
Grapes, L; Rudd, S; Fernando, R L; Megy, K; Rocha, D; Rothschild, M F
2006-06-01
Gene-to-gene variation in the frequency of single nucleotide polymorphisms (SNPs) has been observed in humans, mice, rats, primates and pigs, but a relationship across species in this variation has not been described. Here, the frequency of porcine coding SNPs (cSNPs) identified by in silico methods, and the frequency of murine cSNPs, were compared with the frequency of human cSNPs across homologous genes. From 150,000 porcine expressed sequence tag (EST) sequences, a total of 452 SNP-containing sequence clusters were found, totalling 1394 putative SNPs. All the clustered porcine EST annotations and SNP data have been made publicly available at http://sputnik.btk.fi/project?name=swine. Human and murine cSNPs were identified from dbSNP and were characterized as either validated or total number of cSNPs (validated plus non-validated) for comparison purposes. The correlation between in silico pig cSNP and validated human cSNP densities was found to be 0.77 (p < 0.00001) for a set of 25 homologous genes, while a correlation of 0.48 (p < 0.0005) was found for a primarily random sample of 50 homologous human and mouse genes. This is the first evidence of conserved gene-to-gene variability in cSNP frequency across species and indicates that site-directed screening of porcine genes that are homologous to cSNP-rich human genes may rapidly advance cSNP discovery in pigs.
Payen, Thibaut; Murat, Claude; Gigant, Anaïs; Morin, Emmanuelle; De Mita, Stéphane; Martin, Francis
2015-09-01
The Périgord black truffle (Tuber melanosporum Vittad.), considered a gastronomic delicacy worldwide, is an ectomycorrhizal filamentous fungus that is ecologically important in Mediterranean French, Italian and Spanish woodlands. In this study, we developed a novel resource of single nucleotide polymorphisms (SNPs) for T. melanosporum using Illumina high-throughput resequencing. The genome from six T. melanosporum geographical accessions was sequenced to a depth of approximately 20×. These geographical accessions were selected from different populations within the northern and southern regions of the geographical species distribution. Approximately 80% of the reads for each of the six resequenced geographical accessions mapped against the reference T. melanosporum genome assembly, estimating the core genome size of this organism to be approximately 110 Mbp. A total of 442 326 SNPs corresponding to 3540 SNPs/Mbps were identified as being included in all seven genomes. The SNPs occurred more frequently in repeated sequences (85%), although 4501 SNPs were also identified in the coding regions of 2587 genes. Using the ratio of nonsynonymous mutations per nonsynonymous site (pN) to synonymous mutations per synonymous site (pS) and Tajima's D index scanning the whole genome, we were able to identify genomic regions and genes potentially subjected to positive or purifying selection. The SNPs identified represent a valuable resource for future population genetics and genomics studies. © 2015 John Wiley & Sons Ltd.
Ghalandari, Hamid; Hosseini-Esfahani, Firoozeh; Mirmiran, Parvin
2015-01-01
Context: Leptin and ghrelin are two important appetite and energy balance-regulating peptides. Common polymorphisms in the genes coding these peptides and their related receptors are shown to be associated with body weight, different markers of obesity and metabolic abnormalities. This review article aims to investigate the association of common polymorphisms of these genes with overweight/obesity and the metabolic disturbances related to it. Evidence Acquisition: The keywords leptin, ghrelin, polymorphism, single-nucleotide polymorphism (SNP), obesity, overweight, Body Mass Index, metabolic syndrome, and type 2 diabetes mellitus (T2DM) (MeSH headings) were used to search in the following databases: Pubmed, Sciencedirect (Elsevier), and Google scholar. Overall, 24 case-control studies, relevant to our topic, met the criteria and were included in the review. Results: The most prevalent leptin/leptin receptor genes (LEP/LEPR) and ghrelin/ghrelin receptor genes (GHRL/GHSR) single nucleotide polymorphisms studied were LEP G-2548A, LEPR Q223R, and Leu72Met, respectively. Nine studies of the 17 studies on LEP/LEPR, and three studies of the seven studies on GHRL/GHSR showed significant relationships. Conclusions: In general, our study suggests that the association between LEP/LEPR and GHRL/GHSR with overweight/obesity and the related metabolic disturbances is inconclusive. These results may be due to unidentified gene-environment interactions. More investigations are needed to further clarify this association. PMID:26425125
Raynal, Caroline; Ciccolini, Joseph; Mercier, Cédric; Boyer, Jean-Christophe; Polge, Anne; Lallemant, Benjamin; Mouzat, Kévin; Lumbroso, Serge; Brouillet, Jean-Paul; Evrard, Alexandre
2010-02-01
Gemcitabine (2',2'-difluorodeoxycytidine) is a major antimetabolite cytotoxic drug with a wide spectrum of activity against solid tumors. Hepatic elimination of gemcitabine depends on a catabolic pathway through a deamination step driven by the enzyme cytidine deaminase (CDA). Severe hematologic toxicity to gemcitabine was reported in patients harboring genetic polymorphisms in CDA gene. High-resolution melting (HRM) analysis of polymerase chain reaction amplicon emerges today as a powerful technique for both genotyping and gene scanning strategies. In this study, 46 DNA samples from gemcitabine-treated patients were subjected to HRM analysis on a LightCycler 480 platform. Residual serum CDA activity was assayed as a surrogate marker for the overall functionality of this enzyme. Genotyping of three well-described single nucleotide polymorphisms in coding region (c.79A>C, c.208G>A and c.435C>T) was successfully achieved by HRM analysis of small polymerase chain reaction fragments, whereas unknown single nucleotide polymorphisms were searched by a gene scanning strategy with longer amplicons (up to 622 bp). The gene scanning strategy allowed us to find a new intronic mutation c.246+37G>A in a female patient displaying marked CDA deficiency and who had an extreme toxic reaction with a fatal outcome to gemcitabine treatment. Our work demonstrates that HRM-based methods, owing to their simplicity, reliability, and speed, are useful tools for diagnosis of CDA deficiency and could be of interest for personalized medicine.
Gowin, Ewelina; Świątek-Kościelna, Bogna; Kałużna, Ewelina; Nowak, Jerzy; Michalak, Michał; Wysocki, Jacek; Januszkiewicz-Lewandowska, Danuta
2017-07-01
The aim was to analyse TLR2 rs5743708, TLR2 rs4696480, TLR4 rs4986790, TLR9 rs5743836, and TLR9 rs352140 single nucleotide polymorphisms (SNPs) in children with pneumococcal and meningococcal meningitis and their family members. The study group consisted of 39 children with bacterial meningitis (25 with meningococcal meningitis and 14 with pneumococcal meningitis) and 49 family members. Laboratory test results and the course of the diseases were analyzed. Genomic DNA was extracted from 1.2ml of peripheral blood in order to analyze the five SNPs. Patients with pneumococcal and meningococcal meningitis showed a similar male/female ratio, mean age, and duration of symptoms. There were no statistically significant differences in biochemical markers between the two groups. All patients possessed at least one polymorphic variant of the analyzed SNPs. The most common SNP was TLR9 rs352140, detected in 89.7% of patients. No significant differences in SNP frequency were found between patients, family members, and the general population. The allele frequencies in the population studied are in accordance with the literature data. The study did not find an association between the analyzed SNPs and susceptibility to bacterial meningitis. The role of SNPs in genes coding toll-like receptors and the interactions between them in controlling inflammation in the central nervous system needs further evaluation. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.
González, Carolina; Tabernero, David; Cortese, Maria Francesca; Gregori, Josep; Casillas, Rosario; Riveiro-Barciela, Mar; Godoy, Cristina; Sopena, Sara; Rando, Ariadna; Yll, Marçal; Lopez-Martinez, Rosa; Quer, Josep; Esteban, Rafael; Buti, Maria; Rodríguez-Frías, Francisco
2018-05-21
To detect hyper-conserved regions in the hepatitis B virus (HBV) X gene ( HBX ) 5' region that could be candidates for gene therapy. The study included 27 chronic hepatitis B treatment-naive patients in various clinical stages (from chronic infection to cirrhosis and hepatocellular carcinoma, both HBeAg-negative and HBeAg-positive), and infected with HBV genotypes A-F and H. In a serum sample from each patient with viremia > 3.5 log IU/mL, the HBX 5' end region [nucleotide (nt) 1255-1611] was PCR-amplified and submitted to next-generation sequencing (NGS). We assessed genotype variants by phylogenetic analysis, and evaluated conservation of this region by calculating the information content of each nucleotide position in a multiple alignment of all unique sequences (haplotypes) obtained by NGS. Conservation at the HBx protein amino acid (aa) level was also analyzed. NGS yielded 1333069 sequences from the 27 samples, with a median of 4578 sequences/sample (2487-9279, IQR 2817). In 14/27 patients (51.8%), phylogenetic analysis of viral nucleotide haplotypes showed a complex mixture of genotypic variants. Analysis of the information content in the haplotype multiple alignments detected 2 hyper-conserved nucleotide regions, one in the HBX upstream non-coding region (nt 1255-1286) and the other in the 5' end coding region (nt 1519-1603). This last region coded for a conserved amino acid region (aa 63-76) that partially overlaps a Kunitz-like domain. Two hyper-conserved regions detected in the HBX 5' end may be of value for targeted gene therapy, regardless of the patients' clinical stage or HBV genotype.
Chen, Mengqiang; Xu, Mengyun; Xiao, Yao; Cui, Dandan; Qin, Yongqiang; Wu, Jiaqi; Wang, Wenyi; Wang, Guoping
2018-01-01
Anthocyanins are the main pigments in flowers and fruits. These pigments are responsible for the red, red-purple, violet, and purple color in plants, and act as insect and animal attractants. In this study, phenotypic analysis of the purple flower color in eggplant indicated that the flower color is controlled by a single dominant gene, FAS. Using an F2 mapping population derived from a cross between purple-flowered ‘Blacknite’ and white-flowered ‘Small Round’, Flower Anthocyanidin Synthase (FAS) was fine mapped to an approximately 165.6-kb region between InDel marker Indel8-11 and Cleaved Amplified Polymorphic Sequences (CAPS) marker Efc8-32 on Chromosome 8. On the basis of bioinformatic analysis, 29 genes were subsequently located in the FAS target region, among which were two potential Anthocyanidin Synthase (ANS) gene candidates. Allelic sequence comparison results showed that one ANS gene (Sme2.5_01638.1_g00003.1) was conserved in promoter and coding sequences without any nucleotide change between parents, whereas four single-nucleotide polymorphisms were detected in another ANS gene (Sme2.5_01638.1_g00005.1). Crucially, a single base pair deletion at site 438 resulted in premature termination of FAS, leading to the loss of anthocyanin accumulation. In addition, FAS displayed strong expression in purple flowers compared with white flowers and other tissues. Collectively, our results indicate that Sme2.5_01638.1_g00005.1 is a good candidate gene for FAS, which controls anthocyanidin synthase in eggplant flowers. The present study provides information for further potential facilitate genetic engineering for improvement of anthocyanin levels in plants. PMID:29522465
Current knowledge of microRNA-mediated regulation of drug metabolism in humans.
Nakano, Masataka; Nakajima, Miki
2018-05-01
Understanding the factors causing inter- and intra-individual differences in drug metabolism potencies is required for the practice of personalized or precision medicine, as well as for the promotion of efficient drug development. The expression of drug-metabolizing enzymes is controlled by transcriptional regulation by nuclear receptors and transcriptional factors, epigenetic regulation, such as DNA methylation and histone acetylation, and post-translational modification. In addition to such regulation mechanisms, recent studies revealed that microRNAs (miRNAs), endogenous ~22-nucleotide non-coding RNAs that regulate gene expression through the translational repression and degradation of mRNAs, significantly contribute to post-transcriptional regulation of drug-metabolizing enzymes. Areas covered: This review summarizes the current knowledge regarding miRNAs-dependent regulation of drug-metabolizing enzymes and transcriptional factors and its physiological and clinical significance. We also describe recent advances in miRNA-dependent regulation research, showing that the presence of pseudogenes, single-nucleotide polymorphisms, and RNA editing affects miRNA targeting. Expert opinion: It is unwavering fact that miRNAs are critical factors causing inter- and intra-individual differences in the expression of drug-metabolizing enzymes. Consideration of miRNA-dependent regulation would be a helpful tool for optimizing personalized and precision medicine.
Single Nucleotide Polymorphisms Predict Symptom Severity of Autism Spectrum Disorder
ERIC Educational Resources Information Center
Jiao, Yun; Chen, Rong; Ke, Xiaoyan; Cheng, Lu; Chu, Kangkang; Lu, Zuhong; Herskovits, Edward H.
2012-01-01
Autism is widely believed to be a heterogeneous disorder; diagnosis is currently based solely on clinical criteria, although genetic, as well as environmental, influences are thought to be prominent factors in the etiology of most forms of autism. Our goal is to determine whether a predictive model based on single-nucleotide polymorphisms (SNPs)…
USDA-ARS?s Scientific Manuscript database
Background/Objectives: The misincorporation of uracil into DNA leads to genomic instability. In a previous study, some of us identified four common single nucleotide polymorphisms (SNPs) in uracil-processing genes (rs2029166 and rs7296239 in SMUG1, rs34259 in UNG and rs4775748 in DUT) that were asso...
USDA-ARS?s Scientific Manuscript database
Previously, a candidate gene approach identified 51 single nucleotide polymorphisms (SNP) associated with genetic merit for reproductive traits and 26 associated with genetic merit for production in dairy bulls. We evaluated association of the 77 SNPs with days open (DO) for first lactation in a pop...
USDA-ARS?s Scientific Manuscript database
Watermelon (Citrullus lanatus var. lanatus) is an important vegetable fruit throughout the world. A high number of single nucleotide polymorphism (SNP) and simple sequence repeat (SSR) markers should provide large coverage of the watermelon genome and high phylogenetic resolution of germplasm acces...
Demirci, Berna; Lee, Yoosook; Lanzaro, Gregory C; Alten, Bulent
2012-05-01
Culex theileri Theobald (Diptera: Culicidae) is one of the most common mosquito species in northeastern Turkey and serves as a vector for various zoonotic diseases including West Nile virus. Although there have been some studies on the ecology of Cx. theileri, very little genetic data has been made available. We successfully sequenced 11 gene fragments from Cx. theileri specimens collected from the northeastern part of Turkey. On average, we found a Single nucleotide polymorphism every 45 bp. Transitions outnumbered transversions, at a ratio of 2:1. This is the first report of genetic polymorphisms in Cx. theileri and Single nucleotide polymorphism discovered from this study can be used to investigate population structure and gene-environmental interactions.
José, Marco V.; Govezensky, Tzipe; García, José A.; Bobadilla, Juan R.
2009-01-01
Herein two genetic codes from which the primeval RNA code could have originated the standard genetic code (SGC) are derived. One of them, called extended RNA code type I, consists of all codons of the type RNY (purine-any base-pyrimidine) plus codons obtained by considering the RNA code but in the second (NYR type) and third (YRN type) reading frames. The extended RNA code type II, comprises all codons of the type RNY plus codons that arise from transversions of the RNA code in the first (YNY type) and third (RNR) nucleotide bases. In order to test if putative nucleotide sequences in the RNA World and in both extended RNA codes, share the same scaling and statistical properties to those encountered in current prokaryotes, we used the genomes of four Eubacteria and three Archaeas. For each prokaryote, we obtained their respective genomes obeying the RNA code or the extended RNA codes types I and II. In each case, we estimated the scaling properties of triplet sequences via a renormalization group approach, and we calculated the frequency distributions of distances for each codon. Remarkably, the scaling properties of the distance series of some codons from the RNA code and most codons from both extended RNA codes turned out to be identical or very close to the scaling properties of codons of the SGC. To test for the robustness of these results, we show, via computer simulation experiments, that random mutations of current genomes, at the rates of 10−10 per site per year during three billions of years, were not enough for destroying the observed patterns. Therefore, we conclude that most current prokaryotes may still contain relics of the primeval RNA World and that both extended RNA codes may well represent two plausible evolutionary paths between the RNA code and the current SGC. PMID:19183813
New genetic variants of LATS1 detected in urinary bladder and colon cancer.
Saadeldin, Mona K; Shawer, Heba; Mostafa, Ahmed; Kassem, Neemat M; Amleh, Asma; Siam, Rania
2014-01-01
LATS1, the large tumor suppressor 1 gene, encodes for a serine/threonine kinase protein and is implicated in cell cycle progression. LATS1 is down-regulated in various human cancers, such as breast cancer, and astrocytoma. Point mutations in LATS1 were reported in human sarcomas. Additionally, loss of heterozygosity of LATS1 chromosomal region predisposes to breast, ovarian, and cervical tumors. In the current study, we investigated LATS1 genetic variations including single nucleotide polymorphisms (SNPs), in 28 Egyptian patients with either urinary bladder or colon cancers. The LATS1 gene was amplified and sequenced and the expression of LATS1 at the RNA level was assessed in 12 urinary bladder cancer samples. We report, the identification of a total of 29 variants including previously identified SNPs within LATS1 coding and non-coding sequences. A total of 18 variants were novel. Majority of the novel variants, 13, were mapped to intronic sequences and un-translated regions of the gene. Four of the five novel variants located in the coding region of the gene, represented missense mutations within the serine/threonine kinase catalytic domain. Interestingly, LATS1 RNA steady state levels was lost in urinary bladder cancerous tissue harboring four specific SNPs (16045 + 41736 + 34614 + 56177) positioned in the 5'UTR, intron 6, and two silent mutations within exon 4 and exon 8, respectively. This study identifies novel single-base-sequence alterations in the LATS1 gene. These newly identified variants could potentially be used as novel diagnostic or prognostic tools in cancer.
Updating Our View of Organelle Genome Nucleotide Landscape
Smith, David Roy
2012-01-01
Organelle genomes show remarkable variation in architecture and coding content, yet their nucleotide composition is relatively unvarying across the eukaryotic domain, with most having a high adenine and thymine (AT) content. Recent studies, however, have uncovered guanine and cytosine (GC)-rich mitochondrial and plastid genomes. These sequences come from a small but eclectic list of species, including certain green plants and animals. Here, I review GC-rich organelle DNAs and the insights they have provided into the evolution of nucleotide landscape. I emphasize that GC-biased mitochondrial and plastid DNAs are more widespread than once thought, sometimes occurring together in the same species, and suggest that the forces biasing their nucleotide content can differ both among and within lineages, and may be associated with specific genome architectural features and life history traits. PMID:22973299
Okamura, Kohji; Sakaguchi, Hironari; Sakamoto-Abutani, Rie; Nakanishi, Mahito; Nishimura, Ken; Yamazaki-Inoue, Mayu; Ohtaka, Manami; Periasamy, Vaiyapuri Subbarayan; Alshatwi, Ali Abdullah; Higuchi, Akon; Hanaoka, Kazunori; Nakabayashi, Kazuhiko; Takada, Shuji; Hata, Kenichiro; Toyoda, Masashi; Umezawa, Akihiro
2016-01-01
Disease-specific induced pluripotent stem cells (iPSCs) have been used as a model to analyze pathogenesis of disease. In this study, we generated iPSCs derived from a fibroblastic cell line of xeroderma pigmentosum (XP) group A (XPA-iPSCs), a rare autosomal recessive hereditary disease in which patients develop skin cancer in the areas of skin exposed to sunlight. XPA-iPSCs exhibited hypersensitivity to ultraviolet exposure and accumulation of single-nucleotide substitutions when compared with ataxia telangiectasia-derived iPSCs that were established in a previous study. However, XPA-iPSCs did not show any chromosomal instability in vitro, i.e. intact chromosomes were maintained. The results were mutually compensating for examining two major sources of mutations, nucleotide excision repair deficiency and double-strand break repair deficiency. Like XP patients, XPA-iPSCs accumulated single-nucleotide substitutions that are associated with malignant melanoma, a manifestation of XP. These results indicate that XPA-iPSCs may serve a monitoring tool (analogous to the Ames test but using mammalian cells) to measure single-nucleotide alterations, and may be a good model to clarify pathogenesis of XP. In addition, XPA-iPSCs may allow us to facilitate development of drugs that delay genetic alteration and decrease hypersensitivity to ultraviolet for therapeutic applications. PMID:27197874
Summary of evidence for an anticodonic basis for the origin of the genetic code
NASA Technical Reports Server (NTRS)
Lacey, J. C., Jr.; Mullins, D. W., Jr.
1981-01-01
This article summarizes data supporting the hypothesis that the genetic code origin was based on relationships (probably affinities) between amino acids and their anticodon nucleotides. Selective activation seems to follow from selective affinity and consequently, incorporation of amino acids into peptides can also be selective. It is suggested that these selectivities in affinity and activation, coupled with the base pairing specificities, allowed the origin of the code and the process of translation.
Ishikawa, Sohta A; Inagaki, Yuji; Hashimoto, Tetsuo
2012-01-01
In phylogenetic analyses of nucleotide sequences, 'homogeneous' substitution models, which assume the stationarity of base composition across a tree, are widely used, albeit individual sequences may bear distinctive base frequencies. In the worst-case scenario, a homogeneous model-based analysis can yield an artifactual union of two distantly related sequences that achieved similar base frequencies in parallel. Such potential difficulty can be countered by two approaches, 'RY-coding' and 'non-homogeneous' models. The former approach converts four bases into purine and pyrimidine to normalize base frequencies across a tree, while the heterogeneity in base frequency is explicitly incorporated in the latter approach. The two approaches have been applied to real-world sequence data; however, their basic properties have not been fully examined by pioneering simulation studies. Here, we assessed the performances of the maximum-likelihood analyses incorporating RY-coding and a non-homogeneous model (RY-coding and non-homogeneous analyses) on simulated data with parallel convergence to similar base composition. Both RY-coding and non-homogeneous analyses showed superior performances compared with homogeneous model-based analyses. Curiously, the performance of RY-coding analysis appeared to be significantly affected by a setting of the substitution process for sequence simulation relative to that of non-homogeneous analysis. The performance of a non-homogeneous analysis was also validated by analyzing a real-world sequence data set with significant base heterogeneity.
Comparison and correlation of Simple Sequence Repeats distribution in genomes of Brucella species
Kiran, Jangampalli Adi Pradeep; Chakravarthi, Veeraraghavulu Praveen; Kumar, Yellapu Nanda; Rekha, Somesula Swapna; Kruti, Srinivasan Shanthi; Bhaskar, Matcha
2011-01-01
Computational genomics is one of the important tools to understand the distribution of closely related genomes including simple sequence repeats (SSRs) in an organism, which gives valuable information regarding genetic variations. The central objective of the present study was to screen the SSRs distributed in coding and non-coding regions among different human Brucella species which are involved in a range of pathological disorders. Computational analysis of the SSRs in the Brucella indicates few deviations from expected random models. Statistical analysis also reveals that tri-nucleotide SSRs are overrepresented and tetranucleotide SSRs underrepresented in Brucella genomes. From the data, it can be suggested that over expressed tri-nucleotide SSRs in genomic and coding regions might be responsible in the generation of functional variation of proteins expressed which in turn may lead to different pathogenicity, virulence determinants, stress response genes, transcription regulators and host adaptation proteins of Brucella genomes. Abbreviations SSRs - Simple Sequence Repeats, ORFs - Open Reading Frames. PMID:21738309
High-throughput discovery of rare human nucleotide polymorphisms by Ecotilling
Till, Bradley J.; Zerr, Troy; Bowers, Elisabeth; Greene, Elizabeth A.; Comai, Luca; Henikoff, Steven
2006-01-01
Human individuals differ from one another at only ∼0.1% of nucleotide positions, but these single nucleotide differences account for most heritable phenotypic variation. Large-scale efforts to discover and genotype human variation have been limited to common polymorphisms. However, these efforts overlook rare nucleotide changes that may contribute to phenotypic diversity and genetic disorders, including cancer. Thus, there is an increasing need for high-throughput methods to robustly detect rare nucleotide differences. Toward this end, we have adapted the mismatch discovery method known as Ecotilling for the discovery of human single nucleotide polymorphisms. To increase throughput and reduce costs, we developed a universal primer strategy and implemented algorithms for automated band detection. Ecotilling was validated by screening 90 human DNA samples for nucleotide changes in 5 gene targets and by comparing results to public resequencing data. To increase throughput for discovery of rare alleles, we pooled samples 8-fold and found Ecotilling to be efficient relative to resequencing, with a false negative rate of 5% and a false discovery rate of 4%. We identified 28 new rare alleles, including some that are predicted to damage protein function. The detection of rare damaging mutations has implications for models of human disease. PMID:16893952
Binladen, Jonas; Gilbert, M Thomas P; Bollback, Jonathan P; Panitz, Frank; Bendixen, Christian; Nielsen, Rasmus; Willerslev, Eske
2007-02-14
The invention of the Genome Sequence 20 DNA Sequencing System (454 parallel sequencing platform) has enabled the rapid and high-volume production of sequence data. Until now, however, individual emulsion PCR (emPCR) reactions and subsequent sequencing runs have been unable to combine template DNA from multiple individuals, as homologous sequences cannot be subsequently assigned to their original sources. We use conventional PCR with 5'-nucleotide tagged primers to generate homologous DNA amplification products from multiple specimens, followed by sequencing through the high-throughput Genome Sequence 20 DNA Sequencing System (GS20, Roche/454 Life Sciences). Each DNA sequence is subsequently traced back to its individual source through 5'tag-analysis. We demonstrate that this new approach enables the assignment of virtually all the generated DNA sequences to the correct source once sequencing anomalies are accounted for (miss-assignment rate<0.4%). Therefore, the method enables accurate sequencing and assignment of homologous DNA sequences from multiple sources in single high-throughput GS20 run. We observe a bias in the distribution of the differently tagged primers that is dependent on the 5' nucleotide of the tag. In particular, primers 5' labelled with a cytosine are heavily overrepresented among the final sequences, while those 5' labelled with a thymine are strongly underrepresented. A weaker bias also exists with regards to the distribution of the sequences as sorted by the second nucleotide of the dinucleotide tags. As the results are based on a single GS20 run, the general applicability of the approach requires confirmation. However, our experiments demonstrate that 5'primer tagging is a useful method in which the sequencing power of the GS20 can be applied to PCR-based assays of multiple homologous PCR products. The new approach will be of value to a broad range of research areas, such as those of comparative genomics, complete mitochondrial analyses, population genetics, and phylogenetics.
Shiao, Yih-Horng; Lupascu, Sorin T; Gu, Yuhan D; Kasprzak, Wojciech; Hwang, Christopher J; Fields, Janet R; Leighty, Robert M; Quiñones, Octavio; Shapiro, Bruce A; Alvord, W Gregory; Anderson, Lucy M
2009-10-19
Ribosomal RNA (rRNA) is a central regulator of cell growth and may control cancer development. A cis noncoding rRNA (nc-rRNA) upstream from the 45S rRNA transcription start site has recently been implicated in control of rRNA transcription in mouse fibroblasts. We investigated whether a similar nc-rRNA might be expressed in human cancer epithelial cells, and related to any genomic characteristics. Using quantitative rRNA measurement, we demonstrated that a nc-rRNA is transcribed in human lung epithelial and lung cancer cells, starting from approximately -1000 nucleotides upstream of the rRNA transcription start site (+1) and extending at least to +203. This nc-rRNA was significantly more abundant in the majority of lung cancer cell lines, relative to a nontransformed lung epithelial cell line. Its abundance correlated negatively with total 45S rRNA in 12 of 13 cell lines (P = 0.014). During sequence analysis from -388 to +306, we observed diverse, frequent intercopy single nucleotide polymorphisms (SNPs) in rRNA, with a frequency greater than predicted by chance at 12 sites. A SNP at +139 (U/C) in the 5' leader sequence varied among the cell lines and correlated negatively with level of the nc-rRNA (P = 0.014). Modelling of the secondary structure of the rRNA 5'-leader sequence indicated a small increase in structural stability due to the +139 U/C SNP and a minor shift in local configuration occurrences. The results demonstrate occurrence of a sense nc-rRNA in human lung epithelial and cancer cells, and imply a role in regulation of the rRNA gene, which may be affected by a +139 SNP in the 5' leader sequence of the primary rRNA transcript.
Khalaila, Jawad M; Elami, Amir; Caraco, Yoseph
2007-10-01
Single nucleotide polymorphisms at nucleotides 46, 79 and 491 of the beta2 adrenergic receptor (beta2AR) gene modify its pharmacological properties and may alter the response to agonists. The purpose of this study was to evaluate the role played by beta2AR polymorphisms on isoproterenol-induced relaxation of internal mammary arteries ex vivo. Internal mammary leftover segments were collected from 96 patients undergoing coronary artery bypass operation. Vascular rings were allowed to reach equilibrium with physiological Krebs solution before precontraction with U46619. Using the organ bath technique, cumulative dose-response curve of isoproterenol was constructed and average EC50 calculated. beta2AR genotyping was performed using a PCR-RFLP analysis. Arterial segments obtained from Gly16 homozygotes displayed reduced sensitivity to isoproterenol compared with carriers of Arg16 allele(s) [Mean (-log) EC50+/-SD, 6.42+/-0.24, 95% confidence interval (CI) 6.32-6.53 vs. 6.67+/-0.25, 95% CI 6.62-6.73, P<0.001]. Among Gly16 homozygotes, the presence of two Glu27 alleles restored vascular response to the level noted among Arg16 carriers (6.58+/-0.17, 95% CI 6.41-6.76). The least response to isoproterenol was noted in a single patient carrying the Gly16Gly-Gln27Glu-Thr164Ile combined genotype requiring almost six-fold higher isoproterenol concentration than carriers of the wild-type genotype to achieve half the maximal arterial dilatation (17.78 x 10(-7) vs. 3.01 x 10(-7) +/- 2.62 x 10(-7) mol/l). Vascular dilatation by isoproterenol is determined by a complex interaction between polymorphisms at nucleotides 46, 79 and 491 of the beta2AR gene. Further studies are warranted to evaluate the effect of additional polymorphisms in the coding and noncoding regions on vascular reactivity.
Volkán-Kacsó, Sándor; Marcus, Rudolph A
2016-10-25
A recently proposed chemomechanical group transfer theory of rotary biomolecular motors is applied to treat single-molecule controlled rotation experiments. In these experiments, single-molecule fluorescence is used to measure the binding and release rate constants of nucleotides by monitoring the occupancy of binding sites. It is shown how missed events of nucleotide binding and release in these experiments can be corrected using theory, with F 1 -ATP synthase as an example. The missed events are significant when the reverse rate is very fast. Using the theory the actual rate constants in the controlled rotation experiments and the corrections are predicted from independent data, including other single-molecule rotation and ensemble biochemical experiments. The effective torsional elastic constant is found to depend on the binding/releasing nucleotide, and it is smaller for ADP than for ATP. There is a good agreement, with no adjustable parameters, between the theoretical and experimental results of controlled rotation experiments and stalling experiments, for the range of angles where the data overlap. This agreement is perhaps all the more surprising because it occurs even though the binding and release of fluorescent nucleotides is monitored at single-site occupancy concentrations, whereas the stalling and free rotation experiments have multiple-site occupancy.
NASA Astrophysics Data System (ADS)
Mackiewicz, P.; Gierlik, A.; Kowalczuk, M.; Szczepanik, D.; Dudek, M. R.; Cebrat, S.
1999-12-01
We have analysed protein coding and intergenic sequences in the Borrelia burgdorferi (the Lyme disease bacterium) genome using different kinds of DNA walks. Genes occupying the leading strand of DNA have significantly different nucleotide composition from genes occupying the lagging strand. Nucleotide compositional bias of the two DNA strands reflects the aminoacid composition of proteins. 96% of genes coding for ribosomal proteins lie on the leading DNA strand, which suggests that the positions of these as well as other genes are non-random. In the B. burgdorferi genome, the asymmetry in intergenic DNA sequences is lower than the asymmetry in the third positions in codons. All these characters of the B. burgdorferi genome suggest that both replication-associated mutational pressure and recombination mechanisms have established the specific structure of the genome and now any recombination leading to inversion of a gene in respect to the direction of replication is forbidden. This property of the genome allows us to assume that it is in a steady state, which enables us to fix some parameters for simulations of DNA evolution.
Substitution rate and natural selection in parvovirus B19
Stamenković, Gorana G.; Ćirković, Valentina S.; Šiljić, Marina M.; Blagojević, Jelena V.; Knežević, Aleksandra M.; Joksić, Ivana D.; Stanojević, Maja P.
2016-01-01
The aim of this study was to estimate substitution rate and imprints of natural selection on parvovirus B19 genotype 1. Studied datasets included 137 near complete coding B19 genomes (positions 665 to 4851) for phylogenetic and substitution rate analysis and 146 and 214 partial genomes for selection analyses in open reading frames ORF1 and ORF2, respectively, collected 1973–2012 and including 9 newly sequenced isolates from Serbia. Phylogenetic clustering assigned majority of studied isolates to G1A. Nucleotide substitution rate for total coding DNA was 1.03 (0.6–1.27) x 10−4 substitutions/site/year, with higher values for analyzed genome partitions. In spite of the highest evolutionary rate, VP2 codons were found to be under purifying selection with rare episodic positive selection, whereas codons under diversifying selection were found in the unique part of VP1, known to contain B19 immune epitopes important in persistent infection. Analyses of overlapping gene regions identified nucleotide positions under opposite selective pressure in different ORFs, suggesting complex evolutionary mechanisms of nucleotide changes in B19 viral genomes. PMID:27775080
Li, Tang; Chamberlin, Stephen G; Caraco, M Daniel; Liberles, David A; Gaucher, Eric A; Benner, Steven A
2006-01-01
Background The exchange of nucleotides at synonymous sites in a gene encoding a protein is believed to have little impact on the fitness of a host organism. This should be especially true for synonymous transitions, where a pyrimidine nucleotide is replaced by another pyrimidine, or a purine is replaced by another purine. This suggests that transition redundant exchange (TREx) processes at the third position of conserved two-fold codon systems might offer the best approximation for a neutral molecular clock, serving to examine, within coding regions, theories that require neutrality, determine whether transition rate constants differ within genes in a single lineage, and correlate dates of events recorded in genomes with dates in the geological and paleontological records. To date, TREx analysis of the yeast genome has recognized correlated duplications that established a new metabolic strategies in fungi, and supported analyses of functional change in aromatases in pigs. TREx dating has limitations, however. Multiple transitions at synonymous sites may cause equilibration and loss of information. Further, to be useful to correlate events in the genomic record, different genes within a genome must suffer transitions at similar rates. Results A formalism to analyze divergence at two fold redundant codon systems is presented. This formalism exploits two-state approach-to-equilibrium kinetics from chemistry. This formalism captures, in a single equation, the possibility of multiple substitutions at individual sites, avoiding any need to "correct" for these. The formalism also connects specific rate constants for transitions to specific approximations in an underlying evolutionary model, including assumptions that transition rate constants are invariant at different sites, in different genes, in different lineages, and at different times. Therefore, the formalism supports analyses that evaluate these approximations. Transitions at synonymous sites within two-fold redundant coding systems were examined in the mouse, rat, and human genomes. The key metric (f2), the fraction of those sites that holds the same nucleotide, was measured for putative ortholog pairs. A transition redundant exchange (TREx) distance was calculated from f2 for these pairs. Pyrimidine-pyrimidine transitions at these sites occur approximately 14% faster than purine-purine transitions in various lineages. Transition rate constants were similar in different genes within the same lineages; within a set of orthologs, the f2 distribution is only modest overdispersed. No correlation between disparity and overdispersion is observed. In rodents, evidence was found for greater conservation of TREx sites in genes on the X chromosome, accounting for a small part of the overdispersion, however. Conclusion The TREx metric is useful to analyze the history of transition rate constants within these mammals over the past 100 million years. The TREx metric estimates the extent to which silent nucleotide substitutions accumulate in different genes, on different chromosomes, with different compositions, in different lineages, and at different times. PMID:16545144
Hu, Guang Fu; Liu, Xiang Jiang; Zou, Gui Wei; Li, Zhong; Liang, Hong-Wei; Hu, Shao-Na
2016-01-01
We sequenced the complete mitogenomes of (Cyprinus carpio haematopterus) and Russian scattered scale mirror carp (Cyprinus carpio carpio). Comparison of these two mitogenomes revealed that the mitogenomes of these two common carp strains were remarkably similar in genome length, gene order and content, and AT content. There were only 55 bp variations in 16,581 nucleotides. About 1 bp variation was located in rRNAs, 2 bp in tRNAs, 9 bp in the control region and 43 bp in protein-coding genes. Furthermore, forty-three variable nucleotides in the protein-coding genes of the two strains led to four variable amino acids, which were located in the ND2, ATPase 6, ND5 and ND6 genes, respectively.
The Single Nucleotide Polymorphism Consortium
NASA Technical Reports Server (NTRS)
Morgan, Michael
2003-01-01
I want to discuss both the Single Nucleotide Polymorphism (SNP) Consortium and the Human Genome Project. I am afraid most of my presentation will be thin on law and possibly too high on rhetoric. Having been engaged in a personal and direct way with these issues as a trained scientist, I find it quite difficult to be always as objective as I ought to be.
Analysis of single nucleotide polymorphisms in case-control studies.
Li, Yonghong; Shiffman, Dov; Oberbauer, Rainer
2011-01-01
Single nucleotide polymorphisms (SNPs) are the most common type of genetic variants in the human genome. SNPs are known to modify susceptibility to complex diseases. We describe and discuss methods used to identify SNPs associated with disease in case-control studies. An outline on study population selection, sample collection and genotyping platforms is presented, complemented by SNP selection, data preprocessing and analysis.
Xiao, Zhuo; Lie, Puchang; Fang, Zhiyuan; Yu, Luxin; Chen, Junhua; Liu, Jie; Ge, Chenchen; Zhou, Xuemeng; Zeng, Lingwen
2012-09-04
A lateral flow biosensor for detection of single nucleotide polymorphism based on circular strand displacement reaction (CSDPR) has been developed. Taking advantage of high fidelity of T4 DNA ligase, signal amplification by CSDPR, and the optical properties of gold nanoparticles, this assay has reached a detection limit of 0.01 fM.
A Laboratory Exercise for Genotyping Two Human Single Nucleotide Polymorphisms
ERIC Educational Resources Information Center
Fernando, James; Carlson, Bradley; LeBard, Timothy; McCarthy, Michael; Umali, Finianne; Ashton, Bryce; Rose, Ferrill F., Jr.
2016-01-01
The dramatic decrease in the cost of sequencing a human genome is leading to an era in which a wide range of students will benefit from having an understanding of human genetic variation. Since over 90% of sequence variation between humans is in the form of single nucleotide polymorphisms (SNPs), a laboratory exercise has been devised in order to…
USDA-ARS?s Scientific Manuscript database
The association of single nucleotide polymorphisms (SNPs) of calpastatin (CAST) gene with shear force of 2.54 cm steaks from M. longissimus dorsi from Gannan yaks (Bos grunniens, n=181) was studied. Yaks were harvested at 2, 3, and 4 yr of age (n=51, 59, and 71, respectively), and samples of each ya...
Winterhagen, Patrick; Wünsche, Jens-Norbert
2016-05-01
Within a polyembryonic mango seedling tree population, the genetic background of individuals should be identical because vigorous plants for cultivation are expected to develop from nucellar embryos representing maternal clones. Due to the fact that the mango cultivar 'Hôi' is assigned to the polyembryonic ecotype, an intra-cultivar variability of ethylene receptor genes was unexpected. Ethylene receptors in plants are conserved, but the number of receptors or receptor isoforms is variable regarding different plant species. However, it is shown here that the ethylene receptor MiETR1 is present in various isoforms within the mango cultivar 'Hôi'. The investigation of single nucleotide polymorphisms revealed that different MiETR1 isoforms can not be discriminated simply by individual single nucleotide exchanges but by the specific arrangement of single nucleotide polymorphisms at certain positions in the exons of MiETR1. Furthermore, an MiETR1 isoform devoid of introns in the genomic sequence was identified. The investigation demonstrates some limitations of high resolution melting and ScreenClust analysis and points out the necessity of sequencing to identify individual isoforms and to determine the variability within the tree population.
Wu, N; Qin, H; Wang, M; Bian, Y; Dong, B; Sun, G; Zhao, W; Chang, G; Xu, Q; Chen, G
2017-04-01
1. Endothelin receptor B subtype 2 (EDNRB2) is a paralog of EDNRB, which encodes a 7-transmembrane G-protein coupled receptor. Previous studies reported that EDNRB was essential for melanoblast migration in mammals and ducks. 2. Muscovy ducks have different plumage colour phenotypes. Variations in EDNRB2 coding sequences (CDSs) and mRNA expression levels were investigated in 4 different Muscovy duck plumage colour phenotypes, including black, black mutant, silver and white head. 3. The EDNRB2 gene from Muscovy duck was cloned; it had a length of 6435 bp and encoded 437 amino acids. The coding region was screened and potential single nucleotide polymorphisms were identified. Eight mutations were obtained, including one missense variant (c.64C > T) and 7 synonymous substitutions. The substitutions were associated with plumage colour phenotypes. 4. The EDNRB2 mRNA expression levels were compared between feather pulp from black birds and black mutant birds. The results indicated that EDNRB2 transcripts in feather pulp were significantly higher in black feathers than in white feathers. 5. The results determined the variation of EDNRB2 CDS and mRNA expression in Muscovy ducks of various plumage colours.
Lathe, R
1985-05-05
Synthetic probes deduced from amino acid sequence data are widely used to detect cognate coding sequences in libraries of cloned DNA segments. The redundancy of the genetic code dictates that a choice must be made between (1) a mixture of probes reflecting all codon combinations, and (2) a single longer "optimal" probe. The second strategy is examined in detail. The frequency of sequences matching a given probe by chance alone can be determined and also the frequency of sequences closely resembling the probe and contributing to the hybridization background. Gene banks cannot be treated as random associations of the four nucleotides, and probe sequences deduced from amino acid sequence data occur more often than predicted by chance alone. Probe lengths must be increased to confer the necessary specificity. Examination of hybrids formed between unique homologous probes and their cognate targets reveals that short stretches of perfect homology occurring by chance make a significant contribution to the hybridization background. Statistical methods for improving homology are examined, taking human coding sequences as an example, and considerations of codon utilization and dinucleotide frequencies yield an overall homology of greater than 82%. Recommendations for probe design and hybridization are presented, and the choice between using multiple probes reflecting all codon possibilities and a unique optimal probe is discussed.
O'Shannessy, Daniel J; Bendas, Katie; Schweizer, Charles; Wang, Wenquan; Albone, Earl; Somers, Elizabeth B; Weil, Susan; Meredith, Rhonda K; Wustner, Jason; Grasso, Luigi; Landers, Mark; Nicolaides, Nicholas C
2017-07-01
Farletuzumab (FAR) is a humanized monoclonal antibody (mAb) that binds to folate receptor alpha. A Ph3 trial in ovarian cancer patients treated with carboplatin/taxane plus FAR or placebo did not meet the primary statistical endpoint. Subgroup analysis demonstrated that subjects with high FAR exposure levels (Cmin>57.6μg/mL) showed statistically significant improvements in PFS and OS. The neonatal Fc receptor (fcgrt) plays a central role in albumin/IgG stasis and mAb pharmacokinetics (PK). Here we evaluated fcgrt sequence and association of its promoter variable number tandem repeats (VNTR) and coding single nucleotide variants (SNV) with albumin/IgG levels and FAR PK in the Ph3 patients. A statistical correlation existed between high FAR Cmin and AUC in patients with the highest quartile of albumin and lowest quartile of IgG1. Analysis of fcgrt identified 5 different VNTRs in the promoter region and 9 SNVs within the coding region, 4 which are novel. Copyright © 2017. Published by Elsevier Inc.
SNPConvert: SNP Array Standardization and Integration in Livestock Species.
Nicolazzi, Ezequiel Luis; Marras, Gabriele; Stella, Alessandra
2016-06-09
One of the main advantages of single nucleotide polymorphism (SNP) array technology is providing genotype calls for a specific number of SNP markers at a relatively low cost. Since its first application in animal genetics, the number of available SNP arrays for each species has been constantly increasing. However, conversely to that observed in whole genome sequence data analysis, SNP array data does not have a common set of file formats or coding conventions for allele calling. Therefore, the standardization and integration of SNP array data from multiple sources have become an obstacle, especially for users with basic or no programming skills. Here, we describe the difficulties related to handling SNP array data, focusing on file formats, SNP allele coding, and mapping. We also present SNPConvert suite, a multi-platform, open-source, and user-friendly set of tools to overcome these issues. This tool, which can be integrated with open-source and open-access tools already available, is a first step towards an integrated system to standardize and integrate any type of raw SNP array data. The tool is available at: https://github. com/nicolazzie/SNPConvert.git.
A resource of vectors and ES cells for targeted deletion of microRNAs in mice
Prosser, Haydn M.; Koike-Yusa, Hiroko; Cooper, James D.; Law, Frances C.; Bradley, Allan
2011-01-01
The 21-23 nucleotide single-stranded RNAs classified as microRNAs (miRNA) perform fundamental roles in a wide range of cellular and developmental processes. miRNAs regulate protein expression through sequence-specific base pairing with target messenger RNAs (mRNA) reducing both their stability and the process of protein translation1, 2. At least 30% of protein coding genes appear to be conserved targets for miRNAs1. In contrast to the protein coding genes3, 4, no public resource of miRNA mouse mutant alleles exists. We have generated a library of highly germ-line transmissible C57BL/6N mouse mutant embryonic stem (ES) cells with targeted deletions for the majority of miRNA genes currently annotated within the miRBase registry5. These alleles have been designed to be highly adaptable research tools that can be efficiently altered to create reporter, conditional and other allelic variants. This ES cell resource can be searched electronically and is available from ES cell repositories for distribution to the scientific community6. PMID:21822254
L'Huillier, P J; Davis, S R; Bellamy, A R
1992-01-01
Ribozymes targeted to five sites along the alpha-lactalbumin (alpha-lac) mRNA were delivered to the cytoplasm of mouse C127I mammary cells using the T7-vaccinia virus delivery system and the amount of alpha-lac mRNA was monitored 24-48 h post-transfection. Three target sites were selected in the alpha-lac coding region (nucleotides 15, 145 and 361) and two were located in the 3' non-coding region (nucleotides 442 and 694). Acting in trans and at a target:ribozyme ratio of 1:1000, ribozymes targeting sites 361 and 694 reduced alpha-lac mRNA by > 80%; another two ribozymes (targeting nucleotides 442 and 145) reduced mRNA levels by 80 and 60% respectively; the fifth ribozyme (targeting nucleotide 15, near the AUG) was largely ineffective. The kinetic activity (kcat) of each ribozyme in vitro was somewhat predictive of the activity of the two ribozymes that targeted nucleotides 361 and 694, but was not predictive of the in vivo activity of the other three ribozymes. Down-regulation of the intracellular levels of alpha-lac paralleled the ribozyme-dependent reduction achieved for mRNA. For site 442, the reduction in both mRNA and protein was attributed to the catalytic activity of the ribozyme rather than to the antisense effects of the flanking arms, because delivery of an engineered (catalytically-inactive) variant had no effect on mRNA levels and a minimal effect on the level of alpha-lac present in the cell. Images PMID:1425576
2012-01-01
Background Pseudoscorpions are chelicerates and have historically been viewed as being most closely related to solifuges, harvestmen, and scorpions. No mitochondrial genomes of pseudoscorpions have been published, but the mitochondrial genomes of some lineages of Chelicerata possess unusual features, including short rRNA genes and tRNA genes that lack sequence to encode arms of the canonical cloverleaf-shaped tRNA. Additionally, some chelicerates possess an atypical guanine-thymine nucleotide bias on the major coding strand of their mitochondrial genomes. Results We sequenced the mitochondrial genomes of two divergent taxa from the chelicerate order Pseudoscorpiones. We find that these genomes possess unusually short tRNA genes that do not encode cloverleaf-shaped tRNA structures. Indeed, in one genome, all 22 tRNA genes lack sequence to encode canonical cloverleaf structures. We also find that the large ribosomal RNA genes are substantially shorter than those of most arthropods. We inferred secondary structures of the LSU rRNAs from both pseudoscorpions, and find that they have lost multiple helices. Based on comparisons with the crystal structure of the bacterial ribosome, two of these helices were likely contact points with tRNA T-arms or D-arms as they pass through the ribosome during protein synthesis. The mitochondrial gene arrangements of both pseudoscorpions differ from the ancestral chelicerate gene arrangement. One genome is rearranged with respect to the location of protein-coding genes, the small rRNA gene, and at least 8 tRNA genes. The other genome contains 6 tRNA genes in novel locations. Most chelicerates with rearranged mitochondrial genes show a genome-wide reversal of the CA nucleotide bias typical for arthropods on their major coding strand, and instead possess a GT bias. Yet despite their extensive rearrangement, these pseudoscorpion mitochondrial genomes possess a CA bias on the major coding strand. Phylogenetic analyses of all 13 mitochondrial protein-coding gene sequences consistently yield trees that place pseudoscorpions as sister to acariform mites. Conclusion The well-supported phylogenetic placement of pseudoscorpions as sister to Acariformes differs from some previous analyses based on morphology. However, these two lineages share multiple molecular evolutionary traits, including substantial mitochondrial genome rearrangements, extensive nucleotide substitution, and loss of helices in their inferred tRNA and rRNA structures. PMID:22409411
The chemical basis for the origin of the genetic code and the process of protein synthesis
NASA Technical Reports Server (NTRS)
1982-01-01
The major thrust is to understand just how the process of protein synthesis, including that very important aspect, genetic coding, came to be. Two aspects of the problem: the chemistry of active aminoacyl species; and affinities between amino acids and nucleotides, and specifically, how these affinities might affect the chemistry between the two are stressed.
Khodakov, Dmitriy A; Khodakova, Anastasia S; Huang, David M; Linacre, Adrian; Ellis, Amanda V
2015-03-04
Single nucleotide polymorphisms (SNPs) are a prime source of genetic diversity. Discriminating between different SNPs provides an enormous leap towards the better understanding of the uniqueness of biological systems. Here we report on a new approach for SNP discrimination using toehold-mediated DNA strand displacement. The distinctiveness of the approach is based on the combination of both 3- and 4-way branch migration mechanisms, which allows for reliable discrimination of SNPs within double-stranded DNA generated from real-life human mitochondrial DNA samples. Aside from the potential diagnostic value, the current study represents an additional way to control the strand displacement reaction rate without altering other reaction parameters and provides new insights into the influence of single nucleotide substitutions on 3- and 4-way branch migration efficiency and kinetics.
Single nucleotide polymorphism analysis using different colored dye dimer probes
NASA Astrophysics Data System (ADS)
Marmé, Nicole; Friedrich, Achim; Denapaite, Dalia; Hakenbeck, Regine; Knemeyer, Jens-Peter
2006-09-01
Fluorescence quenching by dye dimer formation has been utilized to develop hairpin-structured DNA probes for the detection of a single nucleotide polymorphism (SNP) in the penicillin target gene pbp2x, which is implicated in the penicillin resistance of Streptococcus pneumoniae. We designed two specific DNA probes for the identification of the pbp2x genes from a penicillin susceptible strain R6 and a resistant strain Streptococcus mitis 661 using green-fluorescent tetramethylrhodamine (TMR) and red-fluorescent DY-636, respectively. Hybridization of each of the probes to its respective target DNA sequence opened the DNA hairpin probes, consequently breaking the nonfluorescent dye dimers into fluorescent species. This hybridization of the target with the hairpin probe achieved single nucleotide specific detection at nanomolar concentrations via increased fluorescence.
New insights into the phylogenetics and population structure of the prairie falcon (Falco mexicanus)
Doyle, Jacqueline M.; Bell, Douglas A.; Bloom, Peter H.; Emmons, Gavin; Fesnock, Amy; Katzner, Todd; LePre, Larry; Leonard, Kolbe; SanMiguel, Phillip; Westerman, Rick; DeWoody, J. Andrew
2018-01-01
BackgroundManagement requires a robust understanding of between- and within-species genetic variability, however such data are still lacking in many species. For example, although multiple population genetics studies of the peregrine falcon (Falco peregrinus) have been conducted, no similar studies have been done of the closely-related prairie falcon (F. mexicanus) and it is unclear how much genetic variation and population structure exists across the species’ range. Furthermore, the phylogenetic relationship of F. mexicanus relative to other falcon species is contested. We utilized a genomics approach (i.e., genome sequencing and assembly followed by single nucleotide polymorphism genotyping) to rapidly address these gaps in knowledge.ResultsWe sequenced the genome of a single female prairie falcon and generated a 1.17 Gb (gigabases) draft genome assembly. We generated maximum likelihood phylogenetic trees using complete mitochondrial genomes as well as nuclear protein-coding genes. This process provided evidence that F. mexicanus is an outgroup to the clade that includes the peregrine falcon and members of the subgenus Hierofalco. We annotated > 16,000 genes and almost 600,000 high-quality single nucleotide polymorphisms (SNPs) in the nuclear genome, providing the raw material for a SNP assay design featuring > 140 gene-associated markers and a molecular-sexing marker. We subsequently genotyped ~ 100 individuals from California (including the San Francisco East Bay Area, Pinnacles National Park and the Mojave Desert) and Idaho (Snake River Birds of Prey National Conservation Area). We tested for population structure and found evidence that individuals sampled in California and Idaho represent a single panmictic population.ConclusionsOur study illustrates how genomic resources can rapidly shed light on genetic variability in understudied species and resolve phylogenetic relationships. Furthermore, we found evidence of a single, randomly mating population of prairie falcons across our sampling locations. Prairie falcons are highly mobile and relatively rare long-distance dispersal events may promote gene flow throughout the range. As such, California’s prairie falcons might be managed as a single population, indicating that management actions undertaken to benefit the species at the local level have the potential to influence the species as a whole.
Genetics of Congenital Heart Disease: Past and Present.
Muntean, Iolanda; Togănel, Rodica; Benedek, Theodora
2017-04-01
Congenital heart disease is the most common congenital anomaly, representing an important cause of infant morbidity and mortality. Congenital heart disease represents a group of heart anomalies that include septal defects, valve defects, and outflow tract anomalies. The exact genetic, epigenetic, or environmental basis of congenital heart disease remains poorly understood, although the exact mechanism is likely multifactorial. However, the development of new technologies including copy number variants, single-nucleotide polymorphism, next-generation sequencing are accelerating the detection of genetic causes of heart anomalies. Recent studies suggest a role of small non-coding RNAs, micro RNA, in congenital heart disease. The recently described epigenetic factors have also been found to contribute to cardiac morphogenesis. In this review, we present past and recent genetic discoveries in congenital heart disease.
Porcine MYF6 gene: sequence, homology analysis, and variation in the promoter region.
Wyszyńska-Koko, J; Kurył, J
2004-01-01
MYF6 gene codes for the bHLH transcription factor belonging to MyoD family. Its expression accompanies the processes of differentiation and maturation of myotubes during embriogenesis and continues on a relatively high level after birth, affecting the muscle phenotype. The porcine MYF6 gene was amplified and sequenced and compared with MYF6 gene sequences of other species. The amino acid sequence was deduced and an interspecies homology analysis was performed. Myf-6 protein shows a high conservation among species of 99 and 97% identity when comparing pig with cow and human, respectively, and of 93% when comparing pig with mouse and rat. The single nucleotide polymorphism (SNP) was revealed within the promoter region, which appeared to be T --> C transition recognized by a MspI restriction enzyme.
TCIRG1-associated congenital neutropenia.
Makaryan, Vahagn; Rosenthal, Elisabeth A; Bolyard, Audrey Anna; Kelley, Merideth L; Below, Jennifer E; Bamshad, Michael J; Bofferding, Kathryn M; Smith, Joshua D; Buckingham, Kati; Boxer, Laurence A; Skokowa, Julia; Welte, Karl; Nickerson, Deborah A; Jarvik, Gail P; Dale, David C
2014-07-01
Severe congenital neutropenia (SCN) is a rare hematopoietic disorder, with estimated incidence of 1 in 200,000 individuals of European descent, many cases of which are inherited in an autosomal dominant pattern. Despite the fact that several causal genes have been identified, the genetic basis for >30% of cases remains unknown. We report a five-generation family segregating a novel single nucleotide variant (SNV) in TCIRG1. There is perfect cosegregation of the SNV with congenital neutropenia in this family; all 11 affected, but none of the unaffected, individuals carry this novel SNV. Western blot analysis show reduced levels of TCIRG1 protein in affected individuals, compared to healthy controls. Two unrelated patients with SCN, identified by independent investigators, are heterozygous for different, rare, highly conserved, coding variants in TCIRG1. © 2014 WILEY PERIODICALS, INC.
TCIRG1 associated Congenital Neutropenia
Makaryan, Vahagn; Rosenthal, Elisabeth A.; Bolyard, Audrey Anna; Kelley, Merideth L.; Below, Jennifer E.; Bamshad, Michael J.; Bofferding, Kathryn M.; Smith, Joshua D.; Buckingham, Kati; Boxer, Laurence A.; Skokowa, Julia; Welte, Karl; Nickerson, Deborah A.; Jarvik, Gail P.; Dale, David C.
2014-01-01
Severe congenital neutropenia (SCN) is a rare hematopoietic disorder, with estimated incidence of 1 in 200,000 individuals of European descent, many cases of which are inherited in an autosomal dominant pattern. Despite the fact that several causal genes have been identified, the genetic basis for >30% of cases remains unknown. We report a five generation family segregating a novel single nucleotide variant (SNV) in TCIRG1. There is perfect co-segregation of the SNV with congenital neutropenia in this family; all 11 affected, but none of the unaffected, individuals carry this novel SNV. Western blot analysis show reduced levels of TCIRG1 protein in affected individuals, compared to healthy controls. Two unrelated patients with SCN, identified by independent investigators, are heterozygous for different, rare, highly conserved, coding variants in TCIRG1. PMID:24753205
Kochanowski, N; Blanchard, F; Cacan, R; Chirat, F; Guedon, E; Marc, A; Goergen, J-L
2006-01-15
Analysis of intracellular nucleotide and nucleotide sugar contents is essential in studying protein glycosylation of mammalian cells. Nucleotides and nucleotide sugars are the donor substrates of glycosyltransferases, and nucleotides are involved in cellular energy metabolism and its regulation. A sensitive and reproducible ion-pair reverse-phase high-performance liquid chromatography (RP-HPLC) method has been developed, allowing the direct and simultaneous detection and quantification of some essential nucleotides and nucleotide sugars. After a perchloric acid extraction, 13 molecules (8 nucleotides and 5 nucleotide sugars) were separated, including activated sugars such as UDP-glucose, UDP-galactose, GDP-mannose, UDP-N-acetylglucosamine, and UDP-N-acetylgalactosamine. To validate the analytical parameters, the reproducibility, linearity of calibration curves, detection limits, and recovery were evaluated for standard mixtures and cell extracts. The developed method is capable of resolving picomolar quantities of nucleotides and nucleotide sugars in a single chromatographic run. The HPLC method was then applied to quantify intracellular levels of nucleotides and nucleotide sugars of Chinese hamster ovary (CHO) cells cultivated in a bioreactor batch process. Evolutions of the titers of nucleotides and nucleotide sugars during the batch process are discussed.
Sperm Bindin Divergence under Sexual Selection and Concerted Evolution in Sea Stars.
Patiño, Susana; Keever, Carson C; Sunday, Jennifer M; Popovic, Iva; Byrne, Maria; Hart, Michael W
2016-08-01
Selection associated with competition among males or sexual conflict between mates can create positive selection for high rates of molecular evolution of gamete recognition genes and lead to reproductive isolation between species. We analyzed coding sequence and repetitive domain variation in the gene encoding the sperm acrosomal protein bindin in 13 diverse sea star species. We found that bindin has a conserved coding sequence domain structure in all 13 species, with several repeated motifs in a large central region that is similar among all sea stars in organization but highly divergent among genera in nucleotide and predicted amino acid sequence. More bindin codons and lineages showed positive selection for high relative rates of amino acid substitution in genera with gonochoric outcrossing adults (and greater expected strength of sexual selection) than in selfing hermaphrodites. That difference is consistent with the expectation that selfing (a highly derived mating system) may moderate the strength of sexual selection and limit the accumulation of bindin amino acid differences. The results implicate both positive selection on single codons and concerted evolution within the repetitive region in bindin divergence, and suggest that both single amino acid differences and repeat differences may affect sperm-egg binding and reproductive compatibility. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Generation and analysis of expressed sequence tags in the extreme large genomes Lilium and Tulipa.
Shahin, Arwa; van Kaauwen, Martijn; Esselink, Danny; Bargsten, Joachim W; van Tuyl, Jaap M; Visser, Richard G F; Arens, Paul
2012-11-20
Bulbous flowers such as lily and tulip (Liliaceae family) are monocot perennial herbs that are economically very important ornamental plants worldwide. However, there are hardly any genetic studies performed and genomic resources are lacking. To build genomic resources and develop tools to speed up the breeding in both crops, next generation sequencing was implemented. We sequenced and assembled transcriptomes of four lily and five tulip genotypes using 454 pyro-sequencing technology. Successfully, we developed the first set of 81,791 contigs with an average length of 514 bp for tulip, and enriched the very limited number of 3,329 available ESTs (Expressed Sequence Tags) for lily with 52,172 contigs with an average length of 555 bp. The contigs together with singletons covered on average 37% of lily and 39% of tulip estimated transcriptome. Mining lily and tulip sequence data for SSRs (Simple Sequence Repeats) showed that di-nucleotide repeats were twice more abundant in UTRs (UnTranslated Regions) compared to coding regions, while tri-nucleotide repeats were equally spread over coding and UTR regions. Two sets of single nucleotide polymorphism (SNP) markers suitable for high throughput genotyping were developed. In the first set, no SNPs flanking the target SNP (50 bp on either side) were allowed. In the second set, one SNP in the flanking regions was allowed, which resulted in a 2 to 3 fold increase in SNP marker numbers compared with the first set. Orthologous groups between the two flower bulbs: lily and tulip (12,017 groups) and among the three monocot species: lily, tulip, and rice (6,900 groups) were determined using OrthoMCL. Orthologous groups were screened for common SNP markers and EST-SSRs to study synteny between lily and tulip, which resulted in 113 common SNP markers and 292 common EST-SSR. Lily and tulip contigs generated were annotated and described according to Gene Ontology terminology. Two transcriptome sets were built that are valuable resources for marker development, comparative genomic studies and candidate gene approaches. Next generation sequencing of leaf transcriptome is very effective; however, deeper sequencing and using more tissues and stages is advisable for extended comparative studies.
Arita, Minetaro; Zhu, Shuang-Li; Yoshida, Hiromu; Yoneyama, Tetsuo; Miyamura, Tatsuo; Shimizu, Hiroyuki
2005-01-01
Outbreaks of poliomyelitis caused by circulating vaccine-derived polioviruses (cVDPVs) have been reported in areas where indigenous wild polioviruses (PVs) were eliminated by vaccination. Most of these cVDPVs contained unidentified sequences in the nonstructural protein coding region which were considered to be derived from human enterovirus species C (HEV-C) by recombination. In this study, we report isolation of a Sabin 3-derived PV recombinant (Cambodia-02) from an acute flaccid paralysis (AFP) case in Cambodia in 2002. We attempted to identify the putative recombination counterpart of Cambodia-02 by sequence analysis of nonpolio enterovirus isolates from AFP cases in Cambodia from 1999 to 2003. Based on the previously estimated evolution rates of PVs, the recombination event resulting in Cambodia-02 was estimated to have occurred within 6 months after the administration of oral PV vaccine (99.3% nucleotide identity in VP1 region). The 2BC and the 3Dpol coding regions of Cambodia-02 were grouped into the genetic cluster of indigenous coxsackie A virus type 17 (CAV17) (the highest [87.1%] nucleotide identity) and the cluster of indigenous CAV13-CAV18 (the highest [94.9%] nucleotide identity) by the phylogenic analysis of the HEV-C isolates in 2002, respectively. CAV13-CAV18 and CAV17 were the dominant HEV-C serotypes in 2002 but not in 2001 and in 2003. We found a putative recombination between CAV13-CAV18 and CAV17 in the 3CDpro coding region of a CAV17 isolate. These results suggested that a part of the 3Dpol coding region of PV3(Cambodia-02) was derived from a HEV-C strain genetically related to indigenous CAV13-CAV18 strains in 2002 in Cambodia. PMID:16188967
USDA-ARS?s Scientific Manuscript database
Using linear regression models, we studied the main and two-way interaction effects of the predictor variables gender, age, BMI, and 64 folate/vitamin B-12/homocysteine/lipid/cholesterol-related single nucleotide polymorphisms (SNP) on log-transformed plasma homocysteine normalized by red blood cell...
ERIC Educational Resources Information Center
Gadow, Kenneth D.; Roohi, Jasmin; DeVincent, Carla J.; Kirsch, Sarah; Hatchwell, Eli
2010-01-01
Investigated association of single nucleotide polymorphism (SNP) rs301430 in glutamate transporter gene ("SLC1A1") with severity of repetitive behaviors (obsessive-compulsive behaviors, tics) and anxiety in children with autism spectrum disorder (ASD). Mothers and/or teachers completed a validated DSM-IV-referenced rating scale for 67 children…
USDA-ARS?s Scientific Manuscript database
The periodic need to restock reagent pools for genotyping chips provides an opportunity to increase the number of single-nucleotide polymorphisms (SNP) on a chip at no increase in cost. A high-density chip with >140,000 SNP has been developed by GeneSeek Inc. (Lincoln, NE) to increase accuracy of ge...
Keith R. Merrill; Craig E. Coleman; Susan E. Meyer; Elizabeth A. Leger; Katherine A. Collins
2016-01-01
Premise of the study: Bromus tectorum (Poaceae) is an annual grass species that is invasive in many areas of the world but most especially in the U.S. Intermountain West. Single-nucleotide polymorphism (SNP) markers were developed for use in investigating the geospatial and ecological diversity of B. tectorum in the Intermountain West to better understand the...
ERIC Educational Resources Information Center
Zhang, Xu; Shao, Meng; Gao, Lu; Zhao, Yuanyuan; Sun, Zixuan; Zhou, Liping; Yan, Yongmin; Shao, Qixiang; Xu, Wenrong; Qian, Hui
2017-01-01
Laboratory exercise is helpful for medical students to understand the basic principles of molecular biology and to learn about the practical applications of molecular biology. We have designed a lab course on molecular biology about the determination of single nucleotide polymorphism (SNP) in human REV3 gene, the product of which is a subunit of…
Brimacombe, M.; Hazbon, M.; Motiwala, A. S.; Alland, D.
2007-01-01
A single-nucleotide polymorphism-based cluster grouping (SCG) classification system for Mycobacterium tuberculosis was used to examine antibiotic resistance type and resistance mutations in relationship to specific evolutionary lineages. Drug resistance and resistance mutations were seen across all SCGs. SCG-2 had higher proportions of katG codon 315 mutations and resistance to four drugs. PMID:17846140
Stranges, P. Benjamin; Palla, Mirkó; Kalachikov, Sergey; Nivala, Jeff; Dorwart, Michael; Trans, Andrew; Kumar, Shiv; Porel, Mintu; Chien, Minchen; Tao, Chuanjuan; Morozova, Irina; Li, Zengmin; Shi, Shundi; Aberra, Aman; Arnold, Cleoma; Yang, Alexander; Aguirre, Anne; Harada, Eric T.; Korenblum, Daniel; Pollard, James; Bhat, Ashwini; Gremyachinskiy, Dmitriy; Bibillo, Arek; Chen, Roger; Davis, Randy; Russo, James J.; Fuller, Carl W.; Roever, Stefan; Ju, Jingyue; Church, George M.
2016-01-01
Scalable, high-throughput DNA sequencing is a prerequisite for precision medicine and biomedical research. Recently, we presented a nanopore-based sequencing-by-synthesis (Nanopore-SBS) approach, which used a set of nucleotides with polymer tags that allow discrimination of the nucleotides in a biological nanopore. Here, we designed and covalently coupled a DNA polymerase to an α-hemolysin (αHL) heptamer using the SpyCatcher/SpyTag conjugation approach. These porin–polymerase conjugates were inserted into lipid bilayers on a complementary metal oxide semiconductor (CMOS)-based electrode array for high-throughput electrical recording of DNA synthesis. The designed nanopore construct successfully detected the capture of tagged nucleotides complementary to a DNA base on a provided template. We measured over 200 tagged-nucleotide signals for each of the four bases and developed a classification method to uniquely distinguish them from each other and background signals. The probability of falsely identifying a background event as a true capture event was less than 1.2%. In the presence of all four tagged nucleotides, we observed sequential additions in real time during polymerase-catalyzed DNA synthesis. Single-polymerase coupling to a nanopore, in combination with the Nanopore-SBS approach, can provide the foundation for a low-cost, single-molecule, electronic DNA-sequencing platform. PMID:27729524
Mühlhausen, Stefanie; Findeisen, Peggy; Plessmann, Uwe; Urlaub, Henning; Kollmar, Martin
2016-07-01
The genetic code is the cellular translation table for the conversion of nucleotide sequences into amino acid sequences. Changes to the meaning of sense codons would introduce errors into almost every translated message and are expected to be highly detrimental. However, reassignment of single or multiple codons in mitochondria and nuclear genomes, although extremely rare, demonstrates that the code can evolve. Several models for the mechanism of alteration of nuclear genetic codes have been proposed (including "codon capture," "genome streamlining," and "ambiguous intermediate" theories), but with little resolution. Here, we report a novel sense codon reassignment in Pachysolen tannophilus, a yeast related to the Pichiaceae. By generating proteomics data and using tRNA sequence comparisons, we show that Pachysolen translates CUG codons as alanine and not as the more usual leucine. The Pachysolen tRNACAG is an anticodon-mutated tRNA(Ala) containing all major alanine tRNA recognition sites. The polyphyly of the CUG-decoding tRNAs in yeasts is best explained by a tRNA loss driven codon reassignment mechanism. Loss of the CUG-tRNA in the ancient yeast is followed by gradual decrease of respective codons and subsequent codon capture by tRNAs whose anticodon is not part of the aminoacyl-tRNA synthetase recognition region. Our hypothesis applies to all nuclear genetic code alterations and provides several testable predictions. We anticipate more codon reassignments to be uncovered in existing and upcoming genome projects. © 2016 Mühlhausen et al.; Published by Cold Spring Harbor Laboratory Press.
Heated oligonucleotide ligation assay (HOLA): an affordable single nucleotide polymorphism assay.
Black, W C; Gorrochotegui-Escalante, N; Duteau, N M
2006-03-01
Most single nucleotide polymorphism (SNP) detection requires expensive equipment and reagents. The oligonucleotide ligation assay (OLA) is an inexpensive SNP assay that detects ligation between a biotinylated "allele-specific detector" and a 3' fluorescein-labeled "reporter" oligonucleotide. No ligation occurs unless the 3' detector nucleotide is complementary to the SNP nucleotide. The original OLA used chemical denaturation and neutralization. Heated OLA (HOLA) instead uses a thermal stable ligase and cycles of denaturing and hybridization for ligation and SNP detection. The cost per genotype is approximately US$1.25 with two-allele SNPs or approximately US$1.75 with three-allele SNPs. We illustrate the development of HOLA for SNP detection in the Early Trypsin and Abundant Trypsin loci in the mosquito Aedes aegypti (L.) and at the a-glycerophosphate dehydrogenase locus in the mosquito Anopheles gambiae s.s.
Competence in Streptococcus pneumoniae Is a Response to an Increasing Mutational Burden
Gagne, Alyssa L.; Stevens, Kathleen E.; Cassone, Marco; Pujari, Amit; Abiola, Olufunke E.; Chang, Diana J.; Sebert, Michael E.
2013-01-01
Competence for genetic transformation in Streptococcus pneumoniae has previously been described as a quorum-sensing trait regulated by a secreted peptide pheromone. Recently we demonstrated that competence is also activated by reduction in the accuracy of protein biosynthesis. We have now investigated whether errors upstream of translation in the form of random genomic mutations can provide a similar stimulus. Here we show that generation of a mutator phenotype in S. pneumoniae through deletions of mutX, hexA or hexB enhanced the expression of competence. Similarly, chemical mutagenesis with the nucleotide analog dPTP promoted development of competence. To investigate the relationship between mutational load and the activation of competence, replicate lineages of the mutX strain were serially passaged under conditions of relaxed selection allowing random accumulation of secondary mutations. Competence increased with propagation in these lineages but not in control lineages having wild-type mutX. Resequencing of these derived strains revealed between 1 and 9 single nucleotide polymorphisms (SNPs) per lineage, which were broadly distributed across the genome and did not involve known regulators of competence. Notably, the frequency of competence development among the sequenced strains correlated significantly with the number of nonsynonymous mutations that had been acquired. Together, these observations provide support for the hypothesis that competence in S. pneumoniae is regulated in response to the accumulated burden of coding mutations in the bacterial genome. In contrast to previously described DNA damage response systems that are activated by physical lesions in the chromosome, this pneumococcal pathway may represent a unique stress response system that monitors the coding integrity of the genome. PMID:23967325
Mitochondrial Genomes of Kinorhyncha: trnM Duplication and New Gene Orders within Animals.
Popova, Olga V; Mikhailov, Kirill V; Nikitin, Mikhail A; Logacheva, Maria D; Penin, Aleksey A; Muntyan, Maria S; Kedrova, Olga S; Petrov, Nikolai B; Panchin, Yuri V; Aleoshin, Vladimir V
2016-01-01
Many features of mitochondrial genomes of animals, such as patterns of gene arrangement, nucleotide content and substitution rate variation are extensively used in evolutionary and phylogenetic studies. Nearly 6,000 mitochondrial genomes of animals have already been sequenced, covering the majority of animal phyla. One of the groups that escaped mitogenome sequencing is phylum Kinorhyncha-an isolated taxon of microscopic worm-like ecdysozoans. The kinorhynchs are thought to be one of the early-branching lineages of Ecdysozoa, and their mitochondrial genomes may be important for resolving evolutionary relations between major animal taxa. Here we present the results of sequencing and analysis of mitochondrial genomes from two members of Kinorhyncha, Echinoderes svetlanae (Cyclorhagida) and Pycnophyes kielensis (Allomalorhagida). Their mitochondrial genomes are circular molecules approximately 15 Kbp in size. The kinorhynch mitochondrial gene sequences are highly divergent, which precludes accurate phylogenetic inference. The mitogenomes of both species encode a typical metazoan complement of 37 genes, which are all positioned on the major strand, but the gene order is distinct and unique among Ecdysozoa or animals as a whole. We predict four types of start codons for protein-coding genes in E. svetlanae and five in P. kielensis with a consensus DTD in single letter code. The mitochondrial genomes of E. svetlanae and P. kielensis encode duplicated methionine tRNA genes that display compensatory nucleotide substitutions. Two distant species of Kinorhyncha demonstrate similar patterns of gene arrangements in their mitogenomes. Both genomes have duplicated methionine tRNA genes; the duplication predates the divergence of two species. The kinorhynchs share a few features pertaining to gene order that align them with Priapulida. Gene order analysis reveals that gene arrangement specific of Priapulida may be ancestral for Scalidophora, Ecdysozoa, and even Protostomia.
Mitochondrial Genomes of Kinorhyncha: trnM Duplication and New Gene Orders within Animals
Popova, Olga V.; Mikhailov, Kirill V.; Nikitin, Mikhail A.; Logacheva, Maria D.; Penin, Aleksey A.; Muntyan, Maria S.; Kedrova, Olga S.; Petrov, Nikolai B.; Panchin, Yuri V.
2016-01-01
Many features of mitochondrial genomes of animals, such as patterns of gene arrangement, nucleotide content and substitution rate variation are extensively used in evolutionary and phylogenetic studies. Nearly 6,000 mitochondrial genomes of animals have already been sequenced, covering the majority of animal phyla. One of the groups that escaped mitogenome sequencing is phylum Kinorhyncha—an isolated taxon of microscopic worm-like ecdysozoans. The kinorhynchs are thought to be one of the early-branching lineages of Ecdysozoa, and their mitochondrial genomes may be important for resolving evolutionary relations between major animal taxa. Here we present the results of sequencing and analysis of mitochondrial genomes from two members of Kinorhyncha, Echinoderes svetlanae (Cyclorhagida) and Pycnophyes kielensis (Allomalorhagida). Their mitochondrial genomes are circular molecules approximately 15 Kbp in size. The kinorhynch mitochondrial gene sequences are highly divergent, which precludes accurate phylogenetic inference. The mitogenomes of both species encode a typical metazoan complement of 37 genes, which are all positioned on the major strand, but the gene order is distinct and unique among Ecdysozoa or animals as a whole. We predict four types of start codons for protein-coding genes in E. svetlanae and five in P. kielensis with a consensus DTD in single letter code. The mitochondrial genomes of E. svetlanae and P. kielensis encode duplicated methionine tRNA genes that display compensatory nucleotide substitutions. Two distant species of Kinorhyncha demonstrate similar patterns of gene arrangements in their mitogenomes. Both genomes have duplicated methionine tRNA genes; the duplication predates the divergence of two species. The kinorhynchs share a few features pertaining to gene order that align them with Priapulida. Gene order analysis reveals that gene arrangement specific of Priapulida may be ancestral for Scalidophora, Ecdysozoa, and even Protostomia. PMID:27755612
Steel, L F; Telly, D L; Leonard, J; Rice, B A; Monks, B; Sawicki, J A
1996-10-01
Murine c-mos transcripts isolated from testes have 5'-untranslated regions (5'UTRs) of approximately 300 nucleotides with a series of four overlapping open reading frames (ORFs) upstream of the AUG codon that initiates the Mos ORF. Ovarian c-mos transcripts have shorter 5'UTRs (70-80 nucleotides) and contain only 1-2 of the upstream ORFs (uORFs). To test whether these 5'UTRs affect translational efficiency, we have constructed plasmids for the expression of chimeric transcripts with a mos-derived 5'UTR fused to the Escherichia coli beta-galactosidase coding region. Translational efficiency has been evaluated by measuring beta-galactosidase activity NIH3T3 cells transiently transfected with these plasmids and with plasmids where various mutations have been introduced into the 5'UTR. We show that the 5'UTR characteristic of testis-specific c-mos mRNA strongly represses translation relative to the translation of transcripts that contain a 5'UTR derived from beta-globin mRNA, and this is mainly due to the four uORFs. Each of the four upstream AUG triplets can be recognized as a start site for translation, and no single uAUG dominates the repressive effect. The uORFs repress translation by a mechanism that is not affected by the amino acid sequence in the COOH-terminal region of the uORF-encoded peptides. The very short uORF (AUGUGA) present in ovary-specific transcripts does not repress translation. Staining of testis sections from transgenic mice carrying chimeric beta-galactosidase transgene constructs, which contain a mos 5'UTR with or without the uATGs, suggests that the uORFs can dramatically change the pattern of expression in spermatogenic cells.
Wagner, Isaac D.; Varghese, Litty B.; Hemme, Christopher L.; Wiegel, Juergen
2013-01-01
Thermal environments have island-like characteristics and provide a unique opportunity to study population structure and diversity patterns of microbial taxa inhabiting these sites. Strains having ≥98% 16S rRNA gene sequence similarity to the obligately anaerobic Firmicutes Thermoanaerobacter uzonensis were isolated from seven geothermal springs, separated by up to 1600 m, within the Uzon Caldera (Kamchatka, Russian Far East). The intraspecies variation and spatial patterns of diversity for this taxon were assessed by multilocus sequence analysis (MLSA) of 106 strains. Analysis of eight protein-coding loci (gyrB, lepA, leuS, pyrG, recA, recG, rplB, and rpoB) revealed that all loci were polymorphic and that nucleotide substitutions were mostly synonymous. There were 148 variable nucleotide sites across 8003 bp concatenates of the protein-coding loci. While pairwise FST values indicated a small but significant level of genetic differentiation between most subpopulations, there was a negligible relationship between genetic divergence and spatial separation. Strains with the same allelic profile were only isolated from the same hot spring, occasionally from consecutive years, and single locus variant (SLV) sequence types were usually derived from the same spring. While recombination occurred, there was an “epidemic” population structure in which a particular T. uzonensis sequence type rose in frequency relative to the rest of the population. These results demonstrate spatial diversity patterns for an anaerobic bacterial species in a relative small geographic location and reinforce the view that terrestrial geothermal springs are excellent places to look for biogeographic diversity patterns regardless of the involved distances. PMID:23801987
NASA Technical Reports Server (NTRS)
Vercoutere, W.; Solbrig, A.; DeGuzman, V.; Deamer, D.; Akeson, M.
2003-01-01
We use a biological nano-scale pore to distinguish among individual DNA hairpins that differ by a single site of oxidation or a nick in the sugar-phosphate backbone. In earlier work we showed that the protein ion channel alpha-hemolysin can be used as a detector to distinguish single-stranded from double-stranded DNA, single base pair and single nucleotide differences. This resolution is in part a result of sensitivity to structural changes that influence the molecular dynamics of nucleotides within DNA. The strand cleavage products we examined here included a 5-base-pair (5-bp) hairpin with a 5-prime five-nucleotide overhang, and a complementary five-nucleotide oligomer. These produced predictable shoulder-spike and rapid near-full blockade signatures, respectively. When combined, strand annealing was monitored in real time. The residual current level dropped to a lower discrete level in the shoulder-spike blockade signatures, and the duration lengthened. However, these blockade signatures had a shorter duration than the unmodified l0bp hairpin. To test the pore sensitivity to nucleotide oxidation, we examined a 9-bp hairpin with a terminal 8-oxo-deoxyguanosine (8-oxo-dG), or a penultimate 8-oxo-dG. Each produced blockade signatures that differed from the otherwise identical control 9bp hairpins. This study showed that DNA structure is modified sufficiently by strand cleavage or oxidation damage at a single site to alter in a predictable manner the ionic current blockade signatures produced. This technique improves the ability to assess damage to DNA, and can provide a simple means to help characterize the risks of radiation exposure. It may also provide a method to test radiation protection.
Yamada, Yoshiji; Sakuma, Jun; Takeuchi, Ichiro; Yasukochi, Yoshiki; Kato, Kimihiko; Oguri, Mitsutoshi; Fujimaki, Tetsuo; Horibe, Hideki; Muramatsu, Masaaki; Sawabe, Motoji; Fujiwara, Yoshinori; Taniguchi, Yu; Obuchi, Shuichi; Kawai, Hisashi; Shinkai, Shoji; Mori, Seijiro; Arai, Tomio; Tanaka, Masashi
2017-06-13
We have performed exome-wide association studies to identify genetic variants that influence body mass index or confer susceptibility to obesity or metabolic syndrome in Japanese. The exome-wide association study for body mass index included 12,890 subjects, and those for obesity and metabolic syndrome included 12,968 subjects (3954 individuals with obesity, 9014 controls) and 6817 subjects (3998 individuals with MetS, 2819 controls), respectively. Exome-wide association studies were performed with Illumina HumanExome-12 DNA Analysis BeadChip or Infinium Exome-24 BeadChip arrays. The relation of genotypes of single nucleotide polymorphisms to body mass index was examined by linear regression analysis, and that of allele frequencies of single nucleotide polymorphisms to obesity or metabolic syndrome was evaluated with Fisher's exact test. The exome-wide association studies identified six, 11, and 40 single nucleotide polymorphisms as being significantly associated with body mass index, obesity (P <1.21 × 10-6), or metabolic syndrome (P <1.20 × 10-6), respectively. Subsequent multivariable logistic regression analysis with adjustment for age and sex revealed that three and five single nucleotide polymorphisms were related (P < 0.05) to obesity or metabolic syndrome, respectively, with one of these latter polymorphisms-rs7350481 (C/T) at chromosome 11q23.3-also being significantly (P < 3.13 × 10-4) associated with metabolic syndrome. The polymorphism rs7350481 may thus be a novel susceptibility locus for metabolic syndrome in Japanese. In addition, single nucleotide polymorphisms in three genes (CROT, TSC1, RIN3) and at four loci (ANKK1, ZNF804B, CSRNP3, 17p11.2) were implicated as candidate determinants of obesity and metabolic syndrome, respectively.
Dai, Weiran; Ye, Ziliang; Lu, Haili; Su, Qiang; Li, Hui; Li, Lang
2018-02-23
The results showed that there was a certain correlation between the single nucleotide polymorphism of IL-10-1082G/A and rheumatic heart disease, but there was no systematic study to verify this conclusion. Systematic review of the association between single nucleotide polymorphism of IL-10-1082G/A locus and rheumatic heart disease. Computer retrieval PubMed, EMbase, Cochrane Library, CBM, CNKI, VIP and Data WanFang, the retrieval time limit from inception to June 2017. A case control study of single nucleotide polymorphisms and rheumatic heart disease in patients with rheumatic heart disease in the IL-10-1082G/A was collected. Two researchers independently screened the literature, extracted data and evaluated the risk of bias in the study, and using RevMan5.3 software for data analysis. A total of 3 case control studies were included, including 318 patients with rheumatic heart disease and 502 controls. Meta-analysis showed that there was no correlation between IL-10-1082G/A gene polymorphism and rheumatic heart disease [AA+AG VS GG: OR = 0.62, 95% CI (0.28, 1.39), P = 0.25; AA VS AG+GG: OR = 0.73, 95% CI (0.54, 1.00), P = 0.05; AA VS GG: OR = 0.70, 95% CI(0.47, 1.05), P = 0.08; AG VS GG: OR = 0.65, 95% CI (0.22, 1.92), P = 0.43; A VS G: OR = 0.87, 95% CI (0.71, 1.06), P = 0.17]. When AA is a recessive gene, the single nucleotide polymorphism of IL-10-1082G/A is associated with the presence of rheumatic heart disease. Due to the limitations of the quantity and quality of the included literatures, the further research results were still needed.
Zhou, Jie; Kherani, Femida; Bardakjian, Tanya M.; Katowitz, James; Hughes, Nkecha; Schimmenti, Lisa A.; Schneider, Adele
2008-01-01
Purpose Mutations in the SOX2 and CHX10 genes have been reported in patients with anophthalmia and/or microphthalmia. In this study, we evaluated 34 anophthalmic/microphthalmic patient DNA samples (two sets of siblings included) for mutations and sequence variants in SOX2 and CHX10. Methods Conformational sensitive gel electrophoresis (CSGE) was used for the initial SOX2 and CHX10 screening of 34 affected individuals (two sets of siblings), five unaffected family members, and 80 healthy controls. Patient samples containing heteroduplexes were selected for sequence analysis. Base pair changes in SOX2 and CHX10 were confirmed by sequencing bidirectionally in patient samples. Results Two novel heterozygous mutations and two sequence variants (one known) in SOX2 were identified in this cohort. Mutation c.310 G>T (p. Glu104X), found in one patient, was in the region encoding the high mobility group (HMG) DNA-binding domain and resulted in a change from glutamic acid to a stop codon. The second mutation, noted in two affected siblings, was a single nucleotide deletion c.549delC (p. Pro184ArgfsX19) in the region encoding the activation domain, resulting in a frameshift and premature termination of the coding sequence. The shortened protein products may result in the loss of function. In addition, a novel nucleotide substitution c.*557G>A was identified in the 3′-untranslated region in one patient. The relationship between the nucleotide change and the protein function is indeterminate. A known single nucleotide polymorphism (c. *469 C>A, SNP rs11915160) was also detected in 2 of the 34 patients. Screening of CHX10 identified two synonymous sequence variants, c.471 C>T (p.Ser157Ser, rs35435463) and c.579 G>A (p. Gln193Gln, novel SNP), and one non-synonymous sequence variant, c.871 G>A (p. Asp291Asn, novel SNP). The non-synonymous polymorphism was also present in healthy controls, suggesting non-causality. Conclusions These results support the role of SOX2 in ocular development. Loss of SOX2 function results in severe eye malformation. CHX10 was not implicated with microphthalmia/anophthalmia in our patient cohort. PMID:18385794
Lijavetzky, Diego; Cabezas, José Antonio; Ibáñez, Ana; Rodríguez, Virginia; Martínez-Zapater, José M
2007-01-01
Background Single-nucleotide polymorphisms (SNPs) are the most abundant type of DNA sequence polymorphisms. Their higher availability and stability when compared to simple sequence repeats (SSRs) provide enhanced possibilities for genetic and breeding applications such as cultivar identification, construction of genetic maps, the assessment of genetic diversity, the detection of genotype/phenotype associations, or marker-assisted breeding. In addition, the efficiency of these activities can be improved thanks to the ease with which SNP genotyping can be automated. Expressed sequence tags (EST) sequencing projects in grapevine are allowing for the in silico detection of multiple putative sequence polymorphisms within and among a reduced number of cultivars. In parallel, the sequence of the grapevine cultivar Pinot Noir is also providing thousands of polymorphisms present in this highly heterozygous genome. Still the general application of those SNPs requires further validation since their use could be restricted to those specific genotypes. Results In order to develop a large SNP set of wide application in grapevine we followed a systematic re-sequencing approach in a group of 11 grape genotypes corresponding to ancient unrelated cultivars as well as wild plants. Using this approach, we have sequenced 230 gene fragments, what represents the analysis of over 1 Mb of grape DNA sequence. This analysis has allowed the discovery of 1573 SNPs with an average of one SNP every 64 bp (one SNP every 47 bp in non-coding regions and every 69 bp in coding regions). Nucleotide diversity in grape (π = 0.0051) was found to be similar to values observed in highly polymorphic plant species such as maize. The average number of haplotypes per gene sequence was estimated as six, with three haplotypes representing over 83% of the analyzed sequences. Short-range linkage disequilibrium (LD) studies within the analyzed sequences indicate the existence of a rapid decay of LD within the selected grapevine genotypes. To validate the use of the detected polymorphisms in genetic mapping, cultivar identification and genetic diversity studies we have used the SNPlex™ genotyping technology in a sample of grapevine genotypes and segregating progenies. Conclusion These results provide accurate values for nucleotide diversity in coding sequences and a first estimate of short-range LD in grapevine. Using SNPlex™ genotyping we have shown the application of a set of discovered SNPs as molecular markers for cultivar identification, linkage mapping and genetic diversity studies. Thus, the combination a highly efficient re-sequencing approach and the SNPlex™ high throughput genotyping technology provide a powerful tool for grapevine genetic analysis. PMID:18021442
OmpF, a nucleotide-sensing nanoprobe, computational evaluation of single channel activities
NASA Astrophysics Data System (ADS)
Abdolvahab, R. H.; Mobasheri, H.; Nikouee, A.; Ejtehadi, M. R.
2016-09-01
The results of highthroughput practical single channel experiments should be formulated and validated by signal analysis approaches to increase the recognition precision of translocating molecules. For this purpose, the activities of the single nano-pore forming protein, OmpF, in the presence of nucleotides were recorded in real time by the voltage clamp technique and used as a means for nucleotide recognition. The results were analyzed based on the permutation entropy of current Time Series (TS), fractality, autocorrelation, structure function, spectral density, and peak fraction to recognize each nucleotide, based on its signature effect on the conductance, gating frequency and voltage sensitivity of channel at different concentrations and membrane potentials. The amplitude and frequency of ion current fluctuation increased in the presence of Adenine more than Cytosine and Thymine in milli-molar (0.5 mM) concentrations. The variance of the current TS at various applied voltages showed a non-monotonic trend whose initial increasing slope in the presence of Thymine changed to a decreasing one in the second phase and was different from that of Adenine and Cytosine; e.g., by increasing the voltage from 40 to 140 mV in the 0.5 mM concentration of Adenine or Cytosine, the variance decreased by one third while for the case of Thymine it was doubled. Moreover, according to the structure function of TS, the fractality of current TS differed as a function of varying membrane potentials (pd) and nucleotide concentrations. Accordingly, the calculated permutation entropy of the TS, validated the biophysical approach defined for the recognition of different nucleotides at various concentrations, pd's and polarities. Thus, the promising outcomes of the combined experimental and theoretical methodologies presented here can be implemented as a complementary means in pore-based nucleotide recognition approaches.
Beta.-glucosidase coding sequences and protein from orpinomyces PC-2
Li, Xin-Liang; Ljungdahl, Lars G.; Chen, Huizhong; Ximenes, Eduardo A.
2001-02-06
Provided is a novel .beta.-glucosidase from Orpinomyces sp. PC2, nucleotide sequences encoding the mature protein and the precursor protein, and methods for recombinant production of this .beta.-glucosidase.
Küpper, Clemens; Burke, Terry; Lank, David B.
2015-01-01
Sequence variation in the melanocortin-1 receptor (MC1R) gene explains color morph variation in several species of birds and mammals. Ruffs (Philomachus pugnax) exhibit major dark/light color differences in melanin-based male breeding plumage which is closely associated with alternative reproductive behavior. A previous study identified a microsatellite marker (Ppu020) near the MC1R locus associated with the presence/absence of ornamental plumage. We investigated whether coding sequence variation in the MC1R gene explains major dark/light plumage color variation and/or the presence/absence of ornamental plumage in ruffs. Among 821bp of the MC1R coding region from 44 male ruffs we found 3 single nucleotide polymorphisms, representing 1 nonsynonymous and 2 synonymous amino acid substitutions. None were associated with major dark/light color differences or the presence/absence of ornamental plumage. At all amino acid sites known to be functionally important in other avian species with dark/light plumage color variation, ruffs were either monomorphic or the shared polymorphism did not coincide with color morph. Neither ornamental plumage color differences nor the presence/absence of ornamental plumage in ruffs are likely to be caused entirely by amino acid variation within the coding regions of the MC1R locus. Regulatory elements and structural variation at other loci may be involved in melanin expression and contribute to the extreme plumage polymorphism observed in this species. PMID:25534935
Origins of genes: "big bang" or continuous creation?
Keese, P K; Gibbs, A
1992-01-01
Many protein families are common to all cellular organisms, indicating that many genes have ancient origins. Genetic variation is mostly attributed to processes such as mutation, duplication, and rearrangement of ancient modules. Thus it is widely assumed that much of present-day genetic diversity can be traced by common ancestry to a molecular "big bang." A rarely considered alternative is that proteins may arise continuously de novo. One mechanism of generating different coding sequences is by "overprinting," in which an existing nucleotide sequence is translated de novo in a different reading frame or from noncoding open reading frames. The clearest evidence for overprinting is provided when the original gene function is retained, as in overlapping genes. Analysis of their phylogenies indicates which are the original genes and which are their informationally novel partners. We report here the phylogenetic relationships of overlapping coding sequences from steroid-related receptor genes and from tymovirus, luteovirus, and lentivirus genomes. For each pair of overlapping coding sequences, one is confined to a single lineage, whereas the other is more widespread. This suggests that the phylogenetically restricted coding sequence arose only in the progenitor of that lineage by translating an out-of-frame sequence to yield the new polypeptide. The production of novel exons by alternative splicing in thyroid receptor and lentivirus genes suggests that introns can be a valuable evolutionary source for overprinting. New genes and their products may drive major evolutionary changes. PMID:1329098
Pseudouridine profiling reveals regulated mRNA pseudouridylation in yeast and human cells
Carlile, Thomas M.; Rojas-Duran, Maria F.; Zinshteyn, Boris; Shin, Hakyung; Bartoli, Kristen M.; Gilbert, Wendy V.
2014-01-01
Post-transcriptional modification of RNA nucleosides occurs in all living organisms. Pseudouridine, the most abundant modified nucleoside in non-coding RNAs1, enhances the function of transfer RNA and ribosomal RNA by stabilizing RNA structure2–8. mRNAs were not known to contain pseudouridine, but artificial pseudouridylation dramatically affects mRNA function – it changes the genetic code by facilitating non-canonical base pairing in the ribosome decoding center9,10. However, without evidence of naturally occurring mRNA pseudouridylation, its physiological was unclear. Here we present a comprehensive analysis of pseudouridylation in yeast and human RNAs using Pseudo-seq, a genome-wide, single-nucleotide-resolution method for pseudouridine identification. Pseudo-seq accurately identifies known modification sites as well as 100 novel sites in non-coding RNAs, and reveals hundreds of pseudouridylated sites in mRNAs. Genetic analysis allowed us to assign most of the new modification sites to one of seven conserved pseudouridine synthases, Pus1–4, 6, 7 and 9. Notably, the majority of pseudouridines in mRNA are regulated in response to environmental signals, such as nutrient deprivation in yeast and serum starvation in human cells. These results suggest a mechanism for the rapid and regulated rewiring of the genetic code through inducible mRNA modifications. Our findings reveal unanticipated roles for pseudouridylation and provide a resource for identifying the targets of pseudouridine synthases implicated in human disease11–13. PMID:25192136
Vorstman, Jacob A S; Olde Loohuis, Loes M; Kahn, René S; Ophoff, Roel A
2018-05-14
The co-occurrence of a Copy Number Variant (CNV) and a functional variant on the other allele may be a relevant genetic mechanism in schizophrenia. We hypothesized that the cumulative burden of such double hits - in particular those composed of a deletion and a coding single nucleotide variation (SNV) - is increased in patients with schizophrenia.We combined CNV data with coding variants data in 795 patients with schizophrenia and 474 controls. To limit false CNV-detection, only CNVs called only by two algorithms we included. CNV-affected genes were subsequently examined for coding SNVs, which we termed "CNV-SNVs". Correcting for total queried sequence, we assessed the CNV-SNV-burden and the combined predicted deleterious effect. We estimated p-values by permutation of the phenotype.We detected 105 CNV-SNVs; 67 in duplicated and 38 in deleted genic sequence. While the difference in CNV-SNVs rates was not significant, the combined deleteriousness inferred by CNV-SNVs in deleted sequence was almost fourfold higher in cases compared to controls (nominal p = 0.009). This effect may be driven by a higher number of CNV-SNVs and/or by a higher degree of predicted deleteriousness of CNV-SNVs. No such effect was observed for duplications.We provide early evidence that deletions co-occurring with a functional variant may be relevant, albeit of modest impact, for the genetic etiology of schizophrenia. Large-scale consortium studies are required to validate our findings. Sequence-based analyses would provide the best resolution for detection of CNVs as well as coding variants genome-wide.
Non-codingRNA sequence variations in human chronic lymphocytic leukemia and colorectal cancer.
Wojcik, Sylwia E; Rossi, Simona; Shimizu, Masayoshi; Nicoloso, Milena S; Cimmino, Amelia; Alder, Hansjuerg; Herlea, Vlad; Rassenti, Laura Z; Rai, Kanti R; Kipps, Thomas J; Keating, Michael J; Croce, Carlo M; Calin, George A
2010-02-01
Cancer is a genetic disease in which the interplay between alterations in protein-coding genes and non-coding RNAs (ncRNAs) plays a fundamental role. In recent years, the full coding component of the human genome was sequenced in various cancers, whereas such attempts related to ncRNAs are still fragmentary. We screened genomic DNAs for sequence variations in 148 microRNAs (miRNAs) and ultraconserved regions (UCRs) loci in patients with chronic lymphocytic leukemia (CLL) or colorectal cancer (CRC) by Sanger technique and further tried to elucidate the functional consequences of some of these variations. We found sequence variations in miRNAs in both sporadic and familial CLL cases, mutations of UCRs in CLLs and CRCs and, in certain instances, detected functional effects of these variations. Furthermore, by integrating our data with previously published data on miRNA sequence variations, we have created a catalog of DNA sequence variations in miRNAs/ultraconserved genes in human cancers. These findings argue that ncRNAs are targeted by both germ line and somatic mutations as well as by single-nucleotide polymorphisms with functional significance for human tumorigenesis. Sequence variations in ncRNA loci are frequent and some have functional and biological significance. Such information can be exploited to further investigate on a genome-wide scale the frequency of genetic variations in ncRNAs and their functional meaning, as well as for the development of new diagnostic and prognostic markers for leukemias and carcinomas.
Non-codingRNA sequence variations in human chronic lymphocytic leukemia and colorectal cancer
Wojcik, Sylwia E.; Rossi, Simona; Shimizu, Masayoshi; Nicoloso, Milena S.; Cimmino, Amelia; Alder, Hansjuerg; Herlea, Vlad; Rassenti, Laura Z.; Rai, Kanti R.; Kipps, Thomas J.; Keating, Michael J.
2010-01-01
Cancer is a genetic disease in which the interplay between alterations in protein-coding genes and non-coding RNAs (ncRNAs) plays a fundamental role. In recent years, the full coding component of the human genome was sequenced in various cancers, whereas such attempts related to ncRNAs are still fragmentary. We screened genomic DNAs for sequence variations in 148 microRNAs (miRNAs) and ultraconserved regions (UCRs) loci in patients with chronic lymphocytic leukemia (CLL) or colorectal cancer (CRC) by Sanger technique and further tried to elucidate the functional consequences of some of these variations. We found sequence variations in miRNAs in both sporadic and familial CLL cases, mutations of UCRs in CLLs and CRCs and, in certain instances, detected functional effects of these variations. Furthermore, by integrating our data with previously published data on miRNA sequence variations, we have created a catalog of DNA sequence variations in miRNAs/ultraconserved genes in human cancers. These findings argue that ncRNAs are targeted by both germ line and somatic mutations as well as by single-nucleotide polymorphisms with functional significance for human tumorigenesis. Sequence variations in ncRNA loci are frequent and some have functional and biological significance. Such information can be exploited to further investigate on a genome-wide scale the frequency of genetic variations in ncRNAs and their functional meaning, as well as for the development of new diagnostic and prognostic markers for leukemias and carcinomas. PMID:19926640
Chiusano, M L; D'Onofrio, G; Alvarez-Valin, F; Jabbari, K; Colonna, G; Bernardi, G
1999-09-30
We investigated the relationships between the nucleotide substitution rates and the predicted secondary structures in the three states representation (alpha-helix, beta-sheet, and coil). The analysis was carried out on 34 alignments, each of which comprised sequences belonging to at least four different mammalian orders. The rates of synonymous substitution were found to be significantly different in regions predicted to be alpha-helix, beta-sheet, or coil. Likewise, the nonsynonymous rates also differ, although expectedly at a lower extent, in the three types of secondary structure, suggesting that different selective constraints associated with the different structures are affecting in a similar way the synonymous and nonsynonymous rates. Moreover, the base composition of the third codon positions is different in coding sequence regions corresponding to different secondary structures of proteins.
NullSeq: A Tool for Generating Random Coding Sequences with Desired Amino Acid and GC Contents.
Liu, Sophia S; Hockenberry, Adam J; Lancichinetti, Andrea; Jewett, Michael C; Amaral, Luís A N
2016-11-01
The existence of over- and under-represented sequence motifs in genomes provides evidence of selective evolutionary pressures on biological mechanisms such as transcription, translation, ligand-substrate binding, and host immunity. In order to accurately identify motifs and other genome-scale patterns of interest, it is essential to be able to generate accurate null models that are appropriate for the sequences under study. While many tools have been developed to create random nucleotide sequences, protein coding sequences are subject to a unique set of constraints that complicates the process of generating appropriate null models. There are currently no tools available that allow users to create random coding sequences with specified amino acid composition and GC content for the purpose of hypothesis testing. Using the principle of maximum entropy, we developed a method that generates unbiased random sequences with pre-specified amino acid and GC content, which we have developed into a python package. Our method is the simplest way to obtain maximally unbiased random sequences that are subject to GC usage and primary amino acid sequence constraints. Furthermore, this approach can easily be expanded to create unbiased random sequences that incorporate more complicated constraints such as individual nucleotide usage or even di-nucleotide frequencies. The ability to generate correctly specified null models will allow researchers to accurately identify sequence motifs which will lead to a better understanding of biological processes as well as more effective engineering of biological systems.
Loreni, F; Ruberti, I; Bozzoni, I; Pierandrei-Amaldi, P; Amaldi, F
1985-01-01
Ribosomal protein L1 is encoded by two genes in Xenopus laevis. The comparison of two cDNA sequences shows that the two L1 gene copies (L1a and L1b) have diverged in many silent sites and very few substitution sites; moreover a small duplication occurred at the very end of the coding region of the L1b gene which thus codes for a product five amino acids longer than that coded by L1a. Quantitatively the divergence between the two L1 genes confirms that a whole genome duplication took place in Xenopus laevis approximately 30 million years ago. A genomic fragment containing one of the two L1 gene copies (L1a), with its nine introns and flanking regions, has been completely sequenced. The 5' end of this gene has been mapped within a 20-pyridimine stretch as already found for other vertebrate ribosomal protein genes. Four of the nine introns have a 60-nucleotide sequence with 80% homology; within this region some boxes, one of which is 16 nucleotides long, are 100% homologous among the four introns. This feature of L1a gene introns is interesting since we have previously shown that the activity of this gene is regulated at a post-transcriptional level and it involves the block of the normal splicing of some intron sequences. Images Fig. 3. Fig. 5. PMID:3841512
Gao, Li; Rafaels, Nicholas M; Huang, Lili; Potee, Joseph; Ruczinski, Ingo; Beaty, Terri H.; Paller, Amy S.; Schneider, Lynda C.; Gallo, Rich; Hanifin, Jon M.; Beck, Lisa A.; Geha, Raif S.; Mathias, Rasika A.; Leung, Donald Y. M.
2015-01-01
Background A subset of atopic dermatitis (AD) is associated with increased susceptibility to eczema herpeticum (ADEH+). We previously reported that common single nucleotide polymorphisms (SNPs) in interferon-gamma (IFNG) and receptor 1 (IFNGR1) were associated with ADEH+ phenotype. Objective To interrogate the role of rare variants in IFN-pathway genes for risk of ADEH+. Methods We performed targeted sequencing of interferon-pathway genes (IFNG, IFNGR1, IFNAR1 and IL12RB1) in 228 European American (EA) AD patients selected according to their EH status and severity measured by Eczema Area and Severity Index (EASI). Replication genotyping was performed in independent samples of 219 EA and 333 African Americans (AA). Functional investigation of ‘loss-of-function’ variants was conducted using site-directed mutagenesis. Results We identified 494 single nucleotide variants (SNVs) encompassing 105kb of sequence, including 145 common, 349 (70.6%) rare (minor allele frequency (MAF) <5%) and 86 (17.4%) novel variants, of which 2.8% were coding-synonymous, 93.3% were non-coding (64.6% intronic), and 3.8% were missense. We identified six rare IFNGR1 missense including three damaging variants (Val14Met (V14M), Val61Ile and Tyr397Cys (Y397C)) conferring a higher risk for ADEH+ (P=0.031). Variants V14M and Y397C were confirmed to be deleterious leading to partial IFNGR1 deficiency. Seven common IFNGR1 SNPs, along with common protective haplotypes (2 to 7-SNPs) conferred a reduced risk of ADEH+ (P=0.015-0.002, P=0.0015-0.0004, respectively), and both SNP and haplotype associations were replicated in an independent AA sample (P=0.004-0.0001 and P=0.001-0.0001, respectively). Conclusion Our results provide evidence that both genetic variants in the gene encoding IFNGR1 are implicated in susceptibility to the ADEH+ phenotype. CAPSULE SUMMARY We provided the first evidence that rare functional IFNGR1 mutations contribute to a defective systemic IFN-γ immune response that accounts for the propensity of AD patients to disseminated viral skin infections. PMID:26343451
Single-Molecule Counting of Point Mutations by Transient DNA Binding
NASA Astrophysics Data System (ADS)
Su, Xin; Li, Lidan; Wang, Shanshan; Hao, Dandan; Wang, Lei; Yu, Changyuan
2017-03-01
High-confidence detection of point mutations is important for disease diagnosis and clinical practice. Hybridization probes are extensively used, but are hindered by their poor single-nucleotide selectivity. Shortening the length of DNA hybridization probes weakens the stability of the probe-target duplex, leading to transient binding between complementary sequences. The kinetics of probe-target binding events are highly dependent on the number of complementary base pairs. Here, we present a single-molecule assay for point mutation detection based on transient DNA binding and use of total internal reflection fluorescence microscopy. Statistical analysis of single-molecule kinetics enabled us to effectively discriminate between wild type DNA sequences and single-nucleotide variants at the single-molecule level. A higher single-nucleotide discrimination is achieved than in our previous work by optimizing the assay conditions, which is guided by statistical modeling of kinetics with a gamma distribution. The KRAS c.34 A mutation can be clearly differentiated from the wild type sequence (KRAS c.34 G) at a relative abundance as low as 0.01% mutant to WT. To demonstrate the feasibility of this method for analysis of clinically relevant biological samples, we used this technology to detect mutations in single-stranded DNA generated from asymmetric RT-PCR of mRNA from two cancer cell lines.
Methods and kits for nucleic acid analysis using fluorescence resonance energy transfer
Kwok, Pui-Yan; Chen, Xiangning
1999-01-01
A method for detecting the presence of a target nucleotide or sequence of nucleotides in a nucleic acid is disclosed. The method is comprised of forming an oligonucleotide labeled with two fluorophores on the nucleic acid target site. The doubly labeled oligonucleotide is formed by addition of a singly labeled dideoxynucleoside triphosphate to a singly labeled polynucleotide or by ligation of two singly labeled polynucleotides. Detection of fluorescence resonance energy transfer upon denaturation indicates the presence of the target. Kits are also provided. The method is particularly applicable to genotyping.
USDA-ARS?s Scientific Manuscript database
In a marker-trait association study we estimated the statistical significance of 65 single nucleotide polymorphisms (SNP) in 23 candidate genes on HDL levels of two independent Caucasian populations. Each population consisted of men and women and their HDL levels were adjusted for gender and body we...
Eliakim, Alon; Ben Zaken, Sigal; Meckel, Yoav; Yamin, Chen; Dror, Nitzan; Nemet, Dan
2015-12-01
We present an adolescent elite water polo player who despite a genetic predisposition to develop exercise-induced severe muscle damage due to carrying the IL-6 174C allele single-nucleotide polymorphism, developed acute rhabdomyolysis only after a vigorous out-of-water training, suggesting that water polo training may be more suitable for genetically predisposed athletes.
Olsen, Randall J.; Sitkiewicz, Izabela; Ayeras, Ara A.; Gonulal, Vedia E.; Cantu, Concepcion; Beres, Stephen B.; Green, Nicole M.; Lei, Benfang; Humbird, Tammy; Greaver, Jamieson; Chang, Ellen; Ragasa, Willie P.; Montgomery, Charles A.; Cartwright, Joiner; McGeer, Allison; Low, Donald E.; Whitney, Adeline R.; Cagle, Philip T.; Blasdel, Terry L.; DeLeo, Frank R.; Musser, James M.
2010-01-01
Single-nucleotide changes are the most common cause of natural genetic variation among members of the same species, but there is remarkably little information bearing on how they alter bacterial virulence. We recently discovered a single-nucleotide mutation in the group A Streptococcus genome that is epidemiologically associated with decreased human necrotizing fasciitis (“flesh-eating disease”). Working from this clinical observation, we find that wild-type mtsR function is required for group A Streptococcus to cause necrotizing fasciitis in mice and nonhuman primates. Expression microarray analysis revealed that mtsR inactivation results in overexpression of PrsA, a chaperonin involved in posttranslational maturation of SpeB, an extracellular cysteine protease. Isogenic mutant strains that overexpress prsA or lack speB had decreased secreted protease activity in vivo and recapitulated the necrotizing fasciitis-negative phenotype of the ΔmtsR mutant strain in mice and monkeys. mtsR inactivation results in increased PrsA expression, which in turn causes decreased SpeB secreted protease activity and reduced necrotizing fasciitis capacity. Thus, a naturally occurring single-nucleotide mutation dramatically alters virulence by dysregulating a multiple gene virulence axis. Our discovery has broad implications for the confluence of population genomics and molecular pathogenesis research. PMID:20080771
Olsen, Randall J; Sitkiewicz, Izabela; Ayeras, Ara A; Gonulal, Vedia E; Cantu, Concepcion; Beres, Stephen B; Green, Nicole M; Lei, Benfang; Humbird, Tammy; Greaver, Jamieson; Chang, Ellen; Ragasa, Willie P; Montgomery, Charles A; Cartwright, Joiner; McGeer, Allison; Low, Donald E; Whitney, Adeline R; Cagle, Philip T; Blasdel, Terry L; DeLeo, Frank R; Musser, James M
2010-01-12
Single-nucleotide changes are the most common cause of natural genetic variation among members of the same species, but there is remarkably little information bearing on how they alter bacterial virulence. We recently discovered a single-nucleotide mutation in the group A Streptococcus genome that is epidemiologically associated with decreased human necrotizing fasciitis ("flesh-eating disease"). Working from this clinical observation, we find that wild-type mtsR function is required for group A Streptococcus to cause necrotizing fasciitis in mice and nonhuman primates. Expression microarray analysis revealed that mtsR inactivation results in overexpression of PrsA, a chaperonin involved in posttranslational maturation of SpeB, an extracellular cysteine protease. Isogenic mutant strains that overexpress prsA or lack speB had decreased secreted protease activity in vivo and recapitulated the necrotizing fasciitis-negative phenotype of the DeltamtsR mutant strain in mice and monkeys. mtsR inactivation results in increased PrsA expression, which in turn causes decreased SpeB secreted protease activity and reduced necrotizing fasciitis capacity. Thus, a naturally occurring single-nucleotide mutation dramatically alters virulence by dysregulating a multiple gene virulence axis. Our discovery has broad implications for the confluence of population genomics and molecular pathogenesis research.
Wang, Xiaohua; Chen, Yanling; Thomas, Catherine L; Ding, Guangda; Xu, Ping; Shi, Dexu; Grandke, Fabian; Jin, Kemo; Cai, Hongmei; Xu, Fangsen; Yi, Bin; Broadley, Martin R; Shi, Lei
2017-08-01
Breeding crops with ideal root system architecture for efficient absorption of phosphorus is an important strategy to reduce the use of phosphate fertilizers. To investigate genetic variants leading to changes in root system architecture, 405 oilseed rape cultivars were genotyped with a 60K Brassica Infinium SNP array in low and high P environments. A total of 285 single-nucleotide polymorphisms were associated with root system architecture traits at varying phosphorus levels. Nine single-nucleotide polymorphisms corroborate a previous linkage analysis of root system architecture quantitative trait loci in the BnaTNDH population. One peak single-nucleotide polymorphism region on A3 was associated with all root system architecture traits and co-localized with a quantitative trait locus for primary root length at low phosphorus. Two more single-nucleotide polymorphism peaks on A5 for root dry weight at low phosphorus were detected in both growth systems and co-localized with a quantitative trait locus for the same trait. The candidate genes identified on A3 form a haplotype 'BnA3Hap', that will be important for understanding the phosphorus/root system interaction and for the incorporation into Brassica napus breeding programs. © The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Guisinger, Mary M; Chumley, Timothy W; Kuehl, Jennifer V; Boore, Jeffrey L; Jansen, Robert K
2010-02-01
Plastid genomes of the grasses (Poaceae) are unusual in their organization and rates of sequence evolution. There has been a recent surge in the availability of grass plastid genome sequences, but a comprehensive comparative analysis of genome evolution has not been performed that includes any related families in the Poales. We report on the plastid genome of Typha latifolia, the first non-grass Poales sequenced to date, and we present comparisons of genome organization and sequence evolution within Poales. Our results confirm that grass plastid genomes exhibit acceleration in both genomic rearrangements and nucleotide substitutions. Poaceae have multiple structural rearrangements, including three inversions, three genes losses (accD, ycf1, ycf2), intron losses in two genes (clpP, rpoC1), and expansion of the inverted repeat (IR) into both large and small single-copy regions. These rearrangements are restricted to the Poaceae, and IR expansion into the small single-copy region correlates with the phylogeny of the family. Comparisons of 73 protein-coding genes for 47 angiosperms including nine Poaceae genera confirm that the branch leading to Poaceae has significantly accelerated rates of change relative to other monocots and angiosperms. Furthermore, rates of sequence evolution within grasses are lower, indicating a deceleration during diversification of the family. Overall there is a strong correlation between accelerated rates of genomic rearrangements and nucleotide substitutions in Poaceae, a phenomenon that has been noted recently throughout angiosperms. The cause of the correlation is unknown, but faulty DNA repair has been suggested in other systems including bacterial and animal mitochondrial genomes.
GESPA: classifying nsSNPs to predict disease association.
Khurana, Jay K; Reeder, Jay E; Shrimpton, Antony E; Thakar, Juilee
2015-07-25
Non-synonymous single nucleotide polymorphisms (nsSNPs) are the most common DNA sequence variation associated with disease in humans. Thus determining the clinical significance of each nsSNP is of great importance. Potential detrimental nsSNPs may be identified by genetic association studies or by functional analysis in the laboratory, both of which are expensive and time consuming. Existing computational methods lack accuracy and features to facilitate nsSNP classification for clinical use. We developed the GESPA (GEnomic Single nucleotide Polymorphism Analyzer) program to predict the pathogenicity and disease phenotype of nsSNPs. GESPA is a user-friendly software package for classifying disease association of nsSNPs. It allows flexibility in acceptable input formats and predicts the pathogenicity of a given nsSNP by assessing the conservation of amino acids in orthologs and paralogs and supplementing this information with data from medical literature. The development and testing of GESPA was performed using the humsavar, ClinVar and humvar datasets. Additionally, GESPA also predicts the disease phenotype associated with a nsSNP with high accuracy, a feature unavailable in existing software. GESPA's overall accuracy exceeds existing computational methods for predicting nsSNP pathogenicity. The usability of GESPA is enhanced by fast SQL-based cloud storage and retrieval of data. GESPA is a novel bioinformatics tool to determine the pathogenicity and phenotypes of nsSNPs. We anticipate that GESPA will become a useful clinical framework for predicting the disease association of nsSNPs. The program, executable jar file, source code, GPL 3.0 license, user guide, and test data with instructions are available at http://sourceforge.net/projects/gespa.
Gorbenko del Blanco, Darya; de Graaff, Laura C G; Visser, Theo J; Hokken-Koelega, Anita C S
2013-03-01
Combined pituitary hormone deficiency (CPHD) is characterized by deficiencies of two or more anterior pituitary hormones. Its genetic cause is unknown in the majority of cases. The Hedgehog (Hh) signalling pathway has been implicated in disorders associated with pituitary development. Mutations in Sonic Hedgehog (SHH) have been described in patients with holoprosencephaly (with or without pituitary involvement). Hedgehog interacting protein (HHIP) has been associated with variations in adult height in genome wide association studies. We investigated whether mutations in these two genes of the Hh pathway, SHH and HHIP, could result in 'idiopathic' CPHD. We directly sequenced the coding regions and exon - intron boundaries of SHH and HHIP in 93 CPHD patients of the Dutch HYPOPIT study in whom mutations in the classical CPHD genes PROP1, POU1F1, HESX1, LHX3 and LHX4 had been ruled out. We compared the expression of Hh genes in Hep3B transfected cells between wild-type proteins and mutants. We identified three single-nucleotide variants (p.Ala226Thr, c.1078C>T and c.*8G>T) in SHH. The function of the latter was severely affected in our in vitro assay. In HHIP, we detected a new activating variant c.-1G>C, which increases HHIP's inhibiting function on the Hh pathway. Our results suggest involvement of the Hedgehog pathway in CPHD. We suggest that both SHH and HHIP are investigated as a second screening in CPHD, after mutations in the classical CPHD genes have been ruled out. © 2012 Blackwell Publishing Ltd.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Beutler, E.; Gelbart, T.; Kuhl, W.
1991-12-01
Gaucher disease is an autosomal recessive glycolipid storage disease characterized by a deficiency of glucocerebrosidase. The disease is most common in persons of Ashkenazi Jewish ancestry and the most common mutation, accounting for about 75% of the mutant alleles in this population, is known to be an A {yields} G substitution at cDNA nucleotide (nt) 1,226. Screening for this disease has not been possible because nearly 25% of the mutant alleles had not been identified, but linkage analysis led to the suggestion that most of these could be accounted for by a single mutation. The authors now report the discoverymore » of this mutation. The insertion of a single nucleotide, a second guanine at cDNA nt 84 (the 84GG mutation), has been detected in the 5{prime} coding region of the glucocerebrosidase gene. The amount mRNA produced is shown to be normal but since the frameshift produces early termination, no translation product is seen. This finding is consistent with the virtual absence of antigen found in patients carrying this mutation. The 84GG mutation accounts for most of the previously unidentified Gaucher disease mutations in Jewish patients. The common Jewish mutation at nt 1,448 accounted for 95% of all of the Gaucher disease-producing alleles in 71 Jewish patients. This now makes it possible to screen for heterozygotes on a DNA level with a relatively low risk of missing couples at risk for producing infants with Gaucher disease.« less
Nascimento, Ana P B; Ortiz, Mauro F; Martins, Willames M B S; Morais, Guilherme L; Fehlberg, Lorena C C; Almeida, Luiz G P; Ciapina, Luciane P; Gales, Ana C; Vasconcelos, Ana T R
2016-01-01
Carbapenems represent the mainstay therapy for the treatment of serious P. aeruginosa infections. However, the emergence of carbapenem resistance has jeopardized the clinical use of this important class of compounds. The production of SPM-1 metallo-β-lactamase has been the most common mechanism of carbapenem resistance identified in P. aeruginosa isolated from Brazilian medical centers. Interestingly, a single SPM-1-producing P. aeruginosa clone belonging to the ST277 has been widely spread within the Brazilian territory. In the current study, we performed a next-generation sequencing of six SPM-1-producing P. aeruginosa ST277 isolates. The core genome contains 5899 coding genes relative to the reference strain P. aeruginos a PAO1. A total of 26 genomic islands were detected in these isolates. We identified remarkable elements inside these genomic islands, such as copies of the bla SPM-1 gene conferring resistance to carbapenems and a type I-C CRISPR-Cas system, which is involved in protection of the chromosome against foreign DNA. In addition, we identified single nucleotide polymorphisms causing amino acid changes in antimicrobial resistance and virulence-related genes. Together, these factors could contribute to the marked resistance and persistence of the SPM-1-producing P. aeruginosa ST277 clone. A comparison of the SPM-1-producing P. aeruginosa ST277 genomes showed that their core genome has a high level nucleotide similarity and synteny conservation. The variability observed was mainly due to acquisition of genomic islands carrying several antibiotic resistance genes.
Thomson, P A; Parla, J S; McRae, A F; Kramer, M; Ramakrishnan, K; Yao, J; Soares, D C; McCarthy, S; Morris, S W; Cardone, L; Cass, S; Ghiban, E; Hennah, W; Evans, K L; Rebolini, D; Millar, J K; Harris, S E; Starr, J M; MacIntyre, D J; McIntosh, A M; Watson, J D; Deary, I J; Visscher, P M; Blackwood, D H; McCombie, W R; Porteous, D J
2014-06-01
A balanced t(1;11) translocation that transects the Disrupted in schizophrenia 1 (DISC1) gene shows genome-wide significant linkage for schizophrenia and recurrent major depressive disorder (rMDD) in a single large Scottish family, but genome-wide and exome sequencing-based association studies have not supported a role for DISC1 in psychiatric illness. To explore DISC1 in more detail, we sequenced 528 kb of the DISC1 locus in 653 cases and 889 controls. We report 2718 validated single-nucleotide polymorphisms (SNPs) of which 2010 have a minor allele frequency of <1%. Only 38% of these variants are reported in the 1000 Genomes Project European subset. This suggests that many DISC1 SNPs remain undiscovered and are essentially private. Rare coding variants identified exclusively in patients were found in likely functional protein domains. Significant region-wide association was observed between rs16856199 and rMDD (P=0.026, unadjusted P=6.3 × 10(-5), OR=3.48). This was not replicated in additional recurrent major depression samples (replication P=0.11). Combined analysis of both the original and replication set supported the original association (P=0.0058, OR=1.46). Evidence for segregation of this variant with disease in families was limited to those of rMDD individuals referred from primary care. Burden analysis for coding and non-coding variants gave nominal associations with diagnosis and measures of mood and cognition. Together, these observations are likely to generalise to other candidate genes for major mental illness and may thus provide guidelines for the design of future studies.
Protein functional features are reflected in the patterns of mRNA translation speed.
López, Daniel; Pazos, Florencio
2015-07-09
The degeneracy of the genetic code makes it possible for the same amino acid string to be coded by different messenger RNA (mRNA) sequences. These "synonymous mRNAs" may differ largely in a number of aspects related to their overall translational efficiency, such as secondary structure content and availability of the encoded transfer RNAs (tRNAs). Consequently, they may render different yields of the translated polypeptides. These mRNA features related to translation efficiency are also playing a role locally, resulting in a non-uniform translation speed along the mRNA, which has been previously related to some protein structural features and also used to explain some dramatic effects of "silent" single-nucleotide-polymorphisms (SNPs). In this work we perform the first large scale analysis of the relationship between three experimental proxies of mRNA local translation efficiency and the local features of the corresponding encoded proteins. We found that a number of protein functional and structural features are reflected in the patterns of ribosome occupancy, secondary structure and tRNA availability along the mRNA. One or more of these proxies of translation speed have distinctive patterns around the mRNA regions coding for certain protein local features. In some cases the three patterns follow a similar trend. We also show specific examples where these patterns of translation speed point to the protein's important structural and functional features. This support the idea that the genome not only codes the protein functional features as sequences of amino acids, but also as subtle patterns of mRNA properties which, probably through local effects on the translation speed, have some consequence on the final polypeptide. These results open the possibility of predicting a protein's functional regions based on a single genomic sequence, and have implications for heterologous protein expression and fine-tuning protein function.
Hinney, Anke; Hoch, Anne; Geller, Frank; Schäfer, Helmut; Siegfried, Wolfgang; Goldschmidt, Hanspeter; Remschmidt, Helmut; Hebebrand, Johannes
2002-06-01
Ghrelin induces obesity via central and peripheral mechanisms. Administration of ghrelin leads to increased food intake and decreased fat utilisation in rodents. Ghrelin levels are decreased in obese individuals. Recently, a polymorphism (Arg-51-Gln) within the ghrelin gene (GHRL) was described to be associated with obesity. We screened the GHRL coding region in 215 extremely obese German Children and adolescents (study group 1) and 93 normal weight students (study group 2) by single strand conformation polymorphism analysis (SSCP). We found the two previously described single nucleotide polymorphisms (SNP: Arg-51-Gln and Leu-72-Met) in similar frequencies in study groups 1 and 2 (allele frequencies were: 0.019 and 0.016 for the 51-Gln allele and 0.091 and 0.086 for the 72-Met allele, respectively). Hence, we could not confirm the previous finding. Additionally, two novel variants were identified within the coding region: (1) We detected one healthy normal weight individual with a frameshift mutation (2bp deletion at codon 34). This frameshift mutation affects the coding region of the mature ghrelin. Hence, it is highly likely that the normal weight student is haplo-insufficient for ghrelin. (2) An A to T transversion leads to an amino acid exchange from Gln to Leu at amino acid position 90. The frequency of the 90-Leu allele was significantly higher in the extremely obese children and adolescents (0.063) than in the normal weight students (0.016; nominal p = 0.011). Additionally, we genotyped 134 underweight students and 44 normal weight adults for this SNP. Genotype frequencies were similar in extremely obese children and adolescents, underweight students and normal weight adults (p > 0.8). In conclusion, we identified four sequence variants in the coding region of the ghrelin gene in individuals belonging to different weight extremes. A frameshift mutation was detected in a normal weight individual. None of the variants seem to influence weight regulation.
Champagne, Devin P.; Shockett, Penny E.
2014-01-01
Illegitimate V(D)J recombination at oncogenes and tumor suppressor genes is implicated in formation of several T cell malignancies. Notch1 and Bcl11b, genes involved in developing T cell specification, selection, proliferation, and survival, were previously shown to contain hotspots for deletional illegitimate V(D)J recombination associated with radiation-induced thymic lymphoma. Interestingly, these deletions were also observed in wild-type animals. In this study, we conducted frequency, clonality, and junctional processing analyses of Notch1 and Bcl11b deletions during mouse development and compared results to published analyses of authentic V(D)J rearrangements at the T cell receptor beta (TCRβ) locus and illegitimate V(D)J deletions observed at the human, nonimmune HPRT1 locus not involved in T cell malignancies. We detect deletions in Notch1 and Bcl11b in thymic and splenic T cell populations, consistent with cells bearing deletions in the circulating lymphocyte pool. Deletions in thymus can occur in utero, increase in frequency between fetal and postnatal stages, are detected at all ages examined between fetal and 7 months, exhibit only limited clonality (contrasting with previous results in radiation-sensitive mouse strains), and consistent with previous reports are more frequent in Bcl11b, partially explained by relatively high Recombination Signal Information Content (RIC) scores. Deletion junctions in Bcl11b exhibit greater germline nucleotide loss, while in Notch1 palindromic (P) nucleotides are more abundant, although average P nucleotide length is similar for both genes and consistent with results at the TCRβ locus. Non-templated (N) nucleotide insertions appear to increase between fetal and postnatal stages for Notch1, consistent with normal terminal deoxynucleotidyl transferase (TdT) activity; however, neonatal Bcl11b junctions contain elevated levels of N insertions. Finally, contrasting with results at the HPRT1 locus, we find no obvious age or gender bias in junctional processing, and inverted repeats at recessed coding ends (Pr nucleotides) correspond mostly to single-base additions consistent with normal TdT activity. PMID:24530429
A fully decompressed synthetic bacteriophage øX174 genome assembled and archived in yeast.
Jaschke, Paul R; Lieberman, Erica K; Rodriguez, Jon; Sierra, Adrian; Endy, Drew
2012-12-20
The 5386 nucleotide bacteriophage øX174 genome has a complicated architecture that encodes 11 gene products via overlapping protein coding sequences spanning multiple reading frames. We designed a 6302 nucleotide synthetic surrogate, øX174.1, that fully separates all primary phage protein coding sequences along with cognate translation control elements. To specify øX174.1f, a decompressed genome the same length as wild type, we truncated the gene F coding sequence. We synthesized DNA encoding fragments of øX174.1f and used a combination of in vitro- and yeast-based assembly to produce yeast vectors encoding natural or designer bacteriophage genomes. We isolated clonal preparations of yeast plasmid DNA and transfected E. coli C strains. We recovered viable øX174 particles containing the øX174.1f genome from E. coli C strains that independently express full-length gene F. We expect that yeast can serve as a genomic 'drydock' within which to maintain and manipulate clonal lineages of other obligate lytic phage. Copyright © 2012 Elsevier Inc. All rights reserved.
Kowalski, Madzia P.; Baylis, Howard A.; Krude, Torsten
2015-01-01
ABSTRACT Stem bulge RNAs (sbRNAs) are a family of small non-coding stem-loop RNAs present in Caenorhabditis elegans and other nematodes, the function of which is unknown. Here, we report the first functional characterisation of nematode sbRNAs. We demonstrate that sbRNAs from a range of nematode species are able to reconstitute the initiation of chromosomal DNA replication in the presence of replication proteins in vitro, and that conserved nucleotide sequence motifs are essential for this function. By functionally inactivating sbRNAs with antisense morpholino oligonucleotides, we show that sbRNAs are required for S phase progression, early embryonic development and the viability of C. elegans in vivo. Thus, we demonstrate a new and essential role for sbRNAs during the early development of C. elegans. sbRNAs show limited nucleotide sequence similarity to vertebrate Y RNAs, which are also essential for the initiation of DNA replication. Our results therefore establish that the essential function of small non-coding stem-loop RNAs during DNA replication extends beyond vertebrates. PMID:25908866
Hirota, R; Yamagata, A; Kato, J; Kuroda, A; Ikeda, T; Takiguchi, N; Ohtake, H
2000-02-01
Pulsed-field gel electrophoresis of PmeI digests of the Nitrosomonas sp. strain ENI-11 chromosome produced four bands ranging from 1,200 to 480 kb in size. Southern hybridizations suggested that a 487-kb PmeI fragment contained two copies of the amoCAB genes, coding for ammonia monooxygenase (designated amoCAB(1) and amoCAB(2)), and three copies of the hao gene, coding for hydroxylamine oxidoreductase (hao(1), hao(2), and hao(3)). In this DNA fragment, amoCAB(1) and amoCAB(2) were about 390 kb apart, while hao(1), hao(2), and hao(3) were separated by at least about 100 kb from each other. Interestingly, hao(1) and hao(2) were located relatively close to amoCAB(1) and amoCAB(2), respectively. DNA sequence analysis revealed that hao(1) and hao(2) shared 160 identical nucleotides immediately upstream of each translation initiation codon. However, hao(3) showed only 30% nucleotide identity in the 160-bp corresponding region.
Hirota, Ryuichi; Yamagata, Akira; Kato, Junichi; Kuroda, Akio; Ikeda, Tsukasa; Takiguchi, Noboru; Ohtake, Hisao
2000-01-01
Pulsed-field gel electrophoresis of PmeI digests of the Nitrosomonas sp. strain ENI-11 chromosome produced four bands ranging from 1,200 to 480 kb in size. Southern hybridizations suggested that a 487-kb PmeI fragment contained two copies of the amoCAB genes, coding for ammonia monooxygenase (designated amoCAB1 and amoCAB2), and three copies of the hao gene, coding for hydroxylamine oxidoreductase (hao1, hao2, and hao3). In this DNA fragment, amoCAB1 and amoCAB2 were about 390 kb apart, while hao1, hao2, and hao3 were separated by at least about 100 kb from each other. Interestingly, hao1 and hao2 were located relatively close to amoCAB1 and amoCAB2, respectively. DNA sequence analysis revealed that hao1 and hao2 shared 160 identical nucleotides immediately upstream of each translation initiation codon. However, hao3 showed only 30% nucleotide identity in the 160-bp corresponding region. PMID:10633121
Spielmann, A; Stutz, E
1983-01-01
The soybean chloroplast psb A gene (photosystem II thylakoid membrane protein of Mr 32 000, lysine-free) and the trn H gene (tRNAHisGUG), which both map in the large single copy region adjacent to one of the inverted repeat structures (IR1), have been sequenced including flanking regions. The psb A gene shows in its structural part 92% sequence homology with the corresponding genes of spinach and N. debneyi and contains also an open reading frame for 353 aminoacids. The aminoacid sequence of a potential primary translation product (calculated Mr, 38 904, no lysine) diverges from that of spinach and N. debneyi in only two positions in the C-terminal part. The trn H gene has the same polarity as the psb A gene and the coding region is located at the very end of the large single copy region. The deduced sequence of the soybean chloroplast tRNAHisGUG is identical with that of Zea mays chloroplasts. Both ends of the large single copy region were sequenced including a small segment of the adjacent IR1 and IR2. PMID:6314279
Hunt, C; Morimoto, R I
1985-01-01
We have determined the nucleotide sequence of the human hsp70 gene and 5' flanking region. The hsp70 gene is transcribed as an uninterrupted primary transcript of 2440 nucleotides composed of a 5' noncoding leader sequence of 212 nucleotides, a 3' noncoding region of 242 nucleotides, and a continuous open reading frame of 1986 nucleotides that encodes a protein with predicted molecular mass of 69,800 daltons. Upstream of the 5' terminus are the canonical TATAAA box, the sequence ATTGG that corresponds in the inverted orientation to the CCAAT motif, and the dyad sequence CTGGAAT/ATTCCCG that shares homology in 12 of 14 positions with the consensus transcription regulatory sequence common to Drosophila heat shock genes. Comparison of the predicted amino acid sequences of human hsp70 with the published sequences of Drosophila hsp70 and Escherichia coli dnaK reveals that human hsp70 is 73% identical to Drosophila hsp70 and 47% identical to E. coli dnaK. Surprisingly, the nucleotide sequences of the human and Drosophila genes are 72% identical and human and E. coli genes are 50% identical, which is more highly conserved than necessary given the degeneracy of the genetic code. The lack of accumulated silent nucleotide substitutions leads us to propose that there may be additional information in the nucleotide sequence of the hsp70 gene or the corresponding mRNA that precludes the maximum divergence allowed in the silent codon positions. PMID:3931075
Breaking the 1000-gene barrier for Mimivirus using ultra-deep genome and transcriptome sequencing.
Legendre, Matthieu; Santini, Sébastien; Rico, Alain; Abergel, Chantal; Claverie, Jean-Michel
2011-03-04
Mimivirus, a giant dsDNA virus infecting Acanthamoeba, is the prototype of the mimiviridae family, the latest addition to the family of the nucleocytoplasmic large DNA viruses (NCLDVs). Its 1.2 Mb-genome was initially predicted to encode 917 genes. A subsequent RNA-Seq analysis precisely mapped many transcript boundaries and identified 75 new genes. We now report a much deeper analysis using the SOLiD™ technology combining RNA-Seq of the Mimivirus transcriptome during the infectious cycle (202.4 Million reads), and a complete genome re-sequencing (45.3 Million reads). This study corrected the genome sequence and identified several single nucleotide polymorphisms. Our results also provided clear evidence of previously overlooked transcription units, including an important RNA polymerase subunit distantly related to Euryarchea homologues. The total Mimivirus gene count is now 1018, 11% greater than the original annotation. This study highlights the huge progress brought about by ultra-deep sequencing for the comprehensive annotation of virus genomes, opening the door to a complete one-nucleotide resolution level description of their transcriptional activity, and to the realistic modeling of the viral genome expression at the ultimate molecular level. This work also illustrates the need to go beyond bioinformatics-only approaches for the annotation of short protein and non-coding genes in viral genomes.
Yang, Zujun; Zhang, Tao; Li, Guangrong; Nevo, Eviatar
2011-12-01
Dehydrins are one of the major stress-induced gene families, and the expression of dehydrin 6 (Dhn6) is strictly related to drought in barley. In order to investigate how the evolution of the Dhn6 gene is associated with adaptation to environmental changes, we examined 48 genotypes of wild barley, Hordeum spontaneum, from "Evolution Canyon" at Mount Carmel, Israel. The Dhn6 sequences of the 48 genotypes were identified, and a recent insertion of 342 bp at 5'UTR was found in the sequences of 11 genotypes. Both nucleotide and haplotype diversity of single nucleotide polymorphism in Dhn6 coding regions were higher on the AS ("African" slope or dry slope) than on the ES ("European" slope or humid slope), and the applied Tajima D and Fu-Li test rejected neutrality of SNP diversity. Expression analysis indicated that the 342 bp insertion at 5'UTR was associated with the earlier up-regulation of Dhn6 after dehydration. The genetic divergence of amino acids sequences indicated significant positive selection of Dhn6 among the wild barley populations. The diversity of Dhn6 in microclimatic divergence slopes suggested that Dhn6 has been subjected to natural selection and adaptively associated with drought resistance of wild barley at "Evolution Canyon".
Saavedra-Lira, E; Pérez-Montfort, R
1994-05-16
We isolated three overlapping clones from a DNA genomic library of Entamoeba histolytica strain HM1:IMSS, whose translated nucleotide (nt) sequence shows similarities of 51, 48 and 47% with the amino acid (aa) sequences reported for the pyruvate phosphate dikinases from Bacteroides symbiosus, maize and Flaveria trinervia, respectively. The reading frame determined codes for a protein of 886 aa.
Shimamoto, I; Sonoda, S; Vazquez, P; Minaka, N; Nishiguchi, M
1998-01-01
The 3' terminal 2378 nucleotides of a wasabi strain of crucifer tobamovirus (CTMV-W) infectious to crucifer plants was determined. This includes the 3' non-coding region of 235 nucleotides, coat protein (CP) gene (468 nucleotides), movement protein (MP) gene (798 nucleotides) and C-terminal partial readthrough portion of 180 K protein gene (940 nucleotides). Comparison of the sequence with homologous regions of thirteen other tobamovirus genomes showed that it had much higher identity to those of four other crucifer tobamoviruses, 85.2% to cr-TMV and turnip vein-clearing virus (TVCV), 87.4% to oilseed rape mosaic virus (ORMV) and 87.1% to TMV-Cg, than to those of other tobamoviruses. Thus CTMV-W was most similar to ORMV and TMV-Cg in sequence, but only marginally so, whereas the location and size of its MP gene was the same as cr-TMV amd TVCV. These results, together with other analyses, show that CTMV-W is a new crucifer tobamovirus, that the five crucifer tobamoviruses can be classified into two subgroups based on MP gene organization, and that the rate of sequence change is not the same in all lineages.
Kim, Hyoung Tae; Kim, Ki-Joong
2014-01-01
Comparative analyses of complete chloroplast (cp) DNA sequences within a species may provide clues to understand the population dynamics and colonization histories of plant species. Equisetum arvense (Equisetaceae) is a widely distributed fern species in northeastern Asia, Europe, and North America. The complete cp DNA sequences from Asian and American E. arvense individuals were compared in this study. The Asian E. arvense cp genome was 583 bp shorter than that of the American E. arvense. In total, 159 indels were observed between two individuals, most of which were concentrated on the hypervariable trnY-trnE intergenic spacer (IGS) in the large single-copy (LSC) region of the cp genome. This IGS region held a series of 19 bp repeating units. The numbers of the 19 bp repeat unit were responsible for 78% of the total length difference between the two cp genomes. Furthermore, only other closely related species of Equisetum also show the hypervariable nature of the trnY-trnE IGS. By contrast, only a single indel was observed in the gene coding regions: the ycf1 gene showed 24 bp differences between the two continental individuals due to a single tandem-repeat indel. A total of 165 single-nucleotide polymorphisms (SNPs) were recorded between the two cp genomes. Of these, 52 SNPs (31.5%) were distributed in coding regions, 13 SNPs (7.9%) were in introns, and 100 SNPs (60.6%) were in intergenic spacers (IGS). The overall difference between the Asian and American E. arvense cp genomes was 0.12%. Despite the relatively high genetic diversity between Asian and American E. arvense, the two populations are recognized as a single species based on their high morphological similarity. This indicated that the two regional populations have been in morphological stasis. PMID:25157804
[Long non-coding RNAs in the pathophysiology of atherosclerosis].
Novak, Jan; Vašků, Julie Bienertová; Souček, Miroslav
2018-01-01
The human genome contains about 22 000 protein-coding genes that are transcribed to an even larger amount of messenger RNAs (mRNA). Interestingly, the results of the project ENCODE from 2012 show, that despite up to 90 % of our genome being actively transcribed, protein-coding mRNAs make up only 2-3 % of the total amount of the transcribed RNA. The rest of RNA transcripts is not translated to proteins and that is why they are referred to as "non-coding RNAs". Earlier the non-coding RNA was considered "the dark matter of genome", or "the junk", whose genes has accumulated in our DNA during the course of evolution. Today we already know that non-coding RNAs fulfil a variety of regulatory functions in our body - they intervene into epigenetic processes from chromatin remodelling to histone methylation, or into the transcription process itself, or even post-transcription processes. Long non-coding RNAs (lncRNA) are one of the classes of non-coding RNAs that have more than 200 nucleotides in length (non-coding RNAs with less than 200 nucleotides in length are called small non-coding RNAs). lncRNAs represent a widely varied and large group of molecules with diverse regulatory functions. We can identify them in all thinkable cell types or tissues, or even in an extracellular space, which includes blood, specifically plasma. Their levels change during the course of organogenesis, they are specific to different tissues and their changes also occur along with the development of different illnesses, including atherosclerosis. This review article aims to present lncRNAs problematics in general and then focuses on some of their specific representatives in relation to the process of atherosclerosis (i.e. we describe lncRNA involvement in the biology of endothelial cells, vascular smooth muscle cells or immune cells), and we further describe possible clinical potential of lncRNA, whether in diagnostics or therapy of atherosclerosis and its clinical manifestations.Key words: atherosclerosis - lincRNA - lncRNA - MALAT - MIAT.
2013-10-01
identify common genetic variations (i.e., single nucleotide polymorphisms [ SNPs ] and haplotypes) in cytokine genes, as well demographic, clinical, and...Center. The purpose of the proposed project is to identify common genetic variations (i.e., single nucleotide polymorphisms [ SNPs ] and haplotypes) in...research team continues to meet monthly to discuss progress with regards to recruitment, enrollment, and data collection. Training in Genetics In year
Single-cell analysis of intercellular heteroplasmy of mtDNA in Leber hereditary optic neuropathy
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kobayashi, Y.; Sharpe, H.; Brown, N.
1994-07-01
The authors have investigated the distribution of mutant mtDNA molecules in single cells from a patient with Leber hereditary optic neuropathy (LHON). LHON is a maternally inherited disease that is characterized by a sudden-onset bilateral loss of central vision, which typically occurs in early adulthood. More than 50% of all LHON patients carry an mtDNA mutation at nucleotide position 11778. This nucleotide change converts a highly conserved arginine residue to histidine at codon 340 in the NADH-ubiquinone oxidoreductase subunit 4 (ND4) gene of mtDNA. In the present study, the authors used PCR amplification of mtDNA from lymphocytes to investigate mtDNAmore » heteroplasmy at the single-cell level in a LHON patient. They found that most cells were either homoplasmic normal or homoplasmic mutant at nucleotide position 11778. Some (16%) cells contained both mutant and normal mtDNA.« less
Tian, Kai; Chen, Xiaowei; Luan, Binquan; Singh, Prashant; Yang, Zhiyu; Gates, Kent S; Lin, Mengshi; Mustapha, Azlin; Gu, Li-Qun
2018-05-22
Accurate and rapid detection of single-nucleotide polymorphism (SNP) in pathogenic mutants is crucial for many fields such as food safety regulation and disease diagnostics. Current detection methods involve laborious sample preparations and expensive characterizations. Here, we investigated a single locked nucleic acid (LNA) approach, facilitated by a nanopore single-molecule sensor, to accurately determine SNPs for detection of Shiga toxin producing Escherichia coli (STEC) serotype O157:H7, and cancer-derived EGFR L858R and KRAS G12D driver mutations. Current LNA applications that require incorporation and optimization of multiple LNA nucleotides. But we found that in the nanopore system, a single LNA introduced in the probe is sufficient to enhance the SNP discrimination capability by over 10-fold, allowing accurate detection of the pathogenic mutant DNA mixed in a large amount of the wild-type DNA. Importantly, the molecular mechanistic study suggests that such a significant improvement is due to the effect of the single-LNA that both stabilizes the fully matched base-pair and destabilizes the mismatched base-pair. This sensitive method, with a simplified, low cost, easy-to-operate LNA design, could be generalized for various applications that need rapid and accurate identification of single-nucleotide variations.
McKinney, Jeffrey; Guerrier-Takada, Cecilia; Wesolowski, Donna; Altman, Sidney
2001-01-01
Narrow spectrum antimicrobial activity has been designed to reduce the expression of two essential genes, one coding for the protein subunit of RNase P (C5 protein) and one for gyrase (gyrase A). In both cases, external guide sequences (EGS) have been designed to complex with either mRNA. Using the EGS technology, the level of microbial viability is reduced to less than 10% of the wild-type strain. The EGSs are additive when used together and depend on the number of nucleotides paired when attacking gyrase A mRNA. In the case of gyrase A, three nucleotides unpaired out of a 15-mer EGS still favor complete inhibition by the EGS but five unpaired nucleotides do not. PMID:11381134
Toward an integrated knowledge environment to support modern oncology.
Blake, Patrick M; Decker, David A; Glennon, Timothy M; Liang, Yong Michael; Losko, Sascha; Navin, Nicholas; Suh, K Stephen
2011-01-01
Around the world, teams of researchers continue to develop a wide range of systems to capture, store, and analyze data including treatment, patient outcomes, tumor registries, next-generation sequencing, single-nucleotide polymorphism, copy number, gene expression, drug chemistry, drug safety, and toxicity. Scientists mine, curate, and manually annotate growing mountains of data to produce high-quality databases, while clinical information is aggregated in distant systems. Databases are currently scattered, and relationships between variables coded in disparate datasets are frequently invisible. The challenge is to evolve oncology informatics from a "systems" orientation of standalone platforms and silos into an "integrated knowledge environments" that will connect "knowable" research data with patient clinical information. The aim of this article is to review progress toward an integrated knowledge environment to support modern oncology with a focus on supporting scientific discovery and improving cancer care.
Bose, Jeffrey L
2016-01-01
The ability to create mutations is an important step towards understanding bacterial physiology and virulence. While targeted approaches are invaluable, the ability to produce genome-wide random mutations can lead to crucial discoveries. Transposon mutagenesis is a useful approach, but many interesting mutations can be missed by these insertions that interrupt coding and noncoding sequences due to the integration of an entire transposon. Chemical mutagenesis and UV-based random mutagenesis are alternate approaches to isolate mutations of interest with the potential of only single nucleotide changes. Once a standard method, difficulty in identifying mutation sites had decreased the popularity of this technique. However, thanks to the recent emergence of economical whole-genome sequencing, this approach to making mutations can once again become a viable option. Therefore, this chapter provides an overview protocol for random mutagenesis using UV light or DNA-damaging chemicals.
Electron attachment to DNA single strands: gas phase and aqueous solution.
Gu, Jiande; Xie, Yaoming; Schaefer, Henry F
2007-01-01
The 2'-deoxyguanosine-3',5'-diphosphate, 2'-deoxyadenosine-3',5'-diphosphate, 2'-deoxycytidine-3',5'-diphosphate and 2'-deoxythymidine-3',5'-diphosphate systems are the smallest units of a DNA single strand. Exploring these comprehensive subunits with reliable density functional methods enables one to approach reasonable predictions of the properties of DNA single strands. With these models, DNA single strands are found to have a strong tendency to capture low-energy electrons. The vertical attachment energies (VEAs) predicted for 3',5'-dTDP (0.17 eV) and 3',5'-dGDP (0.14 eV) indicate that both the thymine-rich and the guanine-rich DNA single strands have the ability to capture electrons. The adiabatic electron affinities (AEAs) of the nucleotides considered here range from 0.22 to 0.52 eV and follow the order 3',5'-dTDP > 3',5'-dCDP > 3',5'-dGDP > 3',5'-dADP. A substantial increase in the AEA is observed compared to that of the corresponding nucleic acid bases and the corresponding nucleosides. Furthermore, aqueous solution simulations dramatically increase the electron attracting properties of the DNA single strands. The present investigation illustrates that in the gas phase, the excess electron is situated both on the nucleobase and on the phosphate moiety for DNA single strands. However, the distribution of the extra negative charge is uneven. The attached electron favors the base moiety for the pyrimidine, while it prefers the 3'-phosphate subunit for the purine DNA single strands. In contrast, the attached electron is tightly bound to the base fragment for the cytidine, thymidine and adenosine nucleotides, while it almost exclusively resides in the vicinity of the 3'-phosphate group for the guanosine nucleotides due to the solvent effects. The comparatively low vertical detachment energies (VDEs) predicted for 3',5'-dADP(-) (0.26 eV) and 3',5'-dGDP(-) (0.32 eV) indicate that electron detachment might compete with reactions having high activation barriers such as glycosidic bond breakage. However, the radical anions of the pyrimidine nucleotides with high VDE are expected to be electronically stable. Thus the base-centered radical anions of the pyrimidine nucleotides might be the possible intermediates for DNA single-strand breakage.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Leong, JoAnn Ching
The nucleotide sequence of the IHNV glycoprotein gene has been determined from a cDNA clone containing the entire coding region. The glycoprotein cDNA clone contained a leader sequence of 48 bases, a coding region of 1524 nucleotides, and 39 bases at the 3 foot end. The entire cDNA clone contains 1609 nucleodites and encodes a protein of 508 amino acids. The deduced amino acid sequence gave a translated molecular weight of 56,795 daltons. A hydropathicity profile of the deduced amino acid sequence indicated that there were two major hydrophobic domains: one,at the N-terminus,delineating a signal peptide of 18 amino acidsmore » and the other, at the C-terminus,delineating the region of the transmembrane. Five possible sites of N-linked glyscoylation were identified. Although no nucleic acid homology existed between the IHNV glycoprotein gene and the glycoprotein genes of rabies and VSV, there was significant homology at the amino acid level between all three rhabdovirus glycoproteins.« less
Takagi, M; Kobayashi, N; Sugimoto, M; Fujii, T; Watari, J; Yano, K
1987-01-01
The expression of a LEU gene from Candida maltosa (designated as C-LEU2) isolated previously (Kawamura et al. 1983) was shown to be regulated, when transferred into Saccharomyces cerevisiae, by leucine and threonine in the medium, as in the case of LEU2 gene of S. cerevisiae. The coding region together with the regulatory region was subcloned and the nucleotide sequence was determined. When the sequence of the coding region was compared with that of LEU2, the homology was 72% for base pairs and 76% for deduced amino acids. Comparison of the regulatory region of C-LEU2 with those of LEU1 and LEU2 suggested a few short consensus sequences which are involved in regulation of gene expression by leucine and threonine in the medium.
Selection of the simplest RNA that binds isoleucine
LOZUPONE, CATHERINE; CHANGAYIL, SHANKAR; MAJERFELD, IRENE; YARUS, MICHAEL
2003-01-01
We have identified the simplest RNA binding site for isoleucine using selection-amplification (SELEX), by shrinking the size of the randomized region until affinity selection is extinguished. Such a protocol can be useful because selection does not necessarily make the simplest active motif most prominent, as is often assumed. We find an isoleucine binding site that behaves exactly as predicted for the site that requires fewest nucleotides. This UAUU motif (16 highly conserved positions; 27 total), is also the most abundant site in successful selections on short random tracts. The UAUU site, now isolated independently at least 63 times, is a small asymmetric internal loop. Conserved loop sequences include isoleucine codon and anticodon triplets, whose nucleotides are required for amino acid binding. This reproducible association between isoleucine and its coding sequences supports the idea that the genetic code is, at least in part, a stereochemical residue of the most easily isolated RNA–amino acid binding structures. PMID:14561881
Genetic risk profiling and gene signature modeling to predict risk of complications after IPAA.
Sehgal, Rishabh; Berg, Arthur; Polinski, Joseph I; Hegarty, John P; Lin, Zhenwu; McKenna, Kevin J; Stewart, David B; Poritz, Lisa S; Koltun, Walter A
2012-03-01
Severe pouchitis and Crohn's disease-like complications are 2 adverse postoperative complications that confound the success of the IPAA in patients with ulcerative colitis. To date, approximately 83 single nucleotide polymorphisms within 55 genes have been associated with IBD. The aim of this study was to identify single-nucleotide polymorphisms that correlate with complications after IPAA that could be utilized in a gene signature fashion to predict postoperative complications and aid in preoperative surgical decision making. One hundred forty-two IPAA patients were retrospectively classified as "asymptomatic" (n = 104, defined as no Crohn's disease-like complications or severe pouchitis for at least 2 years after IPAA) and compared with a "severe pouchitis" group (n = 12, ≥ 4 episodes pouchitis per year for 2 years including the need for long-term therapy to maintain remission) and a "Crohn's disease-like" group (n = 26, presence of fistulae, pouch inlet stricture, proximal small-bowel disease, or pouch granulomata, occurring at least 6 months after surgery). Genotyping for 83 single-nucleotide polymorphisms previously associated with Crohn's disease and/or ulcerative colitis was performed on a customized Illumina genotyping platform. The top 2 single-nucleotide polymorphisms statistically identified as being independently associated with each of Crohn's disease-like and severe pouchitis were used in a multivariate logistic regression model. These single-nucleotide polymorphisms were then used to create probability equations to predict overall chance of a positive or negative outcome for that complication. The top 2 single-nucleotide polymorphisms for Crohn's disease-like complications were in the 10q21 locus and the gene for PTGER4 (p = 0.006 and 0.007), whereas for severe pouchitis it was NOD2 and TNFSF15 (p = 0.003 and 0.011). Probability equations suggested that the risk of these 2 complications greatly increased with increasing number of risk alleles, going as high as 92% for severe pouchitis and 65% for Crohn's disease-like complications. In this IPAA patient cohort, mutations in the 10q21 locus and the PTGER4 gene were associated with Crohn's disease-like complications, whereas mutations in NOD2 and TNFSF15 correlated with severe pouchitis. Preoperative genetic analysis and use of such gene signatures hold promise for improved preoperative surgical patient selection to minimize these IPAA complications.
Crowley, T E; Bond, M W; Meyerowitz, E M
1983-01-01
The polytene chromosome puff at 68C on the Drosophila melanogaster third chromosome is thought from genetic experiments to contain the structural gene for one of the secreted salivary gland glue polypeptides, sgs-3. Previous work has demonstrated that the DNA included in this puff contains sequences that are transcribed to give three different polyadenylated RNAs that are abundant in third-larval-instar salivary glands. These have been called the group II, group III, and group IV RNAs. In the experiments reported here, we used the nucleotide sequence of the DNA coding for these RNAs to predict some of the physical and chemical properties expected of their protein products, including molecular weight, amino acid composition, and amino acid sequence. Salivary gland polypeptides with molecular weights similar to those expected for the 68C RNA translation products, and with the expected degree of incorporation of different radioactive amino acids, were purified. These proteins were shown by amino acid sequencing to correspond to the protein products of the 68C RNAs. It was further shown that each of these proteins is a part of the secreted salivary gland glue: the group IV RNA codes for the previously described sgs-3, whereas the group II and III RNAs code for the newly identified glue polypeptides sgs-8 and sgs-7. Images PMID:6406838
DOE Office of Scientific and Technical Information (OSTI.GOV)
Prody, C.A.; Zevin-Sonkin, D.; Gnatt, A.
1987-06-01
To study the primary structure and regulation of human cholinesterases, oligodeoxynucleotide probes were prepared according to a consensus peptide sequence present in the active site of both human serum pseudocholinesterase and Torpedo electric organ true acetylcholinesterase. Using these probes, the authors isolated several cDNA clones from lambdagt10 libraries of fetal brain and liver origins. These include 2.4-kilobase cDNA clones that code for a polypeptide containing a putative signal peptide and the N-terminal, active site, and C-terminal peptides of human BtChoEase, suggesting that they code either for BtChoEase itself or for a very similar but distinct fetal form of cholinesterase. Inmore » RNA blots of poly(A)/sup +/ RNA from the cholinesterase-producing fetal brain and liver, these cDNAs hybridized with a single 2.5-kilobase band. Blot hybridization to human genomic DNA revealed that these fetal BtChoEase cDNA clones hybridize with DNA fragments of the total length of 17.5 kilobases, and signal intensities indicated that these sequences are not present in many copies. Both the cDNA-encoded protein and its nucleotide sequence display striking homology to parallel sequences published for Torpedo AcChoEase. These finding demonstrate extensive homologies between the fetal BtChoEase encoded by these clones and other cholinesterases of various forms and species.« less
Investigation of genes coding for inflammatory components in Parkinson's disease.
Håkansson, Anna; Westberg, Lars; Nilsson, Staffan; Buervenich, Silvia; Carmine, Andrea; Holmberg, Björn; Sydow, Olof; Olson, Lars; Johnels, Bo; Eriksson, Elias; Nissbrandt, Hans
2005-05-01
Several findings obtained recently indicate that inflammation may contribute to the pathogenesis in Parkinson's disease (PD). Genetic variants of genes coding for components involved in immune reactions in the brain might therefore influence the risk of developing PD or the age of disease onset. Five single nucleotide polymorphisms (SNPs) in the genes coding for interferon-gamma (IFN-gamma; T874A in intron 1), interferon-gamma receptor 2 (IFN-gamma R2; Gln64Arg), interleukin-10 (IL-10; G1082A in the promoter region), platelet-activating factor acetylhydrolase (PAF-AH; Val379Ala), and intercellular adhesion molecule 1 (ICAM-1; Lys469Glu) were genotyped, using pyrosequencing, in 265 patients with PD and 308 controls. None of the investigated SNPs was found to be associated with PD; however, the G1082A polymorphism in the IL-10 gene promoter was found to be related to the age of disease onset. Linear regression showed a significantly earlier onset with more A-alleles (P = 0.0095; after Bonferroni correction, P = 0.048), resulting in a 5-year delayed age of onset of the disease for individuals having two G-alleles compared with individuals having two A-alleles. The results indicate that the IL-10 G1082A SNP could possibly be related to the age of onset of PD. Copyright 2005 Movement Disorder Society.
Biochemical and genetic analysis of the role of the viral polymerase in enterovirus recombination.
Woodman, Andrew; Arnold, Jamie J; Cameron, Craig E; Evans, David J
2016-08-19
Genetic recombination in single-strand, positive-sense RNA viruses is a poorly understand mechanism responsible for generating extensive genetic change and novel phenotypes. By moving a critical cis-acting replication element (CRE) from the polyprotein coding region to the 3' non-coding region we have further developed a cell-based assay (the 3'CRE-REP assay) to yield recombinants throughout the non-structural coding region of poliovirus from dually transfected cells. We have additionally developed a defined biochemical assay in which the only protein present is the poliovirus RNA dependent RNA polymerase (RdRp), which recapitulates the strand transfer events of the recombination process. We have used both assays to investigate the role of the polymerase fidelity and nucleotide turnover rates in recombination. Our results, of both poliovirus intertypic and intratypic recombination in the CRE-REP assay and using a range of polymerase variants in the biochemical assay, demonstrate that RdRp fidelity is a fundamental determinant of recombination frequency. High fidelity polymerases exhibit reduced recombination and low fidelity polymerases exhibit increased recombination in both assays. These studies provide the basis for the analysis of poliovirus recombination throughout the non-structural region of the virus genome and provide a defined biochemical assay to further dissect this important evolutionary process. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Chimeric mitochondrial peptides from contiguous regular and swinger RNA.
Seligmann, Hervé
2016-01-01
Previous mass spectrometry analyses described human mitochondrial peptides entirely translated from swinger RNAs, RNAs where polymerization systematically exchanged nucleotides. Exchanges follow one among 23 bijective transformation rules, nine symmetric exchanges (X ↔ Y, e.g. A ↔ C) and fourteen asymmetric exchanges (X → Y → Z → X, e.g. A → C → G → A), multiplying by 24 DNA's protein coding potential. Abrupt switches from regular to swinger polymerization produce chimeric RNAs. Here, human mitochondrial proteomic analyses assuming abrupt switches between regular and swinger transcriptions, detect chimeric peptides, encoded by part regular, part swinger RNA. Contiguous regular- and swinger-encoded residues within single peptides are stronger evidence for translation of swinger RNA than previously detected, entirely swinger-encoded peptides: regular parts are positive controls matched with contiguous swinger parts, increasing confidence in results. Chimeric peptides are 200 × rarer than swinger peptides (3/100,000 versus 6/1000). Among 186 peptides with > 8 residues for each regular and swinger parts, regular parts of eleven chimeric peptides correspond to six among the thirteen recognized, mitochondrial protein-coding genes. Chimeric peptides matching partly regular proteins are rarer and less expressed than chimeric peptides matching non-coding sequences, suggesting targeted degradation of misfolded proteins. Present results strengthen hypotheses that the short mitogenome encodes far more proteins than hitherto assumed. Entirely swinger-encoded proteins could exist.
Schermerhorn, Kelly M.; Gardner, Andrew F.
2015-01-01
Family D DNA polymerases (polDs) have been implicated as the major replicative polymerase in archaea, excluding the Crenarchaeota branch, and bear little sequence homology to other DNA polymerase families. Here we report a detailed kinetic analysis of nucleotide incorporation and exonuclease activity for a Family D DNA polymerase from Thermococcus sp. 9°N. Pre-steady-state single-turnover nucleotide incorporation assays were performed to obtain the kinetic parameters, kpol and Kd, for correct nucleotide incorporation, incorrect nucleotide incorporation, and ribonucleotide incorporation by exonuclease-deficient polD. Correct nucleotide incorporation kinetics revealed a relatively slow maximal rate of polymerization (kpol ∼2.5 s−1) and especially tight nucleotide binding (Kd(dNTP) ∼1.7 μm), compared with DNA polymerases from Families A, B, C, X, and Y. Furthermore, pre-steady-state nucleotide incorporation assays revealed that polD prevents the incorporation of incorrect nucleotides and ribonucleotides primarily through reduced nucleotide binding affinity. Pre-steady-state single-turnover assays on wild-type 9°N polD were used to examine 3′-5′ exonuclease hydrolysis activity in the presence of Mg2+ and Mn2+. Interestingly, substituting Mn2+ for Mg2+ accelerated hydrolysis rates >40-fold (kexo ≥110 s−1 versus ≥2.5 s−1). Preference for Mn2+ over Mg2+ in exonuclease hydrolysis activity is a property unique to the polD family. The kinetic assays performed in this work provide critical insight into the mechanisms that polD employs to accurately and efficiently replicate the archaeal genome. Furthermore, despite the unique properties of polD, this work suggests that a conserved polymerase kinetic pathway is present in all known DNA polymerase families. PMID:26160179
O'Toole, Amanda S.; Miller, Stacy; Haines, Nathan; Zink, M. Coleen; Serra, Martin J.
2006-01-01
Thermodynamic parameters are reported for duplex formation of 48 self-complementary RNA duplexes containing Watson–Crick terminal base pairs (GC, AU and UA) with all 16 possible 3′ double-nucleotide overhangs; mimicking the structures of short interfering RNAs (siRNA) and microRNAs (miRNA). Based on nearest-neighbor analysis, the addition of a second dangling nucleotide to a single 3′ dangling nucleotide increases stability of duplex formation up to 0.8 kcal/mol in a sequence dependent manner. Results from this study in conjunction with data from a previous study [A. S. O'Toole, S. Miller and M. J. Serra (2005) RNA, 11, 512.] allows for the development of a refined nearest-neighbor model to predict the influence of 3′ double-nucleotide overhangs on the stability of duplex formation. The model improves the prediction of free energy and melting temperature when tested against five oligomers with various core duplex sequences. Phylogenetic analysis of naturally occurring miRNAs was performed to support our results. Selection of the effector miR strand of the mature miRNA duplex appears to be dependent upon the identity of the 3′ double-nucleotide overhang. Thermodynamic parameters for 3′ single terminal overhangs adjacent to a UA pair are also presented. PMID:16820533
Genome-scale engineering of Saccharomyces cerevisiae with single-nucleotide precision.
Bao, Zehua; HamediRad, Mohammad; Xue, Pu; Xiao, Han; Tasan, Ipek; Chao, Ran; Liang, Jing; Zhao, Huimin
2018-07-01
We developed a CRISPR-Cas9- and homology-directed-repair-assisted genome-scale engineering method named CHAnGE that can rapidly output tens of thousands of specific genetic variants in yeast. More than 98% of target sequences were efficiently edited with an average frequency of 82%. We validate the single-nucleotide resolution genome-editing capability of this technology by creating a genome-wide gene disruption collection and apply our method to improve tolerance to growth inhibitors.
A SNP panel and online tool for checking genotype concordance through comparing QR codes.
Du, Yonghong; Martin, Joshua S; McGee, John; Yang, Yuchen; Liu, Eric Yi; Sun, Yingrui; Geihs, Matthias; Kong, Xuejun; Zhou, Eric Lingfeng; Li, Yun; Huang, Jie
2017-01-01
In the current precision medicine era, more and more samples get genotyped and sequenced. Both researchers and commercial companies expend significant time and resources to reduce the error rate. However, it has been reported that there is a sample mix-up rate of between 0.1% and 1%, not to mention the possibly higher mix-up rate during the down-stream genetic reporting processes. Even on the low end of this estimate, this translates to a significant number of mislabeled samples, especially over the projected one billion people that will be sequenced within the next decade. Here, we first describe a method to identify a small set of Single nucleotide polymorphisms (SNPs) that can uniquely identify a personal genome, which utilizes allele frequencies of five major continental populations reported in the 1000 genomes project and the ExAC Consortium. To make this panel more informative, we added four SNPs that are commonly used to predict ABO blood type, and another two SNPs that are capable of predicting sex. We then implement a web interface (http://qrcme.tech), nicknamed QRC (for QR code based Concordance check), which is capable of extracting the relevant ID SNPs from a raw genetic data, coding its genotype as a quick response (QR) code, and comparing QR codes to report the concordance of underlying genetic datasets. The resulting 80 fingerprinting SNPs represent a significant decrease in complexity and the number of markers used for genetic data labelling and tracking. Our method and web tool is easily accessible to both researchers and the general public who consider the accuracy of complex genetic data as a prerequisite towards precision medicine.
A SNP panel and online tool for checking genotype concordance through comparing QR codes
Du, Yonghong; Martin, Joshua S.; McGee, John; Yang, Yuchen; Liu, Eric Yi; Sun, Yingrui; Geihs, Matthias; Kong, Xuejun; Zhou, Eric Lingfeng; Li, Yun
2017-01-01
In the current precision medicine era, more and more samples get genotyped and sequenced. Both researchers and commercial companies expend significant time and resources to reduce the error rate. However, it has been reported that there is a sample mix-up rate of between 0.1% and 1%, not to mention the possibly higher mix-up rate during the down-stream genetic reporting processes. Even on the low end of this estimate, this translates to a significant number of mislabeled samples, especially over the projected one billion people that will be sequenced within the next decade. Here, we first describe a method to identify a small set of Single nucleotide polymorphisms (SNPs) that can uniquely identify a personal genome, which utilizes allele frequencies of five major continental populations reported in the 1000 genomes project and the ExAC Consortium. To make this panel more informative, we added four SNPs that are commonly used to predict ABO blood type, and another two SNPs that are capable of predicting sex. We then implement a web interface (http://qrcme.tech), nicknamed QRC (for QR code based Concordance check), which is capable of extracting the relevant ID SNPs from a raw genetic data, coding its genotype as a quick response (QR) code, and comparing QR codes to report the concordance of underlying genetic datasets. The resulting 80 fingerprinting SNPs represent a significant decrease in complexity and the number of markers used for genetic data labelling and tracking. Our method and web tool is easily accessible to both researchers and the general public who consider the accuracy of complex genetic data as a prerequisite towards precision medicine. PMID:28926565
RNA-Seq Based Transcriptional Map of Bovine Respiratory Disease Pathogen “Histophilus somni 2336”
Kumar, Ranjit; Lawrence, Mark L.; Watt, James; Cooksey, Amanda M.; Burgess, Shane C.; Nanduri, Bindu
2012-01-01
Genome structural annotation, i.e., identification and demarcation of the boundaries for all the functional elements in a genome (e.g., genes, non-coding RNAs, proteins and regulatory elements), is a prerequisite for systems level analysis. Current genome annotation programs do not identify all of the functional elements of the genome, especially small non-coding RNAs (sRNAs). Whole genome transcriptome analysis is a complementary method to identify “novel” genes, small RNAs, regulatory regions, and operon structures, thus improving the structural annotation in bacteria. In particular, the identification of non-coding RNAs has revealed their widespread occurrence and functional importance in gene regulation, stress and virulence. However, very little is known about non-coding transcripts in Histophilus somni, one of the causative agents of Bovine Respiratory Disease (BRD) as well as bovine infertility, abortion, septicemia, arthritis, myocarditis, and thrombotic meningoencephalitis. In this study, we report a single nucleotide resolution transcriptome map of H. somni strain 2336 using RNA-Seq method. The RNA-Seq based transcriptome map identified 94 sRNAs in the H. somni genome of which 82 sRNAs were never predicted or reported in earlier studies. We also identified 38 novel potential protein coding open reading frames that were absent in the current genome annotation. The transcriptome map allowed the identification of 278 operon (total 730 genes) structures in the genome. When compared with the genome sequence of a non-virulent strain 129Pt, a disproportionate number of sRNAs (∼30%) were located in genomic region unique to strain 2336 (∼18% of the total genome). This observation suggests that a number of the newly identified sRNAs in strain 2336 may be involved in strain-specific adaptations. PMID:22276113
RNA-seq based transcriptional map of bovine respiratory disease pathogen "Histophilus somni 2336".
Kumar, Ranjit; Lawrence, Mark L; Watt, James; Cooksey, Amanda M; Burgess, Shane C; Nanduri, Bindu
2012-01-01
Genome structural annotation, i.e., identification and demarcation of the boundaries for all the functional elements in a genome (e.g., genes, non-coding RNAs, proteins and regulatory elements), is a prerequisite for systems level analysis. Current genome annotation programs do not identify all of the functional elements of the genome, especially small non-coding RNAs (sRNAs). Whole genome transcriptome analysis is a complementary method to identify "novel" genes, small RNAs, regulatory regions, and operon structures, thus improving the structural annotation in bacteria. In particular, the identification of non-coding RNAs has revealed their widespread occurrence and functional importance in gene regulation, stress and virulence. However, very little is known about non-coding transcripts in Histophilus somni, one of the causative agents of Bovine Respiratory Disease (BRD) as well as bovine infertility, abortion, septicemia, arthritis, myocarditis, and thrombotic meningoencephalitis. In this study, we report a single nucleotide resolution transcriptome map of H. somni strain 2336 using RNA-Seq method.The RNA-Seq based transcriptome map identified 94 sRNAs in the H. somni genome of which 82 sRNAs were never predicted or reported in earlier studies. We also identified 38 novel potential protein coding open reading frames that were absent in the current genome annotation. The transcriptome map allowed the identification of 278 operon (total 730 genes) structures in the genome. When compared with the genome sequence of a non-virulent strain 129Pt, a disproportionate number of sRNAs (∼30%) were located in genomic region unique to strain 2336 (∼18% of the total genome). This observation suggests that a number of the newly identified sRNAs in strain 2336 may be involved in strain-specific adaptations.
On origin of genetic code and tRNA before translation
2011-01-01
Background Synthesis of proteins is based on the genetic code - a nearly universal assignment of codons to amino acids (aas). A major challenge to the understanding of the origins of this assignment is the archetypal "key-lock vs. frozen accident" dilemma. Here we re-examine this dilemma in light of 1) the fundamental veto on "foresight evolution", 2) modular structures of tRNAs and aminoacyl-tRNA synthetases, and 3) the updated library of aa-binding sites in RNA aptamers successfully selected in vitro for eight amino acids. Results The aa-binding sites of arginine, isoleucine and tyrosine contain both their cognate triplets, anticodons and codons. We have noticed that these cases might be associated with palindrome-dinucleotides. For example, one-base shift to the left brings arginine codons CGN, with CG at 1-2 positions, to the respective anticodons NCG, with CG at 2-3 positions. Formally, the concomitant presence of codons and anticodons is also expected in the reverse situation, with codons containing palindrome-dinucleotides at their 2-3 positions, and anticodons exhibiting them at 1-2 positions. A closer analysis reveals that, surprisingly, RNA binding sites for Arg, Ile and Tyr "prefer" (exactly as in the actual genetic code) the anticodon(2-3)/codon(1-2) tetramers to their anticodon(1-2)/codon(2-3) counterparts, despite the seemingly perfect symmetry of the latter. However, since in vitro selection of aa-specific RNA aptamers apparently had nothing to do with translation, this striking preference provides a new strong support to the notion of the genetic code emerging before translation, in response to catalytic (and possibly other) needs of ancient RNA life. Consistently with the pre-translation origin of the code, we propose here a new model of tRNA origin by the gradual, Fibonacci process-like, elongation of a tRNA molecule from a primordial coding triplet and 5'DCCA3' quadruplet (D is a base-determinator) to the eventual 76 base-long cloverleaf-shaped molecule. Conclusion Taken together, our findings necessarily imply that primordial tRNAs, tRNA aminoacylating ribozymes, and (later) the translation machinery in general have been co-evolving to ''fit'' the (likely already defined) genetic code, rather than the opposite way around. Coding triplets in this primal pre-translational code were likely similar to the anticodons, with second and third nucleotides being more important than the less specific first one. Later, when the code was expanding in co-evolution with the translation apparatus, the importance of 2-3 nucleotides of coding triplets "transferred" to the 1-2 nucleotides of their complements, thus distinguishing anticodons from codons. This evolutionary primacy of anticodons in genetic coding makes the hypothesis of primal stereo-chemical affinity between amino acids and cognate triplets, the hypothesis of coding coenzyme handles for amino acids, the hypothesis of tRNA-like genomic 3' tags suggesting that tRNAs originated in replication, and the hypothesis of ancient ribozymes-mediated operational code of tRNA aminoacylation not mutually contradicting but rather co-existing in harmony. Reviewers This article was reviewed by Eugene V. Koonin, Wentao Ma (nominated by Juergen Brosius) and Anthony Poole. PMID:21342520
Germ-line and somatic EPHA2 coding variants in lens aging and cataract.
Bennett, Thomas M; M'Hamdi, Oussama; Hejtmancik, J Fielding; Shiels, Alan
2017-01-01
Rare germ-line mutations in the coding regions of the human EPHA2 gene (EPHA2) have been associated with inherited forms of pediatric cataract, whereas, frequent, non-coding, single nucleotide variants (SNVs) have been associated with age-related cataract. Here we sought to determine if germ-line EPHA2 coding SNVs were associated with age-related cataract in a case-control DNA panel (> 50 years) and if somatic EPHA2 coding SNVs were associated with lens aging and/or cataract in a post-mortem lens DNA panel (> 48 years). Micro-fluidic PCR amplification followed by targeted amplicon (exon) next-generation (deep) sequencing of EPHA2 (17-exons) afforded high read-depth coverage (1000x) for > 82% of reads in the cataract case-control panel (161 cases, 64 controls) and > 70% of reads in the post-mortem lens panel (35 clear lens pairs, 22 cataract lens pairs). Novel and reference (known) missense SNVs in EPHA2 that were predicted in silico to be functionally damaging were found in both cases and controls from the age-related cataract panel at variant allele frequencies (VAFs) consistent with germ-line transmission (VAF > 20%). Similarly, both novel and reference missense SNVs in EPHA2 were found in the post-mortem lens panel at VAFs consistent with a somatic origin (VAF > 3%). The majority of SNVs found in the cataract case-control panel and post-mortem lens panel were transitions and many occurred at di-pyrimidine sites that are susceptible to ultraviolet (UV) radiation induced mutation. These data suggest that novel germ-line (blood) and somatic (lens) coding SNVs in EPHA2 that are predicted to be functionally deleterious occur in adults over 50 years of age. However, both types of EPHA2 coding variants were present at comparable levels in individuals with or without age-related cataract making simple genotype-phenotype correlations inconclusive.
Germ-line and somatic EPHA2 coding variants in lens aging and cataract
Bennett, Thomas M.; M’Hamdi, Oussama; Hejtmancik, J. Fielding
2017-01-01
Rare germ-line mutations in the coding regions of the human EPHA2 gene (EPHA2) have been associated with inherited forms of pediatric cataract, whereas, frequent, non-coding, single nucleotide variants (SNVs) have been associated with age-related cataract. Here we sought to determine if germ-line EPHA2 coding SNVs were associated with age-related cataract in a case-control DNA panel (> 50 years) and if somatic EPHA2 coding SNVs were associated with lens aging and/or cataract in a post-mortem lens DNA panel (> 48 years). Micro-fluidic PCR amplification followed by targeted amplicon (exon) next-generation (deep) sequencing of EPHA2 (17-exons) afforded high read-depth coverage (1000x) for > 82% of reads in the cataract case-control panel (161 cases, 64 controls) and > 70% of reads in the post-mortem lens panel (35 clear lens pairs, 22 cataract lens pairs). Novel and reference (known) missense SNVs in EPHA2 that were predicted in silico to be functionally damaging were found in both cases and controls from the age-related cataract panel at variant allele frequencies (VAFs) consistent with germ-line transmission (VAF > 20%). Similarly, both novel and reference missense SNVs in EPHA2 were found in the post-mortem lens panel at VAFs consistent with a somatic origin (VAF > 3%). The majority of SNVs found in the cataract case-control panel and post-mortem lens panel were transitions and many occurred at di-pyrimidine sites that are susceptible to ultraviolet (UV) radiation induced mutation. These data suggest that novel germ-line (blood) and somatic (lens) coding SNVs in EPHA2 that are predicted to be functionally deleterious occur in adults over 50 years of age. However, both types of EPHA2 coding variants were present at comparable levels in individuals with or without age-related cataract making simple genotype-phenotype correlations inconclusive. PMID:29267365
Domier, L L; Latorre, I J; Steinlage, T A; McCoppin, N; Hartman, G L
2003-10-01
The variability of North American and Asian strains and isolates of Soybean mosaic virus was investigated. First, polymerase chain reaction (PCR) products representing the coat protein (CP)-coding regions of 38 SMVs were analyzed for restriction fragment length polymorphisms (RFLP). Second, the nucleotide and predicted amino acid sequence variability of the P1-coding region of 18 SMVs and the helper component/protease (HC/Pro) and CP-coding regions of 25 SMVs were assessed. The CP nucleotide and predicted amino acid sequences were the most similar and predicted phylogenetic relationships similar to those obtained from RFLP analysis. Neither RFLP nor sequence analyses of the CP-coding regions grouped the SMVs by geographical origin. The P1 and HC/Pro sequences were more variable and separated the North American and Asian SMV isolates into two groups similar to previously reported differences in pathogenic diversity of the two sets of SMV isolates. The P1 region was the most informative of the three regions analyzed. To assess the biological relevance of the sequence differences in the HC/Pro and CP coding regions, the transmissibility of 14 SMV isolates by Aphis glycines was tested. All field isolates of SMV were transmitted efficiently by A. glycines, but the laboratory isolates analyzed were transmitted poorly. The amino acid sequences from most, but not all, of the poorly transmitted isolates contained mutations in the aphid transmission-associated DAG and/or KLSC amino acid sequence motifs of CP and HC/Pro, respectively.
Simonen, Marja-Leena; Roivainen, Merja; Iber, Jane; Burns, Cara; Hovi, Tapani
2010-01-01
In 1984, a wild type 3 poliovirus (PV3/FIN84) spread all over Finland causing nine cases of paralytic poliomyelitis and one case of aseptic meningitis. The outbreak was ended in 1985 with an intensive vaccination campaign. By limited sequence comparison with previously isolated PV3 strains, closest relatives of PV3/FIN84 were found among strains circulating in the Mediterranean region. Now we wanted to reanalyse the relationships using approaches currently exploited in poliovirus surveillance. Cell lysates of 22 strains isolated during the outbreak and stored frozen were subjected to RT-PCR amplification in three genomic regions without prior subculture. Sequences of the entire VP1 coding region, 150 nucleotides in the VP1-2A junction, most of the 5' non-coding region, partial sequences of the 3D RNA polymerase coding region and partial 3' non-coding region were compared within the outbreak and with sequences available in data banks. In addition, complete nucleotide sequences were obtained for 2 strains isolated from two different cases of disease during the outbreak. The results confirmed the previously described wide intraepidemic variation of the strains, including amino acid substitutions in antigenic sites, as well as the likely Mediterranean region origin of the strains. Simplot and bootscanning analyses of the complete genomes indicated complicated evolutionary history of the non-capsid coding regions of the genome suggesting several recombinations with different HEV-C viruses in the past.
Novel insertion mutation of ABCB1 gene in an ivermectin-sensitive Border Collie.
Han, Jae-Ik; Son, Hyoung-Won; Park, Seung-Cheol; Na, Ki-Jeong
2010-12-01
P-glycoprotein (P-gp) is encoded by the ABCB1 gene and acts as an efflux pump for xenobiotics. In the Border Collie, a nonsense mutation caused by a 4-base pair deletion in the ABCB1 gene is associated with a premature stop to P-gp synthesis. In this study, we examined the full-length coding sequence of the ABCB1 gene in an ivermectin-sensitive Border Collie that lacked the aforementioned deletion mutation. The sequence was compared to the corresponding sequences of a wild-type Beagle and seven ivermectin-tolerant family members of the Border Collie. When compared to the wild-type Beagle sequence, that of the ivermectin-sensitive Border Collie was found to have one insertion mutation and eight single nucleotide polymorphisms (SNPs) in the coding sequence of the ABCB1 gene. While the eight SNPs were also found in the family members' sequences, the insertion mutation was found only in the ivermectin-sensitive dog. These results suggest the possibility that the SNPs are species-specific features of the ABCB1 gene in Border Collies, and that the insertion mutation may be related to ivermectin intolerance.
Novel insertion mutation of ABCB1 gene in an ivermectin-sensitive Border Collie
Han, Jae-Ik; Son, Hyoung-Won; Park, Seung-Cheol
2010-01-01
P-glycoprotein (P-gp) is encoded by the ABCB1 gene and acts as an efflux pump for xenobiotics. In the Border Collie, a nonsense mutation caused by a 4-base pair deletion in the ABCB1 gene is associated with a premature stop to P-gp synthesis. In this study, we examined the full-length coding sequence of the ABCB1 gene in an ivermectin-sensitive Border Collie that lacked the aforementioned deletion mutation. The sequence was compared to the corresponding sequences of a wild-type Beagle and seven ivermectin-tolerant family members of the Border Collie. When compared to the wild-type Beagle sequence, that of the ivermectin-sensitive Border Collie was found to have one insertion mutation and eight single nucleotide polymorphisms (SNPs) in the coding sequence of the ABCB1 gene. While the eight SNPs were also found in the family members' sequences, the insertion mutation was found only in the ivermectin-sensitive dog. These results suggest the possibility that the SNPs are species-specific features of the ABCB1 gene in Border Collies, and that the insertion mutation may be related to ivermectin intolerance. PMID:21113104
Finding cancer driver mutations in the era of big data research.
Poulos, Rebecca C; Wong, Jason W H
2018-04-02
In the last decade, the costs of genome sequencing have decreased considerably. The commencement of large-scale cancer sequencing projects has enabled cancer genomics to join the big data revolution. One of the challenges still facing cancer genomics research is determining which are the driver mutations in an individual cancer, as these contribute only a small subset of the overall mutation profile of a tumour. Focusing primarily on somatic single nucleotide mutations in this review, we consider both coding and non-coding driver mutations, and discuss how such mutations might be identified from cancer sequencing datasets. We describe some of the tools and database that are available for the annotation of somatic variants and the identification of cancer driver genes. We also address the use of genome-wide variation in mutation load to establish background mutation rates from which to identify driver mutations under positive selection. Finally, we describe the ways in which mutational signatures can act as clues for the identification of cancer drivers, as these mutations may cause, or arise from, certain mutational processes. By defining the molecular changes responsible for driving cancer development, new cancer treatment strategies may be developed or novel preventative measures proposed.
Core signaling pathways in human pancreatic cancers revealed by global genomic analyses.
Jones, Siân; Zhang, Xiaosong; Parsons, D Williams; Lin, Jimmy Cheng-Ho; Leary, Rebecca J; Angenendt, Philipp; Mankoo, Parminder; Carter, Hannah; Kamiyama, Hirohiko; Jimeno, Antonio; Hong, Seung-Mo; Fu, Baojin; Lin, Ming-Tseh; Calhoun, Eric S; Kamiyama, Mihoko; Walter, Kimberly; Nikolskaya, Tatiana; Nikolsky, Yuri; Hartigan, James; Smith, Douglas R; Hidalgo, Manuel; Leach, Steven D; Klein, Alison P; Jaffee, Elizabeth M; Goggins, Michael; Maitra, Anirban; Iacobuzio-Donahue, Christine; Eshleman, James R; Kern, Scott E; Hruban, Ralph H; Karchin, Rachel; Papadopoulos, Nickolas; Parmigiani, Giovanni; Vogelstein, Bert; Velculescu, Victor E; Kinzler, Kenneth W
2008-09-26
There are currently few therapeutic options for patients with pancreatic cancer, and new insights into the pathogenesis of this lethal disease are urgently needed. Toward this end, we performed a comprehensive genetic analysis of 24 pancreatic cancers. We first determined the sequences of 23,219 transcripts, representing 20,661 protein-coding genes, in these samples. Then, we searched for homozygous deletions and amplifications in the tumor DNA by using microarrays containing probes for approximately 10(6) single-nucleotide polymorphisms. We found that pancreatic cancers contain an average of 63 genetic alterations, the majority of which are point mutations. These alterations defined a core set of 12 cellular signaling pathways and processes that were each genetically altered in 67 to 100% of the tumors. Analysis of these tumors' transcriptomes with next-generation sequencing-by-synthesis technologies provided independent evidence for the importance of these pathways and processes. Our data indicate that genetically altered core pathways and regulatory processes only become evident once the coding regions of the genome are analyzed in depth. Dysregulation of these core pathways and processes through mutation can explain the major features of pancreatic tumorigenesis.
Kim, Young Jong; Park, Jin Kyung; Kang, Won Sub; Kim, Su Kang; Han, Changsu; Na, Hae Ri; Park, Hae Jeong; Kim, Jong Woo; Kim, Young Youl; Park, Moon Ho
2017-01-01
Objective Mitochondrial dysfunction is a prominent and early feature of Alzheimer's disease (AD). The morphologic changes observed in the AD brain could be caused by a failure of mitochondrial fusion mechanisms. The aim of this study was to investigate whether genetic polymorphisms of two genes involved in mitochondrial fusion mechanisms, optic atrophy 1 (OPA1) and mitofusin 2 (MFN2), were associated with AD in the Korean population by analyzing genotypes and allele frequencies. Methods One coding single nucleotide polymorphism (SNP) in the MFN2, rs1042837, and two coding SNPs in the OPA1, rs7624750 and rs9851685, were compared between 165 patients with AD (83 men and 82 women, mean age 72.3±4.41) and 186 healthy control subjects (82 men and 104 women, mean age 76.5±5.98). Results Among these three SNPs, rs1042837 showed statistically significant differences in allele frequency, and genotype frequency in the co-dominant 1 model and in the dominant model. Conclusion These results suggest that the rs1042837 polymorphism in MFN2 may be involved in the pathogenesis of AD. PMID:28096879
Aslam, Luqman; Beal, Kathryn; Ann Blomberg, Le; Bouffard, Pascal; Burt, David W.; Crasta, Oswald; Crooijmans, Richard P. M. A.; Cooper, Kristal; Coulombe, Roger A.; De, Supriyo; Delany, Mary E.; Dodgson, Jerry B.; Dong, Jennifer J.; Evans, Clive; Frederickson, Karin M.; Flicek, Paul; Florea, Liliana; Folkerts, Otto; Groenen, Martien A. M.; Harkins, Tim T.; Herrero, Javier; Hoffmann, Steve; Megens, Hendrik-Jan; Jiang, Andrew; de Jong, Pieter; Kaiser, Pete; Kim, Heebal; Kim, Kyu-Won; Kim, Sungwon; Langenberger, David; Lee, Mi-Kyung; Lee, Taeheon; Mane, Shrinivasrao; Marcais, Guillaume; Marz, Manja; McElroy, Audrey P.; Modise, Thero; Nefedov, Mikhail; Notredame, Cédric; Paton, Ian R.; Payne, William S.; Pertea, Geo; Prickett, Dennis; Puiu, Daniela; Qioa, Dan; Raineri, Emanuele; Ruffier, Magali; Salzberg, Steven L.; Schatz, Michael C.; Scheuring, Chantel; Schmidt, Carl J.; Schroeder, Steven; Searle, Stephen M. J.; Smith, Edward J.; Smith, Jacqueline; Sonstegard, Tad S.; Stadler, Peter F.; Tafer, Hakim; Tu, Zhijian (Jake); Van Tassell, Curtis P.; Vilella, Albert J.; Williams, Kelly P.; Yorke, James A.; Zhang, Liqing; Zhang, Hong-Bin; Zhang, Xiaojun; Zhang, Yang; Reed, Kent M.
2010-01-01
A synergistic combination of two next-generation sequencing platforms with a detailed comparative BAC physical contig map provided a cost-effective assembly of the genome sequence of the domestic turkey (Meleagris gallopavo). Heterozygosity of the sequenced source genome allowed discovery of more than 600,000 high quality single nucleotide variants. Despite this heterozygosity, the current genome assembly (∼1.1 Gb) includes 917 Mb of sequence assigned to specific turkey chromosomes. Annotation identified nearly 16,000 genes, with 15,093 recognized as protein coding and 611 as non-coding RNA genes. Comparative analysis of the turkey, chicken, and zebra finch genomes, and comparing avian to mammalian species, supports the characteristic stability of avian genomes and identifies genes unique to the avian lineage. Clear differences are seen in number and variety of genes of the avian immune system where expansions and novel genes are less frequent than examples of gene loss. The turkey genome sequence provides resources to further understand the evolution of vertebrate genomes and genetic variation underlying economically important quantitative traits in poultry. This integrated approach may be a model for providing both gene and chromosome level assemblies of other species with agricultural, ecological, and evolutionary interest. PMID:20838655
The bipartite mitochondrial genome of Ruizia karukerae (Rhigonematomorpha, Nematoda).
Kim, Taeho; Kern, Elizabeth; Park, Chungoo; Nadler, Steven A; Bae, Yeon Jae; Park, Joong-Ki
2018-05-10
Mitochondrial genes and whole mitochondrial genome sequences are widely used as molecular markers in studying population genetics and resolving both deep and shallow nodes in phylogenetics. In animals the mitochondrial genome is generally composed of a single chromosome, but mystifying exceptions sometimes occur. We determined the complete mitochondrial genome of the millipede-parasitic nematode Ruizia karukerae and found its mitochondrial genome consists of two circular chromosomes, which is highly unusual in bilateral animals. Chromosome I is 7,659 bp and includes six protein-coding genes, two rRNA genes and nine tRNA genes. Chromosome II comprises 7,647 bp, with seven protein-coding genes and 16 tRNA genes. Interestingly, both chromosomes share a 1,010 bp sequence containing duplicate copies of cox2 and three tRNA genes (trnD, trnG and trnH), and the nucleotide sequences between the duplicated homologous gene copies are nearly identical, suggesting a possible recent genesis for this bipartite mitochondrial genome. Given that little is known about the formation, maintenance or evolution of abnormal mitochondrial genome structures, R. karukerae mtDNA may provide an important early glimpse into this process.
Mutation detection in the human HSP70B′ gene by denaturing high-performance liquid chromatography
Hecker, Karl H.; Asea, Alexzander; Kobayashi, Kaoru; Green, Stacy; Tang, Dan; Calderwood, Stuart K.
2000-01-01
Variances, particularly single nucleotide polymorphisms (SNP), in the genomic sequence of individuals are the primary key to understanding gene function as it relates to differences in the susceptibility to disease, environmental influences, and therapy. In this report, the HSP70B′ gene is the target sequence for mutation detection in biopsy samples from human prostate cancer patients undergoing combined hyperthermia and radiation therapy at the Dana-Farber Cancer Institute, using temperature-modulated heteroduplex analysis (TMHA). The underlying principles of TMHA for mutation detection using DHPLC technology are discussed. The procedures involved in amplicon design for mutation analysis by DHPLC are detailed. The melting behavior of the complete coding sequence of the target gene is characterized using WAVEMAKERTM software. Four overlapping amplicons, which span the complete coding region of the HSP70B′ gene, amenable to mutation detection by DHPLC were identified based on the software-predicted melting profile of the target sequence. TMHA was performed on PCR products of individual amplicons of the HSP70B′ gene on the WAVE® Nucleic Acid Fragment Analysis System. The criteria for mutation calling by comparing wild-type and mutant chromatographic patterns are discussed. PMID:11189446
Mutation detection in the human HSP7OB' gene by denaturing high-performance liquid chromatography.
Hecker, K H; Asea, A; Kobayashi, K; Green, S; Tang, D; Calderwood, S K
2000-11-01
Variances, particularly single nucleotide polymorphisms (SNP), in the genomic sequence of individuals are the primary key to understanding gene function as it relates to differences in the susceptibility to disease, environmental influences, and therapy. In this report, the HSP70B' gene is the target sequence for mutation detection in biopsy samples from human prostate cancer patients undergoing combined hyperthermia and radiation therapy at the Dana-Farber Cancer Institute, using temperature-modulated heteroduplex analysis (TMHA). The underlying principles of TMHA for mutation detection using DHPLC technology are discussed. The procedures involved in amplicon design for mutation analysis by DHPLC are detailed. The melting behavior of the complete coding sequence of the target gene is characterized using WAVEMAKER software. Four overlapping amplicons, which span the complete coding region of the HSP70B' gene, amenable to mutation detection by DHPLC were identified based on the software-predicted melting profile of the target sequence. TMHA was performed on PCR products of individual amplicons of the HSP70B' gene on the WAVE Nucleic Acid Fragment Analysis System. The criteria for mutation calling by comparing wild-type and mutant chromatographic patterns are discussed.
Tsioris, Konstantinos; Gupta, Namita T.; Ogunniyi, Adebola O.; Zimnisky, Ross M.; Qian, Feng; Yao, Yi; Wang, Xiaomei; Stern, Joel N. H.; Chari, Raj; Briggs, Adrian W.; Clouser, Christopher R.; Vigneault, Francois; Church, George M.; Garcia, Melissa N.; Murray, Kristy O.; Montgomery, Ruth R.; Kleinstein, Steven H.; Love, J. Christopher
2015-01-01
West Nile virus infection (WNV) is an emerging mosquito-borne disease that can lead to severe neurological illness and currently has no available treatment or vaccine. Using microengraving, an integrated single-cell analysis method, we analyzed a cohort of subjects infected with WNV - recently infected and post-convalescent subjects - and efficiently identified four novel WNV neutralizing antibodies. We also assessed the humoral response to WNV on a single-cell and repertoire level by integrating next generation sequencing (NGS) into our analysis. The results from single-cell analysis indicate persistence of WNV-specific memory B cells and antibody-secreting cells in post-convalescent subjects. These cells exhibited class-switched antibody isotypes. Furthermore, the results suggest that the antibody response itself does not predict the clinical severity of the disease (asymptomatic or symptomatic). Using the nucleotide coding sequences for WNV-specific antibodies derived from single cells, we revealed the ontogeny of expanded WNV-specific clones in the repertoires of recently infected subjects through NGS and bioinformatic analysis. This analysis also indicated that the humoral response to WNV did not depend on an anamnestic response, due to an unlikely previous exposure to the virus. The innovative and integrative approach presented here to analyze the evolution of neutralizing antibodies from natural infection on a single-cell and repertoire level can also be applied to vaccine studies, and could potentially aid the development of therapeutic antibodies and our basic understanding of other infectious diseases. PMID:26481611
Tsioris, Konstantinos; Gupta, Namita T; Ogunniyi, Adebola O; Zimnisky, Ross M; Qian, Feng; Yao, Yi; Wang, Xiaomei; Stern, Joel N H; Chari, Raj; Briggs, Adrian W; Clouser, Christopher R; Vigneault, Francois; Church, George M; Garcia, Melissa N; Murray, Kristy O; Montgomery, Ruth R; Kleinstein, Steven H; Love, J Christopher
2015-12-01
West Nile virus (WNV) infection is an emerging mosquito-borne disease that can lead to severe neurological illness and currently has no available treatment or vaccine. Using microengraving, an integrated single-cell analysis method, we analyzed a cohort of subjects infected with WNV - recently infected and post-convalescent subjects - and efficiently identified four novel WNV neutralizing antibodies. We also assessed the humoral response to WNV on a single-cell and repertoire level by integrating next generation sequencing (NGS) into our analysis. The results from single-cell analysis indicate persistence of WNV-specific memory B cells and antibody-secreting cells in post-convalescent subjects. These cells exhibited class-switched antibody isotypes. Furthermore, the results suggest that the antibody response itself does not predict the clinical severity of the disease (asymptomatic or symptomatic). Using the nucleotide coding sequences for WNV-specific antibodies derived from single cells, we revealed the ontogeny of expanded WNV-specific clones in the repertoires of recently infected subjects through NGS and bioinformatic analysis. This analysis also indicated that the humoral response to WNV did not depend on an anamnestic response, due to an unlikely previous exposure to the virus. The innovative and integrative approach presented here to analyze the evolution of neutralizing antibodies from natural infection on a single-cell and repertoire level can also be applied to vaccine studies, and could potentially aid the development of therapeutic antibodies and our basic understanding of other infectious diseases.
Lühr, B; Scheller, J; Meyer, P; Kramer, W
1998-02-01
We have analysed the correction of defined mismatches in wild-type and msh2, msh3, msh6 and msh3 msh6 mutants of Saccharomyces cerevisiae in two different yeast strain backgrounds by transformation with plasmid heteroduplex DNA constructs. Ten different base/base mismatches, two single-nucleotide loops and a 38-nucleotide loop were tested. Repair of all types of mismatches was severely impaired in msh2 and msh3 msh6 mutants. In msh6 mutants, repair efficiency of most base/base mismatches was reduced to a similar extent as in msh3 msh6 double mutants. G/T and A/C mismatches, however, displayed residual repair in msh6 mutants in one strain background, implying a role for Msh3p in recognition of base/base mismatches. Furthermore, the efficiency of repair of base/base mismatches was considerably reduced in msh3 mutants in one strain background, indicating a requirement for MSH3 for fully efficient mismatch correction. Also the efficiency of repair of the 38-nucleotide loop was reduced in msh3 mutants, and to a lesser extent in msh6 mutants. The single-nucleotide loop with an unpaired A was less efficiently repaired in msh3 mutants and that with an unpaired T was less efficiently corrected in msh6 mutants, indicating non-redundant functions for the two proteins in the recognition of single-nucleotide loops.
Battilana, Juri; Emanuelli, Francesco; Gambino, Giorgio; Gribaudo, Ivana; Gasperi, Flavia; Boss, Paul K.; Grando, Maria Stella
2011-01-01
Grape berries of Muscat cultivars (Vitis vinifera L.) contain high levels of monoterpenols and exhibit a distinct aroma related to this composition of volatiles. A structural gene of the plastidial methyl-erythritol-phosphate (MEP) pathway, 1-deoxy-D-xylulose 5-phosphate synthase (VvDXS), was recently suggested as a candidate gene for this trait, having been co-localized with a major quantitative trait locus for linalool, nerol, and geraniol concentrations in berries. In addition, a structured association study discovered a putative causal single nucleotide polymorphism (SNP) responsible for the substitution of a lysine with an asparagine at position 284 of the VvDXS protein, and this SNP was significantly associated with Muscat-flavoured varieties. The significance of this nucleotide difference was investigated by comparing the monoterpene profiles with the expression of VvDXS alleles throughout berry development in Moscato Bianco, a cultivar heterozygous for the SNP mutation. Although correlation was detected between the VvDXS transcript profile and the accumulation of free monoterpenol odorants, the modulation of VvDXS expression during berry development appears to be independent of nucleotide variation in the coding sequence. In order to assess how the non-synonymous mutation may enhance Muscat flavour, an in vitro characterization of enzyme isoforms was performed followed by in vivo overexpression of each VvDXS allele in tobacco. The results showed that the amino acid non-neutral substitution influences the enzyme kinetics by increasing the catalytic efficiency and also dramatically affects monoterpene levels in transgenic lines. These findings confirm a functional effect of the VvDXS gene polymorphism and may pave the way for metabolic engineering of terpenoid contents in grapevine. PMID:21868399
CHANG, Weihua; WANG, Juanhong; TAO, Dayong; ZHANG, Yong; HE, Jianzhong; SHI, Changqing
2015-01-01
MicroRNAs (miRNAs) are a class of short endogenous, single-stranded, non-coding small RNA molecules, about 19–25 nucleotides in length that regulate gene expression at the translation level and influence many physiological process, such apoptosis, metabolism, signal transduction, and occurrence and development of diseases. In this study, we constructed a library from the ovine luteal phase ovary by using next-generation sequencing technology (Solexa high-throughput sequencing technique) and identified 267 novel miRNAs by bioinformatics. One of the novel miRNAs (ovis_aries_ovary-m0033_3p), which expressed in the sheep ovary and testis, was confirmed by real time PCR and northern blot. Ovis_aries_ovary-m0033_3p was 21 nucleotides in length and located on chromosome 12, and it had 100% similarity to hsa-miR-214-3p, mmu-miR-214-3p, dre-miR-214and ssc-miR-214. Meanwhile, the pre-miRNA was 82 nucleotides in length and had a standard hairpin stem-loop structure. From the consistency of the sequence and structure, we speculated that ovis_aries_ovary-m0033_3p had a function similar to hsa-miR-214-3p, which is involved in the fine regulation of cell survival, embryonic development, breeding activities and resistance to ovarian cancer, so we defined it as oar-miR-214-3p. These experimental results will enrich the miRNA database for ovis aries and provide the basis for researching the regulation mechanism of miRNA in relation to breeding activities of seasonal breeding animals. PMID:26268666
Jo, Jihoon; Oh, Jooseong; Lee, Hyun-Gwan; Hong, Hyun-Hee; Lee, Sung-Gwon; Cheon, Seongmin; Kern, Elizabeth M A; Jin, Soyeong; Cho, Sung-Jin; Park, Joong-Ki; Park, Chungoo
2017-01-01
The Japanese sea cucumber (Apostichopus japonicus Selenka 1867) is an economically important species as a source of seafood and ingredient in traditional medicine. It is mainly found off the coasts of northeast Asia. Recently, substantial exploitation and widespread biotic diseases in A. japonicus have generated increasing conservation concern. However, the genomic knowledge base and resources available for researchers to use in managing this natural resource and to establish genetically based breeding systems for sea cucumber aquaculture are still in a nascent stage. A total of 312 Gb of raw sequences were generated using the Illumina HiSeq 2000 platform and assembled to a final size of 0.66 Gb, which is about 80.5% of the estimated genome size (0.82 Gb). We observed nucleotide-level heterozygosity within the assembled genome to be 0.986%. The resulting draft genome assembly comprising 132 607 scaffolds with an N50 value of 10.5 kb contains a total of 21 771 predicted protein-coding genes. We identified 6.6-14.5 million heterozygous single nucleotide polymorphisms in the assembled genome of the three natural color variants (green, red, and black), resulting in an estimated nucleotide diversity of 0.00146. We report the first draft genome of A. japonicus and provide a general overview of the genetic variation in the three major color variants of A. japonicus. These data will help provide a comprehensive view of the genetic, physiological, and evolutionary relationships among color variants in A. japonicus, and will be invaluable resources for sea cucumber genomic research. © The Author 2017. Published by Oxford University Press.
Zurawski, Gerard; Bohnert, Hans J.; Whitfeld, Paul R.; Bottomley, Warwick
1982-01-01
The gene for the so-called Mr 32,000 rapidly labeled photosystem II thylakoid membrane protein (here designated psbA) of spinach (Spinacia oleracea) chloroplasts is located on the chloroplast DNA in the large single-copy region immediately adjacent to one of the inverted repeat sequences. In this paper we show that the size of the mRNA for this protein is ≈ 1.25 kilobases and that the direction of transcription is towards the inverted repeat unit. The nucleotide sequence of the gene and its flanking regions is presented. The only large open reading frame in the sequence codes for a protein of Mr 38,950. The nucleotide sequence of psbA from Nicotiana debneyi also has been determined, and comparison of the sequences from the two species shows them to be highly conserved (>95% homology) throughout the entire reading frame. Conservation of the amino acid sequence is absolute, there being no changes in a total of 353 residues. This leads us to conclude that the primary translation product of psbA must be a protein of Mr 38,950. The protein is characterized by the complete absence of lysine residues and is relatively rich in hydrophobic amino acids, which tend to be clustered. Transcription of spinach psbA starts about 86 base pairs before the first ATG codon. Immediately upstream from this point there is a sequence typical of that found in E. coli promoters. An almost identical sequence occurs in the equivalent region of N. debneyi DNA. Images PMID:16593262
In vivo genome-wide profiling of RNA secondary structure reveals novel regulatory features.
Ding, Yiliang; Tang, Yin; Kwok, Chun Kit; Zhang, Yu; Bevilacqua, Philip C; Assmann, Sarah M
2014-01-30
RNA structure has critical roles in processes ranging from ligand sensing to the regulation of translation, polyadenylation and splicing. However, a lack of genome-wide in vivo RNA structural data has limited our understanding of how RNA structure regulates gene expression in living cells. Here we present a high-throughput, genome-wide in vivo RNA structure probing method, structure-seq, in which dimethyl sulphate methylation of unprotected adenines and cytosines is identified by next-generation sequencing. Application of this method to Arabidopsis thaliana seedlings yielded the first in vivo genome-wide RNA structure map at nucleotide resolution for any organism, with quantitative structural information across more than 10,000 transcripts. Our analysis reveals a three-nucleotide periodic repeat pattern in the structure of coding regions, as well as a less-structured region immediately upstream of the start codon, and shows that these features are strongly correlated with translation efficiency. We also find patterns of strong and weak secondary structure at sites of alternative polyadenylation, as well as strong secondary structure at 5' splice sites that correlates with unspliced events. Notably, in vivo structures of messenger RNAs annotated for stress responses are poorly predicted in silico, whereas mRNA structures of genes related to cell function maintenance are well predicted. Global comparison of several structural features between these two categories shows that the mRNAs associated with stress responses tend to have more single-strandedness, longer maximal loop length and higher free energy per nucleotide, features that may allow these RNAs to undergo conformational changes in response to environmental conditions. Structure-seq allows the RNA structurome and its biological roles to be interrogated on a genome-wide scale and should be applicable to any organism.
Bryant, D A; de Lorimier, R; Lambert, D H; Dubbs, J M; Stirewalt, V L; Stevens, S E; Porter, R D; Tam, J; Jay, E
1985-01-01
The genes for the alpha- and beta-subunit apoproteins of allophycocyanin (AP) were isolated from the cyanelle genome of Cyanophora paradoxa and subjected to nucleotide sequence analysis. The AP beta-subunit apoprotein gene was localized to a 7.8-kilobase-pair Pst I restriction fragment from cyanelle DNA by hybridization with a tetradecameric oligonucleotide probe. Sequence analysis using that oligonucleotide and its complement as primers for the dideoxy chain-termination sequencing method confirmed the presence of both AP alpha- and beta-subunit genes on this restriction fragment. Additional oligonucleotide primers were synthesized as sequencing progressed and were used to determine rapidly the nucleotide sequence of a 1336-base-pair region of this cloned fragment. This strategy allowed the sequencing to be completed without a detailed restriction map and without extensive and time-consuming subcloning. The sequenced region contains two open reading frames whose deduced amino acid sequences are 81-85% homologous to cyanobacterial and red algal AP subunits whose amino acid sequences have been determined. The two open reading frames are in the same orientation and are separated by 39 base pairs. AP alpha is 5' to AP beta and both coding sequences are preceded by a polypurine, Shine-Dalgarno-type sequence. Sequences upstream from AP alpha closely resemble the Escherichia coli consensus promoter sequences and also show considerable homology to promoter sequences for several chloroplast-encoded psbA genes. A 56-base-pair palindromic sequence downstream from the AP beta gene could play a role in the termination of transcription or translation. The allophycocyanin apoprotein subunit genes are located on the large single-copy region of the cyanelle genome. PMID:2987916
Demonstration of Protein-Based Human Identification Using the Hair Shaft Proteome
Leppert, Tami; Anex, Deon S.; Hilmer, Jonathan K.; Matsunami, Nori; Baird, Lisa; Stevens, Jeffery; Parsawar, Krishna; Durbin-Johnson, Blythe P.; Rocke, David M.; Nelson, Chad; Fairbanks, Daniel J.; Wilson, Andrew S.; Rice, Robert H.; Woodward, Scott R.; Bothner, Brian; Hart, Bradley R.; Leppert, Mark
2016-01-01
Human identification from biological material is largely dependent on the ability to characterize genetic polymorphisms in DNA. Unfortunately, DNA can degrade in the environment, sometimes below the level at which it can be amplified by PCR. Protein however is chemically more robust than DNA and can persist for longer periods. Protein also contains genetic variation in the form of single amino acid polymorphisms. These can be used to infer the status of non-synonymous single nucleotide polymorphism alleles. To demonstrate this, we used mass spectrometry-based shotgun proteomics to characterize hair shaft proteins in 66 European-American subjects. A total of 596 single nucleotide polymorphism alleles were correctly imputed in 32 loci from 22 genes of subjects’ DNA and directly validated using Sanger sequencing. Estimates of the probability of resulting individual non-synonymous single nucleotide polymorphism allelic profiles in the European population, using the product rule, resulted in a maximum power of discrimination of 1 in 12,500. Imputed non-synonymous single nucleotide polymorphism profiles from European–American subjects were considerably less frequent in the African population (maximum likelihood ratio = 11,000). The converse was true for hair shafts collected from an additional 10 subjects with African ancestry, where some profiles were more frequent in the African population. Genetically variant peptides were also identified in hair shaft datasets from six archaeological skeletal remains (up to 260 years old). This study demonstrates that quantifiable measures of identity discrimination and biogeographic background can be obtained from detecting genetically variant peptides in hair shaft protein, including hair from bioarchaeological contexts. PMID:27603779
Ruhlman, Tracey A; Zhang, Jin; Blazier, John C; Sabir, Jamal S M; Jansen, Robert K
2017-04-01
There is a misinterpretation in the literature regarding the variable orientation of the small single copy region of plastid genomes (plastomes). The common phenomenon of small and large single copy inversion, hypothesized to occur through intramolecular recombination between inverted repeats (IR) in a circular, single unit-genome, in fact, more likely occurs through recombination-dependent replication (RDR) of linear plastome templates. If RDR can be primed through both intra- and intermolecular recombination, then this mechanism could not only create inversion isomers of so-called single copy regions, but also an array of alternative sequence arrangements. We used Illumina paired-end and PacBio single-molecule real-time (SMRT) sequences to characterize repeat structure in the plastome of Monsonia emarginata (Geraniaceae). We used OrgConv and inspected nucleotide alignments to infer ancestral nucleotides and identify gene conversion among repeats and mapped long (>1 kb) SMRT reads against the unit-genome assembly to identify alternative sequence arrangements. Although M. emarginata lacks the canonical IR, we found that large repeats (>1 kilobase; kb) represent ∼22% of the plastome nucleotide content. Among the largest repeats (>2 kb), we identified GC-biased gene conversion and mapping filtered, long SMRT reads to the M. emarginata unit-genome assembly revealed alternative, substoichiometric sequence arrangements. We offer a model based on RDR and gene conversion between long repeated sequences in the M. emarginata plastome and provide support that both intra-and intermolecular recombination between large repeats, particularly in repeat-rich plastomes, varies unit-genome structure while homogenizing the nucleotide sequence of repeats. © 2017 Botanical Society of America.
NASA Astrophysics Data System (ADS)
Kraljić, K.; Strüngmann, L.; Fimmel, E.; Gumbel, M.
2018-01-01
The genetic code is degenerated and it is assumed that redundancy provides error detection and correction mechanisms in the translation process. However, the biological meaning of the code's structure is still under current research. This paper presents a Genetic Code Analysis Toolkit (GCAT) which provides workflows and algorithms for the analysis of the structure of nucleotide sequences. In particular, sets or sequences of codons can be transformed and tested for circularity, comma-freeness, dichotomic partitions and others. GCAT comes with a fertile editor custom-built to work with the genetic code and a batch mode for multi-sequence processing. With the ability to read FASTA files or load sequences from GenBank, the tool can be used for the mathematical and statistical analysis of existing sequence data. GCAT is Java-based and provides a plug-in concept for extensibility. Availability: Open source Homepage:http://www.gcat.bio/
Coupled transcription and processing of mouse ribosomal RNA in a cell-free system.
Mishima, Y; Mitsuma, T; Ogata, K
1985-01-01
An in vitro processing system of mouse rRNA was achieved using an RNA polymerase I-specific transcription system, (S100) and recombinant plasmids consisting of mouse rRNA gene (rDNA) segments containing the transcription initiation and 5'-terminal region of 18S (or 41S) rRNA. Pulse-chase experiments showed that a specific processing occurred with transcripts of the plasmid DNAs when the direction of transcription was the correct orientation relative to the 18S rRNA coding sequence, but not with transcripts of the DNA templates in which this coding sequence was in the opposite orientation. From the S1 nuclease protection analyses, we concluded that there are several steps of endonucleolytic cleavage including one 105 nucleotides upstream from the 5' end of 18S rRNA. Intermediates cleaved at this site were identified in in vivo processing of rRNA. This result indicates that endonucleolytic cleavage takes place 105 nucleotides upstream from the 5' terminus of 18S rRNA prior to the formation of mature 18S rRNA. Trimming or cleavage of the 105 nucleotides may be involved in the formation of the 5' terminus of mature 18S rRNA. Images Fig. 2. Fig. 3. Fig. 4. Fig. 5. Fig. 6. PMID:3004977
Human somatostatin I: sequence of the cDNA.
Shen, L P; Pictet, R L; Rutter, W J
1982-01-01
RNA has been isolated from a human pancreatic somatostatinoma and used to prepare a cDNA library. After prescreening, clones containing somatostatin I sequences were identified by hybridization with an anglerfish somatostatin I-cloned cDNA probe. From the nucleotide sequence of two of these clones, we have deduced an essentially full-length mRNA sequence, including the preprosomatostatin coding region, 105 nucleotides from the 5' untranslated region and the complete 150-nucleotide 3' untranslated region. The coding region predicts a 116-amino acid precursor protein (Mr, 12.727) that contains somatostatin-14 and -28 at its COOH terminus. The predicted amino acid sequence of human somatostatin-28 is identical to that of somatostatin-28 isolated from the porcine and ovine species. A comparison of the amino acid sequences of human and anglerfish preprosomatostatin I indicated that the COOH-terminal region encoding somatostatin-14 and the adjacent 6 amino acids are highly conserved, whereas the remainder of the molecule, including the signal peptide region, is more divergent. However, many of the amino acid differences found in the pro region of the human and anglerfish proteins are conservative changes. This suggests that the propeptides have a similar secondary structure, which in turn may imply a biological function for this region of the molecule. Images PMID:6126875
Investigations with methanobacteria and with evolution of the genetic code
NASA Technical Reports Server (NTRS)
Jukes, T. H.
1986-01-01
Mycoplasma capricolum was found by Osawa et al. to use UGA as the code of tryptophan and to contain 75% A + T in its DNA. This change could have been from evolutionary pressure to replace C + G by A + T. Numerous studies have been reported of evolution of proteins as measured by amino acid replacements that are observed when homologus proteins, such as hemoglobins from various vertebrates, are compared. These replacements result from nucleotide substitutions in amino acid codons in the corresponding genes. Simultaneously, silent nucleotide substitutions take place that can be studied when sequences of the genes are compared. These silent evolutionary changes take place mostly in third positions of codons. Two types of nucleotide substitutions are recognized: pyrimidine-pyrimidine and purine-purine interchanges (transitions) and pyriidine-purine interchanges (transversions). Silent transitions are favored when a corresponding transversion would produce an amino acid replacement. Conversely, silent transversions are favored by probability when transitions and transversions will both be silent. Extensive examples of these situations have been found in protein genes, and it is evident that transversions in silent positions predominate in family boxes in most of the examples studied. In associated research a streptomycete from cow manure was found to produce an extracellular enzyme capable of lysing the pseudomurein-contining methanogen Methanobacterium formicicum.
Li, Guohui; Hu, Zhaoyang; Guo, Xuli; Li, Guangtian; Tang, Qi; Wang, Peng; Chen, Keping; Yao, Qin
2013-06-01
Bombyx mori bidensovirus (BmBDV) VD1-ORF4 (open reading frame 4, ORF4) consists of 3,318 nucleotides, which codes for a predicted 1,105-amino acid protein containing a conserved DNA polymerase motif. However, its functions in viral propagation remain unknown. In the current study, the transcription of VD1-ORF4 was examined from 6 to 96 h postinfection (p.i.) by RT-PCR, 5'-RACE revealed the transcription initiation site of BmBDV ORF4 to be -16 nucleotides upstream from the start codon, and 3'-RACE revealed the transcription termination site of VD1-ORF4 to be +7 nucleotides downstream from termination codon. Three different proteins were examined in the extracts of BmBDV-infected silkworms midguts by Western blot using raised antibodies against VD1-ORF4 deduced amino acid, and a specific protein band about 53 kDa was further detected in purified virions using the same antibodies. Taken together, BmBDV VD1-ORF4 codes for three or more proteins during the viral life cycle, one of which is a 53 kDa protein and confirmed to be a component of BmBDV virion.
[Relevance of long non-coding RNAs in tumour biology].
Nagy, Zoltán; Szabó, Diána Rita; Zsippai, Adrienn; Falus, András; Rácz, Károly; Igaz, Péter
2012-09-23
The discovery of the biological relevance of non-coding RNA molecules represents one of the most significant advances in contemporary molecular biology. It has turned out that a major fraction of the non-coding part of the genome is transcribed. Beside small RNAs (including microRNAs) more and more data are disclosed concerning long non-coding RNAs of 200 nucleotides to 100 kb length that are implicated in the regulation of several basic molecular processes (cell proliferation, chromatin functioning, microRNA-mediated effects, etc.). Some of these long non-coding RNAs have been associated with human tumours, including H19, HOTAIR, MALAT1, etc., the different expression of which has been noted in various neoplasms relative to healthy tissues. Long non-coding RNAs may represent novel markers of molecular diagnostics and they might even turn out to be targets of therapeutic intervention.
Spatio-Temporal Tracking and Phylodynamics of an Urban Dengue 3 Outbreak in São Paulo, Brazil
Mondini, Adriano; Bronzoni, Roberta Vieira de Moraes; Nunes, Silvia Helena Pereira; Chiaravalloti Neto, Francisco; Massad, Eduardo; Alonso, Wladimir J.; Lázzaro, Eduardo S. M.; Ferraz, Amena Alcântara; de Andrade Zanotto, Paolo Marinho; Nogueira, Maurício Lacerda
2009-01-01
The dengue virus has a single-stranded positive-sense RNA genome of ∼10.700 nucleotides with a single open reading frame that encodes three structural (C, prM, and E) and seven nonstructural (NS1, NS2A, NS2B, NS3, NS4A, NS4B, and NS5) proteins. It possesses four antigenically distinct serotypes (DENV 1–4). Many phylogenetic studies address particularities of the different serotypes using convenience samples that are not conducive to a spatio-temporal analysis in a single urban setting. We describe the pattern of spread of distinct lineages of DENV-3 circulating in São José do Rio Preto, Brazil, during 2006. Blood samples from patients presenting dengue-like symptoms were collected for DENV testing. We performed M-N-PCR using primers based on NS5 for virus detection and identification. The fragments were purified from PCR mixtures and sequenced. The positive dengue cases were geo-coded. To type the sequenced samples, 52 reference sequences were aligned. The dataset generated was used for iterative phylogenetic reconstruction with the maximum likelihood criterion. The best demographic model, the rate of growth, rate of evolutionary change, and Time to Most Recent Common Ancestor (TMRCA) were estimated. The basic reproductive rate during the epidemics was estimated. We obtained sequences from 82 patients among 174 blood samples. We were able to geo-code 46 sequences. The alignment generated a 399-nucleotide-long dataset with 134 taxa. The phylogenetic analysis indicated that all samples were of DENV-3 and related to strains circulating on the isle of Martinique in 2000–2001. Sixty DENV-3 from São José do Rio Preto formed a monophyletic group (lineage 1), closely related to the remaining 22 isolates (lineage 2). We assumed that these lineages appeared before 2006 in different occasions. By transforming the inferred exponential growth rates into the basic reproductive rate, we obtained values for lineage 1 of R0 = 1.53 and values for lineage 2 of R0 = 1.13. Under the exponential model, TMRCA of lineage 1 dated 1 year and lineage 2 dated 3.4 years before the last sampling. The possibility of inferring the spatio-temporal dynamics from genetic data has been generally little explored, and it may shed light on DENV circulation. The use of both geographic and temporally structured phylogenetic data provided a detailed view on the spread of at least two dengue viral strains in a populated urban area. PMID:19478848
LaPolla, R J; Mayne, K M; Davidson, N
1984-01-01
A mouse cDNA clone has been isolated that contains the complete coding region of a protein highly homologous to the delta subunit of the Torpedo acetylcholine receptor (AcChoR). The cDNA library was constructed in the vector lambda 10 from membrane-associated poly(A)+ RNA from BC3H-1 mouse cells. Surprisingly, the delta clone was selected by hybridization with cDNA encoding the gamma subunit of the Torpedo AcChoR. The nucleotide sequence of the mouse cDNA clone contains an open reading frame of 520 amino acids. This amino acid sequence exhibits 59% and 50% sequence homology to the Torpedo AcChoR delta and gamma subunits, respectively. However, the mouse nucleotide sequence has several stretches of high homology with the Torpedo gamma subunit cDNA, but not with delta. The mouse protein has the same general structural features as do the Torpedo subunits. It is encoded by a 3.3-kilobase mRNA. There is probably only one, but at most two, chromosomal genes coding for this or closely related sequences. Images PMID:6096870
Information Entropy Analysis of the H1N1 Genetic Code
NASA Astrophysics Data System (ADS)
Martwick, Andy
2010-03-01
During the current H1N1 pandemic, viral samples are being obtained from large numbers of infected people world-wide and are being sequenced on the NCBI Influenza Virus Resource Database. The information entropy of the sequences was computed from the probability of occurrence of each nucleotide base at every position of each set of sequences using Shannon's definition of information entropy, [ H=∑bpb,2( 1pb ) ] where H is the observed information entropy at each nucleotide position and pb is the probability of the base pair of the nucleotides A, C, G, U. Information entropy of the current H1N1 pandemic is compared to reference human and swine H1N1 entropy. As expected, the current H1N1 entropy is in a low entropy state and has a very large mutation potential. Using the entropy method in mature genes we can identify low entropy regions of nucleotides that generally correlate to critical protein function.
Simultaneous determination of nucleotide sugars with ion-pair reversed-phase HPLC.
Nakajima, Kazuki; Kitazume, Shinobu; Angata, Takashi; Fujinawa, Reiko; Ohtsubo, Kazuaki; Miyoshi, Eiji; Taniguchi, Naoyuki
2010-07-01
Nucleotide sugars are important in determining cell surface glycoprotein glycosylation, which can modulate cellular properties such as growth and arrest. We have developed a conventional HPLC method for simultaneous determination of nucleotide sugars. A mixture of nucleotide sugars (CMP-NeuAc, UDP-Gal, UDP-Glc, UDP-GalNAc, UDP-GlcNAc, GDP-Man, GDP-Fuc and UDP-GlcUA) and relevant nucleotides were perfectly separated in an optimized ion-pair reversed-phase mode using Inertsil ODS-4 and ODS-3 columns. The newly developed method enabled us to determine the nucleotide sugars in cellular extracts from 1 x 10(6) cells in a single run. We applied this method to characterize nucleotide sugar levels in breast and pancreatic cancer cell lines and revealed that the abundance of UDP-GlcNAc, UDP-GalNAc, UDP-GlcUA and GDP-Fuc were a cell-type-specific feature. To determine the physiological significance of changes in nucleotide sugar levels, we analyzed their changes by glucose deprivation and found that the determination of nucleotide sugar levels provided us with valuable information with respect to studying the overview of cellular glycosylation status.
Generalization of Associations of Kidney-Related Genetic Loci to American Indians
Haack, Karin; Almasy, Laura; Laston, Sandra; Lee, Elisa T.; Best, Lyle G.; Fabsitz, Richard R.; MacCluer, Jean W.; Howard, Barbara V.; Umans, Jason G.; Cole, Shelley A.
2014-01-01
Summary Background and objectives CKD disproportionally affects American Indians, who similar to other populations, show genetic susceptibility to kidney outcomes. Recent studies have identified several loci associated with kidney traits, but their relevance in American Indians is unknown. Design, setting, participants, & measurements This study used data from a large, family-based genetic study of American Indians (the Strong Heart Family Study), which includes 94 multigenerational families enrolled from communities located in Oklahoma, the Dakotas, and Arizona. Individuals were recruited from the Strong Heart Study, a population-based study of cardiovascular disease in American Indians. This study selected 25 single nucleotide polymorphisms in 23 loci identified from recently published kidney-related genome-wide association studies in individuals of European ancestry to evaluate their associations with kidney function (estimated GFR; individuals 18 years or older, up to 3282 individuals) and albuminuria (urinary albumin to creatinine ratio; n=3552) in the Strong Heart Family Study. This study also examined the association of single nucleotide polymorphisms in the APOL1 region with estimated GFR in 1121 Strong Heart Family Study participants. GFR was estimated using the abbreviated Modification of Diet in Renal Disease Equation. Additive genetic models adjusted for age and sex were used. Results This study identified significant associations of single nucleotide polymorphisms with estimated GFR in or nearby PRKAG2, SLC6A13, UBE2Q2, PIP5K1B, and WDR72 (P<2.1 × 10-3 to account for multiple testing). Single nucleotide polymorphisms in these loci explained 2.2% of the estimated GFR total variance and 2.9% of its heritability. An intronic variant of BCAS3 was significantly associated with urinary albumin to creatinine ratio. APOL1 single nucleotide polymorphisms were not associated with estimated GFR in a single variant test or haplotype analyses, and the at-risk variants identified in individuals with African ancestry were not detected in DNA sequencing of American Indians. Conclusion This study extends the genetic associations of loci affecting kidney function to American Indians, a population at high risk of kidney disease, and provides additional support for a potential biologic relevance of these loci across ancestries. PMID:24311711
Development of a single nucleotide polymorphism barcode to genotype Plasmodium vivax infections.
Baniecki, Mary Lynn; Faust, Aubrey L; Schaffner, Stephen F; Park, Daniel J; Galinsky, Kevin; Daniels, Rachel F; Hamilton, Elizabeth; Ferreira, Marcelo U; Karunaweera, Nadira D; Serre, David; Zimmerman, Peter A; Sá, Juliana M; Wellems, Thomas E; Musset, Lise; Legrand, Eric; Melnikov, Alexandre; Neafsey, Daniel E; Volkman, Sarah K; Wirth, Dyann F; Sabeti, Pardis C
2015-03-01
Plasmodium vivax, one of the five species of Plasmodium parasites that cause human malaria, is responsible for 25-40% of malaria cases worldwide. Malaria global elimination efforts will benefit from accurate and effective genotyping tools that will provide insight into the population genetics and diversity of this parasite. The recent sequencing of P. vivax isolates from South America, Africa, and Asia presents a new opportunity by uncovering thousands of novel single nucleotide polymorphisms (SNPs). Genotyping a selection of these SNPs provides a robust, low-cost method of identifying parasite infections through their unique genetic signature or barcode. Based on our experience in generating a SNP barcode for P. falciparum using High Resolution Melting (HRM), we have developed a similar tool for P. vivax. We selected globally polymorphic SNPs from available P. vivax genome sequence data that were located in putatively selectively neutral sites (i.e., intergenic, intronic, or 4-fold degenerate coding). From these candidate SNPs we defined a barcode consisting of 42 SNPs. We analyzed the performance of the 42-SNP barcode on 87 P. vivax clinical samples from parasite populations in South America (Brazil, French Guiana), Africa (Ethiopia) and Asia (Sri Lanka). We found that the P. vivax barcode is robust, as it requires only a small quantity of DNA (limit of detection 0.3 ng/μl) to yield reproducible genotype calls, and detects polymorphic genotypes with high sensitivity. The markers are informative across all clinical samples evaluated (average minor allele frequency > 0.1). Population genetic and statistical analyses show the barcode captures high degrees of population diversity and differentiates geographically distinct populations. Our 42-SNP barcode provides a robust, informative, and standardized genetic marker set that accurately identifies a genomic signature for P. vivax infections.
Development of a Single Nucleotide Polymorphism Barcode to Genotype Plasmodium vivax Infections
Baniecki, Mary Lynn; Faust, Aubrey L.; Schaffner, Stephen F.; Park, Daniel J.; Galinsky, Kevin; Daniels, Rachel F.; Hamilton, Elizabeth; Ferreira, Marcelo U.; Karunaweera, Nadira D.; Serre, David; Zimmerman, Peter A.; Sá, Juliana M.; Wellems, Thomas E.; Musset, Lise; Legrand, Eric; Melnikov, Alexandre; Neafsey, Daniel E.; Volkman, Sarah K.; Wirth, Dyann F.; Sabeti, Pardis C.
2015-01-01
Plasmodium vivax, one of the five species of Plasmodium parasites that cause human malaria, is responsible for 25–40% of malaria cases worldwide. Malaria global elimination efforts will benefit from accurate and effective genotyping tools that will provide insight into the population genetics and diversity of this parasite. The recent sequencing of P. vivax isolates from South America, Africa, and Asia presents a new opportunity by uncovering thousands of novel single nucleotide polymorphisms (SNPs). Genotyping a selection of these SNPs provides a robust, low-cost method of identifying parasite infections through their unique genetic signature or barcode. Based on our experience in generating a SNP barcode for P. falciparum using High Resolution Melting (HRM), we have developed a similar tool for P. vivax. We selected globally polymorphic SNPs from available P. vivax genome sequence data that were located in putatively selectively neutral sites (i.e., intergenic, intronic, or 4-fold degenerate coding). From these candidate SNPs we defined a barcode consisting of 42 SNPs. We analyzed the performance of the 42-SNP barcode on 87 P. vivax clinical samples from parasite populations in South America (Brazil, French Guiana), Africa (Ethiopia) and Asia (Sri Lanka). We found that the P. vivax barcode is robust, as it requires only a small quantity of DNA (limit of detection 0.3 ng/μl) to yield reproducible genotype calls, and detects polymorphic genotypes with high sensitivity. The markers are informative across all clinical samples evaluated (average minor allele frequency > 0.1). Population genetic and statistical analyses show the barcode captures high degrees of population diversity and differentiates geographically distinct populations. Our 42-SNP barcode provides a robust, informative, and standardized genetic marker set that accurately identifies a genomic signature for P. vivax infections. PMID:25781890
Pombar-Gomez, Maria; Lopez-Lopez, Elixabet; Martin-Guerrero, Idoia; Garcia-Orad Carles, Africa; de Pancorbo, Marian M
2015-05-01
Single nucleotide polymorphisms (SNPs) are an interesting option to facilitate the analysis of highly degraded DNA by allowing the reduction of the size of the DNA amplicons. The SNPforID 52-plex panel is a clear example of the use of non-coding SNPs in forensic genetics. However, nonstop advances in studies of genetic polymorphisms are leading to the discovery of new associations between SNPs and diseases. The aim of this study was to perform a comprehensive review of the state of association between the 52 SNPs in the 52-plex panel and diseases or other traits related to their treatment, such as drug response characters. In order to achieve this goal, we have conducted a bioinformatic search for each SNP included in the panel and the SNPs in linkage disequilibrium (LD) with them in the European population (r (2) > 0.8). A total of 424 SNPs (52 in the panel and 372 in LD) were investigated in PubMed, Scopus, and dbSNP databases. Our results show that three SNPs in the SNPforID 52-plex panel (rs2107612, rs1979255, rs1463729) have been associated with diseases such as hypertension or macular degeneration, as well as drug response. Similarly, three out of the 372 SNPs in LD (rs2107614, r (2) = 0.859; rs765250, r (2) = 0.858; rs11064560, r (2) = 0,887) are also associated with various pathologies. In view of these results, we propose the need for a periodic review of the SNPs used in forensic genetics in order to keep their associations with diseases or related phenotypes updated and to evaluate their continuity in forensic panels for avoiding legal and ethical conflicts.
Regulatory single nucleotide polymorphisms (rSNPs) at the promoters 1A and 1B of the human APC gene.
Matveeva, Marina Yu; Kashina, Elena V; Reshetnikov, Vasily V; Bryzgalov, Leonid O; Antontseva, Elena V; Bondar, Natalia P; Merkulova, Tatiana I
2016-12-22
Germline mutations in the coding sequence of the tumour suppressor APC gene give rise to familial adenomatous polyposis (which leads to colorectal cancer) and are associated with many other oncopathologies. The loss of APC function because of deletion of putative promoter 1A or 1B also results in the development of colorectal cancer. Since the regions of promoters 1A and 1B contain many single nucleotide polymorphisms (SNPs), the aim of this study was to perform functional analysis of some of these SNPs by means of an electrophoretic mobility shift assay (EMSA) and a luciferase reporter assay. First, it was shown that both putative promoters of APC (1A and 1B) drive transcription in an in vitro reporter experiment. From eleven randomly selected SNPs of promoter 1A and four SNPs of promoter 1B, nine and two respectively showed differential patterns of binding of nuclear proteins to oligonucleotide probes corresponding to alternative alleles. The luciferase reporter assay showed that among the six SNPs tested, the rs75612255 C allele and rs113017087 C allele in promoter 1A as well as the rs138386816 T allele and rs115658307 T allele in promoter 1B significantly increased luciferase activity in the human erythromyeloblastoid leukaemia cell line K562. In human colorectal cancer HCT-116 cells, none of the substitutions under study had any effect, with the exception of minor allele G of rs79896135 in promoter 1B. This allele significantly decreased the luciferase reporter's activity CONCLUSION: Our results indicate that many SNPs in APC promoters 1A and 1B are functionally relevant and that allele G of rs79896135 may be associated with the predisposition to colorectal cancer.
Germline variant FGFR4 p.G388R exposes a membrane-proximal STAT3 binding site.
Ulaganathan, Vijay K; Sperl, Bianca; Rapp, Ulf R; Ullrich, Axel
2015-12-24
Variant rs351855-G/A is a commonly occurring single-nucleotide polymorphism of coding regions in exon 9 of the fibroblast growth factor receptor FGFR4 (CD334) gene (c.1162G>A). It results in an amino-acid change at codon 388 from glycine to arginine (p.Gly388Arg) in the transmembrane domain of the receptor. Despite compelling genetic evidence for the association of this common variant with cancers of the bone, breast, colon, prostate, skin, lung, head and neck, as well as soft-tissue sarcomas and non-Hodgkin lymphoma, the underlying biological mechanism has remained elusive. Here we show that substitution of the conserved glycine 388 residue to a charged arginine residue alters the transmembrane spanning segment and exposes a membrane-proximal cytoplasmic signal transducer and activator of transcription 3 (STAT3) binding site Y(390)-(P)XXQ(393). We demonstrate that such membrane-proximal STAT3 binding motifs in the germline of type I membrane receptors enhance STAT3 tyrosine phosphorylation by recruiting STAT3 proteins to the inner cell membrane. Remarkably, such germline variants frequently co-localize with somatic mutations in the Catalogue of Somatic Mutations in Cancer (COSMIC) database. Using Fgfr4 single nucleotide polymorphism knock-in mice and transgenic mouse models for breast and lung cancers, we validate the enhanced STAT3 signalling induced by the FGFR4 Arg388-variant in vivo. Thus, our findings elucidate the molecular mechanism behind the genetic association of rs351855 with accelerated cancer progression and suggest that germline variants of cell-surface molecules that recruit STAT3 to the inner cell membrane are a significant risk for cancer prognosis and disease progression.
Chen, Shanyuan; Gomes, Rui; Costa, Vânia; Santos, Pedro; Charneca, Rui; Zhang, Ya-ping; Liu, Xue-hong; Wang, Shao-qing; Bento, Pedro; Nunes, Jose-Luis; Buzgó, József; Varga, Gyula; Anton, István; Zsolnai, Attila; Beja-Pereira, Albano
2013-10-01
The coexistence of wild boars and domestic pigs across Eurasia makes it feasible to conduct comparative genetic or genomic analyses for addressing how genetically different a domestic species is from its wild ancestor. To test whether there are differences in patterns of genetic variability between wild and domestic pigs at immunity-related genes and to detect outlier loci putatively under selection that may underlie differences in immune responses, here we analyzed 54 single-nucleotide polymorphisms (SNPs) of 19 immunity-related candidate genes on 11 autosomes in three pairs of wild boar and domestic pig populations from China, Iberian Peninsula, and Hungary. Our results showed no statistically significant differences in allele frequency and heterozygosity across SNPs between three pairs of wild and domestic populations. This observation was more likely due to the widespread and long-lasting gene flow between wild boars and domestic pigs across Eurasia. In addition, we detected eight coding SNPs from six genes as outliers being under selection consistently by three outlier tests (BayeScan2.1, FDIST2, and Arlequin3.5). Among four non-synonymous outlier SNPs, one from TLR4 gene was identified as being subject to positive (diversifying) selection and three each from CD36, IFNW1, and IL1B genes were suggested as under balancing selection. All of these four non-synonymous variants were predicted as being benign by PolyPhen-2. Our results were supported by other independent lines of evidence for positive selection or balancing selection acting on these four immune genes (CD36, IFNW1, IL1B, and TLR4). Our study showed an example applying a candidate gene approach to identify functionally important mutations (i.e., outlier loci) in wild and domestic pigs for subsequent functional experiments.
Drew, Richard John; Walsh, Anne; Laoi, Bairbre Ni; Crowley, Brendan
2012-07-01
BK polyomavirus (family Polyomaviridae) may cause hemorrhagic cystitis (BKV-HC) in hematopoietic stem cell transplant recipients. Eleven complete BKV genomes (GenBank accession numbers: JN192431-JN192441) were sequenced from urine samples of allogenic hematopoietic stem cell transplant recipients and compared to complete BKV genomes in the published literature. Of the 11 isolates, seven (64%) were subgroup Ib-1, three (27%) isolates belonged to subgroup Ib-2 and a single isolate belonged to subtype III. The analysis of single-nucleotide polymorphisms in this study showed that isolates could be subclassified into subtypes I-IV and subgroups Ib-1 and Ib-2 on the basis of VP1 of the first part of the Large T-antigen (LTag). The non-coding control region (NCCR) of the 11 isolates was also sequenced. These sequences showed that there was consistent sequence homology within subgroups Ib-1 and Ib-2. Two new mutations were described in the isolates, G→C at O(84) in isolate SJH-LG-310, and a deletion at R(2-7) in isolate SJH-LG-309. No known transcription factor is thought to be present at the site of either of these mutations. There were no rearrangements seen in isolates and this may be because the patients were not followed up over time. There were five nucleotide positions at which subgroup Ib-1 isolated differed from subgroup Ib-2 isolates in the NCCR sequence, O(41) , P(18) , P(31) , R(4) , and S(18) . The mutation O(41) is present in the promoter granulocyte/macrophage stimulating factor) gene and the P(31) mutation is present in the NF-1 gene. Copyright © 2012 Wiley Periodicals, Inc.
Mutations in SLC2A2 Gene Reveal hGLUT2 Function in Pancreatic β Cell Development*
Michau, Aurélien; Guillemain, Ghislaine; Grosfeld, Alexandra; Vuillaumier-Barrot, Sandrine; Grand, Teddy; Keck, Mathilde; L'Hoste, Sébastien; Chateau, Danielle; Serradas, Patricia; Teulon, Jacques; De Lonlay, Pascale; Scharfmann, Raphaël; Brot-Laroche, Edith; Leturque, Armelle; Le Gall, Maude
2013-01-01
The structure-function relationships of sugar transporter-receptor hGLUT2 coded by SLC2A2 and their impact on insulin secretion and β cell differentiation were investigated through the detailed characterization of a panel of mutations along the protein. We studied naturally occurring SLC2A2 variants or mutants: two single-nucleotide polymorphisms and four proposed inactivating mutations associated to Fanconi-Bickel syndrome. We also engineered mutations based on sequence alignment and conserved amino acids in selected domains. The single-nucleotide polymorphisms P68L and T110I did not impact on sugar transport as assayed in Xenopus oocytes. All the Fanconi-Bickel syndrome-associated mutations invalidated glucose transport by hGLUT2 either through absence of protein at the plasma membrane (G20D and S242R) or through loss of transport capacity despite membrane targeting (P417L and W444R), pointing out crucial amino acids for hGLUT2 transport function. In contrast, engineered mutants were located at the plasma membrane and able to transport sugar, albeit with modified kinetic parameters. Notably, these mutations resulted in gain of function. G20S and L368P mutations increased insulin secretion in the absence of glucose. In addition, these mutants increased insulin-positive cell differentiation when expressed in cultured rat embryonic pancreas. F295Y mutation induced β cell differentiation even in the absence of glucose, suggesting that mutated GLUT2, as a sugar receptor, triggers a signaling pathway independently of glucose transport and metabolism. Our results describe the first gain of function mutations for hGLUT2, revealing the importance of its receptor versus transporter function in pancreatic β cell development and insulin secretion. PMID:23986439
Khan, Imran; Ansari, Irfan A; Singh, Pratichi; Dass J, Febin Prabhu
2017-09-01
The phosphatase and tensin homolog (PTEN) gene plays a crucial role in signal transduction by negatively regulating the PI3K signaling pathway. It is the most frequent mutated gene in many human-related cancers. Considering its critical role, a functional analysis of missense mutations of PTEN gene was undertaken in this study. Thirty five nonsynonymous single nucleotide polymorphisms (nsSNPs) within the coding region of the PTEN gene were selected for our in silico investigation, and five nsSNPs (G129E, C124R, D252G, H61D, and R130G) were found to be deleterious based on combinatorial predictions of different computational tools. Moreover, molecular dynamics (MD) simulation was performed to investigate the conformational variation between native and all the five mutant PTEN proteins having predicted deleterious nsSNPs. The results of MD simulation of all mutant models illustrated variation in structural attributes such as root-mean-square deviation, root-mean-square fluctuation, radius of gyration, and total energy; which depicts the structural stability of PTEN protein. Furthermore, mutant PTEN protein structures also showed a significant variation in the solvent accessible surface area and hydrogen bond frequencies from the native PTEN structure. In conclusion, results of this study have established the deleterious effect of the all the five predicted nsSNPs on the PTEN protein structure. Thus, results of the current study can pave a new platform to sort out nsSNPs that can be undertaken for the confirmation of their phenotype and their correlation with diseased status in case of control studies. © 2016 International Union of Biochemistry and Molecular Biology, Inc.
Chiavegatto, Silvana; Sauce, Bruno; Ambar, Guilherme; Cheverud, James M; Peripato, Andrea C
2012-01-01
Maternal care is essential in mammals, and variations in the environment provided by mothers may directly influence the viability of newborns and emotional behavior later in life. A previous study investigated genetic variations associated with maternal care in an intercross of LG/J and SM/J inbred mouse strains and identified two single-locus QTLs (quantitative trait loci). Here, we selected three candidate genes located within these QTLs intervals; Oxt on chromosome 2, and FosB and Peg3 on chromosome 7 and tested their association with maternal care. LG/J females showed impaired postpartum nest building and pup retrieval, a one-day delay in milk ejection, reduced exploratory activity, and higher anxiety-like behavior when compared to SM/J females. The nucleotide sequences of Oxt and FosB were similar between strains, as were their hypothalamic expression levels. Conversely, Peg3 nucleotide sequences showed four nonsynonymous replacement substitutions on LG/J dams, T11062G, G13744A, A13808G, and G13813A, and a 30 base pair (10 aa) in tandem repeat in the coding region with three copies in SM/J and five copies in LG/J. Maternal care impaired LG/J mothers express 37% lower Peg3 mRNA levels in the hypothalamus on the second postpartum day. We also found an association of the Peg3 repeat-variant and poor maternal care in F2 heterozygote females derived from a LG/J × SM/J intercross. These results may suggest that the maternally imprinted Peg3 gene is responsible for the single-locus QTL on chromosome 7 that has been shown to influence maternal care in these strains. Furthermore, these data provide additional support for an epigenetic regulation of maternal behavior. PMID:22950040
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chain, Patrick S; Hu, Ping; Malfatti, Stephanie
2006-01-01
Yersinia pestis, the causative agent of bubonic and pneumonic plagues, has undergone detailed study at the molecular level. To further investigate the genomic diversity among this group and to help characterize lineages of the plague organism that have no sequenced members, we present here the genomes of two isolates of the ''classical'' antiqua biovar, strains Antiqua and Nepal516. The genomes of Antiqua and Nepal516 are 4.7 Mb and 4.5 Mb and encode 4,138 and 3,956 open reading frames, respectively. Though both strains belong to one of the three classical biovars, they represent separate lineages defined by recent phylogenetic studies. Wemore » compare all five currently sequenced Y. pestis genomes and the corresponding features in Yersinia pseudotuberculosis. There are strain-specific rearrangements, insertions, deletions, single nucleotide polymorphisms, and a unique distribution of insertion sequences. We found 453 single nucleotide polymorphisms in protein-coding regions, which were used to assess the evolutionary relationships of these Y. pestis strains. Gene reduction analysis revealed that the gene deletion processes are under selective pressure, and many of the inactivations are probably related to the organism's interaction with its host environment. The results presented here clearly demonstrate the differences between the two biovar antiqua lineages and support the notion that grouping Y. pestis strains based strictly on the classical definition of biovars (predicated upon two biochemical assays) does not accurately reflect the phylogenetic relationships within this species. A comparison of four virulent Y. pestis strains with the human-avirulent strain 91001 provides further insight into the genetic basis of virulence to humans.« less
Ginosar, Y; Davidson, E M; Meroz, Y; Blotnick, S; Shacham, M; Caraco, Y
2009-09-01
There are diverse reports concerning the single-nucleotide polymorphism (SNP) A118G in the gene coding for the mu-opioid receptor. This study assessed pharmacokinetic-pharmacodynamic relationships in patients with acute pain (water-immersed extracorporeal shock wave lithotripsy). Ninety-nine patients (ASA I-II, age 18-70) were assessed in this prospective observational study. Blinding was achieved by determining genotype only after the procedure. I.V. alfentanil was administered by patient-controlled administration (loading dose, 10 microg kg(-1); continuous infusion, 20 microg kg(-1) h(-1); bolus, 3 microg kg(-1); lockout time, 1 min); no other analgesic or sedating medication was used. The allelic frequency was 15.2% in our population. The G118 SNP (AG/GG) was associated with a 27% increase in plasma alfentanil concentration (P=0.034), a 54% increase in alfentanil dose (P=0.009), a 47% increase in dose per kg body weight (P=0.004), a 55% increase in dose per kg corrected for stimulus intensity (P=0.002), a 112% increase in the numbers of attempted boluses (P=0.015), a 79% increase in the numbers of successful boluses (P=0.013), and a 153% increase in the numbers of failed boluses (P=0.042). Despite the increased alfentanil self-administration, the G118 SNP was associated with a 52% increase in verbal analogue pain scores over the same period of time (P=0.047). We demonstrated increased opioid requirement for alfentanil in patients with the G118 SNP, who self-administered a higher dose, achieved higher plasma concentration, and yet complained of more severe pain. This observation suggests that G118 SNP impairs the analgesic response to opioids.
Amirian, E Susan; Scheurer, Michael E; Liu, Yanhong; D'Amelio, Anthony M; Houlston, Richard S; Etzel, Carol J; Shete, Sanjay; Swerdlow, Anthony J; Schoemaker, Minouk J; McKinney, Patricia A; Fleming, Sarah J; Muir, Kenneth R; Lophatananon, Artitaya; Bondy, Melissa L
2011-08-01
Despite extensive research on the topic, glioma etiology remains largely unknown. Exploration of potential interactions between single-nucleotide polymorphisms (SNP) of immune genes is a promising new area of glioma research. The case-only study design is a powerful and efficient design for exploring possible multiplicative interactions between factors that are independent of one another. The purpose of our study was to use this exploratory design to identify potential pair wise SNP-SNP interactions from genes involved in several different immune-related pathways for investigation in future studies. The study population consisted of two case groups: 1,224 histologic confirmed, non-Hispanic white glioma cases from the United States and a validation population of 634 glioma cases from the United Kingdom. Polytomous logistic regression, in which one SNP was coded as the outcome and the other SNP was included as the exposure, was utilized to calculate the ORs of the likelihood of cases simultaneously having the variant alleles of two different SNPs. Potential interactions were examined only between SNPs located in different genes or chromosomes. Using this data mining strategy, we found 396 significant SNP-SNP interactions among polymorphisms of immune-related genes that were present in both the U.S. and U.K. study populations. This exploratory study was conducted for the purpose of hypothesis generation, and thus has provided several new hypotheses that can be tested using traditional case-control study designs to obtain estimates of risk. This is the first study, to our knowledge, to take this novel approach to identifying SNP-SNP interactions relevant to glioma etiology. ©2011 AACR.
Martin, Nicolas W; Benyamin, Beben; Hansell, Narelle K; Montgomery, Grant W; Martin, Nicholas G; Wright, Margaret J; Bates, Timothy C
2011-01-01
Breast-fed C-allele carriers of the rs174575 single nucleotide polymorphism in the fatty acyl desaturase 2 (FADS2) gene have been reported to show a 6.4 to 7 IQ point advantage over formula-fed C-allele carriers, with no effect of breast-feeding in GG carriers. An Australian sample was examined to determine if an interaction between breast-feeding and the rs174575 single nucleotide polymorphism had any effect on IQ. This hypothesis was tested in more than 700 families of adolescent twins assessed for IQ and breast-feeding, birth weight, and FADS2 polymorphisms, and parental socioeconomic status and education, and maternal FADS2 status. No significant evidence for a moderating effect on IQ of rs174575 C-carrier status and breast-feeding was found, and there no effects of maternal FADS2 status on offspring IQ. In addition, no main effects of any FADS2 polymorphisms on IQ were found when the genotype was kept as two-homozygote and one-heterozygote categories and indeed no evidence for effects of breast-feeding on IQ scores after controlling for parental socioeconomic status and education. The investigation was extended to two additional FADS2 polymorphisms (rs1535 and rs174583), but again, although these polymorphisms code alleles affecting fatty acid metabolism, no main or interaction effects were found on IQ. These results support the view that apparent effects of breast-feeding on IQ reflect differential likelihood of breast-feeding as a function of parental education and did not support the predicted interaction effect of FADS2 and breast-feeding on IQ. Copyright © 2011 American Academy of Child and Adolescent Psychiatry. Published by Elsevier Inc. All rights reserved.
Bank, Sarah; Sann, Manuela; Mayer, Christoph; Meusemann, Karen; Donath, Alexander; Podsiadlowski, Lars; Kozlov, Alexey; Petersen, Malte; Krogmann, Lars; Meier, Rudolf; Rosa, Paolo; Schmitt, Thomas; Wurdack, Mareike; Liu, Shanlin; Zhou, Xin; Misof, Bernhard; Peters, Ralph S; Niehuis, Oliver
2017-11-01
The wasp family Vespidae comprises more than 5000 described species which represent life history strategies ranging from solitary and presocial to eusocial and socially parasitic. The phylogenetic relationships of the major vespid wasp lineages (i.e., subfamilies and tribes) have been investigated repeatedly by analyzing behavioral and morphological traits as well as nucleotide sequences of few selected genes with largely incongruent results. Here we reconstruct their phylogenetic relationships using a phylogenomic approach. We sequenced the transcriptomes of 24 vespid wasp and eight outgroup species and exploited the transcript sequences for design of probes for enriching 913 single-copy protein-coding genes to complement the transcriptome data with nucleotide sequence data from additional 25 ethanol-preserved vespid species. Results from phylogenetic analyses of the combined sequence data revealed the eusocial subfamily Stenogastrinae to be the sister group of all remaining Vespidae, while the subfamily Eumeninae turned out to be paraphyletic. Of the three currently recognized eumenine tribes, Odynerini is paraphyletic with respect to Eumenini, and Zethini is paraphyletic with respect to Polistinae and Vespinae. Our results are in conflict with the current tribal subdivision of Eumeninae and thus, we suggest granting subfamily rank to the two major clades of "Zethini": Raphiglossinae and Zethinae. Overall, our findings corroborate the hypothesis of two independent origins of eusociality in vespid wasps and suggest a single origin of using masticated and salivated plant material for building nests by Raphiglossinae, Zethinae, Polistinae, and Vespinae. The inferred phylogenetic relationships and the open access vespid wasp target DNA enrichment probes will provide a valuable tool for future comparative studies on species of the family Vespidae, including their genomes, life styles, evolution of sociality, and co-evolution with other organisms. Copyright © 2017 Elsevier Inc. All rights reserved.
Nicolazzi, Ezequiel L; Caprera, Andrea; Nazzicari, Nelson; Cozzi, Paolo; Strozzi, Francesco; Lawley, Cindy; Pirani, Ali; Soans, Chandrasen; Brew, Fiona; Jorjani, Hossein; Evans, Gary; Simpson, Barry; Tosser-Klopp, Gwenola; Brauning, Rudiger; Williams, John L; Stella, Alessandra
2015-04-10
In recent years, the use of genomic information in livestock species for genetic improvement, association studies and many other fields has become routine. In order to accommodate different market requirements in terms of genotyping cost, manufacturers of single nucleotide polymorphism (SNP) arrays, private companies and international consortia have developed a large number of arrays with different content and different SNP density. The number of currently available SNP arrays differs among species: ranging from one for goats to more than ten for cattle, and the number of arrays available is increasing rapidly. However, there is limited or no effort to standardize and integrate array- specific (e.g. SNP IDs, allele coding) and species-specific (i.e. past and current assemblies) SNP information. Here we present SNPchiMp v.3, a solution to these issues for the six major livestock species (cow, pig, horse, sheep, goat and chicken). Original data was collected directly from SNP array producers and specific international genome consortia, and stored in a MySQL database. The database was then linked to an open-access web tool and to public databases. SNPchiMp v.3 ensures fast access to the database (retrieving within/across SNP array data) and the possibility of annotating SNP array data in a user-friendly fashion. This platform allows easy integration and standardization, and it is aimed at both industry and research. It also enables users to easily link the information available from the array producer with data in public databases, without the need of additional bioinformatics tools or pipelines. In recognition of the open-access use of Ensembl resources, SNPchiMp v.3 was officially credited as an Ensembl E!mpowered tool. Availability at http://bioinformatics.tecnoparco.org/SNPchimp.
Shortt, Katherine; Chaudhary, Suman; Grigoryev, Dmitry; Heruth, Daniel P.; Venkitachalam, Lakshmi; Zhang, Li Q.; Ye, Shui Q.
2014-01-01
Acute respiratory distress syndrome (ARDS) is a lung condition characterized by impaired gas exchange with systemic release of inflammatory mediators, causing pulmonary inflammation, vascular leak and hypoxemia. Existing biomarkers have limited effectiveness as diagnostic and therapeutic targets. To identify disease-associating variants in ARDS patients, whole-exome sequencing was performed on 96 ARDS patients, detecting 1,382,399 SNPs. By comparing these exome data to those of the 1000 Genomes Project, we identified a number of single nucleotide polymorphisms (SNP) which are potentially associated with ARDS. 50,190SNPs were found in all case subgroups and controls, of which89 SNPs were associated with susceptibility. We validated three SNPs (rs78142040, rs9605146 and rs3848719) in additional ARDS patients to substantiate their associations with susceptibility, severity and outcome of ARDS. rs78142040 (C>T) occurs within a histone mark (intron 6) of the Arylsulfatase D gene. rs9605146 (G>A) causes a deleterious coding change (proline to leucine) in the XK, Kell blood group complex subunit-related family, member 3 gene. rs3848719 (G>A) is a synonymous SNP in the Zinc-Finger/Leucine-Zipper Co-Transducer NIF1 gene. rs78142040, rs9605146, and rs3848719 are associated significantly with susceptibility to ARDS. rs3848719 is associated with APACHE II score quartile. rs78142040 is associated with 60-day mortality in the overall ARDS patient population. Exome-seq is a powerful tool to identify potential new biomarkers for ARDS. We selectively validated three SNPs which have not been previously associated with ARDS and represent potential new genetic biomarkers for ARDS. Additional validation in larger patient populations and further exploration of underlying molecular mechanisms are warranted. PMID:25372662
Vongvanrungruang, A; Mongkolsiriwatana, C; Boonkaew, T; Sawatdichaikul, O; Srikulnath, K; Peyachoknagul, S
2016-09-19
The fragrance gene, betaine aldehyde dehydrogenase 2 (Badh2), has been well studied in many plant species. The objectives of this study were to clone Badh2 and compare the sequences between aromatic and non-aromatic coconuts. The complete coding region was cloned from cDNA of both aromatic and non-aromatic coconuts. The nucleotide sequences were highly homologous to Badh2 genes of other plants. Badh2 consisted of a 1512-bp open reading frame encoding 503 amino acids. A single nucleotide difference between aromatic and non-aromatic coconuts resulted in the conversion of alanine (non-aromatic) to proline (aromatic) at position 442, which was the substrate binding site of BADH2. The ring side chain of proline could destabilize the structure leading to a non-functional enzyme. Badh2 genomic DNA was cloned from exon 1 to 4, and from exon 5 to 15 from the two coconut types, except for intron 4 that was very long. The intron sequences of the two coconut groups were highly homologous. No differences in Badh2 expression were found among the tissues of aromatic coconut or between aromatic and non-aromatic coconuts. The amino acid sequences of BADH2 from coconut and other plants were compared and the genetic relationship was analyzed using MEGA 7.0. The phylogenetic tree reconstructed by the Bayesian information criterion consisted of two distinct groups of monocots and dicots. Among the monocots, coconut (Cocos nucifera) and oil palm (Elaeis guineensis) were the most closely related species. A marker for coconut differentiation was developed from one-base substitution site and could be successfully used.
Mutations in the Norrie disease gene.
Schuback, D E; Chen, Z Y; Craig, I W; Breakefield, X O; Sims, K B
1995-01-01
We report our experience to date in mutation identification in the Norrie disease (ND) gene. We carried out mutational analysis in 26 kindreds in an attempt to identify regions presumed critical to protein function and potentially correlated with generation of the disease phenotype. All coding exons, as well as noncoding regions of exons 1 and 2, 636 nucleotides in the noncoding region of exon 3, and 197 nucleotides of 5' flanking sequence, were analyzed for single-strand conformation polymorphisms (SSCP) by polymerase chain reaction (PCR) amplification of genomic DNA. DNA fragments that showed altered SSCP band mobilities were sequenced to locate the specific mutations. In addition to three previously described submicroscopic deletions encompassing the entire ND gene, we have now identified 6 intragenic deletions, 8 missense (seven point mutations, one 9-bp deletion), 6 nonsense (three point mutations, three single bp deletions/frameshift) and one 10-bp insertion, creating an expanded repeat in the 5' noncoding region of exon 1. Thus, mutations have been identified in a total of 24 of 26 (92%) of the kindreds we have studied to date. With the exception of two different mutations, each found in two apparently unrelated kindreds, these mutations are unique and expand the genotype database. Localization of the majority of point mutations at or near cysteine residues, potentially critical in protein tertiary structure, supports a previous protein model for norrin as member of a cystine knot growth factor family (Meitinger et al., 1993). Genotype-phenotype correlations were not evident with the limited clinical data available, except in the cases of larger submicroscopic deletions associated with a more severe neurologic syndrome.(ABSTRACT TRUNCATED AT 250 WORDS)
Yang, Yong; Wu, Zhihong; Zhao, Taimao; Wang, Hai; Zhao, Dong; Zhang, Jianguo; Wang, Yipeng; Ding, Yaozhong; Qiu, Guixing
2009-06-01
The etiology of adolescent idiopathic scoliosis is undetermined despite years of research. A number of hypotheses have been postulated to explain its development, including growth abnormalities. The irregular expression of growth hormone and insulin-like growth factor-1 (IGF-1) may disturb hormone metabolism, result in a gross asymmetry, and promote the progress of adolescent idiopathic scoliosis. Initial association studies in complex diseases have demonstrated the power of candidate gene association. Prior to our study, 1 study in this field had a negative result. A replicable study is vital for reliability. To determine the relationship of growth hormone receptor and IGF-1 genes with adolescent idiopathic scoliosis, a population-based association study was performed. Single nucleotide polymorphisms with potential function were selected from candidate genes and a distribution analysis was performed. A conclusion was made confirming the insufficiency of an association between adolescent idiopathic scoliosis and the single-nucleotide polymorphism of the growth hormone receptor and IGF-1 genes in Han Chinese.
A single splice site mutation in human-specific ARHGAP11B causes basal progenitor amplification
Florio, Marta; Namba, Takashi; Pääbo, Svante; Hiller, Michael; Huttner, Wieland B.
2016-01-01
The gene ARHGAP11B promotes basal progenitor amplification and is implicated in neocortex expansion. It arose on the human evolutionary lineage by partial duplication of ARHGAP11A, which encodes a Rho guanosine triphosphatase–activating protein (RhoGAP). However, a lack of 55 nucleotides in ARHGAP11B mRNA leads to loss of RhoGAP activity by GAP domain truncation and addition of a human-specific carboxy-terminal amino acid sequence. We show that these 55 nucleotides are deleted by mRNA splicing due to a single C→G substitution that creates a novel splice donor site. We reconstructed an ancestral ARHGAP11B complementary DNA without this substitution. Ancestral ARHGAP11B exhibits RhoGAP activity but has no ability to increase basal progenitors during neocortex development. Hence, a single nucleotide substitution underlies the specific properties of ARHGAP11B that likely contributed to the evolutionary expansion of the human neocortex. PMID:27957544
Li, Ming; Ohi, Kazutaka; Chen, Chunhui; He, Qinghua; Liu, Jie-Wei; Chen, Chuansheng; Luo, Xiong-Jian; Dong, Qi; Hashimoto, Ryota; Su, Bing
2014-12-01
Hippocampal volume is a key brain structure for learning ability and memory process, and hippocampal atrophy is a recognized biological marker of Alzheimer's disease. However, the genetic bases of hippocampal volume are still unclear although it is a heritable trait. Genome-wide association studies (GWASs) on hippocampal volume have implicated several significantly associated genetic variants in Europeans. Here, to test the contributions of these GWASs identified genetic variants to hippocampal volume in different ethnic populations, we screened the GWAS-identified candidate single-nucleotide polymorphisms in 3 independent healthy Asian brain imaging samples (a total of 990 subjects). The results showed that none of these single-nucleotide polymorphisms were associated with hippocampal volume in either individual or combined Asian samples. The replication results suggested a complexity of genetic architecture for hippocampal volume and potential genetic heterogeneity between different ethnic populations. Copyright © 2014 Elsevier Inc. All rights reserved.
Detecting Single-Nucleotide Substitutions Induced by Genome Editing.
Miyaoka, Yuichiro; Chan, Amanda H; Conklin, Bruce R
2016-08-01
The detection of genome editing is critical in evaluating genome-editing tools or conditions, but it is not an easy task to detect genome-editing events-especially single-nucleotide substitutions-without a surrogate marker. Here we introduce a procedure that significantly contributes to the advancement of genome-editing technologies. It uses droplet digital polymerase chain reaction (ddPCR) and allele-specific hydrolysis probes to detect single-nucleotide substitutions generated by genome editing (via homology-directed repair, or HDR). HDR events that introduce substitutions using donor DNA are generally infrequent, even with genome-editing tools, and the outcome is only one base pair difference in 3 billion base pairs of the human genome. This task is particularly difficult in induced pluripotent stem (iPS) cells, in which editing events can be very rare. Therefore, the technological advances described here have implications for therapeutic genome editing and experimental approaches to disease modeling with iPS cells. © 2016 Cold Spring Harbor Laboratory Press.
Control of total GFP expression by alterations to the 3′ region nucleotide sequence
2013-01-01
Background Previously, we distinguished the Escherichia coli type II cytoplasmic membrane translocation pathways of Tat, Yid, and Sec for unfolded and folded soluble target proteins. The translocation of folded protein to the periplasm for soluble expression via the Tat pathway was controlled by an N-terminal hydrophilic leader sequence. In this study, we investigated the effect of the hydrophilic C-terminal end and its nucleotide sequence on total and soluble protein expression. Results The native hydrophilic C-terminal end of GFP was obtained by deleting the C-terminal peptide LeuGlu-6×His, derived from pET22b(+). The corresponding clones induced total and soluble GFP expression that was either slightly increased or dramatically reduced, apparently through reconstruction of the nucleotide sequence around the stop codon in the 3′ region. In the expression-induced clones, the hydrophilic C-terminus showed increased Tat pathway specificity for soluble expression. However, in the expression-reduced clone, after analyzing the role of the 5′ poly(A) coding sequence with a substituted synonymous codon, we proved that the longer 5′ poly(A) coding sequence interacted with the reconstructed 3′ region nucleotide sequence to create a new mRNA tertiary structure between the 5′ and 3′ regions, which resulted in reduced total GFP expression. Further, to recover the reduced expression by changing the 3′ nucleotide sequence, after replacing selected C-terminal 5′ codons and the stop codon in the ORF with synonymous codons, total GFP expression in most of the clones was recovered to the undeleted control level. The insertion of trinucleotides after the stop codon in the 3′-UTR recovered or reduced total GFP expression. RT-PCR revealed that the level of total protein expression was controlled by changes in translational or transcriptional regulation, which were induced or reduced by the substitution or insertion of 3′ region nucleotides. Conclusions We found that the hydrophilic C-terminal end of GFP increased Tat pathway specificity and that the 3′ nucleotide sequence played an important role in total protein expression through translational and transcriptional regulation. These findings may be useful for efficiently producing recombinant proteins as well as for potentially controlling the expression level of specific genes in the body for therapeutic purposes. PMID:23834827
Genome sequence, comparative analysis and haplotype structure of the domestic dog.
Lindblad-Toh, Kerstin; Wade, Claire M; Mikkelsen, Tarjei S; Karlsson, Elinor K; Jaffe, David B; Kamal, Michael; Clamp, Michele; Chang, Jean L; Kulbokas, Edward J; Zody, Michael C; Mauceli, Evan; Xie, Xiaohui; Breen, Matthew; Wayne, Robert K; Ostrander, Elaine A; Ponting, Chris P; Galibert, Francis; Smith, Douglas R; DeJong, Pieter J; Kirkness, Ewen; Alvarez, Pablo; Biagi, Tara; Brockman, William; Butler, Jonathan; Chin, Chee-Wye; Cook, April; Cuff, James; Daly, Mark J; DeCaprio, David; Gnerre, Sante; Grabherr, Manfred; Kellis, Manolis; Kleber, Michael; Bardeleben, Carolyne; Goodstadt, Leo; Heger, Andreas; Hitte, Christophe; Kim, Lisa; Koepfli, Klaus-Peter; Parker, Heidi G; Pollinger, John P; Searle, Stephen M J; Sutter, Nathan B; Thomas, Rachael; Webber, Caleb; Baldwin, Jennifer; Abebe, Adal; Abouelleil, Amr; Aftuck, Lynne; Ait-Zahra, Mostafa; Aldredge, Tyler; Allen, Nicole; An, Peter; Anderson, Scott; Antoine, Claudel; Arachchi, Harindra; Aslam, Ali; Ayotte, Laura; Bachantsang, Pasang; Barry, Andrew; Bayul, Tashi; Benamara, Mostafa; Berlin, Aaron; Bessette, Daniel; Blitshteyn, Berta; Bloom, Toby; Blye, Jason; Boguslavskiy, Leonid; Bonnet, Claude; Boukhgalter, Boris; Brown, Adam; Cahill, Patrick; Calixte, Nadia; Camarata, Jody; Cheshatsang, Yama; Chu, Jeffrey; Citroen, Mieke; Collymore, Alville; Cooke, Patrick; Dawoe, Tenzin; Daza, Riza; Decktor, Karin; DeGray, Stuart; Dhargay, Norbu; Dooley, Kimberly; Dooley, Kathleen; Dorje, Passang; Dorjee, Kunsang; Dorris, Lester; Duffey, Noah; Dupes, Alan; Egbiremolen, Osebhajajeme; Elong, Richard; Falk, Jill; Farina, Abderrahim; Faro, Susan; Ferguson, Diallo; Ferreira, Patricia; Fisher, Sheila; FitzGerald, Mike; Foley, Karen; Foley, Chelsea; Franke, Alicia; Friedrich, Dennis; Gage, Diane; Garber, Manuel; Gearin, Gary; Giannoukos, Georgia; Goode, Tina; Goyette, Audra; Graham, Joseph; Grandbois, Edward; Gyaltsen, Kunsang; Hafez, Nabil; Hagopian, Daniel; Hagos, Birhane; Hall, Jennifer; Healy, Claire; Hegarty, Ryan; Honan, Tracey; Horn, Andrea; Houde, Nathan; Hughes, Leanne; Hunnicutt, Leigh; Husby, M; Jester, Benjamin; Jones, Charlien; Kamat, Asha; Kanga, Ben; Kells, Cristyn; Khazanovich, Dmitry; Kieu, Alix Chinh; Kisner, Peter; Kumar, Mayank; Lance, Krista; Landers, Thomas; Lara, Marcia; Lee, William; Leger, Jean-Pierre; Lennon, Niall; Leuper, Lisa; LeVine, Sarah; Liu, Jinlei; Liu, Xiaohong; Lokyitsang, Yeshi; Lokyitsang, Tashi; Lui, Annie; Macdonald, Jan; Major, John; Marabella, Richard; Maru, Kebede; Matthews, Charles; McDonough, Susan; Mehta, Teena; Meldrim, James; Melnikov, Alexandre; Meneus, Louis; Mihalev, Atanas; Mihova, Tanya; Miller, Karen; Mittelman, Rachel; Mlenga, Valentine; Mulrain, Leonidas; Munson, Glen; Navidi, Adam; Naylor, Jerome; Nguyen, Tuyen; Nguyen, Nga; Nguyen, Cindy; Nguyen, Thu; Nicol, Robert; Norbu, Nyima; Norbu, Choe; Novod, Nathaniel; Nyima, Tenchoe; Olandt, Peter; O'Neill, Barry; O'Neill, Keith; Osman, Sahal; Oyono, Lucien; Patti, Christopher; Perrin, Danielle; Phunkhang, Pema; Pierre, Fritz; Priest, Margaret; Rachupka, Anthony; Raghuraman, Sujaa; Rameau, Rayale; Ray, Verneda; Raymond, Christina; Rege, Filip; Rise, Cecil; Rogers, Julie; Rogov, Peter; Sahalie, Julie; Settipalli, Sampath; Sharpe, Theodore; Shea, Terrance; Sheehan, Mechele; Sherpa, Ngawang; Shi, Jianying; Shih, Diana; Sloan, Jessie; Smith, Cherylyn; Sparrow, Todd; Stalker, John; Stange-Thomann, Nicole; Stavropoulos, Sharon; Stone, Catherine; Stone, Sabrina; Sykes, Sean; Tchuinga, Pierre; Tenzing, Pema; Tesfaye, Senait; Thoulutsang, Dawa; Thoulutsang, Yama; Topham, Kerri; Topping, Ira; Tsamla, Tsamla; Vassiliev, Helen; Venkataraman, Vijay; Vo, Andy; Wangchuk, Tsering; Wangdi, Tsering; Weiand, Michael; Wilkinson, Jane; Wilson, Adam; Yadav, Shailendra; Yang, Shuli; Yang, Xiaoping; Young, Geneva; Yu, Qing; Zainoun, Joanne; Zembek, Lisa; Zimmer, Andrew; Lander, Eric S
2005-12-08
Here we report a high-quality draft genome sequence of the domestic dog (Canis familiaris), together with a dense map of single nucleotide polymorphisms (SNPs) across breeds. The dog is of particular interest because it provides important evolutionary information and because existing breeds show great phenotypic diversity for morphological, physiological and behavioural traits. We use sequence comparison with the primate and rodent lineages to shed light on the structure and evolution of genomes and genes. Notably, the majority of the most highly conserved non-coding sequences in mammalian genomes are clustered near a small subset of genes with important roles in development. Analysis of SNPs reveals long-range haplotypes across the entire dog genome, and defines the nature of genetic diversity within and across breeds. The current SNP map now makes it possible for genome-wide association studies to identify genes responsible for diseases and traits, with important consequences for human and companion animal health.
Epigenetic Modifications in Essential Hypertension
Wise, Ingrid A.; Charchar, Fadi J.
2016-01-01
Essential hypertension (EH) is a complex, polygenic condition with no single causative agent. Despite advances in our understanding of the pathophysiology of EH, hypertension remains one of the world’s leading public health problems. Furthermore, there is increasing evidence that epigenetic modifications are as important as genetic predisposition in the development of EH. Indeed, a complex and interactive genetic and environmental system exists to determine an individual’s risk of EH. Epigenetics refers to all heritable changes to the regulation of gene expression as well as chromatin remodelling, without involvement of nucleotide sequence changes. Epigenetic modification is recognized as an essential process in biology, but is now being investigated for its role in the development of specific pathologic conditions, including EH. Epigenetic research will provide insights into the pathogenesis of blood pressure regulation that cannot be explained by classic Mendelian inheritance. This review concentrates on epigenetic modifications to DNA structure, including the influence of non-coding RNAs on hypertension development. PMID:27023534
Global analysis of A-to-I RNA editing reveals association with common disease variants
Jain, Rajeev; Jain, Anamika; Betsholtz, Christer; Giannarelli, Chiara; Kovacic, Jason C.; Ruusalepp, Arno; Skogsberg, Josefin; Hao, Ke; Schadt, Eric E.
2018-01-01
RNA editing modifies transcripts and may alter their regulation or function. In humans, the most common modification is adenosine to inosine (A-to-I). We examined the global characteristics of RNA editing in 4,301 human tissue samples. More than 1.6 million A-to-I edits were identified in 62% of all protein-coding transcripts. mRNA recoding was extremely rare; only 11 novel recoding sites were uncovered. Thirty single nucleotide polymorphisms from genome-wide association studies were associated with RNA editing; one that influences type 2 diabetes (rs2028299) was associated with editing in ARPIN. Twenty-five genes, including LRP11 and PLIN5, had editing sites that were associated with plasma lipid levels. Our findings provide new insights into the genetic regulation of RNA editing and establish a rich catalogue for further exploration of this process. PMID:29527417
Monnier, Stéphanie; Cox, David G; Albion, Tim; Canzian, Federico
2005-01-01
Background Single Nucleotide Polymorphism (SNP) genotyping is a major activity in biomedical research. The Taqman technology is one of the most commonly used approaches. It produces large amounts of data that are difficult to process by hand. Laboratories not equipped with a Laboratory Information Management System (LIMS) need tools to organize the data flow. Results We propose a package of Visual Basic programs focused on sample management and on the parsing of input and output TaqMan files. The code is written in Visual Basic, embedded in the Microsoft Office package, and it allows anyone to have access to those tools, without any programming skills and with basic computer requirements. Conclusion We have created useful tools focused on management of TaqMan genotyping data, a critical issue in genotyping laboratories whithout a more sophisticated and expensive system, such as a LIMS. PMID:16221298
Zhao, Zhanqin; Xue, Yun; Hu, Zhigang; Zhou, Feng; Ma, Beibei; Long, Ta; Xue, Qiao; Liu, Huisheng
2017-04-01
This study evaluated whether there was an association between polymorphisms within the Toll-like receptor 2 gene (TLR2) of Chinese Holstein cattle and susceptibility to bovine tuberculosis (BTB). In a case-control study including 210 BTB cases and 237 control cattle, we found only two common single-nucleotide polymorphisms (SNPs) within the entire coding region of the TLR2 gene, A631G (rs95214857) and T1707C (rs1388116488). Additionally, the allele and genotype distributions of A631G and T1707C were not different between case and control groups, indicated that these SNPs were not associated with susceptibility to BTB. These results suggested that polymorphisms in the TLR2 gene might not play a significant role in the BTB risk in Chinese Holstein cattle. Copyright © 2017 Elsevier B.V. All rights reserved.
CASC15-S is a tumor suppressor lncRNA at the 6p22 neuroblastoma susceptibility locus
Russell, Mike R.; Penikis, Annalise; Oldridge, Derek A.; Alvarez-Dominguez, Juan R.; McDaniel, Lee; Diamond, Maura; Padovan, Olivia; Raman, Pichai; Li, Yimei; Wei, Jun S.; Zhang, Shile; Gnanchandran, Janahan; Seeger, Robert; Asgharzadeh, Shahab; Khan, Javed; Diskin, Sharon J.; Maris, John M.; Cole, Kristina A.
2015-01-01
Chromosome 6p22 was identified recently as a neuroblastoma susceptibility locus, but its mechanistic contributions to tumorigenesis are as yet undefined. Here we report that the most highly significant single nucleotide polymorphism (SNP) associations reside within CASC15, a long non-coding RNA that we define as a tumor suppressor at 6p22. Low-level expression of a short CASC15 isoform (CASC15-S) associated highly with advanced neuroblastoma and poor patient survival. In human neuroblastoma cells, attenuating CASC15-S increased cellular growth and migratory capacity. Gene expression analysis revealed downregulation of neuroblastoma-specific markers in cells with attenuated CASC15-S, with concomitant increases in cell adhesion and extracellular matrix transcripts. Altogether, our results point to CASC15-S as a mediator of neural growth and differentiation, which impacts neuroblastoma initiation and progression. PMID:26100672
Desmoglein 4 diversity and correlation analysis with coat color in goat.
E, G X; Zhao, Y J; Ma, Y H; Cao, G L; He, J N; Na, R S; Zhao, Z Q; Jiang, C D; Zhang, J H; Arlvd, S; Chen, L P; Qiu, X Y; Hu, W; Huang, Y F
2016-03-04
Desmoglein 4 (DSG4) has an important role in the development of wool traits in domestic animals. The full-length DSG4 gene, which contains 3918 bp, a complete open-reading-frame, and encodes a 1040-amino acid protein, was amplified from Liaoning cashmere goat. The sequence was compared with that of DSG4 from other animals and the results show that the DSG4 coding region is consistent with interspecies conservation. Thirteen single-nucleotide polymorphisms (SNPs) were identified in a highly variable region of DSG4, and one SNP (M-1, G>T) was significantly correlated with white and black coat color in goat. Haplotype distribution of the highly variable region of DSG4 was assessed in 179 individuals from seven goat breeds to investigate its association with coat color and its differentiation among populations. However, the lack of a signature result indicates DGS4 haplotypes related with the color of goat coat.
Zhao, Jianjun; Zhang, Hailing; Bai, Xue; Martella, Vito; Hu, Bo; Sun, Yangang; Zhu, Chunsheng; Zhang, Lei; Liu, Hao; Xu, Shujuan; Shao, Xiqun; Wu, Wei; Yan, Xijun
2014-04-01
A total of 16 strains of canine distemper virus (CDV) were detected from vaccinated minks, foxes, and raccoon dogs in four provinces in North-Eastern China between the end of 2011 and 2013. Upon sequence analysis of the haemagglutinin gene and comparison with wild-type CDV from different species in the same geographical areas, two non-synonymous single nucleotide polymorphisms were identified in 10 CDV strains, which led to amino acid changes at positions 542 (isoleucine to asparagine) and 549 (tyrosine to histidine) of the haemagglutinin protein coding sequence. The change at residue 542 generated a potentially novel N-glycosylation site. Masking of antigenic epitopes by sugar moieties might represent a mechanism for evasion of virus neutralising antibodies and reduced protection by vaccination. Copyright © 2014 Elsevier Ltd. All rights reserved.
Yamamoto, Hiroaki; Kudoh, Masatake
2013-09-01
A novel enantioselective alcohol dehydrogenase, (R)-2-octanol dehydrogenase (PfODH), was discovered among methylotrophic microorganisms. The enzyme was purified from Pichia finlandica and characterized. The molecular mass of the enzyme was estimated to be 83,000 and 30,000 by gel filtration and sodium dodecyl sulfate-polyacrylamide gel electrophoresis, respectively. The enzyme was an NAD(+)-dependent secondary alcohol dehydrogenase and showed a strict enantioselectivity, very broad substrate specificity, and high tolerance to SH reagents. A gene-encoding PfODH was cloned and sequenced. The gene consisted of 765 nucleotides, coding polypeptides of 254 amino acids. The gene was singly expressed and coexpressed together with a formate dehydrogenase as an NADH regenerator in an Escherichia coli. Ethyl (S)-4-chloro-3-hydroxybutanoate and (S)-2-chloro-1-phenylethanol were synthesized using a whole-cell biocatalyst in more than 99 % optical purity.
Khrustalev, Vladislav Victorovich
2009-01-01
Guanine is the most mutable nucleotide in HIV genes because of frequently occurring G to A transitions, which are caused by cytosine deamination in viral DNA minus strands catalyzed by APOBEC enzymes. Distribution of guanine between three codon positions should influence the probability for G to A mutation to be nonsynonymous (to occur in first or second codon position). We discovered that nucleotide sequences of env genes coding for third variable regions (V3 loops) of gp120 from HIV1 and HIV2 have different kinds of guanine usage biases. In the HIV1 reference strain and 100 additionally analyzed HIV1 strains the guanine usage bias in V3 loop coding regions (2G>1G>3G) should lead to elevated nonsynonymous G to A transitions occurrence rates. In the HIV2 reference strain and 100 other HIV2 strains guanine usage bias in V3 loop coding regions (3G>2G>1G) should protect V3 loops from hypermutability. According to the HIV1 and HIV2 V3 alignment, insertion of the sequence enriched with 2G (21 codons in length) occurred during the evolution of HIV1 predecessor, while insertion of the different sequence enriched with 3G (19 codons in length) occurred during the evolution of HIV2 predecessor. The higher is the level of 3G in the V3 coding region, the lower should be the immune escaping mutation occurrence rates. This hypothesis was tested in this study by comparing the guanine usage in V3 loop coding regions from HIV1 fast and slow progressors. All calculations have been performed by our algorithms "VVK In length", "VVK Dinucleotides" and "VVK Consensus" (www.barkovsky.hotmail.ru).
Seal, B S; Neill, J D; Ridpath, J F
1994-07-01
Caliciviruses are nonenveloped with a polyadenylated genome of approximately 7.6 kb and a single capsid protein. The "RNA Fold" computer program was used to analyze 3'-terminal noncoding sequences of five feline calicivirus (FCV), rabbit hemorrhagic disease virus (RHDV), and two San Miguel sea lion virus (SMSV) isolates. The FCV 3'-terminal sequences are 40-46 nucleotides in length and 72-91% similar. The FCV sequences were predicted to contain two possible duplex structures and one stem-loop structure with free energies of -2.1 to -18.2 kcal/mole. The RHDV genomic 3'-terminal RNA sequences are 54 nucleotides in length and share 49% sequence similarity to homologous regions of the FCV genome. The RHDV sequence was predicted to form two duplex structures in the 3'-terminal noncoding region with a single stem-loop structure, resembling that of FCV. In contrast, the SMSV 1 and 4 genomic 3'-terminal noncoding sequences were 185 and 182 nucleotides in length, respectively. Ten possible duplex structures were predicted with an average structural free energy of -35 kcal/mole. Sequence similarity between the two SMSV isolates was 75%. Furthermore, extensive cloverleaflike structures are predicted in the 3' noncoding region of the SMSV genome, in contrast to the predicted single stem-loop structures of FCV or RHDV.
Nelson, Chase W; Moncla, Louise H; Hughes, Austin L
2015-11-15
New applications of next-generation sequencing technologies use pools of DNA from multiple individuals to estimate population genetic parameters. However, no publicly available tools exist to analyse single-nucleotide polymorphism (SNP) calling results directly for evolutionary parameters important in detecting natural selection, including nucleotide diversity and gene diversity. We have developed SNPGenie to fill this gap. The user submits a FASTA reference sequence(s), a Gene Transfer Format (.GTF) file with CDS information and a SNP report(s) in an increasing selection of formats. The program estimates nucleotide diversity, distance from the reference and gene diversity. Sites are flagged for multiple overlapping reading frames, and are categorized by polymorphism type: nonsynonymous, synonymous, or ambiguous. The results allow single nucleotide, single codon, sliding window, whole gene and whole genome/population analyses that aid in the detection of positive and purifying natural selection in the source population. SNPGenie version 1.2 is a Perl program with no additional dependencies. It is free, open-source, and available for download at https://github.com/hugheslab/snpgenie. nelsoncw@email.sc.edu or austin@biol.sc.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Base Preferences in Non-Templated Nucleotide Incorporation by MMLV-Derived Reverse Transcriptases
Zajac, Pawel; Islam, Saiful; Hochgerner, Hannah; Lönnerberg, Peter; Linnarsson, Sten
2013-01-01
Reverse transcriptases derived from Moloney Murine Leukemia Virus (MMLV) have an intrinsic terminal transferase activity, which causes the addition of a few non-templated nucleotides at the 3´ end of cDNA, with a preference for cytosine. This mechanism can be exploited to make the reverse transcriptase switch template from the RNA molecule to a secondary oligonucleotide during first-strand cDNA synthesis, and thereby to introduce arbitrary barcode or adaptor sequences in the cDNA. Because the mechanism is relatively efficient and occurs in a single reaction, it has recently found use in several protocols for single-cell RNA sequencing. However, the base preference of the terminal transferase activity is not known in detail, which may lead to inefficiencies in template switching when starting from tiny amounts of mRNA. Here, we used fully degenerate oligos to determine the exact base preference at the template switching site up to a distance of ten nucleotides. We found a strong preference for guanosine at the first non-templated nucleotide, with a greatly reduced bias at progressively more distant positions. Based on this result, and a number of careful optimizations, we report conditions for efficient template switching for cDNA amplification from single cells. PMID:24392002
Generation and analysis of expressed sequence tags in the extreme large genomes Lilium and Tulipa
2012-01-01
Background Bulbous flowers such as lily and tulip (Liliaceae family) are monocot perennial herbs that are economically very important ornamental plants worldwide. However, there are hardly any genetic studies performed and genomic resources are lacking. To build genomic resources and develop tools to speed up the breeding in both crops, next generation sequencing was implemented. We sequenced and assembled transcriptomes of four lily and five tulip genotypes using 454 pyro-sequencing technology. Results Successfully, we developed the first set of 81,791 contigs with an average length of 514 bp for tulip, and enriched the very limited number of 3,329 available ESTs (Expressed Sequence Tags) for lily with 52,172 contigs with an average length of 555 bp. The contigs together with singletons covered on average 37% of lily and 39% of tulip estimated transcriptome. Mining lily and tulip sequence data for SSRs (Simple Sequence Repeats) showed that di-nucleotide repeats were twice more abundant in UTRs (UnTranslated Regions) compared to coding regions, while tri-nucleotide repeats were equally spread over coding and UTR regions. Two sets of single nucleotide polymorphism (SNP) markers suitable for high throughput genotyping were developed. In the first set, no SNPs flanking the target SNP (50 bp on either side) were allowed. In the second set, one SNP in the flanking regions was allowed, which resulted in a 2 to 3 fold increase in SNP marker numbers compared with the first set. Orthologous groups between the two flower bulbs: lily and tulip (12,017 groups) and among the three monocot species: lily, tulip, and rice (6,900 groups) were determined using OrthoMCL. Orthologous groups were screened for common SNP markers and EST-SSRs to study synteny between lily and tulip, which resulted in 113 common SNP markers and 292 common EST-SSR. Lily and tulip contigs generated were annotated and described according to Gene Ontology terminology. Conclusions Two transcriptome sets were built that are valuable resources for marker development, comparative genomic studies and candidate gene approaches. Next generation sequencing of leaf transcriptome is very effective; however, deeper sequencing and using more tissues and stages is advisable for extended comparative studies. PMID:23167289
Urschitz, Johann; Sultan, Omar; Ward, Kenneth
2011-01-01
Objective Various Asian and Pacifific Islander groups have higher prevalence rates of type 2 diabetes and gestational diabetes. This increased incidence is likely to include genetic factors. Single nucleotide polymorphisms in the retinol binding protein 4 gene have been linked to the occurrence of type 2 diabetes. Hypothesizing a link between retinol binding protein 4 and gestational diabetes, we performed a candidate gene study to look for an association between an important retinol binding protein gene polymorphism (rs3758539) and gestational diabetes. Study Design Blood was collected from Caucasian, Asian, and Pacific Islander women diagnosed with gestational diabetes and from ethnically matched non-diabetic controls. DNA was extracted and real time PCR technology (TaqMan, Applied Biosystems) used to screen for the rs3758539 single nucleotide polymorphism located 5′ of exon 1 of the retinol binding protein 4 gene. Results Genotype and allele frequencies in the controls and gestational diabetes cases were tested using chi-square contingency tests. Genotype frequencies were in Hardy-Weinberg equilibrium. There was no association between the rs3758539 retinol binding protein 4 single nucleotide polymorphism and gestational diabetes in the Caucasian, Filipino, or Pacific Islander groups. Conclusion Interestingly, the rs3758539 retinol binding protein 4 single nucleotide polymorphism was not found to be associated with gestational diabetes. The absence of association suggests that gestational and type 2 diabetes may have more divergent molecular pathophysiology than previously suspected. PMID:21886308
Pooled genome wide association detects association upstream of FCRL3 with Graves' disease.
Khong, Jwu Jin; Burdon, Kathryn P; Lu, Yi; Laurie, Kate; Leonardos, Lefta; Baird, Paul N; Sahebjada, Srujana; Walsh, John P; Gajdatsy, Adam; Ebeling, Peter R; Hamblin, Peter Shane; Wong, Rosemary; Forehan, Simon P; Fourlanos, Spiros; Roberts, Anthony P; Doogue, Matthew; Selva, Dinesh; Montgomery, Grant W; Macgregor, Stuart; Craig, Jamie E
2016-11-18
Graves' disease is an autoimmune thyroid disease of complex inheritance. Multiple genetic susceptibility loci are thought to be involved in Graves' disease and it is therefore likely that these can be identified by genome wide association studies. This study aimed to determine if a genome wide association study, using a pooling methodology, could detect genomic loci associated with Graves' disease. Nineteen of the top ranking single nucleotide polymorphisms including HLA-DQA1 and C6orf10, were clustered within the Major Histo-compatibility Complex region on chromosome 6p21, with rs1613056 reaching genome wide significance (p = 5 × 10 -8 ). Technical validation of top ranking non-Major Histo-compatablity complex single nucleotide polymorphisms with individual genotyping in the discovery cohort revealed four single nucleotide polymorphisms with p ≤ 10 -4 . Rs17676303 on chromosome 1q23.1, located upstream of FCRL3, showed evidence of association with Graves' disease across the discovery, replication and combined cohorts. A second single nucleotide polymorphism rs9644119 downstream of DPYSL2 showed some evidence of association supported by finding in the replication cohort that warrants further study. Pooled genome wide association study identified a genetic variant upstream of FCRL3 as a susceptibility locus for Graves' disease in addition to those identified in the Major Histo-compatibility Complex. A second locus downstream of DPYSL2 is potentially a novel genetic variant in Graves' disease that requires further confirmation.