2014-01-01
Background Fatty acid desaturase 1 (FADS1) and 2 (FADS2) genes code respectively for the enzymes delta-5 and delta-6 desaturases which are rate limiting enzymes in the synthesis of polyunsaturated omega-3 and omega-6 fatty acids (FAs). Omega-3 and-6 FAs as well as conjugated linoleic acid (CLA) are present in bovine milk and have demonstrated positive health effects in humans. Studies in humans have shown significant relationships between genetic variants in FADS1 and 2 genes with plasma and tissue concentrations of omega-3 and-6 FAs. The aim of this study was to evaluate the extent of sequence variations within these two genes in Canadian Holstein cows as well as the association between sequence variants and health promoting FAs in milk. Results Thirty three SNPs were detected within the studied regions of genes including a synonymous mutation (FADS1-07, rs42187261, 306Tyr > Tyr) in exon 8 of FADS1, a non-synonymous mutation (FADS2-14, rs211580559, 294Ala > Val) within FADS2 exon 7, a splice site SNP (FADS2-05, rs211263660), a 3′UTR SNP (FADS2-23, rs109772589), and another 3′UTR SNP with an effect on a microRNA binding site within FADS2 gene (FADS2-19, rs210169303). Association analyses showed significant relations between three out of seven tested SNPs and several FAs. Significant associations (FDR P < 0.05) were recorded between FADS2-23 (rs109772589) and two omega-6 FAs (dihomogamma linolenic acid [C20:3n6] and arachidonic acid [C20:4n6]), FADS1-07 (rs42187261) and one omega-3 FA (eicosapentaenoic acid, C20:5n3) and tricosanoic acid (C23:0), and one intronic SNP, FADS1-01 (rs136261927) and C20:3n6. Conclusion Our study has demonstrated positive associations between three SNPs within FADS1 and FADS2 genes (a SNP within the 3’UTR, a synonymous SNP and an intronic SNP), with three milk PUFAs of Canadian Holstein cows thus suggesting possible involvement of synonymous and non-coding region variants in FA synthesis. These SNPs may serve as potential genetic markers in breeding programs to increase milk FAs that are of benefit to human health. PMID:24533445
Ibeagha-Awemu, Eveline M; Akwanji, Kingsley A; Beaudoin, Frédéric; Zhao, Xin
2014-02-17
Fatty acid desaturase 1 (FADS1) and 2 (FADS2) genes code respectively for the enzymes delta-5 and delta-6 desaturases which are rate limiting enzymes in the synthesis of polyunsaturated omega-3 and omega-6 fatty acids (FAs). Omega-3 and-6 FAs as well as conjugated linoleic acid (CLA) are present in bovine milk and have demonstrated positive health effects in humans. Studies in humans have shown significant relationships between genetic variants in FADS1 and 2 genes with plasma and tissue concentrations of omega-3 and-6 FAs. The aim of this study was to evaluate the extent of sequence variations within these two genes in Canadian Holstein cows as well as the association between sequence variants and health promoting FAs in milk. Thirty three SNPs were detected within the studied regions of genes including a synonymous mutation (FADS1-07, rs42187261, 306Tyr > Tyr) in exon 8 of FADS1, a non-synonymous mutation (FADS2-14, rs211580559, 294Ala > Val) within FADS2 exon 7, a splice site SNP (FADS2-05, rs211263660), a 3'UTR SNP (FADS2-23, rs109772589), and another 3'UTR SNP with an effect on a microRNA binding site within FADS2 gene (FADS2-19, rs210169303). Association analyses showed significant relations between three out of seven tested SNPs and several FAs. Significant associations (FDR P < 0.05) were recorded between FADS2-23 (rs109772589) and two omega-6 FAs (dihomogamma linolenic acid [C20:3n6] and arachidonic acid [C20:4n6]), FADS1-07 (rs42187261) and one omega-3 FA (eicosapentaenoic acid, C20:5n3) and tricosanoic acid (C23:0), and one intronic SNP, FADS1-01 (rs136261927) and C20:3n6. Our study has demonstrated positive associations between three SNPs within FADS1 and FADS2 genes (a SNP within the 3'UTR, a synonymous SNP and an intronic SNP), with three milk PUFAs of Canadian Holstein cows thus suggesting possible involvement of synonymous and non-coding region variants in FA synthesis. These SNPs may serve as potential genetic markers in breeding programs to increase milk FAs that are of benefit to human health.
Zhang, Ya-Ran; Gui, Lin-Sheng; Li, Yao-Kun; Jiang, Bi-Jie; Wang, Hong-Cheng; Zhang, Ying-Ying; Zan, Lin-Sen
2015-07-27
Smoothened (Smo)-mediated Hedgehog (Hh) signaling pathway governs the patterning, morphogenesis and growth of many different regions within animal body plans. This study evaluated the effects of genetic variations of the bovine SMO gene on economically important body size traits in Chinese Qinchuan cattle. Altogether, eight single nucleotide polymorphisms (SNPs: 1-8) were identified and genotyped via direct sequencing covering most of the coding region and 3'UTR of the bovine SMO gene. Both the p.698Ser.>Ser. synonymous mutation resulted from SNP1 and the p.700Ser.>Pro. non-synonymous mutation caused by SNP2 mapped to the intracellular C-terminal tail of bovine Smo protein; the other six SNPs were non-coding variants located in the 3'UTR. The linkage disequilibrium was analyzed, and five haplotypes were discovered in 520 Qinchuan cattle. Association analyses showed that SNP2, SNP3/5, SNP4 and SNP6/7 were significantly associated with some body size traits (p < 0.05) except SNP1/8 (p > 0.05). Meanwhile, cattle with wild-type combined haplotype Hap1/Hap1 had significantly (p < 0.05) greater body length than those with Hap2/Hap2. Our results indicate that variations in the SMO gene could affect body size traits of Qinchuan cattle, and the wild-type haplotype Hap1 together with the wild-type alleles of these detected SNPs in the SMO gene could be used to breed cattle with superior body size traits. Therefore, our results could be helpful for marker-assisted selection in beef cattle breeding programs.
Zhang, Ya-Ran; Gui, Lin-Sheng; Li, Yao-Kun; Jiang, Bi-Jie; Wang, Hong-Cheng; Zhang, Ying-Ying; Zan, Lin-Sen
2015-01-01
Smoothened (Smo)-mediated Hedgehog (Hh) signaling pathway governs the patterning, morphogenesis and growth of many different regions within animal body plans. This study evaluated the effects of genetic variations of the bovine SMO gene on economically important body size traits in Chinese Qinchuan cattle. Altogether, eight single nucleotide polymorphisms (SNPs: 1–8) were identified and genotyped via direct sequencing covering most of the coding region and 3ʹUTR of the bovine SMO gene. Both the p.698Ser.>Ser. synonymous mutation resulted from SNP1 and the p.700Ser.>Pro. non-synonymous mutation caused by SNP2 mapped to the intracellular C-terminal tail of bovine Smo protein; the other six SNPs were non-coding variants located in the 3ʹUTR. The linkage disequilibrium was analyzed, and five haplotypes were discovered in 520 Qinchuan cattle. Association analyses showed that SNP2, SNP3/5, SNP4 and SNP6/7 were significantly associated with some body size traits (p < 0.05) except SNP1/8 (p > 0.05). Meanwhile, cattle with wild-type combined haplotype Hap1/Hap1 had significantly (p < 0.05) greater body length than those with Hap2/Hap2. Our results indicate that variations in the SMO gene could affect body size traits of Qinchuan cattle, and the wild-type haplotype Hap1 together with the wild-type alleles of these detected SNPs in the SMO gene could be used to breed cattle with superior body size traits. Therefore, our results could be helpful for marker-assisted selection in beef cattle breeding programs. PMID:26225956
LS-SNP/PDB: annotated non-synonymous SNPs mapped to Protein Data Bank structures.
Ryan, Michael; Diekhans, Mark; Lien, Stephanie; Liu, Yun; Karchin, Rachel
2009-06-01
LS-SNP/PDB is a new WWW resource for genome-wide annotation of human non-synonymous (amino acid changing) SNPs. It serves high-quality protein graphics rendered with UCSF Chimera molecular visualization software. The system is kept up-to-date by an automated, high-throughput build pipeline that systematically maps human nsSNPs onto Protein Data Bank structures and annotates several biologically relevant features. LS-SNP/PDB is available at (http://ls-snp.icm.jhu.edu/ls-snp-pdb) and via links from protein data bank (PDB) biology and chemistry tabs, UCSC Genome Browser Gene Details and SNP Details pages and PharmGKB Gene Variants Downloads/Cross-References pages.
Pavy, Nathalie; Parsons, Lee S; Paule, Charles; MacKay, John; Bousquet, Jean
2006-01-01
Background High-throughput genotyping technologies represent a highly efficient way to accelerate genetic mapping and enable association studies. As a first step toward this goal, we aimed to develop a resource of candidate Single Nucleotide Polymorphisms (SNP) in white spruce (Picea glauca [Moench] Voss), a softwood tree of major economic importance. Results A white spruce SNP resource encompassing 12,264 SNPs was constructed from a set of 6,459 contigs derived from Expressed Sequence Tags (EST) and by using the bayesian-based statistical software PolyBayes. Several parameters influencing the SNP prediction were analysed including the a priori expected polymorphism, the probability score (PSNP), and the contig depth and length. SNP detection in 3' and 5' reads from the same clones revealed a level of inconsistency between overlapping sequences as low as 1%. A subset of 245 predicted SNPs were verified through the independent resequencing of genomic DNA of a genotype also used to prepare cDNA libraries. The validation rate reached a maximum of 85% for SNPs predicted with either PSNP ≥ 0.95 or ≥ 0.99. A total of 9,310 SNPs were detected by using PSNP ≥ 0.95 as a criterion. The SNPs were distributed among 3,590 contigs encompassing an array of broad functional categories, with an overall frequency of 1 SNP per 700 nucleotide sites. Experimental and statistical approaches were used to evaluate the proportion of paralogous SNPs, with estimates in the range of 8 to 12%. The 3,789 coding SNPs identified through coding region annotation and ORF prediction, were distributed into 39% nonsynonymous and 61% synonymous substitutions. Overall, there were 0.9 SNP per 1,000 nonsynonymous sites and 5.2 SNPs per 1,000 synonymous sites, for a genome-wide nonsynonymous to synonymous substitution rate ratio (Ka/Ks) of 0.17. Conclusion We integrated the SNP data in the ForestTreeDB database along with functional annotations to provide a tool facilitating the choice of candidate genes for mapping purposes or association studies. PMID:16824208
LS-SNP: large-scale annotation of coding non-synonymous SNPs based on multiple information sources.
Karchin, Rachel; Diekhans, Mark; Kelly, Libusha; Thomas, Daryl J; Pieper, Ursula; Eswar, Narayanan; Haussler, David; Sali, Andrej
2005-06-15
The NCBI dbSNP database lists over 9 million single nucleotide polymorphisms (SNPs) in the human genome, but currently contains limited annotation information. SNPs that result in amino acid residue changes (nsSNPs) are of critical importance in variation between individuals, including disease and drug sensitivity. We have developed LS-SNP, a genomic scale software pipeline to annotate nsSNPs. LS-SNP comprehensively maps nsSNPs onto protein sequences, functional pathways and comparative protein structure models, and predicts positions where nsSNPs destabilize proteins, interfere with the formation of domain-domain interfaces, have an effect on protein-ligand binding or severely impact human health. It currently annotates 28,043 validated SNPs that produce amino acid residue substitutions in human proteins from the SwissProt/TrEMBL database. Annotations can be viewed via a web interface either in the context of a genomic region or by selecting sets of SNPs, genes, proteins or pathways. These results are useful for identifying candidate functional SNPs within a gene, haplotype or pathway and in probing molecular mechanisms responsible for functional impacts of nsSNPs. http://www.salilab.org/LS-SNP CONTACT: rachelk@salilab.org http://salilab.org/LS-SNP/supp-info.pdf.
NASA Astrophysics Data System (ADS)
Liu, Chengzhang; Wang, Xia; Xiang, Jianhai; Li, Fuhua
2012-09-01
Pacific white shrimp has become a major aquaculture and fishery species worldwide. Although a large scale EST resource has been publicly available since 2008, the data have not yet been widely used for SNP discovery or transcriptome-wide assessment of selective pressure. In this study, a set of 155 411 expressed sequence tags (ESTs) from the NCBI database were computationally analyzed and 17 225 single nucleotide polymorphisms (SNPs) were predicted, including 9 546 transitions, 5 124 transversions and 2 481 indels. Among the 7 298 SNP substitutions located in functionally annotated contigs, 58.4% (4 262) are non-synonymous SNPs capable of introducing amino acid mutations. Two hundred and fifty nonsynonymous SNPs in genes associated with economic traits have been identified as candidates for markers in selective breeding. Diversity estimates among the synonymous nucleotides were on average 3.49 times greater than those in non-synonymous, suggesting negative selection. Distribution of non-synonymous to synonymous substitutions (Ka/Ks) ratio ranges from 0 to 4.01, (average 0.42, median 0.26), suggesting that the majority of the affected genes are under purifying selection. Enrichment analysis identified multiple gene ontology categories under positive or negative selection. Categories involved in innate immune response and male gamete generation are rich in positively selected genes, which is similar to reports in Drosophila and primates. This work is the first transcriptome-wide assessment of selective pressure in a Penaeid shrimp species. The functionally annotated SNPs provide a valuable resource of potential molecular markers for selective breeding.
Liu, Dewu; Zhang, Yushan; Du, Yinjun; Yang, Guanfu; Zhang, Xiquan
2007-06-01
The growth-correlated genes that are part of the neuroendocrine growth axis play crucial roles in the regulation of growth and development of pig. The identification of genetic polymorphisms in these genes will enable the scientist to evaluate the biological relevance of such polymorphisms and to gain a better understanding of quantitative traits like growth. In the present study, seven pairs of primers were designed to obtain unknown sequences of growth-correlated genes, and other 25 pairs of primers were designed to identify single nucleotide polymorphisms (SNP) using the denaturing high-performance liquid chromatography (DHPLC) technology in four pig breeds (Duroc, Landrace, Lantang and Wuzhishan), significantly differing in growth and development characteristics. A total of 101 polymorphisms were discovered in 10,707 base pairs (bp) from six genes of the ghrelin (GHRL), leptin (LEP), insulin-like growth factor II (IGF-II), insulin-like growth factor binding protein 2 (IGFBP-2), insulin-like growth factor binding protein 3 (IGFBP-3), and somatostatin (SS). The observed average distances between the SNP in the 5'UTR, coding regions, introns and 3'UTR were 134, 521, 81 and 92 bp, respectively. Four SNPs were found in the coding regions of IGF-II, IGFBP-2 and LEP, respectively. Two synonymous mutations were obtained in IGF-II and LEP genes respectively, and two non-synonymous were found in IGFBP-2 and LEP genes, respectively. Seven other mutations were also observed. Thirty-two PCR-RFLP markers were found among 101 polymorphisms of the six genes. The SNP discovered in this study would provide suitable markers for association studies of candidate genes with growth related traits in pig.
Bactericidal activity of tracheal antimicrobial peptide against respiratory pathogens of cattle.
Taha-Abdelaziz, Khaled; Perez-Casal, José; Schott, Courtney; Hsiao, Jason; Attah-Poku, Samuel; Slavić, Durđa; Caswell, Jeff L
2013-04-15
Tracheal antimicrobial peptide (TAP) is a β-defensin produced by mucosal epithelial cells of cattle. Although effective against several human pathogens, the activity of this bovine peptide against the bacterial pathogens that cause bovine respiratory disease have not been reported. This study compared the antibacterial effects of synthetic TAP against Mannheimia haemolytica, Histophilus somni, Pasteurella multocida, and Mycoplasma bovis. Bactericidal activity against M. bovis was not detected. In contrast, the Pasteurellaceae bacteria showed similar levels of susceptibility to that of Escherichia coli, with 0.125μg TAP inhibiting growth in a radial diffusion assay and minimum inhibitory concentrations of 1.56-6.25μg/ml in a bactericidal assay. Significant differences among isolates were not observed. Sequencing of exon 2 of the TAP gene from 23 cattle revealed a prevalent non-synonymous single nucleotide polymorphism (SNP) A137G, encoding either serine or asparagine at residue 20 of the mature peptide. The functional effect of this SNP was tested against M. haemolytica using synthetic peptides. The bactericidal effect of the asparagine-containing peptide was consistently higher than the serine-containing peptide. Bactericidal activities were similar for an acapsular mutant of M. haemolytica compared to the wild type. These findings indicate that the Pasteurellaceae bacteria that cause bovine respiratory disease are susceptible to killing by bovine TAP and appear not to have evolved resistance, whereas M. bovis appears to be resistant. A non-synonymous SNP was identified in the coding region of the TAP gene, and the corresponding peptides vary in their bactericidal activity against M. haemolytica. Copyright © 2013 Elsevier B.V. All rights reserved.
2014-01-01
Background Previous genome-wide association studies have identified significant regions of the X chromosome associated with reproductive traits in two Bos indicus-influenced breeds: Brahman cattle and Tropical Composites. Two QTL regions on this chromosome were identified in both breeds as strongly associated with scrotal circumference measurements, a reproductive trait previously shown to be useful for selection of young bulls. Scrotal circumference is genetically correlated with early age at puberty in both male and female offspring. These QTL were located at positions 69–77 and 81–92 Mb respectively, large areas each to which a significant number of potential candidate genes were mapped. Results To further characterise these regions, a bioinformatic approach was undertaken to identify novel non-synonymous SNP within the QTL regions of interest in Brahman cattle. After SNP discovery, we used conventional molecular assay technologies to perform studies of two candidate genes in both breeds. Non-synonymous SNP mapped to Testis-expressed gene 11 (Tex11) were associated (P < 0.001) with scrotal circumference in both breeds, and associations with percentage of normal sperm cells were also observed (P < 0.05). Evidence for recent selection was found as Tex11 SNP form a haplotype segment of Bos taurus origin that is retained within Brahman and Tropical Composite cattle with greatest reproductive potential. Conclusions Association of non-synonymous SNP presented here are a first step to functional genetic studies. Bovine species may serve as a model for studying the role of Tex11 in male fertility, warranting further in-depth molecular characterisation. PMID:24410912
Identification of bovine NPC1 gene cSNPs and their effects on body size traits of Qinchuan cattle.
Dang, Yonglong; Li, Mingxun; Yang, Mingjuan; Cao, Xiukai; Lan, Xianyong; Lei, Chuzhao; Zhang, Chunlei; Lin, Qing; Chen, Hong
2014-05-01
NPC1 gene is an important gene closely related to the Niemann-Pick type C (NPC). Mutations in the NPC1 gene tend to cause Niemann-Pick type C, a lysosomal storage disorder. Previous studies have shown that NPC1 protein plays an important role in subcellular lipid transport, homeostasis, platelet function and formation, which are basic metabolic activities in the process of development. In this study, to explore the association between the NPC1 gene variation and body size traits in Qinchuan cattle, we detected four novel coding single nucleotide polymorphisms (cSNPs) in the bovine NPC1 gene, including one missense mutation (SNP1) and three synonymous mutations (SNP2, SNP3 and SNP4). Population genetic analyses of 518 individuals and association correlations between cSNPs and bovine body size traits were conducted in this research. A missense mutation at SNP1 locus was found to be significantly related to the heart girth, hip width and body weight (P<0.01 or P<0.05, 3.5-year-old). Two synonymous mutations at SNP2 and SNP3 loci also showed significant effects on hip width (P<0.05, 3.5-year-old). One synonymous mutation at SNP4 locus showed significant effect on body weight (P<0.05, 2.0-year-old). Combined haplotypes H2H6 and H6H6 showed significant effects on body size traits such as heart girth, hip width, and body weight (3.5-year-old, P<0.01 or P<0.05). This study provides evidence that the NPC1 gene might be involved in the regulation of bovine growth and body development, and may be considered as a candidate gene for marker assisted selection (MAS) in beef cattle breeding industry. Copyright © 2014. Published by Elsevier B.V.
Rasal, Kiran D; Shah, Tejas M; Vaidya, Megha; Jakhesara, Subhash J; Joshi, Chaitanya G
2015-06-01
The recent advances in high throughput sequencing technology accelerate possible ways for the study of genome wide variation in several organisms and associated consequences. In the present study, mutations in TGFBR3 showing significant association with FCR trait in chicken during exome sequencing were further analyzed. Out of four SNPs, one nsSNP p.Val451Leu was found in the coding region of TGFBR3. In silico tools such as SnpSift and PANTHER predicted it as deleterious (0.04) and to be tolerated, respectively, while I-Mutant revealed that protein stability decreased. The TGFBR3 I-TASSER model has a C-score of 0.85, which was validated using PROCHECK. Based on MD simulation, mutant protein structure deviated from native with RMSD 0.08 Å due to change in the H-bonding distances of mutant residue. The docking of TGFBR3 with interacting TGFBR2 inferred that mutant required more global energy. Therefore, the present study will provide useful information about functional SNPs that have an impact on FCR traits.
Dos Reis, Mario
2015-04-01
First principles of population genetics are used to obtain formulae relating the non-synonymous to synonymous substitution rate ratio to the selection coefficients acting at codon sites in protein-coding genes. Two theoretical cases are discussed and two examples from real data (a chloroplast gene and a virus polymerase) are given. The formulae give much insight into the dynamics of non-synonymous substitutions and may inform the development of methods to detect adaptive evolution. © 2015 The Author(s) Published by the Royal Society. All rights reserved.
Zhou, Jie; Kherani, Femida; Bardakjian, Tanya M.; Katowitz, James; Hughes, Nkecha; Schimmenti, Lisa A.; Schneider, Adele
2008-01-01
Purpose Mutations in the SOX2 and CHX10 genes have been reported in patients with anophthalmia and/or microphthalmia. In this study, we evaluated 34 anophthalmic/microphthalmic patient DNA samples (two sets of siblings included) for mutations and sequence variants in SOX2 and CHX10. Methods Conformational sensitive gel electrophoresis (CSGE) was used for the initial SOX2 and CHX10 screening of 34 affected individuals (two sets of siblings), five unaffected family members, and 80 healthy controls. Patient samples containing heteroduplexes were selected for sequence analysis. Base pair changes in SOX2 and CHX10 were confirmed by sequencing bidirectionally in patient samples. Results Two novel heterozygous mutations and two sequence variants (one known) in SOX2 were identified in this cohort. Mutation c.310 G>T (p. Glu104X), found in one patient, was in the region encoding the high mobility group (HMG) DNA-binding domain and resulted in a change from glutamic acid to a stop codon. The second mutation, noted in two affected siblings, was a single nucleotide deletion c.549delC (p. Pro184ArgfsX19) in the region encoding the activation domain, resulting in a frameshift and premature termination of the coding sequence. The shortened protein products may result in the loss of function. In addition, a novel nucleotide substitution c.*557G>A was identified in the 3′-untranslated region in one patient. The relationship between the nucleotide change and the protein function is indeterminate. A known single nucleotide polymorphism (c. *469 C>A, SNP rs11915160) was also detected in 2 of the 34 patients. Screening of CHX10 identified two synonymous sequence variants, c.471 C>T (p.Ser157Ser, rs35435463) and c.579 G>A (p. Gln193Gln, novel SNP), and one non-synonymous sequence variant, c.871 G>A (p. Asp291Asn, novel SNP). The non-synonymous polymorphism was also present in healthy controls, suggesting non-causality. Conclusions These results support the role of SOX2 in ocular development. Loss of SOX2 function results in severe eye malformation. CHX10 was not implicated with microphthalmia/anophthalmia in our patient cohort. PMID:18385794
Johnson, Amy R; Lao, Sai; Wang, Tongwen; Galanko, Joseph A; Zeisel, Steven H
2012-01-01
Approximately 15% of couples are affected by infertility and up to half of these cases arise from male factor infertility. Unidentified genetic aberrations such as chromosomal deletions, translocations and single nucleotide polymorphisms (SNPs) may be the underlying cause of many cases of idiopathic male infertility. Deletion of the choline dehydrogenase (Chdh) gene in mice results in decreased male fertility due to diminished sperm motility; sperm from Chdh(-/-) males have decreased ATP concentrations likely stemming from abnormal sperm mitochondrial morphology and function in these cells. Several SNPs have been identified in the human CHDH gene that may result in altered CHDH enzymatic activity. rs12676 (G233T), a non-synonymous SNP located in the CHDH coding region, is associated with increased susceptibility to dietary choline deficiency and risk of breast cancer. We now report evidence that this SNP is also associated with altered sperm motility patterns and dysmorphic mitochondrial structure in sperm. Sperm produced by men who are GT or TT for rs12676 have 40% and 73% lower ATP concentrations, respectively, in their sperm. rs12676 is associated with decreased CHDH protein in sperm and hepatocytes. A second SNP located in the coding region of IL17BR, rs1025689, is linked to altered sperm motility characteristics and changes in choline metabolite concentrations in sperm.
Johnson, Amy R.; Lao, Sai; Wang, Tongwen; Galanko, Joseph A.; Zeisel, Steven H.
2012-01-01
Approximately 15% of couples are affected by infertility and up to half of these cases arise from male factor infertility. Unidentified genetic aberrations such as chromosomal deletions, translocations and single nucleotide polymorphisms (SNPs) may be the underlying cause of many cases of idiopathic male infertility. Deletion of the choline dehydrogenase (Chdh) gene in mice results in decreased male fertility due to diminished sperm motility; sperm from Chdh−/− males have decreased ATP concentrations likely stemming from abnormal sperm mitochondrial morphology and function in these cells. Several SNPs have been identified in the human CHDH gene that may result in altered CHDH enzymatic activity. rs12676 (G233T), a non-synonymous SNP located in the CHDH coding region, is associated with increased susceptibility to dietary choline deficiency and risk of breast cancer. We now report evidence that this SNP is also associated with altered sperm motility patterns and dysmorphic mitochondrial structure in sperm. Sperm produced by men who are GT or TT for rs12676 have 40% and 73% lower ATP concentrations, respectively, in their sperm. rs12676 is associated with decreased CHDH protein in sperm and hepatocytes. A second SNP located in the coding region of IL17BR, rs1025689, is linked to altered sperm motility characteristics and changes in choline metabolite concentrations in sperm. PMID:22558321
Zhao, Nan; Han, Jing Ginger; Shyu, Chi-Ren; Korkin, Dmitry
2014-01-01
Single nucleotide polymorphisms (SNPs) are among the most common types of genetic variation in complex genetic disorders. A growing number of studies link the functional role of SNPs with the networks and pathways mediated by the disease-associated genes. For example, many non-synonymous missense SNPs (nsSNPs) have been found near or inside the protein-protein interaction (PPI) interfaces. Determining whether such nsSNP will disrupt or preserve a PPI is a challenging task to address, both experimentally and computationally. Here, we present this task as three related classification problems, and develop a new computational method, called the SNP-IN tool (non-synonymous SNP INteraction effect predictor). Our method predicts the effects of nsSNPs on PPIs, given the interaction's structure. It leverages supervised and semi-supervised feature-based classifiers, including our new Random Forest self-learning protocol. The classifiers are trained based on a dataset of comprehensive mutagenesis studies for 151 PPI complexes, with experimentally determined binding affinities of the mutant and wild-type interactions. Three classification problems were considered: (1) a 2-class problem (strengthening/weakening PPI mutations), (2) another 2-class problem (mutations that disrupt/preserve a PPI), and (3) a 3-class classification (detrimental/neutral/beneficial mutation effects). In total, 11 different supervised and semi-supervised classifiers were trained and assessed resulting in a promising performance, with the weighted f-measure ranging from 0.87 for Problem 1 to 0.70 for the most challenging Problem 3. By integrating prediction results of the 2-class classifiers into the 3-class classifier, we further improved its performance for Problem 3. To demonstrate the utility of SNP-IN tool, it was applied to study the nsSNP-induced rewiring of two disease-centered networks. The accurate and balanced performance of SNP-IN tool makes it readily available to study the rewiring of large-scale protein-protein interaction networks, and can be useful for functional annotation of disease-associated SNPs. SNIP-IN tool is freely accessible as a web-server at http://korkinlab.org/snpintool/. PMID:24784581
McClure, Matthew C; Bickhart, Derek; Null, Dan; Vanraden, Paul; Xu, Lingyang; Wiggans, George; Liu, George; Schroeder, Steve; Glasscock, Jarret; Armstrong, Jon; Cole, John B; Van Tassell, Curtis P; Sonstegard, Tad S
2014-01-01
The recent discovery of bovine haplotypes with negative effects on fertility in the Brown Swiss, Holstein, and Jersey breeds has allowed producers to identify carrier animals using commercial single nucleotide polymorphism (SNP) genotyping assays. This study was devised to identify the causative mutations underlying defective bovine embryo development contained within three of these haplotypes (Brown Swiss haplotype 1 and Holstein haplotypes 2 and 3) by combining exome capture with next generation sequencing. Of the 68,476,640 sequence variations (SV) identified, only 1,311 genome-wide SNP were concordant with the haplotype status of 21 sequenced carriers. Validation genotyping of 36 candidate SNP identified only 1 variant that was concordant to Holstein haplotype 3 (HH3), while no variants located within the refined intervals for HH2 or BH1 were concordant. The variant strictly associated with HH3 is a non-synonymous SNP (T/C) within exon 24 of the Structural Maintenance of Chromosomes 2 (SMC2) on Chromosome 8 at position 95,410,507 (UMD3.1). This polymorphism changes amino acid 1135 from phenylalanine to serine and causes a non-neutral, non-tolerated, and evolutionarily unlikely substitution within the NTPase domain of the encoded protein. Because only exome capture sequencing was used, we could not rule out the possibility that the true causative mutation for HH3 might lie in a non-exonic genomic location. Given the essential role of SMC2 in DNA repair, chromosome condensation and segregation during cell division, our findings strongly support the non-synonymous SNP (T/C) in SMC2 as the likely causative mutation. The absence of concordant variations for HH2 or BH1 suggests either the underlying causative mutations lie within a non-exomic region or in exome regions not covered by the capture array.
McClure, Matthew C.; Bickhart, Derek; Null, Dan; VanRaden, Paul; Xu, Lingyang; Wiggans, George; Liu, George; Schroeder, Steve; Glasscock, Jarret; Armstrong, Jon; Cole, John B.; Van Tassell, Curtis P.; Sonstegard, Tad S.
2014-01-01
The recent discovery of bovine haplotypes with negative effects on fertility in the Brown Swiss, Holstein, and Jersey breeds has allowed producers to identify carrier animals using commercial single nucleotide polymorphism (SNP) genotyping assays. This study was devised to identify the causative mutations underlying defective bovine embryo development contained within three of these haplotypes (Brown Swiss haplotype 1 and Holstein haplotypes 2 and 3) by combining exome capture with next generation sequencing. Of the 68,476,640 sequence variations (SV) identified, only 1,311 genome-wide SNP were concordant with the haplotype status of 21 sequenced carriers. Validation genotyping of 36 candidate SNP identified only 1 variant that was concordant to Holstein haplotype 3 (HH3), while no variants located within the refined intervals for HH2 or BH1 were concordant. The variant strictly associated with HH3 is a non-synonymous SNP (T/C) within exon 24 of the Structural Maintenance of Chromosomes 2 (SMC2) on Chromosome 8 at position 95,410,507 (UMD3.1). This polymorphism changes amino acid 1135 from phenylalanine to serine and causes a non-neutral, non-tolerated, and evolutionarily unlikely substitution within the NTPase domain of the encoded protein. Because only exome capture sequencing was used, we could not rule out the possibility that the true causative mutation for HH3 might lie in a non-exonic genomic location. Given the essential role of SMC2 in DNA repair, chromosome condensation and segregation during cell division, our findings strongly support the non-synonymous SNP (T/C) in SMC2 as the likely causative mutation. The absence of concordant variations for HH2 or BH1 suggests either the underlying causative mutations lie within a non-exomic region or in exome regions not covered by the capture array. PMID:24667746
Sirois, Francine; Kaefer, Nadine; Currie, Krista A; Chrétien, Michel; Nkongolo, Kabwe K; Mbikay, Majambu
2012-10-01
The PCSK1 (proprotein convertase subtilisin/kexin type 1) locus encodes proprotein convertase 1/3, an endoprotease that converts prohormones and proneuropeptides to their active forms. Spontaneous loss-of-function mutations in the coding sequence of its gene have been linked to obesity in humans. Minor alleles of two common non-synonymous single-nucleotide polymorphisms (SNPs), rs6232 (T > C, N221D) and rs6235 (C > G, S690T), have been associated with increased risk of obesity in European populations. In this study, we compared the frequencies of the rs6232 and rs6234 (G > C, Q665E) SNPs in Aboriginal and Caucasian populations of Northern Ontario. The two SNPs were all relatively less frequent in Aboriginals: The minor allele frequency of the rs6232 SNP was 0.01 in Aboriginals and 0.08 in Caucasians (P < 4.10(-6)); for the rs6234 SNP, it was 0.20 and 0.32, respectively (P < 0.001). Resequencing revealed that the rs6234 SNP variation was tightly linked to that of the rs6235 SNP, as previously reported. Most interestingly, all carriers of the rs6232 SNP variation also carried the rs6234/rs6235 SNP clustered variations, but not the reverse, suggesting the former occurred later on an allele already carrying the latter. These data indicate that, in Northern Ontario Aboriginals, the triple-variant PCSK1 allele is relatively rare and might be of lesser significance for obesity risk in this population.
DoGSD: the dog and wolf genome SNP database.
Bai, Bing; Zhao, Wen-Ming; Tang, Bi-Xia; Wang, Yan-Qing; Wang, Lu; Zhang, Zhang; Yang, He-Chuan; Liu, Yan-Hu; Zhu, Jun-Wei; Irwin, David M; Wang, Guo-Dong; Zhang, Ya-Ping
2015-01-01
The rapid advancement of next-generation sequencing technology has generated a deluge of genomic data from domesticated dogs and their wild ancestor, grey wolves, which have simultaneously broadened our understanding of domestication and diseases that are shared by humans and dogs. To address the scarcity of single nucleotide polymorphism (SNP) data provided by authorized databases and to make SNP data more easily/friendly usable and available, we propose DoGSD (http://dogsd.big.ac.cn), the first canidae-specific database which focuses on whole genome SNP data from domesticated dogs and grey wolves. The DoGSD is a web-based, open-access resource comprising ∼ 19 million high-quality whole-genome SNPs. In addition to the dbSNP data set (build 139), DoGSD incorporates a comprehensive collection of SNPs from two newly sequenced samples (1 wolf and 1 dog) and collected SNPs from three latest dog/wolf genetic studies (7 wolves and 68 dogs), which were taken together for analysis with the population genetic statistics, Fst. In addition, DoGSD integrates some closely related information including SNP annotation, summary lists of SNPs located in genes, synonymous and non-synonymous SNPs, sampling location and breed information. All these features make DoGSD a useful resource for in-depth analysis in dog-/wolf-related studies. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Sounds of silence: synonymous nucleotides as a key to biological regulation and complexity
Shabalina, Svetlana A.; Spiridonov, Nikolay A.; Kashina, Anna
2013-01-01
Messenger RNA is a key component of an intricate regulatory network of its own. It accommodates numerous nucleotide signals that overlap protein coding sequences and are responsible for multiple levels of regulation and generation of biological complexity. A wealth of structural and regulatory information, which mRNA carries in addition to the encoded amino acid sequence, raises the question of how these signals and overlapping codes are delineated along non-synonymous and synonymous positions in protein coding regions, especially in eukaryotes. Silent or synonymous codon positions, which do not determine amino acid sequences of the encoded proteins, define mRNA secondary structure and stability and affect the rate of translation, folding and post-translational modifications of nascent polypeptides. The RNA level selection is acting on synonymous sites in both prokaryotes and eukaryotes and is more common than previously thought. Selection pressure on the coding gene regions follows three-nucleotide periodic pattern of nucleotide base-pairing in mRNA, which is imposed by the genetic code. Synonymous positions of the coding regions have a higher level of hybridization potential relative to non-synonymous positions, and are multifunctional in their regulatory and structural roles. Recent experimental evidence and analysis of mRNA structure and interspecies conservation suggest that there is an evolutionary tradeoff between selective pressure acting at the RNA and protein levels. Here we provide a comprehensive overview of the studies that define the role of silent positions in regulating RNA structure and processing that exert downstream effects on proteins and their functions. PMID:23293005
GESPA: classifying nsSNPs to predict disease association.
Khurana, Jay K; Reeder, Jay E; Shrimpton, Antony E; Thakar, Juilee
2015-07-25
Non-synonymous single nucleotide polymorphisms (nsSNPs) are the most common DNA sequence variation associated with disease in humans. Thus determining the clinical significance of each nsSNP is of great importance. Potential detrimental nsSNPs may be identified by genetic association studies or by functional analysis in the laboratory, both of which are expensive and time consuming. Existing computational methods lack accuracy and features to facilitate nsSNP classification for clinical use. We developed the GESPA (GEnomic Single nucleotide Polymorphism Analyzer) program to predict the pathogenicity and disease phenotype of nsSNPs. GESPA is a user-friendly software package for classifying disease association of nsSNPs. It allows flexibility in acceptable input formats and predicts the pathogenicity of a given nsSNP by assessing the conservation of amino acids in orthologs and paralogs and supplementing this information with data from medical literature. The development and testing of GESPA was performed using the humsavar, ClinVar and humvar datasets. Additionally, GESPA also predicts the disease phenotype associated with a nsSNP with high accuracy, a feature unavailable in existing software. GESPA's overall accuracy exceeds existing computational methods for predicting nsSNP pathogenicity. The usability of GESPA is enhanced by fast SQL-based cloud storage and retrieval of data. GESPA is a novel bioinformatics tool to determine the pathogenicity and phenotypes of nsSNPs. We anticipate that GESPA will become a useful clinical framework for predicting the disease association of nsSNPs. The program, executable jar file, source code, GPL 3.0 license, user guide, and test data with instructions are available at http://sourceforge.net/projects/gespa.
Polymorphism of BMP4 gene in Indian goat breeds differing in prolificacy.
Sharma, Rekha; Ahlawat, Sonika; Maitra, A; Roy, Manoranjan; Mandakmale, S; Tantia, M S
2013-12-10
Bone morphogenetic proteins (BMPs) are members of the TGF-β (transforming growth factor-beta) superfamily, of which BMP4 is the most important due to its crucial role in follicular growth and differentiation, cumulus expansion and ovulation. Reproduction is a crucial trait in goat breeding and based on the important role of BMP4 gene in reproduction it was considered as a possible candidate gene for the prolificacy of goats. The objective of the present study was to detect polymorphism in intronic, exonic and 3' un-translated regions of BMP4 gene in Indian goats. Nine different goat breeds (Barbari, Beetal, Black Bengal, Malabari, Jakhrana (Twinning>40%), Osmanabadi, Sangamneri (Twinning 20-30%), Sirohi and Ganjam (Twinning<10%)) differing in prolificacy and geographic distribution were employed for polymorphism scanning. Cattle sequence (AC_000167.1) was used to design primers for the amplification of a targeted region followed by direct DNA sequencing to identify the genetic variations. Single nucleotide polymorphisms (SNPs) were not detected in exon 3, the intronic region and the 3' flanking region. A SNP (G1534A) was identified in exon 2. It was a non-synonymous mutation resulting in an arginine to lysine change in a corresponding protein sequence. G to A transition at the 1534 locus revealed two genotypes GG and GA in the nine investigated goat breeds. The GG genotype was predominant with a genotype frequency of 0.98. The GA genotype was present in the Black Bengal as well as Jakhrana breed with a genotype frequency of 0.02. A microsatellite was identified in the 3' flanking region, only 20 nucleotides downstream from the termination site of the coding region, as a short sequence with more than nineteen continuous and repeated CA dinucleotides. Since the gene is highly evolutionarily conserved, identification of a non-synonymous SNP (G1534A) in the coding region gains further importance. To our knowledge, this is the first report of a mutation in the coding region of the caprine BMP4 gene. But whether the reproduction trait of goat is associated with the BMP4 polymorphism, needs to be further defined by association studies in more populations so as to delineate an effect on it. © 2013 Elsevier B.V. All rights reserved.
Calvo, J H; Serrano, M; Martinez-Royo, A; Lahoz, B; Sarto, P; Ibañez-Deler, A; Folch, J; Alabart, J L
2018-06-01
The aim of this study was to characterize and identify causative SNPs in the MTNR1A gene responsible for the reproductive seasonality traits in the Rasa aragonesa sheep breed. A total of 290 ewes (155, 84 and 51 mature, young and ewe lambs, respectively) from one flock were controlled from January to August. The following three reproductive seasonality traits were considered: the total days of anoestrus (TDA) and the progesterone cycling months (P4CM); both ovarian function seasonality traits based on blood progesterone levels; and the oestrus cycling months (OCM) based on oestrous detection, which indicate behavioural signs of oestrous. We have sequenced the total coding region plus 733 and 251 bp from the promoter and 3'-UTR regions, respectively, from the gene in 268 ewes. We found 9 and 4 SNPs associated with seasonality traits in the promoter (for TDA and P4CM) and exon 2 (for the three traits), respectively. The SNPs located in the gene promoter modify the putative binding sites for various trans-acting factors. In exon 2, two synonymous SNPs affect RFLP sites, rs406779174/RsaI (for the three traits) and rs430181568/MnlI (for OCM), and they have been related with seasonal reproductive activity in previous association studies with other breeds. SNP rs400830807, which is located in the 3'-UTR, was associated with the three traits, but this did not modify the putative target sites for ovine miRNAs according to in silico predictions. Finally, the SNP rs403212791 (NW_014639035.1: g.15099004G > A), which is also associated with the three seasonality phenotypes, was the most significant SNP detected in this study and was a non-synonymous polymorphism, leading a change from an Arginine to a Cysteine (R336C). Haplotype analyses confirmed the association results and showed that the effects found for the seasonality traits were caused by the SNPs located in exon 2. We have demonstrated that the T allele in the SNP rs403212791 in the MNTR1A gene is associated with a lower TDA and higher P4CM and OCM values in the Rasa Aragonesa breed. Copyright © 2018 Elsevier Inc. All rights reserved.
Yu, Yang; Wei, Jiankai; Zhang, Xiaojun; Liu, Jingwen; Liu, Chengzhang; Li, Fuhua; Xiang, Jianhai
2014-01-01
The application of next generation sequencing technology has greatly facilitated high throughput single nucleotide polymorphism (SNP) discovery and genotyping in genetic research. In the present study, SNPs were discovered based on two transcriptomes of Litopenaeus vannamei (L. vannamei) generated from Illumina sequencing platform HiSeq 2000. One transcriptome of L. vannamei was obtained through sequencing on the RNA from larvae at mysis stage and its reference sequence was de novo assembled. The data from another transcriptome were downloaded from NCBI and the reads of the two transcriptomes were mapped separately to the assembled reference by BWA. SNP calling was performed using SAMtools. A total of 58,717 and 36,277 SNPs with high quality were predicted from the two transcriptomes, respectively. SNP calling was also performed using the reads of two transcriptomes together, and a total of 96,040 SNPs with high quality were predicted. Among these 96,040 SNPs, 5,242 and 29,129 were predicted as non-synonymous and synonymous SNPs respectively. Characterization analysis of the predicted SNPs in L. vannamei showed that the estimated SNP frequency was 0.21% (one SNP per 476 bp) and the estimated ratio for transition to transversion was 2.0. Fifty SNPs were randomly selected for validation by Sanger sequencing after PCR amplification and 76% of SNPs were confirmed, which indicated that the SNPs predicted in this study were reliable. These SNPs will be very useful for genetic study in L. vannamei, especially for the high density linkage map construction and genome-wide association studies. PMID:24498047
Wieczorek, Stefan; Holle, Julia U; Bremer, Jan P; Wibisono, David; Moosig, Frank; Fricke, Harald; Assmann, Gunter; Harper, Lorraine; Arning, Larissa; Gross, Wolfgang L; Epplen, Joerg T
2010-05-01
There is evidence that the leptin/ghrelin system is involved in T-cell regulation and plays a role in (auto)immune disorders such as SLE, RA and ANCA-associated vasculitides (AAVs). Here, we evaluate the genetic background of this system in WG. We screened variations in the genes encoding leptin, ghrelin and their receptors, the leptin receptor (LEPR) and the growth hormone secretagogue receptor (GHSR). Three single nucleotide polymorphisms (SNPs) in each gene region were analysed in 460 German WG cases and 878 ethnically matched healthy controls. A three-SNP haplotype of GHSR was significantly associated with WG [P = 0.0067; corrected P-value (P(c)) = 0.026; odds ratio (OR) = 1.30; 95% CI 1.08, 1.57], as was one non-synonymous SNP in LEPR (Lys656Asn, P = 0.0034; P(c) = 0.013; OR = 0.72; 95% CI 0.58, 0.90). These four SNPs were re-analysed in independent cohorts of 226 German WG cases and 519 controls. While the GHSR association was not confirmed, allele frequencies of the LEPR SNP were virtually identical to those from the initial cohorts. Analysis of this SNP in the combined WG and control panels revealed a significant association of the LEPR 656Lys allele with WG (P = 0.00032; P(c) = 0.0013; OR = 0.72; 95% CI 0.60, 0.86). Remarkably, the Lys656Asn SNP showed contrasting allele distribution in two cohorts of 108 and 88 German cases diagnosed with Churg-Strauss syndrome (CSS, combined P = 0.0067; OR = 1.41; 95% CI 1.10, 1.81), whereas identical allele frequencies were revealed when comparing British WG and microscopic polyangiitis cases. While GHSR has to be further evaluated, these data provide profound evidence for an association of the LEPR Lys656Asn SNP with AAV, resulting in opposing effects in WG and CSS.
Doddapaneni, Harshavardhan; Yao, Jiqiang; Lin, Hong; Walker, M Andrew; Civerolo, Edwin L
2006-01-01
Background The Gram-negative, xylem-limited phytopathogenic bacterium Xylella fastidiosa is responsible for causing economically important diseases in grapevine, citrus and many other plant species. Despite its economic impact, relatively little is known about the genomic variations among strains isolated from different hosts and their influence on the population genetics of this pathogen. With the availability of genome sequence information for four strains, it is now possible to perform genome-wide analyses to identify and categorize such DNA variations and to understand their influence on strain functional divergence. Results There are 1,579 genes and 194 non-coding homologous sequences present in the genomes of all four strains, representing a 76. 2% conservation of the sequenced genome. About 60% of the X. fastidiosa unique sequences exist as tandem gene clusters of 6 or more genes. Multiple alignments identified 12,754 SNPs and 14,449 INDELs in the 1528 common genes and 20,779 SNPs and 10,075 INDELs in the 194 non-coding sequences. The average SNP frequency was 1.08 × 10-2 per base pair of DNA and the average INDEL frequency was 2.06 × 10-2 per base pair of DNA. On an average, 60.33% of the SNPs were synonymous type while 39.67% were non-synonymous type. The mutation frequency, primarily in the form of external INDELs was the main type of sequence variation. The relative similarity between the strains was discussed according to the INDEL and SNP differences. The number of genes unique to each strain were 60 (9a5c), 54 (Dixon), 83 (Ann1) and 9 (Temecula-1). A sub-set of the strain specific genes showed significant differences in terms of their codon usage and GC composition from the native genes suggesting their xenologous origin. Tandem repeat analysis of the genomic sequences of the four strains identified associations of repeat sequences with hypothetical and phage related functions. Conclusion INDELs and strain specific genes have been identified as the main source of variations among strains, with individual strains showing different rates of genome evolution. Based on these genome comparisons, it appears that the Pierce's disease strain Temecula-1 genome represents the ancestral genome of the X. fastidiosa. Results of this analysis are publicly available in the form of a web database. PMID:16948851
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yang, Jing; Li, Yuan-Yuan; Shanghai Center for Bioinformation Technology, Shanghai 200235
2012-03-02
Highlights: Black-Right-Pointing-Pointer Proper dataset partition can improve the prediction of deleterious nsSNPs. Black-Right-Pointing-Pointer Partition according to original residue type at nsSNP is a good criterion. Black-Right-Pointing-Pointer Similar strategy is supposed promising in other machine learning problems. -- Abstract: Many non-synonymous SNPs (nsSNPs) are associated with diseases, and numerous machine learning methods have been applied to train classifiers for sorting disease-associated nsSNPs from neutral ones. The continuously accumulated nsSNP data allows us to further explore better prediction approaches. In this work, we partitioned the training data into 20 subsets according to either original or substituted amino acid type at the nsSNPmore » site. Using support vector machine (SVM), training classification models on each subset resulted in an overall accuracy of 76.3% or 74.9% depending on the two different partition criteria, while training on the whole dataset obtained an accuracy of only 72.6%. Moreover, the dataset was also randomly divided into 20 subsets, but the corresponding accuracy was only 73.2%. Our results demonstrated that partitioning the whole training dataset into subsets properly, i.e., according to the residue type at the nsSNP site, will improve the performance of the trained classifiers significantly, which should be valuable in developing better tools for predicting the disease-association of nsSNPs.« less
Johansen, Morten Bo; Izarzugaza, Jose M. G.; Brunak, Søren; Petersen, Thomas Nordahl; Gupta, Ramneek
2013-01-01
We have developed a sequence conservation-based artificial neural network predictor called NetDiseaseSNP which classifies nsSNPs as disease-causing or neutral. Our method uses the excellent alignment generation algorithm of SIFT to identify related sequences and a combination of 31 features assessing sequence conservation and the predicted surface accessibility to produce a single score which can be used to rank nsSNPs based on their potential to cause disease. NetDiseaseSNP classifies successfully disease-causing and neutral mutations. In addition, we show that NetDiseaseSNP discriminates cancer driver and passenger mutations satisfactorily. Our method outperforms other state-of-the-art methods on several disease/neutral datasets as well as on cancer driver/passenger mutation datasets and can thus be used to pinpoint and prioritize plausible disease candidates among nsSNPs for further investigation. NetDiseaseSNP is publicly available as an online tool as well as a web service: http://www.cbs.dtu.dk/services/NetDiseaseSNP PMID:23935863
Ghosh, Mrinmoy; Sodhi, Simrinder Singh; Sharma, Neelesh; Mongre, Raj Kumar; Kim, Nameun; Singh, Amit Kumar; Lee, Sung Jin; Kim, Dae Cheol; Kim, Sung Woo; Lee, Hak Kyo; Song, Ki-Duk; Jeong, Dong Kee
2016-02-04
This study was performed to identify the non- synonymous polymorphisms in the myosin heavy chain 1 gene (MYH1) association with skeletal muscle development in economically important Jeju Native Pig (JNP) and Berkshire breeds. Herein, we present an in silico analysis, with a focus on (a) in silico approaches to predict the functional effect of non-synonymous SNP (nsSNP) in MYH1 on growth, and (b) molecular docking and dynamic simulation of MYH1 to predict the effects of those nsSNP on protein-protein association. The NextGENe (V 2.3.4.) tool was used to identify the variants in MYH1 from JNP and Berkshire using RNA seq. Gene ontology analysis of MYH1 revealed significant association with muscle contraction and muscle organ development. The 95 % confidence intervals clearly indicate that the mRNA expression of MYH1 is significantly higher in the Berkshire longissimus dorsi muscle samples than JNP breed. Concordant in silico analysis of MYH1, the open-source software tools identified 4 potential nsSNP (L884T, K972C, N981G, and Q1285C) in JNP and 1 nsSNP (H973G) in Berkshire pigs. Moreover, protein-protein interactions were studied to investigate the effect of MYH1 mutations on association with hub proteins, and MYH1 was found to be closely associated with the protein myosin light chain, phosphorylatable, fast skeletal muscle MYLPF. The results of molecular docking studies on MYH1 (native and 4 mutants) and MYLFP demonstrated that the native complex showed higher electrostatic energy (-466.5 Kcal mol(-1)), van der Walls energy (-87.3 Kcal mol(-1)), and interaction energy (-835.7 Kcal mol(-1)) than the mutant complexes. Furthermore, the molecular dynamic simulation revealed that the native complex yielded a higher root-mean-square deviation (0.2-0.55 nm) and lower root-mean-square fluctuation (approximately 0.08-0.3 nm) as compared to the mutant complexes. The results suggest that the variants at L884T, K972C, N981G, and Q1285C in MYH1 in JNP might represent a cause for the poor growth performance for this breed. This study is a pioneering in-depth in silico analysis of polymorphic MYH1 and will serve as a valuable resource for further targeted molecular diagnosis and population-based studies conducted for improving the growth performance of JNP.
El-Magd, Mohammed Abu; Abo-Al-Ela, Haitham G; El-Nahas, Abeer; Saleh, Ayman A; Mansour, Ali A
2014-05-01
Insulin-like growth factor 2 receptor (IGF2R) is responsible for degradation of the muscle development initiator, IGF2, and thus it can be used as a marker for selection strategies in the farm animals. The aim of this study was to search for polymorphisms in three coding loci of IGF2R, and to analyze their effect on the growth traits and on the expression levels of IGF2R and IGF2 genes in the gluteus medius muscle of Egyptian buffaloes. A novel A266C SNP was detected in the coding sequences of the third IGF2R locus (at nucleotide number 51 of exon 23) among Egyptian water buffaloes. This SNP was non-synonymous mutation and led to replacement of Y (tyrosine) amino acid (aa) by D (aspartic acid) aa. Three different single-strand conformation polymorphism patterns were observed in the third IGF2R locus: AA, AC, and CC with frequencies of 0.555, 0.195, and 0.250, respectively. Statistical analysis showed that the homozygous AA genotype significantly associated with the average daily gain than AC and CC genotypes from birth to 9 mo of age. Expression analysis showed that the A266C SNP was correlated with IGF2, but not with IGF2R, mRNA levels in the gluteus medius muscle of Egyptian buffaloes. The highest IGF2 mRNA level was estimated in the muscle of animals with the AA homozygous genotype as compared to the AC heterozygotes and CC homozygotes. We conclude that A266C SNP at nucleotide number 51 of exon 23 of the IGF2R gene is associated with the ADG during the early stages of life (from birth to 9 mo of age) and this effect is accompanied by, and may be caused by, increased expression levels of the IGF2 gene. Copyright © 2014 Elsevier B.V. All rights reserved.
Martínez-García, Pedro J; Fresnedo-Ramírez, Jonathan; Parfitt, Dan E; Gradziel, Thomas M; Crisosto, Carlos H
2013-01-01
Single nucleotide polymorphisms (SNPs) are a fundamental source of genomic variation. Large SNP panels have been developed for Prunus species. Fruit quality traits are essential peach breeding program objectives since they determine consumer acceptance, fruit consumption, industry trends and cultivar adoption. For many cultivars, these traits are negatively impacted by cold storage, used to extend fruit market life. The major symptoms of chilling injury are lack of flavor, off flavor, mealiness, flesh browning, and flesh bleeding. A set of 1,109 SNPs was mapped previously and 67 were linked with these complex traits. The prediction of the effects associated with these SNPs on downstream products from the 'peach v1.0' genome sequence was carried out. A total of 2,163 effects were detected, 282 effects (non-synonymous, synonymous or stop codon gained) were located in exonic regions (13.04 %) and 294 placed in intronic regions (13.59 %). An extended list of genes and proteins that could be related to these traits was developed. Two SNP markers that explain a high percentage of the observed phenotypic variance, UCD_SNP_1084 and UCD_SNP_46, are associated with zinc finger (C3HC4-type RING finger) family protein and AOX1A (alternative oxidase 1a) protein groups, respectively. In addition, phenotypic variation suggests that the observed polymorphism for SNP UCD_SNP_1084 [A/G] mutation could be a candidate quantitative trait nucleotide affecting quantitative trait loci for mealiness. The interaction and expression of affected proteins could explain the variation observed in each individual and facilitate understanding of gene regulatory networks for fruit quality traits in peach.
Egawa, Jun; Watanabe, Yuichiro; Shibuya, Masako; Endo, Taro; Sugimoto, Atsunori; Igeta, Hirofumi; Nunokawa, Ayako; Inoue, Emiko; Someya, Toshiyuki
2015-03-01
The oxytocin receptor (OXTR) is implicated in the pathophysiology of autism spectrum disorder (ASD). A recent study found a rare non-synonymous OXTR gene variation, rs35062132 (R376G), associated with ASD in a Japanese population. In order to investigate the association between rare non-synonymous OXTR variations and ASD, we resequenced OXTR and performed association analysis with ASD in a Japanese population. We resequenced the OXTR coding region in 213 ASD patients. Rare non-synonymous OXTR variations detected by resequencing were genotyped in 213 patients and 667 controls. We detected three rare non-synonymous variations: rs35062132 (R376G/C), rs151257822 (G334D), and g.8809426G>T (R150S). However, there was no significant association between these rare non-synonymous variations and ASD. Our present study does not support the contribution of rare non-synonymous OXTR variations to ASD susceptibility in the Japanese population. © 2014 The Authors. Psychiatry and Clinical Neurosciences © 2014 Japanese Society of Psychiatry and Neurology.
Polymorphisms of EpCAM gene and prognosis for non-small-cell lung cancer in Han Chinese
Yang, Yuefan; Fei, Fei; Song, Yang; Li, Xiaofei; Zhang, Zhipei; Fei, Zhou; Su, Haichuan; Wan, Shaogui
2014-01-01
The epithelial cell adhesion molecule (EpCAM) is overexpressed in a wide variety of human cancers and is associated with patient prognosis, including those with lung cancer. However, the association of single nucleotide polymorphisms (SNPs) in the EpCAM gene with the prognosis for non-small-cell lung cancer (NSCLC) patients has never been investigated. We evaluated the association between two SNPs, rs1126497 and rs1421, in the EpCAM gene and clinical outcomes in a Chinese cohort of 506 NSCLC patients. The SNPs were genotyped using the Sequenom iPLEX genotyping system. Multivariate Cox proportional hazards model and Kaplan–Meier curves were used to assess the association of EpCAM gene genotypes with the prognosis of NSCLC. We found that the non-synonymous SNP rs1126497 was significantly associated with survival. Compared with the CC genotype, the CT+TT genotype was a risk factor for both death (hazard ratio, 1.40; 95% confidence interval [CI], 1.02–1.94; P = 0.040) and recurrence (hazard ratio, 1.34; 95% CI, 1.02–1.77; P = 0.039). However, the SNP rs1421 did not show any significant effect on patient prognosis. Instead, the AG+GG genotype in rs1421 was significantly associated with early T stages (T1/T2) when compared with the AA genotype (odds ratio for late stage = 0.65; 95% CI, 0.44–0.96, P = 0.029). Further stratified analysis showed notable modulating effects of clinical characteristics on the associations between variant genotypes of rs1126497 and NSCLC outcomes. In conclusion, our study indicated that the non-synonymous SNP rs1126497 may be a potential prognostic marker for NSCLC patients. PMID:24304228
Battilana, Juri; Emanuelli, Francesco; Gambino, Giorgio; Gribaudo, Ivana; Gasperi, Flavia; Boss, Paul K.; Grando, Maria Stella
2011-01-01
Grape berries of Muscat cultivars (Vitis vinifera L.) contain high levels of monoterpenols and exhibit a distinct aroma related to this composition of volatiles. A structural gene of the plastidial methyl-erythritol-phosphate (MEP) pathway, 1-deoxy-D-xylulose 5-phosphate synthase (VvDXS), was recently suggested as a candidate gene for this trait, having been co-localized with a major quantitative trait locus for linalool, nerol, and geraniol concentrations in berries. In addition, a structured association study discovered a putative causal single nucleotide polymorphism (SNP) responsible for the substitution of a lysine with an asparagine at position 284 of the VvDXS protein, and this SNP was significantly associated with Muscat-flavoured varieties. The significance of this nucleotide difference was investigated by comparing the monoterpene profiles with the expression of VvDXS alleles throughout berry development in Moscato Bianco, a cultivar heterozygous for the SNP mutation. Although correlation was detected between the VvDXS transcript profile and the accumulation of free monoterpenol odorants, the modulation of VvDXS expression during berry development appears to be independent of nucleotide variation in the coding sequence. In order to assess how the non-synonymous mutation may enhance Muscat flavour, an in vitro characterization of enzyme isoforms was performed followed by in vivo overexpression of each VvDXS allele in tobacco. The results showed that the amino acid non-neutral substitution influences the enzyme kinetics by increasing the catalytic efficiency and also dramatically affects monoterpene levels in transgenic lines. These findings confirm a functional effect of the VvDXS gene polymorphism and may pave the way for metabolic engineering of terpenoid contents in grapevine. PMID:21868399
Al-Tobasei, Rafet; Ali, Ali; Leeds, Timothy D; Liu, Sixin; Palti, Yniv; Kenney, Brett; Salem, Mohamed
2017-08-07
Coding/functional SNPs change the biological function of a gene and, therefore, could serve as "large-effect" genetic markers. In this study, we used two bioinformatics pipelines, GATK and SAMtools, for discovering coding/functional SNPs with allelic-imbalances associated with total body weight, muscle yield, muscle fat content, shear force, and whiteness. Phenotypic data were collected for approximately 500 fish, representing 98 families (5 fish/family), from a growth-selected line, and the muscle transcriptome was sequenced from 22 families with divergent phenotypes (4 low- versus 4 high-ranked families per trait). GATK detected 59,112 putative SNPs; of these SNPs, 4798 showed allelic imbalances (>2.0 as an amplification and <0.5 as loss of heterozygosity). SAMtools detected 87,066 putative SNPs; and of them, 4962 had allelic imbalances between the low- and high-ranked families. Only 1829 SNPs with allelic imbalances were common between the two datasets, indicating significant differences in algorithms. The two datasets contained 7930 non-redundant SNPs of which 4439 mapped to 1498 protein-coding genes (with 6.4% non-synonymous SNPs) and 684 mapped to 295 lncRNAs. Validation of a subset of 92 SNPs revealed 1) 86.7-93.8% success rate in calling polymorphic SNPs and 2) 95.4% consistent matching between DNA and cDNA genotypes indicating a high rate of identifying SNPs with allelic imbalances. In addition, 4.64% SNPs revealed random monoallelic expression. Genome distribution of the SNPs with allelic imbalances exhibited high density for all five traits in several chromosomes, especially chromosome 9, 20 and 28. Most of the SNP-harboring genes were assigned to important growth-related metabolic pathways. These results demonstrate utility of RNA-Seq in assessing phenotype-associated allelic imbalances in pooled RNA-Seq samples. The SNPs identified in this study were included in a new SNP-Chip design (available from Affymetrix) for genomic and genetic analyses in rainbow trout.
Karambataki, Maria; Malousi, Andigoni; Tzimagiorgis, Georgios; Haitoglou, Constantinos; Fragou, Aikaterini; Georgiou, Elisavet; Papadopoulou, Foteini; Krassas, Gerasimos E.; Kouidou, Sofia
2017-01-01
Coding synonymous single nucleotide polymorphisms (SNPs) have attracted little attention until recently. However, such SNPs located in epigenetic, CpG sites modifying exonic splicing enhancers (ESEs) can be informative with regards to the recently verified association of intragenic methylation and splicing. The present study describes the association of type 2 diabetes (T2D) with the exonic, synonymous, epigenetic SNPs, rs3749166 in calpain 10 (CAPN10) glucose transporter (GLUT4) translocator and rs5404 in solute carrier family 2, member 2 (SLC2A2), also termed GLUT2, which, according to prior bioinformatic analysis, strongly modify the splicing potential of glucose transport-associated genes. Previous association studies reveal that only rs5404 exhibits a strong negative T2D association, while data on the CAPN10 polymorphism are contradictory. In the present study DNA from blood samples of 99 Greek non-diabetic control subjects and 71 T2D patients was analyzed. In addition, relevant publicly available cases (40) resulting from examination of 110 Personal Genome Project data files were analyzed. The frequency of the rs3749166 A allele, was similar in the patients and non-diabetic control subjects. However, AG heterozygotes were more frequent among patients (73.24% for Greek patients and 54.55% for corresponding non-diabetic control subjects; P=0.0262; total cases, 52.99 and 75.00%, respectively; P=0.0039). The rs5404 T allele was only observed in CT heterozygotes (Greek non-diabetic control subjects, 39.39% and Greek patients, 22.54%; P=0.0205; total cases, 34.69 and 21.28%, respectively; P=0.0258). Notably, only one genotype, heterozygous AG/CC, was T2D-associated (Greek non-diabetic control subjects, 29.29% and Greek patients, 56.33%; P=0.004; total cases, 32.84 and 56.58%, respectively; P=0.0008). Furthermore, AG/CC was strongly associated with very high (≥8.5%) glycosylated plasma hemoglobin levels among patients (P=0.0002 for all cases). These results reveal the complex heterozygotic SNP association with T2D, and indicate possible synergies of these epigenetic, splicing-regulatory, synonymous SNPs, which modify the splicing potential of two alternative glucose transport-associated genes. PMID:28357066
Kathrani, Aarti; House, Arthur; Catchpole, Brian; Murphy, Angela; German, Alex; Werling, Dirk; Allenspach, Karin
2010-01-01
Inflammatory bowel disease (IBD) is considered to be the most common cause of vomiting and diarrhoea in dogs, and the German shepherd dog (GSD) is particularly susceptible. The exact aetiology of IBD is unknown, however associations have been identified between specific single-nucleotide polymorphisms (SNPs) in Toll-like receptors (TLRs) and human IBD. However, to date, no genetic studies have been undertaken in canine IBD. The aim of this study was to investigate whether polymorphisms in canine TLR 2, 4 and 5 genes are associated with IBD in GSDs. Mutational analysis of TLR2, TLR4 and TLR5 was performed in 10 unrelated GSDs with IBD. Four non-synonymous SNPs (T23C, G1039A, A1571T and G1807A) were identified in the TLR4 gene, and three non-synonymous SNPs (G22A, C100T and T1844C) were identified in the TLR5 gene. The non-synonymous SNPs identified in TLR4 and TLR5 were evaluated further in a case-control study using a SNaPSHOT multiplex reaction. Sequencing information from 55 unrelated GSDs with IBD were compared to a control group consisting of 61 unrelated GSDs. The G22A SNP in TLR5 was significantly associated with IBD in GSDs, whereas the remaining two SNPs were found to be significantly protective for IBD. Furthermore, the two SNPs in TLR4 (A1571T and G1807A) were in complete linkage disequilibrium, and were also significantly associated with IBD. The TLR5 risk haplotype (ACC) without the two associated TLR4 SNP alleles was significantly associated with IBD, however the presence of the two TLR4 SNP risk alleles without the TLR5 risk haplotype was not statistically associated with IBD. Our study suggests that the three TLR5 SNPs and two TLR4 SNPs; A1571T and G1807A could play a role in the pathogenesis of IBD in GSDs. Further studies are required to confirm the functional importance of these polymorphisms in the pathogenesis of this disease. PMID:21203467
Kathrani, Aarti; House, Arthur; Catchpole, Brian; Murphy, Angela; German, Alex; Werling, Dirk; Allenspach, Karin
2010-12-23
Inflammatory bowel disease (IBD) is considered to be the most common cause of vomiting and diarrhoea in dogs, and the German shepherd dog (GSD) is particularly susceptible. The exact aetiology of IBD is unknown, however associations have been identified between specific single-nucleotide polymorphisms (SNPs) in Toll-like receptors (TLRs) and human IBD. However, to date, no genetic studies have been undertaken in canine IBD. The aim of this study was to investigate whether polymorphisms in canine TLR 2, 4 and 5 genes are associated with IBD in GSDs. Mutational analysis of TLR2, TLR4 and TLR5 was performed in 10 unrelated GSDs with IBD. Four non-synonymous SNPs (T23C, G1039A, A1571T and G1807A) were identified in the TLR4 gene, and three non-synonymous SNPs (G22A, C100T and T1844C) were identified in the TLR5 gene. The non-synonymous SNPs identified in TLR4 and TLR5 were evaluated further in a case-control study using a SNaPSHOT multiplex reaction. Sequencing information from 55 unrelated GSDs with IBD were compared to a control group consisting of 61 unrelated GSDs. The G22A SNP in TLR5 was significantly associated with IBD in GSDs, whereas the remaining two SNPs were found to be significantly protective for IBD. Furthermore, the two SNPs in TLR4 (A1571T and G1807A) were in complete linkage disequilibrium, and were also significantly associated with IBD. The TLR5 risk haplotype (ACC) without the two associated TLR4 SNP alleles was significantly associated with IBD, however the presence of the two TLR4 SNP risk alleles without the TLR5 risk haplotype was not statistically associated with IBD. Our study suggests that the three TLR5 SNPs and two TLR4 SNPs; A1571T and G1807A could play a role in the pathogenesis of IBD in GSDs. Further studies are required to confirm the functional importance of these polymorphisms in the pathogenesis of this disease.
Genome Sequencing and Comparative Genomics of the Broad Host-Range Pathogen Rhizoctonia solani AG8
Hane, James K.; Anderson, Jonathan P.; Williams, Angela H.; Sperschneider, Jana; Singh, Karam B.
2014-01-01
Rhizoctonia solani is a soil-borne basidiomycete fungus with a necrotrophic lifestyle which is classified into fourteen reproductively incompatible anastomosis groups (AGs). One of these, AG8, is a devastating pathogen causing bare patch of cereals, brassicas and legumes. R. solani is a multinucleate heterokaryon containing significant heterozygosity within a single cell. This complexity posed significant challenges for the assembly of its genome. We present a high quality genome assembly of R. solani AG8 and a manually curated set of 13,964 genes supported by RNA-seq. The AG8 genome assembly used novel methods to produce a haploid representation of its heterokaryotic state. The whole-genomes of AG8, the rice pathogen AG1-IA and the potato pathogen AG3 were observed to be syntenic and co-linear. Genes and functions putatively relevant to pathogenicity were highlighted by comparing AG8 to known pathogenicity genes, orthology databases spanning 197 phytopathogenic taxa and AG1-IA. We also observed SNP-level “hypermutation” of CpG dinucleotides to TpG between AG8 nuclei, with similarities to repeat-induced point mutation (RIP). Interestingly, gene-coding regions were widely affected along with repetitive DNA, which has not been previously observed for RIP in mononuclear fungi of the Pezizomycotina. The rate of heterozygous SNP mutations within this single isolate of AG8 was observed to be higher than SNP mutation rates observed across populations of most fungal species compared. Comparative analyses were combined to predict biological processes relevant to AG8 and 308 proteins with effector-like characteristics, forming a valuable resource for further study of this pathosystem. Predicted effector-like proteins had elevated levels of non-synonymous point mutations relative to synonymous mutations (dN/dS), suggesting that they may be under diversifying selection pressures. In addition, the distant relationship to sequenced necrotrophs of the Ascomycota suggests the R. solani genome sequence may prove to be a useful resource in future comparative analysis of plant pathogens. PMID:24810276
Jacobo, Sarah Melissa P; Deangelis, Margaret M; Kim, Ivana K; Kazlauskas, Andrius
2013-05-01
Synonymous single nucleotide polymorphisms (SNPs) within a transcript's coding region produce no change in the amino acid sequence of the protein product and are therefore intuitively assumed to have a neutral effect on protein function. We report that two common variants of high-temperature requirement A1 (HTRA1) that increase the inherited risk of neovascular age-related macular degeneration (NvAMD) harbor synonymous SNPs within exon 1 of HTRA1 that convert common codons for Ala34 and Gly36 to less frequently used codons. The frequent-to-rare codon conversion reduced the mRNA translation rate and appeared to compromise HtrA1's conformation and function. The protein product generated from the SNP-containing cDNA displayed enhanced susceptibility to proteolysis and a reduced affinity for an anti-HtrA1 antibody. The NvAMD-associated synonymous polymorphisms lie within HtrA1's putative insulin-like growth factor 1 (IGF-1) binding domain. They reduced HtrA1's abilities to associate with IGF-1 and to ameliorate IGF-1-stimulated signaling events and cellular responses. These observations highlight the relevance of synonymous codon usage to protein function and implicate homeostatic protein quality control mechanisms that may go awry in NvAMD.
SNPdbe: constructing an nsSNP functional impacts database.
Schaefer, Christian; Meier, Alice; Rost, Burkhard; Bromberg, Yana
2012-02-15
Many existing databases annotate experimentally characterized single nucleotide polymorphisms (SNPs). Each non-synonymous SNP (nsSNP) changes one amino acid in the gene product (single amino acid substitution;SAAS). This change can either affect protein function or be neutral in that respect. Most polymorphisms lack experimental annotation of their functional impact. Here, we introduce SNPdbe-SNP database of effects, with predictions of computationally annotated functional impacts of SNPs. Database entries represent nsSNPs in dbSNP and 1000 Genomes collection, as well as variants from UniProt and PMD. SAASs come from >2600 organisms; 'human' being the most prevalent. The impact of each SAAS on protein function is predicted using the SNAP and SIFT algorithms and augmented with experimentally derived function/structure information and disease associations from PMD, OMIM and UniProt. SNPdbe is consistently updated and easily augmented with new sources of information. The database is available as an MySQL dump and via a web front end that allows searches with any combination of organism names, sequences and mutation IDs. http://www.rostlab.org/services/snpdbe.
Profiling deleterious non-synonymous SNPs of smoker's gene CYP1A1.
Ramesh, A Sai; Khan, Imran; Farhan, Md; Thiagarajan, Padma
2013-01-01
CYP1A1 gene belongs to the cytochrome P450 family and is known better as smokers' gene due to its hyperactivation as a consequence of long term smoking. The expression of CYP1A1 induces polycyclic aromatic hydrocarbon production in the lungs, which when over expressed, is known to cause smoking related diseases, such as cardiovascular pathologies, cancer, and diabetes. Single nucleotide polymorphisms (SNPs) are the simplest form of genetic variations that occur at a higher frequency, and are denoted as synonymous and non-synonymous SNPs on the basis of their effects on the amino acids. This study adopts a systematic in silico approach to predict the deleterious SNPs that are associated with disease conditions. It is inferred that four SNPs are highly deleterious, among which the SNP with rs17861094 is commonly predicted to be harmful by all tools. Hydrophobic (isoleucine) to hydrophilic (serine) amino acid variation was observed in the candidate gene. Hence, this investigation aims to characterize a candidate gene from 159 SNPs of CYP1A1.
Gan, W; Song, Q; Zhang, N N; Xiong, X P; Wang, D M C; Li, L
2015-06-18
The fat mass and obesity-associated gene (FTO) is an excellent candidate gene that affects energy metabolism. Single nucleotide polymorphisms (SNPs) in FTO are associated with carcass and meat quality traits in pigs, cattle, and rabbits. The aim of this study was to investigate the association between novel SNPs in the FTO coding region and carcass and meat quality traits in 95 crossbred ducks, using DNA sequencing. We found two transitions G/A (SNP 387 and 473) within exon 3. SNP 387 was a synonymous mutation, whereas SNP 473 was a missense mutation. Association analysis suggested that SNP g.387G>A was significantly associated with all of the carcass traits measured, the intramuscular fat content (IMF), cooking yield (CY), pH values 45 min after slaughter (pH45m), drip losses from the breast muscle, and the leg muscle (P < 0.05). For SNP g.473G>A, the genotype AA exhibited greater leg muscle weight than the genotypes GG or AG (P < 0.05). The D value suggested that the two SNPs exhibited strong linkage disequilibrium. Three haplotypes (G1G2, G1A2, and A1A2) were significantly associated with IMF, CY, the a* value, and all of the carcass traits measured (P < 0.05). The results suggest that FTO is a candidate locus that affects carcass and meat quality traits in ducks.
Is α‐T catenin (VR22) an Alzheimer's disease risk gene?
Bertram, Lars; Mullin, Kristina; Parkinson, Michele; Hsiao, Monica; Moscarillo, Thomas J; Wagner, Steven L; Becker, K David; Velicelebi, Gonul; Blacker, Deborah; Tanzi, Rudolph E
2007-01-01
Background Recently, conflicting reports have been published on the potential role of genetic variants in the α‐T catenin gene (VR22; CTNNA3) on the risk for Alzheimer's disease. In these papers, evidence for association is mostly observed in multiplex families with Alzheimer's disease, whereas case–control samples of sporadic Alzheimer's disease are predominantly negative. Methods After sequencing VR22 in multiplex families with Alzheimer's disease linked to chromosome 10q21, we identified a novel non‐synonymous (Ser596Asn; rs4548513) single nucleotide polymorphism (SNP). This and four non‐coding SNPs were assessed in two independent samples of families with Alzheimer's disease, one with 1439 subjects from 437 multiplex families with Alzheimer's disease and the other with 489 subjects from 217 discordant sibships. Results A weak association with the Ser596Asn SNP in the multiplex sample, predominantly in families with late‐onset Alzheimer's disease (p = 0.02), was observed. However, this association does not seem to contribute substantially to the chromosome 10 Alzheimer's disease linkage signal that we and others have reported previously. No evidence was found of association with any of the four additional SNPs tested in the multiplex families with Alzheimer's disease. Finally, the Ser596Asn change was not associated with the risk for Alzheimer's disease in the independent discordant sibship sample. Conclusions This is the first study to report evidence of an association between a potentially functional, non‐synonymous SNP in VR22 and the risk for Alzheimer's disease. As the underlying effects are probably small, and are only seen in families with multiple affected members, the population‐wide significance of this finding remains to be determined. PMID:17209133
Screening for susceptibility genes in hereditary non-polyposis colorectal cancer.
Yu, Li; Yin, Bo; Qu, Kaiying; Li, Jingjing; Jin, Qiao; Liu, Ling; Liu, Chunlan; Zhu, Yuxing; Wang, Qi; Peng, Xiaowei; Zhou, Jianda; Cao, Peiguo; Cao, Ke
2018-06-01
In the present study, hereditary non-polyposis colorectal cancer (HNPCC) susceptibility genes were screened for using whole exome sequencing in 3 HNPCC patients from 1 family and using single nucleotide polymorphism (SNP) genotyping assays in 96 other colorectal cancer and control samples. Peripheral blood was obtained from 3 HNPCC patients from 1 family; the proband and the proband's brother and cousin. High-throughput sequencing was performed using whole exome capture technology. Sequences were aligned against the HAPMAP, dbSNP130 and 1,000 Genome Project databases. Reported common variations and synonymous mutations were filtered out. Non-synonymous single nucleotide variants in the 3 HNPCC patients were integrated and the candidate genes were identified. Finally, SNP genotyping was performed for the genes in 96 peripheral blood samples. In total, 60.4 Gb of data was retrieved from the 3 HNPCC patients using whole exome capture technology. Subsequently, according to certain screening criteria, 15 candidate genes were identified. Among the 96 samples that had been SNP genotyped, 92 were successfully genotyped for 15 gene loci, while genotyping for HTRA1 failed in 4 sporadic colorectal cancer patient samples. In 12 control subjects and 81 sporadic colorectal cancer patients, genotypes at 13 loci were wild-type, namely DDX20, ZFYVE26, PIK3R3, SLC26A8, ZEB2, TP53INP1, SLC11A1, LRBA, CEBPZ, ETAA1, SEMA3G, IFRD2 and FAT1 . The CEP290 genotype was mutant in 1 sporadic colorectal cancer patient and was wild-type in all other subjects. A total of 5 of the 12 control subjects and 30 of the 81 sporadic colorectal cancer patients had a mutant HTRA1 genotype. In all 3 HNPCC patients, the same mutant genotypes were identified at all 15 gene loci. Overall, 13 potential susceptibility genes for HNPCC were identified, namely DDX20, ZFYVE26, PIK3R3, SLC26A8, ZEB2, TP53INP1, SLC11A1, LRBA, CEBPZ, ETAA1, SEMA3G, IFRD2 and FAT1 .
Bester-Van Der Merwe, Aletta; Blaauw, Sonja; Du Plessis, Jana; Roodt-Wilding, Rouvay
2013-09-23
Haliotis midae is one of the most valuable commercial abalone species in the world, but is highly vulnerable, due to exploitation, habitat destruction and predation. In order to preserve wild and cultured stocks, genetic management and improvement of the species has become crucial. Fundamental to this is the availability and employment of molecular markers, such as microsatellites and single nucleotide (SNPs). Transcriptome sequences generated through sequencing-by-synthesis technology were utilized for the in vitro and in silico identification of 505 putative SNPs from a total of 316 selected contigs. A subset of 234 SNPs were further validated and characterized in wild and cultured abalone using two Illumina GoldenGate genotyping assays. Combined with VeraCode technology, this genotyping platform yielded a 65%-69% conversion rate (percentage polymorphic markers) with a global genotyping success rate of 76%-85% and provided a viable means for validating SNP markers in a non-model species. The utility of 31 of the validated SNPs in population structure analysis was confirmed, while a large number of SNPs (174) were shown to be informative and are, thus, good candidates for linkage map construction. The non-synonymous SNPs (50) located in coding regions of genes that showed similarities with known proteins will also be useful for genetic applications, such as the marker-assisted selection of genes of relevance to abalone aquaculture.
Standardization of PCR-RFLP analysis of nsSNP rs1468384 of NPC1L1 gene
Balgir, Praveen P.; Khanna, Divya; Kaur, Gurlovleen
2008-01-01
Niemann-Pick C1-like 1 (NPC1L1) protein, a newly identified sterol influx transporter, located at the apical membrane of the enterocyte, which may actively facilitate the uptake of cholesterol by promoting the passage of sterols across the brush border membrane of the enterocyte. It effects intestinal cholesterol absorption and intracellular transport and as such is an integral part of complex process of cholesterol homeostasis. The study of population data for the distribution of these single nucleotide polymorphisms (SNP) of NPC1L1 has lead to the identification of six non-synonymous single nucleotide polymorphisms (nsSNP). The in vitro analysis using the software MuPro and StructureSNP shows that nsSNP M510I (rs1468384), which involves A→G base pair change leads to decrease in the stability of the protein. A reproducible and a cost-effective PCR-RFLP based assay was developed to screen for the SNP among population data. This SNP has been studied in Caucasian, Asian, and African American populations. Till date, no data is available on Indian population. The distribution of M510I NPC1L1 genotype was estimated in the North Western Indian Population as a test case. The allele distribution in Indian Population differs significantly from that of other populations. The methodology thus proved to be robust enough to bring out these differences. PMID:20300301
Chen, N B; Ma, Y; Yang, T; Lin, F; Fu, W W; Xu, Y J; Li, F; Li, J Y; Gao, S X
2015-08-01
Angiopoietin-like protein 3 (ANGPTL3) is a secreted protein that regulates lipid, glucose and energy metabolism. This study was conducted to better understand the effect of ANGPTL3 on important economic traits in cattle. First, transcript profiles for ANGPTL3 were measured in nine different Jiaxian cattle tissues. Second, polymorphisms were identified in the complete coding region and promoter region of the bovine ANGPTL3 gene in 707 cattle samples. Finally, an association study was carried out utilizing these single nucleotide polymorphisms (SNPs) to determine the effect of these SNPs on the growth and meat quality traits. Quantitative real-time PCR analysis showed that ANGPTL3 was mainly expressed in the liver. The promoter of the bovine ANGPTL3 contained several putative transcription factor binding sites (SF1, HNF-1, LXRα, NFκβ, HNF-3 and C/EBP). In total, four SNPs of the bovine ANGPTL3 gene were identified by direct sequencing. SNP1 (rs469906272: g.-38T>C) was identified in the promoter, SNP2 (rs451104723:g.104A>T) and SNP3 (rs482516226: g.509A>G) were identified in exon 1, and SNP4 (rs477165942: g.8661T>C) was identified in exon 6. Changes in predicted protein structures due to non-synonymous SNPs were analyzed. Haplotype frequencies and linkage disequilibrium were also investigated. Analysis of four SNPs in cattle from different native Chinese breeds (Nanyang (NY) and Jiaxian (JX)) and commercial breeds (Angus (AG), Hereford (HF), Limousin (LM), Luxi (LX), Simmental (ST) and Jinnan (JN)) revealed a significant association with growth traits (including: BW and hipbone width) and meat quality traits (including: Warner-Bratzler shear force and ribeye area). Therefore, implementation of these four mutations in selection indices in the beef industry may be beneficial in selecting individuals with superior growth and meat quality traits.
Variation in conserved non-coding sequences on chromosome 5q andsusceptibility to asthma and atopy
DOE Office of Scientific and Technical Information (OSTI.GOV)
Donfack, Joseph; Schneider, Daniel H.; Tan, Zheng
2005-09-10
Background: Evolutionarily conserved sequences likely havebiological function. Methods: To determine whether variation in conservedsequences in non-coding DNA contributes to risk for human disease, westudied six conserved non-coding elements in the Th2 cytokine cluster onhuman chromosome 5q31 in a large Hutterite pedigree and in samples ofoutbred European American and African American asthma cases and controls.Results: Among six conserved non-coding elements (>100 bp,>70percent identity; human-mouse comparison), we identified one singlenucleotide polymorphism (SNP) in each of two conserved elements and sixSNPs in the flanking regions of three conserved elements. We genotypedour samples for four of these SNPs and an additional three SNPs eachmore » inthe IL13 and IL4 genes. While there was only modest evidence forassociation with single SNPs in the Hutterite and European Americansamples (P<0.05), there were highly significant associations inEuropean Americans between asthma and haplotypes comprised of SNPs in theIL4 gene (P<0.001), including a SNP in a conserved non-codingelement. Furthermore, variation in the IL13 gene was strongly associatedwith total IgE (P = 0.00022) and allergic sensitization to mold allergens(P = 0.00076) in the Hutterites, and more modestly associated withsensitization to molds in the European Americans and African Americans (P<0.01). Conclusion: These results indicate that there is overalllittle variation in the conserved non-coding elements on 5q31, butvariation in IL4 and IL13, including possibly one SNP in a conservedelement, influence asthma and atopic phenotypes in diversepopulations.« less
Duellman, Tyler; Warren, Christopher; Yang, Jay
2014-01-01
Microribonucleic acids (miRNAs) work with exquisite specificity and are able to distinguish a target from a non-target based on a single nucleotide mismatch in the core nucleotide domain. We questioned whether miRNA regulation of gene expression could occur in a single nucleotide polymorphism (SNP)-specific manner, manifesting as a post-transcriptional control of expression of genetic polymorphisms. In our recent study of the functional consequences of matrix metalloproteinase (MMP)-9 SNPs, we discovered that expression of a coding exon SNP in the pro-domain of the protein resulted in a profound decrease in the secreted protein. This missense SNP results in the N38S amino acid change and a loss of an N-glycosylation site. A systematic study demonstrated that the loss of secreted protein was due not to the loss of an N-glycosylation site, but rather an SNP-specific targeting by miR-671-3p and miR-657. Bioinformatics analysis identified 41 SNP-specific miRNA targeting MMP-9 SNPs, mostly in the coding exon and an extension of the analysis to chromosome 20, where the MMP-9 gene is located, suggesting that SNP-specific miRNAs targeting the coding exon are prevalent. This selective post-transcriptional regulation of a target messenger RNA harboring genetic polymorphisms by miRNAs offers an SNP-dependent post-transcriptional regulatory mechanism, allowing for polymorphic-specific differential gene regulation. PMID:24627221
Ning, Shangwei; Yue, Ming; Wang, Peng; Liu, Yue; Zhi, Hui; Zhang, Yan; Zhang, Jizhou; Gao, Yue; Guo, Maoni; Zhou, Dianshuang; Li, Xin; Li, Xia
2017-01-04
We describe LincSNP 2.0 (http://bioinfo.hrbmu.edu.cn/LincSNP), an updated database that is used specifically to store and annotate disease-associated single nucleotide polymorphisms (SNPs) in human long non-coding RNAs (lncRNAs) and their transcription factor binding sites (TFBSs). In LincSNP 2.0, we have updated the database with more data and several new features, including (i) expanding disease-associated SNPs in human lncRNAs; (ii) identifying disease-associated SNPs in lncRNA TFBSs; (iii) updating LD-SNPs from the 1000 Genomes Project; and (iv) collecting more experimentally supported SNP-lncRNA-disease associations. Furthermore, we developed three flexible online tools to retrieve and analyze the data. Linc-Mart is a convenient way for users to customize their own data. Linc-Browse is a tool for all data visualization. Linc-Score predicts the associations between lncRNA and disease. In addition, we provided users a newly designed, user-friendly interface to search and download all the data in LincSNP 2.0 and we also provided an interface to submit novel data into the database. LincSNP 2.0 is a continually updated database and will serve as an important resource for investigating the functions and mechanisms of lncRNAs in human diseases. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Dimitrieva, Slavica; Anisimova, Maria
2014-01-01
In protein-coding genes, synonymous mutations are often thought not to affect fitness and therefore are not subject to natural selection. Yet increasingly, cases of non-neutral evolution at certain synonymous sites were reported over the last decade. To evaluate the extent and the nature of site-specific selection on synonymous codons, we computed the site-to-site synonymous rate variation (SRV) and identified gene properties that make SRV more likely in a large database of protein-coding gene families and protein domains. To our knowledge, this is the first study that explores the determinants and patterns of the SRV in real data. We show that the SRV is widespread in the evolution of protein-coding sequences, putting in doubt the validity of the synonymous rate as a standard neutral proxy. While protein domains rarely undergo adaptive evolution, the SRV appears to play important role in optimizing the domain function at the level of DNA. In contrast, protein families are more likely to evolve by positive selection, but are less likely to exhibit SRV. Stronger SRV was detected in genes with stronger codon bias and tRNA reusage, those coding for proteins with larger number of interactions or forming larger number of structures, located in intracellular components and those involved in typically conserved complex processes and functions. Genes with extreme SRV show higher expression levels in nearly all tissues. This indicates that codon bias in a gene, which often correlates with gene expression, may often be a site-specific phenomenon regulating the speed of translation along the sequence, consistent with the co-translational folding hypothesis. Strikingly, genes with SRV were strongly overrepresented for metabolic pathways and those associated with several genetic diseases, particularly cancers and diabetes.
McClure, Matthew; Kim, Euisoo; Bickhart, Derek; Null, Daniel; Cooper, Tabatha; Cole, John; Wiggans, George; Ajmone-Marsan, Paolo; Colli, Licia; Santus, Enrico; Liu, George E.; Schroeder, Steve; Matukumalli, Lakshmi; Van Tassell, Curt; Sonstegard, Tad
2013-01-01
Bovine Progressive Degenerative Myeloencephalopathy (Weaver Syndrome) is a recessive neurological disease that has been observed in the Brown Swiss cattle breed since the 1970’s in North America and Europe. Bilateral hind leg weakness and ataxia appear in afflicted animals at 6 to 18 months of age, and slowly progresses to total loss of hind limb control by 3 to 4 years of age. While Weaver has previously been mapped to Bos taurus autosome (BTA) 4∶46–56 Mb and a diagnostic test based on the 6 microsatellite (MS) markers is commercially available, neither the causative gene nor mutation has been identified; therefore misdiagnosis can occur due to recombination between the diagnostic MS markers and the causative mutation. Analysis of 34,980 BTA 4 SNPs genotypes derived from the Illumina BovineHD assay for 20 Brown Swiss Weaver carriers and 49 homozygous normal bulls refined the Weaver locus to 48–53 Mb. Genotyping of 153 SNPs, identified from whole genome sequencing of 10 normal and 10 carrier animals, across a validation set of 841 animals resulted in the identification of 41 diagnostic SNPs that were concordant with the disease. Except for one intergenic SNP all are associated with genes expressed in nervous tissues: 37 distal to NRCAM, one non-synonymous (serine to asparagine) in PNPLA8, one synonymous and one non-synonymous (lysine to glutamic acid) in CTTNBP2. Haplotype and imputation analyses of 7,458 Brown Swiss animals with Illumina BovineSNP50 data and the 41 diagnostic SNPs resulted in the identification of only one haplotype concordant with the Weaver phenotype. Use of this haplotype and the diagnostic SNPs more accurately identifies Weaver carriers in both Brown Swiss purebred and influenced herds. PMID:23527149
Analysis of the mitochondrial genome of cheetahs (Acinonyx jubatus) with neurodegenerative disease.
Burger, Pamela A; Steinborn, Ralf; Walzer, Christian; Petit, Thierry; Mueller, Mathias; Schwarzenberger, Franz
2004-08-18
The complete mitochondrial genome of Acinonyx jubatus was sequenced and mitochondrial DNA (mtDNA) regions were screened for polymorphisms as candidates for the cause of a neurodegenerative demyelinating disease affecting captive cheetahs. The mtDNA reference sequences were established on the basis of the complete sequences of two diseased and two nondiseased animals as well as partial sequences of 26 further individuals. The A. jubatus mitochondrial genome is 17,047-bp long and shows a high sequence similarity (91%) to the domestic cat. Based on single nucleotide polymorphisms (SNPs) in the control region (CR) and pedigree information, the 18 myelopathic and 12 non-myelopathic cheetahs included in this study were classified into haplotypes I, II and III. In view of the phenotypic comparability of the neurodegenerative disease observed in cheetahs and human mtDNA-associated diseases, specific coding regions including the tRNAs leucine UUR, lysine, serine UCN, and partial complex I and V sequences were screened. We identified a heteroplasmic and a homoplasmic SNP at codon 507 in the subunit 5 (MTND5) of complex I. The heteroplasmic haplotype I-specific valine to methionine substitution represents a nonconservative amino acid change and was found in 11 myelopathic and eight non-myelopathic cheetahs with levels ranging from 29% to 79%. The homoplasmic conservative amino acid substitution valine to alanine was identified in two myelopathic animals of haplotype II. In addition, a synonymous SNP in the codon 76 of the MTND4L gene was found in the single haplotype III animal. The amino acid exchanges in the MTND5 gene were not associated with the occurrence of neurodegenerative disease in captive cheetahs.
Pang, Erli; Wu, Xiaomei; Lin, Kui
2016-06-01
Protein evolution plays an important role in the evolution of each genome. Because of their functional nature, in general, most of their parts or sites are differently constrained selectively, particularly by purifying selection. Most previous studies on protein evolution considered individual proteins in their entirety or compared protein-coding sequences with non-coding sequences. Less attention has been paid to the evolution of different parts within each protein of a given genome. To this end, based on PfamA annotation of all human proteins, each protein sequence can be split into two parts: domains or unassigned regions. Using this rationale, single nucleotide polymorphisms (SNPs) in protein-coding sequences from the 1000 Genomes Project were mapped according to two classifications: SNPs occurring within protein domains and those within unassigned regions. With these classifications, we found: the density of synonymous SNPs within domains is significantly greater than that of synonymous SNPs within unassigned regions; however, the density of non-synonymous SNPs shows the opposite pattern. We also found there are signatures of purifying selection on both the domain and unassigned regions. Furthermore, the selective strength on domains is significantly greater than that on unassigned regions. In addition, among all of the human protein sequences, there are 117 PfamA domains in which no SNPs are found. Our results highlight an important aspect of protein domains and may contribute to our understanding of protein evolution.
Fine Mapping and Functional Analysis of the Multiple Sclerosis Risk Gene CD6
Swaminathan, Bhairavi; Cuapio, Angélica; Alloza, Iraide; Matesanz, Fuencisla; Alcina, Antonio; García-Barcina, Maria; Fedetz, Maria; Fernández, Óscar; Lucas, Miguel; Órpez, Teresa; Pinto-Medel, Mª Jesus; Otaegui, David; Olascoaga, Javier; Urcelay, Elena; Ortiz, Miguel A.; Arroyo, Rafael; Oksenberg, Jorge R.; Antigüedad, Alfredo; Tolosa, Eva; Vandenbroeck, Koen
2013-01-01
CD6 has recently been identified and validated as risk gene for multiple sclerosis (MS), based on the association of a single nucleotide polymorphism (SNP), rs17824933, located in intron 1. CD6 is a cell surface scavenger receptor involved in T-cell activation and proliferation, as well as in thymocyte differentiation. In this study, we performed a haptag SNP screen of the CD6 gene locus using a total of thirteen tagging SNPs, of which three were non-synonymous SNPs, and replicated the recently reported GWAS SNP rs650258 in a Spanish-Basque collection of 814 controls and 823 cases. Validation of the six most strongly associated SNPs was performed in an independent collection of 2265 MS patients and 2600 healthy controls. We identified association of haplotypes composed of two non-synonymous SNPs [rs11230563 (R225W) and rs2074225 (A257V)] in the 2nd SRCR domain with susceptibility to MS (P max(T) permutation = 1×10−4). The effect of these haplotypes on CD6 surface expression and cytokine secretion was also tested. The analysis showed significantly different CD6 expression patterns in the distinct cell subsets, i.e. – CD4+ naïve cells, P = 0.0001; CD8+ naïve cells, P<0.0001; CD4+ and CD8+ central memory cells, P = 0.01 and 0.05, respectively; and natural killer T (NKT) cells, P = 0.02; with the protective haplotype (RA) showing higher expression of CD6. However, no significant changes were observed in natural killer (NK) cells, effector memory and terminally differentiated effector memory T cells. Our findings reveal that this new MS-associated CD6 risk haplotype significantly modifies expression of CD6 on CD4+ and CD8+ T cells. PMID:23638056
Genetic variation in eleven phase I drug metabolism genes in an ethnically diverse population.
Solus, Joseph F; Arietta, Brenda J; Harris, James R; Sexton, David P; Steward, John Q; McMunn, Chara; Ihrie, Patrick; Mehall, Janelle M; Edwards, Todd L; Dawson, Elliott P
2004-10-01
The extent of genetic variation found in drug metabolism genes and its contribution to interindividual variation in response to medication remains incompletely understood. To better determine the identity and frequency of variation in 11 phase I drug metabolism genes, the exons and flanking intronic regions of the cytochrome P450 (CYP) isoenzyme genes CYP1A1, CYP1A2, CYP2A6, CYP2B6, CYP2C8, CYP2C9, CYP2C19, CYP2D6, CYP2E1, CYP3A4 and CYP3A5 were amplified from genomic DNA and sequenced. A total of 60 kb of bi-directional sequence was generated from each of 93 human DNAs, which included Caucasian, African-American and Asian samples. There were 388 different polymorphisms identified. These included 269 non-coding, 45 synonymous and 74 non-synonymous polymorphisms. Of these, 54% were novel and included 176 non-coding, 14 synonymous and 21 non-synonymous polymorphisms. Of the novel variants observed, 85 were represented by single occurrences of the minor allele in the sample set. Much of the variation observed was from low-frequency alleles. Comparatively, these genes are variation-rich. Calculations measuring genetic diversity revealed that while the values for the individual genes are widely variable, the overall nucleotide diversity of 7.7 x 10(-4) and polymorphism parameter of 11.5 x 10(-4) are higher than those previously reported for other gene sets. Several independent measurements indicate that these genes are under selective pressure, particularly for polymorphisms corresponding to non-synonymous amino acid changes. There is relatively little difference in measurements of diversity among the ethnic groups, but there are large differences among the genes and gene subfamilies themselves. Of the three CYP subfamilies involved in phase I drug metabolism (1, 2, and 3), subfamily 2 displays the highest levels of genetic diversity.
Dereeper, Alexis; Nicolas, Stéphane; Le Cunff, Loïc; Bacilieri, Roberto; Doligez, Agnès; Peros, Jean-Pierre; Ruiz, Manuel; This, Patrice
2011-05-05
High-throughput re-sequencing, new genotyping technologies and the availability of reference genomes allow the extensive characterization of Single Nucleotide Polymorphisms (SNPs) and insertion/deletion events (indels) in many plant species. The rapidly increasing amount of re-sequencing and genotyping data generated by large-scale genetic diversity projects requires the development of integrated bioinformatics tools able to efficiently manage, analyze, and combine these genetic data with genome structure and external data. In this context, we developed SNiPlay, a flexible, user-friendly and integrative web-based tool dedicated to polymorphism discovery and analysis. It integrates:1) a pipeline, freely accessible through the internet, combining existing softwares with new tools to detect SNPs and to compute different types of statistical indices and graphical layouts for SNP data. From standard sequence alignments, genotyping data or Sanger sequencing traces given as input, SNiPlay detects SNPs and indels events and outputs submission files for the design of Illumina's SNP chips. Subsequently, it sends sequences and genotyping data into a series of modules in charge of various processes: physical mapping to a reference genome, annotation (genomic position, intron/exon location, synonymous/non-synonymous substitutions), SNP frequency determination in user-defined groups, haplotype reconstruction and network, linkage disequilibrium evaluation, and diversity analysis (Pi, Watterson's Theta, Tajima's D).Furthermore, the pipeline allows the use of external data (such as phenotype, geographic origin, taxa, stratification) to define groups and compare statistical indices.2) a database storing polymorphisms, genotyping data and grapevine sequences released by public and private projects. It allows the user to retrieve SNPs using various filters (such as genomic position, missing data, polymorphism type, allele frequency), to compare SNP patterns between populations, and to export genotyping data or sequences in various formats. Our experiments on grapevine genetic projects showed that SNiPlay allows geneticists to rapidly obtain advanced results in several key research areas of plant genetic diversity. Both the management and treatment of large amounts of SNP data are rendered considerably easier for end-users through automation and integration. Current developments are taking into account new advances in high-throughput technologies.SNiPlay is available at: http://sniplay.cirad.fr/.
Van Rechem, Capucine; Black, Joshua C; Greninger, Patricia; Zhao, Yang; Donado, Carlos; Burrowes, Paul D; Ladd, Brendon; Christiani, David C; Benes, Cyril H; Whetstine, Johnathan R
2015-03-01
SNPs occur within chromatin-modulating factors; however, little is known about how these variants within the coding sequence affect cancer progression or treatment. Therefore, there is a need to establish their biochemical and/or molecular contribution, their use in subclassifying patients, and their impact on therapeutic response. In this report, we demonstrate that coding SNP-A482 within the lysine tridemethylase gene KDM4A/JMJD2A has different allelic frequencies across ethnic populations, associates with differential outcome in patients with non-small cell lung cancer (NSCLC), and promotes KDM4A protein turnover. Using an unbiased drug screen against 87 preclinical and clinical compounds, we demonstrate that homozygous SNP-A482 cells have increased mTOR inhibitor sensitivity. mTOR inhibitors significantly reduce SNP-A482 protein levels, which parallels the increased drug sensitivity observed with KDM4A depletion. Our data emphasize the importance of using variant status as candidate biomarkers and highlight the importance of studying SNPs in chromatin modifiers to achieve better targeted therapy. This report documents the first coding SNP within a lysine demethylase that associates with worse outcome in patients with NSCLC. We demonstrate that this coding SNP alters the protein turnover and associates with increased mTOR inhibitor sensitivity, which identifies a candidate biomarker for mTOR inhibitor therapy and a therapeutic target for combination therapy. ©2015 American Association for Cancer Research.
Chen, Xing; Zhang, Shujun; Cheng, Zhangrui; Cooke, Jessica S.; Werling, Dirk
2017-01-01
Selectins are adhesion molecules, which mediate attachment between leucocytes and endothelium. They aid extravasation of leucocytes from blood into inflamed tissue during the mammary gland’s response to infection. Selectins are also involved in attachment of the conceptus to the endometrium and subsequent placental development. Poor fertility and udder health are major causes for culling dairy cows. The three identified bovine selectin genes SELP, SELL and SELE are located in a gene cluster. SELP is the most polymorphic of these genes. Several SNP in SELP and SELE are associated with human vascular disease, while SELP SNP rs6127 has been associated with recurrent pregnancy loss in women. This study describes the results of a gene association study for SNP in SELP (n = 5), SELL (n = 2) and SELE (n = 1) with fertility, milk production and longevity traits in a population of 337 Holstein Friesian dairy cows. Blood samples for PCR-RFLP were collected at 6 months of age and animals were monitored until either culling or 2,340 days from birth. Three SNP in SELPEx4-6 formed a haplotype block containing a Glu/Ala substitution at rs42312260. This region was associated with poor fertility and reduced survival times. SELPEx8 (rs378218397) coded for a Val475Met variant locus in the linking region between consensus repeats 4 and 5, which may influence glycosylation. The synonymous SNP rs110045112 in SELEEx14 deviated from Hardy Weinberg equilibrium. For both this SNP and rs378218397 there were too few AA homozygotes present in the population and AG heterozygotes had significantly worse fertility than GG homozygotes. Small changes in milk production associated with some SNP could not account for the reduced fertility and only SELPEx6 showed any association with somatic cell count. These results suggest that polymorphisms in SELP and SELE are associated with the likelihood of successful pregnancy, potentially through compromised implantation and placental development. PMID:28419109
Probing genomic diversity and evolution of Escherichia coli O157 by single nucleotide polymorphisms.
Zhang, Wei; Qi, Weihong; Albert, Thomas J; Motiwala, Alifiya S; Alland, David; Hyytia-Trees, Eija K; Ribot, Efrain M; Fields, Patricia I; Whittam, Thomas S; Swaminathan, Bala
2006-06-01
Infections by Shiga toxin-producing Escherichia coli O157:H7 (STEC O157) are the predominant cause of bloody diarrhea and hemolytic uremic syndrome in the United States. In silico comparison of the two complete STEC O157 genomes (Sakai and EDL933) revealed a strikingly high level of sequence identity in orthologous protein-coding genes, limiting the use of nucleotide sequences to study the evolution and epidemiology of this bacterial pathogen. To systematically examine single nucleotide polymorphisms (SNPs) at a genome scale, we designed comparative genome sequencing microarrays and analyzed 1199 chromosomal genes (a total of 1,167,948 bp) and 92,721 bp of the large virulence plasmid (pO157) of eleven outbreak-associated STEC O157 strains. We discovered 906 SNPs in 523 chromosomal genes and observed a high level of DNA polymorphisms among the pO157 plasmids. Based on a uniform rate of synonymous substitution for Escherichia coli and Salmonella enterica (4.7x10(-9) per site per year), we estimate that the most recent common ancestor of the contemporary beta-glucuronidase-negative, non-sorbitolfermenting STEC O157 strains existed ca. 40 thousand years ago. The phylogeny of the STEC O157 strains based on the informative synonymous SNPs was compared to the maximum parsimony trees inferred from pulsed-field gel electrophoresis and multilocus variable numbers of tandem repeats analysis. The topological discrepancies indicate that, in contrast to the synonymous mutations, parts of STEC O157 genomes have evolved through different mechanisms with highly variable divergence rates. The SNP loci reported here will provide useful genetic markers for developing high-throughput methods for fine-resolution genotyping of STEC O157. Functional characterization of nucleotide polymorphisms should shed new insights on the evolution, epidemiology, and pathogenesis of STEC O157 and related pathogens.
Probing genomic diversity and evolution of Escherichia coli O157 by single nucleotide polymorphisms
Zhang, Wei; Qi, Weihong; Albert, Thomas J.; Motiwala, Alifiya S.; Alland, David; Hyytia-Trees, Eija K.; Ribot, Efrain M.; Fields, Patricia I.; Whittam, Thomas S.; Swaminathan, Bala
2006-01-01
Infections by Shiga toxin-producing Escherichia coli O157:H7 (STEC O157) are the predominant cause of bloody diarrhea and hemolytic uremic syndrome in the United States. In silico comparison of the two complete STEC O157 genomes (Sakai and EDL933) revealed a strikingly high level of sequence identity in orthologous protein-coding genes, limiting the use of nucleotide sequences to study the evolution and epidemiology of this bacterial pathogen. To systematically examine single nucleotide polymorphisms (SNPs) at a genome scale, we designed comparative genome sequencing microarrays and analyzed 1199 chromosomal genes (a total of 1,167,948 bp) and 92,721 bp of the large virulence plasmid (pO157) of eleven outbreak-associated STEC O157 strains. We discovered 906 SNPs in 523 chromosomal genes and observed a high level of DNA polymorphisms among the pO157 plasmids. Based on a uniform rate of synonymous substitution for Escherichia coli and Salmonella enterica (4.7 × 10−9 per site per year), we estimate that the most recent common ancestor of the contemporary β-glucuronidase-negative, non-sorbitolfermenting STEC O157 strains existed ca. 40 thousand years ago. The phylogeny of the STEC O157 strains based on the informative synonymous SNPs was compared to the maximum parsimony trees inferred from pulsed-field gel electrophoresis and multilocus variable numbers of tandem repeats analysis. The topological discrepancies indicate that, in contrast to the synonymous mutations, parts of STEC O157 genomes have evolved through different mechanisms with highly variable divergence rates. The SNP loci reported here will provide useful genetic markers for developing high-throughput methods for fine-resolution genotyping of STEC O157. Functional characterization of nucleotide polymorphisms should shed new insights on the evolution, epidemiology, and pathogenesis of STEC O157 and related pathogens. PMID:16606700
Alvarado, David M; Yang, Ping; Druley, Todd E; Lovett, Michael; Gurnett, Christina A
2014-06-01
Despite declining sequencing costs, few methods are available for cost-effective single-nucleotide polymorphism (SNP), insertion/deletion (INDEL) and copy number variation (CNV) discovery in a single assay. Commercially available methods require a high investment to a specific region and are only cost-effective for large samples. Here, we introduce a novel, flexible approach for multiplexed targeted sequencing and CNV analysis of large genomic regions called multiplexed direct genomic selection (MDiGS). MDiGS combines biotinylated bacterial artificial chromosome (BAC) capture and multiplexed pooled capture for SNP/INDEL and CNV detection of 96 multiplexed samples on a single MiSeq run. MDiGS is advantageous over other methods for CNV detection because pooled sample capture and hybridization to large contiguous BAC baits reduces sample and probe hybridization variability inherent in other methods. We performed MDiGS capture for three chromosomal regions consisting of ∼ 550 kb of coding and non-coding sequence with DNA from 253 patients with congenital lower limb disorders. PITX1 nonsense and HOXC11 S191F missense mutations were identified that segregate in clubfoot families. Using a novel pooled-capture reference strategy, we identified recurrent chromosome chr17q23.1q23.2 duplications and small HOXC 5' cluster deletions (51 kb and 12 kb). Given the current interest in coding and non-coding variants in human disease, MDiGS fulfills a niche for comprehensive and low-cost evaluation of CNVs, coding, and non-coding variants across candidate regions of interest. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Pinto, João; Gribaldo, Simonetta; Legrand, Eric; Niang, Makhtar; Kim, Nimol; Pharath, Lim; Volnay, Béatrice; Ekala, Marie Therese; Bouchier, Christiane; Fandeur, Thierry; Berzosa, Pedro; Benito, Agustin; Ferreira, Isabel Dinis; Ferreira, Cynthia; Vieira, Pedro Paulo; Alecrim, Maria das Graças; Mercereau-Puijalon, Odile; Cravo, Pedro
2010-01-01
Artemisinin, a thapsigargin-like sesquiterpene has been shown to inhibit the Plasmodium falciparum sarco/endoplasmic reticulum calcium-ATPase PfSERCA. To collect baseline pfserca sequence information before field deployment of Artemisinin-based Combination therapies that may select mutant parasites, we conducted a sequence analysis of 100 isolates from multiple sites in Africa, Asia and South America. Coding sequence diversity was large, with 29 mutated codons, including 32 SNPs (average of one SNP/115 bp), of which 19 were novel mutations. Most SNP detected in this study were clustered within a region in the cytosolic head of the protein. The PfSERCA functional domains were very well conserved, with non synonymous mutations located outside the functional domains, except for the S769N mutation associated in French Guiana with elevated IC50 for artemether. The S769N mutation is located close to the hinge of the headpiece, which in other species modulates calcium affinity and in consequence efficacy of inhibitors, possibly linking calcium homeostasis to drug resistance. Genetic diversity was highest in Senegal, Brazil and French Guiana, and few mutations were identified in Asia. Population genetic analysis was conducted for a partial fragment of the gene encompassing nucleotide coordinates 87-2862 (unambiguous sequence available for 96 isolates). This supported a geographic clustering, with a separation between Old and New World samples and one dominant ancestral haplotype. Genetic drift alone cannot explain the observed polymorphism, suggesting that other evolutionary mechanisms are operating. One possible contributor could be the frequency of haemoglobinopathies that are associated with calcium dysregulation in the erythrocyte. PMID:20195531
Albalawi, Fadwa S; Daghestani, Maha H; Daghestani, Mazin H; Eldali, Abdelmoneim; Warsy, Arjumand S
2018-05-30
Kisspeptin is involved in female reproduction. This study was designed to i- estimate kisspeptin levels in women with polycystic ovary syndrome (PCOS), in comparison with controls, ii- study the correlations between kisspeptin and PCOS-related reproductive hormones, and iii- investigate the relation between KISS1 gene polymorphisms and hormone levels in women suffering from PCOS. The investigation was a clinically designed study on 28 women with PCOS, and 30 normal, healthy women with no signs of PCOS as controls. Blood samples were collected between day 3 and day 6 of the menstrual cycle in both groups at 8:00 a.m., and circulating levels of LH, FSH and kisspeptin were estimated. DNA was extracted from whole blood and all coding exons of KISS1 gene were sequenced. Women with PCOS had higher LH levels and BMI compared to controls. Plasma kisspeptin levels were positively correlated with LH levels. There was no statistically significant difference between the groups in terms of kisspeptin and FSH levels. The SNP rs4889 C/G, a non-synonymous SNP, was investigated in the PCOS group. The frequency of GG genotype was significantly higher in the PCOS compared to the controls. These patients were more obese, had higher kisspeptin and FSH levels. The results of the study show that the genetic variation of KISS1 gene may be a factor contributing to PCOS development. The association between the gene and the gene variation and PCOS need further validation in large-scaled and functional studies.
Wang, Liyong; Rundek, Tatjana; Beecham, Ashley; Hudson, Barry; Blanton, Susan H; Zhao, Hongyu; Sacco, Ralph L; Dong, Chuanhui
2014-01-01
Carotid intima-media thickness (cIMT), a marker for atherosclerosis, is affected by smoking and has substantial interindividual variation. We sought to identify the genetic moderators influencing the effect of smoking on cIMT. With a multistage design using 722 379 single nucleotide polymorphisms (SNP), a genome-wide interaction study was performed in a discovery sample of 669 Hispanics, followed by replication in 589 subjects (264 Hispanics, 172 non-Hispanic blacks, 153 non-Hispanic whites). Assuming an additive genetic model, regression analysis was performed to test for smoking-SNP interaction on cIMT while controlling for age, sex, and the top 3 principal components of ancestry. The strongest interaction in Hispanics was found with a synonymous splicing SNP (rs3751383) in exon 9 of RCBTB1 (P=2.5e(-6) in discovery sample; P=0.01 in the Hispanic replication sample; P<8.8e(-9) in the combined Hispanic sample). Stratification analysis in the combined Hispanic sample showed that smoking had no effect on cIMT among rs3751383 G homozygote (P=0.15), a moderate effect among rs3751383 heterozygote (P=0.01), and a strong effect among rs3751383 A homozygote (P=2.1e(-7)). A consistent trend was observed in the non-Hispanic white and black data sets, leading to an interaction effect of P<2.9e(-9) in the meta-analysis of all 1258 subjects. Our study represents the first genome-wide smoking-SNP interaction study of cIMT and identifies RCBTB1 as a modifier of the smoking effect on cIMT. Testing for gene-environment interactions can help uncover genetic factors that contribute to the interindividual variation in response to the same environmental exposure.
Haralambieva, Iana H.; Ovsyannikova, Inna G.; Umlauf, Benjamin J.; Vierkant, Robert A.; Pankratz, V. Shane; Jacobson, Robert M.; Poland, Gregory A.
2014-01-01
Host antiviral genes are important regulators of antiviral immunity and plausible genetic determinants of immune response heterogeneity after vaccination. We genotyped and analyzed 307 common candidate tagSNPs from 12 antiviral genes in a cohort of 745 schoolchildren immunized with two doses of measles-mumps-rubella vaccine. Associations between SNPs/haplotypes and measles virus-specific immune outcomes were assessed using linear regression methodologies in Caucasians and African-Americans. Genetic variants within the DDX58/RIG-I gene, including a coding polymorphism (rs3205166/Val800Val), were associated as single-SNPs (p≤0.017; although these SNPs did not remain significant after correction for false discovery rate/FDR) and in haplotype-level analysis, with measles-specific antibody variations in Caucasians (haplotype allele p-value=0.021; haplotype global p-value=0.076). Four DDX58 polymorphisms, in high LD, demonstrated also associations (after correction for FDR) with variations in both measles-specific IFN-γ and IL-2 secretion in Caucasians (p≤0.001, q=0.193). Two intronic OAS1 polymorphisms, including the functional OAS1 SNP rs10774671 (p=0.003), demonstrated evidence of association with a significant allele-dose-related increase in neutralizing antibody levels in African-Americans. Genotype and haplotype-level associations demonstrated the role of ADAR genetic variants, including a non-synonymous SNP (rs2229857/Arg384Lys; p=0.01), in regulating measles virus-specific IFN-γ Elispot responses in Caucasians (haplotype global p-value=0.017). After correction FDR, 15 single-SNP associations (11 SNPs in Caucasians and 4 SNPs in African-Americans) still remained significant at the q-value<0.20. In conclusion, our findings strongly point to genetic variants/genes, involved in antiviral sensing and antiviral control, as critical determinants, differentially modulating the adaptive immune responses to live attenuated measles vaccine in Caucasians and African-Americans. PMID:21939710
Jia, Dongjie; Shen, Fei; Wang, Yi; Wu, Ting; Xu, Xuefeng; Zhang, Xinzhong; Han, Zhenhai
2018-05-11
Many efforts have been made to map quantitative trait loci (QTLs) to facilitate practical marker-assisted selection (MAS) in plants. In the present study, we identified four genome-wide major QTLs responsible for apple fruit acidity by MapQTL and BSA-seq analyses using two independent pedigree-based populations. Candidate genes were screened in major QTL regions, and three functional gene markers, including a non-synonymous A/G single nucleotide polymorphism (SNP) in the coding region of MdPP2CH, a 36-bp insertion in the promoter of MdSAUR37, and a previously reported SNP in MdALMTII, were validated to influence the malate content of apple fruits. In addition, MdPP2CH inactivated three vacuolar H + -ATPases (MdVHA-A3, MdVHA-B2 and MdVHA-D2) and one aluminium-activated malate transporter (MdALMTII) via dephosphorylation and negatively influenced fruit malate accumulation. The dephosphotase activity of MdPP2CH was suppressed by MdSAUR37, which implied a higher hierarchy of genetic interaction. Therefore, the MdSAUR37/MdPP2CH/MdALMTII chain cascaded hierarchical epistatic genetic effects to precisely determine apple fruit malate content. An A/G SNP (-1010) on MdMYB44 promoter region from a major QTL (qtl08.1) was closely associated with fruit malate content. The predicted phenotype values (PPVs) were estimated using the tentative genotype values of the gene markers, and the PPVs were significantly correlated with the observed phenotype values. Our findings provide an insight into plant genome-based selection in apples and will aid in conducting research to understand the physiological fundamentals of quantitative genetics. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Wang, Zhifu; Feng, Kai; Yue, Maoxing; Lu, Xiaoguang; Zheng, Qihan; Zhang, Hongxing; Zhai, Yun; Li, Peiyao; Yu, Lixia; Cai, Mi; Zhang, Xiumei; Kang, Xin; Shi, Weihai; Xia, Xia; Chen, Xi; Cao, Pengbo; Li, Yuanfeng; Chen, Huipeng; Ling, Yan; Li, Yuxia; He, Fuchu; Zhou, Gangqiao
2013-03-01
Sepsis represents a systemic inflammatory response to infection and its sequelae include severe sepsis, septic shock, multiple organ dysfunction syndrome (MODS) and death. Studies in mice and humans indicate that the inducible nitric oxide synthase (iNOS, NOS2) plays an important role in the development of sepsis and its sequelae. It was reported that several single nucleotide polymorphisms (SNPs) within NOS2 could influence the production or activity of NOS2. In this study, we assessed whether SNPs within NOS2 gene were associated with severity of sepsis in Chinese populations. A case-control study was conducted, which included 299 and 280 unrelated patients with sepsis recruited from Liaoning and Jiangsu provinces in China, respectively. Six SNPs within NOS2 were genotyped using Sequenom MassARRAY system. The associations between the SNPs and risk of sepsis complications were estimated by a binary logistic regression model adjusted for confounding factors. Functional assay was performed to assess the biological significance. The GA + AA genotype of a non-synonymous SNP in the exon 16 of NOS2 (rs2297518: G>A) was significantly associated with increased susceptibility to septic shock compared with GG genotype in Liaoning population (OR = 3.29, 95% CI = 1.40-7.72, P = 0.0047). This association was confirmed in the Jiangsu population (OR = 3.49, 95% CI = 1.57-7.79, P = 0.0019). Furthermore, the functional assay performed in the immortalized lymphocyte cell lines indicated that the at-risk GA genotype had a tendency of higher NOS2 activity than the GG genotype (P = 0.32). Our findings suggest that the NOS2 rs2297518 may play a role in mediating the susceptibility to septic shock in patients with sepsis in Chinese populations.
Sheynkman, Gloria M.; Shortreed, Michael R.; Frey, Brian L.; Scalf, Mark; Smith, Lloyd M.
2013-01-01
Each individual carries thousands of non-synonymous single nucleotide variants (nsSNVs) in their genome, each corresponding to a single amino acid polymorphism (SAP) in the encoded proteins. It is important to be able to directly detect and quantify these variations at the protein level in order to study post-transcriptional regulation, differential allelic expression, and other important biological processes. However, such variant peptides are not generally detected in standard proteomic analyses, due to their absence from the generic databases that are employed for mass spectrometry searching. Here, we extend previous work that demonstrated the use of customized SAP databases constructed from sample-matched RNA-Seq data. We collected deep coverage RNA-Seq data from the Jurkat cell line, compiled the set of nsSNVs that are expressed, used this information to construct a customized SAP database, and searched it against deep coverage shotgun MS data obtained from the same sample. This approach enabled detection of 421 SAP peptides mapping to 395 nsSNVs. We compared these peptides to peptides identified from a large generic search database containing all known nsSNVs (dbSNP) and found that more than 70% of the SAP peptides from this dbSNP-derived search were not supported by the RNA-Seq data, and thus are likely false positives. Next, we increased the SAP coverage from the RNA-Seq derived database by utilizing multiple protease digestions, thereby increasing variant detection to 695 SAP peptides mapping to 504 nsSNV sites. These detected SAP peptides corresponded to moderate to high abundance transcripts (30+ transcripts per million, TPM). The SAP peptides included 192 allelic pairs; the relative expression levels of the two alleles were evaluated for 51 of those pairs, and found to be comparable in all cases. PMID:24175627
Milivojevic, Verica; Kranzler, Henry R.; Gelernter, Joel; Burian, Linda; Covault, Jonathan
2010-01-01
Background Studies of alcohol effects in rodents and in vitro implicate endogenous neuroactive steroids as key mediators of alcohol effects at GABAA receptors. We used a case-control sample to test the association with alcohol dependence (AD) of single nucleotide polymorphisms (SNPs) in the genes encoding two key enzymes required for the generation of endogenous neuroactive steroids: 5α–reductase, type I (5α-R) and 3α-hydroxysteroid dehydrogenase, type 2 (3α-HSD), both of which are expressed in human brain. Methods We focused on markers previously associated with a biological phenotype. For 5α-R, we examined the synonymous SRD5A1 exon 1 SNP rs248793, which has been associated with the ratio of dihydrotestosterone to testosterone. For 3α-HSD, we examined the non-synonymous AKR1C3 SNP rs12529 (H5Q), which has been associated with bladder cancer. The SNPs were genotyped in a sample of 1,083 non-Hispanic Caucasians including 552 controls and 531 subjects with AD. Results The minor allele for both SNPs was more common among controls than subjects with AD: SRD5A1 rs248793 C-allele (χ2(1)=7.6, p=0.006) and AKR1C3 rs12529 G-allele (χ2(1)=14.6, p=0.0001). There was also an interaction of these alleles such that the “protective” effect of the minor allele at each marker for AD was conditional on the genotype of the second marker. Conclusions We found evidence of an association with AD of polymorphisms in two genes encoding neuroactive steroid biosynthetic enzymes, providing indirect evidence that neuroactive steroids are important mediators of alcohol effects in humans. PMID:21323680
Genome-Wide SNP Genotyping to Infer the Effects on Gene Functions in Tomato
Hirakawa, Hideki; Shirasawa, Kenta; Ohyama, Akio; Fukuoka, Hiroyuki; Aoki, Koh; Rothan, Christophe; Sato, Shusei; Isobe, Sachiko; Tabata, Satoshi
2013-01-01
The genotype data of 7054 single nucleotide polymorphism (SNP) loci in 40 tomato lines, including inbred lines, F1 hybrids, and wild relatives, were collected using Illumina's Infinium and GoldenGate assay platforms, the latter of which was utilized in our previous study. The dendrogram based on the genotype data corresponded well to the breeding types of tomato and wild relatives. The SNPs were classified into six categories according to their positions in the genes predicted on the tomato genome sequence. The genes with SNPs were annotated by homology searches against the nucleotide and protein databases, as well as by domain searches, and they were classified into the functional categories defined by the NCBI's eukaryotic orthologous groups (KOG). To infer the SNPs' effects on the gene functions, the three-dimensional structures of the 843 proteins that were encoded by the genes with SNPs causing missense mutations were constructed by homology modelling, and 200 of these proteins were considered to carry non-synonymous amino acid substitutions in the predicted functional sites. The SNP information obtained in this study is available at the Kazusa Tomato Genomics Database (http://plant1.kazusa.or.jp/tomato/). PMID:23482505
Singh, Kh Dhanachandra; Karthikeyan, Muthusamy
2014-12-01
The renin-angiotensin-aldosterone system (RAAS) plays a key role in the regulation of blood pressure (BP). Mutations on the genes that encode components of the RAAS have played a significant role in genetic susceptibility to hypertension and have been intensively scrutinized. The identification of such probably causal mutations not only provides insight into the RAAS but may also serve as antihypertensive therapeutic targets and diagnostic markers. The methods for analyzing the SNPs from the huge dataset of SNPs, containing both functional and neutral SNPs is challenging by the experimental approach on every SNPs to determine their biological significance. To explore the functional significance of genetic mutation (SNPs), we adopted combined sequence and sequence-structure-based SNP analysis algorithm. Out of 3864 SNPs reported in dbSNP, we found 108 missense SNPs in the coding region and remaining in the non-coding region. In this study, we are reporting only those SNPs in coding region to be deleterious when three or more tools are predicted to be deleterious and which have high RMSD from the native structure. Based on these analyses, we have identified two SNPs of REN gene, eight SNPs of AGT gene, three SNPs of ACE gene, two SNPs of AT1R gene, three SNPs of CYP11B2 gene and three SNPs of CMA1 gene in the coding region were found to be deleterious. Further this type of study will be helpful in reducing the cost and time for identification of potential SNP and also helpful in selecting potential SNP for experimental study out of SNP pool.
Impact of SNPs on Protein Phosphorylation Status in Rice (Oryza sativa L.).
Lin, Shoukai; Chen, Lijuan; Tao, Huan; Huang, Jian; Xu, Chaoqun; Li, Lin; Ma, Shiwei; Tian, Tian; Liu, Wei; Xue, Lichun; Ai, Yufang; He, Huaqin
2016-11-11
Single nucleotide polymorphisms (SNPs) are widely used in functional genomics and genetics research work. The high-quality sequence of rice genome has provided a genome-wide SNP and proteome resource. However, the impact of SNPs on protein phosphorylation status in rice is not fully understood. In this paper, we firstly updated rice SNP resource based on the new rice genome Ver. 7.0, then systematically analyzed the potential impact of Non-synonymous SNPs (nsSNPs) on the protein phosphorylation status. There were 3,897,312 SNPs in Ver. 7.0 rice genome, among which 9.9% was nsSNPs. Whilst, a total 2,508,261 phosphorylated sites were predicted in rice proteome. Interestingly, we observed that 150,197 (39.1%) nsSNPs could influence protein phosphorylation status, among which 52.2% might induce changes of protein kinase (PK) types for adjacent phosphorylation sites. We constructed a database, SNP_rice, to deposit the updated rice SNP resource and phosSNPs information. It was freely available to academic researchers at http://bioinformatics.fafu.edu.cn. As a case study, we detected five nsSNPs that potentially influenced heterotrimeric G proteins phosphorylation status in rice, indicating that genetic polymorphisms showed impact on the signal transduction by influencing the phosphorylation status of heterotrimeric G proteins. The results in this work could be a useful resource for future experimental identification and provide interesting information for better rice breeding.
Lata, Charu; Bhutty, Sarita; Bahadur, Ranjit Prasad; Majee, Manoj; Prasad, Manoj
2011-06-01
The DREB genes code for important plant transcription factors involved in the abiotic stress response and signal transduction. Characterization of DREB genes and development of functional markers for effective alleles is important for marker-assisted selection in foxtail millet. Here the characterization of a cDNA (SiDREB2) encoding a putative dehydration-responsive element-binding protein 2 from foxtail millet and the development of an allele-specific marker (ASM) for dehydration tolerance is reported. A cDNA clone (GenBank accession no. GT090998) coding for a putative DREB2 protein was isolated as a differentially expressed gene from a 6 h dehydration stress SSH library. A 5' RACE (rapid amplification of cDNA ends) was carried out to obtain the full-length cDNA, and sequence analysis showed that SiDREB2 encoded a polypeptide of 234 amino acids with a predicted mol. wt of 25.72 kDa and a theoretical pI of 5.14. A theoretical model of the tertiary structure shows that it has a highly conserved GCC-box-binding N-terminal domain, and an acidic C-terminus that acts as an activation domain for transcription. Based on its similarity to AP2 domains, SiDREB2 was classified into the A-2 subgroup of the DREB subfamily. Quantitative real-time PCR analysis showed significant up-regulation of SiDREB2 by dehydration (polyethylene glycol) and salinity (NaCl), while its expression was less affected by other stresses. A synonymous single nucleotide polymorphism (SNP) associated with dehydration tolerance was detected at the 558th base pair (an A/G transition) in the SiDREB2 gene in a core set of 45 foxtail millet accessions used. Based on the identified SNP, three primers were designed to develop an ASM for dehydration tolerance. The ASM produced a 261 bp fragment in all the tolerant accessions and produced no amplification in the sensitive accessions. The use of this ASM might be faster, cheaper, and more reproducible than other SNP genotyping methods, and thus will enable marker-aided breeding of foxtail millet for dehydration tolerance.
Variation analysis and gene annotation of eight MHC haplotypes: The MHC Haplotype Project
Horton, Roger; Gibson, Richard; Coggill, Penny; Miretti, Marcos; Allcock, Richard J.; Almeida, Jeff; Forbes, Simon; Gilbert, James G. R.; Halls, Karen; Harrow, Jennifer L.; Hart, Elizabeth; Howe, Kevin; Jackson, David K.; Palmer, Sophie; Roberts, Anne N.; Sims, Sarah; Stewart, C. Andrew; Traherne, James A.; Trevanion, Steve; Wilming, Laurens; Rogers, Jane; de Jong, Pieter J.; Elliott, John F.; Sawcer, Stephen; Todd, John A.; Trowsdale, John
2008-01-01
The human major histocompatibility complex (MHC) is contained within about 4 Mb on the short arm of chromosome 6 and is recognised as the most variable region in the human genome. The primary aim of the MHC Haplotype Project was to provide a comprehensively annotated reference sequence of a single, human leukocyte antigen-homozygous MHC haplotype and to use it as a basis against which variations could be assessed from seven other similarly homozygous cell lines, representative of the most common MHC haplotypes in the European population. Comparison of the haplotype sequences, including four haplotypes not previously analysed, resulted in the identification of >44,000 variations, both substitutions and indels (insertions and deletions), which have been submitted to the dbSNP database. The gene annotation uncovered haplotype-specific differences and confirmed the presence of more than 300 loci, including over 160 protein-coding genes. Combined analysis of the variation and annotation datasets revealed 122 gene loci with coding substitutions of which 97 were non-synonymous. The haplotype (A3-B7-DR15; PGF cell line) designated as the new MHC reference sequence, has been incorporated into the human genome assembly (NCBI35 and subsequent builds), and constitutes the largest single-haplotype sequence of the human genome to date. The extensive variation and annotation data derived from the analysis of seven further haplotypes have been made publicly available and provide a framework and resource for future association studies of all MHC-associated diseases and transplant medicine. PMID:18193213
Helicobacter pylori genetic diversification in the Mongolian gerbil model.
Beckett, Amber C; Loh, John T; Chopra, Abha; Leary, Shay; Lin, Aung Soe; McDonnell, Wyatt J; Dixon, Beverly R E A; Noto, Jennifer M; Israel, Dawn A; Peek, Richard M; Mallal, Simon; Algood, Holly M Scott; Cover, Timothy L
2018-01-01
Helicobacter pylori requires genetic agility to infect new hosts and establish long-term colonization of changing gastric environments. In this study, we analyzed H. pylori genetic adaptation in the Mongolian gerbil model. This model is of particular interest because H. pylori -infected gerbils develop a high level of gastric inflammation and often develop gastric adenocarcinoma or gastric ulceration. We analyzed the whole genome sequences of H. pylori strains cultured from experimentally infected gerbils, in comparison to the genome sequence of the input strain. The mean annualized single nucleotide polymorphism (SNP) rate per site was 1.5e -5 , which is similar to the rates detected previously in H. pylori- infected humans. Many of the mutations occurred within or upstream of genes associated with iron-related functions ( fur , tonB1 , fecA2 , fecA3 , and frpB3 ) or encoding outer membrane proteins ( alpA, oipA, fecA2, fecA3, frpB3 and cagY ). Most of the SNPs within coding regions (86%) were non-synonymous mutations. Several deletion or insertion mutations led to disruption of open reading frames, suggesting that the corresponding gene products are not required or are deleterious during chronic H. pylori colonization of the gerbil stomach. Five variants (three SNPs and two deletions) were detected in isolates from multiple animals, which suggests that these mutations conferred a selective advantage. One of the mutations (FurR88H) detected in isolates from multiple animals was previously shown to confer increased resistance to oxidative stress, and we now show that this SNP also confers a survival advantage when H. pylori is co-cultured with neutrophils. Collectively, these analyses allow the identification of mutations that are positively selected during H. pylori colonization of the gerbil model.
Drögemüller, Cord; Reichart, Ursula; Seuberlich, Torsten; Oevermann, Anna; Baumgartner, Martin; Kühni Boghenbor, Kathrin; Stoffel, Michael H.; Syring, Claudia; Meylan, Mireille; Müller, Simone; Müller, Mathias; Gredler, Birgit
2011-01-01
Tyrolean Grey cattle represent a local breed with a population size of ∼5000 registered cows. In 2003, a previously unknown neurological disorder was recognized in Tyrolean Grey cattle. The clinical signs of the disorder are similar to those of bovine progressive degenerative myeloencephalopathy (weaver syndrome) in Brown Swiss cattle but occur much earlier in life. The neuropathological investigation of an affected calf showed axonal degeneration in the central nervous system (CNS) and femoral nerve. The pedigrees of the affected calves suggested a monogenic autosomal recessive inheritance. We localized the responsible mutation to a 1.9 Mb interval on chromosome 16 by genome-wide association and haplotype mapping. The MFN2 gene located in this interval encodes mitofusin 2, a mitochondrial membrane protein. A heritable human axonal neuropathy, Charcot-Marie-Tooth disease-2A2 (CMT2A2), is caused by MFN2 mutations. Therefore, we considered MFN2 a positional and functional candidate gene and performed mutation analysis in affected and control Tyrolean Grey cattle. We did not find any non-synonymous variants. However, we identified a perfectly associated silent SNP in the coding region of exon 20 of the MFN2 gene. This SNP is located within a putative exonic splice enhancer (ESE) and the variant allele leads to partial retention of the entire intron 19 and a premature stop codon in the aberrant MFN2 transcript. Thus we have identified a highly unusual splicing defect, where an exonic single base exchange leads to the retention of the preceding intron. This splicing defect represents a potential explanation for the observed degenerative axonopathy. Marker assisted selection can now be used to eliminate degenerative axonopathy from Tyrolean Grey cattle. PMID:21526202
Drögemüller, Cord; Reichart, Ursula; Seuberlich, Torsten; Oevermann, Anna; Baumgartner, Martin; Kühni Boghenbor, Kathrin; Stoffel, Michael H; Syring, Claudia; Meylan, Mireille; Müller, Simone; Müller, Mathias; Gredler, Birgit; Sölkner, Johann; Leeb, Tosso
2011-04-15
Tyrolean Grey cattle represent a local breed with a population size of ∼5000 registered cows. In 2003, a previously unknown neurological disorder was recognized in Tyrolean Grey cattle. The clinical signs of the disorder are similar to those of bovine progressive degenerative myeloencephalopathy (weaver syndrome) in Brown Swiss cattle but occur much earlier in life. The neuropathological investigation of an affected calf showed axonal degeneration in the central nervous system (CNS) and femoral nerve. The pedigrees of the affected calves suggested a monogenic autosomal recessive inheritance. We localized the responsible mutation to a 1.9 Mb interval on chromosome 16 by genome-wide association and haplotype mapping. The MFN2 gene located in this interval encodes mitofusin 2, a mitochondrial membrane protein. A heritable human axonal neuropathy, Charcot-Marie-Tooth disease-2A2 (CMT2A2), is caused by MFN2 mutations. Therefore, we considered MFN2 a positional and functional candidate gene and performed mutation analysis in affected and control Tyrolean Grey cattle. We did not find any non-synonymous variants. However, we identified a perfectly associated silent SNP in the coding region of exon 20 of the MFN2 gene. This SNP is located within a putative exonic splice enhancer (ESE) and the variant allele leads to partial retention of the entire intron 19 and a premature stop codon in the aberrant MFN2 transcript. Thus we have identified a highly unusual splicing defect, where an exonic single base exchange leads to the retention of the preceding intron. This splicing defect represents a potential explanation for the observed degenerative axonopathy. Marker assisted selection can now be used to eliminate degenerative axonopathy from Tyrolean Grey cattle.
Liu, Jun-Jun; Sniezko, Richard; Murray, Michael; Wang, Ning; Chen, Hao; Zamany, Arezoo; Sturrock, Rona N.; Savin, Douglas; Kegley, Angelia
2016-01-01
Whitebark pine (WBP, Pinus albicaulis Engelm.) is an endangered conifer species due to heavy mortality from white pine blister rust (WPBR, caused by Cronartium ribicola) and mountain pine beetle (Dendroctonus ponderosae). Information about genetic diversity and population structure is of fundamental importance for its conservation and restoration. However, current knowledge on the genetic constitution and genomic variation is still limited for WBP. In this study, an integrated genomics approach was applied to characterize seed collections from WBP breeding programs in western North America. RNA-seq analysis was used for de novo assembly of the WBP needle transcriptome, which contains 97,447 protein-coding transcripts. Within the transcriptome, single nucleotide polymorphisms (SNPs) were discovered, and more than 22,000 of them were non-synonymous SNPs (ns-SNPs). Following the annotation of genes with ns-SNPs, 216 ns-SNPs within candidate genes with putative functions in disease resistance and plant defense were selected to design SNP arrays for high-throughput genotyping. Among these SNP loci, 71 were highly polymorphic, with sufficient variation to identify a unique genotype for each of the 371 individuals originating from British Columbia (Canada), Oregon and Washington (USA). A clear genetic differentiation was evident among seed families. Analyses of genetic spatial patterns revealed varying degrees of diversity and the existence of several genetic subgroups in the WBP breeding populations. Genetic components were associated with geographic variables and phenotypic rating of WPBR disease severity across landscapes, which may facilitate further identification of WBP genotypes and gene alleles contributing to local adaptation and quantitative resistance to WPBR. The WBP genomic resources developed here provide an invaluable tool for further studies and for exploitation and utilization of the genetic diversity preserved within this endangered conifer and other five-needle pines. PMID:27992468
Nudel, R; Simpson, N H; Baird, G; O’Hare, A; Conti-Ramsden, G; Bolton, P F; Hennessy, E R; Ring, S M; Davey Smith, G; Francks, C; Paracchini, S; Monaco, A P; Fisher, S E; Newbury, D F
2014-01-01
Specific language impairment (SLI) is a neurodevelopmental disorder that affects linguistic abilities when development is otherwise normal. We report the results of a genome-wide association study of SLI which included parent-of-origin effects and child genotype effects and used 278 families of language-impaired children. The child genotype effects analysis did not identify significant associations. We found genome-wide significant paternal parent-of-origin effects on chromosome 14q12 (P = 3.74 × 10−8) and suggestive maternal parent-of-origin effects on chromosome 5p13 (P = 1.16 × 10−7). A subsequent targeted association of six single-nucleotide-polymorphisms (SNPs) on chromosome 5 in 313 language-impaired individuals and their mothers from the ALSPAC cohort replicated the maternal effects, albeit in the opposite direction (P = 0.001); as fathers’ genotypes were not available in the ALSPAC study, the replication analysis did not include paternal parent-of-origin effects. The paternally-associated SNP on chromosome 14 yields a non-synonymous coding change within the NOP9 gene. This gene encodes an RNA-binding protein that has been reported to be significantly dysregulated in individuals with schizophrenia. The region of maternal association on chromosome 5 falls between the PTGER4 and DAB2 genes, in a region previously implicated in autism and ADHD. The top SNP in this association locus is a potential expression QTL of ARHGEF19 (also called WGEF) on chromosome 1. Members of this protein family have been implicated in intellectual disability. In summary, this study implicates parent-of-origin effects in language impairment, and adds an interesting new dimension to the emerging picture of shared genetic etiology across various neurodevelopmental disorders. PMID:24571439
Lourenco-Jaramillo, Diana Lelidett; Sifuentes-Rincón, Ana María; Parra-Bracamonte, Gaspar Manuel; de la Rosa-Reyna, Xochitl Fabiola; Segura-Cabrera, Aldo; Arellano-Vera, Williams
2012-01-01
DNA from four cattle breeds was used to re-sequence all of the exons and 56% of the introns of the bovine tyrosine hydroxylase (TH) gene and 97% and 13% of the bovine dopamine β-hydroxylase (DBH) coding and non-coding sequences, respectively. Two novel single nucleotide polymorphisms (SNPs) and a microsatellite motif were found in the TH sequences. The DBH sequences contained 62 nucleotide changes, including eight non-synonymous SNPs (nsSNPs) that are of particular interest because they may alter protein function and therefore affect the phenotype. These DBH nsSNPs resulted in amino acid substitutions that were predicted to destabilize the protein structure. Six SNPs (one from TH and five from DBH non-synonymous SNPs) were genotyped in 140 animals; all of them were polymorphic and had a minor allele frequency of > 9%. There were significant differences in the intra- and inter-population haplotype distributions. The haplotype differences between Brahman cattle and the three B. t. taurus breeds (Charolais, Holstein and Lidia) were interesting from a behavioural point of view because of the differences in temperament between these breeds. PMID:22888292
Genome-scale investigation of phenotypically distinct but nearly clonal Trichoderma strains
Weld, Richard J.; Cox, Murray P.; Bradshaw, Rosie E.; McLean, Kirstin L.; Stewart, Alison; Steyaert, Johanna M.
2016-01-01
Biological control agents (BCA) are beneficial organisms that are applied to protect plants from pests. Many fungi of the genus Trichoderma are successful BCAs but the underlying mechanisms are not yet fully understood. Trichoderma cf. atroviride strain LU132 is a remarkably effective BCA compared to T. cf. atroviride strain LU140 but these strains were found to be highly similar at the DNA sequence level. This unusual combination of phenotypic variability and high DNA sequence similarity between separately isolated strains prompted us to undertake a genome comparison study in order to identify DNA polymorphisms. We further investigated if the polymorphisms had functional effects on the phenotypes. The two strains were clearly identified as individuals, exhibiting different growth rates, conidiation and metabolism. Superior pathogen control demonstrated by LU132 depended on its faster growth, which is a prerequisite for successful distribution and competition. Genome sequencing identified only one non-synonymous single nucleotide polymorphism (SNP) between the strains. Based on this SNP, we successfully designed and validated an RFLP protocol that can be used to differentiate LU132 from LU140 and other Trichoderma strains. This SNP changed the amino acid sequence of SERF, encoded by the previously undescribed single copy gene “small EDRK-rich factor” (serf). A deletion of serf in the two strains did not lead to identical phenotypes, suggesting that, in addition to the single functional SNP between the nearly clonal Trichoderma cf. atroviride strains, other non-genomic factors contribute to their phenotypic variation. This finding is significant as it shows that genomics is an extremely useful but not exhaustive tool for the study of biocontrol complexity and for strain typing. PMID:27190719
USDA-ARS?s Scientific Manuscript database
Single-nucleotide Polymorphism (SNP) markers are by far the most common form of DNA polymorphism in a genome. The objectives of this study were to discover SNPs in common bean comparing sequences from coding and non-coding regions obtained from Genbank and genomic DNA and to compare sequencing resu...
Rodrigue, Nicolas; Lartillot, Nicolas
2017-01-01
Codon substitution models have traditionally attempted to uncover signatures of adaptation within protein-coding genes by contrasting the rates of synonymous and non-synonymous substitutions. Another modeling approach, known as the mutation-selection framework, attempts to explicitly account for selective patterns at the amino acid level, with some approaches allowing for heterogeneity in these patterns across codon sites. Under such a model, substitutions at a given position occur at the neutral or nearly neutral rate when they are synonymous, or when they correspond to replacements between amino acids of similar fitness; substitutions from high to low (low to high) fitness amino acids have comparatively low (high) rates. Here, we study the use of such a mutation-selection framework as a null model for the detection of adaptation. Following previous works in this direction, we include a deviation parameter that has the effect of capturing the surplus, or deficit, in non-synonymous rates, relative to what would be expected under a mutation-selection modeling framework that includes a Dirichlet process approach to account for across-codon-site variation in amino acid fitness profiles. We use simulations, along with a few real data sets, to study the behavior of the approach, and find it to have good power with a low false-positive rate. Altogether, we emphasize the potential of recent mutation-selection models in the detection of adaptation, calling for further model refinements as well as large-scale applications. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Melchardt, Thomas; Hufnagl, Clemens; Weinstock, David M; Kopp, Nadja; Neureiter, Daniel; Tränkenschuh, Wolfgang; Hackl, Hubert; Weiss, Lukas; Rinnerthaler, Gabriel; Hartmann, Tanja N; Greil, Richard; Weigert, Oliver; Egle, Alexander
2016-08-09
Little information is available about the role of certain mutations for clonal evolution and the clinical outcome during relapse in diffuse large B-cell lymphoma (DLBCL). Therefore, we analyzed formalin-fixed-paraffin-embedded tumor samples from first diagnosis, relapsed or refractory disease from 28 patients using next-generation sequencing of the exons of 104 coding genes. Non-synonymous mutations were present in 74 of the 104 genes tested. Primary tumor samples showed a median of 8 non-synonymous mutations (range: 0-24) with the used gene set. Lower numbers of non-synonymous mutations in the primary tumor were associated with a better median OS compared with higher numbers (28 versus 15 months, p=0.031). We observed three patterns of clonal evolution during relapse of disease: large global change, subclonal selection and no or minimal change possibly suggesting preprogrammed resistance. We conclude that targeted re-sequencing is a feasible and informative approach to characterize the molecular pattern of relapse and it creates novel insights into the role of dynamics of individual genes.
Locke, Jonathan M; Wei, Fan-Yan; Tomizawa, Kazuhito; Weedon, Michael N; Harries, Lorna W
2015-04-01
Intronic single nucleotide polymorphisms (SNPs) in the CDKAL1 gene are associated with risk of developing type 2 diabetes. A strong correlation between risk alleles and lower levels of the non-coding RNA, CDKAL1-v1, has recently been reported in whole blood extracted from Japanese individuals. We sought to replicate this association in two independent cohorts: one using whole blood from white UK-resident individuals, and one using a collection of human pancreatic islets, a more relevant tissue type to study with respect to the aetiology of diabetes. Levels of CDKAL1-v1 were measured by real-time PCR using RNA extracted from human whole blood (n = 70) and human pancreatic islets (n = 48). Expression with respect to genotype was then determined. In a simple linear regression model, expression of CDKAL1-v1 was associated with the lead type 2 diabetes-associated SNP, rs7756992, in whole blood and islets. However, these associations were abolished or substantially reduced in multiple regression models taking into account rs9366357 genotype: a moderately linked SNP explaining a much larger amount of the variation in CDKAL1-v1 levels, but not strongly associated with risk of type 2 diabetes. Contrary to previous findings, we provide evidence against a role for dysregulated expression of CDKAL1-v1 in mediating the association between intronic SNPs in CDKAL1 and susceptibility to type 2 diabetes. The results of this study illustrate how caution should be exercised when inferring causality from an association between disease-risk genotype and non-coding RNA expression.
Abo-Al-Ela, Haitham G; El-Magd, Mohammed Abu; El-Nahas, Abeer F; Mansour, Ali A
2014-08-01
Insulin-like growth factor 2 (IGF2) plays an important role in muscle growth and it might be used as a marker for the growth traits selection strategies in farm animals. The objectives of this study were to detect polymorphisms in exon 10 of IGF2 and to determine associations between these polymorphisms and growth traits in Egyptian water buffalo. PCR-single-strand conformation polymorphism (SSCP) and DNA sequencing methods were used to detect any prospective polymorphism. A novel single nucleotide polymorphism (SNP), C287A, was detected. It was a non-synonymous mutation and led to replacement of glutamine (Q) amino acid (aa) by histidine (H) aa. Three different SSCP patterns were observed: AA, AC, and CC, with frequencies of 0.540, 0.325, and 0.135, respectively. Association analyses revealed that the AA individuals had a higher average daily gain (ADG) than other individuals (CC and AC) from birth to 9 months of age. We conclude that the AA genotype in C287A SNP in the exon 10 of the IGF2 gene is associated with the ADG during the age from birth to 9 months and could be used as a potential genetic marker for selection of growth traits in Egyptian buffalo.
NASA Astrophysics Data System (ADS)
Li, Jiqin; Bao, Zhenmin; Li, Ling; Wang, Xiaojian; Wang, Shi; Hu, Xiaoli
2013-09-01
Zhikong scallop ( Chlamys farreri) is an important maricultured species in China. Many researches on this species, such as population genetics and QTL fine-mapping, need a large number of molecular markers. In this study, based on the expressed sequence tags (EST), a total of 300 putative single nucleotide polymorphisms (SNPs) were selected and validated using high resolution melting (HRM) technology with unlabeled probe. Of them, 101 (33.7%) were found to be polymorphic in 48 individuals from 4 populations. Further evaluation with 48 individuals from Qingdao population showed that all the polymorphic loci had two alleles with the minor allele frequency ranged from 0.046 to 0.500. The observed and expected heterozygosities ranged from 0.000 to 0.925 and from 0.089 to 0.505, respectively. Fifteen loci deviated significantly from Hardy-Weinberg equilibrium and significant linkage disequilibrate was detected in one pair of markers. BLASTx gave significant hits for 72 of the 101 polymorphic SNP-containing ESTs. Thirty four polymorphic SNP loci were predicted to be non-synonymous substitutions as they caused either the change of codons (33 SNPs) or pretermination of translation (1 SNP). The markers developed can be used for the population studies and genetic improvement on Zhikong scallop.
Seo, Dong-Won; Oh, Jae-Don; Jin, Shil; Song, Ki-Duk; Park, Hee-Bok; Heo, Kang-Nyeong; Shin, Younhee; Jung, Myunghee; Park, Junhyung; Jo, Cheorun; Lee, Hak-Kyo; Lee, Jun-Heon
2015-02-01
There are five native chicken lines in Korea, which are mainly classified by plumage colors (black, white, red, yellow, gray). These five lines are very important genetic resources in the Korean poultry industry. Based on a next generation sequencing technology, whole genome sequence and reference assemblies were performed using Gallus_gallus_4.0 (NCBI) with whole genome sequences from these lines to identify common and novel single nucleotide polymorphisms (SNPs). We obtained 36,660,731,136 ± 1,257,159,120 bp of raw sequence and average 26.6-fold of 25-29 billion reference assembly sequences representing 97.288 % coverage. Also, 4,006,068 ± 97,534 SNPs were observed from 29 autosomes and the Z chromosome and, of these, 752,309 SNPs are the common SNPs across lines. Among the identified SNPs, the number of novel- and known-location assigned SNPs was 1,047,951 ± 14,956 and 2,948,648 ± 81,414, respectively. The number of unassigned known SNPs was 1,181 ± 150 and unassigned novel SNPs was 8,238 ± 1,019. Synonymous SNPs, non-synonymous SNPs, and SNPs having character changes were 26,266 ± 1,456, 11,467 ± 604, 8,180 ± 458, respectively. Overall, 443,048 ± 26,389 SNPs in each bird were identified by comparing with dbSNP in NCBI. The presently obtained genome sequence and SNP information in Korean native chickens have wide applications for further genome studies such as genetic diversity studies to detect causative mutations for economic and disease related traits.
Genome-wide re-sequencing of multidrug-resistant Mycobacterium leprae Airaku-3.
Singh, P; Benjak, A; Carat, S; Kai, M; Busso, P; Avanzi, C; Paniz-Mondolfi, A; Peter, C; Harshman, K; Rougemont, J; Matsuoka, M; Cole, S T
2014-10-01
Genotyping and molecular characterization of drug resistance mechanisms in Mycobacterium leprae enables disease transmission and drug resistance trends to be monitored. In the present study, we performed genome-wide analysis of Airaku-3, a multidrug-resistant strain with an unknown mechanism of resistance to rifampicin. We identified 12 unique non-synonymous single-nucleotide polymorphisms (SNPs) including two in the transporter-encoding ctpC and ctpI genes. In addition, two SNPs were found that improve the resolution of SNP-based genotyping, particularly for Venezuelan and South East Asian strains of M. leprae. © 2014 The Authors Clinical Microbiology and Infection © 2014 European Society of Clinical Microbiology and Infectious Diseases.
Song, Jiangning; Wang, Minglei; Burrage, Kevin
2006-07-21
High-quality data about protein structures and their gene sequences are essential to the understanding of the relationship between protein folding and protein coding sequences. Firstly we constructed the EcoPDB database, which is a high-quality database of Escherichia coli genes and their corresponding PDB structures. Based on EcoPDB, we presented a novel approach based on information theory to investigate the correlation between cysteine synonymous codon usages and local amino acids flanking cysteines, the correlation between cysteine synonymous codon usages and synonymous codon usages of local amino acids flanking cysteines, as well as the correlation between cysteine synonymous codon usages and the disulfide bonding states of cysteines in the E. coli genome. The results indicate that the nearest neighboring residues and their synonymous codons of the C-terminus have the greatest influence on the usages of the synonymous codons of cysteines and the usage of the synonymous codons has a specific correlation with the disulfide bond formation of cysteines in proteins. The correlations may result from the regulation mechanism of protein structures at gene sequence level and reflect the biological function restriction that cysteines pair to form disulfide bonds. The results may also be helpful in identifying residues that are important for synonymous codon selection of cysteines to introduce disulfide bridges in protein engineering and molecular biology. The approach presented in this paper can also be utilized as a complementary computational method and be applicable to analyse the synonymous codon usages in other model organisms.
A second generation human haplotype map of over 3.1 million SNPs.
Frazer, Kelly A; Ballinger, Dennis G; Cox, David R; Hinds, David A; Stuve, Laura L; Gibbs, Richard A; Belmont, John W; Boudreau, Andrew; Hardenbol, Paul; Leal, Suzanne M; Pasternak, Shiran; Wheeler, David A; Willis, Thomas D; Yu, Fuli; Yang, Huanming; Zeng, Changqing; Gao, Yang; Hu, Haoran; Hu, Weitao; Li, Chaohua; Lin, Wei; Liu, Siqi; Pan, Hao; Tang, Xiaoli; Wang, Jian; Wang, Wei; Yu, Jun; Zhang, Bo; Zhang, Qingrun; Zhao, Hongbin; Zhao, Hui; Zhou, Jun; Gabriel, Stacey B; Barry, Rachel; Blumenstiel, Brendan; Camargo, Amy; Defelice, Matthew; Faggart, Maura; Goyette, Mary; Gupta, Supriya; Moore, Jamie; Nguyen, Huy; Onofrio, Robert C; Parkin, Melissa; Roy, Jessica; Stahl, Erich; Winchester, Ellen; Ziaugra, Liuda; Altshuler, David; Shen, Yan; Yao, Zhijian; Huang, Wei; Chu, Xun; He, Yungang; Jin, Li; Liu, Yangfan; Shen, Yayun; Sun, Weiwei; Wang, Haifeng; Wang, Yi; Wang, Ying; Xiong, Xiaoyan; Xu, Liang; Waye, Mary M Y; Tsui, Stephen K W; Xue, Hong; Wong, J Tze-Fei; Galver, Luana M; Fan, Jian-Bing; Gunderson, Kevin; Murray, Sarah S; Oliphant, Arnold R; Chee, Mark S; Montpetit, Alexandre; Chagnon, Fanny; Ferretti, Vincent; Leboeuf, Martin; Olivier, Jean-François; Phillips, Michael S; Roumy, Stéphanie; Sallée, Clémentine; Verner, Andrei; Hudson, Thomas J; Kwok, Pui-Yan; Cai, Dongmei; Koboldt, Daniel C; Miller, Raymond D; Pawlikowska, Ludmila; Taillon-Miller, Patricia; Xiao, Ming; Tsui, Lap-Chee; Mak, William; Song, You Qiang; Tam, Paul K H; Nakamura, Yusuke; Kawaguchi, Takahisa; Kitamoto, Takuya; Morizono, Takashi; Nagashima, Atsushi; Ohnishi, Yozo; Sekine, Akihiro; Tanaka, Toshihiro; Tsunoda, Tatsuhiko; Deloukas, Panos; Bird, Christine P; Delgado, Marcos; Dermitzakis, Emmanouil T; Gwilliam, Rhian; Hunt, Sarah; Morrison, Jonathan; Powell, Don; Stranger, Barbara E; Whittaker, Pamela; Bentley, David R; Daly, Mark J; de Bakker, Paul I W; Barrett, Jeff; Chretien, Yves R; Maller, Julian; McCarroll, Steve; Patterson, Nick; Pe'er, Itsik; Price, Alkes; Purcell, Shaun; Richter, Daniel J; Sabeti, Pardis; Saxena, Richa; Schaffner, Stephen F; Sham, Pak C; Varilly, Patrick; Altshuler, David; Stein, Lincoln D; Krishnan, Lalitha; Smith, Albert Vernon; Tello-Ruiz, Marcela K; Thorisson, Gudmundur A; Chakravarti, Aravinda; Chen, Peter E; Cutler, David J; Kashuk, Carl S; Lin, Shin; Abecasis, Gonçalo R; Guan, Weihua; Li, Yun; Munro, Heather M; Qin, Zhaohui Steve; Thomas, Daryl J; McVean, Gilean; Auton, Adam; Bottolo, Leonardo; Cardin, Niall; Eyheramendy, Susana; Freeman, Colin; Marchini, Jonathan; Myers, Simon; Spencer, Chris; Stephens, Matthew; Donnelly, Peter; Cardon, Lon R; Clarke, Geraldine; Evans, David M; Morris, Andrew P; Weir, Bruce S; Tsunoda, Tatsuhiko; Mullikin, James C; Sherry, Stephen T; Feolo, Michael; Skol, Andrew; Zhang, Houcan; Zeng, Changqing; Zhao, Hui; Matsuda, Ichiro; Fukushima, Yoshimitsu; Macer, Darryl R; Suda, Eiko; Rotimi, Charles N; Adebamowo, Clement A; Ajayi, Ike; Aniagwu, Toyin; Marshall, Patricia A; Nkwodimmah, Chibuzor; Royal, Charmaine D M; Leppert, Mark F; Dixon, Missy; Peiffer, Andy; Qiu, Renzong; Kent, Alastair; Kato, Kazuto; Niikawa, Norio; Adewole, Isaac F; Knoppers, Bartha M; Foster, Morris W; Clayton, Ellen Wright; Watkin, Jessica; Gibbs, Richard A; Belmont, John W; Muzny, Donna; Nazareth, Lynne; Sodergren, Erica; Weinstock, George M; Wheeler, David A; Yakub, Imtaz; Gabriel, Stacey B; Onofrio, Robert C; Richter, Daniel J; Ziaugra, Liuda; Birren, Bruce W; Daly, Mark J; Altshuler, David; Wilson, Richard K; Fulton, Lucinda L; Rogers, Jane; Burton, John; Carter, Nigel P; Clee, Christopher M; Griffiths, Mark; Jones, Matthew C; McLay, Kirsten; Plumb, Robert W; Ross, Mark T; Sims, Sarah K; Willey, David L; Chen, Zhu; Han, Hua; Kang, Le; Godbout, Martin; Wallenburg, John C; L'Archevêque, Paul; Bellemare, Guy; Saeki, Koji; Wang, Hongguang; An, Daochang; Fu, Hongbo; Li, Qing; Wang, Zhen; Wang, Renwu; Holden, Arthur L; Brooks, Lisa D; McEwen, Jean E; Guyer, Mark S; Wang, Vivian Ota; Peterson, Jane L; Shi, Michael; Spiegel, Jack; Sung, Lawrence M; Zacharia, Lynn F; Collins, Francis S; Kennedy, Karen; Jamieson, Ruth; Stewart, John
2007-10-18
We describe the Phase II HapMap, which characterizes over 3.1 million human single nucleotide polymorphisms (SNPs) genotyped in 270 individuals from four geographically diverse populations and includes 25-35% of common SNP variation in the populations surveyed. The map is estimated to capture untyped common variation with an average maximum r2 of between 0.9 and 0.96 depending on population. We demonstrate that the current generation of commercial genome-wide genotyping products captures common Phase II SNPs with an average maximum r2 of up to 0.8 in African and up to 0.95 in non-African populations, and that potential gains in power in association studies can be obtained through imputation. These data also reveal novel aspects of the structure of linkage disequilibrium. We show that 10-30% of pairs of individuals within a population share at least one region of extended genetic identity arising from recent ancestry and that up to 1% of all common variants are untaggable, primarily because they lie within recombination hotspots. We show that recombination rates vary systematically around genes and between genes of different function. Finally, we demonstrate increased differentiation at non-synonymous, compared to synonymous, SNPs, resulting from systematic differences in the strength or efficacy of natural selection between populations.
Regions of extreme synonymous codon selection in mammalian genes
Schattner, Peter; Diekhans, Mark
2006-01-01
Recently there has been increasing evidence that purifying selection occurs among synonymous codons in mammalian genes. This selection appears to be a consequence of either cis-regulatory motifs, such as exonic splicing enhancers (ESEs), or mRNA secondary structures, being superimposed on the coding sequence of the gene. We have developed a program to identify regions likely to be enriched for such motifs by searching for extended regions of extreme codon conservation between homologous genes of related species. Here we present the results of applying this approach to five mammalian species (human, chimpanzee, mouse, rat and dog). Even with very conservative selection criteria, we find over 200 regions of extreme codon conservation, ranging in length from 60 to 178 codons. The regions are often found within genes involved in DNA-binding, RNA-binding or zinc-ion-binding. They are highly depleted for synonymous single nucleotide polymorphisms (SNPs) but not for non-synonymous SNPs, further indicating that the observed codon conservation is being driven by negative selection. Forty-three percent of the regions overlap conserved alternative transcript isoforms and are enriched for known ESEs. Other regions are enriched for TpA dinucleotides and may contain conserved motifs/structures relating to mRNA stability and/or degradation. We anticipate that this tool will be useful for detecting regions enriched in other classes of coding-sequence motifs and structures as well. PMID:16556911
Manaffar, R; Zare, S; Agh, N; Abdolahzadeh, N; Soltanian, S; Sorgeloos, P; Bossier, P; Van Stappen, G
2011-01-01
In order to find a marker for differentiating between a bisexual and a parthenogenetic Artemia strain, Exon-7 of the Na/K ATPase α(1) subunit gene was screened by RFLP technique. The results revealed a constant synonymous SNP (single nucleotide polymorphism) in digestion by the Tru1I enzyme that was consistent with these two types of Artemia. This SNP was identified as an accurate molecular marker for discrimination between bisexual and parthenogenetic Artemia. According to the Nei's genetic distance (1973), the lowest genetic distance was found between individuals from Artemia urmiana Günther 1890 and parthenogenetic populations, making the described marker the first marker to easily distinguish between these two cooccurring species. © 2010 Blackwell Publishing Ltd.
Kumar, Ambuj; Rajendran, Vidya; Sethumadhavan, Rao; Purohit, Rituraj
2012-01-01
Human STIL (SCL/TAL1 interrupting locus) protein maintains centriole stability and spindle pole localisation. It helps in recruitment of CENPJ (Centromere protein J)/CPAP (centrosomal P4.1-associated protein) and other centrosomal proteins. Mutations in STIL protein are reported in several disorders, especially in deregulation of cell cycle cascades. In this work, we examined the non-synonymous single nucleotide polymorphisms (nsSNPs) reported in STIL protein for their disease association. Different SNP prediction tools were used to predict disease-associated nsSNPs. Our evaluation technique predicted rs147744459 (R242C) as a highly deleterious disease-associated nsSNP and its interaction behaviour with CENPJ protein. Molecular modelling, docking and molecular dynamics simulation were conducted to examine the structural consequences of the predicted disease-associated mutation. By molecular dynamic simulation we observed structural consequences of R242C mutation which affects interaction of STIL and CENPJ functional domains. The result obtained in this study will provide a biophysical insight into future investigations of pathological nsSNPs using a computational platform.
Nagasundaram, N; Priya Doss, C George
2011-01-01
Distinguishing the deleterious from the massive number of non-functional nsSNPs that occur within a single genome is a considerable challenge in mutation research. In this approach, we have used the existing in silico methods to explore the mutation-structure-function relationship in the XPAgene. We used the Sorting Intolerant From Tolerant (SIFT), Polymorphism Phenotyping (PolyPhen), I-Mutant 2.0, and the Protein Analysis THrough Evolutionary Relationships methods to predict the effects of deleterious nsSNPs on protein function and evaluated the impact of mutation on protein stability by Molecular Dynamics simulations. By comparing the scores of all the four in silico methods, nsSNP with an ID rs104894131 at position C108F was predicted to be highly deleterious. We extended our Molecular dynamics approach to gain insight into the impact of this non-synonymous polymorphism on structural changes that may affect the activity of the XPAgene. Based on the in silico methods score, potential energy, root-mean-square deviation, and root-mean-square fluctuation, we predict that deleterious nsSNP at position C108F would play a significant role in causing disease by the XPA gene. Our approach would present the application of in silicotools in understanding the functional variation from the perspective of structure, evolution, and phenotype.
Sequence variations of the bovine prion protein gene (PRNP) in native Korean Hanwoo cattle
Choi, Sangho
2012-01-01
Bovine spongiform encephalopathy (BSE) is one of the fatal neurodegenerative diseases known as transmissible spongiform encephalopathies (TSEs) caused by infectious prion proteins. Genetic variations correlated with susceptibility or resistance to TSE in humans and sheep have not been reported for bovine strains including those from Holstein, Jersey, and Japanese Black cattle. Here, we investigated bovine prion protein gene (PRNP) variations in Hanwoo cattle [Bos (B.) taurus coreanae], a native breed in Korea. We identified mutations and polymorphisms in the coding region of PRNP, determined their frequency, and evaluated their significance. We identified four synonymous polymorphisms and two non-synonymous mutations in PRNP, but found no novel polymorphisms. The sequence and number of octapeptide repeats were completely conserved, and the haplotype frequency of the coding region was similar to that of other B. taurus strains. When we examined the 23-bp and 12-bp insertion/deletion (indel) polymorphisms in the non-coding region of PRNP, Hanwoo cattle had a lower deletion allele and 23-bp del/12-bp del haplotype frequency than healthy and BSE-affected animals of other strains. Thus, Hanwoo are seemingly less susceptible to BSE than other strains due to the 23-bp and 12-bp indel polymorphisms. PMID:22705734
Hayashida, Kyoko; Abe, Takashi; Weir, William; Nakao, Ryo; Ito, Kimihito; Kajino, Kiichi; Suzuki, Yutaka; Jongejan, Frans; Geysen, Dirk; Sugimoto, Chihiro
2013-01-01
The disease caused by the apicomplexan protozoan parasite Theileria parva, known as East Coast fever or Corridor disease, is one of the most serious cattle diseases in Eastern, Central, and Southern Africa. We performed whole-genome sequencing of nine T. parva strains, including one of the vaccine strains (Kiambu 5), field isolates from Zambia, Uganda, Tanzania, or Rwanda, and two buffalo-derived strains. Comparison with the reference Muguga genome sequence revealed 34 814–121 545 single nucleotide polymorphisms (SNPs) that were more abundant in buffalo-derived strains. High-resolution phylogenetic trees were constructed with selected informative SNPs that allowed the investigation of possible complex recombination events among ancestors of the extant strains. We further analysed the dN/dS ratio (non-synonymous substitutions per non-synonymous site divided by synonymous substitutions per synonymous site) for 4011 coding genes to estimate potential selective pressure. Genes under possible positive selection were identified that may, in turn, assist in the identification of immunogenic proteins or vaccine candidates. This study elucidated the phylogeny of T. parva strains based on genome-wide SNPs analysis with prediction of possible past recombination events, providing insight into the migration, diversification, and evolution of this parasite species in the African continent. PMID:23404454
Hayashida, Kyoko; Abe, Takashi; Weir, William; Nakao, Ryo; Ito, Kimihito; Kajino, Kiichi; Suzuki, Yutaka; Jongejan, Frans; Geysen, Dirk; Sugimoto, Chihiro
2013-06-01
The disease caused by the apicomplexan protozoan parasite Theileria parva, known as East Coast fever or Corridor disease, is one of the most serious cattle diseases in Eastern, Central, and Southern Africa. We performed whole-genome sequencing of nine T. parva strains, including one of the vaccine strains (Kiambu 5), field isolates from Zambia, Uganda, Tanzania, or Rwanda, and two buffalo-derived strains. Comparison with the reference Muguga genome sequence revealed 34 814-121 545 single nucleotide polymorphisms (SNPs) that were more abundant in buffalo-derived strains. High-resolution phylogenetic trees were constructed with selected informative SNPs that allowed the investigation of possible complex recombination events among ancestors of the extant strains. We further analysed the dN/dS ratio (non-synonymous substitutions per non-synonymous site divided by synonymous substitutions per synonymous site) for 4011 coding genes to estimate potential selective pressure. Genes under possible positive selection were identified that may, in turn, assist in the identification of immunogenic proteins or vaccine candidates. This study elucidated the phylogeny of T. parva strains based on genome-wide SNPs analysis with prediction of possible past recombination events, providing insight into the migration, diversification, and evolution of this parasite species in the African continent.
Kai, M; Nakata, N; Matsuoka, M; Sekizuka, T; Kuroda, M; Makino, M
2013-10-01
Genome analysis of Mycobacterium leprae strain Kyoto-2 in this study revealed characteristic nucleotide substitutions in gene ML0411, compared to the reference genome M. leprae strain TN. The ML0411 gene of Kyoto-2 had six SNPs compared to that of TN. All SNPs in ML0411 were non-synonymous mutations that result in amino acid replacements. In addition, a seventh SNP was found 41 bp upstream of the start codon in the regulatory region. The seven SNP sites in the ML0411 region were investigated by sequencing in 36 M. leprae isolates from the Leprosy Research Center in Japan. The SNP pattern in 14 of the 36 isolates showed similarity to that of Kyoto-2. Determination of the standard SNP types within the 36 stocked isolates revealed that almost all of the Japanese strains belonged to SNP type III, with nucleotide substitutions at position 14676, 164275, and 2935685 of the M. leprae TN genome. The geographical distribution pattern of east Asian M. leprae isolates by discrimination of ML0411 SNPs was investigated and interestingly turned out to be similar to that of tandem repeat numbers of GACATC in the rpoT gene (3 copies or 4 copies), which has been established as a tool for M. leprae genotyping. All seven Korean M. leprae isolates examined in this study, as well as those derived from Honshu Island of Japan, showed 4 copies of the 6-base tandem repeat plus the ML0411 SNPs observed in M. leprae Kyoto-2. They are termed Northeast Asian (NA) strain of M. leprae. On the other hand, many of isolates derived from the Okinawa Islands of Japan and from the Philippines showed 3 copies of the 6-base tandem repeat in addition to the M. leprae TN ML0411 type of SNPs. These results demonstrate the existence of M. leprae strains in Northeast Asian region having characteristic SNP patterns. Copyright © 2013 Elsevier B.V. All rights reserved.
MMP9 polymorphisms and breast cancer risk: a report from the Shanghai Breast Cancer Genetics Study.
Beeghly-Fadiel, Alicia; Lu, Wei; Shu, Xiao-Ou; Long, Jirong; Cai, Qiuyin; Xiang, Yongbin; Gao, Yu-Tang; Zheng, Wei
2011-04-01
In addition to tumor invasion and angiogenesis, matrix metalloproteinase (MMP)9 also contributes to carcinogenesis and tumor growth. Genetic variation that may influence MMP9 expression was evaluated among participants of the Shanghai Breast Cancer Genetics Study (SBCGS) for associations with breast cancer susceptibility. In stage 1, 11 MMP9 single nucleotide polymorphisms (SNPs) were genotyped by the Affymetrix Targeted Genotyping System and/or the Affymetrix Genome-Wide Human SNP Array 6.0 among 4,227 SBCGS participants. One SNP was further genotyped using the Sequenom iPLEX MassARRAY platform among an additional 6,270 SBCGS participants. Associations with breast cancer risk were evaluated by odds ratios (OR) and 95% confidence intervals (CI) from logistic regression models that included adjustment for age, education, and genotyping stage when appropriate. In Stage 1, rare allele homozygotes for a promoter SNP (rs3918241) or a non-synonymous SNP (rs2274756, R668Q) tended to occur more frequently among breast cancer cases (P value = 0.116 and 0.056, respectively). Given their high linkage disequilibrium (D' = 1.0, r (2) = 0.97), one (rs3918241) was selected for additional analysis. An association with breast cancer risk was not supported by additional Stage 2 genotyping. In combined analysis, no elevated risk of breast cancer among homozygotes was found (OR: 1.2, 95% CI: 0.8-1.8). Common genetic variation in MMP9 was not found to be significantly associated with breast cancer susceptibility among participants of the Shanghai Breast Cancer Genetics Study.
Dubey, Bhawna; Meganathan, P R; Haque, Ikramul
2012-07-01
This paper reports the complete mitochondrial genome sequence of an endangered Indian snake, Python molurus molurus (Indian Rock Python). A typical snake mitochondrial (mt) genome of 17258 bp length comprising of 37 genes including the 13 protein coding genes, 22 tRNA genes, and 2 ribosomal RNA genes along with duplicate control regions is described herein. The P. molurus molurus mt. genome is relatively similar to other snake mt. genomes with respect to gene arrangement, composition, tRNA structures and skews of AT/GC bases. The nucleotide composition of the genome shows that there are more A-C % than T-G% on the positive strand as revealed by positive AT and CG skews. Comparison of individual protein coding genes, with other snake genomes suggests that ATP8 and NADH3 genes have high divergence rates. Codon usage analysis reveals a preference of NNC codons over NNG codons in the mt. genome of P. molurus. Also, the synonymous and non-synonymous substitution rates (ka/ks) suggest that most of the protein coding genes are under purifying selection pressure. The phylogenetic analyses involving the concatenated 13 protein coding genes of P. molurus molurus conformed to the previously established snake phylogeny.
El-Magd, Mohammed Abu; Saleh, Ayman A; Abdel-Hamid, Tamer M; Saleh, Rasha M; Afifi, Mohammed A
2016-10-01
Chicken growth hormone secretagogue receptor (GHSR) is a receptor for ghrelin (GHRL), a peptide hormone produced by chicken proventriculus, which stimulates growth hormone (GH) release and food intake. The purpose of this study was to search for single nucleotide polymorphisms (SNPs) in exon 2 of GHSR gene and to analyze their effect on the appetite, growth traits and expression levels of GHSR, GHRL, and GH genes as well as serum levels of GH and GHRL in Mandara chicken. Two adjacent SNPs, A239G and G244A, were detected in exon 2 of GHSR gene. G244A SNP was non-synonymous mutation and led to replacement of lysine amino acid (aa) by arginine aa, while A239G SNP was synonymous mutation. The combined genotypes of A239G and G244A SNPs produced three haplotypes; GG/GG, GG/AG, AG/AG, which associated significantly (P<0.05) with growth traits (body weight, average daily gain, shank length, keel length, chest circumference) at age from >4 to 16w. Chickens with the homozygous GG/GG haplotype showed higher growth performance than other chickens. The two SNPs were also correlated with mRNA levels of GHSR and GH (in pituitary gland), and GHRL (in proventriculus and hypothalamus) as well as with serum level of GH and GHRL. Also, chickens with GG/GG haplotype showed higher mRNA and serum levels. This is the first study to demonstrate that SNPs in GHSR can increase appetite, growth traits, expression and level of GHRL, suggesting a hunger signal role for endogenous GHRL. Copyright © 2016 Elsevier Inc. All rights reserved.
Carter, Tamar E.; Boulter, Alexis; Existe, Alexandre; Romain, Jean R.; St. Victor, Jean Yves; Mulligan, Connie J.; Okech, Bernard A.
2015-01-01
Antimalarial drugs are a key tool in malaria elimination programs. With the emergence of artemisinin resistance in southeast Asia, an effort to identify molecular markers for surveillance of resistant malaria parasites is underway. Non-synonymous mutations in the kelch propeller domain (K13-propeller) in Plasmodium falciparum have been associated with artemisinin resistance in samples from southeast Asia, but additional studies are needed to characterize this locus in other P. falciparum populations with different levels of artemisinin use. Here, we sequenced the K13-propeller locus in 82 samples from Haiti, where limited government oversight of non-governmental organizations may have resulted in low-level use of artemisinin-based combination therapies. We detected a single-nucleotide polymorphism (SNP) at nucleotide 1,359 in a single isolate. Our results contribute to our understanding of the global genomic diversity of the K13-propeller locus in P. falciparum populations. PMID:25646258
Aris-Brosou, Stéphane; Bielawski, Joseph P
2006-08-15
A popular approach to examine the roles of mutation and selection in the evolution of genomes has been to consider the relationship between codon bias and synonymous rates of molecular evolution. A significant relationship between these two quantities is taken to indicate the action of weak selection on substitutions among synonymous codons. The neutral theory predicts that the rate of evolution is inversely related to the level of functional constraint. Therefore, selection against the use of non-preferred codons among those coding for the same amino acid should result in lower rates of synonymous substitution as compared with sites not subject to such selection pressures. However, reliably measuring the extent of such a relationship is problematic, as estimates of synonymous rates are sensitive to our assumptions about the process of molecular evolution. Previous studies showed the importance of accounting for unequal codon frequencies, in particular when synonymous codon usage is highly biased. Yet, unequal codon frequencies can be modeled in different ways, making different assumptions about the mutation process. Here we conduct a simulation study to evaluate two different ways of modeling uneven codon frequencies and show that both model parameterizations can have a dramatic impact on rate estimates and affect biological conclusions about genome evolution. We reanalyze three large data sets to demonstrate the relevance of our results to empirical data analysis.
Genomic Footprints in Selected and Unselected Beef Cattle Breeds in Korea.
Lim, Dajeong; Strucken, Eva M; Choi, Bong Hwan; Chai, Han Ha; Cho, Yong Min; Jang, Gul Won; Kim, Tae-Hun; Gondro, Cedric; Lee, Seung Hwan
2016-01-01
Korean Hanwoo cattle have been subjected to intensive artificial selection over the past four decades to improve meat production traits. Another three cattle varieties very closely related to Hanwoo reside in Korea (Jeju Black and Brindle) and in China (Yanbian). These breeds have not been part of a breeding scheme to improve production traits. Here, we compare the selected Hanwoo against these similar but presumed to be unselected populations to identify genomic regions that have been under recent selection pressure due to the breeding program. Rsb statistics were used to contrast the genomes of Hanwoo versus a pooled sample of the three unselected population (UN). We identified 37 significant SNPs (FDR corrected) in the HW/UN comparison and 21 known protein coding genes were within 1 MB to the identified SNPs. These genes were previously reported to affect traits important for meat production (14 genes), reproduction including mammary gland development (3 genes), coat color (2 genes), and genes affecting behavioral traits in a broader sense (2 genes). We subsequently sequenced (Illumina HiSeq 2000 platform) 10 individuals of the brown Hanwoo and the Chinese Yanbian to identify SNPs within the candidate genomic regions. Based on allele frequency differences, haplotype structures, and literature research, we singled out one non-synonymous SNP in the APP gene (APP: c.569C>T, Ala199Val) and predicted the mutational effect on the protein structure. We found that protein-protein interactions might be impaired due to increased exposed hydrophobic surfaces of the mutated protein. The APP gene has also been reported to affect meat tenderness in pigs and obesity in humans. Meat tenderness has been linked to intramuscular fat content, which is one of the main breeding goals for brown Hanwoo, potentially supporting a causal influence of the herein described nsSNP in the APP gene.
Langeberg, Wendy J.; Kwon, Erika M.; Koopmeiners, Joseph S.; Ostrander, Elaine A.; Stanford, Janet L.
2009-01-01
Background Mismatch repair (MMR) gene activity may be associated with prostate cancer (PC) risk and outcomes. This study evaluated whether single nucleotide polymorphisms (SNPs) in key MMR genes are related to PC outcomes. Methods Data from two population-based case-control studies of PC among Caucasian and African-American men residing in King County, Washington were combined for this analysis. Cases (n=1,458) were diagnosed with PC in 1993–96 or 2002–05 and identified via the Seattle-Puget Sound SEER cancer registry. Controls (n=1,351) were age-matched to cases and identified via random digit dialing. Logistic regression was used to assess the relationship between haplotype-tagging SNPs and PC risk and disease aggressiveness. Cox proportional hazards regression was used to assess the relationship between SNPs and PC recurrence and PC-specific death. Results Nineteen SNPs were evaluated in the key MMR genes: five in MLH1, 10 in MSH2, and 4 in PMS2. Among Caucasian men, one SNP in MLH1 (rs9852810) was associated with: overall PC risk (OR=1.21, 95% CI=1.02, 1.44; p=0.03), more aggressive PC (OR=1.49, 95% CI=1.15–1.91; p<0.01), and PC recurrence (HR=1.83, 95% CI=1.18, 2.86; p<0.01), but not PC-specific mortality. A non-synonymous coding SNP in MLH1, rs1799977 (I219V), was also found to be associated with more aggressive disease. These results did not remain significant after adjusting for multiple comparisons. Conclusion This population-based case-control study provides evidence for a possible association with a gene variant in MLH1 in relation to risk of overall PC, more aggressive disease, and PC recurrence, which warrants replication. PMID:20056646
Nudel, R; Simpson, N H; Baird, G; O'Hare, A; Conti-Ramsden, G; Bolton, P F; Hennessy, E R; Ring, S M; Davey Smith, G; Francks, C; Paracchini, S; Monaco, A P; Fisher, S E; Newbury, D F
2014-04-01
Specific language impairment (SLI) is a neurodevelopmental disorder that affects linguistic abilities when development is otherwise normal. We report the results of a genome-wide association study of SLI which included parent-of-origin effects and child genotype effects and used 278 families of language-impaired children. The child genotype effects analysis did not identify significant associations. We found genome-wide significant paternal parent-of-origin effects on chromosome 14q12 (P = 3.74 × 10(-8)) and suggestive maternal parent-of-origin effects on chromosome 5p13 (P = 1.16 × 10(-7)). A subsequent targeted association of six single-nucleotide-polymorphisms (SNPs) on chromosome 5 in 313 language-impaired individuals and their mothers from the ALSPAC cohort replicated the maternal effects, albeit in the opposite direction (P = 0.001); as fathers' genotypes were not available in the ALSPAC study, the replication analysis did not include paternal parent-of-origin effects. The paternally-associated SNP on chromosome 14 yields a non-synonymous coding change within the NOP9 gene. This gene encodes an RNA-binding protein that has been reported to be significantly dysregulated in individuals with schizophrenia. The region of maternal association on chromosome 5 falls between the PTGER4 and DAB2 genes, in a region previously implicated in autism and ADHD. The top SNP in this association locus is a potential expression QTL of ARHGEF19 (also called WGEF) on chromosome 1. Members of this protein family have been implicated in intellectual disability. In summary, this study implicates parent-of-origin effects in language impairment, and adds an interesting new dimension to the emerging picture of shared genetic etiology across various neurodevelopmental disorders. © 2014 The Authors. Genes, Brain and Behavior published by International Behavioural and Neural Genetics Society and John Wiley & Sons Ltd.
Explaining the disease phenotype of intergenic SNP through predicted long range regulation
Chen, Jingqi; Tian, Weidong
2016-01-01
Thousands of disease-associated SNPs (daSNPs) are located in intergenic regions (IGR), making it difficult to understand their association with disease phenotypes. Recent analysis found that non-coding daSNPs were frequently located in or approximate to regulatory elements, inspiring us to try to explain the disease phenotypes of IGR daSNPs through nearby regulatory sequences. Hence, after locating the nearest distal regulatory element (DRE) to a given IGR daSNP, we applied a computational method named INTREPID to predict the target genes regulated by the DRE, and then investigated their functional relevance to the IGR daSNP's disease phenotypes. 36.8% of all IGR daSNP-disease phenotype associations investigated were possibly explainable through the predicted target genes, which were enriched with, were functionally relevant to, or consisted of the corresponding disease genes. This proportion could be further increased to 60.5% if the LD SNPs of daSNPs were also considered. Furthermore, the predicted SNP-target gene pairs were enriched with known eQTL/mQTL SNP-gene relationships. Overall, it's likely that IGR daSNPs may contribute to disease phenotypes by interfering with the regulatory function of their nearby DREs and causing abnormal expression of disease genes. PMID:27280978
Mutation Analysis of COL1A1 and COL1A2 in Fetuses with Osteogenesis Imperfecta Type II/III.
Wang, Wenbo; Wu, Qichang; Cao, Lin; Sun, Li; Xu, Yasong; Guo, Qiwei
2015-01-27
Aim: To analyze COL1A1/2 mutations in prenatal-onset OI for determine the proportion of mutations in type I collagen genes among prenatal onset OI and to provide additional data for genotype-phenotype analyses. Material and Methods: Ten cases of severe fetal short-limb dwarfism detected by antenatal ultrasonography were referred to our center. Before the termination of pregnancy, cordocentesis was performed for fetal karyotype and COL1A1/2 gene sequencing analysis. Postmortem radiographic examination was performed at all instances for definitive diagnosis. Results: COL1A1 and COL1A2 SNP and mutations were identified in all the cases. Among these, one synonymous SNP and four synonymous SNPs were recognized in COL1A1/2, respectively, seven cases have distinct heterozygous mutations and six new COL1A1/2 gene mutations were identified. Conclusion: There has been substantial progress in the identification of the molecular defects responsible for skeletal dysplasias. With the constant increase in the number of identified mutations in COL1A1 and COL1A2, genotype-phenotype correlation is becoming increasingly pertinent. © 2015 S. Karger AG, Basel.
Protein-based forensic identification using genetically variant peptides in human bone.
Mason, Katelyn Elizabeth; Anex, Deon; Grey, Todd; Hart, Bradley; Parker, Glendon
2018-04-22
Bone tissue contains organic material that is useful for forensic investigations and may contain preserved endogenous protein that can persist in the environment for extended periods of time over a range of conditions. Single amino acid polymorphisms in these proteins reflect genetic information since they result from non-synonymous single nucleotide polymorphisms (SNPs) in DNA. Detection of genetically variant peptides (GVPs) - those peptides that contain amino acid polymorphisms - in digests of bone proteins allows for the corresponding SNP alleles to be inferred. Resulting genetic profiles can be used to calculate statistical measures of association between a bone sample and an individual. In this study proteomic analysis on rib cortical bone samples from 10 recently deceased individuals demonstrates this concept. A straight-forward acidic demineralization protocol yielded proteins that were digested with trypsin. Tryptic digests were analyzed by liquid chromatography mass spectrometry. A total of 1736 different proteins were identified across all resulting datasets. On average, individual samples contained 454±121 (x¯±σ) proteins. Thirty-five genetically variant peptides were identified from 15 observed proteins. Overall, 134 SNP inferences were made based on proteomically detected GVPs, which were confirmed by sequencing of subject DNA. Inferred individual SNP genetic profiles ranged in random match probability (RMP) from 1/6 to 1/42,472 when calculated with European population frequencies in the 1000 Genomes Project, Phase 3. Similarly, RMPs based on African population frequencies were calculated for each SNP genetic profile and likelihood ratios (LR) were obtained by dividing each European RMP by the corresponding African RMP. Resulting LR values ranged from 1.4 to 825 with a median value of 16. GVP markers offer a basis for the identification of compromised skeletal remains independent of the presence of DNA template. Published by Elsevier B.V.
NagaSundaram, N; Priya Doss, C George
2011-01-01
Background: Distinguishing the deleterious from the massive number of non-functional nsSNPs that occur within a single genome is a considerable challenge in mutation research. In this approach, we have used the existing in silico methods to explore the mutation-structure-function relationship in the XPAgene. Materials and Methods: We used the Sorting Intolerant From Tolerant (SIFT), Polymorphism Phenotyping (PolyPhen), I-Mutant 2.0, and the Protein Analysis THrough Evolutionary Relationships methods to predict the effects of deleterious nsSNPs on protein function and evaluated the impact of mutation on protein stability by Molecular Dynamics simulations. Results: By comparing the scores of all the four in silico methods, nsSNP with an ID rs104894131 at position C108F was predicted to be highly deleterious. We extended our Molecular dynamics approach to gain insight into the impact of this non-synonymous polymorphism on structural changes that may affect the activity of the XPAgene. Conclusion: Based on the in silico methods score, potential energy, root-mean-square deviation, and root-mean-square fluctuation, we predict that deleterious nsSNP at position C108F would play a significant role in causing disease by the XPA gene. Our approach would present the application of in silicotools in understanding the functional variation from the perspective of structure, evolution, and phenotype. PMID:22190868
SNPConvert: SNP Array Standardization and Integration in Livestock Species.
Nicolazzi, Ezequiel Luis; Marras, Gabriele; Stella, Alessandra
2016-06-09
One of the main advantages of single nucleotide polymorphism (SNP) array technology is providing genotype calls for a specific number of SNP markers at a relatively low cost. Since its first application in animal genetics, the number of available SNP arrays for each species has been constantly increasing. However, conversely to that observed in whole genome sequence data analysis, SNP array data does not have a common set of file formats or coding conventions for allele calling. Therefore, the standardization and integration of SNP array data from multiple sources have become an obstacle, especially for users with basic or no programming skills. Here, we describe the difficulties related to handling SNP array data, focusing on file formats, SNP allele coding, and mapping. We also present SNPConvert suite, a multi-platform, open-source, and user-friendly set of tools to overcome these issues. This tool, which can be integrated with open-source and open-access tools already available, is a first step towards an integrated system to standardize and integrate any type of raw SNP array data. The tool is available at: https://github. com/nicolazzie/SNPConvert.git.
Inheritance-mode specific pathogenicity prioritization (ISPP) for human protein coding genes.
Hsu, Jacob Shujui; Kwan, Johnny S H; Pan, Zhicheng; Garcia-Barcelo, Maria-Mercè; Sham, Pak Chung; Li, Miaoxin
2016-10-15
Exome sequencing studies have facilitated the detection of causal genetic variants in yet-unsolved Mendelian diseases. However, the identification of disease causal genes among a list of candidates in an exome sequencing study is still not fully settled, and it is often difficult to prioritize candidate genes for follow-up studies. The inheritance mode provides crucial information for understanding Mendelian diseases, but none of the existing gene prioritization tools fully utilize this information. We examined the characteristics of Mendelian disease genes under different inheritance modes. The results suggest that Mendelian disease genes with autosomal dominant (AD) inheritance mode are more haploinsufficiency and de novo mutation sensitive, whereas those autosomal recessive (AR) genes have significantly more non-synonymous variants and regulatory transcript isoforms. In addition, the X-linked (XL) Mendelian disease genes have fewer non-synonymous and synonymous variants. As a result, we derived a new scoring system for prioritizing candidate genes for Mendelian diseases according to the inheritance mode. Our scoring system assigned to each annotated protein-coding gene (N = 18 859) three pathogenic scores according to the inheritance mode (AD, AR and XL). This inheritance mode-specific framework achieved higher accuracy (area under curve = 0.84) in XL mode. The inheritance-mode specific pathogenicity prioritization (ISPP) outperformed other well-known methods including Haploinsufficiency, Recessive, Network centrality, Genic Intolerance, Gene Damage Index and Gene Constraint scores. This systematic study suggests that genes manifesting disease inheritance modes tend to have unique characteristics. ISPP is included in KGGSeq v1.0 (http://grass.cgs.hku.hk/limx/kggseq/), and source code is available from (https://github.com/jacobhsu35/ISPP.git). mxli@hku.hkSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Evans, Perry; Avey, Stefan; Kong, Yong; Krauthammer, Michael
2013-09-01
A common goal of tumor sequencing projects is finding genes whose mutations are selected for during tumor development. This is accomplished by choosing genes that have more non-synonymous mutations than expected from an estimated background mutation frequency. While this background frequency is unknown, it can be estimated using both the observed synonymous mutation frequency and the non-synonymous to synonymous mutation ratio. The synonymous mutation frequency can be determined across all genes or in a gene-specific manner. This choice introduces an interesting trade-off. A gene-specific frequency adjusts for an underlying mutation bias, but is difficult to estimate given missing synonymous mutation counts. Using a genome-wide synonymous frequency is more robust, but is less suited for adjusting biases. Studying four evaluation criteria for identifying genes with high non-synonymous mutation burden (reflecting preferential selection of expressed genes, genes with mutations in conserved bases, genes with many protein interactions, and genes that show loss of heterozygosity), we find that the gene-specific synonymous frequency is superior in the gene expression and protein interaction tests. In conclusion, the use of the gene-specific synonymous mutation frequency is well suited for assessing a gene's non-synonymous mutation burden.
Identification of novel mutations in endometrial cancer patients by whole-exome sequencing.
Chang, Ya-Sian; Huang, Hsien-Da; Yeh, Kun-Tu; Chang, Jan-Gowth
2017-05-01
The aim of the present study was to identify genomic alterations in Taiwanese endometrial cancer patients. This information is vitally important in Taiwan, where endometrial cancer is the second most common gynecological cancer. We performed whole-exome sequencing on DNA from 14 tumor tissue samples from Taiwanese endometrial cancer patients. We used the Genome Analysis Tool kit software package for data analysis, and the dbSNP, Catalogue of Somatic Mutations in Cancer (COSMIC) and The Cancer Genome Atlas (TCGA) databases for comparisons. Variants were validated via Sanger sequencing. We identified 143 non-synonymous mutations in 756 canonical cancer-related genes and 1,271 non-synonymous mutations in non-canonical cancer-related genes in 14 endometrial samples. PTEN, KRAS and PIK3R1 were the most frequently mutated canonical cancer-related genes. Our results revealed nine potential driver genes (MAPT, IL24, MCM6, TSC1, BIRC2, CIITA, DST, CASP8 and NOTCH2) and 21 potential passenger genes (ARMCX4, IGSF10, VPS13C, DCT, DNAH14, TLN1, ZNF605, ZSCAN29, MOCOS, CMYA5, PCDH17, UGT1A8, CYFIP2, MACF1, NUDT5, JAKMIP1, PCDHGB4, FAM178A, SNX6, IMP4 and PCMTD1). The detected molecular aberrations led to putative activation of the mTOR, Wnt, MAPK, VEGF and ErbB pathways, as well as aberrant DNA repair, cell cycle control and apoptosis pathways. We characterized the mutational landscape and genetic alterations in multiple cellular pathways of endometrial cancer in the Taiwanese population.
Mei, C G; Gui, L S; Fu, C Z; Wang, H C; Wang, J L; Cheng, G; Zan, L S
2015-08-07
Previous studies have shown that the cell death-inducing DFF45-like effector-C (CIDEC) gene is involved in lipid storage and energy metabolism, suggesting that it is a potential candidate gene that affects body measurement traits (BMTs) and meat quality traits (MQTs). The aim of this study was to identify polymorphisms of the bovine CIDEC gene and analyze their possible associations with BMTs and MQTs in 531 randomly selected Qinchuan cattle aged between 18 and 24 months. DNA sequencing and polymerase chain reaction-restriction fragment length polymorphism were employed to detect CIDEC single nucleotide polymorphisms (SNPs). We found five SNPs: two in exon 5 (SNP1, g.9815G>A and SNP2, g.9924C>T) and three in the 3'-untranslated region (SNP3, g.13281C>T; SNP4, g.13297A>G; and SNP5, g.13307G>A). SNP1 was a missense mutation that resulted in an arginine to glutamine amino acid change, and exhibited two genotypes (GG and AG). SNP2 was a synonymous mutation that exhibited three genotypes (CC, CT, and TT). SNP3, 4, and 5 were completely linked, and only exhibited two genotypes (CC-AA-GG and CT-AG-GA). We found significant associations between these polymorphisms and BMTs and MQTs (P < 0.05); GG, CT, and CT-AG-GA appeared to be the most beneficial genotypes. Therefore, CIDEC may affect BMTs and MQTs in Qinchuan cattle, and could be used in marker-assisted selection.
Johnson, Matthew P.; Brennecke, Shaun P.; East, Christine E.; Dyer, Thomas D.; Roten, Linda T.; Proffitt, J. Michael; Melton, Phillip E.; Fenstad, Mona H.; Aalto-Viljakainen, Tia; Mäkikallio, Kaarin; Heinonen, Seppo; Kajantie, Eero; Kere, Juha; Laivuori, Hannele; Austgulen, Rigmor; Blangero, John; Moses, Eric K.; Pouta, Anneli; Kivinen, Katja; Ekholm, Eeva; Hietala, Reija; Sainio, Susanna; Saisto, Terhi; Uotila, Jukka; Klemetti, Miira; Inkeri Lokki, Anna; Georgiadis, Leena; Huovari, Elina; Kortelainen, Eija; Leminen, Satu; Lähdesmäki, Aija; Mehtälä, Susanna; Salmen, Christina
2013-01-01
Pre-eclampsia is an idiopathic pregnancy disorder promoting morbidity and mortality to both mother and child. Delivery of the fetus is the only means to resolve severe symptoms. Women with pre-eclamptic pregnancies demonstrate increased risk for later life cardiovascular disease (CVD) and good evidence suggests these two syndromes share several risk factors and pathophysiological mechanisms. To elucidate the genetic architecture of pre-eclampsia we have dissected our chromosome 2q22 susceptibility locus in an extended Australian and New Zealand familial cohort. Positional candidate genes were prioritized for exon-centric sequencing using bioinformatics, SNPing, transcriptional profiling and QTL-walking. In total, we interrogated 1598 variants from 52 genes. Four independent SNP associations satisfied our gene-centric multiple testing correction criteria: a missense LCT SNP (rs2322659, P = 0.0027), a synonymous LRP1B SNP (rs35821928, P = 0.0001), an UTR-3 RND3 SNP (rs115015150, P = 0.0024) and a missense GCA SNP (rs17783344, P = 0.0020). We replicated the LCT SNP association (P = 0.02) and observed a borderline association for the GCA SNP (P = 0.07) in an independent Australian case–control population. The LRP1B and RND3 SNP associations were not replicated in this same Australian singleton cohort. Moreover, these four SNP associations could not be replicated in two additional case–control populations from Norway and Finland. These four SNPs, however, exhibit pleiotropic effects with several quantitative CVD-related traits. Our results underscore the genetic complexity of pre-eclampsia and present novel empirical evidence of possible shared genetic mechanisms underlying both pre-eclampsia and other CVD-related risk factors. PMID:23420841
Explaining the disease phenotype of intergenic SNP through predicted long range regulation.
Chen, Jingqi; Tian, Weidong
2016-10-14
Thousands of disease-associated SNPs (daSNPs) are located in intergenic regions (IGR), making it difficult to understand their association with disease phenotypes. Recent analysis found that non-coding daSNPs were frequently located in or approximate to regulatory elements, inspiring us to try to explain the disease phenotypes of IGR daSNPs through nearby regulatory sequences. Hence, after locating the nearest distal regulatory element (DRE) to a given IGR daSNP, we applied a computational method named INTREPID to predict the target genes regulated by the DRE, and then investigated their functional relevance to the IGR daSNP's disease phenotypes. 36.8% of all IGR daSNP-disease phenotype associations investigated were possibly explainable through the predicted target genes, which were enriched with, were functionally relevant to, or consisted of the corresponding disease genes. This proportion could be further increased to 60.5% if the LD SNPs of daSNPs were also considered. Furthermore, the predicted SNP-target gene pairs were enriched with known eQTL/mQTL SNP-gene relationships. Overall, it's likely that IGR daSNPs may contribute to disease phenotypes by interfering with the regulatory function of their nearby DREs and causing abnormal expression of disease genes. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Prospecting for pig single nucleotide polymorphisms in the human genome: have we struck gold?
Grapes, L; Rudd, S; Fernando, R L; Megy, K; Rocha, D; Rothschild, M F
2006-06-01
Gene-to-gene variation in the frequency of single nucleotide polymorphisms (SNPs) has been observed in humans, mice, rats, primates and pigs, but a relationship across species in this variation has not been described. Here, the frequency of porcine coding SNPs (cSNPs) identified by in silico methods, and the frequency of murine cSNPs, were compared with the frequency of human cSNPs across homologous genes. From 150,000 porcine expressed sequence tag (EST) sequences, a total of 452 SNP-containing sequence clusters were found, totalling 1394 putative SNPs. All the clustered porcine EST annotations and SNP data have been made publicly available at http://sputnik.btk.fi/project?name=swine. Human and murine cSNPs were identified from dbSNP and were characterized as either validated or total number of cSNPs (validated plus non-validated) for comparison purposes. The correlation between in silico pig cSNP and validated human cSNP densities was found to be 0.77 (p < 0.00001) for a set of 25 homologous genes, while a correlation of 0.48 (p < 0.0005) was found for a primarily random sample of 50 homologous human and mouse genes. This is the first evidence of conserved gene-to-gene variability in cSNP frequency across species and indicates that site-directed screening of porcine genes that are homologous to cSNP-rich human genes may rapidly advance cSNP discovery in pigs.
Brennan, Marie-Luise; Pique, Lynn M; Schrijver, Iris
2016-01-01
Several lines of evidence suggest a role for the epithelial sodium channel (ENaC) in cystic fibrosis (CF). The purpose of our study was to assess the contribution of genetic variants in the ENaC subunits (α, β, γ) in nonwhite CF patients in whom CFTR molecular testing has been non-diagnostic. Samples were obtained from patients who were nonwhite and whose molecular CFTR testing did not identify two mutations. Sequencing of the SCNN1A, B, and G genes was performed and variants assessed for pathogenicity and association with CF using databases, protein and splice site mutation analysis software, and literature review. We identified four nonsynonymous amino acid variants in SCNN1A, three in SCNN1B and one in SCNN1G. There was no convincing evidence of pathogenicity. Whereas all have been reported in the dbSNP database, only p.Ala334Thr, p.Val573Ile, and p.Thr663Ala in SCNN1A, p.Gly442Val in SCNN1B and p.Gly183Ser in SCNN1G were previously reported in ENaC genetic studies of CF or CF-like patients. Synonymous substitutions were also observed but novel synonymous variants were not detected. There is no conclusive association of ENaC genetic variants with CF in nonwhite CF patients. Copyright © 2015 European Cystic Fibrosis Society. Published by Elsevier B.V. All rights reserved.
Reiche, Kristin; Kasack, Katharina; Schreiber, Stephan; Lüders, Torben; Due, Eldri U.; Naume, Bjørn; Riis, Margit; Kristensen, Vessela N.; Horn, Friedemann; Børresen-Dale, Anne-Lise; Hackermüller, Jörg; Baumbusch, Lars O.
2014-01-01
Breast cancer, the second leading cause of cancer death in women, is a highly heterogeneous disease, characterized by distinct genomic and transcriptomic profiles. Transcriptome analyses prevalently assessed protein-coding genes; however, the majority of the mammalian genome is expressed in numerous non-coding transcripts. Emerging evidence supports that many of these non-coding RNAs are specifically expressed during development, tumorigenesis, and metastasis. The focus of this study was to investigate the expression features and molecular characteristics of long non-coding RNAs (lncRNAs) in breast cancer. We investigated 26 breast tumor and 5 normal tissue samples utilizing a custom expression microarray enclosing probes for mRNAs as well as novel and previously identified lncRNAs. We identified more than 19,000 unique regions significantly differentially expressed between normal versus breast tumor tissue, half of these regions were non-coding without any evidence for functional open reading frames or sequence similarity to known proteins. The identified non-coding regions were primarily located in introns (53%) or in the intergenic space (33%), frequently orientated in antisense-direction of protein-coding genes (14%), and commonly distributed at promoter-, transcription factor binding-, or enhancer-sites. Analyzing the most diverse mRNA breast cancer subtypes Basal-like versus Luminal A and B resulted in 3,025 significantly differentially expressed unique loci, including 682 (23%) for non-coding transcripts. A notable number of differentially expressed protein-coding genes displayed non-synonymous expression changes compared to their nearest differentially expressed lncRNA, including an antisense lncRNA strongly anticorrelated to the mRNA coding for histone deacetylase 3 (HDAC3), which was investigated in more detail. Previously identified chromatin-associated lncRNAs (CARs) were predominantly downregulated in breast tumor samples, including CARs located in the protein-coding genes for CALD1, FTX, and HNRNPH1. In conclusion, a number of differentially expressed lncRNAs have been identified with relation to cancer-related protein-coding genes. PMID:25264628
Reiche, Kristin; Kasack, Katharina; Schreiber, Stephan; Lüders, Torben; Due, Eldri U; Naume, Bjørn; Riis, Margit; Kristensen, Vessela N; Horn, Friedemann; Børresen-Dale, Anne-Lise; Hackermüller, Jörg; Baumbusch, Lars O
2014-01-01
Breast cancer, the second leading cause of cancer death in women, is a highly heterogeneous disease, characterized by distinct genomic and transcriptomic profiles. Transcriptome analyses prevalently assessed protein-coding genes; however, the majority of the mammalian genome is expressed in numerous non-coding transcripts. Emerging evidence supports that many of these non-coding RNAs are specifically expressed during development, tumorigenesis, and metastasis. The focus of this study was to investigate the expression features and molecular characteristics of long non-coding RNAs (lncRNAs) in breast cancer. We investigated 26 breast tumor and 5 normal tissue samples utilizing a custom expression microarray enclosing probes for mRNAs as well as novel and previously identified lncRNAs. We identified more than 19,000 unique regions significantly differentially expressed between normal versus breast tumor tissue, half of these regions were non-coding without any evidence for functional open reading frames or sequence similarity to known proteins. The identified non-coding regions were primarily located in introns (53%) or in the intergenic space (33%), frequently orientated in antisense-direction of protein-coding genes (14%), and commonly distributed at promoter-, transcription factor binding-, or enhancer-sites. Analyzing the most diverse mRNA breast cancer subtypes Basal-like versus Luminal A and B resulted in 3,025 significantly differentially expressed unique loci, including 682 (23%) for non-coding transcripts. A notable number of differentially expressed protein-coding genes displayed non-synonymous expression changes compared to their nearest differentially expressed lncRNA, including an antisense lncRNA strongly anticorrelated to the mRNA coding for histone deacetylase 3 (HDAC3), which was investigated in more detail. Previously identified chromatin-associated lncRNAs (CARs) were predominantly downregulated in breast tumor samples, including CARs located in the protein-coding genes for CALD1, FTX, and HNRNPH1. In conclusion, a number of differentially expressed lncRNAs have been identified with relation to cancer-related protein-coding genes.
Seo, Seongjin; Solivan-Timpe, Frances; Roos, Ben R; Robin, Alan L; Stone, Edwin M; Kwon, Young H; Alward, Wallace L M; Fingert, John H
2013-02-01
Copy number variations (duplications) of TANK binding kinase 1 (TBK1) have been associated with normal tension glaucoma (NTG), a common cause of blindness worldwide. Mutations in other genes involved in autophagy (TLR4 and OPTN) have been associated with NTG. Here we report searching for additional proteins involved in autophagy that may also have roles in NTG. HEK-293T cells were transfected to produce synthetic TBK1 protein with FLAG and S tags. Proteins that associate with TBK1 were isolated from HEK-293T lysates using tandem affinity purification (TAP) and polyacrylamide gel electrophoresis (PAGE). Isolated proteins were identified with mass spectrometry. A cohort of 148 NTG patients and 77 controls from Iowa were tested for glaucoma-causing mutations in genes that encode identified proteins that interact with TBK1 using high resolution melt (HRM) analysis and DNA sequencing. TAP studies show that three proteins expressed in HEK-293T cells (NAP1, TANK and TBKBP1) interact with TBK1. Testing cohorts of NTG and normal controls for disease-causing mutations in TANK, identified a total of nine unique variants including three non-synonymous changes, one synonymous changes and five intronic changes. When analyzed alone or as a group, the non-synonymous TBK1 coding sequence changes were not associated with either NTG or primary open angle glaucoma. TAP showed that NAP1, TANK and TBKBP1 interact with TBK1 and are good candidates for contributing to NTG. A mutation screen of TANK detected three non-synonymous variants. Although, it remains possible that one or more of these TANK mutations may have a role in NTG, the data in this report do not provide statistical support for an association between TANK variants and NTG.
2013-01-01
Background Genetic variation at the melanocortin-1 receptor (MC1R) gene is correlated with melanin color variation in many birds. Feral pigeons (Columba livia) show two major melanin-based colorations: a red coloration due to pheomelanic pigment and a black coloration due to eumelanic pigment. Furthermore, within each color type, feral pigeons display continuous variation in the amount of melanin pigment present in the feathers, with individuals varying from pure white to a full dark melanic color. Coloration is highly heritable and it has been suggested that it is under natural or sexual selection, or both. Our objective was to investigate whether MC1R allelic variants are associated with plumage color in feral pigeons. Findings We sequenced 888 bp of the coding sequence of MC1R among pigeons varying both in the type, eumelanin or pheomelanin, and the amount of melanin in their feathers. We detected 10 non-synonymous substitutions and 2 synonymous substitution but none of them were associated with a plumage type. It remains possible that non-synonymous substitutions that influence coloration are present in the short MC1R fragment that we did not sequence but this seems unlikely because we analyzed the entire functionally important region of the gene. Conclusions Our results show that color differences among feral pigeons are probably not attributable to amino acid variation at the MC1R locus. Therefore, variation in regulatory regions of MC1R or variation in other genes may be responsible for the color polymorphism of feral pigeons. PMID:23915680
Carter, Tamar E; Boulter, Alexis; Existe, Alexandre; Romain, Jean R; St Victor, Jean Yves; Mulligan, Connie J; Okech, Bernard A
2015-03-01
Antimalarial drugs are a key tool in malaria elimination programs. With the emergence of artemisinin resistance in southeast Asia, an effort to identify molecular markers for surveillance of resistant malaria parasites is underway. Non-synonymous mutations in the kelch propeller domain (K13-propeller) in Plasmodium falciparum have been associated with artemisinin resistance in samples from southeast Asia, but additional studies are needed to characterize this locus in other P. falciparum populations with different levels of artemisinin use. Here, we sequenced the K13-propeller locus in 82 samples from Haiti, where limited government oversight of non-governmental organizations may have resulted in low-level use of artemisinin-based combination therapies. We detected a single-nucleotide polymorphism (SNP) at nucleotide 1,359 in a single isolate. Our results contribute to our understanding of the global genomic diversity of the K13-propeller locus in P. falciparum populations. © The American Society of Tropical Medicine and Hygiene.
Association Analysis of the Ephrin-B2 Gene in African-Americans with End-Stage Renal Disease
Hicks, Pamela J.; Staten, Jennifer L.; Palmer, Nicholette D.; Langefeld, Carl D.; Ziegler, Julie T.; Keene, Keith L.; Sale, Michele M.; Bowden, Donald W.; Freedman, Barry I.
2008-01-01
Background Genome scans in African-Americans with end-stage renal disease (ESRD) identified linkage on chromosome 13q33 in the region containing the ephrin-B2 ligand (EFNB2) genes. Interactions between the ephrin-B2 receptor and ephrin-B2 ligand play essential roles in renal angiogenesis, blood vessel maturation, and kidney disease. Methods The EFNB2 gene was evaluated as a positional candidate for non-diabetic and diabetic ESRD susceptibility in 1,071 unrelated African-American subjects; 316 with non-diabetic etiologies of ESRD, 394 with type 2 diabetes-associated ESRD and 361 healthy controls. Single nucleotide polymorphism (SNP) genotyping was performed on the Sequenom Mass Array System. Statistical analyses were computed using Dandelion version 1.26, Snpaddmix version 1.4 and Haploview version 3.32. Results Twenty-eight HapMap tag SNPs were genotyped spanning the 39 kilobases (kb) of the EFNB2 coding region, with average spacing of 1.43 kb. Analysis of 710 ESRD patient samples and 361 controls provided no evidence of single SNP associations in either diabetic or non-diabetic ESRD; although nominal evidence of association with all-cause ESRD was observed with a two SNP (p = 0.022) and three SNP (p = 0.023) haplotype, both containing SNPs rs7490924 and rs2391335 in intron 1. Conclusions Although an attractive positional candidate gene, polymorphisms in the EFNB2 gene do not appear to contribute in a substantial way to non-diabetic, diabetic or all-cause ESRD susceptibility in African-Americans. Additional genes within the chromosome 13q33 linkage interval are likely contributors to African-American non-diabetic ESRD. PMID:18580054
In silico SNP analysis of the breast cancer antigen NY-BR-1.
Kosaloglu, Zeynep; Bitzer, Julia; Halama, Niels; Huang, Zhiqin; Zapatka, Marc; Schneeweiss, Andreas; Jäger, Dirk; Zörnig, Inka
2016-11-18
Breast cancer is one of the most common malignancies with increasing incidences every year and a leading cause of death among women. Although early stage breast cancer can be effectively treated, there are limited numbers of treatment options available for patients with advanced and metastatic disease. The novel breast cancer associated antigen NY-BR-1 was identified by SEREX analysis and is expressed in the majority (>70%) of breast tumors as well as metastases, in normal breast tissue, in testis and occasionally in prostate tissue. The biological function and regulation of NY-BR-1 is up to date unknown. We performed an in silico analysis on the genetic variations of the NY-BR-1 gene using data available in public SNP databases and the tools SIFT, Polyphen and Provean to find possible functional SNPs. Additionally, we considered the allele frequency of the found damaging SNPs and also analyzed data from an in-house sequencing project of 55 breast cancer samples for recurring SNPs, recorded in dbSNP. Over 2800 SNPs are recorded in the dbSNP and NHLBI ESP databases for the NY-BR-1 gene. Of these, 65 (2.07%) are synonymous SNPs, 191 (6.09%) are non-synoymous SNPs, and 2430 (77.48%) are noncoding intronic SNPs. As a result, 69 non-synoymous SNPs were predicted to be damaging by at least two, and 16 SNPs were predicted as damaging by all three of the used tools. The SNPs rs200639888, rs367841401 and rs377750885 were categorized as highly damaging by all three tools. Eight damaging SNPs are located in the ankyrin repeat domain (ANK), a domain known for its frequent involvement in protein-protein interactions. No distinctive features could be observed in the allele frequency of the analyzed SNPs. Considering these results we expect to gain more insights into the variations of the NY-BR-1 gene and their possible impact on giving rise to splice variants and therefore influence the function of NY-BR-1 in healthy tissue as well as in breast cancer.
Gillespie, Charles F; Almli, Lynn M; Smith, Alicia K; Bradley, Bekh; Kerley, Kimberly; Crain, Daniel F; Mercer, Kristina B; Weiss, Tamara; Phifer, Justine; Tang, Yilang; Cubells, Joseph F; Binder, Elisabeth B; Conneely, Karen N; Ressler, Kerry J
2013-04-01
A non-synonymous, single nucleotide polymorphism (SNP) in the gene coding for steroid 5-α-reductase type 2 (SRD5A2) is associated with reduced conversion of testosterone to dihydrotestosterone (DHT). Because SRD5A2 participates in the regulation of testosterone and cortisol metabolism, hormones shown to be dysregulated in patients with PTSD, we examined whether the V89L variant (rs523349) influences risk for post-traumatic stress disorder (PTSD). Study participants (N = 1,443) were traumatized African-American patients of low socioeconomic status with high rates of lifetime trauma exposure recruited from the primary care clinics of a large, urban hospital. PTSD symptoms were measured with the post-traumatic stress symptom scale (PSS). Subjects were genotyped for the V89L variant (rs523349) of SRD5A2. We initially found a significant sex-dependent effect of genotype in male but not female subjects on symptoms. Associations with PTSD symptoms were confirmed using a separate internal replication sample with identical methods of data analysis, followed by pooled analysis of the combined samples (N = 1,443, sex × genotype interaction P < 0.002; males: n = 536, P < 0.001). These data support the hypothesis that functional variation within SRD5A2 influences, in a sex-specific way, the severity of post-traumatic stress symptoms and risk for diagnosis of PTSD. Copyright © 2013 Wiley Periodicals, Inc.
Goettel, Wolfgang; Xia, Eric; Upchurch, Robert; Wang, Ming-Li; Chen, Pengyin; An, Yong-Qiang Charles
2014-04-23
Variation in seed oil composition and content among soybean varieties is largely attributed to differences in transcript sequences and/or transcript accumulation of oil production related genes in seeds. Discovery and analysis of sequence and expression variations in these genes will accelerate soybean oil quality improvement. In an effort to identify these variations, we sequenced the transcriptomes of soybean seeds from nine lines varying in oil composition and/or total oil content. Our results showed that 69,338 distinct transcripts from 32,885 annotated genes were expressed in seeds. A total of 8,037 transcript expression polymorphisms and 50,485 transcript sequence polymorphisms (48,792 SNPs and 1,693 small Indels) were identified among the lines. Effects of the transcript polymorphisms on their encoded protein sequences and functions were predicted. The studies also provided independent evidence that the lack of FAD2-1A gene activity and a non-synonymous SNP in the coding sequence of FAB2C caused elevated oleic acid and stearic acid levels in soybean lines M23 and FAM94-41, respectively. As a proof-of-concept, we developed an integrated RNA-seq and bioinformatics approach to identify and functionally annotate transcript polymorphisms, and demonstrated its high effectiveness for discovery of genetic and transcript variations that result in altered oil quality traits. The collection of transcript polymorphisms coupled with their predicted functional effects will be a valuable asset for further discovery of genes, gene variants, and functional markers to improve soybean oil quality.
Identification of KCNJ15 as a Susceptibility Gene in Asian Patients with Type 2 Diabetes Mellitus
Okamoto, Koji; Iwasaki, Naoko; Nishimura, Chisa; Doi, Kent; Noiri, Eisei; Nakamura, Shinko; Takizawa, Miho; Ogata, Makiko; Fujimaki, Risa; Grarup, Niels; Pisinger, Charlotta; Borch-Johnsen, Knut; Lauritzen, Torsten; Sandbaek, Annelli; Hansen, Torben; Yasuda, Kazuki; Osawa, Haruhiko; Nanjo, Kishio; Kadowaki, Takashi; Kasuga, Masato; Pedersen, Oluf; Fujita, Toshiro; Kamatani, Naoyuki; Iwamoto, Yasuhiko; Tokunaga, Katsushi
2010-01-01
Recent advances in genome research have enabled the identification of new genomic variations that are associated with type 2 diabetes mellitus (T2DM). Via fine mapping of SNPs in a candidate region of chromosome 21q, the current study identifies potassium inwardly-rectifying channel, subfamily J, member 15 (KCNJ15) as a new T2DM susceptibility gene. KCNJ15 is expressed in the β cell of the pancreas, and a synonymous SNP, rs3746876, in exon 4 (C566T) of this gene, with T allele frequency among control subjects of 3.1%, showed a significant association with T2DM affecting lean individuals in three independent Japanese sample sets (p = 2.5 × 10−7, odds ratio [OR] = 2.54, 95% confidence interval [CI] = 1.76–3.67) and with unstratified T2DM (p = 6.7 × 10−6, OR = 1.76, 95% CI = 1.37–2.25). The diabetes risk allele frequency was, however, very low among Europeans in whom no association between this variant and T2DM could be shown. Functional analysis in human embryonic kidney 293 cells demonstrated that the risk allele of the synonymous SNP in exon 4 increased KCNJ15 expression via increased mRNA stability, which resulted in the higher expression of protein as compared to that of the nonrisk allele. We also showed that KCNJ15 is expressed in human pancreatic β cells. In conclusion, we demonstrated a significant association between a synonymous variant in KCNJ15 and T2DM in lean Japanese patients with T2DM, suggesting that KCNJ15 is a previously unreported susceptibility gene for T2DM among Asians. PMID:20085713
Hein, David W; Doll, Mark A
2012-01-01
Aim Humans exhibit genetic polymorphism in NAT2 resulting in rapid, intermediate and slow acetylator phenotypes. Over 65 NAT2 variants possessing one or more SNPs in the 870-bp NAT2 coding region have been reported. The seven most frequent SNPs are rs1801279 (191G>A), rs1041983 (282C>T), rs1801280 (341T>C), rs1799929 (481C>T), rs1799930 (590G>A), rs1208 (803A>G) and rs1799931 (857G>A). The majority of studies investigate the NAT2 genotype assay for three SNPs: 481C>T, 590G>A and 857G>A. A tag-SNP (rs1495741) recently identified in a genome-wide association study has also been proposed as a biomarker for the NAT2 phenotype. Materials & methods Sulfamethazine N-acetyltransferase catalytic activities were measured in cryopreserved human hepatocytes from a convenience sample of individuals in the USA with an ethnic frequency similar to the 2010 US population census. These activities were segregated by the tag-SNP rs1495741 and each of the seven SNPs described above. We assessed the accuracy of the tag-SNP and various two-, three-, four- and seven-SNP genotyping panels for their ability to accurately infer NAT2 phenotype. Results The accuracy of the various NAT2 SNP genotype panels to infer NAT2 phenotype were as follows: seven-SNP: 98.4%; tag-SNP: 77.7%; two-SNP: 96.1%; three-SNP: 92.2%; and four-SNP: 98.4%. Conclusion A NAT2 four-SNP genotype panel of rs1801279 (191G>A), rs1801280 (341T>C), rs1799930 (590G>A) and rs1799931 (857G>A) infers NAT2 acetylator phenotype with high accuracy, and is recommended over the tag-, two-, three- and (for economy of scale) the seven-SNP genotyping panels, particularly in populations of non-European ancestry. PMID:22092036
Kazakoff, Stephen H.; Imelfort, Michael; Edwards, David; Koehorst, Jasper; Biswas, Bandana; Batley, Jacqueline; Scott, Paul T.; Gresshoff, Peter M.
2012-01-01
Pongamia pinnata (syn. Millettia pinnata) is a novel, fast-growing arboreal legume that bears prolific quantities of oil-rich seeds suitable for the production of biodiesel and aviation biofuel. Here, we have used Illumina® ‘Second Generation DNA Sequencing (2GS)’ and a new short-read de novo assembler, SaSSY, to assemble and annotate the Pongamia chloroplast (152,968 bp; cpDNA) and mitochondrial (425,718 bp; mtDNA) genomes. We also show that SaSSY can be used to accurately assemble 2GS data, by re-assembling the Lotus japonicus cpDNA and in the process assemble its mtDNA (380,861 bp). The Pongamia cpDNA contains 77 unique protein-coding genes and is almost 60% gene-dense. It contains a 50 kb inversion common to other legumes, as well as a novel 6.5 kb inversion that is responsible for the non-disruptive, re-orientation of five protein-coding genes. Additionally, two copies of an inverted repeat firmly place the species outside the subclade of the Fabaceae lacking the inverted repeat. The Pongamia and L. japonicus mtDNA contain just 33 and 31 unique protein-coding genes, respectively, and like other angiosperm mtDNA, have expanded intergenic and multiple repeat regions. Through comparative analysis with Vigna radiata we measured the average synonymous and non-synonymous divergence of all three legume mitochondrial (1.59% and 2.40%, respectively) and chloroplast (8.37% and 8.99%, respectively) protein-coding genes. Finally, we explored the relatedness of Pongamia within the Fabaceae and showed the utility of the organellar genome sequences by mapping transcriptomic data to identify up- and down-regulated stress-responsive gene candidates and confirm in silico predicted RNA editing sites. PMID:23272141
Kazakoff, Stephen H; Imelfort, Michael; Edwards, David; Koehorst, Jasper; Biswas, Bandana; Batley, Jacqueline; Scott, Paul T; Gresshoff, Peter M
2012-01-01
Pongamia pinnata (syn. Millettia pinnata) is a novel, fast-growing arboreal legume that bears prolific quantities of oil-rich seeds suitable for the production of biodiesel and aviation biofuel. Here, we have used Illumina® 'Second Generation DNA Sequencing (2GS)' and a new short-read de novo assembler, SaSSY, to assemble and annotate the Pongamia chloroplast (152,968 bp; cpDNA) and mitochondrial (425,718 bp; mtDNA) genomes. We also show that SaSSY can be used to accurately assemble 2GS data, by re-assembling the Lotus japonicus cpDNA and in the process assemble its mtDNA (380,861 bp). The Pongamia cpDNA contains 77 unique protein-coding genes and is almost 60% gene-dense. It contains a 50 kb inversion common to other legumes, as well as a novel 6.5 kb inversion that is responsible for the non-disruptive, re-orientation of five protein-coding genes. Additionally, two copies of an inverted repeat firmly place the species outside the subclade of the Fabaceae lacking the inverted repeat. The Pongamia and L. japonicus mtDNA contain just 33 and 31 unique protein-coding genes, respectively, and like other angiosperm mtDNA, have expanded intergenic and multiple repeat regions. Through comparative analysis with Vigna radiata we measured the average synonymous and non-synonymous divergence of all three legume mitochondrial (1.59% and 2.40%, respectively) and chloroplast (8.37% and 8.99%, respectively) protein-coding genes. Finally, we explored the relatedness of Pongamia within the Fabaceae and showed the utility of the organellar genome sequences by mapping transcriptomic data to identify up- and down-regulated stress-responsive gene candidates and confirm in silico predicted RNA editing sites.
Donor single nucleotide polymorphism in the CCR9 gene affects the incidence of skin GVHD.
Inamoto, Y; Murata, M; Katsumi, A; Kuwatsuka, Y; Tsujimura, A; Ishikawa, Y; Sugimoto, K; Onizuka, M; Terakura, S; Nishida, T; Kanie, T; Taji, H; Iida, H; Suzuki, R; Abe, A; Kiyoi, H; Matsushita, T; Miyamura, K; Kodera, Y; Naoe, T
2010-02-01
The interactions between chemokines and their receptors may have an important role in initiating GVHD after allogeneic hematopoietic SCT (allo-HSCT). CCL25 and CCR9 are unique because they are exclusively expressed in epithelial cells and in Peyer's patches of the small intestine. We focused on rs12721497 (G926A), one of the non-synonymous single nucleotide polymorphisms (SNPs) in the CCR9 gene, and analyzed the SNP of donors in 167 consecutive patients who received allo-HSCT from an HLA-identical sibling donor. Genotypes were tested for associations with acute and chronic GVHD in each organ and transplant outcome. Multivariate analyses showed that the genotype 926AG was significantly associated with the incidence of acute stage > or =2 skin GVHD (hazard ratio: 3.2; 95% confidence interval (95% CI): 1.1-9.1; P=0.032) and chronic skin GVHD (hazard ratio: 4.1; 95% CI: 1.1-15; P=0.036), but not with GVHD in other organs or with relapse, non-relapse mortality or OS. To clarify the functional differences between genotypes, each SNP in retroviral vectors was transfected into Jurkat cells. In chemotaxis assays, the 926G transfectant showed greater response to CCL25 than the 926A transfectant. In conclusion, more active homing of CCR9-926AG T cells to Peyer's patches may produce changes in Ag presentation and result in increased incidence of skin GVHD.
Ahram, Dina F.; Grozdanic, Sinisa D.; Kecova, Helga; Henkes, Arjen; Collin, Rob W. J.; Kuehn, Markus H.
2015-01-01
Several dog breeds are susceptible to developing primary angle closure glaucoma (PACG), which suggests a genetic basis for the disease. We have identified a four-generation Basset Hound pedigree with characteristic autosomal recessive PACG that closely recapitulates PACG in humans. Our aim is to utilize gene mapping and whole exome sequencing approaches to identify PACG-causing sequence variants in the Basset. Extensive clinical phenotyping of all pedigree members was conducted. SNP-chip genotyping was carried out in 9 affected and 15 unaffected pedigree members. Two-point and multipoint linkage analyses of genome-wide SNP data were performed using Superlink-Online SNP-1.1 and a locus was mapped to chromosome 19q with a maximum LOD score of 3.24. The locus contains 12 Ensemble predicted canine genes and is syntenic to a region on chromosome 2 in the human genome. Using exome-sequencing analysis, a possibly damaging, non-synonymous variant in the gene Nebulin (NEB) was found to segregate with PACG which alters a phylogenetically conserved Lysine residue. The association of this variants with PACG was confirmed in a secondary cohort of unrelated Basset Hounds (p = 3.4 × 10-4, OR = 15.3 for homozygosity). Nebulin, a protein that promotes the contractile function of sarcomeres, was found to be prominently expressed in the ciliary muscles of the anterior segment. Our findings may provide insight into the molecular mechanisms that underlie PACG. The phenotypic similarities of disease presentation in dogs and humans may enable the translation of findings made in this study to patients with PACG. PMID:25938837
Ahram, Dina F; Grozdanic, Sinisa D; Kecova, Helga; Henkes, Arjen; Collin, Rob W J; Kuehn, Markus H
2015-01-01
Several dog breeds are susceptible to developing primary angle closure glaucoma (PACG), which suggests a genetic basis for the disease. We have identified a four-generation Basset Hound pedigree with characteristic autosomal recessive PACG that closely recapitulates PACG in humans. Our aim is to utilize gene mapping and whole exome sequencing approaches to identify PACG-causing sequence variants in the Basset. Extensive clinical phenotyping of all pedigree members was conducted. SNP-chip genotyping was carried out in 9 affected and 15 unaffected pedigree members. Two-point and multipoint linkage analyses of genome-wide SNP data were performed using Superlink-Online SNP-1.1 and a locus was mapped to chromosome 19q with a maximum LOD score of 3.24. The locus contains 12 Ensemble predicted canine genes and is syntenic to a region on chromosome 2 in the human genome. Using exome-sequencing analysis, a possibly damaging, non-synonymous variant in the gene Nebulin (NEB) was found to segregate with PACG which alters a phylogenetically conserved Lysine residue. The association of this variants with PACG was confirmed in a secondary cohort of unrelated Basset Hounds (p = 3.4 × 10-4, OR = 15.3 for homozygosity). Nebulin, a protein that promotes the contractile function of sarcomeres, was found to be prominently expressed in the ciliary muscles of the anterior segment. Our findings may provide insight into the molecular mechanisms that underlie PACG. The phenotypic similarities of disease presentation in dogs and humans may enable the translation of findings made in this study to patients with PACG.
Exome sequencing identifies rare LDLR and APOA5 alleles conferring risk for myocardial infarction.
Do, Ron; Stitziel, Nathan O; Won, Hong-Hee; Jørgensen, Anders Berg; Duga, Stefano; Angelica Merlini, Pier; Kiezun, Adam; Farrall, Martin; Goel, Anuj; Zuk, Or; Guella, Illaria; Asselta, Rosanna; Lange, Leslie A; Peloso, Gina M; Auer, Paul L; Girelli, Domenico; Martinelli, Nicola; Farlow, Deborah N; DePristo, Mark A; Roberts, Robert; Stewart, Alexander F R; Saleheen, Danish; Danesh, John; Epstein, Stephen E; Sivapalaratnam, Suthesh; Hovingh, G Kees; Kastelein, John J; Samani, Nilesh J; Schunkert, Heribert; Erdmann, Jeanette; Shah, Svati H; Kraus, William E; Davies, Robert; Nikpay, Majid; Johansen, Christopher T; Wang, Jian; Hegele, Robert A; Hechter, Eliana; Marz, Winfried; Kleber, Marcus E; Huang, Jie; Johnson, Andrew D; Li, Mingyao; Burke, Greg L; Gross, Myron; Liu, Yongmei; Assimes, Themistocles L; Heiss, Gerardo; Lange, Ethan M; Folsom, Aaron R; Taylor, Herman A; Olivieri, Oliviero; Hamsten, Anders; Clarke, Robert; Reilly, Dermot F; Yin, Wu; Rivas, Manuel A; Donnelly, Peter; Rossouw, Jacques E; Psaty, Bruce M; Herrington, David M; Wilson, James G; Rich, Stephen S; Bamshad, Michael J; Tracy, Russell P; Cupples, L Adrienne; Rader, Daniel J; Reilly, Muredach P; Spertus, John A; Cresci, Sharon; Hartiala, Jaana; Tang, W H Wilson; Hazen, Stanley L; Allayee, Hooman; Reiner, Alex P; Carlson, Christopher S; Kooperberg, Charles; Jackson, Rebecca D; Boerwinkle, Eric; Lander, Eric S; Schwartz, Stephen M; Siscovick, David S; McPherson, Ruth; Tybjaerg-Hansen, Anne; Abecasis, Goncalo R; Watkins, Hugh; Nickerson, Deborah A; Ardissino, Diego; Sunyaev, Shamil R; O'Donnell, Christopher J; Altshuler, David; Gabriel, Stacey; Kathiresan, Sekar
2015-02-05
Myocardial infarction (MI), a leading cause of death around the world, displays a complex pattern of inheritance. When MI occurs early in life, genetic inheritance is a major component to risk. Previously, rare mutations in low-density lipoprotein (LDL) genes have been shown to contribute to MI risk in individual families, whereas common variants at more than 45 loci have been associated with MI risk in the population. Here we evaluate how rare mutations contribute to early-onset MI risk in the population. We sequenced the protein-coding regions of 9,793 genomes from patients with MI at an early age (≤50 years in males and ≤60 years in females) along with MI-free controls. We identified two genes in which rare coding-sequence mutations were more frequent in MI cases versus controls at exome-wide significance. At low-density lipoprotein receptor (LDLR), carriers of rare non-synonymous mutations were at 4.2-fold increased risk for MI; carriers of null alleles at LDLR were at even higher risk (13-fold difference). Approximately 2% of early MI cases harbour a rare, damaging mutation in LDLR; this estimate is similar to one made more than 40 years ago using an analysis of total cholesterol. Among controls, about 1 in 217 carried an LDLR coding-sequence mutation and had plasma LDL cholesterol > 190 mg dl(-1). At apolipoprotein A-V (APOA5), carriers of rare non-synonymous mutations were at 2.2-fold increased risk for MI. When compared with non-carriers, LDLR mutation carriers had higher plasma LDL cholesterol, whereas APOA5 mutation carriers had higher plasma triglycerides. Recent evidence has connected MI risk with coding-sequence mutations at two genes functionally related to APOA5, namely lipoprotein lipase and apolipoprotein C-III (refs 18, 19). Combined, these observations suggest that, as well as LDL cholesterol, disordered metabolism of triglyceride-rich lipoproteins contributes to MI risk.
Ali, Shahin S; Shao, Jonathan; Strem, Mary D; Phillips-Mora, Wilberth; Zhang, Dapeng; Meinhardt, Lyndel W; Bailey, Bryan A
2015-01-01
Moniliophthora roreri is the fungal pathogen that causes frosty pod rot (FPR) disease of Theobroma cacao L., the source of chocolate. FPR occurs in most of the cacao producing countries in the Western Hemisphere, causing yield losses up to 80%. Genetic diversity within the FPR pathogen population may allow the population to adapt to changing environmental conditions and adapt to enhanced resistance in the host plant. The present study developed single nucleotide polymorphism (SNP) markers from RNASeq results for 13 M. roreri isolates and validated the markers for their ability to reveal genetic diversity in an international M. roreri collection. The SNP resources reported herein represent the first study of RNA sequencing (RNASeq)-derived SNP validation in M. roreri and demonstrates the utility of RNASeq as an approach for de novo SNP identification in M. roreri. A total of 88 polymorphic SNPs were used to evaluate the genetic diversity of 172 M. roreri cacao isolates resulting in 37 distinct genotypes (including 14 synonymous groups). Absence of heterozygosity for the 88 SNP markers indicates reproduction in M. roreri is clonal and likely due to a homothallic life style. The upper Magdalena Valley of Colombia showed the highest levels of genetic diversity with 20 distinct genotypes of which 13 were limited to this region, and indicates this region as the possible center of origin for M. roreri.
Ali, Shahin S.; Shao, Jonathan; Strem, Mary D.; Phillips-Mora, Wilberth; Zhang, Dapeng; Meinhardt, Lyndel W.; Bailey, Bryan A.
2015-01-01
Moniliophthora roreri is the fungal pathogen that causes frosty pod rot (FPR) disease of Theobroma cacao L., the source of chocolate. FPR occurs in most of the cacao producing countries in the Western Hemisphere, causing yield losses up to 80%. Genetic diversity within the FPR pathogen population may allow the population to adapt to changing environmental conditions and adapt to enhanced resistance in the host plant. The present study developed single nucleotide polymorphism (SNP) markers from RNASeq results for 13 M. roreri isolates and validated the markers for their ability to reveal genetic diversity in an international M. roreri collection. The SNP resources reported herein represent the first study of RNA sequencing (RNASeq)-derived SNP validation in M. roreri and demonstrates the utility of RNASeq as an approach for de novo SNP identification in M. roreri. A total of 88 polymorphic SNPs were used to evaluate the genetic diversity of 172 M. roreri cacao isolates resulting in 37 distinct genotypes (including 14 synonymous groups). Absence of heterozygosity for the 88 SNP markers indicates reproduction in M. roreri is clonal and likely due to a homothallic life style. The upper Magdalena Valley of Colombia showed the highest levels of genetic diversity with 20 distinct genotypes of which 13 were limited to this region, and indicates this region as the possible center of origin for M. roreri. PMID:26379633
Jung, Bo Kyeung; Kim, Jeeyong; Cho, Chi Hyun; Kim, Ju Yeon; Nam, Myung Hyun; Shin, Bong Kyung; Rho, Eun Youn; Kim, Sollip; Sung, Heungsup; Kim, Shinyoung; Ki, Chang Seok; Park, Min Jung; Lee, Kap No; Yoon, Soo Young
2017-04-01
The National Health Information Standards Committee was established in 2004 in Korea. The practical subcommittee for laboratory test terminology was placed in charge of standardizing laboratory medicine terminology in Korean. We aimed to establish a standardized Korean laboratory terminology database, Korea-Logical Observation Identifier Names and Codes (K-LOINC) based on former products sponsored by this committee. The primary product was revised based on the opinions of specialists. Next, we mapped the electronic data interchange (EDI) codes that were revised in 2014, to the corresponding K-LOINC. We established a database of synonyms, including the laboratory codes of three reference laboratories and four tertiary hospitals in Korea. Furthermore, we supplemented the clinical microbiology section of K-LOINC using an alternative mapping strategy. We investigated other systems that utilize laboratory codes in order to investigate the compatibility of K-LOINC with statistical standards for a number of tests. A total of 48,990 laboratory codes were adopted (21,539 new and 16,330 revised). All of the LOINC synonyms were translated into Korean, and 39,347 Korean synonyms were added. Moreover, 21,773 synonyms were added from reference laboratories and tertiary hospitals. Alternative strategies were established for mapping within the microbiology domain. When we applied these to a smaller hospital, the mapping rate was successfully increased. Finally, we confirmed K-LOINC compatibility with other statistical standards, including a newly proposed EDI code system. This project successfully established an up-to-date standardized Korean laboratory terminology database, as well as an updated EDI mapping to facilitate the introduction of standard terminology into institutions. © 2017 The Korean Academy of Medical Sciences.
Amirian, E Susan; Scheurer, Michael E; Liu, Yanhong; D'Amelio, Anthony M; Houlston, Richard S; Etzel, Carol J; Shete, Sanjay; Swerdlow, Anthony J; Schoemaker, Minouk J; McKinney, Patricia A; Fleming, Sarah J; Muir, Kenneth R; Lophatananon, Artitaya; Bondy, Melissa L
2011-08-01
Despite extensive research on the topic, glioma etiology remains largely unknown. Exploration of potential interactions between single-nucleotide polymorphisms (SNP) of immune genes is a promising new area of glioma research. The case-only study design is a powerful and efficient design for exploring possible multiplicative interactions between factors that are independent of one another. The purpose of our study was to use this exploratory design to identify potential pair wise SNP-SNP interactions from genes involved in several different immune-related pathways for investigation in future studies. The study population consisted of two case groups: 1,224 histologic confirmed, non-Hispanic white glioma cases from the United States and a validation population of 634 glioma cases from the United Kingdom. Polytomous logistic regression, in which one SNP was coded as the outcome and the other SNP was included as the exposure, was utilized to calculate the ORs of the likelihood of cases simultaneously having the variant alleles of two different SNPs. Potential interactions were examined only between SNPs located in different genes or chromosomes. Using this data mining strategy, we found 396 significant SNP-SNP interactions among polymorphisms of immune-related genes that were present in both the U.S. and U.K. study populations. This exploratory study was conducted for the purpose of hypothesis generation, and thus has provided several new hypotheses that can be tested using traditional case-control study designs to obtain estimates of risk. This is the first study, to our knowledge, to take this novel approach to identifying SNP-SNP interactions relevant to glioma etiology. ©2011 AACR.
Yakubu, Abdulmojeed; Salako, Adebowale E; De Donato, Marcos; Peters, Sunday O; Takeet, Michael I; Wheto, Mathew; Okpeku, Moses; Imumorin, Ikhide G
2017-02-01
Host defense in vertebrates depend on many secreted regulatory proteins such as major histocompatibility complex (MHC) class II which provide important regulatory and effector functions of T cells. Gene polymorphism in the second exon of Capra-DRB gene in three major Nigerian goat breeds [West African Dwarf (WAD), Red Sokoto (RS), and Sahel (SH)] was analyzed by restriction fragment length polymorphisms (RFLP). Four restriction enzymes, BsaHI, AluI, HaeIII, and SacII, were utilized. The association between the polymorphic sites and some heat tolerance traits were also investigated in a total of 70 WAD, 90 RS, and 50 SH goats. Fourteen different types of alleles identified in the Nigerian goats, four of which were found in the peptide coding region (A57G, Q89R, G104D, and T112I), indicate a high degree of polymorphism at the DRB locus in this species. An obvious excess (P < 0.01) of non-synonymous substitutions than synonymous (dN/dS) in this locus is a reflection of adaptive evolution and positive selection. The phylogenetic trees revealed largely species-wise clustering in DRB gene. BsaHI, AluI, HaeIII, and SacII genotype frequencies were in Hardy-Weinberg equilibrium (P > 0.05), except AluI in RS goats and HaeIII in WAD goats (P < 0.05). The expected heterozygosity (H), which is a measure of gene diversity in the goat populations, ranged from 0.16 to 0.50. Genotypes AA (BsaHI), GG, GC and CC (AluI) and GG, GA, AA (HaeIII) appeared better in terms of heat tolerance. The heat-tolerant ability of SH and RS goats to the hot and humid tropical environment of Nigeria seemed better than that of the WAD goats. Sex effect (P < 0.05) was mainly on pulse rate and heat stress index, while there were varying interaction effects on heat tolerance. Variation at the DRB locus may prove to be important in possible selection and breeding for genetic resistance to heat stress in the tropics.
Codon Usage Selection Can Bias Estimation of the Fraction of Adaptive Amino Acid Fixations.
Matsumoto, Tomotaka; John, Anoop; Baeza-Centurion, Pablo; Li, Boyang; Akashi, Hiroshi
2016-06-01
A growing number of molecular evolutionary studies are estimating the proportion of adaptive amino acid substitutions (α) from comparisons of ratios of polymorphic and fixed DNA mutations. Here, we examine how violations of two of the model assumptions, neutral evolution of synonymous mutations and stationary base composition, affect α estimation. We simulated the evolution of coding sequences assuming weak selection on synonymous codon usage bias and neutral protein evolution, α = 0. We show that weak selection on synonymous mutations can give polymorphism/divergence ratios that yield α-hat (estimated α) considerably larger than its true value. Nonstationary evolution (changes in population size, selection, or mutation) can exacerbate such biases or, in some scenarios, give biases in the opposite direction, α-hat < α. These results demonstrate that two factors that appear to be prevalent among taxa, weak selection on synonymous mutations and non-steady-state nucleotide composition, should be considered when estimating α. Estimates of the proportion of adaptive amino acid fixations from large-scale analyses of Drosophila melanogaster polymorphism and divergence data are positively correlated with codon usage bias. Such patterns are consistent with α-hat inflation from weak selection on synonymous mutations and/or mutational changes within the examined gene trees. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Oh, Chang Seok; Lee, Soong Deok; Kim, Yi-Suk; Shin, Dong Hoon
2015-01-01
Previous study showed that East Asian mtDNA haplogroups, especially those of Koreans, could be successfully assigned by the coupled use of analyses on coding region SNP markers and control region mutation motifs. In this study, we tried to see if the same triple multiplex analysis for coding regions SNPs could be also applicable to ancient samples from East Asia as the complementation for sequence analysis of mtDNA control region. By the study on Joseon skeleton samples, we know that mtDNA haplogroup determined by coding region SNP markers successfully falls within the same haplogroup that sequence analysis on control region can assign. Considering that ancient samples in previous studies make no small number of errors in control region mtDNA sequencing, coding region SNP analysis can be used as good complimentary to the conventional haplogroup determination, especially of archaeological human bone samples buried underground over long periods. PMID:26345190
Genome Analysis of the Domestic Dog (Korean Jindo) by Massively Parallel Sequencing
Kim, Ryong Nam; Kim, Dae-Soo; Choi, Sang-Haeng; Yoon, Byoung-Ha; Kang, Aram; Nam, Seong-Hyeuk; Kim, Dong-Wook; Kim, Jong-Joo; Ha, Ji-Hong; Toyoda, Atsushi; Fujiyama, Asao; Kim, Aeri; Kim, Min-Young; Park, Kun-Hyang; Lee, Kang Seon; Park, Hong-Seog
2012-01-01
Although pioneering sequencing projects have shed light on the boxer and poodle genomes, a number of challenges need to be met before the sequencing and annotation of the dog genome can be considered complete. Here, we present the DNA sequence of the Jindo dog genome, sequenced to 45-fold average coverage using Illumina massively parallel sequencing technology. A comparison of the sequence to the reference boxer genome led to the identification of 4 675 437 single nucleotide polymorphisms (SNPs, including 3 346 058 novel SNPs), 71 642 indels and 8131 structural variations. Of these, 339 non-synonymous SNPs and 3 indels are located within coding sequences (CDS). In particular, 3 non-synonymous SNPs and a 26-bp deletion occur in the TCOF1 locus, implying that the difference observed in cranial facial morphology between Jindo and boxer dogs might be influenced by those variations. Through the annotation of the Jindo olfactory receptor gene family, we found 2 unique olfactory receptor genes and 236 olfactory receptor genes harbouring non-synonymous homozygous SNPs that are likely to affect smelling capability. In addition, we determined the DNA sequence of the Jindo dog mitochondrial genome and identified Jindo dog-specific mtDNA genotypes. This Jindo genome data upgrade our understanding of dog genomic architecture and will be a very valuable resource for investigating not only dog genetics and genomics but also human and dog disease genetics and comparative genomics. PMID:22474061
Decoding Mechanisms by which Silent Codon Changes Influence Protein Biogenesis and Function
Bali, Vedrana; Bebok, Zsuzsanna
2015-01-01
Scope Synonymous codon usage has been a focus of investigation since the discovery of the genetic code and its redundancy. The occurrences of synonymous codons vary between species and within genes of the same genome, known as codon usage bias. Today, bioinformatics and experimental data allow us to compose a global view of the mechanisms by which the redundancy of the genetic code contributes to the complexity of biological systems from affecting survival in prokaryotes, to fine tuning the structure and function of proteins in higher eukaryotes. Studies analyzing the consequences of synonymous codon changes in different organisms have revealed that they impact nucleic acid stability, protein levels, structure and function without altering amino acid sequence. As such, synonymous mutations inevitably contribute to the pathogenesis of complex human diseases. Yet, fundamental questions remain unresolved regarding the impact of silent mutations in human disorders. In the present review we describe developments in this area concentrating on mechanisms by which synonymous mutations may affect protein function and human health. Purpose This synopsis illustrates the significance of synonymous mutations in disease pathogenesis. We review the different steps of gene expression affected by silent mutations, and assess the benefits and possible harmful effects of codon optimization applied in the development of therapeutic biologics. Physiological and medical relevance Understanding mechanisms by which synonymous mutations contribute to complex diseases such as cancer, neurodegeneration and genetic disorders, including the limitations of codon-optimized biologics, provides insight concerning interpretation of silent variants and future molecular therapies. PMID:25817479
Shi, Wan; Quan, Mingyang; Du, Qingzhang; Zhang, Deqiang
2017-01-01
Long non-coding RNAs (lncRNAs) are important regulatory factors for plant growth and development, but little is known about the allelic interactions of lncRNAs with mRNA in perennial plants. Here, we analyzed the interaction of the NERD (Needed for RDR2-independent DNA methylation) Populus tomentosa gene PtoNERD with its putative regulator, the lncRNA NERDL (NERD-related lncRNA), which partially overlaps with the promoter region of this gene. Expression analysis in eight tissues showed a positive correlation between NERDL and PtoNERD (r = 0.62), suggesting that the interaction of NERDL with its putative target might be involved in wood formation. We conducted association mapping in a natural population of P. tomentosa (435 unrelated individuals) to evaluate genetic variation and the interaction of the lncRNA NERDL with PtoNERD. Using additive and dominant models, we identified 30 SNPs (P < 0.01) associated with five tree growth and wood property traits. Each SNP explained 3.90–8.57% of phenotypic variance, suggesting that NERDL and its putative target play a common role in wood formation. Epistasis analysis uncovered nine SNP-SNP association pairs between NERDL and PtoNERD, with an information gain of -7.55 to 2.16%, reflecting the strong interactions between NERDL and its putative target. This analysis provides a powerful method for deciphering the genetic interactions of lncRNAs with mRNA and dissecting the complex genetic network of quantitative traits in trees. PMID:28674544
Host-Parasite Interactions and Purifying Selection in a Microsporidian Parasite of Honey Bees
Huang, Qiang; Chen, Yan Ping; Wang, Rui Wu; Cheng, Shang; Evans, Jay D.
2016-01-01
To clarify the mechanisms of Nosema ceranae parasitism, we deep-sequenced both honey bee host and parasite mRNAs throughout a complete 6-day infection cycle. By time-series analysis, 1122 parasite genes were significantly differently expressed during the reproduction cycle, clustering into 4 expression patterns. We found reactive mitochondrial oxygen species modulator 1 of the host to be significantly down regulated during the entire infection period. Our data support the hypothesis that apoptosis of honey bee cells was suppressed during infection. We further analyzed genome-wide genetic diversity of this parasite by comparing samples collected from the same site in 2007 and 2013. The number of SNP positions per gene and the proportion of non-synonymous substitutions per gene were significantly reduced over this time period, suggesting purifying selection on the parasite genome and supporting the hypothesis that a subset of N. ceranae strains might be dominating infection. PMID:26840596
Host-Parasite Interactions and Purifying Selection in a Microsporidian Parasite of Honey Bees.
Huang, Qiang; Chen, Yan Ping; Wang, Rui Wu; Cheng, Shang; Evans, Jay D
2016-01-01
To clarify the mechanisms of Nosema ceranae parasitism, we deep-sequenced both honey bee host and parasite mRNAs throughout a complete 6-day infection cycle. By time-series analysis, 1122 parasite genes were significantly differently expressed during the reproduction cycle, clustering into 4 expression patterns. We found reactive mitochondrial oxygen species modulator 1 of the host to be significantly down regulated during the entire infection period. Our data support the hypothesis that apoptosis of honey bee cells was suppressed during infection. We further analyzed genome-wide genetic diversity of this parasite by comparing samples collected from the same site in 2007 and 2013. The number of SNP positions per gene and the proportion of non-synonymous substitutions per gene were significantly reduced over this time period, suggesting purifying selection on the parasite genome and supporting the hypothesis that a subset of N. ceranae strains might be dominating infection.
Mendoza Lopez, Pablo; Golby, Paul; Wooff, Esen; Garcia, Javier Nunez; Garcia Pelayo, M. Carmen; Conlon, Kevin; Gema Camacho, Ana; Hewinson, R. Glyn; Polaina, Julio; Suárez García, Antonio; Gordon, Stephen V.
2010-01-01
A number of single-nucleotide polymorphisms (SNPs) have been identified in the genome of Mycobacterium bovis BCG Pasteur compared with the sequenced strain M. bovis 2122/97. The functional consequences of many of these mutations remain to be described; however, mutations in genes encoding regulators may be particularly relevant to global phenotypic changes such as loss of virulence, since alteration of a regulator's function will affect the expression of a wide range of genes. One such SNP falls in bcg3145, encoding a member of the AfsR/DnrI/SARP class of global transcriptional regulators, that replaces a highly conserved glutamic acid residue at position 159 (E159G) with glycine in a tetratricopeptide repeat (TPR) located in the bacterial transcriptional activation (BTA) domain of BCG3145. TPR domains are associated with protein–protein interactions, and a conserved core (helices T1–T7) of the BTA domain seems to be required for proper function of SARP-family proteins. Structural modelling predicted that the E159G mutation perturbs the third α-helix of the BTA domain and could therefore have functional consequences. The E159G SNP was found to be present in all BCG strains, but absent from virulent M. bovis and Mycobacterium tuberculosis strains. By overexpressing BCG3145 and Rv3124 in BCG and H37Rv and monitoring transcriptome changes using microarrays, we determined that BCG3145/Rv3124 acts as a positive transcriptional regulator of the molybdopterin biosynthesis moa1 locus, and we suggest that rv3124 be renamed moaR1. The SNP in bcg3145 was found to have a subtle effect on the activity of MoaR1, suggesting that this mutation is not a key event in the attenuation of BCG. PMID:20378651
Kujur, Alice; Bajaj, Deepak; Upadhyaya, Hari D.; Das, Shouvik; Ranjan, Rajeev; Shree, Tanima; Saxena, Maneesha S.; Badoni, Saurabh; Kumar, Vinod; Tripathi, Shailesh; Gowda, C. L. L.; Sharma, Shivali; Singh, Sube; Tyagi, Akhilesh K.; Parida, Swarup K.
2015-01-01
The genome-wide discovery and high-throughput genotyping of SNPs in chickpea natural germplasm lines is indispensable to extrapolate their natural allelic diversity, domestication, and linkage disequilibrium (LD) patterns leading to the genetic enhancement of this vital legume crop. We discovered 44,844 high-quality SNPs by sequencing of 93 diverse cultivated desi, kabuli, and wild chickpea accessions using reference genome- and de novo-based GBS (genotyping-by-sequencing) assays that were physically mapped across eight chromosomes of desi and kabuli. Of these, 22,542 SNPs were structurally annotated in different coding and non-coding sequence components of genes. Genes with 3296 non-synonymous and 269 regulatory SNPs could functionally differentiate accessions based on their contrasting agronomic traits. A high experimental validation success rate (92%) and reproducibility (100%) along with strong sensitivity (93–96%) and specificity (99%) of GBS-based SNPs was observed. This infers the robustness of GBS as a high-throughput assay for rapid large-scale mining and genotyping of genome-wide SNPs in chickpea with sub-optimal use of resources. With 23,798 genome-wide SNPs, a relatively high intra-specific polymorphic potential (49.5%) and broader molecular diversity (13–89%)/functional allelic diversity (18–77%) was apparent among 93 chickpea accessions, suggesting their tremendous applicability in rapid selection of desirable diverse accessions/inter-specific hybrids in chickpea crossbred varietal improvement program. The genome-wide SNPs revealed complex admixed domestication pattern, extensive LD estimates (0.54–0.68) and extended LD decay (400–500 kb) in a structured population inclusive of 93 accessions. These findings reflect the utility of our identified SNPs for subsequent genome-wide association study (GWAS) and selective sweep-based domestication trait dissection analysis to identify potential genomic loci (gene-associated targets) specifically regulating important complex quantitative agronomic traits in chickpea. The numerous informative genome-wide SNPs, natural allelic diversity-led domestication pattern, and LD-based information generated in our study have got multidimensional applicability with respect to chickpea genomics-assisted breeding. PMID:25873920
Hypoxia adaptations in the grey wolf (Canis lupus chanco) from Qinghai-Tibet Plateau.
Zhang, Wenping; Fan, Zhenxin; Han, Eunjung; Hou, Rong; Zhang, Liang; Galaverni, Marco; Huang, Jie; Liu, Hong; Silva, Pedro; Li, Peng; Pollinger, John P; Du, Lianming; Zhang, XiuyYue; Yue, Bisong; Wayne, Robert K; Zhang, Zhihe
2014-07-01
The Tibetan grey wolf (Canis lupus chanco) occupies habitats on the Qinghai-Tibet Plateau, a high altitude (>3000 m) environment where low oxygen tension exerts unique selection pressure on individuals to adapt to hypoxic conditions. To identify genes involved in hypoxia adaptation, we generated complete genome sequences of nine Chinese wolves from high and low altitude populations at an average coverage of 25× coverage. We found that, beginning about 55,000 years ago, the highland Tibetan grey wolf suffered a more substantial population decline than lowland wolves. Positively selected hypoxia-related genes in highland wolves are enriched in the HIF signaling pathway (P = 1.57E-6), ATP binding (P = 5.62E-5), and response to an oxygen-containing compound (P≤5.30E-4). Of these positively selected hypoxia-related genes, three genes (EPAS1, ANGPT1, and RYR2) had at least one specific fixed non-synonymous SNP in highland wolves based on the nine genome data. Our re-sequencing studies on a large panel of individuals showed a frequency difference greater than 58% between highland and lowland wolves for these specific fixed non-synonymous SNPs and a high degree of LD surrounding the three genes, which imply strong selection. Past studies have shown that EPAS1 and ANGPT1 are important in the response to hypoxic stress, and RYR2 is involved in heart function. These three genes also exhibited significant signals of natural selection in high altitude human populations, which suggest similar evolutionary constraints on natural selection in wolves and humans of the Qinghai-Tibet Plateau.
Behl, Jyotsna Dhingra; Mishra, Priyanka; Verma, N K; Niranjan, S K; Dangi, P S; Sharma, Rekha; Behl, Rahul
2016-03-15
The present study was undertaken to characterize the genetic variation present in lymphoxin A gene (LTA gene) encoding for the lymphotoxin A protein also known as tumor necrosis factor beta, a cytokine produced by lymphocytes, known to be cytotoxic for a wide range of tumor cells both in vitro and in vivo, and, which is essential for normal immunological development; in 40 animals of 5 diverse Bos indicus Indian zebu cattle breeds. These breeds survive under the harsh and tough tropical climatic conditions of various parts of the Indian subcontinent. The LTA gene in the present study was observed to contain 33 SNPs and 3 small insertion/deletion polymorphisms. Four SNPs occurred in the coding regions of the gene viz. g.1327A>G and g.1400C>T in exon 2 and g.1840C>T and g.1942C>T in exon 3, of which the SNP g.1327A>G in exon 2 resulted in a non-synonymous amino acid change G38D. This amino acid change was however predicted not be affecting the protein function in any manner. The gene contained putative transcription factor binding sites for the c-Re1 and for Pax-4 transcription factors. A putative promoter region was also predicted on the reverse DNA strand from position 894 to 644. Several repeat elements and microsatellite repeats were detected to be occurring across the 3.2kb LTA gene sequence. The study showed the occurrence of 40 genotypes and 48 most probable haplotypes. The genotypes at the observed SNP positions in the LTA gene were in near Hardy-Weinberg equilibrium. A negative Tajima's D value that was not significant statistically at P>0.10 indicated that the neutral mutation hypothesis could not be excluded. The genetic variations observed in the LTA gene in the present study have not been reported earlier and these could possibly be used as molecular markers for further studies involving association of the gene variability with disease resistance/tolerance traits. Copyright © 2015 Elsevier B.V. All rights reserved.
A benchmark study of scoring methods for non-coding mutations.
Drubay, Damien; Gautheret, Daniel; Michiels, Stefan
2018-05-15
Detailed knowledge of coding sequences has led to different candidate models for pathogenic variant prioritization. Several deleteriousness scores have been proposed for the non-coding part of the genome, but no large-scale comparison has been realized to date to assess their performance. We compared the leading scoring tools (CADD, FATHMM-MKL, Funseq2 and GWAVA) and some recent competitors (DANN, SNP and SOM scores) for their ability to discriminate assumed pathogenic variants from assumed benign variants (using the ClinVar, COSMIC and 1000 genomes project databases). Using the ClinVar benchmark, CADD was the best tool for detecting the pathogenic variants that are mainly located in protein coding gene regions. Using the COSMIC benchmark, FATHMM-MKL, GWAVA and SOMliver outperformed the other tools for pathogenic variants that are typically located in lincRNAs, pseudogenes and other parts of the non-coding genome. However, all tools had low precision, which could potentially be improved by future non-coding genome feature discoveries. These results may have been influenced by the presence of potential benign variants in the COSMIC database. The development of a gold standard as consistent as ClinVar for these regions will be necessary to confirm our tool ranking. The Snakemake, C++ and R codes are freely available from https://github.com/Oncostat/BenchmarkNCVTools and supported on Linux. damien.drubay@gustaveroussy.fr or stefan.michiels@gustaveroussy.fr. Supplementary data are available at Bioinformatics online.
Johnson, Katherine A; Barry, Edwina; Lambert, David; Fitzgerald, Michael; McNicholas, Fiona; Kirley, Aiveen; Gill, Michael; Bellgrove, Mark A; Hawi, Ziarih
2013-12-01
A naturalistic, prospective study of the influence of genetic variation on dose prescribed, clinical response, and side effects related to stimulant medication in 77 children with attention-deficit/hyperactivity disorder (ADHD) was undertaken. The influence of genetic variation of the CES1 gene coding for carboxylesterase 1A1 (CES1A1), the major enzyme responsible for the first-pass, stereoselective metabolism of methylphenidate, was investigated. Parent- and teacher-rated behavioral questionnaires were collected at baseline when the children were medication naïve, and again at 6 weeks while they were on medication. Medication dose, prescribed at the discretion of the treating clinician, and side effects, were recorded at week 6. Blood and saliva samples were collected for genotyping. Single nucleotide polymorphisms (SNPs) were selected in the coding, non-coding and the 3' flanking region of the CES1 gene. Genetic association between CES1 variants and ADHD was investigated in an expanded sample of 265 Irish ADHD families. Analyses were conducted using analysis of covariance (ANCOVA) and logistic regression models. None of the CES1 gene variants were associated with the dose of methylphenidate provided or the clinical response recorded at the 6 week time point. An association between two CES1 SNP markers and the occurrence of sadness as a side effect of short-acting methylphenidate was found. The two associated CES1 markers were in linkage disequilibrium and were significantly associated with ADHD in a larger sample of ADHD trios. The associated CES1 markers were also in linkage disequilibrium with two SNP markers of the noradrenaline transporter gene (SLC6A2). This study found an association between two CES1 SNP markers and the occurrence of sadness as a side effect of short-acting methylphenidate. These markers were in linkage disequilibrium together and with two SNP markers of the noradrenaline transporter gene.
Song, Yiqing; Hsu, Yi-Hsiang; Niu, Tianhua; Manson, Joann E; Buring, Julie E; Liu, Simin
2009-01-17
Ion channel transient receptor potential membrane melastatin 6 and 7 (TRPM6 and TRPM7) play a central role in magnesium homeostasis, which is critical for maintaining glucose and insulin metabolism. However, it is unclear whether common genetic variation in TRPM6 and TRPM7 contributes to risk of type 2 diabetes. We conducted a nested case-control study in the Women's Health Study. During a median of 10 years of follow-up, 359 incident diabetes cases were diagnosed and matched by age and ethnicity with 359 controls. We analyzed 20 haplotype-tagging single nucleotide polymorphisms (SNPs) in TRPM6 and 5 common SNPs in TRPM7 for their association with diabetes risk. Overall, there was no robust and significant association between any single SNP and diabetes risk. Neither was there any evidence of association between common TRPM6 and TRPM7 haplotypes and diabetes risk. Our haplotype analyses suggested a significant risk of type 2 diabetes among carriers of both the rare alleles from two non-synomous SNPs in TRPM6 (Val1393Ile in exon 26 [rs3750425] and Lys1584Glu in exon 27 [rs2274924]) when their magnesium intake was lower than 250 mg per day. Compared with non-carriers, women who were carriers of the haplotype 1393Ile-1584Glu had an increased risk of type 2 diabetes (OR, 4.92, 95% CI, 1.05-23.0) only when they had low magnesium intake (<250 mg/day). Our results provide suggestive evidence that two common non-synonymous TRPM6 coding region variants, Ile1393Val and Lys1584Glu polymorphisms, might confer susceptibility to type 2 diabetes in women with low magnesium intake. Further replication in large-scale studies is warranted.
Zhou, Daling; Du, Qingzhang; Chen, Jinhui; Wang, Qingshi; Zhang, Deqiang
2017-10-01
Long non-coding RNAs (lncRNAs) function in various biological processes. However, their roles in secondary growth of plants remain poorly understood. Here, 15,691 lncRNAs were identified from vascular cambium, developing xylem, and mature xylem of Populus tomentosa with high and low biomass using RNA-seq, including 1,994 lncRNAs that were differentially expressed (DE) among the six libraries. 3,569 cis-regulated and 3,297 trans-regulated protein-coding genes were predicted as potential target genes (PTGs) of the DE lncRNAs to participate in biological regulation. Then, 476 and 28 lncRNAs were identified as putative targets and endogenous target mimics (eTMs) of Populus known microRNAs (miRNAs), respectively. Genome re-sequencing of 435 individuals from a natural population of P. tomentosa found 34,015 single nucleotide polymorphisms (SNPs) within 178 lncRNA loci and 522 PTGs. Single-SNP associations analysis detected 2,993 associations with 10 growth and wood-property traits under additive and dominance model. Epistasis analysis identified 17,656 epistatic SNP pairs, providing evidence for potential regulatory interactions between lncRNAs and their PTGs. Furthermore, a reconstructed epistatic network, representing interactions of 8 lncRNAs and 15 PTGs, might enrich regulation roles of genes in the phenylpropanoid pathway. These findings may enhance our understanding of non-coding genes in plants. © The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Cole, Charles; Krampis, Konstantinos; Karagiannis, Konstantinos; Almeida, Jonas S; Faison, William J; Motwani, Mona; Wan, Quan; Golikov, Anton; Pan, Yang; Simonyan, Vahan; Mazumder, Raja
2014-01-27
Next-generation sequencing (NGS) technologies have resulted in petabytes of scattered data, decentralized in archives, databases and sometimes in isolated hard-disks which are inaccessible for browsing and analysis. It is expected that curated secondary databases will help organize some of this Big Data thereby allowing users better navigate, search and compute on it. To address the above challenge, we have implemented a NGS biocuration workflow and are analyzing short read sequences and associated metadata from cancer patients to better understand the human variome. Curation of variation and other related information from control (normal tissue) and case (tumor) samples will provide comprehensive background information that can be used in genomic medicine research and application studies. Our approach includes a CloudBioLinux Virtual Machine which is used upstream of an integrated High-performance Integrated Virtual Environment (HIVE) that encapsulates Curated Short Read archive (CSR) and a proteome-wide variation effect analysis tool (SNVDis). As a proof-of-concept, we have curated and analyzed control and case breast cancer datasets from the NCI cancer genomics program - The Cancer Genome Atlas (TCGA). Our efforts include reviewing and recording in CSR available clinical information on patients, mapping of the reads to the reference followed by identification of non-synonymous Single Nucleotide Variations (nsSNVs) and integrating the data with tools that allow analysis of effect nsSNVs on the human proteome. Furthermore, we have also developed a novel phylogenetic analysis algorithm that uses SNV positions and can be used to classify the patient population. The workflow described here lays the foundation for analysis of short read sequence data to identify rare and novel SNVs that are not present in dbSNP and therefore provides a more comprehensive understanding of the human variome. Variation results for single genes as well as the entire study are available from the CSR website (http://hive.biochemistry.gwu.edu/dna.cgi?cmd=csr). Availability of thousands of sequenced samples from patients provides a rich repository of sequence information that can be utilized to identify individual level SNVs and their effect on the human proteome beyond what the dbSNP database provides.
2014-01-01
Background Next-generation sequencing (NGS) technologies have resulted in petabytes of scattered data, decentralized in archives, databases and sometimes in isolated hard-disks which are inaccessible for browsing and analysis. It is expected that curated secondary databases will help organize some of this Big Data thereby allowing users better navigate, search and compute on it. Results To address the above challenge, we have implemented a NGS biocuration workflow and are analyzing short read sequences and associated metadata from cancer patients to better understand the human variome. Curation of variation and other related information from control (normal tissue) and case (tumor) samples will provide comprehensive background information that can be used in genomic medicine research and application studies. Our approach includes a CloudBioLinux Virtual Machine which is used upstream of an integrated High-performance Integrated Virtual Environment (HIVE) that encapsulates Curated Short Read archive (CSR) and a proteome-wide variation effect analysis tool (SNVDis). As a proof-of-concept, we have curated and analyzed control and case breast cancer datasets from the NCI cancer genomics program - The Cancer Genome Atlas (TCGA). Our efforts include reviewing and recording in CSR available clinical information on patients, mapping of the reads to the reference followed by identification of non-synonymous Single Nucleotide Variations (nsSNVs) and integrating the data with tools that allow analysis of effect nsSNVs on the human proteome. Furthermore, we have also developed a novel phylogenetic analysis algorithm that uses SNV positions and can be used to classify the patient population. The workflow described here lays the foundation for analysis of short read sequence data to identify rare and novel SNVs that are not present in dbSNP and therefore provides a more comprehensive understanding of the human variome. Variation results for single genes as well as the entire study are available from the CSR website (http://hive.biochemistry.gwu.edu/dna.cgi?cmd=csr). Conclusions Availability of thousands of sequenced samples from patients provides a rich repository of sequence information that can be utilized to identify individual level SNVs and their effect on the human proteome beyond what the dbSNP database provides. PMID:24467687
Raadsma, H W; Jonas, E; Fleet, M R; Fullard, K; Gongora, J; Cavanagh, C R; Tammen, I; Thomson, P C
2013-08-01
The pursuits of white features and white fleeces free of pigmented fibre have been important selection objectives for many sheep breeds. The cause and inheritance of non-white colour patterns in sheep has been studied since the early 19th century. Discovery of genetic causes, especially those which predispose pigmentation in white sheep, may lead to more accurate selection tools for improved apparel wool. This article describes an extended QTL study for 13 skin and fibre pigmentation traits in sheep. A total of 19 highly significant, 10 significant and seven suggestive QTL were identified in a QTL mapping experiment using an Awassi × Merino × Merino backcross sheep population. All QTL on chromosome 2 exceeded a LOD score of greater than 4 (range 4.4-30.1), giving very strong support for a major gene for pigmentation on this chromosome. Evidence of epistatic interactions was found for QTL for four traits on chromosomes 2 and 19. The ovine TYRP1 gene on OAR 2 was sequenced as a strong positional candidate gene. A highly significant association (P < 0.01) of grandparental haplotypes across nine segregating SNP/microsatellite markers including one non-synonymous SNP with pigmentation traits could be shown. Up to 47% of the observed variation in pigmentation was accounted for by models using TYRP1 haplotypes and 83% for models with interactions between two QTL probabilities, offering scope for marker-assisted selection for these traits. © 2013 The Authors, Animal Genetics © 2013 Stichting International Foundation for Animal Genetics.
Ma, Xiaoyin; Ma, Zhiwei; Jiao, Xiaodong; Hejtmancik, J Fielding
2017-08-30
To identify possible genetic variants influencing expression of EPHA2 (Ephrin-receptor Type-A2), a tyrosine kinase receptor that has been shown to be important for lens development and to contribute to both congenital and age related cataract when mutated, the extended promoter region of EPHA2 was screened for variants. SNP rs6603883 lies in a PAX2 binding site in the EPHA2 promoter region. The C (minor) allele decreased EPHA2 transcriptional activity relative to the T allele by reducing the binding affinity of PAX2. Knockdown of PAX2 in human lens epithelial (HLE) cells decreased endogenous expression of EPHA2. Whole RNA sequencing showed that extracellular matrix (ECM), MAPK-AKT signaling pathways and cytoskeleton related genes were dysregulated in EPHA2 knockdown HLE cells. Taken together, these results indicate a functional non-coding SNP in EPHA2 promoter affects PAX2 binding and reduces EPHA2 expression. They further suggest that decreasing EPHA2 levels alters MAPK, AKT signaling pathways and ECM and cytoskeletal genes in lens cells that could contribute to cataract. These results demonstrate a direct role for PAX2 in EPHA2 expression and help delineate the role of EPHA2 in development and homeostasis required for lens transparency.
Li, C; Li, G L; Luo, Q; Li, S J; Wang, R B; Lou, Y L; Lyu, J X; Wan, K L
2017-02-10
Objective: To investigate the relationship between D-cycloserine resistance and the gene mutations of alrA , ddlA and cycA of Mycobacterium ( M. ) tuberculosis , as well as the association between D-cycloserine resistance and spoligotyping genotyping. Methods: A total of 145 M. tuberculosis strains were selected from the strain bank. D-cycloserine resistant phenotypes of the strains were determined by the proportion method and the minimal inhibitory concentration was determined by resazurin microtiter assay. PCR amplification and DNA direct sequencing methods were used for the analysis of gene mutations. Relationship between the resistance phenotype and genotype was analyzed by chi -square test. Results: Of the 145 clinically collected strains, 24 (16.6%) of them were D-cycloserine resistant and 121 (83.4%) were sensitive. There were only synonymous mutations noticed on alrA , ddlA and cycA in sensitive strains. Of the 24 D-cycloserine resistant strains, 3 (12.5%) isolates' cycA and 1 (4.2%) isolates' alrA happened to be non-synonymous mutations, in which the codes were 188, 318 and 508 of cycA , and 261 of alrA , respectively. Results on drug sensitivity tests confirmed the minimal inhibitory concentration of the mutant strains were all increased to some degrees. The D-cycloserine resistant rates of 88 Beijing genotype and 57 non-Beijing genotype strains were 20.5% and 10.5% , respectively, but with no statistically significant difference ( χ (2) =2.47, P >0.05). Conclusions: The non-synonymous mutations of alrA and cycA might contribute to one of the mechanisms of M. tuberculosis D-cycloserine resistance. M. tuberculosis Beijing genotype or non-Beijing genotype was not considered to be associated with the D-cycloserine resistance.
2014-01-01
Background Variation in seed oil composition and content among soybean varieties is largely attributed to differences in transcript sequences and/or transcript accumulation of oil production related genes in seeds. Discovery and analysis of sequence and expression variations in these genes will accelerate soybean oil quality improvement. Results In an effort to identify these variations, we sequenced the transcriptomes of soybean seeds from nine lines varying in oil composition and/or total oil content. Our results showed that 69,338 distinct transcripts from 32,885 annotated genes were expressed in seeds. A total of 8,037 transcript expression polymorphisms and 50,485 transcript sequence polymorphisms (48,792 SNPs and 1,693 small Indels) were identified among the lines. Effects of the transcript polymorphisms on their encoded protein sequences and functions were predicted. The studies also provided independent evidence that the lack of FAD2-1A gene activity and a non-synonymous SNP in the coding sequence of FAB2C caused elevated oleic acid and stearic acid levels in soybean lines M23 and FAM94-41, respectively. Conclusions As a proof-of-concept, we developed an integrated RNA-seq and bioinformatics approach to identify and functionally annotate transcript polymorphisms, and demonstrated its high effectiveness for discovery of genetic and transcript variations that result in altered oil quality traits. The collection of transcript polymorphisms coupled with their predicted functional effects will be a valuable asset for further discovery of genes, gene variants, and functional markers to improve soybean oil quality. PMID:24755115
Exome sequencing identifies complex I NDUFV2 mutations as a novel cause of Leigh syndrome.
Cameron, Jessie M; MacKay, Nevena; Feigenbaum, Annette; Tarnopolsky, Mark; Blaser, Susan; Robinson, Brian H; Schulze, Andreas
2015-09-01
Two siblings with hypertrophic cardiomyopathy and brain atrophy were diagnosed with Complex I deficiency based on low enzyme activity in muscle and high lactate/pyruvate ratio in fibroblasts. Whole exome sequencing results of fibroblast gDNA from one sibling was narrowed down to 190 SNPs or In/Dels in 185 candidate genes by selecting non-synonymous coding sequence base pair changes that were not present in the SNP database. Two compound heterozygous mutations were identified in both siblings in NDUFV2, encoding the 24 kDa subunit of Complex I. The intronic mutation (c.IVS2 + 1delGTAA) is disease causing and has been reported before. The other mutation is novel (c.669_670insG, p.Ser224Valfs*3) and predicted to cause a pathogenic frameshift in the protein. Subsequent investigation of 10 probands with complex I deficiency from different families revealed homozygosity for the intronic c.IVS2 + 1delGTAA mutation in a second, consanguineous family. In this family three of five siblings were affected. Interestingly, they presented with Leigh syndrome but no cardiac involvement. The same genotype had been reported previously in a two families but presenting with hypertrophic cardiomyopathy, trunk hypotonia and encephalopathy. We have identified NDUFV2 mutations in two families with Complex I deficiency, including a novel mutation. The diagnosis of Leigh syndrome expands the clinical phenotypes associated with the c.IVS2 + 1delGTAA mutation in this gene. Copyright © 2015 European Paediatric Neurology Society. Published by Elsevier Ltd. All rights reserved.
Kumari, Priyanka; Singh, Subodh Kumar; Raman, Rajiva
2018-06-05
Genome-wide linkage analysis and whole genome sequencing in a Van der Woude syndrome (VWS) family revealed that the SNP, rs539075, within intron 2 of the cadherin 2 gene (CDH2) co-segregated with the disease phenotype. A study with nonsyndromic cleft lip with or without cleft palate (NSCL ± P) cases (N = 292) and controls (N = 287) established association of this SNP with NSCL ± P as a risk factor. RT-PCR based expression analysis of the SNP-harbouring region of intron 2 of CDH2 in the clefted lip and/or palate tissues of 16 patients revealed that the mutant allele expressed in all those individuals having it (hetero-/homozygous), whereas the wild type allele expressed in <50% of the samples in which it was present. The intronic transcript was also present in the prospective lip and palate region of 13.5 dpc mouse embryo, detected by RNA in situ hybridization and RT-PCR. These results including the in silico, characterization of the ~200 nt-intronic transcript showed that conformationally it fits best with noncoding small RNA, possibly a precursor of miRNA. Its function in the orofacial organogenesis remains to be elucidated which will enable us to define the role of this mutant ncRNA in the clefting of lip and palate. Copyright © 2018 Elsevier B.V. All rights reserved.
Boucher, Gabrielle; Lo, Ken Sin; Rivas, Manuel A.; Stevens, Christine; Alikashani, Azadeh; Ladouceur, Martin; Ellinghaus, David; Törkvist, Leif; Goel, Gautam; Lagacé, Caroline; Annese, Vito; Bitton, Alain; Begun, Jakob; Brant, Steve R.; Bresso, Francesca; Cho, Judy H.; Duerr, Richard H.; Halfvarson, Jonas; McGovern, Dermot P. B.; Radford-Smith, Graham; Schreiber, Stefan; Schumm, Philip L.; Sharma, Yashoda; Silverberg, Mark S.; Weersma, Rinse K.; D'Amato, Mauro; Vermeire, Severine; Franke, Andre; Lettre, Guillaume; Xavier, Ramnik J.; Daly, Mark J.; Rioux, John D.
2013-01-01
Genome-wide association studies and follow-up meta-analyses in Crohn's disease (CD) and ulcerative colitis (UC) have recently identified 163 disease-associated loci that meet genome-wide significance for these two inflammatory bowel diseases (IBD). These discoveries have already had a tremendous impact on our understanding of the genetic architecture of these diseases and have directed functional studies that have revealed some of the biological functions that are important to IBD (e.g. autophagy). Nonetheless, these loci can only explain a small proportion of disease variance (∼14% in CD and 7.5% in UC), suggesting that not only are additional loci to be found but that the known loci may contain high effect rare risk variants that have gone undetected by GWAS. To test this, we have used a targeted sequencing approach in 200 UC cases and 150 healthy controls (HC), all of French Canadian descent, to study 55 genes in regions associated with UC. We performed follow-up genotyping of 42 rare non-synonymous variants in independent case-control cohorts (totaling 14,435 UC cases and 20,204 HC). Our results confirmed significant association to rare non-synonymous coding variants in both IL23R and CARD9, previously identified from sequencing of CD loci, as well as identified a novel association in RNF186. With the exception of CARD9 (OR = 0.39), the rare non-synonymous variants identified were of moderate effect (OR = 1.49 for RNF186 and OR = 0.79 for IL23R). RNF186 encodes a protein with a RING domain having predicted E3 ubiquitin-protein ligase activity and two transmembrane domains. Importantly, the disease-coding variant is located in the ubiquitin ligase domain. Finally, our results suggest that rare variants in genes identified by genome-wide association in UC are unlikely to contribute significantly to the overall variance for the disease. Rather, these are expected to help focus functional studies of the corresponding disease loci. PMID:24068945
Weng, Jianfeng; Li, Bo; Liu, Changlin; Yang, Xiaoyan; Wang, Hongwei; Hao, Zhuanfang; Li, Mingshun; Zhang, Degui; Ci, Xiaoke; Li, Xinhai; Zhang, Shihuang
2013-07-05
Kernel weight, controlled by quantitative trait loci (QTL), is an important component of grain yield in maize. Cytokinins (CKs) participate in determining grain morphology and final grain yield in crops. ZmIPT2, which is expressed mainly in the basal transfer cell layer, endosperm, and embryo during maize kernel development, encodes an isopentenyl transferase (IPT) that is involved in CK biosynthesis. The coding region of ZmIPT2 was sequenced across a panel of 175 maize inbred lines that are currently used in Chinese maize breeding programs. Only 16 single nucleotide polymorphisms (SNPs) and seven haplotypes were detected among these inbred lines. Nucleotide diversity (π) within the ZmIPT2 window and coding region were 0.347 and 0.0047, respectively, and they were significantly lower than the mean nucleotide diversity value of 0.372 for maize Chromosome 2 (P < 0.01). Association mapping revealed that a single nucleotide change from cytosine (C) to thymine (T) in the ZmIPT2 coding region, which converted a proline residue into a serine residue, was significantly associated with hundred kernel weight (HKW) in three environments (P <0.05), and explained 4.76% of the total phenotypic variation. In vitro characterization suggests that the dimethylallyl diphospate (DMAPP) IPT activity of ZmIPT2-T is higher than that of ZmIPT2-C, as the amounts of adenosine triphosphate (ATP), adenosine diphosphate (ADP), and adenosine monophosphate (AMP) consumed by ZmIPT2-T were 5.48-, 2.70-, and 1.87-fold, respectively, greater than those consumed by ZmIPT2-C. The effects of artificial selection on the ZmIPT2 coding region were evaluated using Tajima's D tests across six subgroups of Chinese maize germplasm, with the most frequent favorable allele identified in subgroup PB (Partner B). These results showed that ZmIPT2, which is associated with kernel weight, was subjected to artificial selection during the maize breeding process. ZmIPT2-T had higher IPT activity than ZmIPT2-C, and this favorable allele for kernel weight could be used in molecular marker-assisted selection for improvement of grain yield components in Chinese maize breeding programs.
New data and an old puzzle: the negative association between schizophrenia and rheumatoid arthritis.
Lee, S Hong; Byrne, Enda M; Hultman, Christina M; Kähler, Anna; Vinkhuyzen, Anna A E; Ripke, Stephan; Andreassen, Ole A; Frisell, Thomas; Gusev, Alexander; Hu, Xinli; Karlsson, Robert; Mantzioris, Vasilis X; McGrath, John J; Mehta, Divya; Stahl, Eli A; Zhao, Qiongyi; Kendler, Kenneth S; Sullivan, Patrick F; Price, Alkes L; O'Donovan, Michael; Okada, Yukinori; Mowry, Bryan J; Raychaudhuri, Soumya; Wray, Naomi R; Byerley, William; Cahn, Wiepke; Cantor, Rita M; Cichon, Sven; Cormican, Paul; Curtis, David; Djurovic, Srdjan; Escott-Price, Valentina; Gejman, Pablo V; Georgieva, Lyudmila; Giegling, Ina; Hansen, Thomas F; Ingason, Andrés; Kim, Yunjung; Konte, Bettina; Lee, Phil H; McIntosh, Andrew; McQuillin, Andrew; Morris, Derek W; Nöthen, Markus M; O'Dushlaine, Colm; Olincy, Ann; Olsen, Line; Pato, Carlos N; Pato, Michele T; Pickard, Benjamin S; Posthuma, Danielle; Rasmussen, Henrik B; Rietschel, Marcella; Rujescu, Dan; Schulze, Thomas G; Silverman, Jeremy M; Thirumalai, Srinivasa; Werge, Thomas; Agartz, Ingrid; Amin, Farooq; Azevedo, Maria H; Bass, Nicholas; Black, Donald W; Blackwood, Douglas H R; Bruggeman, Richard; Buccola, Nancy G; Choudhury, Khalid; Cloninger, Robert C; Corvin, Aiden; Craddock, Nicholas; Daly, Mark J; Datta, Susmita; Donohoe, Gary J; Duan, Jubao; Dudbridge, Frank; Fanous, Ayman; Freedman, Robert; Freimer, Nelson B; Friedl, Marion; Gill, Michael; Gurling, Hugh; De Haan, Lieuwe; Hamshere, Marian L; Hartmann, Annette M; Holmans, Peter A; Kahn, René S; Keller, Matthew C; Kenny, Elaine; Kirov, George K; Krabbendam, Lydia; Krasucki, Robert; Lawrence, Jacob; Lencz, Todd; Levinson, Douglas F; Lieberman, Jeffrey A; Lin, Dan-Yu; Linszen, Don H; Magnusson, Patrik K E; Maier, Wolfgang; Malhotra, Anil K; Mattheisen, Manuel; Mattingsdal, Morten; McCarroll, Steven A; Medeiros, Helena; Melle, Ingrid; Milanova, Vihra; Myin-Germeys, Inez; Neale, Benjamin M; Ophoff, Roel A; Owen, Michael J; Pimm, Jonathan; Purcell, Shaun M; Puri, Vinay; Quested, Digby J; Rossin, Lizzy; Ruderfer, Douglas; Sanders, Alan R; Shi, Jianxin; Sklar, Pamela; St Clair, David; Stroup, T Scott; Van Os, Jim; Visscher, Peter M; Wiersma, Durk; Zammit, Stanley; Bridges, S Louis; Choi, Hyon K; Coenen, Marieke J H; de Vries, Niek; Dieud, Philippe; Greenberg, Jeffrey D; Huizinga, Tom W J; Padyukov, Leonid; Siminovitch, Katherine A; Tak, Paul P; Worthington, Jane; De Jager, Philip L; Denny, Joshua C; Gregersen, Peter K; Klareskog, Lars; Mariette, Xavier; Plenge, Robert M; van Laar, Mart; van Riel, Piet
2015-10-01
A long-standing epidemiological puzzle is the reduced rate of rheumatoid arthritis (RA) in those with schizophrenia (SZ) and vice versa. Traditional epidemiological approaches to determine if this negative association is underpinned by genetic factors would test for reduced rates of one disorder in relatives of the other, but sufficiently powered data sets are difficult to achieve. The genomics era presents an alternative paradigm for investigating the genetic relationship between two uncommon disorders. We use genome-wide common single nucleotide polymorphism (SNP) data from independently collected SZ and RA case-control cohorts to estimate the SNP correlation between the disorders. We test a genotype X environment (GxE) hypothesis for SZ with environment defined as winter- vs summer-born. We estimate a small but significant negative SNP-genetic correlation between SZ and RA (-0.046, s.e. 0.026, P = 0.036). The negative correlation was stronger for the SNP set attributed to coding or regulatory regions (-0.174, s.e. 0.071, P = 0.0075). Our analyses led us to hypothesize a gene-environment interaction for SZ in the form of immune challenge. We used month of birth as a proxy for environmental immune challenge and estimated the genetic correlation between winter-born and non-winter born SZ to be significantly less than 1 for coding/regulatory region SNPs (0.56, s.e. 0.14, P = 0.00090). Our results are consistent with epidemiological observations of a negative relationship between SZ and RA reflecting, at least in part, genetic factors. Results of the month of birth analysis are consistent with pleiotropic effects of genetic variants dependent on environmental context.
New data and an old puzzle: the negative association between schizophrenia and rheumatoid arthritis
Lee, S Hong; Byrne, Enda M; Hultman, Christina M; Kähler, Anna; Vinkhuyzen, Anna AE; Ripke, Stephan; Andreassen, Ole A; Frisell, Thomas; Gusev, Alexander; Hu, Xinli; Karlsson, Robert; Mantzioris, Vasilis X; McGrath, John J; Mehta, Divya; Stahl, Eli A; Zhao, Qiongyi; Kendler, Kenneth S; Sullivan, Patrick F; Price, Alkes L; O’Donovan, Michael; Okada, Yukinori; Mowry, Bryan J; Raychaudhuri, Soumya; Wray, Naomi R; Byerley, William; Cahn, Wiepke; Cantor, Rita M; Cichon, Sven; Cormican, Paul; Curtis, David; Djurovic, Srdjan; Escott-Price, Valentina; Gejman, Pablo V; Georgieva, Lyudmila; Giegling, Ina; Hansen, Thomas F; Ingason, Andrés; Kim, Yunjung; Konte, Bettina; Lee, Phil H; McIntosh, Andrew; McQuillin, Andrew; Morris, Derek W; Nöthen, Markus M; O’Dushlaine, Colm; Olincy, Ann; Olsen, Line; Pato, Carlos N; Pato, Michele T; Pickard, Benjamin S; Posthuma, Danielle; Rasmussen, Henrik B; Rietschel, Marcella; Rujescu, Dan; Schulze, Thomas G; Silverman, Jeremy M; Thirumalai, Srinivasa; Werge, Thomas; Agartz, Ingrid; Amin, Farooq; Azevedo, Maria H; Bass, Nicholas; Black, Donald W; Blackwood, Douglas H R; Bruggeman, Richard; Buccola, Nancy G; Choudhury, Khalid; Cloninger, Robert C; Corvin, Aiden; Craddock, Nicholas; Daly, Mark J; Datta, Susmita; Donohoe, Gary J; Duan, Jubao; Dudbridge, Frank; Fanous, Ayman; Freedman, Robert; Freimer, Nelson B; Friedl, Marion; Gill, Michael; Gurling, Hugh; De Haan, Lieuwe; Hamshere, Marian L; Hartmann, Annette M; Holmans, Peter A; Kahn, René S; Keller, Matthew C; Kenny, Elaine; Kirov, George K; Krabbendam, Lydia; Krasucki, Robert; Lawrence, Jacob; Lencz, Todd; Levinson, Douglas F; Lieberman, Jeffrey A; Lin, Dan-Yu; Linszen, Don H; Magnusson, Patrik KE; Maier, Wolfgang; Malhotra, Anil K; Mattheisen, Manuel; Mattingsdal, Morten; McCarroll, Steven A; Medeiros, Helena; Melle, Ingrid; Milanova, Vihra; Myin-Germeys, Inez; Neale, Benjamin M; Ophoff, Roel A; Owen, Michael J; Pimm, Jonathan; Purcell, Shaun M; Puri, Vinay; Quested, Digby J; Rossin, Lizzy; Ruderfer, Douglas; Sanders, Alan R; Shi, Jianxin; Sklar, Pamela; St. Clair, David; Stroup, T Scott; Van Os, Jim; Visscher, Peter M; Wiersma, Durk; Zammit, Stanley; Bridges, S Louis; Choi, Hyon K; Coenen, Marieke JH; de Vries, Niek; Dieud, Philippe; Greenberg, Jeffrey D; Huizinga, Tom WJ; Padyukov, Leonid; Siminovitch, Katherine A; Tak, Paul P; Worthington, Jane; De Jager, Philip L; Denny, Joshua C; Gregersen, Peter K; Klareskog, Lars; Mariette, Xavier; Plenge, Robert M; van Laar, Mart; van Riel, Piet
2015-01-01
Background: A long-standing epidemiological puzzle is the reduced rate of rheumatoid arthritis (RA) in those with schizophrenia (SZ) and vice versa. Traditional epidemiological approaches to determine if this negative association is underpinned by genetic factors would test for reduced rates of one disorder in relatives of the other, but sufficiently powered data sets are difficult to achieve. The genomics era presents an alternative paradigm for investigating the genetic relationship between two uncommon disorders. Methods: We use genome-wide common single nucleotide polymorphism (SNP) data from independently collected SZ and RA case-control cohorts to estimate the SNP correlation between the disorders. We test a genotype X environment (GxE) hypothesis for SZ with environment defined as winter- vs summer-born. Results: We estimate a small but significant negative SNP-genetic correlation between SZ and RA (−0.046, s.e. 0.026, P = 0.036). The negative correlation was stronger for the SNP set attributed to coding or regulatory regions (−0.174, s.e. 0.071, P = 0.0075). Our analyses led us to hypothesize a gene-environment interaction for SZ in the form of immune challenge. We used month of birth as a proxy for environmental immune challenge and estimated the genetic correlation between winter-born and non-winter born SZ to be significantly less than 1 for coding/regulatory region SNPs (0.56, s.e. 0.14, P = 0.00090). Conclusions: Our results are consistent with epidemiological observations of a negative relationship between SZ and RA reflecting, at least in part, genetic factors. Results of the month of birth analysis are consistent with pleiotropic effects of genetic variants dependent on environmental context. PMID:26286434
NASA Astrophysics Data System (ADS)
Liu, Meng; Liu, Yuan; Hui, Min; Song, Chengwen; Cui, Zhaoxia
2017-03-01
Clip domain serine proteases (cSPs) and their homologs (SPHs) play an important role in various biological processes that are essential components of extracellular signaling cascades, especially in the innate immune responses of invertebrates. Here, polymorphisms of PtcSP and PtSPH from the swimming crab Portunus trituberculatus were investigated to explore their association with resistance/susceptibility to Vibrio alginolyticus. Polymorphic loci were identified using Clustal X, and characterized with SPSS 16.0 software, and then the significance of genotype and allele frequencies between resistant and susceptible stocks was determined by a χ 2 test. A total of 109 and 77 single nucleotide polymorphisms (SNPs) were identified in the genomic fragments of PtcSP and PtSPH, respectively. Notably, nearly half of PtSPH polymorphisms were found in the non-coding exon 1. Fourteen SNPs investigated were significantly associated with susceptibility/resistance to V. alginolyticus ( P <0.05). Among them, eight SNPs were observed in introns, and one synonymous, four non-synonymous SNPs and one ins-del were found in coding exons. In addition, five simple sequence repeats (SSRs) were detected in intron 3 of PtcSP. Although there was no statistically significant difference of allele frequencies, the SSRs showed different polymorphic alleles on the basis of the repeat number between resistant and susceptible stocks. After further validation, polymorphisms investigated here might be applied to select potential molecular markers of P. trituberculatus with resistance to V. alginolyticus.
Semantic Relationships between Contextual Synonyms
ERIC Educational Resources Information Center
Zeng, Xian-mo
2007-01-01
Contextual synonym is a linguistic phenomenon often applied but rarely discussed. This paper is to discuss the semantic relationships between contextual synonyms and the requirements under which words can be used as contextual synonyms between each other. The three basic relationships are embedment, intersection and non-coherence. The requirements…
Milivojevic, Verica; Feinn, Richard; Kranzler, Henry R.; Covault, Jonathan
2014-01-01
Rationale Animal models suggest that neuroactive steroids contribute to alcohol’s acute effects. We previously reported that a common non-synonymous polymorphism, AKR1C3*2 in the gene encoding the enzyme 3α-HSD2/17β-HSD5 and a synonymous SNP, rs248793, in SRD5A1, which encodes 5α-reductase, were associated with alcohol dependence (AD). Objectives To investigate whether these polymorphisms moderate subjective effects of alcohol in humans and whether AKR1C3*2 affects neuroactive steroid synthesis. Methods 65 Caucasian men (34 lighter and 31 heavier drinkers; mean age 26.2 y) participated in a double-blind laboratory study where they consumed drinks containing no ethanol or 0.8 g/kg of ethanol. Breath alcohol, heart rate (HR), and self-reported alcohol effects were measured at 40-min intervals and genotype was examined as a moderator of alcohol’s effects. Levels of the neuroactive steroid 5α-androstane-3α,17β-diol and its precursors, 3α,5α-androsterone and dihydrotestosterone, were measured at study entry using GC/MS. Results Initially, carriers of the AD-protective AK1C3*2 G-allele had higher levels of 5α-androstane-3α,17β-diol relative to the precursor 3α,5α-androsterone than C-allele homozygotes. AKR1C3*2 G-allele carriers exhibited greater increases in heart rate and stimulant and sedative effects of alcohol than C-allele homozygotes. The genotype effects on sedation were observed only in heavier drinkers. The only effect of the SRD5A1 SNP was to moderate HR. There were no interactive effects of the two SNPs. Conclusions The observed effects of variation in a gene encoding a neuroactive steroid biosynthetic enzyme on the rate of 17p–reduction of androsterone relative to androstanediol and on alcohol’s sedative effects may help to explain the association of AKR1C3*2 with AD. PMID:24838369
Shortt, Katherine; Chaudhary, Suman; Grigoryev, Dmitry; Heruth, Daniel P.; Venkitachalam, Lakshmi; Zhang, Li Q.; Ye, Shui Q.
2014-01-01
Acute respiratory distress syndrome (ARDS) is a lung condition characterized by impaired gas exchange with systemic release of inflammatory mediators, causing pulmonary inflammation, vascular leak and hypoxemia. Existing biomarkers have limited effectiveness as diagnostic and therapeutic targets. To identify disease-associating variants in ARDS patients, whole-exome sequencing was performed on 96 ARDS patients, detecting 1,382,399 SNPs. By comparing these exome data to those of the 1000 Genomes Project, we identified a number of single nucleotide polymorphisms (SNP) which are potentially associated with ARDS. 50,190SNPs were found in all case subgroups and controls, of which89 SNPs were associated with susceptibility. We validated three SNPs (rs78142040, rs9605146 and rs3848719) in additional ARDS patients to substantiate their associations with susceptibility, severity and outcome of ARDS. rs78142040 (C>T) occurs within a histone mark (intron 6) of the Arylsulfatase D gene. rs9605146 (G>A) causes a deleterious coding change (proline to leucine) in the XK, Kell blood group complex subunit-related family, member 3 gene. rs3848719 (G>A) is a synonymous SNP in the Zinc-Finger/Leucine-Zipper Co-Transducer NIF1 gene. rs78142040, rs9605146, and rs3848719 are associated significantly with susceptibility to ARDS. rs3848719 is associated with APACHE II score quartile. rs78142040 is associated with 60-day mortality in the overall ARDS patient population. Exome-seq is a powerful tool to identify potential new biomarkers for ARDS. We selectively validated three SNPs which have not been previously associated with ARDS and represent potential new genetic biomarkers for ARDS. Additional validation in larger patient populations and further exploration of underlying molecular mechanisms are warranted. PMID:25372662
Comprehensive Analysis of Non-Synonymous Natural Variants of G Protein-Coupled Receptors.
Kim, Hee Ryung; Duc, Nguyen Minh; Chung, Ka Young
2018-03-01
G protein-coupled receptors (GPCRs) are the largest superfamily of transmembrane receptors and have vital signaling functions in various organs. Because of their critical roles in physiology and pathology, GPCRs are the most commonly used therapeutic target. It has been suggested that GPCRs undergo massive genetic variations such as genetic polymorphisms and DNA insertions or deletions. Among these genetic variations, non-synonymous natural variations change the amino acid sequence and could thus alter GPCR functions such as expression, localization, signaling, and ligand binding, which may be involved in disease development and altered responses to GPCR-targeting drugs. Despite the clinical importance of GPCRs, studies on the genotype-phenotype relationship of GPCR natural variants have been limited to a few GPCRs such as β-adrenergic receptors and opioid receptors. Comprehensive understanding of non-synonymous natural variations within GPCRs would help to predict the unknown genotype-phenotype relationship and yet-to-be-discovered natural variants. Here, we analyzed the non-synonymous natural variants of all non-olfactory GPCRs available from a public database, UniProt. The results suggest that non-synonymous natural variations occur extensively within the GPCR superfamily especially in the N-terminus and transmembrane domains. Within the transmembrane domains, natural variations observed more frequently in the conserved residues, which leads to disruption of the receptor function. Our analysis also suggests that only few non-synonymous natural variations have been studied in efforts to link the variations with functional consequences.
Bowman, Larry L; Kondrateva, Elizaveta S; Timofeyev, Maxim A; Yampolsky, Lev Y
2018-06-01
Local adaptation and phenotypic plasticity are main mechanisms of organisms' resilience in changing environments. Both are affected by gene flow and are expected to be weak in zooplankton populations inhabiting large continuous water bodies and strongly affected by currents. Lake Baikal, the deepest and one of the coldest lakes on Earth, experienced epilimnion temperature increase during the last 100 years, exposing Baikal's zooplankton to novel selective pressures. We obtained a partial transcriptome of Epischura baikalensis (Copepoda: Calanoida), the dominant component of Baikal's zooplankton, and estimated SNP allele frequencies and transcript abundances in samples from regions of Baikal that differ in multiyear average surface temperatures. The strongest signal in both SNP and transcript abundance differentiation is the SW-NE gradient along the 600+ km long axis of the lake, suggesting isolation by distance. SNP differentiation is stronger for nonsynonymous than synonymous SNPs and is paralleled by differential survival during a laboratory exposure to increased temperature, indicating directional selection operating on the temperature gradient. Transcript abundance, generally collinear with the SNP differentiation, shows samples from the warmest, less deep location clustering together with the southernmost samples. Differential expression is more frequent among transcripts orthologous to candidate thermal response genes previously identified in model arthropods, including genes encoding cytoskeleton proteins, heat-shock proteins, proteases, enzymes of central energy metabolism, lipid and antioxidant pathways. We conclude that the pivotal endemic zooplankton species in Lake Baikal exists under temperature-mediated selection and possesses both genetic variation and plasticity to respond to novel temperature-related environmental pressures. © 2018 John Wiley & Sons Ltd.
A 48 SNP set for grapevine cultivar identification
2011-01-01
Background Rapid and consistent genotyping is an important requirement for cultivar identification in many crop species. Among them grapevine cultivars have been the subject of multiple studies given the large number of synonyms and homonyms generated during many centuries of vegetative multiplication and exchange. Simple sequence repeat (SSR) markers have been preferred until now because of their high level of polymorphism, their codominant nature and their high profile repeatability. However, the rapid application of partial or complete genome sequencing approaches is identifying thousands of single nucleotide polymorphisms (SNP) that can be very useful for such purposes. Although SNP markers are bi-allelic, and therefore not as polymorphic as microsatellites, the high number of loci that can be multiplexed and the possibilities of automation as well as their highly repeatable results under any analytical procedure make them the future markers of choice for any type of genetic identification. Results We analyzed over 300 SNP in the genome of grapevine using a re-sequencing strategy in a selection of 11 genotypes. Among the identified polymorphisms, we selected 48 SNP spread across all grapevine chromosomes with allele frequencies balanced enough as to provide sufficient information content for genetic identification in grapevine allowing for good genotyping success rate. Marker stability was tested in repeated analyses of a selected group of cultivars obtained worldwide to demonstrate their usefulness in genetic identification. Conclusions We have selected a set of 48 stable SNP markers with a high discrimination power and a uniform genome distribution (2-3 markers/chromosome), which is proposed as a standard set for grapevine (Vitis vinifera L.) genotyping. Any previous problems derived from microsatellite allele confusion between labs or the need to run reference cultivars to identify allele sizes disappear using this type of marker. Furthermore, because SNP markers are bi-allelic, allele identification and genotype naming are extremely simple and genotypes obtained with different equipments and by different laboratories are always fully comparable. PMID:22060012
Radha Rama Devi, A; Ramesh, Vakkalagadda A; Nagarajaram, H A; Satish, S P S; Jayanthi, U; Lingappa, Lokesh
2016-01-01
Glutaric aciduria type I is an autosomal recessive organic acid disorder. The primary defect is the deficiency of Glutaryl-CoA dehydrogenase (EC number 1.3.99.7) enzyme that is involved in the catabolic pathways of the amino acids l-lysine, l-hydroxylysine, and l-tryptophan. It is a treatable neuro-metabolic disorder. Early diagnosis and treatment helps in preventing brain damage. The Glutaryl-CoA dehydrogenase gene (GCDH) gene was sequenced to identify disease causing mutations by direct sequencing of all the exons in twelve patients who were biochemically confirmed with GA I. We identified eleven mutations of which nine are homozygous mutations, one heterozygous and two synonymous mutations. Among the eleven mutations, four mutations p.Q162R, p.P286S, p.W225X in two families and p.V410M are novel. A milder clinical presentation is observed in those families who are either heterozygous or with a benign synonymous SNP. Multiple sequence alignment (MSA) of GCDH with its homologues revealed that the observed novel mutations are not tolerated by protein structure and function. The present study indicates genetic heterogeneity in GCDH gene mutations among South Indian population. Genetic analysis is useful in prenatal diagnosis and prevention. Mutation analysis is a useful tool in the absence of non-availability of enzyme assay in GA I. Copyright © 2015 The Japanese Society of Child Neurology. Published by Elsevier B.V. All rights reserved.
Linkage and association study of late-onset Alzheimer disease families linked to 9p21.3.
Züchner, S; Gilbert, J R; Martin, E R; Leon-Guerrero, C R; Xu, P-T; Browning, C; Bronson, P G; Whitehead, P; Schmechel, D E; Haines, J L; Pericak-Vance, M A
2008-11-01
A chromosomal locus for late-onset Alzheimer disease (LOAD) has previously been mapped to 9p21.3. The most significant results were reported in a sample of autopsy-confirmed families. Linkage to this locus has been independently confirmed in AD families from a consanguineous Israeli-Arab community. In the present study we analyzed an expanded clinical sample of 674 late-onset AD families, independently ascertained by three different consortia. Sample subsets were stratified by site and autopsy-confirmation. Linkage analysis of a dense array of SNPs across the chromosomal locus revealed the most significant results in the 166 autopsy-confirmed families of the NIMH sample. Peak HLOD scores of 4.95 at D9S741 and 2.81 at the nearby SNP rs2772677 were obtained in a dominant model. The linked region included the cyclin-dependent kinase inhibitor 2A gene (CDKN2A), which has been suggested as an AD candidate gene. By re-sequencing all exons in the vicinity of CDKN2A in 48 AD cases, we identified and genotyped four novel SNPs, including a non-synonymous, a synonymous, and two variations located in untranslated RNA sequences. Family-based allelic and genotypic association analysis yielded significant results in CDKN2A (rs11515: PDT p = 0.003, genotype-PDT p = 0.014). We conclude that CDKN2A is a promising new candidate gene potentially contributing to AD susceptibility on chromosome 9p.
Codon Optimizing for Increased Membrane Protein Production: A Minimalist Approach.
Mirzadeh, Kiavash; Toddo, Stephen; Nørholm, Morten H H; Daley, Daniel O
2016-01-01
Reengineering a gene with synonymous codons is a popular approach for increasing production levels of recombinant proteins. Here we present a minimalist alternative to this method, which samples synonymous codons only at the second and third positions rather than the entire coding sequence. As demonstrated with two membrane-embedded transporters in Escherichia coli, the method was more effective than optimizing the entire coding sequence. The method we present is PCR based and requires three simple steps: (1) the design of two PCR primers, one of which is degenerate; (2) the amplification of a mini-library by PCR; and (3) screening for high-expressing clones.
Bahri, Bochra A; Daverdin, Guillaume; Xu, Xiangyang; Cheng, Jan-Fang; Barry, Kerrie W; Brummer, E Charles; Devos, Katrien M
2018-06-14
Advances in genomic technologies have expanded our ability to accurately and exhaustively detect natural genomic variants that can be applied in crop improvement and to increase our knowledge of plant evolution and adaptation. Switchgrass (Panicum virgatum L.), an allotetraploid (2n = 4× = 36) perennial C4 grass (Poaceae family) native to North America and a feedstock crop for cellulosic biofuel production, has a large potential for genetic improvement due to its high genotypic and phenotypic variation. In this study, we analyzed single nucleotide polymorphism (SNP) variation in 372 switchgrass genotypes belonging to 36 accessions for 12 genes putatively involved in biomass production to investigate signatures of selection that could have led to ecotype differentiation and to population adaptation to geographic zones. A total of 11,682 SNPs were mined from ~ 15 Gb of sequence data, out of which 251 SNPs were retained after filtering. Population structure analysis largely grouped upland accessions into one subpopulation and lowland accessions into two additional subpopulations. The most frequent SNPs were in homozygous state within accessions. Sixty percent of the exonic SNPs were non-synonymous and, of these, 45% led to non-conservative amino acid changes. The non-conservative SNPs were largely in linkage disequilibrium with one haplotype being predominantly present in upland accessions while the other haplotype was commonly present in lowland accessions. Tajima's test of neutrality indicated that PHYB, a gene involved in photoperiod response, was under positive selection in the switchgrass population. PHYB carried a SNP leading to a non-conservative amino acid change in the PAS domain, a region that acts as a sensor for light and oxygen in signal transduction. Several non-conservative SNPs in genes potentially involved in plant architecture and adaptation have been identified and led to population structure and genetic differentiation of ecotypes in switchgrass. We suggest here that PHYB is a key gene involved in switchgrass natural selection. Further analyses are needed to determine whether any of the non-conservative SNPs identified play a role in the differential adaptation of upland and lowland switchgrass.
Prechl, József; Papp, Krisztián; Hérincs, Zoltán; Péterfy, Hajna; Lóránd, Veronika; Szittner, Zoltán; Estonba, Andone; Rovero, Paolo; Paolini, Ilaria; Del Amo, Jokin; Uribarri, Maria; Alcaro, Maria Claudia; Ruiz-Larrañaga, Otsanda; Migliorini, Paola; Czirják, László
2016-01-01
Systemic lupus erythematosus is a chronic autoimmune disease with multifactorial ethiopathogenesis. The complement system is involved in both the early and late stages of disease development and organ damage. To better understand autoantibody mediated complement consumption we examined ex vivo immune complex formation on autoantigen arrays. We recruited patients with SLE (n = 211), with other systemic autoimmune diseases (n = 65) and non-autoimmune control subjects (n = 149). Standard clinical and laboratory data were collected and serum complement levels were determined. The genotype of SNP rs1143679 in the ITGAM gene was also determined. Ex vivo formation of immune complexes, with respect to IgM, IgG, complement C4 and C3 binding, was examined using a functional immunoassay on autoantigen microarray comprising nucleic acids, proteins and lipids. Complement consumption of nucleic acids increased upon binding of IgM and IgG even when serum complement levels were decreased due to consumption in SLE patients. A negative correlation between serum complement levels and ex vivo complement deposition on nucleic acid autoantigens is demonstrated. On the contrary, complement deposition on tested protein and lipid autoantigens showed positive correlation with C4 levels. Genetic analysis revealed that the non-synonymous variant rs1143679 in complement receptor type 3 is associated with an increased production of anti-dsDNA IgG antibodies. Notwithstanding, homozygous carriers of the previously reported susceptible allele (AA) had lower levels of dsDNA specific IgM among SLE patients. Both the non-synonymous variant rs1143679 and the high ratio of nucleic acid specific IgG/IgM were associated with multiple organ involvement. In summary, secondary complement deficiency in SLE does not impair opsonization of nucleic-acid-containing autoantigens but does affect other antigens and potentially other complement dependent processes. Dysfunction of the receptor recognizing complement opsonized immune complexes promotes the development of class-switched autoantibodies targeting nucleic acids.
Alsaif, Mohammed A.; Al Shammari, Sulaiman A.; Alhamdan, Adel A.
2012-01-01
Introduction Single-nucleotide polymorphisms (SNPs) are biomarkers for exploring the genetic basis of many complex human diseases. The prediction of SNPs is promising in modern genetic analysis but it is still a great challenge to identify the functional SNPs in a disease-related gene. The computational approach has overcome this challenge and an increase in the successful rate of genetic association studies and reduced cost of genotyping have been achieved. The objective of this study is to identify deleterious non-synonymous SNPs (nsSNPs) associated with the COL1A1 gene. Material and methods The SNPs were retrieved from the Single Nucleotide Polymorphism Database (dbSNP). Using I-Mutant, protein stability change was calculated. The potentially functional nsSNPs and their effect on proteins were predicted by PolyPhen and SIFT respectively. FASTSNP was used for estimation of risk score. Results Our analysis revealed 247 SNPs as non-synonymous, out of which 5 nsSNPs were found to be least stable by I-Mutant 2.0 with a DDG value of > –1.0. Four nsSNPs, namely rs17853657, rs17857117, rs57377812 and rs1059454, showed a highly deleterious tolerance index score of 0.00 with a change in their physicochemical properties by the SIFT server. Seven nsSNPs, namely rs1059454, rs8179178, rs17853657, rs17857117, rs72656340, rs72656344 and rs72656351, were found to be probably damaging with a PSIC score difference between 2.0 and 3.5 by the PolyPhen server. Three nsSNPs, namely rs1059454, rs17853657 and rs17857117, were found to be highly polymorphic with a risk score of 3-4 with a possible effect of non-conservative change and splicing regulation by FASTSNP. Conclusions Three nsSNPs, namely rs1059454, rs17853657 and rs17857117, are potential functional polymorphisms that are likely to have a functional impact on the COL1A1 gene. PMID:24273577
IL6R Variation Asp358Ala Is a Potential Modifier of Lung Function in Asthma
Hawkins, Gregory A; Robinson, Mac B; Hastie, Annette T; Li, Xingnan; Li, Huashi; Moore, Wendy C; Howard, Timothy D; Busse, William W.; Erzurum, Serpil C.; Wenzel, Sally E.; Peters, Stephen P; Meyers, Deborah A; Bleecker, Eugene R
2012-01-01
Background The IL6R SNP rs4129267 has recently been identified as an asthma susceptibility locus in subjects of European ancestry but has not been characterized with respect to asthma severity. The SNP rs4129267 is in linkage disequilibrium (r2=1) with the IL6R coding SNP rs2228145 (Asp358Ala). This IL6R coding change increases IL6 receptor shedding and promotes IL6 transsignaling. Objectives To evaluate the IL6R SNP rs2228145 with respect to asthma severity phenotypes. Methods The IL6R SNP rs2228145 was evaluated in subjects of European ancestry with asthma from the Severe Asthma Research Program (SARP). Lung function associations were replicated in the Collaborative Study on the Genetics of Asthma (CSGA) cohort. Serum soluble IL6 receptor (sIL6R) levels were measured in subjects from SARP. Immunohistochemistry was used to qualitatively evaluate IL6R protein expression in BAL cells and endobronchial biopsies. Results The minor C allele of IL6R SNP rs2228145 was associated with lower ppFEV1 in the SARP cohort (p=0.005), the CSGA cohort (0.008), and in combined cohort analysis (p=0.003). Additional associations with ppFVC, FEV1/FVC, and PC20 were observed. The rs2228145 C allele (Ala358) was more frequent in severe asthma phenotypic clusters. Elevated serum sIL6R was associated with lower ppFEV1 (p=0.02) and lower ppFVC (p=0.008) (N=146). IL6R protein expression was observed in BAL macrophages, airway epithelium, vascular endothelium, and airway smooth muscle. Conclusions The IL6R coding SNP rs2228145 (Asp358Ala) is a potential modifier of lung function in asthma and may identify subjects at risk for more severe asthma. IL6 transsignaling may have a pathogenic role in the lung. PMID:22554704
Ling, Kai-Shu; Harris, Karen R; Meyer, Jenelle D F; Levi, Amnon; Guner, Nihat; Wehner, Todd C; Bendahmane, Abdelhafid; Havey, Michael J
2009-12-01
Zucchini yellow mosaic virus (ZYMV) is one of the most economically important potyviruses infecting cucurbit crops worldwide. Using a candidate gene approach, we cloned and sequenced eIF4E and eIF(iso)4E gene segments in watermelon. Analysis of the nucleotide sequences between the ZYMV-resistant watermelon plant introduction PI 595203 (Citrullus lanatus var. lanatus) and the ZYMV-susceptible watermelon cultivar 'New Hampshire Midget' ('NHM') showed the presence of single nucleotide polymorphisms (SNPs). Initial analysis of the identified SNPs in association studies indicated that SNPs in the eIF4E, but not eIF(iso)4E, were closely associated to the phenotype of ZYMV-resistance in 70 F(2) and 114 BC(1R) progenies. Subsequently, we focused our efforts in obtaining the entire genomic sequence of watermelon eIF4E. Three SNPs were identified between PI 595203 and NHM. One of the SNPs (A241C) was in exon 1 and the other two SNPs (C309A and T554G) were in the first intron of the gene. SNP241 which resulted in an amino acid substitution (proline to threonine) was shown to be located in the critical cap recognition and binding area, similar to that of several plant species resistance to potyviruses. Analysis of a cleaved amplified polymorphism sequence (CAPS) marker derived from this SNP in F(2) and BC(1R) populations demonstrated a cosegregation between the CAPS-2 marker and their ZYMV resistance or susceptibility phenotype. When we investigated whether such SNP mutation in the eIF4E was also conserved in several other PIs of C. lanatus var. citroides, we identified a different SNP (A171G) resulting in another amino acid substitution (D71G) from four ZYMV-resistant C. lanatus var. citroides (PI 244018, PI 482261, PI 482299, and PI 482322). Additional CAPS markers were also identified. Availability of all these CAPS markers will enable marker-aided breeding of watermelon for ZYMV resistance.
Huang, Jie; Huffman, Jennifer E.; Yamkauchi, Munekazu; Trompet, Stella; Asselbergs, Folkert W.; Sabater-Lleal, Maria; Trégouët, David-Alexandre; Chen, Wei-Min; Smith, Nicholas L.; Kleber, Marcus E.; Shin, So-Youn; Becker, Diane M.; Tang, Weihong; Dehghan, Abbas; Johnson, Andrew D.; Truong, Vinh; Folkersen, Lasse; Yang, Qiong; Oudot-Mellakh, Tiphaine; Buckley, Brendan M.; Moore, Jason H.; Williams, Frances M.K.; Campbell, Harry; Silbernagel, Günther; Vitart, Veronique; Rudan, Igor; Tofler, Geoffrey H.; Navis, Gerjan J.; DeStefano, Anita; Wright, Alan F.; Chen, Ming-Huei; de Craen, Anton J.M.; Worrall, Bradford B.; Rudnicka, Alicja R.; Rumley, Ann; Bookman, Ebony B.; Psaty, Bruce M.; Chen, Fang; Keene, Keith L.; Franco, Oscar H.; Böhm, Bernhard O.; Uitterlinden, Andre G.; Carter, Angela M.; Jukema, J. Wouter; Sattar, Naveed; Bis, Joshua C.; Ikram, Mohammad A.; Sale, Michèle M.; McKnight, Barbara; Fornage, Myriam; Ford, Ian; Taylor, Kent; Slagboom, P. Eline; McArdle, Wendy L.; Hsu, Fang-Chi; Franco-Cereceda, Anders; Goodall, Alison H.; Yanek, Lisa R.; Furie, Karen L.; Cushman, Mary; Hofman, Albert; Witteman, Jacqueline CM.; Folsom, Aaron R.; Basu, Saonli; Matijevic, Nena; van Gilst, Wiek H.; Wilson, James F.; Westendorp, Rudi G.J.; Kathiresan, Sekar; Reilly, Muredach P.; Tracy, Russell P.; Polasek, Ozren; Winkelmann, Bernhard R.; Grant, Peter J.; Hillege, Hans L.; Cambien, Francois; Stott, David J.; Lowe, Gordon D.; Spector, Timothy D.; Meigs, James B.; Marz, Winfried; Eriksson, Per; Becker, Lewis C.; Morange, Pierre-Emmanuel; Soranzo, Nicole; Williams, Scott M.; Hayward, Caroline; van der Harst, Pim; Hamsten, Anders; Lowenstein, Charles J.; Strachan, David P.; O'Donnell, Christopher J.
2014-01-01
Objective Tissue plasminogen activator (tPA), a serine protease, catalyzes the conversion of plasminogen to plasmin, the major enzyme responsible for endogenous fibrinolysis. In some populations, elevated plasma levels of tPA have been associated with myocardial infarction and other cardiovascular diseases (CVD). We conducted a meta-analysis of genome-wide association studies (GWAS) to identify novel correlates of circulating levels of tPA. Approach and Results Fourteen cohort studies with tPA measures (N=26,929) contributed to the meta-analysis. Three loci were significantly associated with circulating tPA levels (P <5.0×10−8). The first locus is on 6q24.3, with the lead SNP (rs9399599, P=2.9×10−14) within STXBP5. The second locus is on 8p11.21. The lead SNP (rs3136739, P=1.3×10−9) is intronic to POLB and less than 200kb away from the tPA encoding gene PLAT. We identified a non-synonymous SNP (rs2020921) in modest LD with rs3136739 (r2 = 0.50) within exon 5 of PLAT (P=2.0×10−8). The third locus is on 12q24.33, with the lead SNP (rs7301826, P=1.0×10−9) within intron 7 of STX2. We further found evidence for association of lead SNPs in STXBP5 and STX2 with expression levels of the respective transcripts. In in vitro cell studies, silencing STXBP5 decreased release of tPA from vascular endothelial cells, while silencing of STX2 increased tPA release. Through an in-silico lookup, we found no associations of the three lead SNPs with coronary artery disease or stroke. Conclusions We identified three loci associated with circulating tPA levels, the PLAT region, STXBP5 and STX2. Our functional studies implicate a novel role for STXBP5 and STX2 in regulating tPA release. PMID:24578379
Shannon Entropy of the Canonical Genetic Code
NASA Astrophysics Data System (ADS)
Nemzer, Louis
The probability that a non-synonymous point mutation in DNA will adversely affect the functionality of the resultant protein is greatly reduced if the substitution is conservative. In that case, the amino acid coded by the mutated codon has similar physico-chemical properties to the original. Many simplified alphabets, which group the 20 common amino acids into families, have been proposed. To evaluate these schema objectively, we introduce a novel, quantitative method based on the inherent redundancy in the canonical genetic code. By calculating the Shannon information entropy carried by 1- or 2-bit messages, groupings that best leverage the robustness of the code are identified. The relative importance of properties related to protein folding - like hydropathy and size - and function, including side-chain acidity, can also be estimated. In addition, this approach allows us to quantify the average information value of nucleotide codon positions, and explore the physiological basis for distinguishing between transition and transversion mutations. Supported by NSU PFRDG Grant #335347.
A PYY Q62P variant linked to human obesity
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ahituv, Nadav; Kavaslar, Nihan; Schackwitz, Wendy
2005-06-27
Members of the pancreatic polypeptide family and the irreceptors have been implicated in the control of food intake in rodents and humans. To investigate whether nucleotide changes in these candidate genes result in abnormal weight in humans, we sequenced the coding exons and splice sites of seven family members (NPY, PYY, PPY, NPY1R, NPY2R, NPY4R, and NPY5R) in a large cohort of extremely obese (n=379) and lean (n=378) individuals. In total we found eleven rare non-synonymous variants, four of which exhibited familial segregation, NPY1R L53P and PPY P63L with leanness and NPY2R D42G and PYY Q62P with obesity. Functional analysismore » of the obese variants revealed NPY2R D42G to have reduced cell surface expression, while previous cell culture based studies indicated variant PYY Q62P to have altered receptor binding selectivity and we show that it fails to reduce food intake through mouse peptide injection experiments. These results support that rare non-synonymous variants within these genes can alter susceptibility to human body mass index extremes.« less
Miyakawa, Hiroe; Miyamoto, Toshinobu; Koh, Eitetsu; Tsujimura, Akira; Miyagawa, Yasushi; Saijo, Yasuaki; Namiki, Mikio; Sengoku, Kazuo
2012-01-01
Genetic mechanisms have been implicated as a cause of some cases of male infertility. Recently, 10 novel genes involved in human spermatogenesis, including human SEPTIN12, were identified by expression microarray analysis of human testicular tissue. Septin12 is a member of the septin family of conserved cytoskeletal GTPases that form heteropolymeric filamentous structures in interphase cells. It is expressed specifically in the testis. Therefore, we hypothesized that mutation or polymorphisms of SEPTIN12 participate in male infertility, especially Sertoli cell-only syndrome (SCOS). To investigate whether SEPTIN12 gene defects are associated with azoospermia caused by SCOS, mutational analysis was performed in 100 Japanese patients by direct sequencing of coding regions. Statistical analysis was performed in patients with SCOS and in 140 healthy control men. No mutations were found in SEPTIN12 ; however, 8 coding single-nucleotide polymorphisms (SNP1-SNP8) could be detected in the patients with SCOS. The genotype and allele frequencies in SNP3, SNP4, and SNP6 were notably higher in the SCOS group than in the control group (P < .001). These results suggest that SEPTIN12 might play a critical role in human spermatogenesis.
Sinha, Siddharth; Verma, Sharad; Singh, Aditi; Somvanshi, Pallavi; Grover, Abhinav
2018-01-01
Spinocerebellar degeneration, termed as ataxia is a neurological disorder of central nervous system, characterized by limb in-coordination and a progressive gait. The patient also demonstrates specific symptoms of muscle weakness, slurring of speech, and decreased vibration senses. Expansion of polyglutamine trinucleotide (CAG) within ATXN2 gene with 35 or more repeats, results in spinocerebellar ataxia type-2. Protein ataxin-2 coded by ATXN2 gene has been reported to have a crucial role in translation of the genetic information through sequestering the histone acetyl transferases (HAT) resulting in a state of hypo-acetylation. In the present study, we have evaluated the outcome for 122 non synonymous single nucleotide polymorphisms (nsSNPs) reported within ATXN2 gene through computational tools such as SIFT, PolyPhen 2.0, PANTHER, I-mutant 2.0, Phd-SNP, Pmut, MutPred. The apo and mutant (L305V and Q339L) form of structures for the ataxin-2 protein were modeled for gaining insights toward 3D spatial arrangement. Further, molecular dynamics simulations and structural analysis were performed to observe the brunt of disease associated nsSNPs toward the strength and secondary properties of ataxin-2 protein structure. Our results showed that, L305V is a highly deleterious and disease causing point substitution. Analysis based on RMSD, RMSF, Rg, SASA, number of hydrogen bonds (NH bonds), covariance matrix trace, projection analysis for eigen vector demonstrated a significant instability and conformation along with rise in mutant flexibility values in comparison to the apo form of ataxin-2 protein. The study provides a blue print of computational methodologies to examine the ataxin-blend SNPs. J. Cell. Biochem. 119: 499-510, 2018. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Shen, Yu-Chih; Liao, Ding-Lieh; Lu, Chao-Lin; Chen, Jen-Yeu; Liou, Ying-Jay; Chen, Tzu-Ting; Chen, Chia-Hsiang
2010-08-01
Vesicular glutamate transporters (VGLUT1-3) package glutamate into vesicles in the presynaptic terminal and regulate the release of glutamate. In mesencephalic dopamine neuron culture, the majority of isolated dopamine neurons express VGLUT2, but not VGLUT1 or 3, have been demonstrated. As related to the dysregulated glutamatergic hypothesis of schizophrenia, the gene encoding VGLUT2 is the most plausible candidate involved in the pathogenesis of this illness. We searched for genetic variants in the promoter region and 12 exons (including UTR ends) of the VGLUT2 gene using direct sequencing in a sample of Han Chinese schizophrenic patients (n=375) and non-psychotic controls (n=366) from Taiwan, and conducted a case-control association study. We identified 8 common SNPs in the VGLUT2 gene. SNP and haplotype-based analyses showed no association with schizophrenia. Besides, we identified 9 rare variants in 13 out of 375 patients, including 3 variants located at the promoter region, 2 synonymous variants located at protein coding regions, and 4 variants located at UTR ends. No rare variants were found in the control subjects. Collectively, these rare variants were significantly overrepresented in the patient group (3.5% versus 0, p value of Fisher's exact test=2.3x10(-5)), suggesting they may contribute to the pathogenesis of schizophrenia. Although the functional significance of these rare variants remains to be characterized, our study may lend support to the multiple rare mutations hypothesis of schizophrenia, and may provide genetic clues to indicate the involvement of the glutamate transmission pathway in the pathogenesis of schizophrenia. Copyright 2010 Elsevier B.V. All rights reserved.
Tindall, B J; Sutton, G; Garrity, G M
2017-02-01
Enterobacter aerogenes Hormaeche and Edwards 1960 (Approved Lists 1980) and Klebsiella mobilis Bascomb et al. 1971 (Approved Lists 1980) were placed on the Approved Lists of Bacterial Names and were based on the same nomenclatural type, ATCC 13048. Consequently they are to be treated as homotypic synonyms. However, the names of homotypic synonyms at the rank of species normally are based on the same epithet. Examination of the Rules of the International Code of Nomenclature of Bacteria in force at the time indicates that the epithet mobilis in Klebsiella mobilis Bascomb et al. 1971 (Approved Lists 1980) was illegitimate at the time the Approved Lists were published and according to the Rules of the current International Code of Nomenclature of Prokaryotes continues to be illegitimate.
Kaya, Hilal Betul; Cetin, Oznur; Kaya, Hulya; Sahin, Mustafa; Sefer, Filiz; Kahraman, Abdullah; Tanyolac, Bahattin
2013-01-01
Background The olive tree (Olea europaea L.) is a diploid (2n = 2x = 46) outcrossing species mainly grown in the Mediterranean area, where it is the most important oil-producing crop. Because of its economic, cultural and ecological importance, various DNA markers have been used in the olive to characterize and elucidate homonyms, synonyms and unknown accessions. However, a comprehensive characterization and a full sequence of its transcriptome are unavailable, leading to the importance of an efficient large-scale single nucleotide polymorphism (SNP) discovery in olive. The objectives of this study were (1) to discover olive SNPs using next-generation sequencing and to identify SNP primers for cultivar identification and (2) to characterize 96 olive genotypes originating from different regions of Turkey. Methodology/Principal Findings Next-generation sequencing technology was used with five distinct olive genotypes and generated cDNA, producing 126,542,413 reads using an Illumina Genome Analyzer IIx. Following quality and size trimming, the high-quality reads were assembled into 22,052 contigs with an average length of 1,321 bases and 45 singletons. The SNPs were filtered and 2,987 high-quality putative SNP primers were identified. The assembled sequences and singletons were subjected to BLAST similarity searches and annotated with a Gene Ontology identifier. To identify the 96 olive genotypes, these SNP primers were applied to the genotypes in combination with amplified fragment length polymorphism (AFLP) and simple sequence repeats (SSR) markers. Conclusions/Significance This study marks the highest number of SNP markers discovered to date from olive genotypes using transcriptome sequencing. The developed SNP markers will provide a useful source for molecular genetic studies, such as genetic diversity and characterization, high density quantitative trait locus (QTL) analysis, association mapping and map-based gene cloning in the olive. High levels of genetic variation among Turkish olive genotypes revealed by SNPs, AFLPs and SSRs allowed us to characterize the Turkish olive genotype. PMID:24058483
A novel mutation in TFL1 homolog affecting determinacy in cowpea (Vigna unguiculata).
Dhanasekar, P; Reddy, K S
2015-02-01
Mutations in the widely conserved Arabidopsis Terminal Flower 1 (TFL1) gene and its homologs have been demonstrated to result in determinacy across genera, the knowledge of which is lacking in cowpea. Understanding the molecular events leading to determinacy of apical meristems could hasten development of cowpea varieties with suitable ideotypes. Isolation and characterization of a novel mutation in cowpea TFL1 homolog (VuTFL1) affecting determinacy is reported here for the first time. Cowpea TFL1 homolog was amplified using primers designed based on conserved sequences in related genera and sequence variation was analysed in three gamma ray-induced determinate mutants, their indeterminate parent "EC394763" and two indeterminate varieties. The analyses of sequence variation exposed a novel SNP distinguishing the determinate mutants from the indeterminate types. The non-synonymous point mutation in exon 4 at position 1,176 resulted from transversion of cytosine (C) to adenine (A) leading to an amino acid change (Pro-136 to His) in determinate mutants. The effect of the mutation on protein function and stability was predicted to be detrimental using different bioinformatics/computational tools. The functionally significant novel substitution mutation is hypothesized to affect determinacy in the cowpea mutants. Development of suitable regeneration protocols in this hitherto recalcitrant crop and subsequent complementation assay in mutants or over-expressing assay in parents could decisively conclude the role of the SNP in regulating determinacy in these cowpea mutants.
Xu, Xuewen; Ji, Jing; Xu, Qiang; Qi, Xiaohua; Weng, Yiqun; Chen, Xuehao
2018-03-01
In plants, the formation of hypocotyl-derived adventitious roots (ARs) is an important morphological acclimation to waterlogging stress; however, its genetic basis remains fragmentary. Here, through combined use of bulked segregant analysis-based whole-genome sequencing, SNP haplotyping and fine genetic mapping, we identified a candidate gene for a major-effect QTL, ARN6.1, that was responsible for waterlogging tolerance due to increased AR formation in the cucumber line Zaoer-N. Through multiple lines of evidence, we show that CsARN6.1 is the most possible candidate for ARN6.1 which encodes an AAA ATPase. The increased formation of ARs under waterlogging in Zaoer-N could be attributed to a non-synonymous SNP in the coiled-coil domain region of this gene. CsARN6.1 increases the number of ARs via its ATPase activity. Ectopic expression of CsARN6.1 in Arabidopsis resulted in better rooting ability and lateral root development in transgenic plants. Transgenic cucumber expressing the CsARN6.1 Asp allele from Zaoer-N exhibited a significant increase in number of ARs compared with the wild type expressing the allele from Pepino under waterlogging conditions. Taken together, these data support that the AAA ATPase gene CsARN6.1 has an important role in increasing cucumber AR formation and waterlogging tolerance. © 2018 The Authors The Plant Journal © 2018 John Wiley & Sons Ltd.
Characterization of phenylpropanoid pathway genes within European maize (Zea mays L.) inbreds
Andersen, Jeppe Reitan; Zein, Imad; Wenzel, Gerhard; Darnhofer, Birte; Eder, Joachim; Ouzunova, Milena; Lübberstedt, Thomas
2008-01-01
Background Forage quality of maize is influenced by both the content and structure of lignins in the cell wall. Biosynthesis of monolignols, constituting the complex structure of lignins, is catalyzed by enzymes in the phenylpropanoid pathway. Results In the present study we have amplified partial genomic fragments of six putative phenylpropanoid pathway genes in a panel of elite European inbred lines of maize (Zea mays L.) contrasting in forage quality traits. Six loci, encoding C4H, 4CL1, 4CL2, C3H, F5H, and CAD, displayed different levels of nucleotide diversity and linkage disequilibrium (LD) possibly reflecting different levels of selection. Associations with forage quality traits were identified for several individual polymorphisms within the 4CL1, C3H, and F5H genomic fragments when controlling for both overall population structure and relative kinship. A 1-bp indel in 4CL1 was associated with in vitro digestibility of organic matter (IVDOM), a non-synonymous SNP in C3H was associated with IVDOM, and an intron SNP in F5H was associated with neutral detergent fiber. However, the C3H and F5H associations did not remain significant when controlling for multiple testing. Conclusion While the number of lines included in this study limit the power of the association analysis, our results imply that genetic variation for forage quality traits can be mined in phenylpropanoid pathway genes of elite breeding lines of maize. PMID:18173847
Energy efficiency trade-offs drive nucleotide usage in transcribed regions
Chen, Wei-Hua; Lu, Guanting; Bork, Peer; Hu, Songnian; Lercher, Martin J.
2016-01-01
Efficient nutrient usage is a trait under universal selection. A substantial part of cellular resources is spent on making nucleotides. We thus expect preferential use of cheaper nucleotides especially in transcribed sequences, which are often amplified thousand-fold compared with genomic sequences. To test this hypothesis, we derive a mutation-selection-drift equilibrium model for nucleotide skews (strand-specific usage of ‘A' versus ‘T' and ‘G' versus ‘C'), which explains nucleotide skews across 1,550 prokaryotic genomes as a consequence of selection on efficient resource usage. Transcription-related selection generally favours the cheaper nucleotides ‘U' and ‘C' at synonymous sites. However, the information encoded in mRNA is further amplified through translation. Due to unexpected trade-offs in the codon table, cheaper nucleotides encode on average energetically more expensive amino acids. These trade-offs apply to both strand-specific nucleotide usage and GC content, causing a universal bias towards the more expensive nucleotides ‘A' and ‘G' at non-synonymous coding sites. PMID:27098217
Cui, Peng; Liu, Huitao; Lin, Qiang; Ding, Feng; Zhuo, Guoyin; Hu, Songnian; Liu, Dongcheng; Yang, Wenlong; Zhan, Kehui; Zhang, Aimin; Yu, Jun
2009-12-01
Plant mitochondrial genomes, encoding necessary proteins involved in the system of energy production, play an important role in the development and reproduction of the plant. They occupy a specific evolutionary pattern relative to their nuclear counterparts. Here, we determined the winter wheat (Triticum aestivum cv. Chinese Yumai) mitochondrial genome in a length of 452 and 526 bp by shotgun sequencing its BAC library. It contains 202 genes, including 35 known protein-coding genes, three rRNA and 17 tRNA genes, as well as 149 open reading frames (ORFs; greater than 300 bp in length). The sequence is almost identical to the previously reported sequence of the spring wheat (T. aestivum cv. Chinese Spring); we only identified seven SNPs (three transitions and four transversions) and 10 indels (insertions and deletions) between the two independently acquired sequences, and all variations were found in non-coding regions. This result confirmed the accuracy of the previously reported mitochondrial sequence of the Chinese Spring wheat. The nucleotide frequency and codon usage of wheat are common among the lineage of higher plant with a high AT-content of 58%. Molecular evolutionary analysis demonstrated that plant mitochondrial genomes evolved at different rates, which may correlate with substantial variations in metabolic rate and generation time among plant lineages. In addition, through the estimation of the ratio of non-synonymous to synonymous substitution rates between orthologous mitochondrion-encoded genes of higher plants, we found an accelerated evolutionary rate that seems to be the result of relaxed selection.
Singh, Gajinder Pal; Sharma, Amit
2016-01-01
Resistance to frontline anti-malarial drugs, including artemisinin, has repeatedly arisen in South-East Asia, but the reasons for this are not understood. Here we test whether evolutionary constraints on Plasmodium falciparum strains from South-East Asia differ from African strains. We find a significantly higher ratio of non-synonymous to synonymous polymorphisms in P. falciparum from South-East Asia compared to Africa, suggesting differences in the selective constraints on P. falciparum genome in these geographical regions. Furthermore, South-East Asian strains showed a higher proportion of non-synonymous polymorphism at conserved positions, suggesting reduced negative selection. There was a lower rate of mixed infection by multiple genotypes in samples from South-East Asia compared to Africa. We propose that a lower mixed infection rate in South-East Asia reduces intra-host competition between the parasite clones, reducing the efficiency of natural selection. This might increase the probability of fixation of fitness-reducing mutations including drug resistant ones. PMID:27853513
Wang, Xiaodan; Ma, Dehong; Huang, Xinwei; Li, Lihua; Li, Duo; Zhao, Yujiao; Qiu, Lijuan; Pan, Yue; Chen, Junying; Xi, Juemin; Shan, Xiyun; Sun, Qiangming
2017-06-15
In the past few decades, dengue has spread rapidly and is an emerging disease in China. An unexpected dengue outbreak occurred in Xishuangbanna, Yunnan, China, resulting in 1331 patients in 2013. In order to obtain the complete genome information and perform mutation and evolutionary analysis of causative agent related to this largest outbreak of dengue fever. The viruses were isolated by cell culture and evaluated by genome sequence analysis. Phylogenetic trees were then constructed by Neighbor-Joining methods (MEGA6.0), followed by analysis of nucleotide mutation and amino acid substitution. The analysis of the diversity of secondary structure for E and NS1 protein were also performed. Then selection pressures acting on the coding sequences were estimated by PAML software. The complete genome sequences of two isolated strains (YNSW1, YNSW2) were 10,710 and 10,702 nucleotides in length, respectively. Phylogenetic analysis revealed both strain were classified as genotype II of DENV-3. The results indicated that both isolated strains of Xishuangbanna in 2013 and Laos 2013 stains (KF816161.1, KF816158.1, LC147061.1, LC147059.1, KF816162.1) were most similar to Bangladesh (AY496873.2) in 2002. After comparing with the DENV-3SS (H87) 62 amino acid substitutions were identified in translated regions, and 38 amino acid substitutions were identified in translated regions compared with DENV-3 genotype II stains Bangladesh (AY496873.2). 27(YNSW1) or 28(YNSW2) single nucleotide changes were observed in structural protein sequences with 7(YNSW1) or 8(YNSW2) non-synonymous mutations compared with AY496873.2. Of them, 4 non-synonymous mutations were identified in E protein sequences with (2 in the β-sheet, 2 in the coil). Meanwhile, 117(YNSW1) or 115 (YNSW2) single nucleotide changes were observed in non-structural protein sequences with 31(YNSW1) or 30 (YNSW2) non-synonymous mutations. Particularly, 14 single nucleotide changes were observed in NS1 sequences with 4/14 non-synonymous substitutions (4 in the coil). Selection pressure analysis revealed no positive selection in the amino acid sites of the genes encoding for structural and non-structural proteins. This study may help understand the intrinsic geographical relatedness of dengue virus 3 and contributes further to research on their infectivity, pathogenicity and vaccine development. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Saxena, Vijay Kumar; Kumar, Davendra; Naqvi, S. M. K.
2017-04-01
GPR50, formerly known as a melatonin-related receptor, is one of the three subtypes of melatonin receptor subfamily, together with MTNR1A and MTNR1B. GPR50, despite its high identity with the melatonin receptor family, does not bind melatonin and is considered to be an ortholog of MTNR1C in mammals. GPR50-expressing cells have been found in the dorsomedial nucleus of the hypothalamus, the periventricular nucleus, and the median eminence. Genetic and functional evidence have been recently investigated linking GPR50 to adaptive thermogenesis and torpor, but still, it is an orphan receptor and is yet to be studied conclusively. The aims of the study were to characterize the GPR50 gene of sheep and to study the sequence variability of the gene in Indian sheep breeds of two different thermo-varied agroclimatic conditions. Genomic DNA isolation was done and a 791-bp sequence was amplified using self-designed primers and SNP profiling done out of samples of all the breeds to study the relative frequency of SNPs in each of the breed. Five important non-synonymous mutations were observed in the various breeds studied. T698G, G1097A, G1270A, G1318A, and C1334G lead to the following substitution: valine by glycine, arginine by glutamine, threonine by alanine, isoleucine by valine, and serine by cytosine, respectively. Two synonymous mutations (T663G and C888T) were also observed in some of the studied breeds. G1270A and C888T were the most prevalent SNPs observed in nearly all of the breeds. C888T SNPs were observed in higher prevalence in Chokla, Marwari, and Magra in comparison to Gaddi and Bharat Merino. A PolyPhen-2 analysis, which is used to assess the potential damaging nature of an SNP, revealed that mutation T698G and G1270A were benign while G1097A, G1318A, and C1334G were damaging with a score of 0.987, 0.993, and 0.739, respectively. A 3-D homology model of the protein was prepared using c4zwjA (UniProt sequence ID) as a template using the online version of Phyre2 protein modeling software. The structure demonstrated closed similarity with other G-coupled receptor and it had a 45 % α-helical content. G1270A and C888T may be taken up for SNP correlation in a larger population study for their association with heat stress protection.
Lijavetzky, Diego; Cabezas, José Antonio; Ibáñez, Ana; Rodríguez, Virginia; Martínez-Zapater, José M
2007-01-01
Background Single-nucleotide polymorphisms (SNPs) are the most abundant type of DNA sequence polymorphisms. Their higher availability and stability when compared to simple sequence repeats (SSRs) provide enhanced possibilities for genetic and breeding applications such as cultivar identification, construction of genetic maps, the assessment of genetic diversity, the detection of genotype/phenotype associations, or marker-assisted breeding. In addition, the efficiency of these activities can be improved thanks to the ease with which SNP genotyping can be automated. Expressed sequence tags (EST) sequencing projects in grapevine are allowing for the in silico detection of multiple putative sequence polymorphisms within and among a reduced number of cultivars. In parallel, the sequence of the grapevine cultivar Pinot Noir is also providing thousands of polymorphisms present in this highly heterozygous genome. Still the general application of those SNPs requires further validation since their use could be restricted to those specific genotypes. Results In order to develop a large SNP set of wide application in grapevine we followed a systematic re-sequencing approach in a group of 11 grape genotypes corresponding to ancient unrelated cultivars as well as wild plants. Using this approach, we have sequenced 230 gene fragments, what represents the analysis of over 1 Mb of grape DNA sequence. This analysis has allowed the discovery of 1573 SNPs with an average of one SNP every 64 bp (one SNP every 47 bp in non-coding regions and every 69 bp in coding regions). Nucleotide diversity in grape (π = 0.0051) was found to be similar to values observed in highly polymorphic plant species such as maize. The average number of haplotypes per gene sequence was estimated as six, with three haplotypes representing over 83% of the analyzed sequences. Short-range linkage disequilibrium (LD) studies within the analyzed sequences indicate the existence of a rapid decay of LD within the selected grapevine genotypes. To validate the use of the detected polymorphisms in genetic mapping, cultivar identification and genetic diversity studies we have used the SNPlex™ genotyping technology in a sample of grapevine genotypes and segregating progenies. Conclusion These results provide accurate values for nucleotide diversity in coding sequences and a first estimate of short-range LD in grapevine. Using SNPlex™ genotyping we have shown the application of a set of discovered SNPs as molecular markers for cultivar identification, linkage mapping and genetic diversity studies. Thus, the combination a highly efficient re-sequencing approach and the SNPlex™ high throughput genotyping technology provide a powerful tool for grapevine genetic analysis. PMID:18021442
COOLAIR Antisense RNAs Form Evolutionarily Conserved Elaborate Secondary Structures
Hawkes, Emily J.; Hennelly, Scott P.; Novikova, Irina V.; ...
2016-09-20
There is considerable debate about the functionality of long non-coding RNAs (lncRNAs). Lack of sequence conservation has been used to argue against functional relevance. Here, we investigated antisense lncRNAs, called COOLAIR, at the A. thaliana FLC locus and experimentally determined their secondary structure. The major COOLAIR variants are highly structured, organized by exon. The distally polyadenylated transcript has a complex multi-domain structure, altered by a single non-coding SNP defining a functionally distinct A. thaliana FLC haplotype. The A. thaliana COOLAIR secondary structure was used to predict COOLAIR exons in evolutionarily divergent Brassicaceae species. These predictions were validated through chemical probingmore » and cloning. Despite the relatively low nucleotide sequence identity, the structures, including multi-helix junctions, show remarkable evolutionary conservation. In a number of places, the structure is conserved through covariation of a non-contiguous DNA sequence. This structural conservation supports a functional role for COOLAIR transcripts rather than, or in addition to, antisense transcription.« less
COOLAIR Antisense RNAs Form Evolutionarily Conserved Elaborate Secondary Structures
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hawkes, Emily J.; Hennelly, Scott P.; Novikova, Irina V.
There is considerable debate about the functionality of long non-coding RNAs (lncRNAs). Lack of sequence conservation has been used to argue against functional relevance. Here, we investigated antisense lncRNAs, called COOLAIR, at the A. thaliana FLC locus and experimentally determined their secondary structure. The major COOLAIR variants are highly structured, organized by exon. The distally polyadenylated transcript has a complex multi-domain structure, altered by a single non-coding SNP defining a functionally distinct A. thaliana FLC haplotype. The A. thaliana COOLAIR secondary structure was used to predict COOLAIR exons in evolutionarily divergent Brassicaceae species. These predictions were validated through chemical probingmore » and cloning. Despite the relatively low nucleotide sequence identity, the structures, including multi-helix junctions, show remarkable evolutionary conservation. In a number of places, the structure is conserved through covariation of a non-contiguous DNA sequence. This structural conservation supports a functional role for COOLAIR transcripts rather than, or in addition to, antisense transcription.« less
Koning-Boucoiran, Carole F S; Esselink, G Danny; Vukosavljev, Mirjana; van 't Westende, Wendy P C; Gitonga, Virginia W; Krens, Frans A; Voorrips, Roeland E; van de Weg, W Eric; Schulz, Dietmar; Debener, Thomas; Maliepaard, Chris; Arens, Paul; Smulders, Marinus J M
2015-01-01
In order to develop a versatile and large SNP array for rose, we set out to mine ESTs from diverse sets of rose germplasm. For this RNA-Seq libraries containing about 700 million reads were generated from tetraploid cut and garden roses using Illumina paired-end sequencing, and from diploid Rosa multiflora using 454 sequencing. Separate de novo assemblies were performed in order to identify single nucleotide polymorphisms (SNPs) within and between rose varieties. SNPs among tetraploid roses were selected for constructing a genotyping array that can be employed for genetic mapping and marker-trait association discovery in breeding programs based on tetraploid germplasm, both from cut roses and from garden roses. In total 68,893 SNPs were included on the WagRhSNP Axiom array. Next, an orthology-guided assembly was performed for the construction of a non-redundant rose transcriptome database. A total of 21,740 transcripts had significant hits with orthologous genes in the strawberry (Fragaria vesca L.) genome. Of these 13,390 appeared to contain the full-length coding regions. This newly established transcriptome resource adds considerably to the currently available sequence resources for the Rosaceae family in general and the genus Rosa in particular.
New insights on the evolution of Leafy cotyledon1 (LEC1) type genes in vascular plants.
Cagliari, Alexandro; Turchetto-Zolet, Andreia Carina; Korbes, Ana Paula; Maraschin, Felipe Dos Santos; Margis, Rogerio; Margis-Pinheiro, Marcia
2014-01-01
NF-Y is a conserved oligomeric transcription factor found in all eukaryotes. In plants, this regulator evolved with a broad diversification of the genes coding for its three subunits (NF-YA, NF-YB and NF-YC). The NF-YB members can be divided into Leafy Cotyledon1 (LEC1) and non-LEC1 types. Here we presented a comparative genomic study using phylogenetic analyses to validate an evolutionary model for the origin of LEC-type genes in plants and their emergence from non-LEC1-type genes. We identified LEC1-type members in all vascular plant genomes, but not in amoebozoa, algae, fungi, metazoa and non-vascular plant representatives, which present exclusively non-LEC1-type genes as constituents of their NF-YB subunits. The non-synonymous to synonymous nucleotide substitution rates (Ka/Ks) between LEC1 and non-LEC1-type genes indicate the presence of positive selection acting on LEC1-type members to the fixation of LEC1-specific amino acid residues. The phylogenetic analyses demonstrated that plant LEC1-type genes are evolutionary divergent from the non-LEC1-type genes of plants, fungi, amoebozoa, algae and animals. Our results point to a scenario in which LEC1-type genes have originated in vascular plants after gene expansion in plants. We suggest that processes of neofunctionalization and/or subfunctionalization were responsible for the emergence of a versatile role for LEC1-type genes in vascular plants, especially in seed plants. LEC1-type genes besides being phylogenetic divergent also present different expression profile when compared with non-LEC1-type genes. Altogether, our data provide new insights about the LEC1 and non-LEC1 evolutionary relationship during the vascular plant evolution. Copyright © 2014 Elsevier Inc. All rights reserved.
Kos, Mark Z; Carless, Melanie A; Peralta, Juan; Curran, Joanne E; Quillen, Ellen E; Almeida, Marcio; Blackburn, August; Blondell, Lucy; Roalf, David R; Pogue-Geile, Michael F; Gur, Ruben C; Göring, Harald H H; Nimgaonkar, Vishwajit L; Gur, Raquel E; Almasy, Laura
2017-12-01
Schizophrenia is a serious mental illness, involving disruptions in thought and behavior, with a worldwide prevalence of about one percent. Although highly heritable, much of the genetic liability of schizophrenia is yet to be explained. We searched for susceptibility loci in multiplex, multigenerational families affected by schizophrenia, targeting protein-altering variation with in silico predicted functional effects. Exome sequencing was performed on 136 samples from eight European-American families, including 23 individuals diagnosed with schizophrenia or schizoaffective disorder. In total, 11,878 non-synonymous variants from 6,396 genes were tested for their association with schizophrenia spectrum disorders. Pathway enrichment analyses were conducted on gene-based test results, protein-protein interaction (PPI) networks, and epistatic effects. Using a significance threshold of FDR < 0.1, association was detected for rs10941112 (p = 2.1 × 10 -5 ; q-value = 0.073) in AMACR, a gene involved in fatty acid metabolism and previously implicated in schizophrenia, with significant cis effects on gene expression (p = 5.5 × 10 -4 ), including brain tissue data from the Genotype-Tissue Expression project (minimum p = 6.0 × 10 -5 ). A second SNP, rs10378 located in TMEM176A, also shows risk effects in the exome data (p = 2.8 × 10 -5 ; q-value = 0.073). PPIs among our top gene-based association results (p < 0.05; n = 359 genes) reveal significant enrichment of genes involved in NCAM-mediated neurite outgrowth (p = 3.0 × 10 -5 ), while exome-wide SNP-SNP interaction effects for rs10941112 and rs10378 indicate a potential role for kinase-mediated signaling involved in memory and learning. In conclusion, these association results implicate AMACR and TMEM176A in schizophrenia risk, whose effects may be modulated by genes involved in synaptic plasticity and neurocognitive performance. © 2017 Wiley Periodicals, Inc.
Aravind Kumar, M; Singh, Vineeta; Naushad, Shaik Mohammad; Shanker, Uday; Lakshmi Narasu, M
2018-05-01
In the view of aggressive nature of Triple-Negative Breast cancer (TNBC) due to the lack of receptors (ER, PR, HER2) and high incidence of drug resistance associated with it, a case-control association study was conducted to identify the contributing genetic risk factors for Triple-negative breast cancer (TNBC). A total of 30 TNBC patients and 50 age and gender-matched controls of Indian origin were screened for 9,00,000 SNP markers using microarray-based SNP genotyping approach. The initial PLINK association analysis (p < 0.01, MAF 0.14-0.44, OR 10-24) identified 28 non-synonymous SNPs and one stop gain mutation in the exonic region as possible determinants of TNBC risk. All the 29 SNPs were annotated using ANNOVAR. The interactions between these markers were evaluated using Multifactor dimensionality reduction (MDR) analysis. The interactions were in the following order: exm408776 > exm1278309 > rs316389 > rs1651654 > rs635538 > exm1292477. Recursive partitioning analysis (RPA) was performed to construct decision tree useful in predicting TNBC risk. As shown in this analysis, rs1651654 and exm585172 SNPs are found to be determinants of TNBC risk. Artificial neural network model was used to generate the Receiver operating characteristic curves (ROC), which showed high sensitivity and specificity (AUC-0.94) of these markers. To conclude, among the 9,00,000 SNPs tested, CCDC42 exm1292477, ANXA3 exm408776, SASH1 exm585172 are found to be the most significant genetic predicting factors for TNBC. The interactions among exm408776, exm1278309, rs316389, rs1651654, rs635538, exm1292477 SNPs inflate the risk for TNBC further. Targeted analysis of these SNPs and genes alone also will have similar clinical utility in predicting TNBC.
Adaptive evolution of the matrix extracellular phosphoglycoprotein in mammals
2011-01-01
Background Matrix extracellular phosphoglycoprotein (MEPE) belongs to a family of small integrin-binding ligand N-linked glycoproteins (SIBLINGs) that play a key role in skeleton development, particularly in mineralization, phosphate regulation and osteogenesis. MEPE associated disorders cause various physiological effects, such as loss of bone mass, tumors and disruption of renal function (hypophosphatemia). The study of this developmental gene from an evolutionary perspective could provide valuable insights on the adaptive diversification of morphological phenotypes in vertebrates. Results Here we studied the adaptive evolution of the MEPE gene in 26 Eutherian mammals and three birds. The comparative genomic analyses revealed a high degree of evolutionary conservation of some coding and non-coding regions of the MEPE gene across mammals indicating a possible regulatory or functional role likely related with mineralization and/or phosphate regulation. However, the majority of the coding region had a fast evolutionary rate, particularly within the largest exon (1467 bp). Rodentia and Scandentia had distinct substitution rates with an increased accumulation of both synonymous and non-synonymous mutations compared with other mammalian lineages. Characteristics of the gene (e.g. biochemical, evolutionary rate, and intronic conservation) differed greatly among lineages of the eight mammalian orders. We identified 20 sites with significant positive selection signatures (codon and protein level) outside the main regulatory motifs (dentonin and ASARM) suggestive of an adaptive role. Conversely, we find three sites under selection in the signal peptide and one in the ASARM motif that were supported by at least one selection model. The MEPE protein tends to accumulate amino acids promoting disorder and potential phosphorylation targets. Conclusion MEPE shows a high number of selection signatures, revealing the crucial role of positive selection in the evolution of this SIBLING member. The selection signatures were found mainly outside the functional motifs, reinforcing the idea that other regions outside the dentonin and the ASARM might be crucial for the function of the protein and future studies should be undertaken to understand its importance. PMID:22103247
Molecular characterization of Mycobacterium tuberculosis isolates from elephants of Nepal.
Paudel, Sarad; Mikota, Susan K; Nakajima, Chie; Gairhe, Kamal P; Maharjan, Bhagwan; Thapa, Jeewan; Poudel, Ajay; Shimozuru, Michito; Suzuki, Yasuhiko; Tsubota, Toshio
2014-05-01
Mycobacterium tuberculosis was cultured from the lung tissues of 3 captive elephants in Nepal that died with extensive lung lesions. Spoligotyping, TbD1 detection and multi-locus variable number of tandem repeat analysis (MLVA) results suggested 3 isolates belonged to a specific lineage of Indo-Oceanic clade, EAI5 SIT 138. One of the elephant isolates had a new synonymous single nucleotide polymorphism (SNP) T231C in the gyrA sequence, and the same SNP was also found in human isolates in Nepal. MLVA results and transfer history of the elephants suggested that 2 of them might be infected with M. tuberculosis from the same source. These findings indicated the source of M. tuberculosis infection of those elephants were local residents, presumably their handlers. Further investigation including detailed genotyping of elephant and human isolates is needed to clarify the infection route and eventually prevent the transmission of tuberculosis to susceptible hosts. Copyright © 2014 Elsevier Ltd. All rights reserved.
Genetic variations in NADPH-CYP450 oxidoreductase in a Czech Slavic cohort.
Tomková, Mária; Panda, Satya Prakash; Šeda, Ondřej; Baxová, Alice; Hůlková, Martina; Siler Masters, Bettie Sue; Martásek, Pavel
2015-01-01
Estimating polymorphic allele frequencies of the NADPH-CYP450 oxidoreductase (POR) gene in a Czech Slavic population. The POR gene was analyzed in 322 individuals from a control cohort by sequencing and high resolution melting analysis. We identified seven unreported SNP genetic variations, including two SNPs in the 5' flanking region (g.4965C>T and g.4994G>T), one intronic variant (c.1899-20C>T), one synonymous SNP (p.20Ala=) and three nonsynonymous SNPs (p.Thr29Ser, p.Pro384Leu and p.Thr529Met). The p.Pro384Leu variant exhibited reduced enzymatic activities compared with wild-type. New POR variant identification indicates the number of uncommon variants might be specific for each subpopulation being investigated, particularly germane to the singular role that POR plays in providing reducing equivalents to all CYP450s in the endoplasmic reticulum. Original submitted 15 September 2014; Revision submitted 17 November 2014.
Gomes, Sónia; Castro, Cláudia; Barrias, Sara; Pereira, Leonor; Jorge, Pedro; Fernandes, José R; Martins-Lopes, Paula
2018-04-11
The wine sector requires quick and reliable methods for Vitis vinifera L. varietal identification. The number of V. vinifera varieties is estimated in about 5,000 worldwide. Single Nucleotide Polymorphisms (SNPs) represent the most basic and abundant form of genetic sequence variation, being adequate for varietal discrimination. The aim of this work was to develop DNA-based assays suitable to detect SNP variation in V. vinifera, allowing varietal discrimination. Genotyping by sequencing allowed the detection of eleven SNPs on two genes of the anthocyanin pathway, the flavanone 3-hydroxylase (F3H, EC: 1.14.11.9), and the leucoanthocyanidin dioxygenase (LDOX, EC 1.14.11.19; synonym anthocyanidin synthase, ANS) in twenty V. vinifera varieties. Three High Resolution Melting (HRM) assays were designed based on the sequencing information, discriminating five of the 20 varieties: Alicante Bouschet, Donzelinho Tinto, Merlot, Moscatel Galego and Tinta Roriz. Sanger sequencing of the HRM assay products confirmed the HRM profiles. Three probes, with different lengths and sequences, were used as bio-recognition elements in an optical biosensor platform based on a long period grating (LPG) fiber optic sensor. The label free platform detected a difference of a single SNP using genomic DNA samples. The two different platforms were successfully applied for grapevine varietal identification.
Gao, Li; Rafaels, Nicholas M; Huang, Lili; Potee, Joseph; Ruczinski, Ingo; Beaty, Terri H.; Paller, Amy S.; Schneider, Lynda C.; Gallo, Rich; Hanifin, Jon M.; Beck, Lisa A.; Geha, Raif S.; Mathias, Rasika A.; Leung, Donald Y. M.
2015-01-01
Background A subset of atopic dermatitis (AD) is associated with increased susceptibility to eczema herpeticum (ADEH+). We previously reported that common single nucleotide polymorphisms (SNPs) in interferon-gamma (IFNG) and receptor 1 (IFNGR1) were associated with ADEH+ phenotype. Objective To interrogate the role of rare variants in IFN-pathway genes for risk of ADEH+. Methods We performed targeted sequencing of interferon-pathway genes (IFNG, IFNGR1, IFNAR1 and IL12RB1) in 228 European American (EA) AD patients selected according to their EH status and severity measured by Eczema Area and Severity Index (EASI). Replication genotyping was performed in independent samples of 219 EA and 333 African Americans (AA). Functional investigation of ‘loss-of-function’ variants was conducted using site-directed mutagenesis. Results We identified 494 single nucleotide variants (SNVs) encompassing 105kb of sequence, including 145 common, 349 (70.6%) rare (minor allele frequency (MAF) <5%) and 86 (17.4%) novel variants, of which 2.8% were coding-synonymous, 93.3% were non-coding (64.6% intronic), and 3.8% were missense. We identified six rare IFNGR1 missense including three damaging variants (Val14Met (V14M), Val61Ile and Tyr397Cys (Y397C)) conferring a higher risk for ADEH+ (P=0.031). Variants V14M and Y397C were confirmed to be deleterious leading to partial IFNGR1 deficiency. Seven common IFNGR1 SNPs, along with common protective haplotypes (2 to 7-SNPs) conferred a reduced risk of ADEH+ (P=0.015-0.002, P=0.0015-0.0004, respectively), and both SNP and haplotype associations were replicated in an independent AA sample (P=0.004-0.0001 and P=0.001-0.0001, respectively). Conclusion Our results provide evidence that both genetic variants in the gene encoding IFNGR1 are implicated in susceptibility to the ADEH+ phenotype. CAPSULE SUMMARY We provided the first evidence that rare functional IFNGR1 mutations contribute to a defective systemic IFN-γ immune response that accounts for the propensity of AD patients to disseminated viral skin infections. PMID:26343451
Uemoto, Yoshinobu; Ohtake, Tsuyoshi; Sasago, Nanae; Takeda, Masayuki; Abe, Tsuyoshi; Sakuma, Hironori; Kojima, Takatoshi; Sasaki, Shinji
2017-11-13
Umami is a Japanese term for the fifth basic taste and is an important sensory property of beef palatability. Inosine 5'-monophosphate (IMP) contributes to umami taste in beef. Thus, the overall change in concentration of IMP and its degradation products can potentially affect the beef palatability. In this study, we investigated the genetic architecture of IMP and its degradation products in Japanese Black beef. First, we performed genome-wide association study (GWAS), candidate gene analysis, and functional analysis to detect the causal variants that affect IMP, inosine, and hypoxanthine. Second, we evaluated the allele frequencies in the different breeds, the contribution of genetic variance, and the effect on other economical traits using the detected variants. A total of 574 Japanese Black cattle were genotyped using the Illumina BovineSNP50 BeadChip and were then used for GWAS. The results of GWAS showed that the genome-wide significant single nucleotide polymorphisms (SNPs) on BTA9 were detected for IMP, inosine, and hypoxanthine. The ecto-5'-nucleotidase (NT5E) gene, which encodes the enzyme NT5E for the extracellular degradation of IMP to inosine, was located near the significant region on BTA9. The results of candidate gene analysis and functional analysis showed that two non-synonymous SNPs (c.1318C > T and c.1475 T > A) in NT5E affected the amount of IMP and its degradation products in beef by regulating the enzymatic activity of NT5E. The Q haplotype showed a positive effect on IMP and a negative effect on the enzymatic activity of NT5E in IMP degradation. The two SNPs were under perfect linkage disequilibrium in five different breeds, and different haplotype frequencies were seen among breeds. The two SNPs contribute to about half of the total genetic variance in IMP, and the results of genetic relationship between IMP and its degradation products showed that NT5E affected the overall concentration balance of IMP and its degradation products. In addition, the SNPs in NT5E did not have an unfavorable effect on the other economical traits. Based on all the above findings taken together, two non-synonymous SNPs in NT5E would be useful for improving IMP and its degradation products by marker-assisted selection in Japanese Black cattle.
Haplotype analysis of sucrose synthase gene family in three Saccharum species
2013-01-01
Background Sugarcane is an economically important crop contributing about 80% and 40% to the world sugar and ethanol production, respectively. The complicated genetics consequential to its complex polyploid genome, however, have impeded efforts to improve sugar yield and related important agronomic traits. Modern sugarcane cultivars are complex hybrids derived mainly from crosses among its progenitor species, S. officinarum and S. spontanuem, and to a lesser degree, S. robustom. Atypical of higher plants, sugarcane stores its photoassimilates as sucrose rather than as starch in its parenchymous stalk cells. In the sugar biosynthesis pathway, sucrose synthase (SuSy, UDP-glucose: D-fructose 2-a-D-glucosyltransferase, EC 2.4.1.13) is a key enzyme in the regulation of sucrose accumulation and partitioning by catalyzing the reversible conversion of sucrose and UDP into UDP-glucose and fructose. However, little is known about the sugarcane SuSy gene family members and hence no definitive studies have been reported regarding allelic diversity of SuSy gene families in Saccharum species. Results We identified and characterized a total of five sucrose synthase genes in the three sugarcane progenitor species through gene annotation and PCR haplotype analysis by analyzing 70 to 119 PCR fragments amplified from intron-containing target regions. We detected all but one (i.e. ScSuSy5) of ScSuSy transcripts in five tissue types of three Saccharum species. The average SNP frequency was one SNP per 108 bp, 81 bp, and 72 bp in S. officinarum, S. robustom, and S. spontanuem respectively. The average shared SNP is 15 between S. officinarum and S. robustom, 7 between S. officinarum and S. spontanuem , and 11 between S. robustom and S. spontanuem. We identified 27, 35, and 32 haplotypes from the five ScSuSy genes in S. officinarum, S. robustom, and S. spontanuem respectively. Also, 12, 11, and 9 protein sequences were translated from the haplotypes in S. officinarum, S. robustom, S. spontanuem, respectively. Phylogenetic analysis showed three separate clusters composed of SbSuSy1 and SbSuSy2, SbSuSy3 and SbSuSy5, and SbSuSy4. Conclusions The five members of the SuSy gene family evolved before the divergence of the genera in the tribe Andropogoneae at least 12 MYA. Each ScSuSy gene showed at least one non-synonymous substitution in SNP haplotypes. The SNP frequency is the lowest in S. officinarum, intermediate in S. robustum, and the highest in S. spontaneum, which may reflect the timing of the two rounds of whole genome duplication in these octoploids. The higher rate of shared SNP frequency between S. officinarum and S. robustum than between S. officinarum and in S. spontaneum confirmed that the speciation event separating S. officinarum and S. robustum occurred after their common ancestor diverged from S. spontaneum. The SNP and haplotype frequencies in three Saccharum species provide fundamental information for designing strategies to sequence these autopolyploid genomes. PMID:23663250
Haplotype analysis of sucrose synthase gene family in three Saccharum species.
Zhang, Jisen; Arro, Jie; Chen, Youqiang; Ming, Ray
2013-05-10
Sugarcane is an economically important crop contributing about 80% and 40% to the world sugar and ethanol production, respectively. The complicated genetics consequential to its complex polyploid genome, however, have impeded efforts to improve sugar yield and related important agronomic traits. Modern sugarcane cultivars are complex hybrids derived mainly from crosses among its progenitor species, S. officinarum and S. spontanuem, and to a lesser degree, S. robustom. Atypical of higher plants, sugarcane stores its photoassimilates as sucrose rather than as starch in its parenchymous stalk cells. In the sugar biosynthesis pathway, sucrose synthase (SuSy, UDP-glucose: D-fructose 2-a-D-glucosyltransferase, EC 2.4.1.13) is a key enzyme in the regulation of sucrose accumulation and partitioning by catalyzing the reversible conversion of sucrose and UDP into UDP-glucose and fructose. However, little is known about the sugarcane SuSy gene family members and hence no definitive studies have been reported regarding allelic diversity of SuSy gene families in Saccharum species. We identified and characterized a total of five sucrose synthase genes in the three sugarcane progenitor species through gene annotation and PCR haplotype analysis by analyzing 70 to 119 PCR fragments amplified from intron-containing target regions. We detected all but one (i.e. ScSuSy5) of ScSuSy transcripts in five tissue types of three Saccharum species. The average SNP frequency was one SNP per 108 bp, 81 bp, and 72 bp in S. officinarum, S. robustom, and S. spontanuem respectively. The average shared SNP is 15 between S. officinarum and S. robustom, 7 between S. officinarum and S. spontanuem , and 11 between S. robustom and S. spontanuem. We identified 27, 35, and 32 haplotypes from the five ScSuSy genes in S. officinarum, S. robustom, and S. spontanuem respectively. Also, 12, 11, and 9 protein sequences were translated from the haplotypes in S. officinarum, S. robustom, S. spontanuem, respectively. Phylogenetic analysis showed three separate clusters composed of SbSuSy1 and SbSuSy2, SbSuSy3 and SbSuSy5, and SbSuSy4. The five members of the SuSy gene family evolved before the divergence of the genera in the tribe Andropogoneae at least 12 MYA. Each ScSuSy gene showed at least one non-synonymous substitution in SNP haplotypes. The SNP frequency is the lowest in S. officinarum, intermediate in S. robustum, and the highest in S. spontaneum, which may reflect the timing of the two rounds of whole genome duplication in these octoploids. The higher rate of shared SNP frequency between S. officinarum and S. robustum than between S. officinarum and in S. spontaneum confirmed that the speciation event separating S. officinarum and S. robustum occurred after their common ancestor diverged from S. spontaneum. The SNP and haplotype frequencies in three Saccharum species provide fundamental information for designing strategies to sequence these autopolyploid genomes.
E6 and E7 Gene Polymorphisms in Human Papillomavirus Types-58 and 33 Identified in Southwest China
Wen, Qiang; Wang, Tao; Mu, Xuemei; Chenzhang, Yuwei; Cao, Man
2017-01-01
Cancer of the cervix is associated with infection by certain types of human papillomavirus (HPV). The gene variants differ in immune responses and oncogenic potential. The E6 and E7 proteins encoded by high-risk HPV play a key role in cellular transformation. HPV-33 and HPV-58 types are highly prevalent among Chinese women. To study the gene intratypic variations, polymorphisms and positive selections of HPV-33 and HPV-58 E6/E7 in southwest China, HPV-33 (E6, E7: n = 216) and HPV-58 (E6, E7: n = 405) E6 and E7 genes were sequenced and compared to others submitted to GenBank. Phylogenetic trees were constructed by Maximum-likelihood and the Kimura 2-parameters methods by MEGA 6 (Molecular Evolutionary Genetics Analysis version 6.0). The diversity of secondary structure was analyzed by PSIPred software. The selection pressures acting on the E6/E7 genes were estimated by PAML 4.8 (Phylogenetic Analyses by Maximun Likelihood version4.8) software. The positive sites of HPV-33 and HPV-58 E6/E7 were contrasted by ClustalX 2.1. Among 216 HPV-33 E6 sequences, 8 single nucleotide mutations were observed with 6/8 non-synonymous and 2/8 synonymous mutations. The 216 HPV-33 E7 sequences showed 3 single nucleotide mutations that were non-synonymous. The 405 HPV-58 E6 sequences revealed 8 single nucleotide mutations with 4/8 non-synonymous and 4/8 synonymous mutations. Among 405 HPV-58 E7 sequences, 13 single nucleotide mutations were observed with 10/13 non-synonymous mutations and 3/13 synonymous mutations. The selective pressure analysis showed that all HPV-33 and 4/6 HPV-58 E6/E7 major non-synonymous mutations were sites of positive selection. All variations were observed in sites belonging to major histocompatibility complex and/or B-cell predicted epitopes. K93N and R145 (I/N) were observed in both HPV-33 and HPV-58 E6. PMID:28141822
Blasi, Francesca; Bacchelli, Elena; Pesaresi, Giulia; Carone, Simona; Bailey, Anthony J; Maestrini, Elena
2006-04-05
Neuroligin abnormalities have been recently implicated in the aetiology of autism spectrum disorders (ASD), given the finding of point mutations in the two X-linked genes NLGN3 and NLGN4X and the important role of neuroligins in synaptogenesis. To enquire on the relevance and frequency of neuroligin mutations in ASD, we performed a mutation screening of NLGN3 and NLGN4X in a sample of 124 autism probands from the International Molecular Genetic Study of Autism Consortium (IMGSAC). We identified a new non-synonymous variant in NLGN3 (Thr632Ala), which is likely to be a rare polymorphism. Our data indicate that coding mutations in these genes are very rarely associated to ASD. Copyright 2006 Wiley-Liss, Inc.
2012-01-01
Background A detailed knowledge about spatial and temporal gene expression is important for understanding both the function of genes and their evolution. For the vast majority of species, transcriptomes are still largely uncharacterized and even in those where substantial information is available it is often in the form of partially sequenced transcriptomes. With the development of next generation sequencing, a single experiment can now simultaneously identify the transcribed part of a species genome and estimate levels of gene expression. Results mRNA from actively growing needles of Norway spruce (Picea abies) was sequenced using next generation sequencing technology. In total, close to 70 million fragments with a length of 76 bp were sequenced resulting in 5 Gbp of raw data. A de novo assembly of these reads, together with publicly available expressed sequence tag (EST) data from Norway spruce, was used to create a reference transcriptome. Of the 38,419 PUTs (putative unique transcripts) longer than 150 bp in this reference assembly, 83.5% show similarity to ESTs from other spruce species and of the remaining PUTs, 3,704 show similarity to protein sequences from other plant species, leaving 4,167 PUTs with limited similarity to currently available plant proteins. By predicting coding frames and comparing not only the Norway spruce PUTs, but also PUTs from the close relatives Picea glauca and Picea sitchensis to both Pinus taeda and Taxus mairei, we obtained estimates of synonymous and non-synonymous divergence among conifer species. In addition, we detected close to 15,000 SNPs of high quality and estimated gene expression differences between samples collected under dark and light conditions. Conclusions Our study yielded a large number of single nucleotide polymorphisms as well as estimates of gene expression on transcriptome scale. In agreement with a recent study we find that the synonymous substitution rate per year (0.6 × 10−09 and 1.1 × 10−09) is an order of magnitude smaller than values reported for angiosperm herbs. However, if one takes generation time into account, most of this difference disappears. The estimates of the dN/dS ratio (non-synonymous over synonymous divergence) reported here are in general much lower than 1 and only a few genes showed a ratio larger than 1. PMID:23122049
Black-tie dress code: two new species of the genus Toxomerus (Diptera, Syrphidae)
Mengual, Ximo
2011-01-01
Abstract Toxomerus hauseri Mengual sp. n. and Toxomerus picudus Mengual sp. n. are described from Peru and Ecuador respectively. Toxomerus circumcintus (Enderlein, 1938) is treated as a valid species and not considered synonym of Toxomerus marginatus, and Toxomerus ovatus (Hull, 1942) is considered junior synonym of Toxomerus nitidus (Schiner, 1868). An identification key for the Toxomerus species with dark abdomens is given along with diagnoses for each studied species. PMID:22144857
SNP discovery in candidate adaptive genes using exon capture in a free-ranging alpine ungulate
Gretchen H. Roffler; Stephen J. Amish; Seth Smith; Ted Cosart; Marty Kardos; Michael K. Schwartz; Gordon Luikart
2016-01-01
Identification of genes underlying genomic signatures of natural selection is key to understanding adaptation to local conditions. We used targeted resequencing to identify SNP markers in 5321 candidate adaptive genes associated with known immunological, metabolic and growth functions in ovids and other ungulates. We selectively targeted 8161 exons in protein-coding...
Moralli, Daniela; Nudel, Ron; Chan, May T M; Green, Catherine M; Volpi, Emanuela V; Benítez-Burraco, Antonio; Newbury, Dianne F; García-Bellido, Paloma
2015-01-01
We report on a young female, who presents with a severe speech and language disorder and a balanced de novo complex chromosomal rearrangement, likely to have resulted from a chromosome 7 pericentromeric inversion, followed by a chromosome 7 and 11 translocation. Using molecular cytogenetics, we mapped the four breakpoints to 7p21.1-15.3 (chromosome position: 20,954,043-21,001,537, hg19), 7q31 (chromosome position: 114,528,369-114,556,605, hg19), 7q21.3 (chromosome position: 93,884,065-93,933,453, hg19) and 11p12 (chromosome position: 38,601,145-38,621,572, hg19). These regions contain only non-coding transcripts (ENSG00000232790 on 7p21.1 and TCONS_00013886, TCONS_00013887, TCONS_00014353, TCONS_00013888 on 7q21) indicating that no coding sequences are directly disrupted. The breakpoint on 7q31 mapped 200 kb downstream of FOXP2, a well-known language gene. No splice site or non-synonymous coding variants were found in the FOXP2 coding sequence. We were unable to detect any changes in the expression level of FOXP2 in fibroblast cells derived from the proband, although this may be the result of the low expression level of FOXP2 in these cells. We conclude that the phenotype observed in this patient either arises from a subtle change in FOXP2 regulation due to the disruption of a downstream element controlling its expression, or from the direct disruption of non-coding RNAs.
Yokoyama, Eiji; Hirai, Shinichiro; Ishige, Taichiro; Murakami, Satoshi
2018-01-02
Seventeen clusters of Shiga toxin-producing Escherichia coli O157:H7/- (O157) strains, determined by cluster analysis of pulsed-field gel electrophoresis patterns, were analyzed using whole genome sequence (WGS) data to investigate this pathogen's molecular epidemiology. The 17 clusters included 136 strains containing strains from nine outbreaks, with each outbreak caused by a single source contaminated with the organism, as shown by epidemiological contact surveys. WGS data of these strains were used to identify single nucleotide polymorphisms (SNPs) by two methods: short read data were directly mapped to a reference genome (mapping derived SNPs) and common SNPs between the mapping derived SNPs and SNPs in assembled data of short read data (common SNPs). Among both SNPs, those that were detected in genes with a gap were excluded to remove ambiguous SNPs from further analysis. The effectiveness of both SNPs was investigated among all the concatenated SNPs that were detected (whole SNP set); SNPs were divided into three categories based on the genes in which they were located (i.e., backbone SNP set, O-island SNP set, and mobile element SNP set); and SNPs in non-coding regions (intergenic region SNP set). When SNPs from strains isolated from the nine single source derived outbreaks were analyzed using an unweighted pair group method with arithmetic mean tree (UPGMA) and a minimum spanning tree (MST), the maximum pair-wise distances of the backbone SNP set of the mapping derived SNPs were significantly smaller than those of the whole and intergenic region SNP set on both UPGMAs and MSTs. This significant difference was also observed when the backbone SNP set of the common SNPs were examined (Steel-Dwass test, P≤0.01). When the maximum pair-wise distances were compared between the mapping derived and common SNPs, significant differences were observed in those of the whole, mobile element, and intergenic region SNP set (Wilcoxon signed rank test, P≤0.01). When all the strains included in one complex on an MST or one cluster on a UPGMA were designated as the same genotype, the values of the Hunter-Gaston Discriminatory Power Index for the backbone SNP set of the mapping derived and common SNPs were higher than those of other SNP sets. In contrast, the mobile element SNP set could not robustly subdivide lineage I strains of tested O157 strains using both the mapping derived and common SNPs. These results suggested that the backbone SNP set were the most effective for analysis of WGS data for O157 in enabling an appropriation of its molecular epidemiology. Copyright © 2017 Elsevier B.V. All rights reserved.
Karayasheva, Dobrina; Glushkova, Maria; Boteva, Ekaterina; Mitev, Vanyo; Kadiyska, Tanya
2016-08-01
Various exogenous and endogenous risk factors have been described as contributing to dental caries susceptibility. In the last decade it has been established that both pro and active forms of host derived Matrix metalloproteinases (MMPs) are present in the oral cavity. MMPs role in caries development has been hypothesized. The aim of this study was to analyse MMP2 (rs2287074) and MMP3 (rs679620) single nucleotide polymorphisms (SNPs) and their role in caries susceptibility. The two SNPs were analysed by PCR- restriction fragment length polymorphism (RFLP) in a sample of 102 ethnic Bulgarian volunteers (42 males and 60 females), all students in Sofia Medical University. Statistical analysis of the MMP2 SNP showed significant differences for the genotype frequencies between the caries free (CF, DMFT=0) and low caries experience (LCE, DMFT≤5) groups. Analysis for the non-synonymous MMP3 SNP found significant differences between both CF vs caries experience groups (LCE+ high caries experience (HCE, DMFT≥5)) and LCE vs HCE groups. The presence of allele G decreased the risk of HCE about 4 times. MMP2 and MMP3 genes are likely to be involved in caries susceptibility in our population. However, as dental caries is a multifactorial disorder and several genes are likely to have influence on it, it is reasonable to expect that SNPs, even those proven to be functional like rs679620, potentially play a significant, but not major role in the disease outcome. Copyright © 2016 Elsevier Ltd. All rights reserved.
Ricaño-Ponce, Isis; Zhernakova, Daria V; Deelen, Patrick; Luo, Oscar; Li, Xingwang; Isaacs, Aaron; Karjalainen, Juha; Di Tommaso, Jennifer; Borek, Zuzanna Agnieszka; Zorro, Maria M; Gutierrez-Achury, Javier; Uitterlinden, Andre G; Hofman, Albert; van Meurs, Joyce; Netea, Mihai G; Jonkers, Iris H; Withoff, Sebo; van Duijn, Cornelia M; Li, Yang; Ruan, Yijun; Franke, Lude; Wijmenga, Cisca; Kumar, Vinod
2016-04-01
Genome-wide association and fine-mapping studies in 14 autoimmune diseases (AID) have implicated more than 250 loci in one or more of these diseases. As more than 90% of AID-associated SNPs are intergenic or intronic, pinpointing the causal genes is challenging. We performed a systematic analysis to link 460 SNPs that are associated with 14 AID to causal genes using transcriptomic data from 629 blood samples. We were able to link 71 (39%) of the AID-SNPs to two or more nearby genes, providing evidence that for part of the AID loci multiple causal genes exist. While 54 of the AID loci are shared by one or more AID, 17% of them do not share candidate causal genes. In addition to finding novel genes such as ULK3, we also implicate novel disease mechanisms and pathways like autophagy in celiac disease pathogenesis. Furthermore, 42 of the AID SNPs specifically affected the expression of 53 non-coding RNA genes. To further understand how the non-coding genome contributes to AID, the SNPs were linked to functional regulatory elements, which suggest a model where AID genes are regulated by network of chromatin looping/non-coding RNAs interactions. The looping model also explains how a causal candidate gene is not necessarily the gene closest to the AID SNP, which was the case in nearly 50% of cases. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
Junk DNA and the long non-coding RNA twist in cancer genetics
Ling, Hui; Vincent, Kimberly; Pichler, Martin; Fodde, Riccardo; Berindan-Neagoe, Ioana; Slack, Frank J.; Calin, George A
2015-01-01
The central dogma of molecular biology states that the flow of genetic information moves from DNA to RNA to protein. However, in the last decade this dogma has been challenged by new findings on non-coding RNAs (ncRNAs) such as microRNAs (miRNAs). More recently, long non-coding RNAs (lncRNAs) have attracted much attention due to their large number and biological significance. Many lncRNAs have been identified as mapping to regulatory elements including gene promoters and enhancers, ultraconserved regions, and intergenic regions of protein-coding genes. Yet, the biological function and molecular mechanisms of lncRNA in human diseases in general and cancer in particular remain largely unknown. Data from the literature suggest that lncRNA, often via interaction with proteins, functions in specific genomic loci or use their own transcription loci for regulatory activity. In this review, we summarize recent findings supporting the importance of DNA loci in lncRNA function, and the underlying molecular mechanisms via cis or trans regulation, and discuss their implications in cancer. In addition, we use the 8q24 genomic locus, a region containing interactive SNPs, DNA regulatory elements and lncRNAs, as an example to illustrate how single nucleotide polymorphism (SNP) located within lncRNAs may be functionally associated with the individual’s susceptibility to cancer. PMID:25619839
Complex nature of SNP genotype effects on gene expression in primary human leucocytes.
Heap, Graham A; Trynka, Gosia; Jansen, Ritsert C; Bruinenberg, Marcel; Swertz, Morris A; Dinesen, Lotte C; Hunt, Karen A; Wijmenga, Cisca; Vanheel, David A; Franke, Lude
2009-01-07
Genome wide association studies have been hugely successful in identifying disease risk variants, yet most variants do not lead to coding changes and how variants influence biological function is usually unknown. We correlated gene expression and genetic variation in untouched primary leucocytes (n = 110) from individuals with celiac disease - a common condition with multiple risk variants identified. We compared our observations with an EBV-transformed HapMap B cell line dataset (n = 90), and performed a meta-analysis to increase power to detect non-tissue specific effects. In celiac peripheral blood, 2,315 SNP variants influenced gene expression at 765 different transcripts (< 250 kb from SNP, at FDR = 0.05, cis expression quantitative trait loci, eQTLs). 135 of the detected SNP-probe effects (reflecting 51 unique probes) were also detected in a HapMap B cell line published dataset, all with effects in the same allelic direction. Overall gene expression differences within the two datasets predominantly explain the limited overlap in observed cis-eQTLs. Celiac associated risk variants from two regions, containing genes IL18RAP and CCR3, showed significant cis genotype-expression correlations in the peripheral blood but not in the B cell line datasets. We identified 14 genes where a SNP affected the expression of different probes within the same gene, but in opposite allelic directions. By incorporating genetic variation in co-expression analyses, functional relationships between genes can be more significantly detected. In conclusion, the complex nature of genotypic effects in human populations makes the use of a relevant tissue, large datasets, and analysis of different exons essential to enable the identification of the function for many genetic risk variants in common diseases.
Cruz, M; Valladares-Salgado, A; Garcia-Mena, J; Ross, K; Edwards, M; Angeles-Martinez, J; Ortega-Camarillo, C; de la Peña, J Escobedo; Burguete-Garcia, A I; Wacher-Rodarte, N; Ambriz, R; Rivera, R; D'artote, A L; Peralta, J; Parra, Esteban J; Kumate, J
2010-05-01
Type 2 diabetes (T2D) is influenced by diverse environmental and genetic risk factors. Metabolic syndrome (MS) increases the risk of cardiovascular disease and diabetes. We analysed 14 cases of polymorphisms located in 10 candidate loci, in a sample of patients with T2D and controls from Mexico City. We analysed the association of 14 polymorphisms located within 10 genes (TCF7L2, ENPP1, ADRB3, KCNJ11, LEPR, PPARgamma, FTO, CDKAL1, SIRT1 and HHEX) with T2D and MS. The analysis included 519 subjects with T2D defined according to the ADA criteria, 389 with MS defined according to the AHA/NHLBI criteria and 547 controls. Association was tested with the program ADMIXMAP including individual ancestry, age, sex, education and in some cases body mass index (BMI), in a logistic regression model. The two markers located within the TCF7L2 gene showed strong associations with T2D (rs7903146, T allele, odd ratio (OR) = 1.76, p = 0.001 and rs12255372, T allele, OR = 1.78, p = 0.002), but did not show significant association with MS. The non-synonymous rs4994 polymorphism of the ADRB3 gene was associated with T2D (Trp allele, OR = 0.62, p = 0.001) and MS (Trp allele, OR = 0.74, p = 0.018). Nominally significant associations were also observed between T2D and the SIRT1 rs3758391 SNP and MS and the HHEX rs5015480 polymorphism. Variants located within the gene TCF7L2 are strongly associated with T2D but not with MS, providing support to previous evidence indicating that polymorphisms at the TCF7L2 gene increase T2D risk. In contrast, the non-synonymous ADRB3 rs4994 polymorphism is associated with T2D and MS.
NASA Astrophysics Data System (ADS)
Zhang, Zhe; Schwatz, Charles; Alexov, Emil
2011-03-01
Creatine transporter (CT) protein, which is encoded by SLC6A8 gene, is essential for taking up the creatine in the cell, which in turn plays a key role in the spatial and temporal maintenance of energy in skeletal and cardiac muscle cells. It was shown that some missense mutations in CT cause mental retardation, while others are harmless non-synonymous single nucleoside polymorphism (nsSNP). Currently fifteen missense mutations in CT are known, among which twelve are disease-causing. Sequence analysis reveals that there is no clear trend distinguishing disease-causing from harmless missense mutations. Because of that, we built 3D model of the CT using highly homologous template and use the model to investigate the effects of mutations of CT stability and hydrogen bond network. It is demonstrated that disease-causing mutations affect the folding free energy and ionization states of titratable group in much greater extend as compared with harmless mutations. Supported by grants from NLM, NIH, grant numbers 1R03LM009748 and 1R03LM009748-S1.
RNA editing differently affects protein-coding genes in D. melanogaster and H. sapiens.
Grassi, Luigi; Leoni, Guido; Tramontano, Anna
2015-07-14
When an RNA editing event occurs within a coding sequence it can lead to a different encoded amino acid. The biological significance of these events remains an open question: they can modulate protein functionality, increase the complexity of transcriptomes or arise from a loose specificity of the involved enzymes. We analysed the editing events in coding regions that produce or not a change in the encoded amino acid (nonsynonymous and synonymous events, respectively) in D. melanogaster and in H. sapiens and compared them with the appropriate random models. Interestingly, our results show that the phenomenon has rather different characteristics in the two organisms. For example, we confirm the observation that editing events occur more frequently in non-coding than in coding regions, and report that this effect is much more evident in H. sapiens. Additionally, in this latter organism, editing events tend to affect less conserved residues. The less frequently occurring editing events in Drosophila tend to avoid drastic amino acid changes. Interestingly, we find that, in Drosophila, changes from less frequently used codons to more frequently used ones are favoured, while this is not the case in H. sapiens.
Castellana, Stefano; Fusilli, Caterina; Mazzoccoli, Gianluigi; Biagini, Tommaso; Capocefalo, Daniele; Carella, Massimo; Vescovi, Angelo Luigi; Mazza, Tommaso
2017-06-01
24,189 are all the possible non-synonymous amino acid changes potentially affecting the human mitochondrial DNA. Only a tiny subset was functionally evaluated with certainty so far, while the pathogenicity of the vast majority was only assessed in-silico by software predictors. Since these tools proved to be rather incongruent, we have designed and implemented APOGEE, a machine-learning algorithm that outperforms all existing prediction methods in estimating the harmfulness of mitochondrial non-synonymous genome variations. We provide a detailed description of the underlying algorithm, of the selected and manually curated training and test sets of variants, as well as of its classification ability.
Sengupta Chattopadhyay, Amrita; Hsiao, Ching-Lin; Chang, Chien Ching; Lian, Ie-Bin; Fann, Cathy S J
2014-01-01
Identifying susceptibility genes that influence complex diseases is extremely difficult because loci often influence the disease state through genetic interactions. Numerous approaches to detect disease-associated SNP-SNP interactions have been developed, but none consistently generates high-quality results under different disease scenarios. Using summarizing techniques to combine a number of existing methods may provide a solution to this problem. Here we used three popular non-parametric methods-Gini, absolute probability difference (APD), and entropy-to develop two novel summary scores, namely principle component score (PCS) and Z-sum score (ZSS), with which to predict disease-associated genetic interactions. We used a simulation study to compare performance of the non-parametric scores, the summary scores, the scaled-sum score (SSS; used in polymorphism interaction analysis (PIA)), and the multifactor dimensionality reduction (MDR). The non-parametric methods achieved high power, but no non-parametric method outperformed all others under a variety of epistatic scenarios. PCS and ZSS, however, outperformed MDR. PCS, ZSS and SSS displayed controlled type-I-errors (<0.05) compared to GS, APDS, ES (>0.05). A real data study using the genetic-analysis-workshop 16 (GAW 16) rheumatoid arthritis dataset identified a number of interesting SNP-SNP interactions. © 2013 Elsevier B.V. All rights reserved.
Are Synonymous Substitutions in Flowering Plant Mitochondria Neutral?
Wynn, Emily L; Christensen, Alan C
2015-10-01
Angiosperm mitochondrial genes appear to have very low mutation rates, while non-gene regions expand, diverge, and rearrange quickly. One possible explanation for this disparity is that synonymous substitutions in plant mitochondrial genes are not truly neutral and selection keeps their occurrence low. If this were true, the explanation for the disparity in mutation rates in genes and non-genes needs to consider selection as well as mechanisms of DNA repair. Rps14 is co-transcribed with cob and rpl5 in most plant mitochondrial genomes, but in some genomes, rps14 has been duplicated to the nucleus leaving a pseudogene in the mitochondria. This provides an opportunity to compare neutral substitution rates in pseudogenes with synonymous substitution rates in the orthologs. Genes and pseudogenes of rps14 have been aligned among different species and the mutation rates have been calculated. Neutral substitution rates in pseudogenes and synonymous substitution rates in genes are significantly different, providing evidence that synonymous substitutions in plant mitochondrial genes are not completely neutral. The non-neutrality is not sufficient to completely explain the exceptionally low mutation rates in land plant mitochondrial genomes, but selective forces appear to play a small role.
Kämpfer, Peter; Rückert, Christian; Blom, Jochen; Goesmann, Alexander; Wink, Joachim; Kalinowski, Jörn; Glaeser, Stefanie P
2018-01-01
Streptomyces canuswas described in 1953 and the name was listed in the Approved List of Bacterial Names in 1980. Three years later, Streptomyces ciscaucasicus was published and the name was subsequently validated in Validation List no. 22 in 1986. On the basis of genome comparison and multilocus sequence analysis of the type strains of Streptomyces canus and Streptomyces ciscaucasicus it can now be shown that these two species despite some phenotypic differences are subjective synonyms. In such a case Rule 24 of the Bacteriological Code applies, in which priority of names is determined by the date of the original publication. Hence, we propose that S. ciscaucasicus is a later subjective synonym of S. canus.
Gregory, Michael D; Kolachana, Bhaskar; Yao, Yin; Nash, Tiffany; Dickinson, Dwight; Eisenberg, Daniel P; Mervis, Carolyn B; Berman, Karen F
2018-04-04
Williams syndrome ([WS], 7q11.23 hemideletion) and 7q11.23 duplication syndrome (Dup7) show contrasting syndromic symptoms. However, within each group there is considerable interindividual variability in the degree to which these phenotypes are expressed. Though software exists to identify areas of copy number variation (CNV) from commonly-available SNP-chip data, this software does not provide non-diploid genotypes in CNV regions. Here, we describe a method for identifying haploid and triploid genotypes in CNV regions, and then, as a proof-of-concept for applying this information to explain clinical variability, we test for genotype-phenotype associations. Blood samples for 25 individuals with WS and 13 individuals with Dup7 were genotyped with Illumina-HumanOmni5M SNP-chips. PennCNV and in-house code were used to make genotype calls for each SNP in the 7q11.23 locus. We tested for association between the presence of aortic arteriopathy and genotypes of the remaining (haploid in WS) or duplicated (triploid in Dup7) alleles. Haploid calls in the 7q11.23 region were made for 99.0% of SNPs in the WS group, and triploid calls for 98.8% of SNPs in those with Dup7. The G allele of SNP rs2528795 in the ELN gene was associated with aortic stenosis in WS participants (p < 0.0049) while the A allele of the same SNP was associated with aortic dilation in Dup7. Commonly available SNP-chip information can be used to make haploid and triploid calls in individuals with CNVs and then to relate variability in specific genes to variability in syndromic phenotypes, as demonstrated here using aortic arteriopathy. This work sets the stage for similar genotype-phenotype analyses in CNVs where phenotypes may be more complex and/or where there is less information about genetic mechanisms.
Widespread signatures of local mRNA folding structure selection in four Dengue virus serotypes
2015-01-01
Background It is known that mRNA folding can affect and regulate various gene expression steps both in living organisms and in viruses. Previous studies have recognized functional RNA structures in the genome of the Dengue virus. However, these studies usually focused either on the viral untranslated regions or on very specific and limited regions at the beginning of the coding sequences, in a limited number of strains, and without considering evolutionary selection. Results Here we performed the first large scale comprehensive genomics analysis of selection for local mRNA folding strength in the Dengue virus coding sequences, based on a total of 1,670 genomes and 4 serotypes. Our analysis identified clusters of positions along the coding regions that may undergo a conserved evolutionary selection for strong or weak local folding maintained across different viral variants. Specifically, 53-66 clusters for strong folding and 49-73 clusters for weak folding (depending on serotype) aggregated of positions with a significant conservation of folding energy signals (related to partially overlapping local genomic regions) were recognized. In addition, up to 7% of these positions were found to be conserved in more than 90% of the viral genomes. Although some of the identified positions undergo frequent synonymous / non-synonymous substitutions, the selection for folding strength therein is preserved, and thus cannot be trivially explained based on sequence conservation alone. Conclusions The fact that many of the positions with significant folding related signals are conserved among different Dengue variants suggests that a better understanding of the mRNA structures in the corresponding regions may promote the development of prospective anti- Dengue vaccination strategies. The comparative genomics approach described here can be employed in the future for detecting functional regions in other pathogens with very high mutations rates. PMID:26449467
Zhang, Honghai; Chen, Lei
2011-03-01
The dhole (Cuon alpinus) is the only existent species in the genus Cuon (Carnivora: Canidae). In the present study, the complete mitochondrial genome of the dhole was sequenced. The total length is 16672 base pairs which is the shortest in Canidae. Sequence analysis revealed that most mitochondrial genomic functional regions were highly consistent among canid animals except the CSB domain of the control region. The difference in length among the Canidae mitochondrial genome sequences is mainly due to the number of short segments of tandem repeated in the CSB domain. Phylogenetic analysis was progressed based on the concatenated data set of 14 mitochondrial genes of 8 canid animals by using maximum parsimony (MP), maximum likelihood (ML) and Bayesian (BI) inference methods. The genera Vulpes and Nyctereutes formed a sister group and split first within Canidae, followed by that in the Cuon. The divergence in the genus Canis was the latest. The divarication of domestic dogs after that of the Canis lupus laniger is completely supported by all the three topologies. Pairwise sequence divergence data of different mitochondrial genes among canid animals were also determined. Except for the synonymous substitutions in protein-coding genes, the control region exhibits the highest sequence divergences. The synonymous rates are approximately two to six times higher than those of the non-synonymous sites except for a slightly higher rate in the non-synonymous substitution between Cuon alpinus and Vulpes vulpes. 16S rRNA genes have a slightly faster sequence divergence than 12S rRNA and tRNA genes. Based on nucleotide substitutions of tRNA genes and rRNA genes, the times since divergence between dhole and other canid animals, and between domestic dogs and three subspecies of wolves were evaluated. The result indicates that Vulpes and Nyctereutes have a close phylogenetic relationship and the divergence of Nyctereutes is a little earlier. The Tibetan wolf may be an archaic pedigree within wolf subspecies. The genetic distance between wolves and domestic dogs is less than that among different subspecies of wolves. The domestication of dogs was about 1.56-1.92 million years ago or even earlier.
Chograni, Manèl; Rejeb, Imen; Jemaa, Lamia Ben; Châabouni, Myriam; Bouhamed, Habiba Chaabouni
2011-01-01
Nance-Horan Syndrome (NHS) or X-linked cataract-dental syndrome is a disease of unknown gene action mechanism, characterized by congenital cataract, dental anomalies, dysmorphic features and, in some cases, mental retardation. We performed linkage analysis in a Tunisian family with NHS in which affected males and obligate carrier female share a common haplotype in the Xp22.32-p11.21 region that contains the NHS gene. Direct sequencing of NHS coding exons and flanking intronic sequences allowed us to identify the first missense mutation (P551S) and a reported SNP-polymorphism (L1319F) in exon 6, a reported UTR–SNP (c.7422 C>T) and a novel one (c.8239 T>A) in exon 8. Both variations P551S and c.8239 T>A segregate with NHS phenotype in this family. Although truncations, frame-shift and copy number variants have been reported in this gene, no missense mutations have been found to segregate previously. This is the first report of a missense NHS mutation causing NHS phenotype (including cardiac defects). We hypothesize also that the non-reported UTR–SNP of the exon 8 (3′-UTR) is specific to the Tunisian population. PMID:21559051
Chograni, Manèl; Rejeb, Imen; Jemaa, Lamia Ben; Châabouni, Myriam; Bouhamed, Habiba Chaabouni
2011-08-01
Nance-Horan Syndrome (NHS) or X-linked cataract-dental syndrome is a disease of unknown gene action mechanism, characterized by congenital cataract, dental anomalies, dysmorphic features and, in some cases, mental retardation. We performed linkage analysis in a Tunisian family with NHS in which affected males and obligate carrier female share a common haplotype in the Xp22.32-p11.21 region that contains the NHS gene. Direct sequencing of NHS coding exons and flanking intronic sequences allowed us to identify the first missense mutation (P551S) and a reported SNP-polymorphism (L1319F) in exon 6, a reported UTR-SNP (c.7422 C>T) and a novel one (c.8239 T>A) in exon 8. Both variations P551S and c.8239 T>A segregate with NHS phenotype in this family. Although truncations, frame-shift and copy number variants have been reported in this gene, no missense mutations have been found to segregate previously. This is the first report of a missense NHS mutation causing NHS phenotype (including cardiac defects). We hypothesize also that the non-reported UTR-SNP of the exon 8 (3'-UTR) is specific to the Tunisian population.
Linkage Analysis in Autoimmune Addison's Disease: NFATC1 as a Potential Novel Susceptibility Locus.
Mitchell, Anna L; Bøe Wolff, Anette; MacArthur, Katie; Weaver, Jolanta U; Vaidya, Bijay; Erichsen, Martina M; Darlay, Rebecca; Husebye, Eystein S; Cordell, Heather J; Pearce, Simon H S
2015-01-01
Autoimmune Addison's disease (AAD) is a rare, highly heritable autoimmune endocrinopathy. It is possible that there may be some highly penetrant variants which confer disease susceptibility that have yet to be discovered. DNA samples from 23 multiplex AAD pedigrees from the UK and Norway (50 cases, 67 controls) were genotyped on the Affymetrix SNP 6.0 array. Linkage analysis was performed using Merlin. EMMAX was used to carry out a genome-wide association analysis comparing the familial AAD cases to 2706 UK WTCCC controls. To explore some of the linkage findings further, a replication study was performed by genotyping 64 SNPs in two of the four linked regions (chromosomes 7 and 18), on the Sequenom iPlex platform in three European AAD case-control cohorts (1097 cases, 1117 controls). The data were analysed using a meta-analysis approach. In a parametric analysis, applying a rare dominant model, loci on chromosomes 7, 9 and 18 had LOD scores >2.8. In a non-parametric analysis, a locus corresponding to the HLA region on chromosome 6, known to be associated with AAD, had a LOD score >3.0. In the genome-wide association analysis, a SNP cluster on chromosome 2 and a pair of SNPs on chromosome 6 were associated with AAD (P <5x10-7). A meta-analysis of the replication study data demonstrated that three chromosome 18 SNPs were associated with AAD, including a non-synonymous variant in the NFATC1 gene. This linkage study has implicated a number of novel chromosomal regions in the pathogenesis of AAD in multiplex AAD families and adds further support to the role of HLA in AAD. The genome-wide association analysis has also identified a region of interest on chromosome 2. A replication study has demonstrated that the NFATC1 gene is worthy of future investigation, however each of the regions identified require further, systematic analysis.
Polymorphisms of 20 regulatory proteins between Mycobacterium tuberculosis and Mycobacterium bovis.
Bigi, María M; Blanco, Federico Carlos; Araújo, Flabio R; Thacker, Tyler C; Zumárraga, Martín J; Cataldi, Angel A; Soria, Marcelo A; Bigi, Fabiana
2016-08-01
Mycobacterium tuberculosis and Mycobacterium bovis are responsible for tuberculosis in humans and animals, respectively. Both species are closely related and belong to the Mycobacterium tuberculosis complex (MTC). M. tuberculosis is the most ancient species from which M. bovis and other members of the MTC evolved. The genome of M. bovis is over >99.95% identical to that of M. tuberculosis but with seven deletions ranging in size from 1 to 12.7 kb. In addition, 1200 single nucleotide mutations in coding regions distinguish M. bovis from M. tuberculosis. In the present study, we assessed 75 M. tuberculosis genomes and 23 M. bovis genomes to identify non-synonymous mutations in 202 coding sequences of regulatory genes between both species. We identified species-specific variants in 20 regulatory proteins and confirmed differential expression of hypoxia-related genes between M. bovis and M. tuberculosis. © 2016 The Societies and John Wiley & Sons Australia, Ltd.
Mongini, Patricia K. A.; Kramer, Jill M.; Ishikawa, Tomo-o; Herschman, Harvey; Esposito, Donna
2014-01-01
Sjogren’s syndrome (SS) is characterized by salivary gland leukocytic infiltrates and impaired salivation (xerostomia). Cox-2 (Ptgs2) is located on chromosome 1 within the span of the Aec2 region. In an attempt to demonstrate that COX-2 drives antibody-dependent hyposalivation, NOD.B10 congenic mice bearing a Cox-2flox gene were generated. A congenic line with non-NOD alleles in Cox-2-flanking genes failed manifest xerostomia. Further backcrossing yielded disease-susceptible NOD.B10 Cox-2flox lines; fine genetic mapping determined that critical Aec2 genes lie within a 1.56 to 2.17 Mb span of DNA downstream of Cox-2. Bioinformatics analysis revealed that susceptible and non-susceptible lines exhibit non-synonymous coding SNPs in 8 protein-encoding genes of this region, thereby better delineating candidate Aec2 alleles needed for SS xerostomia. PMID:24685748
Genetic variations in NADPH-CYP450 oxidoreductase in a Czech Slavic cohort
Tomková, Mária; Panda, Satya Prakash; Šeda, Ondřej; Baxová, Alice; Hůlková, Martina; Masters, Bettie Sue Siler; Martásek, Pavel
2015-01-01
Background Gene polymorphisms encoding the enzyme NADPH–cytochrome P450 oxidoreductase (POR) contribute to inter-individual differences in drug response. Aim To estimate polymorphic allele frequencies of the POR gene in a Czech Slavic population. Materials & Methods The gene POR was analyzed in 322 Czech Slavic individuals from a control cohort by sequencing and HRM analysis. Results Twenty-five SNP genetic variations were identified. Of these variants, 7 were new, unreported SNPs, including two SNPs in the 5´flanking region (g.4965 C>T and g.4994 G>T), one intronic variant (c.1899 −20C>T), one synonymous SNP (p.20Ala=) and three nonsynonymous SNPs (p.Thr29Ser, p.Pro384Leu and p.Thr529Met). The p.Pro384Leu variant exhibited reduced enzymatic activities compared to wild type. Conclusion New POR variant identification indicates that the number of uncommon variants might be specific for each subpopulation being investigated, particularly germane to the singular role that POR plays in providing reducing equivalents to all CYPs in the endoplasmic reticulum. PMID:25712184
Synonymous codon usage patterns in different parasitic platyhelminth mitochondrial genomes.
Chen, L; Yang, D Y; Liu, T F; Nong, X; Huang, X; Xie, Y; Fu, Y; Zheng, W P; Zhang, R H; Wu, X H; Gu, X B; Wang, S X; Peng, X R; Yang, G Y
2013-02-27
We analyzed synonymous codon usage patterns of the mitochondrial genomes of 43 parasitic platyhelminth species. The relative synonymous codon usage, the effective number of codons (NC) and the frequency of G+C at the third synonymously variable coding position were calculated. Correspondence analysis was used to determine the major variation trends shaping the codon usage patterns. Among the mitochondrial genomes of 19 trematode species, the GC content of third codon positions varied from 0.151 to 0.592, with a mean of 0.295 ± 0.116. In cestodes, the mean GC content of third codon positions was 0.254 ± 0.044. A comparison of the nucleotide composition at 4-fold synonymous sites revealed that, on average, there was a greater abundance of codons ending on U (51.9%) or A (22.7%) than on C (6.3%) or G (19.14%). Twenty-two codons, including UUU, UUA and UUG, were frequently used. In the NC-plot, most of points were distributed well below or around the expected NC curve. In addition to compositional constraints, the degree of hydrophobicity and the aromatic amino acids also influenced codon usage in the mitochondrial genomes of these 43 parasitic platyhelminth species.
SNP discovery in the bovine milk transcriptome using RNA-Seq technology.
Cánovas, Angela; Rincon, Gonzalo; Islas-Trejo, Alma; Wickramasinghe, Saumya; Medrano, Juan F
2010-12-01
High-throughput sequencing of RNA (RNA-Seq) was developed primarily to analyze global gene expression in different tissues. However, it also is an efficient way to discover coding SNPs. The objective of this study was to perform a SNP discovery analysis in the milk transcriptome using RNA-Seq. Seven milk samples from Holstein cows were analyzed by sequencing cDNAs using the Illumina Genome Analyzer system. We detected 19,175 genes expressed in milk samples corresponding to approximately 70% of the total number of genes analyzed. The SNP detection analysis revealed 100,734 SNPs in Holstein samples, and a large number of those corresponded to differences between the Holstein breed and the Hereford bovine genome assembly Btau4.0. The number of polymorphic SNPs within Holstein cows was 33,045. The accuracy of RNA-Seq SNP discovery was tested by comparing SNPs detected in a set of 42 candidate genes expressed in milk that had been resequenced earlier using Sanger sequencing technology. Seventy of 86 SNPs were detected using both RNA-Seq and Sanger sequencing technologies. The KASPar Genotyping System was used to validate unique SNPs found by RNA-Seq but not observed by Sanger technology. Our results confirm that analyzing the transcriptome using RNA-Seq technology is an efficient and cost-effective method to identify SNPs in transcribed regions. This study creates guidelines to maximize the accuracy of SNP discovery and prevention of false-positive SNP detection, and provides more than 33,000 SNPs located in coding regions of genes expressed during lactation that can be used to develop genotyping platforms to perform marker-trait association studies in Holstein cattle.
Chiusano, M L; D'Onofrio, G; Alvarez-Valin, F; Jabbari, K; Colonna, G; Bernardi, G
1999-09-30
We investigated the relationships between the nucleotide substitution rates and the predicted secondary structures in the three states representation (alpha-helix, beta-sheet, and coil). The analysis was carried out on 34 alignments, each of which comprised sequences belonging to at least four different mammalian orders. The rates of synonymous substitution were found to be significantly different in regions predicted to be alpha-helix, beta-sheet, or coil. Likewise, the nonsynonymous rates also differ, although expectedly at a lower extent, in the three types of secondary structure, suggesting that different selective constraints associated with the different structures are affecting in a similar way the synonymous and nonsynonymous rates. Moreover, the base composition of the third codon positions is different in coding sequence regions corresponding to different secondary structures of proteins.
AA9int: SNP Interaction Pattern Search Using Non-Hierarchical Additive Model Set.
Lin, Hui-Yi; Huang, Po-Yu; Chen, Dung-Tsa; Tung, Heng-Yuan; Sellers, Thomas A; Pow-Sang, Julio; Eeles, Rosalind; Easton, Doug; Kote-Jarai, Zsofia; Amin Al Olama, Ali; Benlloch, Sara; Muir, Kenneth; Giles, Graham G; Wiklund, Fredrik; Gronberg, Henrik; Haiman, Christopher A; Schleutker, Johanna; Nordestgaard, Børge G; Travis, Ruth C; Hamdy, Freddie; Neal, David E; Pashayan, Nora; Khaw, Kay-Tee; Stanford, Janet L; Blot, William J; Thibodeau, Stephen N; Maier, Christiane; Kibel, Adam S; Cybulski, Cezary; Cannon-Albright, Lisa; Brenner, Hermann; Kaneva, Radka; Batra, Jyotsna; Teixeira, Manuel R; Pandha, Hardev; Lu, Yong-Jie; Park, Jong Y
2018-06-07
The use of single nucleotide polymorphism (SNP) interactions to predict complex diseases is getting more attention during the past decade, but related statistical methods are still immature. We previously proposed the SNP Interaction Pattern Identifier (SIPI) approach to evaluate 45 SNP interaction patterns/patterns. SIPI is statistically powerful but suffers from a large computation burden. For large-scale studies, it is necessary to use a powerful and computation-efficient method. The objective of this study is to develop an evidence-based mini-version of SIPI as the screening tool or solitary use and to evaluate the impact of inheritance mode and model structure on detecting SNP-SNP interactions. We tested two candidate approaches: the 'Five-Full' and 'AA9int' method. The Five-Full approach is composed of the five full interaction models considering three inheritance modes (additive, dominant and recessive). The AA9int approach is composed of nine interaction models by considering non-hierarchical model structure and the additive mode. Our simulation results show that AA9int has similar statistical power compared to SIPI and is superior to the Five-Full approach, and the impact of the non-hierarchical model structure is greater than that of the inheritance mode in detecting SNP-SNP interactions. In summary, it is recommended that AA9int is a powerful tool to be used either alone or as the screening stage of a two-stage approach (AA9int+SIPI) for detecting SNP-SNP interactions in large-scale studies. The 'AA9int' and 'parAA9int' functions (standard and parallel computing version) are added in the SIPI R package, which is freely available at https://linhuiyi.github.io/LinHY_Software/. hlin1@lsuhsc.edu. Supplementary data are available at Bioinformatics online.
Bruun, Camilla S.; Jäderlund, Karin H.; Berendt, Mette; Jensen, Kristine B.; Spodsberg, Eva H.; Gredal, Hanne; Shelton, G. Diane; Mickelson, James R.; Minor, Katie M.; Lohi, Hannes; Bjerkås, Inge; Stigen, Øyvind; Espenes, Arild; Rohdin, Cecilia; Edlund, Rebecca; Ohlsson, Jennie; Cizinauskas, Sigitas; Leifsson, Páll S.; Drögemüller, Cord; Moe, Lars; Cirera, Susanna; Fredholm, Merete
2013-01-01
The first cases of early-onset progressive polyneuropathy appeared in the Alaskan Malamute population in Norway in the late 1970s. Affected dogs were of both sexes and were ambulatory paraparetic, progressing to non-ambulatory tetraparesis. On neurologic examination, affected dogs displayed predominantly laryngeal paresis, decreased postural reactions, decreased spinal reflexes and muscle atrophy. The disease was considered eradicated through breeding programmes but recently new cases have occurred in the Nordic countries and the USA. The N-myc downstream-regulated gene (NDRG1) is implicated in neuropathies with comparable symptoms or clinical signs both in humans and in Greyhound dogs. This gene was therefore considered a candidate gene for the polyneuropathy in Alaskan Malamutes. The coding sequence of the NDRG1 gene derived from one healthy and one affected Alaskan Malamute revealed a non-synonymous G>T mutation in exon 4 in the affected dog that causes a Gly98Val amino acid substitution. This substitution was categorized to be “probably damaging” to the protein function by PolyPhen2 (score: 1.000). Subsequently, 102 Alaskan Malamutes from the Nordic countries and the USA known to be either affected (n = 22), obligate carriers (n = 7) or healthy (n = 73) were genotyped for the SNP using TaqMan. All affected dogs had the T/T genotype, the obligate carriers had the G/T genotype and the healthy dogs had the G/G genotype except for 13 who had the G/T genotype. A protein alignment showed that residue 98 is conserved in mammals and also that the entire NDRG1 protein is highly conserved (94.7%) in mammals. We conclude that the G>T substitution is most likely the mutation that causes polyneuropathy in Alaskan Malamutes. Our characterization of a novel candidate causative mutation for polyneuropathy offers a new canine model that can provide further insight into pathobiology and therapy of human polyneuropathy. Furthermore, selection against this mutation can now be used to eliminate the disease in Alaskan Malamutes. PMID:23393557
Erixon, Per; Oxelman, Bengt
2008-01-01
Background Synonymous DNA substitution rates in the plant chloroplast genome are generally relatively slow and lineage dependent. Non-synonymous rates are usually even slower due to purifying selection acting on the genes. Positive selection is expected to speed up non-synonymous substitution rates, whereas synonymous rates are expected to be unaffected. Until recently, positive selection has seldom been observed in chloroplast genes, and large-scale structural rearrangements leading to gene duplications are hitherto supposed to be rare. Methodology/Principle Findings We found high substitution rates in the exons of the plastid clpP1 gene in Oenothera (the Evening Primrose family) and three separate lineages in the tribe Sileneae (Caryophyllaceae, the Carnation family). Introns have been lost in some of the lineages, but where present, the intron sequences have substitution rates similar to those found in other introns of their genomes. The elevated substitution rates of clpP1 are associated with statistically significant whole-gene positive selection in three branches of the phylogeny. In two of the lineages we found multiple copies of the gene. Neighboring genes present in the duplicated fragments do not show signs of elevated substitution rates or positive selection. Although non-synonymous substitutions account for most of the increase in substitution rates, synonymous rates are also markedly elevated in some lineages. Whereas plant clpP1 genes experiencing negative (purifying) selection are characterized by having very conserved lengths, genes under positive selection often have large insertions of more or less repetitive amino acid sequence motifs. Conclusions/Significance We found positive selection of the clpP1 gene in various plant lineages to correlated with repeated duplication of the clpP1 gene and surrounding regions, repetitive amino acid sequences, and increase in synonymous substitution rates. The present study sheds light on the controversial issue of whether negative or positive selection is to be expected after gene duplications by providing evidence for the latter alternative. The observed increase in synonymous substitution rates in some of the lineages indicates that the detection of positive selection may be obscured under such circumstances. Future studies are required to explore the functional significance of the large inserted repeated amino acid motifs, as well as the possibility that synonymous substitution rates may be affected by positive selection. PMID:18167545
USDA-ARS?s Scientific Manuscript database
Codon bias deoptimization has been previously used to successfully attenuate human pathogens including polio, respiratory syncytial and influenza viruses. We have applied a similar technology to deoptimize the capsid coding region (P1 region) of the cDNA infectious clone of foot-and-mouth disease vi...
Hagen, Ingerid J; Billing, Anna M; Rønning, Bernt; Pedersen, Sindre A; Pärn, Henrik; Slate, Jon; Jensen, Henrik
2013-05-01
With the advent of next generation sequencing, new avenues have opened to study genomics in wild populations of non-model species. Here, we describe a successful approach to a genome-wide medium density Single Nucleotide Polymorphism (SNP) panel in a non-model species, the house sparrow (Passer domesticus), through the development of a 10 K Illumina iSelect HD BeadChip. Genomic DNA and cDNA derived from six individuals were sequenced on a 454 GS FLX system and generated a total of 1.2 million sequences, in which SNPs were detected. As no reference genome exists for the house sparrow, we used the zebra finch (Taeniopygia guttata) reference genome to determine the most likely position of each SNP. The 10 000 SNPs on the SNP-chip were selected to be distributed evenly across 31 chromosomes, giving on average one SNP per 100 000 bp. The SNP-chip was screened across 1968 individual house sparrows from four island populations. Of the original 10 000 SNPs, 7413 were found to be variable, and 99% of these SNPs were successfully called in at least 93% of all individuals. We used the SNP-chip to demonstrate the ability of such genome-wide marker data to detect population sub-division, and compared these results to similar analyses using microsatellites. The SNP-chip will be used to map Quantitative Trait Loci (QTL) for fitness-related phenotypic traits in natural populations. © 2013 Blackwell Publishing Ltd.
Coding SNP in tenascin-C Fn-III-D domain associates with adult asthma.
Matsuda, Akira; Hirota, Tomomitsu; Akahoshi, Mitsuteru; Shimizu, Makiko; Tamari, Mayumi; Miyatake, Akihiko; Takahashi, Atsushi; Nakashima, Kazuko; Takahashi, Naomi; Obara, Kazuhiko; Yuyama, Noriko; Doi, Satoru; Kamogawa, Yumiko; Enomoto, Tadao; Ohshima, Koichi; Tsunoda, Tatsuhiko; Miyatake, Shoichiro; Fujita, Kimie; Kusakabe, Moriaki; Izuhara, Kenji; Nakamura, Yusuke; Hopkin, Julian; Shirakawa, Taro
2005-10-01
The extracellular matrix glycoprotein tenascin-C (TNC) has been accepted as a valuable histopathological subepithelial marker for evaluating the severity of asthmatic disease and the therapeutic response to drugs. We found an association between an adult asthma and an SNP encoding TNC fibronectin type III-D (Fn-III-D) domain in a case-control study between a Japanese population including 446 adult asthmatic patients and 658 normal healthy controls. The SNP (44513A/T in exon 17) strongly associates with adult bronchial asthma (chi2 test, P=0.00019, Odds ratio=1.76, 95% confidence interval=1.31-2.36). This coding SNP induces an amino acid substitution (Leu1677Ile) within the Fn-III-D domain of the alternative splicing region. Computer-assisted protein structure modeling suggests that the substituted amino acid locates at the outer edge of the beta-sheet in Fn-III-D domain and causes instability of this beta-sheet. As the TNC fibronectin-III domain has molecular elasticity, the structural change may affect the integrity and stiffness of asthmatic airways. In addition, TNC expression in lung fibroblasts increases with Th2 immune cytokine stimulation. Thus, Leu1677Ile may be valuable marker for evaluating the risk for developing asthma and plays a role in its pathogenesis.
Genetic variations of VDR/NR1I1 encoding vitamin D receptor in a Japanese population.
Ukaji, Maho; Saito, Yoshiro; Fukushima-Uesaka, Hiromi; Maekawa, Keiko; Katori, Noriko; Kaniwa, Nahoko; Yoshida, Teruhiko; Nokihara, Hiroshi; Sekine, Ikuo; Kunitoh, Hideo; Ohe, Yuichiro; Yamamoto, Noboru; Tamura, Tomohide; Saijo, Nagahiro; Sawada, Jun-ichi
2007-12-01
The vitamin D receptor (VDR) is a transcriptional factor responsive to 1alpha,25-dihydroxyvitamin D(3) and lithocholic acid, and induces expression of drug metabolizing enzymes CYP3A4, CYP2B6 and CYP2C9. In this study, the promoter regions, 14 exons (including 6 exon 1's) and their flanking introns of VDR were comprehensively screened for genetic variations in 107 Japanese subjects. Sixty-one genetic variations including 25 novel ones were found: 9 in the 5'-flanking region, 2 in the 5'-untranslated region (UTR), 7 in the coding exons (5 synonymous and 2 nonsynonymous variations), 12 in the 3'-UTR, 19 in the introns between the exon 1's, and 12 in introns 2 to 8. Of these, one novel nonsynonymous variation, 154A>G (Met52Val), was detected with an allele frequency of 0.005. The single nucleotide polymorphisms (SNPs) that increase VDR expression or activity, -29649G>A, 2T>C and 1592((*)308)C>A tagging linked variations in the 3'-UTR, were detected at 0.430, 0.636, and 0.318 allele frequencies, respectively. Another SNP, -26930A>G, with reduced VDR transcription was found at a 0.028 frequency. These findings would be useful for association studies on VDR variations in Japanese.
Van, K; Onoda, S; Kim, M Y; Kim, K D; Lee, S-H
2008-03-01
The Waxy (Wx) gene product controls the formation of a straight chain polymer of amylose in the starch pathway. Dominance/recessiveness of the Wx allele is associated with amylose content, leading to non-waxy/waxy phenotypes. For a total of 113 foxtail millet accessions, agronomic traits and the molecular differences of the Wx gene were surveyed to evaluate genetic diversities. Molecular types were associated with phenotypes determined by four specific primer sets (non-waxy, Type I; low amylose, Type VI; waxy, Type IV or V). Additionally, the insertion of transposable element in waxy was confirmed by ex1/TSI2R, TSI2F/ex2, ex2int2/TSI7R and TSI7F/ex4r. Seventeen single nucleotide polymorphims (SNPs) were observed from non-coding regions, while three SNPs from coding regions were non-synonymous. Interestingly, the phenotype of No. 88 was still non-waxy, although seven nucleotides (AATTGGT) insertion at 2,993 bp led to 78 amino acids shorter. The rapid decline of r (2) in the sequenced region (exon 1-intron 1-exon 2) suggested a low level of linkage disequilibrium and limited haplotype structure. K (s) values and estimation of evolutionary events indicate early divergence of S. italica among cereal crops. This study suggested the Wx gene was one of the targets in the selection process during domestication.
Lee, Hwan Young; Yoo, Ji-Eun; Park, Myung Jin; Chung, Ukhee; Kim, Chong-Youl; Shin, Kyoung-Jin
2006-11-01
The present study analyzed 21 coding region SNP markers and one deletion motif for the determination of East Asian mitochondrial DNA (mtDNA) haplogroups by designing three multiplex systems which apply single base extension methods. Using two multiplex systems, all 593 Korean mtDNAs were allocated into 15 haplogroups: M, D, D4, D5, G, M7, M8, M9, M10, M11, R, R9, B, A, and N9. As the D4 haplotypes occurred most frequently in Koreans, the third multiplex system was used to further define D4 subhaplogroups: D4a, D4b, D4e, D4g, D4h, and D4j. This method allowed the complementation of coding region information with control region mutation motifs and the resultant findings also suggest reliable control region mutation motifs for the assignment of East Asian mtDNA haplogroups. These three multiplex systems produce good results in degraded samples as they contain small PCR products (101-154 bp) for single base extension reactions. SNP scoring was performed in 101 old skeletal remains using these three systems to prove their utility in degraded samples. The sequence analysis of mtDNA control region with high incidence of haplogroup-specific mutations and the selective scoring of highly informative coding region SNPs using the three multiplex systems are useful tools for most applications involving East Asian mtDNA haplogroup determination and haplogroup-directed stringent quality control.
Milivojevic, Verica; Kranzler, Henry R; Gelernter, Joel; Burian, Linda; Covault, Jonathan
2011-05-01
Studies of alcohol effects in rodents and in vitro implicate endogenous neuroactive steroids as key mediators of alcohol effects at GABA(A) receptors. We used a case-control sample to test the association with alcohol dependence (AD) of single nucleotide polymorphisms in the genes encoding two key enzymes required for the generation of endogenous neuroactive steroids: 5α-reductase, type I (5α-R), and 3α-hydroxysteroid dehydrogenase, type 2 (3α-HSD), both of which are expressed in human brain. We focused on markers previously associated with a biological phenotype. For 5α-R, we examined the synonymous SRD5A1 exon 1 SNP rs248793, which has been associated with the ratio of dihydrotestosterone to testosterone. For 3α-HSD, we examined the nonsynonymous AKR1C3 SNP rs12529 (H5Q), which has been associated with bladder cancer. The SNPs were genotyped in a sample of 1,083 non-Hispanic Caucasians including 552 controls and 531 subjects with AD. The minor allele for both SNPs was more common among controls than subjects with AD: SRD5A1 rs248793 C-allele (χ(2)(1) = 7.6, p = 0.006) and AKR1C3 rs12529 G-allele (χ(2)(1) = 14.6, p = 0.0001). There was also an interaction of these alleles such that the "protective" effect of the minor allele at each marker for AD was conditional on the genotype of the second marker. We found evidence of an association with AD of polymorphisms in two genes encoding neuroactive steroid biosynthetic enzymes, providing indirect evidence that neuroactive steroids are important mediators of alcohol effects in humans. Copyright © 2011 by the Research Society on Alcoholism.
Huertas-Vazquez, Adriana; Teodorescu, Carmen; Reinier, Kyndaron; Uy-Evanado, Audrey; Chugh, Harpriya; Jerger, Katherine; Ayala, Jo; Gunson, Karen; Jui, Jonathan; Newton-Cheh, Christopher; Albert, Christine M.; Chugh, Sumeet S.
2013-01-01
Background Both schizophrenia and epilepsy have been linked to increased risk of sudden cardiac death (SCD). We hypothesized that DNA variants within genes previously associated with schizophrenia and epilepsy may contribute to an increased risk of SCD. Objective To investigate the contribution to SCD susceptibility of DNA variants previously implicated in schizophrenia and epilepsy. Methods From the ongoing Oregon Sudden Unexpected Death Study, comparisons were performed among 340 SCD cases presenting with ventricular fibrillation and 342 controls. We tested for association between 17 SNPs mapped to 14 loci previously implicated in schizophrenia and epilepsy using logistic regression, assuming additive, dominant and recessive genetic models. Results The minor allele of the non-synonymous SNP rs10503929 within the Neuregulin 1 gene (NRG1) was associated with SCD under all three investigated models, with the strongest association for the recessive genetic model (recessive P=4.01×10−5, OR= 4.04; additive P=2.84×10−7, OR= 1.9 and dominant P=9.01×10−6, OR= 2.06). To validate our findings, we further explored the association of this variant in the Harvard Cohort SCD study. The SNP rs10503929 was associated with an increased risk of SCD under the recessive genetic model (P=0.0005, OR= 2.7). This missense variation causes a methionine to threonine change and functional effects are currently unknown. Conclusions The observed association between a schizophrenia-related NRG1 variant and SCD may represent the first evidence of coexisting genetic susceptibility between two conditions that have an established clinical overlap. Further investigation is warranted to explore the molecular mechanisms of this variant in the pathogenesis of SCD. PMID:23524320
Haplotypes of CYP3A4 and their close linkage with CYP3A5 haplotypes in a Japanese population.
Fukushima-Uesaka, Hiromi; Saito, Yoshiro; Watanabe, Hidemi; Shiseki, Kisho; Saeki, Mayumi; Nakamura, Takahiro; Kurose, Kouichi; Sai, Kimie; Komamura, Kazuo; Ueno, Kazuyuki; Kamakura, Shiro; Kitakaze, Masafumi; Hanai, Sotaro; Nakajima, Toshiharu; Matsumoto, Kenji; Saito, Hirohisa; Goto, Yu-ichi; Kimura, Hideo; Katoh, Masaaki; Sugai, Kenji; Minami, Narihiro; Shirao, Kuniaki; Tamura, Tomohide; Yamamoto, Noboru; Minami, Hironobu; Ohtsu, Atsushi; Yoshida, Teruhiko; Saijo, Nagahiro; Kitamura, Yutaka; Kamatani, Naoyuki; Ozawa, Shogo; Sawada, Jun-ichi
2004-01-01
In order to identify single nucleotide polymorphisms (SNPs) and haplotype frequencies of CYP3A4 in a Japanese population, the distal enhancer and proximal promoter regions, all exons, and the surrounding introns were sequenced from genomic DNA of 416 Japanese subjects. We found 24 SNPs, including 17 novel ones: two in the distal enhancer, four in the proximal promoter, one in the 5'-untranslated region (UTR), seven in the introns, and three in the 3'-UTR. The most common SNP was c.1026+12G>A (IVS10+12G>A), with a 0.249 frequency. Four non-synonymous SNPs, c.554C>G (p.T185S, CYP3A4(*)16), c.830_831insA (p.E277fsX8, (*)6), c.878T>C (p.L293P, (*)18), and c.1088 C>T (p.T363M, (*)11) were found with frequencies of 0.014, 0.001, 0.028, and 0.002, respectively. No SNP was found in the known nuclear transcriptional factor-binding sites in the enhancer and promoter regions. Using these 24 SNPs, 16 haplotypes were unambiguously identified, and nine haplotypes were inferred by aid of an expectation-maximization-based program. In addition, using data from 186 subjects enabled a close linkage to be found between CYP3A4 and CYP3A5 SNPs, especially among the SNPs at c.1026+12 in CYP3A4 and c.219-237 (IVS3-237, a key SNP site for CYP3A5(*)3), c.865+77 (IVS9+77) and c.1523 in CYP3A5. This result suggested that CYP3A4 and CYP3A5 are within the same gene block. Haplotype analysis between CYP3A4 and CYP3A5 revealed several major haplotype combinations in the CYP3A4-CYP3A5 block. Our findings provide fundamental and useful information for genotyping CYP3A4 (and CYP3A5) in the Japanese, and probably Asian populations. Copyright 2003 Wiley-Liss, Inc.
Glutamate decarboxylase genes and alcoholism in Han Taiwanese men.
Loh, El-Wui; Lane, Hsien-Yuan; Chen, Chien-Hsiun; Chang, Pi-Shan; Ku, Li-Wen; Wang, Kathy H T; Cheng, Andrew T A
2006-11-01
Glutamate decarboxylase (GAD), the rate-limiting enzyme in the synthesis of gamma-aminobutyric acid (GABA), may be involved in the development of alcoholism. This study examined the possible roles of the genes that code for 2 forms of GAD (GAD1 and GAD2) in the development of alcoholism. An association study was conducted among 140 male alcoholic subjects meeting the DSM-III-R criteria for alcohol dependence and 146 controls recruited from the Han Taiwanese in community and clinical settings. Psychiatric assessment of drinking conditions was conducted using a Chinese version of the Schedules for Clinical Assessment in Neuropsychiatry. The SHEsis and Haploview programs were used in statistical analyses. Nine single-nucleotide polymorphisms (SNPs) at the GAD1 gene were valid for further statistics. Between alcoholic subjects and controls, significant differences were found in genotype distributions of SNP1 (p=0.000), SNP2 (p=0.015), SNP4 (p=0.015), SNP5 (p=0.031), SNP6 (p=0.012), and SNP8 (p=0.004) and in allele distributions of SNP1 (p=0.001), SNP2 (p=0.009), and SNP8 (p=0.009). Permutation tests of SNP1, SNP2, and SNP8 demonstrated significant differences in allele frequencies but not in 2 major haplotype blocks. Three valid SNPs at the GAD2 gene demonstrated no associations with alcoholism. Further permutation tests in the only 1 haplotype block or individual SNPs demonstrated no significant differences. This is the first report indicating a possible significant role of the GAD1 gene in the development of alcohol dependence and/or the course of alcohol withdrawal and outcome of alcoholism.
TECPR2 Associated Neuroaxonal Dystrophy in Spanish Water Dogs
Jagannathan, Vidhya; Wohlsein, Peter; Baumgärtner, Wolfgang; Seehusen, Frauke; Spitzbarth, Ingo; Grandon, Rodrigo; Drögemüller, Cord; Jäderlund, Karin Hultin
2015-01-01
Clinical, pathological and genetic examination revealed an as yet uncharacterized juvenile-onset neuroaxonal dystrophy (NAD) in Spanish water dogs. Affected dogs presented with various neurological deficits including gait abnormalities and behavioral deficits. Histopathology demonstrated spheroid formation accentuated in the grey matter of the cerebral hemispheres, the cerebellum, the brain stem and in the sensory pathways of the spinal cord. Iron accumulation was absent. Ultrastructurally spheroids contained predominantly closely packed vesicles with a double-layered membrane, which were characterized as autophagosomes using immunohistochemistry. The family history of the four affected dogs suggested an autosomal recessive inheritance. SNP genotyping showed a single genomic region of extended homozygosity of 4.5 Mb in the four cases on CFA 8. Linkage analysis revealed a maximal parametric LOD score of 2.5 at this region. By whole genome re-sequencing of one affected dog, a perfectly associated, single, non-synonymous coding variant in the canine tectonin beta-propeller repeat-containing protein 2 (TECPR2) gene affecting a highly conserved region was detected (c.4009C>T or p.R1337W). This canine NAD form displays etiologic parallels to an inherited TECPR2 associated type of human hereditary spastic paraparesis (HSP). In contrast to the canine NAD, the spinal cord lesions in most types of human HSP involve the sensory and the motor pathways. Furthermore, the canine NAD form reveals similarities to cases of human NAD defined by widespread spheroid formation without iron accumulation in the basal ganglia. Thus TECPR2 should also be considered as candidate gene for human NAD. Immunohistochemistry and the ultrastructural findings further support the assumption, that TECPR2 regulates autophagosome accumulation in the autophagic pathways. Consequently, this report provides the first genetic characterization of juvenile canine NAD, describes the histopathological features associated with the TECPR2 mutation and provides evidence to emphasize the association between failure of autophagy and neurodegeneration. PMID:26555167
Ishikawa, Tetsuya
2017-05-26
To investigate genotype variation among induced pluripotent stem cell (iPSC) lines that were clonally generated from heterogeneous colon cancer tissues using next-generation sequencing. Human iPSC lines were clonally established by selecting independent single colonies expanded from heterogeneous primary cells of S-shaped colon cancer tissues by retroviral gene transfer ( OCT3/4 , SOX2 , and KLF4 ). The ten iPSC lines, their starting cancer tissues, and the matched adjacent non-cancerous tissues were analyzed using next-generation sequencing and bioinformatics analysis using the human reference genome hg19. Non-synonymous single-nucleotide variants (SNVs) (missense, nonsense, and read-through) were identified within the target region of 612 genes related to cancer and the human kinome. All SNVs were annotated using dbSNP135, CCDS, RefSeq, GENCODE, and 1000 Genomes. The SNVs of the iPSC lines were compared with the genotypes of the cancerous and non-cancerous tissues. The putative genotypes were validated using allelic depth and genotype quality. For final confirmation, mutated genotypes were manually curated using the Integrative Genomics Viewer. In eight of the ten iPSC lines, one or two non-synonymous SNVs in EIF2AK2 , TTN , ULK4 , TSSK1B , FLT4 , STK19 , STK31 , TRRAP , WNK1 , PLK1 or PIK3R5 were identified as novel SNVs and were not identical to the genotypes found in the cancer and non-cancerous tissues. This result suggests that the SNVs were de novo or pre-existing mutations that originated from minor populations, such as multifocal pre-cancer (stem) cells or pre-metastatic cancer cells from multiple, different clonal evolutions, present within the heterogeneous cancer tissue. The genotypes of all ten iPSC lines were different from the mutated ERBB2 and MKNK2 genotypes of the cancer tissues and were identical to those of the non-cancerous tissues and that found in the human reference genome hg19. Furthermore, two of the ten iPSC lines did not have any confirmed mutated genotypes, despite being derived from cancerous tissue. These results suggest that the traceability and preference of the starting single cells being derived from pre-cancer (stem) cells, stroma cells such as cancer-associated fibroblasts, and immune cells that co-existed in the tissues along with the mature cancer cells. The genotypes of iPSC lines derived from heterogeneous cancer tissues can provide information on the type of starting cell that the iPSC line was generated from.
Zhou, Xia; Tambo, Ernest; Su, Jing; Fang, Qiang; Ruan, Wei; Chen, Jun-Hu; Yin, Ming-Bo; Zhou, Xiao-Nong
2017-10-01
Plasmodium vivax merozoite surface protein-1 (PvMSP1) gene codes for a major malaria vaccine candidate antigen. However, its polymorphic nature represents an obstacle to the design of a protective vaccine. In this study, we analyzed the genetic polymorphism and natural selection of the C-terminal 42 kDa fragment within PvMSP1 gene (Pv MSP142) from 77 P. vivax isolates, collected from imported cases of China-Myanmar border (CMB) areas in Yunnan province and the inland cases from Anhui, Yunnan, and Zhejiang province in China during 2009-2012. Totally, 41 haplotypes were identified and 30 of them were new haplotypes. The differences between the rates of non-synonymous and synonymous mutations suggest that PvMSP142 has evolved under natural selection, and a high selective pressure preferentially acted on regions identified of PvMSP133. Our results also demonstrated that PvMSP142 of P. vivax isolates collected on China-Myanmar border areas display higher genetic polymorphisms than those collected from inland of China. Such results have significant implications for understanding the dynamic of the P. vivax population and may be useful information towards China malaria elimination campaign strategies.
Rapid evolution of avirulence genes in rice blast fungus Magnaporthe oryzae
2014-01-01
Background Rice blast fungus Magnaporthe oryzae is one of the most devastating pathogens in rice. Avirulence genes in this fungus share a gene-for-gene relationship with the resistance genes in its host rice. Although numerous studies have shown that rice blast R-genes are extremely diverse and evolve rapidly in their host populations, little is known about the evolutionary patterns of the Avr-genes in the pathogens. Results Here, six well-characterized Avr-genes and seven randomly selected non-Avr control genes were used to investigate the genetic variations in 62 rice blast strains from different parts of China. Frequent presence/absence polymorphisms, high levels of nucleotide variation (~10-fold higher than non-Avr genes), high non-synonymous to synonymous substitution ratios, and frequent shared non-synonymous substitution were observed in the Avr-genes of these diversified blast strains. In addition, most Avr-genes are closely associated with diverse repeated sequences, which may partially explain the frequent presence/absence polymorphisms in Avr-genes. Conclusion The frequent deletion and gain of Avr-genes and rapid non-synonymous variations might be the primary mechanisms underlying rapid adaptive evolution of pathogens toward virulence to their host plants, and these features can be used as the indicators for identifying additional Avr-genes. The high number of nucleotide polymorphisms among Avr-gene alleles could also be used to distinguish genetic groups among different strains. PMID:24725999
Vrancken, Bram; Suchard, Marc A; Lemey, Philippe
2017-07-01
Analyses of virus evolution in known transmission chains have the potential to elucidate the impact of transmission dynamics on the viral evolutionary rate and its difference within and between hosts. Lin et al. (2015, Journal of Virology , 89/7: 3512-22) recently investigated the evolutionary history of hepatitis B virus in a transmission chain and postulated that the 'colonization-adaptation-transmission' model can explain the differential impact of transmission on synonymous and non-synonymous substitution rates. Here, we revisit this dataset using a full probabilistic Bayesian phylogenetic framework that adequately accounts for the non-independence of sequence data when estimating evolutionary parameters. Examination of the transmission chain data under a flexible coalescent prior reveals a general inconsistency between the estimated timings and clustering patterns and the known transmission history, highlighting the need to incorporate host transmission information in the analysis. Using an explicit genealogical transmission chain model, we find strong support for a transmission-associated decrease of the overall evolutionary rate. However, in contrast to the initially reported larger transmission effect on non-synonymous substitution rate, we find a similar decrease in both non-synonymous and synonymous substitution rates that cannot be adequately explained by the colonization-adaptation-transmission model. An alternative explanation may involve a transmission/establishment advantage of hepatitis B virus variants that have accumulated fewer within-host substitutions, perhaps by spending more time in the covalently closed circular DNA state between each round of viral replication. More generally, this study illustrates that ignoring phylogenetic relationships can lead to misleading evolutionary estimates.
Valdisser, Paula A M R; Pereira, Wendell J; Almeida Filho, Jâneo E; Müller, Bárbara S F; Coelho, Gesimária R C; de Menezes, Ivandilson P P; Vianna, João P G; Zucchi, Maria I; Lanna, Anna C; Coelho, Alexandre S G; de Oliveira, Jaison P; Moraes, Alessandra da Cunha; Brondani, Claudio; Vianello, Rosana P
2017-05-30
Common bean is a legume of social and nutritional importance as a food crop, cultivated worldwide especially in developing countries, accounting for an important source of income for small farmers. The availability of the complete sequences of the two common bean genomes has dramatically accelerated and has enabled new experimental strategies to be applied for genetic research. DArTseq has been widely used as a method of SNP genotyping allowing comprehensive genome coverage with genetic applications in common bean breeding programs. Using this technology, 6286 SNPs (1 SNP/86.5 Kbp) were genotyped in genic (43.3%) and non-genic regions (56.7%). Genetic subdivision associated to the common bean gene pools (K = 2) and related to grain types (K = 3 and K = 5) were reported. A total of 83% and 91% of all SNPs were polymorphic within the Andean and Mesoamerican gene pools, respectively, and 26% were able to differentiate the gene pools. Genetic diversity analysis revealed an average H E of 0.442 for the whole collection, 0.102 for Andean and 0.168 for Mesoamerican gene pools (F ST = 0.747 between gene pools), 0.440 for the group of cultivars and lines, and 0.448 for the group of landrace accessions (F ST = 0.002 between cultivar/line and landrace groups). The SNP effects were predicted with predominance of impact on non-coding regions (77.8%). SNPs under selection were identified within gene pools comparing landrace and cultivar/line germplasm groups (Andean: 18; Mesoamerican: 69) and between the gene pools (59 SNPs), predominantly on chromosomes 1 and 9. The LD extension estimate corrected for population structure and relatedness (r 2 SV ) was ~ 88 kbp, while for the Andean gene pool was ~ 395 kbp, and for the Mesoamerican was ~ 130 kbp. For common bean, DArTseq provides an efficient and cost-effective strategy of generating SNPs for large-scale genome-wide studies. The DArTseq resulted in an operational panel of 560 polymorphic SNPs in linkage equilibrium, providing high genome coverage. This SNP set could be used in genotyping platforms with many applications, such as population genetics, phylogeny relation between common bean varieties and support to molecular breeding approaches.
NASA Astrophysics Data System (ADS)
Sun, Yanhong; Li, Qing; Wang, Guiying; Zhu, Dongmei; Chen, Jian; Li, Pei; Tong, Jingou
2017-05-01
Myostatin ( MSTN) is a member of the transforming growth factor-β gene superfamily that negatively regulates skeletal muscle development and growth. In the present study, partial genomic fragments of Myostatin-1 ( MSTN-1) in two commercial hatchery populations of Ancherythroculter nigrocauda, an economically important freshwater fish, were screened for single nucleotide polymorphisms (SNPs) and then genotyped by direct sequencing of PCR products. Five SNPs were identified in intron 1 and exon 2, including a non-synonymous mutation causing an amino acid change (Val to Ile) at position 180. Association analyses based on 300 individuals revealed that the g.1129T>C SNP locus was significantly associated with total length (TL), body length (BL), body height (BH) and body weight (BW) in 6- and 18-month-old populations, while the g.1289G>A locus was significantly associated with BH and BW in the 6-month-old population. Haplotype analyses revealed that fish with the genotype combinations TC/TC or TC/GA showed better growth performance. Our results suggest that g.1129T>C and g.1289G>A have positive effects on growth traits and may be candidate gene markers for marker-assisted selection in A. nigrocauda.
Shen, Qi; Zhang, Dong; Sun, Wei; Zhang, Yu-Jun; Shang, Zhi-Wei; Chen, Shi-Lin
2017-05-01
Perilla frutescens is one of 60 kinds of food and medicine plants in the initial directory announced by health ministry of China. With the development of Perilla domain in recent , the breeding and application of good varieties has become the main bottleneck of its development. This study reported that applied to the system selection, add to marker-assisted method to breed perilla varieties. Through the whole genome sequencing and consistency matching, annotated the mutation locus according to genome data, and comparison analysis with Perilla common variants database, finally selected 30 non-synonymous mutation SNPs used as characteristic markers of Zhongyan Feishu No.1. those SNP marker were used as chosen standard of Perilla varieties. Finally breeding new perilla variety Zhongyan Feishu No.1, which possess to characters of the leaf and seed dual-used, high yield, high resistance, and could used to green fertilizer. The Zhongyan Feishu No.1 acquired the plant new varieties identification of Beijing city , the identification numbers is 2016054. Marker assisted identification guide new varieties breeding in plants, which can provide a new reference for breeding of medicinal plants. Copyright© by the Chinese Pharmaceutical Association.
Genetic contribution to 'theory of mind' in adolescence.
Warrier, Varun; Baron-Cohen, Simon
2018-02-22
Difficulties in 'theory of mind' (the ability to attribute mental states to oneself or others, and to make predictions about another's behaviour based on these attributions) have been observed in several psychiatric conditions. We investigate the genetic architecture of theory of mind in 4,577 13-year-olds who completed the Emotional Triangles Task (Triangles Task), a first-order test of theory of mind. We observe a small but significant female-advantage on the Triangles Task (Cohen's d = 0.19, P < 0.01), in keeping with previous work using other tests of theory of mind. Genome-wide association analyses did not identify any significant loci, and SNP heritability was non-significant. Polygenic scores for six psychiatric conditions (ADHD, anorexia, autism, bipolar disorder, depression, and schizophrenia), and empathy were not associated with scores on the Triangles Task. However, polygenic scores of cognitive aptitude, and cognitive empathy, a term synonymous with theory of mind and measured using the "Reading the Mind in the Eyes" Test, were significantly associated with scores on the Triangles Task at multiple P-value thresholds, suggesting shared genetics between different measures of theory of mind and cognition.
Erickson, Robert P.; Larson-Thome, Katherine; Weberg, Lyndon; Szybinska, Aleksandra; Mossakowska, Malgorzata; Styczynska, Maria; Barcikowska, Maria; Kuznicki, Jacek
2008-01-01
There is abundant evidence that cholesterol metabolism, especially as mediated by the intercellular transporter APOE, is involved in the pathogenesis of sporadic, late-onset Alzheimer disease (SLAD). Identification of other genes involved in SLAD pathogenesis has been hampered since gene association studies, whether individual or genome-wide, experience difficulty in finding appropriate controls in as much as 25% or more of normal adults will develop SLAD. Using 152 centenarians as additional controls and 120 “regular,” 65- to 75-year-old controls, we show an association of genetic variation in NPC1 with SLAD and/or aging. In this preliminary study, we find gradients of two non-synonymous SNP’s allele frequencies in NPC1 from centenarians through normal controls to SLAD in this non-stratified Polish population. An intervening intronic SNP is not in Hardy-Weinberg equilibria and differs between centenarians and controls/SLAD. Haplotypes frequencies determined by fastPHASE were somewhat different, and the predicted genotype frequencies were very different between the 3 groups. These findings can also be interpreted as indicating a role for NPC1 in aging, a role also suggested by NPC1’s role in Dauer formation (hibernation, a longevity state) in C. elegans. PMID:18834923
Xu, Zhen-Hua; Thomae, Bianca A; Eckloff, Bruce W; Wieben, Eric D; Weinshilboum, Richard M
2003-06-01
3'-Phosphoadenosine 5'-phosphosulfate (PAPS) is the high-energy "sulfate donor" for reactions catalyzed by sulfotransferase (SULT) enzymes. The strict requirement of SULTs for PAPS suggests that PAPS synthesis might influence the rate of sulfate conjugation. In humans, PAPS is synthesized from ATP and SO(4)(2-) by two isoforms of PAPS synthetase (PAPSS): PAPSS1 and PAPSS2. As a step toward pharmacogenetic studies, we have resequenced the entire coding sequence of the human PAPSS1 gene, including exon-intron splice junctions, using DNA samples from 60 Caucasian-American and 58 African-American subjects. Twenty-one genetic polymorphisms were observed-1 insertion-deletion event and 20 single nucleotide polymorphisms (SNPs)-including two non-synonymous coding SNPs (cSNPs) that altered the following amino acids: Arg333Cys and Glu531Gln. Twelve pairs of these polymorphisms were tightly linked, and a total of twelve unequivocal haplotypes could be identified-two that were common to both ethnic groups and ten that were ethnic-specific. The Arg333Cys polymorphism, with an allele frequency of 2.5%, was observed only in DNA samples from Caucasian subjects. The Glu531Gln polymorphism was rare, with only a single copy of that allele in a DNA sample from an African-American subject. Transient expression in mammalian cells showed that neither of the non-synonymous cSNPs resulted in a change in the basal level of enzyme activity measured under optimal assay conditions. However, the Glu531Gln polymorphism altered the substrate kinetic properties of the enzyme. The Gln531 variant allozyme had a 5-fold higher K(m) value for SO(4)(2-) than did the wild-type allozyme and displayed monophasic kinetics for Na(2)SO(4). The wild-type allozyme (Glu531) showed biphasic kinetics for that substrate. These observations represent a step toward testing the hypothesis that genetic variation in PAPS synthesis catalyzed by PAPSS1 might alter in vivo sulfate conjugation.
Hamasaki-Katagiri, Nobuko; Lin, Brian C.; Simon, Jonathan; Hunt, Ryan C.; Schiller, Tal; Russek-Cohen, Estelle; Komar, Anton A.; Bar, Haim; Kimchi-Sarfaty, Chava
2016-01-01
Introduction Mutational analysis is commonly used to support the diagnosis and management of haemophilia. This has allowed for the generation of large mutation databases which provide unparalleled insight into genotype-phenotype relationships. Haemophilia is associated with inversions, deletions, insertions, nonsense and missense mutations. Both synonymous and non-synonymous mutations influence the base pairing of messenger RNA (mRNA), which can alter mRNA structure, cellular half-life and ribosome processivity/elongation. However, the role of mRNA structure in determining the pathogenicity of point mutations in haemophilia has not been evaluated. Aim To evaluate mRNA thermodynamic stability and associated RNA prediction software as a means to distinguish between neutral and disease-associated mutations in haemophilia. Methods Five mRNA structure prediction software programs were used to assess the thermodynamic stability of mRNA fragments carrying neutral vs. disease-associated and synonymous vs. non-synonymous point mutations in F8, F9 and a third X-linked gene, DMD (dystrophin). Results In F8 and DMD, disease-associated mutations tend to occur in more structurally stable mRNA regions, represented by lower MFE (minimum free energy) levels. In comparing multiple software packages for mRNA structure prediction, a 101–151 nucleotide fragment length appears to be a feasible range for structuring future studies. Conclusion mRNA thermodynamic stability is one predictive characteristic, which when combined with other RNA and protein features, may offer significant insight when screening sequencing data for novel disease-associated mutations. Our results also suggest potential utility in evaluating the mRNA thermodynamic stability profile of a gene when determining the viability of interchanging codons for biological and therapeutic applications. PMID:27933712
Miyamoto, T; Koh, E; Tsujimura, A; Miyagawa, Y; Saijo, Y; Namiki, M; Sengoku, K
2014-04-01
Genetic mechanisms have been implicated as a cause of some cases of male infertility. Recently, ten novel genes involved in human spermatogenesis, including human LRWD1, have been identified by expression microarray analysis of human testictissue. The human LRWD1 protein mediates the origin recognition complex in chromatin, which is critical for the initiation of pre-replication complex assembly in G1 and chromatin organization in post-G1 cells. The Lrwd1 gene expression is specific to the testis in mice. Therefore, we hypothesized that mutation or polymorphisms of LRWD1 participate in male infertility, especially azoospermia. To investigate whether LRWD1 gene defects are associated with azoospermia caused by SCOS and meiotic arrest (MA), mutational analysis was performed in 100 and 30 Japanese patients by direct sequencing of the coding regions, respectively. Statistical analysis was performed for patients with SCOS and MA and in 100 healthy control men. No mutations were found in LRWD1; however, three coding single-nucleotide polymorphisms (SNP1-SNP3) could be detected in the patients. The genotype and allele frequencies in SNP1 and SNP2 were notably higher in the SCOS group than in the control group (P < 0.05). These results suggest the critical role of LRWD1 in human spermatogenesis. © 2013 Blackwell Verlag GmbH.
CsSNP: A Web-Based Tool for the Detecting of Comparative Segments SNPs.
Wang, Yi; Wang, Shuangshuang; Zhou, Dongjie; Yang, Shuai; Xu, Yongchao; Yang, Chao; Yang, Long
2016-07-01
SNP (single nucleotide polymorphism) is a popular tool for the study of genetic diversity, evolution, and other areas. Therefore, it is necessary to develop a convenient, utility, robust, rapid, and open source detecting-SNP tool for all researchers. Since the detection of SNPs needs special software and series steps including alignment, detection, analysis and present, the study of SNPs is limited for nonprofessional users. CsSNP (Comparative segments SNP, http://biodb.sdau.edu.cn/cssnp/ ) is a freely available web tool based on the Blat, Blast, and Perl programs to detect comparative segments SNPs and to show the detail information of SNPs. The results are filtered and presented in the statistics figure and a Gbrowse map. This platform contains the reference genomic sequences and coding sequences of 60 plant species, and also provides new opportunities for the users to detect SNPs easily. CsSNP is provided a convenient tool for nonprofessional users to find comparative segments SNPs in their own sequences, and give the users the information and the analysis of SNPs, and display these data in a dynamic map. It provides a new method to detect SNPs and may accelerate related studies.
snpAD: An ancient DNA genotype caller.
Prüfer, Kay
2018-06-21
The study of ancient genomes can elucidate the evolutionary past. However, analyses are complicated by base-modifications in ancient DNA molecules that result in errors in DNA sequences. These errors are particularly common near the ends of sequences and pose a challenge for genotype calling. I describe an iterative method that estimates genotype frequencies and errors along sequences to allow for accurate genotype calling from ancient sequences. The implementation of this method, called snpAD, performs well on high-coverage ancient data, as shown by simulations and by subsampling the data of a high-coverage Neandertal genome. Although estimates for low-coverage genomes are less accurate, I am able to derive approximate estimates of heterozygosity from several low-coverage Neandertals. These estimates show that low heterozygosity, compared to modern humans, was common among Neandertals. The C ++ code of snpAD is freely available at http://bioinf.eva.mpg.de/snpAD/. Supplementary data are available at Bioinformatics online.
Abdollahi-Arpanahi, Rostam; Morota, Gota; Valente, Bruno D; Kranis, Andreas; Rosa, Guilherme J M; Gianola, Daniel
2016-02-03
Genome-wide association studies in humans have found enrichment of trait-associated single nucleotide polymorphisms (SNPs) in coding regions of the genome and depletion of these in intergenic regions. However, a recent release of the ENCyclopedia of DNA elements showed that ~80 % of the human genome has a biochemical function. Similar studies on the chicken genome are lacking, thus assessing the relative contribution of its genic and non-genic regions to variation is relevant for biological studies and genetic improvement of chicken populations. A dataset including 1351 birds that were genotyped with the 600K Affymetrix platform was used. We partitioned SNPs according to genome annotation data into six classes to characterize the relative contribution of genic and non-genic regions to genetic variation as well as their predictive power using all available quality-filtered SNPs. Target traits were body weight, ultrasound measurement of breast muscle and hen house egg production in broiler chickens. Six genomic regions were considered: intergenic regions, introns, missense, synonymous, 5' and 3' untranslated regions, and regions that are located 5 kb upstream and downstream of coding genes. Genomic relationship matrices were constructed for each genomic region and fitted in the models, separately or simultaneously. Kernel-based ridge regression was used to estimate variance components and assess predictive ability. Contribution of each class of genomic regions to dominance variance was also considered. Variance component estimates indicated that all genomic regions contributed to marked additive genetic variation and that the class of synonymous regions tended to have the greatest contribution. The marked dominance genetic variation explained by each class of genomic regions was similar and negligible (~0.05). In terms of prediction mean-square error, the whole-genome approach showed the best predictive ability. All genic and non-genic regions contributed to phenotypic variation for the three traits studied. Overall, the contribution of additive genetic variance to the total genetic variance was much greater than that of dominance variance. Our results show that all genomic regions are important for the prediction of the targeted traits, and the whole-genome approach was reaffirmed as the best tool for genome-enabled prediction of quantitative traits.
Bečanović, Kristina; Nørremølle, Anne; Neal, Scott J; Kay, Chris; Collins, Jennifer A; Arenillas, David; Lilja, Tobias; Gaudenzi, Giulia; Manoharan, Shiana; Doty, Crystal N; Beck, Jessalyn; Lahiri, Nayana; Portales-Casamar, Elodie; Warby, Simon C; Connolly, Colúm; De Souza, Rebecca A G; Tabrizi, Sarah J; Hermanson, Ola; Langbehn, Douglas R; Hayden, Michael R; Wasserman, Wyeth W; Leavitt, Blair R
2015-06-01
Cis-regulatory variants that alter gene expression can modify disease expressivity, but none have previously been identified in Huntington disease (HD). Here we provide in vivo evidence in HD patients that cis-regulatory variants in the HTT promoter are bidirectional modifiers of HD age of onset. HTT promoter analysis identified a NF-κB binding site that regulates HTT promoter transcriptional activity. A non-coding SNP, rs13102260:G > A, in this binding site impaired NF-κB binding and reduced HTT transcriptional activity and HTT protein expression. The presence of the rs13102260 minor (A) variant on the HD disease allele was associated with delayed age of onset in familial cases, whereas the presence of the rs13102260 (A) variant on the wild-type HTT allele was associated with earlier age of onset in HD patients in an extreme case-based cohort. Our findings suggest a previously unknown mechanism linking allele-specific effects of rs13102260 on HTT expression to HD age of onset and have implications for HTT silencing treatments that are currently in development.
A Simple Test of Class-Level Genetic Association Can Reveal Novel Cardiometabolic Trait Loci.
Qian, Jing; Nunez, Sara; Reed, Eric; Reilly, Muredach P; Foulkes, Andrea S
2016-01-01
Characterizing the genetic determinants of complex diseases can be further augmented by incorporating knowledge of underlying structure or classifications of the genome, such as newly developed mappings of protein-coding genes, epigenetic marks, enhancer elements and non-coding RNAs. We apply a simple class-level testing framework, termed Genetic Class Association Testing (GenCAT), to identify protein-coding gene association with 14 cardiometabolic (CMD) related traits across 6 publicly available genome wide association (GWA) meta-analysis data resources. GenCAT uses SNP-level meta-analysis test statistics across all SNPs within a class of elements, as well as the size of the class and its unique correlation structure, to determine if the class is statistically meaningful. The novelty of findings is evaluated through investigation of regional signals. A subset of findings are validated using recently updated, larger meta-analysis resources. A simulation study is presented to characterize overall performance with respect to power, control of family-wise error and computational efficiency. All analysis is performed using the GenCAT package, R version 3.2.1. We demonstrate that class-level testing complements the common first stage minP approach that involves individual SNP-level testing followed by post-hoc ascribing of statistically significant SNPs to genes and loci. GenCAT suggests 54 protein-coding genes at 41 distinct loci for the 13 CMD traits investigated in the discovery analysis, that are beyond the discoveries of minP alone. An additional application to biological pathways demonstrates flexibility in defining genetic classes. We conclude that it would be prudent to include class-level testing as standard practice in GWA analysis. GenCAT, for example, can be used as a simple, complementary and efficient strategy for class-level testing that leverages existing data resources, requires only summary level data in the form of test statistics, and adds significant value with respect to its potential for identifying multiple novel and clinically relevant trait associations.
Clinical code set engineering for reusing EHR data for research: A review.
Williams, Richard; Kontopantelis, Evangelos; Buchan, Iain; Peek, Niels
2017-06-01
The construction of reliable, reusable clinical code sets is essential when re-using Electronic Health Record (EHR) data for research. Yet code set definitions are rarely transparent and their sharing is almost non-existent. There is a lack of methodological standards for the management (construction, sharing, revision and reuse) of clinical code sets which needs to be addressed to ensure the reliability and credibility of studies which use code sets. To review methodological literature on the management of sets of clinical codes used in research on clinical databases and to provide a list of best practice recommendations for future studies and software tools. We performed an exhaustive search for methodological papers about clinical code set engineering for re-using EHR data in research. This was supplemented with papers identified by snowball sampling. In addition, a list of e-phenotyping systems was constructed by merging references from several systematic reviews on this topic, and the processes adopted by those systems for code set management was reviewed. Thirty methodological papers were reviewed. Common approaches included: creating an initial list of synonyms for the condition of interest (n=20); making use of the hierarchical nature of coding terminologies during searching (n=23); reviewing sets with clinician input (n=20); and reusing and updating an existing code set (n=20). Several open source software tools (n=3) were discovered. There is a need for software tools that enable users to easily and quickly create, revise, extend, review and share code sets and we provide a list of recommendations for their design and implementation. Research re-using EHR data could be improved through the further development, more widespread use and routine reporting of the methods by which clinical codes were selected. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.
Bimolata, Waikhom; Kumar, Anirudh; Sundaram, Raman Meenakshi; Laha, Gouri Shankar; Qureshi, Insaf Ahmed; Reddy, Gajjala Ashok; Ghazi, Irfan Ahmad
2013-08-01
Xa27 is one of the important R-genes, effective against bacterial blight disease of rice caused by Xanthomonas oryzae pv. oryzae (Xoo). Using natural population of Oryza, we analyzed the sequence variation in the functionally important domains of Xa27 across the Oryza species. DNA sequences of Xa27 alleles from 27 rice accessions revealed higher nucleotide diversity among the reported R-genes of rice. Sequence polymorphism analysis revealed synonymous and non-synonymous mutations in addition to a number of InDels in non-coding regions of the gene. High sequence variation was observed in the promoter region including the 5'UTR with 'π' value 0.00916 and 'θ w ' = 0.01785. Comparative analysis of the identified Xa27 alleles with that of IRBB27 and IR24 indicated the operation of both positive selection (Ka/Ks > 1) and neutral selection (Ka/Ks ≈ 0). The genetic distances of alleles of the gene from Oryza nivara were nearer to IRBB27 as compared to IR24. We also found the presence of conserved and null UPT (upregulated by transcriptional activator) box in the isolated alleles. Considerable amino acid polymorphism was localized in the trans-membrane domain for which the functional significance is yet to be elucidated. However, the absence of functional UPT box in all the alleles except IRBB27 suggests the maintenance of single resistant allele throughout the natural population.
Tumor taxonomy for the developmental lineage classification of neoplasms
Berman, Jules J
2004-01-01
Background The new "Developmental lineage classification of neoplasms" was described in a prior publication. The classification is simple (the entire hierarchy is described with just 39 classifiers), comprehensive (providing a place for every tumor of man), and consistent with recent attempts to characterize tumors by cytogenetic and molecular features. A taxonomy is a list of the instances that populate a classification. The taxonomy of neoplasia attempts to list every known term for every known tumor of man. Methods The taxonomy provides each concept with a unique code and groups synonymous terms under the same concept. A Perl script validated successive drafts of the taxonomy ensuring that: 1) each term occurs only once in the taxonomy; 2) each term occurs in only one tumor class; 3) each concept code occurs in one and only one hierarchical position in the classification; and 4) the file containing the classification and taxonomy is a well-formed XML (eXtensible Markup Language) document. Results The taxonomy currently contains 122,632 different terms encompassing 5,376 neoplasm concepts. Each concept has, on average, 23 synonyms. The taxonomy populates "The developmental lineage classification of neoplasms," and is available as an XML file, currently 9+ Megabytes in length. A representation of the classification/taxonomy listing each term followed by its code, followed by its full ancestry, is available as a flat-file, 19+ Megabytes in length. The taxonomy is the largest nomenclature of neoplasms, with more than twice the number of neoplasm names found in other medical nomenclatures, including the 2004 version of the Unified Medical Language System, the Systematized Nomenclature of Medicine Clinical Terminology, the National Cancer Institute's Thesaurus, and the International Classification of Diseases Oncolology version. Conclusions This manuscript describes a comprehensive taxonomy of neoplasia that collects synonymous terms under a unique code number and assigns each tumor to a single class within the tumor hierarchy. The entire classification and taxonomy are available as open access files (in XML and flat-file formats) with this article. PMID:15571625
Improving coeliac disease risk prediction by testing non-HLA variants additional to HLA variants.
Romanos, Jihane; Rosén, Anna; Kumar, Vinod; Trynka, Gosia; Franke, Lude; Szperl, Agata; Gutierrez-Achury, Javier; van Diemen, Cleo C; Kanninga, Roan; Jankipersadsing, Soesma A; Steck, Andrea; Eisenbarth, Georges; van Heel, David A; Cukrowska, Bozena; Bruno, Valentina; Mazzilli, Maria Cristina; Núñez, Concepcion; Bilbao, Jose Ramon; Mearin, M Luisa; Barisani, Donatella; Rewers, Marian; Norris, Jill M; Ivarsson, Anneli; Boezen, H Marieke; Liu, Edwin; Wijmenga, Cisca
2014-03-01
The majority of coeliac disease (CD) patients are not being properly diagnosed and therefore remain untreated, leading to a greater risk of developing CD-associated complications. The major genetic risk heterodimer, HLA-DQ2 and DQ8, is already used clinically to help exclude disease. However, approximately 40% of the population carry these alleles and the majority never develop CD. We explored whether CD risk prediction can be improved by adding non-HLA-susceptible variants to common HLA testing. We developed an average weighted genetic risk score with 10, 26 and 57 single nucleotide polymorphisms (SNP) in 2675 cases and 2815 controls and assessed the improvement in risk prediction provided by the non-HLA SNP. Moreover, we assessed the transferability of the genetic risk model with 26 non-HLA variants to a nested case-control population (n=1709) and a prospective cohort (n=1245) and then tested how well this model predicted CD outcome for 985 independent individuals. Adding 57 non-HLA variants to HLA testing showed a statistically significant improvement compared to scores from models based on HLA only, HLA plus 10 SNP and HLA plus 26 SNP. With 57 non-HLA variants, the area under the receiver operator characteristic curve reached 0.854 compared to 0.823 for HLA only, and 11.1% of individuals were reclassified to a more accurate risk group. We show that the risk model with HLA plus 26 SNP is useful in independent populations. Predicting risk with 57 additional non-HLA variants improved the identification of potential CD patients. This demonstrates a possible role for combined HLA and non-HLA genetic testing in diagnostic work for CD.
Xu, Qing; Mei, Gui; Sun, Dongxiao; Zhang, Qin; Zhang, Yuan; Yin, Cengceng; Chen, Huiyong; Ding, Xiangdong; Liu, Jianfeng
2012-11-02
We previously localized a quantitative trait locus (QTL) on bovine chromosome 6 affecting milk production traits to a 1.5-Mb region between BMS483 and MNB-209 via genome scanning followed by fine mapping. Totally 15 genes were mapped within such linkage region through bioinformatic analysis of the cattle-human comparative map and bovine genome assembly. Of them, the UDP-glucose dehydrogenase (UGDH) was suggested as a potential positional candidate gene for milk production traits based on its corresponding physiological and biochemical functions and genetic effects. By sequencing all the coding exons and the untranslated regions in UGDH with pooled DNA of 8 sires represented the separated families detected in our previous studies, a total of ten SNPs were identified and genotyped in 1417 Holstein cows of 8 separation families. Individual SNP-based association analysis revealed 4 significant associations of SNP Ex1-1, SNP Int3-1, SNP Int5-1, and SNP Ex12-3 with milk yield (P < 0.05), and 2 significant associations of SNP Ex1-1 and SNP Ex12-3 with protein yield (P < 0.05). Furthermore, our haplotype-based association analyses indicated that haplotypes G-C-C, formed by SNP Ex12-2-SNP Int11-1-SNP Ex11-1, T-G, formed by SNP Int9-3-SNP Int9-2, and C-C, formed by SNP Int5-1-SNP Int3-1, are significantly associated with protein percentage (F=4.15; P=0.0418) and fat percentage (F=5.18~7.25; P=0.0072~0.0231). Finally, by using an in vitro expression assay, we demonstrated that the A allele of SNP Ex1-1 and T allele of SNP Ex11-1of UGDH significantly decreases the expression of UGDH by 68.0% at the RNA, and 50.1% at the protein level, suggesting that SNP Ex1-1 and Ex11-1 represent two functional polymorphisms affecting expression of UGDH and may partly contributed to the observed association of the gene with milk production traits in our samples. Taken together, our findings strongly indicate that UGDH gene could be involved in genetic variation underlying the QTL for milk production traits.
Simard, Frédéric; Licht, Monica; Besansky, Nora J.; Lehmann, Tovi
2007-01-01
Genetic variation in defensin, a gene encoding a major effector molecule of insects immune response was analyzed within and between populations of three members of the Anopheles gambiae complex. The species selected included the two anthropophilic species, An. gambiae and An. arabiensis and the most zoophilic species of the complex, An. quadriannulatus. The first species was represented by four populations spanning its extreme genetic and geographical ranges, whereas each of the other two species was represented by a single population. We found (i) reduced overall polymorphism in the mature peptide region and in the total coding region, together with specific reductions in rare and moderately frequent mutations (sites) in the coding region compared with non coding regions, (ii) markedly reduced rate of nonsynonymous diversity compared with synonymous variation in the mature peptide and virtually identical mature peptide across the three species, and (iii) increased divergence between species in the mature peptide together with reduced differentiation between populations of An. gambiae in the same DNA region. These patterns suggest a strong purifying selection on the mature peptide and probably the whole coding region. Because An. quadriannulatus is not exposed to human pathogens, identical mature peptide and similar pattern of polymorphism across species implies that human pathogens played no role as selective agents on this peptide. PMID:17161659
Ni, Tongtian; Chen, Min; Yang, Kang; Shao, Jianwei; Fu, Yi; Zhou, Weijun
2017-08-01
Given the important role of CD147 in the development of atherosclerosis, we speculated that CD147 genetic polymorphisms might influence the formation of carotid atherosclerotic plaques. The study was to investigate the association between CD147 gene polymorphisms and susceptibility to carotid atherosclerotic plaques in individuals with cerebral infarction (CI). Eight SNPs in the regulatory and coding regions of the CD147 gene were examined using polymerase chain reaction-ligase detection reaction (PCR-LDR) in DNA samples from 732 Chinese patients with CI, divided into a carotid plaque group (n=475) and a non-carotid plaque group (n=257). Significant differences were found in the genotypes and allele frequencies of the rs4919862 SNP between the carotid plaque and non-carotid plaque groups of CI patients (P<0.05), while the frequencies of the C allele and the CC genotype in the non-carotid plaque group were significantly lower than those in the carotid plaque group, and the frequencies of the T allele in the non-carotid plaque group were significantly higher than those in the carotid plaque group (P<0.05). In addition, there was strong linkage disequilibrium among the rs4919862, rs8637 and rs8259 sites. In a haplotype analysis, the occurrence rate of the haplotype GATGCAGC was 2.095 times higher in the carotid plaque group than in the non-carotid plaque group (P<0.05). These results showed that the rs4919862 SNP of CD147 was closely associated with carotid atherosclerotic plaques formation. Thus, polymorphisms of the CD147 gene may be related to the tendency for carotid atherosclerotic plaques. Copyright © 2017 Elsevier Ltd. All rights reserved.
Evolution of Synonymous Codon Usage in Neurospora tetrasperma and Neurospora discreta
Whittle, C. A.; Sun, Y.; Johannesson, H.
2011-01-01
Neurospora comprises a primary model system for the study of fungal genetics and biology. In spite of this, little is known about genome evolution in Neurospora. For example, the evolution of synonymous codon usage is largely unknown in this genus. In the present investigation, we conducted a comprehensive analysis of synonymous codon usage and its relationship to gene expression and gene length (GL) in Neurospora tetrasperma and Neurospora discreta. For our analysis, we examined codon usage among 2,079 genes per organism and assessed gene expression using large-scale expressed sequenced tag (EST) data sets (279,323 and 453,559 ESTs for N. tetrasperma and N. discreta, respectively). Data on relative synonymous codon usage revealed 24 codons (and two putative codons) that are more frequently used in genes with high than with low expression and thus were defined as optimal codons. Although codon-usage bias was highly correlated with gene expression, it was independent of selectively neutral base composition (introns); thus demonstrating that translational selection drives synonymous codon usage in these genomes. We also report that GL (coding sequences [CDS]) was inversely associated with optimal codon usage at each gene expression level, with highly expressed short genes having the greatest frequency of optimal codons. Optimal codon frequency was moderately higher in N. tetrasperma than in N. discreta, which might be due to variation in selective pressures and/or mating systems. PMID:21402862
Association of Germline CHEK2 Gene Variants with Risk and Prognosis of Non-Hodgkin Lymphoma
Havranek, Ondrej; Kleiblova, Petra; Hojny, Jan; Lhota, Filip; Soucek, Pavel; Trneny, Marek; Kleibl, Zdenek
2015-01-01
The checkpoint kinase 2 gene (CHEK2) codes for the CHK2 protein, an important mediator of the DNA damage response pathway. The CHEK2 gene has been recognized as a multi-cancer susceptibility gene; however, its role in non-Hodgkin lymphoma (NHL) remains unclear. We performed mutation analysis of the entire CHEK2 coding sequence in 340 NHL patients using denaturing high-performance liquid chromatography (DHPLC) and multiplex ligation-dependent probe amplification (MLPA). Identified hereditary variants were genotyped in 445 non-cancer controls. The influence of CHEK2 variants on disease risk was statistically evaluated. Identified CHEK2 germline variants included four truncating mutations (found in five patients and no control; P = 0.02) and nine missense variants (found in 21 patients and 12 controls; P = 0.02). Carriers of non-synonymous variants had an increased risk of NHL development [odds ratio (OR) 2.86; 95% confidence interval (CI) 1.42–5.79] and an unfavorable prognosis [hazard ratio (HR) of progression-free survival (PFS) 2.1; 95% CI 1.12–4.05]. In contrast, the most frequent intronic variant c.319+43dupA (identified in 22% of patients and 31% of controls) was associated with a decreased NHL risk (OR = 0.62; 95% CI 0.45–0.86), but its positive prognostic effect was limited to NHL patients with diffuse large B-cell lymphoma (DLBCL) treated by conventional chemotherapy without rituximab (HR-PFS 0.4; 94% CI 0.17–0.74). Our results show that germ-line CHEK2 mutations affecting protein coding sequence confer a moderately-increased risk of NHL, they are associated with an unfavorable NHL prognosis, and they may represent a valuable predictive biomarker for patients with DLBCL. PMID:26506619
Association of Germline CHEK2 Gene Variants with Risk and Prognosis of Non-Hodgkin Lymphoma.
Havranek, Ondrej; Kleiblova, Petra; Hojny, Jan; Lhota, Filip; Soucek, Pavel; Trneny, Marek; Kleibl, Zdenek
2015-01-01
The checkpoint kinase 2 gene (CHEK2) codes for the CHK2 protein, an important mediator of the DNA damage response pathway. The CHEK2 gene has been recognized as a multi-cancer susceptibility gene; however, its role in non-Hodgkin lymphoma (NHL) remains unclear. We performed mutation analysis of the entire CHEK2 coding sequence in 340 NHL patients using denaturing high-performance liquid chromatography (DHPLC) and multiplex ligation-dependent probe amplification (MLPA). Identified hereditary variants were genotyped in 445 non-cancer controls. The influence of CHEK2 variants on disease risk was statistically evaluated. Identified CHEK2 germline variants included four truncating mutations (found in five patients and no control; P = 0.02) and nine missense variants (found in 21 patients and 12 controls; P = 0.02). Carriers of non-synonymous variants had an increased risk of NHL development [odds ratio (OR) 2.86; 95% confidence interval (CI) 1.42-5.79] and an unfavorable prognosis [hazard ratio (HR) of progression-free survival (PFS) 2.1; 95% CI 1.12-4.05]. In contrast, the most frequent intronic variant c.319+43dupA (identified in 22% of patients and 31% of controls) was associated with a decreased NHL risk (OR = 0.62; 95% CI 0.45-0.86), but its positive prognostic effect was limited to NHL patients with diffuse large B-cell lymphoma (DLBCL) treated by conventional chemotherapy without rituximab (HR-PFS 0.4; 94% CI 0.17-0.74). Our results show that germ-line CHEK2 mutations affecting protein coding sequence confer a moderately-increased risk of NHL, they are associated with an unfavorable NHL prognosis, and they may represent a valuable predictive biomarker for patients with DLBCL.
Diniz, Mariana C; Olivon, Vania C; Tavares, Lívia D; Simplicio, Janaina A; Gonzaga, Natália A; de Souza, Daniele G; Bendhack, Lusiane M; Tirapelli, Carlos R; Bonaventura, Daniella
2017-05-01
To determine the role of reactive oxygen species (ROS) on sodium nitroprusside (SNP)-induced tolerance. Additionally, we evaluated the role of ROS on NF-κB activation and pro-inflammatory cytokines production during SNP-induced tolerance. To induce in vitro tolerance, endothelium-intact or -denuded aortic rings isolated from male Balb-c mice were incubated for 15, 30, 45 or 60min with SNP (10nmol/L). Tolerance to SNP was observed after incubation of endothelium-denuded, but not endothelium-intact aortas for 60min with this inorganic nitrate. Pre-incubation of denuded rings with tiron (superoxide anion (O 2 - ) scavenger), and the NADPH oxidase inhibitors apocynin and atorvastatin reversed SNP-induced tolerance. l-NAME (non-selective NOS inhibitor) and l-arginine (NOS substrate) also prevented SNP-induced tolerance. Similarly, ibuprofen (non-selective cyclooxygenase (COX) inhibitor), nimesulide (selective COX-2 inhibitor), AH6809 (prostaglandin PGF 2 α receptor antagonist) or SQ29584 [PGH 2 /thromboxane TXA 2 receptor antagonist] reversed SNP-induced tolerance. Increased ROS generation was detected in tolerant arteries and both tiron and atorvastatin reversed this response. Tiron prevented tolerance-induced increase on O 2 - and hydrogen peroxide (H 2 O 2 ) levels. The increase onp65/NF-κB expression and TNF-α production in tolerant arteries was prevented by tiron. The major new finding of our study is that SNP-induced tolerance is mediated by NADPH-oxidase derived ROS and vasoconstrictor prostanoids derived from COX-2, which are capable of reducing the vasorelaxation induced by SNP. Additionally, we found that ROS mediate the activation of NF-κB and the production of TNF-α in tolerant arteries. These findings identify putative molecular mechanisms whereby SNP induces tolerance in the vasculature. Copyright © 2017 Elsevier Inc. All rights reserved.
Genetics of Inflammatory Bowel Diseases
McGovern, Dermot; Kugathasan, Subra; Cho, Judy H.
2015-01-01
In this Review, we provide an update on genome-wide association studies (GWAS) in inflammatory bowel disease (IBD). In addition, we summarize progress in defining the functional consequences of associated alleles for coding and non-coding genetic variation. In the small minority of loci where major association signals correspond to non-synonymous variation, we summarize studies defining their functional effects and implications for therapeutic targeting. Importantly, the large majority of GWAS-associated loci involve non-coding variation, many of which modulate levels of gene expression. Recent expression quantitative trait loci (eQTL) studies have established that expression of the large majority of human genes is regulated by non-coding genetic variation. Significant advances in defining the epigenetic landscape have demonstrated that IBD GWAS signals are highly enriched within cell-specific active enhancer marks. Studies in European ancestry populations have dominated the landscape of IBD genetics studies, but increasingly, studies in Asian and African-American populations are being reported. Common variation accounts for only a modest fraction of the predicted heritability and the role of rare genetic variation of higher effects (i.e. odds ratios markedly deviating from one) is increasingly being identified through sequencing efforts. These sequencing studies have been particularly productive in very-early onset, more severe cases. A major challenge in IBD genetics will be harnessing the vast array of genetic discovery for clinical utility, through emerging precision medicine initiatives. We discuss the rapidly evolving area of direct to consumer genetic testing, as well as the current utility of clinical exome sequencing, especially in very early onset, severe IBD cases. We summarize recent progress in the pharmacogenetics of IBD with respect of partitioning patient responses to anti-TNF and thiopurine therapies. Highly collaborative studies across research centers and across subspecialties and disciplines will be required to fully realize the promise of genetic discovery in IBD. PMID:26255561
Insights into HLA-G Genetics Provided by Worldwide Haplotype Diversity
Castelli, Erick C.; Ramalho, Jaqueline; Porto, Iane O. P.; Lima, Thálitta H. A.; Felício, Leandro P.; Sabbagh, Audrey; Donadi, Eduardo A.; Mendes-Junior, Celso T.
2014-01-01
Human leukocyte antigen G (HLA-G) belongs to the family of non-classical HLA class I genes, located within the major histocompatibility complex (MHC). HLA-G has been the target of most recent research regarding the function of class I non-classical genes. The main features that distinguish HLA-G from classical class I genes are (a) limited protein variability, (b) alternative splicing generating several membrane bound and soluble isoforms, (c) short cytoplasmic tail, (d) modulation of immune response (immune tolerance), and (e) restricted expression to certain tissues. In the present work, we describe the HLA-G gene structure and address the HLA-G variability and haplotype diversity among several populations around the world, considering each of its major segments [promoter, coding, and 3′ untranslated region (UTR)]. For this purpose, we developed a pipeline to reevaluate the 1000Genomes data and recover miscalled or missing genotypes and haplotypes. It became clear that the overall structure of the HLA-G molecule has been maintained during the evolutionary process and that most of the variation sites found in the HLA-G coding region are either coding synonymous or intronic mutations. In addition, only a few frequent and divergent extended haplotypes are found when the promoter, coding, and 3′UTRs are evaluated together. The divergence is particularly evident for the regulatory regions. The population comparisons confirmed that most of the HLA-G variability has originated before human dispersion from Africa and that the allele and haplotype frequencies have probably been shaped by strong selective pressures. PMID:25339953
Problem-Solving Test: The Effect of Synonymous Codons on Gene Expression
ERIC Educational Resources Information Center
Szeberenyi, Jozsef
2009-01-01
Terms to be familiar with before you start to solve the test: the genetic code, codon, degenerate codons, protein synthesis, aminoacyl-tRNA, anticodon, antiparallel orientation, wobble, unambiguous codons, ribosomes, initiation, elongation and termination of translation, peptidyl transferase, translocation, degenerate oligonucleotides, green…
Gershony, L C; Penedo, M C T; Davis, B W; Murphy, W J; Helps, C R; Lyons, L A
2014-12-01
Coat colours and patterns are highly variable in cats and are determined mainly by several genes with Mendelian inheritance. A 2-bp deletion in agouti signalling protein (ASIP) is associated with melanism in domestic cats. Bengal cats are hybrids between domestic cats and Asian leopard cats (Prionailurus bengalensis), and the charcoal coat colouration/pattern in Bengals presents as a possible incomplete melanism. The complete coding region of ASIP was directly sequenced in Asian leopard, domestic and Bengal cats. Twenty-seven variants were identified between domestic and leopard cats and were investigated in Bengals and Savannahs, a hybrid with servals (Leptailurus serval). The leopard cat ASIP haplotype was distinguished from domestic cat by four synonymous and four non-synonymous exonic SNPs, as well as 19 intronic variants, including a 42-bp deletion in intron 4. Fifty-six of 64 reported charcoal cats were compound heterozygotes at ASIP, with leopard cat agouti (A(P) (be) ) and domestic cat non-agouti (a) haplotypes. Twenty-four Bengals had an additional unique haplotype (A2) for exon 2 that was not identified in leopard cats, servals or jungle cats (Felis chaus). The compound heterozygote state suggests the leopard cat allele, in combination with the recessive non-agouti allele, influences Bengal markings, producing a darker, yet not completely melanistic coat. This is the first validation of a leopard cat allele segregating in the Bengal breed and likely affecting their overall pelage phenotype. Genetic testing services need to be aware of the possible segregation of wild felid alleles in all assays performed on hybrid cats. © 2014 The Authors. Animal Genetics published by John Wiley & Sons Ltd on behalf of Stichting International Foundation for Animal Genetics.
Gardner, Shea N.; Hall, Barry G.
2013-01-01
Effective use of rapid and inexpensive whole genome sequencing for microbes requires fast, memory efficient bioinformatics tools for sequence comparison. The kSNP v2 software finds single nucleotide polymorphisms (SNPs) in whole genome data. kSNP v2 has numerous improvements over kSNP v1 including SNP gene annotation; better scaling for draft genomes available as assembled contigs or raw, unassembled reads; a tool to identify the optimal value of k; distribution of packages of executables for Linux and Mac OS X for ease of installation and user-friendly use; and a detailed User Guide. SNP discovery is based on k-mer analysis, and requires no multiple sequence alignment or the selection of a single reference genome. Most target sets with hundreds of genomes complete in minutes to hours. SNP phylogenies are built by maximum likelihood, parsimony, and distance, based on all SNPs, only core SNPs, or SNPs present in some intermediate user-specified fraction of targets. The SNP-based trees that result are consistent with known taxonomy. kSNP v2 can handle many gigabases of sequence in a single run, and if one or more annotated genomes are included in the target set, SNPs are annotated with protein coding and other information (UTRs, etc.) from Genbank file(s). We demonstrate application of kSNP v2 on sets of viral and bacterial genomes, and discuss in detail analysis of a set of 68 finished E. coli and Shigella genomes and a set of the same genomes to which have been added 47 assemblies and four “raw read” genomes of H104:H4 strains from the recent European E. coli outbreak that resulted in both bloody diarrhea and hemolytic uremic syndrome (HUS), and caused at least 50 deaths. PMID:24349125
Gardner, Shea N; Hall, Barry G
2013-01-01
Effective use of rapid and inexpensive whole genome sequencing for microbes requires fast, memory efficient bioinformatics tools for sequence comparison. The kSNP v2 software finds single nucleotide polymorphisms (SNPs) in whole genome data. kSNP v2 has numerous improvements over kSNP v1 including SNP gene annotation; better scaling for draft genomes available as assembled contigs or raw, unassembled reads; a tool to identify the optimal value of k; distribution of packages of executables for Linux and Mac OS X for ease of installation and user-friendly use; and a detailed User Guide. SNP discovery is based on k-mer analysis, and requires no multiple sequence alignment or the selection of a single reference genome. Most target sets with hundreds of genomes complete in minutes to hours. SNP phylogenies are built by maximum likelihood, parsimony, and distance, based on all SNPs, only core SNPs, or SNPs present in some intermediate user-specified fraction of targets. The SNP-based trees that result are consistent with known taxonomy. kSNP v2 can handle many gigabases of sequence in a single run, and if one or more annotated genomes are included in the target set, SNPs are annotated with protein coding and other information (UTRs, etc.) from Genbank file(s). We demonstrate application of kSNP v2 on sets of viral and bacterial genomes, and discuss in detail analysis of a set of 68 finished E. coli and Shigella genomes and a set of the same genomes to which have been added 47 assemblies and four "raw read" genomes of H104:H4 strains from the recent European E. coli outbreak that resulted in both bloody diarrhea and hemolytic uremic syndrome (HUS), and caused at least 50 deaths.
Hoang, Van L T; Innes, David J; Shaw, P Nicholas; Monteith, Gregory R; Gidley, Michael J; Dietzgen, Ralf G
2015-07-30
Mango fruits contain a broad spectrum of phenolic compounds which impart potential health benefits; their biosynthesis is catalysed by enzymes in the phenylpropanoid-flavonoid (PF) pathway. The aim of this study was to reveal the variability in genes involved in the PF pathway in three different mango varieties Mangifera indica L., a member of the family Anacardiaceae: Kensington Pride (KP), Irwin (IW) and Nam Doc Mai (NDM) and to determine associations with gene expression and mango flavonoid profiles. A close evolutionary relationship between mango genes and those from the woody species poplar of the Salicaceae family (Populus trichocarpa) and grape of the Vitaceae family (Vitis vinifera), was revealed through phylogenetic analysis of PF pathway genes. We discovered 145 SNPs in total within coding sequences with an average frequency of one SNP every 316 bp. Variety IW had the highest SNP frequency (one SNP every 258 bp) while KP and NDM had similar frequencies (one SNP every 369 bp and 360 bp, respectively). The position in the PF pathway appeared to influence the extent of genetic diversity of the encoded enzymes. The entry point enzymes phenylalanine lyase (PAL), cinnamate 4-mono-oxygenase (C4H) and chalcone synthase (CHS) had low levels of SNP diversity in their coding sequences, whereas anthocyanidin reductase (ANR) showed the highest SNP frequency followed by flavonoid 3'-hydroxylase (F3'H). Quantitative PCR revealed characteristic patterns of gene expression that differed between mango peel and flesh, and between varieties. The combination of mango expressed sequence tags and availability of well-established reference PF biosynthetic genes from other plant species allowed the identification of coding sequences of genes that may lead to the formation of important flavonoid compounds in mango fruits and facilitated characterisation of single nucleotide polymorphisms between varieties. We discovered an association between the extent of sequence variation and position in the pathway for up-stream genes. The high expression of PAL, C4H and CHS genes in mango peel compared to flesh is associated with high amounts of total phenolic contents in peels, which suggest that these genes have an influence on total flavonoid levels in mango fruit peel and flesh. In addition, the particularly high expression levels of ANR in KP and NDM peels compared to IW peel and the significant accumulation of its product epicatechin gallate (ECG) in those extracts reflects the rate-limiting role of ANR on ECG biosynthesis in mango.
Evidence of translation efficiency adaptation of the coding regions of the bacteriophage lambda.
Goz, Eli; Mioduser, Oriah; Diament, Alon; Tuller, Tamir
2017-08-01
Deciphering the way gene expression regulatory aspects are encoded in viral genomes is a challenging mission with ramifications related to all biomedical disciplines. Here, we aimed to understand how the evolution shapes the bacteriophage lambda genes by performing a high resolution analysis of ribosomal profiling data and gene expression related synonymous/silent information encoded in bacteriophage coding regions.We demonstrated evidence of selection for distinct compositions of synonymous codons in early and late viral genes related to the adaptation of translation efficiency to different bacteriophage developmental stages. Specifically, we showed that evolution of viral coding regions is driven, among others, by selection for codons with higher decoding rates; during the initial/progressive stages of infection the decoding rates in early/late genes were found to be superior to those in late/early genes, respectively. Moreover, we argued that selection for translation efficiency could be partially explained by adaptation to Escherichia coli tRNA pool and the fact that it can change during the bacteriophage life cycle.An analysis of additional aspects related to the expression of viral genes, such as mRNA folding and more complex/longer regulatory signals in the coding regions, is also reported. The reported conclusions are likely to be relevant also to additional viruses. © The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Smura, Teemu; Blomqvist, Soile; Vuorinen, Tytti; Ivanova, Olga; Samoilovich, Elena; Al-Hello, Haider; Savolainen-Kopra, Carita; Hovi, Tapani; Roivainen, Merja
2014-01-01
Genus Enterovirus (Family Picornaviridae,) consists of twelve species divided into genetically diverse types by their capsid protein VP1 coding sequences. Each enterovirus type can further be divided into intra-typic sub-clusters (genotypes). The aim of this study was to elucidate what leads to the emergence of novel enterovirus clades (types and genotypes). An evolutionary analysis was conducted for a sub-group of Enterovirus C species that contains types Coxsackievirus A21 (CVA-21), CVA-24, Enterovirus C95 (EV-C95), EV-C96 and EV-C99. VP1 gene datasets were collected and analysed to infer the phylogeny, rate of evolution, nucleotide and amino acid substitution patterns and signs of selection. In VP1 coding gene, high intra-typic sequence diversities and robust grouping into distinct genotypes within each type were detected. Within each type the majority of nucleotide substitutions were synonymous and the non-synonymous substitutions tended to cluster in distinct highly polymorphic sites. Signs of positive selection were detected in some of these highly polymorphic sites, while strong negative selection was indicated in most of the codons. Despite robust clustering to intra-typic genotypes, only few genotype-specific ‘signature’ amino acids were detected. In contrast, when different enterovirus types were compared, there was a clear tendency towards fixation of type-specific ‘signature’ amino acids. The results suggest that permanent fixation of type-specific amino acids is a hallmark associated with evolution of different enterovirus types, whereas neutral evolution and/or (frequency-dependent) positive selection in few highly polymorphic amino acid sites are the dominant forms of evolution when strains within an enterovirus type are compared. PMID:24695547
Smura, Teemu; Blomqvist, Soile; Vuorinen, Tytti; Ivanova, Olga; Samoilovich, Elena; Al-Hello, Haider; Savolainen-Kopra, Carita; Hovi, Tapani; Roivainen, Merja
2014-01-01
Genus Enterovirus (Family Picornaviridae,) consists of twelve species divided into genetically diverse types by their capsid protein VP1 coding sequences. Each enterovirus type can further be divided into intra-typic sub-clusters (genotypes). The aim of this study was to elucidate what leads to the emergence of novel enterovirus clades (types and genotypes). An evolutionary analysis was conducted for a sub-group of Enterovirus C species that contains types Coxsackievirus A21 (CVA-21), CVA-24, Enterovirus C95 (EV-C95), EV-C96 and EV-C99. VP1 gene datasets were collected and analysed to infer the phylogeny, rate of evolution, nucleotide and amino acid substitution patterns and signs of selection. In VP1 coding gene, high intra-typic sequence diversities and robust grouping into distinct genotypes within each type were detected. Within each type the majority of nucleotide substitutions were synonymous and the non-synonymous substitutions tended to cluster in distinct highly polymorphic sites. Signs of positive selection were detected in some of these highly polymorphic sites, while strong negative selection was indicated in most of the codons. Despite robust clustering to intra-typic genotypes, only few genotype-specific 'signature' amino acids were detected. In contrast, when different enterovirus types were compared, there was a clear tendency towards fixation of type-specific 'signature' amino acids. The results suggest that permanent fixation of type-specific amino acids is a hallmark associated with evolution of different enterovirus types, whereas neutral evolution and/or (frequency-dependent) positive selection in few highly polymorphic amino acid sites are the dominant forms of evolution when strains within an enterovirus type are compared.
NASA Astrophysics Data System (ADS)
He, Feng; Wen, Haishen; Yu, Dahui; Li, Jifang; Shi, Bao; Chen, Caifang; Zhang, Jiaren; Jin, Guoxiong; Chen, Xiaoyan; Shi, Dan; Yang, Yanping
2010-12-01
Follicle stimulating hormone β (FSHβ) of Japanese flounder ( Paralichthys olivaceus) plays a key role in the regulation of gonadal development. This study aimed to investigate molecular genetic characteristics of the FSHβ gene and elucidate the effects of single nucleotide polymorphisms (SNPs) of FSHβ on reproductive traits in Japanese flounder. We used polymerase chain reaction single-strand conformation polymorphism (PCR-SSCP) and sequencing of the FSHβ gene in 60 individuals. We identified only an SNP (T/C) in the coding region of exon3 of FSHβ. The SNP (T/C) did not lead to amino acid changes at the position 340 bp of FSHβ gene. Statistical analysis showed that the SNP was significantly associated with testosterone (T) level and gonadosomatic index (GSI) ( P < 0.05). Individuals with genotype TC of the SNP had significantly higher serum T levels and GSI ( P < 0.05) than that of genotype CC. Therefore, FSHβ gene could be a useful molecular marker in selection for prominent reproductive trait in Japanese Flounder.
SNP_tools: A compact tool package for analysis and conversion of genotype data for MS-Excel
Chen, Bowang; Wilkening, Stefan; Drechsel, Marion; Hemminki, Kari
2009-01-01
Background Single nucleotide polymorphism (SNP) genotyping is a major activity in biomedical research. Scientists prefer to have a facile access to the results which may require conversions between data formats. First hand SNP data is often entered in or saved in the MS-Excel format, but this software lacks genetic and epidemiological related functions. A general tool to do basic genetic and epidemiological analysis and data conversion for MS-Excel is needed. Findings The SNP_tools package is prepared as an add-in for MS-Excel. The code is written in Visual Basic for Application, embedded in the Microsoft Office package. This add-in is an easy to use tool for users with basic computer knowledge (and requirements for basic statistical analysis). Conclusion Our implementation for Microsoft Excel 2000-2007 in Microsoft Windows 2000, XP, Vista and Windows 7 beta can handle files in different formats and converts them into other formats. It is a free software. PMID:19852806
SNP_tools: A compact tool package for analysis and conversion of genotype data for MS-Excel.
Chen, Bowang; Wilkening, Stefan; Drechsel, Marion; Hemminki, Kari
2009-10-23
Single nucleotide polymorphism (SNP) genotyping is a major activity in biomedical research. Scientists prefer to have a facile access to the results which may require conversions between data formats. First hand SNP data is often entered in or saved in the MS-Excel format, but this software lacks genetic and epidemiological related functions. A general tool to do basic genetic and epidemiological analysis and data conversion for MS-Excel is needed. The SNP_tools package is prepared as an add-in for MS-Excel. The code is written in Visual Basic for Application, embedded in the Microsoft Office package. This add-in is an easy to use tool for users with basic computer knowledge (and requirements for basic statistical analysis). Our implementation for Microsoft Excel 2000-2007 in Microsoft Windows 2000, XP, Vista and Windows 7 beta can handle files in different formats and converts them into other formats. It is a free software.
Diehl, William E.; Johnson, Welkin E.; Hunter, Eric
2013-01-01
All genes in the TRIM6/TRIM34/TRIM5/TRIM22 locus are type I interferon inducible, with TRIM5 and TRIM22 possessing antiviral properties. Evolutionary studies involving the TRIM6/34/5/22 locus have predominantly focused on the coding sequence of the genes, finding that TRIM5 and TRIM22 have undergone high rates of both non-synonymous nucleotide replacements and in-frame insertions and deletions. We sought to understand if divergent evolutionary pressures on TRIM6/34/5/22 coding regions have selected for modifications in the non-coding regions of these genes and explore whether such non-coding changes may influence the biological function of these genes. The transcribed genomic regions, including the introns, of TRIM6, TRIM34, TRIM5, and TRIM22 from ten Haplorhini primates and one prosimian species were analyzed for transposable element content. In Haplorhini species, TRIM5 displayed an exaggerated interspecies variability, predominantly resulting from changes in the composition of transposable elements in the large first and fourth introns. Multiple lineage-specific endogenous retroviral long terminal repeats (LTRs) were identified in the first intron of TRIM5 and TRIM22. In the prosimian genome, we identified a duplication of TRIM5 with a concomitant loss of TRIM22. The transposable element content of the prosimian TRIM5 genes appears to largely represent the shared Haplorhini/prosimian ancestral state for this gene. Furthermore, we demonstrated that one such differentially fixed LTR provides for species-specific transcriptional regulation of TRIM22 in response to p53 activation. Our results identify a previously unrecognized source of species-specific variation in the antiviral TRIM genes, which can lead to alterations in their transcriptional regulation. These observations suggest that there has existed long-term pressure for exaptation of retroviral LTRs in the non-coding regions of these genes. This likely resulted from serial viral challenges and provided a mechanism for rapid alteration of transcriptional regulation. To our knowledge, this represents the first report of persistent evolutionary pressure for the capture of retroviral LTR insertions. PMID:23516500
Epidemiology of angina pectoris: role of natural language processing of the medical record
Pakhomov, Serguei; Hemingway, Harry; Weston, Susan A.; Jacobsen, Steven J.; Rodeheffer, Richard; Roger, Véronique L.
2007-01-01
Background The diagnosis of angina is challenging as it relies on symptom descriptions. Natural language processing (NLP) of the electronic medical record (EMR) can provide access to such information contained in free text that may not be fully captured by conventional diagnostic coding. Objective To test the hypothesis that NLP of the EMR improves angina pectoris (AP) ascertainment over diagnostic codes. Methods Billing records of in- and out-patients were searched for ICD-9 codes for AP, chronic ischemic heart disease and chest pain. EMR clinical reports were searched electronically for 50 specific non-negated natural language synonyms to these ICD-9 codes. The two methods were compared to a standardized assessment of angina by Rose questionnaire for three diagnostic levels: unspecified chest pain, exertional chest pain, and Rose angina. Results Compared to the Rose questionnaire, the true positive rate of EMR-NLP for unspecified chest pain was 62% (95%CI:55–67) vs. 51% (95%CI:44–58) for diagnostic codes (p<0.001). For exertional chest pain, the EMR-NLP true positive rate was 71% (95%CI:61–80) vs. 62% (95%CI:52–73) for diagnostic codes (p=0.10). Both approaches had 88% (95%CI:65–100) true positive rate for Rose angina. The EMR-NLP method consistently identified more patients with exertional chest pain over 28-month follow-up. Conclusion EMR-NLP method improves the detection of unspecified and exertional chest pain cases compared to diagnostic codes. These findings have implications for epidemiological and clinical studies of angina pectoris. PMID:17383310
Schnitzler, Fabian; Friedrich, Matthias; Wolf, Christiane; Angelberger, Marianne; Diegelmann, Julia; Olszak, Torsten; Beigel, Florian; Tillack, Cornelia; Stallhofer, Johannes; Göke, Burkhard; Glas, Jürgen; Lohse, Peter; Brand, Stephan
2014-01-01
Very recently, a sub-analysis of genome-wide association scans revealed that the non-coding single nucleotide polymorphism (SNP) rs12212067 in the FOXO3A gene is associated with a milder course of Crohn's disease (CD) (Cell 2013;155:57-69). The aim of our study was to evaluate the clinical value of the SNP rs12212067 in predicting the severity of CD by correlating CD patient genotype status with the most relevant complications of CD such as stenoses, fistulas, and CD-related surgery. We genotyped 550 CD patients for rs12212067 (FOXO3A) and the three common CD-associated NOD2 mutations rs2066844, rs2066847, and rs2066847 and performed genotype-phenotype analyses. No significant phenotypic differences were found between the wild-type genotype TT of the FOXO3A SNP rs12212067 and the minor genotypes TG and GG independently from NOD2 variants. The allele frequency of the minor G allele was 12.7%. Age at diagnosis, disease duration, body mass index, surgery rate, stenoses, fistula, need for immunosuppressive therapy, and disease course were not significantly different. In contrast, the NOD2 mutant p.Leu1007fsX1008 (rs2066847) was highly associated with penetrating CD (p = 0.01), the development of fistulas (p = 0.01) and stenoses (p = 0.01), and ileal disease localization (p = 0.03). Importantly, the NOD2 SNP rs2066847 was a strong separator between an aggressive and a mild course of CD (p = 2.99×10(-5)), while the FOXO3A SNP rs12212067 did not separate between mild and aggressive CD behavior in our cohort (p = 0.35). 96.2% of the homozygous NOD2 p.Leu1007fsX1008 carriers had an aggressive disease behavior compared to 69.3% of the patients with the NOD2 wild-type genotype (p = 0.007). In clinical practice, the NOD2 variant p.Leu1007fsX1008 (rs2066847), in particular in homozygous form, is a much stronger marker for a severe clinical phenotype than the FOXO3A rs12212067 SNP for a mild disease course on an individual patient level despite its important impact on the inflammatory response of monocytes.
Sallman, David A.; Basiorka, Ashley A.; Irvine, Brittany A.; Zhang, Ling; Epling-Burnette, P.K.; Rollison, Dana E.; Mallo, Mar; Sokol, Lubomir; Solé, Francesc; Maciejewski, Jaroslaw; List, Alan F.
2015-01-01
P53 is a key regulator of many cellular processes and is negatively regulated by the human homolog of murine double minute-2 (MDM2) E3 ubiquitin ligase. Single nucleotide polymorphisms (SNPs) of either gene alone, and in combination, are linked to cancer susceptibility, disease progression, and therapy response. We analyzed the interaction of TP53 R72P and MDM2 SNP309 SNPs in relationship to outcome in patients with myelodysplastic syndromes (MDS). Sanger sequencing was performed on DNA isolated from 208 MDS cases. Utilizing a novel functional SNP scoring system ranging from +2 to −2 based on predicted p53 activity, we found statistically significant differences in overall survival (OS) (p = 0.02) and progression-free survival (PFS) (p = 0.02) in non-del(5q) MDS patients with low functional scores. In univariate analysis, only IPSS and the functional SNP score predicted OS and PFS in non-del(5q) patients. In multivariate analysis, the functional SNP score was independent of IPSS for OS and PFS. These data underscore the importance of TP53 R72P and MDM2 SNP309 SNPs in MDS, and provide a novel scoring system independent of IPSS that is predictive for disease outcome. PMID:26416416
Keene, Keith L; Chen, Wei-Min; Chen, Fang; Williams, Stephen R; Elkhatib, Stacey D; Hsu, Fang-Chi; Mychaleckyj, Josyf C; Doheny, Kimberly F; Pugh, Elizabeth W; Ling, Hua; Laurie, Cathy C; Gogarten, Stephanie M; Madden, Ebony B; Worrall, Bradford B; Sale, Michele M
2014-01-01
B vitamins play an important role in homocysteine metabolism, with vitamin deficiencies resulting in increased levels of homocysteine and increased risk for stroke. We performed a genome-wide association study (GWAS) in 2,100 stroke patients from the Vitamin Intervention for Stroke Prevention (VISP) trial, a clinical trial designed to determine whether the daily intake of high-dose folic acid, vitamins B6, and B12 reduce recurrent cerebral infarction. Extensive quality control (QC) measures resulted in a total of 737,081 SNPs for analysis. Genome-wide association analyses for baseline quantitative measures of folate, Vitamins B12, and B6 were completed using linear regression approaches, implemented in PLINK. Six associations met or exceeded genome-wide significance (P ≤ 5 × 10(-08)). For baseline Vitamin B12, the strongest association was observed with a non-synonymous SNP (nsSNP) located in the CUBN gene (P = 1.76 × 10(-13)). Two additional CUBN intronic SNPs demonstrated strong associations with B12 (P = 2.92 × 10(-10) and 4.11 × 10(-10)), while a second nsSNP, located in the TCN1 gene, also reached genome-wide significance (P = 5.14 × 10(-11)). For baseline measures of Vitamin B6, we identified genome-wide significant associations for SNPs at the ALPL locus (rs1697421; P = 7.06 × 10(-10) and rs1780316; P = 2.25 × 10(-08)). In addition to the six genome-wide significant associations, nine SNPs (two for Vitamin B6, six for Vitamin B12, and one for folate measures) provided suggestive evidence for association (P ≤ 10(-07)). Our GWAS study has identified six genome-wide significant associations, nine suggestive associations, and successfully replicated 5 of 16 SNPs previously reported to be associated with measures of B vitamins. The six genome-wide significant associations are located in gene regions that have shown previous associations with measures of B vitamins; however, four of the nine suggestive associations represent novel finding and warrant further investigation in additional populations.
Analysis and visualization of chromosomal abnormalities in SNP data with SNPscan
Ting, Jason C; Ye, Ying; Thomas, George H; Ruczinski, Ingo; Pevsner, Jonathan
2006-01-01
Background A variety of diseases are caused by chromosomal abnormalities such as aneuploidies (having an abnormal number of chromosomes), microdeletions, microduplications, and uniparental disomy. High density single nucleotide polymorphism (SNP) microarrays provide information on chromosomal copy number changes, as well as genotype (heterozygosity and homozygosity). SNP array studies generate multiple types of data for each SNP site, some with more than 100,000 SNPs represented on each array. The identification of different classes of anomalies within SNP data has been challenging. Results We have developed SNPscan, a web-accessible tool to analyze and visualize high density SNP data. It enables researchers (1) to visually and quantitatively assess the quality of user-generated SNP data relative to a benchmark data set derived from a control population, (2) to display SNP intensity and allelic call data in order to detect chromosomal copy number anomalies (duplications and deletions), (3) to display uniparental isodisomy based on loss of heterozygosity (LOH) across genomic regions, (4) to compare paired samples (e.g. tumor and normal), and (5) to generate a file type for viewing SNP data in the University of California, Santa Cruz (UCSC) Human Genome Browser. SNPscan accepts data exported from Affymetrix Copy Number Analysis Tool as its input. We validated SNPscan using data generated from patients with known deletions, duplications, and uniparental disomy. We also inspected previously generated SNP data from 90 apparently normal individuals from the Centre d'Étude du Polymorphisme Humain (CEPH) collection, and identified three cases of uniparental isodisomy, four females having an apparently mosaic X chromosome, two mislabelled SNP data sets, and one microdeletion on chromosome 2 with mosaicism from an apparently normal female. These previously unrecognized abnormalities were all detected using SNPscan. The microdeletion was independently confirmed by fluorescence in situ hybridization, and a region of homozygosity in a UPD case was confirmed by sequencing of genomic DNA. Conclusion SNPscan is useful to identify chromosomal abnormalities based on SNP intensity (such as chromosomal copy number changes) and heterozygosity data (including regions of LOH and some cases of UPD). The program and source code are available at the SNPscan website . PMID:16420694
Diversity in the Toll-Like Receptor Genes of the African Penguin (Spheniscus demersus).
Dalton, Desiré Lee; Vermaak, Elaine; Roelofse, Marli; Kotze, Antoinette
2016-01-01
The African penguin, Spheniscus demersus, is listed as Endangered by the IUCN Red List of Threatened Species due to the drastic reduction in population numbers over the last 20 years. To date, the only studies on immunogenetic variation in penguins have been conducted on the major histocompatibility complex (MHC) genes. It was shown in humans that up to half of the genetic variability in immune responses to pathogens are located in non-MHC genes. Toll-like receptors (TLRs) are now increasingly being studied in a variety of taxa as a broader approach to determine functional genetic diversity. In this study, we confirm low genetic diversity in the innate immune region of African penguins similar to that observed in New Zealand robin that has undergone several severe population bottlenecks. Single nucleotide polymorphism (SNP) diversity across TLRs varied between ex situ and in situ penguins with the number of non-synonymous alterations in ex situ populations (n = 14) being reduced in comparison to in situ populations (n = 16). Maintaining adaptive diversity is of vital importance in the assurance populations as these animals may potentially be used in the future for re-introductions. Therefore, this study provides essential data on immune gene diversity in penguins and will assist in providing an additional monitoring tool for African penguin in the wild, as well as to monitor diversity in ex situ populations and to ensure that diversity found in the in situ populations are captured in the assurance populations.
Anaya, Juan-Manuel; Kim-Howard, Xana; Prahalad, Sampath; Cherñavsky, Alejandra; Cañas, Carlos; Rojas-Villarraga, Adriana; Bohnsack, John; Jonsson, Roland; Bolstad, Anne Isine; Brun, Johan G; Cobb, Beth; Moser, Kathy L; James, Judith A; Harley, John B; Nath, Swapan K
2012-02-01
Many autoimmune diseases (ADs) share similar underlying pathology and have a tendency to cluster within families, supporting the involvement of shared susceptibility genes. To date, most of the genetic variants associated with systemic lupus erythematosus (SLE) susceptibility also show association with others ADs. ITGAM and its associated 'predisposing' variant (rs1143679, Arg77His), predicted to alter the tertiary structures of the ligand-binding domain of ITGAM, may play a key role for SLE pathogenesis. The aim of this study is to examine whether the ITGAM variant is also associated with other ADs. We evaluated case-control association between rs1143679 and ADs (N=18,457) including primary Sjögren's syndrome, systemic sclerosis, multiple sclerosis, rheumatoid arthritis, juvenile idiopathic arthritis, celiac disease, and type-1 diabetes. We also performed meta-analyses using our data in addition to available published data. Although the risk allele 'A' is relatively more frequent among cases for each disease, it was not significantly associated with any other ADs tested in this study. However, the meta-analysis for systemic sclerosis was associated with rs1143679 (p(meta)=0.008). In summary, this study explored the role of ITGAM in general autoimmunity in seven non-lupus ADs, and only found association for systemic sclerosis when our results were combined with published results. Thus ITGAM may not be a general autoimmunity gene but this variant may be specifically associated with SLE and systemic sclerosis. Copyright © 2011 Elsevier B.V. All rights reserved.
Oyiga, Benedict C; Sharma, Ram C; Baum, Michael; Ogbonnaya, Francis C; Léon, Jens; Ballvora, Agim
2018-05-01
The increasing salinization of agricultural lands is a threat to global wheat production. Understanding of the mechanistic basis of salt tolerance (ST) is essential for developing breeding and selection strategies that would allow for increased wheat production under saline conditions to meet the increasing global demand. We used a set that consists of 150 internationally derived winter and facultative wheat cultivars genotyped with a 90K SNP chip and phenotyped for ST across three growth stages and for ionic (leaf K + and Na + contents) traits to dissect the genetic architecture regulating ST in wheat. Genome-wide association mapping revealed 187 Single Nucleotide Polymorphism (SNPs) (R 2 = 3.00-30.67%), representing 37 quantitative trait loci (QTL), significantly associated with the ST traits. Of these, four QTL on 1BS, 2AL, 2BS and 3AL were associated with ST across the three growth stages and with the ionic traits. Novel QTL were also detected on 1BS and 1DL. Candidate genes linked to these polymorphisms were uncovered, and expression analyses were performed and validated on them under saline and non-saline conditions using transcriptomics and qRT-PCR data. Expressed sequence comparisons in contrasting ST wheat genotypes identified several non-synonymous/missense mutation sites that are contributory to the ST trait variations, indicating the biological relevance of these polymorphisms that can be exploited in breeding for ST in wheat. © 2017 The Authors. Plant, Cell & Environment published by JohnWiley & Sons Ltd.
Code of Federal Regulations, 2010 CFR
2010-01-01
... 7 Agriculture 8 2010-01-01 2010-01-01 false Lot number. 987.102 Section 987.102 Agriculture... RIVERSIDE COUNTY, CALIFORNIA Administrative Rules Definitions § 987.102 Lot number. Lot number is synonymous with code and means a combination of letters or numbers, or both, acceptable to the Committee, showing...
Code of Federal Regulations, 2011 CFR
2011-01-01
... 7 Agriculture 8 2011-01-01 2011-01-01 false Lot number. 987.102 Section 987.102 Agriculture... RIVERSIDE COUNTY, CALIFORNIA Administrative Rules Definitions § 987.102 Lot number. Lot number is synonymous with code and means a combination of letters or numbers, or both, acceptable to the Committee, showing...
Code of Federal Regulations, 2013 CFR
2013-01-01
... 7 Agriculture 8 2013-01-01 2013-01-01 false Lot number. 987.102 Section 987.102 Agriculture... RIVERSIDE COUNTY, CALIFORNIA Administrative Rules Definitions § 987.102 Lot number. Lot number is synonymous with code and means a combination of letters or numbers, or both, acceptable to the Committee, showing...
Code of Federal Regulations, 2014 CFR
2014-01-01
... 7 Agriculture 8 2014-01-01 2014-01-01 false Lot number. 987.102 Section 987.102 Agriculture... RIVERSIDE COUNTY, CALIFORNIA Administrative Rules Definitions § 987.102 Lot number. Lot number is synonymous with code and means a combination of letters or numbers, or both, acceptable to the Committee, showing...
Code of Federal Regulations, 2012 CFR
2012-01-01
... 7 Agriculture 8 2012-01-01 2012-01-01 false Lot number. 987.102 Section 987.102 Agriculture... RIVERSIDE COUNTY, CALIFORNIA Administrative Rules Definitions § 987.102 Lot number. Lot number is synonymous with code and means a combination of letters or numbers, or both, acceptable to the Committee, showing...
Gardner, Shea N; Wagner, Mark C
2005-01-01
Background Microbial forensics is important in tracking the source of a pathogen, whether the disease is a naturally occurring outbreak or part of a criminal investigation. Results A method and SPR Opt (SNP and PCR-RFLP Optimization) software to perform a comprehensive, whole-genome analysis to forensically discriminate multiple sequences is presented. Tools for the optimization of forensic typing using Single Nucleotide Polymorphism (SNP) and PCR-Restriction Fragment Length Polymorphism (PCR-RFLP) analyses across multiple isolate sequences of a species are described. The PCR-RFLP analysis includes prediction and selection of optimal primers and restriction enzymes to enable maximum isolate discrimination based on sequence information. SPR Opt calculates all SNP or PCR-RFLP variations present in the sequences, groups them into haplotypes according to their co-segregation across those sequences, and performs combinatoric analyses to determine which sets of haplotypes provide maximal discrimination among all the input sequences. Those set combinations requiring that membership in the fewest haplotypes be queried (i.e. the fewest assays be performed) are found. These analyses highlight variable regions based on existing sequence data. These markers may be heterogeneous among unsequenced isolates as well, and thus may be useful for characterizing the relationships among unsequenced as well as sequenced isolates. The predictions are multi-locus. Analyses of mumps and SARS viruses are summarized. Phylogenetic trees created based on SNPs, PCR-RFLPs, and full genomes are compared for SARS virus, illustrating that purported phylogenies based only on SNP or PCR-RFLP variations do not match those based on multiple sequence alignment of the full genomes. Conclusion This is the first software to optimize the selection of forensic markers to maximize information gained from the fewest assays, accepting whole or partial genome sequence data as input. As more sequence data becomes available for multiple strains and isolates of a species, automated, computational approaches such as those described here will be essential to make sense of large amounts of information, and to guide and optimize efforts in the laboratory. The software and source code for SPR Opt is publicly available and free for non-profit use at . PMID:15904493
Norton, Heather L; Werren, Elizabeth; Friedlaender, Jonathan
2015-10-19
Variation in human skin pigmentation evolved in response to the selective pressure of ultra-violet radiation (UVR). Selection to maintain darker skin in high UVR environments is expected to constrain pigmentation phenotype and variation in pigmentation loci. Consistent with this hypothesis, the gene MC1R exhibits reduced diversity in African populations from high UVR regions compared to low-UVR non-African populations. However, MC1R diversity in non-African populations that have evolved under high-UVR conditions is not well characterized. In order to test the hypothesis that MC1R variation has been constrained in Melanesians the coding region of the MC1R gene was sequenced in 188 individuals from Northern Island Melanesia. The role of purifying selection was assessed using a modified McDonald Kreitman's test. Pairwise FST was calculated between Melanesian populations and populations from the 1000 Genomes Project. The SNP rs2228479 was genotyped in a larger sample (n = 635) of Melanesians and tested for associations with skin and hair pigmentation. We observe three nonsynonymous and two synonymous mutations. A modified McDonald Kreitman's test failed to detect a significant signal of purifying selection. Pairwise FST values calculated between the four islands sampled here indicate little regional substructure in MC1R. When compared to African, European, East and South Asian populations, Melanesians do not exhibit reduced population divergence (measured as FST) or a high proportion of haplotype sharing with Africans, as one might expect if ancestral haplotypes were conserved across high UVR populations in and out of Africa. The only common nonsynonymous polymorphism observed, rs2228479, is not significantly associated with skin or hair pigmentation in a larger sample of Melanesians. The pattern of sequence diversity here does not support a model of strong selective constraint on MC1R in Northern Island Melanesia This absence of strong constraint, as well as the recent population history of the region, may explain the observed frequencies of the derived rs2228479 allele. These results emphasize the complex genetic architecture of pigmentation phenotypes, which are controlled by multiple, possibly interacting loci. They also highlight the role that population history can play in influencing phenotypic diversity in the absence of strong natural selection.
Positive selection on the killer whale mitogenome.
Foote, Andrew D; Morin, Phillip A; Durban, John W; Pitman, Robert L; Wade, Paul; Willerslev, Eske; Gilbert, M Thomas P; da Fonseca, Rute R
2011-02-23
Mitochondria produce up to 95 per cent of the eukaryotic cell's energy. The coding genes of the mitochondrial DNA may therefore evolve under selection owing to metabolic requirements. The killer whale, Orcinus orca, is polymorphic, has a global distribution and occupies a range of ecological niches. It is therefore a suitable organism for testing this hypothesis. We compared a global dataset of the complete mitochondrial genomes of 139 individuals for amino acid changes that were associated with radical physico-chemical property changes and were influenced by positive selection. Two such selected non-synonymous amino acid changes were found; one in each of two ecotypes that inhabit the Antarctic pack ice. Both substitutions were associated with changes in local polarity, increased steric constraints and α-helical tendencies that could influence overall metabolic performance, suggesting a functional change.
Tantrawatpan, Chairat; Saijuntha, Weerachai; Sithithaworn, Paiboon; Andrews, Ross H; Petney, Trevor N
2013-01-01
Genetic differentiation between two synonymous echinostomes species, Artyfechinostomum malayanum and Artyfechinostomum sufrartyfex was determined by using the first and second internal transcribed spacers (ITS1 and ITS2), the non-coding region of rDNA as genetic makers. Of the 699 bp of combined ITS1 and ITS2 sequences examined, 18 variable nucleotide positions (2.58 %) were observed. Of these, 17 positions could be used as diagnostic position between these two sibling species, whereas the other one variation was intraspecific variation of A. malayanum. A clade of A. malayanum was closely aligned with A. sufrartyfex and clearly distance from the cluster of other echinostomes. Our results may sufficiently suggest that the current synonymy of these species is not valid.
Keel, B N; Nonneman, D J; Rohrer, G A
2017-08-01
Genetic variants detected from sequence have been used to successfully identify causal variants and map complex traits in several organisms. High and moderate impact variants, those expected to alter or disrupt the protein coded by a gene and those that regulate protein production, likely have a more significant effect on phenotypic variation than do other types of genetic variants. Hence, a comprehensive list of these functional variants would be of considerable interest in swine genomic studies, particularly those targeting fertility and production traits. Whole-genome sequence was obtained from 72 of the founders of an intensely phenotyped experimental swine herd at the U.S. Meat Animal Research Center (USMARC). These animals included all 24 of the founding boars (12 Duroc and 12 Landrace) and 48 Yorkshire-Landrace composite sows. Sequence reads were mapped to the Sscrofa10.2 genome build, resulting in a mean of 6.1 fold (×) coverage per genome. A total of 22 342 915 high confidence SNPs were identified from the sequenced genomes. These included 21 million previously reported SNPs and 79% of the 62 163 SNPs on the PorcineSNP60 BeadChip assay. Variation was detected in the coding sequence or untranslated regions (UTRs) of 87.8% of the genes in the porcine genome: loss-of-function variants were predicted in 504 genes, 10 202 genes contained nonsynonymous variants, 10 773 had variation in UTRs and 13 010 genes contained synonymous variants. Approximately 139 000 SNPs were classified as loss-of-function, nonsynonymous or regulatory, which suggests that over 99% of the variation detected in our pigs could potentially be ignored, allowing us to focus on a much smaller number of functional SNPs during future analyses. Published 2017. This article is a U.S. Government work and is in the public domain in the USA.
Stütz, Adrian M; Teran-Garcia, Margarita; Rao, D C; Rice, Treva; Bouchard, Claude; Rankinen, Tuomo
2009-11-01
The sodium bicarbonate cotransporter gene SLC4A5, associated earlier with cardiovascular phenotypes, was tested for associations in the HERITAGE Family Study, and possible mechanisms were investigated. Twelve tag-single nucleotide polymorphisms (SNPs) covering the SLC4A5 gene were analyzed in 276 Black and 503 White healthy, sedentary subjects. Associations were tested using a variance components-based (QTDT) method with data adjusted for age, sex and body size. In Whites, rs6731545 and rs7571842 were significantly associated with resting and submaximal exercise pulse pressure (PP) (0.0004
Stütz, Adrian M; Teran-Garcia, Margarita; Rao, D C; Rice, Treva; Bouchard, Claude; Rankinen, Tuomo
2009-01-01
The sodium bicarbonate cotransporter gene SLC4A5, associated earlier with cardiovascular phenotypes, was tested for associations in the HERITAGE Family Study, and possible mechanisms were investigated. Twelve tag-single nucleotide polymorphisms (SNPs) covering the SLC4A5 gene were analyzed in 276 Black and 503 White healthy, sedentary subjects. Associations were tested using a variance components-based (QTDT) method with data adjusted for age, sex and body size. In Whites, rs6731545 and rs7571842 were significantly associated with resting and submaximal exercise pulse pressure (PP) (0.0004
Allelic expression mapping across cellular lineages to establish impact of non-coding SNPs
Adoue, Veronique; Schiavi, Alicia; Light, Nicholas; Almlöf, Jonas Carlsson; Lundmark, Per; Ge, Bing; Kwan, Tony; Caron, Maxime; Rönnblom, Lars; Wang, Chuan; Chen, Shu-Huang; Goodall, Alison H; Cambien, Francois; Deloukas, Panos; Ouwehand, Willem H; Syvänen, Ann-Christine; Pastinen, Tomi
2014-01-01
Most complex disease-associated genetic variants are located in non-coding regions and are therefore thought to be regulatory in nature. Association mapping of differential allelic expression (AE) is a powerful method to identify SNPs with direct cis-regulatory impact (cis-rSNPs). We used AE mapping to identify cis-rSNPs regulating gene expression in 55 and 63 HapMap lymphoblastoid cell lines from a Caucasian and an African population, respectively, 70 fibroblast cell lines, and 188 purified monocyte samples and found 40–60% of these cis-rSNPs to be shared across cell types. We uncover a new class of cis-rSNPs, which disrupt footprint-derived de novo motifs that are predominantly bound by repressive factors and are implicated in disease susceptibility through overlaps with GWAS SNPs. Finally, we provide the proof-of-principle for a new approach for genome-wide functional validation of transcription factor–SNP interactions. By perturbing NFκB action in lymphoblasts, we identified 489 cis-regulated transcripts with altered AE after NFκB perturbation. Altogether, we perform a comprehensive analysis of cis-variation in four cell populations and provide new tools for the identification of functional variants associated to complex diseases. PMID:25326100
Generation and analysis of expressed sequence tags in the extreme large genomes Lilium and Tulipa.
Shahin, Arwa; van Kaauwen, Martijn; Esselink, Danny; Bargsten, Joachim W; van Tuyl, Jaap M; Visser, Richard G F; Arens, Paul
2012-11-20
Bulbous flowers such as lily and tulip (Liliaceae family) are monocot perennial herbs that are economically very important ornamental plants worldwide. However, there are hardly any genetic studies performed and genomic resources are lacking. To build genomic resources and develop tools to speed up the breeding in both crops, next generation sequencing was implemented. We sequenced and assembled transcriptomes of four lily and five tulip genotypes using 454 pyro-sequencing technology. Successfully, we developed the first set of 81,791 contigs with an average length of 514 bp for tulip, and enriched the very limited number of 3,329 available ESTs (Expressed Sequence Tags) for lily with 52,172 contigs with an average length of 555 bp. The contigs together with singletons covered on average 37% of lily and 39% of tulip estimated transcriptome. Mining lily and tulip sequence data for SSRs (Simple Sequence Repeats) showed that di-nucleotide repeats were twice more abundant in UTRs (UnTranslated Regions) compared to coding regions, while tri-nucleotide repeats were equally spread over coding and UTR regions. Two sets of single nucleotide polymorphism (SNP) markers suitable for high throughput genotyping were developed. In the first set, no SNPs flanking the target SNP (50 bp on either side) were allowed. In the second set, one SNP in the flanking regions was allowed, which resulted in a 2 to 3 fold increase in SNP marker numbers compared with the first set. Orthologous groups between the two flower bulbs: lily and tulip (12,017 groups) and among the three monocot species: lily, tulip, and rice (6,900 groups) were determined using OrthoMCL. Orthologous groups were screened for common SNP markers and EST-SSRs to study synteny between lily and tulip, which resulted in 113 common SNP markers and 292 common EST-SSR. Lily and tulip contigs generated were annotated and described according to Gene Ontology terminology. Two transcriptome sets were built that are valuable resources for marker development, comparative genomic studies and candidate gene approaches. Next generation sequencing of leaf transcriptome is very effective; however, deeper sequencing and using more tissues and stages is advisable for extended comparative studies.
Oh, Juliana J.; Koegel, Ashley; Phan, Diana T.; Razfar, Ali; Slamon, Dennis J.
2007-01-01
Summary Allele loss and genetic alteration in chromosome 3p, particularly in 3p21.3 region, are the most frequent and the earliest genomic abnormalities found in lung cancer. Multiple 3p21.3 genes exhibit various degrees of tumour suppression activity suggesting that 3p21.3 genes may function as an integrated tumour suppressor region through their diverse biological activities. We have previously demonstrated growth inhibitory effects and tumour suppression mechanism of the H37/RBM5 gene which is one of the 19 genes residing in the 370kb minimal overlap region at 3p21.3. In the current study, in an attempt to find, if any, mutations in the H37 coding region in lung cancer cells, we compared nucleotide sequences of the entire H37 gene in tumour vs. adjacent normal tissues from 17 non-small cell lung cancer (NSCLC) patients. No mutations were detected, instead, we found the two silent single nucleotide polymorphisms (SNPs), C1138T and C2185T, within the coding region of the H37 gene. In addition, we found that specific allele types at these SNP positions are correlated with different histological subtypes of NSCLC; tumours containing heterozygous alleles (C+T) at these SNP positions are more likely to be associated with adenocarcinoma (AC) whereas homozygous alleles (either C or T) are associated with squamous cell carcinoma (SCC) (p=0.0098). We postulate that, these two silent polymorphisms may be in linkage disequilibrium (LD) with a disease causative allele in the 3p21.3 tumour suppressor region which is packed with a large number of important genes affecting lung cancer development. In addition, because of prevalent loss of heterozygosity (LOH) detected at 3p21.3 which precedes lung cancer initiation, these SNPs may be developed into a marker screening for the high risk individuals. PMID:17606309
Azarian, Taj; Ali, Afsar; Johnson, Judith A.; Mohr, David; Prosperi, Mattia; Veras, Nazle M.; Jubair, Mohammed; Strickland, Samantha L.; Rashid, Mohammad H.; Alam, Meer T.; Weppelmann, Thomas A.; Katz, Lee S.; Tarr, Cheryl L.; Colwell, Rita R.
2014-01-01
ABSTRACT Phylodynamic analysis of genome-wide single-nucleotide polymorphism (SNP) data is a powerful tool to investigate underlying evolutionary processes of bacterial epidemics. The method was applied to investigate a collection of 65 clinical and environmental isolates of Vibrio cholerae from Haiti collected between 2010 and 2012. Characterization of isolates recovered from environmental samples identified a total of four toxigenic V. cholerae O1 isolates, four non-O1/O139 isolates, and a novel nontoxigenic V. cholerae O1 isolate with the classical tcpA gene. Phylogenies of strains were inferred from genome-wide SNPs using coalescent-based demographic models within a Bayesian framework. A close phylogenetic relationship between clinical and environmental toxigenic V. cholerae O1 strains was observed. As cholera spread throughout Haiti between October 2010 and August 2012, the population size initially increased and then fluctuated over time. Selection analysis along internal branches of the phylogeny showed a steady accumulation of synonymous substitutions and a progressive increase of nonsynonymous substitutions over time, suggesting diversification likely was driven by positive selection. Short-term accumulation of nonsynonymous substitutions driven by selection may have significant implications for virulence, transmission dynamics, and even vaccine efficacy. PMID:25538191
Jameson-Lee, Max; Koparde, Vishal; Griffith, Phil; Scalora, Allison F.; Sampson, Juliana K.; Khalid, Haniya; Sheth, Nihar U.; Batalo, Michael; Serrano, Myrna G.; Roberts, Catherine H.; Hess, Michael L.; Buck, Gregory A.; Neale, Michael C.; Manjili, Masoud H.; Toor, Amir Ahmed
2014-01-01
Donor T-cell mediated graft versus host (GVH) effects may result from the aggregate alloreactivity to minor histocompatibility antigens (mHA) presented by the human leukocyte antigen (HLA) molecules in each donor–recipient pair undergoing stem-cell transplantation (SCT). Whole exome sequencing has previously demonstrated a large number of non-synonymous single nucleotide polymorphisms (SNP) present in HLA-matched recipients of SCT donors (GVH direction). The nucleotide sequence flanking each of these SNPs was obtained and the amino acid sequence determined. All the possible nonameric peptides incorporating the variant amino acid resulting from these SNPs were interrogated in silico for their likelihood to be presented by the HLA class I molecules using the Immune Epitope Database stabilized matrix method (SMM) and NetMHCpan algorithms. The SMM algorithm predicted that a median of 18,396 peptides weakly bound HLA class I molecules in individual SCT recipients, and 2,254 peptides displayed strong binding. A similar library of presented peptides was identified when the data were interrogated using the NetMHCpan algorithm. The bioinformatic algorithm presented here demonstrates that there may be a high level of mHA variation in HLA-matched individuals, constituting a HLA-specific alloreactivity potential. PMID:25414699
Jameson-Lee, Max; Koparde, Vishal; Griffith, Phil; Scalora, Allison F; Sampson, Juliana K; Khalid, Haniya; Sheth, Nihar U; Batalo, Michael; Serrano, Myrna G; Roberts, Catherine H; Hess, Michael L; Buck, Gregory A; Neale, Michael C; Manjili, Masoud H; Toor, Amir Ahmed
2014-01-01
Donor T-cell mediated graft versus host (GVH) effects may result from the aggregate alloreactivity to minor histocompatibility antigens (mHA) presented by the human leukocyte antigen (HLA) molecules in each donor-recipient pair undergoing stem-cell transplantation (SCT). Whole exome sequencing has previously demonstrated a large number of non-synonymous single nucleotide polymorphisms (SNP) present in HLA-matched recipients of SCT donors (GVH direction). The nucleotide sequence flanking each of these SNPs was obtained and the amino acid sequence determined. All the possible nonameric peptides incorporating the variant amino acid resulting from these SNPs were interrogated in silico for their likelihood to be presented by the HLA class I molecules using the Immune Epitope Database stabilized matrix method (SMM) and NetMHCpan algorithms. The SMM algorithm predicted that a median of 18,396 peptides weakly bound HLA class I molecules in individual SCT recipients, and 2,254 peptides displayed strong binding. A similar library of presented peptides was identified when the data were interrogated using the NetMHCpan algorithm. The bioinformatic algorithm presented here demonstrates that there may be a high level of mHA variation in HLA-matched individuals, constituting a HLA-specific alloreactivity potential.
Ishikawa, Toshihisa; Aw, Wanping; Kaneko, Kiyoko
2013-11-04
In mammals, excess purine nucleosides are removed from the body by breakdown in the liver and excretion from the kidneys. Uric acid is the end product of purine metabolism in humans. Two-thirds of uric acid in the human body is normally excreted through the kidney, whereas one-third undergoes uricolysis (decomposition of uric acid) in the gut. Elevated serum uric acid levels result in gout and could be a risk factor for cardiovascular disease and diabetes. Recent studies have shown that human ATP-binding cassette transporter ABCG2 plays a role of renal excretion of uric acid. Two non-synonymous single nucleotide polymorphisms (SNPs), i.e., 421C>A (major) and 376C>T (minor), in the ABCG2 gene result in impaired transport activity, owing to ubiquitination-mediated proteosomal degradation and truncation of ABCG2, respectively. These genetic polymorphisms are associated with hyperuricemia and gout. Allele frequencies of those SNPs are significantly higher in Asian populations than they are in African and Caucasian populations. A rapid and isothermal genotyping method has been developed to detect the SNP 421C>A, where one drop of peripheral blood is sufficient for the detection. Development of simple genotyping methods would serve to improve prevention and early therapeutic intervention for high-risk individuals in personalized healthcare.
Shamim, Z; Spellman, S; Haagenson, M; Wang, T; Lee, S J; Ryder, L P; Müller, K
2013-08-01
Interleukin-7 (IL-7) is essential for T cell development in the thymus and maintenance of peripheral T cells. The α-chain of the IL-7R is polymorphic with the existence of SNPs that give rise to non-synonymous amino acid substitutions. We previously found an association between donor genotypes and increased treatment-related mortality (TRM) (rs1494555G) and acute graft versus host disease (aGvHD) (rs1494555G and rs1494558T) after hematopoietic cell transplantation (HCT). Some studies have confirmed an association between rs6897932C and multiple sclerosis. In this study, we evaluated the prognostic significance of IL-7Rα SNP genotypes in 590-recipient/donor pairs that received HLA-matched unrelated donor HCT for haematological malignancies. Consistent with the primary studies, the rs1494555GG and rs1494558TT genotypes of the donor were associated with aGvHD and chronic GvHD in the univariate analysis. The Tallele of rs6897932 was suggestive of an association with increased frequency of relapse by univariate analysis (P = 0.017) and multivariate analysis (P = 0.015). In conclusion, this study provides further evidence of a role of the IL-7 pathway and IL-7Rα SNPs in HCT. © 2013 John Wiley & Sons Ltd.
Epistasis between polymorphisms in PCSK1 and DBH is associated with premature ovarian failure.
Pyun, Jung-A; Kim, Sunshin; Cha, Dong Hyun; Kwack, KyuBum
2014-11-01
This study examined whether epistasis between single nucleotide polymorphisms (SNPs) within proprotein convertase subtilisin/kexin type 1 (PCSK1) and dopamine β-hydroxylase (DBH) genes is associated with premature ovarian failure (POF). One hundred twenty women with POF and 222 female controls were recruited for this study. To genotype SNPs within PCSK1 and DBH, we used a GoldenGate assay with VeraCode technology, which uses an allele-specific primer extension method. Two SNPs (rs155979 and rs3762986) within PCSK1 and one SNP (rs1611114) within DBH, which were located in the 5' flanking region, were involved in synergistic interactions. The C allele in the rs155979 SNP showed an increased risk of POF in a dominant model when AA genotype in the rs1611114 SNP was present (odds ratio, 3.60; 95% CI, 1.82-7.14; P = 0.00024), whereas the G allele in the rs1611114 SNP showed a reduced risk of POF in a dominant model when at least one C allele at the rs155979 SNP was present (odds ratio, 0.24; 95% CI, 0.11-0.51; P = 0.00018) or one G allele at the rs3762986 SNP was present (odds ratio, 0.33; 95% CI, 0.19-0.60; P = 0.00023). Epistases between SNPs within PCSK1 and DBH genes are significantly associated with susceptibility or resistance to POF.
[Genetic diversity analysis of Andrographis paniculata in China based on SRAP and SNP].
Chen, Rong; Wang, Xiao-Yun; Song, Yu-Ning; Zhu, Yun-feng; Wang, Peng-liang; Li, Min; Zhong, Guo-Yue
2014-12-01
In order to reveal genetic diversity of domestic Andrographis paniculata and its impact on quality, genetic backgrounds of 103 samples from 7 provinces in China were analyzed using SRAP marker and SNP marker. Genetic structures of the A. paniculata populations were estimated with Powermarker V 3.25 and Mega 6.0 software, and polymorphic SNPs were identified with CodonCode Aligner software. The results showed that the genetic distances of domestic A. paniculata germplasm ranged from 0. 01 to 0.09, and no polymorphic SNPs were discovered in coding sequence fragments of ent-copalyl diphosphate synthase. A. paniculata germplasm from various regions in China had poor genetic diversity. This phenomenon was closely related to strict self-fertilization and earlier introduction from the same origin. Therefore, genetic background had little impact on variable qualities of A. paniculata in domestic market. Mutation breeding, polyploid breeding and molecular breeding were proposed as promising strategies in germplasm innovation.
Wang, Y C; Jiang, R R; Kang, X T; Li, Z J; Han, R L; Geng, J; Fu, J X; Wang, J F; Wu, J P
2015-09-25
ASB15 is a member of the ankyrin repeat and suppressor of cytokine signaling box family, and is predominantly expressed in skeletal muscle. In the present study, an F2 resource population of Gushi chickens crossed with Anka broilers was used to investigate the genetic effects of the chicken ASB15 gene. Two single nucleotide polymorphisms (SNPs) (rs315759231 A>G and rs312619270 T>C) were identified in exon 7 of the ASB15 gene using forced chain reaction-restriction fragment length polymorphism and DNA sequencing. One was a missense SNP (rs315759231 A>G) and the other was a synonymous SNP (rs312619270 T>C). The rs315759231 A>G polymorphism was significantly associated with body weight at birth, 12-week body slanting length, semi-evisceration weight, evisceration weight, leg muscle weight, and carcass weight (P < 0.05). The rs312619270 T>C polymorphism was significantly associated with body weight at birth, 4, 8, and 12-week body weight, 8-week shank length, 12-week breast bone length, 8 and 12-week body slanting length, breast muscle weight, and carcass weight (P < 0.05). Our results suggest that the ASB15 gene profoundly affects chicken growth and carcass traits.
Nelson, Chase W; Moncla, Louise H; Hughes, Austin L
2015-11-15
New applications of next-generation sequencing technologies use pools of DNA from multiple individuals to estimate population genetic parameters. However, no publicly available tools exist to analyse single-nucleotide polymorphism (SNP) calling results directly for evolutionary parameters important in detecting natural selection, including nucleotide diversity and gene diversity. We have developed SNPGenie to fill this gap. The user submits a FASTA reference sequence(s), a Gene Transfer Format (.GTF) file with CDS information and a SNP report(s) in an increasing selection of formats. The program estimates nucleotide diversity, distance from the reference and gene diversity. Sites are flagged for multiple overlapping reading frames, and are categorized by polymorphism type: nonsynonymous, synonymous, or ambiguous. The results allow single nucleotide, single codon, sliding window, whole gene and whole genome/population analyses that aid in the detection of positive and purifying natural selection in the source population. SNPGenie version 1.2 is a Perl program with no additional dependencies. It is free, open-source, and available for download at https://github.com/hugheslab/snpgenie. nelsoncw@email.sc.edu or austin@biol.sc.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Nicolazzi, Ezequiel L; Caprera, Andrea; Nazzicari, Nelson; Cozzi, Paolo; Strozzi, Francesco; Lawley, Cindy; Pirani, Ali; Soans, Chandrasen; Brew, Fiona; Jorjani, Hossein; Evans, Gary; Simpson, Barry; Tosser-Klopp, Gwenola; Brauning, Rudiger; Williams, John L; Stella, Alessandra
2015-04-10
In recent years, the use of genomic information in livestock species for genetic improvement, association studies and many other fields has become routine. In order to accommodate different market requirements in terms of genotyping cost, manufacturers of single nucleotide polymorphism (SNP) arrays, private companies and international consortia have developed a large number of arrays with different content and different SNP density. The number of currently available SNP arrays differs among species: ranging from one for goats to more than ten for cattle, and the number of arrays available is increasing rapidly. However, there is limited or no effort to standardize and integrate array- specific (e.g. SNP IDs, allele coding) and species-specific (i.e. past and current assemblies) SNP information. Here we present SNPchiMp v.3, a solution to these issues for the six major livestock species (cow, pig, horse, sheep, goat and chicken). Original data was collected directly from SNP array producers and specific international genome consortia, and stored in a MySQL database. The database was then linked to an open-access web tool and to public databases. SNPchiMp v.3 ensures fast access to the database (retrieving within/across SNP array data) and the possibility of annotating SNP array data in a user-friendly fashion. This platform allows easy integration and standardization, and it is aimed at both industry and research. It also enables users to easily link the information available from the array producer with data in public databases, without the need of additional bioinformatics tools or pipelines. In recognition of the open-access use of Ensembl resources, SNPchiMp v.3 was officially credited as an Ensembl E!mpowered tool. Availability at http://bioinformatics.tecnoparco.org/SNPchimp.
Håkansson, Anna; Westberg, Lars; Nilsson, Staffan; Buervenich, Silvia; Carmine, Andrea; Holmberg, Björn; Sydow, Olof; Olson, Lars; Johnels, Bo; Eriksson, Elias; Nissbrandt, Hans
2005-02-05
The multifunctional cytokine interleukin-6 (IL-6) is involved in inflammatory processes in the central nervous system and increased levels of IL-6 have been found in patients with Parkinson's disease (PD). It is known that estrogen inhibits the production of IL-6, via action on estrogen receptors, thereby pointing to an important influence of estrogen on IL-6. In a previous study, we reported an association between a G/A single nucleotide polymorphism (SNP) at position 1730 in the gene coding for estrogen receptor beta (ERbeta) and age of onset of PD. To investigate the influence of a G/C SNP at position 174 in the promoter of the IL-6 gene, and the possible interaction of this SNP and the ERbeta G-1730A SNP on the risk for PD, the G-174C SNP was genotyped, by pyrosequencing, in 258 patients with PD and 308 controls. A significantly elevated frequency of the GG genotype of the IL-6 SNP was found in the patient group and this was most obvious among patients with an early age of onset (=50 years) of PD. When the GG genotypes of the IL-6 and ERbeta SNPs were combined, the combination was much more robustly associated with PD, and especially with PD with an early age of onset, than respective GG genotype when analyzed separately. Our results indicate that the G-174C SNP in the IL-6 promoter may influence the risk for developing PD, particularly regarding early age of onset PD, and that the effect is modified by interaction of the G-1730A SNP in the ERbeta gene. (c) 2004 Wiley-Liss, Inc.
Seligmann, Hervé; Warthi, Ganesh
2017-01-01
A new codon property, codon directional asymmetry in nucleotide content (CDA), reveals a biologically meaningful genetic code dimension: palindromic codons (first and last nucleotides identical, codon structure XZX) are symmetric (CDA = 0), codons with structures ZXX/XXZ are 5'/3' asymmetric (CDA = - 1/1; CDA = - 0.5/0.5 if Z and X are both purines or both pyrimidines, assigning negative/positive (-/+) signs is an arbitrary convention). Negative/positive CDAs associate with (a) Fujimoto's tetrahedral codon stereo-table; (b) tRNA synthetase class I/II (aminoacylate the 2'/3' hydroxyl group of the tRNA's last ribose, respectively); and (c) high/low antiparallel (not parallel) betasheet conformation parameters. Preliminary results suggest CDA-whole organism associations (body temperature, developmental stability, lifespan). Presumably, CDA impacts spatial kinetics of codon-anticodon interactions, affecting cotranslational protein folding. Some synonymous codons have opposite CDA sign (alanine, leucine, serine, and valine), putatively explaining how synonymous mutations sometimes affect protein function. Correlations between CDA and tRNA synthetase classes are weaker than between CDA and antiparallel betasheet conformation parameters. This effect is stronger for mitochondrial genetic codes, and potentially drives mitochondrial codon-amino acid reassignments. CDA reveals information ruling nucleotide-protein relations embedded in reversed (not reverse-complement) sequences (5'-ZXX-3'/5'-XXZ-3').
Gene-Centric Analysis of Serum Cotinine Levels in African and European American Populations
Hamidovic, Ajna; Goodloe, Robert J; Bergen, Andrew W; Benowitz, Neal L; Styn, Mindi A; Kasberger, Jay L; Choquet, Helene; Young, Taylor R; Meng, Yan; Palmer, Cameron; Pletcher, Mark; Kertesz, Stefan; Hitsman, Brian; Spring, Bonnie; Jorgenson, Eric
2012-01-01
To date, most genetic association studies of tobacco use have been conducted in European American subjects using the phenotype of smoking quantity (cigarettes per day). However, smoking quantity is a very imprecise measure of exposure to tobacco smoke constituents. Analyses of alternate phenotypes and populations may improve our understanding of tobacco addiction genetics. Cotinine is the major metabolite of nicotine, and measuring serum cotinine levels in smokers provides a more objective measure of nicotine dose than smoking quantity. Previous genetic association studies of serum cotinine have focused on individual genes. We conducted a genetic association study of the biomarker in African American (N=365) and European American (N=315) subjects from the Coronary Artery Risk Development in Young Adults study using a chip containing densely-spaced tag SNPs in ∼2100 genes. We found that rs11187065, located in the non-coding region (intron 1) of insulin-degrading enzyme (IDE), was the most strongly associated SNP (p=8.91 × 10−6) in the African American cohort, whereas rs11763963, located on chromosome 7 outside of a gene transcript, was the most strongly associated SNP in European Americans (p=1.53 × 10−6). We then evaluated how the top variant association in each population performed in the other group. We found that the association of rs11187065 in IDE was also associated with the phenotype in European Americans (p=0.044). Our top SNP association in European Americans, rs11763963 was non-polymorphic in our African American sample. It has been previously shown that psychostimulant self-administration is reduced in animals with lower insulin because of interference with dopamine transmission in the brain reward centers. Our finding provides a platform for further investigation of this, or additional mechanisms, involving the relationship between insulin and self-administered nicotine dose. PMID:22089314
Polymorphism of the prion protein gene (PRNP) in two Chinese indigenous cattle breeds.
Qin, L H; Zhao, Y M; Bao, Y H; Bai, W L; Chong, J; Zhang, G L; Zhang, J B; Zhao, Z H
2011-08-01
Prion protein (PRNP) gene has been located at position q17 of chromosome 13 in cattle. The polymorphisms of PRNP gene might be associated with BSE susceptibility. In the present work, we investigated the polymorphisms of PRNP gene, including SNP in exon 3, 23-bp indel in promoter region, 12-bp indel in intron 1 in 2 Chinese indigenous cattle breeds of northeast China. Eighty-six animals from Yanbian (34) and Chinese Red Steppes (52) were genotyped at PRNP locus by analyzing genomic DNA. A total of 4 single nucleotide polymorphism (SNP) sites were revealed in the PRNP gene exon 3 of the 2 cattle breeds investigated. Three of these SNPs were non-synonymous mutations that resulted in the amino acid exchanges (K119N, S154N, and M177V), and one is silent nucleotide substitutions (A234G). The two amino acid mutations of S154N and M177V were detected only in Yanbian with a very low frequency (0.0147), and they appears to be absent in Chinese Red Steppes. The average gene heterozygosity (He), effective allele numbers (Ne), Shannon's information index (I) and polymorphism information content (PIC) were 0.3088, 1.5013, 0.3814 and 0.2000 in Yanbian, respectively, being relatively higher than that of Chinese Red Steppes (0.2885, 1.4985, 0.3462 and 0.1873, respectively). In 23-bp indel and 12-bp indel loci, three different genotypes were identified in both Yanbian and Chinese Red Steppes breeds. Based 23- and 12-bp indels, four haplotypes was constructed in the 2 Chinese cattle breeds, of which the 23-bp (-)/12-bp (-) was main haplotypes accounting for more than 50% of the total in both Yanbian and Chinese Red Steppes breeds. These results might be useful in understanding the genetic characteristics of PRNP gene in Chinese indigenous cattle breeds.
Órpez-Zafra, Teresa; Pinto-Medel, María Jesús; Oliver-Martos, Begoña; Ortega-Pinazo, Jesús; Arnáiz, Carlos; Guijarro-Castro, Cristina; Varadé, Jezabel; Álvarez-Lafuente, Roberto; Urcelay, Elena; Sánchez-Jiménez, Francisca
2013-01-01
TRAIL and TRAIL Receptor genes have been implicated in Multiple Sclerosis pathology as well as in the response to IFN beta therapy. The objective of our study was to evaluate the association of these genes in relation to the age at disease onset (AAO) and to the clinical response upon IFN beta treatment in Spanish MS patients. We carried out a candidate gene study of TRAIL, TRAILR-1, TRAILR-2, TRAILR-3 and TRAILR-4 genes. A total of 54 SNPs were analysed in 509 MS patients under IFN beta treatment, and an additional cohort of 226 MS patients was used to validate the results. Associations of rs1047275 in TRAILR-2 and rs7011559 in TRAILR-4 genes with AAO under an additive model did not withstand Bonferroni correction. In contrast, patients with the TRAILR-1 rs20576-CC genotype showed a better clinical response to IFN beta therapy compared with patients carrying the A-allele (recessive model: p = 8.88×10−4, pc = 0.048, OR = 0.30). This SNP resulted in a non synonymous substitution of Glutamic acid to Alanine in position 228 (E228A), a change previously associated with susceptibility to different cancer types and risk of metastases, suggesting a lack of functionality of TRAILR-1. In order to unravel how this amino acid change in TRAILR-1 would affect to death signal, we performed a molecular modelling with both alleles. Neither TRAIL binding sites in the receptor nor the expression levels of TRAILR-1 in peripheral blood mononuclear cell subsets (monocytes, CD4+ and CD8+ T cells) were modified, suggesting that this SNP may be altering the death signal by some other mechanism. These findings show a role for TRAILR-1 gene variations in the clinical outcome of IFN beta therapy that might have relevance as a biomarker to predict the response to IFN beta in MS. PMID:23658636
Zhang, Chaowen; Chen, Feifan; Zhao, Ziyao; Hu, Liangliang; Liu, Hanqiang; Cheng, Zhihui; Weng, Yiqun; Chen, Peng; Li, Yuhong
2018-06-01
Two round-leaf mutants, rl-1 and rl-2, were identified from EMS-induced mutagenesis. High throughput sequencing and map-based cloning suggested CsPID encoding a Ser/Thr protein kinase as the most possible candidate for rl-1. Rl-2 was allelic to Rl-1. Leaf shape is an important plant architecture trait that is affected by plant hormones, especially auxin. In Arabidopsis, PINOID (PID), a regulator for the auxin polar transporter PIN (PIN-FORMED) affects leaf shape formation, but this function of PID in crop plants has not been well studied. From an EMS mutagenesis population, we identified two round-leaf (rl) mutants, C356 and C949. Segregation analysis suggested that both mutations were controlled by single recessive genes, rl-1 and rl-2, respectively. With map-based cloning, we show that CsPID as the candidate gene of rl-1; a non-synonymous SNP in the second exon of CsPID resulted in an amino acid substitution and the round leaf phenotype. As compared in the wild type plant, CsPID had significantly lower expression in the root, leaf and female flowers in C356, which may result in the less developed roots, round leaves and abnormal female flowers, respectively in the rl-1 mutant. Among the three copies of PID genes, CsPID, CsPID2 and CSPID2L (CsPID2-like) in the cucumber genome, CsPID was the only one with significantly differential expression in adult leaves between WT and C356 suggesting CsPID plays a main role in leaf shape formation. The rl-2 mutation in C949 was also cloned, which was due to another SNP in a nearby location of rl-1 in the same CsPID gene. The two round leaf mutants and the work presented herein provide a good foundation for understanding the molecular mechanisms of CsPID in cucumber leaf development.
In search of causal variants: refining disease association signals using cross-population contrasts.
Saccone, Nancy L; Saccone, Scott F; Goate, Alison M; Grucza, Richard A; Hinrichs, Anthony L; Rice, John P; Bierut, Laura J
2008-08-29
Genome-wide association (GWA) using large numbers of single nucleotide polymorphisms (SNPs) is now a powerful, state-of-the-art approach to mapping human disease genes. When a GWA study detects association between a SNP and the disease, this signal usually represents association with a set of several highly correlated SNPs in strong linkage disequilibrium. The challenge we address is to distinguish among these correlated loci to highlight potential functional variants and prioritize them for follow-up. We implemented a systematic method for testing association across diverse population samples having differing histories and LD patterns, using a logistic regression framework. The hypothesis is that important underlying biological mechanisms are shared across human populations, and we can filter correlated variants by testing for heterogeneity of genetic effects in different population samples. This approach formalizes the descriptive comparison of p-values that has typified similar cross-population fine-mapping studies to date. We applied this method to correlated SNPs in the cholinergic nicotinic receptor gene cluster CHRNA5-CHRNA3-CHRNB4, in a case-control study of cocaine dependence composed of 504 European-American and 583 African-American samples. Of the 10 SNPs genotyped in the r2 > or = 0.8 bin for rs16969968, three demonstrated significant cross-population heterogeneity and are filtered from priority follow-up; the remaining SNPs include rs16969968 (heterogeneity p = 0.75). Though the power to filter out rs16969968 is reduced due to the difference in allele frequency in the two groups, the results nevertheless focus attention on a smaller group of SNPs that includes the non-synonymous SNP rs16969968, which retains a similar effect size (odds ratio) across both population samples. Filtering out SNPs that demonstrate cross-population heterogeneity enriches for variants more likely to be important and causative. Our approach provides an important and effective tool to help interpret results from the many GWA studies now underway.
Kim, Seungill; Kim, Myung-Shin; Kim, Yong-Min; Yeom, Seon-In; Cheong, Kyeongchae; Kim, Ki-Tae; Jeon, Jongbum; Kim, Sunggil; Kim, Do-Sun; Sohn, Seong-Han; Lee, Yong-Hwan; Choi, Doil
2015-01-01
The onion (Allium cepa L.) is one of the most widely cultivated and consumed vegetable crops in the world. Although a considerable amount of onion transcriptome data has been deposited into public databases, the sequences of the protein-coding genes are not accurate enough to be used, owing to non-coding sequences intermixed with the coding sequences. We generated a high-quality, annotated onion transcriptome from de novo sequence assembly and intensive structural annotation using the integrated structural gene annotation pipeline (ISGAP), which identified 54,165 protein-coding genes among 165,179 assembled transcripts totalling 203.0 Mb by eliminating the intron sequences. ISGAP performed reliable annotation, recognizing accurate gene structures based on reference proteins, and ab initio gene models of the assembled transcripts. Integrative functional annotation and gene-based SNP analysis revealed a whole biological repertoire of genes and transcriptomic variation in the onion. The method developed in this study provides a powerful tool for the construction of reference gene sets for organisms based solely on de novo transcriptome data. Furthermore, the reference genes and their variation described here for the onion represent essential tools for molecular breeding and gene cloning in Allium spp. PMID:25362073
NASA Astrophysics Data System (ADS)
Liu, Siwei; Li, Qi; Yu, Hong; Kong, Lingfeng
2017-02-01
Glycogen is important not only for the energy supplementary of oysters, but also for human consumption. High glycogen content can improve the stress survival of oyster. A key enzyme in glycogenesis is glycogen synthase that is encoded by glycogen synthase gene GYS. In this study, the relationship between single nucleotide polymorphisms (SNPs) in coding regions of Crassostrea gigas GYS (Cg-GYS) and individual glycogen content was investigated with 321 individuals from five full-sib families. Single-strand conformation polymorphism (SSCP) procedure was combined with sequencing to confirm individual SNP genotypes of Cg-GYS. Least-square analysis of variance was performed to assess the relationship of variation in glycogen content of C. gigas with single SNP genotype and SNP haplotype. As a consequence, six SNPs were found in coding regions to be significantly associated with glycogen content ( P < 0.01), from which we constructed four main haplotypes due to linkage disequilibrium. Furthermore, the most effective haplotype H2 (GAGGAT) had extremely significant relationship with high glycogen content ( P < 0.0001). These findings revealed the potential influence of Cg-GYS polymorphism on the glycogen content and provided molecular biological information for the selective breeding of good quality traits of C. gigas.
Röper, Andrea; Reichert, Walter; Mattern, Rainer
2007-01-01
In the field of forensic DNA typing, the analysis of Short Tandem Repeats (STRs) can fail in cases of degraded DNA. The typing of coding region Single Nucleotide Polymorphisms (SNPs) of the mitochondrial genome provides an approach to acquire additional information. In the examined case of aggravated theft, both suspects could be excluded of having left the analyzed hair on the crime scene by SNP typing. This conclusion was not possible subsequent to STR typing. SNP typing of the trace on the torch light left on the crime scene increased the likelihood for suspect no. 2 to be the origin of this trace. This finding was already indicated by STR analysis. Suspect no. 1 was excluded for being the origin of this trace by SNP typing which was also indicated by STR analysis. A limiting factor for the analysis of SNPs is the maternal inheritance of mitochondrial DNA. Individualisation is not possible. In conclusion, it can be said that in the case of traces which cause problems with conventional STR typing the supplementary analysis of coding region SNPs from the mitochondrial genome is very reasonable and greatly contributes to the refinement of analysis methods in the field of forensic genetics.
SNPversity: a web-based tool for visualizing diversity
Schott, David A; Vinnakota, Abhinav G; Portwood, John L; Andorf, Carson M
2018-01-01
Abstract Many stand-alone desktop software suites exist to visualize single nucleotide polymorphism (SNP) diversity, but web-based software that can be easily implemented and used for biological databases is absent. SNPversity was created to answer this need by building an open-source visualization tool that can be implemented on a Unix-like machine and served through a web browser that can be accessible worldwide. SNPversity consists of a HDF5 database back-end for SNPs, a data exchange layer powered by TASSEL libraries that represent data in JSON format, and an interface layer using PHP to visualize SNP information. SNPversity displays data in real-time through a web browser in grids that are color-coded according to a given SNP’s allelic status and mutational state. SNPversity is currently available at MaizeGDB, the maize community’s database, and will be soon available at GrainGenes, the clade-oriented database for Triticeae and Avena species, including wheat, barley, rye, and oat. The code and documentation are uploaded onto github, and they are freely available to the public. We expect that the tool will be highly useful for other biological databases with a similar need to display SNP diversity through their web interfaces. Database URL: https://www.maizegdb.org/snpversity PMID:29688387
Chauke, Chesa G; Magwebu, Zandisiwe E; Sharma, Jyoti R; Arieff, Zainunisha; Seier, Jürgen V
2016-08-01
Non-ketotic hyperglycinaemia (NKH) is an autosomal recessive inborn error of glycine metabolism characterized by accumulation of glycine in body fluids and various neurological symptoms. This study describes the first screening of NKH in cataract captive-bred vervet monkeys (Chlorocebus aethiops). Glycine dehydrogenase (GLDC), aminomethyltransferase (AMT) and glycine cleavage system H protein (GCSH) were prioritized. Mutation analysis of the complete coding sequence of GLDC and AMT revealed six novel single-base substitutions, of which three were non-synonymous missense and three were silent nucleotide changes. Although deleterious effects of the three amino acid substitutions were not evaluated, one substitution of GLDC gene (S44R) could be disease-causing because of its drastic amino acid change, affecting amino acids conserved in different primate species. This study confirms the diagnosis of NKH for the first time in vervet monkeys with cataracts. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
DNA fingerprinting of Chinese melon provides evidentiary support of seed quality appraisal.
Gao, Peng; Ma, Hongyan; Luan, Feishi; Song, Haibin
2012-01-01
Melon, Cucumis melo L. is an important vegetable crop worldwide. At present, there are phenomena of homonyms and synonyms present in the melon seed markets of China, which could cause variety authenticity issues influencing the process of melon breeding, production, marketing and other aspects. Molecular markers, especially microsatellites or simple sequence repeats (SSRs) are playing increasingly important roles for cultivar identification. The aim of this study was to construct a DNA fingerprinting database of major melon cultivars, which could provide a possibility for the establishment of a technical standard system for purity and authenticity identification of melon seeds. In this study, to develop the core set SSR markers, 470 polymorphic SSRs were selected as the candidate markers from 1219 SSRs using 20 representative melon varieties (lines). Eighteen SSR markers, evenly distributed across the genome and with the highest contents of polymorphism information (PIC) were identified as the core marker set for melon DNA fingerprinting analysis. Fingerprint codes for 471 melon varieties (lines) were established. There were 51 materials which were classified into17 groups based on sharing the same fingerprint code, while field traits survey results showed that these plants in the same group were synonyms because of the same or similar field characters. Furthermore, DNA fingerprinting quick response (QR) codes of 471 melon varieties (lines) were constructed. Due to its fast readability and large storage capacity, QR coding melon DNA fingerprinting is in favor of read convenience and commercial applications.
DNA Fingerprinting of Chinese Melon Provides Evidentiary Support of Seed Quality Appraisal
Gao, Peng; Ma, Hongyan; Luan, Feishi; Song, Haibin
2012-01-01
Melon, Cucumis melo L. is an important vegetable crop worldwide. At present, there are phenomena of homonyms and synonyms present in the melon seed markets of China, which could cause variety authenticity issues influencing the process of melon breeding, production, marketing and other aspects. Molecular markers, especially microsatellites or simple sequence repeats (SSRs) are playing increasingly important roles for cultivar identification. The aim of this study was to construct a DNA fingerprinting database of major melon cultivars, which could provide a possibility for the establishment of a technical standard system for purity and authenticity identification of melon seeds. In this study, to develop the core set SSR markers, 470 polymorphic SSRs were selected as the candidate markers from 1219 SSRs using 20 representative melon varieties (lines). Eighteen SSR markers, evenly distributed across the genome and with the highest contents of polymorphism information (PIC) were identified as the core marker set for melon DNA fingerprinting analysis. Fingerprint codes for 471 melon varieties (lines) were established. There were 51 materials which were classified into17 groups based on sharing the same fingerprint code, while field traits survey results showed that these plants in the same group were synonyms because of the same or similar field characters. Furthermore, DNA fingerprinting quick response (QR) codes of 471 melon varieties (lines) were constructed. Due to its fast readability and large storage capacity, QR coding melon DNA fingerprinting is in favor of read convenience and commercial applications. PMID:23285039
Hellwege, Jacklyn N; Palmer, Nicholette D; Mark Brown, W; Brown, Mark W; Ziegler, Julie T; Sandy An, S; An, Sandy S; Guo, Xiuqing; Ida Chen, Y-D; Chen, Ida Y-D; Taylor, Kent; Hawkins, Gregory A; Ng, Maggie C Y; Speliotes, Elizabeth K; Lorenzo, Carlos; Norris, Jill M; Rotter, Jerome I; Wagenknecht, Lynne E; Langefeld, Carl D; Bowden, Donald W
2015-02-01
We previously identified a low-frequency (1.1 %) coding variant (G45R; rs200573126) in the adiponectin gene (ADIPOQ) which was the basis for a multipoint microsatellite linkage signal (LOD = 8.2) for plasma adiponectin levels in Hispanic families. We have empirically evaluated the ability of data from targeted common variants, exome chip genotyping, and genome-wide association study data to detect linkage and association to adiponectin protein levels at this locus. Simple two-point linkage and association analyses were performed in 88 Hispanic families (1,150 individuals) using 10,958 SNPs on chromosome 3. Approaches were compared for their ability to map the functional variant, G45R, which was strongly linked (two-point LOD = 20.98) and powerfully associated (p value = 8.1 × 10(-50)). Over 450 SNPs within a broad 61 Mb interval around rs200573126 showed nominal evidence of linkage (LOD > 3) but only four other SNPs in this region were associated with p values < 1.0 × 10(-4). When G45R was accounted for, the maximum LOD score across the interval dropped to 4.39 and the best p value was 1.1 × 10(-5). Linked and/or associated variants ranged in frequency (0.0018-0.50) and type (coding, non-coding) and had little detectable linkage disequilibrium with rs200573126 (r (2) < 0.20). In addition, the two-point linkage approach empirically outperformed multipoint microsatellite and multipoint SNP analysis. In the absence of data for rs200573126, family-based linkage analysis using a moderately dense SNP dataset, including both common and low-frequency variants, resulted in stronger evidence for an adiponectin locus than association data alone. Thus, linkage analysis can be a useful tool to facilitate identification of high-impact genetic variants.
Pattison, Jillian M.; Posternak, Valeriya; Cole, Michael D.
2016-01-01
It is well established that environmental toxins, such as exposure to arsenic, are risk factors in the development of urinary bladder cancer, yet recent genome-wide association studies (GWAS) provide compelling evidence that there is a strong genetic component associated with disease predisposition. A single nucleotide polymorphism (SNP), rs8102137, was identified on chromosome 19q12, residing 6 kb upstream of the important cell cycle regulator and proto-oncogene, Cyclin E1 (CCNE1). However, the functional role of this variant in bladder cancer predisposition has been unclear since it lies within a non-coding region of the genome. Here, it is demonstrated that bladder cancer cells heterozygous for this SNP exhibit biased allelic expression of CCNE1 with 1.5-fold more transcription occurring from the risk allele. Furthermore, using chromatin immunoprecipitation assays, a novel enhancer element was identified within the first intron of CCNE1 that binds Kruppel-like Factor 5 (KLF5), a known transcriptional activator in bladder cancer. Moreover, the data reveal that the presence of rs200996365, a SNP in high linkage disequilibrium with rs8102137 residing in the center of a KLF5 motif, alters KLF5 binding to this genomic region. Through luciferase assays and CRISPR-Cas9 genome editing, a novel polymorphic intronic regulatory element controlling CCNE1 transcription is characterized. These studies uncover how a cancer-associated polymorphism mechanistically contributes to an increased predisposition for bladder cancer development. Implications A polymorphic KLF5 binding site near the CCNE1 gene explains genetic risk identified through genome wide association studies. PMID:27514407
Lager, Malin; Mernelius, Sara; Löfgren, Sture; Söderman, Jan
2016-01-01
Healthcare-associated infections caused by Escherichia coli and antibiotic resistance due to extended-spectrum beta-lactamase (ESBL) production constitute a threat against patient safety. To identify, track, and control outbreaks and to detect emerging virulent clones, typing tools of sufficient discriminatory power that generate reproducible and unambiguous data are needed. A probe based real-time PCR method targeting multiple single nucleotide polymorphisms (SNP) was developed. The method was based on the multi locus sequence typing scheme of Institute Pasteur and by adaptation of previously described typing assays. An 8 SNP-panel that reached a Simpson's diversity index of 0.95 was established, based on analysis of sporadic E. coli cases (ESBL n = 27 and non-ESBL n = 53). This multi-SNP assay was used to identify the sequence type 131 (ST131) complex according to the Achtman's multi locus sequence typing scheme. However, it did not fully discriminate within the complex but provided a diagnostic signature that outperformed a previously described detection assay. Pulsed-field gel electrophoresis typing of isolates from a presumed outbreak (n = 22) identified two outbreaks (ST127 and ST131) and three different non-outbreak-related isolates. Multi-SNP typing generated congruent data except for one non-outbreak-related ST131 isolate. We consider multi-SNP real-time PCR typing an accessible primary generic E. coli typing tool for rapid and uniform type identification.
Making a chocolate chip: development and evaluation of a 6K SNP array for Theobroma cacao
Livingstone, Donald; Royaert, Stefan; Stack, Conrad; Mockaitis, Keithanne; May, Greg; Farmer, Andrew; Saski, Christopher; Schnell, Ray; Kuhn, David; Motamayor, Juan Carlos
2015-01-01
Theobroma cacao, the key ingredient in chocolate production, is one of the world's most important tree fruit crops, with ∼4,000,000 metric tons produced across 50 countries. To move towards gene discovery and marker-assisted breeding in cacao, a single-nucleotide polymorphism (SNP) identification project was undertaken using RNAseq data from 16 diverse cacao cultivars. RNA sequences were aligned to the assembled transcriptome of the cultivar Matina 1-6, and 330,000 SNPs within coding regions were identified. From these SNPs, a subset of 6,000 high-quality SNPs were selected for inclusion on an Illumina Infinium SNP array: the Cacao6kSNP array. Using Cacao6KSNP array data from over 1,000 cacao samples, we demonstrate that our custom array produces a saturated genetic map and can be used to distinguish among even closely related genotypes. Our study enhances and expands the genetic resources available to the cacao research community, and provides the genome-scale set of tools that are critical for advancing breeding with molecular markers in an agricultural species with high genetic diversity. PMID:26070980
Xiao, P; Niu, L L; Zhao, Q J; Chen, X Y; Wang, L J; Li, L; Zhang, H P; Guo, J Z; Xu, H Y; Zhong, T
2017-11-16
The origins and phylogeny of different sheep breeds has been widely studied using polymorphisms within the mitochondrial hypervariable region. However, little is known about the mitochondrial DNA (mtDNA) content and phylogeny based on mtDNA protein-coding genes. In this study, we assessed the phylogeny and copy number of the mtDNA in eight indigenous (population size, n=184) and three introduced (n=66) sheep breeds in China based on five mitochondrial coding genes (COX1, COX2, ATP8, ATP6 and COX3). The mean haplotype and nucleotide diversities were 0.944 and 0.00322, respectively. We identified a correlation between the lineages distribution and the genetic distance, whereby Valley-type Tibetan sheep had a closer genetic relationship with introduced breeds (Dorper, Poll Dorset and Suffolk) than with other indigenous breeds. Similarly, the Median-joining profile of haplotypes revealed the distribution of clusters according to genetic differences. Moreover, copy number analysis based on the five mitochondrial coding genes was affected by the genetic distance combining with genetic phylogeny; we also identified obvious non-synonymous mutations in ATP6 between the different levels of copy number expressions. These results imply that differences in mitogenomic compositions resulting from geographical separation lead to differences in mitochondrial function.
A novel approach to analyzing fMRI and SNP data via parallel independent component analysis
NASA Astrophysics Data System (ADS)
Liu, Jingyu; Pearlson, Godfrey; Calhoun, Vince; Windemuth, Andreas
2007-03-01
There is current interest in understanding genetic influences on brain function in both the healthy and the disordered brain. Parallel independent component analysis, a new method for analyzing multimodal data, is proposed in this paper and applied to functional magnetic resonance imaging (fMRI) and a single nucleotide polymorphism (SNP) array. The method aims to identify the independent components of each modality and the relationship between the two modalities. We analyzed 92 participants, including 29 schizophrenia (SZ) patients, 13 unaffected SZ relatives, and 50 healthy controls. We found a correlation of 0.79 between one fMRI component and one SNP component. The fMRI component consists of activations in cingulate gyrus, multiple frontal gyri, and superior temporal gyrus. The related SNP component is contributed to significantly by 9 SNPs located in sets of genes, including those coding for apolipoprotein A-I, and C-III, malate dehydrogenase 1 and the gamma-aminobutyric acid alpha-2 receptor. A significant difference in the presences of this SNP component is found between the SZ group (SZ patients and their relatives) and the control group. In summary, we constructed a framework to identify the interactions between brain functional and genetic information; our findings provide new insight into understanding genetic influences on brain function in a common mental disorder.
Barnes, David J; Hookway, Edward; Athanasou, Nick; Kashima, Takeshi; Oppermann, Udo; Hughes, Simon; Swan, Daniel; Lueerssen, Dietrich; Anson, John; Hassan, A Bassim
2016-08-12
Melanotic neuroectodermal tumor of infancy (MNTI) is exceptionally rare and occurs predominantly in the head and neck (92.8 % cases). The patient reported here is only the eighth case of MNTI presenting in an extremity, and the first reported in the fibula. A 2-month-old female presented with a mass arising in the fibula. Exhaustive genomic, transcriptomic, epigenetic and pathological characterization was performed on the excised primary tumor and a derived cell line. Whole-exome analysis of genomic DNA from both the tumor and blood indicated no somatic, non-synonymous coding mutations within the tumor, but a heterozygous, unique germline, loss of function mutation in CDKN2A (p16(INK4A), D74A). SNP-array CGH on DNA samples revealed the tumor to be euploid, with no detectable gene copy number variants. Multiple chromosomal translocations were identified by RNA-Seq, and fusion genes included RPLP1-C19MC, potentially deregulating the C19MC cluster, an imprinted locus containing microRNA genes reactivated by gene fusion in embryonal tumors with multilayered rosettes. Since the presumed cell of origin of MNTI is from the neural crest, we also compared gene expression with a dataset from human neural crest cells and identified 185 genes with significantly different expression. Consistent with the melanotic phenotype of the tumor, elevated expression of tyrosinase was observed. Other highly expressed genes encoded muscle proteins and modulators of the extracellular matrix. A derived MNTI cell line was sensitive to inhibitors of lysine demethylase, but not to compounds targeting other epigenetic regulators. In the absence of somatic copy number variations or mutations, the fully transformed phenotype of the MNTI may have arisen in infancy because of the combined effects of a germline CDKN2A mutation, tumor promoting somatic fusion genes and epigenetic deregulation. Very little is known about the etiology of MNTI and this report advances knowledge of these rare tumors by providing the first comprehensive genomic, transcriptomic and epigenetic characterization of a case.
Das, Shouvik; Singh, Mohar; Srivastava, Rishi; Bajaj, Deepak; Saxena, Maneesha S.; Rana, Jai C.; Bansal, Kailash C.; Tyagi, Akhilesh K.; Parida, Swarup K.
2016-01-01
The present study used a whole-genome, NGS resequencing-based mQTL-seq (multiple QTL-seq) strategy in two inter-specific mapping populations (Pusa 1103 × ILWC 46 and Pusa 256 × ILWC 46) to scan the major genomic region(s) underlying QTL(s) governing pod number trait in chickpea. Essentially, the whole-genome resequencing of low and high pod number-containing parental accessions and homozygous individuals (constituting bulks) from each of these two mapping populations discovered >8 million high-quality homozygous SNPs with respect to the reference kabuli chickpea. The functional significance of the physically mapped SNPs was apparent from the identified 2,264 non-synonymous and 23,550 regulatory SNPs, with 8–10% of these SNPs-carrying genes corresponding to transcription factors and disease resistance-related proteins. The utilization of these mined SNPs in Δ (SNP index)-led QTL-seq analysis and their correlation between two mapping populations based on mQTL-seq, narrowed down two (CaqaPN4.1: 867.8 kb and CaqaPN4.2: 1.8 Mb) major genomic regions harbouring robust pod number QTLs into the high-resolution short QTL intervals (CaqbPN4.1: 637.5 kb and CaqbPN4.2: 1.28 Mb) on chickpea chromosome 4. The integration of mQTL-seq-derived one novel robust QTL with QTL region-specific association analysis delineated the regulatory (C/T) and coding (C/A) SNPs-containing one pentatricopeptide repeat (PPR) gene at a major QTL region regulating pod number in chickpea. This target gene exhibited anther, mature pollen and pod-specific expression, including pronounced higher up-regulated (∼3.5-folds) transcript expression in high pod number-containing parental accessions and homozygous individuals of two mapping populations especially during pollen and pod development. The proposed mQTL-seq-driven combinatorial strategy has profound efficacy in rapid genome-wide scanning of potential candidate gene(s) underlying trait-associated high-resolution robust QTL(s), thereby expediting genomics-assisted breeding and genetic enhancement of crop plants, including chickpea. PMID:26685680
Multiple origins of resistance-conferring mutations in Plasmodium vivax dihydrofolate reductase
Hawkins, Vivian N; Auliff, Alyson; Prajapati, Surendra Kumar; Rungsihirunrat, Kanchana; Hapuarachchi, Hapuarachchige C; Maestre, Amanda; O'Neil, Michael T; Cheng, Qin; Joshi, Hema; Na-Bangchang, Kesara; Sibley, Carol Hopkins
2008-01-01
Background In order to maximize the useful therapeutic life of antimalarial drugs, it is crucial to understand the mechanisms by which parasites resistant to antimalarial drugs are selected and spread in natural populations. Recent work has demonstrated that pyrimethamine-resistance conferring mutations in Plasmodium falciparum dihydrofolate reductase (dhfr) have arisen rarely de novo, but spread widely in Asia and Africa. The origin and spread of mutations in Plasmodium vivax dhfr were assessed by constructing haplotypes based on sequencing dhfr and its flanking regions. Methods The P. vivax dhfr coding region, 792 bp upstream and 683 bp downstream were amplified and sequenced from 137 contemporary patient isolates from Colombia, India, Indonesia, Papua New Guinea, Sri Lanka, Thailand, and Vanuatu. A repeat motif located 2.6 kb upstream of dhfr was also sequenced from 75 of 137 patient isolates, and mutational relationships among the haplotypes were visualized using the programme Network. Results Synonymous and non-synonymous single nucleotide polymorphisms (SNPs) within the dhfr coding region were identified, as was the well-documented in-frame insertion/deletion (indel). SNPs were also identified upstream and downstream of dhfr, with an indel and a highly polymorphic repeat region identified upstream of dhfr. The regions flanking dhfr were highly variable. The double mutant (58R/117N) dhfr allele has evolved from several origins, because the 58R is encoded by at least 3 different codons. The triple (58R/61M/117T) and quadruple (57L/61M/117T/173F, 57I/58R/61M/117T and 57L/58R/61M/117T) mutant alleles had at least three independent origins in Thailand, Indonesia, and Papua New Guinea/Vanuatu. Conclusion It was found that the P. vivax dhfr coding region and its flanking intergenic regions are highly polymorphic and that mutations in P. vivax dhfr that confer antifolate resistance have arisen several times in the Asian region. This contrasts sharply with the selective sweep of rare antifolate resistant alleles observed in the P. falciparum populations in Asia and Africa. The finding of multiple origins of resistance-conferring mutations has important implications for drug policy. PMID:18442404
Multiple origins of resistance-conferring mutations in Plasmodium vivax dihydrofolate reductase.
Hawkins, Vivian N; Auliff, Alyson; Prajapati, Surendra Kumar; Rungsihirunrat, Kanchana; Hapuarachchi, Hapuarachchige C; Maestre, Amanda; O'Neil, Michael T; Cheng, Qin; Joshi, Hema; Na-Bangchang, Kesara; Sibley, Carol Hopkins
2008-04-28
In order to maximize the useful therapeutic life of antimalarial drugs, it is crucial to understand the mechanisms by which parasites resistant to antimalarial drugs are selected and spread in natural populations. Recent work has demonstrated that pyrimethamine-resistance conferring mutations in Plasmodium falciparum dihydrofolate reductase (dhfr) have arisen rarely de novo, but spread widely in Asia and Africa. The origin and spread of mutations in Plasmodium vivax dhfr were assessed by constructing haplotypes based on sequencing dhfr and its flanking regions. The P. vivax dhfr coding region, 792 bp upstream and 683 bp downstream were amplified and sequenced from 137 contemporary patient isolates from Colombia, India, Indonesia, Papua New Guinea, Sri Lanka, Thailand, and Vanuatu. A repeat motif located 2.6 kb upstream of dhfr was also sequenced from 75 of 137 patient isolates, and mutational relationships among the haplotypes were visualized using the programme Network. Synonymous and non-synonymous single nucleotide polymorphisms (SNPs) within the dhfr coding region were identified, as was the well-documented in-frame insertion/deletion (indel). SNPs were also identified upstream and downstream of dhfr, with an indel and a highly polymorphic repeat region identified upstream of dhfr. The regions flanking dhfr were highly variable. The double mutant (58R/117N) dhfr allele has evolved from several origins, because the 58R is encoded by at least 3 different codons. The triple (58R/61M/117T) and quadruple (57L/61M/117T/173F, 57I/58R/61M/117T and 57L/58R/61M/117T) mutant alleles had at least three independent origins in Thailand, Indonesia, and Papua New Guinea/Vanuatu. It was found that the P. vivax dhfr coding region and its flanking intergenic regions are highly polymorphic and that mutations in P. vivax dhfr that confer antifolate resistance have arisen several times in the Asian region. This contrasts sharply with the selective sweep of rare antifolate resistant alleles observed in the P. falciparum populations in Asia and Africa. The finding of multiple origins of resistance-conferring mutations has important implications for drug policy.
Ma, Li; Runesha, H Birali; Dvorkin, Daniel; Garbe, John R; Da, Yang
2008-01-01
Background Genome-wide association studies (GWAS) using single nucleotide polymorphism (SNP) markers provide opportunities to detect epistatic SNPs associated with quantitative traits and to detect the exact mode of an epistasis effect. Computational difficulty is the main bottleneck for epistasis testing in large scale GWAS. Results The EPISNPmpi and EPISNP computer programs were developed for testing single-locus and epistatic SNP effects on quantitative traits in GWAS, including tests of three single-locus effects for each SNP (SNP genotypic effect, additive and dominance effects) and five epistasis effects for each pair of SNPs (two-locus interaction, additive × additive, additive × dominance, dominance × additive, and dominance × dominance) based on the extended Kempthorne model. EPISNPmpi is the parallel computing program for epistasis testing in large scale GWAS and achieved excellent scalability for large scale analysis and portability for various parallel computing platforms. EPISNP is the serial computing program based on the EPISNPmpi code for epistasis testing in small scale GWAS using commonly available operating systems and computer hardware. Three serial computing utility programs were developed for graphical viewing of test results and epistasis networks, and for estimating CPU time and disk space requirements. Conclusion The EPISNPmpi parallel computing program provides an effective computing tool for epistasis testing in large scale GWAS, and the epiSNP serial computing programs are convenient tools for epistasis analysis in small scale GWAS using commonly available computer hardware. PMID:18644146
Identification of Conflicting Selective Effects on Highly Expressed Genes
Higgs, Paul G.; Hao, Weilong; Golding, G. Brian
2007-01-01
Many different selective effects on DNA and proteins influence the frequency of codons and amino acids in coding sequences. Selection is often stronger on highly expressed genes. Hence, by comparing high- and low-expression genes it is possible to distinguish the factors that are selected by evolution. It has been proposed that highly expressed genes should (i) preferentially use codons matching abundant tRNAs (translational efficiency), (ii) preferentially use amino acids with low cost of synthesis, (iii) be under stronger selection to maintain the required amino acid content, and (iv) be selected for translational robustness. These effects act simultaneously and can be contradictory. We develop a model that combines these factors, and use Akaike’s Information Criterion for model selection. We consider pairs of paralogues that arose by whole-genome duplication in Saccharmyces cerevisiae. A codon-based model is used that includes asymmetric effects due to selection on highly expressed genes. The largest effect is translational efficiency, which is found to strongly influence synonymous, but not non-synonymous rates. Minimization of the cost of amino acid synthesis is implicated. However, when a more general measure of selection for amino acid usage is used, the cost minimization effect becomes redundant. Small effects that we attribute to selection for translational robustness can be identified as an improvement in the model fit on top of the effects of translational efficiency and amino acid usage. PMID:19430600
TLR4 Asp299Gly polymorphism may be protective against chronic periodontitis.
Sellers, R M; Payne, J B; Yu, F; LeVan, T D; Walker, C; Mikuls, T R
2016-04-01
Periodontitis results from interplay between genetic and environmental factors. Single nucleotide polymorphisms (SNPs) in the coding region of the toll-like receptor 4 gene (TLR4) may be associated with periodontitis, although previous studies have been inconclusive. Moreover, the interaction between environmental factors, such as cigarette smoking (a major risk factor for periodontitis), and Porphyromonas gingivalis (a major periodontal pathogen) with the TLR4 coding region Asp299Gly SNP (rs4986790; a SNP associated with lipopolysaccharide-mediated inflammatory responses in periodontitis), have been largely ignored in previous reports. Therefore, the objective of this study was to examine the association between TLR4 Asp299Gly (rs4986790) with alveolar bone height loss (ABHL) and periodontitis, accounting for interactions between this SNP with smoking and P. gingivalis prevalence. The CD14/-260 SNP (rs2569190) served as a control, as a recent meta-analysis suggested no relationship between this SNP and periodontitis. This multicenter study included 617 participants who had rheumatoid arthritis or osteoarthritis. This report presents a secondary outcome from the primary case-control study examining the relationship of periodontitis with established rheumatoid arthritis. The Centers for Disease Control/American Academy of Periodontology case definitions of periodontitis were used for this analysis. Participants received a full-mouth clinical periodontal examination and panoramic radiograph. Percentage ABHL was measured on posterior teeth. The TLR4 Asp299Gly and CD14/-260 SNPs were selected a priori and genotypes were determined using the ImmunoChip array (Illumina(®) ). Minor allele frequencies and associations with periodontitis and ABHL did not differ according to rheumatoid arthritis vs. osteoarthritis status; therefore, data from these two groups were pooled. The presence of P. gingivalis was detected in subgingival plaque by PCR. Multivariate ordinal logistic regression examined associations between the SNPs and periodontitis or ABHL. SNP interactions with smoking and P. gingivalis were analyzed. A significant, negative interaction was observed between the TLR4 SNP and the presence of P. gingivalis (p = 0.045) with respect to periodontitis. The TLR4 minor variant was also associated with less ABHL: 16.8% of individuals with low ABHL, 9.0% with moderate ABHL and 11.2% with high ABHL had the minor allele [p = 0.029; odds ratio = 0.58 (95% confidence interval: 0.36-0.95)]. The interaction between the TLR4 SNP and smoking was not significant with respect to periodontitis or ABHL. The CD14 SNP was not associated with periodontitis or ABHL. The TLR4 Asp299Gly SNP significantly interacted with P. gingivalis in conferring a decreased risk of periodontitis and may be protective against ABHL, a feature of periodontitis. Agents blocking TLR4 signaling, a strategy currently under investigation for the treatment of other inflammatory conditions, may warrant investigation in the context of periodontitis related to the presence of P. gingivalis. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Positive selection on the killer whale mitogenome
Foote, Andrew D.; Morin, Phillip A.; Durban, John W.; Pitman, Robert L.; Wade, Paul; Willerslev, Eske; Gilbert, M. Thomas P.; da Fonseca, Rute R.
2011-01-01
Mitochondria produce up to 95 per cent of the eukaryotic cell's energy. The coding genes of the mitochondrial DNA may therefore evolve under selection owing to metabolic requirements. The killer whale, Orcinus orca, is polymorphic, has a global distribution and occupies a range of ecological niches. It is therefore a suitable organism for testing this hypothesis. We compared a global dataset of the complete mitochondrial genomes of 139 individuals for amino acid changes that were associated with radical physico-chemical property changes and were influenced by positive selection. Two such selected non-synonymous amino acid changes were found; one in each of two ecotypes that inhabit the Antarctic pack ice. Both substitutions were associated with changes in local polarity, increased steric constraints and α-helical tendencies that could influence overall metabolic performance, suggesting a functional change. PMID:20810427
Whole genome survey of coding SNPs reveals a reproducible pathway determinant of Parkinson disease
Srinivasan, Balaji S; Doostzadeh, Jaleh; Absalan, Farnaz; Mohandessi, Sharareh; Jalili, Roxana; Bigdeli, Saharnaz; Wang, Justin; Mahadevan, Jaydev; Lee, Caroline LG; Davis, Ronald W; William Langston, J; Ronaghi, Mostafa
2009-01-01
It is quickly becoming apparent that situating human variation in a pathway context is crucial to understanding its phenotypic significance. Toward this end, we have developed a general method for finding pathways associated with traits that control for pathway size. We have applied this method to a new whole genome survey of coding SNP variation in 187 patients afflicted with Parkinson disease (PD) and 187 controls. We show that our dataset provides an independent replication of the axon guidance association recently reported by Lesnick et al. [PLoS Genet 2007;3:e98], and also indicates that variation in the ubiquitin-mediated proteolysis and T-cell receptor signaling pathways may predict PD susceptibility. Given this result, it is reasonable to hypothesize that pathway associations are more replicable than individual SNP associations in whole genome association studies. However, this hypothesis is complicated by a detailed comparison of our dataset to the second recent PD association study by Fung et al. [Lancet Neurol 2006;5:911–916]. Surprisingly, we find that the axon guidance pathway does not rank at the very top of the Fung dataset after controlling for pathway size. More generally, in comparing the studies, we find that SNP frequencies replicate well despite technologically different assays, but that both SNP and pathway associations are globally uncorrelated across studies. We thus have a situation in which an association between axon guidance pathway variation and PD has been found in 2 out of 3 studies. We conclude by relating this seeming inconsistency to the molecular heterogeneity of PD, and suggest future analyses that may resolve such discrepancies. PMID:18853455
Zonato, Valeria; Fedele, Giorgio; Kyriacou, Charalambos P
2016-01-01
couch potato (cpo) encodes an RNA binding protein that has been reported to be expressed in the peripheral and central nervous system of embryos, larvae and adults, including the major endocrine organ, the ring gland. A polymorphism in the D. melanogaster cpo gene coding region displays a latitudinal cline in frequency in North American populations, but as cpo lies within the inversion In(3R)Payne, which is at high frequencies and itself shows a strong cline on this continent, interpretation of the cpo cline is not straightforward. A second downstream SNP in strong linkage disequilibrium with the first has been claimed to be primarily responsible for the latitudinal cline in diapause incidence in USA populations.Here, we investigate the frequencies of these two cpo SNPs in populations of Drosophila throughout continental Europe. The advantage of studying cpo variation in Europe is the very low frequency of In(3R)Payne, which we reveal here, does not appear to be clinally distributed. We observe a very different geographical scenario for cpo variation from the one in North America, suggesting that the downstream SNP does not play a role in diapause. In an attempt to verify whether the SNPs influence diapause we subsequently generated lines with different combinations of the two cpo SNPs on known timeless (tim) genetic backgrounds, because polymorphism in the clock gene tim plays a significant role in diapause inducibility. Our results reveal that the downstream cpo SNP does not seem to play any role in diapause induction in European populations in contrast to the upstream coding cpo SNP. Consequently, all future diapause studies on strains of D. melanogaster should initially determine their tim and cpo status.
Scaltriti, Erika; Sassera, Davide; Comandatore, Francesco; Morganti, Marina; Mandalari, Carmen; Gaiarsa, Stefano; Bandi, Claudio; Zehender, Gianguglielmo; Bolzoni, Luca; Casadei, Gabriele
2015-01-01
We retrospectively analyzed a rare Salmonella enterica serovar Manhattan outbreak that occurred in Italy in 2009 to evaluate the potential of new genomic tools based on differential single nucleotide polymorphism (SNP) analysis in comparison with the gold standard genotyping method, pulsed-field gel electrophoresis. A total of 39 isolates were analyzed from patients (n = 15) and food, feed, animal, and environmental sources (n = 24), resulting in five different pulsed-field gel electrophoresis (PFGE) profiles. Isolates epidemiologically related to the outbreak clustered within the same pulsotype, SXB_BS.0003, without any further differentiation. Thirty-three isolates were considered for genomic analysis based on different sets of SNPs, core, synonymous, nonsynonymous, as well as SNPs in different codon positions, by Bayesian and maximum likelihood algorithms. Trees generated from core and nonsynonymous SNPs, as well as SNPs at the second and first plus second codon positions detailed four distinct groups of isolates within the outbreak pulsotype, discriminating outbreak-related isolates of human and food origins. Conversely, the trees derived from synonymous and third-codon-position SNPs clustered food and human isolates together, indicating that all outbreak-related isolates constituted a single clone, which was in line with the epidemiological evidence. Further experiments are in place to extend this approach within our regional enteropathogen surveillance system. PMID:25653407
Scaltriti, Erika; Sassera, Davide; Comandatore, Francesco; Morganti, Marina; Mandalari, Carmen; Gaiarsa, Stefano; Bandi, Claudio; Zehender, Gianguglielmo; Bolzoni, Luca; Casadei, Gabriele; Pongolini, Stefano
2015-04-01
We retrospectively analyzed a rare Salmonella enterica serovar Manhattan outbreak that occurred in Italy in 2009 to evaluate the potential of new genomic tools based on differential single nucleotide polymorphism (SNP) analysis in comparison with the gold standard genotyping method, pulsed-field gel electrophoresis. A total of 39 isolates were analyzed from patients (n=15) and food, feed, animal, and environmental sources (n=24), resulting in five different pulsed-field gel electrophoresis (PFGE) profiles. Isolates epidemiologically related to the outbreak clustered within the same pulsotype, SXB_BS.0003, without any further differentiation. Thirty-three isolates were considered for genomic analysis based on different sets of SNPs, core, synonymous, nonsynonymous, as well as SNPs in different codon positions, by Bayesian and maximum likelihood algorithms. Trees generated from core and nonsynonymous SNPs, as well as SNPs at the second and first plus second codon positions detailed four distinct groups of isolates within the outbreak pulsotype, discriminating outbreak-related isolates of human and food origins. Conversely, the trees derived from synonymous and third-codon-position SNPs clustered food and human isolates together, indicating that all outbreak-related isolates constituted a single clone, which was in line with the epidemiological evidence. Further experiments are in place to extend this approach within our regional enteropathogen surveillance system. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Naser, Sabri M; Vancanneyt, Marc; Hoste, Bart; Snauwaert, Cindy; Swings, Jean
2006-07-01
The applicability of a multilocus sequence analysis (MLSA)-based identification system for lactobacilli was evaluated. Two housekeeping genes that code for the phenylalanyl-tRNA synthase alpha-subunit (pheS) and RNA polymerase alpha-subunit (rpoA) were sequenced and analysed for members of the Lactobacillus salivarius species group. The type strains of Lactobacillus acidipiscis and Lactobacillus cypricasei were investigated further using a third gene that encodes the alpha-subunit of ATP synthase (atpA). The MLSA data revealed close relatedness between L. acidipiscis and L. cypricasei, with 99.8-100 % pheS, rpoA and atpA gene sequence similarities. Comparison of the 16S rRNA gene sequences of the type strains of the two species confirmed the close relatedness (99.8 % gene sequence similarity) between the two taxa. Similar phenotypes and high DNA-DNA binding values in the range of 84 to 97.5 % confirmed that L. acidipiscis and L. cypricasei are synonymous species. On the basis of the present study, it is proposed that Lactobacillus cypricasei is a later heterotypic synonym of Lactobacillus acidipiscis.
Kämpfer, Peter; Rückert, Christian; Blom, Jochen; Goesmann, Alexander; Wink, Joachim; Kalinowski, Jörn; Glaeser, Stefanie P
2017-08-01
On the basis of whole genome comparisons of Streptomyces griseorubiginosus and Streptomyces phaeopurpureus it could by shown that these two species are subjective synonyms. The names of both species have been published in the Approved Lists of Bacterial Names and, in such a case, normally Rule 24b (1) of the Prokaryotic Code applies, which reads: 'If two names compete for priority and if both names date from 1 January 1980 on an Approved List, the priority shall be determined by the date of the original publication of the name before 1 January 1980'. Streptomyces griseorubiginosus and Streptomyces phaeopurpureus were both effectively published in 1957, and for both publications, the exact date cannot be obtained. In this case a further statement of Rule 24 applies, which reads: 'If the names or epithets are of the same date, the author who first unites the taxa has the right to choose one of them, and his choice must be followed.' Hence we propose that Streptomyces phaeopurpureus is a later heterotypic subjective synonym of Streptomyces griseorubiginosus.
Emergent rules for codon choice elucidated by editing rare arginine codons in Escherichia coli
Napolitano, Michael G.; Landon, Matthieu; Gregg, Christopher J.; Lajoie, Marc J.; Govindarajan, Lakshmi; Mosberg, Joshua A.; Kuznetsov, Gleb; Goodman, Daniel B.; Vargas-Rodriguez, Oscar; Isaacs, Farren J.; Söll, Dieter; Church, George M.
2016-01-01
The degeneracy of the genetic code allows nucleic acids to encode amino acid identity as well as noncoding information for gene regulation and genome maintenance. The rare arginine codons AGA and AGG (AGR) present a case study in codon choice, with AGRs encoding important transcriptional and translational properties distinct from the other synonymous alternatives (CGN). We created a strain of Escherichia coli with all 123 instances of AGR codons removed from all essential genes. We readily replaced 110 AGR codons with the synonymous CGU codons, but the remaining 13 “recalcitrant” AGRs required diversification to identify viable alternatives. Successful replacement codons tended to conserve local ribosomal binding site-like motifs and local mRNA secondary structure, sometimes at the expense of amino acid identity. Based on these observations, we empirically defined metrics for a multidimensional “safe replacement zone” (SRZ) within which alternative codons are more likely to be viable. To evaluate synonymous and nonsynonymous alternatives to essential AGRs further, we implemented a CRISPR/Cas9-based method to deplete a diversified population of a wild-type allele, allowing us to evaluate exhaustively the fitness impact of all 64 codon alternatives. Using this method, we confirmed the relevance of the SRZ by tracking codon fitness over time in 14 different genes, finding that codons that fall outside the SRZ are rapidly depleted from a growing population. Our unbiased and systematic strategy for identifying unpredicted design flaws in synthetic genomes and for elucidating rules governing codon choice will be crucial for designing genomes exhibiting radically altered genetic codes. PMID:27601680
2011-01-01
Background Integration of genomic variation with phenotypic information is an effective approach for uncovering genotype-phenotype associations. This requires an accurate identification of the different types of variation in individual genomes. Results We report the integration of the whole genome sequence of a single Holstein Friesian bull with data from single nucleotide polymorphism (SNP) and comparative genomic hybridization (CGH) array technologies to determine a comprehensive spectrum of genomic variation. The performance of resequencing SNP detection was assessed by combining SNPs that were identified to be either in identity by descent (IBD) or in copy number variation (CNV) with results from SNP array genotyping. Coding insertions and deletions (indels) were found to be enriched for size in multiples of 3 and were located near the N- and C-termini of proteins. For larger indels, a combination of split-read and read-pair approaches proved to be complementary in finding different signatures. CNVs were identified on the basis of the depth of sequenced reads, and by using SNP and CGH arrays. Conclusions Our results provide high resolution mapping of diverse classes of genomic variation in an individual bovine genome and demonstrate that structural variation surpasses sequence variation as the main component of genomic variability. Better accuracy of SNP detection was achieved with little loss of sensitivity when algorithms that implemented mapping quality were used. IBD regions were found to be instrumental for calculating resequencing SNP accuracy, while SNP detection within CNVs tended to be less reliable. CNV discovery was affected dramatically by platform resolution and coverage biases. The combined data for this study showed that at a moderate level of sequencing coverage, an ensemble of platforms and tools can be applied together to maximize the accurate detection of sequence and structural variants. PMID:22082336
RAD tag sequencing as a source of SNP markers in Cynara cardunculus L
2012-01-01
Background The globe artichoke (Cynara cardunculus L. var. scolymus) genome is relatively poorly explored, especially compared to those of the other major Asteraceae crops sunflower and lettuce. No SNP markers are in the public domain. We have combined the recently developed restriction-site associated DNA (RAD) approach with the Illumina DNA sequencing platform to effect the rapid and mass discovery of SNP markers for C. cardunculus. Results RAD tags were sequenced from the genomic DNA of three C. cardunculus mapping population parents, generating 9.7 million reads, corresponding to ~1 Gbp of sequence. An assembly based on paired ends produced ~6.0 Mbp of genomic sequence, separated into ~19,000 contigs (mean length 312 bp), of which ~21% were fragments of putative coding sequence. The shared sequences allowed for the discovery of ~34,000 SNPs and nearly 800 indels, equivalent to a SNP frequency of 5.6 per 1,000 nt, and an indel frequency of 0.2 per 1,000 nt. A sample of heterozygous SNP loci was mapped by CAPS assays and this exercise provided validation of our mining criteria. The repetitive fraction of the genome had a high representation of retrotransposon sequence, followed by simple repeats, AT-low complexity regions and mobile DNA elements. The genomic k-mers distribution and CpG rate of C. cardunculus, compared with data derived from three whole genome-sequenced dicots species, provided a further evidence of the random representation of the C. cardunculus genome generated by RAD sampling. Conclusion The RAD tag sequencing approach is a cost-effective and rapid method to develop SNP markers in a highly heterozygous species. Our approach permitted to generate a large and robust SNP datasets by the adoption of optimized filtering criteria. PMID:22214349
Estimate of within population incremental selection through branch imbalance in lineage trees
Liberman, Gilad; Benichou, Jennifer I.C.; Maman, Yaakov; Glanville, Jacob; Alter, Idan; Louzoun, Yoram
2016-01-01
Incremental selection within a population, defined as limited fitness changes following mutation, is an important aspect of many evolutionary processes. Strongly advantageous or deleterious mutations are detected using the synonymous to non-synonymous mutations ratio. However, there are currently no precise methods to estimate incremental selection. We here provide for the first time such a detailed method and show its precision in multiple cases of micro-evolution. The proposed method is a novel mixed lineage tree/sequence based method to detect within population selection as defined by the effect of mutations on the average number of offspring. Specifically, we propose to measure the log of the ratio between the number of leaves in lineage trees branches following synonymous and non-synonymous mutations. The method requires a high enough number of sequences, and a large enough number of independent mutations. It assumes that all mutations are independent events. It does not require of a baseline model and is practically not affected by sampling biases. We show the method's wide applicability by testing it on multiple cases of micro-evolution. We show that it can detect genes and inter-genic regions using the selection rate and detect selection pressures in viral proteins and in the immune response to pathogens. PMID:26586802
CAPRRESI: Chimera Assembly by Plasmid Recovery and Restriction Enzyme Site Insertion.
Santillán, Orlando; Ramírez-Romero, Miguel A; Dávila, Guillermo
2017-06-25
Here, we present chimera assembly by plasmid recovery and restriction enzyme site insertion (CAPRRESI). CAPRRESI benefits from many strengths of the original plasmid recovery method and introduces restriction enzyme digestion to ease DNA ligation reactions (required for chimera assembly). For this protocol, users clone wildtype genes into the same plasmid (pUC18 or pUC19). After the in silico selection of amino acid sequence regions where chimeras should be assembled, users obtain all the synonym DNA sequences that encode them. Ad hoc Perl scripts enable users to determine all synonym DNA sequences. After this step, another Perl script searches for restriction enzyme sites on all synonym DNA sequences. This in silico analysis is also performed using the ampicillin resistance gene (ampR) found on pUC18/19 plasmids. Users design oligonucleotides inside synonym regions to disrupt wildtype and ampR genes by PCR. After obtaining and purifying complementary DNA fragments, restriction enzyme digestion is accomplished. Chimera assembly is achieved by ligating appropriate complementary DNA fragments. pUC18/19 vectors are selected for CAPRRESI because they offer technical advantages, such as small size (2,686 base pairs), high copy number, advantageous sequencing reaction features, and commercial availability. The usage of restriction enzymes for chimera assembly eliminates the need for DNA polymerases yielding blunt-ended products. CAPRRESI is a fast and low-cost method for fusing protein-coding genes.
Development of a set of SNP markers present in expressed genes of the apple.
Chagné, David; Gasic, Ksenija; Crowhurst, Ross N; Han, Yuepeng; Bassett, Heather C; Bowatte, Deepa R; Lawrence, Timothy J; Rikkerink, Erik H A; Gardiner, Susan E; Korban, Schuyler S
2008-11-01
Molecular markers associated with gene coding regions are useful tools for bridging functional and structural genomics. Due to their high abundance in plant genomes, single nucleotide polymorphisms (SNPs) are present within virtually all genomic regions, including most coding sequences. The objective of this study was to develop a set of SNPs for the apple by taking advantage of the wealth of genomics resources available for the apple, including a large collection of expressed sequenced tags (ESTs). Using bioinformatics tools, a search for SNPs within an EST database of approximately 350,000 sequences developed from a variety of apple accessions was conducted. This resulted in the identification of a total of 71,482 putative SNPs. As the apple genome is reported to be an ancient polyploid, attempts were made to verify whether those SNPs detected in silico were attributable either to allelic polymorphisms or to gene duplication or paralogous or homeologous sequence variations. To this end, a set of 464 PCR primer pairs was designed, PCR was amplified using two subsets of plants, and the PCR products were sequenced. The SNPs retrieved from these sequences were then mapped onto apple genetic maps, including a newly constructed map of a Royal Gala x A689-24 cross and a Malling 9 x Robusta 5, map using a bin mapping strategy. The SNP genotyping was performed using the high-resolution melting (HRM) technique. A total of 93 new markers containing 210 coding SNPs were successfully mapped. This new set of SNP markers for the apple offers new opportunities for understanding the genetic control of important horticultural traits using quantitative trait loci (QTL) or linkage disequilibrium analysis. These also serve as useful markers for aligning physical and genetic maps, and as potential transferable markers across the Rosaceae family.
Jeukens, Julie; Bernatchez, Louis
2012-01-01
While gene expression divergence is known to be involved in adaptive phenotypic divergence and speciation, the relative importance of regulatory and structural evolution of genes is poorly understood. A recent next-generation sequencing experiment allowed identifying candidate genes potentially involved in the ongoing speciation of sympatric dwarf and normal lake whitefish (Coregonus clupeaformis), such as cytosolic malate dehydrogenase (MDH1), which showed both significant expression and sequence divergence. The main goal of this study was to investigate into more details the signatures of natural selection in the regulatory and coding sequences of MDH1 in lake whitefish and test for parallelism of these signatures with other coregonine species. Sequencing of the two regions in 118 fish from four sympatric pairs of whitefish and two cisco species revealed a total of 35 single nucleotide polymorphisms (SNPs), with more genetic diversity in European compared to North American coregonine species. While the coding region was found to be under purifying selection, an SNP in the proximal promoter exhibited significant allele frequency divergence in a parallel manner among independent sympatric pairs of North American lake whitefish and European whitefish (C. lavaretus). According to transcription factor binding simulation for 22 regulatory haplotypes of MDH1, putative binding profiles were fairly conserved among species, except for the region around this SNP. Moreover, we found evidence for the role of this SNP in the regulation of MDH1 expression level. Overall, these results provide further evidence for the role of natural selection in gene regulation evolution among whitefish species pairs and suggest its possible link with patterns of phenotypic diversity observed in coregonine species. PMID:22408741
Jeukens, Julie; Bernatchez, Louis
2012-01-01
While gene expression divergence is known to be involved in adaptive phenotypic divergence and speciation, the relative importance of regulatory and structural evolution of genes is poorly understood. A recent next-generation sequencing experiment allowed identifying candidate genes potentially involved in the ongoing speciation of sympatric dwarf and normal lake whitefish (Coregonus clupeaformis), such as cytosolic malate dehydrogenase (MDH1), which showed both significant expression and sequence divergence. The main goal of this study was to investigate into more details the signatures of natural selection in the regulatory and coding sequences of MDH1 in lake whitefish and test for parallelism of these signatures with other coregonine species. Sequencing of the two regions in 118 fish from four sympatric pairs of whitefish and two cisco species revealed a total of 35 single nucleotide polymorphisms (SNPs), with more genetic diversity in European compared to North American coregonine species. While the coding region was found to be under purifying selection, an SNP in the proximal promoter exhibited significant allele frequency divergence in a parallel manner among independent sympatric pairs of North American lake whitefish and European whitefish (C. lavaretus). According to transcription factor binding simulation for 22 regulatory haplotypes of MDH1, putative binding profiles were fairly conserved among species, except for the region around this SNP. Moreover, we found evidence for the role of this SNP in the regulation of MDH1 expression level. Overall, these results provide further evidence for the role of natural selection in gene regulation evolution among whitefish species pairs and suggest its possible link with patterns of phenotypic diversity observed in coregonine species.
Wolf, Christiane; Angelberger, Marianne; Diegelmann, Julia; Olszak, Torsten; Beigel, Florian; Tillack, Cornelia; Stallhofer, Johannes; Göke, Burkhard; Glas, Jürgen; Lohse, Peter; Brand, Stephan
2014-01-01
Background Very recently, a sub-analysis of genome-wide association scans revealed that the non-coding single nucleotide polymorphism (SNP) rs12212067 in the FOXO3A gene is associated with a milder course of Crohn's disease (CD) (Cell 2013;155:57–69). The aim of our study was to evaluate the clinical value of the SNP rs12212067 in predicting the severity of CD by correlating CD patient genotype status with the most relevant complications of CD such as stenoses, fistulas, and CD-related surgery. Methodology/Principal Findings We genotyped 550 CD patients for rs12212067 (FOXO3A) and the three common CD-associated NOD2 mutations rs2066844, rs2066847, and rs2066847 and performed genotype-phenotype analyses. Results No significant phenotypic differences were found between the wild-type genotype TT of the FOXO3A SNP rs12212067 and the minor genotypes TG and GG independently from NOD2 variants. The allele frequency of the minor G allele was 12.7%. Age at diagnosis, disease duration, body mass index, surgery rate, stenoses, fistula, need for immunosuppressive therapy, and disease course were not significantly different. In contrast, the NOD2 mutant p.Leu1007fsX1008 (rs2066847) was highly associated with penetrating CD (p = 0.01), the development of fistulas (p = 0.01) and stenoses (p = 0.01), and ileal disease localization (p = 0.03). Importantly, the NOD2 SNP rs2066847 was a strong separator between an aggressive and a mild course of CD (p = 2.99×10−5), while the FOXO3A SNP rs12212067 did not separate between mild and aggressive CD behavior in our cohort (p = 0.35). 96.2% of the homozygous NOD2 p.Leu1007fsX1008 carriers had an aggressive disease behavior compared to 69.3% of the patients with the NOD2 wild-type genotype (p = 0.007). Conclusion/Significance In clinical practice, the NOD2 variant p.Leu1007fsX1008 (rs2066847), in particular in homozygous form, is a much stronger marker for a severe clinical phenotype than the FOXO3A rs12212067 SNP for a mild disease course on an individual patient level despite its important impact on the inflammatory response of monocytes. PMID:25365249
Haplotype-based approach to known MS-associated regions increases the amount of explained risk
Khankhanian, Pouya; Gourraud, Pierre-Antoine; Lizee, Antoine; Goodin, Douglas S
2015-01-01
Genome-wide association studies (GWAS), using single nucleotide polymorphisms (SNPs), have yielded 110 non-human leucocyte antigen genomic regions that are associated with multiple sclerosis (MS). Despite this large number of associations, however, only 28% of MS-heritability can currently be explained. Here we compare the use of multi-SNP-haplotypes to the use of single-SNPs as alternative methods to describe MS genetic risk. SNP-haplotypes (of various lengths from 1 up to 15 contiguous SNPs) were constructed at each of the 110 previously identified, MS-associated, genomic regions. Even after correcting for the larger number of statistical comparisons made when using the haplotype-method, in 32 of the regions, the SNP-haplotype based model was markedly more significant than the single-SNP based model. By contrast, in no region was the single-SNP based model similarly more significant than the SNP-haplotype based model. Moreover, when we included the 932 MS-associated SNP-haplotypes (that we identified from 102 regions) as independent variables into a logistic linear model, the amount of MS-heritability, as assessed by Nagelkerke's R-squared, was 38%, which was considerably better than 29%, which was obtained by using only single-SNPs. This study demonstrates that SNP-haplotypes can be used to fine-map the genetic associations within regions of interest previously identified by single-SNP GWAS. Moreover, the amount of the MS genetic risk explained by the SNP-haplotype associations in the 110 MS-associated genomic regions was considerably greater when using SNP-haplotypes than when using single-SNPs. Also, the use of SNP-haplotypes can lead to the discovery of new regions of interest, which have not been identified by a single-SNP GWAS. PMID:26185143
Bolbase: a comprehensive genomics database for Brassica oleracea.
Yu, Jingyin; Zhao, Meixia; Wang, Xiaowu; Tong, Chaobo; Huang, Shunmou; Tehrim, Sadia; Liu, Yumei; Hua, Wei; Liu, Shengyi
2013-09-30
Brassica oleracea is a morphologically diverse species in the family Brassicaceae and contains a group of nutrition-rich vegetable crops, including common heading cabbage, cauliflower, broccoli, kohlrabi, kale, Brussels sprouts. This diversity along with its phylogenetic membership in a group of three diploid and three tetraploid species, and the recent availability of genome sequences within Brassica provide an unprecedented opportunity to study intra- and inter-species divergence and evolution in this species and its close relatives. We have developed a comprehensive database, Bolbase, which provides access to the B. oleracea genome data and comparative genomics information. The whole genome of B. oleracea is available, including nine fully assembled chromosomes and 1,848 scaffolds, with 45,758 predicted genes, 13,382 transposable elements, and 3,581 non-coding RNAs. Comparative genomics information is available, including syntenic regions among B. oleracea, Brassica rapa and Arabidopsis thaliana, synonymous (Ks) and non-synonymous (Ka) substitution rates between orthologous gene pairs, gene families or clusters, and differences in quantity, category, and distribution of transposable elements on chromosomes. Bolbase provides useful search and data mining tools, including a keyword search, a local BLAST server, and a customized GBrowse tool, which can be used to extract annotations of genome components, identify similar sequences and visualize syntenic regions among species. Users can download all genomic data and explore comparative genomics in a highly visual setting. Bolbase is the first resource platform for the B. oleracea genome and for genomic comparisons with its relatives, and thus it will help the research community to better study the function and evolution of Brassica genomes as well as enhance molecular breeding research. This database will be updated regularly with new features, improvements to genome annotation, and new genomic sequences as they become available. Bolbase is freely available at http://ocri-genomics.org/bolbase.
Araripe, Luciana O; Montenegro, Horácio; Lemos, Bernardo; Hartl, Daniel L
2010-12-14
Hybrid male sterility (HMS) is a usual outcome of hybridization between closely related animal species. It arises because interactions between alleles that are functional within one species may be disrupted in hybrids. The identification of genes leading to hybrid sterility is of great interest for understanding the evolutionary process of speciation. In the current work we used marked P-element insertions as dominant markers to efficiently locate one genetic factor causing a severe reduction in fertility in hybrid males of Drosophila simulans and D. mauritiana. Our mapping effort identified a region of 9 kb on chromosome 3, containing three complete and one partial coding sequences. Within this region, two annotated genes are suggested as candidates for the HMS factor, based on the comparative molecular characterization and public-source information. Gene Taf1 is partially contained in the region, but yet shows high polymorphism with four fixed non-synonymous substitutions between the two species. Its molecular functions involve sequence-specific DNA binding and transcription factor activity. Gene agt is a small, intronless gene, whose molecular function is annotated as methylated-DNA-protein-cysteine S-methyltransferase activity. High polymorphism and one fixed non-synonymous substitution suggest this is a fast evolving gene. The gene trees of both genes perfectly separate D. simulans and D. mauritiana into monophyletic groups. Analysis of gene expression using microarray revealed trends that were similar to those previously found in comparisons between whole-genome hybrids and parental species. The identification following confirmation of the HMS candidate gene will add another case study leading to understanding the evolutionary process of hybrid incompatibility.
Chen, Shanyuan; Gomes, Rui; Costa, Vânia; Santos, Pedro; Charneca, Rui; Zhang, Ya-ping; Liu, Xue-hong; Wang, Shao-qing; Bento, Pedro; Nunes, Jose-Luis; Buzgó, József; Varga, Gyula; Anton, István; Zsolnai, Attila; Beja-Pereira, Albano
2013-10-01
The coexistence of wild boars and domestic pigs across Eurasia makes it feasible to conduct comparative genetic or genomic analyses for addressing how genetically different a domestic species is from its wild ancestor. To test whether there are differences in patterns of genetic variability between wild and domestic pigs at immunity-related genes and to detect outlier loci putatively under selection that may underlie differences in immune responses, here we analyzed 54 single-nucleotide polymorphisms (SNPs) of 19 immunity-related candidate genes on 11 autosomes in three pairs of wild boar and domestic pig populations from China, Iberian Peninsula, and Hungary. Our results showed no statistically significant differences in allele frequency and heterozygosity across SNPs between three pairs of wild and domestic populations. This observation was more likely due to the widespread and long-lasting gene flow between wild boars and domestic pigs across Eurasia. In addition, we detected eight coding SNPs from six genes as outliers being under selection consistently by three outlier tests (BayeScan2.1, FDIST2, and Arlequin3.5). Among four non-synonymous outlier SNPs, one from TLR4 gene was identified as being subject to positive (diversifying) selection and three each from CD36, IFNW1, and IL1B genes were suggested as under balancing selection. All of these four non-synonymous variants were predicted as being benign by PolyPhen-2. Our results were supported by other independent lines of evidence for positive selection or balancing selection acting on these four immune genes (CD36, IFNW1, IL1B, and TLR4). Our study showed an example applying a candidate gene approach to identify functionally important mutations (i.e., outlier loci) in wild and domestic pigs for subsequent functional experiments.
Shi, Yan-Hui; Wang, Bin; Xu, Bai-Ping; Jiang, Dan-Na; Zhao, Dong-Mei; Ji, Man-Ru; Zhou, Li; Li, Xue; Lu, Chang-Zhu
2016-11-01
Hepatocellular carcinoma is a complex polygenic disease. Despite the huge advances in genetic epidemiology, it still remains a challenge to unveil the genetic architecture of hepatocellular carcinoma. We, therefore, decided to meta-analytically assess the association of six non-synonymous coding variants from XRCC1, XRCC3 and XPD genes with hepatocellular carcinoma risk by pooling the results of 20 English articles. This meta-analysis was conducted according to the PRISMA statement, and data collection was independently completed in duplicate. In overall analyses, the minor alleles of four variants, Arg280His (odds ratio, 95% confidence interval, P: 1.37, 1.13-1.66, 0.001), Thr241Met (1.93, 1.17-3.20, 0.011), Asp312Asn (1.22, 1.08-1.38, 0.001) and Lys751Gln (1.42, 1.02-1.97, 0.038), were associated with the significant risk for hepatocellular carcinoma. There were low probabilities of publication bias for all variants. Subgroup analyses revealed significant association of XRCC1 gene Arg399Gln with hepatocellular carcinoma in Chinese especially from south China (odds ratio, 95% confidence interval, P: 1.57, 1.16-2.14, 0.004), in larger studies (1.48, 1.11-1.98, 0.007) and in studies with population-based controls (1.33, 1.06-1.68, 0.016). Taken together, our findings demonstrated that XPD gene Asp312Asn and XRCC1 gene Arg399Gln might be candidate susceptibility loci for hepatocellular carcinoma. Considering the ubiquity of genetic heterogeneity, further validation in a broad range of ethnic populations is warranted. © 2016 The Authors. Journal of Cellular and Molecular Medicine published by John Wiley & Sons Ltd and Foundation for Cellular and Molecular Medicine.
Singh, Kanhaiya; Goyal, Prabhjot; Singh, Manju; Deshmukh, Sujit; Upadhyay, Divyesh; Kant, Sri; Agrawal, Neeraj K; Gupta, Sanjeev K; Singh, Kiran
2017-12-01
Retinal angiogenesis is a hallmark of diabetic retinopathy. Matrix Metalloproteinases (MMPs) are involved in degradation of extracellular matrix (ECM). Functional SNP-1562C>T in the promoter of the MMP-9 gene results increase in transcriptional activity. The present work was designed to evaluate the contribution of functional SNP-1562C>T of MMP-9 gene to the risk of proliferative diabetic retinopathy (PDR) in type 2 diabetes mellitus (T2DM) patients in north Indian Population. This Case control study comprised of a total of 645 individuals in which 320 were T2DM patients out of which 73 had PDR, 98 had non- proliferative diabetic retinopathy (NPDR), 149 T2DM cases without any eye related disease (DM) and 325 non diabetic healthy individuals as controls (non DM controls). Genotyping for SNP-1562C>T of MMP-9 was done by polymerase chain reactions followed by restriction analyses with specific endonucleases (PCR-RFLP). DNA sequencing was used to ascertain PCR-RFLP results. T allele frequency in PDR patients was 32.1%, 20.4% in NPDR, 15.4% in DM and 13.7% in controls. Statistically significant difference was observed in both allele and genotype distribution between the PDR versus non-DM control group (p<0.0001 by T allele; p=0.002 by TT and p<0.0001 by CT genotype). The present study suggests that the functional SNP-1562C>T in the promoter of the MMP-9 gene could be regarded as a major risk factor for PDR as increased MMP-9 production from high expressing T allele may promote retinal angiogenesis. Copyright © 2017 Elsevier Inc. All rights reserved.
Generation and analysis of expressed sequence tags in the extreme large genomes Lilium and Tulipa
2012-01-01
Background Bulbous flowers such as lily and tulip (Liliaceae family) are monocot perennial herbs that are economically very important ornamental plants worldwide. However, there are hardly any genetic studies performed and genomic resources are lacking. To build genomic resources and develop tools to speed up the breeding in both crops, next generation sequencing was implemented. We sequenced and assembled transcriptomes of four lily and five tulip genotypes using 454 pyro-sequencing technology. Results Successfully, we developed the first set of 81,791 contigs with an average length of 514 bp for tulip, and enriched the very limited number of 3,329 available ESTs (Expressed Sequence Tags) for lily with 52,172 contigs with an average length of 555 bp. The contigs together with singletons covered on average 37% of lily and 39% of tulip estimated transcriptome. Mining lily and tulip sequence data for SSRs (Simple Sequence Repeats) showed that di-nucleotide repeats were twice more abundant in UTRs (UnTranslated Regions) compared to coding regions, while tri-nucleotide repeats were equally spread over coding and UTR regions. Two sets of single nucleotide polymorphism (SNP) markers suitable for high throughput genotyping were developed. In the first set, no SNPs flanking the target SNP (50 bp on either side) were allowed. In the second set, one SNP in the flanking regions was allowed, which resulted in a 2 to 3 fold increase in SNP marker numbers compared with the first set. Orthologous groups between the two flower bulbs: lily and tulip (12,017 groups) and among the three monocot species: lily, tulip, and rice (6,900 groups) were determined using OrthoMCL. Orthologous groups were screened for common SNP markers and EST-SSRs to study synteny between lily and tulip, which resulted in 113 common SNP markers and 292 common EST-SSR. Lily and tulip contigs generated were annotated and described according to Gene Ontology terminology. Conclusions Two transcriptome sets were built that are valuable resources for marker development, comparative genomic studies and candidate gene approaches. Next generation sequencing of leaf transcriptome is very effective; however, deeper sequencing and using more tissues and stages is advisable for extended comparative studies. PMID:23167289
Joseph, S; Schmidt, L M; Danquah, W B; Timper, P; Mekete, T
2017-02-01
To generate single spore lines of a population of bacterial parasite of root-knot nematode (RKN), Pasteuria penetrans, isolated from Florida and examine genotypic variation and virulence characteristics exist within the population. Six single spore lines (SSP), 16SSP, 17SSP, 18SSP, 25SSP, 26SSP and 30SSP were generated. Genetic variability was evaluated by comparing single-nucleotide polymorphisms (SNPs) in six protein-coding genes and the 16S rRNA gene. An average of one SNP was observed for every 69 bp in the 16S rRNA, whereas no SNPs were observed in the protein-coding sequences. Hierarchical cluster analysis of 16S rRNA sequences placed the clones into three distinct clades. Bio-efficacy analysis revealed significant heterogeneity in the level virulence and host specificity between the individual clones. The SNP markers developed to the 5' hypervariable region of the 16S rRNA gene may be useful in biotype differentiation within a population of P. penetrans. This study demonstrates an efficient method for generating single spore lines of P. penetrans and gives a deep insight into genetic heterogeneity and varying level of virulence exists within a population parasitizing a specific Meloidogyne sp. host. The results also suggest that the application of generalist spore lines in nematode management may achieve broad RKN control. © 2016 The Society for Applied Microbiology.
Evolution of the F-Box Gene Family in Euarchontoglires: Gene Number Variation and Selection Patterns
Wang, Ailan; Fu, Mingchuan; Jiang, Xiaoqian; Mao, Yuanhui; Li, Xiangchen; Tao, Shiheng
2014-01-01
F-box proteins are substrate adaptors used by the SKP1–CUL1–F-box protein (SCF) complex, a type of E3 ubiquitin ligase complex in the ubiquitin proteasome system (UPS). SCF-mediated ubiquitylation regulates proteolysis of hundreds of cellular proteins involved in key signaling and disease systems. However, our knowledge of the evolution of the F-box gene family in Euarchontoglires is limited. In the present study, 559 F-box genes and nine related pseudogenes were identified in eight genomes. Lineage-specific gene gain and loss events occurred during the evolution of Euarchontoglires, resulting in varying F-box gene numbers ranging from 66 to 81 among the eight species. Both tandem duplication and retrotransposition were found to have contributed to the increase of F-box gene number, whereas mutation in the F-box domain was the main mechanism responsible for reduction in the number of F-box genes, resulting in a balance of expansion and contraction in the F-box gene family. Thus, the Euarchontoglire F-box gene family evolved under a birth-and-death model. Signatures of positive selection were detected in substrate-recognizing domains of multiple F-box proteins, and adaptive changes played a role in evolution of the Euarchontoglire F-box gene family. In addition, single nucleotide polymorphism (SNP) distributions were found to be highly non-random among different regions of F-box genes in 1092 human individuals, with domain regions having a significantly lower number of non-synonymous SNPs. PMID:24727786
Optimal Design of Low-Density SNP Arrays for Genomic Prediction: Algorithm and Applications.
Wu, Xiao-Lin; Xu, Jiaqi; Feng, Guofei; Wiggans, George R; Taylor, Jeremy F; He, Jun; Qian, Changsong; Qiu, Jiansheng; Simpson, Barry; Walker, Jeremy; Bauck, Stewart
2016-01-01
Low-density (LD) single nucleotide polymorphism (SNP) arrays provide a cost-effective solution for genomic prediction and selection, but algorithms and computational tools are needed for the optimal design of LD SNP chips. A multiple-objective, local optimization (MOLO) algorithm was developed for design of optimal LD SNP chips that can be imputed accurately to medium-density (MD) or high-density (HD) SNP genotypes for genomic prediction. The objective function facilitates maximization of non-gap map length and system information for the SNP chip, and the latter is computed either as locus-averaged (LASE) or haplotype-averaged Shannon entropy (HASE) and adjusted for uniformity of the SNP distribution. HASE performed better than LASE with ≤1,000 SNPs, but required considerably more computing time. Nevertheless, the differences diminished when >5,000 SNPs were selected. Optimization was accomplished conditionally on the presence of SNPs that were obligated to each chromosome. The frame location of SNPs on a chip can be either uniform (evenly spaced) or non-uniform. For the latter design, a tunable empirical Beta distribution was used to guide location distribution of frame SNPs such that both ends of each chromosome were enriched with SNPs. The SNP distribution on each chromosome was finalized through the objective function that was locally and empirically maximized. This MOLO algorithm was capable of selecting a set of approximately evenly-spaced and highly-informative SNPs, which in turn led to increased imputation accuracy compared with selection solely of evenly-spaced SNPs. Imputation accuracy increased with LD chip size, and imputation error rate was extremely low for chips with ≥3,000 SNPs. Assuming that genotyping or imputation error occurs at random, imputation error rate can be viewed as the upper limit for genomic prediction error. Our results show that about 25% of imputation error rate was propagated to genomic prediction in an Angus population. The utility of this MOLO algorithm was also demonstrated in a real application, in which a 6K SNP panel was optimized conditional on 5,260 obligatory SNP selected based on SNP-trait association in U.S. Holstein animals. With this MOLO algorithm, both imputation error rate and genomic prediction error rate were minimal.
Optimal Design of Low-Density SNP Arrays for Genomic Prediction: Algorithm and Applications
Wu, Xiao-Lin; Xu, Jiaqi; Feng, Guofei; Wiggans, George R.; Taylor, Jeremy F.; He, Jun; Qian, Changsong; Qiu, Jiansheng; Simpson, Barry; Walker, Jeremy; Bauck, Stewart
2016-01-01
Low-density (LD) single nucleotide polymorphism (SNP) arrays provide a cost-effective solution for genomic prediction and selection, but algorithms and computational tools are needed for the optimal design of LD SNP chips. A multiple-objective, local optimization (MOLO) algorithm was developed for design of optimal LD SNP chips that can be imputed accurately to medium-density (MD) or high-density (HD) SNP genotypes for genomic prediction. The objective function facilitates maximization of non-gap map length and system information for the SNP chip, and the latter is computed either as locus-averaged (LASE) or haplotype-averaged Shannon entropy (HASE) and adjusted for uniformity of the SNP distribution. HASE performed better than LASE with ≤1,000 SNPs, but required considerably more computing time. Nevertheless, the differences diminished when >5,000 SNPs were selected. Optimization was accomplished conditionally on the presence of SNPs that were obligated to each chromosome. The frame location of SNPs on a chip can be either uniform (evenly spaced) or non-uniform. For the latter design, a tunable empirical Beta distribution was used to guide location distribution of frame SNPs such that both ends of each chromosome were enriched with SNPs. The SNP distribution on each chromosome was finalized through the objective function that was locally and empirically maximized. This MOLO algorithm was capable of selecting a set of approximately evenly-spaced and highly-informative SNPs, which in turn led to increased imputation accuracy compared with selection solely of evenly-spaced SNPs. Imputation accuracy increased with LD chip size, and imputation error rate was extremely low for chips with ≥3,000 SNPs. Assuming that genotyping or imputation error occurs at random, imputation error rate can be viewed as the upper limit for genomic prediction error. Our results show that about 25% of imputation error rate was propagated to genomic prediction in an Angus population. The utility of this MOLO algorithm was also demonstrated in a real application, in which a 6K SNP panel was optimized conditional on 5,260 obligatory SNP selected based on SNP-trait association in U.S. Holstein animals. With this MOLO algorithm, both imputation error rate and genomic prediction error rate were minimal. PMID:27583971
Huang, Dandan; Yi, Xianfu; Zhang, Shijie; Zheng, Zhanye; Wang, Panwen; Xuan, Chenghao; Sham, Pak Chung; Wang, Junwen; Li, Mulin Jun
2018-05-16
Genome-wide association studies have generated over thousands of susceptibility loci for many human complex traits, and yet for most of these associations the true causal variants remain unknown. Tissue/cell type-specific prediction and prioritization of non-coding regulatory variants will facilitate the identification of causal variants and underlying pathogenic mechanisms for particular complex diseases and traits. By leveraging recent large-scale functional genomics/epigenomics data, we develop an intuitive web server, GWAS4D (http://mulinlab.tmu.edu.cn/gwas4d or http://mulinlab.org/gwas4d), that systematically evaluates GWAS signals and identifies context-specific regulatory variants. The updated web server includes six major features: (i) updates the regulatory variant prioritization method with our new algorithm; (ii) incorporates 127 tissue/cell type-specific epigenomes data; (iii) integrates motifs of 1480 transcriptional regulators from 13 public resources; (iv) uniformly processes Hi-C data and generates significant interactions at 5 kb resolution across 60 tissues/cell types; (v) adds comprehensive non-coding variant functional annotations; (vi) equips a highly interactive visualization function for SNP-target interaction. Using a GWAS fine-mapped set for 161 coronary artery disease risk loci, we demonstrate that GWAS4D is able to efficiently prioritize disease-causal regulatory variants.
Mankowska, M; Stachowiak, M; Graczyk, A; Ciazynska, P; Gogulski, M; Nizanski, W; Switonski, M
2016-04-01
Obesity is an emerging health problem in purebred dogs. Due to their crucial role in energy homeostasis control, genes encoding adipokines are considered candidate genes, and their variants may be associated with predisposition to obesity. Searching for polymorphism was carried out in three adipokine genes (TNF, RETN and IL6). The study was performed on 260 dogs, including lean (n = 109), overweight (n = 88) and obese (n = 63) dogs. The largest cohort was represented by Labrador Retrievers (n = 136). Altogether, 24 novel polymorphisms were identified: 12 in TNF (including one missense SNP), eight in RETN (including one missense SNP) and four in IL6. Distributions of five common SNPs (two in TNF, two in RETN and one in IL6) were further analyzed with regard to body condition score. Two SNPs in the non-coding parts of TNF (c.-40A>C and c.233+14G>A) were associated with obesity in Labrador dogs. The obtained results showed that the studied adipokine genes are highly polymorphic and two polymorphisms in the TNF gene may be considered as markers predisposing Labrador dogs to obesity. © 2015 Stichting International Foundation for Animal Genetics.
Functional non-synonymous variants of ABCG2 and gout risk.
Stiburkova, Blanka; Pavelcova, Katerina; Zavada, Jakub; Petru, Lenka; Simek, Pavel; Cepek, Pavel; Pavlikova, Marketa; Matsuo, Hirotaka; Merriman, Tony R; Pavelka, Karel
2017-11-01
Common dysfunctional variants of ATP binding cassette subfamily G member 2 (Junior blood group) (ABCG2), a high-capacity urate transporter gene, that result in decreased urate excretion are major causes of hyperuricemia and gout. In the present study, our objective was to determine the frequency and effect on gout of common and rare non-synonymous and other functional allelic variants in the ABCG2 gene. The main cohort recruited from the Czech Republic consisted of 145 gout patients; 115 normouricaemic controls were used for comparison. We amplified, directly sequenced and analysed 15 ABCG2 exons. The associations between genetic variants and clinical phenotype were analysed using the t-test, Fisher's exact test and a logistic and linear regression approach. Data from a New Zealand Polynesian sample set and the UK Biobank were included for the p.V12M analysis. In the ABCG2 gene, 18 intronic (one dysfunctional splicing) and 11 exonic variants were detected: 9 were non-synonymous (2 common, 7 rare including 1 novel), namely p.V12M, p.Q141K, p.R147W, p.T153M, p.F373C, p.T434M, p.S476P, p.D620N and p.K360del. The p.Q141K (rs2231142) variant had a significantly higher minor allele frequency (0.23) in the gout patients compared with the European-origin population (0.09) and was significantly more common among gout patients than among normouricaemic controls (odds ratio = 3.26, P < 0.0001). Patients with non-synonymous allelic variants had an earlier onset of gout (42 vs 48 years, P = 0.0143) and a greater likelihood of a familial history of gout (41% vs 27%, odds ratio = 1.96, P = 0.053). In a meta-analysis p.V12M exerted a protective effect from gout (P < 0.0001). Genetic variants of ABCG2, common and rare, increased the risk of gout. Non-synonymous allelic variants of ABCG2 had a significant effect on earlier onset of gout and the presence of a familial gout history. ABCG2 should thus be considered a common and significant risk factor for gout. © The Author 2017. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Lynch, Ryan C.; Darcy, John L.; Kane, Nolan C.; Nemergut, Diana R.; Schmidt, Steve K.
2014-01-01
Previous surveys of very dry Atacama Desert mineral soils have consistently revealed sparse communities of non-photosynthetic microbes. The functional nature of these microorganisms remains debatable given the harshness of the environment and low levels of biomass and diversity. The aim of this study was to gain an understanding of the phylogenetic community structure and metabolic potential of a low-diversity mineral soil metagenome that was collected from a high-elevation Atacama Desert volcano debris field. We pooled DNA extractions from over 15 g of volcanic material, and using whole genome shotgun sequencing, observed only 75–78 total 16S rRNA gene OTUs3%. The phylogenetic structure of this community is significantly under dispersed, with actinobacterial lineages making up 97.9–98.6% of the 16S rRNA genes, suggesting a high degree of environmental selection. Due to this low diversity and uneven community composition, we assembled and analyzed the metabolic pathways of the most abundant genome, a Pseudonocardia sp. (56–72% of total 16S genes). Our assembly and binning efforts yielded almost 4.9 Mb of Pseudonocardia sp. contigs, which accounts for an estimated 99.3% of its non-repetitive genomic content. This genome contains a limited array of carbohydrate catabolic pathways, but encodes for CO2 fixation via the Calvin cycle. The genome also encodes complete pathways for the catabolism of various trace gases (H2, CO and several organic C1 compounds) and the assimilation of ammonia and nitrate. We compared genomic content among related Pseudonocardia spp. and estimated rates of non-synonymous and synonymous nucleic acid substitutions between protein coding homologs. Collectively, these comparative analyses suggest that the community structure and various functional genes have undergone strong selection in the nutrient poor desert mineral soils and high-elevation atmospheric conditions. PMID:25566214
Wang, Nuohan; Ma, Jianjiang; Pei, Wenfeng; Wu, Man; Li, Haijing; Li, Xingli; Yu, Shuxun; Zhang, Jinfa; Yu, Jiwen
2017-03-01
Lysophosphatidic acid acyltransferase (LPAAT) encoded by a multigene family is a rate-limiting enzyme in the Kennedy pathway in higher plants. Cotton is the most important natural fiber crop and one of the most important oilseed crops. However, little is known on genes coding for LPAATs involved in oil biosynthesis with regard to its genome organization, diversity, expression, natural genetic variation, and association with fiber development and oil content in cotton. In this study, a comprehensive genome-wide analysis in four Gossypium species with genome sequences, i.e., tetraploid G. hirsutum- AD 1 and G. barbadense- AD 2 and its possible ancestral diploids G. raimondii- D 5 and G. arboreum- A 2 , identified 13, 10, 8, and 9 LPAAT genes, respectively, that were divided into four subfamilies. RNA-seq analyses of the LPAAT genes in the widely grown G. hirsutum suggest their differential expression at the transcriptional level in developing cottonseeds and fibers. Although 10 LPAAT genes were co-localised with quantitative trait loci (QTL) for cottonseed oil or protein content within a 25-cM region, only one single strand conformation polymorphic (SSCP) marker developed from a synonymous single nucleotide polymorphism (SNP) of the At-Gh13LPAAT5 gene was significantly correlated with cottonseed oil and protein contents in one of the three field tests. Moreover, transformed yeasts using the At-Gh13LPAAT5 gene with the two sequences for the SNP led to similar results, i.e., a 25-31% increase in palmitic acid and oleic acid, and a 16-29% increase in total triacylglycerol (TAG). The results in this study demonstrated that the natural variation in the LPAAT genes to improving cottonseed oil content and fiber quality is limited; therefore, traditional cross breeding should not expect much progress in improving cottonseed oil content or fiber quality through a marker-assisted selection for the LPAAT genes. However, enhancing the expression of one of the LPAAT genes such as At-Gh13LPAAT5 can significantly increase the production of total TAG and other fatty acids, providing an incentive for further studies into the use of LPAAT genes to increase cottonseed oil content through biotechnology.
Fontanesi, Luca; Bertolini, Francesca; Scotti, Emilio; Schiavo, Giuseppina; Colombo, Michela; Trevisi, Paolo; Ribani, Anisa; Buttazzoni, Luca; Russo, Vincenzo; Dall'Olio, Stefania
2015-01-01
The GPR120 gene (also known as FFAR4 or O3FAR1) encodes for a functional omega-3 fatty acid receptor/sensor that mediates potent insulin sensitizing effects by repressing macrophage-induced tissue inflammation. For its functional role, GPR120 could be considered a potential target gene in animal nutrigenetics. In this work we resequenced the porcine GPR120 gene by high throughput Ion Torrent semiconductor sequencing of amplified fragments obtained from 8 DNA pools derived, on the whole, from 153 pigs of different breeds/populations (two Italian Large White pools, Italian Duroc, Italian Landrace, Casertana, Pietrain, Meishan, and wild boars). Three single nucleotide polymorphisms (SNPs), two synonymous substitutions and one in the putative 3'-untranslated region (g.114765469C > T), were identified and their allele frequencies were estimated by sequencing reads count. The g.114765469C > T SNP was also genotyped by PCR-RFLP confirming estimated frequency in Italian Large White pools. Then, this SNP was analyzed in two Italian Large White cohorts using a selective genotyping approach based on extreme and divergent pigs for back fat thickness (BFT) estimated breeding value (EBV) and average daily gain (ADG) EBV. Significant differences of allele and genotype frequencies distribution was observed between the extreme ADG-EBV groups (P < 0.001) whereas this marker was not associated with BFT-EBV.
Wu, Chen; Wang, Zhaoming; Song, Xin; Feng, Xiao-Shan; Abnet, Christian C; He, Jie; Hu, Nan; Zuo, Xian-Bo; Tan, Wen; Zhan, Qimin; Hu, Zhibin; He, Zhonghu; Jia, Weihua; Zhou, Yifeng; Yu, Kai; Shu, Xiao-Ou; Yuan, Jian-Min; Zheng, Wei; Zhao, Xue-Ke; Gao, She-Gan; Yuan, Zhi-Qing; Zhou, Fu-You; Fan, Zong-Min; Cui, Ji-Li; Lin, Hong-Li; Han, Xue-Na; Li, Bei; Chen, Xi; Dawsey, Sanford M; Liao, Linda; Lee, Maxwell P; Ding, Ti; Qiao, You-Lin; Liu, Zhihua; Liu, Yu; Yu, Dianke; Chang, Jiang; Wei, Lixuan; Gao, Yu-Tang; Koh, Woon-Puay; Xiang, Yong-Bing; Tang, Ze-Zhong; Fan, Jin-Hu; Han, Jing-Jing; Zhou, Sheng-Li; Zhang, Peng; Zhang, Dong-Yun; Yuan, Yuan; Huang, Ying; Liu, Chunling; Zhai, Kan; Qiao, Yan; Jin, Guangfu; Guo, Chuanhai; Fu, Jianhua; Miao, Xiaoping; Lu, Changdong; Yang, Haijun; Wang, Chaoyu; Wheeler, William A; Gail, Mitchell; Yeager, Meredith; Yuenger, Jeff; Guo, Er-Tao; Li, Ai-Li; Zhang, Wei; Li, Xue-Min; Sun, Liang-Dan; Ma, Bao-Gen; Li, Yan; Tang, Sa; Peng, Xiu-Qing; Liu, Jing; Hutchinson, Amy; Jacobs, Kevin; Giffen, Carol; Burdette, Laurie; Fraumeni, Joseph F; Shen, Hongbing; Ke, Yang; Zeng, Yixin; Wu, Tangchun; Kraft, Peter; Chung, Charles C; Tucker, Margaret A; Hou, Zhi-Chao; Liu, Ya-Li; Hu, Yan-Long; Liu, Yu; Wang, Li; Yuan, Guo; Chen, Li-Sha; Liu, Xiao; Ma, Teng; Meng, Hui; Sun, Li; Li, Xin-Min; Li, Xiu-Min; Ku, Jian-Wei; Zhou, Ying-Fa; Yang, Liu-Qin; Wang, Zhou; Li, Yin; Qige, Qirenwang; Yang, Wen-Jun; Lei, Guang-Yan; Chen, Long-Qi; Li, En-Min; Yuan, Ling; Yue, Wen-Bin; Wang, Ran; Wang, Lu-Wen; Fan, Xue-Ping; Zhu, Fang-Heng; Zhao, Wei-Xing; Mao, Yi-Min; Zhang, Mei; Xing, Guo-Lan; Li, Ji-Lin; Han, Min; Ren, Jing-Li; Liu, Bin; Ren, Shu-Wei; Kong, Qing-Peng; Li, Feng; Sheyhidin, Ilyar; Wei, Wu; Zhang, Yan-Rui; Feng, Chang-Wei; Wang, Jin; Yang, Yu-Hua; Hao, Hong-Zhang; Bao, Qi-De; Liu, Bao-Chi; Wu, Ai-Qun; Xie, Dong; Yang, Wan-Cai; Wang, Liang; Zhao, Xiao-Hang; Chen, Shu-Qing; Hong, Jun-Yan; Zhang, Xue-Jun; Freedman, Neal D; Goldstein, Alisa M; Lin, Dongxin; Taylor, Philip R; Wang, Li-Dong; Chanock, Stephen J
2014-09-01
We conducted a joint (pooled) analysis of three genome-wide association studies (GWAS) of esophageal squamous cell carcinoma (ESCC) in individuals of Chinese ancestry (5,337 ESCC cases and 5,787 controls) with 9,654 ESCC cases and 10,058 controls for follow-up. In a logistic regression model adjusted for age, sex, study and two eigenvectors, two new loci achieved genome-wide significance, marked by rs7447927 at 5q31.2 (per-allele odds ratio (OR) = 0.85, 95% confidence interval (CI) = 0.82-0.88; P = 7.72 × 10(-20)) and rs1642764 at 17p13.1 (per-allele OR = 0.88, 95% CI = 0.85-0.91; P = 3.10 × 10(-13)). rs7447927 is a synonymous SNP in TMEM173, and rs1642764 is an intronic SNP in ATP1B2, near TP53. Furthermore, a locus in the HLA class II region at 6p21.32 (rs35597309) achieved genome-wide significance in the two populations at highest risk for ESSC (OR = 1.33, 95% CI = 1.22-1.46; P = 1.99 × 10(-10)). Our joint analysis identifies new ESCC susceptibility loci overall as well as a new locus unique to the population in the Taihang Mountain region at high risk of ESCC.
Kim, Seungill; Kim, Myung-Shin; Kim, Yong-Min; Yeom, Seon-In; Cheong, Kyeongchae; Kim, Ki-Tae; Jeon, Jongbum; Kim, Sunggil; Kim, Do-Sun; Sohn, Seong-Han; Lee, Yong-Hwan; Choi, Doil
2015-02-01
The onion (Allium cepa L.) is one of the most widely cultivated and consumed vegetable crops in the world. Although a considerable amount of onion transcriptome data has been deposited into public databases, the sequences of the protein-coding genes are not accurate enough to be used, owing to non-coding sequences intermixed with the coding sequences. We generated a high-quality, annotated onion transcriptome from de novo sequence assembly and intensive structural annotation using the integrated structural gene annotation pipeline (ISGAP), which identified 54,165 protein-coding genes among 165,179 assembled transcripts totalling 203.0 Mb by eliminating the intron sequences. ISGAP performed reliable annotation, recognizing accurate gene structures based on reference proteins, and ab initio gene models of the assembled transcripts. Integrative functional annotation and gene-based SNP analysis revealed a whole biological repertoire of genes and transcriptomic variation in the onion. The method developed in this study provides a powerful tool for the construction of reference gene sets for organisms based solely on de novo transcriptome data. Furthermore, the reference genes and their variation described here for the onion represent essential tools for molecular breeding and gene cloning in Allium spp. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Hurba, Olha; Mancikova, Andrea; Krylov, Vladimir; Pavlikova, Marketa; Pavelka, Karel; Stibůrková, Blanka
2014-01-01
Using European descent Czech populations, we performed a study of SLC2A9 and SLC22A12 genes previously identified as being associated with serum uric acid concentrations and gout. This is the first study of the impact of non-synonymous allelic variants on the function of GLUT9 except for patients suffering from renal hypouricemia type 2. The cohort consisted of 250 individuals (150 controls, 54 nonspecific hyperuricemics and 46 primary gout and/or hyperuricemia subjects). We analyzed 13 exons of SLC2A9 (GLUT9 variant 1 and GLUT9 variant 2) and 10 exons of SLC22A12 by PCR amplification and sequenced directly. Allelic variants were prepared and their urate uptake and subcellular localization were studied by Xenopus oocytes expression system. The functional studies were analyzed using the non-parametric Wilcoxon and Kruskall-Wallis tests; the association study used the Fisher exact test and linear regression approach. We identified a total of 52 sequence variants (12 unpublished). Eight non-synonymous allelic variants were found only in SLC2A9: rs6820230, rs2276961, rs144196049, rs112404957, rs73225891, rs16890979, rs3733591 and rs2280205. None of these variants showed any significant difference in the expression of GLUT9 and in urate transport. In the association study, eight variants showed a possible association with hyperuricemia. However, seven of these were in introns and the one exon located variant, rs7932775, did not show a statistically significant association with serum uric acid concentration. Our results did not confirm any effect of SLC22A12 and SLC2A9 variants on serum uric acid concentration. Our complex approach using association analysis together with functional and immunohistochemical characterization of non-synonymous allelic variants did not show any influence on expression, subcellular localization and urate uptake of GLUT9.
Common genetic variants of surfactant protein-D (SP-D) are associated with type 2 diabetes.
Pueyo, Neus; Ortega, Francisco J; Mercader, Josep M; Moreno-Navarrete, José M; Sabater, Monica; Bonàs, Sílvia; Botas, Patricia; Delgado, Elías; Ricart, Wifredo; Martinez-Larrad, María T; Serrano-Ríos, Manuel; Torrents, David; Fernández-Real, José M
2013-01-01
Surfactant protein-D (SP-D) is a primordial component of the innate immune system intrinsically linked to metabolic pathways. We aimed to study the association of single nucleotide polymorphisms (SNPs) affecting SP-D with insulin resistance and type 2 diabetes (T2D). We evaluated a common genetic variant located in the SP-D coding region (rs721917, Met(31)Thr) in a sample of T2D patients and non-diabetic controls (n = 2,711). In a subset of subjects (n = 1,062), this SNP was analyzed in association with circulating SP-D concentrations, insulin resistance, and T2D. This SNP and others were also screened in the publicly available Genome Wide Association (GWA) database of the Meta-Analyses of Glucose and Insulin-related traits Consortium (MAGIC). We found the significant association of rs721917 with circulating SP-D, parameters of insulin resistance and T2D. Indeed, G carriers showed decreased circulating SP-D (p = 0.004), decreased fasting glucose (p = 0.0002), glycated hemoglobin (p = 0.0005), and 33% (p = 0.002) lower prevalence of T2D, estimated under a dominant model, especially among women. Interestingly, these differences remained significant after controlling for origin, age, gender, and circulating SP-D. Moreover, this SNP and others within the SP-D genomic region (i.e. rs10887344) were significantly associated with quantitative measures of glucose homeostasis, insulin sensitivity, and T2D, according to GWAS datasets from MAGIC. SP-D gene polymorphisms are associated with insulin resistance and T2D. These associations are independent of circulating SP-D concentrations.
Schrimpf, Rahel; Dierks, Claudia; Martinsson, Gunilla; Sieme, Harald; Distl, Ottmar
2014-01-01
A consistently high level of stallion fertility plays an economically important role in modern horse breeding. We performed a genome-wide association study for estimated breeding values of the paternal component of the pregnancy rate per estrus cycle (EBV-PAT) in Hanoverian stallions. A total of 228 Hanoverian stallions were genotyped using the Equine SNP50 Beadchip. The most significant association was found on horse chromosome 6 for a single nucleotide polymorphism (SNP) within phospholipase C zeta 1 (PLCz1). In the close neighbourhood to PLCz1 is located CAPZA3 (capping protein (actin filament) muscle Z-line, alpha 3). The gene PLCz1 encodes a protein essential for spermatogenesis and oocyte activation through sperm induced Ca2+-oscillation during fertilization. We derived equine gene models for PLCz1 and CAPZA3 based on cDNA and genomic DNA sequences. The equine PLCz1 had four different transcripts of which two contained a premature termination codon. Sequencing all exons and their flanking sequences using genomic DNA samples from 19 Hanoverian stallions revealed 47 polymorphisms within PLCz1 and one SNP within CAPZA3. Validation of these 48 polymorphisms in 237 Hanoverian stallions identified three intronic SNPs within PLCz1 as significantly associated with EBV-PAT. Bioinformatic analysis suggested regulatory effects for these SNPs via transcription factor binding sites or microRNAs. In conclusion, non-coding polymorphisms within PLCz1 were identified as conferring stallion fertility and PLCz1 as candidate locus for male fertility in Hanoverian warmblood. CAPZA3 could be eliminated as candidate gene for fertility in Hanoverian stallions.
Schrimpf, Rahel; Dierks, Claudia; Martinsson, Gunilla; Sieme, Harald; Distl, Ottmar
2014-01-01
A consistently high level of stallion fertility plays an economically important role in modern horse breeding. We performed a genome-wide association study for estimated breeding values of the paternal component of the pregnancy rate per estrus cycle (EBV-PAT) in Hanoverian stallions. A total of 228 Hanoverian stallions were genotyped using the Equine SNP50 Beadchip. The most significant association was found on horse chromosome 6 for a single nucleotide polymorphism (SNP) within phospholipase C zeta 1 (PLCz1). In the close neighbourhood to PLCz1 is located CAPZA3 (capping protein (actin filament) muscle Z-line, alpha 3). The gene PLCz1 encodes a protein essential for spermatogenesis and oocyte activation through sperm induced Ca2+-oscillation during fertilization. We derived equine gene models for PLCz1 and CAPZA3 based on cDNA and genomic DNA sequences. The equine PLCz1 had four different transcripts of which two contained a premature termination codon. Sequencing all exons and their flanking sequences using genomic DNA samples from 19 Hanoverian stallions revealed 47 polymorphisms within PLCz1 and one SNP within CAPZA3. Validation of these 48 polymorphisms in 237 Hanoverian stallions identified three intronic SNPs within PLCz1 as significantly associated with EBV-PAT. Bioinformatic analysis suggested regulatory effects for these SNPs via transcription factor binding sites or microRNAs. In conclusion, non-coding polymorphisms within PLCz1 were identified as conferring stallion fertility and PLCz1 as candidate locus for male fertility in Hanoverian warmblood. CAPZA3 could be eliminated as candidate gene for fertility in Hanoverian stallions. PMID:25354211
Woo, Patrick C Y; Lau, Susanna K P; Li, Tong; Jose, Shanty; Yip, Cyril C Y; Huang, Yi; Wong, Emily Y M; Fan, Rachel Y Y; Cai, Jian-Piao; Wernery, Ulrich; Yuen, Kwok-Yung
2015-07-01
The recent emergence of Middle East respiratory syndrome coronavirus from the Middle East and the discovery of the virus from dromedary camels have boosted interest in the search for novel viruses in dromedaries. Whilst picornaviruses are known to infect various animals, their existence in dromedaries was unknown. We describe the discovery of a novel picornavirus, dromedary camel enterovirus (DcEV), from dromedaries in Dubai. Among 215 dromedaries, DcEV was detected in faecal samples of four (1.9 %) dromedaries [one (0.5 %) adult dromedary and three (25 %) dromedary calves] by reverse transcription PCR. Analysis of two DcEV genomes showed that DcEV was clustered with other species of the genus Enterovirus and was most closely related to and possessed highest amino acid identities to the species Enterovirus E and Enterovirus F found in cattle. The G+C content of DcEV was 45 mol%, which differed from that of Enterovirus E and Enterovirus F (49-50 mol%) by 4-5 %. Similar to other members of the genus Enterovirus, the 5' UTR of DcEV possessed a putative type I internal ribosome entry site. The low ratios of the number of nonsynonymous substitutions per non-synonymous site to the number of synonymous substitutions per synonymous site (Ka/Ks) of various coding regions suggested that dromedaries are the natural reservoir in which DcEV has been stably evolving. These results suggest that DcEV is a novel species of the genus Enterovirus in the family Picornaviridae. Western blot analysis using recombinant DcEV VP1 polypeptide showed a high seroprevalence of 52 % among serum samples from 172 dromedaries for IgG, concurring with its much higher infection rates in dromedary calves than in adults. Further studies are important to understand the pathogenicity, epidemiology and genetic evolution of DcEV in this unique group of animals.
NASA Astrophysics Data System (ADS)
Ma, Ruiqin; He, Feng; Wen, Haishen; Li, Jifang; Shi, Bao; Shi, Dan; Liu, Miao; Mu, Weijie; Zhang, Yuanqing; Hu, Jian; Han, Weiguo; Zhang, Jianan; Wang, Qingqing; Yuan, Yuren; Liu, Qun
2012-03-01
As a specific gene of fish, cytochrome P450c17-II ( CYP17-II) gene plays a key role in the growth, development an reproduction level of fish. In this study, the single-stranded conformational polymorphism (SSCP) technique was used to characterize polymorphisms within the coding region of CYP17-II gene in a population of 75 male Japanese flounder ( Paralichthys olivaceus). Three single nucleotide polymorphisms (SNPs) were identified in CYP17-II gene of Japanese flounder. They were c.G594A (p.G188R), c.G939A and c.G1502A (p.G490D). SNP1 (c.G594A), located in exon 4 of CYP17-II gene, was significantly associated with gonadosomatic index (GSI). Individuals with genotype GG of SNP1 had significantly lower GSI ( P < 0.05) than those with genotype AA or AG. SNP2 (c.G939A) located at the CpG island of CYP17-II gene. The mutation changed the methylation of exon 6. Individuals with genotype AA of SNP2 had significantly lower serum testosterone (T) level and hepatosomatic index (HSI) compared to those with genotype GG. The results suggested that SNP2 could influence the reproductive endocrine of male Japanese flounder. However, the SNP3 (c.G1502A) located in exon 9 did not affect the four measured reproductive traits. This study showed that CYP17-II gene could be a potentially useful candidate gene for the research of genetic breeding and physiological aspects of Japanese flounder.
Sequence diversity and molecular evolutionary rates between buffalo and cattle.
Moaeen-ud-Din, M; Bilal, G
2015-02-01
Identification of genes of importance regarding production traits in buffalo is impaired by a paucity of genomic resources. Choice to fill this gap is to exploit data available for cow. The cross-species application of comparative genomics tools is potential gear to investigate the buffalo genome. However, this is dependent on nucleotide sequences similarity. In this study, gene diversity between buffalo and cattle was determined using 86 gene orthologues. There was approximately 3% difference in all genes in terms of nucleotide diversity and 0.267 ± 0.134 in amino acids, indicating the possibility for successfully using cross-species strategies for genomic studies. There were significantly higher non-synonymous substitutions both in cattle and buffalo; however, there was similar difference in terms of dN- dS (4.414 versus 4.745) in buffalo and cattle, respectively. Higher rate of non-synonymous substitutions at similar level in buffalo and cattle indicated a similar positive selection pressure. Results for relative rate test were assessed with the chi-squared test. There was no significance difference on unique mutations between cattle and buffalo lineages at synonymous sites. However, there was a significance difference on unique mutations for non-synonymous sites, indicating ongoing mutagenic process that generates substitutional mutation at approximately the same rate at silent sites. Moreover, despite of common ancestry, our results indicate a different divergent time among genes of cattle and buffalo. This is the first demonstration that variable rates of molecular evolution may be present within the family Bovidae. © 2014 Blackwell Verlag GmbH.
RNA-based ovarian cancer research from 'a gene to systems biomedicine' perspective.
Gov, Esra; Kori, Medi; Arga, Kazim Yalcin
2017-08-01
Ovarian cancer remains the leading cause of death from a gynecologic malignancy, and treatment of this disease is harder than any other type of female reproductive cancer. Improvements in the diagnosis and development of novel and effective treatment strategies for complex pathophysiologies, such as ovarian cancer, require a better understanding of disease emergence and mechanisms of progression through systems medicine approaches. RNA-level analyses generate new information that can help in understanding the mechanisms behind disease pathogenesis, to identify new biomarkers and therapeutic targets and in new drug discovery. Whole RNA sequencing and coding and non-coding RNA expression array datasets have shed light on the mechanisms underlying disease progression and have identified mRNAs, miRNAs, and lncRNAs involved in ovarian cancer progression. In addition, the results from these analyses indicate that various signalling pathways and biological processes are associated with ovarian cancer. Here, we present a comprehensive literature review on RNA-based ovarian cancer research and highlight the benefits of integrative approaches within the systems biomedicine concept for future ovarian cancer research. We invite the ovarian cancer and systems biomedicine research fields to join forces to achieve the interdisciplinary caliber and rigor required to find real-life solutions to common, devastating, and complex diseases such as ovarian cancer. CAF: cancer-associated fibroblasts; COG: Cluster of Orthologous Groups; DEA: disease enrichment analysis; EOC: epithelial ovarian carcinoma; ESCC: oesophageal squamous cell carcinoma; GSI: gamma secretase inhibitor; GO: Gene Ontology; GSEA: gene set enrichment analyzes; HAS: Hungarian Academy of Sciences; lncRNAs: long non-coding RNAs; MAPK/ERK: mitogen-activated protein kinase/extracellular signal-regulated kinases; NGS: next-generation sequencing; ncRNAs: non-coding RNAs; OvC: ovarian cancer; PI3K/Akt/mTOR: phosphatidylinositol-3-kinase/protein kinase B/mammalian target of rapamycin; RT-PCR: real-time polymerase chain reaction; SNP: single nucleotide polymorphism; TF: transcription factor; TGF-β: transforming growth factor-β.
Making a chocolate chip: development and evaluation of a 6K SNP array for Theobroma cacao.
Livingstone, Donald; Royaert, Stefan; Stack, Conrad; Mockaitis, Keithanne; May, Greg; Farmer, Andrew; Saski, Christopher; Schnell, Ray; Kuhn, David; Motamayor, Juan Carlos
2015-08-01
Theobroma cacao, the key ingredient in chocolate production, is one of the world's most important tree fruit crops, with ∼4,000,000 metric tons produced across 50 countries. To move towards gene discovery and marker-assisted breeding in cacao, a single-nucleotide polymorphism (SNP) identification project was undertaken using RNAseq data from 16 diverse cacao cultivars. RNA sequences were aligned to the assembled transcriptome of the cultivar Matina 1-6, and 330,000 SNPs within coding regions were identified. From these SNPs, a subset of 6,000 high-quality SNPs were selected for inclusion on an Illumina Infinium SNP array: the Cacao6kSNP array. Using Cacao6KSNP array data from over 1,000 cacao samples, we demonstrate that our custom array produces a saturated genetic map and can be used to distinguish among even closely related genotypes. Our study enhances and expands the genetic resources available to the cacao research community, and provides the genome-scale set of tools that are critical for advancing breeding with molecular markers in an agricultural species with high genetic diversity. © The Author 2015. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
SNPchiMp: a database to disentangle the SNPchip jungle in bovine livestock.
Nicolazzi, Ezequiel Luis; Picciolini, Matteo; Strozzi, Francesco; Schnabel, Robert David; Lawley, Cindy; Pirani, Ali; Brew, Fiona; Stella, Alessandra
2014-02-11
Currently, six commercial whole-genome SNP chips are available for cattle genotyping, produced by two different genotyping platforms. Technical issues need to be addressed to combine data that originates from the different platforms, or different versions of the same array generated by the manufacturer. For example: i) genome coordinates for SNPs may refer to different genome assemblies; ii) reference genome sequences are updated over time changing the positions, or even removing sequences which contain SNPs; iii) not all commercial SNP ID's are searchable within public databases; iv) SNPs can be coded using different formats and referencing different strands (e.g. A/B or A/C/T/G alleles, referencing forward/reverse, top/bottom or plus/minus strand); v) Due to new information being discovered, higher density chips do not necessarily include all the SNPs present in the lower density chips; and, vi) SNP IDs may not be consistent across chips and platforms. Most researchers and breed associations manage SNP data in real-time and thus require tools to standardise data in a user-friendly manner. Here we present SNPchiMp, a MySQL database linked to an open access web-based interface. Features of this interface include, but are not limited to, the following functions: 1) referencing the SNP mapping information to the latest genome assembly, 2) extraction of information contained in dbSNP for SNPs present in all commercially available bovine chips, and 3) identification of SNPs in common between two or more bovine chips (e.g. for SNP imputation from lower to higher density). In addition, SNPchiMp can retrieve this information on subsets of SNPs, accessing such data either via physical position on a supported assembly, or by a list of SNP IDs, rs or ss identifiers. This tool combines many different sources of information, that otherwise are time consuming to obtain and difficult to integrate. The SNPchiMp not only provides the information in a user-friendly format, but also enables researchers to perform a large number of operations with a few clicks of the mouse. This significantly reduces the time needed to execute the large number of operations required to manage SNP data.
Computational screening of disease-associated mutations in OCA2 gene.
Kamaraj, Balu; Purohit, Rituraj
2014-01-01
Oculocutaneous albinism type 2 (OCA2), caused by mutations of OCA2 gene, is an autosomal recessive disorder characterized by reduced biosynthesis of melanin pigment in the skin, hair, and eyes. The OCA2 gene encodes instructions for making a protein called the P protein. This protein plays a crucial role in melanosome biogenesis, and controls the eumelanin content in melanocytes in part via the processing and trafficking of tyrosinase which is the rate-limiting enzyme in melanin synthesis. In this study we analyzed the pathogenic effect of 95 non-synonymous single nucleotide polymorphisms reported in OCA2 gene using computational methods. We found R305W mutation as most deleterious and disease associated using SIFT, PolyPhen, PANTHER, PhD-SNP, Pmut, and MutPred tools. To understand the atomic arrangement in 3D space, the native and mutant (R305W) structures were modeled. Molecular dynamics simulation was conducted to observe the structural significance of computationally prioritized disease-associated mutation (R305W). Root-mean-square deviation, root-mean-square fluctuation, radius of gyration, solvent accessibility surface area, hydrogen bond (NH bond), trace of covariance matrix, eigenvector projection analysis, and density analysis results showed prominent loss of stability and rise in mutant flexibility values in 3D space. This study presents a well designed computational methodology to examine the albinism-associated SNPs.
ITALICS: an algorithm for normalization and DNA copy number calling for Affymetrix SNP arrays.
Rigaill, Guillem; Hupé, Philippe; Almeida, Anna; La Rosa, Philippe; Meyniel, Jean-Philippe; Decraene, Charles; Barillot, Emmanuel
2008-03-15
Affymetrix SNP arrays can be used to determine the DNA copy number measurement of 11 000-500 000 SNPs along the genome. Their high density facilitates the precise localization of genomic alterations and makes them a powerful tool for studies of cancers and copy number polymorphism. Like other microarray technologies it is influenced by non-relevant sources of variation, requiring correction. Moreover, the amplitude of variation induced by non-relevant effects is similar or greater than the biologically relevant effect (i.e. true copy number), making it difficult to estimate non-relevant effects accurately without including the biologically relevant effect. We addressed this problem by developing ITALICS, a normalization method that estimates both biological and non-relevant effects in an alternate, iterative manner, accurately eliminating irrelevant effects. We compared our normalization method with other existing and available methods, and found that ITALICS outperformed these methods for several in-house datasets and one public dataset. These results were validated biologically by quantitative PCR. The R package ITALICS (ITerative and Alternative normaLIzation and Copy number calling for affymetrix Snp arrays) has been submitted to Bioconductor.
USDA-ARS?s Scientific Manuscript database
The purpose of this application, under Article 23.9.3 of the Code, is to conserve the widely used specific name Metochus abbreviatus Scott, 1874, for a species of rhyparochromid bugs from East Asia. The name is threatened by the senior subjective synonym Metochus erosus (Walker, 1872), which has bee...
SNP discovery and genotyping using Genotyping-by-Sequencing in Pekin ducks.
Zhu, Feng; Cui, Qian-Qian; Hou, Zhuo-Cheng
2016-11-15
Genomic selection and genome-wide association studies need thousands to millions of SNPs. However, many non-model species do not have reference chips for detecting variation. Our goal was to develop and validate an inexpensive but effective method for detecting SNP variation. Genotyping by sequencing (GBS) can be a highly efficient strategy for genome-wide SNP detection, as an alternative to microarray chips. Here, we developed a GBS protocol for ducks and tested it to genotype 49 Pekin ducks. A total of 169,209 SNPs were identified from all animals, with a mean of 55,920 SNPs per individual. The average SNP density reached 1156 SNPs/MB. In this study, the first application of GBS to ducks, we demonstrate the power and simplicity of this method. GBS can be used for genetic studies in to provide an effective method for genome-wide SNP discovery.
Wong, Wing Chung; Kim, Dewey; Carter, Hannah; Diekhans, Mark; Ryan, Michael C; Karchin, Rachel
2011-08-01
Thousands of cancer exomes are currently being sequenced, yielding millions of non-synonymous single nucleotide variants (SNVs) of possible relevance to disease etiology. Here, we provide a software toolkit to prioritize SNVs based on their predicted contribution to tumorigenesis. It includes a database of precomputed, predictive features covering all positions in the annotated human exome and can be used either stand-alone or as part of a larger variant discovery pipeline. MySQL database, source code and binaries freely available for academic/government use at http://wiki.chasmsoftware.org, Source in Python and C++. Requires 32 or 64-bit Linux system (tested on Fedora Core 8,10,11 and Ubuntu 10), 2.5*≤ Python <3.0*, MySQL server >5.0, 60 GB available hard disk space (50 MB for software and data files, 40 GB for MySQL database dump when uncompressed), 2 GB of RAM.
Zhao, Jianjun; Zhang, Hailing; Bai, Xue; Martella, Vito; Hu, Bo; Sun, Yangang; Zhu, Chunsheng; Zhang, Lei; Liu, Hao; Xu, Shujuan; Shao, Xiqun; Wu, Wei; Yan, Xijun
2014-04-01
A total of 16 strains of canine distemper virus (CDV) were detected from vaccinated minks, foxes, and raccoon dogs in four provinces in North-Eastern China between the end of 2011 and 2013. Upon sequence analysis of the haemagglutinin gene and comparison with wild-type CDV from different species in the same geographical areas, two non-synonymous single nucleotide polymorphisms were identified in 10 CDV strains, which led to amino acid changes at positions 542 (isoleucine to asparagine) and 549 (tyrosine to histidine) of the haemagglutinin protein coding sequence. The change at residue 542 generated a potentially novel N-glycosylation site. Masking of antigenic epitopes by sugar moieties might represent a mechanism for evasion of virus neutralising antibodies and reduced protection by vaccination. Copyright © 2014 Elsevier Ltd. All rights reserved.
2012-01-01
Background Cucurbita pepo is a member of the Cucurbitaceae family, the second- most important horticultural family in terms of economic importance after Solanaceae. The "summer squash" types, including Zucchini and Scallop, rank among the highest-valued vegetables worldwide. There are few genomic tools available for this species. The first Cucurbita transcriptome, along with a large collection of Single Nucleotide Polymorphisms (SNP), was recently generated using massive sequencing. A set of 384 SNP was selected to generate an Illumina GoldenGate assay in order to construct the first SNP-based genetic map of Cucurbita and map quantitative trait loci (QTL). Results We herein present the construction of the first SNP-based genetic map of Cucurbita pepo using a population derived from the cross of two varieties with contrasting phenotypes, representing the main cultivar groups of the species' two subspecies: Zucchini (subsp. pepo) × Scallop (subsp. ovifera). The mapping population was genotyped with 384 SNP, a set of selected EST-SNP identified in silico after massive sequencing of the transcriptomes of both parents, using the Illumina GoldenGate platform. The global success rate of the assay was higher than 85%. In total, 304 SNP were mapped, along with 11 SSR from a previous map, giving a map density of 5.56 cM/marker. This map was used to infer syntenic relationships between C. pepo and cucumber and to successfully map QTL that control plant, flowering and fruit traits that are of benefit to squash breeding. The QTL effects were validated in backcross populations. Conclusion Our results show that massive sequencing in different genotypes is an excellent tool for SNP discovery, and that the Illumina GoldenGate platform can be successfully applied to constructing genetic maps and performing QTL analysis in Cucurbita. This is the first SNP-based genetic map in the Cucurbita genus and is an invaluable new tool for biological research, especially considering that most of these markers are located in the coding regions of genes involved in different physiological processes. The platform will also be useful for future mapping and diversity studies, and will be essential in order to accelerate the process of breeding new and better-adapted squash varieties. PMID:22356647
Castelli, Erick C; Mendes-Junior, Celso T; Sabbagh, Audrey; Porto, Iane O P; Garcia, André; Ramalho, Jaqueline; Lima, Thálitta H A; Massaro, Juliana D; Dias, Fabrício C; Collares, Cristhianna V A; Jamonneau, Vincent; Bucheton, Bruno; Camara, Mamadou; Donadi, Eduardo A
2015-12-01
HLA-E is a non-classical Human Leucocyte Antigen class I gene with immunomodulatory properties. Whereas HLA-E expression usually occurs at low levels, it is widely distributed amongst human tissues, has the ability to bind self and non-self antigens and to interact with NK cells and T lymphocytes, being important for immunosurveillance and also for fighting against infections. HLA-E is usually the most conserved locus among all class I genes. However, most of the previous studies evaluating HLA-E variability sequenced only a few exons or genotyped known polymorphisms. Here we report a strategy to evaluate HLA-E variability by next-generation sequencing (NGS) that might be used to other HLA loci and present the HLA-E haplotype diversity considering the segment encoding the entire HLA-E mRNA (including 5'UTR, introns and the 3'UTR) in two African population samples, Susu from Guinea-Conakry and Lobi from Burkina Faso. Our results indicate that (a) the HLA-E gene is indeed conserved, encoding mainly two different protein molecules; (b) Africans do present several unknown HLA-E alleles presenting synonymous mutations; (c) the HLA-E 3'UTR is quite polymorphic and (d) haplotypes in the HLA-E 3'UTR are in close association with HLA-E coding alleles. NGS has proved to be an important tool on data generation for future studies evaluating variability in non-classical MHC genes. Copyright © 2015 American Society for Histocompatibility and Immunogenetics. Published by Elsevier Inc. All rights reserved.
Zhu, Yuanqi; Hein, David W.
2007-01-01
Genetic variants of human N-acetyltransferase 1 (NAT1) are associated with cancer and birth defects. N- and O-acetyltransferase catalytic activities, Michaelis-Menten kinetic constants (Km & Vmax), and steady state expression levels of NAT1-specific mRNA and protein were determined for the reference NAT1*4 and variant human NAT1 haplotypes possessing single nucleotide polymorphisms (SNPs) in the open reading frame. Although none of the SNPs caused a significant effect on steady state levels of NAT1-specific mRNA, C97T(R33stop), C190T(R64W), C559T (R187stop) and A752T(D251V) each reduced NAT1 protein level and/or N- and O-acetyltransferase catalytic activities to levels below detection. G560A(R187Q) substantially reduced NAT1 protein level and catalytic activities and increased substrate Km. The G445A(V149I), G459A(synonymous) and T640G(S214A) haplotype present in NAT1*11 significantly (p<0.05) increased NAT1 protein level and catalytic activity. Neither T21G(synonymous), T402C(synonymous), A613G(M205V), T777C(synonymous), G781A(E261K), or A787G(I263V) significantly affected Km, catalytic activity, mRNA or protein level. These results suggest heterogeneity among slow NAT1 acetylator phenotypes. PMID:17909564
A genomic scale map of genetic diversity in Trypanosoma cruzi
2012-01-01
Background Trypanosoma cruzi, the causal agent of Chagas Disease, affects more than 16 million people in Latin America. The clinical outcome of the disease results from a complex interplay between environmental factors and the genetic background of both the human host and the parasite. However, knowledge of the genetic diversity of the parasite, is currently limited to a number of highly studied loci. The availability of a number of genomes from different evolutionary lineages of T. cruzi provides an unprecedented opportunity to look at the genetic diversity of the parasite at a genomic scale. Results Using a bioinformatic strategy, we have clustered T. cruzi sequence data available in the public domain and obtained multiple sequence alignments in which one or two alleles from the reference CL-Brener were included. These data covers 4 major evolutionary lineages (DTUs): TcI, TcII, TcIII, and the hybrid TcVI. Using these set of alignments we have identified 288,957 high quality single nucleotide polymorphisms and 1,480 indels. In a reduced re-sequencing study we were able to validate ~ 97% of high-quality SNPs identified in 47 loci. Analysis of how these changes affect encoded protein products showed a 0.77 ratio of synonymous to non-synonymous changes in the T. cruzi genome. We observed 113 changes that introduce or remove a stop codon, some causing significant functional changes, and a number of tri-allelic and tetra-allelic SNPs that could be exploited in strain typing assays. Based on an analysis of the observed nucleotide diversity we show that the T. cruzi genome contains a core set of genes that are under apparent purifying selection. Interestingly, orthologs of known druggable targets show statistically significant lower nucleotide diversity values. Conclusions This study provides the first look at the genetic diversity of T. cruzi at a genomic scale. The analysis covers an estimated ~ 60% of the genetic diversity present in the population, providing an essential resource for future studies on the development of new drugs and diagnostics, for Chagas Disease. These data is available through the TcSNP database (http://snps.tcruzi.org). PMID:23270511
Plasmodium vivax rhomboid-like protease 1 gene diversity in Thailand.
Mataradchakul, Touchchapol; Uthaipibull, Chairat; Nosten, Francois; Vega-Rodriguez, Joel; Jacobs-Lorena, Marcelo; Lek-Uthai, Usa
2017-10-01
Plasmodium vivax infection remains a major public health problem, especially along the Thailand border regions. We examined the genetic diversity of this parasite by analyzing single-nucleotide polymorphisms (SNPs) of the P. vivax rhomboid-like protease 1 gene (Pvrom1) in parasites collected from western (Tak province, Thai-Myanmar border) and eastern (Chanthaburi province, Thai-Cambodia border) regions. Data were collected by a cross-sectional survey, consisting of 47 and 45 P. vivax-infected filter paper-spotted blood samples from the western and eastern regions of Thailand, respectively during September 2013 to May 2014. Extracted DNA was examined for presence of P. vivax using Plasmodium species-specific nested PCR. Pvrom1 gene was PCR amplified, sequenced and the SNP diversity was analyzed using F-STAT, DnaSP, MEGA and LIAN programs. Comparison of sequences of the 92 Pvrom1 831-base open reading frames with that of a reference sequence (GenBank acc. no. XM001615211) revealed 17 samples with a total of 8 polymorphic sites, consisting of singleton (exon 3, nt 645) and parsimony informative (exon 1, nt 22 and 39; exon 3, nt 336, 537 and 656; and exon 4, nt 719 and 748) sites, which resulted in six different deduced Pvrom1 variants. Non-synonymous to synonymous substitutions ratio estimated by the DnaSP program was 1.65 indicating positive selection, but the Z-tests of selection showed no significant deviations from neutrality for Pvrom1 samples from western region of Thailand. In addition McDonald Kreitman test (MK) showed not significant, and Fst values are not different between the two regions and the regions combined. Interestingly, only Pvrom1 exon 2 was the most conserved sequences among the four exons. The relatively high degree of Pvrom1 polymorphism suggests that the protein is important for parasite survival in face of changes in both insect vector and human populations. These polymorphisms could serve as a sensitive marker for studying plasmodial genetic diversity. The significance of Pvrom1 conserved exon 2 sequence remains to be investigated. Copyright © 2017 Mahidol University. Published by Elsevier Inc. All rights reserved.
Walthour, C. S.; Schaeffer, S. W.
1994-01-01
The transformer locus (tra) produces an RNA processing protein that alternatively splices the doublesex pre-mRNA in the sex determination hierarchy of Drosophila melanogaster. Comparisons of the tra coding region among Drosophila species have revealed an unusually high degree of divergence in synonymous and nonsynonymous sites. In this study, we tested the hypothesis that the tra gene will be polymorphic in synonymous and nonsynonymous sites within species by investigating nucleotide sequence variation in eleven tra alleles within D. melanogaster. Of the 1063 nucleotides examined, two synonymous sites were polymorphic and no amino acid variation was detected. Three statistical tests were used to detect departures from an equilibrium neutral model. Two tests failed to reject a neutral model of molecular evolution because of low statisitical power associated with low levels of genetic variation (Tajima/Fu and Li). The Hudson, Kreitman, and Aguade test rejected a neutral model when the tra region was compared to the 5'-flanking region of alcohol dehydrogenase (Adh). The lack of variability in the tra gene is consistent with a recent selective sweep of a beneficial allele in or near the tra locus. PMID:8013913
R classes and methods for SNP array data.
Scharpf, Robert B; Ruczinski, Ingo
2010-01-01
The Bioconductor project is an "open source and open development software project for the analysis and comprehension of genomic data" (1), primarily based on the R programming language. Infrastructure packages, such as Biobase, are maintained by Bioconductor core developers and serve several key roles to the broader community of Bioconductor software developers and users. In particular, Biobase introduces an S4 class, the eSet, for high-dimensional assay data. Encapsulating the assay data as well as meta-data on the samples, features, and experiment in the eSet class definition ensures propagation of the relevant sample and feature meta-data throughout an analysis. Extending the eSet class promotes code reuse through inheritance as well as interoperability with other R packages and is less error-prone. Recently proposed class definitions for high-throughput SNP arrays extend the eSet class. This chapter highlights the advantages of adopting and extending Biobase class definitions through a working example of one implementation of classes for the analysis of high-throughput SNP arrays.
Pharmacogenomic prediction of anthracycline-induced cardiotoxicity in children.
Visscher, Henk; Ross, Colin J D; Rassekh, S Rod; Barhdadi, Amina; Dubé, Marie-Pierre; Al-Saloos, Hesham; Sandor, George S; Caron, Huib N; van Dalen, Elvira C; Kremer, Leontien C; van der Pal, Helena J; Brown, Andrew M K; Rogers, Paul C; Phillips, Michael S; Rieder, Michael J; Carleton, Bruce C; Hayden, Michael R
2012-05-01
Anthracycline-induced cardiotoxicity (ACT) is a serious adverse drug reaction limiting anthracycline use and causing substantial morbidity and mortality. Our aim was to identify genetic variants associated with ACT in patients treated for childhood cancer. We carried out a study of 2,977 single-nucleotide polymorphisms (SNPs) in 220 key drug biotransformation genes in a discovery cohort of 156 anthracycline-treated children from British Columbia, with replication in a second cohort of 188 children from across Canada and further replication of the top SNP in a third cohort of 96 patients from Amsterdam, the Netherlands. We identified a highly significant association of a synonymous coding variant rs7853758 (L461L) within the SLC28A3 gene with ACT (odds ratio, 0.35; P = 1.8 × 10(-5) for all cohorts combined). Additional associations (P < .01) with risk and protective variants in other genes including SLC28A1 and several adenosine triphosphate-binding cassette transporters (ABCB1, ABCB4, and ABCC1) were present. We further explored combining multiple variants into a single-prediction model together with clinical risk factors and classification of patients into three risk groups. In the high-risk group, 75% of patients were accurately predicted to develop ACT, with 36% developing this within the first year alone, whereas in the low-risk group, 96% of patients were accurately predicted not to develop ACT. We have identified multiple genetic variants in SLC28A3 and other genes associated with ACT. Combined with clinical risk factors, genetic risk profiling might be used to identify high-risk patients who can then be provided with safer treatment options.
Somatic and Germline TP53 Alterations in Second Malignant Neoplasms from Pediatric Cancer Survivors.
Sherborne, Amy L; Lavergne, Vincent; Yu, Katharine; Lee, Leah; Davidson, Philip R; Mazor, Tali; Smirnoff, Ivan V; Horvai, Andrew E; Loh, Mignon; DuBois, Steven G; Goldsby, Robert E; Neglia, Joseph P; Hammond, Sue; Robison, Leslie L; Wustrack, Rosanna; Costello, Joseph F; Nakamura, Alice O; Shannon, Kevin M; Bhatia, Smita; Nakamura, Jean L
2017-04-01
Purpose: Second malignant neoplasms (SMNs) are severe late complications that occur in pediatric cancer survivors exposed to radiotherapy and other genotoxic treatments. To characterize the mutational landscape of treatment-induced sarcomas and to identify candidate SMN-predisposing variants, we analyzed germline and SMN samples from pediatric cancer survivors. Experimental Design: We performed whole-exome sequencing (WES) and RNA sequencing on radiation-induced sarcomas arising from two pediatric cancer survivors. To assess the frequency of germline TP53 variants in SMNs, Sanger sequencing was performed to analyze germline TP53 in 37 pediatric cancer survivors from the Childhood Cancer Survivor Study (CCSS) without any history of a familial cancer predisposition syndrome but known to have developed SMNs. Results: WES revealed TP53 mutations involving p53's DNA-binding domain in both index cases, one of which was also present in the germline. The germline and somatic TP53- mutant variants were enriched in the transcriptomes for both sarcomas. Analysis of TP53- coding exons in germline specimens from the CCSS survivor cohort identified a G215C variant encoding an R72P amino acid substitution in 6 patients and a synonymous SNP A639G in 4 others, resulting in 10 of 37 evaluable patients (27%) harboring a germline TP53 variant. Conclusions: Currently, germline TP53 is not routinely assessed in patients with pediatric cancer. These data support the concept that identifying germline TP53 variants at the time a primary cancer is diagnosed may identify patients at high risk for SMN development, who could benefit from modified therapeutic strategies and/or intensive posttreatment monitoring. Clin Cancer Res; 23(7); 1852-61. ©2016 AACR . ©2016 American Association for Cancer Research.
Transcriptome Analysis of Sarracenia, an Insectivorous Plant
Srivastava, Anuj; Rogers, Willie L.; Breton, Catherine M.; Cai, Liming; Malmberg, Russell L.
2011-01-01
Sarracenia species (pitcher plants) are carnivorous plants which obtain a portion of their nutrients from insects captured in the pitchers. To investigate these plants, we sequenced the transcriptome of two species, Sarracenia psittacina and Sarracenia purpurea, using Roche 454 pyrosequencing technology. We obtained 46 275 and 36 681 contigs by de novo assembly methods for S. psittacina and S. purpurea, respectively, and further identified 16 163 orthologous contigs between them. Estimation of synonymous substitution rates between orthologous and paralogous contigs indicates the events of genome duplication and speciation within the Sarracenia genus both occurred ∼2 million years ago. The ratios of synonymous and non-synonymous substitution rates indicated that 491 contigs have been under positive selection (Ka/Ks > 1). Significant proportions of these contigs were involved in functions related to binding activity. We also found that the greatest sequence similarity for both of these species was to Vitis vinifera, which is most consistent with a non-current classification of the order Ericales as an asterid. This study has provided new insights into pitcher plants and will contribute greatly to future research on this genus and its distinctive ecological adaptations. PMID:21676972
Transcriptome analysis of sarracenia, an insectivorous plant.
Srivastava, Anuj; Rogers, Willie L; Breton, Catherine M; Cai, Liming; Malmberg, Russell L
2011-08-01
Sarracenia species (pitcher plants) are carnivorous plants which obtain a portion of their nutrients from insects captured in the pitchers. To investigate these plants, we sequenced the transcriptome of two species, Sarracenia psittacina and Sarracenia purpurea, using Roche 454 pyrosequencing technology. We obtained 46 275 and 36 681 contigs by de novo assembly methods for S. psittacina and S. purpurea, respectively, and further identified 16 163 orthologous contigs between them. Estimation of synonymous substitution rates between orthologous and paralogous contigs indicates the events of genome duplication and speciation within the Sarracenia genus both occurred ∼2 million years ago. The ratios of synonymous and non-synonymous substitution rates indicated that 491 contigs have been under positive selection (K(a)/K(s) > 1). Significant proportions of these contigs were involved in functions related to binding activity. We also found that the greatest sequence similarity for both of these species was to Vitis vinifera, which is most consistent with a non-current classification of the order Ericales as an asterid. This study has provided new insights into pitcher plants and will contribute greatly to future research on this genus and its distinctive ecological adaptations.
Association between long non-coding RNA polymorphisms and cancer risk: a meta-analysis.
Huang, Xin; Zhang, Weiyue; Shao, Zengwu
2018-05-25
Several studies have suggested that long non-coding RNA (lncRNA) gene polymorphisms are associated with cancer risk. In the present study, we conducted a meta-analysis related to studies on the association between lncRNA single-nucleotide polymorphisms (SNPs) and the overall risk of cancer. A total 12 SNPs in five common lncRNA genes were finally included in the meta-analysis. In the lncRNA antisense noncoding RNA in the INK4 locus (ANRIL), the rs1333048 A/C, rs4977574 A/G, and rs10757278 A/G polymorphisms, but not rs1333045 C/T, were correlated with overall cancer risk. Our study also demonstrated that other SNPs were correlated with overall cancer risk, namely, metastasis-associated lung adenocarcinoma transcript 1 (MALAT1, rs619586 A/G), HOXA distal transcript antisense RNA (HOTTIP, rs1859168 A/C) and highly up-regulated in liver cancer (HULC, rs7763881 A/C). Moreover, four prostate cancer‑associated non‑coding RNA 1 (PRNCR1, rs16901946 G/A, rs13252298 G/A, rs1016343 T/C, and rs1456315 G/A) SNPs were in association with cancer risk. No association was found between the PRNCR1 (rs7007694 C/T) SNP and the risk of cancer. In conclusion, our results suggest that several studied lncRNA SNPs are associated with overall cancer risk. Therefore, they might be potential predictive biomarkers for the risk of cancer. More studies based on larger sample sizes and more lncRNA SNPs are warranted to confirm these findings. ©2018 The Author(s).
The evolution of small insertions and deletions in the coding genes of Drosophila melanogaster.
Chong, Zechen; Zhai, Weiwei; Li, Chunyan; Gao, Min; Gong, Qiang; Ruan, Jue; Li, Juan; Jiang, Lan; Lv, Xuemei; Hungate, Eric; Wu, Chung-I
2013-12-01
Studies of protein evolution have focused on amino acid substitutions with much less systematic analysis on insertion and deletions (indels) in protein coding genes. We hence surveyed 7,500 genes between Drosophila melanogaster and D. simulans, using D. yakuba as an outgroup for this purpose. The evolutionary rate of coding indels is indeed low, at only 3% of that of nonsynonymous substitutions. As coding indels follow a geometric distribution in size and tend to fall in low-complexity regions of proteins, it is unclear whether selection or mutation underlies this low rate. To resolve the issue, we collected genomic sequences from an isogenic African line of D. melanogaster (ZS30) at a high coverage of 70× and analyzed indel polymorphism between ZS30 and the reference genome. In comparing polymorphism and divergence, we found that the divergence to polymorphism ratio (i.e., fixation index) for smaller indels (size ≤ 10 bp) is very similar to that for synonymous changes, suggesting that most of the within-species polymorphism and between-species divergence for indels are selectively neutral. Interestingly, deletions of larger sizes (size ≥ 11 bp and ≤ 30 bp) have a much higher fixation index than synonymous mutations and 44.4% of fixed middle-sized deletions are estimated to be adaptive. To our surprise, this pattern is not found for insertions. Protein indel evolution appear to be in a dynamic flux of neutrally driven expansion (insertions) together with adaptive-driven contraction (deletions), and these observations provide important insights for understanding the fitness of new mutations as well as the evolutionary driving forces for genomic evolution in Drosophila species.
Chono, Makiko; Matsunaka, Hitoshi; Seki, Masako; Fujita, Masaya; Kiribuchi-Otobe, Chikako; Oda, Shunsuke; Kojima, Hisayo; Nakamura, Shingo
2015-03-01
In the wheat (Triticum aestivum L.) cultivar 'Zenkoujikomugi', a single nucleotide polymorphism (SNP) in the promoter of MOTHER OF FT AND TFL1 on chromosome 3A (MFT-3A) causes an increase in the level of gene expression, resulting in strong grain dormancy. We used a DNA marker to detect the 'Zenkoujikomugi'-type (Zen-type) SNP and examined the genotype of MFT-3A in Japanese wheat varieties, and we found that 169 of 324 varieties carry the Zen-type SNP. In Japanese commercial varieties, the frequency of the Zen-type SNP was remarkably high in the southern part of Japan, but low in the northern part. To examine the relationship between MFT-3A genotype and grain dormancy, we performed a germination assay in three wheat-growing seasons. On average, the varieties carrying the Zen-type SNP showed stronger grain dormancy than the varieties carrying the non-Zen-type SNP. Among commercial cultivars, 'Iwainodaichi' (Kyushu), 'Junreikomugi' (Kinki-Chugoku-Shikoku), 'Kinuhime' (Kanto-Tokai), 'Nebarigoshi' (Tohoku-Hokuriku), and 'Kitamoe' (Hokkaido) showed the strongest grain dormancy in each geographical group, and all these varieties, except for 'Kitamoe', were found to carry the Zen-type SNP. In recent years, the number of varieties carrying the Zen-type SNP has increased in the Tohoku-Hokuriku region, but not in the Hokkaido region.
Shih, P Betty; Manzi, Susan; Shaw, Penny; Kenney, Margaret; Kao, Amy H; Bontempo, Franklin; Barmada, M Michael; Kammerer, Candace; Kamboh, M Ilyas
2008-11-01
The gene coding for C-reactive protein (CRP) is located on chromosome 1q23.2, which falls within a linkage region thought to harbor a systemic lupus erythematosus (SLE) susceptibility gene. Recently, 2 single-nucleotide polymorphisms (SNP) in the CRP gene (+838, +2043) have been shown to be associated with CRP concentrations and/or SLE risk in a British family-based cohort. Our study was done to confirm the reported association in an independent population-based case-control cohort, and also to investigate the influence of 3 additional CRP tagSNP (-861, -390, +90) on SLE risk and serum CRP concentrations. DNA from 337 Caucasian women who met the American College of Rheumatology criteria for definite (n = 324) or probable (n = 13) SLE and 448 Caucasian healthy female controls was genotyped for 5 CRP tagSNP (-861, -390, +90, +838, +2043). Genotyping was performed using restriction fragment length polymorphism-polymerase chain reaction, pyrosequencing, or TaqMan assays. Serum CRP levels were measured using ELISA. Association studies were performed using the chi-squared distribution, Z-test, Fisher's exact test, and analysis of variance. Haplotype analysis was performed using EH software and the haplo.stats package in R 2.1.2. While none of the SNP were found to be associated with SLE risk individually, there was an association with the 5 SNP haplotypes (p < 0.001). Three SNP (-861, -390, +90) were found to significantly influence serum CRP level in SLE cases, both independently and as haplotypes. Our data suggest that unique haplotype combinations in the CRP gene may modify the risk of developing SLE and influence circulating CRP levels.
Cappola, Thomas P; Matkovich, Scot J; Wang, Wei; van Booven, Derek; Li, Mingyao; Wang, Xuexia; Qu, Liming; Sweitzer, Nancy K; Fang, James C; Reilly, Muredach P; Hakonarson, Hakon; Nerbonne, Jeanne M; Dorn, Gerald W
2011-02-08
Common heart failure has a strong undefined heritable component. Two recent independent cardiovascular SNP array studies identified a common SNP at 1p36 in intron 2 of the HSPB7 gene as being associated with heart failure. HSPB7 resequencing identified other risk alleles but no functional gene variants. Here, we further show no effect of the HSPB7 SNP on cardiac HSPB7 mRNA levels or splicing, suggesting that the SNP marks the position of a functional variant in another gene. Accordingly, we used massively parallel platforms to resequence all coding exons of the adjacent CLCNKA gene, which encodes the K(a) renal chloride channel (ClC-K(a)). Of 51 exonic CLCNKA variants identified, one SNP (rs10927887, encoding Arg83Gly) was common, in linkage disequilibrium with the heart failure risk SNP in HSPB7, and associated with heart failure in two independent Caucasian referral populations (n = 2,606 and 1,168; combined P = 2.25 × 10(-6)). Individual genotyping of rs10927887 in the two study populations and a third independent heart failure cohort (combined n = 5,489) revealed an additive allele effect on heart failure risk that is independent of age, sex, and prior hypertension (odds ratio = 1.27 per allele copy; P = 8.3 × 10(-7)). Functional characterization of recombinant wild-type Arg83 and variant Gly83 ClC-K(a) chloride channel currents revealed ≈ 50% loss-of-function of the variant channel. These findings identify a common, functionally significant genetic risk factor for Caucasian heart failure. The variant CLCNKA risk allele, telegraphed by linked variants in the adjacent HSPB7 gene, uncovers a previously overlooked genetic mechanism affecting the cardio-renal axis.
SNP discovery in candidate adaptive genes using exon capture in a free-ranging alpine ungulate
Roffler, Gretchen H.; Amish, Stephen J.; Smith, Seth; Cosart, Ted F.; Kardos, Marty; Schwartz, Michael K.; Luikart, Gordon
2016-01-01
Identification of genes underlying genomic signatures of natural selection is key to understanding adaptation to local conditions. We used targeted resequencing to identify SNP markers in 5321 candidate adaptive genes associated with known immunological, metabolic and growth functions in ovids and other ungulates. We selectively targeted 8161 exons in protein-coding and nearby 5′ and 3′ untranslated regions of chosen candidate genes. Targeted sequences were taken from bighorn sheep (Ovis canadensis) exon capture data and directly from the domestic sheep genome (Ovis aries v. 3; oviAri3). The bighorn sheep sequences used in the Dall's sheep (Ovis dalli dalli) exon capture aligned to 2350 genes on the oviAri3 genome with an average of 2 exons each. We developed a microfluidic qPCR-based SNP chip to genotype 476 Dall's sheep from locations across their range and test for patterns of selection. Using multiple corroborating approaches (lositan and bayescan), we detected 28 SNP loci potentially under selection. We additionally identified candidate loci significantly associated with latitude, longitude, precipitation and temperature, suggesting local environmental adaptation. The three methods demonstrated consistent support for natural selection on nine genes with immune and disease-regulating functions (e.g. Ovar-DRA, APC, BATF2, MAGEB18), cell regulation signalling pathways (e.g. KRIT1, PI3K, ORRC3), and respiratory health (CYSLTR1). Characterizing adaptive allele distributions from novel genetic techniques will facilitate investigation of the influence of environmental variation on local adaptation of a northern alpine ungulate throughout its range. This research demonstrated the utility of exon capture for gene-targeted SNP discovery and subsequent SNP chip genotyping using low-quality samples in a nonmodel species.
Sherva, Richard; Rice, John P; Neuman, Rosalind J; Rochberg, Nanette; Saccone, Nancy L; Bierut, Laura J
2009-05-01
Alcohol dependence is a major cause of morbidity and mortality worldwide and has a strong familial component. Several linkage and association studies have identified chromosomal regions and/or genes that affect alcohol consumption, notably in genes involved in the 2-stage pathway of alcohol metabolism. Here, we use multiple regression models to test for associations and interactions between 2 alcohol-related phenotypes and SNPs in 17 genes involved in alcohol metabolism in a sample of 1,588 European American subjects. The strongest evidence for association after correcting for multiple testing was between rs1229984, a nonsynonymous coding SNP in ADH1B, and DSM-IV symptom count (p = 0.0003). This SNP was also associated with maximum number of drinks in 24 hours (p = 0.0004). Each minor allele at this SNP predicts 45% fewer DSM-IV symptoms and 18% fewer max drinks. Another SNP in a splice site in ALDH1A1 (rs8187974) showed evidence for association with both phenotypes as well (p = 0.02 and 0.004, respectively), but neither association was significant after accounting for multiple testing. Minor alleles at this SNP predict greater alcohol consumption. In addition, pairwise interactions were observed between SNPs in several genes (p = 0.00002). We replicated the large effect of rs1229984 on alcohol behavior, and although not common (MAF = 4%), this polymorphism may be highly relevant from a public health perspective in European Americans. Another SNP, rs8187974, may also affect alcohol behavior but requires replication. Also, interactions between polymorphisms in genes involved in alcohol metabolism are likely determinants of the parameters that ultimately affect alcohol consumption.
Genomic Comparisons Reveal Microevolutionary Differences in Mycobacterium abscessus Subspecies
Tan, Joon L.; Ng, Kee P.; Ong, Chia S.; Ngeow, Yun F.
2017-01-01
Mycobacterium abscessus, a rapid-growing non-tuberculous mycobacterium, has been the cause of sporadic and outbreak infections world-wide. The subspecies in M. abscessus complex (M. abscessus, M. massiliense, and M. bolletii) are associated with different biologic and pathogenic characteristics and are known to be among the most frequently isolated opportunistic pathogens from clinical material. To date, the evolutionary forces that could have contributed to these biological and clinical differences are still unclear. We compared genome data from 243 M. abscessus strains downloaded from the NCBI ftp Refseq database to understand how the microevolutionary processes of homologous recombination and positive selection influenced the diversification of the M. abscessus complex at the subspecies level. The three subspecies are clearly separated in the Minimum Spanning Tree. Their MUMi-based genomic distances support the separation of M. massiliense and M. bolletii into two subspecies. Maximum Likelihood analysis through dN/dS (the ratio of number of non-synonymous substitutions per non-synonymous site, to the number of synonymous substitutions per synonymous site) identified distinct genes in each subspecies that could have been affected by positive selection during evolution. The results of genome-wide alignment based on concatenated locally-collinear blocks suggest that (a) recombination has affected the M. abscessus complex more than mutation and positive selection; (b) recombination occurred more frequently in M. massiliense than in the other two subspecies; and (c) the recombined segments in the three subspecies have come from different intra-species and inter-species origins. The results lead to the identification of possible gene sets that could have been responsible for the subspecies-specific features and suggest independent evolution among the three subspecies, with recombination playing a more significant role than positive selection in the diversification among members in this complex. PMID:29109707
Genomic Comparisons Reveal Microevolutionary Differences in Mycobacterium abscessus Subspecies.
Tan, Joon L; Ng, Kee P; Ong, Chia S; Ngeow, Yun F
2017-01-01
Mycobacterium abscessus , a rapid-growing non-tuberculous mycobacterium, has been the cause of sporadic and outbreak infections world-wide. The subspecies in M. abscessus complex ( M. abscessus, M. massiliense , and M. bolletii ) are associated with different biologic and pathogenic characteristics and are known to be among the most frequently isolated opportunistic pathogens from clinical material. To date, the evolutionary forces that could have contributed to these biological and clinical differences are still unclear. We compared genome data from 243 M. abscessus strains downloaded from the NCBI ftp Refseq database to understand how the microevolutionary processes of homologous recombination and positive selection influenced the diversification of the M. abscessus complex at the subspecies level. The three subspecies are clearly separated in the Minimum Spanning Tree. Their MUMi-based genomic distances support the separation of M. massiliense and M. bolletii into two subspecies. Maximum Likelihood analysis through dN/dS (the ratio of number of non-synonymous substitutions per non-synonymous site, to the number of synonymous substitutions per synonymous site) identified distinct genes in each subspecies that could have been affected by positive selection during evolution. The results of genome-wide alignment based on concatenated locally-collinear blocks suggest that (a) recombination has affected the M. abscessus complex more than mutation and positive selection; (b) recombination occurred more frequently in M. massiliense than in the other two subspecies; and (c) the recombined segments in the three subspecies have come from different intra-species and inter-species origins. The results lead to the identification of possible gene sets that could have been responsible for the subspecies-specific features and suggest independent evolution among the three subspecies, with recombination playing a more significant role than positive selection in the diversification among members in this complex.
Cartwright, Joseph F; Anderson, Karin; Longworth, Joseph; Lobb, Philip; James, David C
2018-06-01
High-fidelity replication of biologic-encoding recombinant DNA sequences by engineered mammalian cell cultures is an essential pre-requisite for the development of stable cell lines for the production of biotherapeutics. However, immortalized mammalian cells characteristically exhibit an increased point mutation frequency compared to mammalian cells in vivo, both across their genomes and at specific loci (hotspots). Thus unforeseen mutations in recombinant DNA sequences can arise and be maintained within producer cell populations. These may affect both the stability of recombinant gene expression and give rise to protein sequence variants with variable bioactivity and immunogenicity. Rigorous quantitative assessment of recombinant DNA integrity should therefore form part of the cell line development process and be an essential quality assurance metric for instances where synthetic/multi-component assemblies are utilized to engineer mammalian cells, such as the assessment of recombinant DNA fidelity or the mutability of single-site integration target loci. Based on Pacific Biosciences (Menlo Park, CA) single molecule real-time (SMRT™) circular consensus sequencing (CCS) technology we developed a rDNA sequence analysis tool to process the multi-parallel sequencing of ∼40,000 single recombinant DNA molecules. After statistical filtering of raw sequencing data, we show that this analytical method is capable of detecting single point mutations in rDNA to a minimum single mutation frequency of 0.0042% (<1/24,000 bases). Using a stable CHO transfectant pool harboring a randomly integrated 5 kB plasmid construct encoding GFP we found that 28% of recombinant plasmid copies contained at least one low frequency (<0.3%) point mutation. These mutations were predominantly found in GC base pairs (85%) and that there was no positional bias in mutation across the plasmid sequence. There was no discernable difference between the mutation frequencies of coding and non-coding DNA. The putative ratio of non-synonymous and synonymous changes within the open reading frames (ORFs) in the plasmid sequence indicates that natural selection does not impact upon the prevalence of these mutations. Here we have demonstrated the abundance of mutations that fall outside of the reported range of detection of next generation sequencing (NGS) and second generation sequencing (SGS) platforms, providing a methodology capable of being utilized in cell line development platforms to identify the fidelity of recombinant genes throughout the production process. © 2018 Wiley Periodicals, Inc.
Roman, Erika; Colombo, Giancarlo
2009-12-14
The present investigation continues previous behavioral profiling studies of selectively bred alcohol-drinking and alcohol non-drinking rats. In this study, alcohol-naïve adult Sardinian alcohol-preferring (sP) and non-preferring (sNP) rats were tested in the multivariate concentric square field (MCSF) test. The MCSF test has an ethoexperimental approach and measures general activity, exploration, risk assessment, risk taking, and shelter seeking in laboratory rodents. The multivariate design enables behavioral profiling in one and the same test situation. Age-matched male Wistar rats were included as a control group. Five weeks after the first MCSF trial, a repeated testing was done to explore differences in acquired experience. The results revealed distinct differences in exploratory strategies and behavioral profiles between sP and sNP rats. The sP rats were characterized by lower activity, lower exploratory drive, higher risk assessment, and lower risk taking behavior than in sNP rats. In the repeated trial, risk-taking behavior was almost abolished in sP rats. When comparing the performance of sP and sNP rats with that of Wistar rats, the principal component analysis revealed that the sP rats were the most divergent group. The vigilant behavior observed in sP rats with low exploratory drive and low risk-taking behavior is interpreted here as high innate anxiety-related behaviors and may be related to their propensity for high voluntary alcohol intake and preference. We suggest that the different lines of alcohol-preferring rats with different behavioral characteristics constitute valuable animal models that mimic the heterogeneity in human alcohol dependence.
Li, Ling; Li, Dan; Liu, Li; Li, Shijun; Feng, Yanping; Peng, Xiuli; Gong, Yanzhang
2015-01-01
Endothelin receptor B subtype 2 (EDNRB2) is a seven-transmembrane G-protein coupled receptor. In this study, we investigated EDNRB2 gene as a candidate gene for duck spot plumage pattern according to studies of chicken and Japanese quail. The entire coding region was cloned by the reverse transcription polymerase chain reaction (RT-PCR). Sequence analysis showed that duck EDNRB2 cDNA contained a 1311 bp open reading frame and encoded a putative protein of 436 amino acids residues. The transcript shared 89%-90% identity with the counterparts in other avian species. A phylogenetic tree based on amino acid sequences showed that duck EDNRB2 was evolutionary conserved in avian clade. The entire coding region of EDNRB2 were sequenced in 20 spot and 20 non-spot ducks, and 13 SNPs were identified. Two of them (c.940G>A and c.995G>A) were non-synonymous substitutions, and were genotyped in 647 ducks representing non-spot and spot phenotypes. The c.995G>A mutation, which results in the amino acid substitution of Arg332His, was completely associated with the spot phenotype: all 152 spot ducks were carriers of the AA genotype and the other 495 individuals with non-spot phenotype were carriers of GA or GG genotype, respectively. Segregation in 17 GA×GG and 22 GA×GA testing combinations confirmed this association since the segregation ratios and genotypes of the offspring were in agreement with the hypothesis. In order to investigate the underlying mechanism of the spot phenotype, MITF gene was used as cell type marker of melanocyte progenitor cells while TYR and TYRP1 gene were used as cell type markers of mature melanocytes. Transcripts of MITF, TYR and TYRP1 gene with expected size were identified in all pigmented skin tissues while PCR products were not obtained from non-pigmented skin tissues. It was inferred that melanocytes are absent in non-pigmented skin tissues of spot ducks.
PBOOST: a GPU-based tool for parallel permutation tests in genome-wide association studies.
Yang, Guangyuan; Jiang, Wei; Yang, Qiang; Yu, Weichuan
2015-05-01
The importance of testing associations allowing for interactions has been demonstrated by Marchini et al. (2005). A fast method detecting associations allowing for interactions has been proposed by Wan et al. (2010a). The method is based on likelihood ratio test with the assumption that the statistic follows the χ(2) distribution. Many single nucleotide polymorphism (SNP) pairs with significant associations allowing for interactions have been detected using their method. However, the assumption of χ(2) test requires the expected values in each cell of the contingency table to be at least five. This assumption is violated in some identified SNP pairs. In this case, likelihood ratio test may not be applicable any more. Permutation test is an ideal approach to checking the P-values calculated in likelihood ratio test because of its non-parametric nature. The P-values of SNP pairs having significant associations with disease are always extremely small. Thus, we need a huge number of permutations to achieve correspondingly high resolution for the P-values. In order to investigate whether the P-values from likelihood ratio tests are reliable, a fast permutation tool to accomplish large number of permutations is desirable. We developed a permutation tool named PBOOST. It is based on GPU with highly reliable P-value estimation. By using simulation data, we found that the P-values from likelihood ratio tests will have relative error of >100% when 50% cells in the contingency table have expected count less than five or when there is zero expected count in any of the contingency table cells. In terms of speed, PBOOST completed 10(7) permutations for a single SNP pair from the Wellcome Trust Case Control Consortium (WTCCC) genome data (Wellcome Trust Case Control Consortium, 2007) within 1 min on a single Nvidia Tesla M2090 device, while it took 60 min in a single CPU Intel Xeon E5-2650 to finish the same task. More importantly, when simultaneously testing 256 SNP pairs for 10(7) permutations, our tool took only 5 min, while the CPU program took 10 h. By permuting on a GPU cluster consisting of 40 nodes, we completed 10(12) permutations for all 280 SNP pairs reported with P-values smaller than 1.6 × 10⁻¹² in the WTCCC datasets in 1 week. The source code and sample data are available at http://bioinformatics.ust.hk/PBOOST.zip. gyang@ust.hk; eeyu@ust.hk Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
A KCNJ6 gene polymorphism modulates theta oscillations during reward processing.
Kamarajan, Chella; Pandey, Ashwini K; Chorlian, David B; Manz, Niklas; Stimus, Arthur T; Edenberg, Howard J; Wetherill, Leah; Schuckit, Marc; Wang, Jen-Chyong; Kuperman, Samuel; Kramer, John; Tischfield, Jay A; Porjesz, Bernice
2017-05-01
Event related oscillations (EROs) are heritable measures of neurocognitive function that have served as useful phenotype in genetic research. A recent family genome-wide association study (GWAS) by the Collaborative Study on the Genetics of Alcoholism (COGA) found that theta EROs during visual target detection were associated at genome-wide levels with several single nucleotide polymorphisms (SNPs), including a synonymous SNP, rs702859, in the KCNJ6 gene that encodes GIRK2, a G-protein inward rectifying potassium channel that regulates excitability of neuronal networks. The present study examined the effect of the KCNJ6 SNP (rs702859), previously associated with theta ERO to targets in a visual oddball task, on theta EROs during reward processing in a monetary gambling task. The participants were 1601 adolescent and young adult offspring within the age-range of 17-25years (800 males and 801 females) from high-dense alcoholism families as well as control families of the COGA prospective study. Theta ERO power (3.5-7.5Hz, 200-500ms post-stimulus) was compared across genotype groups. ERO theta power at central and parietal regions increased as a function of the minor allele (A) dose in the genotype (AA>AG>GG) in both loss and gain conditions. These findings indicate that variations in the KCNJ6 SNP influence magnitude of theta oscillations at posterior loci during the evaluation of loss and gain, reflecting a genetic influence on neuronal circuits involved in reward-processing. Increased theta power as a function of minor allele dose suggests more efficient cognitive processing in those carrying the minor allele of the KCNJ6 SNPs. Future studies are needed to determine the implications of these genetic effects on posterior theta EROs as possible "protective" factors, or as indices of delays in brain maturation (i.e., lack of frontalization). Copyright © 2016 Elsevier B.V. All rights reserved.
Pecavar, Verena; Blaschitz, Marion; Hufnagl, Peter; Zeinzinger, Josef; Fiedler, Anita; Allerberger, Franz; Maass, Matthias; Indra, Alexander
2012-06-01
Clostridium difficile, a Gram-positive, spore-forming, anaerobic bacterium, is the main causative agent of hospital-acquired diarrhoea worldwide. In addition to metronidazole and vancomycin, rifaximin, a rifamycin derivative, is a promising antibiotic for the treatment of recurring C. difficile infections (CDI). However, exposure of C. difficile to this antibiotic has led to the development of rifaximin-resistance due to point mutations in the β-subunit of the RNA polymerase (rpoB) gene. In the present study, 348 C. difficile strains with known PCR-ribotypes were investigated for respective single nucleotide polymorphisms (SNPs) within the proposed rpoB hot-spot region by using high-resolution melting (HRM) analysis. This method allows the detection of SNPs by comparing the altered melting behaviour of dsDNA with that of wild-type DNA. Discrimination between wild-type and mutant strains was enhanced by creating heteroduplexes by mixing sample DNA with wild-type DNA, leading to characteristic melting curve shapes from samples containing SNPs in the respective rpoB section. In the present study, we were able to identify 16 different rpoB sequence-types (ST) by sequencing analysis of a 325 bp fragment. The 16 PCR STs displayed a total of 24 different SNPs. Fifteen of these 24 SNPs were located within the proposed 151 bp SNP hot-spot region, resulting in 11 different HRM curve profiles (CP). Eleven SNPs (seven of which were within the proposed hot-spot region) led to amino acid substitutions associated with reduced susceptibility to rifaximin and 13 SNPs (eight of which were within the hot-spot region) were synonymous. This investigation clearly demonstrates that HRM analysis of the proposed SNP hot-spot region in the rpoB gene of C. difficile is a fast and cost-effective method for the identification of C. difficile samples with reduced susceptibility to rifaximin and even allows simultaneous SNP subtyping of the respective C. difficile isolates.
Núñez-Acuña, Gustavo; Aguilar-Espinoza, Andrea; Chávez-Mardones, Jacqueline; Gallardo-Escárate, Cristian
2012-10-01
Ubiquitin-conjugated E2 enzyme (UBE2) is one of the main components of the proteasome degradation cascade. Previous studies have shown an increase of expression levels in individuals challenged to some pathogen organism such as virus and bacteria. The study was to characterize the immune response of UBE2 gene in the gastropod Concholepas concholepas through expression analysis and single nucleotide polymorphisms (SNP) discovery. Hence, UBE2 was identified from a cDNA library by 454 pyrosequencing, while SNP identification and validation were performed using De novo assembly and high resolution melting analysis. Challenge trials with Vibrio anguillarum was carried out to evaluate the relative transcript abundance of UBE2 gene from two to thirty-three hours post-treatment. The results showed a partial UBE2 sequence of 889 base pair (bp) with a partial coding region of 291 bp. SNP variation (A/C) was observed at the 546th position. Individuals challenged by V. anguillarum showed an overexpression of the UBE2 gene, the expression being significantly higher in homozygous individuals (AA) than (CC) or heterozygous individuals (A/C). This study contributes useful information relating to the UBE2 gene and its association with innate immune response in marine invertebrates. Copyright © 2012 Elsevier Ltd. All rights reserved.
Identification of a Novel Idiopathic Epilepsy Locus in Belgian Shepherd Dogs
Seppälä, Eija H.; Koskinen, Lotta L. E.; Gulløv, Christina H.; Jokinen, Päivi; Karlskov-Mortensen, Peter; Bergamasco, Luciana; Baranowska Körberg, Izabella; Cizinauskas, Sigitas; Oberbauer, Anita M.; Berendt, Mette; Fredholm, Merete; Lohi, Hannes
2012-01-01
Epilepsy is the most common neurological disorder in dogs, with an incidence ranging from 0.5% to up to 20% in particular breeds. Canine epilepsy can be etiologically defined as idiopathic or symptomatic. Epileptic seizures may be classified as focal with or without secondary generalization, or as primary generalized. Nine genes have been identified for symptomatic (storage diseases) and one for idiopathic epilepsy in different breeds. However, the genetic background of common canine epilepsies remains unknown. We have studied the clinical and genetic background of epilepsy in Belgian Shepherds. We collected 159 cases and 148 controls and confirmed the presence of epilepsy through epilepsy questionnaires and clinical examinations. The MRI was normal while interictal EEG revealed abnormalities and variable foci in the clinically examined affected dogs. A genome-wide association study using Affymetrix 50K SNP arrays in 40 cases and 44 controls mapped the epilepsy locus on CFA37, which was replicated in an independent cohort (81 cases and 88 controls; combined p = 9.70×10−10, OR = 3.3). Fine mapping study defined a ∼1 Mb region including 12 genes of which none are known epilepsy genes or encode ion channels. Exonic sequencing was performed for two candidate genes, KLF7 and ADAM23. No variation was found in KLF7 but a highly-associated non-synonymous variant, G1203A (R387H) was present in the ADAM23 gene (p = 3.7×10−8, OR = 3.9 for homozygosity). Homozygosity for a two-SNP haplotype within the ADAM23 gene conferred the highest risk for epilepsy (p = 6.28×10−11, OR = 7.4). ADAM23 interacts with known epilepsy proteins LGI1 and LGI2. However, our data suggests that the ADAM23 variant is a polymorphism and we have initiated a targeted re-sequencing study across the locus to identify the causative mutation. It would establish the affected breed as a novel therapeutic model, help to develop a DNA test for breeding purposes and introduce a novel candidate gene for human idiopathic epilepsies. PMID:22457775
PGen: large-scale genomic variations analysis workflow and browser in SoyKB.
Liu, Yang; Khan, Saad M; Wang, Juexin; Rynge, Mats; Zhang, Yuanxun; Zeng, Shuai; Chen, Shiyuan; Maldonado Dos Santos, Joao V; Valliyodan, Babu; Calyam, Prasad P; Merchant, Nirav; Nguyen, Henry T; Xu, Dong; Joshi, Trupti
2016-10-06
With the advances in next-generation sequencing (NGS) technology and significant reductions in sequencing costs, it is now possible to sequence large collections of germplasm in crops for detecting genome-scale genetic variations and to apply the knowledge towards improvements in traits. To efficiently facilitate large-scale NGS resequencing data analysis of genomic variations, we have developed "PGen", an integrated and optimized workflow using the Extreme Science and Engineering Discovery Environment (XSEDE) high-performance computing (HPC) virtual system, iPlant cloud data storage resources and Pegasus workflow management system (Pegasus-WMS). The workflow allows users to identify single nucleotide polymorphisms (SNPs) and insertion-deletions (indels), perform SNP annotations and conduct copy number variation analyses on multiple resequencing datasets in a user-friendly and seamless way. We have developed both a Linux version in GitHub ( https://github.com/pegasus-isi/PGen-GenomicVariations-Workflow ) and a web-based implementation of the PGen workflow integrated within the Soybean Knowledge Base (SoyKB), ( http://soykb.org/Pegasus/index.php ). Using PGen, we identified 10,218,140 single-nucleotide polymorphisms (SNPs) and 1,398,982 indels from analysis of 106 soybean lines sequenced at 15X coverage. 297,245 non-synonymous SNPs and 3330 copy number variation (CNV) regions were identified from this analysis. SNPs identified using PGen from additional soybean resequencing projects adding to 500+ soybean germplasm lines in total have been integrated. These SNPs are being utilized for trait improvement using genotype to phenotype prediction approaches developed in-house. In order to browse and access NGS data easily, we have also developed an NGS resequencing data browser ( http://soykb.org/NGS_Resequence/NGS_index.php ) within SoyKB to provide easy access to SNP and downstream analysis results for soybean researchers. PGen workflow has been optimized for the most efficient analysis of soybean data using thorough testing and validation. This research serves as an example of best practices for development of genomics data analysis workflows by integrating remote HPC resources and efficient data management with ease of use for biological users. PGen workflow can also be easily customized for analysis of data in other species.
Islam, Md S; Zeng, Linghe; Thyssen, Gregory N; Delhom, Christopher D; Kim, Hee Jin; Li, Ping; Fang, David D
2016-06-01
Three QTL regions controlling three fiber quality traits were validated and further fine-mapped with 27 new single nucleotide polymorphism (SNP) markers. Transcriptome analysis suggests that receptor-like kinases found within the validated QTLs are potential candidate genes responsible for superior fiber strength in cotton line MD52ne. Fiber strength, length, maturity and fineness determine the market value of cotton fibers and the quality of spun yarn. Cotton fiber strength has been recognized as a critical quality attribute in the modern textile industry. Fine mapping along with quantitative trait loci (QTL) validation and candidate gene prediction can uncover the genetic and molecular basis of fiber quality traits. Four previously-identified QTLs (qFBS-c3, qSFI-c14, qUHML-c14 and qUHML-c24) related to fiber bundle strength, short fiber index and fiber length, respectively, were validated using an F3 population that originated from a cross of MD90ne × MD52ne. A group of 27 new SNP markers generated from mapping-by-sequencing (MBS) were placed in QTL regions to improve and validate earlier maps. Our refined QTL regions spanned 4.4, 1.8 and 3.7 Mb of physical distance in the Gossypium raimondii reference genome. We performed RNA sequencing (RNA-seq) of 15 and 20 days post-anthesis fiber cells from MD52ne and MD90ne and aligned reads to the G. raimondii genome. The QTL regions contained 21 significantly differentially expressed genes (DEGs) between the two near-isogenic parental lines. SNPs that result in non-synonymous substitutions to amino acid sequences of annotated genes were identified within these DEGs, and mapped. Taken together, transcriptome and amino acid mutation analysis indicate that receptor-like kinase pathway genes are likely candidates for superior fiber strength and length in MD52ne. MBS along with RNA-seq demonstrated a powerful strategy to elucidate candidate genes for the QTLs that control complex traits in a complex genome like tetraploid upland cotton.
Munns, Krysty D.; Zaheer, Rahat; Xu, Yong; Stanford, Kim; Laing, Chad R.; Gannon, Victor P. J.; Selinger, L. Brent; McAllister, Tim A.
2016-01-01
Cattle are the primary reservoir of the foodborne pathogen Escherichia coli O157:H7, with the concentration and frequency of E. coli O157:H7 shedding varying substantially among individual hosts. The term ‘‘super-shedder” has been applied to cattle that shed ≥104 cfu E. coli O157:H7/g of feces. Super-shedders have been reported to be responsible for the majority of E. coli O157:H7 shed into the environment. The objective of this study was to determine if there are phenotypic and/or genotypic differences between E. coli O157:H7 isolates obtained from super-shedder compared to low-shedder cattle. From a total of 784 isolates, four were selected from low-shedder steers and six isolates from super-shedder steers (4.01–8.45 log cfu/g feces) for whole genome sequencing. Isolates were phage and clade typed, screened for substrate utilization, pH sensitivity, virulence gene profiles and Stx bacteriophage insertion (SBI) sites. A range of 89–2473 total single nucleotide polymorphisms (SNPs) were identified when sequenced strains were compared to E. coli O157:H7 strain Sakai. More non-synonymous SNP mutations were observed in low-shedder isolates. Pan-genomic and SNPs comparisons did not identify genetic segregation between super-shedder or low-shedder isolates. All super-shedder isolates and 3 of 4 of low-shedder isolates were typed as phage type 14a, SBI cluster 3 and SNP clade 2. Super-shedder isolates displayed increased utilization of galactitol, thymidine and 3-O-β-D-galactopyranosyl-D-arabinose when compared to low-shedder isolates, but no differences in SNPs were observed in genes encoding for proteins involved in the metabolism of these substrates. While genetic traits specific to super-shedder isolates were not identified in this study, differences in the level of gene expression or genes of unknown function may still contribute to some strains of E. coli O157:H7 reaching high densities within bovine feces. PMID:27018858
Munns, Krysty D; Zaheer, Rahat; Xu, Yong; Stanford, Kim; Laing, Chad R; Gannon, Victor P J; Selinger, L Brent; McAllister, Tim A
2016-01-01
Cattle are the primary reservoir of the foodborne pathogen Escherichia coli O157:H7, with the concentration and frequency of E. coli O157:H7 shedding varying substantially among individual hosts. The term ''super-shedder" has been applied to cattle that shed ≥10(4) cfu E. coli O157:H7/g of feces. Super-shedders have been reported to be responsible for the majority of E. coli O157:H7 shed into the environment. The objective of this study was to determine if there are phenotypic and/or genotypic differences between E. coli O157:H7 isolates obtained from super-shedder compared to low-shedder cattle. From a total of 784 isolates, four were selected from low-shedder steers and six isolates from super-shedder steers (4.01-8.45 log cfu/g feces) for whole genome sequencing. Isolates were phage and clade typed, screened for substrate utilization, pH sensitivity, virulence gene profiles and Stx bacteriophage insertion (SBI) sites. A range of 89-2473 total single nucleotide polymorphisms (SNPs) were identified when sequenced strains were compared to E. coli O157:H7 strain Sakai. More non-synonymous SNP mutations were observed in low-shedder isolates. Pan-genomic and SNPs comparisons did not identify genetic segregation between super-shedder or low-shedder isolates. All super-shedder isolates and 3 of 4 of low-shedder isolates were typed as phage type 14a, SBI cluster 3 and SNP clade 2. Super-shedder isolates displayed increased utilization of galactitol, thymidine and 3-O-β-D-galactopyranosyl-D-arabinose when compared to low-shedder isolates, but no differences in SNPs were observed in genes encoding for proteins involved in the metabolism of these substrates. While genetic traits specific to super-shedder isolates were not identified in this study, differences in the level of gene expression or genes of unknown function may still contribute to some strains of E. coli O157:H7 reaching high densities within bovine feces.
A de novo missense mutation of FGFR2 causes facial dysplasia syndrome in Holstein cattle.
Agerholm, Jørgen S; McEvoy, Fintan J; Heegaard, Steffen; Charlier, Carole; Jagannathan, Vidhya; Drögemüller, Cord
2017-08-02
Surveillance for bovine genetic diseases in Denmark identified a hitherto unreported congenital syndrome occurring among progeny of a Holstein sire used for artificial breeding. A genetic aetiology due to a dominant inheritance with incomplete penetrance or a mosaic germline mutation was suspected as all recorded cases were progeny of the same sire. Detailed investigations were performed to characterize the syndrome and to reveal its cause. Seven malformed calves were submitted examination. All cases shared a common morphology with the most striking lesions being severe facial dysplasia and complete prolapse of the eyes. Consequently the syndrome was named facial dysplasia syndrome (FDS). Furthermore, extensive brain malformations, including microencephaly, hydrocephalus, lobation of the cerebral hemispheres and compression of the brain were present. Subsequent data analysis of progeny of the sire revealed that around 0.5% of his offspring suffered from FDS. High density single nucleotide polymorphism (SNP) genotyping data of the seven cases and their parents were used to map the defect in the bovine genome. Significant genetic linkage was obtained for three regions, including chromosome 26 where whole genome sequencing of a case-parent trio revealed two de novo variants perfectly associated with the disease: an intronic SNP in the DMBT1 gene and a single non-synonymous variant in the FGFR2 gene. This FGFR2 missense variant (c.927G>T) affects a gene encoding a member of the fibroblast growth factor receptor family, where amino acid sequence is highly conserved between members and across species. It is predicted to change an evolutionary conserved tryptophan into a cysteine residue (p.Trp309Cys). Both variant alleles were proven to result from de novo mutation events in the germline of the sire. FDS is a novel genetic disorder of Holstein cattle. Mutations in the human FGFR2 gene are associated with various dominant inherited craniofacial dysostosis syndromes. Given the phenotypic similarities in FDS affected calves, the genetic mapping and absence of further high impact variants in the critical genome regions, it is highly likely that the missense mutation in the FGFR2 gene caused the FDS phenotype in a dominant mode of inheritance.
EvoSNP-DB: A database of genetic diversity in East Asian populations.
Kim, Young Uk; Kim, Young Jin; Lee, Jong-Young; Park, Kiejung
2013-08-01
Genome-wide association studies (GWAS) have become popular as an approach for the identification of large numbers of phenotype-associated variants. However, differences in genetic architecture and environmental factors mean that the effect of variants can vary across populations. Understanding population genetic diversity is valuable for the investigation of possible population specific and independent effects of variants. EvoSNP-DB aims to provide information regarding genetic diversity among East Asian populations, including Chinese, Japanese, and Korean. Non-redundant SNPs (1.6 million) were genotyped in 54 Korean trios (162 samples) and were compared with 4 million SNPs from HapMap phase II populations. EvoSNP-DB provides two user interfaces for data query and visualization, and integrates scores of genetic diversity (Fst and VarLD) at the level of SNPs, genes, and chromosome regions. EvoSNP-DB is a web-based application that allows users to navigate and visualize measurements of population genetic differences in an interactive manner, and is available online at [http://biomi.cdc.go.kr/EvoSNP/].
Fang, Wanping; Meinhardt, Lyndel W; Mischke, Sue; Bellato, Cláudia M; Motilal, Lambert; Zhang, Dapeng
2014-01-15
Cacao (Theobroma cacao L.), the source of cocoa, is an economically important tropical crop. One problem with the premium cacao market is contamination with off-types adulterating raw premium material. Accurate determination of the genetic identity of single cacao beans is essential for ensuring cocoa authentication. Using nanofluidic single nucleotide polymorphism (SNP) genotyping with 48 SNP markers, we generated SNP fingerprints for small quantities of DNA extracted from the seed coat of single cacao beans. On the basis of the SNP profiles, we identified an assumed adulterant variety, which was unambiguously distinguished from the authentic beans by multilocus matching. Assignment tests based on both Bayesian clustering analysis and allele frequency clearly separated all 30 authentic samples from the non-authentic samples. Distance-based principle coordinate analysis further supported these results. The nanofluidic SNP protocol, together with forensic statistical tools, is sufficiently robust to establish authentication and to verify gourmet cacao varieties. This method shows significant potential for practical application.
Chung, Jonathan H.; Cai, Jinlu; Suskin, Barrie G.; Zhang, Zhengdong; Coleman, Karlene
2015-01-01
The 22q11.2 deletion syndrome (22q11DS) affects 1:4000 live births and presents with highly variable phenotype expressivity. In this study, we developed an analytical approach utilizing whole genome sequencing and integrative analysis to discover genetic modifiers. Our pipeline combined available tools in order to prioritize rare, predicted deleterious, coding and non-coding single nucleotide variants (SNVs) and insertion/deletions (INDELs) from whole genome sequencing (WGS). We sequenced two unrelated probands with 22q11DS, with contrasting clinical findings, and their unaffected parents. Proband P1 had cognitive impairment, psychotic episodes, anxiety, and tetralogy of Fallot (TOF); while proband P2 had juvenile rheumatoid arthritis but no other major clinical findings. In P1, we identified common variants in COMT and PRODH on 22q11.2 as well as rare potentially deleterious DNA variants in other behavioral/neurocognitive genes. We also identified a de novo SNV in ADNP2 (NM_014913.3:c.2243G>C), encoding a neuroprotective protein that may be involved in behavioral disorders. In P2, we identified a novel non-synonymous SNV in ZFPM2 (NM_012082.3:c.1576C>T), a known causative gene for TOF, which may act as a protective variant downstream of TBX1, haploinsufficiency of which is responsible for congenital heart disease in individuals with 22q11DS. PMID:25981510
IL-23 Receptor (IL-23R) Gene Protects Against Pediatric Crohn’s Disease
Dubinsky, Marla C.; Wang, Dai; Picornell, Yoana; Wrobel, Iwona; Katzir, Lirona; Quiros, Antonio; Dutridge, Debra; Wahbeh, Ghassan; Silber, Gary; Bahar, Ron; Mengesha, Emebet; Targan, Stephan R.; Taylor, Kent D.; Rotter, Jerome I.
2007-01-01
Background The IL-23 receptor (IL-23R) has been found to be associated with small bowel Crohn’s disease (CD) in a whole genome association study. Specifically, the rare allele of the R381Q single nucleotide polymorphism (SNP) conferred protection against CD. It is unknown whether IL-23R is associated with IBD in children. The aim was to examine the association of IL-23R with susceptibility to IBD in pediatric patients. Methods DNA was collected from 609 subjects (151 CD and 52 ulcerative colitis [UC] trios). Trios were genotyped for the R381Q SNP of the IL-23R gene and SNP8, SNP12, SNP13, of the CARD15 gene using Taqman. The transmission disequilibrium test (TDT) was used for association to disease using GENEHUNTER 2.0. Results The rare allele of R381Q SNP was present in 2.7% of CD and 2.9% UC probands. The CARD15 frequency was 31.5% (CD) and 18% (UC). The IL-23R allele was negatively associated with inflammatory bowel disease (IBD): the R381Q SNP was undertransmitted in children with IBD (8 transmitted [T] versus 27 untransmitted [UT]; P = 0.001). This association was significant for all CD patients (6 T versus 19 UT; P = 0.009), especially for non-Jewish CD patients (2 T versus 17 UT; P = 0.0006). TDT showed a borderline association for UC (2 T versus 8 UT; P = 0.06). As expected, CARD15 was associated with CD in children by the TDT (58 T versus 22 UT P = 0.00006), but not with UC. Conclusions The protective IL-23R R381Q variant was particularly associated with CD in non-Jewish children. Thus, the initial whole genome association study based on ileal CD in adults has been extended to the pediatric population and beyond small bowel CD. PMID:17309073
MRI Reveals Edema in Larynx (But Not in Brain) During Anaphylactic Hypotension in Anesthetized Rats
Toyota, Ichiro; Tanida, Mamoru; Wang, Mofei; Kurata, Yasutaka; Tonami, Hisao
2013-01-01
Purpose Anaphylactic shock is sometimes accompanied by local interstitial edema due to increased vascular permeability. We performed magnetic resonance imaging (MRI) to compare edema in the larynx and brain of anesthetized rats during anaphylactic hypotension versus vasodilator-induced hypotension. Methods Male Sprague Dawley rats were subjected to hypotension induced by the ovalbumin antigen (n=7) or a vasodilator sodium nitroprusside (SNP; n=7). Apparent diffusion coefficient (ADC) and T2-relaxation time (T2RT) were quantified on MRI performed repeatedly for up to 68 min after the injection of either agent. The presence of laryngeal edema was also examined by histological examination. Separately, the occurrence of brain edema was assessed by measuring brain water content using the wet/dry method in rats with anaphylaxis (n=5) or SNP (n=5) and the non-hypotensive control rats (n=5). Mast cells in hypothalamus were morphologically examined. Results Mean arterial blood pressure similarly decreased to 35 mmHg after an injection of the antigen or SNP. Hyperintensity on T2-weighted images (as reflected by elevated T2RT) was found in the larynx as early as 13 min after an injection of the antigen, but not SNP. A postmortem histological examination revealed epiglottic edema in the rats with anaphylaxis, but not SNP. In contrast, no significant changes in T2RT or ADC were detectable in the brains of any rats studied. In separate experiments, the quantified brain water content did not increase in either anaphylaxis or SNP rats, as compared with the non-hypotensive control rats. The numbers of mast cells with metachromatic granules in the hypothalamus were not different between rats with anaphylaxis and SNP, suggesting the absence of anaphylactic reaction in hypothalamus. Conclusion Edema was detected using the MRI technique in the larynx during rat anaphylaxis, but not in the brain. PMID:24179686
MRI reveals edema in larynx (but not in brain) during anaphylactic hypotension in anesthetized rats.
Toyota, Ichiro; Tanida, Mamoru; Shibamoto, Toshishige; Wang, Mofei; Kurata, Yasutaka; Tonami, Hisao
2013-11-01
Anaphylactic shock is sometimes accompanied by local interstitial edema due to increased vascular permeability. We performed magnetic resonance imaging (MRI) to compare edema in the larynx and brain of anesthetized rats during anaphylactic hypotension versus vasodilator-induced hypotension. Male Sprague Dawley rats were subjected to hypotension induced by the ovalbumin antigen (n=7) or a vasodilator sodium nitroprusside (SNP; n=7). Apparent diffusion coefficient (ADC) and T2-relaxation time (T2RT) were quantified on MRI performed repeatedly for up to 68 min after the injection of either agent. The presence of laryngeal edema was also examined by histological examination. Separately, the occurrence of brain edema was assessed by measuring brain water content using the wet/dry method in rats with anaphylaxis (n=5) or SNP (n=5) and the non-hypotensive control rats (n=5). Mast cells in hypothalamus were morphologically examined. Mean arterial blood pressure similarly decreased to 35 mmHg after an injection of the antigen or SNP. Hyperintensity on T2-weighted images (as reflected by elevated T2RT) was found in the larynx as early as 13 min after an injection of the antigen, but not SNP. A postmortem histological examination revealed epiglottic edema in the rats with anaphylaxis, but not SNP. In contrast, no significant changes in T2RT or ADC were detectable in the brains of any rats studied. In separate experiments, the quantified brain water content did not increase in either anaphylaxis or SNP rats, as compared with the non-hypotensive control rats. The numbers of mast cells with metachromatic granules in the hypothalamus were not different between rats with anaphylaxis and SNP, suggesting the absence of anaphylactic reaction in hypothalamus. Edema was detected using the MRI technique in the larynx during rat anaphylaxis, but not in the brain.
Whole-exome sequencing and high throughput genotyping identified KCNJ11 as the thirteenth MODY gene.
Bonnefond, Amélie; Philippe, Julien; Durand, Emmanuelle; Dechaume, Aurélie; Huyvaert, Marlène; Montagne, Louise; Marre, Michel; Balkau, Beverley; Fajardy, Isabelle; Vambergue, Anne; Vatin, Vincent; Delplanque, Jérôme; Le Guilcher, David; De Graeve, Franck; Lecoeur, Cécile; Sand, Olivier; Vaxillaire, Martine; Froguel, Philippe
2012-01-01
Maturity-onset of the young (MODY) is a clinically heterogeneous form of diabetes characterized by an autosomal-dominant mode of inheritance, an onset before the age of 25 years, and a primary defect in the pancreatic beta-cell function. Approximately 30% of MODY families remain genetically unexplained (MODY-X). Here, we aimed to use whole-exome sequencing (WES) in a four-generation MODY-X family to identify a new susceptibility gene for MODY. WES (Agilent-SureSelect capture/Illumina-GAIIx sequencing) was performed in three affected and one non-affected relatives in the MODY-X family. We then performed a high-throughput multiplex genotyping (Illumina-GoldenGate assay) of the putative causal mutations in the whole family and in 406 controls. A linkage analysis was also carried out. By focusing on variants of interest (i.e. gains of stop codon, frameshift, non-synonymous and splice-site variants not reported in dbSNP130) present in the three affected relatives and not present in the control, we found 69 mutations. However, as WES was not uniform between samples, a total of 324 mutations had to be assessed in the whole family and in controls. Only one mutation (p.Glu227Lys in KCNJ11) co-segregated with diabetes in the family (with a LOD-score of 3.68). No KCNJ11 mutation was found in 25 other MODY-X unrelated subjects. Beyond neonatal diabetes mellitus (NDM), KCNJ11 is also a MODY gene ('MODY13'), confirming the wide spectrum of diabetes related phenotypes due to mutations in NDM genes (i.e. KCNJ11, ABCC8 and INS). Therefore, the molecular diagnosis of MODY should include KCNJ11 as affected carriers can be ideally treated with oral sulfonylureas.
Bigi, María Mercedes; Lopez, Beatriz; Blanco, Federico Carlos; Sasiain, María Del Carmen; De la Barrera, Silvia; Marti, Marcelo A; Sosa, Ezequiel Jorge; Fernández Do Porto, Darío Augusto; Ritacco, Viviana; Bigi, Fabiana; Soria, Marcelo Abel
2017-03-01
Globally, about 4.5% of new tuberculosis (TB) cases are multi-drug-resistant (MDR), i.e. resistant to the two most powerful first-line anti-TB drugs. Indeed, 480,000 people developed MDR-TB in 2015 and 190,000 people died because of MDR-TB. The MDR Mycobacterium tuberculosis M family, which belongs to the Haarlem lineage, is highly prosperous in Argentina and capable of building up further drug resistance without impairing its ability to spread. In this study, we sequenced the whole genomes of a highly prosperous M-family strain (Mp) and its contemporary variant, strain 410, which produced only one recorded tuberculosis case in the last two decades. Previous reports have demonstrated that Mp induced dysfunctional CD8 + cytotoxic T cell activity, suggesting that this strain has the ability to evade the immune response against M. tuberculosis. Comparative analysis of Mp and 410 genomes revealed non-synonymous polymorphisms in eleven genes and five intergenic regions with polymorphisms between both strains. Some of these genes and promoter regions are involved in the metabolism of cell wall components, others in drug resistance and a SNP in Rv1861, a gene encoding a putative transglycosylase that produces a truncated protein in Mp. The mutation in Rv3787c, a putative S-adenosyl-l-methionine-dependent methyltransferase, is conserved in all of the other prosperous M strains here analysed and absent in non-prosperous M strains. Remarkably, three polymorphic promoter regions displayed differential transcriptional activity between Mp and 410. We speculate that the observed mutations/polymorphisms are associated with the reported higher capacity of Mp for modulating the host's immune response. Copyright © 2017 Elsevier Ltd. All rights reserved.
Identifying predictors of time-inhomogeneous viral evolutionary processes.
Bielejec, Filip; Baele, Guy; Rodrigo, Allen G; Suchard, Marc A; Lemey, Philippe
2016-07-01
Various factors determine the rate at which mutations are generated and fixed in viral genomes. Viral evolutionary rates may vary over the course of a single persistent infection and can reflect changes in replication rates and selective dynamics. Dedicated statistical inference approaches are required to understand how the complex interplay of these processes shapes the genetic diversity and divergence in viral populations. Although evolutionary models accommodating a high degree of complexity can now be formalized, adequately informing these models by potentially sparse data, and assessing the association of the resulting estimates with external predictors, remains a major challenge. In this article, we present a novel Bayesian evolutionary inference method, which integrates multiple potential predictors and tests their association with variation in the absolute rates of synonymous and non-synonymous substitutions along the evolutionary history. We consider clinical and virological measures as predictors, but also changes in population size trajectories that are simultaneously inferred using coalescent modelling. We demonstrate the potential of our method in an application to within-host HIV-1 sequence data sampled throughout the infection of multiple patients. While analyses of individual patient populations lack statistical power, we detect significant evidence for an abrupt drop in non-synonymous rates in late stage infection and a more gradual increase in synonymous rates over the course of infection in a joint analysis across all patients. The former is predicted by the immune relaxation hypothesis while the latter may be in line with increasing replicative fitness during the asymptomatic stage.
Hurba, Olha; Mancikova, Andrea; Krylov, Vladimir; Pavlikova, Marketa; Pavelka, Karel; Stibůrková, Blanka
2014-01-01
Objective Using European descent Czech populations, we performed a study of SLC2A9 and SLC22A12 genes previously identified as being associated with serum uric acid concentrations and gout. This is the first study of the impact of non-synonymous allelic variants on the function of GLUT9 except for patients suffering from renal hypouricemia type 2. Methods The cohort consisted of 250 individuals (150 controls, 54 nonspecific hyperuricemics and 46 primary gout and/or hyperuricemia subjects). We analyzed 13 exons of SLC2A9 (GLUT9 variant 1 and GLUT9 variant 2) and 10 exons of SLC22A12 by PCR amplification and sequenced directly. Allelic variants were prepared and their urate uptake and subcellular localization were studied by Xenopus oocytes expression system. The functional studies were analyzed using the non-parametric Wilcoxon and Kruskall-Wallis tests; the association study used the Fisher exact test and linear regression approach. Results We identified a total of 52 sequence variants (12 unpublished). Eight non-synonymous allelic variants were found only in SLC2A9: rs6820230, rs2276961, rs144196049, rs112404957, rs73225891, rs16890979, rs3733591 and rs2280205. None of these variants showed any significant difference in the expression of GLUT9 and in urate transport. In the association study, eight variants showed a possible association with hyperuricemia. However, seven of these were in introns and the one exon located variant, rs7932775, did not show a statistically significant association with serum uric acid concentration. Conclusion Our results did not confirm any effect of SLC22A12 and SLC2A9 variants on serum uric acid concentration. Our complex approach using association analysis together with functional and immunohistochemical characterization of non-synonymous allelic variants did not show any influence on expression, subcellular localization and urate uptake of GLUT9. PMID:25268603
McClure, Matthew C.; McCarthy, John; Flynn, Paul; McClure, Jennifer C.; Dair, Emma; O'Connell, D. K.; Kearney, John F.
2018-01-01
A major use of genetic data is parentage verification and identification as inaccurate pedigrees negatively affect genetic gain. Since 2012 the international standard for single nucleotide polymorphism (SNP) verification in Bos taurus cattle has been the ISAG SNP panels. While these ISAG panels provide an increased level of parentage accuracy over microsatellite markers (MS), they can validate the wrong parent at ≤1% misconcordance rate levels, indicating that more SNP are needed if a more accurate pedigree is required. With rapidly increasing numbers of cattle being genotyped in Ireland that represent 61 B. taurus breeds from a wide range of farm types: beef/dairy, AI/pedigree/commercial, purebred/crossbred, and large to small herd size the Irish Cattle Breeding Federation (ICBF) analyzed different SNP densities to determine that at a minimum ≥500 SNP are needed to consistently predict only one set of parents at a ≤1% misconcordance rate. For parentage validation and prediction ICBF uses 800 SNP (ICBF800) selected based on SNP clustering quality, ISAG200 inclusion, call rate (CR), and minor allele frequency (MAF) in the Irish cattle population. Large datasets require sample and SNP quality control (QC). Most publications only deal with SNP QC via CR, MAF, parent-progeny conflicts, and Hardy-Weinberg deviation, but not sample QC. We report here parentage, SNP QC, and a genomic sample QC pipelines to deal with the unique challenges of >1 million genotypes from a national herd such as SNP genotype errors from mis-tagging of animals, lab errors, farm errors, and multiple other issues that can arise. We divide the pipeline into two parts: a Genotype QC and an Animal QC pipeline. The Genotype QC identifies samples with low call rate, missing or mixed genotype classes (no BB genotype or ABTG alleles present), and low genotype frequencies. The Animal QC handles situations where the genotype might not belong to the listed individual by identifying: >1 non-matching genotypes per animal, SNP duplicates, sex and breed prediction mismatches, parentage and progeny validation results, and other situations. The Animal QC pipeline make use of ICBF800 SNP set where appropriate to identify errors in a computationally efficient yet still highly accurate method. PMID:29599798
Dalman, Kerstin; Himmelstrand, Kajsa; Olson, Åke; Lind, Mårten; Brandström-Durling, Mikael; Stenlid, Jan
2013-01-01
The dense single nucleotide polymorphisms (SNP) panels needed for genome wide association (GWA) studies have hitherto been expensive to establish and use on non-model organisms. To overcome this, we used a next generation sequencing approach to both establish SNPs and to determine genotypes. We conducted a GWA study on a fungal species, analysing the virulence of Heterobasidion annosum s.s., a necrotrophic pathogen, on its hosts Picea abies and Pinus sylvestris. From a set of 33,018 single nucleotide polymorphisms (SNP) in 23 haploid isolates, twelve SNP markers distributed on seven contigs were associated with virulence (P<0.0001). Four of the contigs harbour known virulence genes from other fungal pathogens and the remaining three harbour novel candidate genes. Two contigs link closely to virulence regions recognized previously by QTL mapping in the congeneric hybrid H. irregulare × H. occidentale. Our study demonstrates the efficiency of GWA studies for dissecting important complex traits of small populations of non-model haploid organisms with small genomes. PMID:23341945
Kim, Daniel S.; Burt, Amber A.; Ranchalis, Jane E.; Jarvik, Ella R.; Rosenthal, Elisabeth A.; Hatsukami, Thomas S.; Furlong, Clement E.; Jarvik, Gail P.
2013-01-01
Cardiovascular disease (CVD) is the leading cause of death in developed countries. Plasma cholesterol level is a key risk factor in CVD pathogenesis. Genetic and dietary variation both influence plasma cholesterol; however, little is known about dietary interactions with genetic variants influencing the absorption and transport of dietary cholesterol. We sought to determine whether gut expressed variants predicting plasma cholesterol differentially affected the relationship between dietary and plasma cholesterol levels in 1,128 subjects (772/356 in the discovery/replication cohorts, respectively). Four single nucleotide polymorphisms (SNPs) within three genes (APOB, CETP, and NPC1L1) were significantly associated with plasma cholesterol in the discovery cohort. These were subsequently evaluated for gene-by-environment (GxE) interactions with dietary cholesterol for the prediction of plasma cholesterol, with significant findings tested for replication. Novel GxE interactions were identified and replicated for two variants: rs1042034, an APOB Ser4338Asn missense SNP and rs2072183 (in males only), a synonymous NPC1L1 SNP in linkage disequilibrium with SNPs 5′ of NPC1L1. This study identifies the presence of novel GxE and gender interactions implying that differential gut absorption is the basis for the variant associations with plasma cholesterol. These GxE interactions may account for part of the “missing heritability” not accounted for by genetic associations. PMID:23482652
Feline hypersomatotropism and acromegaly tumorigenesis: a potential role for the AIP gene.
Scudder, C J; Niessen, S J; Catchpole, B; Fowkes, R C; Church, D B; Forcada, Y
2017-04-01
Acromegaly in humans is usually sporadic, however up to 20% of familial isolated pituitary adenomas are caused by germline sequence variants of the aryl-hydrocarbon-receptor interacting protein (AIP) gene. Feline acromegaly has similarities to human acromegalic families with AIP mutations. The aim of this study was to sequence the feline AIP gene, identify sequence variants and compare the AIP gene sequence between feline acromegalic and control cats, and in acromegalic siblings. The feline AIP gene was amplified through PCR using whole blood genomic DNA from 10 acromegalic and 10 control cats, and 3 sibling pairs affected by acromegaly. PCR products were sequenced and compared with the published predicted feline AIP gene. A single nonsynonymous SNP was identified in exon 1 (AIP:c.9T > G) of two acromegalic cats and none of the control cats, as well as both members of one sibling pair. The region of this SNP is considered essential for the interaction of the AIP protein with its receptor. This sequence variant has not previously been reported in humans. Two additional synonymous sequence variants were identified (AIP:c.481C > T and AIP:c.826C > T). This is the first molecular study to investigate a potential genetic cause of feline acromegaly and identified a nonsynonymous AIP single nucleotide polymorphism in 20% of the acromegalic cat population evaluated, as well as in one of the sibling pairs evaluated. Copyright © 2016 Elsevier Inc. All rights reserved.
Peter, Valsa S
2013-01-15
Nitric oxide (NO), a short-lived freely diffusible radical gas that acts as an important biological signal, regulates an impressive spectrum of physiological functions in vertebrates including fishes. The action of NO, however, on thyroid hormone status and its role in the integration of acid-base, osmotic and metabolic balances during stress are not yet delineated in fish. Sodium nitroprusside (SNP), a NO donor, was employed in the present study to investigate the role of NO in the stressed air-breathing fish Anabas testudineus. Short-term SNP treatment (1 mM; 30 min) interacted negatively with thyroid axis, as evident in the fall of plasma thyroxine in both stressed and non-stressed fish. In contrast, the cortisol responsiveness to NO was negligible. SNP challenge produced systemic alkalosis, hypocapnia and hyperglycemia in non-stressed fish. Remarkable acid-base compensation was found in fish kept for 60 min net confinement where a rise in blood pH and HCO(3) content occurred with a reduction in PCO(2) content. SNP challenge in these fish, on the contrary, produced a rise in oxygen load together with hypocapnia but without an effect on HCO(3) content, indicating a modulator role of NO in respiratory gas transport during stress response. SNP treatment reduced Na(+), K(+) ATPase activity in the gill, intestine and liver of both stressed and non-stressed fish, and this suggests that stress state has little effect on the NO-driven osmotic competence of these organs. On the other hand, a modulatory effect of NO was found in the kidney which showed a differential response to SNP, emphasizing a key role of NO in kidney ion transport and its sensitivity to stressful condition. H(+)-ATPase activity, an index of H(+) secretion, downregulated in all the organs of both non-stressed and stressed fish except in the gill of non-stressed fish and this supports a role for NO in promoting alkalosis. The data indicate that, (1) NO interacts antagonistically with T(4), (2) modifies respiratory gas transport and (3) integrates acid-base and osmotic actions during stress response in air-breathing fish. Collectively, this first evidence in fish indicate that NO can promote compensatory physiologic modification and that can reduce the magnitude of stress-induced acid-base and osmotic disturbance and that suggests a role for NO in the ease and ease response of this fish. Copyright © 2012 Elsevier Inc. All rights reserved.
Chono, Makiko; Matsunaka, Hitoshi; Seki, Masako; Fujita, Masaya; Kiribuchi-Otobe, Chikako; Oda, Shunsuke; Kojima, Hisayo; Nakamura, Shingo
2015-01-01
In the wheat (Triticum aestivum L.) cultivar ‘Zenkoujikomugi’, a single nucleotide polymorphism (SNP) in the promoter of MOTHER OF FT AND TFL1 on chromosome 3A (MFT-3A) causes an increase in the level of gene expression, resulting in strong grain dormancy. We used a DNA marker to detect the ‘Zenkoujikomugi’-type (Zen-type) SNP and examined the genotype of MFT-3A in Japanese wheat varieties, and we found that 169 of 324 varieties carry the Zen-type SNP. In Japanese commercial varieties, the frequency of the Zen-type SNP was remarkably high in the southern part of Japan, but low in the northern part. To examine the relationship between MFT-3A genotype and grain dormancy, we performed a germination assay in three wheat-growing seasons. On average, the varieties carrying the Zen-type SNP showed stronger grain dormancy than the varieties carrying the non-Zen-type SNP. Among commercial cultivars, ‘Iwainodaichi’ (Kyushu), ‘Junreikomugi’ (Kinki-Chugoku-Shikoku), ‘Kinuhime’ (Kanto-Tokai), ‘Nebarigoshi’ (Tohoku-Hokuriku), and ‘Kitamoe’ (Hokkaido) showed the strongest grain dormancy in each geographical group, and all these varieties, except for ‘Kitamoe’, were found to carry the Zen-type SNP. In recent years, the number of varieties carrying the Zen-type SNP has increased in the Tohoku-Hokuriku region, but not in the Hokkaido region. PMID:25931984
Bertelsen, H P; Gregersen, V R; Poulsen, N; Nielsen, R O; Das, A; Madsen, L B; Buitenhuis, A J; Holm, L-E; Panitz, F; Larsen, L B; Bendixen, C
2016-04-01
Rennet-induced milk coagulation is an important trait for cheese production. Recent studies have reported an alarming frequency of cows producing poorly coagulating milk unsuitable for cheese production. Several genetic factors are known to affect milk coagulation, including variation in the major milk proteins; however, recent association studies indicate genetic effects from other genomic regions as well. The aim of this study was to detect genetic variation affecting milk coagulation properties, measured as curd-firming rate (CFR) and milk pH. This was achieved by examining allele frequency differences between pooled whole-genome sequences of phenotypically extreme samples (pool-seq).. Curd-firming rate and raw milk pH were measured for 415 Danish Holstein cows, and each animal was sequenced at low coverage. Pools were created containing whole genome sequence reads from samples with "extreme" values (high or low) for both phenotypic traits. A total of 6,992,186 and 5,295,501 SNP were assessed in relation to CFR and milk pH, respectively. Allele frequency differences were calculated between pools and 32 significantly different SNP were detected, 1 for milk pH and 31 for CFR, of which 19 are located on chromosome 6. A total of 9 significant SNP, which were selected based on the possible function of proximal candidate genes, were genotyped in the entire sample set ( = 415) to test for an association. The most significant SNP was located proximal to , explaining 33% of the phenotypic variance. , coding for κ-casein, is the most studied in relation to milk coagulation due to its position on the surface of the casein micelles and the direct involvement in milk coagulation. Three additional SNP located on chromosome 6 showed significant associations explaining 7, 3.6, and 1.3% of the phenotypic variance of CFR. The significant SNP on chromosome 6 were shown to be in linkage disequilibrium with the SNP peaking proximal to ; however, after accounting for the genotype of the peak SNP within this QTL, significant effects (-value < 0.1) could still be detected for 2 of the SNP accounting for 2 and 1% of the phenotypic variance. These 2 interesting SNP were located within introns or proximal to the candidate genes-solute carrier family 4 (sodium bicarbonate cotransporter), member 4 () and LIM and calponin homology domains 1 (), respectively-making them interesting targets for further analysis.
Cell-free DNA fragment-size distribution analysis for non-invasive prenatal CNV prediction.
Arbabi, Aryan; Rampášek, Ladislav; Brudno, Michael
2016-06-01
Non-invasive detection of aneuploidies in a fetal genome through analysis of cell-free DNA circulating in the maternal plasma is becoming a routine clinical test. Such tests, which rely on analyzing the read coverage or the allelic ratios at single-nucleotide polymorphism (SNP) loci, are not sensitive enough for smaller sub-chromosomal abnormalities due to sequencing biases and paucity of SNPs in a genome. We have developed an alternative framework for identifying sub-chromosomal copy number variations in a fetal genome. This framework relies on the size distribution of fragments in a sample, as fetal-origin fragments tend to be smaller than those of maternal origin. By analyzing the local distribution of the cell-free DNA fragment sizes in each region, our method allows for the identification of sub-megabase CNVs, even in the absence of SNP positions. To evaluate the accuracy of our method, we used a plasma sample with the fetal fraction of 13%, down-sampled it to samples with coverage of 10X-40X and simulated samples with CNVs based on it. Our method had a perfect accuracy (both specificity and sensitivity) for detecting 5 Mb CNVs, and after reducing the fetal fraction (to 11%, 9% and 7%), it could correctly identify 98.82-100% of the 5 Mb CNVs and had a true-negative rate of 95.29-99.76%. Our source code is available on GitHub at https://github.com/compbio-UofT/FSDA CONTACT: : brudno@cs.toronto.edu. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Saa, Pedro A.; Nielsen, Lars K.
2016-01-01
Motivation: Computation of steady-state flux solutions in large metabolic models is routinely performed using flux balance analysis based on a simple LP (Linear Programming) formulation. A minimal requirement for thermodynamic feasibility of the flux solution is the absence of internal loops, which are enforced using ‘loopless constraints’. The resulting loopless flux problem is a substantially harder MILP (Mixed Integer Linear Programming) problem, which is computationally expensive for large metabolic models. Results: We developed a pre-processing algorithm that significantly reduces the size of the original loopless problem into an easier and equivalent MILP problem. The pre-processing step employs a fast matrix sparsification algorithm—Fast- sparse null-space pursuit (SNP)—inspired by recent results on SNP. By finding a reduced feasible ‘loop-law’ matrix subject to known directionalities, Fast-SNP considerably improves the computational efficiency in several metabolic models running different loopless optimization problems. Furthermore, analysis of the topology encoded in the reduced loop matrix enabled identification of key directional constraints for the potential permanent elimination of infeasible loops in the underlying model. Overall, Fast-SNP is an effective and simple algorithm for efficient formulation of loop-law constraints, making loopless flux optimization feasible and numerically tractable at large scale. Availability and Implementation: Source code for MATLAB including examples is freely available for download at http://www.aibn.uq.edu.au/cssb-resources under Software. Optimization uses Gurobi, CPLEX or GLPK (the latter is included with the algorithm). Contact: lars.nielsen@uq.edu.au Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27559155
Soria, L A; Corva, P M; Branda Sica, A; Villarreal, E L; Melucci, L M; Mezzadra, C A; Papaleo Mazzucco, J; Fernández Macedo, G; Silvestro, C; Schor, A; Miquel, M C
2009-12-01
The PPARGC1A gene (peroxysome proliferator-activated receptor-gamma coactivator 1alpha gene) controls muscle fiber type and brown adipocyte differentiation; therefore, it is a candidate gene for beef quality traits (tenderness and fat content). Two SNPs (Single Nucleotide Polymorphisms) were identified within exon 8 by multiple alignment of DNA sequences obtained from 24 bulls: a transition G/A (SNP 1181) and a transversion A/T (SNP 1299). The SNP 1181 is a novel SNP, corresponding to a non-conservative substitution (AGT/AAT) that could be the cause of amino acid substitution ((364)Serine/(364)Asparagine). A Mismatch PCR method was designed to determine genotypes of 73 bulls and 268 steers for SNP 1181. Growth, slaughter and meat quality information were available for the group of steers. Allele A of SNP 1181 was not found in Angus. In 243 steers, no significant differences (P > 0.05) were found for either final live body weight, gain in backfat thickness in Spring, kidney fat weight, kidney fat percentage, Warner-Bratzler shear force at 7 days postmortem, intramuscular fat percentage or meat colour between genotype GG and AG. This SNP could be included in breed composition and population admixture analyses because there are marked differences in allelic frequencies between Bos taurus and Bos indicus breeds.
Cech, P
2001-01-01
Nearly eighty years after his death, Albert Adamkiewicz (1850-1921) has still been persisting in both the history of medicine owing to his work and in the medical terminology owing to eponymy: since his flourishing period toward the end of the XIXth century, the surname Adamkiewicz has entered the language of science as a proper-name constituent of anatomical, pathological, neurological, surgical as well as orthopaedic terms, combing with the appellatives stain, corpuscle or demilune, reaction or test, serum, syndrome as well as artery. Estimation of the actual vitality of particular eponymous terms compared with non-eponymous synonyms had to be the aim of the presented search in the scientific literature a century after. In contrast with the inert non-periodical (encyclopaedic) literature, periodicals have revealed all the eponymous terms fallen in oblivion except the 'Adamkiewicz artery' that has only recently been introduced in encyclopaedias although constantly preferred in periodicals of the period under investigation (appearing in 75% articles) over the most frequent non-eponymous synonym 'arteria radicularis magna / great(er) radicular artery' (scarcely 11% articles). Thanks to the 'artery' - joining furthermore several synonyms to appear nearly in 86% articles altogether - the surname Adamkiewicz persists in the living language of science; that is why its bearer ought to be remembered and mentioned even on the threshold of the XXIst century.
TMC-SNPdb: an Indian germline variant database derived from whole exome sequences.
Upadhyay, Pawan; Gardi, Nilesh; Desai, Sanket; Sahoo, Bikram; Singh, Ankita; Togar, Trupti; Iyer, Prajish; Prasad, Ratnam; Chandrani, Pratik; Gupta, Sudeep; Dutt, Amit
2016-01-01
Cancer is predominantly a somatic disease. A mutant allele present in a cancer cell genome is considered somatic when it's absent in the paired normal genome along with public SNP databases. The current build of dbSNP, the most comprehensive public SNP database, however inadequately represents several non-European Caucasian populations, posing a limitation in cancer genomic analyses of data from these populations. We present the T: ata M: emorial C: entre-SNP D: ata B: ase (TMC-SNPdb), as the first open source, flexible, upgradable, and freely available SNP database (accessible through dbSNP build 149 and ANNOVAR)-representing 114 309 unique germline variants-generated from whole exome data of 62 normal samples derived from cancer patients of Indian origin. The TMC-SNPdb is presented with a companion subtraction tool that can be executed with command line option or using an easy-to-use graphical user interface with the ability to deplete additional Indian population specific SNPs over and above dbSNP and 1000 Genomes databases. Using an institutional generated whole exome data set of 132 samples of Indian origin, we demonstrate that TMC-SNPdb could deplete 42, 33 and 28% false positive somatic events post dbSNP depletion in Indian origin tongue, gallbladder, and cervical cancer samples, respectively. Beyond cancer somatic analyses, we anticipate utility of the TMC-SNPdb in several Mendelian germline diseases. In addition to dbSNP build 149 and ANNOVAR, the TMC-SNPdb along with the subtraction tool is available for download in the public domain at the following:Database URL: http://www.actrec.gov.in/pi-webpages/AmitDutt/TMCSNP/TMCSNPdp.html. © The Author(s) 2016. Published by Oxford University Press.
Wildman, Derek E.; Uddin, Monica; Liu, Guozhen; Grossman, Lawrence I.; Goodman, Morris
2003-01-01
What do functionally important DNA sites, those scrutinized and shaped by natural selection, tell us about the place of humans in evolution? Here we compare ≈90 kb of coding DNA nucleotide sequence from 97 human genes to their sequenced chimpanzee counterparts and to available sequenced gorilla, orangutan, and Old World monkey counterparts, and, on a more limited basis, to mouse. The nonsynonymous changes (functionally important), like synonymous changes (functionally much less important), show chimpanzees and humans to be most closely related, sharing 99.4% identity at nonsynonymous sites and 98.4% at synonymous sites. On a time scale, the coding DNA divergencies separate the human–chimpanzee clade from the gorilla clade at between 6 and 7 million years ago and place the most recent common ancestor of humans and chimpanzees at between 5 and 6 million years ago. The evolutionary rate of coding DNA in the catarrhine clade (Old World monkey and ape, including human) is much slower than in the lineage to mouse. Among the genes examined, 30 show evidence of positive selection during descent of catarrhines. Nonsynonymous substitutions by themselves, in this subset of positively selected genes, group humans and chimpanzees closest to each other and have chimpanzees diverge about as much from the common human–chimpanzee ancestor as humans do. This functional DNA evidence supports two previously offered taxonomic proposals: family Hominidae should include all extant apes; and genus Homo should include three extant species and two subgenera, Homo (Homo) sapiens (humankind), Homo (Pan) troglodytes (common chimpanzee), and Homo (Pan) paniscus (bonobo chimpanzee). PMID:12766228
Wildman, Derek E; Uddin, Monica; Liu, Guozhen; Grossman, Lawrence I; Goodman, Morris
2003-06-10
What do functionally important DNA sites, those scrutinized and shaped by natural selection, tell us about the place of humans in evolution? Here we compare approximately 90 kb of coding DNA nucleotide sequence from 97 human genes to their sequenced chimpanzee counterparts and to available sequenced gorilla, orangutan, and Old World monkey counterparts, and, on a more limited basis, to mouse. The nonsynonymous changes (functionally important), like synonymous changes (functionally much less important), show chimpanzees and humans to be most closely related, sharing 99.4% identity at nonsynonymous sites and 98.4% at synonymous sites. On a time scale, the coding DNA divergencies separate the human-chimpanzee clade from the gorilla clade at between 6 and 7 million years ago and place the most recent common ancestor of humans and chimpanzees at between 5 and 6 million years ago. The evolutionary rate of coding DNA in the catarrhine clade (Old World monkey and ape, including human) is much slower than in the lineage to mouse. Among the genes examined, 30 show evidence of positive selection during descent of catarrhines. Nonsynonymous substitutions by themselves, in this subset of positively selected genes, group humans and chimpanzees closest to each other and have chimpanzees diverge about as much from the common human-chimpanzee ancestor as humans do. This functional DNA evidence supports two previously offered taxonomic proposals: family Hominidae should include all extant apes; and genus Homo should include three extant species and two subgenera, Homo (Homo) sapiens (humankind), Homo (Pan) troglodytes (common chimpanzee), and Homo (Pan) paniscus (bonobo chimpanzee).
Kouvelis, Vassili N; Ghikas, Dimitri V; Typas, Milton A
2004-10-01
The mitochondrial genome (mtDNA) of the entomopathogenic fungus Lecanicillium muscarium (synonym Verticillium lecanii) with a total size of 24,499-bp has been analyzed. So far, it is the smallest known mitochondrial genome among Pezizomycotina, with an extremely compact gene organization and only one group-I intron in its large ribosomal RNA (rnl) gene. It contains the 14 typical genes coding for proteins related to oxidative phosphorylation, the two rRNA genes, one intronic ORF coding for a possible ribosomal protein (rps), and a set of 25 tRNA genes which recognize codons for all amino acids, except alanine and cysteine. All genes are transcribed from the same DNA strand. Gene order comparison with all available complete fungal mtDNAs-representatives of all four Phyla are included-revealed some characteristic common features like uninterrupted gene pairs, overlapping genes, and extremely variable intergenic regions, that can all be exploited for the study of fungal mitochondrial genomes. Moreover, a minimum common mtDNA gene order could be detected, in two units, for all known Sordariomycetes namely nad1-nad4-atp8-atp6 and rns-cox3-rnl, which can be extended in Hypocreales, to nad4L-nad5-cob-cox1-nad1-nad4-atp8-atp6 and rns-cox3-rnl nad2-nad3, respectively. Phylogenetic analysis of all fungal mtDNA essential protein-coding genes as one unit, clearly demonstrated the superiority of small genome (mtDNA) over single gene comparisons.
Sims, Rebecca; van der Lee, Sven J.; Naj, Adam C.; Bellenguez, Céline; Badarinarayan, Nandini; Jakobsdottir, Johanna; Kunkle, Brian W.; Boland, Anne; Raybould, Rachel; Bis, Joshua C.; Martin, Eden R.; Grenier-Boley, Benjamin; Heilmann-Heimbach, Stefanie; Chouraki, Vincent; Kuzma, Amanda B.; Sleegers, Kristel; Vronskaya, Maria; Ruiz, Agustin; Graham, Robert R.; Olaso, Robert; Hoffmann, Per; Grove, Megan L.; Vardarajan, Badri N.; Hiltunen, Mikko; Nöthen, Markus M.; White, Charles C.; Hamilton-Nelson, Kara L.; Epelbaum, Jacques; Maier, Wolfgang; Choi, Seung-Hoan; Beecham, Gary W.; Dulary, Cécile; Herms, Stefan; Smith, Albert V.; Funk, Cory C.; Derbois, Céline; Forstner, Andreas J.; Ahmad, Shahzad; Li, Hongdong; Bacq, Delphine; Harold, Denise; Satizabal, Claudia L.; Valladares, Otto; Squassina, Alessio; Thomas, Rhodri; Brody, Jennifer A.; Qu, Liming; Sanchez-Juan, Pascual; Morgan, Taniesha; Wolters, Frank J.; Zhao, Yi; Garcia, Florentino Sanchez; Denning, Nicola; Fornage, Myriam; Malamon, John; Naranjo, Maria Candida Deniz; Majounie, Elisa; Mosley, Thomas H.; Dombroski, Beth; Wallon, David; Lupton, Michelle K; Dupuis, Josée; Whitehead, Patrice; Fratiglioni, Laura; Medway, Christopher; Jian, Xueqiu; Mukherjee, Shubhabrata; Keller, Lina; Brown, Kristelle; Lin, Honghuang; Cantwell, Laura B.; Panza, Francesco; McGuinness, Bernadette; Moreno-Grau, Sonia; Burgess, Jeremy D.; Solfrizzi, Vincenzo; Proitsi, Petra; Adams, Hieab H.; Allen, Mariet; Seripa, Davide; Pastor, Pau; Cupples, L. Adrienne; Price, Nathan D; Hannequin, Didier; Frank-García, Ana; Levy, Daniel; Chakrabarty, Paramita; Caffarra, Paolo; Giegling, Ina; Beiser, Alexa S.; Giedraitis, Vimantas; Hampel, Harald; Garcia, Melissa E.; Wang, Xue; Lannfelt, Lars; Mecocci, Patrizia; Eiriksdottir, Gudny; Crane, Paul K.; Pasquier, Florence; Boccardi, Virginia; Henández, Isabel; Barber, Robert C.; Scherer, Martin; Tarraga, Lluis; Adams, Perrie M.; Leber, Markus; Chen, Yuning; Albert, Marilyn S.; Riedel-Heller, Steffi; Emilsson, Valur; Beekly, Duane; Braae, Anne; Schmidt, Reinhold; Blacker, Deborah; Masullo, Carlo; Schmidt, Helena; Doody, Rachelle S.; Spalletta, Gianfranco; Longstreth, WT; Fairchild, Thomas J.; Bossù, Paola; Lopez, Oscar L.; Frosch, Matthew P.; Sacchinelli, Eleonora; Ghetti, Bernardino; Sánchez-Juan, Pascual; Yang, Qiong; Huebinger, Ryan M.; Jessen, Frank; Li, Shuo; Kamboh, M. Ilyas; Morris, John; Sotolongo-Grau, Oscar; Katz, Mindy J.; Corcoran, Chris; Himali, Jayanadra J.; Keene, C. Dirk; Tschanz, JoAnn; Fitzpatrick, Annette L.; Kukull, Walter A.; Norton, Maria; Aspelund, Thor; Larson, Eric B.; Munger, Ron; Rotter, Jerome I.; Lipton, Richard B.; Bullido, María J; Hofman, Albert; Montine, Thomas J.; Coto, Eliecer; Boerwinkle, Eric; Petersen, Ronald C.; Alvarez, Victoria; Rivadeneira, Fernando; Reiman, Eric M.; Gallo, Maura; O’Donnell, Christopher J.; Reisch, Joan S.; Bruni, Amalia Cecilia; Royall, Donald R.; Dichgans, Martin; Sano, Mary; Galimberti, Daniela; St George-Hyslop, Peter; Scarpini, Elio; Tsuang, Debby W.; Mancuso, Michelangelo; Bonuccelli, Ubaldo; Winslow, Ashley R.; Daniele, Antonio; Wu, Chuang-Kuo; Peters, Oliver; Nacmias, Benedetta; Riemenschneider, Matthias; Heun, Reinhard; Brayne, Carol; Rubinsztein, David C; Bras, Jose; Guerreiro, Rita; Hardy, John; Al-Chalabi, Ammar; Shaw, Christopher E; Collinge, John; Mann, David; Tsolaki, Magda; Clarimón, Jordi; Sussams, Rebecca; Lovestone, Simon; O’Donovan, Michael C; Owen, Michael J; Behrens, Timothy W.; Mead, Simon; Goate, Alison M.; Uitterlinden, Andre G.; Holmes, Clive; Cruchaga, Carlos; Ingelsson, Martin; Bennett, David A.; Powell, John; Golde, Todd E.; Graff, Caroline; De Jager, Philip L.; Morgan, Kevin; Ertekin-Taner, Nilufer; Combarros, Onofre; Psaty, Bruce M.; Passmore, Peter; Younkin, Steven G; Berr, Claudine; Gudnason, Vilmundur; Rujescu, Dan; Dickson, Dennis W.; Dartigues, Jean-Francois; DeStefano, Anita L.; Ortega-Cubero, Sara; Hakonarson, Hakon; Campion, Dominique; Boada, Merce; Kauwe, John “Keoni”; Farrer, Lindsay A.; Van Broeckhoven, Christine; Ikram, M. Arfan; Jones, Lesley; Haines, Johnathan; Tzourio, Christophe; Launer, Lenore J.; Escott-Price, Valentina; Mayeux, Richard; Deleuze, Jean-François; Amin, Najaf; Holmans, Peter A; Pericak-Vance, Margaret A.; Amouyel, Philippe; van Duijn, Cornelia M.; Ramirez, Alfredo; Wang, Li-San; Lambert, Jean-Charles; Seshadri, Sudha; Williams, Julie; Schellenberg, Gerard D.
2017-01-01
Introduction We identified rare coding variants associated with Alzheimer’s disease (AD) in a 3-stage case-control study of 85,133 subjects. In stage 1, 34,174 samples were genotyped using a whole-exome microarray. In stage 2, we tested associated variants (P<1×10-4) in 35,962 independent samples using de novo genotyping and imputed genotypes. In stage 3, an additional 14,997 samples were used to test the most significant stage 2 associations (P<5×10-8) using imputed genotypes. We observed 3 novel genome-wide significant (GWS) AD associated non-synonymous variants; a protective variant in PLCG2 (rs72824905/p.P522R, P=5.38×10-10, OR=0.68, MAFcases=0.0059, MAFcontrols=0.0093), a risk variant in ABI3 (rs616338/p.S209F, P=4.56×10-10, OR=1.43, MAFcases=0.011, MAFcontrols=0.008), and a novel GWS variant in TREM2 (rs143332484/p.R62H, P=1.55×10-14, OR=1.67, MAFcases=0.0143, MAFcontrols=0.0089), a known AD susceptibility gene. These protein-coding changes are in genes highly expressed in microglia and highlight an immune-related protein-protein interaction network enriched for previously identified AD risk genes. These genetic findings provide additional evidence that the microglia-mediated innate immune response contributes directly to AD development. PMID:28714976
Firth, Andrew E; Atkins, John F
2009-01-01
Japanese encephalitis, West Nile, Usutu and Murray Valley encephalitis viruses form a tight subgroup within the larger Flavivirus genus. These viruses utilize a single-polyprotein expression strategy, resulting in ~10 mature proteins. Plotting the conservation at synonymous sites along the polyprotein coding sequence reveals strong conservation peaks at the very 5' end of the coding sequence, and also at the 5' end of the sequence encoding the NS2A protein. Such peaks are generally indicative of functionally important non-coding sequence elements. The second peak corresponds to a predicted stable pseudoknot structure whose biological importance is supported by compensatory mutations that preserve the structure. The pseudoknot is preceded by a conserved slippery heptanucleotide (Y CCU UUU), thus forming a classical stimulatory motif for -1 ribosomal frameshifting. We hypothesize, therefore, that the functional importance of the pseudoknot is to stimulate a portion of ribosomes to shift -1 nt into a short (45 codon), conserved, overlapping open reading frame, termed foo. Since cleavage at the NS1-NS2A boundary is known to require synthesis of NS2A in cis, the resulting transframe fusion protein is predicted to be NS1-NS2AN-term-FOO. We hypothesize that this may explain the origin of the previously identified NS1 'extension' protein in JEV-group flaviviruses, known as NS1'. PMID:19196463
A Genome-Wide Association Study of Circulating Galectin-3
van Veldhuisen, Dirk J.; Westra, Harm-Jan; Bakker, Stephan J. L.; Gansevoort, Ron T.; Muller Kobold, Anneke C.; van Gilst, Wiek H.; Franke, Lude
2012-01-01
Galectin-3 is a lectin involved in fibrosis, inflammation and proliferation. Increased circulating levels of galectin-3 have been associated with various diseases, including cancer, immunological disorders, and cardiovascular disease. To enhance our knowledge on galectin-3 biology we performed the first genome-wide association study (GWAS) using the Illumina HumanCytoSNP-12 array imputed with the HapMap 2 CEU panel on plasma galectin-3 levels in 3,776 subjects and follow-up genotyping in an additional 3,516 subjects. We identified 2 genome wide significant loci associated with plasma galectin-3 levels. One locus harbours the LGALS3 gene (rs2274273; P = 2.35×10−188) and the other locus the ABO gene (rs644234; P = 3.65×10−47). The variance explained by the LGALS3 locus was 25.6% and by the ABO locus 3.8% and jointly they explained 29.2%. Rs2274273 lies in high linkage disequilibrium with two non-synonymous SNPs (rs4644; r2 = 1.0, and rs4652; r2 = 0.91) and wet lab follow-up genotyping revealed that both are strongly associated with galectin-3 levels (rs4644; P = 4.97×10−465 and rs4652 P = 1.50×10−421) and were also associated with LGALS3 gene-expression. The origins of our associations should be further validated by means of functional experiments. PMID:23056639
Fard-Esfahani, Pezhman; Fard-Esfahani, Armaghan; Fayaz, Shima; Ghanbarzadeh, Bahareh; Saidi, Parinaz; Mohabati, Reyhaneh; Bidoki, Seyed Kazem; Majdi, Mina
2011-01-01
Background: X-ray repair cross-complementing group 1 (XRCC1) gene is a DNA repair gene and its non-synonymous single nucleotide polymorphisms (SNP) may influence DNA repair capacity which has been considered as a modifying risk factor for cancer development. Methods: A case-control study was conducted to investigate impact of three frequently studied polymorphisms (Arg194Trp, Arg280His and Arg399Gln) on developing differentiated thyroid carcinoma (DTC). Results: Increased risks for DTC were shown in homozygous (odds ratio [OR]: 3.66, 95% confidence interval [CI]: 0.38-35.60) and in dominant trait (OR: 1.22, 95% CI: 1.64-2.32) of Arg194Trp genotype. Also, for Arg280His genotype, an increased risk for DTC was shown in dominant trait (OR: 1.42, 95% confidence interval [CI]: 0.76-2.68), while a mildly reduction of risk for DTC (OR: 0.77, 95% [CI]: 0.50-1.17) was estimated in dominant Gln genotype of Arg399Gln. Considering combinatory effects of Arg194Trp and Arg280His genotypes on DTC, the calculated OR and 95% CI for being heterozygous for one of Arg194Trp or Arg280His genotypes were 1.57 and 0.90-2.74, respectively. Conclusion: Genotyping of codons 194, 280 and 399 in XRCC1 gene may use in risk assessment of DTC. PMID:21987112
Bajaj, Deepak; Das, Shouvik; Badoni, Saurabh; Kumar, Vinod; Singh, Mohar; Bansal, Kailash C.; Tyagi, Akhilesh K.; Parida, Swarup K.
2015-01-01
We identified 82489 high-quality genome-wide SNPs from 93 wild and cultivated Cicer accessions through integrated reference genome- and de novo-based GBS assays. High intra- and inter-specific polymorphic potential (66–85%) and broader natural allelic diversity (6–64%) detected by genome-wide SNPs among accessions signify their efficacy for monitoring introgression and transferring target trait-regulating genomic (gene) regions/allelic variants from wild to cultivated Cicer gene pools for genetic improvement. The population-specific assignment of wild Cicer accessions pertaining to the primary gene pool are more influenced by geographical origin/phenotypic characteristics than species/gene-pools of origination. The functional significance of allelic variants (non-synonymous and regulatory SNPs) scanned from transcription factors and stress-responsive genes in differentiating wild accessions (with potential known sources of yield-contributing and stress tolerance traits) from cultivated desi and kabuli accessions, fine-mapping/map-based cloning of QTLs and determination of LD patterns across wild and cultivated gene-pools are suitably elucidated. The correlation between phenotypic (agromorphological traits) and molecular diversity-based admixed domestication patterns within six structured populations of wild and cultivated accessions via genome-wide SNPs was apparent. This suggests utility of whole genome SNPs as a potential resource for identifying naturally selected trait-regulating genomic targets/functional allelic variants adaptive to diverse agroclimatic regions for genetic enhancement of cultivated gene-pools. PMID:26208313
Yimer, Solomon A; Namouchi, Amine; Zegeye, Ephrem Debebe; Holm-Hansen, Carol; Norheim, Gunnstein; Abebe, Markos; Aseffa, Abraham; Tønjum, Tone
2016-06-30
A deeply rooted phylogenetic lineage of Mycobacterium tuberculosis (M. tuberculosis) termed lineage 7 was discovered in Ethiopia. Whole genome sequencing of 30 lineage 7 strains from patients in Ethiopia was performed. Intra-lineage genome variation was defined and unique characteristics identified with a focus on genes involved in DNA repair, recombination and replication (3R genes). More than 800 mutations specific to M. tuberculosis lineage 7 strains were identified. The proportion of non-synonymous single nucleotide polymorphisms (nsSNPs) in 3R genes was higher after the recent expansion of M. tuberculosis lineage 7 strain started. The proportion of nsSNPs in genes involved in inorganic ion transport and metabolism was significantly higher before the expansion began. A total of 22346 bp deletions were observed. Lineage 7 strains also exhibited a high number of mutations in genes involved in carbohydrate transport and metabolism, transcription, energy production and conversion. We have identified unique genomic signatures of the lineage 7 strains. The high frequency of nsSNP in 3R genes after the phylogenetic expansion may have contributed to recent variability and adaptation. The abundance of mutations in genes involved in inorganic ion transport and metabolism before the expansion period may indicate an adaptive response of lineage 7 strains to enable survival, potentially under environmental stress exposure. As lineage 7 strains originally were phylogenetically deeply rooted, this may indicate fundamental adaptive genomic pathways affecting the fitness of M. tuberculosis as a species.
Ajayi, Oyeyemi O; Adefenwa, Mufliat A; Agaviezor, Brilliant O; Ikeobi, Christian O N; Wheto, Matthew; Okpeku, Moses; Amusan, Samuel A; Yakubu, Abdulmojeed; De Donato, Marcos; Peters, Sunday O; Imumorin, Ikhide G
2014-02-01
The tenascin-XB (TNXB) gene has antiadhesive effects, functions in matrix maturation in connective tissues, and localizes to the major histocompatibility complex class III region. We hypothesized that it may influence adaptive physiological response through an effect on blood vessel function. We identified a novel g.1324 A→G polymorphism at a TaqI recognition site in a 454 bp fragment of ovine TNXB and genotyped it in 150 Nigerian sheep using PCR-RFLP. The missense mutation changes glutamic acid (GAA) to glycine (GGA). Among SNP genotypes, significant differences (P < 0.05) were observed in body weight and fore cannon bone length. Interaction effects of breed, SNP genotype, and geographic location had a significant effect (P < 0.05) on chest girth. The SNP genotype was significantly (P < 0.05) associated with physiological traits of pulse rate and skin temperature. The observed effect of this novel polymorphism may be mediated through its role in connective tissue biology, requiring further association and functional studies.
Samuels, David C.; Boys, Richard J.; Henderson, Daniel A.; Chinnery, Patrick F.
2003-01-01
We applied a hidden Markov model segmentation method to the human mitochondrial genome to identify patterns in the sequence, to compare these patterns to the gene structure of mtDNA and to see whether these patterns reveal additional characteristics important for our understanding of genome evolution, structure and function. Our analysis identified three segmentation categories based upon the sequence transition probabilities. Category 2 segments corresponded to the tRNA and rRNA genes, with a greater strand-symmetry in these segments. Category 1 and 3 segments covered the protein- coding genes and almost all of the non-coding D-loop. Compared to category 1, the mtDNA segments assigned to category 3 had much lower guanine abundance. A comparison to two independent databases of mitochondrial mutations and polymorphisms showed that the high substitution rate of guanine in human mtDNA is largest in the category 3 segments. Analysis of synonymous mutations showed the same pattern. This suggests that this heterogeneity in the mutation rate is partly independent of respiratory chain function and is a direct property of the genome sequence itself. This has important implications for our understanding of mtDNA evolution and its use as a ‘molecular clock’ to determine the rate of population and species divergence. PMID:14530452
Gómez, Fernando
2008-04-01
The validity of categorizing the diatoms and dinoflagellates reported in the literature as non-indigenous phytoplankton in the European Seas was investigated. Species that are synonymous are often included as separate species (Gessnerium mochimaensis=Alexandrium monilatum, Gymnodinium nagasakiense=Karenia mikimotoi, Pleurosigma simonsenii=P. planctonicum), while other species names are synonyms of cosmopolitan taxa (Prorocentrum redfieldii=P. triestinum, Pseliodinium vaubanii=Gyrodinium falcatum, Gonyaulax grindleyi=Protoceratium reticulatum, Asterionella japonica=Asterionellopsis glacialis). Epithets of an exotic etymology (i.e. japonica, sinensis, indica) imply that a cosmopolitan species may be non-indigenous, and several taxa are even considered as non-indigenous in their type locality (Alexandrium tamarense and A. pseudogoniaulax). The records of Alexandrium monilatum, A. leei and Corethron criophilum are doubtful. Cold or warm-water species expand their geographical ranges or increase their abundances to detectable levels during cooling (Coscinodiscus wailesii) or warming periods (Chaetoceros coarctatus, Proboscia indica, Pyrodinium bahamense). These are a few examples of marginal dispersal associated with climatic events instead of species introductions from remote areas. The number of non-indigenous phytoplankton species in European Seas has thus been excessively inflated.
He, Xiaoqing; Jin, Yi; Ye, Meixia; Chen, Nan; Zhu, Jing; Wang, Jingqi; Jiang, Libo; Wu, Rongling
2017-01-01
How a species responds to such a biotic environment in the community, ultimately leading to its evolution, has been a topic of intense interest to ecological evolutionary biologists. Until recently, limited knowledge was available regarding the genotypic changes that underlie phenotypic changes. Our study implemented GWAS (Genome-Wide Association Studies) to illustrate the genetic architecture of ecological interactions that take place in microbial populations. By choosing 45 such interspecific pairs of Escherichia coli and Staphylococcus aureus strains that were all genotyped throughout the entire genome, we employed Q-ROADTRIPS to analyze the association between single SNPs and microbial abundance measured at each time point for bacterial populations reared in monoculture and co-culture, respectively. We identified a large number of SNPs and indels across the genomes (35.69 G clean data of E. coli and 50.41 G of S. aureus ). We reported 66 and 111 SNPs that were associated with interaction in E. coli and S. aureus , respectively. 23 out of 66 polymorphic changes resulted in amino acid alterations.12 significant genes, such as murE, treA, argS , and relA , which were also identified in previous evolutionary studies. In S. aureus , 111 SNPs detected in coding sequences could be divided into 35 non-synonymous and 76 synonymous SNPs. Our study illustrated the potential of genome-wide association methods for studying rapidly evolving traits in bacteria. Genetic association study methods will facilitate the identification of genetic elements likely to cause phenotypes of interest and provide targets for further laboratory investigation.
Crosley, E J; Elliot, M G; Christians, J K; Crespi, B J
2013-02-01
Recent evidence from chimpanzees and gorillas has raised doubts that preeclampsia is a uniquely human disease. The deep extravillous trophoblast (EVT) invasion and spiral artery remodeling that characterizes our placenta (and is abnormal in preeclampsia) is shared within great apes, setting Homininae apart from Hylobatidae and Old World Monkeys, which show much shallower trophoblast invasion and limited spiral artery remodeling. We hypothesize that the evolution of a more invasive placenta in the lineage ancestral to the great apes involved positive selection on genes crucial to EVT invasion and spiral artery remodeling. Furthermore, identification of placentally-expressed genes under selection in this lineage may identify novel genes involved in placental development. We tested for positive selection in approximately 18,000 genes using the ratio of non-synonymous to synonymous amino acid substitution for protein-coding DNA. DAVID Bioinformatics Resources identified biological processes enriched in positively selected genes, including processes related to EVT invasion and spiral artery remodeling. Analyses revealed 295 and 264 genes under significant positive selection on the branches ancestral to Hominidae (Human, Chimp, Gorilla, Orangutan) and Homininae (Human, Chimp, Gorilla), respectively. Gene ontology analysis of these gene sets demonstrated significant enrichments for several functional gene clusters relevant to preeclampsia risk, and sets of placentally-expressed genes that have been linked with preeclampsia and/or trophoblast invasion in other studies. Our study represents a novel approach to the identification of candidate genes and amino acid residues involved in placental pathologies by implicating them in the evolution of highly-invasive placenta. Copyright © 2012 Elsevier Ltd. All rights reserved.
Modeling coding-sequence evolution within the context of residue solvent accessibility.
Scherrer, Michael P; Meyer, Austin G; Wilke, Claus O
2012-09-12
Protein structure mediates site-specific patterns of sequence divergence. In particular, residues in the core of a protein (solvent-inaccessible residues) tend to be more evolutionarily conserved than residues on the surface (solvent-accessible residues). Here, we present a model of sequence evolution that explicitly accounts for the relative solvent accessibility of each residue in a protein. Our model is a variant of the Goldman-Yang 1994 (GY94) model in which all model parameters can be functions of the relative solvent accessibility (RSA) of a residue. We apply this model to a data set comprised of nearly 600 yeast genes, and find that an evolutionary-rate ratio ω that varies linearly with RSA provides a better model fit than an RSA-independent ω or an ω that is estimated separately in individual RSA bins. We further show that the branch length t and the transition-transverion ratio κ also vary with RSA. The RSA-dependent GY94 model performs better than an RSA-dependent Muse-Gaut 1994 (MG94) model in which the synonymous and non-synonymous rates individually are linear functions of RSA. Finally, protein core size affects the slope of the linear relationship between ω and RSA, and gene expression level affects both the intercept and the slope. Structure-aware models of sequence evolution provide a significantly better fit than traditional models that neglect structure. The linear relationship between ω and RSA implies that genes are better characterized by their ω slope and intercept than by just their mean ω.
Chen, Zhengshuai; Li, Jingjie; Chen, Peng; Wang, Fengjiao; Zhang, Ning; Yang, Min; Jin, Tianbo; Chen, Chao
2016-09-01
1. Detection of CYP3A5 variant alleles, and knowledge about their allelic frequency in Uyghur ethnic groups, is important to establish the clinical relevance of screening for these polymorphisms to optimize pharmacotherapy. 2. We used DNA sequencing to investigate the promoter, exons and surrounding introns, and 3'-untranslated region of the CYP3A5 gene in 96 unrelated healthy Uyghur individuals. We also used SIFT and PolyPhen-2 to predict the protein function of the novel non-synonymous mutation in CYP3A5 coding regions. 3. We found 24 different CYP3A5 polymorphisms in the Uyghur population, three of which were novel: the synonymous mutation 43C > T in exon 1, two mutations 32120C > G and 32245T > C in 3'-untranslated region, and we detected the allele frequencies of CYP3A5*1 and *3 as 64.58% and 35.42%, respectively. While no subjects with CYP3A5*6 were identified. Other identified genotypes included the heterozygous genotype 1A/3A (59.38%) and 1A/3E (11.46%), which lead to decreased enzyme activity. In addition, the frequency of haplotype "TTAGGT" was the most prevalent with 0.781. 4. Our data provide new information regarding CYP3A5 genetic polymorphisms in Uyghur individuals, which may help to improve individualization of drug therapy and offer a preliminary basis for more rational use of drugs.
Grossen, Christine; Keller, Lukas; Biebach, Iris; Croll, Daniel
2014-01-01
The major histocompatibility complex (MHC) is a crucial component of the vertebrate immune system and shows extremely high levels of genetic polymorphism. The extraordinary genetic variation is thought to be ancient polymorphisms maintained by balancing selection. However, introgression from related species was recently proposed as an additional mechanism. Here we provide evidence for introgression at the MHC in Alpine ibex (Capra ibex ibex). At a usually very polymorphic MHC exon involved in pathogen recognition (DRB exon 2), Alpine ibex carried only two alleles. We found that one of these DRB alleles is identical to a DRB allele of domestic goats (Capra aegagrus hircus). We sequenced 2489 bp of the coding and non-coding regions of the DRB gene and found that Alpine ibex homozygous for the goat-type DRB exon 2 allele showed nearly identical sequences (99.8%) to a breed of domestic goats. Using Sanger and RAD sequencing, microsatellite and SNP chip data, we show that the chromosomal region containing the goat-type DRB allele has a signature of recent introgression in Alpine ibex. A region of approximately 750 kb including the DRB locus showed high rates of heterozygosity in individuals carrying one copy of the goat-type DRB allele. These individuals shared SNP alleles both with domestic goats and other Alpine ibex. In a survey of four Alpine ibex populations, we found that the region surrounding the DRB allele shows strong linkage disequilibria, strong sequence clustering and low diversity among haplotypes carrying the goat-type allele. Introgression at the MHC is likely adaptive and introgression critically increased MHC DRB diversity in the genetically impoverished Alpine ibex. Our finding contradicts the long-standing view that genetic variability at the MHC is solely a consequence of ancient trans-species polymorphism. Introgression is likely an underappreciated source of genetic diversity at the MHC and other loci under balancing selection. PMID:24945814
Lidral, Andrew C.; Liu, Huan; Bullard, Steven A.; Bonde, Greg; Machida, Junichiro; Visel, Axel; Uribe, Lina M. Moreno; Li, Xiao; Amendt, Brad; Cornell, Robert A.
2015-01-01
Three common diseases, isolated cleft lip and cleft palate (CLP), hypothyroidism and thyroid cancer all map to the FOXE1 locus, but causative variants have yet to be identified. In patients with CLP, the frequency of coding mutations in FOXE1 fails to account for the risk attributable to this locus, suggesting that the common risk alleles reside in nearby regulatory elements. Using a combination of zebrafish and mouse transgenesis, we screened 15 conserved non-coding sequences for enhancer activity, identifying three that regulate expression in a tissue specific pattern consistent with endogenous foxe1 expression. These three, located −82.4, −67.7 and +22.6 kb from the FOXE1 start codon, are all active in the oral epithelium or branchial arches. The −67.7 and +22.6 kb elements are also active in the developing heart, and the −67.7 kb element uniquely directs expression in the developing thyroid. Within the −67.7 kb element is the SNP rs7850258 that is associated with all three diseases. Quantitative reporter assays in oral epithelial and thyroid cell lines show that the rs7850258 allele (G) associated with CLP and hypothyroidism has significantly greater enhancer activity than the allele associated with thyroid cancer (A). Moreover, consistent with predicted transcription factor binding differences, the −67.7 kb element containing rs7850258 allele G is significantly more responsive to both MYC and ARNT than allele A. By demonstrating that this common non-coding variant alters FOXE1 expression, we have identified at least in part the functional basis for the genetic risk of these seemingly disparate disorders. PMID:25652407
Promoter mutation is a common variant in GJC2-associated Pelizaeus-Merzbacher-like disease.
Meyer, E; Kurian, M A; Morgan, N V; McNeill, A; Pasha, S; Tee, L; Younis, R; Norman, A; van der Knaap, M S; Wassmer, E; Trembath, R C; Brueton, L; Maher, E R
2011-12-01
Pelizaeus-Merzbacher-like disease (PMLD) is a clinically and genetically heterogeneous neurological disorder of cerebral hypomyelination. It is clinically characterised by early onset (usually infantile) nystagmus, impaired motor development, ataxia, choreoathetoid movements, dysarthria and progressive limb spasticity. We undertook autozygosity mapping studies in a large consanguineous family of Pakistani origin in which affected children had progressive lower limb spasticity and features of cerebral hypomyelination on MR brain imaging. SNP microarray and microsatellite marker analysis demonstrated linkage to chromosome 1q42.13-1q42.2. Direct sequencing of the gap junction protein gamma-2 gene, GJC2, identified a promoter region mutation (c.-167A>G) in the non-coding exon 1. The c.-167A>G promoter mutation was identified in a further 4 individuals from two families (who were also of Pakistani origin) with clinical and radiological features of PMLD in whom previous routine diagnostic screening of GJC2 had been reported as negative. A common haplotype was identified at the GJC2 locus in the three mutation-positive families, consistent with a common origin for the mutation and likely founder effect. This promoter mutation has only recently been reported in GJC2-PMLD but it has been postulated to affect the binding of the transcription factor SOX10 and appears to be a prevalent mutation, accounting for ~29% of reported patients with GJC2-PMLD. We propose that diagnostic screening of GJC2 should include sequence analysis of the non-coding exon 1, as well as the coding regions to avoid misdiagnosis or diagnostic delay in suspected PMLD. Copyright © 2011 Elsevier Inc. All rights reserved.
Kirsten, Holger; Al-Hasani, Hoor; Holdt, Lesca; Gross, Arnd; Beutner, Frank; Krohn, Knut; Horn, Katrin; Ahnert, Peter; Burkhardt, Ralph; Reiche, Kristin; Hackermüller, Jörg; Löffler, Markus; Teupser, Daniel; Thiery, Joachim; Scholz, Markus
2015-01-01
Genetics of gene expression (eQTLs or expression QTLs) has proved an indispensable tool for understanding biological pathways and pathomechanisms of trait-associated SNPs. However, power of most genome-wide eQTL studies is still limited. We performed a large eQTL study in peripheral blood mononuclear cells of 2112 individuals increasing the power to detect trans-effects genome-wide. Going beyond univariate SNP-transcript associations, we analyse relations of eQTLs to biological pathways, polygenetic effects of expression regulation, trans-clusters and enrichment of co-localized functional elements. We found eQTLs for about 85% of analysed genes, and 18% of genes were trans-regulated. Local eSNPs were enriched up to a distance of 5 Mb to the transcript challenging typically implemented ranges of cis-regulations. Pathway enrichment within regulated genes of GWAS-related eSNPs supported functional relevance of identified eQTLs. We demonstrate that nearest genes of GWAS-SNPs might frequently be misleading functional candidates. We identified novel trans-clusters of potential functional relevance for GWAS-SNPs of several phenotypes including obesity-related traits, HDL-cholesterol levels and haematological phenotypes. We used chromatin immunoprecipitation data for demonstrating biological effects. Yet, we show for strongly heritable transcripts that still little trans-chromosomal heritability is explained by all identified trans-eSNPs; however, our data suggest that most cis-heritability of these transcripts seems explained. Dissection of co-localized functional elements indicated a prominent role of SNPs in loci of pseudogenes and non-coding RNAs for the regulation of coding genes. In summary, our study substantially increases the catalogue of human eQTLs and improves our understanding of the complex genetic regulation of gene expression, pathways and disease-related processes. PMID:26019233
Bolbase: a comprehensive genomics database for Brassica oleracea
2013-01-01
Background Brassica oleracea is a morphologically diverse species in the family Brassicaceae and contains a group of nutrition-rich vegetable crops, including common heading cabbage, cauliflower, broccoli, kohlrabi, kale, Brussels sprouts. This diversity along with its phylogenetic membership in a group of three diploid and three tetraploid species, and the recent availability of genome sequences within Brassica provide an unprecedented opportunity to study intra- and inter-species divergence and evolution in this species and its close relatives. Description We have developed a comprehensive database, Bolbase, which provides access to the B. oleracea genome data and comparative genomics information. The whole genome of B. oleracea is available, including nine fully assembled chromosomes and 1,848 scaffolds, with 45,758 predicted genes, 13,382 transposable elements, and 3,581 non-coding RNAs. Comparative genomics information is available, including syntenic regions among B. oleracea, Brassica rapa and Arabidopsis thaliana, synonymous (Ks) and non-synonymous (Ka) substitution rates between orthologous gene pairs, gene families or clusters, and differences in quantity, category, and distribution of transposable elements on chromosomes. Bolbase provides useful search and data mining tools, including a keyword search, a local BLAST server, and a customized GBrowse tool, which can be used to extract annotations of genome components, identify similar sequences and visualize syntenic regions among species. Users can download all genomic data and explore comparative genomics in a highly visual setting. Conclusions Bolbase is the first resource platform for the B. oleracea genome and for genomic comparisons with its relatives, and thus it will help the research community to better study the function and evolution of Brassica genomes as well as enhance molecular breeding research. This database will be updated regularly with new features, improvements to genome annotation, and new genomic sequences as they become available. Bolbase is freely available at http://ocri-genomics.org/bolbase. PMID:24079801
Munde, Elly O; Raballah, Evans; Okeyo, Winnie A; Ong'echa, John M; Perkins, Douglas J; Ouma, Collins
2017-04-20
Improved understanding of the molecular mechanisms involved in pediatric severe malarial anemia (SMA) pathogenesis is a crucial step in the design of novel therapeutics. Identification of host genetic susceptibility factors in immune regulatory genes offers an important tool for deciphering malaria pathogenesis. The IL-23/IL-17 immune pathway is important for both immunity and erythropoiesis via its effects through IL-23 receptors (IL-23R). However, the impact of IL-23R variants on SMA has not been fully elucidated. Since variation within the coding region of IL-23R may influence the pathogenesis of SMA, the association between IL-23R rs1884444 (G/T), rs7530511 (C/T), and SMA (Hb < 6.0 g/dL) was examined in children (n = 369, aged 6-36 months) with P. falciparum malaria in a holoendemic P. falciparum transmission area. Logistic regression analysis, controlling for confounding factor of anemia, revealed that individual genotypes of IL-23R rs1884444 (G/T) [GT; OR = 1.34, 95% CI = 0.78-2.31, P = 0.304 and TT; OR = 2.02, 95% CI = 0.53-7.74, P = 0.286] and IL-23R rs7530511 (C/T) [CT; OR = 2.6, 95% CI = 0.59-11.86, P = 0.202 and TT; OR = 1.66, 95% CI = 0.84-3.27, P = 0.142] were not associated with susceptibility to SMA. However, carriage of IL-23R rs1884444T/rs7530511T (TT) haplotype, consisting of both mutant alleles, was associated with increased susceptibility to SMA (OR = 1.12, 95% CI = 1.07-4.19, P = 0.030). Results presented here demonstrate that a haplotype of non-synonymous IL-23R variants increase susceptibility to SMA in children of a holoendemic P. falciparum transmission area.
Yang, Jie; Wang, Zhen Long; Zhao, Xin Quan; Wang, De Peng; Qi, De Lin; Xu, Bao Hong; Ren, Yong Hong; Tian, Hui Fang
2008-01-01
Background Environmental stress can accelerate the evolutionary rate of specific stress-response proteins and create new functions specialized for different environments, enhancing an organism's fitness to stressful environments. Pikas (order Lagomorpha), endemic, non-hibernating mammals in the modern Holarctic Region, live in cold regions at either high altitudes or high latitudes and have a maximum distribution of species diversification confined to the Qinghai-Tibet Plateau. Variations in energy metabolism are remarkable for them living in cold environments. Leptin, an adipocyte-derived hormone, plays important roles in energy homeostasis. Methodology/Principal Findings To examine the extent of leptin variations within the Ochotona family, we cloned the entire coding sequence of pika leptin from 6 species in two regions (Qinghai-Tibet Plateau and Inner Mongolia steppe in China) and the leptin sequences of plateau pikas (O. curzonia) from different altitudes on Qinghai-Tibet Plateau. We carried out both DNA and amino acid sequence analyses in molecular evolution and compared modeled spatial structures. Our results show that positive selection (PS) acts on pika leptin, while nine PS sites located within the functionally significant segment 85-119 of leptin and one unique motif appeared only in pika lineages-the ATP synthase α and β subunit signature site. To reveal the environmental factors affecting sequence evolution of pika leptin, relative rate test was performed in pikas from different altitudes. Stepwise multiple regression shows that temperature is significantly and negatively correlated with the rates of non-synonymous substitution (Ka) and amino acid substitution (Aa), whereas altitude does not significantly affect synonymous substitution (Ks), Ka and Aa. Conclusions/Significance Our findings support the viewpoint that adaptive evolution may occur in pika leptin, which may play important roles in pikas' ecological adaptation to extreme environmental stress. We speculate that cold, and probably not hypoxia, may be the primary environmental factor for driving adaptive evolution of pika leptin. PMID:18213380
Genome sequence, comparative analysis and haplotype structure of the domestic dog.
Lindblad-Toh, Kerstin; Wade, Claire M; Mikkelsen, Tarjei S; Karlsson, Elinor K; Jaffe, David B; Kamal, Michael; Clamp, Michele; Chang, Jean L; Kulbokas, Edward J; Zody, Michael C; Mauceli, Evan; Xie, Xiaohui; Breen, Matthew; Wayne, Robert K; Ostrander, Elaine A; Ponting, Chris P; Galibert, Francis; Smith, Douglas R; DeJong, Pieter J; Kirkness, Ewen; Alvarez, Pablo; Biagi, Tara; Brockman, William; Butler, Jonathan; Chin, Chee-Wye; Cook, April; Cuff, James; Daly, Mark J; DeCaprio, David; Gnerre, Sante; Grabherr, Manfred; Kellis, Manolis; Kleber, Michael; Bardeleben, Carolyne; Goodstadt, Leo; Heger, Andreas; Hitte, Christophe; Kim, Lisa; Koepfli, Klaus-Peter; Parker, Heidi G; Pollinger, John P; Searle, Stephen M J; Sutter, Nathan B; Thomas, Rachael; Webber, Caleb; Baldwin, Jennifer; Abebe, Adal; Abouelleil, Amr; Aftuck, Lynne; Ait-Zahra, Mostafa; Aldredge, Tyler; Allen, Nicole; An, Peter; Anderson, Scott; Antoine, Claudel; Arachchi, Harindra; Aslam, Ali; Ayotte, Laura; Bachantsang, Pasang; Barry, Andrew; Bayul, Tashi; Benamara, Mostafa; Berlin, Aaron; Bessette, Daniel; Blitshteyn, Berta; Bloom, Toby; Blye, Jason; Boguslavskiy, Leonid; Bonnet, Claude; Boukhgalter, Boris; Brown, Adam; Cahill, Patrick; Calixte, Nadia; Camarata, Jody; Cheshatsang, Yama; Chu, Jeffrey; Citroen, Mieke; Collymore, Alville; Cooke, Patrick; Dawoe, Tenzin; Daza, Riza; Decktor, Karin; DeGray, Stuart; Dhargay, Norbu; Dooley, Kimberly; Dooley, Kathleen; Dorje, Passang; Dorjee, Kunsang; Dorris, Lester; Duffey, Noah; Dupes, Alan; Egbiremolen, Osebhajajeme; Elong, Richard; Falk, Jill; Farina, Abderrahim; Faro, Susan; Ferguson, Diallo; Ferreira, Patricia; Fisher, Sheila; FitzGerald, Mike; Foley, Karen; Foley, Chelsea; Franke, Alicia; Friedrich, Dennis; Gage, Diane; Garber, Manuel; Gearin, Gary; Giannoukos, Georgia; Goode, Tina; Goyette, Audra; Graham, Joseph; Grandbois, Edward; Gyaltsen, Kunsang; Hafez, Nabil; Hagopian, Daniel; Hagos, Birhane; Hall, Jennifer; Healy, Claire; Hegarty, Ryan; Honan, Tracey; Horn, Andrea; Houde, Nathan; Hughes, Leanne; Hunnicutt, Leigh; Husby, M; Jester, Benjamin; Jones, Charlien; Kamat, Asha; Kanga, Ben; Kells, Cristyn; Khazanovich, Dmitry; Kieu, Alix Chinh; Kisner, Peter; Kumar, Mayank; Lance, Krista; Landers, Thomas; Lara, Marcia; Lee, William; Leger, Jean-Pierre; Lennon, Niall; Leuper, Lisa; LeVine, Sarah; Liu, Jinlei; Liu, Xiaohong; Lokyitsang, Yeshi; Lokyitsang, Tashi; Lui, Annie; Macdonald, Jan; Major, John; Marabella, Richard; Maru, Kebede; Matthews, Charles; McDonough, Susan; Mehta, Teena; Meldrim, James; Melnikov, Alexandre; Meneus, Louis; Mihalev, Atanas; Mihova, Tanya; Miller, Karen; Mittelman, Rachel; Mlenga, Valentine; Mulrain, Leonidas; Munson, Glen; Navidi, Adam; Naylor, Jerome; Nguyen, Tuyen; Nguyen, Nga; Nguyen, Cindy; Nguyen, Thu; Nicol, Robert; Norbu, Nyima; Norbu, Choe; Novod, Nathaniel; Nyima, Tenchoe; Olandt, Peter; O'Neill, Barry; O'Neill, Keith; Osman, Sahal; Oyono, Lucien; Patti, Christopher; Perrin, Danielle; Phunkhang, Pema; Pierre, Fritz; Priest, Margaret; Rachupka, Anthony; Raghuraman, Sujaa; Rameau, Rayale; Ray, Verneda; Raymond, Christina; Rege, Filip; Rise, Cecil; Rogers, Julie; Rogov, Peter; Sahalie, Julie; Settipalli, Sampath; Sharpe, Theodore; Shea, Terrance; Sheehan, Mechele; Sherpa, Ngawang; Shi, Jianying; Shih, Diana; Sloan, Jessie; Smith, Cherylyn; Sparrow, Todd; Stalker, John; Stange-Thomann, Nicole; Stavropoulos, Sharon; Stone, Catherine; Stone, Sabrina; Sykes, Sean; Tchuinga, Pierre; Tenzing, Pema; Tesfaye, Senait; Thoulutsang, Dawa; Thoulutsang, Yama; Topham, Kerri; Topping, Ira; Tsamla, Tsamla; Vassiliev, Helen; Venkataraman, Vijay; Vo, Andy; Wangchuk, Tsering; Wangdi, Tsering; Weiand, Michael; Wilkinson, Jane; Wilson, Adam; Yadav, Shailendra; Yang, Shuli; Yang, Xiaoping; Young, Geneva; Yu, Qing; Zainoun, Joanne; Zembek, Lisa; Zimmer, Andrew; Lander, Eric S
2005-12-08
Here we report a high-quality draft genome sequence of the domestic dog (Canis familiaris), together with a dense map of single nucleotide polymorphisms (SNPs) across breeds. The dog is of particular interest because it provides important evolutionary information and because existing breeds show great phenotypic diversity for morphological, physiological and behavioural traits. We use sequence comparison with the primate and rodent lineages to shed light on the structure and evolution of genomes and genes. Notably, the majority of the most highly conserved non-coding sequences in mammalian genomes are clustered near a small subset of genes with important roles in development. Analysis of SNPs reveals long-range haplotypes across the entire dog genome, and defines the nature of genetic diversity within and across breeds. The current SNP map now makes it possible for genome-wide association studies to identify genes responsible for diseases and traits, with important consequences for human and companion animal health.
CASC15-S is a tumor suppressor lncRNA at the 6p22 neuroblastoma susceptibility locus
Russell, Mike R.; Penikis, Annalise; Oldridge, Derek A.; Alvarez-Dominguez, Juan R.; McDaniel, Lee; Diamond, Maura; Padovan, Olivia; Raman, Pichai; Li, Yimei; Wei, Jun S.; Zhang, Shile; Gnanchandran, Janahan; Seeger, Robert; Asgharzadeh, Shahab; Khan, Javed; Diskin, Sharon J.; Maris, John M.; Cole, Kristina A.
2015-01-01
Chromosome 6p22 was identified recently as a neuroblastoma susceptibility locus, but its mechanistic contributions to tumorigenesis are as yet undefined. Here we report that the most highly significant single nucleotide polymorphism (SNP) associations reside within CASC15, a long non-coding RNA that we define as a tumor suppressor at 6p22. Low-level expression of a short CASC15 isoform (CASC15-S) associated highly with advanced neuroblastoma and poor patient survival. In human neuroblastoma cells, attenuating CASC15-S increased cellular growth and migratory capacity. Gene expression analysis revealed downregulation of neuroblastoma-specific markers in cells with attenuated CASC15-S, with concomitant increases in cell adhesion and extracellular matrix transcripts. Altogether, our results point to CASC15-S as a mediator of neural growth and differentiation, which impacts neuroblastoma initiation and progression. PMID:26100672
In Silico Analysis of Single Nucleotide Polymorphism (SNPs) in Human β-Globin Gene
Alanazi, Mohammed; Abduljaleel, Zainularifeen; Khan, Wajahatullah; Warsy, Arjumand S.; Elrobh, Mohamed; Khan, Zahid; Amri, Abdullah Al; Bazzi, Mohammad D.
2011-01-01
Single amino acid substitutions in the globin chain are the most common forms of genetic variations that produce hemoglobinopathies- the most widespread inherited disorders worldwide. Several hemoglobinopathies result from homozygosity or compound heterozygosity to beta-globin (HBB) gene mutations, such as that producing sickle cell hemoglobin (HbS), HbC, HbD and HbE. Several of these mutations are deleterious and result in moderate to severe hemolytic anemia, with associated complications, requiring lifelong care and management. Even though many hemoglobinopathies result from single amino acid changes producing similar structural abnormalities, there are functional differences in the generated variants. Using in silico methods, we examined the genetic variations that can alter the expression and function of the HBB gene. Using a sequence homology-based Sorting Intolerant from Tolerant (SIFT) server we have searched for the SNPs, which showed that 200 (80%) non-synonymous polymorphism were found to be deleterious. The structure-based method via PolyPhen server indicated that 135 (40%) non-synonymous polymorphism may modify protein function and structure. The Pupa Suite software showed that the SNPs will have a phenotypic consequence on the structure and function of the altered protein. Structure analysis was performed on the key mutations that occur in the native protein coded by the HBB gene that causes hemoglobinopathies such as: HbC (E→K), HbD (E→Q), HbE (E→K) and HbS (E→V). Atomic Non-Local Environment Assessment (ANOLEA), Yet Another Scientific Artificial Reality Application (YASARA), CHARMM-GUI webserver for macromolecular dynamics and mechanics, and Normal Mode Analysis, Deformation and Refinement (NOMAD-Ref) of Gromacs server were used to perform molecular dynamics simulations and energy minimization calculations on β-Chain residue of the HBB gene before and after mutation. Furthermore, in the native and altered protein models, amino acid residues were determined and secondary structures were observed for solvent accessibility to confirm the protein stability. The functional study in this investigation may be a good model for additional future studies. PMID:22028795
Zhang, L; Wu, Q; Hu, Y; Wu, H; Wei, F
2015-01-01
Major histocompatibility complex (MHC) polymorphism is thought to be driven by antagonistic coevolution between pathogens and hosts, mediated through either overdominance or frequency-dependent selection. However, investigations under natural conditions are still rare for endangered mammals which often exhibit depleted variation, and the mechanism of selection underlying the maintenance of characteristics remains a considerable debate. In this study, 87 wild giant pandas were used to investigate MHC variation associated with parasite load. With the knowledge of the MHC profile provided by the genomic data of the giant panda, seven DRB1, seven DQA1 and eight DQA2 alleles were identified at each single locus. Positive selection evidenced by a significantly higher number of non-synonymous substitutions per non-synonymous codon site relative to synonymous substitutions per synonymous codon site could only be detected at the DRB1 locus, which leads to the speculation that DRB1 may have a more important role in dealing with parasite infection for pandas. Coprological analyses revealed that 55.17% of individuals exhibited infection with 1-2 helminthes and 95.3% of infected pandas carried Baylisascaris shroederi. Using a generalized linear model, we found that Aime-DRB1*10 was significantly associated with parasite infection, but no resistant alleles could be detected. MHC heterozygosity of the pandas was found to be uncorrelated with the infection status or the infection intensity. These results suggested that the possible selection mechanisms in extant wild pandas may be frequency dependent rather than being determined by overdominance selection. Our findings could guide the candidate selection for the ongoing reintroduction or translocation of pandas.
Zhang, L; Wu, Q; Hu, Y; Wu, H; Wei, F
2015-01-01
Major histocompatibility complex (MHC) polymorphism is thought to be driven by antagonistic coevolution between pathogens and hosts, mediated through either overdominance or frequency-dependent selection. However, investigations under natural conditions are still rare for endangered mammals which often exhibit depleted variation, and the mechanism of selection underlying the maintenance of characteristics remains a considerable debate. In this study, 87 wild giant pandas were used to investigate MHC variation associated with parasite load. With the knowledge of the MHC profile provided by the genomic data of the giant panda, seven DRB1, seven DQA1 and eight DQA2 alleles were identified at each single locus. Positive selection evidenced by a significantly higher number of non-synonymous substitutions per non-synonymous codon site relative to synonymous substitutions per synonymous codon site could only be detected at the DRB1 locus, which leads to the speculation that DRB1 may have a more important role in dealing with parasite infection for pandas. Coprological analyses revealed that 55.17% of individuals exhibited infection with 1–2 helminthes and 95.3% of infected pandas carried Baylisascaris shroederi. Using a generalized linear model, we found that Aime-DRB1*10 was significantly associated with parasite infection, but no resistant alleles could be detected. MHC heterozygosity of the pandas was found to be uncorrelated with the infection status or the infection intensity. These results suggested that the possible selection mechanisms in extant wild pandas may be frequency dependent rather than being determined by overdominance selection. Our findings could guide the candidate selection for the ongoing reintroduction or translocation of pandas. PMID:25248466