Polypeptide having or assisting in carbohydrate material degrading activity and uses thereof
Schooneveld-Bergmans, Margot Elisabeth Francoise; Heijne, Wilbert Herman Marie; Los, Alrik Pieter
2016-02-16
The invention relates to a polypeptide which comprises the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 76% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 76% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.
Polypeptide having beta-glucosidase activity and uses thereof
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schoonneveld-Bergmans, Margot Elisabeth Francoise; Heijne, Wilbert Herman Marie; De Jong, Rene Marcel
The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 96% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 96% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well asmore » the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.« less
Polypeptide having swollenin activity and uses thereof
Schoonneveld-Bergmans, Margot Elizabeth Francoise; Heijne, Wilbert Herman Marie; Vlasie, Monica D; Damveld, Robbertus Antonius
2015-11-04
The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 73% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 73% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.
Polypeptide having beta-glucosidase activity and uses thereof
Schooneveld-Bergmans, Margot Elisabeth Francoise; Heijne, Wilbert Herman Marie; De Jong, Rene Marcel; Damveld, Robbertus Antonius
2015-09-01
The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 70% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 70% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.
Polypeptide having cellobiohydrolase activity and uses thereof
Sagt, Cornelis Maria Jacobus; Schooneveld-Bergmans, Margot Elisabeth Francoise; Roubos, Johannes Andries; Los, Alrik Pieter
2015-09-15
The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 93% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 93% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.
Polypeptide having acetyl xylan esterase activity and uses thereof
Schoonneveld-Bergmans, Margot Elisabeth Francoise; Heijne, Wilbert Herman Marie; Los, Alrik Pieter
2015-10-20
The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 82% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 82% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.
Polypeptide having carbohydrate degrading activity and uses thereof
Schooneveld-Bergmans, Margot Elisabeth Francoise; Heijne, Wilbert Herman Marie; Vlasie, Monica Diana; Damveld, Robbertus Antonius
2015-08-18
The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 73% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 73% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.
Carbohydrate degrading polypeptide and uses thereof
Sagt, Cornelis Maria Jacobus; Schooneveld-Bergmans, Margot Elisabeth Francoise; Roubos, Johannes Andries; Los, Alrik Pieter
2015-10-20
The invention relates to a polypeptide having carbohydrate material degrading activity which comprises the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1 or SEQ ID NO: 4, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 96% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 96% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional protein and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.
Fidantsef, Ana; Lamsa, Michael; Gorre-Clancy, Brian
2015-07-14
The present invention relates to variants of a parent beta-glucosidase, comprising a substitution at one or more positions corresponding to positions 142, 183, 266, and 703 of amino acids 1 to 842 of SEQ ID NO: 2 or corresponding to positions 142, 183, 266, and 705 of amino acids 1 to 844 of SEQ ID NO: 70, wherein the variant has beta-glucosidase activity. The present invention also relates to nucleotide sequences encoding the variant beta-glucosidases and to nucleic acid constructs, vectors, and host cells comprising the nucleotide sequences.
Fidantsef, Ana; Lamsa, Michael; Gorre-Clancy, Brian
2014-10-07
The present invention relates to variants of a parent beta-glucosidase, comprising a substitution at one or more positions corresponding to positions 142, 183, 266, and 703 of amino acids 1 to 842 of SEQ ID NO: 2 or corresponding to positions 142, 183, 266, and 705 of amino acids 1 to 844 of SEQ ID NO: 70, wherein the variant has beta-glucosidase activity. The present invention also relates to nucleotide sequences encoding the variant beta-glucosidases and to nucleic acid constructs, vectors, and host cells comprising the nucleotide sequences.
Fidantsef, Ana [Davis, CA; Lamsa, Michael [Davis, CA; Gorre-Clancy, Brian [Elk Grove, CA
2009-12-29
The present invention relates to variants of a parent beta-glucosidase, comprising a substitution at one or more positions corresponding to positions 142, 183, 266, and 703 of amino acids 1 to 842 of SEQ ID NO: 2 or corresponding to positions 142, 183, 266, and 705 of amino acids 1 to 844 of SEQ ID NO: 70, wherein the variant has beta-glucosidase activity. The present invention also relates to nucleotide sequences encoding the variant beta-glucosidases and to nucleic acid constructs, vectors, and host cells comprising the nucleotide sequences.
Variant Humicola grisea CBH1.1
Goedegeburr, Frits; Gualfetti, Peter; Mitchinson, Colin; Larenas, Edmund
2013-02-19
Disclosed are variants of Humicola grisea Cel7A (CBH1.1), H. jecorina CBH1 variant or S. thermophilium CBH1, nucleic acids encoding the same and methods for producing the same. The variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted.
Variant humicola grisea CBH1.1
Goedegebuur, Frits; Gualfetti, Peter; Mitchinson, Colin; Edmund, Larenas
2014-09-09
Disclosed are variants of Humicola grisea Cel7A (CBH1.1), H. jecorina CBH1 variant or S. thermophilium CBH1, nucleic acids encoding the same and methods for producing the same. The variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted.
Variant Humicola grisea CBH1.1
Goedegebuur, Frits; Gualfetti, Peter; Mitchinson, Colin; Larenas, Edmund
2014-03-18
Disclosed are variants of Humicola grisea Cel7A (CBH1.1), H. jecorina CBH1 variant or S. thermophilium CBH1, nucleic acids encoding the same and methods for producing the same. The variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted.
Variant Humicola grisea CBH1.1
Goedegebuur, Frits; Gualfetti, Peter; Mitchinson, Colin; Larenas, Edmund
2017-05-09
Disclosed are variants of Humicola grisea CeI7A (CBH1.1), H. jecorina CBH1 variant or S. thermophilium CBH1, nucleic acids encoding the same and methods for producing the same. The variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted.
Variant Humicola grisea CBH1.1
Goedegebuur, Frits [Vlaardingen, NL; Gualfetti, Peter [San Francisco, CA; Mitchinson, Colin [Half Moon Bay, CA; Larenas, Edmund [Moss Beach, CA
2011-08-16
Disclosed are variants of Humicola grisea Cel7A (CBH1.1), H. jecorina CBH1 variant or S. thermophilium CBH1, nucleic acids encoding the same and methods for producing the same. The variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted.
Variant Humicola grisea CBH1.1
Goedegebuur, Frits [Vlaardingen, NL; Gualfetti, Peter [San Francisco, CA; Mitchinson, Colin [Half Moon Bay, CA; Larenas, Edmund [Moss Beach, CA
2012-08-07
Disclosed are variants of Humicola grisea Cel7A (CBH1.1), H. jecorina CBH1 variant or S. thermophilium CBH1, nucleic acids encoding the same and methods for producing the same. The variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted.
Variant Humicola grisea CBH1.1
Goedegebuur, Frits [Vlaardingen, NL; Gualfetti, Peter [San Francisco, CA; Mitchinson, Colin [Half Moon Bay, CA; Larenas, Edmund [Moss Beach, CA
2008-12-02
Disclosed are variants of Humicola grisea Cel7A (CBH1.1), H. jecorina CBH1 variant or S. thermophilium CBH1, nucleic acids encoding the same and methods for producing the same. The variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted.
Variant Humicola grisea CBH1.1
Goedegebuur, Frits [Vlaardingen, NL; Gualfetti, Peter [San Francisco, CA; Mitchinson, Colin [Half Moon Bay, CA; Larenas, Edmund [Moss Beach, CA
2011-05-31
Disclosed are variants of Humicola grisea Cel7A (CBH1.1), H. jecorina CBH1 variant or S. thermophilium CBH1, nucleic acids encoding the same and methods for producing the same. The variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted.
Variants of cellobiohydrolases
Bott, Richard R.; Foukaraki, Maria; Hommes, Ronaldus Wilhelmus; Kaper, Thijs; Kelemen, Bradley R.; Kralj, Slavko; Nikolaev, Igor; Sandgren, Mats; Van Lieshout, Johannes Franciscus Thomas; Van Stigt Thans, Sander
2018-04-10
Disclosed are a number of homologs and variants of Hypocrea jecorina Ce17A (formerly Trichoderma reesei cellobiohydrolase I or CBH1), nucleic acids encoding the same and methods for producing the same. The homologs and variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted and/or deleted.
CBH1 homologs and variant CBH1 cellulases
Goedegebuur, Frits [Rozenlaan, NL; Gualfetti, Peter [San Francisco, CA; Mitchinson, Colin [Half Moon Bay, CA; Neefe, Paulien [Zoetermeer, NL
2011-05-31
Disclosed are a number of homologs and variants of Hypocrea jecorina Cel7A (formerly Trichoderma reesei cellobiohydrolase I or CBH1), nucleic acids encoding the same and methods for producing the same. The homologs and variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted and/or deleted.
Buitrago, Lorena; Rendon, Augusto; Liang, Yupu; Simeoni, Ilenia; Negri, Ana; Filizola, Marta; Ouwehand, Willem H.; Coller, Barry S.; Alessi, Marie-Christine; Ballmaier, Matthias; Bariana, Tadbir; Bellissimo, Daniel; Bertoli, Marta; Bray, Paul; Bury, Loredana; Carrell, Robin; Cattaneo, Marco; Collins, Peter; French, Deborah; Favier, Remi; Freson, Kathleen; Furie, Bruce; Germeshausen, Manuela; Ghevaert, Cedric; Gomez, Keith; Goodeve, Anne; Gresele, Paolo; Guerrero, Jose; Hampshire, Dan J.; Hadinnapola, Charaka; Heemskerk, Johan; Henskens, Yvonne; Hill, Marian; Hogg, Nancy; Johnsen, Jill; Kahr, Walter; Kerr, Ron; Kunishima, Shinji; Laffan, Michael; Natwani, Amit; Neerman-Arbez, Marguerite; Nurden, Paquita; Nurden, Alan; Ormiston, Mark; Othman, Maha; Ouwehand, Willem; Perry, David; Vilk, Shoshana Ravel; Reitsma, Pieter; Rondina, Matthew; Simeoni, Ilenia; Smethurst, Peter; Stephens, Jonathan; Stevenson, William; Szkotak, Artur; Turro, Ernest; Van Geet, Christel; Vries, Minka; Ward, June; Waye, John; Westbury, Sarah; Whiteheart, Sidney; Wilcox, David; Zhang, Bi
2015-01-01
Next-generation sequencing is transforming our understanding of human genetic variation but assessing the functional impact of novel variants presents challenges. We analyzed missense variants in the integrin αIIbβ3 receptor subunit genes ITGA2B and ITGB3 identified by whole-exome or -genome sequencing in the ThromboGenomics project, comprising ∼32,000 alleles from 16,108 individuals. We analyzed the results in comparison with 111 missense variants in these genes previously reported as being associated with Glanzmann thrombasthenia (GT), 20 associated with alloimmune thrombocytopenia, and 5 associated with aniso/macrothrombocytopenia. We identified 114 novel missense variants in ITGA2B (affecting ∼11% of the amino acids) and 68 novel missense variants in ITGB3 (affecting ∼9% of the amino acids). Of the variants, 96% had minor allele frequencies (MAF) < 0.1%, indicating their rarity. Based on sequence conservation, MAF, and location on a complete model of αIIbβ3, we selected three novel variants that affect amino acids previously associated with GT for expression in HEK293 cells. αIIb P176H and β3 C547G severely reduced αIIbβ3 expression, whereas αIIb P943A partially reduced αIIbβ3 expression and had no effect on fibrinogen binding. We used receiver operating characteristic curves of combined annotation-dependent depletion, Polyphen 2-HDIV, and sorting intolerant from tolerant to estimate the percentage of novel variants likely to be deleterious. At optimal cut-off values, which had 69–98% sensitivity in detecting GT mutations, between 27% and 71% of the novel αIIb or β3 missense variants were predicted to be deleterious. Our data have implications for understanding the evolutionary pressure on αIIbβ3 and highlight the challenges in predicting the clinical significance of novel missense variants. PMID:25827233
Amicarelli, Giulia; Adlerstein, Daniel; Shehi, Erlet; Wang, Fengfei; Makrigiorgos, G Mike
2006-10-01
Genotyping methods that reveal single-nucleotide differences are useful for a wide range of applications. We used digestion of 3-way DNA junctions in a novel technology, OneCutEventAmplificatioN (OCEAN) that allows sequence-specific signal generation and amplification. We combined OCEAN with peptide-nucleic-acid (PNA)-based variant enrichment to detect and simultaneously genotype v-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog (KRAS) codon 12 sequence variants in human tissue specimens. We analyzed KRAS codon 12 sequence variants in 106 lung cancer surgical specimens. We conducted a PNA-PCR reaction that suppresses wild-type KRAS amplification and genotyped the product with a set of OCEAN reactions carried out in fluorescence microplate format. The isothermal OCEAN assay enabled a 3-way DNA junction to form between the specific target nucleic acid, a fluorescently labeled "amplifier", and an "anchor". The amplifier-anchor contact contains the recognition site for a restriction enzyme. Digestion produces a cleaved amplifier and generation of a fluorescent signal. The cleaved amplifier dissociates from the 3-way DNA junction, allowing a new amplifier to bind and propagate the reaction. The system detected and genotyped KRAS sequence variants down to approximately 0.3% variant-to-wild-type alleles. PNA-PCR/OCEAN had a concordance rate with PNA-PCR/sequencing of 93% to 98%, depending on the exact implementation. Concordance rate with restriction endonuclease-mediated selective-PCR/sequencing was 89%. OCEAN is a practical and low-cost novel technology for sequence-specific signal generation. Reliable analysis of KRAS sequence alterations in human specimens circumvents the requirement for sequencing. Application is expected in genotyping KRAS codon 12 sequence variants in surgical specimens or in bodily fluids, as well as single-base variations and sequence alterations in other genes.
Variants of glycoside hydrolases
Teter, Sarah; Ward, Connie; Cherry, Joel; Jones, Aubrey; Harris, Paul; Yi, Jung
2013-02-26
The present invention relates to variants of a parent glycoside hydrolase, comprising a substitution at one or more positions corresponding to positions 21, 94, 157, 205, 206, 247, 337, 350, 373, 383, 438, 455, 467, and 486 of amino acids 1 to 513 of SEQ ID NO: 2, and optionally further comprising a substitution at one or more positions corresponding to positions 8, 22, 41, 49, 57, 113, 193, 196, 226, 227, 246, 251, 255, 259, 301, 356, 371, 411, and 462 of amino acids 1 to 513 of SEQ ID NO: 2 a substitution at one or more positions corresponding to positions 8, 22, 41, 49, 57, 113, 193, 196, 226, 227, 246, 251, 255, 259, 301, 356, 371, 411, and 462 of amino acids 1 to 513 of SEQ ID NO: 2, wherein the variants have glycoside hydrolase activity. The present invention also relates to nucleotide sequences encoding the variant glycoside hydrolases and to nucleic acid constructs, vectors, and host cells comprising the nucleotide sequences.
Variants of glycoside hydrolases
Teter, Sarah [Davis, CA; Ward, Connie [Hamilton, MT; Cherry, Joel [Davis, CA; Jones, Aubrey [Davis, CA; Harris, Paul [Carnation, WA; Yi, Jung [Sacramento, CA
2011-04-26
The present invention relates to variants of a parent glycoside hydrolase, comprising a substitution at one or more positions corresponding to positions 21, 94, 157, 205, 206, 247, 337, 350, 373, 383, 438, 455, 467, and 486 of amino acids 1 to 513 of SEQ ID NO: 2, and optionally further comprising a substitution at one or more positions corresponding to positions 8, 22, 41, 49, 57, 113, 193, 196, 226, 227, 246, 251, 255, 259, 301, 356, 371, 411, and 462 of amino acids 1 to 513 of SEQ ID NO: 2 a substitution at one or more positions corresponding to positions 8, 22, 41, 49, 57, 113, 193, 196, 226, 227, 246, 251, 255, 259, 301, 356, 371, 411, and 462 of amino acids 1 to 513 of SEQ ID NO: 2, wherein the variants have glycoside hydrolase activity. The present invention also relates to nucleotide sequences encoding the variant glycoside hydrolases and to nucleic acid constructs, vectors, and host cells comprising the nucleotide sequences.
Variants of glycoside hydrolases
Teter, Sarah; Ward, Connie; Cherry, Joel; Jones, Aubrey; Harris, Paul; Yi, Jung
2017-07-11
The present invention relates to variants of a parent glycoside hydrolase, comprising a substitution at one or more positions corresponding to positions 21, 94, 157, 205, 206, 247, 337, 350, 373, 383, 438, 455, 467, and 486 of amino acids 1 to 513 of SEQ ID NO: 2, and optionally further comprising a substitution at one or more positions corresponding to positions 8, 22, 41, 49, 57, 113, 193, 196, 226, 227, 246, 251, 255, 259, 301, 356, 371, 411, and 462 of amino acids 1 to 513 of SEQ ID NO: 2 a substitution at one or more positions corresponding to positions 8, 22, 41, 49, 57, 113, 193, 196, 226, 227, 246, 251, 255, 259, 301, 356, 371, 411, and 462 of amino acids 1 to 513 of SEQ ID NO: 2, wherein the variants have glycoside hydrolase activity. The present invention also relates to nucleotide sequences encoding the variant glycoside hydrolases and to nucleic acid constructs, vectors, and host cells comprising the nucleotide sequences.
NASA Astrophysics Data System (ADS)
Anderson, Lissa C.; Håkansson, Maria; Walse, Björn; Nilsson, Carol L.
2017-09-01
Structural technologies are an essential component in the design of precision therapeutics. Precision medicine entails the development of therapeutics directed toward a designated target protein, with the goal to deliver the right drug to the right patient at the right time. In the field of oncology, protein structural variants are often associated with oncogenic potential. In a previous proteogenomic screen of patient-derived glioblastoma (GBM) tumor materials, we identified a sequence variant of human mitochondrial branched-chain amino acid aminotransferase 2 as a putative factor of resistance of GBM to standard-of-care-treatments. The enzyme generates glutamate, which is neurotoxic. To elucidate structural coordinates that may confer altered substrate binding or activity of the variant BCAT2 T186R, a 45 kDa protein, we applied combined ETD and CID top-down mass spectrometry in a LC-FT-ICR MS at 21 T, and X-Ray crystallography in the study of both the variant and non-variant intact proteins. The combined ETD/CID fragmentation pattern allowed for not only extensive sequence coverage but also confident localization of the amino acid variant to its position in the sequence. The crystallographic experiments confirmed the hypothesis generated by in silico structural homology modeling, that the Lys59 side-chain of BCAT2 may repulse the Arg186 in the variant protein (PDB code: 5MPR), leading to destabilization of the protein dimer and altered enzyme kinetics. Taken together, the MS and novel 3D structural data give us reason to further pursue BCAT2 T186R as a precision drug target in GBM. [Figure not available: see fulltext.
CBH1 homologs and varian CBH1 cellulase
Goedegebuur, Frits; Gualfetti, Peter; Mitchinson, Colin; Neefe, Paulien
2014-07-01
Disclosed are a number of homologs and variants of Hypocrea jecorina Cel7A (formerly Trichoderma reesei cellobiohydrolase I or CBH1), nucleic acids encoding the same and methods for producing the same. The homologs and variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted and/or deleted.
Altier, Daniel J.; Dahlbacka, Glen; Ellanskaya, legal representative, Natalia; Herrmann, Rafael; Hunter-Cevera, Jennie; McCutchen, Billy F.; Presnail, James K.; Rice, Janet A.; Schepers, Eric; Simmons, Carl R.; Torok, Tamas; Yalpani, Nasser; Ellanskaya, deceased, Irina
2007-12-11
Compositions and methods for protecting a plant from a pathogen, particularly a fungal pathogen, are provided. Compositions include novel amino acid sequences, and variants and fragments thereof, for antipathogenic polypeptides that were isolated from microbial fermentation broths. Nucleic acid molecules comprising nucleotide sequences that encode the antipathogenic polypeptides of the invention are also provided. A method for inducing pathogen resistance in a plant using the nucleotide sequences disclosed herein is further provided. The method comprises introducing into a plant an expression cassette comprising a promoter operably linked to a nucleotide sequence that encodes an antipathogenic polypeptide of the invention. Compositions comprising an antipathogenic polypeptide or a transformed microorganism comprising a nucleic acid of the invention in combination with a carrier and methods of using these compositions to protect a plant from a pathogen are further provided. Transformed plants, plant cells, seeds, and microorganisms comprising a nucleotide sequence that encodes an antipathogenic polypeptide of the invention, or variant or fragment thereof, are also disclosed.
Altier, Daniel J.; Dahlbacka, Glen; Elleskaya, Irina; Ellanskaya, legal representative; Natalia; Herrmann, Rafael; Hunter-Cevera, Jennie; McCutchen, Billy F.; Presnail, James K.; Rice, Janet A.; Schepers, Eric; Simmons, Carl R.; Torok, Tamas; Yalpani, Nasser
2010-08-10
Compositions and methods for protecting a plant from a pathogen, particularly a fungal pathogen, are provided. Compositions include novel amino acid sequences, and variants and fragments thereof, for antipathogenic polypeptides that were isolated from microbial fermentation broths. Nucleic acid molecules comprising nucleotide sequences that encode the antipathogenic polypeptides of the invention are also provided. A method for inducing pathogen resistance in a plant using the nucleotide sequences disclosed herein is further provided. The method comprises introducing into a plant an expression cassette comprising a promoter operably linked to a nucleotide sequence that encodes an antipathogenic polypeptide of the invention. Compositions comprising an antipathogenic polypeptide or a transformed microorganism comprising a nucleic acid of the invention in combination with a carrier and methods of using these compositions to protect a plant from a pathogen are further provided. Transformed plants, plant cells, seeds, and microorganisms comprising a nucleotide sequence that encodes an antipathogenic polypeptide of the invention, or variant or fragment thereof, are also disclosed.
Altier, Daniel J [Waukee, IA; Dahlbacka, Glen [Oakland, CA; Elleskaya, Irina [Kyiv, UA; Ellanskaya, legal representative, Natalia; Herrmann, Rafael [Wilmington, DE; Hunter-Cevera, Jennie [Elliott City, MD; McCutchen, Billy F [College Station, IA; Presnail, James K [Avondale, PA; Rice, Janet A [Wilmington, DE; Schepers, Eric [Port Deposit, MD; Simmons, Carl R [Des Moines, IA; Torok, Tamas [Richmond, CA; Yalpani, Nasser [Johnston, IA
2011-04-12
Compositions and methods for protecting a plant from a pathogen, particularly a fungal pathogen, are provided. Compositions include novel amino acid sequences, and variants and fragments thereof, for antipathogenic polypeptides that were isolated from microbial fermentation broths. Nucleic acid molecules comprising nucleotide sequences that encode the antipathogenic polypeptides of the invention are also provided. A method for inducing pathogen resistance in a plant using the nucleotide sequences disclosed herein is further provided. The method comprises introducing into a plant an expression cassette comprising a promoter operably linked to a nucleotide sequence that encodes an antipathogenic polypeptide of the invention. Compositions comprising an antipathogenic polypeptide or a transformed microorganism comprising a nucleic acid of the invention in combination with a carrier and methods of using these compositions to protect a plant from a pathogen are further provided. Transformed plants, plant cells, seeds, and microorganisms comprising a nucleotide sequence that encodes an antipathogenic polypeptide of the invention, or variant or fragment thereof, are also disclosed.
Altier, Daniel J [Granger, IA; Dahlbacka, Glen [Oakland, CA; Ellanskaya, Irina [Kyiv, UA; Ellanskaya, legal representative, Natalia; Herrmann, Rafael [Wilmington, DE; Hunter-Cevera, Jennie [Elliott City, MD; McCutchen, Billy F [College Station, TX; Presnail, James K [Avondale, PA; Rice, Janet A [Wilmington, DE; Schepers, Eric [Port Deposit, MD; Simmons, Carl R [Des Moines, IA; Torok, Tamas [Richmond, CA; Yalpani, Nasser [Johnston, IA
2012-04-03
Compositions and methods for protecting a plant from a pathogen, particularly a fungal pathogen, are provided. Compositions include novel amino acid sequences, and variants and fragments thereof, for antipathogenic polypeptides that were isolated from microbial fermentation broths. Nucleic acid molecules comprising nucleotide sequences that encode the antipathogenic polypeptides of the invention are also provided. A method for inducing pathogen resistance in a plant using the nucleotide sequences disclosed herein is further provided. The method comprises introducing into a plant an expression cassette comprising a promoter operably linked to a nucleotide sequence that encodes an antipathogenic polypeptide of the invention. Compositions comprising an antipathogenic polypeptide or a transformed microorganism comprising a nucleic acid of the invention in combination with a carrier and methods of using these compositions to protect a plant from a pathogen are further provided. Transformed plants, plant cells, seeds, and microorganisms comprising a nucleotide sequence that encodes an antipathogenic polypeptide of the invention, or variant or fragment thereof, are also disclosed.
Artificial mismatch hybridization
Guo, Zhen; Smith, Lloyd M.
1998-01-01
An improved nucleic acid hybridization process is provided which employs a modified oligonucleotide and improves the ability to discriminate a control nucleic acid target from a variant nucleic acid target containing a sequence variation. The modified probe contains at least one artificial mismatch relative to the control nucleic acid target in addition to any mismatch(es) arising from the sequence variation. The invention has direct and advantageous application to numerous existing hybridization methods, including, applications that employ, for example, the Polymerase Chain Reaction, allele-specific nucleic acid sequencing methods, and diagnostic hybridization methods.
Khan, Waqasuddin; Saripella, Ganapathi Varma-; Ludwig, Thomas; Cuppens, Tania; Thibord, Florian; Génin, Emmanuelle; Deleuze, Jean-Francois; Trégouët, David-Alexandre
2018-05-03
Predicted deleteriousness of coding variants is a frequently used criterion to filter out variants detected in next-generation sequencing projects and to select candidates impacting on the risk of human diseases. Most available dedicated tools implement a base-to-base annotation approach that could be biased in presence of several variants in the same genetic codon. We here proposed the MACARON program that, from a standard VCF file, identifies, re-annotates and predicts the amino acid change resulting from multiple single nucleotide variants (SNVs) within the same genetic codon. Applied to the whole exome dataset of 573 individuals, MACARON identifies 114 situations where multiple SNVs within a genetic codon induce an amino acid change that is different from those predicted by standard single SNV annotation tool. Such events are not uncommon and deserve to be studied in sequencing projects with inconclusive findings. MACARON is written in python with codes available on the GENMED website (www.genmed.fr). david-alexandre.tregouet@inserm.fr. Supplementary data are available at Bioinformatics online.
Lockridge, O
1990-01-01
People with genetic variants of cholinesterase respond abnormally to succinylcholine, experiencing substantial prolongation of muscle paralysis with apnea rather than the usual 2-6 min. The structure of usual cholinesterase has been determined including the complete amino acid and nucleotide sequence. This has allowed identification of altered amino acids and nucleotides. The variant most frequently found in patients who respond abnormally to succinylcholine is atypical cholinesterase, which occurs in homozygous form in 1 out of 3500 Caucasians. Atypical cholinesterase has a single substitution at nucleotide 209 which changes aspartic acid 70 to glycine. This suggests that Asp 70 is part of the anionic site, and that the absence of this negatively charged amino acid explains the reduced affinity of atypical cholinesterase for positively charged substrates and inhibitors. The clinical consequence of reduced affinity for succinylcholine is that none of the succinylcholine is hydrolyzed in blood and a large overdose reaches the nerve-muscle junction where it causes prolonged muscle paralysis. Silent cholinesterase has a frame shift mutation at glycine 117 which prematurely terminates protein synthesis and yields no active enzyme. The K variant, named in honor of W. Kalow, has threonine in place of alanine 539. The K variant is associated with 33% lower activity. All variants arise from a single locus as there is only one gene for human cholinesterase (EC 3.1.1.8). Comparison of amino acid sequences of esterases and proteases shows that cholinesterase belongs to a new family of serine esterases which is different from the serine proteases.
Våge, D I; Nieminen, M; Anderson, D G; Røed, K H
2014-10-01
The protein-coding region of melanocortin 1 receptor (MC1R) was sequenced to identify potential variation affecting coat color in reindeer (Rangifer tarandus). A T→C sequence variation at nucleotide position 218 (c.218T>C) causing an amino acid (aa) change from methionine to threonine at aa position 73 (p.Met73Thr) was identified. In addition, a T→G sequence variation was found at nucleotide position 839 (c.839T>G), causing phenylalanine to be exchanged by cysteine at aa position 280 (p.Phe280Cys). The two sequence variants (c.218C and c.839G) were found to be closely associated with a darker belly coat compared with animals not having any of these two variants. The aa acid change p.Met73Thr affects the same position as p.Met73Lys previously reported to give constitutive activation of MC1R in black sheep (Ovis aries), whereas p.Phe280Cys is identical to one of two variants previously reported to be associated with dark coat color in Arctic fox (Alopex lagopus), supporting that the two variants found in reindeer are functional. The complete absence of Thr73 and Cys280 among the 51 wild reindeer analyzed provides some evidence that these variants are more common in the domestic herds. © 2014 Stichting International Foundation for Animal Genetics.
Fisher, Kevin E.; Zhang, Linsheng; Wang, Jason; Smith, Geoffrey H.; Newman, Scott; Schneider, Thomas M.; Pillai, Rathi N.; Kudchadkar, Ragini R.; Owonikoko, Taofeek K.; Ramalingam, Suresh S.; Lawson, David H.; Delman, Keith A.; El-Rayes, Bassel F.; Wilson, Malania M.; Sullivan, H. Clifford; Morrison, Annie S.; Balci, Serdar; Adsay, N. Volkan; Gal, Anthony A.; Sica, Gabriel L.; Saxe, Debra F.; Mann, Karen P.; Hill, Charles E.; Khuri, Fadlo R.; Rossi, Michael R.
2017-01-01
We tested and clinically validated a targeted next-generation sequencing (NGS) mutation panel using 80 formalin-fixed, paraffin-embedded (FFPE) tumor samples. Forty non-small cell lung carcinoma (NSCLC), 30 melanoma, and 30 gastrointestinal (12 colonic, 10 gastric, and 8 pancreatic adenocarcinoma) FFPE samples were selected from laboratory archives. After appropriate specimen and nucleic acid quality control, 80 NGS libraries were prepared using the Illumina TruSight tumor (TST) kit and sequenced on the Illumina MiSeq. Sequence alignment, variant calling, and sequencing quality control were performed using vendor software and laboratory-developed analysis workflows. TST generated ≥500× coverage for 98.4% of the 13,952 targeted bases. Reproducible and accurate variant calling was achieved at ≥5% variant allele frequency with 8 to 12 multiplexed samples per MiSeq flow cell. TST detected 112 variants overall, and confirmed all known single-nucleotide variants (n = 27), deletions (n = 5), insertions (n = 3), and multinucleotide variants (n = 3). TST detected at least one variant in 85.0% (68/80), and two or more variants in 36.2% (29/80), of samples. TP53 was the most frequently mutated gene in NSCLC (13 variants; 13/32 samples), gastrointestinal malignancies (15 variants; 13/25 samples), and overall (30 variants; 28/80 samples). BRAF mutations were most common in melanoma (nine variants; 9/23 samples). Clinically relevant NGS data can be obtained from routine clinical FFPE solid tumor specimens using TST, benchtop instruments, and vendor-supplied bioinformatics pipelines. PMID:26801070
Khodakov, Dmitriy; Wang, Chunyan; Zhang, David Yu
2016-10-01
Nucleic acid sequence variations have been implicated in many diseases, and reliable detection and quantitation of DNA/RNA biomarkers can inform effective therapeutic action, enabling precision medicine. Nucleic acid analysis technologies being translated into the clinic can broadly be classified into hybridization, PCR, and sequencing, as well as their combinations. Here we review the molecular mechanisms of popular commercial assays, and their progress in translation into in vitro diagnostics. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xi, T; Jones, I M; Mohrenweiser, H W
2003-11-03
Over 520 different amino acid substitution variants have been previously identified in the systematic screening of 91 human DNA repair genes for sequence variation. Two algorithms were employed to predict the impact of these amino acid substitutions on protein activity. Sorting Intolerant From Tolerant (SIFT) classified 226 of 508 variants (44%) as ''Intolerant''. Polymorphism Phenotyping (PolyPhen) classed 165 of 489 amino acid substitutions (34%) as ''Probably or Possibly Damaging''. Another 9-15% of the variants were classed as ''Potentially Intolerant or Damaging''. The results from the two algorithms are highly associated, with concordance in predicted impact observed for {approx}62% of themore » variants. Twenty one to thirty one percent of the variant proteins are predicted to exhibit reduced activity by both algorithms. These variants occur at slightly lower individual allele frequency than do the variants classified as ''Tolerant'' or ''Benign''. Both algorithms correctly predicted the impact of 26 functionally characterized amino acid substitutions in the APE1 protein on biochemical activity, with one exception. It is concluded that a substantial fraction of the missense variants observed in the general human population are functionally relevant. These variants are expected to be the molecular genetic and biochemical basis for the associations of reduced DNA repair capacity phenotypes with elevated cancer risk.« less
Nance, D; Campbell, R A; Rowley, J W; Downie, J M; Jorde, L B; Kahr, W H; Mereby, S A; Tolley, N D; Zimmerman, G A; Weyrich, A S; Rondina, M T
2016-11-01
Essentials Co-existent damaging variants are likely to cause more severe bleeding and may go undiagnosed. We determined pathogenic variants in a three-generational pedigree with excessive bleeding. Bleeding occurred with concurrent variants in prostaglandin synthase-1 (PTGS-1) and factor VIII. The PTGS-1 variant was associated with functional defects in the arachidonic acid pathway. Background Inherited human variants that concurrently cause disorders of primary hemostasis and coagulation are uncommon. Nevertheless, rare cases of co-existent damaging variants are likely to cause more severe bleeding and may go undiagnosed. Objective We prospectively sought to determine pathogenic variants in a three-generational pedigree with excessive bleeding. Patients/methods Platelet number, size and light transmission aggregometry to multiple agonists were evaluated in pedigree members. Transmission electron microscopy determined platelet morphology and granule content. Thromboxane release studies and light transmission aggregometry in the presence or absence of prostaglandin G 2 assessed specific functional defects in the arachidonic acid pathway. Whole exome sequencing (WES) and targeted nucleotide sequence analysis identified potentially deleterious variants. Results Pedigree members with excessive bleeding had impaired platelet aggregation with arachidonic acid, epinephrine and low-dose ADP, as well as reduced platelet thromboxane B 2 release. Impaired platelet aggregation in response to 2MesADP was rescued with prostaglandin G 2 , a prostaglandin intermediate downstream of prostaglandin synthase-1 (PTGS-1) that aids in the production of thromboxane. WES identified a non-synonymous variant in the signal peptide of PTGS-1 (rs3842787; c.50C>T; p.Pro17Leu) that completely co-segregated with disease phenotype. A variant in the F8 gene causing hemophilia A (rs28935203; c.5096A>T; p.Y1699F) was also identified. Individuals with both variants had more severe bleeding manifestations than characteristic of mild hemophilia A alone. Conclusion We provide the first report of co-existing variants in both F8 and PTGS-1 genes in a three-generation pedigree. The PTGS-1 variant was associated with specific functional defects in the arachidonic acid pathway and more severe hemorrhage. © 2016 International Society on Thrombosis and Haemostasis.
Structural comparisons of two allelic variants of human placental alkaline phosphatase.
Millán, J L; Stigbrand, T; Jörnvall, H
1985-01-01
A simple immunosorbent purification scheme based on monoclonal antibodies has been devised for human placental alkaline phosphatase. The two most common allelic variants, S and F, have similar amino acid compositions with identical N-terminal amino acid sequences through the first 13 residues. Both variants have identical lectin binding properties towards concanavalin A, lentil-lectin, wheat germ agglutinin, phytohemagglutinin and soybean agglutinin, and identical carbohydrate contents as revealed by methylation analysis. CNBr fragments of the variants demonstrate identical high performance liquid chromatography patterns. The carbohydrate containing fragment is different from the 32P-labeled active site fragment and the N-terminal fragment.
Santos, Regie Lyn P.; El-Shanti, Hatem; Sikandar, Shaheen; Lee, Kwanghyuk; Bhatti, Attya; Yan, Kai; Chahrour, Maria H.; McArthur, Nathan; Pham, Thanh L.; Mahasneh, Amjad Abdullah; Ahmad, Wasim
2010-01-01
To date, 37 genes have been identified for nonsyndromic hearing impairment (NSHI). Identifying the functional sequence variants within these genes and knowing their population-specific frequencies is of public health value, in particular for genetic screening for NSHI. To determine putatively functional sequence variants in the transmembrane inner ear (TMIE) gene in Pakistani and Jordanian families with autosomal recessive (AR) NSHI, four Jordanian and 168 Pakistani families with ARNSHI that is not due to GJB2 (CX26) were submitted to a genome scan. Two-point and multipoint parametric linkage analyses were performed, and families with logarithmic odds (LOD) scores of 1.0 or greater within the TMIE region underwent further DNA sequencing. The evolutionary conservation and location in predicted protein domains of amino acid residues where sequence variants occurred were studied to elucidate the possible effects of these sequence variants on function. Of seven families that were screened for TMIE, putatively functional sequence variants were found to segregate with hearing impairment in four families but were not seen in not less than 110 ethnically matched control chromosomes. The previously reported c.241C>T (p.R81C) variant was observed in two Pakistani families. Two novel variants, c.92A>G (p.E31G) and the splice site mutation c.212–2A>C, were identified in one Pakistani and one Jordanian family, respectively. The c.92A>G (p.E31G) variant occurred at a residue that is conserved in the mouse and is predicted to be extracellular. Conservation and potential functionality of previously published mutations were also examined. The prevalence of functional TMIE variants in Pakistani families is 1.7% [95% confidence interval (CI) 0.3–4.8]. Further studies on the spectrum, prevalence rates, and functional effect of sequence variants in the TMIE gene in other populations should demonstrate the true importance of this gene as a cause of hearing impairment. PMID:16389551
Cystinuria Associated with Different SLC7A9 Gene Variants in the Cat
Raj, Karthik; Osborne, Carl; Giger, Urs
2016-01-01
Cystinuria is a classical inborn error of metabolism characterized by a selective proximal renal tubular defect affecting cystine, ornithine, lysine, and arginine (COLA) reabsorption, which can lead to uroliths and urinary obstruction. In humans, dogs and mice, cystinuria is caused by variants in one of two genes, SLC3A1 and SLC7A9, which encode the rBAT and bo,+AT subunits of the bo,+ basic amino acid transporter system, respectively. In this study, exons and flanking regions of the SLC3A1 and SLC7A9 genes were sequenced from genomic DNA of cats (Felis catus) with COLAuria and cystine calculi. Relative to the Felis catus-6.2 reference genome sequence, DNA sequences from these affected cats revealed 3 unique homozygous SLC7A9 missense variants: one in exon 5 (p.Asp236Asn) from a non-purpose-bred medium-haired cat, one in exon 7 (p.Val294Glu) in a Maine Coon and a Sphinx cat, and one in exon 10 (p.Thr392Met) from a non-purpose-bred long-haired cat. A genotyping assay subsequently identified another cystinuric domestic medium-haired cat that was homozygous for the variant originally identified in the purebred cats. These missense variants result in deleterious amino acid substitutions of highly conserved residues in the bo,+AT protein. A limited population survey supported that the variants found were likely causative. The remaining 2 sequenced domestic short-haired cats had a heterozygous variant at a splice donor site in intron 10 and a homozygous single nucleotide variant at a branchpoint in intron 11 of SLC7A9, respectively. This study identifies the first SLC7A9 variants causing feline cystinuria and reveals that, as in humans and dogs, this disease is genetically heterogeneous in cats. PMID:27404572
Epitaxial Nucleation on Rationally Designed Peptide Functionalized Interface
2011-07-19
of 17 amino acid peptides. In this report, we focus on the findings from several variants of these sequences, including the role of charge...separation and histidine-gold coordination. We find that these 17 amino acid peptide sequences behave robustly, where periodicity appears to dominate the...26,27 Secondary structure propensity refers to the intrinsic inclination of individual amino acids to a given secondary structure, where side-group
Structural analysis of an HLA-B27 functional variant, B27d detected in American blacks
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rojo, S.; Aparicio, P.; Hansen, J.A.
1987-11-15
The structure of a new functional variant B27d has been established by comparative peptide mapping and radiochemical sequencing. This analysis complete the structural characterization of the six know histocompatibility leukocyte antigen (HLA)-B27 subtypes. The only detected amino acid change between the main HLA-B27.1 subtype and B27d is that of Try/sub 59/ to His/sub 59/. Position 59 has not been previously found to vary among class I HLA or H-2 antigens. Such substitution accounts for the reported isoelectric focusing pattern of this variant. HLA-B27d is the only B27 variant found to differ from other subtypes by a single amino acid replacement.more » The nature of the change is compatible with its origin by a point mutation from HLB-B27.1. Because B27d was found only American blacks and in no other ethnic groups, it is suggested that this variant originated as a result of a mutation of the B27.1 gene that occurred within the black population. Structural analysis of B27d was done by comparative mapping. Radiochemical sequencing was carried out with /sup 14/C-labeled and /sup 3/H-labeled amino acids.« less
Arias-Pulido, Hugo; Peyton, Cheri L; Torrez-Martínez, Norah; Anderson, D Nelson; Wheeler, Cosette M
2005-07-20
While HPV 16 variant lineages have been well characterized, the knowledge about HPV 18 variants is limited. In this study, HPV 18 nucleotide variations in the E2 hinge region were characterized by sequence analysis in 47 control and 51 tumor specimens. Fifty of these specimens were randomly selected for sequencing of an LCR-E6 segment and 20 samples representative of LCR-E6 and E2 sequence variants were examined across the L1 region. A total of 2770 nucleotides per HPV 18 variant genome were considered in this study. HPV 18 variant nucleotides were linked among all gene segments analyzed and grouped into three main branches: Asian-American (AA), European (E), and African (Af). These three branches were equally distributed among controls and cases and when stratified by Hispanic and non-Hispanic ethnicities. Among invasive cervical cancer cases, no significant differences in the three HPV variant branches were observed among ethnic groups or when stratified by histopathology (squamous vs. adenocarcinoma). The Af branch showed the greatest nucleotide variability when compared to the HPV 18 reference sequence and was more closely related to HPV 45 than either AA or E branches. Our data also characterize nucleotide and amino acid variations in the L1 capsid gene among HPV 18 variants, which may be relevant to vaccine strategies and subsequent studies of naturally occurring HPV 18 variants. Several novel HPV 18 nucleotide variations were identified in this study.
Clinical Applications of Molecular Genetic Discoveries
Marian, A.J.
2015-01-01
Genome-wide association studies (GWAS) of complex traits have mapped more than 15,000 common single nucleotide variants (SNVs). Likewise, applications of massively parallel nucleic acid sequencing technologies often referred to as Next Generation Sequencing, to molecular genetic studies of complex traits have catalogued a large number of rare variants (population frequency of <0.01) in cases with complex traits. Moreover, high throughput nucleic acid sequencing, variant burden analysis, and linkage studies are illuminating the presence of large number of SNVs in cases and families with single gene disorders. The plethora of the genetic variants has exposed the formidable challenge of identifying the causal and pathogenic variants from the enormous number of innocuous common and rare variants that exist in the population as well as in an individual genome. The arduous task of identifying the causal and pathogenic variants is further compounded by the pleiotropic effects of the variants, complexity of cis and trans interactions in the genome, variability in phenotypic expression of the disease, as well as phenotypic plasticity, and the multifarious determinants of the phenotype. Population genetic studies offer the initial roadmaps and have the potential to elucidate novel pathways involved in the pathogenesis of the disease. However, the genome of an individual is unique, rendering unambiguous identification of the causal or pathogenic variant in a single individual exceedingly challenging. Yet, the focus of the practice of medicine is on the individual, as Sir William Osler elegantly expressed in his insightful quotation: “The good physician treats the disease; the great physician treats the patient who has the disease.” The daunting task facing physicians, patients, and researchers alike is to apply the modern genetic discoveries to care of the individual with or at risk of the disease. PMID:26548329
Wik, Lotta; Mikko, Sofia; Klingeborn, Mikael; Stéen, Margareta; Simonsson, Magnus; Linné, Tommy
2012-01-01
The prion protein (PrP) sequence of European moose, reindeer, roe deer and fallow deer in Scandinavia has high homology to the PrP sequence of North American cervids. Variants in the European moose PrP sequence were found at amino acid position 109 as K or Q. The 109Q variant is unique in the PrP sequence of vertebrates. During the 1980s a wasting syndrome in Swedish moose, Moose Wasting Syndrome (MWS), was described. SNP analysis demonstrated a difference in the observed genotype proportions of the heterozygous Q/K and homozygous Q/Q variants in the MWS animals compared with the healthy animals. In MWS moose the allele frequencies for 109K and 109Q were 0.73 and 0.27, respectively, and for healthy animals 0.69 and 0.31. Both alleles were seen as heterozygotes and homozygotes. In reindeer, PrP sequence variation was demonstrated at codon 176 as D or N and codon 225 as S or Y. The PrP sequences in roe deer and fallow deer were identical with published GenBank sequences. PMID:22441661
Auer, Paul L.; Johnsen, Jill M.; Johnson, Andrew D.; Logsdon, Benjamin A.; Lange, Leslie A.; Nalls, Michael A.; Zhang, Guosheng; Franceschini, Nora; Fox, Keolu; Lange, Ethan M.; Rich, Stephen S.; O’Donnell, Christopher J.; Jackson, Rebecca D.; Wallace, Robert B.; Chen, Zhao; Graubert, Timothy A.; Wilson, James G.; Tang, Hua; Lettre, Guillaume; Reiner, Alex P.; Ganesh, Santhi K.; Li, Yun
2012-01-01
Researchers have successfully applied exome sequencing to discover causal variants in selected individuals with familial, highly penetrant disorders. We demonstrate the utility of exome sequencing followed by imputation for discovering low-frequency variants associated with complex quantitative traits. We performed exome sequencing in a reference panel of 761 African Americans and then imputed newly discovered variants into a larger sample of more than 13,000 African Americans for association testing with the blood cell traits hemoglobin, hematocrit, white blood count, and platelet count. First, we illustrate the feasibility of our approach by demonstrating genome-wide-significant associations for variants that are not covered by conventional genotyping arrays; for example, one such association is that between higher platelet count and an MPL c.117G>T (p.Lys39Asn) variant encoding a p.Lys39Asn amino acid substitution of the thrombpoietin receptor gene (p = 1.5 × 10−11). Second, we identified an association between missense variants of LCT and higher white blood count (p = 4 × 10−13). Third, we identified low-frequency coding variants that might account for allelic heterogeneity at several known blood cell-associated loci: MPL c.754T>C (p.Tyr252His) was associated with higher platelet count; CD36 c.975T>G (p.Tyr325∗) was associated with lower platelet count; and several missense variants at the α-globin gene locus were associated with lower hemoglobin. By identifying low-frequency missense variants associated with blood cell traits not previously reported by genome-wide association studies, we establish that exome sequencing followed by imputation is a powerful approach to dissecting complex, genetically heterogeneous traits in large population-based studies. PMID:23103231
Jang, Mi Ae; Lee, Chang Woo; Kim, Jin Kyung; Ki, Chang Seok
2015-11-01
Cornelia de Lange syndrome (CdLS) is a clinically and genetically heterogeneous congenital anomaly. Mutations in the NIPBL gene account for a half of the affected individuals. We describe a family with CdLS carrying a novel pathogenic variant of the SMC1A gene identified by exome sequencing. The proband was a 3-yr-old boy presenting with a developmental delay. He had distinctive facial features without major structural anomalies and tested negative for the NIPBL gene. His younger sister, mother, and maternal grandmother presented with mild mental retardation. By exome sequencing of the proband, a novel SMC1A variant, c.3178G>A, was identified, which was expected to cause an amino acid substitution (p.Glu1060Lys) in the highly conserved coiled-coil domain of the SMC1A protein. Sanger sequencing confirmed that the three female relatives with mental retardation also carry this variant. Our results reveal that SMC1A gene defects are associated with milder phenotypes of CdLS. Furthermore, we showed that exome sequencing could be a useful tool to identify pathogenic variants in patients with CdLS.
Hurba, Olha; Mancikova, Andrea; Krylov, Vladimir; Pavlikova, Marketa; Pavelka, Karel; Stibůrková, Blanka
2014-01-01
Using European descent Czech populations, we performed a study of SLC2A9 and SLC22A12 genes previously identified as being associated with serum uric acid concentrations and gout. This is the first study of the impact of non-synonymous allelic variants on the function of GLUT9 except for patients suffering from renal hypouricemia type 2. The cohort consisted of 250 individuals (150 controls, 54 nonspecific hyperuricemics and 46 primary gout and/or hyperuricemia subjects). We analyzed 13 exons of SLC2A9 (GLUT9 variant 1 and GLUT9 variant 2) and 10 exons of SLC22A12 by PCR amplification and sequenced directly. Allelic variants were prepared and their urate uptake and subcellular localization were studied by Xenopus oocytes expression system. The functional studies were analyzed using the non-parametric Wilcoxon and Kruskall-Wallis tests; the association study used the Fisher exact test and linear regression approach. We identified a total of 52 sequence variants (12 unpublished). Eight non-synonymous allelic variants were found only in SLC2A9: rs6820230, rs2276961, rs144196049, rs112404957, rs73225891, rs16890979, rs3733591 and rs2280205. None of these variants showed any significant difference in the expression of GLUT9 and in urate transport. In the association study, eight variants showed a possible association with hyperuricemia. However, seven of these were in introns and the one exon located variant, rs7932775, did not show a statistically significant association with serum uric acid concentration. Our results did not confirm any effect of SLC22A12 and SLC2A9 variants on serum uric acid concentration. Our complex approach using association analysis together with functional and immunohistochemical characterization of non-synonymous allelic variants did not show any influence on expression, subcellular localization and urate uptake of GLUT9.
Efficient analysis of mouse genome sequences reveal many nonsense variants
Steeland, Sophie; Timmermans, Steven; Van Ryckeghem, Sara; Hulpiau, Paco; Saeys, Yvan; Van Montagu, Marc; Vandenbroucke, Roosmarijn E.; Libert, Claude
2016-01-01
Genetic polymorphisms in coding genes play an important role when using mouse inbred strains as research models. They have been shown to influence research results, explain phenotypical differences between inbred strains, and increase the amount of interesting gene variants present in the many available inbred lines. SPRET/Ei is an inbred strain derived from Mus spretus that has ∼1% sequence difference with the C57BL/6J reference genome. We obtained a listing of all SNPs and insertions/deletions (indels) present in SPRET/Ei from the Mouse Genomes Project (Wellcome Trust Sanger Institute) and processed these data to obtain an overview of all transcripts having nonsynonymous coding sequence variants. We identified 8,883 unique variants affecting 10,096 different transcripts from 6,328 protein-coding genes, which is about 28% of all coding genes. Because only a subset of these variants results in drastic changes in proteins, we focused on variations that are nonsense mutations that ultimately resulted in a gain of a stop codon. These genes were identified by in silico changing the C57BL/6J coding sequences to the SPRET/Ei sequences, converting them to amino acid (AA) sequences, and comparing the AA sequences. All variants and transcripts affected were also stored in a database, which can be browsed using a SPRET/Ei M. spretus variants web tool (www.spretus.org), including a manual. We validated the tool by demonstrating the loss of function of three proteins predicted to be severely truncated, namely Fas, IRAK2, and IFNγR1. PMID:27147605
2002-10-01
This document contains three papers focusing on the analysis of anti-p53 cellular immune responses of breast, head, neck, and oral cancer patients...variants were generated by amino acid exchanges at positions 6 (6T) and 7 (7W) of the peptide. The 7W variant peptide has potential for immunotherapy of nonresponsive oral cancer patients.
Hepatitis C Virus Antigenic Convergence
Campo, David S.; Dimitrova, Zoya; Yokosawa, Jonny; Hoang, Duc; Perez, Nestor O.; Ramachandran, Sumathi; Khudyakov, Yury
2012-01-01
Vaccine development against hepatitis C virus (HCV) is hindered by poor understanding of factors defining cross-immunoreactivity among heterogeneous epitopes. Using synthetic peptides and mouse immunization as a model, we conducted a quantitative analysis of cross-immunoreactivity among variants of the HCV hypervariable region 1 (HVR1). Analysis of 26,883 immunological reactions among pairs of peptides showed that the distribution of cross-immunoreactivity among HVR1 variants was skewed, with antibodies against a few variants reacting with all tested peptides. The HVR1 cross-immunoreactivity was accurately modeled based on amino acid sequence alone. The tested peptides were mapped in the HVR1 sequence space, which was visualized as a network of 11,319 sequences. The HVR1 variants with a greater network centrality showed a broader cross-immunoreactivity. The entire sequence space is explored by each HCV genotype and subtype. These findings indicate that HVR1 antigenic diversity is extensively convergent and effectively limited, suggesting significant implications for vaccine development. PMID:22355779
Drobni, Mirva; Hallberg, Kristina; Öhman, Ulla; Birve, Anna; Persson, Karina; Johansson, Ingegerd; Strömberg, Nicklas
2006-01-01
Background Actinomyces naeslundii genospecies 1 and 2 express type-2 fimbriae (FimA subunit polymers) with variant Galβ binding specificities and Actinomyces odontolyticus a sialic acid specificity to colonize different oral surfaces. However, the fimbrial nature of the sialic acid binding property and sequence information about FimA proteins from multiple strains are lacking. Results Here we have sequenced fimA genes from strains of A.naeslundii genospecies 1 (n = 4) and genospecies 2 (n = 4), both of which harboured variant Galβ-dependent hemagglutination (HA) types, and from A.odontolyticus PK984 with a sialic acid-dependent HA pattern. Three unique subtypes of FimA proteins with 63.8–66.4% sequence identity were present in strains of A. naeslundii genospecies 1 and 2 and A. odontolyticus. The generally high FimA sequence identity (>97.2%) within a genospecies revealed species specific sequences or segments that coincided with binding specificity. All three FimA protein variants contained a signal peptide, pilin motif, E box, proline-rich segment and an LPXTG sorting motif among other conserved segments for secretion, assembly and sorting of fimbrial proteins. The highly conserved pilin, E box and LPXTG motifs are present in fimbriae proteins from other Gram-positive bacteria. Moreover, only strains of genospecies 1 were agglutinated with type-2 fimbriae antisera derived from A. naeslundii genospecies 1 strain 12104, emphasizing that the overall folding of FimA may generate different functionalities. Western blot analyses with FimA antisera revealed monomers and oligomers of FimA in whole cell protein extracts and a purified recombinant FimA preparation, indicating a sortase-independent oligomerization of FimA. Conclusion The genus Actinomyces involves a diversity of unique FimA proteins with conserved pilin, E box and LPXTG motifs, depending on subspecies and associated binding specificity. In addition, a sortase independent oligomerization of FimA subunit proteins in solution was indicated. PMID:16686953
Apparent founder effect during the early years of the San Francisco HIV type 1 epidemic (1978-1979).
Foley, B; Pan, H; Buchbinder, S; Delwart, E L
2000-10-10
HIV-1 envelope sequence variants were RT-PCR amplified from serum samples cryopreserved in San Francisco in 1978-1979. The HIV-1 subtype B env V3-V5 sequences from four homosexual men clustered phylogenetically, with a median nucleotide distance of 2.8%, reflecting a recent common origin. These early U.S. HIV-1 env variants mapped close to the phylogenetic root of the subtype B tree while env variants collected in the United States throughout the 1980s and 1990s showed, on average, increasing genetic diversity and divergence from the subtype B consensus sequence. These results indicate that the majority of HIV-1 currently circulating in the United States may be descended from an initial introduction and rapid spread during the mid- to late 1970s of subtype B viruses with limited variability (i.e., a founder effect). As expected from the starburst-shaped phylogeny of HIV-1 subtype B, contemporary U.S. strains were, on average, more closely related at the nucleic acid and amino acid levels to the earlier 1978-1979 env variants than to each other. The growing levels of HIV-1 genetic diversity, one of multiple obstacles in designing a protective vaccine, may therefore be mitigated by using epidemic founding variants as antigenic strains for protection against contemporary strains.
Location of a major antigenic site involved in Ross River virus neutralization.
Vrati, S; Fernon, C A; Dalgarno, L; Weir, R C
1988-02-01
The location of a major antigenic domain involved in the neutralization of an alphavirus, Ross River virus, has been defined in terms of its position in the amino acid sequence of the E2 glycoprotein. The domain encompasses three topographically close epitopes which were identified using three E2-specific neutralizing monoclonal antibodies in competitive binding assays. Nucleotide sequencing of the structural protein genes of monoclonal antibody-selected antigenic variants showed that for each variant there was a single nucleotide change in the E2 gene leading to a nonconservative amino acid substitution in E2. Changes were at positions 216, 234, and 246-251 in the amino acid sequence. The epitopes are in a region of E2 which, though not strongly conserved as to sequence among Ross River virus, Semliki Forest virus, and Sindbis virus, is conserved in its hydropathy profile among the three alphaviruses. The epitopes lie between two asparagine-linked glycosylation sites (residues 200 and 262) in E2. They are conserved as to position between the mouse virulent T48 strain and the mouse avirulent NB5092 strain.
Abraham, Paul E; Wang, Xiaojing; Ranjan, Priya; Nookaew, Intawat; Zhang, Bing; Tuskan, Gerald A; Hettich, Robert L
2015-12-04
Next-generation sequencing has transformed the ability to link genotypes to phenotypes and facilitates the dissection of genetic contribution to complex traits. However, it is challenging to link genetic variants with the perturbed functional effects on proteins encoded by such genes. Here we show how RNA sequencing can be exploited to construct genotype-specific protein sequence databases to assess natural variation in proteins, providing information about the molecular toolbox driving cellular processes. For this study, we used two natural genotypes selected from a recent genome-wide association study of Populus trichocarpa, an obligate outcrosser with tremendous phenotypic variation across the natural population. This strategy allowed us to comprehensively catalogue proteins containing single amino acid polymorphisms (SAAPs), as well as insertions and deletions. We profiled the frequency of 128 types of naturally occurring amino acid substitutions, including both expected (neutral) and unexpected (non-neutral) SAAPs, with a subset occurring in regions of the genome having strong polymorphism patterns consistent with recent positive and/or divergent selection. By zeroing in on the molecular signatures of these important regions that might have previously been uncharacterized, we now provide a high-resolution molecular inventory that should improve accessibility and subsequent identification of natural protein variants in future genotype-to-phenotype studies.
Karas, Vlad O; Sinnott-Armstrong, Nicholas A; Varghese, Vici; Shafer, Robert W; Greenleaf, William J; Sherlock, Gavin
2018-01-01
Abstract Much of the within species genetic variation is in the form of single nucleotide polymorphisms (SNPs), typically detected by whole genome sequencing (WGS) or microarray-based technologies. However, WGS produces mostly uninformative reads that perfectly match the reference, while microarrays require genome-specific reagents. We have developed Diff-seq, a sequencing-based mismatch detection assay for SNP discovery without the requirement for specialized nucleic-acid reagents. Diff-seq leverages the Surveyor endonuclease to cleave mismatched DNA molecules that are generated after cross-annealing of a complex pool of DNA fragments. Sequencing libraries enriched for Surveyor-cleaved molecules result in increased coverage at the variant sites. Diff-seq detected all mismatches present in an initial test substrate, with specific enrichment dependent on the identity and context of the variation. Application to viral sequences resulted in increased observation of variant alleles in a biologically relevant context. Diff-Seq has the potential to increase the sensitivity and efficiency of high-throughput sequencing in the detection of variation. PMID:29361139
Abdel-Sabour, Mohammed A; Al-Ebshahy, Emad M; Khaliel, Samy A; Abdel-Wanis, Nabil A; Yanai, Tokuma
2017-09-01
The present study aimed to determine the molecular characteristics of circulating infectious bronchitis virus (IBV) strains in vaccinated broiler flocks in the Giza and Fayoum governorates. Thirty-four isolates were collected, and egg propagation revealed their ability to induce typical IBV lesions after three to five successive passages. Three selected isolates were identified as IBV using a real-time reverse transcriptase-PCR assay targeted the nucleocapsid (N) gene and further characterized by partial spike (S) gene sequence analysis. Phylogenetic analysis revealed their clustering into two variant groups. Group I consisted of one variant (VSVRI_F3), which had 99.1% nucleotide sequence identity to the Q1 reference strain. Group II consisted of variants VSVRI_G4 and VSVRI_G9, which showed 92.8%-94.3% nucleotide identity with the Egyptian variants Eg/12120S/2012, Eg/12197B/2012, and Eg/1265B/2012. Regarding the deduced amino acid sequence, the three variants had 77.1%-85.2% similarity with the vaccine strains currently used in Egypt. These findings highlight the importance of monitoring the prevalence of IBV variants in vaccinated broiler flocks as well as adopting an appropriate vaccination strategy.
A novel isoform of vertebrate ancient opsin in a smelt fish, Plecoglossus altivelis.
Minamoto, Toshifumi; Shimizu, Isamu
2002-01-11
Vertebrate ancient (VA) opsin of nonvisual pigment in fishes was reported to exist in two isoforms, i.e., short and long variants with an unusual predicted amino acid sequence length compared to vertebrate visual opsins. Here we cloned an isoform (Pal-VAM) of VA opsin showing the usual opsin length in addition to the long type isoform (Pal-VAL) from a smelt fish, Plecoglossus altivelis. Pal-VAM and Pal-VAL were composed of 346 and 387 amino acids, respectively. The deduced amino acid sequences of these variants were identical to each other within the first 342 residues, but they showed divergence in the carboxyl-terminal sequence. Pal-VAL corresponded to the long isoform found in zebrafish and carp, and Pal-VAM was identified as a new type of VA opsin variant. Southern blotting experiments indicated that the VA opsin gene of the smelt is present as a single copy, and RT-PCR analysis revealed that Pal-VAM and Pal-VAL mRNA were expressed in both the eyes and brain. In situ hybridization showed that Pal-VAM and Pal-VAL mRNA are expressed in amacrine cells in the retina. Pal-VAM is a new probably functional nonvisual photoreceptive molecule in fish. (c)2002 Elsevier Science.
Krause, William C.; Shafi, Ayesha A.; Nakka, Manjula; Weigel, Nancy L.
2014-01-01
Prostate cancer (PCa) is an androgen-dependent disease, and tumors that are resistant to androgen ablation therapy often remain androgen receptor (AR) dependent. Among the contributors to castration-resistant PCa are AR splice variants that lack the ligand-binding domain (LBD). Instead, they have small amounts of unique sequence derived from cryptic exons or from out of frame translation. The AR-V7 (or AR3) variant is constitutively active and is expressed under conditions consistent with CRPC. AR-V7 is reported to regulate a transcriptional program that is similar but not identical to that of AR. However, it is unknown whether these differences are due to the unique sequence in AR-V7, or simply to loss of the LBD. To examine transcriptional regulation by AR-V7, we have used lentiviruses encoding AR-V7 (amino acids 1-627 of AR with the 16 amino acids unique to the variant) to prepare a derivative of the androgen-dependent LNCaP cells with inducible expression of AR-V7. An additional cell line was generated with regulated expression of AR-NTD (amino acids 1-660 of AR); this mutant lacks the LBD but does not have the AR-V7 specific sequence. We find that AR and AR-V7 have distinct activities on target genes that are co-regulated by FOXA1. Transcripts regulated by AR-V7 were similarly regulated by AR-NTD, indicating that loss of the LBD is sufficient for the observed differences. Differential regulation of target genes correlates with preferential recruitment of AR or AR-V7 to specific cis-regulatory DNA sequences providing an explanation for some of the observed differences in target gene regulation. PMID:25008967
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grisewood, Matthew J.; Hernández-Lozada, Néstor J.; Thoden, James B.
Enzyme and metabolic engineering offer the potential to develop biocatalysts for converting natural resources to a wide range of chemicals. To broaden the scope of potential products beyond natural metabolites, methods of engineering enzymes to accept alternative substrates and/or perform novel chemistries must be developed. DNA synthesis can create large libraries of enzyme-coding sequences, but most biochemistries lack a simple assay to screen for promising enzyme variants. Our solution to this challenge is structure-guided mutagenesis, in which optimization algorithms select the best sequences from libraries based on specified criteria (i.e., binding selectivity). We demonstrate this approach by identifying medium-chain (C8–C12)more » acyl-ACP thioesterases through structure-guided mutagenesis. Medium-chain fatty acids, which are products of thioesterase-catalyzed hydrolysis, are limited in natural abundance, compared to long-chain fatty acids; the limited supply leads to high costs of C6–C10 oleochemicals such as fatty alcohols, amines, and esters. Here, we applied computational tools to tune substrate binding of the highly active ‘TesA thioesterase in Escherichia coli. We used the IPRO algorithm to design thioesterase variants with enhanced C12 or C8 specificity, while maintaining high activity. After four rounds of structure-guided mutagenesis, we identified 3 variants with enhanced production of dodecanoic acid (C12) and 27 variants with enhanced production of octanoic acid (C8). The top variants reached up to 49% C12 and 50% C8 while exceeding native levels of total free fatty acids. A comparably sized library created by random mutagenesis failed to identify promising mutants. The chain length-preference of ‘TesA and the best mutant were confirmed in vitro using acyl-CoA substrates. Molecular dynamics simulations, confirmed by resolved crystal structures, of ‘TesA variants suggest that hydrophobic forces govern ‘TesA substrate specificity. Finally, we expect the design rules that we uncovered and the thioesterase variants that we identified will be useful to metabolic engineering projects aimed at sustainable production of medium-chain-length oleochemicals.« less
Grisewood, Matthew J.; Hernández-Lozada, Néstor J.; Thoden, James B.; ...
2017-04-20
Enzyme and metabolic engineering offer the potential to develop biocatalysts for converting natural resources to a wide range of chemicals. To broaden the scope of potential products beyond natural metabolites, methods of engineering enzymes to accept alternative substrates and/or perform novel chemistries must be developed. DNA synthesis can create large libraries of enzyme-coding sequences, but most biochemistries lack a simple assay to screen for promising enzyme variants. Our solution to this challenge is structure-guided mutagenesis, in which optimization algorithms select the best sequences from libraries based on specified criteria (i.e., binding selectivity). We demonstrate this approach by identifying medium-chain (C8–C12)more » acyl-ACP thioesterases through structure-guided mutagenesis. Medium-chain fatty acids, which are products of thioesterase-catalyzed hydrolysis, are limited in natural abundance, compared to long-chain fatty acids; the limited supply leads to high costs of C6–C10 oleochemicals such as fatty alcohols, amines, and esters. Here, we applied computational tools to tune substrate binding of the highly active ‘TesA thioesterase in Escherichia coli. We used the IPRO algorithm to design thioesterase variants with enhanced C12 or C8 specificity, while maintaining high activity. After four rounds of structure-guided mutagenesis, we identified 3 variants with enhanced production of dodecanoic acid (C12) and 27 variants with enhanced production of octanoic acid (C8). The top variants reached up to 49% C12 and 50% C8 while exceeding native levels of total free fatty acids. A comparably sized library created by random mutagenesis failed to identify promising mutants. The chain length-preference of ‘TesA and the best mutant were confirmed in vitro using acyl-CoA substrates. Molecular dynamics simulations, confirmed by resolved crystal structures, of ‘TesA variants suggest that hydrophobic forces govern ‘TesA substrate specificity. Finally, we expect the design rules that we uncovered and the thioesterase variants that we identified will be useful to metabolic engineering projects aimed at sustainable production of medium-chain-length oleochemicals.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gelb, Bruce D; Tartaglia, Marco; Pennacchio, Len
Diagnostic and therapeutic applications for Noonan Syndrome are described. The diagnostic and therapeutic applications are based on certain mutations in a RAS-specific guanine nucleotide exchange factor gene SOS1 or its expression product. The diagnostic and therapeutic applications are also based on certain mutations in a serine/threonine protein kinase gene RAF1 or its expression product thereof. Also described are nucleotide sequences, amino acid sequences, probes, and primers related to RAF1 or SOS1, and variants thereof, as well as host cells expressing such variants.
Villate, Olatz; Ibarluzea, Nekane; Fraile-Bethencourt, Eugenia; Valenzuela, Alberto; Velasco, Eladio A; Grozeva, Detelina; Raymond, F L; Botella, María P; Tejada, María-Isabel
2018-01-01
Mutations in CHD7 have been shown to be a major cause of CHARGE syndrome, which presents many symptoms and features common to other syndromes making its diagnosis difficult. Next generation sequencing (NGS) of a panel of intellectual disability related genes was performed in an adult patient without molecular diagnosis. A splice donor variant in CHD7 (c.5665 + 1G > T) was identified. To study its potential pathogenicity, exons and flanking intronic sequences were amplified from patient DNA and cloned into the pSAD ® splicing vector. HeLa cells were transfected with this construct and a wild-type minigene and functional analysis were performed. The construct with the c.5665 + 1G > T variant produced an aberrant transcript with an insert of 63 nucleotides of intron 28 creating a premature termination codon (TAG) 25 nucleotides downstream. This would lead to the insertion of 8 new amino acids and therefore a truncated 1896 amino acid protein. As a result of this, the patient was diagnosed with CHARGE syndrome. Functional analyses underline their usefulness for studying the pathogenicity of variants found by NGS and therefore its application to accurately diagnose patients.
Astell, C R; Gardiner, E M; Tattersall, P
1986-02-01
The sequence of molecular clones of the genome of MVM(i), a lymphotropic variant of minute virus of mice, was determined and compared with that of MVM(p), the fibrotropic prototype strain. At the nucleotide level there are 163 base changes: 129 transitions and 34 transversions. Most nucleotide changes are silent, with only 27 amino acids changes predicted, of which 22 are conservative. Notable differences between the MVM(i) and MVM(p) genomes which may account for the cell specificities of these viruses occur within the 3' nontranslated regions. The differences discussed include the absence of a 65-base-pair direct in MVM(i), the presence of only two polyadenylation sites in MVM(i) compared with four in MVM(p), and sequences that bear a resemblance to enhancer sequences. Also included in this paper is an important correction to the MVM(p) sequence (C.R. Astell, M. Thomson, M. Merchlinsky, and D. C. Ward, Nucleic Acids Res. 11:999-1018, 1983).
Benmansour, A; Brahimi, M; Tuffereau, C; Coulon, P; Lafay, F; Flamand, A
1992-03-01
The sequence of the glycoprotein gene of a street rabies virus was determined directly using fragments of a rabid dog brain after PCR amplification. Compared with that of the prototype strain CVS, this sequence displayed 10% divergence in overall amino acid composition. However only 6% divergence was noted in the ectodomain suggesting that structural constraints are exerted on this portion of the glycoprotein. A human strain isolated on cell culture from the saliva of a patient with clinical rabies had only five amino acid differences with the canine isolate, an indication of their close relatedness. These differences could have originated during transmission from dog to dog, or from dog to man, or during isolation on cell culture; they are nonetheless indicative of a genetic evolution of street rabies virus. This evolution was further evidenced by the selection of cell-adapted variants which displayed new amino acid substitutions in the glycoprotein. One of them concerned antigenic site III where arginine at position 333 was replaced by glutamine. As expected this substitution conferred resistance to a site IIIa monoclonal antibody (MAb), but surprisingly did not abolish neurovirulence for adult mice. However, a decrease in the neurovirulence of the cell-adapted variant in the presence of a site IIIa specific MAb was noted, suggesting that neurovirulence was due to a subpopulation neutralizable by the MAb. Simultaneous presence of both the parental and variant sequences was indeed evidenced in the brain of a mouse inoculated with the cell-adapted variant; during multiplication in the mouse brain, the frequency of the parental sequence rose from less than 10% to nearly 50%, indicating the selective advantage conferred by arginine 333 in nervous tissue. Altogether these results were suggestive of an intrinsic heterogeneity of street rabies virus. This heterogeneity was further demonstrated by the sequencing of molecular clones of the glycoprotein gene, which revealed that only one-third of the viral genomes present in the brain of a rabid dog had the consensus sequence. Two-thirds of the clones analyzed displayed from one to three amino acid substitutions. Such heterogeneous populations have been referred to as quasispecies, a concept which implies heterogeneous populations kept together in a dynamic equilibrium. This equilibrium could be rapidly displaced, giving the virus the capacity to adapt easily to new environmental conditions.
The Saccharomyces Genome Database Variant Viewer.
Sheppard, Travis K; Hitz, Benjamin C; Engel, Stacia R; Song, Giltae; Balakrishnan, Rama; Binkley, Gail; Costanzo, Maria C; Dalusag, Kyla S; Demeter, Janos; Hellerstedt, Sage T; Karra, Kalpana; Nash, Robert S; Paskov, Kelley M; Skrzypek, Marek S; Weng, Shuai; Wong, Edith D; Cherry, J Michael
2016-01-04
The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org) is the authoritative community resource for the Saccharomyces cerevisiae reference genome sequence and its annotation. In recent years, we have moved toward increased representation of sequence variation and allelic differences within S. cerevisiae. The publication of numerous additional genomes has motivated the creation of new tools for their annotation and analysis. Here we present the Variant Viewer: a dynamic open-source web application for the visualization of genomic and proteomic differences. Multiple sequence alignments have been constructed across high quality genome sequences from 11 different S. cerevisiae strains and stored in the SGD. The alignments and summaries are encoded in JSON and used to create a two-tiered dynamic view of the budding yeast pan-genome, available at http://www.yeastgenome.org/variant-viewer. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Valine/isoleucine variants drive selective pressure in the VP1 sequence of EV-A71 enteroviruses.
Duy, Nghia Ngu; Huong, Le Thi Thanh; Ravel, Patrice; Huong, Le Thi Song; Dwivedi, Ankit; Sessions, October Michael; Hou, Yan'An; Chua, Robert; Kister, Guilhem; Afelt, Aneta; Moulia, Catherine; Gubler, Duane J; Thiem, Vu Dinh; Thanh, Nguyen Thi Hien; Devaux, Christian; Duong, Tran Nhu; Hien, Nguyen Tran; Cornillot, Emmanuel; Gavotte, Laurent; Frutos, Roger
2017-05-08
In 2011-2012, Northern Vietnam experienced its first large scale hand foot and mouth disease (HFMD) epidemic. In 2011, a major HFMD epidemic was also reported in South Vietnam with fatal cases. This 2011-2012 outbreak was the first one to occur in North Vietnam providing grounds to study the etiology, origin and dynamic of the disease. We report here the analysis of the VP1 gene of strains isolated throughout North Vietnam during the 2011-2012 outbreak and before. The VP1 gene of 106 EV-A71 isolates from North Vietnam and 2 from Central Vietnam were sequenced. Sequence alignments were analyzed at the nucleic acid and protein level. Gene polymorphism was also analyzed. A Factorial Correspondence Analysis was performed to correlate amino acid mutations with clinical parameters. The sequences were distributed into four phylogenetic clusters. Three clusters corresponded to the subgenogroup C4 and the last one corresponded to the subgenogroup C5. Each cluster displayed different polymorphism characteristics. Proteins were highly conserved but three sites bearing only Isoleucine (I) or Valine (V) were characterized. The isoleucine/valine variability matched the clusters. Spatiotemporal analysis of the I/V variants showed that all variants which emerged in 2011 and then in 2012 were not the same but were all present in the region prior to the 2011-2012 outbreak. Some correlation was found between certain I/V variants and ethnicity and severity. The 2011-2012 outbreak was not caused by an exogenous strain coming from South Vietnam or elsewhere but by strains already present and circulating at low level in North Vietnam. However, what triggered the outbreak remains unclear. A selective pressure is applied on I/V variants which matches the genetic clusters. I/V variants were shown on other viruses to correlate with pathogenicity. This should be investigated in EV-A71. I/V variants are an easy and efficient way to survey and identify circulating EV-A71 strains.
Calderon, Marina Gallo; Mattion, Nora; Bucafusco, Danilo; Fogel, Fernando; Remorini, Patricia; La Torre, Jose
2009-08-01
PCR amplification with sequence-specific primers was used to detect canine parvovirus (CPV) DNA in 38 rectal swabs from Argentine domestic dogs with symptoms compatible with parvovirus disease. Twenty-seven out of 38 samples analyzed were CPV positive. The classical CPV2 strain was not detected in any of the samples, but nine samples were identified as CPV2a variant and 18 samples as CPV2b variant. Further sequence analysis revealed a mutation at amino acid 426 of the VP2 gene (Asp426Glu), characteristic of the CPV2c variant, in 14 out of 18 of the samples identified initially by PCR as CPV2b. The appearance of CPV2c variant in Argentina might be dated at least to the year 2003. Three different pathogenic CPV variants circulating currently in the Argentine domestic dog population were identified, with CPV2c being the only variant affecting vaccinated and unvaccinated dogs during the year 2008.
Novel Variant of Tickborne Encephalitis Virus, Russia
Ternovoi, Vladimir A.; Protopopova, Elena V.; Chausov, Eugene V.; Novikov, Dmitry V.; Leonova, Galina N.; Netesov, Sergey V.
2007-01-01
We isolated a novel strain of tickborne encephalitis virus (TBEV), Glubinnoe/2004, from a patient with a fatal case in Russia. We sequenced the strain, whose landmark features included 57 amino acid substitutions and 5 modified cleavage sites. Phylogenetically, Glubinnoe/2004 is a novel variant that belongs to the Eastern type of TBEV. PMID:18258012
Duplication polymorphisms in exon 4 of κ-casein gene in yak breeds/populations.
Pingcuo, S; Gao, J; Jiang, Z R; Jin, S Y; Fu, C Y; Liu, X; Huang, L; Zheng, Y C
2015-08-28
The objective of this study was to compare 12 bp-duplication polymorphisms in exon 4 of the κ-casein gene among 3 breeds/populations of yak (Bos grunniens). Genomic DNA was extracted from yak blood or muscle samples (N = 211) and a partial sequence of exon 4 of κ-casein gene was amplified by polymerase chain reaction. A polyacrylamide gel electrophoresis assay of the products (169 bp) revealed 2 variants. These variants differed in a 12-bp duplication of the nucleotide sequence corresponding to amino acids 147-150 (Glu-Ala-Ser-Pro) or 148-151 (Ala-Ser-Pro-Glu). The genotype frequency and gene frequency of the 2 κ-casein variants differed among the 3 yak breeds/populations. The long form of the κ-casein gene was the predominant allele, and the Jiulong yak showed the highest frequency of the short form variant of the κ-casein gene. In addition, 2 nucleotide differences resulting in amino acid substitutions were also identified in yaks. These results are significant for designing a breeding strategy to improve the genetic makeup of yak herds.
Direct Calculation of Protein Fitness Landscapes through Computational Protein Design
Au, Loretta; Green, David F.
2016-01-01
Naturally selected amino-acid sequences or experimentally derived ones are often the basis for understanding how protein three-dimensional conformation and function are determined by primary structure. Such sequences for a protein family comprise only a small fraction of all possible variants, however, representing the fitness landscape with limited scope. Explicitly sampling and characterizing alternative, unexplored protein sequences would directly identify fundamental reasons for sequence robustness (or variability), and we demonstrate that computational methods offer an efficient mechanism toward this end, on a large scale. The dead-end elimination and A∗ search algorithms were used here to find all low-energy single mutant variants, and corresponding structures of a G-protein heterotrimer, to measure changes in structural stability and binding interactions to define a protein fitness landscape. We established consistency between these algorithms with known biophysical and evolutionary trends for amino-acid substitutions, and could thus recapitulate known protein side-chain interactions and predict novel ones. PMID:26745411
Li, Yantao; Fu, Tuo; Liu, Tao; Guo, Huaizu; Guo, Qingcheng; Xu, Jin; Zhang, Dapeng; Qian, Weizhu; Dai, Jianxin; Li, Bohua; Guo, Yajun; Hou, Sheng; Wang, Hao
2016-07-01
Nivolumab is a therapeutic fully human IgG4 antibody to programmed death 1 (PD-1). In this study, a nivolumab biosimilar, which was produced in our laboratory, was analyzed and characterized. Sequence variants that contain undesired amino acid sequences may cause concern during biosimilar bioprocess development. We found that low levels of sequence variants were detected in the heavy chain of the nivolumab biosimilar by ultra performance liquid chromatography (UPLC) and tandem mass spectrometry. It was further identified with UPLC-MS/MS by IdeS or trypsin digestion. The sequence variant was confirmed through addition of synthetic mutant peptide. Subsequently, the mixing base signal of normal and mutant sequence was detected through DNA sequencing. The relative levels of mutant A424V in the Fc region of the heavy chain have been detected and demonstrated to be 12.25% and 13.54%, via base peak intensity (BPI) and UV chromatography of the tryptic peptide mapping, respectively. A424V variant was also quantified by real-time PCR (RT-PCR) at the DNA and RNA level, which was 19.2% and 16.8%, respectively. The relative content of the mutant was consistent at the DNA, RNA and protein level, indicating that the A424V mutation may have little influence at transcriptional or translational levels. These results demonstrate that orthogonal state-of-the-art techniques such as LC- UV- MS and RT-PCR should be implemented to characterize recombinant proteins and cell lines for development of biosimilars. Our study suggests that it is important to establish an integrated and effective analytical method to monitor and characterize sequence variants during antibody drug development, especially for antibody biosimilar products.
Hurba, Olha; Mancikova, Andrea; Krylov, Vladimir; Pavlikova, Marketa; Pavelka, Karel; Stibůrková, Blanka
2014-01-01
Objective Using European descent Czech populations, we performed a study of SLC2A9 and SLC22A12 genes previously identified as being associated with serum uric acid concentrations and gout. This is the first study of the impact of non-synonymous allelic variants on the function of GLUT9 except for patients suffering from renal hypouricemia type 2. Methods The cohort consisted of 250 individuals (150 controls, 54 nonspecific hyperuricemics and 46 primary gout and/or hyperuricemia subjects). We analyzed 13 exons of SLC2A9 (GLUT9 variant 1 and GLUT9 variant 2) and 10 exons of SLC22A12 by PCR amplification and sequenced directly. Allelic variants were prepared and their urate uptake and subcellular localization were studied by Xenopus oocytes expression system. The functional studies were analyzed using the non-parametric Wilcoxon and Kruskall-Wallis tests; the association study used the Fisher exact test and linear regression approach. Results We identified a total of 52 sequence variants (12 unpublished). Eight non-synonymous allelic variants were found only in SLC2A9: rs6820230, rs2276961, rs144196049, rs112404957, rs73225891, rs16890979, rs3733591 and rs2280205. None of these variants showed any significant difference in the expression of GLUT9 and in urate transport. In the association study, eight variants showed a possible association with hyperuricemia. However, seven of these were in introns and the one exon located variant, rs7932775, did not show a statistically significant association with serum uric acid concentration. Conclusion Our results did not confirm any effect of SLC22A12 and SLC2A9 variants on serum uric acid concentration. Our complex approach using association analysis together with functional and immunohistochemical characterization of non-synonymous allelic variants did not show any influence on expression, subcellular localization and urate uptake of GLUT9. PMID:25268603
Fernández-Lainez, Cynthia; Aláez-Verson, Carmen; Ibarra-González, Isabel; Enríquez-Flores, Sergio; Carrillo-Sanchez, Karol; Flores-Lagunes, Leonardo; Guillén-López, Sara; Belmont-Martínez, Leticia; Vela-Amieva, Marcela
2018-04-16
Maple syrup urine disease (MSUD) is a metabolic disorder caused by mutations in three of the branched-chain α-keto acid dehydrogenase complex (BCKDC) genes. Classical MSUD symptom can be observed immediately after birth and include ketoacidosis, irritability, lethargy, and coma, which can lead to death or irreversible neurodevelopmental delay in survivors. The molecular diagnosis of MSUD can be time-consuming and difficult to establish using conventional Sanger sequencing because it could be due to pathogenic variants of any of the BCKDC genes. Next-generation sequencing-based methodologies have revolutionized the molecular diagnosis of inborn errors in metabolism and offer a superior approach for genotyping these patients. Here, we report an MSUD case whose molecular diagnosis was performed by clinical exome sequencing (CES), and the possible structural pathogenic effect of a novel E1α subunit pathogenic variant was analyzed using in silico analysis of α and β subunit crystallographic structure. Molecular analysis revealed a new homozygous non-sense c.1267C>T or p.Gln423Ter variant of BCKDHA. The novel BCKDHA variant is considered pathogenic because it caused a premature stop codon that probably led to the loss of the last 22 amino acid residues of the E1α subunit C-terminal end. In silico analysis of this region showed that it is in contact with several residues of the E1β subunit mainly through polar contacts, hydrogen bonds, and hydrophobic interactions. CES strategy could benefit the patients and families by offering precise and prompt diagnosis and better genetic counseling. Copyright © 2018 Elsevier B.V. All rights reserved.
Analysis of selected genes associated with cardiomyopathy by next-generation sequencing.
Szabadosova, Viktoria; Boronova, Iveta; Ferenc, Peter; Tothova, Iveta; Bernasovska, Jarmila; Zigova, Michaela; Kmec, Jan; Bernasovsky, Ivan
2018-02-01
As the leading cause of congestive heart failure, cardiomyopathy represents a heterogenous group of heart muscle disorders. Despite considerable progress being made in the genetic diagnosis of cardiomyopathy by detection of the mutations in the most prevalent cardiomyopathy genes, the cause remains unsolved in many patients. High-throughput mutation screening in the disease genes for cardiomyopathy is now possible because of using target enrichment followed by next-generation sequencing. The aim of the study was to analyze a panel of genes associated with dilated or hypertrophic cardiomyopathy based on previously published results in order to identify the subjects at risk. The method of next-generation sequencing by IlluminaHiSeq 2500 platform was used to detect sequence variants in 16 individuals diagnosed with dilated or hypertrophic cardiomyopathy. Detected variants were filtered and the functional impact of amino acid changes was predicted by computational programs. DNA samples of the 16 patients were analyzed by whole exome sequencing. We identified six nonsynonymous variants that were shown to be pathogenic in all used prediction softwares: rs3744998 (EPG5), rs11551768 (MGME1), rs148374985 (MURC), rs78461695 (PLEC), rs17158558 (RET) and rs2295190 (SYNE1). Two of the analyzed sequence variants had minor allele frequency (MAF)<0.01: rs148374985 (MURC), rs34580776 (MYBPC3). Our data support the potential role of the detected variants in pathogenesis of dilated or hypertrophic cardiomyopathy; however, the possibility that these variants might not be true disease-causing variants but are susceptibility alleles that require additional mutations or injury to cause the clinical phenotype of disease must be considered. © 2017 Wiley Periodicals, Inc.
Knowles, D P; Cheevers, W P; McGuire, T C; Brassfield, A L; Harwood, W G; Stem, T A
1991-11-01
To define the structure of the caprine arthritis-encephalitis virus (CAEV) env gene and characterize genetic changes which occur during antigenic variation, we sequenced the env genes of CAEV-63 and CAEV-Co, two antigenic variants of CAEV defined by serum neutralization. The deduced primary translation product of the CAEV env gene consists of a 60- to 80-amino-acid signal peptide followed by an amino-terminal surface protein (SU) and a carboxy-terminal transmembrane protein (TM) separated by an Arg-Lys-Lys-Arg cleavage site. The signal peptide cleavage site was verified by amino-terminal amino acid sequencing of native CAEV-63 SU. In addition, immunoprecipitation of [35S]methionine-labeled CAEV-63 proteins by sera from goats immunized with recombinant vaccinia virus expressing the CAEV-63 env gene confirmed that antibodies induced by env-encoded recombinant proteins react specifically with native virion SU and TM. The env genes of CAEV-63 and CAEV-Co encode 28 conserved cysteines and 25 conserved potential N-linked glycosylation sites. Nucleotide sequence variability results in 62 amino acid changes and one deletion within the SU and 34 amino acid changes within the TM.
Knowles, D P; Cheevers, W P; McGuire, T C; Brassfield, A L; Harwood, W G; Stem, T A
1991-01-01
To define the structure of the caprine arthritis-encephalitis virus (CAEV) env gene and characterize genetic changes which occur during antigenic variation, we sequenced the env genes of CAEV-63 and CAEV-Co, two antigenic variants of CAEV defined by serum neutralization. The deduced primary translation product of the CAEV env gene consists of a 60- to 80-amino-acid signal peptide followed by an amino-terminal surface protein (SU) and a carboxy-terminal transmembrane protein (TM) separated by an Arg-Lys-Lys-Arg cleavage site. The signal peptide cleavage site was verified by amino-terminal amino acid sequencing of native CAEV-63 SU. In addition, immunoprecipitation of [35S]methionine-labeled CAEV-63 proteins by sera from goats immunized with recombinant vaccinia virus expressing the CAEV-63 env gene confirmed that antibodies induced by env-encoded recombinant proteins react specifically with native virion SU and TM. The env genes of CAEV-63 and CAEV-Co encode 28 conserved cysteines and 25 conserved potential N-linked glycosylation sites. Nucleotide sequence variability results in 62 amino acid changes and one deletion within the SU and 34 amino acid changes within the TM. Images PMID:1656067
NASA Astrophysics Data System (ADS)
Ivanov, Mark V.; Lobas, Anna A.; Levitsky, Lev I.; Moshkovskii, Sergei A.; Gorshkov, Mikhail V.
2018-02-01
In a proteogenomic approach based on tandem mass spectrometry analysis of proteolytic peptide mixtures, customized exome or RNA-seq databases are employed for identifying protein sequence variants. However, the problem of variant peptide identification without personalized genomic data is important for a variety of applications. Following the recent proposal by Chick et al. (Nat. Biotechnol. 33, 743-749, 2015) on the feasibility of such variant peptide search, we evaluated two available approaches based on the previously suggested "open" search and the "brute-force" strategy. To improve the efficiency of these approaches, we propose an algorithm for exclusion of false variant identifications from the search results involving analysis of modifications mimicking single amino acid substitutions. Also, we propose a de novo based scoring scheme for assessment of identified point mutations. In the scheme, the search engine analyzes y-type fragment ions in MS/MS spectra to confirm the location of the mutation in the variant peptide sequence.
Identification of rare paired box 3 variant in strabismus by whole exome sequencing
Gong, Hui-Min; Wang, Jing; Xu, Jing; Zhou, Zhan-Yu; Li, Jing-Wen; Chen, Shu-Fang
2017-01-01
AIM To identify the potentially pathogenic gene variants that contributes to the etiology of strabismus. METHODS A Chinese pedigree with strabismus was collected and the exomes of two affected individuals were sequenced using the next-generation sequencing technology. The resulting variants from exome sequencing were filtered by subsequent bioinformatics methods and the candidate mutation was verified as heterozygous in the affected proposita and her mother by sanger sequencing. RESULTS Whole exome sequencing and filtering identified a nonsynonymous mutation c.434G-T transition in paired box 3 (PAX3) in the two affected individuals, which were predicted to be deleterious by more than 4 bioinformatics programs. This altered amino acid residue was located in the conserved PAX domain of PAX3. This gene encodes a member of the PAX family of transcription factors, which play critical roles during fetal development. Mutations in PAX3 were associated with Waardenburg syndrome with strabismus. CONCLUSION Our results report that the c.434G-T mutation (p.R145L) in PAX3 may contribute to strabismus, expanding our understanding of the causally relevant genes for this disorder. PMID:28861346
Identification of rare paired box 3 variant in strabismus by whole exome sequencing.
Gong, Hui-Min; Wang, Jing; Xu, Jing; Zhou, Zhan-Yu; Li, Jing-Wen; Chen, Shu-Fang
2017-01-01
To identify the potentially pathogenic gene variants that contributes to the etiology of strabismus. A Chinese pedigree with strabismus was collected and the exomes of two affected individuals were sequenced using the next-generation sequencing technology. The resulting variants from exome sequencing were filtered by subsequent bioinformatics methods and the candidate mutation was verified as heterozygous in the affected proposita and her mother by sanger sequencing. Whole exome sequencing and filtering identified a nonsynonymous mutation c.434G-T transition in paired box 3 (PAX3) in the two affected individuals, which were predicted to be deleterious by more than 4 bioinformatics programs. This altered amino acid residue was located in the conserved PAX domain of PAX3. This gene encodes a member of the PAX family of transcription factors, which play critical roles during fetal development. Mutations in PAX3 were associated with Waardenburg syndrome with strabismus. Our results report that the c.434G-T mutation (p.R145L) in PAX3 may contribute to strabismus, expanding our understanding of the causally relevant genes for this disorder.
Hoelsch, K; Lenggeler, I; Pfannes, W; Knabe, H; Klein, H-G; Woelpl, A
2005-05-01
A new human leukocyte antigen (HLA)-B allele was found during routine typing of samples for a German unrelated bone marrow donor registry, the "Aktion Knochenmarkspende Bayern". After first interpretation of data of two independent low-resolution sequence-specific oligonucleotide typing tests, a B*51 variant was suggested. Further analysis via sequence-based typing identified the sequence as new B*52 allele. This new allele officially assigned as B*5206 differs from HLA-B*520102 by one nucleotide exchange in exon 2. The mutation is located at nucleotide position 274, at which a cytosine is substituted by a thymine leading to an amino acid change at protein position 67 from serine (TCC) to phenylalanine (TTC).
MutaBind estimates and interprets the effects of sequence variants on protein-protein interactions.
Li, Minghui; Simonetti, Franco L; Goncearenco, Alexander; Panchenko, Anna R
2016-07-08
Proteins engage in highly selective interactions with their macromolecular partners. Sequence variants that alter protein binding affinity may cause significant perturbations or complete abolishment of function, potentially leading to diseases. There exists a persistent need to develop a mechanistic understanding of impacts of variants on proteins. To address this need we introduce a new computational method MutaBind to evaluate the effects of sequence variants and disease mutations on protein interactions and calculate the quantitative changes in binding affinity. The MutaBind method uses molecular mechanics force fields, statistical potentials and fast side-chain optimization algorithms. The MutaBind server maps mutations on a structural protein complex, calculates the associated changes in binding affinity, determines the deleterious effect of a mutation, estimates the confidence of this prediction and produces a mutant structural model for download. MutaBind can be applied to a large number of problems, including determination of potential driver mutations in cancer and other diseases, elucidation of the effects of sequence variants on protein fitness in evolution and protein design. MutaBind is available at http://www.ncbi.nlm.nih.gov/projects/mutabind/. Published by Oxford University Press on behalf of Nucleic Acids Research 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Ie, Susan I; Thedja, Meta D; Roni, Martono; Muljono, David H
2010-11-18
Selection of hepatitis B virus (HBV) by host immunity has been suggested to give rise to variants with amino acid substitutions at or around the 'a' determinant of the surface antigen (HBsAg), the main target of antibody neutralization and diagnostic assays. However, there have never been successful attempts to provide evidence for this hypothesis, partly because the 3 D structure of HBsAg molecules has not been determined. Tertiary structure prediction of HBsAg solely from its primary amino acid sequence may reveal the molecular energetic of the mutated proteins. We carried out this preliminary study to analyze the predicted HBsAg conformation changes of HBV variants isolated from Indonesian blood donors undetectable by HBsAg assays and its significance, compared to other previously-reported variants that were associated with diagnostic failure. Three HBV variants (T123A, M133L and T143M) and a wild type sequence were analyzed together with frequently emerged variants T123N, M133I, M133T, M133V, and T143L. Based on the Jameson-Wolf algorithm for calculating antigenic index, the first two amino acid substitutions resulted in slight changes in the antigenicity of the 'a' determinant, while all four of the comparative variants showed relatively more significant changes. In the pattern T143M, changes in antigenic index were more significant, both in its coverage and magnitude, even when compared to variant T143L. These data were also partially supported by the tertiary structure prediction, in which the pattern T143M showed larger shift in the HBsAg second loop structure compared to the others. Single amino acid substitutions within or near the 'a' determinant of HBsAg may alter antigenicity properties of variant HBsAg, which can be shown by both its antigenic index and predicted 3 D conformation. Findings in this study emphasize the significance of variant T143M, the prevalent isolate with highest degree of antigenicity changes found in Indonesian blood donors. This highlights the importance of evaluating the effects of protein structure alterations on the sensitivity of screening methods being used in detection of ongoing HBV infection, as well as the use of vaccines and immunoglobulin therapy in contributing to the selection of HBV variants.
Bukin, Yu S; Dzhioev, Yu P; Tkachev, S E; Kozlova, I V; Paramonov, A I; Ruzek, D; Qu, Z; Zlobin, V I
2017-06-15
This work is dedicated to the study of the variability of the main antigenic envelope protein E among different strains of tick-borne encephalitis virus at the level of physical and chemical properties of the amino acid residues. E protein variants were extracted from then NCBI database. Four amino acid residues properties in the polypeptide sequences were investigated: the average volume of the amino acid residue in the protein tertiary structure, the number of amino acid residue hydrogen bond donors, the charge of amino acid residue lateral radical and the dipole moment of the amino acid residue. These physico-chemical properties are involved in antigen-antibody interactions. As a result, 103 different variants of the antigenic determinants of the tick-borne encephalitis virus E protein were found, significantly different by physical and chemical properties of the amino acid residues in their structure. This means that some strains among the natural variants of tick-borne encephalitis virus can potentially escape the immune response induced by the standard vaccine. Copyright © 2017 Elsevier B.V. All rights reserved.
Méndez, E; Arias, C F; López, S
1996-01-01
The infection of target cells by most animal rotavirus strains requires the presence of sialic acids (SAs) on the cell surface. We recently isolated variants from simian rotavirus RRV whose infectivity is no longer dependent on SAs and showed that the mutant phenotype segregates with the gene coding for VP4, one of the two surface proteins of rotaviruses (the other one being VP7). The nucleotide sequence of the VP4 gene of four independently isolated variants showed three amino acid changes, at positions 37 (Leu to Pro), 187 (Lys to Arg), and 267 (Tyr to Cys), in all mutant VP4 proteins compared with RRV VP4. The characterization of revertant viruses from two independent mutants showed that the arginine residue at position 187 changed back to lysine, indicating that this amino acid is involved in the determination of the mutant phenotype. Surprisingly, sequence analysis of reassortant virus DS1XRRV, which depends on SAs to infect the cell, showed that its VP4 gene is identical to the VP4 gene of the variants. Since the only difference between DS1XRRV and the RRV variants is the parental origin of the VP7 gene (human rotavirus DS1 in the reassortant), these findings suggest that the receptor-binding specificity of rotaviruses, via VP4, may be influenced by the associated VP7 protein. PMID:8551583
Sadkowska-Todys, M
2000-01-01
The aims of these studies were: genetic characteristic of street rabies virus strains isolated from different animal species in Poland and determination of phylogenetic relationships to reference laboratory strains of the street rabies viruses belonging to genotype 1 and 5. The variability of rabies isolates and their phylogenetic relationship were studied by comparing the nucleotide sequence of the virus genome fragment. The Polish strains of genotype 1 belong to four phylogenetic groups (NE, CE, NEE, EE) corresponding to four variants: fox-racoon dog (F-RD); European fox 1 (F1); European fox 2 (F2) and European fox 3 (F3). On the Polish territories there are no rabies strains representing the variant dog-wolf and typical for arctic fox variant. The similarity of nucleotide and amino acid sequences of street rabies strains belonging to genotype 1 and laboratory strain CVS is very high. It is about 91% similarity at nucleotide level and 95% at amino acid level. Rabies strain CVS is similar to genotype 5 bat strains (EBL 1) only in about 69% and 74% at nucleotide and amino acid level, respectively. The genetic divergence of rabies strains circulating in Poland raised the need of permanent epidemiological and virological surveillance. The genotype and variant of isolated strains should be determined (using PCR and RLFP methods).
In vivo and in vitro binding of fatty acids to genetic variants of human serum albumin.
Kragh-Hansen, U; Nielsen, H; Pedersen, A O
1995-01-01
The effect of genetic variation on the fatty-acid binding properties of human serum albumin was studied by two methods involving the use of sequenced albumin variants isolated from bisalbuminaemic persons. First, the amount of total fatty acid and of several individuals fatty acids bound to eighteen different variants and to their normal counterpart (Alb A) were determined by a gas-chromatographic micromethod. Pronounced effects on total fatty acid binding were found for the glycosylated variants Alb Redhill (modified in domain II) and Alb Casebrook (domain III) in which cases a 1.7- and 8.6-fold increment, respectively, was found. By contrast, Alb Malm0 (glycosylated in domain I) carried the same amount of fatty acid as Alb A. The fatty acid loads on three chain-termination variants were normal. Finally, eight albumins with single amino-acid substitutions bound normal amounts of fatty acid, whereas one bound increased (1.7-fold) and three albumins bound diminished amounts (0.5-0.6-fold). Information on nineteen individual fatty acids was also obtained. It was possible, based on the type of changes in their relative amounts, to group the fatty acids as follows: (a) = C6:0 - C14:0, (b) = C15:0 - C18:0, (c) = C16:1 - C18:1, and (d) a group composed of essential and conditionally essential fatty acids. For nine variants, in most cases modified in domain III, large changes in one or more of these groups were observed. The changes were not related to any changes in total fatty acid load. Second, the binding of laurate, as a representative of the group (a) fatty acids, to delipidated albumin preparations was studied at pH 7.4 by a kinetic dialysis technique. The first stoichiometric association constant for binding to Alb Redhill (0.7-fold) and Alb Casebrook (0.6-fold) was diminished as compared with binding to their corresponding Alb A, whereas binding to one chain-termination variant and three single amino-acid substitutions were all unaffected by the mutation.
Detection of Emerging Vaccine-Related Polioviruses by Deep Sequencing.
Sahoo, Malaya K; Holubar, Marisa; Huang, ChunHong; Mohamed-Hadley, Alisha; Liu, Yuanyuan; Waggoner, Jesse J; Troy, Stephanie B; Garcia-Garcia, Lourdes; Ferreyra-Reyes, Leticia; Maldonado, Yvonne; Pinsky, Benjamin A
2017-07-01
Oral poliovirus vaccine can mutate to regain neurovirulence. To date, evaluation of these mutations has been performed primarily on culture-enriched isolates by using conventional Sanger sequencing. We therefore developed a culture-independent, deep-sequencing method targeting the 5' untranslated region (UTR) and P1 genomic region to characterize vaccine-related poliovirus variants. Error analysis of the deep-sequencing method demonstrated reliable detection of poliovirus mutations at levels of <1%, depending on read depth. Sequencing of viral nucleic acids from the stool of vaccinated, asymptomatic children and their close contacts collected during a prospective cohort study in Veracruz, Mexico, revealed no vaccine-derived polioviruses. This was expected given that the longest duration between sequenced sample collection and the end of the most recent national immunization week was 66 days. However, we identified many low-level variants (<5%) distributed across the 5' UTR and P1 genomic region in all three Sabin serotypes, as well as vaccine-related viruses with multiple canonical mutations associated with phenotypic reversion present at high levels (>90%). These results suggest that monitoring emerging vaccine-related poliovirus variants by deep sequencing may aid in the poliovirus endgame and efforts to ensure global polio eradication. Copyright © 2017 Sahoo et al.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Abraham, Paul E.; Wang, Xiaojing; Ranjan, Priya
The availability of next-generation sequencing technologies has rapidly transformed our ability to link genotypes to phenotypes, and as such, promises to facilitate the dissection of genetic contribution to complex traits. Although discoveries of genetic associations will further our understanding of biology, once candidate variants have been identified, investigators are faced with the challenge of characterizing the functional effects on proteins encoded by such genes. Here we show how next-generation RNA sequencing data can be exploited to construct genotype-specific protein sequence databases, which provide a clearer picture of the molecular toolbox underlying cellular and organismal processes and their variation in amore » natural population. For this study, we used two individual genotypes (DENA-17-3 and VNDL-27-4) from a recent genome wide association (GWA) study of Populus trichocarpa, an obligate outcrosser that exhibits tremendous phenotypic variation across the natural population. This strategy allowed us to comprehensively catalogue proteins containing single amino acid polymorphisms (SAAPs) and insertions and deletions (INDELS). Based on large-scale identification of SAAPs, we profiled the frequency of 128 types of naturally occurring amino acid substitutions, with a subset of SAAPs occurring in regions of the genome having strong polymorphism patterns consistent with recent positive and/or divergent selection. In addition, we were able to explore the diploid landscape of Populus at the proteome-level, allowing the characterization of heterozygous variants.« less
Abraham, Paul E.; Wang, Xiaojing; Ranjan, Priya; ...
2015-10-20
The availability of next-generation sequencing technologies has rapidly transformed our ability to link genotypes to phenotypes, and as such, promises to facilitate the dissection of genetic contribution to complex traits. Although discoveries of genetic associations will further our understanding of biology, once candidate variants have been identified, investigators are faced with the challenge of characterizing the functional effects on proteins encoded by such genes. Here we show how next-generation RNA sequencing data can be exploited to construct genotype-specific protein sequence databases, which provide a clearer picture of the molecular toolbox underlying cellular and organismal processes and their variation in amore » natural population. For this study, we used two individual genotypes (DENA-17-3 and VNDL-27-4) from a recent genome wide association (GWA) study of Populus trichocarpa, an obligate outcrosser that exhibits tremendous phenotypic variation across the natural population. This strategy allowed us to comprehensively catalogue proteins containing single amino acid polymorphisms (SAAPs) and insertions and deletions (INDELS). Based on large-scale identification of SAAPs, we profiled the frequency of 128 types of naturally occurring amino acid substitutions, with a subset of SAAPs occurring in regions of the genome having strong polymorphism patterns consistent with recent positive and/or divergent selection. In addition, we were able to explore the diploid landscape of Populus at the proteome-level, allowing the characterization of heterozygous variants.« less
Kangaroo IGF-II is structurally and functionally similar to the human [Ser29]-IGF-II variant.
Yandell, C A; Francis, G L; Wheldrake, J F; Upton, Z
1999-06-01
Kangaroo IGF-II has been purified from western grey kangaroo (Macropus fuliginosus) serum and characterised in a number of in vitro assays. In addition, the complete cDNA sequence of mature IGF-II has been obtained by reverse-transcription polymerase chain reaction. Comparison of the kangaroo IGF-II cDNA sequence with known IGF-II sequences from other species revealed that it is very similar to the human variant, [Ser29]-hIGF-II. Both the variant and kangaroo IGF-II contain an insert of nine nucleotides that encode the amino acids Leu-Pro-Gly at the junction of the B and C domains of the mature protein. The deduced kangaroo IGF-II protein sequence also contains three other amino acid changes that are not observed in human IGF-II. These amino acid differences share similarities with the changes described in many of the IGF-IIs reported for non-mammalian species. Characterisation of human IGF-II, kangaroo IGF-II, chicken IGF-II and [Ser29]-hIGF-II in a number of in vitro assays revealed that all four proteins are functionally very similar. No significant differences were observed in the ability of the IGF-IIs to bind to the bovine IGF-II/cation-independent mannose 6-phosphate receptor or to stimulate protein synthesis in rat L6 myoblasts. However, differences were observed in their abilities to bind to IGF-binding proteins (IGFBPs) present in human serum. Kangaroo, chicken and [Ser29]-hIGF-II had lower apparent affinities for human IGFBPs than did human IGF-II. Thus, it appears that the major circulating form of IGF-II in the kangaroo and a minor form of IGF-II found in human serum are structurally and functionally very similar. This suggests that the splice site that generates both the variant and major form of human IGF-II must have evolved after the divergence of marsupials from placental mammals.
Chowanadisai, Winyoo; Kelleher, Shannon L; Nemeth, Jennifer F; Yachetti, Stephen; Kuhlman, Charles F; Jackson, Joan G; Davis, Anne M; Lien, Eric L; Lönnerdal, Bo
2005-05-01
Variability in the protein composition of breast milk has been observed in many women and is believed to be due to natural variation of the human population. Single nucleotide polymorphisms (SNPs) are present throughout the entire human genome, but the impact of this variation on human milk composition and biological activity and infant nutrition and health is unclear. The goals of this study were to characterize a variant of human alpha-lactalbumin observed in milk from a Filipino population by determining the location of the polymorphism in the amino acid and genomic sequences of alpha-lactalbumin. Milk and blood samples were collected from 20 Filipino women, and milk samples were collected from an additional 450 women from nine different countries. alpha-Lactalbumin concentration was measured by high-performance liquid chromatography (HPLC), and milk samples containing the variant form of the protein were identified with both HPLC and mass spectrometry (MS). The molecular weight of the variant form was measured by MS, and the location of the polymorphism was narrowed down by protein reduction, alkylation and trypsin digestion. Genomic DNA was isolated from whole blood, and the polymorphism location and subject genotype were determined by amplifying the entire coding sequence of human alpha-lactalbumin by PCR, followed by DNA sequencing. A variant form of alpha-lactalbumin was observed in HPLC chromatograms, and the difference in molecular weight was determined by MS (wild type=14,070 Da, variant=14,056 Da). Protein reduction and digestion narrowed the polymorphism between the 33rd and 77th amino acid of the protein. The genetic polymorphism was identified as adenine to guanine, which translates to a substitution from isoleucine to valine at amino acid 46. The frequency of variation was higher in milk from China, Japan and Philippines, which suggests that this polymorphism is most prevalent in Asia. There are SNPs in the genome for human milk proteins and their implications for protein bioactivity and infant nutrition need to be considered.
Evolution of simeprevir-resistant variants over time by ultra-deep sequencing in HCV genotype 1b.
Akuta, Norio; Suzuki, Fumitaka; Sezaki, Hitomi; Suzuki, Yoshiyuki; Hosaka, Tetsuya; Kobayashi, Masahiro; Kobayashi, Mariko; Saitoh, Satoshi; Ikeda, Kenji; Kumada, Hiromitsu
2014-08-01
Using ultra-deep sequencing technology, the present study was designed to investigate the evolution of simeprevir-resistant variants (amino acid substitutions of aa80, aa155, aa156, and aa168 positions in HCV NS3 region) over time. In Toranomon Hospital, 18 Japanese patients infected with HCV genotype 1b, received triple therapy of simeprevir/PEG-IFN/ribavirin (DRAGON or CONCERT study). Sustained virological response rate was 67%, and that was significantly higher in patients with IL28B rs8099917 TT than in those with non-TT. Six patients, who did not achieve sustained virological response, were tested for resistant variants by ultra-deep sequencing, at the baseline, at the time of re-elevation of viral loads, and at 96 weeks after the completion of treatment. Twelve of 18 resistant variants, detected at re-elevation of viral load, were de novo resistant variants. Ten of 12 de novo resistant variants become undetectable over time, and that five of seven resistant variants, detected at baseline, persisted over time. In one patient, variants of Q80R at baseline (0.3%) increased at 96-week after the cessation of treatment (10.2%), and de novo resistant variants of D168E (0.3%) also increased at 96-week after the cessation of treatment (9.7%). In conclusion, the present study indicates that the emergence of simeprevir-resistant variants after the start of treatment could not be predicted at baseline, and the majority of de novo resistant variants become undetectable over time. Further large-scale prospective studies should be performed to investigate the clinical utility in detecting simeprevir-resistant variants. © 2014 Wiley Periodicals, Inc.
A gene variation of 14-3-3 zeta isoform in rat hippocampus.
Murakami, K; Situ, S Y; Eshete, F
1996-11-14
A variant form of 14-3-3 zeta was isolated from the rat hippocampal cDNA library. The cloned cDNA is 1687 bp in length and it contains an entire ORF (nt = 63-797) with 245 amino acids that is characteristic to 14-3-3 zeta subtype. By comparing with reported sequences of 14-3-3 zeta, we found three nucleotide substitutions within the coding sequence in our clone; C<-->T transition at nt = 325 and G<-->C transversions at nt = 387 and 388. Both are missense mutations, leading ACG (Thr) to ATG (Met) and CGT (Arg) to GCT (Ala) conversions at residue 88 and 109, respectively. Our results show that at least three different genetic variants of 14-3-3 zeta are present in rat species which results in protein variations. Such mutation in the amino acid sequence is an important indication of the diverse functions of this protein and may also contribute to the recent contradictory observations regarding the role of the 14-3-3 zeta subtype.
Literak, Ivan; Manga, Ivan; Wojczulanis-Jakubas, Katarzyna; Chroma, Magdalena; Jamborova, Ivana; Dobiasova, Hana; Sedlakova, Miroslava Htoutou; Cizek, Alois
2014-07-16
We aimed at Escherichia coli and Enterobacter cloacae isolates resistant to cephalosporins and fluoroquinolones and Salmonella isolates in wild birds in Arctic Svalbard, Norway. Cloacal swabs of little auks (Alle alle, n=215) and samples of faeces of glaucous gulls (Larus hyperboreus, n=15) were examined. Inducible production of AmpC enzyme was detected in E. cloacae KW218 isolate. Sequence analysis of the 1146 bp PCR product of the ampC gene from this isolate revealed 99% sequence homology with the blaACT-14 and blaACT-5 AmpC beta-lactamase genes. Four, respectively six of the identified single nucleotide polymorphisms generated amino acid substitutions in the amino acid chain. As the ampC sequence polymorphism in the investigated E. cloacae strain was identified as unique, we revealed a novel variant of the ampC beta-lactamase gene blaACT-23. Copyright © 2014 Elsevier B.V. All rights reserved.
Chen, M J; Chu, C C; Shyr, M H; Lin, C L; Lin, P Y; Yang, K L
2010-02-01
HLA-B*5214, a novel rare allele of HLA-B*52 variant, was found in a Taiwanese volunteer bone marrow donor by sequence-based typing method. The sequence of B*5214 is identical to that of B*520101 in exon 2 but differs from B*520101 in exon 3 at nucleotide positions 419 A-->T and 435 A-->G. Alteration of these two nucleotides resulted an amino acid substitution at amino acid residue 116 Y-->F ( TAC-->TTC) and a silent exchange at residue 121 K-->K (AAA-->AAG).
Simmons, Sheri L; Dibartolo, Genevieve; Denef, Vincent J; Goltsman, Daniela S Aliaga; Thelen, Michael P; Banfield, Jillian F
2008-07-22
Deeply sampled community genomic (metagenomic) datasets enable comprehensive analysis of heterogeneity in natural microbial populations. In this study, we used sequence data obtained from the dominant member of a low-diversity natural chemoautotrophic microbial community to determine how coexisting closely related individuals differ from each other in terms of gene sequence and gene content, and to uncover evidence of evolutionary processes that occur over short timescales. DNA sequence obtained from an acid mine drainage biofilm was reconstructed, taking into account the effects of strain variation, to generate a nearly complete genome tiling path for a Leptospirillum group II species closely related to L. ferriphilum (sampling depth approximately 20x). The population is dominated by one sequence type, yet we detected evidence for relatively abundant variants (>99.5% sequence identity to the dominant type) at multiple loci, and a few rare variants. Blocks of other Leptospirillum group II types ( approximately 94% sequence identity) have recombined into one or more variants. Variant blocks of both types are more numerous near the origin of replication. Heterogeneity in genetic potential within the population arises from localized variation in gene content, typically focused in integrated plasmid/phage-like regions. Some laterally transferred gene blocks encode physiologically important genes, including quorum-sensing genes of the LuxIR system. Overall, results suggest inter- and intrapopulation genetic exchange involving distinct parental genome types and implicate gain and loss of phage and plasmid genes in recent evolution of this Leptospirillum group II population. Population genetic analyses of single nucleotide polymorphisms indicate variation between closely related strains is not maintained by positive selection, suggesting that these regions do not represent adaptive differences between strains. Thus, the most likely explanation for the observed patterns of polymorphism is divergence of ancestral strains due to geographic isolation, followed by mixing and subsequent recombination.
Denef, Vincent J; Goltsman, Daniela S. Aliaga; Thelen, Michael P; Banfield, Jillian F
2008-01-01
Deeply sampled community genomic (metagenomic) datasets enable comprehensive analysis of heterogeneity in natural microbial populations. In this study, we used sequence data obtained from the dominant member of a low-diversity natural chemoautotrophic microbial community to determine how coexisting closely related individuals differ from each other in terms of gene sequence and gene content, and to uncover evidence of evolutionary processes that occur over short timescales. DNA sequence obtained from an acid mine drainage biofilm was reconstructed, taking into account the effects of strain variation, to generate a nearly complete genome tiling path for a Leptospirillum group II species closely related to L. ferriphilum (sampling depth ∼20×). The population is dominated by one sequence type, yet we detected evidence for relatively abundant variants (>99.5% sequence identity to the dominant type) at multiple loci, and a few rare variants. Blocks of other Leptospirillum group II types (∼94% sequence identity) have recombined into one or more variants. Variant blocks of both types are more numerous near the origin of replication. Heterogeneity in genetic potential within the population arises from localized variation in gene content, typically focused in integrated plasmid/phage-like regions. Some laterally transferred gene blocks encode physiologically important genes, including quorum-sensing genes of the LuxIR system. Overall, results suggest inter- and intrapopulation genetic exchange involving distinct parental genome types and implicate gain and loss of phage and plasmid genes in recent evolution of this Leptospirillum group II population. Population genetic analyses of single nucleotide polymorphisms indicate variation between closely related strains is not maintained by positive selection, suggesting that these regions do not represent adaptive differences between strains. Thus, the most likely explanation for the observed patterns of polymorphism is divergence of ancestral strains due to geographic isolation, followed by mixing and subsequent recombination. PMID:18651792
Zink, Lisa-Maria; Delbarre, Erwan; Eberl, H. Christian; Keilhauer, Eva C.; Bönisch, Clemens; Pünzeler, Sebastian; Bartkuhn, Marek; Collas, Philippe; Mann, Matthias
2017-01-01
Abstract Histone chaperones prevent promiscuous histone interactions before chromatin assembly. They guarantee faithful deposition of canonical histones and functionally specialized histone variants into chromatin in a spatial- and temporally-restricted manner. Here, we identify the binding partners of the primate-specific and H3.3-related histone variant H3.Y using several quantitative mass spectrometry approaches, and biochemical and cell biological assays. We find the HIRA, but not the DAXX/ATRX, complex to recognize H3.Y, explaining its presence in transcriptionally active euchromatic regions. Accordingly, H3.Y nucleosomes are enriched in the transcription-promoting FACT complex and depleted of repressive post-translational histone modifications. H3.Y mutational gain-of-function screens reveal an unexpected combinatorial amino acid sequence requirement for histone H3.3 interaction with DAXX but not HIRA, and for H3.3 recruitment to PML nuclear bodies. We demonstrate the importance and necessity of specific H3.3 core and C-terminal amino acids in discriminating between distinct chaperone complexes. Further, chromatin immunoprecipitation sequencing experiments reveal that in contrast to euchromatic HIRA-dependent deposition sites, human DAXX/ATRX-dependent regions of histone H3 variant incorporation are enriched in heterochromatic H3K9me3 and simple repeat sequences. These data demonstrate that H3.Y's unique amino acids allow a functional distinction between HIRA and DAXX binding and its consequent deposition into open chromatin. PMID:28334823
Insecticidal components from field pea extracts: sequences of some variants of pea albumin 1b.
Taylor, Wesley G; Sutherland, Daniel H; Olson, Douglas J H; Ross, Andrew R S; Fields, Paul G
2004-12-15
Methanol soluble insecticidal peptides with masses of 3752, 3757, and 3805 Da, isolated from crude extracts (C8 extracts) derived from the protein-enriched flour of commercial field peas [Pisum sativum (L.)], were purified by reversed phase chromatography and, after reduction and alkylation, were sequenced by matrix-assisted laser desorption/ionization (MALDI) time-of-flight mass spectrometry with the aid of various peptidases. These major peptides were variants of pea albumin 1b (PA1b) with methionine sulfoxide rather than methionine at position 12. Peptide 3752 showed additional variations at positions 29 (valine for isoleucine) and 34 (histidine for asparagine). A minor, 37 amino acid peptide with a molecular mass of 3788 Da was also sequenced and differed from a known PA1b variant at positions 1, 25, and 31. Sequence variants of PA1b with their molecular masses were compiled, and variants that matched the accurate masses of the experimental peptides were used to narrow the search. MALDI postsource decay experiments on pronase fragments helped to confirm the sequences. Whole and dehulled field peas gave insecticidal C8 extracts in the laboratory that were enriched in peptides with masses of 3736, 3741, and 3789 Da, as determined by high-performance liquid chromatography (HPLC) and electrospray ionization mass spectrometry. It was therefore concluded that oxidation of the methionine residues to methionine sulfoxide occurred primarily during the processing of dehulled peas in a mill.
Anderson, Tavis K; Laegreid, William W; Cerutti, Francesco; Osorio, Fernando A; Nelson, Eric A; Christopher-Hennings, Jane; Goldberg, Tony L
2012-06-15
The extraordinary genetic and antigenic variability of RNA viruses is arguably the greatest challenge to the development of broadly effective vaccines. No single viral variant can induce sufficiently broad immunity, and incorporating all known naturally circulating variants into one multivalent vaccine is not feasible. Furthermore, no objective strategies currently exist to select actual viral variants that should be included or excluded in polyvalent vaccines. To address this problem, we demonstrate a method based on graph theory that quantifies the relative importance of viral variants. We demonstrate our method through application to the envelope glycoprotein gene of a particularly diverse RNA virus of pigs: porcine reproductive and respiratory syndrome virus (PRRSV). Using distance matrices derived from sequence nucleotide difference, amino acid difference and evolutionary distance, we constructed viral networks and used common network statistics to assign each sequence an objective ranking of relative 'importance'. To validate our approach, we use an independent published algorithm to score our top-ranked wild-type variants for coverage of putative T-cell epitopes across the 9383 sequences in our dataset. Top-ranked viruses achieve significantly higher coverage than low-ranked viruses, and top-ranked viruses achieve nearly equal coverage as a synthetic mosaic protein constructed in silico from the same set of 9383 sequences. Our approach relies on the network structure of PRRSV but applies to any diverse RNA virus because it identifies subsets of viral variants that are most important to overall viral diversity. We suggest that this method, through the objective quantification of variant importance, provides criteria for choosing viral variants for further characterization, diagnostics, surveillance and ultimately polyvalent vaccine development.
The genetic evolution of canine parvovirus - A new perspective.
Zhou, Pei; Zeng, Weijie; Zhang, Xin; Li, Shoujun
2017-01-01
To trace the evolution process of CPV-2, all of the VP2 gene sequences of CPV-2 and FPV (from 1978 to 2015) from GenBank were analyzed in this study. Then, several new ideas regarding CPV-2 evolution were presented. First, the VP2 amino acid 555 and 375 positions of CPV-2 were first ruled out as a universal mutation site in CPV-2a and amino acid 101 position of FPV feature I or T instead of only I in existing rule. Second, the recently confusing nomenclature of CPV-2 variants was substituted with a optional nomenclature that would serve future CPV-2 research. Third, After check the global distribution of variants, CPV-2a is the predominant variant in Asia and CPV-2c is the predominant variant in Europe and Latin America. Fourth, a series of CPV-2-like strains were identified and deduced to evolve from modified live vaccine strains. Finally, three single VP2 mutation (F267Y, Y324I, and T440A) strains were caught concern. Furthermore, these three new VP2 mutation strains may be responsible for vaccine failure, and the strains with VP2 440A may become the novel CPV sub-variant. In conclusion, a summary of all VP2 sequences provides a new perspective regarding CPV-2 evolution and the correlative biological studies needs to be further performed.
Xu, Gaolian; You, Qimin; Pickerill, Sam; Zhong, Huayan; Wang, Hongying; Shi, Jian; Luo, Ying; You, Paul; Kong, Huimin; Lu, Fengmin; Hu, Lin
2010-07-01
Chronic hepatitis B virus (CHBV) infection causes cirrhosis and hepatocellular carcinoma. Lamivudine (LAM) has been successfully used to treat CHBV infections but prolonged use leads to the emergence of drug-resistant variants. This is primarily linked to a mutation in the tyrosine-methionine-aspartate-aspartate (YMDD) motif of the HBV polymerase gene at position 204. Rapid diagnosis of drug-resistant HBV is necessary for a prompt treatment response. Common diagnostic methods such as sequencing and restriction fragment length polymorphism (RFLP) analysis lack sensitivity and require significant processing. The aim of this study was to demonstrate the usefulness of a novel diagnostic method that combines polymerase chain reaction (PCR), ligase detection reaction (LDR) and a nucleic acid detection strip (NADS) in detecting site-specific mutations related to HBV LAM resistance. We compared this method (PLNA) to direct sequencing and RFLP analysis in 50 clinical samples from HBV infected patients. There was 90% concordance between all three results. PLNA detected more samples containing mutant variants than both sequencing and RFLP analysis and was more sensitive in detecting mixed variant populations. Plasmid standards indicated that the sensitivity of PLNA is at or below 3,000 copies per ml and that it can detect a minor variant at 5% of the total viral population. This warrants its further development and suggests that the PLNA method could be a useful tool in detecting LAM resistance. (c) 2010 Wiley-Liss, Inc.
GenProBiS: web server for mapping of sequence variants to protein binding sites.
Konc, Janez; Skrlj, Blaz; Erzen, Nika; Kunej, Tanja; Janezic, Dusanka
2017-07-03
Discovery of potentially deleterious sequence variants is important and has wide implications for research and generation of new hypotheses in human and veterinary medicine, and drug discovery. The GenProBiS web server maps sequence variants to protein structures from the Protein Data Bank (PDB), and further to protein-protein, protein-nucleic acid, protein-compound, and protein-metal ion binding sites. The concept of a protein-compound binding site is understood in the broadest sense, which includes glycosylation and other post-translational modification sites. Binding sites were defined by local structural comparisons of whole protein structures using the Protein Binding Sites (ProBiS) algorithm and transposition of ligands from the similar binding sites found to the query protein using the ProBiS-ligands approach with new improvements introduced in GenProBiS. Binding site surfaces were generated as three-dimensional grids encompassing the space occupied by predicted ligands. The server allows intuitive visual exploration of comprehensively mapped variants, such as human somatic mis-sense mutations related to cancer and non-synonymous single nucleotide polymorphisms from 21 species, within the predicted binding sites regions for about 80 000 PDB protein structures using fast WebGL graphics. The GenProBiS web server is open and free to all users at http://genprobis.insilab.org. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Graff, J; Normann, A; Feinstone, S M; Flehmig, B
1994-01-01
In order to study cell tropism and attenuation of hepatitis A virus (HAV), the genome of HAV wild-type GBM and two cell culture-adapted variants, GBM/FRhK and GBM/HFS, were cloned and sequenced after amplification by reverse transcriptase-PCR. During virus cultivation, the HAV variant GBM/FRhK had a strict host range for FRhK-4 cells, in contrast to GBM/HFS, which can be grown in HFS and FRhK-4 cells. The HAV variant GBM/HFS was shown to be attenuated when inoculated into chimpanzees (B. Flehmig, R. F. Mauler, G. Noll, E. Weinmann, and J. P. Gregerson, p. 87-90, in A. Zuckerman, ed., Viral Hepatitis and Liver Disease, 1988). On the basis of this biological background, the comparison of the nucleotide sequences of these three HAV GBM variants should elucidate differences which may be of importance for cell tropism and attenuation. The comparison of the genome between the GBM wild type and HAV wild types HM175 (J. I. Cohen, J. R. Ticehurst, R. H. Purcell, A. Buckler-White, and B. M. Baroudy, J. Virol. 61:50-59, 1987) and HAV-LA (R. Najarian, O. Caput, W. Gee, S. J. Potter, A. Renard, J. Merryweather, G. Van Nest, and D. Dina, Proc. Natl. Acad. Sci. USA 82:2627-2631, 1985) showed a 92 to 96.3% identity, whereas the identity was 99.3 to 99.6% between the GBM variants. Nucleotide differences between the wild-type and the cell culture-adapted variants, which were identical in both cell culture-adapted GBM variants, were localized in the 5' noncoding region; in 2B, 3B, and 3D; and in the 3' noncoding region. Our result concerning the 2B/2C region confirms a mutation at position 3889 (C-->T, alanine to valine), which had been shown to be of importance for cell culture adaptation (S. U. Emerson, C. McRill, B. Rosenblum, S. M. Feinstone, and R. H. Purcell, J. Virol. 65:4882-4886, 1991; S. U. Emerson, Y. K. Huang, C. McRill, M. Lewis, and R. H. Purcell, J. Virol. 66:650-654, 1992), whereas other mutations differ from published HAV sequence data and may be cell specific. Further comparison of the two cell culture-adapted GBM variants showed cell-specific mutations resulting in deletions of six amino acids in the VP1 region and three amino acids in the 3A region of the GBM variant GBM/FRhK. PMID:8254770
Isolated nucleic acids encoding antipathogenic polypeptides and uses thereof
Altier, Daniel J.; Crane, Virginia C.; Ellanskaya, Irina; Ellanskaya, Natalia; Gilliam, Jacob T.; Hunter-Cevera, Jennie; Presnail, James K.; Schepers, Eric J.; Simmons, Carl R.; Torok, Tamas; Yalpani, Nasser
2010-04-20
Compositions and methods for protecting a plant from a pathogen, particularly a fungal pathogen, are provided. Compositions include amino acid sequences, and variants and fragments thereof, for antipathogenic polypeptides that were isolated from fungal fermentation broths. Nucleic acids that encode the antipathogenic polypeptides are also provided. A method for inducing pathogen resistance in a plant using the nucleotide sequences disclosed herein is further provided. The method comprises introducing into a plant an expression cassette comprising a promoter operably linked to a nucleotide sequence that encodes an antipathogenic polypeptide of the invention. Compositions comprising an antipathogenic polypeptide or a transformed microorganism comprising a nucleic acid of the invention in combination with a carrier and methods of using these compositions to protect a plant from a pathogen are further provided. Transformed plants, plant cells, seeds, and microorganisms comprising a nucleotide sequence that encodes an antipathogenic polypeptide of the invention are also disclosed.
Ebrahim, Hatim Y; Baker, Robert J; Mehta, Atul B; Hughes, Derralynn A
2012-03-01
The functional significance of missense mutations in genes encoding acid glycosidases of lysosomal storage disorders (LSDs) is not always clear. Here we describe a method of investigating functional properties of variant enzymes in vitro using a human embryonic kidney epithelial cell line. Site-directed mutagenesis was performed on the parental plasmids containing cDNA encoding for alpha-galactosidase A (α-Gal A) and acid maltase (α-Glu) to prepare plasmids encoding relevant point mutations. Mutant plasmids were transfected into HEK 293 T cells, and transient over-expression of variant enzymes was measured after 3 days. We have illustrated the method by examining enzymatic activities of four unknown α-Gal A and one α-Glu variants identified in our patients with Anderson-Fabry disease and Pompe diseases respectively. Comparison with control variants known to be either pathogenic or non-pathogenic together with over-expression of wild-type enzyme allowed determination of the pathogenicity of the mutation. One leader sequence novel variant of α-Gal A (p.A15T) was shown not to significantly reduce enzyme activity, whereas three other novel α-Gal A variants (p.D93Y, p.L372P and p.T410I) were shown to be pathogenic as they resulted in significant reduction of enzyme activity. A novel α-Glu variant (p.L72R) was shown to be pathogenic as this significantly reduced enzyme activity. Certain acid glycosidase variants that have been described in association with late-onset LSDs and which are known to have variable residual plasma and leukocyte enzyme activity in patients appear to show intermediate to low enzyme activity (p.N215S and p.Q279E α-Gal A respectively) in the over-expression system.
Exon 11 skipping of SCN10A coding for voltage-gated sodium channels in dorsal root ganglia
Schirmeyer, Jana; Szafranski, Karol; Leipold, Enrico; Mawrin, Christian; Platzer, Matthias; Heinemann, Stefan H
2014-01-01
The voltage-gated sodium channel NaV1.8 (encoded by SCN10A) is predominantly expressed in dorsal root ganglia (DRG) and plays a critical role in pain perception. We analyzed SCN10A transcripts isolated from human DRGs using deep sequencing and found a novel splice variant lacking exon 11, which codes for 98 amino acids of the domain I/II linker. Quantitative PCR analysis revealed an abundance of this variant of up to 5–10% in human, while no such variants were detected in mouse or rat. Since no obvious functional differences between channels with and without the exon-11 sequence were detected, it is suggested that SCN10A exon 11 skipping in humans is a tolerated event. PMID:24763188
Tharia, Hazel A; Shrive, Annette K; Mills, John D; Arme, Chris; Williams, Gwyn T; Greenhough, Trevor J
2002-02-22
The serum amyloid P component (SAP)-like pentraxin Limulus polyphemus SAP is a recently discovered, distinct pentraxin species, of known structure, which does not bind phosphocholine and whose N-terminal sequence has been shown to differ markedly from the highly conserved N terminus of all other known horseshoe crab pentraxins. The complete cDNA sequence of Limulus SAP, and the derived amino acid sequence, the first invertebrate SAP-like pentraxin sequence, have been determined. Two sequences were identified that differed only in the length of the 3' untranslated region. Limulus SAP is synthesised as a precursor protein of 234 amino acid residues, the first 17 residues encoding a signal peptide that is absent from the mature protein. Phylogenetic analysis clusters Limulus SAP pentraxin with the horseshoe crab C-reactive proteins (CRPs) rather than the mammalian SAPs, which are clustered with mammalian CRPs. The deduced amino acid sequence shares 22% identity with both human SAP and CRP, which are 51% identical, and 31-35% with horseshoe crab CRPs. These analyses indicate that gene duplication of CRP (or SAP), followed by sequence divergence and the evolution of CRP and/or SAP function, occurred independently along the chordate and arthropod evolutionary lines rather than in a common ancestor. They further indicate that the CRP/SAP gene duplication event in Limulus occurred before both the emergence of the Limulus CRP variants and the mammalian CRP/SAP gene duplication. Limulus SAP, which does not exhibit the CRP characteristic of calcium-dependent binding to phosphocholine, is established as a pentraxin species distinct from all other known horseshoe crab pentraxins that exist in many variant forms sharing a high level of sequence homology. Copyright 2002 Elsevier Science Ltd.
Lescat, Mathilde; Hoede, Claire; Clermont, Olivier; Garry, Louis; Darlu, Pierre; Tuffery, Pierre; Denamur, Erick; Picard, Bertrand
2009-12-29
Previous studies have established a correlation between electrophoretic polymorphism of esterase B, and virulence and phylogeny of Escherichia coli. Strains belonging to the phylogenetic group B2 are more frequently implicated in extraintestinal infections and include esterase B2 variants, whereas phylogenetic groups A, B1 and D contain less virulent strains and include esterase B1 variants. We investigated esterase B as a marker of phylogeny and/or virulence, in a thorough analysis of the esterase B-encoding gene. We identified the gene encoding esterase B as the acetyl-esterase gene (aes) using gene disruption. The analysis of aes nucleotide sequences in a panel of 78 reference strains, including the E. coli reference (ECOR) strains, demonstrated that the gene is under purifying selection. The phylogenetic tree reconstructed from aes sequences showed a strong correlation with the species phylogenetic history, based on multi-locus sequence typing using six housekeeping genes. The unambiguous distinction between variants B1 and B2 by electrophoresis was consistent with Aes amino-acid sequence analysis and protein modelling, which showed that substituted amino acids in the two esterase B variants occurred mostly at different sites on the protein surface. Studies in an experimental mouse model of septicaemia using mutant strains did not reveal a direct link between aes and extraintestinal virulence. Moreover, we did not find any genes in the chromosomal region of aes to be associated with virulence. Our findings suggest that aes does not play a direct role in the virulence of E. coli extraintestinal infection. However, this gene acts as a powerful marker of phylogeny, illustrating the extensive divergence of B2 phylogenetic group strains from the rest of the species.
Rare mtDNA variants in Leber hereditary optic neuropathy families with recurrence of myoclonus.
La Morgia, C; Achilli, A; Iommarini, L; Barboni, P; Pala, M; Olivieri, A; Zanna, C; Vidoni, S; Tonon, C; Lodi, R; Vetrugno, R; Mostacci, B; Liguori, R; Carroccia, R; Montagna, P; Rugolo, M; Torroni, A; Carelli, V
2008-03-04
To investigate the mechanisms underlying myoclonus in Leber hereditary optic neuropathy (LHON). Five patients and one unaffected carrier from two Italian families bearing the homoplasmic 11778/ND4 and 3460/ND1 mutations underwent a uniform investigation including neurophysiologic studies, muscle biopsy, serum lactic acid after exercise, and muscle ((31)P) and cerebral ((1)H) magnetic resonance spectroscopy (MRS). Biochemical investigations on fibroblasts and complete mitochondrial DNA (mtDNA) sequences of both families were also performed. All six individuals had myoclonus. In spite of a normal EEG background and the absence of giant SEPs and C reflex, EEG-EMG back-averaging showed a preceding jerk-locked EEG potential, consistent with a cortical generator of the myoclonus. Specific comorbidities in the 11778/ND4 family included muscular cramps and psychiatric disorders, whereas features common to both families were migraine and cardiologic abnormalities. Signs of mitochondrial proliferation were seen in muscle biopsies and lactic acid elevation was observed in four of six patients. (31)P-MRS was abnormal in five of six patients and (1)H-MRS showed ventricular accumulation of lactic acid in three of six patients. Fibroblast ATP depletion was evident at 48 hours incubation with galactose in LHON/myoclonus patients. Sequence analysis revealed haplogroup T2 (11778/ND4 family) and U4a (3460/ND1 family) mtDNAs. A functional role for the non-synonymous 4136A>G/ND1, 9139G>A/ATPase6, and 15773G>A/cyt b variants was supported by amino acid conservation analysis. Myoclonus and other comorbidities characterized our Leber hereditary optic neuropathy (LHON) families. Functional investigations disclosed a bioenergetic impairment in all individuals. Our sequence analysis suggests that the LHON plus phenotype in our cases may relate to the synergic role of mtDNA variants.
SLC6A1 Mutation and Ketogenic Diet in Epilepsy With Myoclonic-Atonic Seizures.
Palmer, Samantha; Towne, Meghan C; Pearl, Phillip L; Pelletier, Renee C; Genetti, Casie A; Shi, Jiahai; Beggs, Alan H; Agrawal, Pankaj B; Brownstein, Catherine A
2016-11-01
Epilepsy with myoclonic-atonic seizures, also known as myoclonic-astatic epilepsy or Doose syndrome, has been recently linked to variants in the SLC6A1 gene. Epilepsy with myoclonic-atonic seizures is often refractory to antiepileptic drugs, and the ketogenic diet is known for treating medically intractable seizures, although the mechanism of action is largely unknown. We report a novel SLC6A1 variant in a patient with epilepsy with myoclonic-atonic seizures, analyze its effects, and suggest a mechanism of action for the ketogenic diet. We describe a ten-year-old girl with epilepsy with myoclonic-atonic seizures and a de novo SLC6A1 mutation who responded well to the ketogenic diet. She carried a c.491G>A mutation predicted to cause p.Cys164Tyr amino acid change, which was identified using whole exome sequencing and confirmed by Sanger sequencing. High-resolution structural modeling was used to analyze the likely effects of the mutation. The SLC6A1 gene encodes a transporter that removes gamma-aminobutyric acid from the synaptic cleft. Mutations in SLC6A1 are known to disrupt the gamma-aminobutyric acid transporter protein 1, affecting gamma-aminobutyric acid levels and causing seizures. The p.Cys164Tyr variant found in our study has not been previously reported, expanding on the variants linked to epilepsy with myoclonic-atonic seizures. A 10-year-old girl with a novel SLC6A1 mutation and epilepsy with myoclonic-atonic seizures had an excellent clinical response to the ketogenic diet. An effect of the diet on gamma-aminobutyric acid reuptake mediated by gamma-aminobutyric acid transporter protein 1 is suggested. A personalized approach to epilepsy with myoclonic-atonic seizures patients carrying SLC6A1 mutation and a relationship between epilepsy with myoclonic-atonic seizures due to SLC6A1 mutations, GABAergic drugs, and the ketogenic diet warrants further exploration. Copyright © 2016 Elsevier Inc. All rights reserved.
Detection of the Canine Parvovirus 2c Subtype in Australian Dogs.
Woolford, Lucy; Crocker, Paul; Bobrowski, Hannah; Baker, Trevor; Hemmatzadeh, Farhid
2017-06-01
Canine parvovirus (CPV-2) is an important cause of hemorrhagic enteritis in dogs. In Australia the disease has been associated with CPV-2a and CPV-2b variants. A third more recently emerged variant overseas, CPV-2c, has not been detected in surveys of the Australian dog population. In this study, we report three cases of canine parvoviral enteritis associated with CPV-2c infection; case 1 occurred in an 8-week-old puppy that died following acute hemorrhagic enteritis. Cases 2 and 3 were an 11-month-old female entire Saint Bernard and a 9-month-old male entire Siberian husky, respectively, both which had completed vaccination schedules and presented with vomiting or mild diarrhea only. Full genomic sequencing of parvoviral DNA from cases 1, 2, and 3 revealed greater than 99% homology to known CPV-2c variants and predicted protein sequences from the VP2 region of viral DNA from all three cases identified; glutamic acid residues at the 426 amino acid residue, characteristic of the CPV-2c variant. Veterinary professionals should be aware that CPV-2c is now present in Australia, detected in a puppy and vaccinated young adult dogs in this study. Further characterization of CPV-2c-associated disease and its prevalence in Australian dogs requires additional research.
Ultrasensitive Genotypic Detection of Antiviral Resistance in Hepatitis B Virus Clinical Isolates▿ †
Fang, Jie; Wichroski, Michael J.; Levine, Steven M.; Baldick, Carl J.; Mazzucco, Charles E.; Walsh, Ann W.; Kienzle, Bernadette K.; Rose, Ronald E.; Pokornowski, Kevin A.; Colonno, Richard J.; Tenney, Daniel J.
2009-01-01
Amino acid substitutions that confer reduced susceptibility to antivirals arise spontaneously through error-prone viral polymerases and are selected as a result of antiviral therapy. Resistance substitutions first emerge in a fraction of the circulating virus population, below the limit of detection by nucleotide sequencing of either the population or limited sets of cloned isolates. These variants can expand under drug pressure to dominate the circulating virus population. To enhance detection of these viruses in clinical samples, we established a highly sensitive quantitative, real-time allele-specific PCR assay for hepatitis B virus (HBV) DNA. Sensitivity was accomplished using a high-fidelity DNA polymerase and oligonucleotide primers containing locked nucleic acid bases. Quantitative measurement of resistant and wild-type variants was accomplished using sequence-matched standards. Detection methodology that was not reliant on hybridization probes, and assay modifications, minimized the effect of patient-specific sequence polymorphisms. The method was validated using samples from patients chronically infected with HBV through parallel sequencing of large numbers of cloned isolates. Viruses with resistance to lamivudine and other l-nucleoside analogs and entecavir, involving 17 different nucleotide substitutions, were reliably detected at levels at or below 0.1% of the total population. The method worked across HBV genotypes. Longitudinal analysis of patient samples showed earlier emergence of resistance on therapy than was seen with sequencing methodologies, including some cases of resistance that existed prior to treatment. In summary, we established and validated an ultrasensitive method for measuring resistant HBV variants in clinical specimens, which enabled earlier, quantitative measurement of resistance to therapy. PMID:19433559
PNPLA3 variant I148M is associated with altered hepatic lipid composition in humans.
Peter, Andreas; Kovarova, Marketa; Nadalin, Silvio; Cermak, Tomas; Königsrainer, Alfred; Machicao, Fausto; Stefan, Norbert; Häring, Hans-Ulrich; Schleicher, Erwin
2014-10-01
The common sequence variant I148M of the patatin-like phospholipase domain-containing protein 3 gene (PNPLA3) is associated with increased hepatic triacylglycerol (TAG) content, but not with insulin resistance, in humans. The PNPLA3 (I148M) variant was previously reported to alter the specificity of the encoded enzyme and subsequently affect lipid composition. We analysed the fatty acid composition of five lipid fractions from liver tissue samples from 52 individuals, including 19 carriers of the minor PNPLA3 (I148M) variant. PNPLA3 (I148M) was associated with a strong increase (1.75-fold) in liver TAGs, but with no change in other lipid fractions. PNPLA3 (I148M) minor allele carriers had an increased n-3 polyunsaturated fatty acid (PUFA) α-linolenic acid content and reductions in several n-6 PUFAs in the liver TAG fraction. Furthermore, there was a strong inverse correlation between n-6 PUFA and TAG content independent of PNPLA3 genotype. In a multivariate model including liver fat content, PNPLA3 genotype and fatty acid composition, two significant differences could be exclusively attributed to the PNPLA3 (I148M) minor allele: reduced stearic acid and increased α-linolenic acid content in the hepatic TAG fraction. These changes therefore suggest a mechanism to explain the PNPLA3 (I148M)-dependent increase in liver fat content without causing insulin resistance. Stearic acid can induce insulin resistance, whereas α-linolenic acid may protect against it.
Tang, Danming; Lam, Cynthia; Louie, Salina; Hoi, Kam Hon; Shaw, David; Yim, Mandy; Snedecor, Brad; Misaghi, Shahram
2018-01-01
In the process of generating stable monoclonal antibody (mAb) producing cell lines, reagents such as methotrexate (MTX) or methionine sulfoximine (MSX) are often used. However, using such selection reagent(s) increases the possibility of having higher occurrence of sequence variants in the expressed antibody molecules due to the effects of MTX or MSX on de novo nucleotide synthesis. Since MSX inhibits glutamine synthase (GS) and results in both amino acid and nucleoside starvation, it is questioned whether supplementing nucleosides into the media could lower sequence variant levels without affecting titer. The results show that the supplementation of nucleosides to the media during MSX selection decreased genomic DNA mutagenesis rates in the selected cells, probably by reducing nucleotide mis-incorporation into the DNA. Furthermore, addition of nucleosides enhance clone recovery post selection and does not affect antibody expression. It is further observed that nucleoside supplements lowered DNA mutagenesis rates only at the initial stage of the clone selection and do not have any effect on DNA mutagenesis rates after stable cell lines are established. Therefore, the data suggests that addition of nucleosides during early stages of MSX selection can lower sequence variant levels without affecting titer or clone stability in antibody expression. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
VarMod: modelling the functional effects of non-synonymous variants.
Pappalardo, Morena; Wass, Mark N
2014-07-01
Unravelling the genotype-phenotype relationship in humans remains a challenging task in genomics studies. Recent advances in sequencing technologies mean there are now thousands of sequenced human genomes, revealing millions of single nucleotide variants (SNVs). For non-synonymous SNVs present in proteins the difficulties of the problem lie in first identifying those nsSNVs that result in a functional change in the protein among the many non-functional variants and in turn linking this functional change to phenotype. Here we present VarMod (Variant Modeller) a method that utilises both protein sequence and structural features to predict nsSNVs that alter protein function. VarMod develops recent observations that functional nsSNVs are enriched at protein-protein interfaces and protein-ligand binding sites and uses these characteristics to make predictions. In benchmarking on a set of nearly 3000 nsSNVs VarMod performance is comparable to an existing state of the art method. The VarMod web server provides extensive resources to investigate the sequence and structural features associated with the predictions including visualisation of protein models and complexes via an interactive JSmol molecular viewer. VarMod is available for use at http://www.wasslab.org/varmod. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Sex determination: balancing selection in the honey bee.
Charlesworth, Deborah
2004-07-27
Sequences of alleles of the honey bee's primary sex-determining gene have extremely high diversity, with many amino acid variants, suggesting that different alleles of this gene have been maintained in populations for very long evolutionary times.
Continuously tunable nucleic acid hybridization probes.
Wu, Lucia R; Wang, Juexiao Sherry; Fang, John Z; Evans, Emily R; Pinto, Alessandro; Pekker, Irena; Boykin, Richard; Ngouenet, Celine; Webster, Philippa J; Beechem, Joseph; Zhang, David Yu
2015-12-01
In silico-designed nucleic acid probes and primers often do not achieve favorable specificity and sensitivity tradeoffs on the first try, and iterative empirical sequence-based optimization is needed, particularly in multiplexed assays. We present a novel, on-the-fly method of tuning probe affinity and selectivity by adjusting the stoichiometry of auxiliary species, which allows for independent and decoupled adjustment of the hybridization yield for different probes in multiplexed assays. Using this method, we achieved near-continuous tuning of probe effective free energy. To demonstrate our approach, we enforced uniform capture efficiency of 31 DNA molecules (GC content, 0-100%), maximized the signal difference for 11 pairs of single-nucleotide variants and performed tunable hybrid capture of mRNA from total RNA. Using the Nanostring nCounter platform, we applied stoichiometric tuning to simultaneously adjust yields for a 24-plex assay, and we show multiplexed quantitation of RNA sequences and variants from formalin-fixed, paraffin-embedded samples.
Limited Variation in BK Virus T-Cell Epitopes Revealed by Next-Generation Sequencing
Sahoo, Malaya K.; Tan, Susanna K.; Chen, Sharon F.; Kapusinszky, Beatrix; Concepcion, Katherine R.; Kjelson, Lynn; Mallempati, Kalyan; Farina, Heidi M.; Fernández-Viña, Marcelo; Tyan, Dolly; Grimm, Paul C.; Anderson, Matthew W.; Concepcion, Waldo
2015-01-01
BK virus (BKV) infection causing end-organ disease remains a formidable challenge to the hematopoietic cell transplant (HCT) and kidney transplant fields. As BKV-specific treatments are limited, immunologic-based therapies may be a promising and novel therapeutic option for transplant recipients with persistent BKV infection. Here, we describe a whole-genome, deep-sequencing methodology and bioinformatics pipeline that identify BKV variants across the genome and at BKV-specific HLA-A2-, HLA-B0702-, and HLA-B08-restricted CD8 T-cell epitopes. BKV whole genomes were amplified using long-range PCR with four inverse primer sets, and fragmentation libraries were sequenced on the Ion Torrent Personal Genome Machine (PGM). An error model and variant-calling algorithm were developed to accurately identify rare variants. A total of 65 samples from 18 pediatric HCT and kidney recipients with quantifiable BKV DNAemia underwent whole-genome sequencing. Limited genetic variation was observed. The median number of amino acid variants identified per sample was 8 (range, 2 to 37; interquartile range, 10), with the majority of variants (77%) detected at a frequency of <5%. When normalized for length, there was no statistical difference in the median number of variants across all genes. Similarly, the predominant virus population within samples harbored T-cell epitopes similar to the reference BKV strain that was matched for the BKV genotype. Despite the conservation of epitopes, low-level variants in T-cell epitopes were detected in 77.7% (14/18) of patients. Understanding epitope variation across the whole genome provides insight into the virus-immune interface and may help guide the development of protocols for novel immunologic-based therapies. PMID:26202116
la Torre, David De; Mafla, Eulalia; Puga, Byron; Erazo, Linda; Astolfi-Ferreira, Claudete; Ferreira, Antonio Piantino
2018-04-01
The objective of this study was to determine the presence of the variants of canine parvovirus (CPV)-2 in the city of Quito, Ecuador, due to the high domestic and street-type canine population, and to identify possible mutations at a genetic level that could be causing structural changes in the virus with a consequent influence on the immune response of the hosts. Thirty-five stool samples from different puppies with characteristic signs of the disease and positives for CPV through immunochromatography kits were collected from different veterinarian clinics of the city. Polymerase chain reaction and DNA sequencing were used to determine the mutations in residue 426 of the VP2 gene, which determines the variants of CPV-2; in addition, four samples were chosen for complete sequencing of the VP2 gene to identify all possible mutations in the circulating strains in this region of the country. The results revealed the presence of the three variants of CPV-2 with a prevalence of 57.1% (20/35) for CPV-2a, 8.5% (3/35) for CPV-2b, and 34.3% (12/35) for CPV-2c. In addition, complete sequencing of the VP2 gene showed amino acid substitutions in residues 87, 101, 139, 219, 297, 300, 305, 322, 324, 375, 386, 426, 440, and 514 of the three Ecuadorian variants when compared with the original CPV-2 sequence. This study describes the detection of CPV variants in the city of Quito, Ecuador. Variants of CPV-2 (2a, 2b, and 2c) have been reported in South America, and there are cases in Ecuador where CVP-2 is affecting even vaccinated puppies.
Cicek, Aysegul Copur; Duzgun, Azer Ozad; Saral, Aysegul; Sandalli, Cemal
2014-10-01
Proteus mirabilis (P. mirabilis) is one of Gram-negative pathogens encountered in clinical specimens. A clinical isolate (TRP41) of P. mirabilis was isolated from a Turkish patient in Turkey. The isolate was identified using the API 32GN system and 16S rRNA gene sequencing and it was found resistant to ampicillin/sulbactam, piperacillin, tetracycline, and trimethoprim/sulfamethoxazole. This isolate was harboring a Class 1 integron gene cassette and its DNA sequence analysis revealed a novel blaOXA variant exhibiting one amino acid substitution (Asn266Ile) from blaOXA-1 . This new variant of OXA was located on Class 1 integron together with aadA1 gene encoding aminoglycoside-modifying enzymes. According to sequence records, the new variant was named as blaOXA-320 . Cassette array and size of integron were found as blaOXA-320 -aadA1 and 2086 bp, respectively. The blaOXA-320 gene is not transferable according to conjugation experiment. In this study, we report the first identification of blaOXA-320 -aadA1 gene cassette, a novel variant of Class D β-lactamase, in P. mirabilis from Turkey. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Pan, Wei; Song, Im-Sook; Shin, Ho-Jung; Kim, Min-Hye; Choi, Yeong-Lim; Lim, Su-Jeong; Kim, Woo-Young; Lee, Sang-Seop; Shin, Jae-Gook
2011-06-01
Genetic variants of Na(+)-taurocholate co-transporting polypeptide (NTCP; SLC10A1) and ileal apical sodium-dependent bile acid transporter (ASBT; SLC10A2), which greatly contribute to bile acid homeostasis, were extensively explored in the Korean population and functional variants of NTCP were compared among Asian populations. From direct DNA sequencing, six SNPs were identified in the SLC10A1 gene and 14 SNPs in the SLC10A2 gene. Three of seven coding variants were non-synonymous SNPs: two variants from SLC10A1 (A64T, S267F) and one from SLC10A2 (A171S). No linkage was analysed in the SLC10A1 gene because of low frequencies of genetic variants, and the SLC10A2 gene was composed of two separated linkage disequilibrium blocks contrary to the white population. The stably transfected NTCP-A64T variant showed significantly decreased uptakes of taurocholate and rosuvastatin compared with wild-type NTCP. The decreased taurocholate uptake and increased rosuvastatin uptake were shown in the NTCP-S267F variant. The allele frequencies of these functional variants were 1.0% and 3.1%, respectively, in a Korean population. However, NTCP-A64T was not found in Chinese and Vietnamese subjects. The frequency distribution of NTCP-S267F in Koreans was significantly lower than those in Chinese and Vietnamese populations. Our data suggest that NTCP-A64T and -S267F variants cause substrate-dependent functional change in vitro, and show ethnic difference in their allelic frequencies among Asian populations although the clinical relevance of these variants is remained to be evaluated.
Sata, F; Sapone, A; Elizondo, G; Stocker, P; Miller, V P; Zheng, W; Raunio, H; Crespi, C L; Gonzalez, F J
2000-01-01
To determine the existence of mutant and variant CgammaP3A4 alleles in three racial groups and to assess functions of the variant alleles by complementary deoxyribonucleic acid (cDNA) expression. A bacterial artificial chromosome that contains the complete CgammaP3A4 gene was isolated and the exons and surrounding introns were directly sequenced to develop primers to polymerase chain reaction (PCR) amplify and sequence the gene from lymphocyte DNA. DNA samples from Chinese, black, and white subjects were screened. Mutating the affected amino acid in the wild-type cDNA and expressing the variant enzyme with use of the baculovirus system was used to functionally evaluate the variant allele having a missense mutation. To investigate the existence of mutant and variant CgammaP3A4 alleles in humans, all 13 exons and the 5'-flanking region of the human CgammaP3A4 gene in three racial groups were sequenced and four alleles were identified. An A-->G point mutation in the 5'-flanking region of the human CgammaP3A4 gene, designated CgammaP3A4*1B, was found in the three different racial groups. The frequency of this allele in a white population was 4.2%, whereas it was 66.7% in black subjects. The CgammaP3A4*1B allele was not found in Chinese subjects. A second variant allele, designated CgammaP3A4*2, having a Ser222Pro change, was found at a frequency of 2.7% in the white population and was absent in the black subjects and Chinese subjects analyzed. Baculovirus-directed cDNA expression revealed that the CYP3A4*2 P450 had a lower intrinsic clearance for the CYP3A4 substrate nifedipine compared with the wild-type enzyme but was not significantly different from the wild-type enzyme for testosterone 6beta-hydroxylation. Another rare allele, designated CgammaP3A4*3, was found in a single Chinese subject who had a Met445Thr change in the conserved heme-binding region of the P450. These are the first examples of potential function polymorphisms resulting from missense mutations in the CgammaP3A4 gene. The CgammaP3A4*2 allele was found to encode a P450 with substrate-dependent altered kinetics compared with the wild-type P450.
Identification and expression analysis of cDNA encoding insulin-like growth factor 2 in horses
KIKUCHI, Kohta; SASAKI, Keisuke; AKIZAWA, Hiroki; TSUKAHARA, Hayato; BAI, Hanako; TAKAHASHI, Masashi; NAMBO, Yasuo; HATA, Hiroshi; KAWAHARA, Manabu
2017-01-01
Insulin-like growth factor 2 (IGF2) is responsible for a broad range of physiological processes during fetal development and adulthood, but genomic analyses of IGF2 containing the 5ʹ- and 3ʹ-untranslated regions (UTRs) in equines have been limited. In this study, we characterized the IGF2 mRNA containing the UTRs, and determined its expression pattern in the fetal tissues of horses. The complete equine IGF2 mRNA sequence harboring another exon approximately 2.8 kb upstream from the canonical transcription start site was identified as a new transcript variant. As this upstream exon did not contain the start codon, the amino acid sequence was identical to the canonical variant. Analysis of the deduced amino acid sequence revealed that the protein possessed two major domains, IlGF and IGF2_C, and analysis of IGF2 sequence polymorphism in fetal tissues of Hokkaido native horse and Thoroughbreds revealed a single nucleotide polymorphism (T to C transition) at position 398 in Thoroughbreds, which caused an amino acid substitution at position 133 in the IGF2 sequence. Furthermore, the expression pattern of the IGF2 mRNA in the fetal tissues of horses was determined for the first time, and was found to be consistent with those of other species. Taken together, these results suggested that the transcriptional and translational products of the IGF2 gene have conserved functions in the fetal development of mammals, including horses. PMID:29151450
Koo, Eung Seo; Kim, Man Su; Choi, Yong Seon; Park, Kwon-Sam; Jeong, Yong Seok
2017-01-01
Human norovirus (HNoV), a positive-sense RNA virus, is the main causative agent of acute viral gastroenteritis. Multiple pandemic variants of the genogroup II genotype 4 (GII.4) of NoV have attracted great attention from researchers worldwide. However, novel variants of GII.17 have been overtaking those pandemic variants in some areas of East Asia. To investigate the environmental occurrence of GII in South Korea, we collected water samples from coastal streams and a neighboring waste water treatment plant in North Jeolla province (in March, July, and December of 2015). Based on capsid gene region C analysis, four different genotypes (GII.4, GII.13, GII.17, and GII.21) were detected, with much higher prevalence of GII.17 than of GII.4. Additional sequence analyses of the ORF1-ORF2 junction and ORF2 from the water samples revealed that the GII.17 sequences in this study were closely related to the novel strains of GII.P17-GII.17, the main causative variants of the 2014–2015 HNoV outbreak in China and Japan. In addition, the GII.P21-GII.21 variants were identified in this study and they had new amino acid sequence variations in the blockade epitopes of the P2 domain. From these results, we present two important findings: 1) the novel GII.P17-GII.17 variants appeared to be predominant in the study area, and 2) new GII.21 variants have emerged in South Korea. PMID:28199388
Shen, Bo; Damude, Howard G.; Everard, John D.; Booth, John R.
2016-01-01
Kinetically improved diacylglycerol acyltransferase (DGAT) variants were created to favorably alter carbon partitioning in soybean (Glycine max) seeds. Initially, variants of a type 1 DGAT from a high-oil, high-oleic acid plant seed, Corylus americana, were screened for high oil content in Saccharomyces cerevisiae. Nearly all DGAT variants examined from high-oil strains had increased affinity for oleoyl-CoA, with S0.5 values decreased as much as 4.7-fold compared with the wild-type value of 0.94 µm. Improved soybean DGAT variants were then designed to include amino acid substitutions observed in promising C. americana DGAT variants. The expression of soybean and C. americana DGAT variants in soybean somatic embryos resulted in oil contents as high as 10% and 12%, respectively, compared with only 5% and 7.6% oil achieved by overexpressing the corresponding wild-type DGATs. The affinity for oleoyl-CoA correlated strongly with oil content. The soybean DGAT variant that gave the greatest oil increase contained 14 amino acid substitutions out of a total of 504 (97% sequence identity with native). Seed-preferred expression of this soybean DGAT1 variant increased oil content of soybean seeds by an average of 3% (16% relative increase) in highly replicated, single-location field trials. The DGAT transgenes significantly reduced the soluble carbohydrate content of mature seeds and increased the seed protein content of some events. This study demonstrated that engineering of the native DGAT enzyme is an effective strategy to improve the oil content and value of soybeans. PMID:27208257
Cell density signal protein suitable for treatment of connective tissue injuries and defects
Schwarz, Richard I.
2002-08-13
Identification, isolation and partial sequencing of a cell density protein produced by fibroblastic cells. The cell density signal protein comprising a 14 amino acid peptide or a fragment, variant, mutant or analog thereof, the deduced cDNA sequence from the 14 amino acid peptide, a recombinant protein, protein and peptide-specific antibodies, and the use of the peptide and peptide-specific antibodies as therapeutic agents for regulation of cell differentiation and proliferation. A method for treatment and repair of connective tissue and tendon injuries, collagen deficiency, and connective tissue defects.
Auer, Paul L; Nalls, Mike; Meschia, James F; Worrall, Bradford B; Longstreth, W T; Seshadri, Sudha; Kooperberg, Charles; Burger, Kathleen M; Carlson, Christopher S; Carty, Cara L; Chen, Wei-Min; Cupples, L Adrienne; DeStefano, Anita L; Fornage, Myriam; Hardy, John; Hsu, Li; Jackson, Rebecca D; Jarvik, Gail P; Kim, Daniel S; Lakshminarayan, Kamakshi; Lange, Leslie A; Manichaikul, Ani; Quinlan, Aaron R; Singleton, Andrew B; Thornton, Timothy A; Nickerson, Deborah A; Peters, Ulrike; Rich, Stephen S
2015-07-01
Stroke is the second leading cause of death and the third leading cause of years of life lost. Genetic factors contribute to stroke prevalence, and candidate gene and genome-wide association studies (GWAS) have identified variants associated with ischemic stroke risk. These variants often have small effects without obvious biological significance. Exome sequencing may discover predicted protein-altering variants with a potentially large effect on ischemic stroke risk. To investigate the contribution of rare and common genetic variants to ischemic stroke risk by targeting the protein-coding regions of the human genome. The National Heart, Lung, and Blood Institute (NHLBI) Exome Sequencing Project (ESP) analyzed approximately 6000 participants from numerous cohorts of European and African ancestry. For discovery, 365 cases of ischemic stroke (small-vessel and large-vessel subtypes) and 809 European ancestry controls were sequenced; for replication, 47 affected sibpairs concordant for stroke subtype and an African American case-control series were sequenced, with 1672 cases and 4509 European ancestry controls genotyped. The ESP's exome sequencing and genotyping started on January 1, 2010, and continued through June 30, 2012. Analyses were conducted on the full data set between July 12, 2012, and July 13, 2013. Discovery of new variants or genes contributing to ischemic stroke risk and subtype (primary analysis) and determination of support for protein-coding variants contributing to risk in previously published candidate genes (secondary analysis). We identified 2 novel genes associated with an increased risk of ischemic stroke: a protein-coding variant in PDE4DIP (rs1778155; odds ratio, 2.15; P = 2.63 × 10(-8)) with an intracellular signal transduction mechanism and in ACOT4 (rs35724886; odds ratio, 2.04; P = 1.24 × 10(-7)) with a fatty acid metabolism; confirmation of PDE4DIP was observed in affected sibpair families with large-vessel stroke subtype and in African Americans. Replication of protein-coding variants in candidate genes was observed for 2 previously reported GWAS associations: ZFHX3 (cardioembolic stroke) and ABCA1 (large-vessel stroke). Exome sequencing discovered 2 novel genes and mechanisms, PDE4DIP and ACOT4, associated with increased risk for ischemic stroke. In addition, ZFHX3 and ABCA1 were discovered to have protein-coding variants associated with ischemic stroke. These results suggest that genetic variation in novel pathways contributes to ischemic stroke risk and serves as a target for prediction, prevention, and therapy.
Abo, Ryan P; Ducar, Matthew; Garcia, Elizabeth P; Thorner, Aaron R; Rojas-Rudilla, Vanesa; Lin, Ling; Sholl, Lynette M; Hahn, William C; Meyerson, Matthew; Lindeman, Neal I; Van Hummelen, Paul; MacConaill, Laura E
2015-02-18
Genomic structural variation (SV), a common hallmark of cancer, has important predictive and therapeutic implications. However, accurately detecting SV using high-throughput sequencing data remains challenging, especially for 'targeted' resequencing efforts. This is critically important in the clinical setting where targeted resequencing is frequently being applied to rapidly assess clinically actionable mutations in tumor biopsies in a cost-effective manner. We present BreaKmer, a novel approach that uses a 'kmer' strategy to assemble misaligned sequence reads for predicting insertions, deletions, inversions, tandem duplications and translocations at base-pair resolution in targeted resequencing data. Variants are predicted by realigning an assembled consensus sequence created from sequence reads that were abnormally aligned to the reference genome. Using targeted resequencing data from tumor specimens with orthogonally validated SV, non-tumor samples and whole-genome sequencing data, BreaKmer had a 97.4% overall sensitivity for known events and predicted 17 positively validated, novel variants. Relative to four publically available algorithms, BreaKmer detected SV with increased sensitivity and limited calls in non-tumor samples, key features for variant analysis of tumor specimens in both the clinical and research settings. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Genotype–phenotype correlations in individuals with pathogenic RERE variants
Jordan, Valerie K.; Fregeau, Brieana; Ge, Xiaoyan; Giordano, Jessica; Wapner, Ronald J.; Balci, Tugce B.; Carter, Melissa T.; Bernat, John A.; Moccia, Amanda N.; Srivastava, Anshika; Martin, Donna M.; Bielas, Stephanie L.; Pappas, John; Svoboda, Melissa D.; Rio, Marlène; Boddaert, Nathalie; Cantagrel, Vincent; Lewis, Andrea M.; Scaglia, Fernando; Kohler, Jennefer N.; Bernstein, Jonathan A.; Dries, Annika M.; Rosenfeld, Jill A.; DeFilippo, Colette; Thorson, Willa; Yang, Yaping; Sherr, Elliott H.; Bi, Weimin; Scott, Daryl A.
2018-01-01
Heterozygous variants in the arginine-glutamic acid dipeptide repeats gene (RERE) have been shown to cause neurodevelopmental disorder with or without anomalies of the brain, eye, or heart (NEDBEH). Here, we report nine individuals with NEDBEH who carry partial deletions or deleterious sequence variants in RERE. These variants were found to be de novo in all cases in which parental samples were available. An analysis of data from individuals with NEDBEH suggests that point mutations affecting the Atrophin-1 domain of RERE are associated with an increased risk of structural eye defects, congenital heart defects, renal anomalies, and sensorineural hearing loss when compared with loss-of-function variants that are likely to lead to haploinsufficiency. A high percentage of RERE pathogenic variants affect a histidine-rich region in the Atrophin-1 domain. We have also identified a recurrent two-amino-acid duplication in this region that is associated with the development of a CHARGE syndrome-like phenotype. We conclude that mutations affecting RERE result in a spectrum of clinical phenotypes. Genotype–phenotype correlations exist and can be used to guide medical decision making. Consideration should also be given to screening for RERE variants in individuals who fulfill diagnostic criteria for CHARGE syndrome but do not carry pathogenic variants in CHD7. PMID:29330883
Genotype-phenotype correlations in individuals with pathogenic RERE variants.
Jordan, Valerie K; Fregeau, Brieana; Ge, Xiaoyan; Giordano, Jessica; Wapner, Ronald J; Balci, Tugce B; Carter, Melissa T; Bernat, John A; Moccia, Amanda N; Srivastava, Anshika; Martin, Donna M; Bielas, Stephanie L; Pappas, John; Svoboda, Melissa D; Rio, Marlène; Boddaert, Nathalie; Cantagrel, Vincent; Lewis, Andrea M; Scaglia, Fernando; Kohler, Jennefer N; Bernstein, Jonathan A; Dries, Annika M; Rosenfeld, Jill A; DeFilippo, Colette; Thorson, Willa; Yang, Yaping; Sherr, Elliott H; Bi, Weimin; Scott, Daryl A
2018-05-01
Heterozygous variants in the arginine-glutamic acid dipeptide repeats gene (RERE) have been shown to cause neurodevelopmental disorder with or without anomalies of the brain, eye, or heart (NEDBEH). Here, we report nine individuals with NEDBEH who carry partial deletions or deleterious sequence variants in RERE. These variants were found to be de novo in all cases in which parental samples were available. An analysis of data from individuals with NEDBEH suggests that point mutations affecting the Atrophin-1 domain of RERE are associated with an increased risk of structural eye defects, congenital heart defects, renal anomalies, and sensorineural hearing loss when compared with loss-of-function variants that are likely to lead to haploinsufficiency. A high percentage of RERE pathogenic variants affect a histidine-rich region in the Atrophin-1 domain. We have also identified a recurrent two-amino-acid duplication in this region that is associated with the development of a CHARGE syndrome-like phenotype. We conclude that mutations affecting RERE result in a spectrum of clinical phenotypes. Genotype-phenotype correlations exist and can be used to guide medical decision making. Consideration should also be given to screening for RERE variants in individuals who fulfill diagnostic criteria for CHARGE syndrome but do not carry pathogenic variants in CHD7. © 2018 Wiley Periodicals, Inc.
On the conservative nature of intragenic recombination
Drummond, D. Allan; Silberg, Jonathan J.; Meyer, Michelle M.; Wilke, Claus O.; Arnold, Frances H.
2005-01-01
Intragenic recombination rapidly creates protein sequence diversity compared with random mutation, but little is known about the relative effects of recombination and mutation on protein function. Here, we compare recombination of the distantly related β-lactamases PSE-4 and TEM-1 to mutation of PSE-4. We show that, among β-lactamase variants containing the same number of amino acid substitutions, variants created by recombination retain function with a significantly higher probability than those generated by random mutagenesis. We present a simple model that accurately captures the differing effects of mutation and recombination in real and simulated proteins with only four parameters: (i) the amino acid sequence distance between parents, (ii) the number of substitutions, (iii) the average probability that random substitutions will preserve function, and (iv) the average probability that substitutions generated by recombination will preserve function. Our results expose a fundamental functional enrichment in regions of protein sequence space accessible by recombination and provide a framework for evaluating whether the relative rates of mutation and recombination observed in nature reflect the underlying imbalance in their effects on protein function. PMID:15809422
On the conservative nature of intragenic recombination.
Drummond, D Allan; Silberg, Jonathan J; Meyer, Michelle M; Wilke, Claus O; Arnold, Frances H
2005-04-12
Intragenic recombination rapidly creates protein sequence diversity compared with random mutation, but little is known about the relative effects of recombination and mutation on protein function. Here, we compare recombination of the distantly related beta-lactamases PSE-4 and TEM-1 to mutation of PSE-4. We show that, among beta-lactamase variants containing the same number of amino acid substitutions, variants created by recombination retain function with a significantly higher probability than those generated by random mutagenesis. We present a simple model that accurately captures the differing effects of mutation and recombination in real and simulated proteins with only four parameters: (i) the amino acid sequence distance between parents, (ii) the number of substitutions, (iii) the average probability that random substitutions will preserve function, and (iv) the average probability that substitutions generated by recombination will preserve function. Our results expose a fundamental functional enrichment in regions of protein sequence space accessible by recombination and provide a framework for evaluating whether the relative rates of mutation and recombination observed in nature reflect the underlying imbalance in their effects on protein function.
Zhang, Dong-Yan; Feng, Yan; Zhong, Shu-Ling; Lu, Yi-Yu; Zhuang, Fang-Cheng; Xu, Chang-Ping
2012-03-01
To compare the differences in the complete genome sequence between mumps epidemic strain and mumps vaccine strain S79 isolated in Zhejiang province. A total of 4 mumps epidemic strains, which were separated from Zhejiang province during 2005 to 2010, named as ZJ05-1, ZJ06-3, ZJ08-1 and ZJ10-1 were selected in the study. The complete genome sequences were amplified using RT-PCR. The genetic differences between vaccine strain S79 and other genotype strains were compared; while the genetic-distance was calculated and the evolution was analyzed. The biggest difference between the 4 epidemic strains and the vaccine strain S79 was found on the membrane associated protein gene; whose average nucleotide differential number was 42.5 +/- 3.0 and the average variant ratio was 13.6%; while the mean amino acid differential number was 12.8 +/- 1.5 and the average variant ratio was 22.4%. The smallest difference among the 4 epidemic strains and the vaccine strain was found in stromatin genes, whose average nucleotide differential number was 73.8 +/- 2.5 and the average variant ratio was 5.9%; while the mean amino acid differential number was 3.0 +/- 0.8 and the average variant ratio was 0.8%. The dn/ds value of the stromatin genes of the 4 epidemic strains reached the highest, as 0.6526; but without any positive pressure (dn/ds < 1, chi2 = 0.87, P > 0.05). There were mutations happened on the known antigen epitope, as 8th amino acid of membrane associated protein genes and on the 336th and 356th amino acid of hemagglutinin/neuraminidase proteins. Compared with the vaccine strain, the glycosylation sites of ZJ05-1, ZJ06-3, ZJ08-1 and ZJ10-1 increased 1, 1, 2 and 2 respectively. The complete amino acid sequence of all strains showed that there were 17 characteristic sites found on the genotype-F mumps strain. Within the complete genome, the genetic-distance between epidemic strains and vaccine strains in Zhejiang province (0.071) was significantly larger than the genetic-distance between strains in Yunnan province (0.013); the difference showing statistical significance (t = 4.14, P < 0.05). Except nucleocapsid protein genes, all the genes shared similar evolution tree. There were significant differences found in the genes between mumps epidemic strain and mumps vaccine in Zhejiang province.
Nucleic acids encoding antifungal polypeptides and uses thereof
Altier, Daniel J.; Ellanskaya, I. A.; Gilliam, Jacob T.; Hunter-Cevera, Jennie; Presnail, James K; Schepers, Eric; Simmons, Carl R.; Torok, Tamas; Yalpani, Nasser
2010-11-02
Compositions and methods for protecting a plant from a pathogen, particularly a fungal pathogen, are provided. Compositions include an amino acid sequence, and variants and fragments thereof, for an antipathogenic polypeptide that was isolated from a fungal fermentation broth. Nucleic acid molecules that encode the antipathogenic polypeptides of the invention, and antipathogenic domains thereof, are also provided. A method for inducing pathogen resistance in a plant using the nucleotide sequences disclosed herein is further provided. The method comprises introducing into a plant an expression cassette comprising a promoter operably linked to a nucleotide sequence that encodes an antipathogenic polypeptide of the invention. Compositions comprising an antipathogenic polypeptide or a transformed microorganism comprising a nucleic acid of the invention in combination with a carrier and methods of using these compositions to protect a plant from a pathogen are further provided. Transformed plants, plant cells, seeds, and microorganisms comprising a nucleotide sequence that encodes an antipathogenic polypeptide of the invention are also disclosed.
Premraj, Avinash; Nautiyal, Binita; Aleyas, Abi G; Rasool, Thaha Jamal
2015-10-01
Interleukin-26 (IL-26) is a member of the IL-10 family of cytokines. Though conserved across vertebrates, the IL-26 gene is functionally inactivated in a few mammals like rat, mouse and horse. We report here the identification, isolation and cloning of the cDNA of IL-26 from the dromedary camel. The camel cDNA contains a 516 bp open reading frame encoding a 171 amino acid precursor protein, including a 21 amino acid signal peptide. Sequence analysis revealed high similarity with other mammalian IL-26 homologs and the conservation of IL-10 cytokine family domain structure including key amino acid residues. We also report the identification and cloning of four novel transcript variants produced by alternative splicing at the Exon 3-Exon 4 regions of the gene. Three of the alternative splice variants had premature termination codons and are predicted to code for truncated proteins. The transcript variant 4 (Tv4) having an insertion of an extra 120 bp nucleotides in the ORF was predicted to encode a full length protein product with 40 extra amino acid residues. The mRNA transcripts of all the variants were identified in lymph node, where as fewer variants were observed in other tissues like blood, liver and kidney. The expression of Tv2 and Tv3 were found to be up regulated in mitogen induced camel peripheral blood mononuclear cells. IL-26-Tv2 expression was also induced in camel fibroblast cells infected with Camel pox virus in-vitro. The identification of the transcript variants of IL-26 from the dromedary camel is the first report of alternative splicing for IL-26 in a species in which the gene has not been inactivated. Copyright © 2015 Elsevier Ltd. All rights reserved.
Racsa, Lori D; Luu, Hung S; Park, Jason Y; Mitui, Midori; Timmons, Charles F
2014-06-01
Hemoglobin (Hb) Austin was defined in 1977, using amino acid sequencing of samples from 3 unrelated Mexican-Americans, as a substitution of serine for arginine at position 40 of the β-globin chain (Arg40Ser). Its electrophoretic migration on both cellulose acetate (pH 8.4) and citrate agar (pH 6.2) was reported between Hb F and Hb A, and this description persists in reference literature. OBJECTIVES.-To review the clinical features and redefine the diagnostic characteristics of Hb Austin. Eight samples from 6 unrelated individuals and 2 siblings, all with Hispanic surnames, were submitted for abnormal Hb identification between June 2010 and September 2011. High-performance liquid chromatography, isoelectric focusing (IEF), citrate agar electrophoresis, and bidirectional DNA sequencing of the entire β-globin gene were performed. DNA sequencing confirmed all 8 individuals to be heterozygous for Hb Austin (Arg40Ser). Retention time on high-performance liquid chromatography and migration on citrate agar electrophoresis were consistent with that identification. Migration on IEF, however, was not between Hb F and Hb A, as predicted from the report of cellulose acetate electrophoresis. By IEF, Hb Austin migrated anodal to ("faster than") Hb A. Hemoglobin Austin (Arg40Ser) appears on IEF as a "fast," anodally migrating, Hb variant, just as would be expected from its amino acid substitution. The cited historic report is, at best, not applicable to IEF and is probably erroneous. Our observation of 8 cases in 16 months suggests that this variant may be relatively common in some Hispanic populations, making its recognition important. Furthermore, gene sequencing is proving itself a powerful and reliable tool for definitive identification of Hb variants.
Graziano, Claudio; Wischmeijer, Anita; Pippucci, Tommaso; Fusco, Carlo; Diquigiovanni, Chiara; Nõukas, Margit; Sauk, Martin; Kurg, Ants; Rivieri, Francesca; Blau, Nenad; Hoffmann, Georg F; Chaubey, Alka; Schwartz, Charles E; Romeo, Giovanni; Bonora, Elena; Garavelli, Livia; Seri, Marco
2015-04-01
The causative variant in a consanguineous family in which the three patients (two siblings and a cousin) presented with intellectual disability, Marfanoid habitus, craniofacial dysmorphisms, chronic diarrhea and progressive kyphoscoliosis, has been identified through whole exome sequencing (WES) analysis. WES study identified a homozygous DDC variant in the patients, c.1123C>T, resulting in p.Arg375Cys missense substitution. Mutations in DDC cause a recessive metabolic disorder (aromatic amino acid decarboxylase, AADC, deficiency, OMIM #608643) characterized by hypotonia, oculogyric crises, excessive sweating, temperature instability, dystonia, severe neurologic dysfunction in infancy, and specific abnormalities of neurotransmitters and their metabolites in the cerebrospinal fluid (CSF). In our family, analysis of neurotransmitters and their metabolites in patient's CSF shows a pattern compatible with AADC deficiency, although the clinical signs are different from the classic form. Our work expands the phenotypic spectrum associated with DDC variants, which therefore can cause an additional novel syndrome without typical movement abnormalities. Copyright © 2015 Elsevier B.V. All rights reserved.
X-Linked Glomerulopathy Due to COL4A5 Founder Variant.
Barua, Moumita; John, Rohan; Stella, Lorenzo; Li, Weili; Roslin, Nicole M; Sharif, Bedra; Hack, Saidah; Lajoie-Starkell, Ginette; Schwaderer, Andrew L; Becknell, Brian; Wuttke, Matthias; Köttgen, Anna; Cattran, Daniel; Paterson, Andrew D; Pei, York
2018-03-01
Alport syndrome is a rare hereditary disorder caused by rare variants in 1 of 3 genes encoding for type IV collagen. Rare variants in COL4A5 on chromosome Xq22 cause X-linked Alport syndrome, which accounts for ∼80% of the cases. Alport syndrome has a variable clinical presentation, including progressive kidney failure, hearing loss, and ocular defects. Exome sequencing performed in 2 affected related males with an undefined X-linked glomerulopathy characterized by global and segmental glomerulosclerosis, mesangial hypercellularity, and vague basement membrane immune complex deposition revealed a COL4A5 sequence variant, a substitution of a thymine by a guanine at nucleotide 665 (c.T665G; rs281874761) of the coding DNA predicted to lead to a cysteine to phenylalanine substitution at amino acid 222, which was not seen in databases cataloguing natural human genetic variation, including dbSNP138, 1000 Genomes Project release version 01-11-2004, Exome Sequencing Project 21-06-2014, or ExAC 01-11-2014. Review of the literature identified 2 additional families with the same COL4A5 variant leading to similar atypical histopathologic features, suggesting a unique pathologic mechanism initiated by this specific rare variant. Homology modeling suggests that the substitution alters the structural and dynamic properties of the type IV collagen trimer. Genetic analysis comparing members of the 3 families indicated a distant relationship with a shared haplotype, implying a founder effect. Crown Copyright © 2017. Published by Elsevier Inc. All rights reserved.
Roesler, Keith; Shen, Bo; Bermudez, Ericka; Li, Changjiang; Hunt, Joanne; Damude, Howard G; Ripp, Kevin G; Everard, John D; Booth, John R; Castaneda, Leandro; Feng, Lizhi; Meyer, Knut
2016-06-01
Kinetically improved diacylglycerol acyltransferase (DGAT) variants were created to favorably alter carbon partitioning in soybean (Glycine max) seeds. Initially, variants of a type 1 DGAT from a high-oil, high-oleic acid plant seed, Corylus americana, were screened for high oil content in Saccharomyces cerevisiae Nearly all DGAT variants examined from high-oil strains had increased affinity for oleoyl-CoA, with S0.5 values decreased as much as 4.7-fold compared with the wild-type value of 0.94 µm Improved soybean DGAT variants were then designed to include amino acid substitutions observed in promising C. americana DGAT variants. The expression of soybean and C. americana DGAT variants in soybean somatic embryos resulted in oil contents as high as 10% and 12%, respectively, compared with only 5% and 7.6% oil achieved by overexpressing the corresponding wild-type DGATs. The affinity for oleoyl-CoA correlated strongly with oil content. The soybean DGAT variant that gave the greatest oil increase contained 14 amino acid substitutions out of a total of 504 (97% sequence identity with native). Seed-preferred expression of this soybean DGAT1 variant increased oil content of soybean seeds by an average of 3% (16% relative increase) in highly replicated, single-location field trials. The DGAT transgenes significantly reduced the soluble carbohydrate content of mature seeds and increased the seed protein content of some events. This study demonstrated that engineering of the native DGAT enzyme is an effective strategy to improve the oil content and value of soybeans. © 2016 American Society of Plant Biologists. All Rights Reserved.
Rozman, Vita; Kunej, Tanja
2018-05-10
Harnessing the genomics big data requires innovation in how we extract and interpret biologically relevant variants. Currently, there is no established catalog of prioritized missense variants associated with deleterious protein function phenotypes. We report in this study, to the best of our knowledge, the first genome-wide prioritization of sequence variants with the most deleterious effect on protein function (potentially deleterious variants [pDelVars]) in nine vertebrate species: human, cattle, horse, sheep, pig, dog, rat, mouse, and zebrafish. The analysis was conducted using the Ensembl/BioMart tool. Genes comprising pDelVars in the highest number of examined species were identified using a Python script. Multiple genomic alignments of the selected genes were built to identify interspecies orthologous potentially deleterious variants, which we defined as the "ortho-pDelVars." Genome-wide prioritization revealed that in humans, 0.12% of the known variants are predicted to be deleterious. In seven out of nine examined vertebrate species, the genes encoding the multiple PDZ domain crumbs cell polarity complex component (MPDZ) and the transforming acidic coiled-coil containing protein 2 (TACC2) comprise pDelVars. Five interspecies ortho-pDelVars were identified in three genes. These findings offer new ways to harness genomics big data by facilitating the identification of functional polymorphisms in humans and animal models and thus provide a future basis for optimization of protocols for whole genome prioritization of pDelVars and screening of orthologous sequence variants. The approach presented here can inform various postgenomic applications such as personalized medicine and multiomics study of health interventions (iatromics).
PVRL1 Variants Contribute to Non-Syndromic Cleft Lip and Palate in Multiple Populations
Avila, Joseph R.; Jezewski, Peter A.; Vieira, Alexandre R.; Orioli, Iêda M.; Castilla, Eduardo E.; Christensen, Kaare; Daack-Hirsch, Sandra; Romitti, Paul A.; Murray, Jeffrey C.
2007-01-01
Poliovirus Receptor Like-1 (PVRL1) is a member of the immunoglobulin super family that acts in the initiation and maintenance of epithelial adherens junctions and is mutated in the cleft lip and palate/ectodermal dysplasia 1 syndrome (CLPED1, OMIM #225000). In addition, a common non-sense mutation in PVRL1 was discovered more often among non-syndromic sporadic clefting cases in Northern Venezuela in a previous case-control study. The present work sought to ascertain the role of PVRL1 in the sporadic forms of orofacial clefting in multiple populations. Multiple rare and common variants from all three splice isoforms were initially ascertained by sequencing 92 Iowan and 86 Filipino cases and CEPH controls. Using a family-based analysis to examine these variants, the common glycine allele of the G361V coding variant was significantly overtransmitted among all orofacial clefting phenotypes (P = 0.005). This represented G361V genotyping from over 800 Iowan, Danish, and Filipino families. Among four rare amino acid changes found within the V1 and C1 domains, S112T and T131A were found adjacent to critical amino acid positions within the V1 variable domain, regions previously shown to mediate cell-to-cell and cell-to-virus adhesion. The T131A variant was not found in over 1,300 non-affected control samples although the alanine is found in other species. The serine of the S112T variant position is conserved across all known PVRL1 sequences. Together these data suggest that both rare and common mutations within PVRL1 make a minor contribution to disrupting the initiation and regulation of cell-to-cell adhesion and downstream morphogenesis of the embryonic face. PMID:17089422
1990-08-15
the same sequences and chira Jies of the amino acids as reported earlier in other microcystins."’ ’ All contain two variant amino acids in the L... Williams , D. H.; Santikarn, S.; Smith, R. J.; Hammond, S. J., Chem. Soc. Perkin Tran., 1984, 2311. 2. Marfey, P. Carlsberg Res. Commun., 1984, 49, 591
Edrees, Burhan M; Athar, Mohammad; Abduljaleel, Zainularifeen; Al-Allaf, Faisal A; Taher, Mohiuddin M; Khan, Wajahatullah; Bouazzaoui, Abdellatif; Al-Harbi, Naffaa; Safar, Ramzia; Al-Edressi, Howaida; Alansary, Khawala; Anazi, Abulkareem; Altayeb, Naji; Ahmed, Muawia A
2016-12-01
A targeted customized sequencing of genes implicated in autosomal recessive polycystic kidney disease (ARPKD) phenotype was performed to identify candidate variants using the Ion torrent PGM next-generation sequencing. The results identified four potential pathogenic variants in PKHD1 gene [c.4870C > T, p.(Arg1624Trp), c.5725C > T, p.(Arg1909Trp), c.1736C > T, p.(Thr579Met) and c.10628T > G, p.(Leu3543Trp)] among 12 out of 18 samples. However, one variant c.4870C > T, p.(Arg1624Trp) was common among eight patients. Some patient samples also showed few variants in autosomal dominant polycystic kidney disease (ADPKD) disease causing genes PKD1 and PKD2 such as c.12433G > A, p.(Val4145Ile) and c.1445T > G, p.(Phe482Cys), respectively. All causative variants were validated by capillary sequencing and confirmed the presence of a novel homozygous variant c.10628T > G, p.(Leu3543Trp) in a male proband. We have recently published the results of these studies (Edrees et al., 2016). Here we report for the first time the effect of the common mutation p.(Arg1624Trp) found in eight samples on the protein structure and function due to the specific amino acid changes of PKHD1 protein using molecular dynamics simulations. The computational approaches provide tool predict the phenotypic effect of variant on the structure and function of the altered protein. The structural analysis with the common mutation p.(Arg1624Trp) in the native and mutant modeled protein were also studied for solvent accessibility, secondary structure and stabilizing residues to find out the stability of the protein between wild type and mutant forms. Furthermore, comparative genomics and evolutionary analyses of variants observed in PKHD1 , PKD1 , and PKD2 genes were also performed in some mammalian species including human to understand the complexity of genomes among closely related mammalian species. Taken together, the results revealed that the evolutionary comparative analyses and characterization of PKHD1 , PKD1 , and PKD2 genes among various related and unrelated mammalian species will provide important insights into their evolutionary process and understanding for further disease characterization and management.
A splice variant in the ACSL5 gene relates migraine with fatty acid activation in mitochondria
Matesanz, Fuencisla; Fedetz, María; Barrionuevo, Cristina; Karaky, Mohamad; Catalá-Rabasa, Antonio; Potenciano, Victor; Bello-Morales, Raquel; López-Guerrero, Jose-Antonio; Alcina, Antonio
2016-01-01
Genome-wide association studies (GWAS) in migraine are providing the molecular basis of this heterogeneous disease, but the understanding of its aetiology is still incomplete. Although some biomarkers have currently been accepted for migraine, large amount of studies for identifying new ones is needed. The migraine-associated variant rs12355831:A>G (P=2 × 10−6), described in a GWAS of the International Headache Genetic Consortium, is localized in a non-coding sequence with unknown function. We sought to identify the causal variant and the genetic mechanism involved in the migraine risk. To this end, we integrated data of RNA sequences from the Genetic European Variation in Health and Disease (GEUVADIS) and genotypes from 1000 GENOMES of 344 lymphoblastoid cell lines (LCLs), to determine the expression quantitative trait loci (eQTLs) in the region. We found that the migraine-associated variant belongs to a linkage disequilibrium block associated with the expression of an acyl-coenzyme A synthetase 5 (ACSL5) transcript lacking exon 20 (ACSL5-Δ20). We showed by exon-skipping assay a direct causality of rs2256368-G in the exon 20 skipping of approximately 20 to 40% of ACSL5 RNA molecules. In conclusion, we identified the functional variant (rs2256368:A>G) affecting ACSL5 exon 20 skipping, as a causal factor linked to the migraine-associated rs12355831:A>G, suggesting that the activation of long-chain fatty acids by the spliced ACSL5-Δ20 molecules, a mitochondrial located enzyme, is involved in migraine pathology. PMID:27189022
Skoczinski, Pia; Volkenborn, Kristina; Fulton, Alexander; Bhadauriya, Anuseema; Nutschel, Christina; Gohlke, Holger; Knapp, Andreas; Jaeger, Karl-Erich
2017-09-25
Bacillus subtilis produces and secretes proteins in amounts of up to 20 g/l under optimal conditions. However, protein production can be challenging if transcription and cotranslational secretion are negatively affected, or the target protein is degraded by extracellular proteases. This study aims at elucidating the influence of a target protein on its own production by a systematic mutational analysis of the homologous B. subtilis model protein lipase A (LipA). We have covered the full natural diversity of single amino acid substitutions at 155 positions of LipA by site saturation mutagenesis excluding only highly conserved residues and qualitatively and quantitatively screened about 30,000 clones for extracellular LipA production. Identified variants with beneficial effects on production were sequenced and analyzed regarding B. subtilis growth behavior, extracellular lipase activity and amount as well as changes in lipase transcript levels. In total, 26 LipA variants were identified showing an up to twofold increase in either amount or activity of extracellular lipase. These variants harbor single amino acid or codon substitutions that did not substantially affect B. subtilis growth. Subsequent exemplary combination of beneficial single amino acid substitutions revealed an additive effect solely at the level of extracellular lipase amount; however, lipase amount and activity could not be increased simultaneously. Single amino acid and codon substitutions can affect LipA secretion and production by B. subtilis. Several codon-related effects were observed that either enhance lipA transcription or promote a more efficient folding of LipA. Single amino acid substitutions could improve LipA production by increasing its secretion or stability in the culture supernatant. Our findings indicate that optimization of the expression system is not sufficient for efficient protein production in B. subtilis. The sequence of the target protein should also be considered as an optimization target for successful protein production. Our results further suggest that variants with improved properties might be identified much faster and easier if mutagenesis is prioritized towards elements that contribute to enzymatic activity or structural integrity.
Yao, Yongxiu; Mingay, Louise J.; McCauley, John W.; Barclay, Wendy S.
2001-01-01
Reverse genetics was used to analyze the host range of two avian influenza viruses which differ in their ability to replicate in mouse and human cells in culture. Engineered viruses carrying sequences encoding amino acids 362 to 581 of PB2 from a host range variant productively infect mouse and human cells. PMID:11333926
Li, Jonathan Z; Chapman, Brad; Charlebois, Patrick; Hofmann, Oliver; Weiner, Brian; Porter, Alyssa J; Samuel, Reshmi; Vardhanabhuti, Saran; Zheng, Lu; Eron, Joseph; Taiwo, Babafemi; Zody, Michael C; Henn, Matthew R; Kuritzkes, Daniel R; Hide, Winston; Wilson, Cara C; Berzins, Baiba I; Acosta, Edward P; Bastow, Barbara; Kim, Peter S; Read, Sarah W; Janik, Jennifer; Meres, Debra S; Lederman, Michael M; Mong-Kryspin, Lori; Shaw, Karl E; Zimmerman, Louis G; Leavitt, Randi; De La Rosa, Guy; Jennings, Amy
2014-01-01
The impact of raltegravir-resistant HIV-1 minority variants (MVs) on raltegravir treatment failure is unknown. Illumina sequencing offers greater throughput than 454, but sequence analysis tools for viral sequencing are needed. We evaluated Illumina and 454 for the detection of HIV-1 raltegravir-resistant MVs. A5262 was a single-arm study of raltegravir and darunavir/ritonavir in treatment-naïve patients. Pre-treatment plasma was obtained from 5 participants with raltegravir resistance at the time of virologic failure. A control library was created by pooling integrase clones at predefined proportions. Multiplexed sequencing was performed with Illumina and 454 platforms at comparable costs. Illumina sequence analysis was performed with the novel snp-assess tool and 454 sequencing was analyzed with V-Phaser. Illumina sequencing resulted in significantly higher sequence coverage and a 0.095% limit of detection. Illumina accurately detected all MVs in the control library at ≥0.5% and 7/10 MVs expected at 0.1%. 454 sequencing failed to detect any MVs at 0.1% with 5 false positive calls. For MVs detected in the patient samples by both 454 and Illumina, the correlation in the detected variant frequencies was high (R2 = 0.92, P<0.001). Illumina sequencing detected 2.4-fold greater nucleotide MVs and 2.9-fold greater amino acid MVs compared to 454. The only raltegravir-resistant MV detected was an E138K mutation in one participant by Illumina sequencing, but not by 454. In participants of A5262 with raltegravir resistance at virologic failure, baseline raltegravir-resistant MVs were rarely detected. At comparable costs to 454 sequencing, Illumina demonstrated greater depth of coverage, increased sensitivity for detecting HIV MVs, and fewer false positive variant calls.
Investigating intermolecular forces associated with thrombus initiation using optical tweezers
NASA Astrophysics Data System (ADS)
Arya, Maneesh; Lopez, Jose A.; Romo, Gabriel M.; Dong, Jing-Fei; McIntire, Larry V.; Moake, Joel L.; Anvari, Bahman
2002-05-01
Thrombus formation occurs when a platelet membrane receptor, glycoprotein (GP) Ib-IX-V complex, binds to its ligand, von Willebrand factor (vWf), in the subendothelium or plasma. To determine which GP Ib-IX-V amino acid sequences are critical for bond formation, we have used optical tweezers to measure forces involved in the binding of vWf to GP Ib-IX-V variants. Inasmuch as GP Ib(alpha) subunit is the primary component in human GP Ib-IX-V complex that binds to vWf, and that canine GP Ib(alpha) , on the other hand, does not bind to human vWf, we progressively replaced human GP Ib(alpha) amino acid sequences with canine GP Ib(alpha) sequences to determine the sequences essential for vWf/GP Ib(alpha) binding. After measuring the adhesive forces between optically trapped, vWf-coated beads and GP Ib(alpha) variants expressed on mammalian cells, we determined that leucine- rich repeat 2 of GP Ib(alpha) was necessary for vWf/GP Ib-IX- V bond formation. We also found that deletion of the N- terminal flanking sequence and leucine-rich repeat 1 reduced adhesion strength to vWf but did not abolish binding. While divalent cations are known to influence binding of vWf, addition of 1mM CaCl2 had no effect on measured vWf/GP Ib(alpha) bond strengths.
Breitfeld, Jana; Martens, Susanne; Klammt, Jürgen; Schlicke, Marina; Pfäffle, Roland; Krause, Kerstin; Weidle, Kerstin; Schleinitz, Dorit; Stumvoll, Michael; Führer, Dagmar; Kovacs, Peter; Tönjes, Anke
2013-12-01
The complex process of development of the pituitary gland is regulated by a number of signalling molecules and transcription factors. Mutations in these factors have been identified in rare cases of congenital hypopituitarism but for most subjects with combined pituitary hormone deficiency (CPHD) genetic causes are unknown. Bone morphogenetic proteins (BMPs) affect induction and growth of the pituitary primordium and thus represent plausible candidates for mutational screening of patients with CPHD. We sequenced BMP2, 4 and 7 in 19 subjects with CPHD. For validation purposes, novel genetic variants were genotyped in 1046 healthy subjects. Additionally, potential functional relevance for most promising variants has been assessed by phylogenetic analyses and prediction of effects on protein structure. Sequencing revealed two novel variants and confirmed 30 previously known polymorphisms and mutations in BMP2, 4 and 7. Although phylogenetic analyses indicated that these variants map within strongly conserved gene regions, there was no direct support for their impact on protein structure when applying predictive bioinformatics tools. A mutation in the BMP4 coding region resulting in an amino acid exchange (p.Arg300Pro) appeared most interesting among the identified variants. Further functional analyses are required to ultimately map the relevance of these novel variants in CPHD.
2013-01-01
Background The complex process of development of the pituitary gland is regulated by a number of signalling molecules and transcription factors. Mutations in these factors have been identified in rare cases of congenital hypopituitarism but for most subjects with combined pituitary hormone deficiency (CPHD) genetic causes are unknown. Bone morphogenetic proteins (BMPs) affect induction and growth of the pituitary primordium and thus represent plausible candidates for mutational screening of patients with CPHD. Methods We sequenced BMP2, 4 and 7 in 19 subjects with CPHD. For validation purposes, novel genetic variants were genotyped in 1046 healthy subjects. Additionally, potential functional relevance for most promising variants has been assessed by phylogenetic analyses and prediction of effects on protein structure. Results Sequencing revealed two novel variants and confirmed 30 previously known polymorphisms and mutations in BMP2, 4 and 7. Although phylogenetic analyses indicated that these variants map within strongly conserved gene regions, there was no direct support for their impact on protein structure when applying predictive bioinformatics tools. Conclusions A mutation in the BMP4 coding region resulting in an amino acid exchange (p.Arg300Pro) appeared most interesting among the identified variants. Further functional analyses are required to ultimately map the relevance of these novel variants in CPHD. PMID:24289245
Site-specific photoconjugation of antibodies using chemically synthesized IgG-binding domains.
Perols, Anna; Karlström, Amelie Eriksson
2014-03-19
Site-specific labeling of antibodies can be performed using the immunoglobulin-binding Z domain, derived from staphylococcal protein A (SpA), which has a well-characterized binding site in the Fc region of antibodies. By introducing a photoactivable probe in the Z domain, a covalent bond can be formed between the Z domain and the antibody by irradiation with UV light. The aim of this study was to improve the conjugation yield for labeling of different subclasses of IgG having different sequence composition, using a photoactivated Z domain variant. Four different variants of the Z domain (Z5BPA, Z5BBA, Z32BPA, and Z32BBA) were synthesized to investigate the influence of the position of the photoactivable probe and the presence of a flexible linker between the probe and the protein. For two of the variants, the photoreactive benzophenone group was introduced as part of an amino acid side chain by incorporation of the unnatural amino acid benzoylphenylalanine (BPA) during peptide synthesis. For the other two variants, the photoreactive benzophenone group was attached via a flexible linker by coupling of benzoylbenzoic acid (BBA) to the ε-amino group of a selectively deprotected lysine residue. Photoconjugation experiments using human IgG1, mouse IgG1, and mouse IgG2A demonstrated efficient conjugation for all antibodies. It was shown that differences in linker length had a large impact on the conjugation efficiency for labeling of mouse IgG1, whereas the positioning of the photoactivable probe in the sequence of the protein had a larger effect for mouse IgG2A. Conjugation to human IgG1 was only to a minor extent affected by position or linker length. For each subclass of antibody, the best variant tested using a standard conjugation protocol resulted in conjugation efficiencies of 41-66%, which corresponds to on average approximately one Z domain attached to each antibody. As a combination of the two best performing variants, Z5BBA and Z32BPA, a Z domain variant with two photoactivable probes (Z5BBA32BPA) was also synthesized with the aim of targeting a wider panel of antibody subclasses and species. This new reagent could efficiently couple to all antibody subclasses that were targeted by the single benzophenone-labeled Z domain variants, with conjugation efficiencies of 26-41%.
Continuously Tunable Nucleic Acid Hybridization Probes
Wu, Lucia R.; Wang, J. Sherry; Fang, John Z.; Reiser, Emily; Pinto, Alessandro; Pekker, Irena; Boykin, Richard; Ngouenet, Celine; Webster, Philippa J.; Beechem, Joseph; Zhang, David Yu
2015-01-01
In silico designed nucleic acid probes and primers often fail to achieve favorable specificity and sensitivity tradeoffs on the first try, and iterative empirical sequence-based optimization is needed, particularly in multiplexed assays. Here, we present a novel, on-the-fly method of tuning probe affinity and selectivity via the stoichiometry of auxiliary species, allowing independent and decoupled adjustment of hybridization yield for different probes in multiplexed assays. Using this method, we achieve near-continuous tuning of probe effective free energy (0.03 kcal·mol−1 granularity). As applications, we enforced uniform capture efficiency of 31 DNA molecules (GC content 0% – 100%), maximized signal difference for 11 pairs of single nucleotide variants, and performed tunable hybrid-capture of mRNA from total RNA. Using the Nanostring nCounter platform, we applied stoichiometric tuning to simultaneously adjust yields for a 24-plex assay, and we show multiplexed quantitation of RNA sequences and variants from formalin-fixed, paraffin-embedded samples (FFPE). PMID:26480474
Higgins, Chelsea D; Malashkevich, Vladimir N; Almo, Steven C; Lai, Jonathan R
2014-09-01
The coiled-coil is one of the most common protein structural motifs. Amino acid sequences of regions that participate in coiled-coils contain a heptad repeat in which every third then forth residue is occupied by a hydrophobic residue. Here we examine the consequences of a "stutter," a deviation of the idealized heptad repeat that is found in the central coiled-coil of influenza hemagluttinin HA2. Characterization of a peptide containing the native stutter-containing HA2 sequence, as well as several variants in which the stutter was engineered out to restore an idealized heptad repeat pattern, revealed that the stutter is important for allowing coiled-coil formation in the WT HA2 at both neutral and low pH (7.1 and 4.5). By contrast, all variants that contained idealized heptad repeats exhibited marked pH-dependent coiled-coil formation with structures forming much more stably at low pH. A crystal structure of one variant containing an idealized heptad repeat, and comparison to the WT HA2 structure, suggest that the stutter distorts the optimal interhelical core packing arrangement, resulting in unwinding of the coiled-coil superhelix. Interactions between acidic side chains, in particular E69 and E74 (present in all peptides studied), are suggested to play a role in mediating these pH-dependent conformational effects. This conclusion is partially supported by studies on HA2 variant peptides in which these positions were altered to aspartic acid. These results provide new insight into the structural role of the heptad repeat stutter in HA2. © 2014 Wiley Periodicals, Inc.
Zheng, Ling; Shockey, Jay; Guo, Feng; Shi, Lingmin; Li, Xinguo; Shan, Lei; Wan, Shubo; Peng, Zhenying
2017-12-01
Triacylglycerols (TAGs) are the most important energy storage form in oilseed crops. Diacylglycerol acyltransferase (DGAT) catalyzes the rate-limiting step of the Kennedy pathway of TAG biosynthesis. To date, little is known about the regulation of DGAT activity in peanut (Arachis hypogaea), an agronomically important oilseed crop that is cultivated in many parts of the world. In this study, seven distinct forms of type 1 DGAT (AhDGAT1.1-AhDGAT1.7) were identified, cloned, and characterized. Comparisons of the nucleotide sequences and gene structures revealed many different splicing variants of AhDGAT1, some of which displayed different organ-specific expression patterns. A representative gene (AhDGAT1.1) was transformed into wild-type tobacco and was shown to increase seed fatty acid (FA) content by 14.7%-20.9%. All seven AhDGAT1s were expressed in TAG-deficient Saccharomyces cerevisiae strain H1246; the five longest AhDGAT1 variants generated high levels of acyltransferase activity and complemented the free fatty acid lethality phenotype in this strain. The alternative splicing that gives rise to AhDGAT1.2 and AhDGAT1.4 creates predicted protein C-terminal truncations. The proteins encoded by these two variants were not active and did not complement the fatty acid sensitivity in H1246. These results were verified by visualization of intracellular lipid droplets using Nile Red staining. Collectively, the results presented here represent the first comprehensive analysis of the peanut DGAT1 gene family, which, unlike in other published plant DGAT1 sequences, shows widespread alternative splicing that may affect the expression patterns and enzyme activities of some members of the gene family. Copyright © 2017. Published by Elsevier GmbH.
Lunova, Mariia; Guldiken, Nurdan; Lienau, Tim C.; Stickel, Felix; Omary, M. Bishr
2012-01-01
Background Keratins 8 and 18 (K8/K18) are intermediate filament proteins that protect the liver from various forms of injury. Exonic K8/K18 variants associate with adverse outcome in acute liver failure and with liver fibrosis progression in patients with chronic hepatitis C infection or primary biliary cirrhosis. Given the association of K8/K18 variants with end-stage liver disease and progression in several chronic liver disorders, we studied the importance of keratin variants in patients with hemochromatosis. Methods The entire K8/K18 exonic regions were analyzed in 162 hemochromatosis patients carrying homozygous C282Y HFE (hemochromatosis gene) mutations. 234 liver-healthy subjects were used as controls. Exonic regions were PCR-amplified and analyzed using denaturing high-performance liquid chromatography and DNA sequencing. Previously-generated transgenic mice overexpressing K8 G62C were studied for their susceptibility to iron overload. Susceptibility to iron toxicity of primary hepatocytes that express K8 wild-type and G62C was also assessed. Results We identified amino-acid-altering keratin heterozygous variants in 10 of 162 hemochromatosis patients (6.2%) and non-coding heterozygous variants in 6 additional patients (3.7%). Two novel K8 variants (Q169E/R275W) were found. K8 R341H was the most common amino-acid altering variant (4 patients), and exclusively associated with an intronic KRT8 IVS7+10delC deletion. Intronic, but not amino-acid-altering variants associated with the development of liver fibrosis. In mice, or ex vivo, the K8 G62C variant did not affect iron-accumulation in response to iron-rich diet or the extent of iron-induced hepatocellular injury. Conclusion In patients with hemochromatosis, intronic but not exonic K8/K18 variants associate with liver fibrosis development. PMID:22412904
Cloning and purification of alpha-neurotoxins from king cobra (Ophiophagus hannah).
He, Ying-Ying; Lee, Wei-Hui; Zhang, Yun
2004-09-01
Thirteen complete and three partial cDNA sequences were cloned from the constructed king cobra (Ophiophagus hannah) venom gland cDNA library. Phylogenetic analysis of nucleotide sequences of king cobra with those from other snake venoms revealed that obtained cDNAs are highly homologous to snake venom alpha-neurotoxins. Alignment of deduced mature peptide sequences of the obtained clones with those of other reported alpha-neurotoxins from the king cobra venom indicates that our obtained 16 clones belong to long-chain neurotoxins (seven), short-chain neurotoxins (seven), weak toxin (one) and variant (one), respectively. Up to now, two out of 16 newly cloned king cobra alpha-neurotoxins have identical amino acid sequences with CM-11 and Oh-6A/6B, which have been characterized from the same venom. Furthermore, five long-chain alpha-neurotoxins and two short-chain alpha-neurotoxins were purified from crude venom and their N-terminal amino acid sequences were determined. The cDNAs encoding the putative precursors of the purified native peptide were also determined based on the N-terminal amino acid sequencing. The purified alpha-neurotoxins showed different lethal activities on mice.
2014-01-01
Background Fatty acid desaturase 1 (FADS1) and 2 (FADS2) genes code respectively for the enzymes delta-5 and delta-6 desaturases which are rate limiting enzymes in the synthesis of polyunsaturated omega-3 and omega-6 fatty acids (FAs). Omega-3 and-6 FAs as well as conjugated linoleic acid (CLA) are present in bovine milk and have demonstrated positive health effects in humans. Studies in humans have shown significant relationships between genetic variants in FADS1 and 2 genes with plasma and tissue concentrations of omega-3 and-6 FAs. The aim of this study was to evaluate the extent of sequence variations within these two genes in Canadian Holstein cows as well as the association between sequence variants and health promoting FAs in milk. Results Thirty three SNPs were detected within the studied regions of genes including a synonymous mutation (FADS1-07, rs42187261, 306Tyr > Tyr) in exon 8 of FADS1, a non-synonymous mutation (FADS2-14, rs211580559, 294Ala > Val) within FADS2 exon 7, a splice site SNP (FADS2-05, rs211263660), a 3′UTR SNP (FADS2-23, rs109772589), and another 3′UTR SNP with an effect on a microRNA binding site within FADS2 gene (FADS2-19, rs210169303). Association analyses showed significant relations between three out of seven tested SNPs and several FAs. Significant associations (FDR P < 0.05) were recorded between FADS2-23 (rs109772589) and two omega-6 FAs (dihomogamma linolenic acid [C20:3n6] and arachidonic acid [C20:4n6]), FADS1-07 (rs42187261) and one omega-3 FA (eicosapentaenoic acid, C20:5n3) and tricosanoic acid (C23:0), and one intronic SNP, FADS1-01 (rs136261927) and C20:3n6. Conclusion Our study has demonstrated positive associations between three SNPs within FADS1 and FADS2 genes (a SNP within the 3’UTR, a synonymous SNP and an intronic SNP), with three milk PUFAs of Canadian Holstein cows thus suggesting possible involvement of synonymous and non-coding region variants in FA synthesis. These SNPs may serve as potential genetic markers in breeding programs to increase milk FAs that are of benefit to human health. PMID:24533445
Ibeagha-Awemu, Eveline M; Akwanji, Kingsley A; Beaudoin, Frédéric; Zhao, Xin
2014-02-17
Fatty acid desaturase 1 (FADS1) and 2 (FADS2) genes code respectively for the enzymes delta-5 and delta-6 desaturases which are rate limiting enzymes in the synthesis of polyunsaturated omega-3 and omega-6 fatty acids (FAs). Omega-3 and-6 FAs as well as conjugated linoleic acid (CLA) are present in bovine milk and have demonstrated positive health effects in humans. Studies in humans have shown significant relationships between genetic variants in FADS1 and 2 genes with plasma and tissue concentrations of omega-3 and-6 FAs. The aim of this study was to evaluate the extent of sequence variations within these two genes in Canadian Holstein cows as well as the association between sequence variants and health promoting FAs in milk. Thirty three SNPs were detected within the studied regions of genes including a synonymous mutation (FADS1-07, rs42187261, 306Tyr > Tyr) in exon 8 of FADS1, a non-synonymous mutation (FADS2-14, rs211580559, 294Ala > Val) within FADS2 exon 7, a splice site SNP (FADS2-05, rs211263660), a 3'UTR SNP (FADS2-23, rs109772589), and another 3'UTR SNP with an effect on a microRNA binding site within FADS2 gene (FADS2-19, rs210169303). Association analyses showed significant relations between three out of seven tested SNPs and several FAs. Significant associations (FDR P < 0.05) were recorded between FADS2-23 (rs109772589) and two omega-6 FAs (dihomogamma linolenic acid [C20:3n6] and arachidonic acid [C20:4n6]), FADS1-07 (rs42187261) and one omega-3 FA (eicosapentaenoic acid, C20:5n3) and tricosanoic acid (C23:0), and one intronic SNP, FADS1-01 (rs136261927) and C20:3n6. Our study has demonstrated positive associations between three SNPs within FADS1 and FADS2 genes (a SNP within the 3'UTR, a synonymous SNP and an intronic SNP), with three milk PUFAs of Canadian Holstein cows thus suggesting possible involvement of synonymous and non-coding region variants in FA synthesis. These SNPs may serve as potential genetic markers in breeding programs to increase milk FAs that are of benefit to human health.
la Torre, David De; Mafla, Eulalia; Puga, Byron; Erazo, Linda; Astolfi-Ferreira, Claudete; Ferreira, Antonio Piantino
2018-01-01
Aim The objective of this study was to determine the presence of the variants of canine parvovirus (CPV)-2 in the city of Quito, Ecuador, due to the high domestic and street-type canine population, and to identify possible mutations at a genetic level that could be causing structural changes in the virus with a consequent influence on the immune response of the hosts. Materials and Methods Thirty-five stool samples from different puppies with characteristic signs of the disease and positives for CPV through immunochromatography kits were collected from different veterinarian clinics of the city. Polymerase chain reaction and DNA sequencing were used to determine the mutations in residue 426 of the VP2 gene, which determines the variants of CPV-2; in addition, four samples were chosen for complete sequencing of the VP2 gene to identify all possible mutations in the circulating strains in this region of the country. Results The results revealed the presence of the three variants of CPV-2 with a prevalence of 57.1% (20/35) for CPV-2a, 8.5% (3/35) for CPV-2b, and 34.3% (12/35) for CPV-2c. In addition, complete sequencing of the VP2 gene showed amino acid substitutions in residues 87, 101, 139, 219, 297, 300, 305, 322, 324, 375, 386, 426, 440, and 514 of the three Ecuadorian variants when compared with the original CPV-2 sequence. Conclusion This study describes the detection of CPV variants in the city of Quito, Ecuador. Variants of CPV-2 (2a, 2b, and 2c) have been reported in South America, and there are cases in Ecuador where CVP-2 is affecting even vaccinated puppies. PMID:29805214
Mutation Update for GNE Gene Variants Associated with GNE Myopathy
Celeste, Frank V.; Vilboux, Thierry; Ciccone, Carla; de Dios, John Karl; Malicdan, May Christine V.; Leoyklang, Petcharat; McKew, John C.; Gahl, William A.; Carrillo-Carrasco, Nuria; Huizing, Marjan
2014-01-01
The GNE gene encodes the rate-limiting, bifunctional enzyme of sialic acid biosynthesis, UDP-N-acetylglucosamine 2-epimerase/N-acetylmannosamine kinase (GNE). Biallelic GNE mutations underlie GNE myopathy, an adult-onset progressive myopathy. GNE myopathy-associated GNE mutations are predominantly missense, resulting in reduced, but not absent, GNE enzyme activities. The exact pathomechanism of GNE myopathy remains unknown, but likely involves aberrant (muscle) sialylation. Here we summarize 154 reported and novel GNE variants associated with GNE myopathy, including 122 missense, 11 nonsense, 14 insertion/deletions and 7 intronic variants. All variants were deposited in the online GNE variation database (http://www.dmd.nl/nmdb2/home.php?select_db=GNE). We report the predicted effects on protein function of all variants as well as the predicted effects on epimerase and/or kinase enzymatic activities of selected variants. By analyzing exome sequence databases, we identified three frequently occurring, unreported GNE missense variants/polymorphisms, important for future sequence interpretations. Based on allele frequencies, we estimate the world-wide prevalence of GNE myopathy to be ~ 4–21/1,000,000. This previously unrecognized high prevalence confirms suspicions that many patients may escape diagnosis. Awareness among physicians for GNE myopathy is essential for the identification of new patients, which is required for better understanding of the disorder’s pathomechanism and for the success of ongoing treatment trials. PMID:24796702
Zhou, Sirui; Xiong, Lan; Xie, Pingxing; Ambalavanan, Amirthagowri; Bourassa, Cynthia V.; Dionne-Laporte, Alexandre; Spiegelman, Dan; Turcotte Gauthier, Maude; Henrion, Edouard; Diallo, Ousmane; Dion, Patrick A.; Rouleau, Guy A.
2015-01-01
Background Nunavik Inuit (northern Quebec, Canada) reside along the arctic coastline where for generations their daily energy intake has mainly been derived from animal fat. Given this particular diet it has been hypothesized that natural selection would lead to population specific allele frequency differences and unique variants in genes related to fatty acid metabolism. A group of genes, namely CPT1A, CPT1B, CPT1C, CPT2, CRAT and CROT, encode for three carnitine acyltransferases that are important for the oxidation of fatty acids, a critical step in their metabolism. Methods Exome sequencing and SNP array genotyping were used to examine the genetic variations in the six genes encoding for the carnitine acyltransferases in 113 Nunavik Inuit individuals. Results Altogether ten missense variants were found in genes CPT1A, CPT1B, CPT1C, CPT2 and CRAT, including three novel variants and one Inuit specific variant CPT1A p.P479L (rs80356779). The latter has the highest frequency (0.955) compared to other Inuit populations. We found that by comparison to Asians or Europeans, the Nunavik Inuit have an increased mutation burden in CPT1A, CPT2 and CRAT; there is also a high level of population differentiation based on carnitine acyltransferase gene variations between Nunavik Inuit and Asians. Conclusion The increased number and frequency of deleterious variants in these fatty acid metabolism genes in Nunavik Inuit may be the result of genetic adaptation to their diet and/or the extremely cold climate. In addition, the identification of these variants may help to understand some of the specific health risks of Nunavik Inuit. PMID:26010953
Zhou, Sirui; Xiong, Lan; Xie, Pingxing; Ambalavanan, Amirthagowri; Bourassa, Cynthia V; Dionne-Laporte, Alexandre; Spiegelman, Dan; Turcotte Gauthier, Maude; Henrion, Edouard; Diallo, Ousmane; Dion, Patrick A; Rouleau, Guy A
2015-01-01
Nunavik Inuit (northern Quebec, Canada) reside along the arctic coastline where for generations their daily energy intake has mainly been derived from animal fat. Given this particular diet it has been hypothesized that natural selection would lead to population specific allele frequency differences and unique variants in genes related to fatty acid metabolism. A group of genes, namely CPT1A, CPT1B, CPT1C, CPT2, CRAT and CROT, encode for three carnitine acyltransferases that are important for the oxidation of fatty acids, a critical step in their metabolism. Exome sequencing and SNP array genotyping were used to examine the genetic variations in the six genes encoding for the carnitine acyltransferases in 113 Nunavik Inuit individuals. Altogether ten missense variants were found in genes CPT1A, CPT1B, CPT1C, CPT2 and CRAT, including three novel variants and one Inuit specific variant CPT1A p.P479L (rs80356779). The latter has the highest frequency (0.955) compared to other Inuit populations. We found that by comparison to Asians or Europeans, the Nunavik Inuit have an increased mutation burden in CPT1A, CPT2 and CRAT; there is also a high level of population differentiation based on carnitine acyltransferase gene variations between Nunavik Inuit and Asians. The increased number and frequency of deleterious variants in these fatty acid metabolism genes in Nunavik Inuit may be the result of genetic adaptation to their diet and/or the extremely cold climate. In addition, the identification of these variants may help to understand some of the specific health risks of Nunavik Inuit.
Li, Zhongshan; Liu, Zhenwei; Jiang, Yi; Chen, Denghui; Ran, Xia; Sun, Zhong Sheng; Wu, Jinyu
2017-01-01
Exome sequencing has been widely used to identify the genetic variants underlying human genetic disorders for clinical diagnoses, but the identification of pathogenic sequence variants among the huge amounts of benign ones is complicated and challenging. Here, we describe a new Web server named mirVAFC for pathogenic sequence variants prioritizations from clinical exome sequencing (CES) variant data of single individual or family. The mirVAFC is able to comprehensively annotate sequence variants, filter out most irrelevant variants using custom criteria, classify variants into different categories as for estimated pathogenicity, and lastly provide pathogenic variants prioritizations based on classifications and mutation effects. Case studies using different types of datasets for different diseases from publication and our in-house data have revealed that mirVAFC can efficiently identify the right pathogenic candidates as in original work in each case. Overall, the Web server mirVAFC is specifically developed for pathogenic sequence variant identifications from family-based CES variants using classification-based prioritizations. The mirVAFC Web server is freely accessible at https://www.wzgenomics.cn/mirVAFC/. © 2016 WILEY PERIODICALS, INC.
Thorell, Kaisa; Hosseini, Shaghayegh; Palacios Gonzales, Reyna Victoria Palacios; ...
2016-02-29
In this study, Helicobacter pylori (H. pylori) is one of the most common bacterial infections in humans and this infection can lead to gastric ulcers and gastric cancer. H. pylori is one of the most genetically variable human pathogens and the ability of the bacterium to bind to the host epithelium as well as the presence of different virulence factors and genetic variants within these genes have been associated with disease severity. Nicaragua has particularly high gastric cancer incidence and we therefore studied Nicaraguan clinical H. pylori isolates for factors that could contribute to cancer risk. The complete genomes ofmore » fifty-two Nicaraguan H. pylorii isolates were sequenced and assembled de novo, and phylogenetic and virulence factor analyses were performed. The Nicaraguan isolates showed phylogenetic relationship with West African isolates in whole-genome sequence comparisons and with Western and urban South-and Central American isolates using MLSA (Multi-locus sequence analysis). A majority, 77 % of the isolates carried the cancer-associated virulence gene cagA and also the s1/i1/m1 vacuolating cytotoxin, vacA allele combination, which is linked to increased severity of disease. Specifically, we also found that Nicaraguan isolates have a blood group-binding adhesin (BabA) variant highly similar to previously reported BabA sequences from Latin America, including from isolates belonging to other phylogenetic groups. These BabA sequences were found to be under positive selection at several amino acid positions that differed from the global collection of isolates. In conclusion, the discovery of a Latin American BabA variant, independent of overall phylogenetic background, suggests hitherto unknown host or environmental factors within the Latin American population giving H. pylori isolates carrying this adhesin variant a selective advantage, which could affect pathogenesis and risk for sequelae through specific adherence properties.« less
Yu, Zhijun; Sun, Weiyang; Zhang, Xinghai; Cheng, Kaihui; Zhao, Chuqi; Xia, Xianzhu; Gao, Yuwei
2017-08-01
Although H1N2 avian influenza virus (AIV) only infect birds, documented cases of swine infection with H1N2 influenza viruses suggest this subtype AIV may pose a potential threat to mammals. Here, we generated mouse-adapted variants of a H1N2 AIV to identify adaptive changes that increased virulence in mammals. MLD 50 of the variants were reduced >1000-fold compared to the parental virus. Variants displayed enhanced replication in vitro and in vivo, and replicate in extrapulmonary organs. These data show that enhanced replication capacity and expanded tissue tropism may increase the virulence of H1N2 AIV in mice. Sequence analysis revealed multiple amino acid substitutions in the PB2 (L134H, I647L, and D701N), HA (G228S), and M1 (D231N) proteins. These results indicate that H1N2 AIV can rapidly acquire adaptive amino acid substitutions in mammalian hosts, and these amino acid substitutions collaboratively enhance the ability of H1N2 AIV to replicate and cause severe disease in mammals. Copyright © 2017 Elsevier B.V. All rights reserved.
Ovine Reference Materials and Assays for Prion Genetic Testing
USDA-ARS?s Scientific Manuscript database
Background: Genetic predisposition to scrapie in sheep is associated with variation in the peptide sequence of the ovine prion protein encoded by Prnp. Codon variants implicated in scrapie susceptibility or disease progression include those at amino acid positions 112, 136, 141, 154, and 171. Nin...
Carroll, Thomas M.; Setlow, Peter
2005-01-01
Germination protease (GPR) initiates the degradation of small, acid-soluble spore proteins (SASP) during germination of spores of Bacillus and Clostridium species. The GPR amino acid sequence is not homologous to members of the major protease families, and previous work has not identified residues involved in GPR catalysis. The current work has focused on identifying catalytically essential amino acids by mutagenesis of Bacillus megaterium gpr. A residue was selected for alteration if it (i) was conserved among spore-forming bacteria, (ii) was a potential nucleophile, and (iii) had not been ruled out as inessential for catalysis. GPR variants were overexpressed in Escherichia coli, and the active form (P41) was assayed for activity against SASP and the zymogen form (P46) was assayed for the ability to autoprocess to P41. Variants inactive against SASP and unable to autoprocess were analyzed by circular dichroism spectroscopy and multiangle laser light scattering to determine whether the variant's inactivity was due to loss of secondary or quaternary structure, respectively. Variation of D127 and D193, but no other residues, resulted in inactive P46 and P41, while variants of each form were well structured and tetrameric, suggesting that D127 and D193 are essential for activity and autoprocessing. Mapping these two aspartate residues and a highly conserved lysine onto the B. megaterium P46 crystal structure revealed a striking similarity to the catalytic residues and propeptide lysine of aspartic acid proteases. These data indicate that GPR is an atypical aspartic acid protease. PMID:16199582
A novel histone variant localized in nucleoli of higher plant cells.
Tanaka, I; Akahori, Y; Gomi, K; Suzuki, T; Ueda, K
1999-07-01
Immunofluorescence staining with antisera raised against p35, a basic nuclear protein that accumulates in the pollen nuclei of Lilium longiflorum, specifically stained the nucleoli in interphase nuclei of somatic tissues, including root and leaf, and in pachytene nuclei during meiotic division, whereas antisera raised against histone H1 uniformly stained the entire chromatin domain with the exception of the nucleoli in these nuclei. Further, p35-specific antisera stained the nucleoli in root and leaf nuclei of the monocotyledonous plants Tulipa gesneriana, Allium cepa and Triticum aestivum and of the dicotyledonous plants Vicia faba and Nicotiana tabacum. Thus, these novel antisera stained the nucleoli in cells of all higher plants examined, although the staining patterns within nucleoli were somewhat different among plant species and tissues. The full-length cDNA of p35 was cloned on the basis of the partial amino acid sequence. The deduced amino acid composition and amino acid sequence of p35 indicate that this nucleolar protein is a novel variant of histone Hl. Further, p35 was strongly bound to ribosomal DNA in vitro. The results of immunoblotting of histones extracted from each tissue of the various plant species with the nucleolus-specific antibodies also suggested the conservation of similar epitope(s) in both mono- and dicotyledonous plants. From these results, it is suggested that similar variants of histone Hl are specifically distributed in the nucleoli of all plant species and help to organize the nucleolar chromatin.
Postnatal Expression of V2 Vasopressin Receptor Splice Variants in the Rat Cerebellum
Vargas, Karina J.; Sarmiento, José M.; Ehrenfeld, Pamela; Añazco, Carolina C.; Villanueva, Carolina I.; Carmona, Pamela L.; Brenet, Marianne; Navarro, Javier; Müller-Esterl, Werner; Figueroa, Carlos D.; González, Carlos B.
2010-01-01
The V2 vasopressin receptor gene contains an alternative splice site in exon-3, which leads to the generation of two splice variants (V2a and V2b) first identified in the kidney. The open reading frame of the alternatively spliced V2b transcripten codes a truncated receptor, showing the same amino acid sequence as the canonical V2a receptor up to the 6th transmembrane segment, but displaying a distinct sequence to the corresponding 7th transmembrane segment and C-terminal domain relative to the V2a receptor. Here, we demonstrate the postnatal expression of V2a and V2b variants in the rat cerebellum. Most importantly, we showed by in situ hybridization and immunocytochemistry that both V2 splice variants were preferentially expressed in Purkinje cells, from early to late postnatal development. In addition, both variants were transiently expressed in the neuroblastic external granule cells and Bergmann fibers. These results indicate that the cellular distributions of both splice variants are developmentally regulated, and suggest that the transient expression of the V2 receptor is involved in the mechanisms of cerebellar cytodifferentiation by AVP. Finally, transfected CHO-K1 .expressing similar amounts of both V2 splice variants, as that found in the cerebellum, showed a significant reduction in the surface expression of V2a receptors, suggesting that the differential expression of the V2 splice variants regulate the vasopressin signaling in the cerebellum. PMID:19281786
Dyson, Gregory; Levin, Nancy K.; Chaudhry, Sophia; Rosati, Rita; Kalpage, Hasini; Simon, Michael S.; Tainsky, Michael A.
2017-01-01
While up to 25% of ovarian cancer (OVCA) cases are thought to be due to inherited factors, the majority of genetic risk remains unexplained. To address this gap, we sought to identify previously undescribed OVCA risk variants through the whole exome sequencing (WES) and candidate gene analysis of 48 women with ovarian cancer and selected for high risk of genetic inheritance, yet negative for any known pathogenic variants in either BRCA1 or BRCA2. In silico SNP analysis was employed to identify suspect variants followed by validation using Sanger DNA sequencing. We identified five pathogenic variants in our sample, four of which are in two genes featured on current multi-gene panels; (RAD51D, ATM). In addition, we found a pathogenic FANCM variant (R1931*) which has been recently implicated in familial breast cancer risk. Numerous rare and predicted to be damaging variants of unknown significance were detected in genes on current commercial testing panels, most prominently in ATM (n = 6) and PALB2 (n = 5). The BRCA2 variant p.K3326*, resulting in a 93 amino acid truncation, was overrepresented in our sample (odds ratio = 4.95, p = 0.01) and coexisted in the germline of these women with other deleterious variants, suggesting a possible role as a modifier of genetic penetrance. Furthermore, we detected loss of function variants in non-panel genes involved in OVCA relevant pathways; DNA repair and cell cycle control, including CHEK1, TP53I3, REC8, HMMR, RAD52, RAD1, POLK, POLQ, and MCM4. In summary, our study implicates novel risk loci as well as highlights the clinical utility for retesting BRCA1/2 negative OVCA patients by genomic sequencing and analysis of genes in relevant pathways. PMID:28591191
Yu, Zhijun; Cheng, Kaihui; Sun, Weiyang; Zhang, Xinghai; Xia, Xianzhu; Gao, Yuwei
2018-01-15
A novel H5N8 highly pathogenic avian influenza virus (HPAIV) caused poultry outbreaks in the Republic of Korea in 2014. The novel H5N8 HPAIV has spread to Asia, Europe, and North America and caused great public concern from then on. Here, we generated mouse-adapted variants of a wild waterfowl-origin H5N8 HPAIV to identify adaptive mutants that confer enhanced pathogenicity in mammals. The mouse lethal doses (MLD 50 ) of the mouse-adapted variants were reduced 31623-fold compared to the wild-type (WT) virus. Mouse-adapted variants displayed enhanced replication in vitro and in vivo, and expanded tissue tropism in mice. Sequence analysis revealed four amino acid substitutions in the PB2 (E627K), PA (F35S), HA (R227H), and NA (I462V) proteins. These data suggest that multiple amino acid substitutions collaboratively increase the virulence of a wild bird-origin reassortant H5N8 HPAIV and cause severe disease in mice. Copyright © 2017 Elsevier B.V. All rights reserved.
Mollah, A K M M; Stennis, Rhonda L; Mossing, Michael C
2003-05-01
The thermodynamic stabilities of three monomeric variants of the bacteriophage lambda Cro repressor that differ only in the sequence of two amino acids at the apex of an engineered beta-hairpin have been determined. The sequences of the turns are EVK-XX-EVK, where the two central residues are DG, GG, and GT, respectively. Standard-state unfolding free energies, determined from circular dichroism measurements as a function of urea concentration, range from 2.4 to 2.7 kcal/mole, while those determined from guanidine hydrochloride range from 2.8 to 3.3 kcal/mole for the three proteins. Thermal denaturation yields van't Hoff unfolding enthalpies of 36 to 40 kcal /mole at midpoint temperatures in the range of 53 to 58 degrees C. Extrapolation of the thermal denaturation free energies with heat capacities of 400 to 600 cal/mole deg gives good agreement with the parameters determined in denaturant titrations. As predicted from statistical surveys of amino acid replacements in beta-hairpins, energetic barriers to transformation from a type I' turn (DG) to a type II' turn (GT) can be quite small.
Single Molecule Spectroscopy of Amino Acids and Peptides by Recognition Tunneling
Zhao, Yanan; Ashcroft, Brian; Zhang, Peiming; Liu, Hao; Sen, Suman; Song, Weisi; Im, JongOne; Gyarfas, Brett; Manna, Saikat; Biswas, Sovan; Borges, Chad; Lindsay, Stuart
2014-01-01
The human proteome has millions of protein variants due to alternative RNA splicing and post-translational modifications, and variants that are related to diseases are frequently present in minute concentrations. For DNA and RNA, low concentrations can be amplified using the polymerase chain reaction, but there is no such reaction for proteins. Therefore, the development of single molecule protein sequencing is a critical step in the search for protein biomarkers. Here we show that single amino acids can be identified by trapping the molecules between two electrodes that are coated with a layer of recognition molecules and measuring the electron tunneling current across the junction. A given molecule can bind in more than one way in the junction, and we therefore use a machine-learning algorithm to distinguish between the sets of electronic ‘fingerprints’ associated with each binding motif. With this recognition tunneling technique, we are able to identify D, L enantiomers, a methylated amino acid, isobaric isomers, and short peptides. The results suggest that direct electronic sequencing of single proteins could be possible by sequentially measuring the products of processive exopeptidase digestion, or by using a molecular motor to pull proteins through a tunnel junction integrated with a nanopore. PMID:24705512
Janarthanan, Sundaram; Sakthivelkumar, Shanmugavel; Veeramani, Velayutham; Radhika, Dixit; Muthukrishanan, Subbaratnam
2012-12-15
The anti-metabolic or insecticidal gene, arcelin (Arl) was isolated, cloned and sequenced using sequence specific degenerate primers from the seeds of Lablab purpureus collected from the Western Ghats, Tamil Nadu, India. The L. purpureus arcelin nucleotide sequence was homologous to Arl-3 and Arl-4 alleles from Phaseolus spp. The protein it encodes has 70% amino acid identity with the amino acid sequences of Arl-3I, Arl-3III, Arl-4 precursor, Arl-4 and Arl-4I. The partially purified arcelin from the seeds of L. purpureus using an artificial diet confirmed the complete retardation of development of the stored product pest Callosobruchus maculatus at 0.2% w/w arcelin-incorporated artificial seeds. Copyright © 2012 Elsevier Ltd. All rights reserved.
Fais, Antonella; Sollaino, Maria Carla; Barella, Susanna; Perseu, Lucia; Era, Benedetta; Corda, Marcella
2012-01-01
During a screening program for the identification of β-thalassemia (β-thal) carriers in Sardinia, Italy, we identified two subjects with increased hemoglobin (Hb) levels and an abnormal Hb variant. The same variant was detected in a family member. DNA sequencing revealed a TGT > TGG mutation at codon 93 of the β-globin gene. Structural analysis demonstrated that the cystine residue at position 93 of the β chain was substituted by tryptophan. Since this amino acid substitution had not yet been reported, we designated this variant Hb Santa Giusta Sardegna for the place of birth of the subjects. This amino acid substitution occurs at the tyrosine pocket of the β chain as well as at the α1β2/α2β1 contact of the quaternary structure of the molecule. The presence of this Hb in the hemolysate causes an increased oxygen affinity, a slightly reduced Bohr effect and a reduced heme-heme interaction (n(50), Hill's constant) in comparison with those of Hb A.
An inversion of 25 base pairs causes feline GM2 gangliosidosis variant.
Martin, Douglas R; Krum, Barbara K; Varadarajan, G S; Hathcock, Terri L; Smith, Bruce F; Baker, Henry J
2004-05-01
In G(M2) gangliosidosis variant 0, a defect in the beta-subunit of lysosomal beta-N-acetylhexosaminidase (EC 3.2.1.52) causes abnormal accumulation of G(M2) ganglioside and severe neurodegeneration. Distinct feline models of G(M2) gangliosidosis variant 0 have been described in both domestic shorthair and Korat cats. In this study, we determined that the causative mutation of G(M2) gangliosidosis in the domestic shorthair cat is a 25-base-pair inversion at the extreme 3' end of the beta-subunit (HEXB) coding sequence, which introduces three amino acid substitutions at the carboxyl terminus of the protein and a translational stop that is eight amino acids premature. Cats homozygous for the 25-base-pair inversion express levels of beta-subunit mRNA approximately 190% of normal and protein levels only 10-20% of normal. Because the 25-base-pair inversion is similar to mutations in the terminal exon of human HEXB, the domestic shorthair cat should serve as an appropriate model to study the molecular pathogenesis of human G(M2) gangliosidosis variant 0 (Sandhoff disease).
USDA-ARS?s Scientific Manuscript database
Diacylglycerol acyltransferase (DGAT) catalyzes the final, rate-limiting step in triacylglycerol (TAG) biosynthesis via the acyl-CoA-dependent acylation of diacylglycerol. In this study, type-2 DGAT2 genes were cloned from eleven peanut cultivars. Sequence analysis revealed at least eight peanut D...
Cheun-Arom, Thaniwan; Temeeyasen, Gun; Tripipat, Thitima; Kaewprommal, Pavita; Piriyapongsa, Jittima; Sukrong, Suchada; Chongcharoen, Wanchai; Tantituvanont, Angkana; Nilubol, Dachrit
2016-10-01
Porcine epidemic diarrhea virus (PEDV) has continued to cause sporadic outbreaks in Thailand since 2007 and a pandemic variant containing an insertion and deletion in the spike gene was responsible for outbreaks. In 2014, there were further outbreaks of the disease occurring within four months of each other. In this study, the full-length genome sequences of two genetically distinct PEDV isolates from the outbreaks were characterized. The two PEDV isolates, CBR1/2014 and EAS1/2014, were 28,039 and 28,033 nucleotides in length and showed 96.2% and 93.6% similarities at nucleotide and amino acid levels respectively. In total, we have observed 1048 nucleotide substitutions throughout the genome. Compared to EAS1/2014, CBR1/2014 has 2 insertions of 4 ((56)GENQ(59)) and 1 ((140)N) amino acid positions 56-59 and 140, and 2 deletions of 2 ((160)DG(161)) and 1 ((1199)Y) amino acid positions 160-161 and 1199. The phylogenetic analysis based on full-length genome of CBR1/2014 isolate has grouped the virus with the pandemic variants. In contrast, EAS1/2014 isolate was grouped with CV777, LZC and SM98, a classical variant. Our findings demonstrated the emergence of EAS1/2014, a classical variant which is novel to Thailand and genetically distinct from the currently circulating endemic variants. This study warrants further investigations into molecular epidemiology and genetic evolution of the PEDV in Thailand. Copyright © 2016 Elsevier B.V. All rights reserved.
Paenibacillus polymyxa PKB1 produces variants of polymyxin B-type antibiotics.
Shaheen, Mohamed; Li, Jingru; Ross, Avena C; Vederas, John C; Jensen, Susan E
2011-12-23
Polymyxins are cationic lipopeptide antibiotics active against many species of Gram-negative bacteria. We sequenced the gene cluster for polymyxin biosynthesis from Paenibacillus polymyxa PKB1. The 40.8 kb gene cluster comprises three nonribosomal peptide synthetase-encoding genes and two ABC transporter-like genes. Disruption of a peptide synthetase gene abolished all antibiotic production, whereas deletion of one or both transporter genes only reduced antibiotic production. Computational analysis of the peptide synthetase modules suggested that the enzyme system produces variant forms of polymyxin B (1 and 2), with D-2,4-diaminobutyrate instead of L-2,4-diaminobutyrate in amino acid position 3. Two antibacterial metabolites were resolved by HPLC and identified by high-resolution mass spectrometry and MS/MS sequencing as the expected variants 3 and 4 of polymyxin B(1) (1) and B(2) (2). Stereochemical analysis confirmed the presence of both D-2,4-diaminobutyrate and L-2,4-diaminobutyrate residues. Copyright © 2011 Elsevier Ltd. All rights reserved.
Fast single-pass alignment and variant calling using sequencing data
USDA-ARS?s Scientific Manuscript database
Sequencing research requires efficient computation. Few programs use already known information about DNA variants when aligning sequence data to the reference map. New program findmap.f90 reads the previous variant list before aligning sequence, calling variant alleles, and summing the allele counts...
Cloning and characterization of two novel DNases from Streptococcus pyogenes.
Hasegawa, Tadao; Torii, Keizo; Hashikawa, Shinnosuke; Iinuma, Yoshitsugu; Ohta, Michio
2002-06-01
The proteins in the culture supernatant (exoproteins) from Streptococcus pyogenes serotype M1 were separated by two-dimensional gel electrophoresis, and their N-terminal amino acid sequences were determined. The amino acid sequences were compared to sequences in the S. pyogenes genome database. The coding sequence showed similarity to sequences of two genes, mf2-v ( mf2 variant) and mf3, which had sequence similarity to genes encoding mitogenic factor (MF); MF has DNase activity. The recombinant genes were expressed in Escherichia coli and the proteins were synthesized. Mf2-v and Mf3 had DNase activity. The activity of Mf2-v was localized to the C-terminal half of the protein. The mf3 gene was shown to be present in most clinically isolated strains of S. pyogenes tested, and the mf2gene was detected in 20% of the isolates. The products of the mf2 and mf3 genes in clinically isolated S. pyogenes strains were thus shown to be DNases.
Droplet digital PCR technology promises new applications and research areas.
Manoj, P
2016-01-01
Digital Polymerase Chain Reaction (dPCR) is used to quantify nucleic acids and its applications are in the detection and precise quantification of low-level pathogens, rare genetic sequences, quantification of copy number variants, rare mutations and in relative gene expressions. Here the PCR is performed in large number of reaction chambers or partitions and the reaction is carried out in each partition individually. This separation allows a more reliable collection and sensitive measurement of nucleic acid. Results are calculated by counting amplified target sequence (positive droplets) and the number of partitions in which there is no amplification (negative droplets). The mean number of target sequences was calculated by Poisson Algorithm. Poisson correction compensates the presence of more than one copy of target gene in any droplets. The method provides information with accuracy and precision which is highly reproducible and less susceptible to inhibitors than qPCR. It has been demonstrated in studying variations in gene sequences, such as copy number variants and point mutations, distinguishing differences between expression of nearly identical alleles, assessment of clinically relevant genetic variations and it is routinely used for clonal amplification of samples for NGS methods. dPCR enables more reliable predictors of tumor status and patient prognosis by absolute quantitation using reference normalizations. Rare mitochondrial DNA deletions associated with a range of diseases and disorders as well as aging can be accurately detected with droplet digital PCR.
Welsch, Christoph; Shimakami, Tetsuro; Hartmann, Christoph; Yang, Yan; Domingues, Francisco S.; Lengauer, Thomas; Zeuzem, Stefan; Lemon, Stanley M.
2011-01-01
Background & Aims It is a challenge to develop direct-acting antiviral agents (DAAs) that target the NS3/4A protease of hepatitis C virus (HCV) because resistant variants develop. Ketoamide compounds, designed to mimic the natural protease substrate, have been developed as inhibitors. However, clinical trials have revealed rapid selection of resistant mutants, most of which are considered to be pre-existing variants. Methods We identified residues near the ketoamide-binding site in X-ray structures of the genotype 1a protease, co-crystallized with boceprevir or a telaprevir-like ligand, and then identified variants at these positions in 219 genotype 1 sequences from a public database. We used side-chain modeling to assess the potential effects of these variants on the interaction between ketoamide and the protease, and compared these results with the phenotypic effects on ketoamide resistance, RNA replication capacity, and infectious virus yields in a cell culture model of infection. Results Thirteen natural binding-site variants with potential for ketoamide resistance were identified at 10 residues in the protease, near the ketoamide binding site. Rotamer analysis of amino acid side-chain conformations indicated that 2 variants (R155K and D168G) could affect binding of telaprevir more than boceprevir. Measurements of antiviral susceptibility in cell culture studies were consistent with this observation. Four variants (Q41H, I132V, R155K, and D168G) caused low-to-moderate levels of ketoamide resistance; 3 of these were highly fit (Q41H, I132V, and R155K). Conclusions Using a comprehensive sequence and structure-based analysis, we showed how natural variation in the HCV protease NS3/4A sequences might affect susceptibility to first-generation DAAs. These findings increase our understanding of the molecular basis of ketoamide resistance among naturally existing viral variants. PMID:22155364
Balasubramanian, M; Lord, H; Levesque, S; Guturu, H; Thuriot, F; Sillon, G; Wenger, A M; Sureka, D L; Lester, T; Johnson, D S; Bowen, J; Calhoun, A R; Viskochil, D H; Bejerano, G; Bernstein, J A; Chitayat, D
2017-03-01
In 1993, Chitayat et al. , reported a newborn with hyperphalangism, facial anomalies, and bronchomalacia. We identified three additional families with similar findings. Features include bilateral accessory phalanx resulting in shortened index fingers; hallux valgus; distinctive face; respiratory compromise. To identify the genetic aetiology of Chitayat syndrome and identify a unifying cause for this specific form of hyperphalangism. Through ongoing collaboration, we had collected patients with strikingly-similar phenotype. Trio-based exome sequencing was first performed in Patient 2 through Deciphering Developmental Disorders study. Proband-only exome sequencing had previously been independently performed in Patient 4. Following identification of a candidate gene variant in Patient 2, the same variant was subsequently confirmed from exome data in Patient 4. Sanger sequencing was used to validate this variant in Patients 1, 3; confirm paternal inheritance in Patient 5. A recurrent, novel variant NM_006494.2:c.266A>G p.(Tyr89Cys) in ERF was identified in five affected individuals: de novo (patient 1, 2 and 3) and inherited from an affected father (patient 4 and 5). p.Tyr89Cys is an aromatic polar neutral to polar neutral amino acid substitution, at a highly conserved position and lies within the functionally important ETS-domain of the protein. The recurrent ERF c.266A>C p.(Tyr89Cys) variant causes Chitayat syndrome. ERF variants have previously been associated with complex craniosynostosis. In contrast, none of the patients with the c.266A>G p.(Tyr89Cys) variant have craniosynostosis. We report the molecular aetiology of Chitayat syndrome and discuss potential mechanisms for this distinctive phenotype associated with the p.Tyr89Cys substitution in ERF . Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
Qi, Wenbao; Jia, Weixin; Liu, Di; Li, Jing; Bi, Yuhai; Xie, Shumin; Li, Bo; Hu, Tao; Du, Yingying; Xing, Li; Zhang, Jiahao; Zhang, Fuchun; Wei, Xiaoman; Eden, John-Sebastian; Li, Huanan; Tian, Huaiyu; Li, Wei; Su, Guanming; Lao, Guangjie; Xu, Chenggang; Xu, Bing; Liu, Wenjun; Zhang, Guihong; Ren, Tao; Holmes, Edward C; Cui, Jie; Shi, Weifeng; Gao, George F; Liao, Ming
2018-01-15
Since its emergence in 2013, the H7N9 low-pathogenic avian influenza virus (LPAIV) has been circulating in domestic poultry in China, causing five waves of human infections. A novel H7N9 highly pathogenic avian influenza virus (HPAIV) variant possessing multiple basic amino acids at the cleavage site of the hemagglutinin (HA) protein was first reported in two cases of human infection in January 2017. More seriously, those novel H7N9 HPAIV variants have been transmitted and caused outbreaks on poultry farms in eight provinces in China. Herein, we demonstrate the presence of three different amino acid motifs at the cleavage sites of these HPAIV variants which were isolated from chickens and humans and likely evolved from the preexisting LPAIVs. Animal experiments showed that these novel H7N9 HPAIV variants are both highly pathogenic in chickens and lethal to mice. Notably, human-origin viruses were more pathogenic in mice than avian viruses, and the mutations in the PB2 gene associated with adaptation to mammals (E627K, A588V, and D701N) were identified by next-generation sequencing (NGS) and Sanger sequencing of the isolates from infected mice. No polymorphisms in the key amino acid substitutions of PB2 and HA in isolates from infected chicken lungs were detected by NGS. In sum, these results highlight the high degree of pathogenicity and the valid transmissibility of this new H7N9 variant in chickens and the quick adaptation of this new H7N9 variant to mammals, so the risk should be evaluated and more attention should be paid to this variant. IMPORTANCE Due to the recent increased numbers of zoonotic infections in poultry and persistent human infections in China, influenza A(H7N9) virus has remained a public health threat. Most of the influenza A(H7N9) viruses reported previously have been of low pathogenicity. Now, these novel H7N9 HPAIV variants have caused human infections in three provinces and outbreaks on poultry farms in eight provinces in China. We analyzed the molecular features and compared the relative characteristics of one H7N9 LPAIV and two H7N9 HPAIVs isolated from chickens and two human-origin H7N9 HPAIVs in chicken and mouse models. We found that all HPAIVs both are highly pathogenic and have valid transmissibility in chickens. Strikingly, the human-origin viruses were more highly pathogenic than the avian-origin viruses in mice, and dynamic mutations were confirmed by NGS and Sanger sequencing. Our findings offer important insight into the origin, adaptation, pathogenicity, and transmissibility of these viruses to both poultry and mammals. Copyright © 2018 American Society for Microbiology.
Seo, Wonhyo; Servat, Alexandre; Cliquet, Florence; Akinbowale, Jenkins; Prehaud, Christophe; Lafon, Monique; Sabeta, Claude
Rabies is a fatal zoonotic disease and infections generally lead to a fatal encephalomyelitis in both humans and animals. In South Africa, domestic (dogs) and the wildlife (yellow mongoose) host species maintain the canid and mongoose rabies variants respectively. In this study, pathogenicity differences of South African canid and mongoose rabies viruses were investigated in a murine model, by assessing the progression of clinical signs and survivorship. Comparison of glycoprotein gene sequences revealed amino acid differences that may underpin the observed pathogenicity differences. Cumulatively, our results suggest that the canid rabies virus may be more neurovirulent in mice than the mongoose rabies variant. Copyright © 2017 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.
A novel PTCH1 mutation underlies non-syndromic cleft lip and/or palate in a Han Chinese family.
Zhao, Huaxiang; Zhong, Wenjie; Leng, Chuntao; Zhang, Jieni; Zhang, Mengqi; Huang, Wenbin; Zhang, Yunfan; Li, Weiran; Jia, Peizeng; Lin, Jiuxiang; Maimaitili, Gulibaha; Chen, Feng
2018-06-16
Cleft lip and/or palate (CL/P) is the most common craniofacial congenital disease, and it has a complex aetiology. This study aimed to identify the causative gene mutation of a Han Chinese family with CL/P. Whole exome sequencing was conducted on the proband and her mother, who exhibited the same phenotype. A Mendelian dominant inheritance model, allele frequency, mutation regions, functional prediction and literature review were used to screen and filter the variants. The candidate was validated by Sanger sequencing. Conservation analysis and homology modelling were conducted. A heterozygous missense mutation c.1175C>T in the PTCH1 gene predicting p.Ala392Val was identified. This variant has not been reported and was predicted to be deleterious. Sanger sequencing verified the variant and the dominant inheritance model in the family. The missense alteration affects an amino acid that is evolutionarily conserved in the first extracellular loop of the PTCH1 protein. The local structure of the mutant protein was significantly altered according to homology modelling. Our findings suggest that c.1175C>T in PTCH1 (NM_000264) may be the causative mutation of this pedigree. Our results add to the evidence that PTCH1 variants play a role in the pathogenesis of orofacial clefts. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Somatic and Germline TP53 Alterations in Second Malignant Neoplasms from Pediatric Cancer Survivors.
Sherborne, Amy L; Lavergne, Vincent; Yu, Katharine; Lee, Leah; Davidson, Philip R; Mazor, Tali; Smirnoff, Ivan V; Horvai, Andrew E; Loh, Mignon; DuBois, Steven G; Goldsby, Robert E; Neglia, Joseph P; Hammond, Sue; Robison, Leslie L; Wustrack, Rosanna; Costello, Joseph F; Nakamura, Alice O; Shannon, Kevin M; Bhatia, Smita; Nakamura, Jean L
2017-04-01
Purpose: Second malignant neoplasms (SMNs) are severe late complications that occur in pediatric cancer survivors exposed to radiotherapy and other genotoxic treatments. To characterize the mutational landscape of treatment-induced sarcomas and to identify candidate SMN-predisposing variants, we analyzed germline and SMN samples from pediatric cancer survivors. Experimental Design: We performed whole-exome sequencing (WES) and RNA sequencing on radiation-induced sarcomas arising from two pediatric cancer survivors. To assess the frequency of germline TP53 variants in SMNs, Sanger sequencing was performed to analyze germline TP53 in 37 pediatric cancer survivors from the Childhood Cancer Survivor Study (CCSS) without any history of a familial cancer predisposition syndrome but known to have developed SMNs. Results: WES revealed TP53 mutations involving p53's DNA-binding domain in both index cases, one of which was also present in the germline. The germline and somatic TP53- mutant variants were enriched in the transcriptomes for both sarcomas. Analysis of TP53- coding exons in germline specimens from the CCSS survivor cohort identified a G215C variant encoding an R72P amino acid substitution in 6 patients and a synonymous SNP A639G in 4 others, resulting in 10 of 37 evaluable patients (27%) harboring a germline TP53 variant. Conclusions: Currently, germline TP53 is not routinely assessed in patients with pediatric cancer. These data support the concept that identifying germline TP53 variants at the time a primary cancer is diagnosed may identify patients at high risk for SMN development, who could benefit from modified therapeutic strategies and/or intensive posttreatment monitoring. Clin Cancer Res; 23(7); 1852-61. ©2016 AACR . ©2016 American Association for Cancer Research.
Somatic and germline TP53 alterations in second malignant neoplasms from pediatric cancer survivors
Sherborne, Amy L.; Lavergne, Vincent; Yu, Katharine; Lee, Leah; Davidson, Philip R.; Mazor, Tali; Smirnoff, Ivan; Horvai, Andrew; Loh, Mignon; DuBois, Steven G.; Goldsby, Robert E.; Neglia, Joseph; Hammond, Sue; Robison, Leslie L.; Wustrack, Rosanna; Costello, Joseph; Nakamura, Alice O.; Shannon, Kevin; Bhatia, Smita; Nakamura, Jean L.
2016-01-01
Purpose Second malignant neoplasms (SMNs) are severe late complications that occur in pediatric cancer survivors exposed to radiotherapy and other genotoxic treatments. To characterize the mutational landscape of treatment-induced sarcomas and to identify candidate SMN-predisposing variants we analyzed germline and SMN samples from pediatric cancer survivors. Experimental Design We performed whole exome sequencing (WES) and RNA sequencing on radiation-induced sarcomas arising from two pediatric cancer survivors. To assess the frequency of germline TP53 variants in SMNs, Sanger sequencing was performed to analyze germline TP53 in thirty-seven pediatric cancer survivors from the Childhood Cancer Survivor Study (CCSS) without history of a familial cancer predisposition syndrome but known to have developed SMNs. Results WES revealed TP53 mutations involving p53’s DNA binding domain in both index cases, one of which was also present in the germline. The germline and somatic TP53 mutant variants were enriched in the transcriptomes for both sarcomas. Analysis of TP53 coding exons in germline specimens from the CCSS survivor cohort identified a G215C variant encoding an R72P amino acid substitution in six patients and a synonymous single nucleotide polymorphism A639G in four others, resulting in ten out of 37 evaluable patients (27%) harboring a germline TP53 variant. Conclusions Currently, germline TP53 is not routinely assessed in pediatric cancer patients. These data support the concept that identifying germline TP53 variants at the time a primary cancer is diagnosed may identify patients at high risk for SMN development, who could benefit from modified therapeutic strategies and/or intensive post-treatment monitoring. PMID:27683180
Liu, Lu; Feng, Yu; McNally, Alan; Zong, Zhiyong
2018-06-14
New Delhi MBL (NDM) is a type of carbapenemase; 20 variants of NDM have been identified to date. We have found a new variant of NDM, NDM-21, and describe it here. A carbapenem-resistant Escherichia coli was subjected to WGS using an Illumina X10 sequencer to identify the antimicrobial resistance genes and its ST. The gene encoding the new variant of NDM was cloned into E. coli DH5α, with blaNDM-5 being cloned as the control. Transformants were tested for susceptibility to carbapenems. Mating was performed to obtain the plasmid carrying the new blaNDM gene and the complete plasmid sequence was obtained using long-read MinION sequencing. The E. coli isolate belonged to ST617 and phylogenetic group A. It had a gene encoding NDM-21, a new NDM variant. NDM-21 differs from NDM-5 by a Gly-to-Ser amino acid substitution at position 69 (G69S). NDM-21 retains the same activity against carbapenems as NDM-5. blaNDM-21 is carried by a 46.1 kb IncX3 plasmid, which is self-transmissible, and is located in a complex genetic context as blaNDM-5. The isolate also carried blaCTX-M-55, which encodes an ESBL conferring resistance to aztreonam (which completed its resistance to all clinically available β-lactams), and rmtB, which mediates high-level resistance to aminoglycosides, on an IncFII plasmid. A new NDM variant has been identified and blaNDM-21 has evolved from blaNDM-5 on an IncX3 plasmid.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Thorell, Kaisa; Hosseini, Shaghayegh; Palacios Gonzales, Reyna Victoria Palacios
In this study, Helicobacter pylori (H. pylori) is one of the most common bacterial infections in humans and this infection can lead to gastric ulcers and gastric cancer. H. pylori is one of the most genetically variable human pathogens and the ability of the bacterium to bind to the host epithelium as well as the presence of different virulence factors and genetic variants within these genes have been associated with disease severity. Nicaragua has particularly high gastric cancer incidence and we therefore studied Nicaraguan clinical H. pylori isolates for factors that could contribute to cancer risk. The complete genomes ofmore » fifty-two Nicaraguan H. pylorii isolates were sequenced and assembled de novo, and phylogenetic and virulence factor analyses were performed. The Nicaraguan isolates showed phylogenetic relationship with West African isolates in whole-genome sequence comparisons and with Western and urban South-and Central American isolates using MLSA (Multi-locus sequence analysis). A majority, 77 % of the isolates carried the cancer-associated virulence gene cagA and also the s1/i1/m1 vacuolating cytotoxin, vacA allele combination, which is linked to increased severity of disease. Specifically, we also found that Nicaraguan isolates have a blood group-binding adhesin (BabA) variant highly similar to previously reported BabA sequences from Latin America, including from isolates belonging to other phylogenetic groups. These BabA sequences were found to be under positive selection at several amino acid positions that differed from the global collection of isolates. In conclusion, the discovery of a Latin American BabA variant, independent of overall phylogenetic background, suggests hitherto unknown host or environmental factors within the Latin American population giving H. pylori isolates carrying this adhesin variant a selective advantage, which could affect pathogenesis and risk for sequelae through specific adherence properties.« less
A novel variant in the SLC12A1 gene in two families with antenatal Bartter syndrome.
Breinbjerg, Anders; Siggaard Rittig, Charlotte; Gregersen, Niels; Rittig, Søren; Hvarregaard Christensen, Jane
2017-01-01
Bartter syndrome is an autosomal-recessive inherited disease in which patients present with hypokalaemia and metabolic alkalosis. We present two apparently nonrelated cases with antenatal Bartter syndrome type I, due to a novel variant in the SLC12A1 gene encoding the bumetanide-sensitive sodium-(potassium)-chloride cotransporter 2 in the thick ascending limb of the loop of Henle. Blood samples were received from the two cases and 19 of their relatives, and deoxyribonucleic acid was extracted. The coding regions of the SLC12A1 gene were amplified using polymerase chain reaction, followed by bidirectional direct deoxyribonucleic acid sequencing. Each affected child in the two families was homozygous for a novel inherited variant in the SLC12A1gene, c.1614T>A. The variant predicts a change from a tyrosine codon to a stop codon (p.Tyr538Ter). The two cases presented antenatally and at six months of age, respectively. The two cases were homozygous for the same variant in the SLC12A1 gene, but presented clinically at different ages. This could eventually be explained by the presence of other gene variants or environmental factors modifying the phenotypes. The phenotypes of the patients were similar to other patients with antenatal Bartter syndrome. ©2016 Foundation Acta Paediatrica. Published by John Wiley & Sons Ltd.
Wen, Bo; Xu, Shaohang; Sheynkman, Gloria M; Feng, Qiang; Lin, Liang; Wang, Quanhui; Xu, Xun; Wang, Jun; Liu, Siqi
2014-11-01
Single nucleotide variations (SNVs) located within a reading frame can result in single amino acid polymorphisms (SAPs), leading to alteration of the corresponding amino acid sequence as well as function of a protein. Accurate detection of SAPs is an important issue in proteomic analysis at the experimental and bioinformatic level. Herein, we present sapFinder, an R software package, for detection of the variant peptides based on tandem mass spectrometry (MS/MS)-based proteomics data. This package automates the construction of variation-associated databases from public SNV repositories or sample-specific next-generation sequencing (NGS) data and the identification of SAPs through database searching, post-processing and generation of HTML-based report with visualized interface. sapFinder is implemented as a Bioconductor package in R. The package and the vignette can be downloaded at http://bioconductor.org/packages/devel/bioc/html/sapFinder.html and are provided under a GPL-2 license. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Scheps, Karen G; De Paula, Silvia M; Bitsman, Alicia R; Freigeiro, Daniel H; Basack, F Nora; Pennesi, Sandra P; Varela, Viviana
2013-01-01
We describe a novel frameshift mutation on the HBA1 gene (c.187delG), causative of α-thalassemia (α-thal) in a Black Cuban family with multiple sequence variants in the HBA genes and the Hb S [β6(A3)Glu→Val, GAG>GTG; HBB: c.20A>T] mutation. The deletion of the first base of codon 62 resulted in a frameshift at amino acid 62 with a putative premature termination codon (PTC) at amino acid 66 on the same exon (p.W62fsX66), which most likely triggers nonsense mediated decay of the resulting mRNA. This study also presents the first report of the α212 patchwork allele in Latin America and the description of two new sequence variants in the HBA2 region (c.-614G>A in the promoter region and c.95+39 C>T on the first intron).
Ishii, Atsushi; Watkins, Joseph C; Chen, Debbie; Hirose, Shinichi; Hammer, Michael F
2017-02-01
Two major classes of SCN1A variants are associated with Dravet syndrome (DS): those that result in haploinsufficiency (truncating) and those that result in an amino acid substitution (missense). The aim of this retrospective study was to describe the first large cohort of Japanese patients with SCN1A mutation-positive DS (n = 285), and investigate the relationship between variant (type and position) and clinical expression and response to treatment. We sequenced all exons and intron-exon boundaries of SCN1A in our cohort, investigated differences in the distribution of truncating and missense variants, tested for associations between variant type and phenotype, and compared these patterns with those of cohorts with milder epilepsy and healthy individuals. Unlike truncation variants, missense variants are found at higher density in the S4 voltage sensor and pore loops and at lower density in the domain I-II and II-III linkers and the first three segments of domain II. Relative to healthy individuals, there is an increased frequency of truncating (but not missense) variants in the noncoding C-terminus. The rate of cognitive decline is more rapid for patients with truncation variants regardless of age at seizure onset, whereas age at onset is a predictor of the rate of cognitive decline for patients with missense variants. We found significant differences in the distribution of truncating and missense variants across the SCN1A sequence among healthy individuals, patients with DS, and those with milder forms of SCN1A-variant positive epilepsy. Testing for associations with phenotype revealed that variant type can be predictive of rate of cognitive decline. Analysis of descriptive medication data suggests that in addition to conventional drug therapy in DS, bromide, clonazepam and topiramate may reduce seizure frequency. Wiley Periodicals, Inc. © 2016 International League Against Epilepsy.
Schaafsma, Gerard C P; Vihinen, Mauno
2017-07-01
Genes and proteins are known to have differences in their sensitivity to alterations. Despite numerous sequencing studies, proportions of harmful and harmless substitutions are not known for proteins and groups of proteins. To address this question, we predicted the outcome for all possible single amino acid substitutions (AASs) in nine representative protein groups by using the PON-P2 method. The effects on 996 proteins were studied and vast differences were noticed. Proteins in the cancer group harbor the largest proportion of harmful variants (42.1%), whereas the non-disease group of proteins not known to have a disease association and not involved in the housekeeping functions had the lowest number of harmful variants (4.2%). Differences in the proportions of the harmful and benign variants are wide within each group, but they still show clear differences between the groups. Frequently appearing protein domains show a wide spectrum of variant frequencies, whereas no major protein structural class-specific differences were noticed. AAS types in the original and variant residues showed distinctive patterns, which are shared by all the protein groups. The observations are relevant for understanding genetic bases of diseases, variation interpretation, and for the development of methods for that purpose. © 2017 Wiley Periodicals, Inc.
Helicobacter pylori Heat Shock Protein A: Serologic Responses and Genetic Diversity
Ng, Enders K. W.; Thompson, Stuart A.; Pérez-Pérez, Guillermo I.; Kansau, Imad; van der Ende, Arie; Labigne, Agnès; Sung, Joseph J. Y.; Chung, S. C. Sydney; Blaser, Martin J.
1999-01-01
Helicobacter pylori synthesizes an unusual GroES homolog, heat shock protein A (HspA). The present study was aimed at an assessment of the serological response to HspA in a group of Chinese patients with defined gastroduodenal pathologies and determination of whether diversity is present in the nucleotide sequences encoding HspA in isolates from these patients. Serum samples collected from 154 patients who had an upper gastrointestinal pathology and the presence of H. pylori defined by biopsy were tested for an immunoglobulin G (IgG) serologic response to H. pylori HspA by an enzyme linked immunosorbant assay. HspA-encoding nucleotide sequences in H. pylori isolates from 14 patients (7 seropositive and 7 seronegative for HspA) were analyzed by PCR and direct sequencing of the PCR products. The sequencing results were compared to those of 48 isolates from other parts of the world. Of the 154 known H. pylori-positive patients, 54 (35.1%) were seropositive for HspA. The A domain (GroES homology) of HspA was highly conserved in the 14 isolates tested. Although the B domain (metal-binding site unique to H. pylori) resembled that in the known major variant, particular amino acid substitutions allowed definition of an HspA variant associated with isolates from East Asia. There were no associations between patient characteristics and HspA seropositivity or amino acid sequences. We confirmed in this study that the clinical outcomes of H. pylori infection are not related to HspA antigenicity or to sequence variation. However, B-domain sequence variation may be a marker for the study of the genetic diversity of H. pylori strains of different geographic origins. PMID:10225839
Comparison and evaluation of two exome capture kits and sequencing platforms for variant calling.
Zhang, Guoqiang; Wang, Jianfeng; Yang, Jin; Li, Wenjie; Deng, Yutian; Li, Jing; Huang, Jun; Hu, Songnian; Zhang, Bing
2015-08-05
To promote the clinical application of next-generation sequencing, it is important to obtain accurate and consistent variants of target genomic regions at low cost. Ion Proton, the latest updated semiconductor-based sequencing instrument from Life Technologies, is designed to provide investigators with an inexpensive platform for human whole exome sequencing that achieves a rapid turnaround time. However, few studies have comprehensively compared and evaluated the accuracy of variant calling between Ion Proton and Illumina sequencing platforms such as HiSeq 2000, which is the most popular sequencing platform for the human genome. The Ion Proton sequencer combined with the Ion TargetSeq Exome Enrichment Kit together make up TargetSeq-Proton, whereas SureSelect-Hiseq is based on the Agilent SureSelect Human All Exon v4 Kit and the HiSeq 2000 sequencer. Here, we sequenced exonic DNA from four human blood samples using both TargetSeq-Proton and SureSelect-HiSeq. We then called variants in the exonic regions that overlapped between the two exome capture kits (33.6 Mb). The rates of shared variant loci called by two sequencing platforms were from 68.0 to 75.3% in four samples, whereas the concordance of co-detected variant loci reached 99%. Sanger sequencing validation revealed that the validated rate of concordant single nucleotide polymorphisms (SNPs) (91.5%) was higher than the SNPs specific to TargetSeq-Proton (60.0%) or specific to SureSelect-HiSeq (88.3%). With regard to 1-bp small insertions and deletions (InDels), the Sanger sequencing validated rates of concordant variants (100.0%) and SureSelect-HiSeq-specific (89.6%) were higher than those of TargetSeq-Proton-specific (15.8%). In the sequencing of exonic regions, a combination of using of two sequencing strategies (SureSelect-HiSeq and TargetSeq-Proton) increased the variant calling specificity for concordant variant loci and the sensitivity for variant loci called by any one platform. However, for the sequencing of platform-specific variants, the accuracy of variant calling by HiSeq 2000 was higher than that of Ion Proton, specifically for the InDel detection. Moreover, the variant calling software also influences the detection of SNPs and, specifically, InDels in Ion Proton exome sequencing.
Lepori, Vincent; Mühlhause, Franziska; Sewell, Adrian C; Jagannathan, Vidhya; Janzen, Nils; Rosati, Marco; Alves de Sousa, Filipe Miguel Maximiano; Tschopp, Aurélie; Schüpbach, Gertraud; Matiasek, Kaspar; Tipold, Andrea; Leeb, Tosso; Kornberg, Marion
2018-05-04
Several enzymes are involved in fatty acid oxidation, which is a key process in mitochondrial energy production. Inherited defects affecting any step of fatty acid oxidation can result in clinical disease. We present here an extended family of German Hunting Terriers with 10 dogs affected by clinical signs of exercise induced weakness, muscle pain, and suspected rhabdomyolysis. The combination of clinical signs, muscle histopathology and acylcarnitine analysis with an elevated tetradecenoylcarnitine (C14:1) peak suggested a possible diagnosis of acyl-CoA dehydrogenase very long chain deficiency (ACADVLD). Whole genome sequence analysis of one affected dog and 191 controls revealed a nonsense variant in the ACADVL gene encoding acyl-CoA dehydrogenase very long chain, c.1728C>A or p.(Tyr576*). The variant showed perfect association with the phenotype in the 10 affected and more than 500 control dogs of various breeds. Pathogenic variants in the ACADVL gene have been reported in humans with similar myopathic phenotypes. We therefore considered the detected variant to be the most likely candidate causative variant for the observed exercise induced myopathy. To our knowledge, this is the first description of this disease in dogs, which we propose to name exercise induced metabolic myopathy (EIMM), and the identification of the first canine pathogenic ACADVL variant. Our findings provide a large animal model for a known human disease and will enable genetic testing to avoid the unintentional breeding of affected offspring. Copyright © 2018 Lepori et al.
Low, Karen J; Ansari, Morad; Abou Jamra, Rami; Clarke, Angus; El Chehadeh, Salima; FitzPatrick, David R; Greenslade, Mark; Henderson, Alex; Hurst, Jane; Keller, Kory; Kuentz, Paul; Prescott, Trine; Roessler, Franziska; Selmer, Kaja K; Schneider, Michael C; Stewart, Fiona; Tatton-Brown, Katrina; Thevenon, Julien; Vigeland, Magnus D; Vogt, Julie; Willems, Marjolaine; Zonana, Jonathan; Study, D D D; Smithson, Sarah F
2017-01-01
PUF60 encodes a nucleic acid-binding protein, a component of multimeric complexes regulating RNA splicing and transcription. In 2013, patients with microdeletions of chromosome 8q24.3 including PUF60 were found to have developmental delay, microcephaly, craniofacial, renal and cardiac defects. Very similar phenotypes have been described in six patients with variants in PUF60, suggesting that it underlies the syndrome. We report 12 additional patients with PUF60 variants who were ascertained using exome sequencing: six through the Deciphering Developmental Disorders Study and six through similar projects. Detailed phenotypic analysis of all patients was undertaken. All 12 patients had de novo heterozygous PUF60 variants on exome analysis, each confirmed by Sanger sequencing: four frameshift variants resulting in premature stop codons, three missense variants that clustered within the RNA recognition motif of PUF60 and five essential splice-site (ESS) variant. Analysis of cDNA from a fibroblast cell line derived from one of the patients with an ESS variants revealed aberrant splicing. The consistent feature was developmental delay and most patients had short stature. The phenotypic variability was striking; however, we observed similarities including spinal segmentation anomalies, congenital heart disease, ocular colobomata, hand anomalies and (in two patients) unilateral renal agenesis/horseshoe kidney. Characteristic facial features included micrognathia, a thin upper lip and long philtrum, narrow almond-shaped palpebral fissures, synophrys, flared eyebrows and facial hypertrichosis. Heterozygote loss-of-function variants in PUF60 cause a phenotype comprising growth/developmental delay and craniofacial, cardiac, renal, ocular and spinal anomalies, adding to disorders of human development resulting from aberrant RNA processing/spliceosomal function. PMID:28327570
Novel oxytocin receptor variants in laboring women requiring high doses of oxytocin.
Reinl, Erin L; Goodwin, Zane A; Raghuraman, Nandini; Lee, Grace Y; Jo, Erin Y; Gezahegn, Beakal M; Pillai, Meghan K; Cahill, Alison G; de Guzman Strong, Cristina; England, Sarah K
2017-08-01
Although oxytocin commonly is used to augment or induce labor, it is difficult to predict its effectiveness because oxytocin dose requirements vary significantly among women. One possibility is that women requiring high or low doses of oxytocin have variations in the oxytocin receptor gene. To identify oxytocin receptor gene variants in laboring women with low and high oxytocin dosage requirements. Term, nulliparous women requiring oxytocin doses of ≤4 mU/min (low-dose-requiring, n = 83) or ≥20 mU/min (high-dose-requiring, n = 104) for labor augmentation or induction provided consent to a postpartum blood draw as a source of genomic DNA. Targeted-amplicon sequencing (coverage >30×) with MiSeq (Illumina) was performed to discover variants in the coding exons of the oxytocin receptor gene. Baseline relevant clinical history, outcomes, demographics, and oxytocin receptor gene sequence variants and their allele frequencies were compared between low-dose-requiring and high-dose-requiring women. The Scale-Invariant Feature Transform algorithm was used to predict the effect of variants on oxytocin receptor function. The Fisher exact or χ 2 tests were used for categorical variables, and Student t tests or Wilcoxon rank sum tests were used for continuous variables. A P value < .05 was considered statistically significant. The high-dose-requiring women had greater rates of obesity and diabetes and were more likely to have undergone labor induction and required prostaglandins. High-dose-requiring women were more likely to undergo cesarean delivery for first-stage arrest and less likely to undergo cesarean delivery for nonreassuring fetal status. Targeted sequencing of the oxytocin receptor gene in the total cohort (n = 187) revealed 30 distinct coding variants: 17 nonsynonymous, 11 synonymous, and 2 small structural variants. One novel variant (A243T) was found in both the low- and high-dose-requiring groups. Three novel variants (Y106H, A240_A249del, and P197delfs*206) resulting in an amino acid substitution, loss of 9 amino acids, and a frameshift stop mutation, respectively, were identified only in low-dose-requiring women. Nine nonsynonymous variants were unique to the high-dose-requiring group. These included 3 known variants (R151C, G221S, and W228C) and 6 novel variants (M133V, R150L, H173R, A248V, G253R, and I266V). Of these, R150L, R151C, and H173R were predicted by Scale-Invariant Feature Transform algorithm to damage oxytocin receptor function. There was no statistically significant association between the numbers of synonymous and nonsynonymous substitutions in the patient groups. Obesity, diabetes, and labor induction were associated with the requirement for high doses of oxytocin. We did not identify significant differences in the prevalence of oxytocin receptor variants between low-dose-requiring and high-dose-requiring women, but novel oxytocin receptor variants were enriched in the high-dose-requiring women. We also found 3 oxytocin receptor variants (2 novel, 1 known) that were predicted to damage oxytocin receptor function and would likely increase an individual's risk for requiring a high oxytocin dose. Further investigation of oxytocin receptor variants and their effects on protein function will inform precision medicine in pregnant women. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Carss, Keren J; Arno, Gavin; Erwood, Marie; Stephens, Jonathan; Sanchis-Juan, Alba; Hull, Sarah; Megy, Karyn; Grozeva, Detelina; Dewhurst, Eleanor; Malka, Samantha; Plagnol, Vincent; Penkett, Christopher; Stirrups, Kathleen; Rizzo, Roberta; Wright, Genevieve; Josifova, Dragana; Bitner-Glindzicz, Maria; Scott, Richard H; Clement, Emma; Allen, Louise; Armstrong, Ruth; Brady, Angela F; Carmichael, Jenny; Chitre, Manali; Henderson, Robert H H; Hurst, Jane; MacLaren, Robert E; Murphy, Elaine; Paterson, Joan; Rosser, Elisabeth; Thompson, Dorothy A; Wakeling, Emma; Ouwehand, Willem H; Michaelides, Michel; Moore, Anthony T; Webster, Andrew R; Raymond, F Lucy
2017-01-05
Inherited retinal disease is a common cause of visual impairment and represents a highly heterogeneous group of conditions. Here, we present findings from a cohort of 722 individuals with inherited retinal disease, who have had whole-genome sequencing (n = 605), whole-exome sequencing (n = 72), or both (n = 45) performed, as part of the NIHR-BioResource Rare Diseases research study. We identified pathogenic variants (single-nucleotide variants, indels, or structural variants) for 404/722 (56%) individuals. Whole-genome sequencing gives unprecedented power to detect three categories of pathogenic variants in particular: structural variants, variants in GC-rich regions, which have significantly improved coverage compared to whole-exome sequencing, and variants in non-coding regulatory regions. In addition to previously reported pathogenic regulatory variants, we have identified a previously unreported pathogenic intronic variant in CHM in two males with choroideremia. We have also identified 19 genes not previously known to be associated with inherited retinal disease, which harbor biallelic predicted protein-truncating variants in unsolved cases. Whole-genome sequencing is an increasingly important comprehensive method with which to investigate the genetic causes of inherited retinal disease. Copyright © 2017. Published by Elsevier Inc.
Characterization of Novel Missense Variants of SERPINA1 Gene Causing Alpha-1 Antitrypsin Deficiency.
Matamala, Nerea; Lara, Beatriz; Gomez-Mariano, Gema; Martínez, Selene; Retana, Diana; Fernandez, Taiomara; Silvestre, Ramona Angeles; Belmonte, Irene; Rodriguez-Frias, Francisco; Vilar, Marçal; Sáez, Raquel; Iturbe, Igor; Castillo, Silvia; Molina-Molina, María; Texido, Anna; Tirado-Conde, Gema; Lopez-Campos, Jose Luis; Posada, Manuel; Blanco, Ignacio; Janciauskiene, Sabina; Martinez-Delgado, Beatriz
2018-06-01
The SERPINA1 gene is highly polymorphic, with more than 100 variants described in databases. SERPINA1 encodes the alpha-1 antitrypsin (AAT) protein, and severe deficiency of AAT is a major contributor to pulmonary emphysema and liver diseases. In Spanish patients with AAT deficiency, we identified seven new variants of the SERPINA1 gene involving amino acid substitutions in different exons: PiSDonosti (S+Ser14Phe), PiTijarafe (Ile50Asn), PiSevilla (Ala58Asp), PiCadiz (Glu151Lys), PiTarragona (Phe227Cys), PiPuerto Real (Thr249Ala), and PiValencia (Lys328Glu). We examined the characteristics of these variants and the putative association with the disease. Mutant proteins were overexpressed in HEK293T cells, and AAT expression, polymerization, degradation, and secretion, as well as antielastase activity, were analyzed by periodic acid-Schiff staining, Western blotting, pulse-chase, and elastase inhibition assays. When overexpressed, S+S14F, I50N, A58D, F227C, and T249A variants formed intracellular polymers and did not secrete AAT protein. Both the E151K and K328E variants secreted AAT protein and did not form polymers, although K328E showed intracellular retention and reduced antielastase activity. We conclude that deficient variants may be more frequent than previously thought and that their discovery is possible only by the complete sequencing of the gene and subsequent functional characterization. Better knowledge of SERPINA1 variants would improve diagnosis and management of individuals with AAT deficiency.
Zhou, Jie; Kherani, Femida; Bardakjian, Tanya M.; Katowitz, James; Hughes, Nkecha; Schimmenti, Lisa A.; Schneider, Adele
2008-01-01
Purpose Mutations in the SOX2 and CHX10 genes have been reported in patients with anophthalmia and/or microphthalmia. In this study, we evaluated 34 anophthalmic/microphthalmic patient DNA samples (two sets of siblings included) for mutations and sequence variants in SOX2 and CHX10. Methods Conformational sensitive gel electrophoresis (CSGE) was used for the initial SOX2 and CHX10 screening of 34 affected individuals (two sets of siblings), five unaffected family members, and 80 healthy controls. Patient samples containing heteroduplexes were selected for sequence analysis. Base pair changes in SOX2 and CHX10 were confirmed by sequencing bidirectionally in patient samples. Results Two novel heterozygous mutations and two sequence variants (one known) in SOX2 were identified in this cohort. Mutation c.310 G>T (p. Glu104X), found in one patient, was in the region encoding the high mobility group (HMG) DNA-binding domain and resulted in a change from glutamic acid to a stop codon. The second mutation, noted in two affected siblings, was a single nucleotide deletion c.549delC (p. Pro184ArgfsX19) in the region encoding the activation domain, resulting in a frameshift and premature termination of the coding sequence. The shortened protein products may result in the loss of function. In addition, a novel nucleotide substitution c.*557G>A was identified in the 3′-untranslated region in one patient. The relationship between the nucleotide change and the protein function is indeterminate. A known single nucleotide polymorphism (c. *469 C>A, SNP rs11915160) was also detected in 2 of the 34 patients. Screening of CHX10 identified two synonymous sequence variants, c.471 C>T (p.Ser157Ser, rs35435463) and c.579 G>A (p. Gln193Gln, novel SNP), and one non-synonymous sequence variant, c.871 G>A (p. Asp291Asn, novel SNP). The non-synonymous polymorphism was also present in healthy controls, suggesting non-causality. Conclusions These results support the role of SOX2 in ocular development. Loss of SOX2 function results in severe eye malformation. CHX10 was not implicated with microphthalmia/anophthalmia in our patient cohort. PMID:18385794
Ståhlberg, Anders; Krzyzanowski, Paul M; Jackson, Jennifer B; Egyud, Matthew; Stein, Lincoln; Godfrey, Tony E
2016-06-20
Detection of cell-free DNA in liquid biopsies offers great potential for use in non-invasive prenatal testing and as a cancer biomarker. Fetal and tumor DNA fractions however can be extremely low in these samples and ultra-sensitive methods are required for their detection. Here, we report an extremely simple and fast method for introduction of barcodes into DNA libraries made from 5 ng of DNA. Barcoded adapter primers are designed with an oligonucleotide hairpin structure to protect the molecular barcodes during the first rounds of polymerase chain reaction (PCR) and prevent them from participating in mis-priming events. Our approach enables high-level multiplexing and next-generation sequencing library construction with flexible library content. We show that uniform libraries of 1-, 5-, 13- and 31-plex can be generated. Utilizing the barcodes to generate consensus reads for each original DNA molecule reduces background sequencing noise and allows detection of variant alleles below 0.1% frequency in clonal cell line DNA and in cell-free plasma DNA. Thus, our approach bridges the gap between the highly sensitive but specific capabilities of digital PCR, which only allows a limited number of variants to be analyzed, with the broad target capability of next-generation sequencing which traditionally lacks the sensitivity to detect rare variants. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Griaud, François; Winter, Andrej; Denefeld, Blandine; Lang, Manuel; Hensinger, Héloïse; Straube, Frank; Sackewitz, Mirko; Berg, Matthias
Patent expiration of first-generation biologics and the high cost of innovative biologics are 2 drivers for the development of biosimilar products. There are, however, technical challenges to the production of exact copies of such large molecules. In this study, we performed a head-to-head comparison between the originator anti-VEGF-A Fab product LUCENTIS® (ranibizumab) and an intended copy product using an integrated analytical approach. While no differences could be observed using size-exclusion chromatography, capillary electrophoresis-sodium dodecyl sulfate and potency assays, different acidic peaks were identified with cation ion exchange chromatography and capillary zone electrophoresis. Further investigation of the intact Fab, subunits and primary sequence with mass spectrometry demonstrated the presence of a modified light chain variant in the intended copy product batches. This variant was characterized with a mass increase of 27.01 Da compared to the originator sequence and its abundance was estimated in the range of 6-9% of the intended copy product light chain. MS/MS spectra interrogation confirmed that this modification relates to a serine to asparagine sequence variant found in the intended copy product light chain. We demonstrated that the integration of high-resolution and sensitive orthogonal technologies was beneficial to assess the similarity of an originator and an intended copy product.
Lugo-Martinez, Jose; Pejaver, Vikas; Pagel, Kymberleigh A.; Mort, Matthew; Cooper, David N.; Mooney, Sean D.; Radivojac, Predrag
2016-01-01
Elucidating the precise molecular events altered by disease-causing genetic variants represents a major challenge in translational bioinformatics. To this end, many studies have investigated the structural and functional impact of amino acid substitutions. Most of these studies were however limited in scope to either individual molecular functions or were concerned with functional effects (e.g. deleterious vs. neutral) without specifically considering possible molecular alterations. The recent growth of structural, molecular and genetic data presents an opportunity for more comprehensive studies to consider the structural environment of a residue of interest, to hypothesize specific molecular effects of sequence variants and to statistically associate these effects with genetic disease. In this study, we analyzed data sets of disease-causing and putatively neutral human variants mapped to protein 3D structures as part of a systematic study of the loss and gain of various types of functional attribute potentially underlying pathogenic molecular alterations. We first propose a formal model to assess probabilistically function-impacting variants. We then develop an array of structure-based functional residue predictors, evaluate their performance, and use them to quantify the impact of disease-causing amino acid substitutions on catalytic activity, metal binding, macromolecular binding, ligand binding, allosteric regulation and post-translational modifications. We show that our methodology generates actionable biological hypotheses for up to 41% of disease-causing genetic variants mapped to protein structures suggesting that it can be reliably used to guide experimental validation. Our results suggest that a significant fraction of disease-causing human variants mapping to protein structures are function-altering both in the presence and absence of stability disruption. PMID:27564311
Lugo-Martinez, Jose; Pejaver, Vikas; Pagel, Kymberleigh A; Jain, Shantanu; Mort, Matthew; Cooper, David N; Mooney, Sean D; Radivojac, Predrag
2016-08-01
Elucidating the precise molecular events altered by disease-causing genetic variants represents a major challenge in translational bioinformatics. To this end, many studies have investigated the structural and functional impact of amino acid substitutions. Most of these studies were however limited in scope to either individual molecular functions or were concerned with functional effects (e.g. deleterious vs. neutral) without specifically considering possible molecular alterations. The recent growth of structural, molecular and genetic data presents an opportunity for more comprehensive studies to consider the structural environment of a residue of interest, to hypothesize specific molecular effects of sequence variants and to statistically associate these effects with genetic disease. In this study, we analyzed data sets of disease-causing and putatively neutral human variants mapped to protein 3D structures as part of a systematic study of the loss and gain of various types of functional attribute potentially underlying pathogenic molecular alterations. We first propose a formal model to assess probabilistically function-impacting variants. We then develop an array of structure-based functional residue predictors, evaluate their performance, and use them to quantify the impact of disease-causing amino acid substitutions on catalytic activity, metal binding, macromolecular binding, ligand binding, allosteric regulation and post-translational modifications. We show that our methodology generates actionable biological hypotheses for up to 41% of disease-causing genetic variants mapped to protein structures suggesting that it can be reliably used to guide experimental validation. Our results suggest that a significant fraction of disease-causing human variants mapping to protein structures are function-altering both in the presence and absence of stability disruption.
DOE Office of Scientific and Technical Information (OSTI.GOV)
McGrath, B.C.; Dunn, J.J.; France, L.L.
1995-12-31
Lyme borreliosis, caused by the spirochete Borrelia burgdorferi, is the most common vector-borne disease in North America and Western Europe. As the major delayed immune response in humans, a better understanding of the major outer surface lipoproteins OspA and OspB are of much interest. These proteins have been shown to exhibit three distinct phylogenetic genotypes based on their DNA sequences. This paper describes the cloning of genomic DNA for each variant and amplification of PCR. DNA sequence data was used to derive computer driven phylogenetic analysis and deduced amino acid sequences. Overproduction of variant OspAs was carried out in E.more » coli using a T7-based expression system. Circular dichroism and fluorescence studies was carried out on the recombinant B31 PspA yielding evidence supporting a B31 protein containing 11% alpha-helix, 34% antiparallel beta-sheet, 12% parallel beta sheet.« less
[Application of the polymerase chain reaction (PCR) in the diagnosis of Hb S-beta(+)-thalassemia].
Harano, K; Harano, T; Kushida, Y; Ueda, S
1991-08-01
Isoelectric focusing of the hemolysate prepared from a two-year-old American black boy with microcytic hypochromia showed the presence of a high percentage (63.3%) of such Hb variant as Hb S, while the levels of Hb A, Hb F and Hb A2 were 20.0%, 12.7%, and 4.0%, respectively. The ratio of the non-alpha-chain to the alpha-chain of the biosynthesized globin chains was 0.49. The variant was identified as Hb S by amino acid analysis of the abnormal peptide (beta T-1) and digestion of DNA amplified by the polymerase chain reaction with enzyme Eco 81 I. This was further confirmed by DNA sequencing. DNA sequencing of a beta-gene without the beta s-mutation revealed a nucleotide change of T to C in the polyadenylation signal sequence AATAAA 3' to the beta-gene, resulting in beta(+)-thalassemia. These results are consistent with the existence of a beta s-gene and a beta(+)-thalassemia gene in trans.
Lee, Jin Goo; Gu, Se Hun; Baek, Luck Ju; Shin, Ok Sarah; Park, Kwang Sook; Kim, Heung-Chul; Klein, Terry A.; Yanagihara, Richard; Song, Jin-Won
2014-01-01
The genome of Muju virus (MUJV), identified originally in the royal vole (Myodes regulus) in Korea, was fully sequenced to ascertain its genetic and phylogenetic relationship with Puumala virus (PUUV), harbored by the bank vole (My. glareolus), and a PUUV-like virus, named Hokkaido virus (HOKV), in the grey red-backed vole (My. rufocanus) in Japan. Whole genome sequence analysis of the 6544-nucleotide large (L), 3652-nucleotide medium (M) and 1831-nucleotide small (S) segments of MUJV, as well as the amino acid sequences of their gene products, indicated that MUJV strains from different capture sites might represent genetic variants of PUUV, the prototype arvicolid rodent-borne hantavirus in Europe. Distinct geographic-specific clustering of MUJV was found in different provinces in Korea, and phylogenetic analyses revealed that MUJV and HOKV share a common ancestry with PUUV. A better understanding of the taxonomic classification and pathogenic potential of MUJV must await its isolation in cell culture. PMID:24736214
Valliere-Douglass, John F; Kodama, Paul; Mujacic, Mirna; Brady, Lowell J; Wang, Wes; Wallace, Alison; Yan, Boxu; Reddy, Pranhitha; Treuheit, Michael J; Balland, Alain
2009-11-20
We report that N-linked oligosaccharide structures can be present on an asparagine residue not adhering to the consensus site motif NX(S/T), where X is not proline, described in the literature. We have observed oligosaccharides on a non-consensus asparaginyl residue in the C(H)1 constant domain of IgG1 and IgG2 antibodies. The initial findings were obtained from characterization of charge variant populations evident in a recombinant human antibody of the IgG2 subclass. HPLC-MS results indicated that cation-exchange chromatography acidic variant populations were enriched in antibody with a second glycosylation site, in addition to the well documented canonical glycosylation site located in the C(H)2 domain. Subsequent tryptic and chymotryptic peptide map data indicated that the second glycosylation site was associated with the amino acid sequence TVSWN(162)SGAL in the C(H)1 domain of the antibody. This highly atypical modification is present at levels of 0.5-2.0% on most of the recombinant antibodies that have been tested and has also been observed in IgG1 antibodies derived from human donors. Site-directed mutagenesis of the C(H)1 domain sequence in a recombinant-human IgG1 antibody resulted in an increase in non-consensus glycosylation to 3.15%, a greater than 4-fold increase over the level observed in the wild type, by changing the -1 and +1 amino acids relative to the asparagine residue at position 162. We believe that further understanding of the phenomenon of non-consensus glycosylation can be used to gain fundamental insights into the fidelity of the cellular glycosylation machinery.
Cuypers, Lize; Li, Guangdi; Libin, Pieter; Piampongsant, Supinya; Vandamme, Anne-Mieke; Theys, Kristof
2015-09-16
Treatment with pan-genotypic direct-acting antivirals, targeting different viral proteins, is the best option for clearing hepatitis C virus (HCV) infection in chronically infected patients. However, the diversity of the HCV genome is a major obstacle for the development of antiviral drugs, vaccines, and genotyping assays. In this large-scale analysis, genome-wide diversity and selective pressure was mapped, focusing on positions important for treatment, drug resistance, and resistance testing. A dataset of 1415 full-genome sequences, including genotypes 1-6 from the Los Alamos database, was analyzed. In 44% of all full-genome positions, the consensus amino acid was different for at least one genotype. Focusing on positions sharing the same consensus amino acid in all genotypes revealed that only 15% was defined as pan-genotypic highly conserved (≥99% amino acid identity) and an additional 24% as pan-genotypic conserved (≥95%). Despite its large genetic diversity, across all genotypes, codon positions were rarely identified to be positively selected (0.23%-0.46%) and predominantly found to be under negative selective pressure, suggesting mainly neutral evolution. For NS3, NS5A, and NS5B, respectively, 40% (6/15), 33% (3/9), and 14% (2/14) of the resistance-related positions harbored as consensus the amino acid variant related to resistance, potentially impeding treatment. For example, the NS3 variant 80K, conferring resistance to simeprevir used for treatment of HCV1 infected patients, was present in 39.3% of the HCV1a strains and 0.25% of HCV1b strains. Both NS5A variants 28M and 30S, known to be associated with resistance to the pan-genotypic drug daclatasvir, were found in a significant proportion of HCV4 strains (10.7%). NS5B variant 556G, known to confer resistance to non-nucleoside inhibitor dasabuvir, was observed in 8.4% of the HCV1b strains. Given the large HCV genetic diversity, sequencing efforts for resistance testing purposes may need to be genotype-specific or geographically tailored.
Wala, Jeremiah; Zhang, Cheng-Zhong; Meyerson, Matthew; Beroukhim, Rameen
2016-07-01
We developed VariantBam, a C ++ read filtering and profiling tool for use with BAM, CRAM and SAM sequencing files. VariantBam provides a flexible framework for extracting sequencing reads or read-pairs that satisfy combinations of rules, defined by any number of genomic intervals or variant sites. We have implemented filters based on alignment data, sequence motifs, regional coverage and base quality. For example, VariantBam achieved a median size reduction ratio of 3.1:1 when applied to 10 lung cancer whole genome BAMs by removing large tags and selecting for only high-quality variant-supporting reads and reads matching a large dictionary of sequence motifs. Thus VariantBam enables efficient storage of sequencing data while preserving the most relevant information for downstream analysis. VariantBam and full documentation are available at github.com/jwalabroad/VariantBam rameen@broadinstitute.org Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Cao, Jingyuan; Zhou, Wenting; Yi, Yao; Jia, Zhiyuan; Bi, Shengli
2013-01-01
Hepatitis A virus (HAV) is the most common cause of infectious hepatitis throughout the world, spread largely by the fecal-oral route. To characterize the genetic diversity of the virus circulating in China where HAV in endemic, we selected the outbreak cases with identical sequences in VP1-2A junction region and compiled a panel of 42 isolates. The VP3-VP1-2A regions of the HAV capsid-coding genes were further sequenced and analyzed. The quasispecies distribution was evaluated by cloning the VP3 and VP1-2A genes in three clinical samples. Phylogenetic analysis demonstrated that the same genotyping results could be obtained whether using the complete VP3, VP1, or partial VP1-2A genes for analysis in this study, although some differences did exist. Most isolates clustered in sub-genotype IA, and fewer in sub-genotype IB. No amino acid mutations were found at the published neutralizing epitope sites, however, several unique amino acid substitutions in the VP3 or VP1 region were identified, with two amino acid variants closely located to the immunodominant site. Quasispecies analysis showed the mutation frequencies were in the range of 7.22x10-4 -2.33x10-3 substitutions per nucleotide for VP3, VP1, or VP1-2A. When compared with the consensus sequences, mutated nucleotide sites represented the minority of all the analyzed sequences sites. HAV replicated as a complex distribution of closely genetically related variants referred to as quasispecies, and were under negative selection. The results indicate that diverse HAV strains and quasispecies inside the viral populations are presented in China, with unique amino acid substitutions detected close to the immunodominant site, and that the possibility of antigenic escaping mutants cannot be ruled out and needs to be further analyzed. PMID:24069343
BlackOPs: increasing confidence in variant detection through mappability filtering.
Cabanski, Christopher R; Wilkerson, Matthew D; Soloway, Matthew; Parker, Joel S; Liu, Jinze; Prins, Jan F; Marron, J S; Perou, Charles M; Hayes, D Neil
2013-10-01
Identifying variants using high-throughput sequencing data is currently a challenge because true biological variants can be indistinguishable from technical artifacts. One source of technical artifact results from incorrectly aligning experimentally observed sequences to their true genomic origin ('mismapping') and inferring differences in mismapped sequences to be true variants. We developed BlackOPs, an open-source tool that simulates experimental RNA-seq and DNA whole exome sequences derived from the reference genome, aligns these sequences by custom parameters, detects variants and outputs a blacklist of positions and alleles caused by mismapping. Blacklists contain thousands of artifact variants that are indistinguishable from true variants and, for a given sample, are expected to be almost completely false positives. We show that these blacklist positions are specific to the alignment algorithm and read length used, and BlackOPs allows users to generate a blacklist specific to their experimental setup. We queried the dbSNP and COSMIC variant databases and found numerous variants indistinguishable from mapping errors. We demonstrate how filtering against blacklist positions reduces the number of potential false variants using an RNA-seq glioblastoma cell line data set. In summary, accounting for mapping-caused variants tuned to experimental setups reduces false positives and, therefore, improves genome characterization by high-throughput sequencing.
LenVarDB: database of length-variant protein domains.
Mutt, Eshita; Mathew, Oommen K; Sowdhamini, Ramanathan
2014-01-01
Protein domains are functionally and structurally independent modules, which add to the functional variety of proteins. This array of functional diversity has been enabled by evolutionary changes, such as amino acid substitutions or insertions or deletions, occurring in these protein domains. Length variations (indels) can introduce changes at structural, functional and interaction levels. LenVarDB (freely available at http://caps.ncbs.res.in/lenvardb/) traces these length variations, starting from structure-based sequence alignments in our Protein Alignments organized as Structural Superfamilies (PASS2) database, across 731 structural classification of proteins (SCOP)-based protein domain superfamilies connected to 2 730 625 sequence homologues. Alignment of sequence homologues corresponding to a structural domain is available, starting from a structure-based sequence alignment of the superfamily. Orientation of the length-variant (indel) regions in protein domains can be visualized by mapping them on the structure and on the alignment. Knowledge about location of length variations within protein domains and their visual representation will be useful in predicting changes within structurally or functionally relevant sites, which may ultimately regulate protein function. Non-technical summary: Evolutionary changes bring about natural changes to proteins that may be found in many organisms. Such changes could be reflected as amino acid substitutions or insertions-deletions (indels) in protein sequences. LenVarDB is a database that provides an early overview of observed length variations that were set among 731 protein families and after examining >2 million sequences. Indels are followed up to observe if they are close to the active site such that they can affect the activity of proteins. Inclusion of such information can aid the design of bioengineering experiments.
Hepatitis delta genotypes in chronic delta infection in the northeast of Spain (Catalonia).
Cotrina, M; Buti, M; Jardi, R; Quer, J; Rodriguez, F; Pascual, C; Esteban, R; Guardia, J
1998-06-01
Based on genetic analysis of variants obtained around the world, three genotypes of the hepatitis delta virus have been defined. Hepatitis delta virus variants have been associated with different disease patterns and geographic distributions. To determine the prevalence of hepatitis delta virus genotypes in the northeast of Spain (Catalonia) and the correlation with transmission routes and clinical disease, we studied the nucleotide divergence of the consensus sequence of HDV RNA obtained from 33 patients with chronic delta hepatitis (24 were intravenous drug users and nine had no risk factors), and four patients with acute self-limited delta infection. Serum HDV RNA was amplified by the polymerase chain reaction technique and a fragment of 350 nucleotides (nt 910 to 1259) was directly sequenced. Genetic analysis of the nucleotide consensus sequence obtained showed a high degree of conservation among sequences (93% of mean). Comparison of these sequences with those derived from different geographic areas and pertaining to genotypes I, II and III, showed a mean sequence identity of 92% with genotype I, 73% with genotype II and 61% with genotype III. At the amino acid level (aa 115 to 214), the mean identity was 87% with genotype I, 63% with genotype II and 56% with genotype III. Conserved regions included the RNA editing domain, the carboxyl terminal 19 amino acids of the hepatitis delta antigen and the polyadenylation signal of the viral mRNA. Hepatitis delta virus isolates in the northeast of Spain are exclusively genotype I, independently of the transmission route and the type of infection. No hepatitis delta virus subgenotypes were found, suggesting that the origin of hepatitis delta virus infection in our geographical area is homogeneous.
Tsai, C P; Pan, C H; Liu, M Y; Lin, Y L; Chen, C M; Huang, T S; Cheng, I C; Jong, M H; Yang, P C
2000-06-01
Sequence diversity was assessed of the complete VP1 gene directly amplified from 49 clinical specimens during an explosive foot-and-mouth disease (FMD) outbreak in Taiwan. Type O Taiwan FMD viruses are genetically highly homogenous, as seen by the minute divergence of 0.2-0.9% revealed in 20 variants. The O/HCP-0314/TW/97 and O/TCP-022/TW/97 viral variants dominated FMD outbreaks and were prevalent in most affected pig-raising areas. Comparison of deduced amino acid sequences around the main neutralizable antigenic sites on the VP1 polypeptide showed no significant antigenic variation. However, the O/CHP-158/TW/97 variant had an alternative critical residue at position 43 in antigenic site 3, which may be due to selective pressure in the field. Two vaccine production strains (O1/Manisa/Turkey/69 and O1/Campos/Brazil/71) probably provide partial heterologous protection of swine against O Taiwan viruses. The type O Taiwan variants clustered in sublineage A1 of four main lineages in the phylogenetic tree. The O/Hong Kong/9/94 and O/1685/Moscow/Russia/95 viruses in sublineage A2 are closely related to the O Taiwan variants. The causative agent for the 1997 epidemic presumably originated from a single common source of type O FMD viruses prevalent in neighboring areas.
Localized structural frustration for evaluating the impact of sequence variants.
Kumar, Sushant; Clarke, Declan; Gerstein, Mark
2016-12-01
Population-scale sequencing is increasingly uncovering large numbers of rare single-nucleotide variants (SNVs) in coding regions of the genome. The rarity of these variants makes it challenging to evaluate their deleteriousness with conventional phenotype-genotype associations. Protein structures provide a way of addressing this challenge. Previous efforts have focused on globally quantifying the impact of SNVs on protein stability. However, local perturbations may severely impact protein functionality without strongly disrupting global stability (e.g. in relation to catalysis or allostery). Here, we describe a workflow in which localized frustration, quantifying unfavorable local interactions, is employed as a metric to investigate such effects. Using this workflow on the Protein Databank, we find that frustration produces many immediately intuitive results: for instance, disease-related SNVs create stronger changes in localized frustration than non-disease related variants, and rare SNVs tend to disrupt local interactions to a larger extent than common variants. Less obviously, we observe that somatic SNVs associated with oncogenes and tumor suppressor genes (TSGs) induce very different changes in frustration. In particular, those associated with TSGs change the frustration more in the core than the surface (by introducing loss-of-function events), whereas those associated with oncogenes manifest the opposite pattern, creating gain-of-function events. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Characterization of Canine parvovirus 2 variants circulating in Greece.
Ntafis, Vasileios; Xylouri, Eftychia; Kalli, Iris; Desario, Costantina; Mari, Viviana; Decaro, Nicola; Buonavoglia, Canio
2010-09-01
The aim of the present study was to characterize Canine parvovirus 2 (CPV-2) variants currently circulating in Greece. Between March 2008 and March 2009, 167 fecal samples were collected from diarrheic dogs from different regions of Greece. Canine parvovirus 2 was detected by standard polymerase chain reaction, whereas minor groove binder probe assays were used to distinguish genetic variants and discriminate between vaccine and field strains. Of 84 CPV-2-positive samples, 81 CPV-2a, 1 CPV-2b, and 2 CPV-2c were detected. Vaccine strains were not detected in any sample. Sequence analysis of the VP2 gene of the 2 CPV-2c viruses revealed up to 100% amino acid identity with the CPV-2c strains previously detected in Europe. The results indicated that, unlike other European countries, CPV-2a remains the most common variant in Greece, and that the CPV-2c variant found in Europe is also present in Greece.
Ranganath, Prajnya; Matta, Divya; Bhavani, Gandham SriLakshmi; Wangnekar, Savita; Jain, Jamal Mohammed Nurul; Verma, Ishwar C; Kabra, Madhulika; Puri, Ratna Dua; Danda, Sumita; Gupta, Neerja; Girisha, Katta M; Sankar, Vaikom H; Patil, Siddaramappa J; Ramadevi, Akella Radha; Bhat, Meenakshi; Gowrishankar, Kalpana; Mandal, Kausik; Aggarwal, Shagun; Tamhankar, Parag Mohan; Tilak, Preetha; Phadke, Shubha R; Dalal, Ashwin
2016-10-01
Acid sphingomyelinase (ASM)-deficient Niemann-Pick disease is an autosomal recessive lysosomal storage disorder caused by biallelic mutations in the SMPD1 gene. To date, around 185 mutations have been reported in patients with ASM-deficient NPD world-wide, but the mutation spectrum of this disease in India has not yet been reported. The aim of this study was to ascertain the mutation profile in Indian patients with ASM-deficient NPD. We sequenced SMPD1 in 60 unrelated families affected with ASM-deficient NPD. A total of 45 distinct pathogenic sequence variants were found, of which 14 were known and 31 were novel. The variants included 30 missense, 4 nonsense, and 9 frameshift (7 single base deletions and 2 single base insertions) mutations, 1 indel, and 1 intronic duplication. The pathogenicity of the novel mutations was inferred with the help of the mutation prediction software MutationTaster, SIFT, Polyphen-2, PROVEAN, and HANSA. The effects of the identified sequence variants on the protein structure were studied using the structure modeled with the help of the SWISS-MODEL workspace program. The p. (Arg542*) (c.1624C>T) mutation was the most commonly identified mutation, found in 22% (26 out of 120) of the alleles tested, but haplotype analysis for this mutation did not identify a founder effect for the Indian population. To the best of our knowledge, this is the largest study on mutation analysis of patients with ASM-deficient Niemann-Pick disease reported in literature and also the first study on the SMPD1 gene mutation spectrum in India. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Selecting sequence variants to improve genomic predictions for dairy cattle
USDA-ARS?s Scientific Manuscript database
Millions of genetic variants have been identified by population-scale sequencing projects, but subsets are needed for routine genomic predictions or to include on genotyping arrays. Methods of selecting sequence variants were compared using both simulated sequence genotypes and actual data from run ...
Proteogenomic Investigation of Strain Variation in Clinical Mycobacterium tuberculosis Isolates.
Heunis, Tiaan; Dippenaar, Anzaan; Warren, Robin M; van Helden, Paul D; van der Merwe, Ruben G; Gey van Pittius, Nicolaas C; Pain, Arnab; Sampson, Samantha L; Tabb, David L
2017-10-06
Mycobacterium tuberculosis consists of a large number of different strains that display unique virulence characteristics. Whole-genome sequencing has revealed substantial genetic diversity among clinical M. tuberculosis isolates, and elucidating the phenotypic variation encoded by this genetic diversity will be of the utmost importance to fully understand M. tuberculosis biology and pathogenicity. In this study, we integrated whole-genome sequencing and mass spectrometry (GeLC-MS/MS) to reveal strain-specific characteristics in the proteomes of two clinical M. tuberculosis Latin American-Mediterranean isolates. Using this approach, we identified 59 peptides containing single amino acid variants, which covered ∼9% of all coding nonsynonymous single nucleotide variants detected by whole-genome sequencing. Furthermore, we identified 29 distinct peptides that mapped to a hypothetical protein not present in the M. tuberculosis H37Rv reference proteome. Here, we provide evidence for the expression of this protein in the clinical M. tuberculosis SAWC3651 isolate. The strain-specific databases enabled confirmation of genomic differences (i.e., large genomic regions of difference and nonsynonymous single nucleotide variants) in these two clinical M. tuberculosis isolates and allowed strain differentiation at the proteome level. Our results contribute to the growing field of clinical microbial proteogenomics and can improve our understanding of phenotypic variation in clinical M. tuberculosis isolates.
Wang, Nan; Zhang, Yeting; Gedvilaite, Erika; Loh, Jui Wan; Lin, Timothy; Liu, Xiuping; Liu, Chang-Gong; Kumar, Dibyendu; Donnelly, Robert; Raymond, Kimiyo; Schuchman, Edward H; Sleat, David E; Lobel, Peter; Xing, Jinchuan
2017-11-01
Lysosomes are membrane-bound, acidic eukaryotic cellular organelles that play important roles in the degradation of macromolecules. Mutations that cause the loss of lysosomal protein function can lead to a group of disorders categorized as the lysosomal storage diseases (LSDs). Suspicion of LSD is frequently based on clinical and pathologic findings, but in some cases, the underlying genetic and biochemical defects remain unknown. Here, we performed whole-exome sequencing (WES) on 14 suspected LSD cases to evaluate the feasibility of using WES for identifying causal mutations. By examining 2,157 candidate genes potentially associated with lysosomal function, we identified eight variants in five genes as candidate disease-causing variants in four individuals. These included both known and novel mutations. Variants were corroborated by targeted sequencing and, when possible, functional assays. In addition, we identified nonsense mutations in two individuals in genes that are not known to have lysosomal function. However, mutations in these genes could have resulted in phenotypes that were diagnosed as LSDs. This study demonstrates that WES can be used to identify causal mutations in suspected LSD cases. We also demonstrate cases where a confounding clinical phenotype may potentially reflect more than one lysosomal protein defect. © 2017 Wiley Periodicals, Inc.
Genetic Diversity in Oxytocin Ligands and Receptors in New World Monkeys
Ren, Dongren; Lu, Guoqing; Moriyama, Hideaki; Mustoe, Aaryn C.; Harrison, Emily B.; French, Jeffrey A.
2015-01-01
Oxytocin (OXT) is an important neurohypophyseal hormone that influences wide spectrum of reproductive and social processes. Eutherian mammals possess a highly conserved sequence of OXT (Cys-Tyr-Ile-Gln-Asn-Cys-Pro-Leu-Gly). However, in this study, we sequenced the coding region for OXT in 22 species covering all New World monkeys (NWM) genera and clades, and characterize five OXT variants, including consensus mammalian Leu8-OXT, major variant Pro8-OXT, and three previously unreported variants: Ala8-OXT, Thr8-OXT, and Phe2-OXT. Pro8-OXT shows clear structural and physicochemical differences from Leu8-OXT. We report multiple predicted amino acid substitutions in the G protein-coupled OXT receptor (OXTR), especially in the critical N-terminus, which is crucial for OXT recognition and binding. Genera with same Pro8-OXT tend to cluster together on a phylogenetic tree based on OXTR sequence, and we demonstrate significant coevolution between OXT and OXTR. NWM species are characterized by high incidence of social monogamy, and we document an association between OXTR phylogeny and social monogamy. Our results demonstrate remarkable genetic diversity in the NWM OXT/OXTR system, which can provide a foundation for molecular, pharmacological, and behavioral studies of the role of OXT signaling in regulating complex social phenotypes. PMID:25938568
Computational design of chimeric protein libraries for directed evolution.
Silberg, Jonathan J; Nguyen, Peter Q; Stevenson, Taylor
2010-01-01
The best approach for creating libraries of functional proteins with large numbers of nondisruptive amino acid substitutions is protein recombination, in which structurally related polypeptides are swapped among homologous proteins. Unfortunately, as more distantly related proteins are recombined, the fraction of variants having a disrupted structure increases. One way to enrich the fraction of folded and potentially interesting chimeras in these libraries is to use computational algorithms to anticipate which structural elements can be swapped without disturbing the integrity of a protein's structure. Herein, we describe how the algorithm Schema uses the sequences and structures of the parent proteins recombined to predict the structural disruption of chimeras, and we outline how dynamic programming can be used to find libraries with a range of amino acid substitution levels that are enriched in variants with low Schema disruption.
Dunemann, Frank; Schrader, Otto; Budahn, Holger; Houben, Andreas
2014-01-01
In eukaryotes, centromeres are the assembly sites for the kinetochore, a multi-protein complex to which spindle microtubules are attached at mitosis and meiosis, thereby ensuring segregation of chromosomes during cell division. They are specified by incorporation of CENH3, a centromere specific histone H3 variant which replaces canonical histone H3 in the nucleosomes of functional centromeres. To lay a first foundation of a putative alternative haploidization strategy based on centromere-mediated genome elimination in cultivated carrots, in the presented research we aimed at the identification and cloning of functional CENH3 genes in Daucus carota and three distantly related wild species of genus Daucus varying in basic chromosome numbers. Based on mining the carrot transcriptome followed by a subsequent PCR-based cloning, homologous coding sequences for CENH3s of the four Daucus species were identified. The ORFs of the CENH3 variants were very similar, and an amino acid sequence length of 146 aa was found in three out of the four species. Comparison of Daucus CENH3 amino acid sequences with those of other plant CENH3s as well as their phylogenetic arrangement among other dicot CENH3s suggest that the identified genes are authentic CENH3 homologs. To verify the location of the CENH3 protein in the kinetochore regions of the Daucus chromosomes, a polyclonal antibody based on a peptide corresponding to the N-terminus of DcCENH3 was developed and used for anti-CENH3 immunostaining of mitotic root cells. The chromosomal location of CENH3 proteins in the centromere regions of the chromosomes could be confirmed. For genetic localization of the CENH3 gene in the carrot genome, a previously constructed linkage map for carrot was used for mapping a CENH3-specific simple sequence repeat (SSR) marker, and the CENH3 locus was mapped on the carrot chromosome 9. PMID:24887084
Mutations Affecting Expression of the rosy Locus in Drosophila melanogaster
Lee, Chong Sung; Curtis, Daniel; McCarron, Margaret; Love, Carol; Gray, Mark; Bender, Welcome; Chovnick, Arthur
1987-01-01
The rosy locus in Drosophila melanogaster codes for the enzyme xanthine dehydrogenase (XDH). Previous studies defined a "control element" near the 5' end of the gene, where variant sites affected the amount of rosy mRNA and protein produced. We have determined the DNA sequence of this region from both genomic and cDNA clones, and from the ry+10 underproducer strain. This variant strain had many sequence differences, so that the site of the regulatory change could not be fixed. A mutagenesis was also undertaken to isolate new regulatory mutations. We induced 376 new mutations with 1-ethyl-1-nitrosourea (ENU) and screened them to isolate those that reduced the amount of XDH protein produced, but did not change the properties of the enzyme. Genetic mapping was used to find mutations located near the 5' end of the gene. DNA from each of seven mutants was cloned and sequenced through the 5' region. Mutant base changes were identified in all seven; they appear to affect splicing and translation of the rosy mRNA. In a related study (T. P. Keith et al. 1987), the genomic and cDNA sequences are extended through the 3' end of the gene; the combined sequences define the processing pattern of the rosy transcript and predict the amino acid sequence of XDH. PMID:3036645
A Common Mutation in DEFB126 Causes Impaired Sperm Function and Subfertility
Tollner, Theodore L.; Venners, Scott A.; Hollox, Edward J.; Yudin, Ashley I.; Liu, Xue; Tang, Genfu; Xing, Houxun; Kays, Robert J.; Lau, Tsang; Overstreet, James W.; Xu, Xiping; Bevins, Charles L.; Cherr, Gary N.
2013-01-01
A glycosylated polypeptide, β-defensin 126 (DEFB126), derived from the epididymis and adsorbed onto the sperm surface, has been implicated in immunoprotection and efficient movement of sperm in mucosal fluids of the female reproductive tract. Here, we report a sequence variant in DEFB126 that has a 2-nucleotide deletion in the open reading frame, which generates a non-stop mRNA. The allele frequency of this variant sequence is high in both a European (0.47) and a Chinese (0.45) population cohort. Binding of the Agaricus bisporus lectin to the sperm surface glycocalyx was significantly lower in men with the homozygous variant (del/del) genotype than in those with either a del/wt or wt/wt genotype, suggesting an altered sperm glycocalyx with fewer O-linked oligosaccharides in del/del men. Moreover, sperm from the del/del donors exhibited an 84% reduction in the rate of penetration of a hyaluronic acid (HA) gel, a surrogate for cervical mucus, compared to the other genotypes. This reduction in sperm performance in HA gels was not a result of decreased progressive motility (average curvilinear velocity) or morphological deficits. However, DEFB126 genotype and lectin binding were highly correlated with performance in the penetration assays. In a prospective cohort study of newly married couples who were trying to conceive by natural means, couples were less likely to become pregnant and took longer to achieve a live birth if the male partner was homozygous for the variant sequence. This common sequence variation in DEFB126, and its apparent cause of impaired reproductive function, provides an opportunity to better understand, clinically evaluate, and possibly treat human infertility. PMID:21775668
Whole exome sequencing of rare variants in EIF4G1 and VPS35 in Parkinson disease
Nuytemans, Karen; Bademci, Guney; Inchausti, Vanessa; Dressen, Amy; Kinnamon, Daniel D.; Mehta, Arpit; Wang, Liyong; Züchner, Stephan; Beecham, Gary W.; Martin, Eden R.; Scott, William K.
2013-01-01
Objective: Recently, vacuolar protein sorting 35 (VPS35) and eukaryotic translation initiation factor 4 gamma 1 (EIF4G1) have been identified as 2 causal Parkinson disease (PD) genes. We used whole exome sequencing for rapid, parallel analysis of variations in these 2 genes. Methods: We performed whole exome sequencing in 213 patients with PD and 272 control individuals. Those rare variants (RVs) with <5% frequency in the exome variant server database and our own control data were considered for analysis. We performed joint gene-based tests for association using RVASSOC and SKAT (Sequence Kernel Association Test) as well as single-variant test statistics. Results: We identified 3 novel VPS35 variations that changed the coded amino acid (nonsynonymous) in 3 cases. Two variations were in multiplex families and neither segregated with PD. In EIF4G1, we identified 11 (9 nonsynonymous and 2 small indels) RVs including the reported pathogenic mutation p.R1205H, which segregated in all affected members of a large family, but also in 1 unaffected 86-year-old family member. Two additional RVs were found in isolated patients only. Whereas initial association studies suggested an association (p = 0.04) with all RVs in EIF4G1, subsequent testing in a second dataset for the driving variant (p.F1461) suggested no association between RVs in the gene and PD. Conclusions: We confirm that the specific EIF4G1 variation p.R1205H seems to be a strong PD risk factor, but is nonpenetrant in at least one 86-year-old. A few other select RVs in both genes could not be ruled out as causal. However, there was no evidence for an overall contribution of genetic variability in VPS35 or EIF4G1 to PD development in our dataset. PMID:23408866
Zhu, You-Cai; Zhou, Yue-Fen; Wang, Wen-Xian; Xu, Chun-Wei; Zhuang, Wu; Du, Kai-Qi; Chen, Gang
2018-05-01
ROS1 rearrangement is a validated therapeutic driver gene in non-small cell lung cancer (NSCLC) and represents a small subset (1-2%) of NSCLC. A total of 17 different fusion partner genes of ROS1 in NSCLC have been reported. The multi-targeted MET/ALK/ROS1 tyrosine kinase inhibitor (TKI) crizotinib has demonstrated remarkable efficacy in ROS1-rearranged NSCLC. Consequently, ROS1 detection assays include fluorescence in situ hybridization, immunohistochemistry, and real-time PCR. Next-generation sequencing (NGS) assay covers a range of fusion genes and approaches to discover novel receptor-kinase rearrangements in lung cancer. A 63-year-old male smoker with stage IV NSCLC (TxNxM1) was detected with a novel ROS1 fusion. Histological examination of the tumor showed lung adenocarcinoma. NGS analysis of the hydrothorax cellblocks revealed a novel CEP72-ROS1 rearrangement. This novel CEP72-ROS1 fusion variant is generated by the fusion of exons 1-11 of CEP72 on chromosome 5p15 to exons 23-43 of ROS1 on chromosome 6q22. The predicted CEP72-ROS1 protein product contains 1202 amino acids comprising the N-terminal amino acids 594-647 of CEP72 and C-terminal amino acid 1-1148 of ROS1. CEP72-ROS1 is a novel ROS1 fusion variant in NSCLC discovered by NGS and could be included in ROS1 detection assay, such as reverse transcription PCR. Pleural effusion samples show good diagnostic performance in clinical practice. © 2018 The Authors. Thoracic Cancer published by China Lung Oncology Group and John Wiley & Sons Australia, Ltd.
Zhu, You‐cai; Zhou, Yue‐fen; Zhuang, Wu; Du, Kai‐qi; Chen, Gang
2018-01-01
ROS1 rearrangement is a validated therapeutic driver gene in non‐small cell lung cancer (NSCLC) and represents a small subset (1–2%) of NSCLC. A total of 17 different fusion partner genes of ROS1 in NSCLC have been reported. The multi‐targeted MET/ALK/ROS1 tyrosine kinase inhibitor (TKI) crizotinib has demonstrated remarkable efficacy in ROS1‐rearranged NSCLC. Consequently, ROS1 detection assays include fluorescence in situ hybridization, immunohistochemistry, and real‐time PCR. Next‐generation sequencing (NGS) assay covers a range of fusion genes and approaches to discover novel receptor‐kinase rearrangements in lung cancer. A 63‐year‐old male smoker with stage IV NSCLC (TxNxM1) was detected with a novel ROS1 fusion. Histological examination of the tumor showed lung adenocarcinoma. NGS analysis of the hydrothorax cellblocks revealed a novel CEP72‐ROS1 rearrangement. This novel CEP72‐ROS1 fusion variant is generated by the fusion of exons 1–11 of CEP72 on chromosome 5p15 to exons 23–43 of ROS1 on chromosome 6q22. The predicted CEP72‐ROS1 protein product contains 1202 amino acids comprising the N‐terminal amino acids 594–647 of CEP72 and C‐terminal amino acid 1‐1148 of ROS1. CEP72‐ROS1 is a novel ROS1 fusion variant in NSCLC discovered by NGS and could be included in ROS1 detection assay, such as reverse transcription PCR. Pleural effusion samples show good diagnostic performance in clinical practice. PMID:29517860
Dodds, Peter N.; Lawrence, Gregory J.; Catanzariti, Ann-Maree; Teh, Trazel; Wang, Ching-I. A.; Ayliffe, Michael A.; Kobe, Bostjan; Ellis, Jeffrey G.
2006-01-01
Plant resistance proteins (R proteins) recognize corresponding pathogen avirulence (Avr) proteins either indirectly through detection of changes in their host protein targets or through direct R–Avr protein interaction. Although indirect recognition imposes selection against Avr effector function, pathogen effector molecules recognized through direct interaction may overcome resistance through sequence diversification rather than loss of function. Here we show that the flax rust fungus AvrL567 genes, whose products are recognized by the L5, L6, and L7 R proteins of flax, are highly diverse, with 12 sequence variants identified from six rust strains. Seven AvrL567 variants derived from Avr alleles induce necrotic responses when expressed in flax plants containing corresponding resistance genes (R genes), whereas five variants from avr alleles do not. Differences in recognition specificity between AvrL567 variants and evidence for diversifying selection acting on these genes suggest they have been involved in a gene-specific arms race with the corresponding flax R genes. Yeast two-hybrid assays indicate that recognition is based on direct R–Avr protein interaction and recapitulate the interaction specificity observed in planta. Biochemical analysis of Escherichia coli-produced AvrL567 proteins shows that variants that escape recognition nevertheless maintain a conserved structure and stability, suggesting that the amino acid sequence differences directly affect the R–Avr protein interaction. We suggest that direct recognition associated with high genetic diversity at corresponding R and Avr gene loci represents an alternative outcome of plant–pathogen coevolution to indirect recognition associated with simple balanced polymorphisms for functional and nonfunctional R and Avr genes. PMID:16731621
Whole-genome sequencing and genetic variant analysis of a Quarter Horse mare.
Doan, Ryan; Cohen, Noah D; Sawyer, Jason; Ghaffari, Noushin; Johnson, Charlie D; Dindot, Scott V
2012-02-17
The catalog of genetic variants in the horse genome originates from a few select animals, the majority originating from the Thoroughbred mare used for the equine genome sequencing project. The purpose of this study was to identify genetic variants, including single nucleotide polymorphisms (SNPs), insertion/deletion polymorphisms (INDELs), and copy number variants (CNVs) in the genome of an individual Quarter Horse mare sequenced by next-generation sequencing. Using massively parallel paired-end sequencing, we generated 59.6 Gb of DNA sequence from a Quarter Horse mare resulting in an average of 24.7X sequence coverage. Reads were mapped to approximately 97% of the reference Thoroughbred genome. Unmapped reads were de novo assembled resulting in 19.1 Mb of new genomic sequence in the horse. Using a stringent filtering method, we identified 3.1 million SNPs, 193 thousand INDELs, and 282 CNVs. Genetic variants were annotated to determine their impact on gene structure and function. Additionally, we genotyped this Quarter Horse for mutations of known diseases and for variants associated with particular traits. Functional clustering analysis of genetic variants revealed that most of the genetic variation in the horse's genome was enriched in sensory perception, signal transduction, and immunity and defense pathways. This is the first sequencing of a horse genome by next-generation sequencing and the first genomic sequence of an individual Quarter Horse mare. We have increased the catalog of genetic variants for use in equine genomics by the addition of novel SNPs, INDELs, and CNVs. The genetic variants described here will be a useful resource for future studies of genetic variation regulating performance traits and diseases in equids.
A rare variant in MYH6 is associated with high risk of sick sinus syndrome
Holm, Hilma; Gudbjartsson, Daniel F; Sulem, Patrick; Masson, Gisli; Helgadottir, Hafdis Th; Zanon, Carlo; Magnusson, Olafur Th; Helgason, Agnar; Saemundsdottir, Jona; Gylfason, Arnaldur; Stefansdottir, Hrafnhildur; Gretarsdottir, Solveig; Matthiasson, Stefan E; Thorgeirsson, Guðmundur; Jonasdottir, Aslaug; Sigurdsson, Asgeir; Stefansson, Hreinn; Werge, Thomas; Rafnar, Thorunn; Kiemeney, Lambertus A; Parvez, Babar; Muhammad, Raafia; Roden, Dan M; Darbar, Dawood; Thorleifsson, Gudmar; Walters, G Bragi; Kong, Augustine; Thorsteinsdottir, Unnur; Arnar, David O; Stefansson, Kari
2011-01-01
Through complementary application of SNP genotyping, whole-genome sequencing and imputation in 38,384 Icelanders, we have discovered a previously unidentified sick sinus syndrome susceptibility gene, MYH6, encoding the alpha heavy chain subunit of cardiac myosin. A missense variant in this gene, c.2161C>T, results in the conceptual amino acid substitution p.Arg721Trp, has an allelic frequency of 0.38% in Icelanders and associates with sick sinus syndrome with an odds ratio = 1 2.53 and P = 1.5 × 10−29. We show that the lifetime risk of being diagnosed with sick sinus syndrome is around 6% for non-carriers of c.2161C>T but is approximately 50% for carriers of the c.2161C>T variant. PMID:21378987
Zheng, Ling; Shockey, Jay; Bian, Fei; Chen, Gao; Shan, Lei; Li, Xinguo; Wan, Shubo; Peng, Zhenying
2017-01-01
Diacylglycerol acyltransferase (DGAT) catalyzes the final step in triacylglycerol (TAG) biosynthesis via the acyl-CoA-dependent acylation of diacylglycerol. This reaction is a major control point in the Kennedy pathway for biosynthesis of TAG, which is the most important form of stored metabolic energy in most oil-producing plants. In this study, Arachis hypogaea type 2 DGAT (AhDGAT2) genes were cloned from the peanut cultivar ‘Luhua 14.’ Sequence analysis of 11 different peanut cultivars revealed a gene family of 8 peanut DGAT2 genes (designated AhDGAT2a-h). Sequence alignments revealed 21 nucleotide differences between the eight ORFs, but only six differences result in changes to the predicted amino acid (AA) sequences. A representative full-length cDNA clone (AhDGAT2a) was characterized in detail. The biochemical effects of altering the AhDGAT2a sequence to include single variable AA residues were tested by mutagenesis and functional complementation assays in transgenic yeast systems. All six mutant variants retained enzyme activity and produced lipid droplets in vivo. The N6D and A26P mutants also displayed increased enzyme activity and/or total cellular fatty acid (FA) content. N6D mutant mainly increased the content of palmitoleic acid, and A26P mutant mainly increased the content of palmitic acid. The A26P mutant grew well both in the presence of oleic and C18:2, but the other mutants grew better in the presence of C18:2. AhDGAT2 is expressed in all peanut organs analyzed, with high transcript levels in leaves and flowers. These levels are comparable to that found in immature seeds, where DGAT2 expression is most abundant in other plants. Over-expression of AhDGAT2a in tobacco substantially increased the FA content of transformed tobacco seeds. Expression of AhDGAT2a also altered transcription levels of endogenous tobacco lipid metabolic genes in transgenic tobacco, apparently creating a larger carbon ‘sink’ that supports increased FA levels. PMID:29085382
Nath, Rahul; Mant, Christine A; Kell, Barbara; Cason, John; Bible, Jon M
2006-01-01
Background Human papillomavirus type 16 (HPV-16) E5 protein co-operates with epidermal growth factor to stimulate mitogenesis of murine fibroblasts. Currently, little is known about which viral amino acids are involved in this process. Using sequence variants of HPV-16 E5 we have investigated their effects upon E5 transcription, cell-cycling and cell-growth of murine fibroblasts. Results We demonstrate that: (i) introduction of Thr64 into the reference E5 sequence of HPV-16 abrogates mitogenic activity: both were poorly transcribed in NIH-3T3 cells; (ii) substitution of Leu44Val65 or, Thr37Leu44Val65 into the HPV-16 E5 reference backbone resulted in high transcription in NIH-3T3 cells, enhanced cell-cycle progression and high cell-growth; and, (iii) inclusion of Tyr8 into the Leu44Val65 backbone inhibited E5 induced cell-growth and repression of p21 expression, despite high transcription levels. Conclusion The effects of HPV-16 E5 variants upon mitosis help to explain why Leu44Val65 HPV-16 E5 variants are most prevalent in 'wild' pathogenic viral populations in the UK. PMID:16899131
Littlejohn, Mathew D.; Tiplady, Kathryn; Lopdell, Thomas; Law, Tania A.; Scott, Andrew; Harland, Chad; Sherlock, Ric; Henty, Kristen; Obolonkin, Vlad; Lehnert, Klaus; MacGibbon, Alistair; Spelman, Richard J.; Davis, Stephen R.; Snell, Russell G.
2014-01-01
Milk is composed of a complex mixture of lipids, proteins, carbohydrates and various vitamins and minerals as a source of nutrition for young mammals. The composition of milk varies between individuals, with lipid composition in particular being highly heritable. Recent reports have highlighted a region of bovine chromosome 27 harbouring variants affecting milk fat percentage and fatty acid content. We aimed to further investigate this locus in two independent cattle populations, consisting of a Holstein-Friesian x Jersey crossbreed pedigree of 711 F2 cows, and a collection of 32,530 mixed ancestry Bos taurus cows. Bayesian genome-wide association mapping using markers imputed from the Illumina BovineHD chip revealed a large quantitative trait locus (QTL) for milk fat percentage on chromosome 27, present in both populations. We also investigated a range of other milk composition phenotypes, and report additional associations at this locus for fat yield, protein percentage and yield, lactose percentage and yield, milk volume, and the proportions of numerous milk fatty acids. We then used mammary RNA sequence data from 212 lactating cows to assess the transcript abundance of genes located in the milk fat percentage QTL interval. This analysis revealed a strong eQTL for AGPAT6, demonstrating that high milk fat percentage genotype is also additively associated with increased expression of the AGPAT6 gene. Finally, we used whole genome sequence data from six F1 sires to target a panel of novel AGPAT6 locus variants for genotyping in the F2 crossbreed population. Association analysis of 58 of these variants revealed highly significant association for polymorphisms mapping to the 5′UTR exons and intron 1 of AGPAT6. Taken together, these data suggest that variants affecting the expression of AGPAT6 are causally involved in differential milk fat synthesis, with pleiotropic consequences for a diverse range of other milk components. PMID:24465687
Brady, Graham F; Kwan, Raymond; Ulintz, Peter J; Nguyen, Phirum; Bassirian, Shirin; Basrur, Venkatesha; Nesvizhskii, Alexey I; Loomba, Rohit; Omary, M Bishr
2018-05-01
Nonalcoholic fatty liver disease (NAFLD) is becoming the major chronic liver disease in many countries. Its pathogenesis is multifactorial, but twin and familial studies indicate significant heritability, which is not fully explained by currently known genetic susceptibility loci. Notably, mutations in genes encoding nuclear lamina proteins, including lamins, cause lipodystrophy syndromes that include NAFLD. We hypothesized that variants in lamina-associated proteins predispose to NAFLD and used a candidate gene-sequencing approach to test for variants in 10 nuclear lamina-related genes in a cohort of 37 twin and sibling pairs: 21 individuals with and 53 without NAFLD. Twelve heterozygous sequence variants were identified in four lamina-related genes (ZMPSTE24, TMPO, SREBF1, SREBF2). The majority of NAFLD patients (>90%) had at least one variant compared to <40% of controls (P < 0.0001). When only insertions/deletions and changes in conserved residues were considered, the difference between the groups was similarly striking (>80% versus <25%; P < 0.0001). Presence of a lamina variant segregated with NAFLD independently of the PNPLA3 I148M polymorphism. Several variants were found in TMPO, which encodes the lamina-associated polypeptide-2 (LAP2) that has not been associated with liver disease. One of these, a frameshift insertion that generates truncated LAP2, abrogated lamin-LAP2 binding, caused LAP2 mislocalization, altered endogenous lamin distribution, increased lipid droplet accumulation after oleic acid treatment in transfected cells, and led to cytoplasmic association with the ubiquitin-binding protein p62/SQSTM1. Several variants in nuclear lamina-related genes were identified in a cohort of twins and siblings with NAFLD; one such variant, which results in a truncated LAP2 protein and a dramatic phenotype in cell culture, represents an association of TMPO/LAP2 variants with NAFLD and underscores the potential importance of the nuclear lamina in NAFLD. (Hepatology 2018;67:1710-1725). © 2017 by the American Association for the Study of Liver Diseases.
Evaluation of exome variants using the Ion Proton Platform to sequence error-prone regions.
Seo, Heewon; Park, Yoomi; Min, Byung Joo; Seo, Myung Eui; Kim, Ju Han
2017-01-01
The Ion Proton sequencer from Thermo Fisher accurately determines sequence variants from target regions with a rapid turnaround time at a low cost. However, misleading variant-calling errors can occur. We performed a systematic evaluation and manual curation of read-level alignments for the 675 ultrarare variants reported by the Ion Proton sequencer from 27 whole-exome sequencing data but that are not present in either the 1000 Genomes Project and the Exome Aggregation Consortium. We classified positive variant calls into 393 highly likely false positives, 126 likely false positives, and 156 likely true positives, which comprised 58.2%, 18.7%, and 23.1% of the variants, respectively. We identified four distinct error patterns of variant calling that may be bioinformatically corrected when using different strategies: simplicity region, SNV cluster, peripheral sequence read, and base inversion. Local de novo assembly successfully corrected 201 (38.7%) of the 519 highly likely or likely false positives. We also demonstrate that the two sequencing kits from Thermo Fisher (the Ion PI Sequencing 200 kit V3 and the Ion PI Hi-Q kit) exhibit different error profiles across different error types. A refined calling algorithm with better polymerase may improve the performance of the Ion Proton sequencing platform.
A variant Tc4 transposable element in the nematode C. elegans could encode a novel protein.
Li, W; Shaw, J E
1993-01-01
A variant C. elegans Tc4 transposable element, Tc4-rh1030, has been sequenced and is 3483 bp long. The Tc4 element that had been analyzed previously is 1605 bp long, consists of two 774-bp nearly perfect inverted terminal repeats connected by a 57-bp loop, and lacks significant open reading frames. In Tc4-rh1030, by comparison, a 2343-bp novel sequence is present in place of a 477-bp segment in one of the inverted repeats. The novel sequence of Tc4-rh1030 is present about five times per haploid genome and is invariably associated with Tc4 elements; we have used the designation Tc4v to denote this variant subfamily of Tc4 elements. Sequence analysis of three cDNA clones suggests that a Tc4v element contains at least five exons that could encode a novel basic protein of 537 amino acid residues. On northern blots, a 1.6-kb Tc4v-specific transcript was detected in the mutator strain TR679 but not in the wild-type strain N2; Tc4 elements are known to transpose in TR679 but appear to be quiescent in N2. We have analyzed transcripts produced by an unc-33 gene that has the Tc4-rh1030 insertional mutation in its transcribed region; all or almost all of the Tc4v sequence is frequently spliced out of the mutant unc-33 transcripts, sometimes by means of non-consensus splice acceptor sites. Images PMID:8382791
Oluwayelu, D O; Todd, D; Olaleye, O D
2008-12-01
This work reports the first molecular analysis study of chicken anaemia virus (CAV) in backyard chickens in Africa using molecular cloning and sequence analysis to characterize CAV strains obtained from commercial chickens and Nigerian backyard chickens. Partial VP1 gene sequences were determined for three CAVs from commercial chickens and for six CAV variants present in samples from a backyard chicken. Multiple alignment analysis revealed that the 6% and 4% nucleotide diversity obtained respectively for the commercial and backyard chicken strains translated to only 2% amino acid diversity for each breed. Overall, the amino acid composition of Nigerian CAVs was found to be highly conserved. Since the partial VP1 gene sequence of two backyard chicken cloned CAV strains (NGR/CI-8 and NGR/CI-9) were almost identical and evolutionarily closely related to the commercial chicken strains NGR-1, and NGR-4 and NGR-5, respectively, we concluded that CAV infections had crossed the farm boundary.
Radonić, Aleksandar; Kocak Tufan, Zeliha; Domingo, Cristina
2017-01-01
Background We describe the development and evaluation of a novel method for targeted amplification and Next Generation Sequencing (NGS)-based identification of viral hemorrhagic fever (VHF) agents and assess the feasibility of this approach in diagnostics. Methodology An ultrahigh-multiplex panel was designed with primers to amplify all known variants of VHF-associated viruses and relevant controls. The performance of the panel was evaluated via serially quantified nucleic acids from Yellow fever virus, Rift Valley fever virus, Crimean-Congo hemorrhagic fever (CCHF) virus, Ebola virus, Junin virus and Chikungunya virus in a semiconductor-based sequencing platform. A comparison of direct NGS and targeted amplification-NGS was performed. The panel was further tested via a real-time nanopore sequencing-based platform, using clinical specimens from CCHF patients. Principal findings The multiplex primer panel comprises two pools of 285 and 256 primer pairs for the identification of 46 virus species causing hemorrhagic fevers, encompassing 6,130 genetic variants of the strains involved. In silico validation revealed that the panel detected over 97% of all known genetic variants of the targeted virus species. High levels of specificity and sensitivity were observed for the tested virus strains. Targeted amplification ensured viral read detection in specimens with the lowest virus concentration (1–10 genome equivalents) and enabled significant increases in specific reads over background for all viruses investigated. In clinical specimens, the panel enabled detection of the causative agent and its characterization within 10 minutes of sequencing, with sample-to-result time of less than 3.5 hours. Conclusions Virus enrichment via targeted amplification followed by NGS is an applicable strategy for the diagnosis of VHFs which can be adapted for high-throughput or nanopore sequencing platforms and employed for surveillance or outbreak monitoring. PMID:29155823
Brinkmann, Annika; Ergünay, Koray; Radonić, Aleksandar; Kocak Tufan, Zeliha; Domingo, Cristina; Nitsche, Andreas
2017-11-01
We describe the development and evaluation of a novel method for targeted amplification and Next Generation Sequencing (NGS)-based identification of viral hemorrhagic fever (VHF) agents and assess the feasibility of this approach in diagnostics. An ultrahigh-multiplex panel was designed with primers to amplify all known variants of VHF-associated viruses and relevant controls. The performance of the panel was evaluated via serially quantified nucleic acids from Yellow fever virus, Rift Valley fever virus, Crimean-Congo hemorrhagic fever (CCHF) virus, Ebola virus, Junin virus and Chikungunya virus in a semiconductor-based sequencing platform. A comparison of direct NGS and targeted amplification-NGS was performed. The panel was further tested via a real-time nanopore sequencing-based platform, using clinical specimens from CCHF patients. The multiplex primer panel comprises two pools of 285 and 256 primer pairs for the identification of 46 virus species causing hemorrhagic fevers, encompassing 6,130 genetic variants of the strains involved. In silico validation revealed that the panel detected over 97% of all known genetic variants of the targeted virus species. High levels of specificity and sensitivity were observed for the tested virus strains. Targeted amplification ensured viral read detection in specimens with the lowest virus concentration (1-10 genome equivalents) and enabled significant increases in specific reads over background for all viruses investigated. In clinical specimens, the panel enabled detection of the causative agent and its characterization within 10 minutes of sequencing, with sample-to-result time of less than 3.5 hours. Virus enrichment via targeted amplification followed by NGS is an applicable strategy for the diagnosis of VHFs which can be adapted for high-throughput or nanopore sequencing platforms and employed for surveillance or outbreak monitoring.
Polymorphic human somatostatin gene is located on chromosome 3.
Naylor, S L; Sakaguchi, A Y; Shen, L P; Bell, G I; Rutter, W J; Shows, T B
1983-01-01
Somatostatin is a 14-amino-acid neuropeptide and hormone that inhibits the secretion of several peptide hormones. The human gene for somatostatin SST has been cloned, and the sequence has been determined. This clone was used as a probe in chromosome mapping studies to detect the human somatostatin sequence in human-rodent hybrids. Southern blot analysis of 41 hybrids, including some containing translocations of human chromosomes, placed SST in the q21 leads to qter region of chromosome 3. Human DNAs from unrelated individuals were screened for restriction fragment polymorphisms detectable by the somatostatin gene probe. Two polymorphisms were found: (i) an EcoRI variant located at the 3' end of the gene, found in Caucasian, U.S. Black, and Asian populations with a frequency of approximately 0.10 and (ii) a BamHI variant in the intron, which occurs in Caucasians at a frequency of 0.13. Images PMID:6133281
McAllister, Jane; Casino, Carmela; Davidson, Fiona; Power, Joan; Lawlor, Emer; Yap, Peng Lee; Simmonds, Peter; Smith, Donald B.
1998-01-01
The long-term evolution of the hepatitis C virus hypervariable region (HVR) and flanking regions of the E1 and E2 envelope proteins have been studied in a cohort of women infected from a common source of anti-D immunoglobulin. Whereas virus sequences in the infectious source were relatively homogeneous, distinct HVR variants were observed in each anti-D recipient, indicating that this region can evolve in multiple directions from the same point. Where HVR variants with dissimilar sequences were present in a single individual, the frequency of synonymous substitution in the flanking regions suggested that the lineages diverged more than a decade previously. Even where a single major HVR variant was present in an infected individual, this lineage was usually several years old. Multiple lineages can therefore coexist during long periods of chronic infection without replacement. The characteristics of amino acid substitution in the HVR were not consistent with the random accumulation of mutations and imply that amino acid replacement in the HVR was strongly constrained. Another variable region of E2 centered on codon 60 shows similar constraints, while HVR2 was relatively unconstrained. Several of these features are difficult to explain if a neutralizing immune response against the HVR is the only selective force operating on E2. The impact of PCR artifacts such as nucleotide misincorporation and the shuffling of dissimilar templates is discussed. PMID:9573256
Lessons for livestock genomics from genome and transcriptome sequencing in cattle and other mammals.
Taylor, Jeremy F; Whitacre, Lynsey K; Hoff, Jesse L; Tizioto, Polyana C; Kim, JaeWoo; Decker, Jared E; Schnabel, Robert D
2016-08-17
Decreasing sequencing costs and development of new protocols for characterizing global methylation, gene expression patterns and regulatory regions have stimulated the generation of large livestock datasets. Here, we discuss experiences in the analysis of whole-genome and transcriptome sequence data. We analyzed whole-genome sequence (WGS) data from 132 individuals from five canid species (Canis familiaris, C. latrans, C. dingo, C. aureus and C. lupus) and 61 breeds, three bison (Bison bison), 64 water buffalo (Bubalus bubalis) and 297 bovines from 17 breeds. By individual, data vary in extent of reference genome depth of coverage from 4.9X to 64.0X. We have also analyzed RNA-seq data for 580 samples representing 159 Bos taurus and Rattus norvegicus animals and 98 tissues. By aligning reads to a reference assembly and calling variants, we assessed effects of average depth of coverage on the actual coverage and on the number of called variants. We examined the identity of unmapped reads by assembling them and querying produced contigs against the non-redundant nucleic acids database. By imputing high-density single nucleotide polymorphism data on 4010 US registered Angus animals to WGS using Run4 of the 1000 Bull Genomes Project and assessing the accuracy of imputation, we identified misassembled reference sequence regions. We estimate that a 24X depth of coverage is required to achieve 99.5 % coverage of the reference assembly and identify 95 % of the variants within an individual's genome. Genomes sequenced to low average coverage (e.g., <10X) may fail to cover 10 % of the reference genome and identify <75 % of variants. About 10 % of genomic DNA or transcriptome sequence reads fail to align to the reference assembly. These reads include loci missing from the reference assembly and misassembled genes and interesting symbionts, commensal and pathogenic organisms. Assembly errors and a lack of annotation of functional elements significantly limit the utility of the current draft livestock reference assemblies. The Functional Annotation of Animal Genomes initiative seeks to annotate functional elements, while a 70X Pac-Bio assembly for cow is underway and may result in a significantly improved reference assembly.
Jun, Goo; Wing, Mary Kate; Abecasis, Gonçalo R; Kang, Hyun Min
2015-06-01
The analysis of next-generation sequencing data is computationally and statistically challenging because of the massive volume of data and imperfect data quality. We present GotCloud, a pipeline for efficiently detecting and genotyping high-quality variants from large-scale sequencing data. GotCloud automates sequence alignment, sample-level quality control, variant calling, filtering of likely artifacts using machine-learning techniques, and genotype refinement using haplotype information. The pipeline can process thousands of samples in parallel and requires less computational resources than current alternatives. Experiments with whole-genome and exome-targeted sequence data generated by the 1000 Genomes Project show that the pipeline provides effective filtering against false positive variants and high power to detect true variants. Our pipeline has already contributed to variant detection and genotyping in several large-scale sequencing projects, including the 1000 Genomes Project and the NHLBI Exome Sequencing Project. We hope it will now prove useful to many medical sequencing studies. © 2015 Jun et al.; Published by Cold Spring Harbor Laboratory Press.
Lubin, Ira M; Aziz, Nazneen; Babb, Lawrence J; Ballinger, Dennis; Bisht, Himani; Church, Deanna M; Cordes, Shaun; Eilbeck, Karen; Hyland, Fiona; Kalman, Lisa; Landrum, Melissa; Lockhart, Edward R; Maglott, Donna; Marth, Gabor; Pfeifer, John D; Rehm, Heidi L; Roy, Somak; Tezak, Zivana; Truty, Rebecca; Ullman-Cullere, Mollie; Voelkerding, Karl V; Worthey, Elizabeth A; Zaranek, Alexander W; Zook, Justin M
2017-05-01
A national workgroup convened by the Centers for Disease Control and Prevention identified principles and made recommendations for standardizing the description of sequence data contained within the variant file generated during the course of clinical next-generation sequence analysis for diagnosing human heritable conditions. The specifications for variant files were initially developed to be flexible with regard to content representation to support a variety of research applications. This flexibility permits variation with regard to how sequence findings are described and this depends, in part, on the conventions used. For clinical laboratory testing, this poses a problem because these differences can compromise the capability to compare sequence findings among laboratories to confirm results and to query databases to identify clinically relevant variants. To provide for a more consistent representation of sequence findings described within variant files, the workgroup made several recommendations that considered alignment to a common reference sequence, variant caller settings, use of genomic coordinates, and gene and variant naming conventions. These recommendations were considered with regard to the existing variant file specifications presently used in the clinical setting. Adoption of these recommendations is anticipated to reduce the potential for ambiguity in describing sequence findings and facilitate the sharing of genomic data among clinical laboratories and other entities. Copyright © 2017 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.
Tkachenko, Evgeniy A; Witkowski, Peter T; Radosa, Lukas; Dzagurova, Tamara K; Okulova, Nataliya M; Yunicheva, Yulia V; Vasilenko, Ludmila; Morozov, Vyacheslav G; Malkin, Gennadiy A; Krüger, Detlev H; Klempa, Boris
2015-01-01
Although at least 30 novel hantaviruses have been recently discovered in novel hosts such as shrews, moles and even bats, hantaviruses (family Bunyaviridae, genus Hantavirus) are primarily known as rodent-borne human pathogens. Here we report on identification of a novel hantavirus variant associated with a rodent host, Major's pine vole (Microtus majori). Altogether 36 hantavirus PCR-positive Major's pine voles were identified in the Krasnodar region of southern European Russia within the years 2008-2011. Initial partial L-segment sequence analysis revealed novel hantavirus sequences. Moreover, we found a single common vole (Microtusarvalis) infected with Tula virus (TULV). Complete S- and M-segment coding sequences were determined from 11 Major's pine voles originating from 8 trapping sites and subjected to phylogenetic analyses. The data obtained show that Major's pine vole is a newly recognized hantavirus reservoir host. The newfound virus, provisionally called Adler hantavirus (ADLV), is closely related to TULV. Based on amino acid differences to TULV (5.6-8.2% for nucleocapsid protein, 9.4-9.5% for glycoprotein precursor) we propose to consider ADLV as a genotype of TULV. Occurrence of ADLV and TULV in the same region suggests that ADLV is not only a geographical variant of TULV but a host-specific genotype. High intra-cluster nucleotide sequence variability (up to 18%) and geographic clustering indicate long-term presence of the virus in this region. Copyright © 2014. Published by Elsevier B.V.
Sequence data and association statistics from 12,940 type 2 diabetes cases and controls.
Flannick, Jason; Fuchsberger, Christian; Mahajan, Anubha; Teslovich, Tanya M; Agarwala, Vineeta; Gaulton, Kyle J; Caulkins, Lizz; Koesterer, Ryan; Ma, Clement; Moutsianas, Loukas; McCarthy, Davis J; Rivas, Manuel A; Perry, John R B; Sim, Xueling; Blackwell, Thomas W; Robertson, Neil R; Rayner, N William; Cingolani, Pablo; Locke, Adam E; Tajes, Juan Fernandez; Highland, Heather M; Dupuis, Josee; Chines, Peter S; Lindgren, Cecilia M; Hartl, Christopher; Jackson, Anne U; Chen, Han; Huyghe, Jeroen R; van de Bunt, Martijn; Pearson, Richard D; Kumar, Ashish; Müller-Nurasyid, Martina; Grarup, Niels; Stringham, Heather M; Gamazon, Eric R; Lee, Jaehoon; Chen, Yuhui; Scott, Robert A; Below, Jennifer E; Chen, Peng; Huang, Jinyan; Go, Min Jin; Stitzel, Michael L; Pasko, Dorota; Parker, Stephen C J; Varga, Tibor V; Green, Todd; Beer, Nicola L; Day-Williams, Aaron G; Ferreira, Teresa; Fingerlin, Tasha; Horikoshi, Momoko; Hu, Cheng; Huh, Iksoo; Ikram, Mohammad Kamran; Kim, Bong-Jo; Kim, Yongkang; Kim, Young Jin; Kwon, Min-Seok; Lee, Juyoung; Lee, Selyeong; Lin, Keng-Han; Maxwell, Taylor J; Nagai, Yoshihiko; Wang, Xu; Welch, Ryan P; Yoon, Joon; Zhang, Weihua; Barzilai, Nir; Voight, Benjamin F; Han, Bok-Ghee; Jenkinson, Christopher P; Kuulasmaa, Teemu; Kuusisto, Johanna; Manning, Alisa; Ng, Maggie C Y; Palmer, Nicholette D; Balkau, Beverley; Stančáková, Alena; Abboud, Hanna E; Boeing, Heiner; Giedraitis, Vilmantas; Prabhakaran, Dorairaj; Gottesman, Omri; Scott, James; Carey, Jason; Kwan, Phoenix; Grant, George; Smith, Joshua D; Neale, Benjamin M; Purcell, Shaun; Butterworth, Adam S; Howson, Joanna M M; Lee, Heung Man; Lu, Yingchang; Kwak, Soo-Heon; Zhao, Wei; Danesh, John; Lam, Vincent K L; Park, Kyong Soo; Saleheen, Danish; So, Wing Yee; Tam, Claudia H T; Afzal, Uzma; Aguilar, David; Arya, Rector; Aung, Tin; Chan, Edmund; Navarro, Carmen; Cheng, Ching-Yu; Palli, Domenico; Correa, Adolfo; Curran, Joanne E; Rybin, Dennis; Farook, Vidya S; Fowler, Sharon P; Freedman, Barry I; Griswold, Michael; Hale, Daniel Esten; Hicks, Pamela J; Khor, Chiea-Chuen; Kumar, Satish; Lehne, Benjamin; Thuillier, Dorothée; Lim, Wei Yen; Liu, Jianjun; Loh, Marie; Musani, Solomon K; Puppala, Sobha; Scott, William R; Yengo, Loïc; Tan, Sian-Tsung; Taylor, Herman A; Thameem, Farook; Wilson, Gregory; Wong, Tien Yin; Njølstad, Pål Rasmus; Levy, Jonathan C; Mangino, Massimo; Bonnycastle, Lori L; Schwarzmayr, Thomas; Fadista, João; Surdulescu, Gabriela L; Herder, Christian; Groves, Christopher J; Wieland, Thomas; Bork-Jensen, Jette; Brandslund, Ivan; Christensen, Cramer; Koistinen, Heikki A; Doney, Alex S F; Kinnunen, Leena; Esko, Tõnu; Farmer, Andrew J; Hakaste, Liisa; Hodgkiss, Dylan; Kravic, Jasmina; Lyssenko, Valeri; Hollensted, Mette; Jørgensen, Marit E; Jørgensen, Torben; Ladenvall, Claes; Justesen, Johanne Marie; Käräjämäki, Annemari; Kriebel, Jennifer; Rathmann, Wolfgang; Lannfelt, Lars; Lauritzen, Torsten; Narisu, Narisu; Linneberg, Allan; Melander, Olle; Milani, Lili; Neville, Matt; Orho-Melander, Marju; Qi, Lu; Qi, Qibin; Roden, Michael; Rolandsson, Olov; Swift, Amy; Rosengren, Anders H; Stirrups, Kathleen; Wood, Andrew R; Mihailov, Evelin; Blancher, Christine; Carneiro, Mauricio O; Maguire, Jared; Poplin, Ryan; Shakir, Khalid; Fennell, Timothy; DePristo, Mark; de Angelis, Martin Hrabé; Deloukas, Panos; Gjesing, Anette P; Jun, Goo; Nilsson, Peter; Murphy, Jacquelyn; Onofrio, Robert; Thorand, Barbara; Hansen, Torben; Meisinger, Christa; Hu, Frank B; Isomaa, Bo; Karpe, Fredrik; Liang, Liming; Peters, Annette; Huth, Cornelia; O'Rahilly, Stephen P; Palmer, Colin N A; Pedersen, Oluf; Rauramaa, Rainer; Tuomilehto, Jaakko; Salomaa, Veikko; Watanabe, Richard M; Syvänen, Ann-Christine; Bergman, Richard N; Bharadwaj, Dwaipayan; Bottinger, Erwin P; Cho, Yoon Shin; Chandak, Giriraj R; Chan, Juliana Cn; Chia, Kee Seng; Daly, Mark J; Ebrahim, Shah B; Langenberg, Claudia; Elliott, Paul; Jablonski, Kathleen A; Lehman, Donna M; Jia, Weiping; Ma, Ronald C W; Pollin, Toni I; Sandhu, Manjinder; Tandon, Nikhil; Froguel, Philippe; Barroso, Inês; Teo, Yik Ying; Zeggini, Eleftheria; Loos, Ruth J F; Small, Kerrin S; Ried, Janina S; DeFronzo, Ralph A; Grallert, Harald; Glaser, Benjamin; Metspalu, Andres; Wareham, Nicholas J; Walker, Mark; Banks, Eric; Gieger, Christian; Ingelsson, Erik; Im, Hae Kyung; Illig, Thomas; Franks, Paul W; Buck, Gemma; Trakalo, Joseph; Buck, David; Prokopenko, Inga; Mägi, Reedik; Lind, Lars; Farjoun, Yossi; Owen, Katharine R; Gloyn, Anna L; Strauch, Konstantin; Tuomi, Tiinamaija; Kooner, Jaspal Singh; Lee, Jong-Young; Park, Taesung; Donnelly, Peter; Morris, Andrew D; Hattersley, Andrew T; Bowden, Donald W; Collins, Francis S; Atzmon, Gil; Chambers, John C; Spector, Timothy D; Laakso, Markku; Strom, Tim M; Bell, Graeme I; Blangero, John; Duggirala, Ravindranath; Tai, E Shyong; McVean, Gilean; Hanis, Craig L; Wilson, James G; Seielstad, Mark; Frayling, Timothy M; Meigs, James B; Cox, Nancy J; Sladek, Rob; Lander, Eric S; Gabriel, Stacey; Mohlke, Karen L; Meitinger, Thomas; Groop, Leif; Abecasis, Goncalo; Scott, Laura J; Morris, Andrew P; Kang, Hyun Min; Altshuler, David; Burtt, Noël P; Florez, Jose C; Boehnke, Michael; McCarthy, Mark I
2017-12-19
To investigate the genetic basis of type 2 diabetes (T2D) to high resolution, the GoT2D and T2D-GENES consortia catalogued variation from whole-genome sequencing of 2,657 European individuals and exome sequencing of 12,940 individuals of multiple ancestries. Over 27M SNPs, indels, and structural variants were identified, including 99% of low-frequency (minor allele frequency [MAF] 0.1-5%) non-coding variants in the whole-genome sequenced individuals and 99.7% of low-frequency coding variants in the whole-exome sequenced individuals. Each variant was tested for association with T2D in the sequenced individuals, and, to increase power, most were tested in larger numbers of individuals (>80% of low-frequency coding variants in ~82 K Europeans via the exome chip, and ~90% of low-frequency non-coding variants in ~44 K Europeans via genotype imputation). The variants, genotypes, and association statistics from these analyses provide the largest reference to date of human genetic information relevant to T2D, for use in activities such as T2D-focused genotype imputation, functional characterization of variants or genes, and other novel analyses to detect associations between sequence variation and T2D.
Sequence data and association statistics from 12,940 type 2 diabetes cases and controls
Jason, Flannick; Fuchsberger, Christian; Mahajan, Anubha; Teslovich, Tanya M.; Agarwala, Vineeta; Gaulton, Kyle J.; Caulkins, Lizz; Koesterer, Ryan; Ma, Clement; Moutsianas, Loukas; McCarthy, Davis J.; Rivas, Manuel A.; Perry, John R. B.; Sim, Xueling; Blackwell, Thomas W.; Robertson, Neil R.; Rayner, N William; Cingolani, Pablo; Locke, Adam E.; Tajes, Juan Fernandez; Highland, Heather M.; Dupuis, Josee; Chines, Peter S.; Lindgren, Cecilia M.; Hartl, Christopher; Jackson, Anne U.; Chen, Han; Huyghe, Jeroen R.; van de Bunt, Martijn; Pearson, Richard D.; Kumar, Ashish; Müller-Nurasyid, Martina; Grarup, Niels; Stringham, Heather M.; Gamazon, Eric R.; Lee, Jaehoon; Chen, Yuhui; Scott, Robert A.; Below, Jennifer E.; Chen, Peng; Huang, Jinyan; Go, Min Jin; Stitzel, Michael L.; Pasko, Dorota; Parker, Stephen C. J.; Varga, Tibor V.; Green, Todd; Beer, Nicola L.; Day-Williams, Aaron G.; Ferreira, Teresa; Fingerlin, Tasha; Horikoshi, Momoko; Hu, Cheng; Huh, Iksoo; Ikram, Mohammad Kamran; Kim, Bong-Jo; Kim, Yongkang; Kim, Young Jin; Kwon, Min-Seok; Lee, Juyoung; Lee, Selyeong; Lin, Keng-Han; Maxwell, Taylor J.; Nagai, Yoshihiko; Wang, Xu; Welch, Ryan P.; Yoon, Joon; Zhang, Weihua; Barzilai, Nir; Voight, Benjamin F.; Han, Bok-Ghee; Jenkinson, Christopher P.; Kuulasmaa, Teemu; Kuusisto, Johanna; Manning, Alisa; Ng, Maggie C. Y.; Palmer, Nicholette D.; Balkau, Beverley; Stančáková, Alena; Abboud, Hanna E.; Boeing, Heiner; Giedraitis, Vilmantas; Prabhakaran, Dorairaj; Gottesman, Omri; Scott, James; Carey, Jason; Kwan, Phoenix; Grant, George; Smith, Joshua D.; Neale, Benjamin M.; Purcell, Shaun; Butterworth, Adam S.; Howson, Joanna M. M.; Lee, Heung Man; Lu, Yingchang; Kwak, Soo-Heon; Zhao, Wei; Danesh, John; Lam, Vincent K. L.; Park, Kyong Soo; Saleheen, Danish; So, Wing Yee; Tam, Claudia H. T.; Afzal, Uzma; Aguilar, David; Arya, Rector; Aung, Tin; Chan, Edmund; Navarro, Carmen; Cheng, Ching-Yu; Palli, Domenico; Correa, Adolfo; Curran, Joanne E.; Rybin, Dennis; Farook, Vidya S.; Fowler, Sharon P.; Freedman, Barry I.; Griswold, Michael; Hale, Daniel Esten; Hicks, Pamela J.; Khor, Chiea-Chuen; Kumar, Satish; Lehne, Benjamin; Thuillier, Dorothée; Lim, Wei Yen; Liu, Jianjun; Loh, Marie; Musani, Solomon K.; Puppala, Sobha; Scott, William R.; Yengo, Loïc; Tan, Sian-Tsung; Taylor, Herman A.; Thameem, Farook; Wilson, Gregory; Wong, Tien Yin; Njølstad, Pål Rasmus; Levy, Jonathan C.; Mangino, Massimo; Bonnycastle, Lori L.; Schwarzmayr, Thomas; Fadista, João; Surdulescu, Gabriela L.; Herder, Christian; Groves, Christopher J.; Wieland, Thomas; Bork-Jensen, Jette; Brandslund, Ivan; Christensen, Cramer; Koistinen, Heikki A.; Doney, Alex S. F.; Kinnunen, Leena; Esko, Tõnu; Farmer, Andrew J.; Hakaste, Liisa; Hodgkiss, Dylan; Kravic, Jasmina; Lyssenko, Valeri; Hollensted, Mette; Jørgensen, Marit E.; Jørgensen, Torben; Ladenvall, Claes; Justesen, Johanne Marie; Käräjämäki, Annemari; Kriebel, Jennifer; Rathmann, Wolfgang; Lannfelt, Lars; Lauritzen, Torsten; Narisu, Narisu; Linneberg, Allan; Melander, Olle; Milani, Lili; Neville, Matt; Orho-Melander, Marju; Qi, Lu; Qi, Qibin; Roden, Michael; Rolandsson, Olov; Swift, Amy; Rosengren, Anders H.; Stirrups, Kathleen; Wood, Andrew R.; Mihailov, Evelin; Blancher, Christine; Carneiro, Mauricio O.; Maguire, Jared; Poplin, Ryan; Shakir, Khalid; Fennell, Timothy; DePristo, Mark; de Angelis, Martin Hrabé; Deloukas, Panos; Gjesing, Anette P.; Jun, Goo; Nilsson, Peter; Murphy, Jacquelyn; Onofrio, Robert; Thorand, Barbara; Hansen, Torben; Meisinger, Christa; Hu, Frank B.; Isomaa, Bo; Karpe, Fredrik; Liang, Liming; Peters, Annette; Huth, Cornelia; O'Rahilly, Stephen P; Palmer, Colin N. A.; Pedersen, Oluf; Rauramaa, Rainer; Tuomilehto, Jaakko; Salomaa, Veikko; Watanabe, Richard M.; Syvänen, Ann-Christine; Bergman, Richard N.; Bharadwaj, Dwaipayan; Bottinger, Erwin P.; Cho, Yoon Shin; Chandak, Giriraj R.; Chan, Juliana CN; Chia, Kee Seng; Daly, Mark J.; Ebrahim, Shah B.; Langenberg, Claudia; Elliott, Paul; Jablonski, Kathleen A.; Lehman, Donna M.; Jia, Weiping; Ma, Ronald C. W.; Pollin, Toni I.; Sandhu, Manjinder; Tandon, Nikhil; Froguel, Philippe; Barroso, Inês; Teo, Yik Ying; Zeggini, Eleftheria; Loos, Ruth J. F.; Small, Kerrin S.; Ried, Janina S.; DeFronzo, Ralph A.; Grallert, Harald; Glaser, Benjamin; Metspalu, Andres; Wareham, Nicholas J.; Walker, Mark; Banks, Eric; Gieger, Christian; Ingelsson, Erik; Im, Hae Kyung; Illig, Thomas; Franks, Paul W.; Buck, Gemma; Trakalo, Joseph; Buck, David; Prokopenko, Inga; Mägi, Reedik; Lind, Lars; Farjoun, Yossi; Owen, Katharine R.; Gloyn, Anna L.; Strauch, Konstantin; Tuomi, Tiinamaija; Kooner, Jaspal Singh; Lee, Jong-Young; Park, Taesung; Donnelly, Peter; Morris, Andrew D.; Hattersley, Andrew T.; Bowden, Donald W.; Collins, Francis S.; Atzmon, Gil; Chambers, John C.; Spector, Timothy D.; Laakso, Markku; Strom, Tim M.; Bell, Graeme I.; Blangero, John; Duggirala, Ravindranath; Tai, E. Shyong; McVean, Gilean; Hanis, Craig L.; Wilson, James G.; Seielstad, Mark; Frayling, Timothy M.; Meigs, James B.; Cox, Nancy J.; Sladek, Rob; Lander, Eric S.; Gabriel, Stacey; Mohlke, Karen L.; Meitinger, Thomas; Groop, Leif; Abecasis, Goncalo; Scott, Laura J.; Morris, Andrew P.; Kang, Hyun Min; Altshuler, David; Burtt, Noël P.; Florez, Jose C.; Boehnke, Michael; McCarthy, Mark I.
2017-01-01
To investigate the genetic basis of type 2 diabetes (T2D) to high resolution, the GoT2D and T2D-GENES consortia catalogued variation from whole-genome sequencing of 2,657 European individuals and exome sequencing of 12,940 individuals of multiple ancestries. Over 27M SNPs, indels, and structural variants were identified, including 99% of low-frequency (minor allele frequency [MAF] 0.1–5%) non-coding variants in the whole-genome sequenced individuals and 99.7% of low-frequency coding variants in the whole-exome sequenced individuals. Each variant was tested for association with T2D in the sequenced individuals, and, to increase power, most were tested in larger numbers of individuals (>80% of low-frequency coding variants in ~82 K Europeans via the exome chip, and ~90% of low-frequency non-coding variants in ~44 K Europeans via genotype imputation). The variants, genotypes, and association statistics from these analyses provide the largest reference to date of human genetic information relevant to T2D, for use in activities such as T2D-focused genotype imputation, functional characterization of variants or genes, and other novel analyses to detect associations between sequence variation and T2D. PMID:29257133
Kolpakova, E; Frengen, E; Stokke, T; Olsnes, S
2000-01-01
Acidic fibroblast growth factor (aFGF) intracellular binding protein (FIBP) is a protein found mainly in the nucleus that might be involved in the intracellular function of aFGF. Here we present a comparative analysis of the deduced amino acid sequences of human, murine and Drosophila FIBP analogues and demonstrate that FIBP is an evolutionarily conserved protein. The human gene spans more than 5 kb, comprising ten exons and nine introns, and maps to chromosome 11q13.1. Two slightly different splice variants found in different tissues were isolated and characterized. Sequence analysis of the region surrounding the translation start revealed a CpG island, a classical feature of widely expressed genes. Functional studies of the promoter region with a luciferase reporter system suggested a strong transcriptional activity residing within 600 bp of the 5' flanking region. PMID:11104667
Fibrinogen Lincoln: a new truncated alpha chain variant with delayed clotting.
Ridgway, H J; Brennan, S O; Gibbons, S; George, P M
1996-04-01
A patient referred for preoperative investigation of prolonged bleeding and easy bruising was found to have increased thrombin and reptilase times; however, the thrombin catalysed release of fibrinopeptides A and B was normal. Analysis of five other family members, spanning three generations, indicated that three had a similar defect and suggested autosomal dominant inheritance. Non-reducing SDS-PAGE of purified fibrinogen from affected individuals showed that the 340 kD form of their fibrinogen ran as a doublet. SSCP (single-stranded conformational polymorphism) analysis of exon 5 of the A alpha gene, which encodes the C-terminal half of the chain, confirmed the presence of a mutation. Cycle sequencing of PCR amplified DNA revealed a 13 base pair deletion (nt 4758-4770), resulting in a frame-shift at Ala 475, which translates as four new amino acids before terminating at a new stop codon (-476His-Cys-Leu-Ala-Stop). The presence of a circulating truncated A alpha chain was confirmed when SDS-PAGE gels were probed with an alpha chain specific antisera; which showed that the variant A alpha chain comigrated with gamma chains. The truncation results in a variant A alpha chain with a deletion of 131 amino acids (480-610), and four new amino acids at the C-terminal.
Stockbauer, K E; Magoun, L; Liu, M; Burns, E H; Gubba, S; Renish, S; Pan, X; Bodary, S C; Baker, E; Coburn, J; Leong, J M; Musser, J M
1999-01-05
The human pathogenic bacterium group A Streptococcus produces an extracellular cysteine protease [streptococcal pyrogenic exotoxin B (SpeB)] that is a critical virulence factor for invasive disease episodes. Sequence analysis of the speB gene from 200 group A Streptococcus isolates collected worldwide identified three main mature SpeB (mSpeB) variants. One of these variants (mSpeB2) contains an Arg-Gly-Asp (RGD) sequence, a tripeptide motif that is commonly recognized by integrin receptors. mSpeB2 is made by all isolates of the unusually virulent serotype M1 and several other geographically widespread clones that frequently cause invasive infections. Only the mSpeB2 variant bound to transfected cells expressing integrin alphavbeta3 (also known as the vitronectin receptor) or alphaIIbbeta3 (platelet glycoprotein IIb-IIIa), and binding was blocked by a mAb that recognizes the streptococcal protease RGD motif region. In addition, mSpeB2 bound purified platelet integrin alphaIIbbeta3. Defined beta3 mutants that are altered for fibrinogen binding were defective for SpeB binding. Synthetic peptides with the mSpeB2 RGD motif, but not the RSD sequence present in other mSpeB variants, blocked binding of mSpeB2 to transfected cells expressing alphavbeta3 and caused detachment of cultured human umbilical vein endothelial cells. The results (i) identify a Gram-positive virulence factor that directly binds integrins, (ii) identify naturally occurring variants of a documented Gram-positive virulence factor with biomedically relevant differences in their interactions with host cells, and (iii) add to the theme that subtle natural variation in microbial virulence factor structure alters the character of host-pathogen interactions.
Heaton, Michael P.; Smith, Timothy P.L.; Carnahan, Jacky K.; Basnayake, Veronica; Qiu, Jiansheng; Simpson, Barry; Kalbfleisch, Theodore S.
2016-01-01
The availability of whole genome sequence (WGS) data has made it possible to discover protein variants in silico. However, existing bovine WGS databases do not show data in a form conducive to protein variant analysis, and tend to under represent the breadth of genetic diversity in global beef cattle. Thus, our first aim was to use 96 beef sires, sharing minimal pedigree relationships, to create a searchable and publicly viewable set of mapped genomes relevant for 19 popular breeds of U.S. cattle. Our second aim was to identify protein variants encoded by the bovine endothelial PAS domain-containing protein 1 gene ( EPAS1), a gene associated with pulmonary hypertension in Angus cattle. The identity and quality of genomic sequences were verified by comparing WGS genotypes to those derived from other methods. The average read depth, genotype scoring rate, and genotype accuracy exceeded 14, 99%, and 99%, respectively. The 96 genomes were used to discover four amino acid variants encoded by EPAS1 (E270Q, P362L, A671G, and L701F) and confirm two variants previously associated with disease (A606T and G610S). The six EPAS1 missense mutations were verified with matrix-assisted laser desorption/ionization time-of-flight mass spectrometry assays, and their frequencies were estimated in a separate collection of 1154 U.S. cattle representing 46 breeds. A rooted phylogenetic tree of eight polypeptide sequences provided a framework for evaluating the likely order of mutations and potential impact of EPAS1 alleles on the adaptive response to chronic hypoxia in U.S. cattle. This public, whole genome resource facilitates in silico identification of protein variants in diverse types of U.S. beef cattle, and provides a means of translating WGS data into a practical biological and evolutionary context for generating and testing hypotheses. PMID:27746904
Functional Consequences of a Novel Variant of PCSK1
Pickett, Lindsay A.; Yourshaw, Michael; Albornoz, Valeria; Chen, Zijun; Solorzano-Vargas, R. Sergio; Nelson, Stanley F.; Martín, Martín G.; Lindberg, Iris
2013-01-01
Background Common single nucleotide polymorphisms (SNPs) in proprotein convertase subtilisin/kexin type 1 with modest effects on PC1/3 in vitro have been associated with obesity in five genome-wide association studies and with diabetes in one genome-wide association study. We here present a novel SNP and compare its biosynthesis, secretion and catalytic activity to wild-type enzyme and to SNPs that have been linked to obesity. Methodology/Principal Findings A novel PC1/3 variant introducing an Arg to Gln amino acid substitution at residue 80 (within the secondary cleavage site of the prodomain) (rs1799904) was studied. This novel variant was selected for analysis from the 1000 Genomes sequencing project based on its predicted deleterious effect on enzyme function and its comparatively more frequent allele frequency. The actual existence of the R80Q (rs1799904) variant was verified by Sanger sequencing. The effects of this novel variant on the biosynthesis, secretion, and catalytic activity were determined; the previously-described obesity risk SNPs N221D (rs6232), Q665E/S690T (rs6234/rs6235), and the Q665E and S690T SNPs (analyzed separately) were included for comparative purposes. The novel R80Q (rs1799904) variant described in this study resulted in significantly detrimental effects on both the maturation and in vitro catalytic activity of PC1/3. Conclusion/Significance Our findings that this novel R80Q (rs1799904) variant both exhibits adverse effects on PC1/3 activity and is prevalent in the population suggests that further biochemical and genetic analysis to assess its contribution to the risk of metabolic disease within the general population is warranted. PMID:23383060
Functional consequences of a novel variant of PCSK1.
Pickett, Lindsay A; Yourshaw, Michael; Albornoz, Valeria; Chen, Zijun; Solorzano-Vargas, R Sergio; Nelson, Stanley F; Martín, Martín G; Lindberg, Iris
2013-01-01
Common single nucleotide polymorphisms (SNPs) in proprotein convertase subtilisin/kexin type 1 with modest effects on PC1/3 in vitro have been associated with obesity in five genome-wide association studies and with diabetes in one genome-wide association study. We here present a novel SNP and compare its biosynthesis, secretion and catalytic activity to wild-type enzyme and to SNPs that have been linked to obesity. A novel PC1/3 variant introducing an Arg to Gln amino acid substitution at residue 80 (within the secondary cleavage site of the prodomain) (rs1799904) was studied. This novel variant was selected for analysis from the 1000 Genomes sequencing project based on its predicted deleterious effect on enzyme function and its comparatively more frequent allele frequency. The actual existence of the R80Q (rs1799904) variant was verified by Sanger sequencing. The effects of this novel variant on the biosynthesis, secretion, and catalytic activity were determined; the previously-described obesity risk SNPs N221D (rs6232), Q665E/S690T (rs6234/rs6235), and the Q665E and S690T SNPs (analyzed separately) were included for comparative purposes. The novel R80Q (rs1799904) variant described in this study resulted in significantly detrimental effects on both the maturation and in vitro catalytic activity of PC1/3. Our findings that this novel R80Q (rs1799904) variant both exhibits adverse effects on PC1/3 activity and is prevalent in the population suggests that further biochemical and genetic analysis to assess its contribution to the risk of metabolic disease within the general population is warranted.
Jo, Jihoon; Park, Jongsun; Lee, Hyun-Gwan; Kern, Elizabeth M A; Cheon, Seongmin; Jin, Soyeong; Park, Joong-Ki; Cho, Sung-Jin; Park, Chungoo
2016-08-01
The sea cucumber Apostichopus japonicus Selenka 1867 represents an important resource in biomedical research, traditional medicine, and the seafood industry. Much of the commercial value of A. japonicus is determined by dorsal/ventral color variation (red, green, and black), yet the taxonomic relationships between these color variants are not clearly understood. We performed the first comparative analysis of de novo assembled transcriptome data from three color variants of A. japonicus. Using the Illumina platform, we sequenced nearly 177,596,774 clean reads representing a total of 18.2Gbp of sea cucumber transcriptome. A comparison of over 0.3 million transcript scaffolds against the Uniprot/Swiss-Prot database yielded 8513, 8602, and 8588 positive matches for green, red, and black body color transcriptomes, respectively. Using the Panther gene classification system, we assessed an extensive and diverse set of expressed genes in three color variants and found that (1) among the three color variants of A. japonicus, genes associated with RNA binding protein, oxidoreductase, nucleic acid binding, transferase, and KRAB box transcription factor were most commonly expressed; and (2) the main protein functional classes are differently regulated in all three color variants (extracellular matrix protein and phosphatase for green color, transporter and potassium channel for red color, and G-protein modulator and enzyme modulator for black color). This work will assist in the discovery and annotation of novel genes that play significant morphological and physiological roles in color variants of A. japonicus, and these sequence data will provide a useful set of resources for the rapidly growing sea cucumber aquaculture industry. Copyright © 2016 Elsevier B.V. All rights reserved.
A statistical method for the detection of variants from next-generation resequencing of DNA pools.
Bansal, Vikas
2010-06-15
Next-generation sequencing technologies have enabled the sequencing of several human genomes in their entirety. However, the routine resequencing of complete genomes remains infeasible. The massive capacity of next-generation sequencers can be harnessed for sequencing specific genomic regions in hundreds to thousands of individuals. Sequencing-based association studies are currently limited by the low level of multiplexing offered by sequencing platforms. Pooled sequencing represents a cost-effective approach for studying rare variants in large populations. To utilize the power of DNA pooling, it is important to accurately identify sequence variants from pooled sequencing data. Detection of rare variants from pooled sequencing represents a different challenge than detection of variants from individual sequencing. We describe a novel statistical approach, CRISP [Comprehensive Read analysis for Identification of Single Nucleotide Polymorphisms (SNPs) from Pooled sequencing] that is able to identify both rare and common variants by using two approaches: (i) comparing the distribution of allele counts across multiple pools using contingency tables and (ii) evaluating the probability of observing multiple non-reference base calls due to sequencing errors alone. Information about the distribution of reads between the forward and reverse strands and the size of the pools is also incorporated within this framework to filter out false variants. Validation of CRISP on two separate pooled sequencing datasets generated using the Illumina Genome Analyzer demonstrates that it can detect 80-85% of SNPs identified using individual sequencing while achieving a low false discovery rate (3-5%). Comparison with previous methods for pooled SNP detection demonstrates the significantly lower false positive and false negative rates for CRISP. Implementation of this method is available at http://polymorphism.scripps.edu/~vbansal/software/CRISP/.
Cashman, N R
1997-01-01
By biological and medical criteria, prions are infectious agents; however, many of their properties differ profoundly from those of conventional microbes. Prions are "encoded" by alterations in protein conformation rather than in nucleic acid or amino acid sequence. New epidemic prion diseases (bovine spongiform encephalopathy and new variant Creutzfeldt-Jakob disease) have recently emerged under the active surveillance of the modern world. The risk of contracting prion disease from blood products or other biologicals is now a focus of worldwide concern. Much has been discovered about prions and prion diseases, but much remains to be done. PMID:9371069
Nyaku, Seloame T; Sripathi, Venkateswara R; Kantety, Ramesh V; Gu, Yong Q; Lawrence, Kathy; Sharma, Govind C
2013-01-01
The 18S rRNA gene is fundamental to cellular and organismal protein synthesis and because of its stable persistence through generations it is also used in phylogenetic analysis among taxa. Sequence variation in this gene within a single species is rare, but it has been observed in few metazoan organisms. More frequently it has mostly been reported in the non-transcribed spacer region. Here, we have identified two sequence variants within the near full coding region of 18S rRNA gene from a single reniform nematode (RN) Rotylenchulus reniformis labeled as reniform nematode variant 1 (RN_VAR1) and variant 2 (RN_VAR2). All sequences from three of the four isolates had both RN variants in their sequences; however, isolate 13B had only RN variant 2 sequence. Specific variable base sites (96 or 5.5%) were found within the 18S rRNA gene that can clearly distinguish the two 18S rDNA variants of RN, in 11 (25.0%) and 33 (75.0%) of the 44 RN clones, for RN_VAR1 and RN_VAR2, respectively. Neighbor-joining trees show that the RN_VAR1 is very similar to the previously existing R. reniformis sequence in GenBank, while the RN_VAR2 sequence is more divergent. This is the first report of the identification of two major variants of the 18S rRNA gene in the same single RN, and documents the specific base variation between the two variants, and hypothesizes on simultaneous co-existence of these two variants for this gene.
Nyaku, Seloame T.; Sripathi, Venkateswara R.; Kantety, Ramesh V.; Gu, Yong Q.; Lawrence, Kathy; Sharma, Govind C.
2013-01-01
The 18S rRNA gene is fundamental to cellular and organismal protein synthesis and because of its stable persistence through generations it is also used in phylogenetic analysis among taxa. Sequence variation in this gene within a single species is rare, but it has been observed in few metazoan organisms. More frequently it has mostly been reported in the non-transcribed spacer region. Here, we have identified two sequence variants within the near full coding region of 18S rRNA gene from a single reniform nematode (RN) Rotylenchulus reniformis labeled as reniform nematode variant 1 (RN_VAR1) and variant 2 (RN_VAR2). All sequences from three of the four isolates had both RN variants in their sequences; however, isolate 13B had only RN variant 2 sequence. Specific variable base sites (96 or 5.5%) were found within the 18S rRNA gene that can clearly distinguish the two 18S rDNA variants of RN, in 11 (25.0%) and 33 (75.0%) of the 44 RN clones, for RN_VAR1 and RN_VAR2, respectively. Neighbor-joining trees show that the RN_VAR1 is very similar to the previously existing R. reniformis sequence in GenBank, while the RN_VAR2 sequence is more divergent. This is the first report of the identification of two major variants of the 18S rRNA gene in the same single RN, and documents the specific base variation between the two variants, and hypothesizes on simultaneous co-existence of these two variants for this gene. PMID:23593343
Whole exome sequencing for familial bicuspid aortic valve identifies putative variants.
Martin, Lisa J; Pilipenko, Valentina; Kaufman, Kenneth M; Cripe, Linda; Kottyan, Leah C; Keddache, Mehdi; Dexheimer, Phillip; Weirauch, Matthew T; Benson, D Woodrow
2014-10-01
Bicuspid aortic valve (BAV) is the most common congenital cardiovascular malformation. Although highly heritable, few causal variants have been identified. The purpose of this study was to identify genetic variants underlying BAV by whole exome sequencing a multiplex BAV kindred. Whole exome sequencing was performed on 17 individuals from a single family (BAV=3; other cardiovascular malformation, 3). Postvariant calling error control metrics were established after examining the relationship between Mendelian inheritance error rate and coverage, quality score, and call rate. To determine the most effective approach to identifying susceptibility variants from among 54 674 variants passing error control metrics, we evaluated 3 variant selection strategies frequently used in whole exome sequencing studies plus extended family linkage. No putative rare, high-effect variants were identified in all affected but no unaffected individuals. Eight high-effect variants were identified by ≥2 of the commonly used selection strategies; however, these were either common in the general population (>10%) or present in the majority of the unaffected family members. However, using extended family linkage, 3 synonymous variants were identified; all 3 variants were identified by at least one other strategy. These results suggest that traditional whole exome sequencing approaches, which assume causal variants alter coding sense, may be insufficient for BAV and other complex traits. Identification of disease-associated variants is facilitated by the use of segregation within families. © 2014 American Heart Association, Inc.
The organization and expression of the mdm2 gene.
de Oca Luna, R M; Tabor, A D; Eberspaecher, H; Hulboy, D L; Worth, L L; Colman, M S; Finlay, C A; Lozano, G
1996-05-01
The mdm2 gene encodes a zinc finger protein that negatively regulates p53 function by binding and masking the p53 transcriptional activation domain. Two different promoters control expression of mdm2, one of which is also transactivated by p53. We cloned and characterized the mdm2 gene from a murine 129 library. It contained at least 12 exons and spanned approximately 25 kb of DNA. Sequencing of the mdm2 gene revealed three nucleotide differences that resulted in amino acid substitutions in the previously published mdm2 sequence. Sequencing of normal BalbC/J DNA and the original cosmid clone isolated from the 3T3DM cell line revealed that they are identical, suggesting that the published sequence is in error at these three positions. In addition, we analyzed the expression pattern of mdm2 and found ubiquitous low-level expression throughout embryo development and in adult tissues. Analysis of mRNA from numerous tissues for several mdm2 spliced variants that had been identified in the transformed 3T3DM cell line revealed that these variants could not be detected in the developing embryo or in adult tissues.
Identification of candidate genes for familial early-onset essential tremor.
Liu, Xinmin; Hernandez, Nora; Kisselev, Sergey; Floratos, Aris; Sawle, Ashley; Ionita-Laza, Iuliana; Ottman, Ruth; Louis, Elan D; Clark, Lorraine N
2016-07-01
Essential tremor (ET) is one of the most common causes of tremor in humans. Despite its high heritability and prevalence, few susceptibility genes for ET have been identified. To identify ET genes, whole-exome sequencing was performed in 37 early-onset ET families with an autosomal-dominant inheritance pattern. We identified candidate genes for follow-up functional studies in five ET families. In two independent families, we identified variants predicted to affect function in the nitric oxide (NO) synthase 3 gene (NOS3) that cosegregated with disease. NOS3 is highly expressed in the central nervous system (including cerebellum), neurons and endothelial cells, and is one of three enzymes that converts l-arginine to the neurotransmitter NO. In one family, a heterozygous variant, c.46G>A (p.(Gly16Ser)), in NOS3, was identified in three affected ET cases and was absent in an unaffected family member; and in a second family, a heterozygous variant, c.164C>T (p.(Pro55Leu)), was identified in three affected ET cases (dizygotic twins and their mother). Both variants result in amino-acid substitutions of highly conserved amino-acid residues that are predicted to be deleterious and damaging by in silico analysis. In three independent families, variants predicted to affect function were also identified in other genes, including KCNS2 (KV9.2), HAPLN4 (BRAL2) and USP46. These genes are highly expressed in the cerebellum and Purkinje cells, and influence function of the gamma-amino butyric acid (GABA)-ergic system. This is in concordance with recent evidence that the pathophysiological process in ET involves cerebellar dysfunction and possibly cerebellar degeneration with a reduction in Purkinje cells, and a decrease in GABA-ergic tone.
Guidelines for investigating causality of sequence variants in human disease
MacArthur, D. G.; Manolio, T. A.; Dimmock, D. P.; Rehm, H. L.; Shendure, J.; Abecasis, G. R.; Adams, D. R.; Altman, R. B.; Antonarakis, S. E.; Ashley, E. A.; Barrett, J. C.; Biesecker, L. G.; Conrad, D. F.; Cooper, G. M.; Cox, N. J.; Daly, M. J.; Gerstein, M. B.; Goldstein, D. B.; Hirschhorn, J. N.; Leal, S. M.; Pennacchio, L. A.; Stamatoyannopoulos, J. A.; Sunyaev, S. R.; Valle, D.; Voight, B. F.; Winckler, W.; Gunter, C.
2014-01-01
The discovery of rare genetic variants is accelerating, and clear guidelines for distinguishing disease-causing sequence variants from the many potentially functional variants present in any human genome are urgently needed. Without rigorous standards we risk an acceleration of false-positive reports of causality, which would impede the translation of genomic research findings into the clinical diagnostic setting and hinder biological understanding of disease. Here we discuss the key challenges of assessing sequence variants in human disease, integrating both gene-level and variant-level support for causality. We propose guidelines for summarizing confidence in variant pathogenicity and highlight several areas that require further resource development. PMID:24759409
Guidelines for investigating causality of sequence variants in human disease.
MacArthur, D G; Manolio, T A; Dimmock, D P; Rehm, H L; Shendure, J; Abecasis, G R; Adams, D R; Altman, R B; Antonarakis, S E; Ashley, E A; Barrett, J C; Biesecker, L G; Conrad, D F; Cooper, G M; Cox, N J; Daly, M J; Gerstein, M B; Goldstein, D B; Hirschhorn, J N; Leal, S M; Pennacchio, L A; Stamatoyannopoulos, J A; Sunyaev, S R; Valle, D; Voight, B F; Winckler, W; Gunter, C
2014-04-24
The discovery of rare genetic variants is accelerating, and clear guidelines for distinguishing disease-causing sequence variants from the many potentially functional variants present in any human genome are urgently needed. Without rigorous standards we risk an acceleration of false-positive reports of causality, which would impede the translation of genomic research findings into the clinical diagnostic setting and hinder biological understanding of disease. Here we discuss the key challenges of assessing sequence variants in human disease, integrating both gene-level and variant-level support for causality. We propose guidelines for summarizing confidence in variant pathogenicity and highlight several areas that require further resource development.
Reliable Detection of Herpes Simplex Virus Sequence Variation by High-Throughput Resequencing.
Morse, Alison M; Calabro, Kaitlyn R; Fear, Justin M; Bloom, David C; McIntyre, Lauren M
2017-08-16
High-throughput sequencing (HTS) has resulted in data for a number of herpes simplex virus (HSV) laboratory strains and clinical isolates. The knowledge of these sequences has been critical for investigating viral pathogenicity. However, the assembly of complete herpesviral genomes, including HSV, is complicated due to the existence of large repeat regions and arrays of smaller reiterated sequences that are commonly found in these genomes. In addition, the inherent genetic variation in populations of isolates for viruses and other microorganisms presents an additional challenge to many existing HTS sequence assembly pipelines. Here, we evaluate two approaches for the identification of genetic variants in HSV1 strains using Illumina short read sequencing data. The first, a reference-based approach, identifies variants from reads aligned to a reference sequence and the second, a de novo assembly approach, identifies variants from reads aligned to de novo assembled consensus sequences. Of critical importance for both approaches is the reduction in the number of low complexity regions through the construction of a non-redundant reference genome. We compared variants identified in the two methods. Our results indicate that approximately 85% of variants are identified regardless of the approach. The reference-based approach to variant discovery captures an additional 15% representing variants divergent from the HSV1 reference possibly due to viral passage. Reference-based approaches are significantly less labor-intensive and identify variants across the genome where de novo assembly-based approaches are limited to regions where contigs have been successfully assembled. In addition, regions of poor quality assembly can lead to false variant identification in de novo consensus sequences. For viruses with a well-assembled reference genome, a reference-based approach is recommended.
General approach to reversing ketol-acid reductoisomerase cofactor dependence from NADPH to NADH
Brinkmann-Chen, Sabine; Flock, Tilman; Cahn, Jackson K. B.; ...
2013-06-17
To date, efforts to switch the cofactor specificity of oxidoreductases from nicotinamide adenine dinucleotide phosphate (NADPH) to nicotinamide adenine dinucleotide (NADH) have been made on a case-by-case basis with varying degrees of success. Here we present a straightforward recipe for altering the cofactor specificity of a class of NADPH-dependent oxidoreductases, the ketol-acid reductoisomerases (KARIs). Combining previous results for an engineered NADH-dependent variant of Escherichia coli KARI with available KARI crystal structures and a comprehensive KARI-sequence alignment, we identified key cofactor specificity determinants and used this information to construct five KARIs with reversed cofactor preference. Additional directed evolution generated two enzymesmore » having NADH-dependent catalytic efficiencies that are greater than the wild-type enzymes with NADPH. As a result, high-resolution structures of a wild-type/variant pair reveal the molecular basis of the cofactor switch.« less
Pettigrew, Christopher; Wayte, Nicola; Lovelock, Paul K; Tavtigian, Sean V; Chenevix-Trench, Georgia; Spurdle, Amanda B; Brown, Melissa A
2005-01-01
Introduction Aberrant pre-mRNA splicing can be more detrimental to the function of a gene than changes in the length or nature of the encoded amino acid sequence. Although predicting the effects of changes in consensus 5' and 3' splice sites near intron:exon boundaries is relatively straightforward, predicting the possible effects of changes in exonic splicing enhancers (ESEs) remains a challenge. Methods As an initial step toward determining which ESEs predicted by the web-based tool ESEfinder in the breast cancer susceptibility gene BRCA1 are likely to be functional, we have determined their evolutionary conservation and compared their location with known BRCA1 sequence variants. Results Using the default settings of ESEfinder, we initially detected 669 potential ESEs in the coding region of the BRCA1 gene. Increasing the threshold score reduced the total number to 464, while taking into consideration the proximity to splice donor and acceptor sites reduced the number to 211. Approximately 11% of these ESEs (23/211) either are identical at the nucleotide level in human, primates, mouse, cow, dog and opossum Brca1 (conserved) or are detectable by ESEfinder in the same position in the Brca1 sequence (shared). The frequency of conserved and shared predicted ESEs between human and mouse is higher in BRCA1 exons (2.8 per 100 nucleotides) than in introns (0.6 per 100 nucleotides). Of conserved or shared putative ESEs, 61% (14/23) were predicted to be affected by sequence variants reported in the Breast Cancer Information Core database. Applying the filters described above increased the colocalization of predicted ESEs with missense changes, in-frame deletions and unclassified variants predicted to be deleterious to protein function, whereas they decreased the colocalization with known polymorphisms or unclassified variants predicted to be neutral. Conclusion In this report we show that evolutionary conservation analysis may be used to improve the specificity of an ESE prediction tool. This is the first report on the prediction of the frequency and distribution of ESEs in the BRCA1 gene, and it is the first reported attempt to predict which ESEs are most likely to be functional and therefore which sequence variants in ESEs are most likely to be pathogenic. PMID:16280041
Nedeljkovic, Ivana; Terzikhan, Natalie; Vonk, Judith M; van der Plaat, Diana A; Lahousse, Lies; van Diemen, Cleo C; Hobbs, Brian D; Qiao, Dandi; Cho, Michael H; Brusselle, Guy G; Postma, Dirkje S; Boezen, H M; van Duijn, Cornelia M; Amin, Najaf
2018-01-01
Chronic obstructive pulmonary disease (COPD) is a complex and heritable disease, associated with multiple genetic variants. Specific familial types of COPD may be explained by rare variants, which have not been widely studied. We aimed to discover rare genetic variants underlying COPD through a genome-wide linkage scan. Affected-only analysis was performed using the 6K Illumina Linkage IV Panel in 142 cases clustered in 27 families from a genetic isolate, the Erasmus Rucphen Family (ERF) study. Potential causal variants were identified by searching for shared rare variants in the exome-sequence data of the affected members of the families contributing most to the linkage peak. The identified rare variants were then tested for association with COPD in a large meta-analysis of several cohorts. Significant evidence for linkage was observed on chromosomes 15q14-15q25 [logarithm of the odds (LOD) score = 5.52], 11p15.4-11q14.1 (LOD = 3.71) and 5q14.3-5q33.2 (LOD = 3.49). In the chromosome 15 peak, that harbors the known COPD locus for nicotinic receptors, and in the chromosome 5 peak we could not identify shared variants. In the chromosome 11 locus, we identified four rare (minor allele frequency (MAF) <0.02), predicted pathogenic, missense variants. These were shared among the affected family members. The identified variants localize to genes including neuroblast differentiation-associated protein ( AHNAK ), previously associated with blood biomarkers in COPD, phospholipase C Beta 3 ( PLCB3 ), shown to increase airway hyper-responsiveness, solute carrier family 22-A11 ( SLC22A11 ), involved in amino acid metabolism and ion transport, and metallothionein-like protein 5 ( MTL5 ), involved in nicotinate and nicotinamide metabolism. Association of SLC22A11 and MTL5 variants were confirmed in the meta-analysis of 9,888 cases and 27,060 controls. In conclusion, we have identified novel rare variants in plausible genes related to COPD. Further studies utilizing large sample whole-genome sequencing should further confirm the associations at chromosome 11 and investigate the chromosome 15 and 5 linked regions.
Variant calling in low-coverage whole genome sequencing of a Native American population sample.
Bizon, Chris; Spiegel, Michael; Chasse, Scott A; Gizer, Ian R; Li, Yun; Malc, Ewa P; Mieczkowski, Piotr A; Sailsbery, Josh K; Wang, Xiaoshu; Ehlers, Cindy L; Wilhelmsen, Kirk C
2014-01-30
The reduction in the cost of sequencing a human genome has led to the use of genotype sampling strategies in order to impute and infer the presence of sequence variants that can then be tested for associations with traits of interest. Low-coverage Whole Genome Sequencing (WGS) is a sampling strategy that overcomes some of the deficiencies seen in fixed content SNP array studies. Linkage-disequilibrium (LD) aware variant callers, such as the program Thunder, may provide a calling rate and accuracy that makes a low-coverage sequencing strategy viable. We examined the performance of an LD-aware variant calling strategy in a population of 708 low-coverage whole genome sequences from a community sample of Native Americans. We assessed variant calling through a comparison of the sequencing results to genotypes measured in 641 of the same subjects using a fixed content first generation exome array. The comparison was made using the variant calling routines GATK Unified Genotyper program and the LD-aware variant caller Thunder. Thunder was found to improve concordance in a coverage dependent fashion, while correctly calling nearly all of the common variants as well as a high percentage of the rare variants present in the sample. Low-coverage WGS is a strategy that appears to collect genetic information intermediate in scope between fixed content genotyping arrays and deep-coverage WGS. Our data suggests that low-coverage WGS is a viable strategy with a greater chance of discovering novel variants and associations than fixed content arrays for large sample association analyses.
Komech, Ekaterina A; Pogorelyy, Mikhail V; Egorov, Evgeniy S; Britanova, Olga V; Rebrikov, Denis V; Bochkova, Anna G; Shmidt, Evgeniya I; Shostak, Nadejda A; Shugay, Mikhail; Lukyanov, Sergey; Mamedov, Ilgar Z; Lebedev, Yuriy B; Chudakov, Dmitriy M; Zvyagin, Ivan V
2018-02-22
The risk of AS is associated with genomic variants related to antigen presentation and specific cytokine signalling pathways, suggesting the involvement of cellular immunity in disease initiation/progression. The aim of the present study was to explore the repertoire of TCR sequences in healthy donors and AS patients to uncover AS-linked TCR variants. Using quantitative molecular-barcoded 5'-RACE, we performed deep TCR β repertoire profiling of peripheral blood (PB) and SF samples for 25 AS patients and 108 healthy donors. AS-linked TCR variants were identified using a new computational approach that relies on a probabilistic model of the VDJ rearrangement process. Using the donor-agnostic probabilistic model, we reveal a TCR β motif characteristic for PB of AS patients, represented by eight highly homologous amino acid sequence variants. Some of these variants were previously reported in SF and PB of patients with ReA and in PB of AS patients. We demonstrate that identified AS-linked clones have a CD8+ phenotype, present at relatively low frequencies in PB, and are significantly enriched in matched SF samples of AS patients. Our results suggest the involvement of a particular antigen-specific subset of CD8+ T cells in AS pathogenesis, confirming and expanding earlier findings. The high similarity of the clonotypes with the ones found in ReA implies common mechanisms for the initiation of the diseases.
van den Akker, Jeroen; Mishne, Gilad; Zimmer, Anjali D; Zhou, Alicia Y
2018-04-17
Next generation sequencing (NGS) has become a common technology for clinical genetic tests. The quality of NGS calls varies widely and is influenced by features like reference sequence characteristics, read depth, and mapping accuracy. With recent advances in NGS technology and software tools, the majority of variants called using NGS alone are in fact accurate and reliable. However, a small subset of difficult-to-call variants that still do require orthogonal confirmation exist. For this reason, many clinical laboratories confirm NGS results using orthogonal technologies such as Sanger sequencing. Here, we report the development of a deterministic machine-learning-based model to differentiate between these two types of variant calls: those that do not require confirmation using an orthogonal technology (high confidence), and those that require additional quality testing (low confidence). This approach allows reliable NGS-based calling in a clinical setting by identifying the few important variant calls that require orthogonal confirmation. We developed and tested the model using a set of 7179 variants identified by a targeted NGS panel and re-tested by Sanger sequencing. The model incorporated several signals of sequence characteristics and call quality to determine if a variant was identified at high or low confidence. The model was tuned to eliminate false positives, defined as variants that were called by NGS but not confirmed by Sanger sequencing. The model achieved very high accuracy: 99.4% (95% confidence interval: +/- 0.03%). It categorized 92.2% (6622/7179) of the variants as high confidence, and 100% of these were confirmed to be present by Sanger sequencing. Among the variants that were categorized as low confidence, defined as NGS calls of low quality that are likely to be artifacts, 92.1% (513/557) were found to be not present by Sanger sequencing. This work shows that NGS data contains sufficient characteristics for a machine-learning-based model to differentiate low from high confidence variants. Additionally, it reveals the importance of incorporating site-specific features as well as variant call features in such a model.
Characterization of a novel variant of Mycobacterium chimaera.
van Ingen, J; Hoefsloot, W; Buijtels, P C A M; Tortoli, E; Supply, P; Dekhuijzen, P N R; Boeree, M J; van Soolingen, D
2012-09-01
In this study, nonchromogenic mycobacteria were isolated from pulmonary samples of three patients in the Netherlands. All isolates had identical, unique 16S rRNA gene and 16S-23S ITS sequences, which were closely related to those of Mycobacterium chimaera and Mycobacterium marseillense. The biochemical features of the isolates differed slightly from those of M. chimaera, suggesting that the isolates may represent a possible separate species within the Mycobacterium avium complex (MAC). However, the cell-wall mycolic acid pattern, analysed by HPLC, and the partial sequences of the hsp65 and rpoB genes were identical to those of M. chimaera. We concluded that the isolates represent a novel variant of M. chimaera. The results of this analysis have led us to question the currently used methods of species definition for members of the genus Mycobacterium, which are based largely on 16S rRNA or rpoB gene sequencing. Definitions based on a single genetic target are likely to be insufficient. Genetic divergence, especially in the MAC, yields strains that cannot be confidently assigned to a specific species based on the analysis of a single genetic target.
Al-Muhaizea, Mohammad A; AlMutairi, Faten; Almass, Rawan; AlHarthi, Safinaz; Aldosary, Mazhor S; Alsagob, Maysoon; AlOdaib, Ali; Colak, Dilek; Kaya, Namik
2018-06-01
The objective of this study was the identification of likely genes and mutations associated with an autosomal recessive (AR) rare spinocerebellar ataxia (SCA) phenotype in two patients with infantile onset, from a consanguineous family. Using genome-wide SNP screening, autozygosity mapping, targeted Sanger sequencing and nextgen sequencing, family segregation analysis, and comprehensive neuropanel, we discovered a novel mutation in SPTBN2. Next, we utilized multiple sequence alignment of amino acids from various species as well as crystal structures provided by protein data bank (PDB# 1WYQ and 1WJM) to model the mutation site and its effect on β-III-spectrin. Finally, we used various bioinformatic classifiers to determine pathogenicity of the missense variant. A comprehensive clinical and diagnostic workup including radiological exams were performed on the patients as part of routine patient care. The homozygous missense variant (c.1572C>T; p.R414C) detected in exon 2 was fully segregated in the family and absent in a large ethnic cohort as well as publicly available data sets. Our comprehensive targeted sequencing approaches did not reveal any other likely candidate variants or mutations in both patients. The two male siblings presented with delayed motor milestones and cognitive and learning disability. Brain MRI revealed isolated cerebellar atrophy more marked in midline inferior vermis at ages of 3 and 6.5 years. Sequence alignments of the amino acids for β-III-spectrin indicated that the arginine at 414 is highly conserved among various species and located towards the end of first spectrin repeat domain. Inclusive bioinformatic analysis predicted that the variant is to be damaging and disease causing. In addition to the novel mutation, a brief literature review of the previously reported mutations as well as clinical comparison of the cases were also presented. Our study reviews the previously reported SPTBN2 mutations and cases. Moreover, the novel mutation, p.R414C, adds up to the literature for the infantile-onset form of autosomal recessive ataxia associated with SPTBN2. Previously, few SPTBN2 recessive mutations have been reported in humans. Animal models especially the β-III -/- mouse model provided insights into early coordination and gait deficit suggestive of loss-of-function. It is expected to see more recessive SPTBN2 mutations appearing in the literature during the upcoming years.
Wozniak, D J; Hsu, L Y; Galloway, D R
1988-01-01
Exotoxin A (ETA) is recognized as the most toxic product associated with the opportunistic pathogen Pseudomonas aeruginosa. Identification of the amino acids in the polypeptide sequence that are required for toxin activity is critical for vaccine development. By defining the nucleotide sequence of the structural gene of a mutant that encodes an enzymatically inactive ETA (CRM 66), we identified an essential amino acid (His-426), which is involved in the ADP-ribosyltransferase activity associated with functional ETA. A monoclonal antibody that inhibits ETA enzymatic activity in vitro fails to react with ETA variants that have a His 426----Tyr substitution. Several mono-ADP-ribosylating toxins, including diphtheria and pertussis toxins, within the primary amino acid sequences carry a histidine residue that is conserved in spacing and in location with respect to other critical residues. Analysis of the three-dimensional structure of ETA revealed that His-426 is not associated with the proposed NAD+ binding site. These findings should be useful for the design and construction of toxin vaccines. Images PMID:3143111
Treccani, Laura; Mann, Karlheinz; Heinemann, Fabian; Fritz, Monika
2006-01-01
We have isolated a new protein from the nacreous layer of the shell of the sea snail Haliotis laevigata (abalone). Amino acid sequence analysis showed the protein to consist of 134 amino acids and to contain three sequence repeats of ∼40 amino acids which were very similar to the well-known whey acidic protein domains of other proteins. The new protein was therefore named perlwapin. In addition to the major sequence, we identified several minor variants. Atomic force microscopy was used to explore the interaction of perlwapin with calcite crystals. Monomolecular layers of calcite crystals dissolve very slowly in deionized water and recrystallize in supersaturated calcium carbonate solution. When perlwapin was dissolved in the supersaturated calcium carbonate solution, growth of the crystal was inhibited immediately. Perlwapin molecules bound tightly to distinct step edges, preventing the crystal layers from growing. Using lower concentrations of perlwapin in a saturated calcium carbonate solution, we could distinguish native, active perlwapin molecules from denaturated ones. These observations showed that perlwapin can act as a growth inhibitor for calcium carbonate crystals in saturated calcium carbonate solution. The function of perlwapin in nacre growth may be to inhibit the growth of certain crystallographic planes in the mineral phase of the polymer/mineral composite nacre. PMID:16861275
Exome sequencing reveals novel genetic loci influencing obesity-related traits in Hispanic children
USDA-ARS?s Scientific Manuscript database
To perform whole exome sequencing in 928 Hispanic children and identify variants and genes associated with childhood obesity.Single-nucleotide variants (SNVs) were identified from Illumina whole exome sequencing data using integrated read mapping, variant calling, and an annotation pipeline (Mercury...
Litim, Nadhir; Labrie, Yvan; Desjardins, Sylvie; Ouellette, Geneviève; Plourde, Karine; Belleau, Pascal; Durocher, Francine
2013-02-01
The majority of genes associated with breast cancer susceptibility, including BRCA1 and BRCA2 genes, are involved in DNA repair mechanisms. Moreover, among the genes recently associated with an increased susceptibility to breast cancer, four are Fanconi Anemia (FA) genes: FANCD1/BRCA2, FANCJ/BACH1/BRIP1, FANCN/PALB2 and FANCO/RAD51C. FANCA is implicated in DNA repair and has been shown to interact directly with BRCA1. It has been proposed that the formation of FANCA/G (dependent upon the phosphorylation of FANCA) and FANCB/L sub-complexes altogether with FANCM, represent the initial step for DNA repair activation and subsequent formation of other sub-complexes leading to ubiquitination of FANCD2 and FANCI. As only approximately 25% of inherited breast cancers are attributable to BRCA1/2 mutations, FANCA therefore becomes an attractive candidate for breast cancer susceptibility. We thus analyzed FANCA gene in 97 high-risk French Canadian non-BRCA1/2 breast cancer individuals by direct sequencing as well as in 95 healthy control individuals from the same population. Among a total of 85 sequence variants found in either or both series, 28 are coding variants and 19 of them are missense variations leading to amino acid change. Three of the amino acid changes, namely Thr561Met, Cys625Ser and particularly Ser1088Phe, which has been previously reported to be associated with FA, are predicted to be damaging by the SIFT and PolyPhen softwares. cDNA amplification revealed significant expression of 4 alternative splicing events (insertion of an intronic portion of intron 10, and the skipping of exons 11, 30 and 31). In silico analyzes of relevant genomic variants have been performed in order to identify potential variations involved in the expression of these spliced transcripts. Sequence variants in FANCA could therefore be potential spoilers of the Fanconi-BRCA pathway and as a result, they could in turn have an impact in non-BRCA1/2 breast cancer families. Copyright © 2012 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.
Shakibaie, Mohammad Reza; Azizi, Omid; Shahcheraghi, Fereshteh
2017-07-01
Metallo-β-lactamases (MBLs) such as IMPs are broad-spectrum β-lactamases that inactivate virtually all β-lactam antibiotics including carbapenems. In this study, we investigated the hydrolytic activity, phylogenetic relationship, three dimensional (3D) structure including zinc binding motif of a new IMP variant (IMP-55) identified in a clinical strain of Acinetobacter baumannii (AB). AB strain 56 was isolated from an adult ICU of a teaching hospital in Kerman, Iran. It exhibited MIC 32μg/ml to imipenem and showed MBL activity. Hydrolytic property of the MBL enzyme was measured phenotypically. Presence of bla IMP gene encoded by class 1 integrons was detected by PCR-sequencing. Phylogenetic tree of IMP protein was constructed using the Unweighted Pair Group Method with Arithmetic Mean (UPGMA) and 3D model including zinc binding motif was predicted by bioinformatics softwares. Analysis of IMP sequence led to the identification of a novel IMP-type designated as IMP-55 (GenBank: KU299753.1; UniprotKB: A0A0S2MTX2). Impact in term of hydrolytic activity compared to the closest variants suggested efficient imipenem hydrolysis by this enzyme. Evolutionary distance matrix assessment indicated that IMP-55 protein is not closely related to other A. baumannii IMPs, however, shared 98% homology with Escherichia coli IMP-30 (UniprotKB: A0A0C5PJR0) and Pseudomonas aeruginosa IMP-1 (UniprotKB: Q19KT1). It consisted of five α-helices, ten β-sheets and six loops. A monovalent zinc ion attached to core of enzyme via His95, His97, His157 and Cys176. Multiple amino acid sequence alignments and mutational trajectory with reported IMPs showed 4 amino acid substitutions at positions 12(Phe→Ile), 31(Asp→Glu), 172(Leu→Phe) and 185(Asn→Lys). We suggest that the pleiotropic effect of mutations due to frequent administration of imipenem is responsible for emergence of new IMP variant in our hospitals. Copyright © 2017 Elsevier B.V. All rights reserved.
Thonberg, Håkan; Chiang, Huei-Hsin; Lilius, Lena; Forsell, Charlotte; Lindström, Anna-Karin; Johansson, Charlotte; Björkström, Jenny; Thordardottir, Steinunn; Sleegers, Kristel; Van Broeckhoven, Christine; Rönnbäck, Annica; Graff, Caroline
2017-06-09
Alzheimer disease (AD) is a progressive neurodegenerative disorder and the most common form of dementia. The majority of AD cases are sporadic, while up to 5% are families with an early onset AD (EOAD). Mutations in one of the three genes: amyloid beta precursor protein (APP), presenilin 1 (PSEN1) or presenilin 2 (PSEN2) can be disease causing. However, most EOAD families do not carry mutations in any of these three genes, and candidate genes, such as the sortilin-related receptor 1 (SORL1), have been suggested to be potentially causative. To identify AD causative variants, we performed whole-exome sequencing on five individuals from a family with EOAD and a missense variant, p.Arg1303Cys (c.3907C > T) was identified in SORL1 which segregated with disease and was further characterized with immunohistochemistry on two post mortem autopsy cases from the same family. In a targeted re-sequencing effort on independent index patients from 35 EOAD-families, a second SORL1 variant, c.3050-2A > G, was found which segregated with the disease in 3 affected and was absent in one unaffected family member. The c.3050-2A > G variant is located two nucleotides upstream of exon 22 and was shown to cause exon 22 skipping, resulting in a deletion of amino acids Gly1017- Glu1074 of SORL1. Furthermore, a third SORL1 variant, c.5195G > C, recently identified in a Swedish case control cohort included in the European Early-Onset Dementia (EU EOD) consortium study, was detected in two affected siblings in a third family with familial EOAD. The finding of three SORL1-variants that segregate with disease in three separate families with EOAD supports the involvement of SORL1 in AD pathology. The cause of these rare monogenic forms of EOAD has proven difficult to find and the use of exome and genome sequencing may be a successful route to target them.
Hall, L; Laird, J E; Craig, R K
1984-01-01
Nucleotide sequence analysis of cloned guinea-pig casein B cDNA sequences has identified two casein B variants related to the bovine and rat alpha s1 caseins. Amino acid homology was largely confined to the known bovine or predicted rat phosphorylation sites and within the 'signal' precursor sequence. Comparison of the deduced nucleotide sequence of the guinea-pig and rat alpha s1 casein mRNA species showed greater sequence conservation in the non-coding than in the coding regions, suggesting a functional and possibly regulatory role for the non-coding regions of casein mRNA. The results provide insight into the evolution of the casein genes, and raise questions as to the role of conserved nucleotide sequences within the non-coding regions of mRNA species. Images Fig. 1. PMID:6548375
Norton, Nadine; Li, Duanxiang; Rampersaud, Evadnie; Morales, Ana; Martin, Eden R; Zuchner, Stephan; Guo, Shengru; Gonzalez, Michael; Hedges, Dale J; Robertson, Peggy D; Krumm, Niklas; Nickerson, Deborah A; Hershberger, Ray E
2013-04-01
BACKGROUND- Familial dilated cardiomyopathy (DCM) is a genetically heterogeneous disease with >30 known genes. TTN truncating variants were recently implicated in a candidate gene study to cause 25% of familial and 18% of sporadic DCM cases. METHODS AND RESULTS- We used an unbiased genome-wide approach using both linkage analysis and variant filtering across the exome sequences of 48 individuals affected with DCM from 17 families to identify genetic cause. Linkage analysis ranked the TTN region as falling under the second highest genome-wide multipoint linkage peak, multipoint logarithm of odds, 1.59. We identified 6 TTN truncating variants carried by individuals affected with DCM in 7 of 17 DCM families (logarithm of odds, 2.99); 2 of these 7 families also had novel missense variants that segregated with disease. Two additional novel truncating TTN variants did not segregate with DCM. Nucleotide diversity at the TTN locus, including missense variants, was comparable with 5 other known DCM genes. The average number of missense variants in the exome sequences from the DCM cases or the ≈5400 cases from the Exome Sequencing Project was ≈23 per individual. The average number of TTN truncating variants in the Exome Sequencing Project was 0.014 per individual. We also identified a region (chr9q21.11-q22.31) with no known DCM genes with a maximum heterogeneity logarithm of odds score of 1.74. CONCLUSIONS- These data suggest that TTN truncating variants contribute to DCM cause. However, the lack of segregation of all identified TTN truncating variants illustrates the challenge of determining variant pathogenicity even with full exome sequencing.
Byers, Helen; Wallis, Yvonne; van Veen, Elke M; Lalloo, Fiona; Reay, Kim; Smith, Philip; Wallace, Andrew J; Bowers, Naomi; Newman, William G; Evans, D Gareth
2016-11-01
The sensitivity of testing BRCA1 and BRCA2 remains unresolved as the frequency of deep intronic splicing variants has not been defined in high-risk familial breast/ovarian cancer families. This variant category is reported at significant frequency in other tumour predisposition genes, including NF1 and MSH2. We carried out comprehensive whole gene RNA analysis on 45 high-risk breast/ovary and male breast cancer families with no identified pathogenic variant on exonic sequencing and copy number analysis of BRCA1/2. In addition, we undertook variant screening of a 10-gene high/moderate risk breast/ovarian cancer panel by next-generation sequencing. DNA testing identified the causative variant in 50/56 (89%) breast/ovarian/male breast cancer families with Manchester scores of ≥50 with two variants being confirmed to affect splicing on RNA analysis. RNA sequencing of BRCA1/BRCA2 on 45 individuals from high-risk families identified no deep intronic variants and did not suggest loss of RNA expression as a cause of lost sensitivity. Panel testing in 42 samples identified a known RAD51D variant, a high-risk ATM variant in another breast ovary family and a truncating CHEK2 mutation. Current exonic sequencing and copy number analysis variant detection methods of BRCA1/2 have high sensitivity in high-risk breast/ovarian cancer families. Sequence analysis of RNA does not identify any variants undetected by current analysis of BRCA1/2. However, RNA analysis clarified the pathogenicity of variants of unknown significance detected by current methods. The low diagnostic uplift achieved through sequence analysis of the other known breast/ovarian cancer susceptibility genes indicates that further high-risk genes remain to be identified.
ACTG: novel peptide mapping onto gene models.
Choi, Seunghyuk; Kim, Hyunwoo; Paek, Eunok
2017-04-15
In many proteogenomic applications, mapping peptide sequences onto genome sequences can be very useful, because it allows us to understand origins of the gene products. Existing software tools either take the genomic position of a peptide start site as an input or assume that the peptide sequence exactly matches the coding sequence of a given gene model. In case of novel peptides resulting from genomic variations, especially structural variations such as alternative splicing, these existing tools cannot be directly applied unless users supply information about the variant, either its genomic position or its transcription model. Mapping potentially novel peptides to genome sequences, while allowing certain genomic variations, requires introducing novel gene models when aligning peptide sequences to gene structures. We have developed a new tool called ACTG (Amino aCids To Genome), which maps peptides to genome, assuming all possible single exon skipping, junction variation allowing three edit distances from the original splice sites, exon extension and frame shift. In addition, it can also consider SNVs (single nucleotide variations) during mapping phase if a user provides the VCF (variant call format) file as an input. Available at http://prix.hanyang.ac.kr/ACTG/search.jsp . eunokpaek@hanyang.ac.kr. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Kamada, Mayumi; Hase, Sumitaka; Fujii, Kazushi; Miyake, Masato; Sato, Kengo; Kimura, Keitarou; Sakakibara, Yasubumi
2015-01-01
Bacillus subtilis is the main component in the fermentation of soybeans. To investigate the genetics of the soybean-fermenting B. subtilis strains and its relationship with the productivity of extracellular poly-γ-glutamic acid (γPGA), we sequenced the whole genome of eight B. subtilis stains isolated from non-salted fermented soybean foods in Southeast Asia. Assembled nucleotide sequences were compared with those of a natto (fermented soybean food) starter strain B. subtilis BEST195 and the laboratory standard strain B. subtilis 168 that is incapable of γPGA production. Detected variants were investigated in terms of insertion sequences, biotin synthesis, production of subtilisin NAT, and regulatory genes for γPGA synthesis, which were related to fermentation process. Comparing genome sequences, we found that the strains that produce γPGA have a deletion in a protein that constitutes the flagellar basal body, and this deletion was not found in the non-producing strains. We further identified diversity in variants of the bio operon, which is responsible for the biotin auxotrophism of the natto starter strains. Phylogenetic analysis using multilocus sequencing typing revealed that the B. subtilis strains isolated from the non-salted fermented soybeans were not clustered together, while the natto-fermenting strains were tightly clustered; this analysis also suggested that the strain isolated from "Tua Nao" of Thailand traces a different evolutionary process from other strains.
Mitsui, Jun; Fukuda, Yoko; Azuma, Kyo; Tozaki, Hirokazu; Ishiura, Hiroyuki; Takahashi, Yuji; Goto, Jun; Tsuji, Shoji
2010-07-01
We have recently found that multiple rare variants of the glucocerebrosidase gene (GBA) confer a robust risk for Parkinson disease, supporting the 'common disease-multiple rare variants' hypothesis. To develop an efficient method of identifying rare variants in a large number of samples, we applied multiplexed resequencing using a next-generation sequencer to identification of rare variants of GBA. Sixteen sets of pooled DNAs from six pooled DNA samples were prepared. Each set of pooled DNAs was subjected to polymerase chain reaction to amplify the target gene (GBA) covering 6.5 kb, pooled into one tube with barcode indexing, and then subjected to extensive sequence analysis using the SOLiD System. Individual samples were also subjected to direct nucleotide sequence analysis. With the optimization of data processing, we were able to extract all the variants from 96 samples with acceptable rates of false-positive single-nucleotide variants.
Guaranteed Discrete Energy Optimization on Large Protein Design Problems.
Simoncini, David; Allouche, David; de Givry, Simon; Delmas, Céline; Barbe, Sophie; Schiex, Thomas
2015-12-08
In Computational Protein Design (CPD), assuming a rigid backbone and amino-acid rotamer library, the problem of finding a sequence with an optimal conformation is NP-hard. In this paper, using Dunbrack's rotamer library and Talaris2014 decomposable energy function, we use an exact deterministic method combining branch and bound, arc consistency, and tree-decomposition to provenly identify the global minimum energy sequence-conformation on full-redesign problems, defining search spaces of size up to 10(234). This is achieved on a single core of a standard computing server, requiring a maximum of 66GB RAM. A variant of the algorithm is able to exhaustively enumerate all sequence-conformations within an energy threshold of the optimum. These proven optimal solutions are then used to evaluate the frequencies and amplitudes, in energy and sequence, at which an existing CPD-dedicated simulated annealing implementation may miss the optimum on these full redesign problems. The probability of finding an optimum drops close to 0 very quickly. In the worst case, despite 1,000 repeats, the annealing algorithm remained more than 1 Rosetta unit away from the optimum, leading to design sequences that could differ from the optimal sequence by more than 30% of their amino acids.
Global variation in CYP2C8–CYP2C9 functional haplotypes
Speed, William C; Kang, Soonmo Peter; Tuck, David P; Harris, Lyndsay N; Kidd, Kenneth K
2009-01-01
We have studied the global frequency distributions of 10 single nucleotide polymorphisms (SNPs) across 132 kb of CYP2C8 and CYP2C9 in ∼2500 individuals representing 45 populations. Five of the SNPs were in noncoding sequences; the other five involved the more common missense variants (four in CYP2C8, one in CYP2C9) that change amino acids in the gene products. One haplotype containing two CYP2C8 coding variants and one CYP2C9 coding variant reaches an average frequency of 10% in Europe; a set of haplotypes with a different CYP2C8 coding variant reaches 17% in Africa. In both cases these haplotypes are found in other regions of the world at <1%. This considerable geographic variation in haplotype frequencies impacts the interpretation of CYP2C8/CYP2C9 association studies, and has pharmacogenomic implications for drug interactions. PMID:19381162
SIBIS: a Bayesian model for inconsistent protein sequence estimation.
Khenoussi, Walyd; Vanhoutrève, Renaud; Poch, Olivier; Thompson, Julie D
2014-09-01
The prediction of protein coding genes is a major challenge that depends on the quality of genome sequencing, the accuracy of the model used to elucidate the exonic structure of the genes and the complexity of the gene splicing process leading to different protein variants. As a consequence, today's protein databases contain a huge amount of inconsistency, due to both natural variants and sequence prediction errors. We have developed a new method, called SIBIS, to detect such inconsistencies based on the evolutionary information in multiple sequence alignments. A Bayesian framework, combined with Dirichlet mixture models, is used to estimate the probability of observing specific amino acids and to detect inconsistent or erroneous sequence segments. We evaluated the performance of SIBIS on a reference set of protein sequences with experimentally validated errors and showed that the sensitivity is significantly higher than previous methods, with only a small loss of specificity. We also assessed a large set of human sequences from the UniProt database and found evidence of inconsistency in 48% of the previously uncharacterized sequences. We conclude that the integration of quality control methods like SIBIS in automatic analysis pipelines will be critical for the robust inference of structural, functional and phylogenetic information from these sequences. Source code, implemented in C on a linux system, and the datasets of protein sequences are freely available for download at http://www.lbgi.fr/∼julie/SIBIS. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
VarDict: a novel and versatile variant caller for next-generation sequencing in cancer research
Lai, Zhongwu; Markovets, Aleksandra; Ahdesmaki, Miika; Chapman, Brad; Hofmann, Oliver; McEwen, Robert; Johnson, Justin; Dougherty, Brian; Barrett, J. Carl; Dry, Jonathan R.
2016-01-01
Abstract Accurate variant calling in next generation sequencing (NGS) is critical to understand cancer genomes better. Here we present VarDict, a novel and versatile variant caller for both DNA- and RNA-sequencing data. VarDict simultaneously calls SNV, MNV, InDels, complex and structural variants, expanding the detected genetic driver landscape of tumors. It performs local realignments on the fly for more accurate allele frequency estimation. VarDict performance scales linearly to sequencing depth, enabling ultra-deep sequencing used to explore tumor evolution or detect tumor DNA circulating in blood. In addition, VarDict performs amplicon aware variant calling for polymerase chain reaction (PCR)-based targeted sequencing often used in diagnostic settings, and is able to detect PCR artifacts. Finally, VarDict also detects differences in somatic and loss of heterozygosity variants between paired samples. VarDict reprocessing of The Cancer Genome Atlas (TCGA) Lung Adenocarcinoma dataset called known driver mutations in KRAS, EGFR, BRAF, PIK3CA and MET in 16% more patients than previously published variant calls. We believe VarDict will greatly facilitate application of NGS in clinical cancer research. PMID:27060149
Jacobson, D R; Gorevic, P D; Buxbaum, J N
1990-01-01
Senile systemic amyloidosis (SSA) is a late-onset disease characterized by deposition of amyloid fibrils containing transthyretin (TTR). Amino acid sequencing of protein isolated from the amyloid fibrils of a patient with SSA identified TTR containing a position - 122 isoleucine-for-valine substitution. This change led to the prediction of a genomic G-to-A transition, destroying an MaeIII restriction site. We confirmed the presence of the variant DNA fragment both by Southern blotting and by visualization of MaeIII digests of DNA amplified around codon 122, by using the polymerase chain reaction. The patient's DNA was entirely resistant to MaeIII cleavage; therefore, only the mutant sequence was present. DNA from none of either 24 controls or six other SSA patients contained the variant. Quantitative Southern blotting demonstrated that the patient's DNA contained two copies of the TTR gene per genome; the mutation was therefore homozygous rather than hemizygous. In the present case, the homozygous mutation TTR (122 Val----Ile) is associated with SSA, a finding which is consistent with autosomal recessive inheritance of this condition. Images Figure 2 Figure 4 Figure 5 Figure 6 Figure 7 PMID:2349941
2013-01-01
Background Characterising genetic diversity through the analysis of massively parallel sequencing (MPS) data offers enormous potential to significantly improve our understanding of the genetic basis for observed phenotypes, including predisposition to and progression of complex human disease. Great challenges remain in resolving genetic variants that are genuine from the millions of artefactual signals. Results FAVR is a suite of new methods designed to work with commonly used MPS analysis pipelines to assist in the resolution of some of the issues related to the analysis of the vast amount of resulting data, with a focus on relatively rare genetic variants. To the best of our knowledge, no equivalent method has previously been described. The most important and novel aspect of FAVR is the use of signatures in comparator sequence alignment files during variant filtering, and annotation of variants potentially shared between individuals. The FAVR methods use these signatures to facilitate filtering of (i) platform and/or mapping-specific artefacts, (ii) common genetic variants, and, where relevant, (iii) artefacts derived from imbalanced paired-end sequencing, as well as annotation of genetic variants based on evidence of co-occurrence in individuals. We applied conventional variant calling applied to whole-exome sequencing datasets, produced using both SOLiD and TruSeq chemistries, with or without downstream processing by FAVR methods. We demonstrate a 3-fold smaller rare single nucleotide variant shortlist with no detected reduction in sensitivity. This analysis included Sanger sequencing of rare variant signals not evident in dbSNP131, assessment of known variant signal preservation, and comparison of observed and expected rare variant numbers across a range of first cousin pairs. The principles described herein were applied in our recent publication identifying XRCC2 as a new breast cancer risk gene and have been made publically available as a suite of software tools. Conclusions FAVR is a platform-agnostic suite of methods that significantly enhances the analysis of large volumes of sequencing data for the study of rare genetic variants and their influence on phenotypes. PMID:23441864
Whole-genome sequence-based analysis of thyroid function.
Taylor, Peter N; Porcu, Eleonora; Chew, Shelby; Campbell, Purdey J; Traglia, Michela; Brown, Suzanne J; Mullin, Benjamin H; Shihab, Hashem A; Min, Josine; Walter, Klaudia; Memari, Yasin; Huang, Jie; Barnes, Michael R; Beilby, John P; Charoen, Pimphen; Danecek, Petr; Dudbridge, Frank; Forgetta, Vincenzo; Greenwood, Celia; Grundberg, Elin; Johnson, Andrew D; Hui, Jennie; Lim, Ee M; McCarthy, Shane; Muddyman, Dawn; Panicker, Vijay; Perry, John R B; Bell, Jordana T; Yuan, Wei; Relton, Caroline; Gaunt, Tom; Schlessinger, David; Abecasis, Goncalo; Cucca, Francesco; Surdulescu, Gabriela L; Woltersdorf, Wolfram; Zeggini, Eleftheria; Zheng, Hou-Feng; Toniolo, Daniela; Dayan, Colin M; Naitza, Silvia; Walsh, John P; Spector, Tim; Davey Smith, George; Durbin, Richard; Richards, J Brent; Sanna, Serena; Soranzo, Nicole; Timpson, Nicholas J; Wilson, Scott G
2015-03-06
Normal thyroid function is essential for health, but its genetic architecture remains poorly understood. Here, for the heritable thyroid traits thyrotropin (TSH) and free thyroxine (FT4), we analyse whole-genome sequence data from the UK10K project (N=2,287). Using additional whole-genome sequence and deeply imputed data sets, we report meta-analysis results for common variants (MAF≥1%) associated with TSH and FT4 (N=16,335). For TSH, we identify a novel variant in SYN2 (MAF=23.5%, P=6.15 × 10(-9)) and a new independent variant in PDE8B (MAF=10.4%, P=5.94 × 10(-14)). For FT4, we report a low-frequency variant near B4GALT6/SLC25A52 (MAF=3.2%, P=1.27 × 10(-9)) tagging a rare TTR variant (MAF=0.4%, P=2.14 × 10(-11)). All common variants explain ≥20% of the variance in TSH and FT4. Analysis of rare variants (MAF<1%) using sequence kernel association testing reveals a novel association with FT4 in NRG1. Our results demonstrate that increased coverage in whole-genome sequence association studies identifies novel variants associated with thyroid function.
Mitochondrial targeting sequence variants of the CHCHD2 gene are a risk for Lewy body disorders
Ogaki, Kotaro; Koga, Shunsuke; Heckman, Michael G.; Fiesel, Fabienne C.; Ando, Maya; Labbé, Catherine; Lorenzo-Betancor, Oswaldo; Moussaud-Lamodière, Elisabeth L.; Soto-Ortolaza, Alexandra I.; Walton, Ronald L.; Strongosky, Audrey J.; Uitti, Ryan J.; McCarthy, Allan; Lynch, Timothy; Siuda, Joanna; Opala, Grzegorz; Rudzinska, Monika; Krygowska-Wajs, Anna; Barcikowska, Maria; Czyzewski, Krzysztof; Puschmann, Andreas; Nishioka, Kenya; Funayama, Manabu; Hattori, Nobutaka; Parisi, Joseph E.; Petersen, Ronald C.; Graff-Radford, Neill R.; Boeve, Bradley F.; Springer, Wolfdieter; Wszolek, Zbigniew K.; Dickson, Dennis W.
2015-01-01
Objective: To assess the role of CHCHD2 variants in patients with Parkinson disease (PD) and Lewy body disease (LBD) in Caucasian populations. Methods: All exons of the CHCHD2 gene were sequenced in a US Caucasian patient-control series (878 PD, 610 LBD, and 717 controls). Subsequently, exons 1 and 2 were sequenced in an Irish series (355 PD and 365 controls) and a Polish series (394 PD and 350 controls). Immunohistochemistry and immunofluorescence studies were performed on pathologic LBD cases with rare CHCHD2 variants. Results: We identified 9 rare exonic variants of unknown significance. These variants were more frequent in the combined group of PD and LBD patients compared to controls (0.6% vs 0.1%, p = 0.013). In addition, the presence of any rare variant was more common in patients with LBD (2.5% vs 1.0%, p = 0.050) compared to controls. Eight of these 9 variants were located within the gene's mitochondrial targeting sequence. Conclusions: Although the role of variants of the CHCHD2 gene in PD and LBD remains to be further elucidated, the rare variants in the mitochondrial targeting sequence may be a risk factor for Lewy body disorders, which may link CHCHD2 to other genetic forms of parkinsonism with mitochondrial dysfunction. PMID:26561290
Jin, Sheng Chih; Benitez, Bruno A; Deming, Yuetiva; Cruchaga, Carlos
2016-01-01
Analyses of genome-wide association studies (GWAS) for complex disorders usually identify common variants with a relatively small effect size that only explain a small proportion of phenotypic heritability. Several studies have suggested that a significant fraction of heritability may be explained by low-frequency (minor allele frequency (MAF) of 1-5 %) and rare-variants that are not contained in the commercial GWAS genotyping arrays (Schork et al., Curr Opin Genet Dev 19:212, 2009). Rare variants can also have relatively large effects on risk for developing human diseases or disease phenotype (Cruchaga et al., PLoS One 7:e31039, 2012). However, it is necessary to perform next-generation sequencing (NGS) studies in a large population (>4,000 samples) to detect a significant rare-variant association. Several NGS methods, such as custom capture sequencing and amplicon-based sequencing, are designed to screen a small proportion of the genome, but most of these methods are limited in the number of samples that can be multiplexed (i.e. most sequencing kits only provide 96 distinct index). Additionally, the sequencing library preparation for 4,000 samples remains expensive and thus conducting NGS studies with the aforementioned methods are not feasible for most research laboratories.The need for low-cost large scale rare-variant detection makes pooled-DNA sequencing an ideally efficient and cost-effective technique to identify rare variants in target regions by sequencing hundreds to thousands of samples. Our recent work has demonstrated that pooled-DNA sequencing can accurately detect rare variants in targeted regions in multiple DNA samples with high sensitivity and specificity (Jin et al., Alzheimers Res Ther 4:34, 2012). In these studies we used a well-established pooled-DNA sequencing approach and a computational package, SPLINTER (short indel prediction by large deviation inference and nonlinear true frequency estimation by recursion) (Vallania et al., Genome Res 20:1711, 2010), for accurate identification of rare variants in large DNA pools. Given an average sequencing coverage of 30× per haploid genome, SPLINTER can detect rare variants and short indels up to 4 base pairs (bp) with high sensitivity and specificity (up to 1 haploid allele in a pool as large as 500 individuals). Step-by-step instructions on how to conduct pooled-DNA sequencing experiments and data analyses are described in this chapter.
Sequence variants in ESR1 and OXTR are associated with Mayer-Rokitansky-Küster-Hauser syndrome.
Brucker, Sara Yvonne; Frank, Liliane; Eisenbeis, Simone; Henes, Melanie; Wallwiener, Diethelm; Riess, Olaf; van Eijck, Barbara; Schöller, Dorit; Bonin, Michael; Rall, Kristin Katharina
2017-11-01
Mayer-Rokitansky-Küster-Hauser syndrome (MRKHS) is characterized by congenital absence of the uterus and the upper two-thirds of the vagina in otherwise phenotypically normal females. It is found isolated or associated with renal, skeletal and other malformations. Despite ongoing research, the etiology is mainly unknown. For a long time, the hypothesis of deficient hormone receptors as the cause for MRKHS has existed, supported by previous findings of our group. The aim of the present study was to identify unknown genetic causes for MRKHS and to compare them with data banks including a review of the literature. DNA sequence analysis of the oxytocin receptor (OXTR) and estrogen receptor-1 gene (ESR1) was performed in a group of 93 clinically well-defined patients with uterovaginal aplasia (68 with the isolated form and 25 with associated malformations). In total, we detected three OXTR variants in 18 MRKHS patients with one leading to a missense mutation, and six ESR1 variants in 21 MRKHS patients, two of these causing amino acid changes and therefore potentially disease. The identified variants on DNA level might impair receptor function through different molecular mechanisms. Mutations of ESR1 and OXTR are associated with MRKHS. Thus, we consider these genes potential candidates associated with the manifestation of MRKHS. © 2017 Nordic Federation of Societies of Obstetrics and Gynecology, Acta Obstetricia et Gynecologica Scandinavica.
Jóri, Balazs; Kamps, Rick; Xanthoulea, Sofia; Delvoux, Bert; Blok, Marinus J; Van de Vijver, Koen K; de Koning, Bart; Oei, Felicia Trups; Tops, Carli M; Speel, Ernst Jm; Kruitwagen, Roy F; Gomez-Garcia, Encarna B; Romano, Andrea
2015-12-01
The risk to develop colorectal and endometrial cancers among subjects testing positive for a pathogenic Lynch syndrome mutation varies, making the risk prediction difficult. Genetic risk modifiers alter the risk conferred by inherited Lynch syndrome mutations, and their identification can improve genetic counseling. We aimed at identifying rare genetic modifiers of the risk of Lynch syndrome endometrial cancer. A family based approach was used to assess the presence of genetic risk modifiers among 35 Lynch syndrome mutation carriers having either a poor clinical phenotype (early age of endometrial cancer diagnosis or multiple cancers) or a neutral clinical phenotype. Putative genetic risk modifiers were identified by Next Generation Sequencing among a panel of 154 genes involved in endometrial physiology and carcinogenesis. A simple pipeline, based on an allele frequency lower than 0.001 and on predicted non-conservative amino-acid substitutions returned 54 variants that were considered putative risk modifiers. The presence of two or more risk modifying variants in women carrying a pathogenic Lynch syndrome mutation was associated with a poor clinical phenotype. A gene-panel is proposed that comprehends genes that can carry variants with putative modifying effects on the risk of Lynch syndrome endometrial cancer. Validation in further studies is warranted before considering the possible use of this tool in genetic counseling.
Ito, Y; Ikeuchi, A; Imamura, C
2013-01-01
We aimed at constructing thermostable cellulase variants of cellobiohydrolase II, derived from the mesophilic fungus Phanerochaete chrysosporium, by using an advanced evolutionary molecular engineering method. By aligning the amino acid sequences of the catalytic domains of five thermophilic fungal CBH2 and PcCBH2 proteins, we identified 45 positions where the PcCBH2 genes differ from the consensus sequence of two to five thermophilic fungal CBH2s. PcCBH2 variants with the consensus mutations were obtained by a cell-free translation system that was chosen for easy evaluation of thermostability. From the small library of consensus mutations, advantageous mutations for improving thermostability were found to occur with much higher frequency relative to a random library. To further improve thermostability, advantageous mutations were accumulated within the wild-type gene. Finally, we obtained the most thermostable variant Mall4, which contained all 15 advantageous mutations found in this study. This variant had the same specific cellulase activity as the wild type and retained sufficient activity at 50°C for >72 h, whereas wild-type PcCBH2 retained much less activity under the same conditions. The history of the accumulation process indicated that evolution of PcCBH2 toward improved thermostability was ideally and rapidly accomplished through the evolutionary process employed in this study.
High depth, whole-genome sequencing of cholera isolates from Haiti and the Dominican Republic.
Sealfon, Rachel; Gire, Stephen; Ellis, Crystal; Calderwood, Stephen; Qadri, Firdausi; Hensley, Lisa; Kellis, Manolis; Ryan, Edward T; LaRocque, Regina C; Harris, Jason B; Sabeti, Pardis C
2012-09-11
Whole-genome sequencing is an important tool for understanding microbial evolution and identifying the emergence of functionally important variants over the course of epidemics. In October 2010, a severe cholera epidemic began in Haiti, with additional cases identified in the neighboring Dominican Republic. We used whole-genome approaches to sequence four Vibrio cholerae isolates from Haiti and the Dominican Republic and three additional V. cholerae isolates to a high depth of coverage (>2000x); four of the seven isolates were previously sequenced. Using these sequence data, we examined the effect of depth of coverage and sequencing platform on genome assembly and identification of sequence variants. We found that 50x coverage is sufficient to construct a whole-genome assembly and to accurately call most variants from 100 base pair paired-end sequencing reads. Phylogenetic analysis between the newly sequenced and thirty-three previously sequenced V. cholerae isolates indicates that the Haitian and Dominican Republic isolates are closest to strains from South Asia. The Haitian and Dominican Republic isolates form a tight cluster, with only four variants unique to individual isolates. These variants are located in the CTX region, the SXT region, and the core genome. Of the 126 mutations identified that separate the Haiti-Dominican Republic cluster from the V. cholerae reference strain (N16961), 73 are non-synonymous changes, and a number of these changes cluster in specific genes and pathways. Sequence variant analyses of V. cholerae isolates, including multiple isolates from the Haitian outbreak, identify coverage-specific and technology-specific effects on variant detection, and provide insight into genomic change and functional evolution during an epidemic.
Popova, Blagovesta; Schubert, Steffen; Bulla, Ingo; Buchwald, Daniela; Kramer, Wilfried
2015-01-01
A major challenge in gene library generation is to guarantee a large functional size and diversity that significantly increases the chances of selecting different functional protein variants. The use of trinucleotides mixtures for controlled randomization results in superior library diversity and offers the ability to specify the type and distribution of the amino acids at each position. Here we describe the generation of a high diversity gene library using tHisF of the hyperthermophile Thermotoga maritima as a scaffold. Combining various rational criteria with contingency, we targeted 26 selected codons of the thisF gene sequence for randomization at a controlled level. We have developed a novel method of creating full-length gene libraries by combinatorial assembly of smaller sub-libraries. Full-length libraries of high diversity can easily be assembled on demand from smaller and much less diverse sub-libraries, which circumvent the notoriously troublesome long-term archivation and repeated proliferation of high diversity ensembles of phages or plasmids. We developed a generally applicable software tool for sequence analysis of mutated gene sequences that provides efficient assistance for analysis of library diversity. Finally, practical utility of the library was demonstrated in principle by assessment of the conformational stability of library members and isolating protein variants with HisF activity from it. Our approach integrates a number of features of nucleic acids synthetic chemistry, biochemistry and molecular genetics to a coherent, flexible and robust method of combinatorial gene synthesis. PMID:26355961
Popova, Blagovesta; Schubert, Steffen; Bulla, Ingo; Buchwald, Daniela; Kramer, Wilfried
2015-01-01
A major challenge in gene library generation is to guarantee a large functional size and diversity that significantly increases the chances of selecting different functional protein variants. The use of trinucleotides mixtures for controlled randomization results in superior library diversity and offers the ability to specify the type and distribution of the amino acids at each position. Here we describe the generation of a high diversity gene library using tHisF of the hyperthermophile Thermotoga maritima as a scaffold. Combining various rational criteria with contingency, we targeted 26 selected codons of the thisF gene sequence for randomization at a controlled level. We have developed a novel method of creating full-length gene libraries by combinatorial assembly of smaller sub-libraries. Full-length libraries of high diversity can easily be assembled on demand from smaller and much less diverse sub-libraries, which circumvent the notoriously troublesome long-term archivation and repeated proliferation of high diversity ensembles of phages or plasmids. We developed a generally applicable software tool for sequence analysis of mutated gene sequences that provides efficient assistance for analysis of library diversity. Finally, practical utility of the library was demonstrated in principle by assessment of the conformational stability of library members and isolating protein variants with HisF activity from it. Our approach integrates a number of features of nucleic acids synthetic chemistry, biochemistry and molecular genetics to a coherent, flexible and robust method of combinatorial gene synthesis.
Krøigård, Anne Bruun; Thomassen, Mads; Lænkholm, Anne-Vibeke; Kruse, Torben A; Larsen, Martin Jakob
2016-01-01
Next generation sequencing is extensively applied to catalogue somatic mutations in cancer, in research settings and increasingly in clinical settings for molecular diagnostics, guiding therapy decisions. Somatic variant callers perform paired comparisons of sequencing data from cancer tissue and matched normal tissue in order to detect somatic mutations. The advent of many new somatic variant callers creates a need for comparison and validation of the tools, as no de facto standard for detection of somatic mutations exists and only limited comparisons have been reported. We have performed a comprehensive evaluation using exome sequencing and targeted deep sequencing data of paired tumor-normal samples from five breast cancer patients to evaluate the performance of nine publicly available somatic variant callers: EBCall, Mutect, Seurat, Shimmer, Indelocator, Somatic Sniper, Strelka, VarScan 2 and Virmid for the detection of single nucleotide mutations and small deletions and insertions. We report a large variation in the number of calls from the nine somatic variant callers on the same sequencing data and highly variable agreement. Sequencing depth had markedly diverse impact on individual callers, as for some callers, increased sequencing depth highly improved sensitivity. For SNV calling, we report EBCall, Mutect, Virmid and Strelka to be the most reliable somatic variant callers for both exome sequencing and targeted deep sequencing. For indel calling, EBCall is superior due to high sensitivity and robustness to changes in sequencing depths.
Krøigård, Anne Bruun; Thomassen, Mads; Lænkholm, Anne-Vibeke; Kruse, Torben A.; Larsen, Martin Jakob
2016-01-01
Next generation sequencing is extensively applied to catalogue somatic mutations in cancer, in research settings and increasingly in clinical settings for molecular diagnostics, guiding therapy decisions. Somatic variant callers perform paired comparisons of sequencing data from cancer tissue and matched normal tissue in order to detect somatic mutations. The advent of many new somatic variant callers creates a need for comparison and validation of the tools, as no de facto standard for detection of somatic mutations exists and only limited comparisons have been reported. We have performed a comprehensive evaluation using exome sequencing and targeted deep sequencing data of paired tumor-normal samples from five breast cancer patients to evaluate the performance of nine publicly available somatic variant callers: EBCall, Mutect, Seurat, Shimmer, Indelocator, Somatic Sniper, Strelka, VarScan 2 and Virmid for the detection of single nucleotide mutations and small deletions and insertions. We report a large variation in the number of calls from the nine somatic variant callers on the same sequencing data and highly variable agreement. Sequencing depth had markedly diverse impact on individual callers, as for some callers, increased sequencing depth highly improved sensitivity. For SNV calling, we report EBCall, Mutect, Virmid and Strelka to be the most reliable somatic variant callers for both exome sequencing and targeted deep sequencing. For indel calling, EBCall is superior due to high sensitivity and robustness to changes in sequencing depths. PMID:27002637
High local genetic diversity of canine parvovirus from Ecuador.
Aldaz, Jaime; García-Díaz, Juan; Calleros, Lucía; Sosa, Katia; Iraola, Gregorio; Marandino, Ana; Hernández, Martín; Panzera, Yanina; Pérez, Ruben
2013-09-27
Canine parvovirus (CPV) comprises three antigenic variants (2a, 2b, and 2c) that are distributed globally with different frequencies and levels of genetic variability. CPVs from central Ecuador were herein analyzed to characterize the strains and to provide new insights into local viral diversity, evolution, and pathogenicity. Variant prevalence was analyzed by PCR and partial sequencing for 53 CPV-positive samples collected during 2011 and 2012. The full-length VP2 gene was sequenced in 24 selected strains and a maximum-likelihood phylogenetic tree was constructed using both Ecuadorian and worldwide strains. Ecuadorian CPVs have a remarkable genetic diversity that includes the circulation of all three variants and the existence of different evolutionary groups or lineages. CPV-2c was the most prevalent variant (54.7%), confirming the spread of this variant in America. Ecuadorian CPV-2c strains clustered in two lineages, which represent the first evidence of polyphyletic CPV-2c circulating in South America. CPV-2a strains constituted 41.5% of the samples and clustered in a single lineage. The two detected CPV-2b strains (3.8%) were clearly polyphyletic and appeared related to Ecuadorian CPV-2a or foreign CPV-2b strains. Besides the substitution at residue 426 that is used to identify the variants, two amino acid changes occurred in Ecuadorian strains: Val139Iso and Thr440Ser. Ser(440) occurred in a biologically relevant domain of VP2 and is here described for the first time in CPV. The associations of Ecuadorian CPV-2c and CPV-2a with clinical symptoms indicate that dull mentation, hemorrhagic gastroenteritis and hypothermia occurred more frequently in infection with CPV-2c than with CPV-2a. Copyright © 2013 Elsevier B.V. All rights reserved.
Divergent Ah Receptor Ligand Selectivity during Hominin Evolution
Hubbard, Troy D.; Murray, Iain A.; Bisson, William H.; Sullivan, Alexis P.; Sebastian, Aswathy; Perry, George H.; Jablonski, Nina G.; Perdew, Gary H.
2016-01-01
We have identified a fixed nonsynonymous sequence difference between humans (Val381; derived variant) and Neandertals (Ala381; ancestral variant) in the ligand-binding domain of the aryl hydrocarbon receptor (AHR) gene. In an exome sequence analysis of four Neandertal and Denisovan individuals compared with nine modern humans, there are only 90 total nucleotide sites genome-wide for which archaic hominins are fixed for the ancestral nonsynonymous variant and the modern humans are fixed for the derived variant. Of those sites, only 27, including Val381 in the AHR, also have no reported variability in the human dbSNP database, further suggesting that this highly conserved functional variant is a rare event. Functional analysis of the amino acid variant Ala381 within the AHR carried by Neandertals and nonhuman primates indicate enhanced polycyclic aromatic hydrocarbon (PAH) binding, DNA binding capacity, and AHR mediated transcriptional activity compared with the human AHR. Also relative to human AHR, the Neandertal AHR exhibited 150–1000 times greater sensitivity to induction of Cyp1a1 and Cyp1b1 expression by PAHs (e.g., benzo(a)pyrene). The resulting CYP1A1/CYP1B1 enzymes are responsible for PAH first pass metabolism, which can result in the generation of toxic intermediates and perhaps AHR-associated toxicities. In contrast, the human AHR retains the ancestral sensitivity observed in primates to nontoxic endogenous AHR ligands (e.g., indole, indoxyl sulfate). Our findings reveal that a functionally significant change in the AHR occurred uniquely in humans, relative to other primates, that would attenuate the response to many environmental pollutants, including chemicals present in smoke from fire use during cooking. PMID:27486223
Olson, Nathan D.; Lund, Steven P.; Zook, Justin M.; Rojas-Cornejo, Fabiola; Beck, Brian; Foy, Carole; Huggett, Jim; Whale, Alexandra S.; Sui, Zhiwei; Baoutina, Anna; Dobeson, Michael; Partis, Lina; Morrow, Jayne B.
2015-01-01
This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA) sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing®, or Ion Torrent PGM®. The sequencing data were evaluated on three levels: (1) identity of biologically conserved position, (2) ratio of 16S rRNA gene copies featuring identified variants, and (3) the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies. PMID:27077030
2014-01-01
Background The raw goat milk microbiota is considered a good source of novel bacteriocinogenic lactic acid bacteria (LAB) strains that can be exploited as an alternative for use as biopreservatives in foods. The constant demand for such alternative tools justifies studies that investigate the antimicrobial potential of such strains. Results The obtained data identified a predominance of Lactococcus and Enterococcus strains in raw goat milk microbiota with antimicrobial activity against Listeria monocytogenes ATCC 7644. Enzymatic assays confirmed the bacteriocinogenic nature of the antimicrobial substances produced by the isolated strains, and PCR reactions detected a variety of bacteriocin-related genes in their genomes. Rep-PCR identified broad genetic variability among the Enterococcus isolates, and close relations between the Lactococcus strains. The sequencing of PCR products from nis-positive Lactococcus allowed the identification of a predicted nisin variant not previously described and possessing a wide inhibitory spectrum. Conclusions Raw goat milk was confirmed as a good source of novel bacteriocinogenic LAB strains, having identified Lactococcus isolates possessing variations in their genomes that suggest the production of a nisin variant not yet described and with potential for use as biopreservatives in food due to its broad spectrum of action. PMID:24521354
Pattaradilokrat, Sittiporn; Sawaswong, Vorthon; Simpalipan, Phumin; Kaewthamasorn, Morakot; Siripoon, Napaporn; Harnyuttanakorn, Pongchai
2016-10-21
An effective malaria vaccine is an urgently needed tool to fight against human malaria, the most deadly parasitic disease of humans. One promising candidate is the merozoite surface protein-3 (MSP-3) of Plasmodium falciparum. This antigenic protein, encoded by the merozoite surface protein (msp-3) gene, is polymorphic and classified according to size into the two allelic types of K1 and 3D7. A recent study revealed that both the K1 and 3D7 alleles co-circulated within P. falciparum populations in Thailand, but the extent of the sequence diversity and variation within each allelic type remains largely unknown. The msp-3 gene was sequenced from 59 P. falciparum samples collected from five endemic areas (Mae Hong Son, Kanchanaburi, Ranong, Trat and Ubon Ratchathani) in Thailand and analysed for nucleotide sequence diversity, haplotype diversity and deduced amino acid sequence diversity. The gene was also subject to population genetic analysis (F st ) and neutrality tests (Tajima's D, Fu and Li D* and Fu and Li' F* tests) to determine any signature of selection. The sequence analyses revealed eight unique DNA haplotypes and seven amino acid sequence variants, with a haplotype and nucleotide diversity of 0.828 and 0.049, respectively. Neutrality tests indicated that the polymorphism detected in the alanine heptad repeat region of MSP-3 was maintained by positive diversifying selection, suggesting its role as a potential target of protective immune responses and supporting its role as a vaccine candidate. Comparison of MSP-3 variants among parasite populations in Thailand, India and Nigeria also inferred a close genetic relationship between P. falciparum populations in Asia. This study revealed the extent of the msp-3 gene diversity in P. falciparum in Thailand, providing the fundamental basis for the better design of future blood stage malaria vaccines against P. falciparum.
Veerkamp, Roel F; Bouwman, Aniek C; Schrooten, Chris; Calus, Mario P L
2016-12-01
Whole-genome sequence data is expected to capture genetic variation more completely than common genotyping panels. Our objective was to compare the proportion of variance explained and the accuracy of genomic prediction by using imputed sequence data or preselected SNPs from a genome-wide association study (GWAS) with imputed whole-genome sequence data. Phenotypes were available for 5503 Holstein-Friesian bulls. Genotypes were imputed up to whole-genome sequence (13,789,029 segregating DNA variants) by using run 4 of the 1000 bull genomes project. The program GCTA was used to perform GWAS for protein yield (PY), somatic cell score (SCS) and interval from first to last insemination (IFL). From the GWAS, subsets of variants were selected and genomic relationship matrices (GRM) were used to estimate the variance explained in 2087 validation animals and to evaluate the genomic prediction ability. Finally, two GRM were fitted together in several models to evaluate the effect of selected variants that were in competition with all the other variants. The GRM based on full sequence data explained only marginally more genetic variation than that based on common SNP panels: for PY, SCS and IFL, genomic heritability improved from 0.81 to 0.83, 0.83 to 0.87 and 0.69 to 0.72, respectively. Sequence data also helped to identify more variants linked to quantitative trait loci and resulted in clearer GWAS peaks across the genome. The proportion of total variance explained by the selected variants combined in a GRM was considerably smaller than that explained by all variants (less than 0.31 for all traits). When selected variants were used, accuracy of genomic predictions decreased and bias increased. Although 35 to 42 variants were detected that together explained 13 to 19% of the total variance (18 to 23% of the genetic variance) when fitted alone, there was no advantage in using dense sequence information for genomic prediction in the Holstein data used in our study. Detection and selection of variants within a single breed are difficult due to long-range linkage disequilibrium. Stringent selection of variants resulted in more biased genomic predictions, although this might be due to the training population being the same dataset from which the selected variants were identified.
Song, Dandan; Li, Ning; Liao, Lejian
2015-01-01
Due to the generation of enormous amounts of data at both lower costs as well as in shorter times, whole-exome sequencing technologies provide dramatic opportunities for identifying disease genes implicated in Mendelian disorders. Since upwards of thousands genomic variants can be sequenced in each exome, it is challenging to filter pathogenic variants in protein coding regions and reduce the number of missing true variants. Therefore, an automatic and efficient pipeline for finding disease variants in Mendelian disorders is designed by exploiting a combination of variants filtering steps to analyze the family-based exome sequencing approach. Recent studies on the Freeman-Sheldon disease are revisited and show that the proposed method outperforms other existing candidate gene identification methods.
Xia, Pengpeng; Quan, Guomei; Yang, Yi; Zhao, Jing; Wang, Yiting; Zhou, Mingxu; Hardwidge, Philip R; Zhu, Jianzhong; Liu, Siguo; Zhu, Guoqiang
2018-02-26
The binding of F4 + enterotoxigenic Escherichia coli (ETEC) and the specific receptor on porcine intestinal epithelial cells is the initial step in F4 + ETEC infection. Porcine aminopeptidase N (APN) is a newly discovered receptor for F4 fimbriae that binds directly to FaeG adhesin, which is the major subunit of the F4 fimbriae variants F4ab, F4ac, and F4ad. We used overlapping peptide assays to map the APN-FaeG binding sites, which has facilitated in the identifying the APN-binding amino acids that are located in the same region of FaeG variants, thereby limiting the major binding regions of APN to 13 peptides. To determine the core sequence motif, a panel of FaeG peptides with point mutations and FaeG mutants were constructed. Pull-down and binding reactivity assays using piglet intestines determined that the amino acids G159 of F4ab, N209 and L212 of F4ac, and A200 of F4ad were the critical residues for APN binding of FaeG. We further show using ELISA and confocal microscopy assay that amino acids 553-568, and 652-670 of the APN comprise the linear epitope for FaeG binding in all three F4 fimbriae variants.
Effect of Next-Generation Exome Sequencing Depth for Discovery of Diagnostic Variants.
Kim, Kyung; Seong, Moon-Woo; Chung, Won-Hyong; Park, Sung Sup; Leem, Sangseob; Park, Won; Kim, Jihyun; Lee, KiYoung; Park, Rae Woong; Kim, Namshin
2015-06-01
Sequencing depth, which is directly related to the cost and time required for the generation, processing, and maintenance of next-generation sequencing data, is an important factor in the practical utilization of such data in clinical fields. Unfortunately, identifying an exome sequencing depth adequate for clinical use is a challenge that has not been addressed extensively. Here, we investigate the effect of exome sequencing depth on the discovery of sequence variants for clinical use. Toward this, we sequenced ten germ-line blood samples from breast cancer patients on the Illumina platform GAII(x) at a high depth of ~200×. We observed that most function-related diverse variants in the human exonic regions could be detected at a sequencing depth of 120×. Furthermore, investigation using a diagnostic gene set showed that the number of clinical variants identified using exome sequencing reached a plateau at an average sequencing depth of about 120×. Moreover, the phenomena were consistent across the breast cancer samples.
Genome Wide Analysis of Fatty Acid Desaturation and Its Response to Temperature1[OPEN
Menard, Guillaume N.; Moreno, Jose Martin; Bryant, Fiona M.; Munoz-Azcarate, Olaya; Hassani-Pak, Keywan; Kurup, Smita
2017-01-01
Plants modify the polyunsaturated fatty acid content of their membrane and storage lipids in order to adapt to changes in temperature. In developing seeds, this response is largely controlled by the activities of the microsomal ω-6 and ω-3 fatty acid desaturases, FAD2 and FAD3. Although temperature regulation of desaturation has been studied at the molecular and biochemical levels, the genetic control of this trait is poorly understood. Here, we have characterized the response of Arabidopsis (Arabidopsis thaliana) seed lipids to variation in ambient temperature and found that heat inhibits both ω-6 and ω-3 desaturation in phosphatidylcholine, leading to a proportional change in triacylglycerol composition. Analysis of the 19 parental accessions of the multiparent advanced generation intercross (MAGIC) population showed that significant natural variation exists in the temperature responsiveness of ω-6 desaturation. A combination of quantitative trait locus (QTL) analysis and genome-wide association studies (GWAS) using the MAGIC population suggests that ω-6 desaturation is largely controlled by cis-acting sequence variants in the FAD2 5′ untranslated region intron that determine the expression level of the gene. However, the temperature responsiveness of ω-6 desaturation is controlled by a separate QTL on chromosome 2. The identity of this locus is unknown, but genome-wide association studies identified potentially causal sequence variants within ∼40 genes in an ∼450-kb region of the QTL. PMID:28108698
Rydzanicz, Małgorzata; Stradomska, Teresa Joanna; Jurkiewicz, Elżbieta; Jamroz, Ewa; Gasperowicz, Piotr; Kostrzewa, Grażyna; Płoski, Rafał; Tylki-Szymańska, Anna
2017-11-01
Zellweger syndrome (ZS) is a consequence of a peroxisome biogenesis disorder (PBD) caused by the presence of a pathogenic mutation in one of the 13 genes from the PEX family. ZS is a severe multisystem condition characterized by neonatal appearance of symptoms and a shorter life. Here, we report a case of ZS with a mild phenotype, due to a novel PEX6 gene mutation. The patient presented subtle craniofacial dysmorphic features and slightly slower psychomotor development. At the age of 2 years, he was diagnosed with adrenal insufficiency, hypoacusis, and general deterioration. Magnetic resonance imaging showed a symmetrical hyperintense signal in the frontal and parietal white matter. Biochemical tests showed elevated liver transaminases, elevated serum very long chain fatty acids, and phytanic acid. After the death of the child at the age of 6 years, molecular diagnostics were continued in order to provide genetic counseling for his parents. Next generation sequencing (NGS) analysis with the TruSight One™ Sequencing Panel revealed a novel homozygous PEX6 p.Ala94Pro mutation. In silico prediction of variant severity suggested its possible benign effect. To conclude, in the milder phenotypes, adrenal insufficiency, hypoacusis, and leukodystrophy together seem to be pathognomonic for ZS.
Shin, Saeam; Kim, Yoonjung; Chul Oh, Seoung; Yu, Nae; Lee, Seung-Tae; Rak Choi, Jong; Lee, Kyung-A
2017-05-23
In this study, we validated the analytical performance of BRCA1/2 sequencing using Ion Torrent's new bench-top sequencer with amplicon panel with optimized bioinformatics pipelines. Using 43 samples that were previously validated by Illumina's MiSeq platform and/or by Sanger sequencing/multiplex ligation-dependent probe amplification, we amplified the target with the Oncomine™ BRCA Research Assay and sequenced on Ion Torrent S5 XL (Thermo Fisher Scientific, Waltham, MA, USA). We compared two bioinformatics pipelines for optimal processing of S5 XL sequence data: the Torrent Suite with a plug-in Torrent Variant Caller (Thermo Fisher Scientific), and commercial NextGENe software (Softgenetics, State College, PA, USA). All expected 681 single nucleotide variants, 15 small indels, and three copy number variants were correctly called, except one common variant adjacent to a rare variant on the primer-binding site. The sensitivity, specificity, false positive rate, and accuracy for detection of single nucleotide variant and small indels of S5 XL sequencing were 99.85%, 100%, 0%, and 99.99% for the Torrent Variant Caller and 99.85%, 99.99%, 0.14%, and 99.99% for NextGENe, respectively. The reproducibility of variant calling was 100%, and the precision of variant frequency also showed good performance with coefficients of variation between 0.32 and 5.29%. We obtained highly accurate data through uniform and sufficient coverage depth over all target regions and through optimization of the bioinformatics pipeline. We confirmed that our platform is accurate and practical for diagnostic BRCA1/2 testing in a clinical laboratory.
Han, Jia; Liu, Ying; Rao, Fangwen; Nievergelt, Caroline M.; O’Connor, Daniel T.; Wang, Xingyu; Liu, Lisheng; Bu, Dingfang; Liang, Yu; Wang, Fang; Zhang, Luxia; Zhang, Hong; Chen, Yuqing; Wang, Haiyan
2013-01-01
Uromodulin (UMOD) genetic variants cause familial juvenile hyperuricemic nephropathy, characterized by hyperuricemia, decreased renal excretion of UMOD and uric acid; such findings suggest a role for UMOD in the regulation of plasma uric acid. We screened common variants across the UMOD locus in two populations, one from a community-based Chinese population, the other from California twins and siblings. Transcriptional activity of promoter variants was estimated in luciferase reporter plasmids transfected into HEK293 cells and mlMCD3 cells. By variance components in twin pairs, uric acid concentration and excretion were heritable traits. In the primary population from Beijing, we identified that carriers of haplotype GCC displayed higher plasma uric acid, and 3 UMOD promoter variants associated with plasma uric acid. UMOD promoter variants displayed reciprocal effects on urine uric acid excretion and plasma uric acid concentration, suggesting a primary effect on renal tubular handling of urate. These UMOD genetic marker-on-trait associations for uric acid were replicated in an independent American population sample. Site-directed mutagenesis at trait-associated UMOD promoter variants altered promoter activity in transfected luciferase reporter plasmids. These results suggest that UMOD promoter variants seem to initiate a cascade of transcriptional and biochemical changes influencing UMOD secretion, eventuating in elevation of plasma uric acid. PMID:23344472
Qualtieri, Antonio; Le, Pera Maria; Pedace, Vera; Magariello, Angela; Brancati, Carlo
2002-02-01
We have identified a new neutral hemoglobin variant in a pregnant Italian woman, that resulted from a GTG-->CTG replacement at codon 126 of the beta chain, corresponding to a Val-->Leu amino acid change at position beta126(H4). Thermal and isopropanol stability tests were normal and there were no abnormal clinical features. Routine electrophoretic and ion exchange chromatographic methods for hemoglobin separation failed to show this variant, but reversed phase high performance liquid chromatography revealed an abnormal peak eluting near the normal beta chain. No abnormal tryptic peptide was revealed on the high performance liquid chromatographic elution pattern of the total globin digest. The mutation was determined at the DNA level by amplification of the three beta exons by polymerase chain reaction and direct sequencing of one exon that showed an abnormal migration on single strand conformational polymorphism analysis.
de Bruin, Christiaan; Mericq, Verónica; Andrew, Shayne F.; van Duyvenvoorde, Hermine A.; Verkaik, Nicole S.; Losekoot, Monique; Porollo, Aleksey; Garcia, Hernán; Kuang, Yi; Hanson, Dan; Clayton, Peter; van Gent, Dik C.; Wit, Jan M.; Hwa, Vivian
2015-01-01
Context: Severe short stature can be caused by defects in numerous biological processes including defects in IGF-1 signaling, centromere function, cell cycle control, and DNA damage repair. Many syndromic causes of short stature are associated with medical comorbidities including hypogonadism and microcephaly. Objective: To identify an underlying genetic etiology in two siblings with severe short stature and gonadal failure. Design: Clinical phenotyping, genetic analysis, complemented by in vitro functional studies of the candidate gene. Setting: An academic pediatric endocrinology clinic. Patients or Other Participants: Two adult siblings (male patient [P1] and female patient 2 [P2]) presented with a history of severe postnatal growth failure (adult heights: P1, −6.8 SD score; P2, −4 SD score), microcephaly, primary gonadal failure, and early-onset metabolic syndrome in late adolescence. In addition, P2 developed a malignant gastrointestinal stromal tumor at age 28. Intervention(s): Single nucleotide polymorphism microarray and exome sequencing. Results: Combined microarray analysis and whole exome sequencing of the two affected siblings and one unaffected sister identified a homozygous variant in XRCC4 as the probable candidate variant. Sanger sequencing and mRNA studies revealed a splice variant resulting in an in-frame deletion of 23 amino acids. Primary fibroblasts (P1) showed a DNA damage repair defect. Conclusions: In this study we have identified a novel pathogenic variant in XRCC4, a gene that plays a critical role in non-homologous end-joining DNA repair. This finding expands the spectrum of DNA damage repair syndromes to include XRCC4 deficiency causing severe postnatal growth failure, microcephaly, gonadal failure, metabolic syndrome, and possibly tumor predisposition. PMID:25742519
Thorleifsson, Gudmar; Ahluwalia, Tarunveer S.; Steinthorsdottir, Valgerdur; Bjarnason, Helgi; Gudbjartsson, Daniel F.; Magnusson, Olafur T.; Sparsø, Thomas; Albrechtsen, Anders; Kong, Augustine; Masson, Gisli; Tian, Geng; Cao, Hongzhi; Nie, Chao; Kristiansen, Karsten; Husemoen, Lise Lotte; Thuesen, Betina; Li, Yingrui; Nielsen, Rasmus; Linneberg, Allan; Olafsson, Isleifur; Eyjolfsson, Gudmundur I.; Jørgensen, Torben; Wang, Jun; Hansen, Torben; Thorsteinsdottir, Unnur; Stefánsson, Kari; Pedersen, Oluf
2013-01-01
Genome-wide association studies have mainly relied on common HapMap sequence variations. Recently, sequencing approaches have allowed analysis of low frequency and rare variants in conjunction with common variants, thereby improving the search for functional variants and thus the understanding of the underlying biology of human traits and diseases. Here, we used a large Icelandic whole genome sequence dataset combined with Danish exome sequence data to gain insight into the genetic architecture of serum levels of vitamin B12 (B12) and folate. Up to 22.9 million sequence variants were analyzed in combined samples of 45,576 and 37,341 individuals with serum B12 and folate measurements, respectively. We found six novel loci associating with serum B12 (CD320, TCN2, ABCD4, MMAA, MMACHC) or folate levels (FOLR3) and confirmed seven loci for these traits (TCN1, FUT6, FUT2, CUBN, CLYBL, MUT, MTHFR). Conditional analyses established that four loci contain additional independent signals. Interestingly, 13 of the 18 identified variants were coding and 11 of the 13 target genes have known functions related to B12 and folate pathways. Contrary to epidemiological studies we did not find consistent association of the variants with cardiovascular diseases, cancers or Alzheimer's disease although some variants demonstrated pleiotropic effects. Although to some degree impeded by low statistical power for some of these conditions, these data suggest that sequence variants that contribute to the population diversity in serum B12 or folate levels do not modify the risk of developing these conditions. Yet, the study demonstrates the value of combining whole genome and exome sequencing approaches to ascertain the genetic and molecular architectures underlying quantitative trait associations. PMID:23754956
Tang, Haiming; Thomas, Paul D
2016-07-15
PANTHER-PSEP is a new software tool for predicting non-synonymous genetic variants that may play a causal role in human disease. Several previous variant pathogenicity prediction methods have been proposed that quantify evolutionary conservation among homologous proteins from different organisms. PANTHER-PSEP employs a related but distinct metric based on 'evolutionary preservation': homologous proteins are used to reconstruct the likely sequences of ancestral proteins at nodes in a phylogenetic tree, and the history of each amino acid can be traced back in time from its current state to estimate how long that state has been preserved in its ancestors. Here, we describe the PSEP tool, and assess its performance on standard benchmarks for distinguishing disease-associated from neutral variation in humans. On these benchmarks, PSEP outperforms not only previous tools that utilize evolutionary conservation, but also several highly used tools that include multiple other sources of information as well. For predicting pathogenic human variants, the trace back of course starts with a human 'reference' protein sequence, but the PSEP tool can also be applied to predicting deleterious or pathogenic variants in reference proteins from any of the ∼100 other species in the PANTHER database. PANTHER-PSEP is freely available on the web at http://pantherdb.org/tools/csnpScoreForm.jsp Users can also download the command-line based tool at ftp://ftp.pantherdb.org/cSNP_analysis/PSEP/ CONTACT: pdthomas@usc.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
NOTCH3 variants and risk of ischemic stroke.
Ross, Owen A; Soto-Ortolaza, Alexandra I; Heckman, Michael G; Verbeeck, Christophe; Serie, Daniel J; Rayaprolu, Sruti; Rich, Stephen S; Nalls, Michael A; Singleton, Andrew; Guerreiro, Rita; Kinsella, Emma; Wszolek, Zbigniew K; Brott, Thomas G; Brown, Robert D; Worrall, Bradford B; Meschia, James F
2013-01-01
Mutations within the NOTCH3 gene cause cerebral autosomal dominant arteriopathy with subcortical infarcts and leukoencephalopathy (CADASIL). CADASIL mutations appear to be restricted to the first twenty-four exons, resulting in the gain or loss of a cysteine amino acid. The role of other exonic NOTCH3 variation not involving cysteine residues and mutations in exons 25-33 in ischemic stroke remains unresolved. All 33 exons of NOTCH3 were sequenced in 269 Caucasian probands from the Siblings With Ischemic Stroke Study (SWISS), a 70-center North American affected sibling pair study and 95 healthy Caucasian control subjects. Variants identified by sequencing in the SWISS probands were then tested for association with ischemic stroke using US Caucasian controls collected at the Mayo Clinic (n=654), and further assessed in a Caucasian (n=802) and African American (n=298) patient-control series collected through the Ischemic Stroke Genetics Study (ISGS). Sequencing of the 269 SWISS probands identified one (0.4%) with small vessel type stroke carrying a known CADASIL mutation (p.R558C; Exon 11). Of the 19 common NOTCH3 variants identified, the only variant significantly associated with ischemic stroke after multiple testing adjustment was p.R1560P (rs78501403; Exon 25) in the combined SWISS and ISGS Caucasian series (Odds Ratio [OR] 0.50, P=0.0022) where presence of the minor allele was protective against ischemic stroke. Although only significant prior to adjustment for multiple testing, p.T101T (rs3815188; Exon 3) was associated with an increased risk of small-vessel stroke (OR: 1.56, P=0.008) and p.P380P (rs61749020; Exon 7) was associated with decreased risk of large-vessel stroke (OR: 0.35, P=0.047) in Caucasians. No significant associations were observed in the small African American series. Cysteine-affecting NOTCH3 mutations are rare in patients with typical ischemic stroke, however our observation that common NOTCH3 variants may be associated with risk of ischemic stroke warrants further study.
Factors influencing success of clinical genome sequencing across a broad spectrum of disorders
Lise, Stefano; Broxholme, John; Cazier, Jean-Baptiste; Rimmer, Andy; Kanapin, Alexander; Lunter, Gerton; Fiddy, Simon; Allan, Chris; Aricescu, A. Radu; Attar, Moustafa; Babbs, Christian; Becq, Jennifer; Beeson, David; Bento, Celeste; Bignell, Patricia; Blair, Edward; Buckle, Veronica J; Bull, Katherine; Cais, Ondrej; Cario, Holger; Chapel, Helen; Copley, Richard R; Cornall, Richard; Craft, Jude; Dahan, Karin; Davenport, Emma E; Dendrou, Calliope; Devuyst, Olivier; Fenwick, Aimée L; Flint, Jonathan; Fugger, Lars; Gilbert, Rodney D; Goriely, Anne; Green, Angie; Greger, Ingo H.; Grocock, Russell; Gruszczyk, Anja V; Hastings, Robert; Hatton, Edouard; Higgs, Doug; Hill, Adrian; Holmes, Chris; Howard, Malcolm; Hughes, Linda; Humburg, Peter; Johnson, David; Karpe, Fredrik; Kingsbury, Zoya; Kini, Usha; Knight, Julian C; Krohn, Jonathan; Lamble, Sarah; Langman, Craig; Lonie, Lorne; Luck, Joshua; McCarthy, Davis; McGowan, Simon J; McMullin, Mary Frances; Miller, Kerry A; Murray, Lisa; Németh, Andrea H; Nesbit, M Andrew; Nutt, David; Ormondroyd, Elizabeth; Oturai, Annette Bang; Pagnamenta, Alistair; Patel, Smita Y; Percy, Melanie; Petousi, Nayia; Piazza, Paolo; Piret, Sian E; Polanco-Echeverry, Guadalupe; Popitsch, Niko; Powrie, Fiona; Pugh, Chris; Quek, Lynn; Robbins, Peter A; Robson, Kathryn; Russo, Alexandra; Sahgal, Natasha; van Schouwenburg, Pauline A; Schuh, Anna; Silverman, Earl; Simmons, Alison; Sørensen, Per Soelberg; Sweeney, Elizabeth; Taylor, John; Thakker, Rajesh V; Tomlinson, Ian; Trebes, Amy; Twigg, Stephen RF; Uhlig, Holm H; Vyas, Paresh; Vyse, Tim; Wall, Steven A; Watkins, Hugh; Whyte, Michael P; Witty, Lorna; Wright, Ben; Yau, Chris; Buck, David; Humphray, Sean; Ratcliffe, Peter J; Bell, John I; Wilkie, Andrew OM; Bentley, David; Donnelly, Peter; McVean, Gilean
2015-01-01
To assess factors influencing the success of whole genome sequencing for mainstream clinical diagnosis, we sequenced 217 individuals from 156 independent cases across a broad spectrum of disorders in whom prior screening had identified no pathogenic variants. We quantified the number of candidate variants identified using different strategies for variant calling, filtering, annotation and prioritisation. We found that jointly calling variants across samples, filtering against both local and external databases, deploying multiple annotation tools and using familial transmission above biological plausibility contributed to accuracy. Overall, we identified disease causing variants in 21% of cases, rising to 34% (23/68) for Mendelian disorders and 57% (8/14) in trios. We also discovered 32 potentially clinically actionable variants in 18 genes unrelated to the referral disorder, though only four were ultimately considered reportable. Our results demonstrate the value of genome sequencing for routine clinical diagnosis, but also highlight many outstanding challenges. PMID:25985138
Whole genome sequences of a male and female supercentenarian, ages greater than 114 years.
Sebastiani, Paola; Riva, Alberto; Montano, Monty; Pham, Phillip; Torkamani, Ali; Scherba, Eugene; Benson, Gary; Milton, Jacqueline N; Baldwin, Clinton T; Andersen, Stacy; Schork, Nicholas J; Steinberg, Martin H; Perls, Thomas T
2011-01-01
Supercentenarians (age 110+ years old) generally delay or escape age-related diseases and disability well beyond the age of 100 and this exceptional survival is likely to be influenced by a genetic predisposition that includes both common and rare genetic variants. In this report, we describe the complete genomic sequences of male and female supercentenarians, both age >114 years old. We show that: (1) the sequence variant spectrum of these two individuals' DNA sequences is largely comparable to existing non-supercentenarian genomes; (2) the two individuals do not appear to carry most of the well-established human longevity enabling variants already reported in the literature; (3) they have a comparable number of known disease-associated variants relative to most human genomes sequenced to-date; (4) approximately 1% of the variants these individuals possess are novel and may point to new genes involved in exceptional longevity; and (5) both individuals are enriched for coding variants near longevity-associated variants that we discovered through a large genome-wide association study. These analyses suggest that there are both common and rare longevity-associated variants that may counter the effects of disease-predisposing variants and extend lifespan. The continued analysis of the genomes of these and other rare individuals who have survived to extremely old ages should provide insight into the processes that contribute to the maintenance of health during extreme aging.
Whole Genome Sequences of a Male and Female Supercentenarian, Ages Greater than 114 Years
Sebastiani, Paola; Riva, Alberto; Montano, Monty; Pham, Phillip; Torkamani, Ali; Scherba, Eugene; Benson, Gary; Milton, Jacqueline N.; Baldwin, Clinton T.; Andersen, Stacy; Schork, Nicholas J.; Steinberg, Martin H.; Perls, Thomas T.
2012-01-01
Supercentenarians (age 110+ years old) generally delay or escape age-related diseases and disability well beyond the age of 100 and this exceptional survival is likely to be influenced by a genetic predisposition that includes both common and rare genetic variants. In this report, we describe the complete genomic sequences of male and female supercentenarians, both age >114 years old. We show that: (1) the sequence variant spectrum of these two individuals’ DNA sequences is largely comparable to existing non-supercentenarian genomes; (2) the two individuals do not appear to carry most of the well-established human longevity enabling variants already reported in the literature; (3) they have a comparable number of known disease-associated variants relative to most human genomes sequenced to-date; (4) approximately 1% of the variants these individuals possess are novel and may point to new genes involved in exceptional longevity; and (5) both individuals are enriched for coding variants near longevity-associated variants that we discovered through a large genome-wide association study. These analyses suggest that there are both common and rare longevity-associated variants that may counter the effects of disease-predisposing variants and extend lifespan. The continued analysis of the genomes of these and other rare individuals who have survived to extremely old ages should provide insight into the processes that contribute to the maintenance of health during extreme aging. PMID:22303384
Exome Sequence Analysis of 14 Families With High Myopia.
Kloss, Bethany A; Tompson, Stuart W; Whisenhunt, Kristina N; Quow, Krystina L; Huang, Samuel J; Pavelec, Derek M; Rosenberg, Thomas; Young, Terri L
2017-04-01
To identify causal gene mutations in 14 families with autosomal dominant (AD) high myopia using exome sequencing. Select individuals from 14 large Caucasian families with high myopia were exome sequenced. Gene variants were filtered to identify potential pathogenic changes. Sanger sequencing was used to confirm variants in original DNA, and to test for disease cosegregation in additional family members. Candidate genes and chromosomal loci previously associated with myopic refractive error and its endophenotypes were comprehensively screened. In 14 high myopia families, we identified 73 rare and 31 novel gene variants as candidates for pathogenicity. In seven of these families, two of the novel and eight of the rare variants were within known myopia loci. A total of 104 heterozygous nonsynonymous rare variants in 104 genes were identified in 10 out of 14 probands. Each variant cosegregated with affection status. No rare variants were identified in genes known to cause myopia or in genes closest to published genome-wide association study association signals for refractive error or its endophenotypes. Whole exome sequencing was performed to determine gene variants implicated in the pathogenesis of AD high myopia. This study provides new genes for consideration in the pathogenesis of high myopia, and may aid in the development of genetic profiling of those at greatest risk for attendant ocular morbidities of this disorder.
Yan, Song; Li, Yun
2014-02-15
Despite its great capability to detect rare variant associations, next-generation sequencing is still prohibitively expensive when applied to large samples. In case-control studies, it is thus appealing to sequence only a subset of cases to discover variants and genotype the identified variants in controls and the remaining cases under the reasonable assumption that causal variants are usually enriched among cases. However, this approach leads to inflated type-I error if analyzed naively for rare variant association. Several methods have been proposed in recent literature to control type-I error at the cost of either excluding some sequenced cases or correcting the genotypes of discovered rare variants. All of these approaches thus suffer from certain extent of information loss and thus are underpowered. We propose a novel method (BETASEQ), which corrects inflation of type-I error by supplementing pseudo-variants while keeps the original sequence and genotype data intact. Extensive simulations and real data analysis demonstrate that, in most practical situations, BETASEQ leads to higher testing powers than existing approaches with guaranteed (controlled or conservative) type-I error. BETASEQ and associated R files, including documentation, examples, are available at http://www.unc.edu/~yunmli/betaseq
Using whole-exome sequencing to identify variants inherited from mosaic parents
Rios, Jonathan J; Delgado, Mauricio R
2015-01-01
Whole-exome sequencing (WES) has allowed the discovery of genes and variants causing rare human disease. This is often achieved by comparing nonsynonymous variants between unrelated patients, and particularly for sporadic or recessive disease, often identifies a single or few candidate genes for further consideration. However, despite the potential for this approach to elucidate the genetic cause of rare human disease, a majority of patients fail to realize a genetic diagnosis using standard exome analysis methods. Although genetic heterogeneity contributes to the difficulty of exome sequence analysis between patients, it remains plausible that rare human disease is not caused by de novo or recessive variants. Multiple human disorders have been described for which the variant was inherited from a phenotypically normal mosaic parent. Here we highlight the potential for exome sequencing to identify a reasonable number of candidate genes when dominant disease variants are inherited from a mosaic parent. We show the power of WES to identify a limited number of candidate genes using this disease model and how sequence coverage affects identification of mosaic variants by WES. We propose this analysis as an alternative to discover genetic causes of rare human disorders for which typical WES approaches fail to identify likely pathogenic variants. PMID:24986828
Lee, Michael; Hills, Mark; Conomos, Dimitri; Stutz, Michael D.; Dagg, Rebecca A.; Lau, Loretta M.S.; Reddel, Roger R.; Pickett, Hilda A.
2014-01-01
Telomeres are terminal repetitive DNA sequences on chromosomes, and are considered to comprise almost exclusively hexameric TTAGGG repeats. We have evaluated telomere sequence content in human cells using whole-genome sequencing followed by telomere read extraction in a panel of mortal cell strains and immortal cell lines. We identified a wide range of telomere variant repeats in human cells, and found evidence that variant repeats are generated by mechanistically distinct processes during telomerase- and ALT-mediated telomere lengthening. Telomerase-mediated telomere extension resulted in biased repeat synthesis of variant repeats that differed from the canonical sequence at positions 1 and 3, but not at positions 2, 4, 5 or 6. This indicates that telomerase is most likely an error-prone reverse transcriptase that misincorporates nucleotides at specific positions on the telomerase RNA template. In contrast, cell lines that use the ALT pathway contained a large range of variant repeats that varied greatly between lines. This is consistent with variant repeats spreading from proximal telomeric regions throughout telomeres in a stochastic manner by recombination-mediated templating of DNA synthesis. The presence of unexpectedly large numbers of variant repeats in cells utilizing either telomere maintenance mechanism suggests a conserved role for variant sequences at human telomeres. PMID:24225324
González, Carolina; Tabernero, David; Cortese, Maria Francesca; Gregori, Josep; Casillas, Rosario; Riveiro-Barciela, Mar; Godoy, Cristina; Sopena, Sara; Rando, Ariadna; Yll, Marçal; Lopez-Martinez, Rosa; Quer, Josep; Esteban, Rafael; Buti, Maria; Rodríguez-Frías, Francisco
2018-05-21
To detect hyper-conserved regions in the hepatitis B virus (HBV) X gene ( HBX ) 5' region that could be candidates for gene therapy. The study included 27 chronic hepatitis B treatment-naive patients in various clinical stages (from chronic infection to cirrhosis and hepatocellular carcinoma, both HBeAg-negative and HBeAg-positive), and infected with HBV genotypes A-F and H. In a serum sample from each patient with viremia > 3.5 log IU/mL, the HBX 5' end region [nucleotide (nt) 1255-1611] was PCR-amplified and submitted to next-generation sequencing (NGS). We assessed genotype variants by phylogenetic analysis, and evaluated conservation of this region by calculating the information content of each nucleotide position in a multiple alignment of all unique sequences (haplotypes) obtained by NGS. Conservation at the HBx protein amino acid (aa) level was also analyzed. NGS yielded 1333069 sequences from the 27 samples, with a median of 4578 sequences/sample (2487-9279, IQR 2817). In 14/27 patients (51.8%), phylogenetic analysis of viral nucleotide haplotypes showed a complex mixture of genotypic variants. Analysis of the information content in the haplotype multiple alignments detected 2 hyper-conserved nucleotide regions, one in the HBX upstream non-coding region (nt 1255-1286) and the other in the 5' end coding region (nt 1519-1603). This last region coded for a conserved amino acid region (aa 63-76) that partially overlaps a Kunitz-like domain. Two hyper-conserved regions detected in the HBX 5' end may be of value for targeted gene therapy, regardless of the patients' clinical stage or HBV genotype.
Xu, Chang; Nezami Ranjbar, Mohammad R; Wu, Zhong; DiCarlo, John; Wang, Yexun
2017-01-03
Detection of DNA mutations at very low allele fractions with high accuracy will significantly improve the effectiveness of precision medicine for cancer patients. To achieve this goal through next generation sequencing, researchers need a detection method that 1) captures rare mutation-containing DNA fragments efficiently in the mix of abundant wild-type DNA; 2) sequences the DNA library extensively to deep coverage; and 3) distinguishes low level true variants from amplification and sequencing errors with high accuracy. Targeted enrichment using PCR primers provides researchers with a convenient way to achieve deep sequencing for a small, yet most relevant region using benchtop sequencers. Molecular barcoding (or indexing) provides a unique solution for reducing sequencing artifacts analytically. Although different molecular barcoding schemes have been reported in recent literature, most variant calling has been done on limited targets, using simple custom scripts. The analytical performance of barcode-aware variant calling can be significantly improved by incorporating advanced statistical models. We present here a highly efficient, simple and scalable enrichment protocol that integrates molecular barcodes in multiplex PCR amplification. In addition, we developed smCounter, an open source, generic, barcode-aware variant caller based on a Bayesian probabilistic model. smCounter was optimized and benchmarked on two independent read sets with SNVs and indels at 5 and 1% allele fractions. Variants were called with very good sensitivity and specificity within coding regions. We demonstrated that we can accurately detect somatic mutations with allele fractions as low as 1% in coding regions using our enrichment protocol and variant caller.
Lim, Eileen C P; Brett, Maggie; Lai, Angeline H M; Lee, Siew-Peng; Tan, Ee-Shien; Jamuar, Saumya S; Ng, Ivy S L; Tan, Ene-Choo
2015-12-14
Next-generation sequencing (NGS) has revolutionized genetic research and offers enormous potential for clinical application. Sequencing the exome has the advantage of casting the net wide for all known coding regions while targeted gene panel sequencing provides enhanced sequencing depths and can be designed to avoid incidental findings in adult-onset conditions. A HaloPlex panel consisting of 180 genes within commonly altered chromosomal regions is available for use on both the Ion Personal Genome Machine (PGM) and MiSeq platforms to screen for causative mutations in these genes. We used this Haloplex ICCG panel for targeted sequencing of 15 patients with clinical presentations indicative of an abnormality in one of the 180 genes. Sequencing runs were done using the Ion 318 Chips on the Ion Torrent PGM. Variants were filtered for known polymorphisms and analysis was done to identify possible disease-causing variants before validation by Sanger sequencing. When possible, segregation of variants with phenotype in family members was performed to ascertain the pathogenicity of the variant. More than 97% of the target bases were covered at >20×. There was an average of 9.6 novel variants per patient. Pathogenic mutations were identified in five genes for six patients, with two novel variants. There were another five likely pathogenic variants, some of which were unreported novel variants. In a cohort of 15 patients, we were able to identify a likely genetic etiology in six patients (40%). Another five patients had candidate variants for which further evaluation and segregation analysis are ongoing. Our results indicate that the HaloPlex ICCG panel is useful as a rapid, high-throughput and cost-effective screening tool for 170 of the 180 genes. There is low coverage for some regions in several genes which might have to be supplemented by Sanger sequencing. However, comparing the cost, ease of analysis, and shorter turnaround time, it is a good alternative to exome sequencing for patients whose features are suggestive of a genetic etiology involving one of the genes in the panel.
López-Revilla, Rubén; Pineda, Marco A; Ortiz-Valdez, Julio; Sánchez-Garza, Mireya; Riego, Lina
2009-01-01
Background In San Luis Potosí City cervical infection by human papillomavirus type 16 (HPV16) associated to dysplastic lesions is more prevalent in younger women. In this work HPV16 subtypes and variants associated to low-grade intraepithelial lesions (LSIL), high-grade intraepithelial lesions (HSIL) and invasive cervical cancer (ICC) of 38 women residing in San Luis Potosí City were identified by comparing their E6 open reading frame sequences. Results Three European (E) variants (E-P, n = 27; E-T350G, n = 7; E-C188G, n = 2) and one AA-a variant (n = 2) were identified among the 38 HPV16 sequences analyzed. E-P variant sequences contained 23 single nucleotide changes, two of which (A334G, A404T) had not been described before and allowed the phylogenetic separation from the other variants. E-P A334G sequences were the most prevalent (22 cases, 57.9%), followed by the E-P Ref prototype (8 cases, 21.1%) and E-P A404T (1 case, 2.6%) sequences. The HSIL + ICC fraction was 0.21 for the E-P A334G variants and 0.00 for the E-P Ref variants. Conclusion We conclude that in the women included in this study the HPV16 E subtype is 19 times more frequent than the AA subtype; that the circulating E variants are E-P (71.1%) > E-T350G (18.4%) > E-C188G (5.3%); that 71.0% of the E-P sequences carry the A334G single nucleotide change and appear to correspond to a HPV16 variant characteristic of San Luis Potosi City more oncogenic than the E-P Ref prototype. PMID:19216802
A method for multi-codon scanning mutagenesis of proteins based on asymmetric transposons.
Liu, Jia; Cropp, T Ashton
2012-02-01
Random mutagenesis followed by selection or screening is a commonly used strategy to improve protein function. Despite many available methods for random mutagenesis, nearly all generate mutations at the nucleotide level. An ideal mutagenesis method would allow for the generation of 'codon mutations' to change protein sequence with defined or mixed amino acids of choice. Herein we report a method that allows for mutations of one, two or three consecutive codons. Key to this method is the development of a Mu transposon variant with asymmetric terminal sequences. As a demonstration of the method, we performed multi-codon scanning on the gene encoding superfolder GFP (sfGFP). Characterization of 50 randomly chosen clones from each library showed that more than 40% of the mutants in these three libraries contained seamless, in-frame mutations with low site preference. By screening only 500 colonies from each library, we successfully identified several spectra-shift mutations, including a S205D variant that was found to bear a single excitation peak in the UV region.
Stowe, Robert C; Sun, Qin; Elsea, Sarah H; Scaglia, Fernando
2018-05-01
Lipoic acid is an essential cofactor for the mitochondrial 2-ketoacid dehydrogenase complexes and the glycine cleavage system. Lipoyltransferase 1 catalyzes the covalent attachment of lipoate to these enzyme systems. Pathogenic variants in LIPT1 gene have recently been described in four patients from three families, commonly presenting with severe lactic acidosis resulting in neonatal death and/or poor neurocognitive outcomes. We report a 2-month-old male with severe lactic acidosis, refractory status epilepticus, and brain imaging suggestive of Leigh disease. Exome sequencing implicated compound heterozygous LIPT1 pathogenic variants. We describe the fifth case of LIPT1 deficiency, whose phenotype progressed to that of an early infantile epileptic encephalopathy, which is novel compared to previously described patients whom we will review. Due to the significant biochemical and phenotypic overlap that LIPT1 deficiency and mitochondrial energy cofactor disorders have with pyruvate dehydrogenase deficiency and/or nonketotic hyperglycinemia, they are and have been presumptively under-diagnosed without exome sequencing. © 2018 Wiley Periodicals, Inc.
Kono, H; Saven, J G
2001-02-23
Combinatorial experiments provide new ways to probe the determinants of protein folding and to identify novel folding amino acid sequences. These types of experiments, however, are complicated both by enormous conformational complexity and by large numbers of possible sequences. Therefore, a quantitative computational theory would be helpful in designing and interpreting these types of experiment. Here, we present and apply a statistically based, computational approach for identifying the properties of sequences compatible with a given main-chain structure. Protein side-chain conformations are included in an atom-based fashion. Calculations are performed for a variety of similar backbone structures to identify sequence properties that are robust with respect to minor changes in main-chain structure. Rather than specific sequences, the method yields the likelihood of each of the amino acids at preselected positions in a given protein structure. The theory may be used to quantify the characteristics of sequence space for a chosen structure without explicitly tabulating sequences. To account for hydrophobic effects, we introduce an environmental energy that it is consistent with other simple hydrophobicity scales and show that it is effective for side-chain modeling. We apply the method to calculate the identity probabilities of selected positions of the immunoglobulin light chain-binding domain of protein L, for which many variant folding sequences are available. The calculations compare favorably with the experimentally observed identity probabilities.
USDA-ARS?s Scientific Manuscript database
Scope: Tissue concentrations of omega-3 fatty acids may reduce cardiovascular disease risk, and genetic variants are associated with circulating fatty acids concentrations. Whether dietary fatty acids interact with genetic variants to modify circulating omega-3 fatty acids is unclear. We evaluated i...
Investigation of the role of TCF4 rare sequence variants in schizophrenia.
Basmanav, F Buket; Forstner, Andreas J; Fier, Heide; Herms, Stefan; Meier, Sandra; Degenhardt, Franziska; Hoffmann, Per; Barth, Sandra; Fricker, Nadine; Strohmaier, Jana; Witt, Stephanie H; Ludwig, Michael; Schmael, Christine; Moebus, Susanne; Maier, Wolfgang; Mössner, Rainald; Rujescu, Dan; Rietschel, Marcella; Lange, Christoph; Nöthen, Markus M; Cichon, Sven
2015-07-01
Transcription factor 4 (TCF4) is one of the most robust of all reported schizophrenia risk loci and is supported by several genetic and functional lines of evidence. While numerous studies have implicated common genetic variation at TCF4 in schizophrenia risk, the role of rare, small-sized variants at this locus-such as single nucleotide variants and short indels which are below the resolution of chip-based arrays requires further exploration. The aim of the present study was to investigate the association between rare TCF4 sequence variants and schizophrenia. Exon-targeted resequencing was performed in 190 German schizophrenia patients. Six rare variants at the coding exons and flanking sequences of the TCF4 gene were identified, including two missense variants and one splice site variant. These six variants were then pooled with nine additional rare variants identified in 379 European participants of the 1000 Genomes Project, and all 15 variants were genotyped in an independent German sample (n = 1,808 patients; n = 2,261 controls). These data were then analyzed using six statistical methods developed for the association analysis of rare variants. No significant association (P < 0.05) was found. However, the results from our association and power analyses suggest that further research into the possible involvement of rare TCF4 sequence variants in schizophrenia risk is warranted by the assessment of larger cohorts with higher statistical power to identify rare variant associations. © 2015 Wiley Periodicals, Inc.
Wang, Jingwen; Skoog, Tiina; Einarsdottir, Elisabet; Kaartokallio, Tea; Laivuori, Hannele; Grauers, Anna; Gerdhem, Paul; Hytönen, Marjo; Lohi, Hannes; Kere, Juha; Jiao, Hong
2016-01-01
High-throughput sequencing using pooled DNA samples can facilitate genome-wide studies on rare and low-frequency variants in a large population. Some major questions concerning the pooling sequencing strategy are whether rare and low-frequency variants can be detected reliably, and whether estimated minor allele frequencies (MAFs) can represent the actual values obtained from individually genotyped samples. In this study, we evaluated MAF estimates using three variant detection tools with two sets of pooled whole exome sequencing (WES) and one set of pooled whole genome sequencing (WGS) data. Both GATK and Freebayes displayed high sensitivity, specificity and accuracy when detecting rare or low-frequency variants. For the WGS study, 56% of the low-frequency variants in Illumina array have identical MAFs and 26% have one allele difference between sequencing and individual genotyping data. The MAF estimates from WGS correlated well (r = 0.94) with those from Illumina arrays. The MAFs from the pooled WES data also showed high concordance (r = 0.88) with those from the individual genotyping data. In conclusion, the MAFs estimated from pooled DNA sequencing data reflect the MAFs in individually genotyped samples well. The pooling strategy can thus be a rapid and cost-effective approach for the initial screening in large-scale association studies. PMID:27633116
Discovery of a Mammalian Splice Variant of Myostatin That Stimulates Myogenesis
Jeanplong, Ferenc; Falconer, Shelley J.; Oldham, Jenny M.; Thomas, Mark; Gray, Tarra S.; Hennebry, Alex; Matthews, Kenneth G.; Kemp, Frederick C.; Patel, Ketan; Berry, Carole; Nicholas, Gina; McMahon, Christopher D.
2013-01-01
Myostatin plays a fundamental role in regulating the size of skeletal muscles. To date, only a single myostatin gene and no splice variants have been identified in mammals. Here we describe the splicing of a cryptic intron that removes the coding sequence for the receptor binding moiety of sheep myostatin. The deduced polypeptide sequence of the myostatin splice variant (MSV) contains a 256 amino acid N-terminal domain, which is common to myostatin, and a unique C-terminus of 65 amino acids. Western immunoblotting demonstrated that MSV mRNA is translated into protein, which is present in skeletal muscles. To determine the biological role of MSV, we developed an MSV over-expressing C2C12 myoblast line and showed that it proliferated faster than that of the control line in association with an increased abundance of the CDK2/Cyclin E complex in the nucleus. Recombinant protein made for the novel C-terminus of MSV also stimulated myoblast proliferation and bound to myostatin with high affinity as determined by surface plasmon resonance assay. Therefore, we postulated that MSV functions as a binding protein and antagonist of myostatin. Consistent with our postulate, myostatin protein was co-immunoprecipitated from skeletal muscle extracts with an MSV-specific antibody. MSV over-expression in C2C12 myoblasts blocked myostatin-induced Smad2/3-dependent signaling, thereby confirming that MSV antagonizes the canonical myostatin pathway. Furthermore, MSV over-expression increased the abundance of MyoD, Myogenin and MRF4 proteins (P<0.05), which indicates that MSV stimulates myogenesis through the induction of myogenic regulatory factors. To help elucidate a possible role in vivo, we observed that MSV protein was more abundant during early post-natal muscle development, while myostatin remained unchanged, which suggests that MSV may promote the growth of skeletal muscles. We conclude that MSV represents a unique example of intra-genic regulation in which a splice variant directly antagonizes the biological activity of the canonical gene product. PMID:24312578
Kim, Yoonhee; Suktitipat, Bhoom; Yanek, Lisa R.; Faraday, Nauder; Wilson, Alexander F.; Becker, Diane M.; Becker, Lewis C.; Mathias, Rasika A.
2013-01-01
Platelet aggregation is heritable, and genome-wide association studies have detected strong associations with a common intronic variant of the platelet endothelial aggregation receptor1 (PEAR1) gene both in African American and European American individuals. In this study, we used a sequencing approach to identify additional exonic variants in PEAR1 that may also determine variability in platelet aggregation in the GeneSTAR Study. A 0.3 Mb targeted region on chromosome 1q23.1 including the entire PEAR1 gene was Sanger sequenced in 104 subjects (45% male, 49% African American, age = 52±13) selected on the basis of hyper- and hypo- aggregation across three different agonists (collagen, epinephrine, and adenosine diphosphate). Single-variant and multi-variant burden tests for association were performed. Of the 235 variants identified through sequencing, 61 were novel, and three of these were missense variants. More rare variants (MAF<5%) were noted in African Americans compared to European Americans (108 vs. 45). The common intronic GWAS-identified variant (rs12041331) demonstrated the most significant association signal in African Americans (p = 4.020×10−4); no association was seen for additional exonic variants in this group. In contrast, multi-variant burden tests indicated that exonic variants play a more significant role in European Americans (p = 0.0099 for the collective coding variants compared to p = 0.0565 for intronic variant rs12041331). Imputation of the individual exonic variants in the rest of the GeneSTAR European American cohort (N = 1,965) supports the results noted in the sequenced discovery sample: p = 3.56×10−4, 2.27×10−7, 5.20×10−5 for coding synonymous variant rs56260937 and collagen, epinephrine and adenosine diphosphate induced platelet aggregation, respectively. Sequencing approaches confirm that a common intronic variant has the strongest association with platelet aggregation in African Americans, and show that exonic variants play an additional role in platelet aggregation in European Americans. PMID:23704978
Uptake, Results, and Outcomes of Germline Multiple-Gene Sequencing After Diagnosis of Breast Cancer.
Kurian, Allison W; Ward, Kevin C; Hamilton, Ann S; Deapen, Dennis M; Abrahamse, Paul; Bondarenko, Irina; Li, Yun; Hawley, Sarah T; Morrow, Monica; Jagsi, Reshma; Katz, Steven J
2018-05-10
Low-cost sequencing of multiple genes is increasingly available for cancer risk assessment. Little is known about uptake or outcomes of multiple-gene sequencing after breast cancer diagnosis in community practice. To examine the effect of multiple-gene sequencing on the experience and treatment outcomes for patients with breast cancer. For this population-based retrospective cohort study, patients with breast cancer diagnosed from January 2013 to December 2015 and accrued from SEER registries across Georgia and in Los Angeles, California, were surveyed (n = 5080, response rate = 70%). Responses were merged with SEER data and results of clinical genetic tests, either BRCA1 and BRCA2 (BRCA1/2) sequencing only or including additional other genes (multiple-gene sequencing), provided by 4 laboratories. Type of testing (multiple-gene sequencing vs BRCA1/2-only sequencing), test results (negative, variant of unknown significance, or pathogenic variant), patient experiences with testing (timing of testing, who discussed results), and treatment (strength of patient consideration of, and surgeon recommendation for, prophylactic mastectomy), and prophylactic mastectomy receipt. We defined a patient subgroup with higher pretest risk of carrying a pathogenic variant according to practice guidelines. Among 5026 patients (mean [SD] age, 59.9 [10.7]), 1316 (26.2%) were linked to genetic results from any laboratory. Multiple-gene sequencing increasingly replaced BRCA1/2-only testing over time: in 2013, the rate of multiple-gene sequencing was 25.6% and BRCA1/2-only testing, 74.4%;in 2015 the rate of multiple-gene sequencing was 66.5% and BRCA1/2-only testing, 33.5%. Multiple-gene sequencing was more often ordered by genetic counselors (multiple-gene sequencing, 25.5% and BRCA1/2-only testing, 15.3%) and delayed until after surgery (multiple-gene sequencing, 32.5% and BRCA1/2-only testing, 19.9%). Multiple-gene sequencing substantially increased rate of detection of any pathogenic variant (multiple-gene sequencing: higher-risk patients, 12%; average-risk patients, 4.2% and BRCA1/2-only testing: higher-risk patients, 7.8%; average-risk patients, 2.2%) and variants of uncertain significance, especially in minorities (multiple-gene sequencing: white patients, 23.7%; black patients, 44.5%; and Asian patients, 50.9% and BRCA1/2-only testing: white patients, 2.2%; black patients, 5.6%; and Asian patients, 0%). Multiple-gene sequencing was not associated with an increase in the rate of prophylactic mastectomy use, which was highest with pathogenic variants in BRCA1/2 (BRCA1/2, 79.0%; other pathogenic variant, 37.6%; variant of uncertain significance, 30.2%; negative, 35.3%). Multiple-gene sequencing rapidly replaced BRCA1/2-only testing for patients with breast cancer in the community and enabled 2-fold higher detection of clinically relevant pathogenic variants without an associated increase in prophylactic mastectomy. However, important targets for improvement in the clinical utility of multiple-gene sequencing include postsurgical delay and racial/ethnic disparity in variants of uncertain significance.
Han, Jun Hyun; Lee, Yong Seong; Kim, Hae Jong; Lee, Shin Young; Myung, Soon Chul
2015-01-01
In this study, we evaluated genetic variants of the androgen metabolism genes CYP17A1, CYP3A4, and CYP3A43 to determine whether they play a role in the development of prostate cancer (PCa) in Korean men. The study population included 240 pathologically diagnosed cases of PCa and 223 age-matched controls. Among the 789 single-nucleotide polymorphism (SNP) database variants detected, 129 were reported in two Asian groups (Han Chinese and Japanese) in the HapMap database. Only 21 polymorphisms of CYP17A1, CYP3A4, and CYP3A43 were selected based on linkage disequilibrium in Asians (r2 = 1), locations (SNPs in exons were preferred), and amino acid changes and were assessed. In addition, we performed haplotype analysis for the 21 SNPs in CYP17A1, CYP3A4, and CYP3A43 genes. To determine the association between genotype and haplotype distributions of patients and controls, logistic analyses were carried out, controlling for age. Twelve sequence variants and five major haplotypes were identified in CYP17A1. Five sequence variants and two major haplotypes were identified in CYP3A4. Four sequence variants and four major haplotypes were observed in CYP3A43. CYP17A1 haplotype-2 (Ht-2) (odds ratio [OR], 1.51; 95% confidence interval [CI], 1.04–2.18) was associated with PCa susceptibility. CYP3A4 Ht-2 (OR: 1.87; 95% CI: 1.02–3.43) was associated with PCa metastatic potential according to tumor stage. rs17115149 (OR: 1.96; 95% CI: 1.04–3.68) and CYP17A1 Ht-4 (OR: 2.01; 95% CI: 1.07–4.11) showed a significant association with histologic aggressiveness according to Gleason score. Genetic variants of CYP17A1 and CYP3A4 may play a role in the development of PCa in Korean men. PMID:25337833
Han, Jun Hyun; Lee, Yong Seong; Kim, Hae Jong; Lee, Shin Young; Myung, Soon Chul
2015-01-01
In this study, we evaluated genetic variants of the androgen metabolism genes CYP17A1, CYP3A4, and CYP3A43 to determine whether they play a role in the development of prostate cancer (PCa) in Korean men. The study population included 240 pathologically diagnosed cases of PCa and 223 age-matched controls. Among the 789 single-nucleotide polymorphism (SNP) database variants detected, 129 were reported in two Asian groups (Han Chinese and Japanese) in the HapMap database. Only 21 polymorphisms of CYP17A1, CYP3A4, and CYP3A43 were selected based on linkage disequilibrium in Asians (r2 = 1), locations (SNPs in exons were preferred), and amino acid changes and were assessed. In addition, we performed haplotype analysis for the 21 SNPs in CYP17A1, CYP3A4, and CYP3A43 genes. To determine the association between genotype and haplotype distributions of patients and controls, logistic analyses were carried out, controlling for age. Twelve sequence variants and five major haplotypes were identified in CYP17A1. Five sequence variants and two major haplotypes were identified in CYP3A4. Four sequence variants and four major haplotypes were observed in CYP3A43. CYP17A1 haplotype-2 (Ht-2) (odds ratio [OR], 1.51; 95% confidence interval [CI], 1.04-2.18) was associated with PCa susceptibility. CYP3A4 Ht-2 (OR: 1.87; 95% CI: 1.02-3.43) was associated with PCa metastatic potential according to tumor stage. rs17115149 (OR: 1.96; 95% CI: 1.04-3.68) and CYP17A1 Ht-4 (OR: 2.01; 95% CI: 1.07-4.11) showed a significant association with histologic aggressiveness according to Gleason score. Genetic variants of CYP17A1 and CYP3A4 may play a role in the development of PCa in Korean men.
Systematic comparison of variant calling pipelines using gold standard personal exome variants
Hwang, Sohyun; Kim, Eiru; Lee, Insuk; Marcotte, Edward M.
2015-01-01
The success of clinical genomics using next generation sequencing (NGS) requires the accurate and consistent identification of personal genome variants. Assorted variant calling methods have been developed, which show low concordance between their calls. Hence, a systematic comparison of the variant callers could give important guidance to NGS-based clinical genomics. Recently, a set of high-confident variant calls for one individual (NA12878) has been published by the Genome in a Bottle (GIAB) consortium, enabling performance benchmarking of different variant calling pipelines. Based on the gold standard reference variant calls from GIAB, we compared the performance of thirteen variant calling pipelines, testing combinations of three read aligners—BWA-MEM, Bowtie2, and Novoalign—and four variant callers—Genome Analysis Tool Kit HaplotypeCaller (GATK-HC), Samtools mpileup, Freebayes and Ion Proton Variant Caller (TVC), for twelve data sets for the NA12878 genome sequenced by different platforms including Illumina2000, Illumina2500, and Ion Proton, with various exome capture systems and exome coverage. We observed different biases toward specific types of SNP genotyping errors by the different variant callers. The results of our study provide useful guidelines for reliable variant identification from deep sequencing of personal genomes. PMID:26639839
Parasail: SIMD C library for global, semi-global, and local pairwise sequence alignments.
Daily, Jeff
2016-02-10
Sequence alignment algorithms are a key component of many bioinformatics applications. Though various fast Smith-Waterman local sequence alignment implementations have been developed for x86 CPUs, most are embedded into larger database search tools. In addition, fast implementations of Needleman-Wunsch global sequence alignment and its semi-global variants are not as widespread. This article presents the first software library for local, global, and semi-global pairwise intra-sequence alignments and improves the performance of previous intra-sequence implementations. A faster intra-sequence local pairwise alignment implementation is described and benchmarked, including new global and semi-global variants. Using a 375 residue query sequence a speed of 136 billion cell updates per second (GCUPS) was achieved on a dual Intel Xeon E5-2670 24-core processor system, the highest reported for an implementation based on Farrar's 'striped' approach. Rognes's SWIPE optimal database search application is still generally the fastest available at 1.2 to at best 2.4 times faster than Parasail for sequences shorter than 500 amino acids. However, Parasail was faster for longer sequences. For global alignments, Parasail's prefix scan implementation is generally the fastest, faster even than Farrar's 'striped' approach, however the opal library is faster for single-threaded applications. The software library is designed for 64 bit Linux, OS X, or Windows on processors with SSE2, SSE41, or AVX2. Source code is available from https://github.com/jeffdaily/parasail under the Battelle BSD-style license. Applications that require optimal alignment scores could benefit from the improved performance. For the first time, SIMD global, semi-global, and local alignments are available in a stand-alone C library.
Rasoolizadeh, Asieh; Goulet, Marie-Claire; Sainsbury, Frank; Cloutier, Conrad; Michaud, Dominique
2016-04-01
A causal link has been reported between positively selected amino acids in plant cystatins and the inhibitory range of these proteins against insect digestive cysteine (Cys) proteases. Here we assessed the impact of single substitutions to closely related amino acids on the contribution of positive selection to cystatin diversification. Cystatin sequence alignments, while confirming hypervariability, indicated a preference for related amino acids at positively selected sites. For example, the non-polar residues leucine (Leu), isoleucine (Ile) and valine (Val) were shown to predominate at positively selected site 2 in the N-terminal region, unlike selected sites 6 and 10, where polar residues are preferred. The model cystatin SlCYS8 and single variants with Leu, Ile or Val at position 2 were compared with regard to their ability to bind digestive proteases of the coleopteran pest Leptinotarsa decemlineata and to induce compensatory responses in this insect. A functional proteomics procedure to capture target Cys proteases in midgut extracts allowed confirmation of distinct binding profiles for the cystatin variants. A shotgun proteomics procedure to monitor whole Cys protease complements revealed protease family specific compensatory responses in the insect, dependent on the variant ingested. Our data confirm the contribution of closely related amino acids to the functional diversity of positively selected plant cystatins in a broader structure/function context imposing physicochemical constraints to primary structure alterations. They also underline the complexity of protease/inhibitor interactions in plant-insect systems, and the challenges still to be met in order to harness the full potential of ectopically expressed protease inhibitors in crop protection. © 2016 Federation of European Biochemical Societies.
Novel Naja atra cardiotoxin 1 (CTX-1) derived antimicrobial peptides with broad spectrum activity
Santospirito, Davide; Polverini, Eugenia; Flisi, Sara; Cavirani, Sandro; Taddei, Simone
2018-01-01
Naja atra subsp. atra cardiotoxin 1 (CTX-1), produced by Chinese cobra snakes, belonging to Elapidae family, is included in the three-finger toxin family and exerts high cytotoxicity and antimicrobial activity too. Using as template mainly the tip and the subsequent β-strand of the first “finger” of this toxin, different sequences of 20 amino acids linear peptides have been designed in order to avoid toxic effects but to maintain or even strengthen the partial antimicrobial activity already seen for the complete toxin. As a result, the sequence NCP-0 (Naja Cardiotoxin Peptide-0) was designed as ancestor and subsequently 4 other variant sequences of NCP-0 were developed. These synthesized variant sequences have shown microbicidal activity towards a panel of reference and field strains of Gram-positive and Gram-negative bacteria. The sequence named NCP-3, and its variants NCP-3a and NCP-3b, have shown the best antimicrobial activity, together with low cytotoxicity against eukaryotic cells and low hemolytic activity. Bactericidal activity has been demonstrated by minimum bactericidal concentration (MBC) assay at values below 10 μg/ml for most of the tested bacterial strains. This potent antimicrobial activity was confirmed even for unicellular fungi Candida albicans, Candida glabrata and Malassezia pachydermatis (MBC 50–6.3 μg/ml), and against the fast-growing mycobacteria Mycobacterium smegmatis and Mycobacterium fortuitum. Moreover, NCP-3 has shown virucidal activity on Bovine Herpesvirus 1 (BoHV1) belonging to Herpesviridae family. The bactericidal activity is maintained even in a high salt concentration medium (125 and 250 mM NaCl) and phosphate buffer with 20% Mueller Hinton (MH) medium against E. coli, methicillin resistant Staphylococcus aureus (MRSA) and Pseudomonas aeruginosa reference strains. Considering these in vitro obtained data, the search for active sequences within proteins presenting an intrinsic microbicidal activity could provide a new way for discovering a large number of novel and promising antimicrobial peptides families. PMID:29364903
Ceballos, Ana; Andreani, Guadalupe; Ripamonti, Chiara; Dilernia, Dario; Mendez, Ramiro; Rabinovich, Roberto D; Cárdenas, Patricia Coll; Zala, Carlos; Cahn, Pedro; Scarlatti, Gabriella; Martínez Peralta, Liliana
2008-11-01
Mother-to-child transmission (MTCT) of human immunodeficiency virus type 1 (HIV-1) as described for women with an established infection is, in most cases, associated with the transmission of few maternal variants. This study analysed virus variability in four cases of maternal primary infection occurring during pregnancy and/or breastfeeding. Estimated time of seroconversion was at 4 months of pregnancy for one woman (early seroconversion) and during the last months of pregnancy and/or breastfeeding for the remaining three (late seroconversion). The C2V3 envelope region was analysed in samples of mother-child pairs by molecular cloning and sequencing. Comparisons of nucleotide and amino acid sequences as well as phylogenetic analysis were performed. The results showed low variability in the virus population of both mother and child. Maximum-likelihood analysis showed that, in the early pregnancy seroconversion case, a minor viral variant with further evolution in the child was transmitted, which could indicate a selection event in MTCT or a stochastic event, whereas in the late seroconversion cases, the mother's and child's sequences were intermingled, which is compatible with the transmission of multiple viral variants from the mother's major population. These results could be explained by the less pronounced selective pressure exerted by the immune system in the early stages of the mother's infection, which could play a role in MTCT of HIV-1.
Greipel, Leonie; Fischer, Sebastian; Klockgether, Jens; Dorda, Marie; Mielke, Samira; Wiehlmann, Lutz; Cramer, Nina; Tümmler, Burkhard
2016-11-01
The chronic airway infections with Pseudomonas aeruginosa in people with cystic fibrosis (CF) are treated with aerosolized antibiotics, oral fluoroquinolones, and/or intravenous combination therapy with aminoglycosides and β-lactam antibiotics. An international strain collection of 361 P. aeruginosa isolates from 258 CF patients seen at 30 CF clinics was examined for mutations in 17 antimicrobial susceptibility and resistance loci that had been identified as hot spots of mutation by genome sequencing of serial isolates from a single CF clinic. Combinatorial amplicon sequencing of pooled PCR products identified 1,112 sequence variants that were not present in the genomes of representative strains of the 20 most common clones of the global P. aeruginosa population. A high frequency of singular coding variants was seen in spuE, mexA, gyrA, rpoB, fusA1, mexZ, mexY, oprD, ampD, parR, parS, and envZ (amgS), reflecting the pressure upon P. aeruginosa in lungs of CF patients to generate novel protein variants. The proportion of nonneutral amino acid exchanges was high. Of the 17 loci, mexA, mexZ, and pagL were most frequently affected by independent stop mutations. Private and de novo mutations seem to play a pivotal role in the response of P. aeruginosa populations to the antimicrobial load and the individual CF host. Copyright © 2016, American Society for Microbiology. All Rights Reserved.
Preconception Carrier Screening by Genome Sequencing: Results from the Clinical Laboratory.
Punj, Sumit; Akkari, Yassmine; Huang, Jennifer; Yang, Fei; Creason, Allison; Pak, Christine; Potter, Amiee; Dorschner, Michael O; Nickerson, Deborah A; Robertson, Peggy D; Jarvik, Gail P; Amendola, Laura M; Schleit, Jennifer; Simpson, Dana Kostiner; Rope, Alan F; Reiss, Jacob; Kauffman, Tia; Gilmore, Marian J; Himes, Patricia; Wilfond, Benjamin; Goddard, Katrina A B; Richards, C Sue
2018-06-07
Advances in sequencing technologies permit the analysis of a larger selection of genes for preconception carrier screening. The study was designed as a sequential carrier screen using genome sequencing to analyze 728 gene-disorder pairs for carrier and medically actionable conditions in 131 women and their partners (n = 71) who were planning a pregnancy. We report here on the clinical laboratory results from this expanded carrier screening program. Variants were filtered and classified using the latest American College of Medical Genetics and Genomics (ACMG) guideline; only pathogenic and likely pathogenic variants were confirmed by orthologous methods before being reported. Novel missense variants were classified as variants of uncertain significance. We reported 304 variants in 202 participants. Twelve carrier couples (12/71 couples tested) were identified for common conditions; eight were carriers for hereditary hemochromatosis. Although both known and novel variants were reported, 48% of all reported variants were missense. For novel splice-site variants, RNA-splicing assays were performed to aid in classification. We reported ten copy-number variants and five variants in non-coding regions. One novel variant was reported in F8, associated with hemophilia A; prenatal testing showed that the male fetus harbored this variant and the neonate suffered a life-threatening hemorrhage which was anticipated and appropriately managed. Moreover, 3% of participants had variants that were medically actionable. Compared with targeted mutation screening, genome sequencing improves the sensitivity of detecting clinically significant variants. While certain novel variant interpretation remains challenging, the ACMG guidelines are useful to classify variants in a healthy population. Copyright © 2018 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Kamada, Mayumi; Hase, Sumitaka; Fujii, Kazushi; Miyake, Masato; Sato, Kengo; Kimura, Keitarou; Sakakibara, Yasubumi
2015-01-01
Bacillus subtilis is the main component in the fermentation of soybeans. To investigate the genetics of the soybean-fermenting B. subtilis strains and its relationship with the productivity of extracellular poly-γ-glutamic acid (γPGA), we sequenced the whole genome of eight B. subtilis stains isolated from non-salted fermented soybean foods in Southeast Asia. Assembled nucleotide sequences were compared with those of a natto (fermented soybean food) starter strain B. subtilis BEST195 and the laboratory standard strain B. subtilis 168 that is incapable of γPGA production. Detected variants were investigated in terms of insertion sequences, biotin synthesis, production of subtilisin NAT, and regulatory genes for γPGA synthesis, which were related to fermentation process. Comparing genome sequences, we found that the strains that produce γPGA have a deletion in a protein that constitutes the flagellar basal body, and this deletion was not found in the non-producing strains. We further identified diversity in variants of the bio operon, which is responsible for the biotin auxotrophism of the natto starter strains. Phylogenetic analysis using multilocus sequencing typing revealed that the B. subtilis strains isolated from the non-salted fermented soybeans were not clustered together, while the natto-fermenting strains were tightly clustered; this analysis also suggested that the strain isolated from “Tua Nao” of Thailand traces a different evolutionary process from other strains. PMID:26505996
Zhang, Jimmy F; James, Francis; Shukla, Anju; Girisha, Katta M; Paciorkowski, Alex R
2017-06-27
We built India Allele Finder, an online searchable database and command line tool, that gives researchers access to variant frequencies of Indian Telugu individuals, using publicly available fastq data from the 1000 Genomes Project. Access to appropriate population-based genomic variant annotation can accelerate the interpretation of genomic sequencing data. In particular, exome analysis of individuals of Indian descent will identify population variants not reflected in European exomes, complicating genomic analysis for such individuals. India Allele Finder offers improved ease-of-use to investigators seeking to identify and annotate sequencing data from Indian populations. We describe the use of India Allele Finder to identify common population variants in a disease quartet whole exome dataset, reducing the number of candidate single nucleotide variants from 84 to 7. India Allele Finder is freely available to investigators to annotate genomic sequencing data from Indian populations. Use of India Allele Finder allows efficient identification of population variants in genomic sequencing data, and is an example of a population-specific annotation tool that simplifies analysis and encourages international collaboration in genomics research.
Antimicrobial Peptides Produced by Selective Pressure Incorporation of Non-canonical Amino Acids.
Nickling, Jessica H; Baumann, Tobias; Schmitt, Franz-Josef; Bartholomae, Maike; Kuipers, Oscar P; Friedrich, Thomas; Budisa, Nediljko
2018-05-04
Nature has a variety of possibilities to create new protein functions by modifying the sequence of the individual amino acid building blocks. However, all variations are based on the 20 canonical amino acids (cAAs). As a way to introduce additional physicochemical properties into polypeptides, the incorporation of non-canonical amino acids (ncAAs) is increasingly used in protein engineering. Due to their relatively short length, the modification of ribosomally synthesized and post-translationally modified peptides by ncAAs is particularly attractive. New functionalities and chemical handles can be generated by specific modifications of individual residues. The selective pressure incorporation (SPI) method utilizes auxotrophic host strains that are deprived of an essential amino acid in chemically defined growth media. Several structurally and chemically similar amino acid analogs can then be activated by the corresponding aminoacyl-tRNA synthetase and provide residue-specific cAA(s) → ncAA(s) substitutions in the target peptide or protein sequence. Although, in the context of the SPI method, ncAAs are also incorporated into the host proteome during the phase of recombinant gene expression, the majority of the cell's resources are assigned to the expression of the target gene. This enables efficient residue-specific incorporation of ncAAs often accompanied with high amounts of modified target. The presented work describes the in vivo incorporation of six proline analogs into the antimicrobial peptide nisin, a lantibiotic naturally produced by Lactococcus lactis. Antimicrobial properties of nisin can be changed and further expanded during its fermentation and expression in auxotrophic Escherichia coli strains in defined growth media. Thereby, the effects of residue-specific replacement of cAAs with ncAAs can deliver changes in antimicrobial activity and specificity. Antimicrobial activity assays and fluorescence microscopy are used to test the new nisin variants for growth inhibition of a Gram-positive Lactococcus lactis indicator strain. Mass spectroscopy is used to confirm ncAA incorporation in bioactive nisin variants.
López-Bueno, Alberto; Rubio, Mari-Paz; Bryant, Nathan; McKenna, Robert; Agbandje-McKenna, Mavis; Almendral, José M.
2006-01-01
The role of receptor recognition in the emergence of virulent viruses was investigated in the infection of severe combined immunodeficient (SCID) mice by the apathogenic prototype strain of the parvovirus minute virus of mice (MVMp). Genetic analysis of isolated MVMp viral clones (n = 48) emerging in mice, including lethal variants, showed only one of three single changes (V325M, I362S, or K368R) in the common sequence of the two capsid proteins. As was found for the parental isolates, the constructed recombinant viruses harboring the I362S or the K368R single substitutions in the capsid sequence, or mutations at both sites, showed a large-plaque phenotype and lower avidity than the wild type for cells in the cytotoxic interaction with two permissive fibroblast cell lines in vitro and caused a lethal disease in SCID mice when inoculated by the natural oronasal route. Significantly, the productive adsorption of MVMp variants carrying any of the three mutations selected through parallel evolution in mice showed higher sensitivity to the treatment of cells by neuraminidase than that of the wild type, indicating a lower affinity of the viral particle for the sialic acid component of the receptor. Consistent with this, the X-ray crystal structure of the MVMp capsids soaked with sialic acid (N-acetyl neuraminic acid) showed the sugar allocated in the depression at the twofold axis of symmetry (termed the dimple), immediately adjacent to residues I362 and K368, which are located on the wall of the dimple, and approximately 22 Å away from V325 in a threefold-related monomer. This is the first reported crystal structure identifying an infectious receptor attachment site on a parvovirus capsid. We conclude that the affinity of the interactions of sialic-acid-containing receptors with residues at or surrounding the dimple can evolutionarily regulate parvovirus pathogenicity and adaptation to new hosts. PMID:16415031
Methods for engineering polypeptide variants via somatic hypermutation and polypeptide made thereby
Tsien, Roger Y; Wang, Lei
2015-01-13
Methods using somatic hypermutation (SHM) for producing polypeptide and nucleic acid variants, and nucleic acids encoding such polypeptide variants are disclosed. Such variants may have desired properties. Also disclosed are novel polypeptides, such as improved fluorescent proteins, produced by the novel methods, and nucleic acids, vectors, and host cells comprising such vectors.
Hybridization capture reveals evolution and conservation across the entire Koala retrovirus genome.
Tsangaras, Kyriakos; Siracusa, Matthew C; Nikolaidis, Nikolas; Ishida, Yasuko; Cui, Pin; Vielgrader, Hanna; Helgen, Kristofer M; Roca, Alfred L; Greenwood, Alex D
2014-01-01
The koala retrovirus (KoRV) is the only retrovirus known to be in the midst of invading the germ line of its host species. Hybridization capture and next generation sequencing were used on modern and museum DNA samples of koala (Phascolarctos cinereus) to examine ca. 130 years of evolution across the full KoRV genome. Overall, the entire proviral genome appeared to be conserved across time in sequence, protein structure and transcriptional binding sites. A total of 138 polymorphisms were detected, of which 72 were found in more than one individual. At every polymorphic site in the museum koalas, one of the character states matched that of modern KoRV. Among non-synonymous polymorphisms, radical substitutions involving large physiochemical differences between amino acids were elevated in env, potentially reflecting anti-viral immune pressure or avoidance of receptor interference. Polymorphisms were not detected within two functional regions believed to affect infectivity. Host sequences flanking proviral integration sites were also captured; with few proviral loci shared among koalas. Recently described variants of KoRV, designated KoRV-B and KoRV-J, were not detected in museum samples, suggesting that these variants may be of recent origin.
Hybridization Capture Reveals Evolution and Conservation across the Entire Koala Retrovirus Genome
Ishida, Yasuko; Cui, Pin; Vielgrader, Hanna; Helgen, Kristofer M.; Roca, Alfred L.; Greenwood, Alex D.
2014-01-01
The koala retrovirus (KoRV) is the only retrovirus known to be in the midst of invading the germ line of its host species. Hybridization capture and next generation sequencing were used on modern and museum DNA samples of koala (Phascolarctos cinereus) to examine ca. 130 years of evolution across the full KoRV genome. Overall, the entire proviral genome appeared to be conserved across time in sequence, protein structure and transcriptional binding sites. A total of 138 polymorphisms were detected, of which 72 were found in more than one individual. At every polymorphic site in the museum koalas, one of the character states matched that of modern KoRV. Among non-synonymous polymorphisms, radical substitutions involving large physiochemical differences between amino acids were elevated in env, potentially reflecting anti-viral immune pressure or avoidance of receptor interference. Polymorphisms were not detected within two functional regions believed to affect infectivity. Host sequences flanking proviral integration sites were also captured; with few proviral loci shared among koalas. Recently described variants of KoRV, designated KoRV-B and KoRV-J, were not detected in museum samples, suggesting that these variants may be of recent origin. PMID:24752422
Kim, Hyon Suk; Chen, Xinyue; Xu, Min; Yan, Cunling; Liu, Yali; Deng, Haohui; Hoang, Bui Huu; Thuy, Pham Thi Thu; Wang, Terry; Yan, Yiwen; Zeng, Zhen; Gencay, Mikael; Westergaard, Gaston; Pabinger, Stephan; Kriegner, Albert; Nauck, Markus; Seffner, Anja; Gohl, Peter; Hübner, Kirsten; Kaminski, Wolfgang E
2018-06-01
To avoid false negative results, hepatitis B surface antigen (HBsAg) assays need to detect samples with mutations in the immunodominant 'a' determinant region, which vary by ethnographic region. We evaluated the prevalence and type of HBsAg mutations in a hepatitis B virus (HBV)-infected East- and Southeast Asian population, and the diagnostic performance of the Elecsys ® HBsAg II Qualitative assay. We analyzed 898 samples from patients with HBV infection from four sites (China [Beijing and Guangzhou], Korea and Vietnam). HBsAg mutations were detected and sequenced using highly sensitive ultra-deep sequencing and compared between the first (amino acids 124-137) and second (amino acids 139-147) loops of the 'a' determinant region using the Elecsys ® HBsAg II Qualitative assay. Overall, 237 distinct amino acid mutations in the major hydrophilic region were identified; mutations were present in 660 of 898 HBV-infected patient samples (73.5%). Within the pool of 237 distinct mutations, the majority of the amino acid mutations were found in HBV genotype C (64.8%). We identified 25 previously unknown distinct mutations, mostly prevalent in genotype C-infected Korean patients (n = 18) followed by Chinese (n = 12) patients. All 898 samples were correctly identified by the Elecsys ® HBsAg II Qualitative assay. We observed 237 distinct (including 25 novel) mutations, demonstrating the complexity of HBsAg variants in HBV-infected East- and Southeast Asian patients. The Elecsys ® HBsAg II Qualitative assay can reliably detect HBV-positive samples and is suitable for routine diagnostic use in East and Southeast Asia. Copyright © 2018 Roche Diagnostics International Ltd. Published by Elsevier B.V. All rights reserved.
Carr, Ian M; Morgan, Joanne; Watson, Christopher; Melnik, Svitlana; Diggle, Christine P; Logan, Clare V; Harrison, Sally M; Taylor, Graham R; Pena, Sergio D J; Markham, Alexander F; Alkuraya, Fowzan S; Black, Graeme C M; Ali, Manir; Bonthron, David T
2013-07-01
Massively parallel ("next generation") DNA sequencing (NGS) has quickly become the method of choice for seeking pathogenic mutations in rare uncharacterized monogenic diseases. Typically, before DNA sequencing, protein-coding regions are enriched from patient genomic DNA, representing either the entire genome ("exome sequencing") or selected mapped candidate loci. Sequence variants, identified as differences between the patient's and the human genome reference sequences, are then filtered according to various quality parameters. Changes are screened against datasets of known polymorphisms, such as dbSNP and the 1000 Genomes Project, in the effort to narrow the list of candidate causative variants. An increasing number of commercial services now offer to both generate and align NGS data to a reference genome. This potentially allows small groups with limited computing infrastructure and informatics skills to utilize this technology. However, the capability to effectively filter and assess sequence variants is still an important bottleneck in the identification of deleterious sequence variants in both research and diagnostic settings. We have developed an approach to this problem comprising a user-friendly suite of programs that can interactively analyze, filter and screen data from enrichment-capture NGS data. These programs ("Agile Suite") are particularly suitable for small-scale gene discovery or for diagnostic analysis. © 2013 WILEY PERIODICALS, INC.
Regularized rare variant enrichment analysis for case-control exome sequencing data.
Larson, Nicholas B; Schaid, Daniel J
2014-02-01
Rare variants have recently garnered an immense amount of attention in genetic association analysis. However, unlike methods traditionally used for single marker analysis in GWAS, rare variant analysis often requires some method of aggregation, since single marker approaches are poorly powered for typical sequencing study sample sizes. Advancements in sequencing technologies have rendered next-generation sequencing platforms a realistic alternative to traditional genotyping arrays. Exome sequencing in particular not only provides base-level resolution of genetic coding regions, but also a natural paradigm for aggregation via genes and exons. Here, we propose the use of penalized regression in combination with variant aggregation measures to identify rare variant enrichment in exome sequencing data. In contrast to marginal gene-level testing, we simultaneously evaluate the effects of rare variants in multiple genes, focusing on gene-based least absolute shrinkage and selection operator (LASSO) and exon-based sparse group LASSO models. By using gene membership as a grouping variable, the sparse group LASSO can be used as a gene-centric analysis of rare variants while also providing a penalized approach toward identifying specific regions of interest. We apply extensive simulations to evaluate the performance of these approaches with respect to specificity and sensitivity, comparing these results to multiple competing marginal testing methods. Finally, we discuss our findings and outline future research. © 2013 WILEY PERIODICALS, INC.
Inner retinal dystrophy in a patient with biallelic sequence variants in BRAT1.
Oatts, Julius T; Duncan, Jacque L; Hoyt, Creig S; Slavotinek, Anne M; Moore, Anthony T
2017-12-01
Mutations in the BRCA1-associated protein required for the ataxia telangiectasia mutated (ATM) activation-1 (BRAT1) gene cause lethal neonatal rigidity and multifocal seizure syndrome characterized by rigidity and intractable seizures and a milder phenotype with intellectual disability, seizures, nonprogressive cerebellar ataxia or dyspraxia, and cerebellar atrophy. To date, nystagmus, cortical visual impairment, impairment of central vision, optic nerve hypoplasia, and optic atrophy have been described in this condition. This article describes the retinal findings in a patient with biallelic deleterious sequence variants in BRAT1. Case report of a child with biallelic sequence variants in the BRAT1 gene. This patient had developmental delay, microcephaly, nystagmus, and esotropia, and full-field electroretinography (ERG) revealed an inner retinal dystrophy. She was found on exome sequencing to have compound heterozygous sequence variants in the BRAT1 gene: one maternally inherited frameshift variant (c.294dupA, predicting p.Leu99Thrfs*92), which has previously been reported, and one paternally inherited novel missense variant (c.803G>A, p.Arg268His), which is likely to affect protein function. Biallelic sequence variants in BRAT1 have been reported to cause a variety of ocular and systemic manifestations, but to our knowledge, this is the first report of inner retinal dysfunction manifest as selective loss of full-field ERG scotopic and photopic b-wave amplitudes.
Cady, Janet; Allred, Peggy; Bali, Taha; Pestronk, Alan; Goate, Alison; Miller, Timothy M; Mitra, Robi D; Ravits, John; Harms, Matthew B; Baloh, Robert H
2015-01-01
To define the genetic landscape of amyotrophic lateral sclerosis (ALS) and assess the contribution of possible oligogenic inheritance, we aimed to comprehensively sequence 17 known ALS genes in 391 ALS patients from the United States. Targeted pooled-sample sequencing was used to identify variants in 17 ALS genes. Fragment size analysis was used to define ATXN2 and C9ORF72 expansion sizes. Genotype-phenotype correlations were made with individual variants and total burden of variants. Rare variant associations for risk of ALS were investigated at both the single variant and gene level. A total of 64.3% of familial and 27.8% of sporadic subjects carried potentially pathogenic novel or rare coding variants identified by sequencing or an expanded repeat in C9ORF72 or ATXN2; 3.8% of subjects had variants in >1 ALS gene, and these individuals had disease onset 10 years earlier (p = 0.0046) than subjects with variants in a single gene. The number of potentially pathogenic coding variants did not influence disease duration or site of onset. Rare and potentially pathogenic variants in known ALS genes are present in >25% of apparently sporadic and 64% of familial patients, significantly higher than previous reports using less comprehensive sequencing approaches. A significant number of subjects carried variants in >1 gene, which influenced the age of symptom onset and supports oligogenic inheritance as relevant to disease pathogenesis. © 2014 American Neurological Association.
Imani, Saber; Cheng, Jingliang; Shasaltaneh, Marzieh Dehghan; Wei, Chunli; Yang, Lisha; Fu, Shangyi; Zou, Hui; Khan, Md. Asaduzzaman; Zhang, Xianqin; Chen, Hanchun; Zhang, Dianzheng; Duan, Chengxia; Lv, Hongbin; Li, Yumei; Chen, Rui; Fu, Junjiang
2018-01-01
Stargardt disease-4 (STGD4) is an autosomal dominant complex, genetically heterogeneous macular degeneration/dystrophy (MD) disorder. In this paper, we used targeted next generation sequencing and multiple molecular dynamics analyses to identify and characterize a disease-causing genetic variant in four generations of a Chinese family with STGD4-like MD. We found a novel heterozygous missense mutation, c.734T>C (p.L245P) in the PROM1 gene. Structurally, this mutation most likely impairs PROM1 protein stability, flexibility, and amino acid interaction network after changing the amino acid residue Leucine into Proline in the basic helix-loop-helix leucine zipper domain. Molecular dynamic simulation and principal component analysis provide compelling evidence that this PROM1 mutation contributes to disease causativeness or susceptibility variants in patients with STGD4-like MD. Thus, this finding defines new approaches in genetic characterization, accurate diagnosis, and prevention of STGD4-like MD. PMID:29416601
Plasticity of laccase generated by homeologous recombination in yeast.
Cusano, Angela M; Mekmouche, Yasmina; Meglecz, Emese; Tron, Thierry
2009-10-01
Laccase-encoding sequences sharing 65-71% identity were shuffledin vivo by homeologous recombination. Yeast efficiently repaired linearized plasmids containing clac1, clac2 or clac5 Trametes sp. C30 cDNAs using a clac3 PCR fragment. From transformants secreting active variants, three chimeric laccases (LAC131, LAC232 and LAC535), each resulting from double crossovers, were purified, and their apparent kinetic parameters were determined using 2,2'-azino-bis(3-ethylbenzthiazoline-6-sulphonic acid) and syringaldazine (SGZ) as substrates. At acidic pH, the apparent kinetic parameters of the chimera were not distinguishable from each other or from those obtained for the LAC3 enzyme used as reference. On the other hand, the pH tolerance of the variants was visibly extended towards alkaline pH values. Compared to the parental LAC3, a 31-fold increase in apparent k(cat) was observed for LAC131 at pH 8. This factor is one of the highest ever observed for laccase in a single mutagenesis step.
Naidu, Hariprasad; Subramanian, B Mohana; Chinchkar, Shankar Ramchandra; Sriraman, Rajan; Rana, Samir Kumar; Srinivasan, V A
2012-05-01
The antigenic types of canine parvovirus (CPV) are defined based on differences in the amino acids of the major capsid protein VP2. Type specificity is conferred by a limited number of amino acid changes and in particular by few nucleotide substitutions. PCR based methods are not particularly suitable for typing circulating variants which differ in a few specific nucleotide substitutions. Assays for determining SNPs can detect efficiently nucleotide substitutions and can thus be adapted to identify CPV types. In the present study, CPV typing was performed by single nucleotide extension using the mini-sequencing technique. A mini-sequencing signature was established for all the four CPV types (CPV2, 2a, 2b and 2c) and feline panleukopenia virus. The CPV typing using the mini-sequencing reaction was performed for 13 CPV field isolates and the two vaccine strains available in our repository. All the isolates had been typed earlier by full-length sequencing of the VP2 gene. The typing results obtained from mini-sequencing matched completely with that of sequencing. Typing could be achieved with less than 100 copies of standard plasmid DNA constructs or ≤10¹ FAID₅₀ of virus by mini-sequencing technique. The technique was also efficient for detecting multiple types in mixed infections. Copyright © 2012 Elsevier B.V. All rights reserved.
Hwang, Sang Mee; Lee, Ki Chan; Lee, Min Seob; Park, Kyoung Un
2018-01-01
Transition to next generation sequencing (NGS) for BRCA1 / BRCA2 analysis in clinical laboratories is ongoing but different platforms and/or data analysis pipelines give different results resulting in difficulties in implementation. We have evaluated the Ion Personal Genome Machine (PGM) Platforms (Ion PGM, Ion PGM Dx, Thermo Fisher Scientific) for the analysis of BRCA1 /2. The results of Ion PGM with OTG-snpcaller, a pipeline based on Torrent mapping alignment program and Genome Analysis Toolkit, from 75 clinical samples and 14 reference DNA samples were compared with Sanger sequencing for BRCA1 / BRCA2 . Ten clinical samples and 14 reference DNA samples were additionally sequenced by Ion PGM Dx with Torrent Suite. Fifty types of variants including 18 pathogenic or variants of unknown significance were identified from 75 clinical samples and known variants of the reference samples were confirmed by Sanger sequencing and/or NGS. One false-negative results were present for Ion PGM/OTG-snpcaller for an indel variant misidentified as a single nucleotide variant. However, eight discordant results were present for Ion PGM Dx/Torrent Suite with both false-positive and -negative results. A 40-bp deletion, a 4-bp deletion and a 1-bp deletion variant was not called and a false-positive deletion was identified. Four other variants were misidentified as another variant. Ion PGM/OTG-snpcaller showed acceptable performance with good concordance with Sanger sequencing. However, Ion PGM Dx/Torrent Suite showed many discrepant results not suitable for use in a clinical laboratory, requiring further optimization of the data analysis for calling variants.
Bai, W L; Yin, R H; Dou, Q L; Jiang, W Q; Zhao, S J; Ma, Z J; Luo, G B; Zhao, Z H
2011-04-01
κ-Casein is one of the major proteins in the milk of mammals. It plays an important role in determining the size and specific function of milk micelles. We have previously identified and characterized a genetic variant of yak κ-casein by evaluating genomic DNA. Here, we isolate and characterize a yak κ-casein cDNA harboring the full-length open reading frame (ORF) from lactating mammary gland. Total RNA was extracted from mammary tissue of lactating female yak, and the κ-casein cDNA were synthesized by RT-PCR technique, then cloned and sequenced. The obtained cDNA of 660-bp contained an ORF sufficient to encode the entire amino acid sequence of κ-casein precursor protein consisting of 190 amino acids with a signal peptide of 21 amino acids. Yak κ-casein has a predicted molecular mass of 19,006.588 Da with a calculated isoelectric point of 7.245. Compared with the corresponding sequences in GenBank of cattle, buffalo, sheep, goat, Arabian camel, horse, and rabbit, yak κ-casein sequence had identity of 64.76-98.78% in cDNA, and identity of 44.79-98.42% and similarity of 53.65-98.42% in deduced amino acids, revealing a high homology with the other livestock species. Based on κ-casein cDNA sequences, the phylogenetic analysis indicated that yak κ-casein had a close relationship with that of cattle. This work might be useful in the genetic engineering researches for yak κ-casein.
VarBin, a novel method for classifying true and false positive variants in NGS data
2013-01-01
Background Variant discovery for rare genetic diseases using Illumina genome or exome sequencing involves screening of up to millions of variants to find only the one or few causative variant(s). Sequencing or alignment errors create "false positive" variants, which are often retained in the variant screening process. Methods to remove false positive variants often retain many false positive variants. This report presents VarBin, a method to prioritize variants based on a false positive variant likelihood prediction. Methods VarBin uses the Genome Analysis Toolkit variant calling software to calculate the variant-to-wild type genotype likelihood ratio at each variant change and position divided by read depth. The resulting Phred-scaled, likelihood-ratio by depth (PLRD) was used to segregate variants into 4 Bins with Bin 1 variants most likely true and Bin 4 most likely false positive. PLRD values were calculated for a proband of interest and 41 additional Illumina HiSeq, exome and whole genome samples (proband's family or unrelated samples). At variant sites without apparent sequencing or alignment error, wild type/non-variant calls cluster near -3 PLRD and variant calls typically cluster above 10 PLRD. Sites with systematic variant calling problems (evident by variant quality scores and biases as well as displayed on the iGV viewer) tend to have higher and more variable wild type/non-variant PLRD values. Depending on the separation of a proband's variant PLRD value from the cluster of wild type/non-variant PLRD values for background samples at the same variant change and position, the VarBin method's classification is assigned to each proband variant (Bin 1 to Bin 4). Results To assess VarBin performance, Sanger sequencing was performed on 98 variants in the proband and background samples. True variants were confirmed in 97% of Bin 1 variants, 30% of Bin 2, and 0% of Bin 3/Bin 4. Conclusions These data indicate that VarBin correctly classifies the majority of true variants as Bin 1 and Bin 3/4 contained only false positive variants. The "uncertain" Bin 2 contained both true and false positive variants. Future work will further differentiate the variants in Bin 2. PMID:24266885
Unlocking hidden genomic sequence
Keith, Jonathan M.; Cochran, Duncan A. E.; Lala, Gita H.; Adams, Peter; Bryant, Darryn; Mitchelson, Keith R.
2004-01-01
Despite the success of conventional Sanger sequencing, significant regions of many genomes still present major obstacles to sequencing. Here we propose a novel approach with the potential to alleviate a wide range of sequencing difficulties. The technique involves extracting target DNA sequence from variants generated by introduction of random mutations. The introduction of mutations does not destroy original sequence information, but distributes it amongst multiple variants. Some of these variants lack problematic features of the target and are more amenable to conventional sequencing. The technique has been successfully demonstrated with mutation levels up to an average 18% base substitution and has been used to read previously intractable poly(A), AT-rich and GC-rich motifs. PMID:14973330
The functional spectrum of low-frequency coding variation.
Marth, Gabor T; Yu, Fuli; Indap, Amit R; Garimella, Kiran; Gravel, Simon; Leong, Wen Fung; Tyler-Smith, Chris; Bainbridge, Matthew; Blackwell, Tom; Zheng-Bradley, Xiangqun; Chen, Yuan; Challis, Danny; Clarke, Laura; Ball, Edward V; Cibulskis, Kristian; Cooper, David N; Fulton, Bob; Hartl, Chris; Koboldt, Dan; Muzny, Donna; Smith, Richard; Sougnez, Carrie; Stewart, Chip; Ward, Alistair; Yu, Jin; Xue, Yali; Altshuler, David; Bustamante, Carlos D; Clark, Andrew G; Daly, Mark; DePristo, Mark; Flicek, Paul; Gabriel, Stacey; Mardis, Elaine; Palotie, Aarno; Gibbs, Richard
2011-09-14
Rare coding variants constitute an important class of human genetic variation, but are underrepresented in current databases that are based on small population samples. Recent studies show that variants altering amino acid sequence and protein function are enriched at low variant allele frequency, 2 to 5%, but because of insufficient sample size it is not clear if the same trend holds for rare variants below 1% allele frequency. The 1000 Genomes Exon Pilot Project has collected deep-coverage exon-capture data in roughly 1,000 human genes, for nearly 700 samples. Although medical whole-exome projects are currently afoot, this is still the deepest reported sampling of a large number of human genes with next-generation technologies. According to the goals of the 1000 Genomes Project, we created effective informatics pipelines to process and analyze the data, and discovered 12,758 exonic SNPs, 70% of them novel, and 74% below 1% allele frequency in the seven population samples we examined. Our analysis confirms that coding variants below 1% allele frequency show increased population-specificity and are enriched for functional variants. This study represents a large step toward detecting and interpreting low frequency coding variation, clearly lays out technical steps for effective analysis of DNA capture data, and articulates functional and population properties of this important class of genetic variation.
Steinberg, Karyn Meltz; Ramachandran, Dhanya; Patel, Viren C; Shetty, Amol C; Cutler, David J; Zwick, Michael E
2012-09-28
Autism spectrum disorder (ASD) is highly heritable, but the genetic risk factors for it remain largely unknown. Although structural variants with large effect sizes may explain up to 15% ASD, genome-wide association studies have failed to uncover common single nucleotide variants with large effects on phenotype. The focus within ASD genetics is now shifting to the examination of rare sequence variants of modest effect, which is most often achieved via exome selection and sequencing. This strategy has indeed identified some rare candidate variants; however, the approach does not capture the full spectrum of genetic variation that might contribute to the phenotype. We surveyed two loci with known rare variants that contribute to ASD, the X-linked neuroligin genes by performing massively parallel Illumina sequencing of the coding and noncoding regions from these genes in males from families with multiplex autism. We annotated all variant sites and functionally tested a subset to identify other rare mutations contributing to ASD susceptibility. We found seven rare variants at evolutionary conserved sites in our study population. Functional analyses of the three 3' UTR variants did not show statistically significant effects on the expression of NLGN3 and NLGN4X. In addition, we identified two NLGN3 intronic variants located within conserved transcription factor binding sites that could potentially affect gene regulation. These data demonstrate the power of massively parallel, targeted sequencing studies of affected individuals for identifying rare, potentially disease-contributing variation. However, they also point out the challenges and limitations of current methods of direct functional testing of rare variants and the difficulties of identifying alleles with modest effects.
2012-01-01
Background Autism spectrum disorder (ASD) is highly heritable, but the genetic risk factors for it remain largely unknown. Although structural variants with large effect sizes may explain up to 15% ASD, genome-wide association studies have failed to uncover common single nucleotide variants with large effects on phenotype. The focus within ASD genetics is now shifting to the examination of rare sequence variants of modest effect, which is most often achieved via exome selection and sequencing. This strategy has indeed identified some rare candidate variants; however, the approach does not capture the full spectrum of genetic variation that might contribute to the phenotype. Methods We surveyed two loci with known rare variants that contribute to ASD, the X-linked neuroligin genes by performing massively parallel Illumina sequencing of the coding and noncoding regions from these genes in males from families with multiplex autism. We annotated all variant sites and functionally tested a subset to identify other rare mutations contributing to ASD susceptibility. Results We found seven rare variants at evolutionary conserved sites in our study population. Functional analyses of the three 3’ UTR variants did not show statistically significant effects on the expression of NLGN3 and NLGN4X. In addition, we identified two NLGN3 intronic variants located within conserved transcription factor binding sites that could potentially affect gene regulation. Conclusions These data demonstrate the power of massively parallel, targeted sequencing studies of affected individuals for identifying rare, potentially disease-contributing variation. However, they also point out the challenges and limitations of current methods of direct functional testing of rare variants and the difficulties of identifying alleles with modest effects. PMID:23020841
Zhao, Mi; He, Maoxian; Huang, Xiande; Wang, Qi; Shi, Yu
2016-02-01
The granulin/epithelin precursor (GEP) encodes a glycoprotein precursor which exhibits pleiotropic tissue growth factor activity with multiple functions. Here, GEP was isolated and its role in the shell biomineralization process of the pearl oyster Pinctada fucata was investigated. Three forms of GEP mRNA were isolated from the pearl oyster (designated PfGEP-1, PfGEP-2 and PfGEP-3). Genomic DNA flanking the splicing region of the PfGEP variants was sequenced and it was found that PfGEP-2 splices out Exon 4, whereas PfGEP-3 splices out Exon 3 compared to PfGEP-1. PfGEP-1 (1505 amino acids) consists of 18 granulin domains, whereas PfGEP-2 (1459 amino acids) and PfGEP-3 (1471 amino acids) consist of 17.5 granulin domains, respectively. Analyses of PfGEP-1 and PfGEP-3 mRNA showed differential patterns in the tissues and developmental stages. Western blotting results showed that the three splice variants can translate to proteins in HEK293T cells. A knockdown experiment using PfGEP dsRNA showed decreased PfGEP-1/PfGEP-3 and PfMSX mRNA, and irregular crystallization of the nacreous layer using scanning electron microscopy. In luciferase assays, co-transfection of PfGEP-1 could activate as well as repress luciferase expression of the reporter plasmid driven by the PfMSX promoter, whereas PfGEP-3 stimulated the expression, elucidating the molecular mechanisms involved in the correlation between PfGEP and PfMSX. These results suggested that GEP variants might function differently during the biomineralization process, which provides new knowledge on the mechanism regulating nacre formation.
García, Verónica; Salinas, Francisco; Aguilera, Omayra; Liti, Gianni; Martínez, Claudio
2014-01-01
Different populations within a species represent a rich reservoir of allelic variants, corresponding to an evolutionary signature of withstood environmental constraints. Saccharomyces cerevisiae strains are widely utilised in the fermentation of different kinds of alcoholic beverages, such as, wine and sake, each of them derived from must with distinct nutrient composition. Importantly, adequate nitrogen levels in the medium are essential for the fermentation process, however, a comprehensive understanding of the genetic variants determining variation in nitrogen consumption is lacking. Here, we assessed the genetic factors underlying variation in nitrogen consumption in a segregating population derived from a cross between two main fermenter yeasts, a Wine/European and a Sake isolate. By linkage analysis we identified 18 main effect QTLs for ammonium and amino acids sources. Interestingly, majority of QTLs were involved in more than a single trait, grouped based on amino acid structure and indicating high levels of pleiotropy across nitrogen sources, in agreement with the observed patterns of phenotypic co-variation. Accordingly, we performed reciprocal hemizygosity analysis validating an effect for three genes, GLT1, ASI1 and AGP1. Furthermore, we detected a widespread pleiotropic effect on these genes, with AGP1 affecting seven amino acids and nine in the case of GLT1 and ASI1. Based on sequence and comparative analysis, candidate causative mutations within these genes were also predicted. Altogether, the identification of these variants demonstrate how Sake and Wine/European genetic backgrounds differentially consume nitrogen sources, in part explaining independently evolved preferences for nitrogen assimilation and representing a niche of genetic diversity for the implementation of practical approaches towards more efficient strains for nitrogen metabolism. PMID:24466135
Chronological analysis of canine parvovirus type 2 isolates in Japan.
Ohshima, Takahisa; Hisaka, Mitsuaki; Kawakami, Kazuo; Kishi, Masahiko; Tohya, Yukinobu; Mochizuki, Masami
2008-08-01
Fifty-five canine parvovirus type 2 (CPV) samples, 12 fecal specimens and 43 cell culture isolates, were examined for their genetic characteristics of VP2 gene. They were collected from the diseased dogs at various districts of Japan during 27 years from 1980 to 2006. A fragment of VP2 gene was analyzed by restriction fragment length polymorphism assay and DNA sequencing. The original antigenic type 2 of CPV (CPV-2) was no longer found in the samples since 1984, and two antigenic variants CPV-2a and CPV-2b replaced CPV-2 as predominant types for about 5 years from 1982. A new genetic variant of prototype CPV-2a with non-synonymous substitution at the VP2 amino acid residue 297 from Ser to Ala was first detected in 1987. New CPV-2b with the same amino acid substitution at position 297 as new CPV-2a was also detected from the samples collected in 1997. Since then new CPV-2b has been the predominant CPV over the field of Japan. Several additional amino acid substitutions were detected in the VP2 gene of some recent CPV strains. Neither CPV-2c(a), CPV-2c(b), nor "Glu-426" of the antigenic variants previously found outside the country was detected in any samples tested. Reactivity of new CPV-2a and 2b variants against antibodies produced by the current vaccine products was determined by a cross hemagglutination-inhibition test. The recent field CPV isolates reacted more efficiently to the antibodies produced in dogs vaccinated with the new CPV-2b vaccine strain than the conventional CPV-2 vaccine strain.
Bull, Marta; Learn, Gerald; Genowati, Indira; McKernan, Jennifer; Hitti, Jane; Lockhart, David; Tapia, Kenneth; Holte, Sarah; Dragavon, Joan; Coombs, Robert; Mullins, James; Frenkel, Lisa
2009-09-22
Compartmentalization of HIV-1 between the genital tract and blood was noted in half of 57 women included in 12 studies primarily using cell-free virus. To further understand differences between genital tract and blood viruses of women with chronic HIV-1 infection cell-free and cell-associated virus populations were sequenced from these tissues, reasoning that integrated viral DNA includes variants archived from earlier in infection, and provides a greater array of genotypes for comparisons. Multiple sequences from single-genome-amplification of HIV-1 RNA and DNA from the genital tract and blood of each woman were compared in a cross-sectional study. Maximum likelihood phylogenies were evaluated for evidence of compartmentalization using four statistical tests. Genital tract and blood HIV-1 appears compartmentalized in 7/13 women by >/=2 statistical analyses. These subjects' phylograms were characterized by low diversity genital-specific viral clades interspersed between clades containing both genital and blood sequences. Many of the genital-specific clades contained monotypic HIV-1 sequences. In 2/7 women, HIV-1 populations were significantly compartmentalized across all four statistical tests; both had low diversity genital tract-only clades. Collapsing monotypic variants into a single sequence diminished the prevalence and extent of compartmentalization. Viral sequences did not demonstrate tissue-specific signature amino acid residues, differential immune selection, or co-receptor usage. In women with chronic HIV-1 infection multiple identical sequences suggest proliferation of HIV-1-infected cells, and low diversity tissue-specific phylogenetic clades are consistent with bursts of viral replication. These monotypic and tissue-specific viruses provide statistical support for compartmentalization of HIV-1 between the female genital tract and blood. However, the intermingling of these clades with clades comprised of both genital and blood sequences and the absence of tissue-specific genetic features suggests compartmentalization between blood and genital tract may be due to viral replication and proliferation of infected cells, and questions whether HIV-1 in the female genital tract is distinct from blood.
Reuter, Miriam S.; Walker, Susan; Thiruvahindrapuram, Bhooma; Whitney, Joe; Cohn, Iris; Sondheimer, Neal; Yuen, Ryan K.C.; Trost, Brett; Paton, Tara A.; Pereira, Sergio L.; Herbrick, Jo-Anne; Wintle, Richard F.; Merico, Daniele; Howe, Jennifer; MacDonald, Jeffrey R.; Lu, Chao; Nalpathamkalam, Thomas; Sung, Wilson W.L.; Wang, Zhuozhi; Patel, Rohan V.; Pellecchia, Giovanna; Wei, John; Strug, Lisa J.; Bell, Sherilyn; Kellam, Barbara; Mahtani, Melanie M.; Bassett, Anne S.; Bombard, Yvonne; Weksberg, Rosanna; Shuman, Cheryl; Cohn, Ronald D.; Stavropoulos, Dimitri J.; Bowdin, Sarah; Hildebrandt, Matthew R.; Wei, Wei; Romm, Asli; Pasceri, Peter; Ellis, James; Ray, Peter; Meyn, M. Stephen; Monfared, Nasim; Hosseini, S. Mohsen; Joseph-George, Ann M.; Keeley, Fred W.; Cook, Ryan A.; Fiume, Marc; Lee, Hin C.; Marshall, Christian R.; Davies, Jill; Hazell, Allison; Buchanan, Janet A.; Szego, Michael J.; Scherer, Stephen W.
2018-01-01
BACKGROUND: The Personal Genome Project Canada is a comprehensive public data resource that integrates whole genome sequencing data and health information. We describe genomic variation identified in the initial recruitment cohort of 56 volunteers. METHODS: Volunteers were screened for eligibility and provided informed consent for open data sharing. Using blood DNA, we performed whole genome sequencing and identified all possible classes of DNA variants. A genetic counsellor explained the implication of the results to each participant. RESULTS: Whole genome sequencing of the first 56 participants identified 207 662 805 sequence variants and 27 494 copy number variations. We analyzed a prioritized disease-associated data set (n = 1606 variants) according to standardized guidelines, and interpreted 19 variants in 14 participants (25%) as having obvious health implications. Six of these variants (e.g., in BRCA1 or mosaic loss of an X chromosome) were pathogenic or likely pathogenic. Seven were risk factors for cancer, cardiovascular or neurobehavioural conditions. Four other variants — associated with cancer, cardiac or neurodegenerative phenotypes — remained of uncertain significance because of discrepancies among databases. We also identified a large structural chromosome aberration and a likely pathogenic mitochondrial variant. There were 172 recessive disease alleles (e.g., 5 individuals carried mutations for cystic fibrosis). Pharmacogenomics analyses revealed another 3.9 potentially relevant genotypes per individual. INTERPRETATION: Our analyses identified a spectrum of genetic variants with potential health impact in 25% of participants. When also considering recessive alleles and variants with potential pharmacologic relevance, all 56 participants had medically relevant findings. Although access is mostly limited to research, whole genome sequencing can provide specific and novel information with the potential of major impact for health care. PMID:29431110
Zeil, Catharina; Widmann, Michael; Fademrecht, Silvia; Vogel, Constantin; Pleiss, Jürgen
2016-05-01
The Lactamase Engineering Database (www.LacED.uni-stuttgart.de) was developed to facilitate the classification and analysis of TEM β-lactamases. The current version contains 474 TEM variants. Two hundred fifty-nine variants form a large scale-free network of highly connected point mutants. The network was divided into three subnetworks which were enriched by single phenotypes: one network with predominantly 2be and two networks with 2br phenotypes. Fifteen positions were found to be highly variable, contributing to the majority of the observed variants. Since it is expected that a considerable fraction of the theoretical sequence space is functional, the currently sequenced 474 variants represent only the tip of the iceberg of functional TEM β-lactamase variants which form a huge natural reservoir of highly interconnected variants. Almost 50% of the variants are part of a quartet. Thus, two single mutations that result in functional enzymes can be combined into a functional protein. Most of these quartets consist of the same phenotype, or the mutations are additive with respect to the phenotype. By predicting quartets from triplets, 3,916 unknown variants were constructed. Eighty-seven variants complement multiple quartets and therefore have a high probability of being functional. The construction of a TEM β-lactamase network and subsequent analyses by clustering and quartet prediction are valuable tools to gain new insights into the viable sequence space of TEM β-lactamases and to predict their phenotype. The highly connected sequence space of TEM β-lactamases is ideally suited to network analysis and demonstrates the strengths of network analysis over tree reconstruction methods. Copyright © 2016, American Society for Microbiology. All Rights Reserved.
Reuter, Miriam S; Walker, Susan; Thiruvahindrapuram, Bhooma; Whitney, Joe; Cohn, Iris; Sondheimer, Neal; Yuen, Ryan K C; Trost, Brett; Paton, Tara A; Pereira, Sergio L; Herbrick, Jo-Anne; Wintle, Richard F; Merico, Daniele; Howe, Jennifer; MacDonald, Jeffrey R; Lu, Chao; Nalpathamkalam, Thomas; Sung, Wilson W L; Wang, Zhuozhi; Patel, Rohan V; Pellecchia, Giovanna; Wei, John; Strug, Lisa J; Bell, Sherilyn; Kellam, Barbara; Mahtani, Melanie M; Bassett, Anne S; Bombard, Yvonne; Weksberg, Rosanna; Shuman, Cheryl; Cohn, Ronald D; Stavropoulos, Dimitri J; Bowdin, Sarah; Hildebrandt, Matthew R; Wei, Wei; Romm, Asli; Pasceri, Peter; Ellis, James; Ray, Peter; Meyn, M Stephen; Monfared, Nasim; Hosseini, S Mohsen; Joseph-George, Ann M; Keeley, Fred W; Cook, Ryan A; Fiume, Marc; Lee, Hin C; Marshall, Christian R; Davies, Jill; Hazell, Allison; Buchanan, Janet A; Szego, Michael J; Scherer, Stephen W
2018-02-05
The Personal Genome Project Canada is a comprehensive public data resource that integrates whole genome sequencing data and health information. We describe genomic variation identified in the initial recruitment cohort of 56 volunteers. Volunteers were screened for eligibility and provided informed consent for open data sharing. Using blood DNA, we performed whole genome sequencing and identified all possible classes of DNA variants. A genetic counsellor explained the implication of the results to each participant. Whole genome sequencing of the first 56 participants identified 207 662 805 sequence variants and 27 494 copy number variations. We analyzed a prioritized disease-associated data set ( n = 1606 variants) according to standardized guidelines, and interpreted 19 variants in 14 participants (25%) as having obvious health implications. Six of these variants (e.g., in BRCA1 or mosaic loss of an X chromosome) were pathogenic or likely pathogenic. Seven were risk factors for cancer, cardiovascular or neurobehavioural conditions. Four other variants - associated with cancer, cardiac or neurodegenerative phenotypes - remained of uncertain significance because of discrepancies among databases. We also identified a large structural chromosome aberration and a likely pathogenic mitochondrial variant. There were 172 recessive disease alleles (e.g., 5 individuals carried mutations for cystic fibrosis). Pharmacogenomics analyses revealed another 3.9 potentially relevant genotypes per individual. Our analyses identified a spectrum of genetic variants with potential health impact in 25% of participants. When also considering recessive alleles and variants with potential pharmacologic relevance, all 56 participants had medically relevant findings. Although access is mostly limited to research, whole genome sequencing can provide specific and novel information with the potential of major impact for health care. © 2018 Joule Inc. or its licensors.
A Next-Generation Sequencing Primer—How Does It Work and What Can It Do?
Alekseyev, Yuriy O.; Fazeli, Roghayeh; Yang, Shi; Basran, Raveen; Miller, Nancy S.
2018-01-01
Next-generation sequencing refers to a high-throughput technology that determines the nucleic acid sequences and identifies variants in a sample. The technology has been introduced into clinical laboratory testing and produces test results for precision medicine. Since next-generation sequencing is relatively new, graduate students, medical students, pathology residents, and other physicians may benefit from a primer to provide a foundation about basic next-generation sequencing methods and applications, as well as specific examples where it has had diagnostic and prognostic utility. Next-generation sequencing technology grew out of advances in multiple fields to produce a sophisticated laboratory test with tremendous potential. Next-generation sequencing may be used in the clinical setting to look for specific genetic alterations in patients with cancer, diagnose inherited conditions such as cystic fibrosis, and detect and profile microbial organisms. This primer will review DNA sequencing technology, the commercialization of next-generation sequencing, and clinical uses of next-generation sequencing. Specific applications where next-generation sequencing has demonstrated utility in oncology are provided. PMID:29761157
Mutations in the S gene region of hepatitis B virus genotype D in Turkish patients.
Ozaslan, Mehmet; Ozaslan, Ersan; Barsgan, Arzu; Koruk, Mehmet
2007-12-01
The S gene region of the hepatitis B virus (HBV) is responsible for the expression of surface antigens and includes the 'a'-determinant region. Thus, mutation(s) in this region would afford HBV variants a distinct survival advantage, permitting the mutant virus to escape from the immune system. The aim of this study was to search for mutations of the S gene region in different patient groups infected with genotype D variants of HBV, and to analyse the biological significance of these mutations. Moreover, we investigated S gene mutation inductance among family members. Forty HBV-DNA-positive patients were determined among 132 hepatitis B surface antigen (HbsAg) carriers by the first stage of seminested PCR. Genotypes and subtypes were established by sequencing of the amplified S gene regions. Variants were compared with original sequences of these serotypes, and mutations were identified. All variants were designated as genotype D and subtype ayw3. Ten kinds of point mutations were identified within the S region. The highest rates of mutation were found in chronic hepatitis patients and their family members. The amino acid mutations 125 (M -> T) and 127 (T -> P) were found on the first loop of 'a'-determinant. The other consequence was mutation inductance in a family member. We found some mutations in the S gene region known to be stable and observed that some of these mutations affected S gene expression.
Deletion mapping of the Aequorea victoria green fluorescent protein.
Dopf, J; Horiagon, T M
1996-01-01
Aequorea victoria green fluorescent protein (GFP) is a promising fluorescent marker which is active in a diverse array of prokaryotic and eukaryotic organisms. A key feature underlying the versatility of GFP is its capacity to undergo heterocyclic chromophore formation by cyclization of a tripeptide present in its primary sequence and thereby acquiring fluorescent activity in a variety of intracellular environments. In order to define further the primary structure requirements for chromophore formation and fluorescence in GFP, a series of N- and C-terminal GFP deletion variant expression vectors were created using the polymerase chain reaction. Scanning spectrofluorometric analyses of crude soluble protein extracts derived from eleven GFP expression constructs revealed that amino acid (aa) residues 2-232, of a total of 238 aa in the native protein, were required for the characteristic emission and absorption spectra of native GFP. Heterocyclic chromophore formation was assayed by comparing the absorption spectrum of GFP deletion variants over the 300-500-nm range to the absorption spectra of full-length GFP and GFP deletion variants missing the chromophore substrate domain from the primary sequence. GFP deletion variants lacking fluorescent activity showed no evidence of heterocyclic ring structure formation when the soluble extracts of their bacterial expression hosts were studied at pH 7.9. These observations suggest that the primary structure requirements for the fluorescent activity of GFP are relatively extensive and are compatible with the view that much of the primary structure serves an autocatalytic function.
Bandarian, Fatemeh; Daneshpour, Maryam Sadat; Hedayati, Mehdi; Naseri, Mohsen; Azizi, Fereidoun
2016-01-01
Apolipoprotein A2 (APOA2) is the second major apolipoprotein of the high-density lipoprotein cholesterol (HDL-C). The study aim was to identify APOA2 gene variation in individuals within two extreme tails of HDL-C levels and its relationship with HDL-C level. This cross-sectional survey was conducted on participants from Tehran Glucose and Lipid Study (TLGS) at Research Institute for Endocrine Sciences, Tehran, Iran from April 2012 to February 2013. In total, 79 individuals with extreme low HDL-C levels (≤5th percentile for age and gender) and 63 individuals with extreme high HDL-C levels (≥95th percentile for age and gender) were selected. Variants were identified using DNA amplification and direct sequencing. Screen of all exons and the core promoter region of APOA2 gene identified nine single nucleotide substitutions and one microsatellite; five of which were known and four were new variants. Of these nine variants, two were common tag single nucleotide polymorphisms (SNPs) and seven were rare SNPs. Both exonic substitutions were missense mutations and caused an amino acid change. There was a significant association between the new missense mutation (variant Chr.1:16119226, Ala98Pro) and HDL-C level. None of two common tag SNPs of rs6413453 and rs5082 contributes to the HDL-C trait in Iranian population, but a new missense mutation in APOA2 in our population has a significant association with HDL-C.
Shared epitopes of glycoprotein A and protein 4.1 defined by antibody NaM10-3C10.
Rasamoelisolo, M; Czerwinski, M; Willem, C; Blanchard, D
1998-06-01
We have produced the murine monoclonal antibody (MAb) NaM70-3C10 (IgM) from splenocytes of mice immunized with human red blood cells (RBCs). The MAb agglutinated untreated as well as trypsin, chymotrypsin, neuraminidase, or ficin-treated RBCs from controls. In contrast, control RBCs treated with papaine or bromelaine were not agglutinated. On immunoblots, the MAb bound to glycophorin A (GPA) and to a 80 kDa protein identified as protein 4.1. Analysis by agglutination of variant RBCs carrying hybrid glycophorins made of the N-terminus (amino acids 1-58) of GPA and of the C-terminus (amino acids 27-72) of glycophorin B (GPB) and competition-inhibition test using purified GPA and a synthetic peptide corresponding to the amino acid sequence 48-58 of GPA demonstrated that the epitope is located within residues 48-58 of GPA. Epitope analysis with immobilized peptides showed that the MAb recognizes the sequence 53Pro-Pro-Glu-Glu-GIu58 of GPA. A homologous sequence is also present within amino acids 395 to 405 of protein 4.1. Finally, the MAb bound to 16 kDa chymotryptic peptide of protein 4.1, which carries the above amino acid sequence. In conclusion, it may be assumed that NaM70-3C10 specifically recognizes a common epitope on the extracellular domain of GPA and on the intracellular protein 4.1; this specificity explains the persistence of the 80 kDa band on blots when RBCs are treated with papain.
An update on the genetic architecture of hyperuricemia and gout.
Merriman, Tony R
2015-04-10
Genome-wide association studies that scan the genome for common genetic variants associated with phenotype have greatly advanced medical knowledge. Hyperuricemia is no exception, with 28 loci identified. However, genetic control of pathways determining gout in the presence of hyperuricemia is still poorly understood. Two important pathways determining hyperuricemia have been confirmed (renal and gut excretion of uric acid with glycolysis now firmly implicated). Major urate loci are SLC2A9 and ABCG2. Recent studies show that SLC2A9 is involved in renal and gut excretion of uric acid and is implicated in antioxidant defense. Although etiological variants at SLC2A9 are yet to be identified, it is clear that considerable genetic complexity exists at the SLC2A9 locus, with multiple statistically independent genetic variants and local epistatic interactions. The positions of implicated genetic variants within or near chromatin regions involved in transcriptional control suggest that this mechanism (rather than structural changes in SLC2A9) is important in regulating the activity of SLC2A9. ABCG2 is involved primarily in extra-renal uric acid under-excretion with the etiological variant influencing expression. At the other 26 loci, probable causal genes can be identified at three (PDZK1, SLC22A11, and INHBB) with strong candidates at a further 10 loci. Confirmation of the causal gene will require a combination of re-sequencing, trans-ancestral mapping, and correlation of genetic association data with expression data. As expected, the urate loci associate with gout, although inconsistent effect sizes for gout require investigation. Finally, there has been no genome-wide association study using clinically ascertained cases to investigate the causes of gout in the presence of hyperuricemia. In such a study, use of asymptomatic hyperurcemic controls would be expected to increase the ability to detect genetic associations with gout.
Applegate, Tanya L; Gaudieri, Silvana; Plauzolles, Anne; Chopra, Abha; Grebely, Jason; Lucas, Michaela; Hellard, Margaret; Luciani, Fabio; Dore, Gregory J; Matthews, Gail V
2015-01-01
Direct-acting antivirals (DAAs) are predicted to transform hepatitis C therapy, yet little is known about the prevalence of naturally occurring resistance mutations in recently acquired HCV. This study aimed to determine the prevalence and frequency of drug resistance mutations in the viral quasispecies among HIV-positive and -negative individuals with recent HCV. The NS3 protease, NS5A and NS5B polymerase genes were amplified from 50 genotype 1a participants of the Australian Trial in Acute Hepatitis C. Amino acid variations at sites known to be associated with possible drug resistance were analysed by ultra-deep pyrosequencing. A total of 12% of individuals harboured dominant resistance mutations, while 36% demonstrated non-dominant resistant variants below that detectable by bulk sequencing (that is, <20%) but above a threshold of 1%. Resistance variants (<1%) were observed at most sites associated with DAA resistance from all classes, with the exception of sofosbuvir. Dominant resistant mutations were uncommonly observed in the setting of recent HCV. However, low-level mutations to all DAA classes were observed by deep sequencing at the majority of sites and in most individuals. The significance of these variants and impact on future treatment options remains to be determined. Clinicaltrials.gov NCT00192569.
Contactin 4 as an Autism Susceptibility Locus
Cottrell, Catherine E.; Bir, Natalie; Varga, Elizabeth; Alvarez, Carlos E.; Bouyain, Samuel; Zernzach, Randall; LambThrush, Devon; Evans, Johnna; Trimarchi, Michael; Butter, Eric M.; Cunningham, David; Gastier-Foster, Julie M.; McBride, Kim; Herman, Gail E.
2011-01-01
Scientific Abstract Structural and sequence variation have been described in several members of the contactin (CNTN) and contactin associated protein (CNTNAP) gene families in association with neurodevelopmental disorders, including autism. Using array comparative genome hybridization (CGH), we identified a maternally inherited ~535 kb deletion at 3p26.3 encompassing the 5′ end of the contactin 4 gene (CNTN4) in a patient with autism. Based on this finding and previous reports implicating genomic rearrangements of CNTN4 in autism spectrum disorders (ASDs) and 3p− microdeletion syndrome, we undertook sequencing of the coding regions of the gene in a local ASD cohort in comparison with a set of controls. Unique missense variants were identified in 4/75 unrelated individuals with an ASD, as well as in 1/107 controls. All of the amino acid substitutions were nonsynonomous, occurred at evolutionarily conserved positions, and were, thus, felt likely to be deleterious. However, these data did not reach statistical significance, nor did the variants segregate with disease within all of the ASD families. Finally, there was no detectable difference in binding of two of the variants to the interacting protein PTPRG in vitro. Thusadditional, larger studies will be necessary to determine whether CNTN4 functions as an autism susceptibility locus in combination with other genetic and/or environmental factors. PMID:21308999
Costantini, S; Malerba, G; Contreas, G; Corradi, M; Marin Vargas, S P; Giorgetti, A; Maffeis, C
2015-05-01
Heterozygous loss-of-function mutations in the glucokinase (GCK) gene cause maturity-onset diabetes of the young (MODY) subtype GCK (GCK-MODY/MODY2). GCK sequencing revealed 16 distinct mutations (13 missense, 1 nonsense, 1 splice site, and 1 frameshift-deletion) co-segregating with hyperglycaemia in 23 GCK-MODY families. Four missense substitutions (c.718A>G/p.Asn240Asp, c.757G>T/p.Val253Phe, c.872A>C/p.Lys291Thr, and c.1151C>T/p.Ala384Val) were novel and a founder effect for the nonsense mutation (c.76C>T/p.Gln26*) was supposed. We tested whether an accurate bioinformatics approach could strengthen family-genetic evidence for missense variant pathogenicity in routine diagnostics, where wet-lab functional assays are generally unviable. In silico analyses of the novel missense variants, including orthologous sequence conservation, amino acid substitution (AAS)-pathogenicity predictors, structural modeling and splicing predictors, suggested that the AASs and/or the underlying nucleotide changes are likely to be pathogenic. This study shows how a careful bioinformatics analysis could provide effective suggestions to help molecular-genetic diagnosis in absence of wet-lab validations. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Gimm, O; Gössling, A; Marsh, D J; Dahia, P L M; Mulligan, L M; Deimling, A von; Eng, C
1999-01-01
Glial cell line-derived neurotrophic factor (GDNF) plays a key role in the control of vertebrate neuron survival and differentiation in both the central and peripheral nervous systems. GDNF preferentially binds to GFRα-1 which then interacts with the receptor tyrosine kinase RET. We investigated a panel of 36 independent cases of mainly advanced sporadic brain tumours for the presence of mutations in GDNF and GFRα-1. No mutations were found in the coding region of GDNF. We identified six previously described GFRα-1 polymorphisms, two of which lead to an amino acid change. In 15 of 36 brain tumours, all polymorphic variants appeared to be homozygous. Of these 15 tumours, one also had a rare, apparently homozygous, sequence variant at codon 361. Because of the rarity of the combination of homozygous sequence variants, analysis for hemizygous deletion was pursued in the 15 samples and loss of heterozygosity was found in 11 tumours. Our data suggest that intragenic point mutations of GDNF or GFRα-1 are not a common aetiologic event in brain tumours. However, either deletion of GFRα-1 and/or nearby genes may contribute to the pathogenesis of these tumours. © 1999 Cancer Research Campaign PMID:10408842
Barber, Lisa M; McGrath, Helen E N; Meyer, Stefan; Will, Andrew M; Birch, Jillian M; Eden, Osborn B; Taylor, G Malcolm
2003-04-01
The extent to which genetic susceptibility contributes to the causation of childhood acute myeloid leukaemia (AML) is not known. The inherited bone marrow failure disorder Fanconi anaemia (FA) carries a substantially increased risk of AML, raising the possibility that constitutional variation in the FA (FANC) genes is involved in the aetiology of childhood AML. We have screened genomic DNA extracted from remission blood samples of 97 children with sporadic AML and 91 children with sporadic acute lymphoblastic leukaemia (ALL), together with 104 cord blood DNA samples from newborn children, for variations in the Fanconi anaemia group C (FANCC) gene. We found no evidence of known FANCC pathogenic mutations in children with AML, ALL or in the cord blood samples. However, we detected 12 different FANCC sequence variants, of which five were novel to this study. Among six FANCC variants leading to amino-acid substitutions, one (S26F) was present at a fourfold greater frequency in children with AML than in the cord blood samples (odds ratio: 4.09, P = 0.047; 95% confidence interval 1.08-15.54). Our results thus do not exclude the possibility that this polymorphic variant contributes to the risk of a small proportion of childhood AML.
DHAD variants and methods of screening
Kelly, Kristen J.; Ye, Rick W.
2017-02-28
Methods of screening for dihydroxy-acid dehydratase (DHAD) variants that display increased DHAD activity are disclosed, along with DHAD variants identified by these methods. Such enzymes can result in increased production of compounds from DHAD requiring biosynthetic pathways. Also disclosed are isolated nucleic acids encoding the DHAD variants, recombinant host cells comprising the isolated nucleic acid molecules, and methods of producing butanol.
Zimmer, Christoph T; Garrood, William T; Singh, Kumar Saurabh; Randall, Emma; Lueke, Bettina; Gutbrod, Oliver; Matthiesen, Svend; Kohler, Maxie; Nauen, Ralf; Davies, T G Emyr; Bass, Chris
2018-01-22
Gene duplication is a major source of genetic variation that has been shown to underpin the evolution of a wide range of adaptive traits [1, 2]. For example, duplication or amplification of genes encoding detoxification enzymes has been shown to play an important role in the evolution of insecticide resistance [3-5]. In this context, gene duplication performs an adaptive function as a result of its effects on gene dosage and not as a source of functional novelty [3, 6-8]. Here, we show that duplication and neofunctionalization of a cytochrome P450, CYP6ER1, led to the evolution of insecticide resistance in the brown planthopper. Considerable genetic variation was observed in the coding sequence of CYP6ER1 in populations of brown planthopper collected from across Asia, but just two sequence variants are highly overexpressed in resistant strains and metabolize imidacloprid. Both variants are characterized by profound amino-acid alterations in substrate recognition sites, and the introduction of these mutations into a susceptible P450 sequence is sufficient to confer resistance. CYP6ER1 is duplicated in resistant strains with individuals carrying paralogs with and without the gain-of-function mutations. Despite numerical parity in the genome, the susceptible and mutant copies exhibit marked asymmetry in their expression with the resistant paralogs overexpressed. In the primary resistance-conferring CYP6ER1 variant, this results from an extended region of novel sequence upstream of the gene that provides enhanced expression. Our findings illustrate the versatility of gene duplication in providing opportunities for functional and regulatory innovation during the evolution of an adaptive trait. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
Implementation and utilization of genetic testing in personalized medicine
Abul-Husn, Noura S; Owusu Obeng, Aniwaa; Sanderson, Saskia C; Gottesman, Omri; Scott, Stuart A
2014-01-01
Clinical genetic testing began over 30 years ago with the availability of mutation detection for sickle cell disease diagnosis. Since then, the field has dramatically transformed to include gene sequencing, high-throughput targeted genotyping, prenatal mutation detection, preimplantation genetic diagnosis, population-based carrier screening, and now genome-wide analyses using microarrays and next-generation sequencing. Despite these significant advances in molecular technologies and testing capabilities, clinical genetics laboratories historically have been centered on mutation detection for Mendelian disorders. However, the ongoing identification of deoxyribonucleic acid (DNA) sequence variants associated with common diseases prompted the availability of testing for personal disease risk estimation, and created commercial opportunities for direct-to-consumer genetic testing companies that assay these variants. This germline genetic risk, in conjunction with other clinical, family, and demographic variables, are the key components of the personalized medicine paradigm, which aims to apply personal genomic and other relevant data into a patient’s clinical assessment to more precisely guide medical management. However, genetic testing for disease risk estimation is an ongoing topic of debate, largely due to inconsistencies in the results, concerns over clinical validity and utility, and the variable mode of delivery when returning genetic results to patients in the absence of traditional counseling. A related class of genetic testing with analogous issues of clinical utility and acceptance is pharmacogenetic testing, which interrogates sequence variants implicated in interindividual drug response variability. Although clinical pharmacogenetic testing has not previously been widely adopted, advances in rapid turnaround time genetic testing technology and the recent implementation of preemptive genotyping programs at selected medical centers suggest that personalized medicine through pharmacogenetics is now a reality. This review aims to summarize the current state of implementing genetic testing for personalized medicine, with an emphasis on clinical pharmacogenetic testing. PMID:25206309
Wu, Lucia R.; Chen, Sherry X.; Wu, Yalei; Patel, Abhijit A.; Zhang, David Yu
2018-01-01
Rare DNA-sequence variants hold important clinical and biological information, but existing detection techniques are expensive, complex, allele-specific, or don’t allow for significant multiplexing. Here, we report a temperature-robust polymerase-chain-reaction method, which we term blocker displacement amplification (BDA), that selectively amplifies all sequence variants, including single-nucleotide variants (SNVs), within a roughly 20-nucleotide window by 1,000-fold over wild-type sequences. This allows for easy detection and quantitation of hundreds of potential variants originally at ≤0.1% in allele frequency. BDA is compatible with inexpensive thermocycler instrumentation and employs a rationally designed competitive hybridization reaction to achieve comparable enrichment performance across annealing temperatures ranging from 56 °C to 64 °C. To show the sequence generality of BDA, we demonstrate enrichment of 156 SNVs and the reliable detection of single-digit copies. We also show that the BDA detection of rare driver mutations in cell-free DNA samples extracted from the blood plasma of lung-cancer patients is highly consistent with deep sequencing using molecular lineage tags, with a receiver operator characteristic accuracy of 95%. PMID:29805844
Suzuki, Masaharu; Ketterling, Matthew G; McCarty, Donald R
2005-09-01
We have developed a simple quantitative computational approach for objective analysis of cis-regulatory sequences in promoters of coregulated genes. The program, designated MotifFinder, identifies oligo sequences that are overrepresented in promoters of coregulated genes. We used this approach to analyze promoter sequences of Viviparous1 (VP1)/abscisic acid (ABA)-regulated genes and cold-regulated genes, respectively, of Arabidopsis (Arabidopsis thaliana). We detected significantly enriched sequences in up-regulated genes but not in down-regulated genes. This result suggests that gene activation but not repression is mediated by specific and common sequence elements in promoters. The enriched motifs include several known cis-regulatory sequences as well as previously unidentified motifs. With respect to known cis-elements, we dissected the flanking nucleotides of the core sequences of Sph element, ABA response elements (ABREs), and the C repeat/dehydration-responsive element. This analysis identified the motif variants that may correlate with qualitative and quantitative differences in gene expression. While both VP1 and cold responses are mediated in part by ABA signaling via ABREs, these responses correlate with unique ABRE variants distinguished by nucleotides flanking the ACGT core. ABRE and Sph motifs are tightly associated uniquely in the coregulated set of genes showing a strict dependence on VP1 and ABA signaling. Finally, analysis of distribution of the enriched sequences revealed a striking concentration of enriched motifs in a proximal 200-base region of VP1/ABA and cold-regulated promoters. Overall, each class of coregulated genes possesses a discrete set of the enriched motifs with unique distributions in their promoters that may account for the specificity of gene regulation.
Linkage disequilibrium among commonly genotyped SNP and variants detected from bull sequence
USDA-ARS?s Scientific Manuscript database
Genomic prediction utilizing causal variants could increase selection accuracy above that achieved with SNP genotyped by commercial assays. A number of variants detected from sequencing influential sires are likely to be causal, but noticable improvements in prediction accuracy using imputed sequen...
Ionita-Laza, Iuliana; Ottman, Ruth
2011-11-01
The recent progress in sequencing technologies makes possible large-scale medical sequencing efforts to assess the importance of rare variants in complex diseases. The results of such efforts depend heavily on the use of efficient study designs and analytical methods. We introduce here a unified framework for association testing of rare variants in family-based designs or designs based on unselected affected individuals. This framework allows us to quantify the enrichment in rare disease variants in families containing multiple affected individuals and to investigate the optimal design of studies aiming to identify rare disease variants in complex traits. We show that for many complex diseases with small values for the overall sibling recurrence risk ratio, such as Alzheimer's disease and most cancers, sequencing affected individuals with a positive family history of the disease can be extremely advantageous for identifying rare disease variants. In contrast, for complex diseases with large values of the sibling recurrence risk ratio, sequencing unselected affected individuals may be preferable.
Matsuo, Kumihiro; Tanahashi, Yusuke; Mukai, Tokuo; Suzuki, Shigeru; Tajima, Toshihiro; Azuma, Hiroshi; Fujieda, Kenji
2016-07-01
Dual oxidase 2 (DUOX2) mutations are a cause of dyshormonogenesis (DH) and have been identified in patients with permanent congenital hypothyroidism (PH) and with transient hypothyroidism (TH). We aimed to elucidate the prevalence and phenotypical variations of DUOX2 mutations. Forty-eight Japanese DH patients were enroled and analysed for sequence variants of DUOX2, DUOXA2, and TPO using polymerase chain reaction-amplified direct sequencing. Fourteen sequence variants of DUOX2, including 10 novel variants, were identified in 11 patients. DUOX2 variants were more prevalent (11/48, 22.9%) than TPO (3/48, 6.3%) (p=0.020). The prevalence of DUOX2 variants in TH was slightly, but not significantly, higher than in PH. Furthermore, one patient had digenic heterozygous sequence variants of both DUOX2 and TPO. Our results suggest that DUOX2 mutations might be the most common cause of both PH and TH, and that phenotypes of these mutations might be milder than those of other causes.
Zolodz, Melissa D; Herberg, John T; Narepekha, Halyna E; Raleigh, Emily; Farber, Matthew R; Dufield, Robert L; Boyle, Denis M
2010-01-08
Obtaining sufficient amounts of pure glycoprotein variants to characterize their structures is an important goal in both functional biology and the biotechnology industry. We have developed preparative HIC conditions that resolve glycoform variants on the basis of overall carbohydrate content for a recombinant transferrin-exendin-4 fusion protein. The fusion protein was expressed from the yeast Saccharomyces cerevisiae from high density fermentation and is post-translationally modified with mannose sugars through O-glycosidic linkages. Overall hydrophobic behavior appeared to be dominated by the N-terminal 39 amino acids from the exendin-4 and linker peptide sequences as compared to the less hydrophobic behavior of human transferrin alone. In addition, using LC techniques that measure total glycans released from the pure protein combined with new high resolution technologies using mass spectrometry, we have determined the locations and chain lengths of mannose residues on specific peptides derived from tryptic maps of the transferrin-exendin-4 protein. Though the protein is large (80,488kDa) and contains 78 possible serine and threonine residues as potential sites for sugar addition, mannosylation was observed on only two tryptic peptides located within the first 55 amino acids of the N-terminus. These glycopeptides were highly heterogeneous and contained between 1 and 10 mannose residues scattered among the various serine and threonine sites which were identified by electron transfer dissociation mass spectrometry. Glycan sequences from 1 to 6 linear mannose residues were detected, but mannose chain lengths of 3 or 4 were more common and formed 80% of the total oligosaccharides. This work introduces new technological capabilities for the purification and characterization of glycosylated variants of therapeutic recombinant proteins. Copyright 2009 Elsevier B.V. All rights reserved.
GETPrime: a gene- or transcript-specific primer database for quantitative real-time PCR.
Gubelmann, Carine; Gattiker, Alexandre; Massouras, Andreas; Hens, Korneel; David, Fabrice; Decouttere, Frederik; Rougemont, Jacques; Deplancke, Bart
2011-01-01
The vast majority of genes in humans and other organisms undergo alternative splicing, yet the biological function of splice variants is still very poorly understood in large part because of the lack of simple tools that can map the expression profiles and patterns of these variants with high sensitivity. High-throughput quantitative real-time polymerase chain reaction (qPCR) is an ideal technique to accurately quantify nucleic acid sequences including splice variants. However, currently available primer design programs do not distinguish between splice variants and also differ substantially in overall quality, functionality or throughput mode. Here, we present GETPrime, a primer database supported by a novel platform that uniquely combines and automates several features critical for optimal qPCR primer design. These include the consideration of all gene splice variants to enable either gene-specific (covering the majority of splice variants) or transcript-specific (covering one splice variant) expression profiling, primer specificity validation, automated best primer pair selection according to strict criteria and graphical visualization of the latter primer pairs within their genomic context. GETPrime primers have been extensively validated experimentally, demonstrating high transcript specificity in complex samples. Thus, the free-access, user-friendly GETPrime database allows fast primer retrieval and visualization for genes or groups of genes of most common model organisms, and is available at http://updepla1srv1.epfl.ch/getprime/. Database URL: http://deplanckelab.epfl.ch.
GETPrime: a gene- or transcript-specific primer database for quantitative real-time PCR
Gubelmann, Carine; Gattiker, Alexandre; Massouras, Andreas; Hens, Korneel; David, Fabrice; Decouttere, Frederik; Rougemont, Jacques; Deplancke, Bart
2011-01-01
The vast majority of genes in humans and other organisms undergo alternative splicing, yet the biological function of splice variants is still very poorly understood in large part because of the lack of simple tools that can map the expression profiles and patterns of these variants with high sensitivity. High-throughput quantitative real-time polymerase chain reaction (qPCR) is an ideal technique to accurately quantify nucleic acid sequences including splice variants. However, currently available primer design programs do not distinguish between splice variants and also differ substantially in overall quality, functionality or throughput mode. Here, we present GETPrime, a primer database supported by a novel platform that uniquely combines and automates several features critical for optimal qPCR primer design. These include the consideration of all gene splice variants to enable either gene-specific (covering the majority of splice variants) or transcript-specific (covering one splice variant) expression profiling, primer specificity validation, automated best primer pair selection according to strict criteria and graphical visualization of the latter primer pairs within their genomic context. GETPrime primers have been extensively validated experimentally, demonstrating high transcript specificity in complex samples. Thus, the free-access, user-friendly GETPrime database allows fast primer retrieval and visualization for genes or groups of genes of most common model organisms, and is available at http://updepla1srv1.epfl.ch/getprime/. Database URL: http://deplanckelab.epfl.ch. PMID:21917859
Iqbal, Zafar; Püttmann, Lucia; Musante, Luciana; Razzaq, Attia; Zahoor, Muhammad Yasir; Hu, Hao; Wienker, Thomas F; Garshasbi, Masoud; Fattahi, Zohreh; Gilissen, Christian; Vissers, Lisenka ELM; de Brouwer, Arjan PM; Veltman, Joris A; Pfundt, Rolph; Najmabadi, Hossein; Ropers, Hans-Hilger; Riazuddin, Sheikh; Kahrizi, Kimia; van Bokhoven, Hans
2016-01-01
AIMP1/p43 is a multifunctional non-catalytic component of the multisynthetase complex. The complex consists of nine catalytic and three non-catalytic proteins, which catalyze the ligation of amino acids to their cognate tRNA isoacceptors for use in protein translation. To date, two allelic variants in the AIMP1 gene have been reported as the underlying cause of autosomal recessive primary neurodegenerative disorder. Here, we present two consanguineous families from Pakistan and Iran, presenting with moderate to severe intellectual disability, global developmental delay, and speech impairment without neurodegeneration. By the combination of homozygosity mapping and next generation sequencing, we identified two homozygous missense variants, p.(Gly299Arg) and p.(Val176Gly), in the gene AIMP1 that co-segregated with the phenotype in the respective families. Molecular modeling of the variants revealed deleterious effects on the protein structure that are predicted to result in reduced AIMP1 function. Our findings indicate that the clinical spectrum for AIMP1 defects is broader than witnessed so far. PMID:26173967
Iqbal, Zafar; Püttmann, Lucia; Musante, Luciana; Razzaq, Attia; Zahoor, Muhammad Yasir; Hu, Hao; Wienker, Thomas F; Garshasbi, Masoud; Fattahi, Zohreh; Gilissen, Christian; Vissers, Lisenka E L M; de Brouwer, Arjan P M; Veltman, Joris A; Pfundt, Rolph; Najmabadi, Hossein; Ropers, Hans-Hilger; Riazuddin, Sheikh; Kahrizi, Kimia; van Bokhoven, Hans
2016-03-01
AIMP1/p43 is a multifunctional non-catalytic component of the multisynthetase complex. The complex consists of nine catalytic and three non-catalytic proteins, which catalyze the ligation of amino acids to their cognate tRNA isoacceptors for use in protein translation. To date, two allelic variants in the AIMP1 gene have been reported as the underlying cause of autosomal recessive primary neurodegenerative disorder. Here, we present two consanguineous families from Pakistan and Iran, presenting with moderate to severe intellectual disability, global developmental delay, and speech impairment without neurodegeneration. By the combination of homozygosity mapping and next generation sequencing, we identified two homozygous missense variants, p.(Gly299Arg) and p.(Val176Gly), in the gene AIMP1 that co-segregated with the phenotype in the respective families. Molecular modeling of the variants revealed deleterious effects on the protein structure that are predicted to result in reduced AIMP1 function. Our findings indicate that the clinical spectrum for AIMP1 defects is broader than witnessed so far.
Yoast, Sienna; Adams, Robin M.; Mainzer, Stanley E.; Moon, Keith; Palombella, Anthony L.; Schmidt, Brian F.
1994-01-01
A method is described for generating and screening variants of the β-galactosidase from Lactobacillus delbrueckii subsp. bulgaricus sensitive to several environmental stresses, with potential application in the food industry. Chemical mutagenesis with hydroxylamine or methoxylamine was performed on the β-galactosidase gene carried on an Escherichia coli expression vector. Mutants sensitive to cold, heat, low pH, low magnesium concentration, and the presence of urea were isolated by screening for reduced color development on β-galactosidase indicator plates. The mutations responsible for three variant β-galactosidases were localized, and the base substitutions were determined by DNA sequencing. The amino acid alterations associated with one low-pH-sensitive (pHs) and two urea-sensitive (Us) variants correspond to P584L (pHs1), G400S/R479Q (Us26), and G167E/E168K/E363K/V492M (Us17), respectively. Mutant pHs1 is also heat, cold, low magnesium, and urea sensitive; Us26 is also cold sensitive; and Us17 is also low-pH sensitive. PMID:16349230
Deep whole-genome sequencing of 90 Han Chinese genomes.
Lan, Tianming; Lin, Haoxiang; Zhu, Wenjuan; Laurent, Tellier Christian Asker Melchior; Yang, Mengcheng; Liu, Xin; Wang, Jun; Wang, Jian; Yang, Huanming; Xu, Xun; Guo, Xiaosen
2017-09-01
Next-generation sequencing provides a high-resolution insight into human genetic information. However, the focus of previous studies has primarily been on low-coverage data due to the high cost of sequencing. Although the 1000 Genomes Project and the Haplotype Reference Consortium have both provided powerful reference panels for imputation, low-frequency and novel variants remain difficult to discover and call with accuracy on the basis of low-coverage data. Deep sequencing provides an optimal solution for the problem of these low-frequency and novel variants. Although whole-exome sequencing is also a viable choice for exome regions, it cannot account for noncoding regions, sometimes resulting in the absence of important, causal variants. For Han Chinese populations, the majority of variants have been discovered based upon low-coverage data from the 1000 Genomes Project. However, high-coverage, whole-genome sequencing data are limited for any population, and a large amount of low-frequency, population-specific variants remain uncharacterized. We have performed whole-genome sequencing at a high depth (∼×80) of 90 unrelated individuals of Chinese ancestry, collected from the 1000 Genomes Project samples, including 45 Northern Han Chinese and 45 Southern Han Chinese samples. Eighty-three of these 90 have been sequenced by the 1000 Genomes Project. We have identified 12 568 804 single nucleotide polymorphisms, 2 074 210 short InDels, and 26 142 structural variations from these 90 samples. Compared to the Han Chinese data from the 1000 Genomes Project, we have found 7 000 629 novel variants with low frequency (defined as minor allele frequency < 5%), including 5 813 503 single nucleotide polymorphisms, 1 169 199 InDels, and 17 927 structural variants. Using deep sequencing data, we have built a greatly expanded spectrum of genetic variation for the Han Chinese genome. Compared to the 1000 Genomes Project, these Han Chinese deep sequencing data enhance the characterization of a large number of low-frequency, novel variants. This will be a valuable resource for promoting Chinese genetics research and medical development. Additionally, it will provide a valuable supplement to the 1000 Genomes Project, as well as to other human genome projects. © The Authors 2017. Published by Oxford University Press.
Analysis of CHRNA7 rare variants in autism spectrum disorder susceptibility.
Bacchelli, Elena; Battaglia, Agatino; Cameli, Cinzia; Lomartire, Silvia; Tancredi, Raffaella; Thomson, Susanne; Sutcliffe, James S; Maestrini, Elena
2015-04-01
Chromosome 15q13.3 recurrent microdeletions are causally associated with a wide range of phenotypes, including autism spectrum disorder (ASD), seizures, intellectual disability, and other psychiatric conditions. Whether the reciprocal microduplication is pathogenic is less certain. CHRNA7, encoding for the alpha7 subunit of the neuronal nicotinic acetylcholine receptor, is considered the likely culprit gene in mediating neurological phenotypes in 15q13.3 deletion cases. To assess if CHRNA7 rare variants confer risk to ASD, we performed copy number variant analysis and Sanger sequencing of the CHRNA7 coding sequence in a sample of 135 ASD cases. Sequence variation in this gene remains largely unexplored, given the existence of a fusion gene, CHRFAM7A, which includes a nearly identical partial duplication of CHRNA7. Hence, attempts to sequence coding exons must distinguish between CHRNA7 and CHRFAM7A, making next-generation sequencing approaches unreliable for this purpose. A CHRNA7 microduplication was detected in a patient with autism and moderate cognitive impairment; while no rare damaging variants were identified in the coding region, we detected rare variants in the promoter region, previously described to functionally reduce transcription. This study represents the first sequence variant analysis of CHRNA7 in a sample of idiopathic autism. © 2015 Wiley Periodicals, Inc.
Entry kinetics and mouse virulence of Ross River virus mutants altered in neutralization epitopes.
Vrati, S; Kerr, P J; Weir, R C; Dalgarno, L
1996-03-01
Previously we identified the locations of three neutralization epitopes (a, b1 and b2) of Ross River virus (RRV) by sequencing a number of variants resistant to monoclonal antibody neutralization which were found to have single amino acid substitutions in the E2 protein (S. Vrati, C.A. Fernon, L. Dalgarno, and R.C. Weir, Virology 162:346-353, 1988). We have now studied the biological properties of these variants in BHK cells and their virulence in mice. While variants altered in epitopes a and/or b1 showed no difference, variants altered in epitope b2, including a triple variant altered in epitopes a, b1, and b2, showed rapid penetration but retarded kinetics of growth and RNA and protein synthesis in BHK cells compared with RRV T48, the parent virus. Variants altered in epitopes a and/or b1 showed no change in mouse virulence. However, two of the six epitope b2 variants examined had attenuated mouse virulence. They had a four- to fivefold-higher 50% lethal dose (LD50), although no change in the average survival time of infected mice was observed. These variants grew to titers in mouse tissues similar to those of RRV T48. The ID50 of the triple variant was unchanged, but infected mice had an increased average survival time. This variant produced lower levels of viremia in infected mice. On the basis of these findings we propose that both the receptor binding site and neutralization epitopes of RRV are nearby or in the same domain of the E2 protein.
Johnson, Ryan C; Hu, Heidi Q; Merrell, D Scott; Maroney, Michael J
2015-04-01
Helicobacter pylori requires urease activity in order to survive in the acid environment of the human stomach. Urease is regulated in part by nickelation, a process that requires the HypA protein, which is a putative nickel metallochaperone that is generally associated with hydrogenase maturation. However, in H. pylori, HypA plays a dual role. In addition to an N-terminal nickel binding site, HypA proteins also contain a structural zinc site that is coordinated by two rigorously conserved CXXC sequences, which in H. pylori are flanked by His residues. These structural Zn sites are known to be dynamic, converting from Zn(Cys)4 centers at pH 7.2 to Zn(Cys)2(His)2 centers at pH 6.3 in the presence of Ni(ii) ions. In this study, mutant strains of H. pylori that express zinc site variants of the HypA protein are used to show that the structural changes in the zinc site are important for the acid viability of the bacterium, and that a reduction in acid viability in these variants can be traced in large measure to deficient urease activity. This in turn leads to a model that connects the Zn(Cys)4 coordination to urease maturation.
Novel nonsense mutation in the katA gene of a catalase-negative Staphylococcus aureus strain.
Lagos, Jaime; Alarcón, Pedro; Benadof, Dona; Ulloa, Soledad; Fasce, Rodrigo; Tognarelli, Javier; Aguayo, Carolina; Araya, Pamela; Parra, Bárbara; Olivares, Berta; Hormazábal, Juan Carlos; Fernández, Jorge
2016-01-01
We report the first description of a rare catalase-negative strain of Staphylococcus aureus in Chile. This new variant was isolated from blood and synovial tissue samples of a pediatric patient. Sequencing analysis revealed that this catalase-negative strain is related to ST10 strain, which has earlier been described in relation to S. aureus carriers. Interestingly, sequence analysis of the catalase gene katA revealed presence of a novel nonsense mutation that causes premature translational truncation of the C-terminus of the enzyme leading to a loss of 222 amino acids. Our study suggests that loss of catalase activity in this rare catalase-negative Chilean strain is due to this novel nonsense mutation in the katA gene, which truncates the enzyme to just 283 amino acids. Copyright © 2015 Sociedade Brasileira de Microbiologia. Published by Elsevier Editora Ltda. All rights reserved.
Molecular mechanisms for protein-encoded inheritance
Wiltzius, Jed J. W.; Landau, Meytal; Nelson, Rebecca; Sawaya, Michael R.; Apostol, Marcin I.; Goldschmidt, Lukasz; Soriaga, Angela B.; Cascio, Duilio; Rajashankar, Kanagalaghatta; Eisenberg, David
2013-01-01
Strains are phenotypic variants, encoded by nucleic acid sequences in chromosomal inheritance and by protein “conformations” in prion inheritance and transmission. But how is a protein “conformation” stable enough to endure transmission between cells or organisms? Here new polymorphic crystal structures of segments of prion and other amyloid proteins offer structural mechanisms for prion strains. In packing polymorphism, prion strains are encoded by alternative packings (polymorphs) of β-sheets formed by the same segment of a protein; in a second mechanism, segmental polymorphism, prion strains are encoded by distinct β-sheets built from different segments of a protein. Both forms of polymorphism can produce enduring “conformations,” capable of encoding strains. These molecular mechanisms for transfer of information into prion strains share features with the familiar mechanism for transfer of information by nucleic acid inheritance, including sequence specificity and recognition by non-covalent bonds. PMID:19684598
Stokowy, Tomasz; Garbulowski, Mateusz; Fiskerstrand, Torunn; Holdhus, Rita; Labun, Kornel; Sztromwasser, Pawel; Gilissen, Christian; Hoischen, Alexander; Houge, Gunnar; Petersen, Kjell; Jonassen, Inge; Steen, Vidar M
2016-10-01
The search for causative genetic variants in rare diseases of presumed monogenic inheritance has been boosted by the implementation of whole exome (WES) and whole genome (WGS) sequencing. In many cases, WGS seems to be superior to WES, but the analysis and visualization of the vast amounts of data is demanding. To aid this challenge, we have developed a new tool-RareVariantVis-for analysis of genome sequence data (including non-coding regions) for both germ line and somatic variants. It visualizes variants along their respective chromosomes, providing information about exact chromosomal position, zygosity and frequency, with point-and-click information regarding dbSNP IDs, gene association and variant inheritance. Rare variants as well as de novo variants can be flagged in different colors. We show the performance of the RareVariantVis tool in the Genome in a Bottle WGS data set. https://www.bioconductor.org/packages/3.3/bioc/html/RareVariantVis.html tomasz.stokowy@k2.uib.no Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Lim, Hassol; Park, Young-Mi; Lee, Jong-Keuk; Taek Lim, Hyun
2016-10-01
To present an efficient and successful application of a single-exome sequencing study in a family clinically diagnosed with X-linked retinitis pigmentosa. Exome sequencing study based on clinical examination data. An 8-year-old proband and his family. The proband and his family members underwent comprehensive ophthalmologic examinations. Exome sequencing was undertaken in the proband using Agilent SureSelect Human All Exon Kit and Illumina HiSeq 2000 platform. Bioinformatic analysis used Illumina pipeline with Burrows-Wheeler Aligner-Genome Analysis Toolkit (BWA-GATK), followed by ANNOVAR to perform variant functional annotation. All variants passing filter criteria were validated by Sanger sequencing to confirm familial segregation. Analysis of exome sequence data identified a novel frameshift mutation in RP2 gene resulting in a premature stop codon (c.665delC, p.Pro222fsTer237). Sanger sequencing revealed this mutation co-segregated with the disease phenotype in the child's family. We identified a novel causative mutation in RP2 from a single proband's exome sequence data analysis. This study highlights the effectiveness of the whole-exome sequencing in the genetic diagnosis of X-linked retinitis pigmentosa, over the conventional sequencing methods. Even using a single exome, exome sequencing technology would be able to pinpoint pathogenic variant(s) for X-linked retinitis pigmentosa, when properly applied with aid of adequate variant filtering strategy. Copyright © 2016 Canadian Ophthalmological Society. Published by Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, YanHua, E-mail: liyanhua.1982@aliyun.com; Li, AiHua; Yang, Z.Q.
Cell death-inducing DNA fragmentation factor-α-like effector b (CIDEb) is a member of the CIDE family of apoptosis-inducing factors, CIDEa and CIDEc have been reported to be Lipid droplets (LDs)-associated proteins that promote atypical LD fusion in adipocytes, and responsible for liver steatosis under fasting and obese conditions, whereas CIDEb promotes lipid storage under normal diet conditions [1], and promotes the formation of triacylglyceride-enriched VLDL particles in hepatocytes [2]. Here, we report the gene cloning, chromosome mapping, tissue distribution, genetic expression analysis, and identification of a novel splicing variant of the porcine CIDEb gene. Sequence analysis shows that the open readingmore » frame of the normal porcine CIDEb isoform covers 660bp and encodes a 219-amino acid polypeptide, whereas its alternative splicing variant encodes a 142-amino acid polypeptide truncated at the fourth exon and comprised of the CIDE-N domain and part of the CIDE-C domain. The deduced amino acid sequence of normal porcine CIDEb shows an 85.8% similarity to the human protein and 80.0% to the mouse protein. The CIDEb genomic sequence spans approximately 6KB comprised of five exons and four introns. Radiation hybrid mapping demonstrated that porcine CIDEb is located at chromosome 7q21 and at a distance of 57cR from the most significantly linked marker, S0334, regions that are syntenic with the corresponding region in the human genome. Tissue expression analysis indicated that normal CIDEb mRNA is ubiquitously expressed in many porcine tissues. It was highly expressed in white adipose tissue and was observed at relatively high levels in the liver, lung, small intestine, lymphatic tissue and brain. The normal version of CIDEb was the predominant form in all tested tissues, whereas the splicing variant was expressed at low levels in all examined tissues except the lymphatic tissue. Furthermore, genetic expression analysis indicated that CIDEb mRNA levels were significantly higher in the white adipose tissue of lean pigs than their obese counterparts, in contrast to porcine CIDEa and CIDEc [3]. We therefore speculate that CIDEb may play a contrary role to the other CIDEs. The basic molecular information we provide here will be useful for further investigations of the physiological function of the gene, which will be helpful in better understanding the role of the CIDE family in lipid metabolism in pig models.« less
Circular permutant GFP insertion folding reporters
Waldo, Geoffrey S [Santa Fe, NM; Cabantous, Stephanie [Los Alamos, NM
2008-06-24
Provided are methods of assaying and improving protein folding using circular permutants of fluorescent proteins, including circular permutants of GFP variants and combinations thereof. The invention further provides various nucleic acid molecules and vectors incorporating such nucleic acid molecules, comprising polynucleotides encoding fluorescent protein circular permutants derived from superfolder GFP, which polynucleotides include an internal cloning site into which a heterologous polynucleotide may be inserted in-frame with the circular permutant coding sequence, and which when expressed are capable of reporting on the degree to which a polypeptide encoded by such an inserted heterologous polynucleotide is correctly folded by correlation with the degree of fluorescence exhibited.
Circular permutant GFP insertion folding reporters
Waldo, Geoffrey S; Cabantous, Stephanie
2013-02-12
Provided are methods of assaying and improving protein folding using circular permutants of fluorescent proteins, including circular permutants of GFP variants and combinations thereof. The invention further provides various nucleic acid molecules and vectors incorporating such nucleic acid molecules, comprising polynucleotides encoding fluorescent protein circular permutants derived from superfolder GFP, which polynucleotides include an internal cloning site into which a heterologous polynucleotide may be inserted in-frame with the circular permutant coding sequence, and which when expressed are capable of reporting on the degree to which a polypeptide encoded by such an inserted heterologous polynucleotide is correctly folded by correlation with the degree of fluorescence exhibited.
Circular permutant GFP insertion folding reporters
Waldo, Geoffrey S [Santa Fe, NM; Cabantous, Stephanie [Los Alamos, NM
2011-06-14
Provided are methods of assaying and improving protein folding using circular permutants of fluorescent proteins, including circular permutants of GFP variants and combinations thereof. The invention further provides various nucleic acid molecules and vectors incorporating such nucleic acid molecules, comprising polynucleotides encoding fluorescent protein circular permutants derived from superfolder GFP, which polynucleotides include an internal cloning site into which a heterologous polynucleotide may be inserted in-frame with the circular permutant coding sequence, and which when expressed are capable of reporting on the degree to which a polypeptide encoded by such an inserted heterologous polynucleotide is correctly folded by correlation with the degree of fluorescence exhibited.
Circular permutant GFP insertion folding reporters
Waldo, Geoffrey S.; Cabantous, Stephanie
2013-04-16
Provided are methods of assaying and improving protein folding using circular permutants of fluorescent proteins, including circular permutants of GFP variants and combinations thereof. The invention further provides various nucleic acid molecules and vectors incorporating such nucleic acid molecules, comprising polynucleotides encoding fluorescent protein circular permutants derived from superfolder GFP, which polynucleotides include an internal cloning site into which a heterologous polynucleotide may be inserted in-frame with the circular permutant coding sequence, and which when expressed are capable of reporting on the degree to which a polypeptide encoded by such an inserted heterologous polynucleotide is correctly folded by correlation with the degree of fluorescence exhibited.
Cantalupo, Paul G.; Katz, Joshua P.
2015-01-01
ABSTRACT We searched The Cancer Genome Atlas (TCGA) database for viruses by comparing non-human reads present in transcriptome sequencing (RNA-Seq) and whole-exome sequencing (WXS) data to viral sequence databases. Human papillomavirus 18 (HPV18) is an etiologic agent of cervical cancer, and as expected, we found robust expression of HPV18 genes in cervical cancer samples. In agreement with previous studies, we also found HPV18 transcripts in non-cervical cancer samples, including those from the colon, rectum, and normal kidney. However, in each of these cases, HPV18 gene expression was low, and single-nucleotide variants and positions of genomic alignments matched the integrated portion of HPV18 present in HeLa cells. Chimeric reads that match a known virus-cell junction of HPV18 integrated in HeLa cells were also present in some samples. We hypothesize that HPV18 sequences in these non-cervical samples are due to nucleic acid contamination from HeLa cells. This finding highlights the problems that contamination presents in computational virus detection pipelines. IMPORTANCE Viruses associated with cancer can be detected by searching tumor sequence databases. Several studies involving searches of the TCGA database have reported the presence of HPV18, a known cause of cervical cancer, in a small number of additional cancers, including those of the rectum, kidney, and colon. We have determined that the sequences related to HPV18 in non-cervical samples are due to nucleic acid contamination from HeLa cells. To our knowledge, this is the first report of the misidentification of viruses in next-generation sequencing data of tumors due to contamination with a cancer cell line. These results raise awareness of the difficulty of accurately identifying viruses in human sequence databases. PMID:25631090
Optimized molecular design of ADAPT-based HER2-imaging probes labelled with 111In and 68Ga.
Lindbo, Sarah; Garousi, Javad; Mitran, Bogdan; Vorobyeva, Anzhelika; Oroujeni, Maryam; Orlova, Anna; Hober, Sophia; Tolmachev, Vladimir
2018-06-04
Radionuclide molecular imaging is a promising tool for visualization of cancer associated molecular abnormalities in vivo and stratification of patients for specific therapies. ADAPT is a new type of small engineered proteins based on the scaffold of an albumin binding domain of protein G. ADAPTs have been utilized to select and develop high affinity binders to different proteinaceous targets. ADAPT6 binds to human epidermal growth factor 2 (HER2) with low nanomolar affinity and can be used for its in vivo visualization. Molecular design of 111 In-labeled anti-HER2 ADAPT has been optimized in several earlier studies. In this study, we made a direct comparison of two of the most promising variants, having either a DEAVDANS or a (HE) 3 DANS sequence at the N-terminus, conjugated with a maleimido derivative of DOTA to a GSSC amino acids sequence at the C-terminus. The variants (designated DOTA-C 59 - DEAVDANS-ADAPT6-GSSC and DOTA-C 61 -(HE) 3 DANS-ADAPT6-GSSC) were stably labeled with 111 In for SPECT and 68 Ga for PET. Biodistribution of labeled ADAPT variants was evaluated in nude mice bearing human tumor xenografts with different levels of HER2 expression. Both variants enabled clear discrimination between tumors with high and low levels of HER2 expression. 111 In-labeled ADAPT6 derivatives provided higher tumor-to-organ ratios compared to 68 Ga-labeled counterparts. The best performing variant was DOTA-C 61 -(HE) 3 DANS-ADAPT6-GSSC, providing tumor-to-blood ratios of 208±36 and 109±17 at 3 h for 111 In and 68 Ga labels, respectively.
Telomere biology and telomerase mutations in cirrhotic patients with hepatocellular carcinoma
Alves-Paiva, Raquel M.; Podlevsky, Joshua D.; Logeswaran, Dhenugen; Santana, Barbara A.; Teixeira, Andreza C.; Chen, Julian J.-L.; Calado, Rodrigo T.; Martinelli, Ana L. C.
2017-01-01
Telomeres are repetitive DNA sequences at linear chromosome termini, protecting chromosomes against end-to-end fusion and damage, providing chromosomal stability. Telomeres shorten with mitotic cellular division, but are maintained in cells with high proliferative capacity by telomerase. Loss-of-function mutations in telomere-maintenance genes are genetic risk factors for cirrhosis development in humans and murine models. Telomerase deficiency provokes accelerated telomere shortening and dysfunction, facilitating genomic instability and oncogenesis. Here we examined whether telomerase mutations and telomere shortening were associated with hepatocellular carcinoma (HCC) secondary to cirrhosis. Telomere length of peripheral blood leukocytes was measured by Southern blot and qPCR in 120 patients with HCC associated with cirrhosis and 261 healthy subjects. HCC patients were screened for telomerase gene variants (in TERT and TERC) by Sanger sequencing. Age-adjusted telomere length was comparable between HCC patients and healthy subjects by both Southern blot and qPCR. Four non-synonymous TERT heterozygous variants were identified in four unrelated patients, resulting in a significantly higher mutation carrier frequency (3.3%) in patients as compared to controls (p = 0.02). Three of the four variants (T726M, A1062T, and V1090M) were previously observed in patients with other telomere diseases (severe aplastic anemia, acute myeloid leukemia, and cirrhosis). A novel TERT variant, A243V, was identified in a 65-year-old male with advanced HCC and cirrhosis secondary to chronic hepatitis C virus (HCV) and alcohol ingestion, but direct assay measurements in vitro did not detect modulation of telomerase enzymatic activity or processivity. In summary, constitutional variants resulting in amino acid changes in the telomerase reverse transcriptase were found in a small proportion of patients with cirrhosis-associated HCC. PMID:28813500
The Variant p.(Arg183Trp) in SPTLC2 Causes Late-Onset Hereditary Sensory Neuropathy.
Suriyanarayanan, Saranya; Auranen, Mari; Toppila, Jussi; Paetau, Anders; Shcherbii, Maria; Palin, Eino; Wei, Yu; Lohioja, Tarja; Schlotter-Weigel, Beate; Schön, Ulrike; Abicht, Angela; Rautenstrauss, Bernd; Tyynismaa, Henna; Walter, Maggie C; Hornemann, Thorsten; Ylikallio, Emil
2016-03-01
Hereditary sensory and autonomic neuropathy 1 (HSAN1) is an autosomal dominant disorder that can be caused by variants in SPTLC1 or SPTLC2, encoding subunits of serine palmitoyl-CoA transferase. Disease variants alter the enzyme's substrate specificity and lead to accumulation of neurotoxic 1-deoxysphingolipids. We describe two families with autosomal dominant HSAN1C caused by a new variant in SPTLC2, c.547C>T, p.(Arg183Trp). The variant changed a conserved amino acid and was not found in public variant databases. All patients had a relatively mild progressive distal sensory impairment, with onset after age 50. Small fibers were affected early, leading to abnormalities on quantitative sensory testing. Sural biopsy revealed a severe chronic axonal neuropathy with subtotal loss of myelinated axons, relatively preserved number of non-myelinated fibers and no signs for regeneration. Skin biopsy with PGP9.5 labeling showed lack of intraepidermal nerve endings early in the disease. Motor manifestations developed later in the disease course, but there was no evidence of autonomic involvement. Patients had elevated serum 1-deoxysphingolipids, and the variant protein produced elevated amounts of 1-deoxysphingolipids in vitro, which proved the pathogenicity of the variant. Our results expand the genetic spectrum of HSAN1C and provide further detail about the clinical characteristics. Sequencing of SPTLC2 should be considered in all patients presenting with mild late-onset sensory-predominant small or large fiber neuropathy.
Criscione, Andrea; Cunsolo, Vincenzo; Tumino, Serena; Di Francesco, Antonella; Bordonaro, Salvatore; Muccilli, Vera; Saletti, Rosaria; Marletta, Donata
2018-06-01
In the last years, donkey milk had evidenced a renewed interest as a potential functional food and a breast milk substitute. In this light, the study of the protein composition assumes an important role. In particular, β-lactoglobulin (β-LG), which is considered as one of the main allergenic milk protein, in donkey species consists of two molecular forms, namely β-LG I and β-LG II. In the present research, a genetic analysis coupled with a proteomic approach showed the presence of a new allele, here named F, which is apparently associated with a null or a severely reduced expression of β-LG II protein. The new β-LG II F genetic variant shows a theoretical average mass (M av ) of 18,310.64 Da, a value practically corresponding with that of the variant D (∆ mass < 0.07 Da), but differs from β-LG II D for two amino acid substitutions: Thr 100 (variant F) → Ala 100 (variant D) and Thr 118 (variant F) → Met 118 (variant D). Proteomic investigation of the whey protein fraction of an individual milk sample, homozygous FF at β-LG II locus, allowed to identify, as very minor component, the new β-LG II F genetic variant. By MS/MS analysis of enzymatic digests, the sequence of the β-LG II F was characterized, and the predicted genomic data confirmed.
Liu, Yong; Cao, Yu; Li, Yaxiong; Lei, Dongyun; Li, Lin; Hou, Zong Liu; Han, Shen; Meng, Mingyao; Shi, Jianlin; Zhang, Yayong; Wang, Yi; Niu, Zhaoyi; Xie, Yanhua; Xiao, Benshan; Wang, Yuanfei; Li, Xiao; Yang, Lirong
2018-01-01
Background Recently, mutations in several genes have been described to be associated with sporadic ASD, but some genetic variants remain to be identified. The aim of this study was to use whole-exome sequencing (WES) combined with bioinformatics analysis to identify novel genetic variants in cases of sporadic congenital ASD, followed by validation by Sanger sequencing. Material/Methods Five Han patients with secundum ASD were recruited, and their tissue samples were analyzed by WES, followed by verification by Sanger sequencing of tissue and blood samples. Further evaluation using blood samples included 452 additional patients with sporadic secundum ASD (212 male and 240 female patients) and 519 healthy subjects (252 male and 267 female subjects) for further verification by a multiplexed MassARRAY system. Bioinformatic analyses were performed to identify novel genetic variants associated with sporadic ASD. Results From five patients with sporadic ASD, a total of 181,762 genomic variants in 33 exon loci, validated by Sanger sequencing, were selected and underwent MassARRAY analysis in 452 patients with ASD and 519 healthy subjects. Three loci with high mutation frequencies, the 138665410 FOXL2 gene variant, the 23862952 MYH6 gene variant, and the 71098693 HYDIN gene variant were found to be significantly associated with sporadic ASD (P<0.05); variants in FOXL2 and MYH6 were found in patients with isolated, sporadic ASD (P<5×10−4). Conclusions This was the first study that demonstrated variants in FOXL2 and HYDIN associated with sporadic ASD, and supported the use of WES and bioinformatics analysis to identify disease-associated mutations. PMID:29505555
Liu, Yong; Cao, Yu; Li, Yaxiong; Lei, Dongyun; Li, Lin; Hou, Zong Liu; Han, Shen; Meng, Mingyao; Shi, Jianlin; Zhang, Yayong; Wang, Yi; Niu, Zhaoyi; Xie, Yanhua; Xiao, Benshan; Wang, Yuanfei; Li, Xiao; Yang, Lirong; Wang, Wenju; Jiang, Lihong
2018-03-05
BACKGROUND Recently, mutations in several genes have been described to be associated with sporadic ASD, but some genetic variants remain to be identified. The aim of this study was to use whole-exome sequencing (WES) combined with bioinformatics analysis to identify novel genetic variants in cases of sporadic congenital ASD, followed by validation by Sanger sequencing. MATERIAL AND METHODS Five Han patients with secundum ASD were recruited, and their tissue samples were analyzed by WES, followed by verification by Sanger sequencing of tissue and blood samples. Further evaluation using blood samples included 452 additional patients with sporadic secundum ASD (212 male and 240 female patients) and 519 healthy subjects (252 male and 267 female subjects) for further verification by a multiplexed MassARRAY system. Bioinformatic analyses were performed to identify novel genetic variants associated with sporadic ASD. RESULTS From five patients with sporadic ASD, a total of 181,762 genomic variants in 33 exon loci, validated by Sanger sequencing, were selected and underwent MassARRAY analysis in 452 patients with ASD and 519 healthy subjects. Three loci with high mutation frequencies, the 138665410 FOXL2 gene variant, the 23862952 MYH6 gene variant, and the 71098693 HYDIN gene variant were found to be significantly associated with sporadic ASD (P<0.05); variants in FOXL2 and MYH6 were found in patients with isolated, sporadic ASD (P<5×10^-4). CONCLUSIONS This was the first study that demonstrated variants in FOXL2 and HYDIN associated with sporadic ASD, and supported the use of WES and bioinformatics analysis to identify disease-associated mutations.
Pausch, Hubert; Wurmser, Christine; Reinhardt, Friedrich; Emmerling, Reiner; Fries, Ruedi
2015-06-01
Most association studies for pinpointing trait-associated variants are performed within breed. The availability of sequence data from key ancestors of several cattle breeds now enables immediate assessment of the frequency of trait-associated variants in populations different from the mapping population and their imputation into large validation populations. The objective of this study was to validate the effects of 4 putatively causative variants on milk production traits, male fertility, and stature in German Fleckvieh and Holstein-Friesian animals using targeted sequence imputation. We used whole-genome sequence data of 456 animals to impute 4 missense mutations in DGAT1, GHR, PRLR, and PROP1 into 10,363 Fleckvieh and 8,812 Holstein animals. The accuracy of the imputed genotypes exceeded 95% for all variants. Association testing with imputed variants revealed consistent antagonistic effects of the DGAT1 p.A232K and GHR p.F279Y variants on milk yield and protein and fat contents, respectively, in both breeds. The allele frequency of both polymorphisms has changed considerably in the past 20 yr, indicating that they were targets of recent selection for milk production traits. The PRLR p.S18N variant was associated with yield traits in Fleckvieh but not in Holstein, suggesting that it may be in linkage disequilibrium with a mutation affecting yield traits rather than being causal. The reported effects of the PROP1 p.H173R variant on milk production, male fertility, and stature could not be confirmed. Our results demonstrate that population-wide imputation of candidate causal variants from sequence data is feasible, enabling their rapid validation in large independent populations. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Al-Bustan, Suzanne A; Al-Serri, Ahmad; Annice, Babitha G; Alnaqeeb, Majed A; Al-Kandari, Wafa Y; Dashti, Mohammed
2018-01-01
The role interethnic genetic differences play in plasma lipid level variation across populations is a global health concern. Several genes involved in lipid metabolism and transport are strong candidates for the genetic association with lipid level variation especially lipoprotein lipase (LPL). The objective of this study was to re-sequence the full LPL gene in Kuwaiti Arabs, analyse the sequence variation and identify variants that could attribute to variation in plasma lipid levels for further genetic association. Samples (n = 100) of an Arab ethnic group from Kuwait were analysed for sequence variation by Sanger sequencing across the 30 Kb LPL gene and its flanking sequences. A total of 293 variants including 252 single nucleotide polymorphisms (SNPs) and 39 insertions/deletions (InDels) were identified among which 47 variants (32 SNPs and 15 InDels) were novel to Kuwaiti Arabs. This study is the first to report sequence data and analysis of frequencies of variants at the LPL gene locus in an Arab ethnic group with a novel "rare" variant (LPL:g.18704C>A) significantly associated to HDL (B = -0.181; 95% CI (-0.357, -0.006); p = 0.043), TG (B = 0.134; 95% CI (0.004-0.263); p = 0.044) and VLDL (B = 0.131; 95% CI (-0.001-0.263); p = 0.043) levels. Sequence variation in Kuwaiti Arabs was compared to other populations and was found to be similar with regards to the number of SNPs, InDels and distribution of the number of variants across the LPL gene locus and minor allele frequency (MAF). Moreover, comparison of the identified variants and their MAF with other reports provided a list of 46 potential variants across the LPL gene to be considered for future genetic association studies. The findings warrant further investigation into the association of g.18704C>A with lipid levels in other ethnic groups and with clinical manifestations of dyslipidemia.
Al-Serri, Ahmad; Annice, Babitha G.; Alnaqeeb, Majed A.; Al-Kandari, Wafa Y.; Dashti, Mohammed
2018-01-01
The role interethnic genetic differences play in plasma lipid level variation across populations is a global health concern. Several genes involved in lipid metabolism and transport are strong candidates for the genetic association with lipid level variation especially lipoprotein lipase (LPL). The objective of this study was to re-sequence the full LPL gene in Kuwaiti Arabs, analyse the sequence variation and identify variants that could attribute to variation in plasma lipid levels for further genetic association. Samples (n = 100) of an Arab ethnic group from Kuwait were analysed for sequence variation by Sanger sequencing across the 30 Kb LPL gene and its flanking sequences. A total of 293 variants including 252 single nucleotide polymorphisms (SNPs) and 39 insertions/deletions (InDels) were identified among which 47 variants (32 SNPs and 15 InDels) were novel to Kuwaiti Arabs. This study is the first to report sequence data and analysis of frequencies of variants at the LPL gene locus in an Arab ethnic group with a novel “rare” variant (LPL:g.18704C>A) significantly associated to HDL (B = -0.181; 95% CI (-0.357, -0.006); p = 0.043), TG (B = 0.134; 95% CI (0.004–0.263); p = 0.044) and VLDL (B = 0.131; 95% CI (-0.001–0.263); p = 0.043) levels. Sequence variation in Kuwaiti Arabs was compared to other populations and was found to be similar with regards to the number of SNPs, InDels and distribution of the number of variants across the LPL gene locus and minor allele frequency (MAF). Moreover, comparison of the identified variants and their MAF with other reports provided a list of 46 potential variants across the LPL gene to be considered for future genetic association studies. The findings warrant further investigation into the association of g.18704C>A with lipid levels in other ethnic groups and with clinical manifestations of dyslipidemia. PMID:29438437
Vinciguerra, Margherita; Passarello, Cristina; Cassarà, Filippo; Leto, Filippo; Cannata, Monica; Crivello, Anna; Di Salvo, Veronica; Maggio, Aurelio; Giambona, Antonino
2016-08-01
A 59-year-old Italian woman came to our center for revaluation of a previous diagnosis of polycythemia vera. The patient presented with a lifelong history of polycythemia, no increase in white blood cells (WBCs) and platelets, and a negative bone marrow biopsy. Analysis of hemoglobin (Hb) fractions showed an abnormal fast moving Hb component. We aimed to determine if this variant was the cause of polycythemia in this patient. A complete blood count (CBC) was performed by an automated cell counter and Hb fractions were determined by high performance liquid chromatography (HPLC). Standard stability tests and oxygen affinity evaluation were also performed. Genomic DNA was extracted from peripheral blood leukocytes using the phenol chloroform method and the entire β-globin gene was analyzed by direct sequencing. At the hematological level, no anemia or hemolysis was observed but an abnormal Hb fraction was detected using cation exchange HPLC. Molecular analysis of the β-globin gene showed heterozygosity for an AAG > ACG substitution at codon 144, resulting in a Lys→Thr amino acid replacement. We demonstrated that this is a new Hb variant with increased oxygen affinity. Its altered physiology is caused by the reduction of 2,3-diphosphoglycerate (2,3-DPG) effects, due to an amino acid substitution in the central pocket near the C-terminal of the β chain. We called this new variant Hb San Cataldo for the native city of proband.
GWASeq: targeted re-sequencing follow up to GWAS.
Salomon, Matthew P; Li, Wai Lok Sibon; Edlund, Christopher K; Morrison, John; Fortini, Barbara K; Win, Aung Ko; Conti, David V; Thomas, Duncan C; Duggan, David; Buchanan, Daniel D; Jenkins, Mark A; Hopper, John L; Gallinger, Steven; Le Marchand, Loïc; Newcomb, Polly A; Casey, Graham; Marjoram, Paul
2016-03-03
For the last decade the conceptual framework of the Genome-Wide Association Study (GWAS) has dominated the investigation of human disease and other complex traits. While GWAS have been successful in identifying a large number of variants associated with various phenotypes, the overall amount of heritability explained by these variants remains small. This raises the question of how best to follow up on a GWAS, localize causal variants accounting for GWAS hits, and as a consequence explain more of the so-called "missing" heritability. Advances in high throughput sequencing technologies now allow for the efficient and cost-effective collection of vast amounts of fine-scale genomic data to complement GWAS. We investigate these issues using a colon cancer dataset. After QC, our data consisted of 1993 cases, 899 controls. Using marginal tests of associations, we identify 10 variants distributed among six targeted regions that are significantly associated with colorectal cancer, with eight of the variants being novel to this study. Additionally, we perform so-called 'SNP-set' tests of association and identify two sets of variants that implicate both common and rare variants in the etiology of colorectal cancer. Here we present a large-scale targeted re-sequencing resource focusing on genomic regions implicated in colorectal cancer susceptibility previously identified in several GWAS, which aims to 1) provide fine-scale targeted sequencing data for fine-mapping and 2) provide data resources to address methodological questions regarding the design of sequencing-based follow-up studies to GWAS. Additionally, we show that this strategy successfully identifies novel variants associated with colorectal cancer susceptibility and can implicate both common and rare variants.
de Beer, Tjaart A P; Laskowski, Roman A; Parks, Sarah L; Sipos, Botond; Goldman, Nick; Thornton, Janet M
2013-01-01
The 1000 Genomes Project data provides a natural background dataset for amino acid germline mutations in humans. Since the direction of mutation is known, the amino acid exchange matrix generated from the observed nucleotide variants is asymmetric and the mutabilities of the different amino acids are very different. These differences predominantly reflect preferences for nucleotide mutations in the DNA (especially the high mutation rate of the CpG dinucleotide, which makes arginine mutability very much higher than other amino acids) rather than selection imposed by protein structure constraints, although there is evidence for the latter as well. The variants occur predominantly on the surface of proteins (82%), with a slight preference for sites which are more exposed and less well conserved than random. Mutations to functional residues occur about half as often as expected by chance. The disease-associated amino acid variant distributions in OMIM are radically different from those expected on the basis of the 1000 Genomes dataset. The disease-associated variants preferentially occur in more conserved sites, compared to 1000 Genomes mutations. Many of the amino acid exchange profiles appear to exhibit an anti-correlation, with common exchanges in one dataset being rare in the other. Disease-associated variants exhibit more extreme differences in amino acid size and hydrophobicity. More modelling of the mutational processes at the nucleotide level is needed, but these observations should contribute to an improved prediction of the effects of specific variants in humans.
2010-02-01
specimen-specific viral gene sequences as determinants of virus type, A/HN subtype, virulence, host-range, and resistance to antiviral agents . Citation...Agriculture, Agriculture Research Service (USDA-ARS) Southeast Poultry Research Laboratory (SEPRL, Athens, GA) selected specimens from its reference...Flu assay as agents of flu-like illness are identified in Supplemental Information Figure S1. A single total nucleic acid preparation from a single
USDA-ARS?s Scientific Manuscript database
Fine-mapping of causal variants is becoming feasible for complex traits in livestock GWAS, as an increasing number of animals are sequenced. Imputation has been routinely applied to ascertain sequence variants in large genotyped populations based on small reference populations of sequenced animals. ...
USDA-ARS?s Scientific Manuscript database
Major whole genome sequencing projects promise to identify rare and causal variants within livestock species; however, the efficient selection of animals for sequencing remains a major problem within these surveys. The goal of this project was to develop a library of high accuracy genetic variants f...
USDA-ARS?s Scientific Manuscript database
Imputation has been routinely applied to ascertain sequence variants in large genotyped populations based on reference populations of sequenced animals. With the implementation of the 1000 Bull Genomes Project and increasing numbers of animals sequenced, fine-mapping of causal variants is becoming f...
Krassowski, Michal; Paczkowska, Marta; Cullion, Kim; Huang, Tina; Dzneladze, Irakli; Ouellette, B F Francis; Yamada, Joseph T; Fradet-Turcotte, Amelie
2018-01-01
Abstract Interpretation of genetic variation is needed for deciphering genotype-phenotype associations, mechanisms of inherited disease, and cancer driver mutations. Millions of single nucleotide variants (SNVs) in human genomes are known and thousands are associated with disease. An estimated 21% of disease-associated amino acid substitutions corresponding to missense SNVs are located in protein sites of post-translational modifications (PTMs), chemical modifications of amino acids that extend protein function. ActiveDriverDB is a comprehensive human proteo-genomics database that annotates disease mutations and population variants through the lens of PTMs. We integrated >385,000 published PTM sites with ∼3.6 million substitutions from The Cancer Genome Atlas (TCGA), the ClinVar database of disease genes, and human genome sequencing projects. The database includes site-specific interaction networks of proteins, upstream enzymes such as kinases, and drugs targeting these enzymes. We also predicted network-rewiring impact of mutations by analyzing gains and losses of kinase-bound sequence motifs. ActiveDriverDB provides detailed visualization, filtering, browsing and searching options for studying PTM-associated mutations. Users can upload mutation datasets interactively and use our application programming interface in pipelines. Integrative analysis of mutations and PTMs may help decipher molecular mechanisms of phenotypes and disease, as exemplified by case studies of TP53, BRCA2 and VHL. The open-source database is available at https://www.ActiveDriverDB.org. PMID:29126202
Kinoti, Wycliff M; Constable, Fiona E; Nancarrow, Narelle; Plummer, Kim M; Rodoni, Brendan
2017-01-01
PCR amplicon next generation sequencing (NGS) analysis offers a broadly applicable and targeted approach to detect populations of both high- or low-frequency virus variants in one or more plant samples. In this study, amplicon NGS was used to explore the diversity of the tripartite genome virus, Prunus necrotic ringspot virus (PNRSV) from 53 PNRSV-infected trees using amplicons from conserved gene regions of each of PNRSV RNA1, RNA2 and RNA3. Sequencing of the amplicons from 53 PNRSV-infected trees revealed differing levels of polymorphism across the three different components of the PNRSV genome with a total number of 5040, 2083 and 5486 sequence variants observed for RNA1, RNA2 and RNA3 respectively. The RNA2 had the lowest diversity of sequences compared to RNA1 and RNA3, reflecting the lack of flexibility tolerated by the replicase gene that is encoded by this RNA component. Distinct PNRSV phylo-groups, consisting of closely related clusters of sequence variants, were observed in each of PNRSV RNA1, RNA2 and RNA3. Most plant samples had a single phylo-group for each RNA component. Haplotype network analysis showed that smaller clusters of PNRSV sequence variants were genetically connected to the largest sequence variant cluster within a phylo-group of each RNA component. Some plant samples had sequence variants occurring in multiple PNRSV phylo-groups in at least one of each RNA and these phylo-groups formed distinct clades that represent PNRSV genetic strains. Variants within the same phylo-group of each Prunus plant sample had ≥97% similarity and phylo-groups within a Prunus plant sample and between samples had less ≤97% similarity. Based on the analysis of diversity, a definition of a PNRSV genetic strain was proposed. The proposed definition was applied to determine the number of PNRSV genetic strains in each of the plant samples and the complexity in defining genetic strains in multipartite genome viruses was explored.
Barbato, Ersilia; Traversa, Alice; Guarnieri, Rosanna; Giovannetti, Agnese; Genovesi, Maria Luce; Magliozzi, Maria Rosa; Paolacci, Stefano; Ciolfi, Andrea; Pizzi, Simone; Di Giorgio, Roberto; Tartaglia, Marco; Pizzuti, Antonio; Caputo, Viviana
2018-07-01
The aim of this study was the clinical and molecular characterization of a family segregating a trait consisting of a phenotype specifically involving the maxillary canines, including agenesis, impaction and ectopic eruption, characterized by incomplete penetrance and variable expressivity. Clinical standardized assessment of 14 family members and a whole-exome sequencing (WES) of three affected subjects were performed. WES data analyses (sequence alignment, variant calling, annotation and prioritization) were carried out using an in-house implemented pipeline. Variant filtering retained coding and splice-site high quality private and rare variants. Variant prioritization was performed taking into account both the disruptive impact and the biological relevance of individual variants and genes. Sanger sequencing was performed to validate the variants of interest and to carry out segregation analysis. Prioritization of variants "by function" allowed the identification of multiple variants contributing to the trait, including two concomitant heterozygous variants in EDARADD (c.308C>T, p.Ser103Phe) and COL5A1 (c.1588G>A, p.Gly530Ser), specifically associated with a more severe phenotype (i.e. canine agenesis). Differently, heterozygous variants in genes encoding proteins with a role in the WNT pathway were shared by subjects showing a phenotype of impacted/ectopic erupted canines. This study characterized the genetic contribution underlying a complex trait consisting of isolated canine anomalies in a medium-sized family, highlighting the role of WNT and EDA cell signaling pathways in tooth development. Copyright © 2018 Elsevier Ltd. All rights reserved.
Visschedijk, Marijn C; Alberts, Rudi; Mucha, Soren; Deelen, Patrick; de Jong, Dirk J; Pierik, Marieke; Spekhorst, Lieke M; Imhann, Floris; van der Meulen-de Jong, Andrea E; van der Woude, C Janneke; van Bodegraven, Adriaan A; Oldenburg, Bas; Löwenberg, Mark; Dijkstra, Gerard; Ellinghaus, David; Schreiber, Stefan; Wijmenga, Cisca; Rivas, Manuel A; Franke, Andre; van Diemen, Cleo C; Weersma, Rinse K
2016-01-01
Genome-wide association studies have revealed several common genetic risk variants for ulcerative colitis (UC). However, little is known about the contribution of rare, large effect genetic variants to UC susceptibility. In this study, we performed a deep targeted re-sequencing of 122 genes in Dutch UC patients in order to investigate the contribution of rare variants to the genetic susceptibility to UC. The selection of genes consists of 111 established human UC susceptibility genes and 11 genes that lead to spontaneous colitis when knocked-out in mice. In addition, we sequenced the promoter regions of 45 genes where known variants exert cis-eQTL-effects. Targeted pooled re-sequencing was performed on DNA of 790 Dutch UC cases. The Genome of the Netherlands project provided sequence data of 500 healthy controls. After quality control and prioritization based on allele frequency and pathogenicity probability, follow-up genotyping of 171 rare variants was performed on 1021 Dutch UC cases and 1166 Dutch controls. Single-variant association and gene-based analyses identified an association of rare variants in the MUC2 gene with UC. The associated variants in the Dutch population could not be replicated in a German replication cohort (1026 UC cases, 3532 controls). In conclusion, this study has identified a putative role for MUC2 on UC susceptibility in the Dutch population and suggests a population-specific contribution of rare variants to UC.
Leung, Ross Ka-Kit; Dong, Zhi Qiang; Sa, Fei; Chong, Cheong Meng; Lei, Si Wan; Tsui, Stephen Kwok-Wing; Lee, Simon Ming-Yuen
2014-02-01
Minor variants have significant implications in quasispecies evolution, early cancer detection and non-invasive fetal genotyping but their accurate detection by next-generation sequencing (NGS) is hampered by sequencing errors. We generated sequencing data from mixtures at predetermined ratios in order to provide insight into sequencing errors and variations that can arise for which simulation cannot be performed. The information also enables better parameterization in depth of coverage, read quality and heterogeneity, library preparation techniques, technical repeatability for mathematical modeling, theory development and simulation experimental design. We devised minor variant authentication rules that achieved 100% accuracy in both testing and validation experiments. The rules are free from tedious inspection of alignment accuracy, sequencing read quality or errors introduced by homopolymers. The authentication processes only require minor variants to: (1) have minimum depth of coverage larger than 30; (2) be reported by (a) four or more variant callers, or (b) DiBayes or LoFreq, plus SNVer (or BWA when no results are returned by SNVer), and with the interassay coefficient of variation (CV) no larger than 0.1. Quantification accuracy undermined by sequencing errors could neither be overcome by ultra-deep sequencing, nor recruiting more variant callers to reach a consensus, such that consistent underestimation and overestimation (i.e. low CV) were observed. To accommodate stochastic error and adjust the observed ratio within a specified accuracy, we presented a proof of concept for the use of a double calibration curve for quantification, which provides an important reference towards potential industrial-scale fabrication of calibrants for NGS.
Ramas, Viviana; Mirazo, Santiago; Bonilla, Sylvia; Ruchansky, Dora; Arbiza, Juan
2018-05-15
This study aims to investigate the HPV16 variant distribution by sequence analyses of E6, E7 oncogenes and the Long Control Region (LCR), from cervical cells collected from Uruguayan women, and to reconstruct the phylogenetic relationships among variants. Forty-seven HPV16 variants, obtained from women with HSIL, LSIL, ASCUS and NILM cytological classes were analyzed for LCR and 12 were further studied for E6 and E7. Detailed sequence comparison, genetic heterogeneity analyses and phylogenetic reconstruction were performed. A high variability was observed among LCR sequences, which were distributed in 18 different variants. E6 and E7 sequences exhibited novel non-synonymous substitutions. Uruguayan sequences mainly belonged to the European lineage, and only 5 sequences clustered in non-European branches; 3 of them in the Asian-American and North-American linage and 2 in an African branch. Additionally, 6 new variants from European and African clusters were identified. HPV16 isolates mainly belonged to the European lineage, though strains from African and Asian-American lineages were also identified. Herein is reported for the first time the distribution and molecular characterization of HPV16 variants from Uruguay, providing novel insights on the molecular epidemiology of this infectious disease in the South America. A high variability among HPV 16 isolates mainly belonged to European lineage, provides an extensive sequence dataset from a country with high burden of cervical cancer. Copyright © 2018 Elsevier B.V. All rights reserved.
McClure, Matthew C; Bickhart, Derek; Null, Dan; Vanraden, Paul; Xu, Lingyang; Wiggans, George; Liu, George; Schroeder, Steve; Glasscock, Jarret; Armstrong, Jon; Cole, John B; Van Tassell, Curtis P; Sonstegard, Tad S
2014-01-01
The recent discovery of bovine haplotypes with negative effects on fertility in the Brown Swiss, Holstein, and Jersey breeds has allowed producers to identify carrier animals using commercial single nucleotide polymorphism (SNP) genotyping assays. This study was devised to identify the causative mutations underlying defective bovine embryo development contained within three of these haplotypes (Brown Swiss haplotype 1 and Holstein haplotypes 2 and 3) by combining exome capture with next generation sequencing. Of the 68,476,640 sequence variations (SV) identified, only 1,311 genome-wide SNP were concordant with the haplotype status of 21 sequenced carriers. Validation genotyping of 36 candidate SNP identified only 1 variant that was concordant to Holstein haplotype 3 (HH3), while no variants located within the refined intervals for HH2 or BH1 were concordant. The variant strictly associated with HH3 is a non-synonymous SNP (T/C) within exon 24 of the Structural Maintenance of Chromosomes 2 (SMC2) on Chromosome 8 at position 95,410,507 (UMD3.1). This polymorphism changes amino acid 1135 from phenylalanine to serine and causes a non-neutral, non-tolerated, and evolutionarily unlikely substitution within the NTPase domain of the encoded protein. Because only exome capture sequencing was used, we could not rule out the possibility that the true causative mutation for HH3 might lie in a non-exonic genomic location. Given the essential role of SMC2 in DNA repair, chromosome condensation and segregation during cell division, our findings strongly support the non-synonymous SNP (T/C) in SMC2 as the likely causative mutation. The absence of concordant variations for HH2 or BH1 suggests either the underlying causative mutations lie within a non-exomic region or in exome regions not covered by the capture array.
McClure, Matthew C.; Bickhart, Derek; Null, Dan; VanRaden, Paul; Xu, Lingyang; Wiggans, George; Liu, George; Schroeder, Steve; Glasscock, Jarret; Armstrong, Jon; Cole, John B.; Van Tassell, Curtis P.; Sonstegard, Tad S.
2014-01-01
The recent discovery of bovine haplotypes with negative effects on fertility in the Brown Swiss, Holstein, and Jersey breeds has allowed producers to identify carrier animals using commercial single nucleotide polymorphism (SNP) genotyping assays. This study was devised to identify the causative mutations underlying defective bovine embryo development contained within three of these haplotypes (Brown Swiss haplotype 1 and Holstein haplotypes 2 and 3) by combining exome capture with next generation sequencing. Of the 68,476,640 sequence variations (SV) identified, only 1,311 genome-wide SNP were concordant with the haplotype status of 21 sequenced carriers. Validation genotyping of 36 candidate SNP identified only 1 variant that was concordant to Holstein haplotype 3 (HH3), while no variants located within the refined intervals for HH2 or BH1 were concordant. The variant strictly associated with HH3 is a non-synonymous SNP (T/C) within exon 24 of the Structural Maintenance of Chromosomes 2 (SMC2) on Chromosome 8 at position 95,410,507 (UMD3.1). This polymorphism changes amino acid 1135 from phenylalanine to serine and causes a non-neutral, non-tolerated, and evolutionarily unlikely substitution within the NTPase domain of the encoded protein. Because only exome capture sequencing was used, we could not rule out the possibility that the true causative mutation for HH3 might lie in a non-exonic genomic location. Given the essential role of SMC2 in DNA repair, chromosome condensation and segregation during cell division, our findings strongly support the non-synonymous SNP (T/C) in SMC2 as the likely causative mutation. The absence of concordant variations for HH2 or BH1 suggests either the underlying causative mutations lie within a non-exomic region or in exome regions not covered by the capture array. PMID:24667746
Whole Exome Sequencing Identifies Rare Protein-Coding Variants in Behçet's Disease.
Ognenovski, Mikhail; Renauer, Paul; Gensterblum, Elizabeth; Kötter, Ina; Xenitidis, Theodoros; Henes, Jörg C; Casali, Bruno; Salvarani, Carlo; Direskeneli, Haner; Kaufman, Kenneth M; Sawalha, Amr H
2016-05-01
Behçet's disease (BD) is a systemic inflammatory disease with an incompletely understood etiology. Despite the identification of multiple common genetic variants associated with BD, rare genetic variants have been less explored. We undertook this study to investigate the role of rare variants in BD by performing whole exome sequencing in BD patients of European descent. Whole exome sequencing was performed in a discovery set comprising 14 German BD patients of European descent. For replication and validation, Sanger sequencing and Sequenom genotyping were performed in the discovery set and in 2 additional independent sets of 49 German BD patients and 129 Italian BD patients of European descent. Genetic association analysis was then performed in BD patients and 503 controls of European descent. Functional effects of associated genetic variants were assessed using bioinformatic approaches. Using whole exome sequencing, we identified 77 rare variants (in 74 genes) with predicted protein-damaging effects in BD. These variants were genotyped in 2 additional patient sets and then analyzed to reveal significant associations with BD at 2 genetic variants detected in all 3 patient sets that remained significant after Bonferroni correction. We detected genetic association between BD and LIMK2 (rs149034313), involved in regulating cytoskeletal reorganization, and between BD and NEIL1 (rs5745908), involved in base excision DNA repair (P = 3.22 × 10(-4) and P = 5.16 × 10(-4) , respectively). The LIMK2 association is a missense variant with predicted protein damage that may influence functional interactions with proteins involved in cytoskeletal regulation by Rho GTPase, inflammation mediated by chemokine and cytokine signaling pathways, T cell activation, and angiogenesis (Bonferroni-corrected P = 5.63 × 10(-14) , P = 7.29 × 10(-6) , P = 1.15 × 10(-5) , and P = 6.40 × 10(-3) , respectively). The genetic association in NEIL1 is a predicted splice donor variant that may introduce a deleterious intron retention and result in a noncoding transcript variant. We used whole exome sequencing in BD for the first time and identified 2 rare putative protein-damaging genetic variants associated with this disease. These genetic variants might influence cytoskeletal regulation and DNA repair mechanisms in BD and might provide further insight into increased leukocyte tissue infiltration and the role of oxidative stress in BD. © 2016, American College of Rheumatology.
Tria, Antje; Hiort, Olaf; Sinnecker, Gernot H G
2004-01-01
Defects in the steroid 5alpha-reductase type 2 (SRD5A2) activity cause decreased formation of dihydrotestosterone (DHT) from testosterone (T), resulting in defective masculinization of external genitalia; the T/DHT ratio is increased. We investigated 10 patients with elevated T/DHT ratios in whom mutations in the SRD5A2 and AR genes had been excluded to find out whether structural alterations of the SRD5A1 gene could contribute to their genital malformations. Single-strand conformation polymorphism analysis and direct sequencing were used to detect variations in the SRD5A1 gene of the patients and of 49 adult fertile men who served as controls. The sequence analysis of exon 3 of the SRD5A1 gene indicated an adenine-to-guanine change (ACA vs. ACG), both triplets encoding the amino acid residue threonine. The ACG sequence was detected in 57% of all subjects and was equally distributed in patients and controls. The T/DHT ratio was significantly higher in controls with the ACG variant as compared with those having the ACA variant. However, no particular sequence aberration was found in the SRD5A1 genes of either group. Mutant SRD5A1 isoenzyme does not seem to play a crucial role in the development of hypospadias. Copyright 2004 S. Karger AG, Basel
Hwang, Kyu-Baek; Lee, In-Hee; Park, Jin-Ho; Hambuch, Tina; Choe, Yongjoon; Kim, MinHyeok; Lee, Kyungjoon; Song, Taemin; Neu, Matthew B; Gupta, Neha; Kohane, Isaac S; Green, Robert C; Kong, Sek Won
2014-08-01
As whole genome sequencing (WGS) uncovers variants associated with rare and common diseases, an immediate challenge is to minimize false-positive findings due to sequencing and variant calling errors. False positives can be reduced by combining results from orthogonal sequencing methods, but costly. Here, we present variant filtering approaches using logistic regression (LR) and ensemble genotyping to minimize false positives without sacrificing sensitivity. We evaluated the methods using paired WGS datasets of an extended family prepared using two sequencing platforms and a validated set of variants in NA12878. Using LR or ensemble genotyping based filtering, false-negative rates were significantly reduced by 1.1- to 17.8-fold at the same levels of false discovery rates (5.4% for heterozygous and 4.5% for homozygous single nucleotide variants (SNVs); 30.0% for heterozygous and 18.7% for homozygous insertions; 25.2% for heterozygous and 16.6% for homozygous deletions) compared to the filtering based on genotype quality scores. Moreover, ensemble genotyping excluded > 98% (105,080 of 107,167) of false positives while retaining > 95% (897 of 937) of true positives in de novo mutation (DNM) discovery in NA12878, and performed better than a consensus method using two sequencing platforms. Our proposed methods were effective in prioritizing phenotype-associated variants, and an ensemble genotyping would be essential to minimize false-positive DNM candidates. © 2014 WILEY PERIODICALS, INC.
Weerakkody, Ruwan A; Vandrovcova, Jana; Kanonidou, Christina; Mueller, Michael; Gampawar, Piyush; Ibrahim, Yousef; Norsworthy, Penny; Biggs, Jennifer; Abdullah, Abdulshakur; Ross, David; Black, Holly A; Ferguson, David; Cheshire, Nicholas J; Kazkaz, Hanadi; Grahame, Rodney; Ghali, Neeti; Vandersteen, Anthony; Pope, F Michael; Aitman, Timothy J
2016-11-01
Ehlers-Danlos syndrome (EDS) comprises a group of overlapping hereditary disorders of connective tissue with significant morbidity and mortality, including major vascular complications. We sought to identify the diagnostic utility of a next-generation sequencing (NGS) panel in a mixed EDS cohort. We developed and applied PCR-based NGS assays for targeted, unbiased sequencing of 12 collagen and aortopathy genes to a cohort of 177 unrelated EDS patients. Variants were scored blind to previous genetic testing and then compared with results of previous Sanger sequencing. Twenty-eight pathogenic variants in COL5A1/2, COL3A1, FBN1, and COL1A1 and four likely pathogenic variants in COL1A1, TGFBR1/2, and SMAD3 were identified by the NGS assays. These included all previously detected single-nucleotide and other short pathogenic variants in these genes, and seven newly detected pathogenic or likely pathogenic variants leading to clinically significant diagnostic revisions. Twenty-two variants of uncertain significance were identified, seven of which were in aortopathy genes and required clinical follow-up. Unbiased NGS-based sequencing made new molecular diagnoses outside the expected EDS genotype-phenotype relationship and identified previously undetected clinically actionable variants in aortopathy susceptibility genes. These data may be of value in guiding future clinical pathways for genetic diagnosis in EDS.Genet Med 18 11, 1119-1127.
Lange, Leslie A.; Hu, Youna; Zhang, He; Xue, Chenyi; Schmidt, Ellen M.; Tang, Zheng-Zheng; Bizon, Chris; Lange, Ethan M.; Smith, Joshua D.; Turner, Emily H.; Jun, Goo; Kang, Hyun Min; Peloso, Gina; Auer, Paul; Li, Kuo-ping; Flannick, Jason; Zhang, Ji; Fuchsberger, Christian; Gaulton, Kyle; Lindgren, Cecilia; Locke, Adam; Manning, Alisa; Sim, Xueling; Rivas, Manuel A.; Holmen, Oddgeir L.; Gottesman, Omri; Lu, Yingchang; Ruderfer, Douglas; Stahl, Eli A.; Duan, Qing; Li, Yun; Durda, Peter; Jiao, Shuo; Isaacs, Aaron; Hofman, Albert; Bis, Joshua C.; Correa, Adolfo; Griswold, Michael E.; Jakobsdottir, Johanna; Smith, Albert V.; Schreiner, Pamela J.; Feitosa, Mary F.; Zhang, Qunyuan; Huffman, Jennifer E.; Crosby, Jacy; Wassel, Christina L.; Do, Ron; Franceschini, Nora; Martin, Lisa W.; Robinson, Jennifer G.; Assimes, Themistocles L.; Crosslin, David R.; Rosenthal, Elisabeth A.; Tsai, Michael; Rieder, Mark J.; Farlow, Deborah N.; Folsom, Aaron R.; Lumley, Thomas; Fox, Ervin R.; Carlson, Christopher S.; Peters, Ulrike; Jackson, Rebecca D.; van Duijn, Cornelia M.; Uitterlinden, André G.; Levy, Daniel; Rotter, Jerome I.; Taylor, Herman A.; Gudnason, Vilmundur; Siscovick, David S.; Fornage, Myriam; Borecki, Ingrid B.; Hayward, Caroline; Rudan, Igor; Chen, Y. Eugene; Bottinger, Erwin P.; Loos, Ruth J.F.; Sætrom, Pål; Hveem, Kristian; Boehnke, Michael; Groop, Leif; McCarthy, Mark; Meitinger, Thomas; Ballantyne, Christie M.; Gabriel, Stacey B.; O’Donnell, Christopher J.; Post, Wendy S.; North, Kari E.; Reiner, Alexander P.; Boerwinkle, Eric; Psaty, Bruce M.; Altshuler, David; Kathiresan, Sekar; Lin, Dan-Yu; Jarvik, Gail P.; Cupples, L. Adrienne; Kooperberg, Charles; Wilson, James G.; Nickerson, Deborah A.; Abecasis, Goncalo R.; Rich, Stephen S.; Tracy, Russell P.; Willer, Cristen J.; Gabriel, Stacey B.; Altshuler, David M.; Abecasis, Gonçalo R.; Allayee, Hooman; Cresci, Sharon; Daly, Mark J.; de Bakker, Paul I.W.; DePristo, Mark A.; Do, Ron; Donnelly, Peter; Farlow, Deborah N.; Fennell, Tim; Garimella, Kiran; Hazen, Stanley L.; Hu, Youna; Jordan, Daniel M.; Jun, Goo; Kathiresan, Sekar; Kang, Hyun Min; Kiezun, Adam; Lettre, Guillaume; Li, Bingshan; Li, Mingyao; Newton-Cheh, Christopher H.; Padmanabhan, Sandosh; Peloso, Gina; Pulit, Sara; Rader, Daniel J.; Reich, David; Reilly, Muredach P.; Rivas, Manuel A.; Schwartz, Steve; Scott, Laura; Siscovick, David S.; Spertus, John A.; Stitziel, Nathaniel O.; Stoletzki, Nina; Sunyaev, Shamil R.; Voight, Benjamin F.; Willer, Cristen J.; Rich, Stephen S.; Akylbekova, Ermeg; Atwood, Larry D.; Ballantyne, Christie M.; Barbalic, Maja; Barr, R. Graham; Benjamin, Emelia J.; Bis, Joshua; Boerwinkle, Eric; Bowden, Donald W.; Brody, Jennifer; Budoff, Matthew; Burke, Greg; Buxbaum, Sarah; Carr, Jeff; Chen, Donna T.; Chen, Ida Y.; Chen, Wei-Min; Concannon, Pat; Crosby, Jacy; Cupples, L. Adrienne; D’Agostino, Ralph; DeStefano, Anita L.; Dreisbach, Albert; Dupuis, Josée; Durda, J. Peter; Ellis, Jaclyn; Folsom, Aaron R.; Fornage, Myriam; Fox, Caroline S.; Fox, Ervin; Funari, Vincent; Ganesh, Santhi K.; Gardin, Julius; Goff, David; Gordon, Ora; Grody, Wayne; Gross, Myron; Guo, Xiuqing; Hall, Ira M.; Heard-Costa, Nancy L.; Heckbert, Susan R.; Heintz, Nicholas; Herrington, David M.; Hickson, DeMarc; Huang, Jie; Hwang, Shih-Jen; Jacobs, David R.; Jenny, Nancy S.; Johnson, Andrew D.; Johnson, Craig W.; Kawut, Steven; Kronmal, Richard; Kurz, Raluca; Lange, Ethan M.; Lange, Leslie A.; Larson, Martin G.; Lawson, Mark; Lewis, Cora E.; Levy, Daniel; Li, Dalin; Lin, Honghuang; Liu, Chunyu; Liu, Jiankang; Liu, Kiang; Liu, Xiaoming; Liu, Yongmei; Longstreth, William T.; Loria, Cay; Lumley, Thomas; Lunetta, Kathryn; Mackey, Aaron J.; Mackey, Rachel; Manichaikul, Ani; Maxwell, Taylor; McKnight, Barbara; Meigs, James B.; Morrison, Alanna C.; Musani, Solomon K.; Mychaleckyj, Josyf C.; Nettleton, Jennifer A.; North, Kari; O’Donnell, Christopher J.; O’Leary, Daniel; Ong, Frank; Palmas, Walter; Pankow, James S.; Pankratz, Nathan D.; Paul, Shom; Perez, Marco; Person, Sharina D.; Polak, Joseph; Post, Wendy S.; Psaty, Bruce M.; Quinlan, Aaron R.; Raffel, Leslie J.; Ramachandran, Vasan S.; Reiner, Alexander P.; Rice, Kenneth; Rotter, Jerome I.; Sanders, Jill P.; Schreiner, Pamela; Seshadri, Sudha; Shea, Steve; Sidney, Stephen; Silverstein, Kevin; Smith, Nicholas L.; Sotoodehnia, Nona; Srinivasan, Asoke; Taylor, Herman A.; Taylor, Kent; Thomas, Fridtjof; Tracy, Russell P.; Tsai, Michael Y.; Volcik, Kelly A.; Wassel, Chrstina L.; Watson, Karol; Wei, Gina; White, Wendy; Wiggins, Kerri L.; Wilk, Jemma B.; Williams, O. Dale; Wilson, Gregory; Wilson, James G.; Wolf, Phillip; Zakai, Neil A.; Hardy, John; Meschia, James F.; Nalls, Michael; Singleton, Andrew; Worrall, Brad; Bamshad, Michael J.; Barnes, Kathleen C.; Abdulhamid, Ibrahim; Accurso, Frank; Anbar, Ran; Beaty, Terri; Bigham, Abigail; Black, Phillip; Bleecker, Eugene; Buckingham, Kati; Cairns, Anne Marie; Caplan, Daniel; Chatfield, Barbara; Chidekel, Aaron; Cho, Michael; Christiani, David C.; Crapo, James D.; Crouch, Julia; Daley, Denise; Dang, Anthony; Dang, Hong; De Paula, Alicia; DeCelie-Germana, Joan; Drumm, Allen DozorMitch; Dyson, Maynard; Emerson, Julia; Emond, Mary J.; Ferkol, Thomas; Fink, Robert; Foster, Cassandra; Froh, Deborah; Gao, Li; Gershan, William; Gibson, Ronald L.; Godwin, Elizabeth; Gondor, Magdalen; Gutierrez, Hector; Hansel, Nadia N.; Hassoun, Paul M.; Hiatt, Peter; Hokanson, John E.; Howenstine, Michelle; Hummer, Laura K.; Kanga, Jamshed; Kim, Yoonhee; Knowles, Michael R.; Konstan, Michael; Lahiri, Thomas; Laird, Nan; Lange, Christoph; Lin, Lin; Lin, Xihong; Louie, Tin L.; Lynch, David; Make, Barry; Martin, Thomas R.; Mathai, Steve C.; Mathias, Rasika A.; McNamara, John; McNamara, Sharon; Meyers, Deborah; Millard, Susan; Mogayzel, Peter; Moss, Richard; Murray, Tanda; Nielson, Dennis; Noyes, Blakeslee; O’Neal, Wanda; Orenstein, David; O’Sullivan, Brian; Pace, Rhonda; Pare, Peter; Parker, H. Worth; Passero, Mary Ann; Perkett, Elizabeth; Prestridge, Adrienne; Rafaels, Nicholas M.; Ramsey, Bonnie; Regan, Elizabeth; Ren, Clement; Retsch-Bogart, George; Rock, Michael; Rosen, Antony; Rosenfeld, Margaret; Ruczinski, Ingo; Sanford, Andrew; Schaeffer, David; Sell, Cindy; Sheehan, Daniel; Silverman, Edwin K.; Sin, Don; Spencer, Terry; Stonebraker, Jackie; Tabor, Holly K.; Varlotta, Laurie; Vergara, Candelaria I.; Weiss, Robert; Wigley, Fred; Wise, Robert A.; Wright, Fred A.; Wurfel, Mark M.; Zanni, Robert; Zou, Fei; Nickerson, Deborah A.; Rieder, Mark J.; Green, Phil; Shendure, Jay; Akey, Joshua M.; Bustamante, Carlos D.; Crosslin, David R.; Eichler, Evan E.; Fox, P. Keolu; Fu, Wenqing; Gordon, Adam; Gravel, Simon; Jarvik, Gail P.; Johnsen, Jill M.; Kan, Mengyuan; Kenny, Eimear E.; Kidd, Jeffrey M.; Lara-Garduno, Fremiet; Leal, Suzanne M.; Liu, Dajiang J.; McGee, Sean; O’Connor, Timothy D.; Paeper, Bryan; Robertson, Peggy D.; Smith, Joshua D.; Staples, Jeffrey C.; Tennessen, Jacob A.; Turner, Emily H.; Wang, Gao; Yi, Qian; Jackson, Rebecca; Peters, Ulrike; Carlson, Christopher S.; Anderson, Garnet; Anton-Culver, Hoda; Assimes, Themistocles L.; Auer, Paul L.; Beresford, Shirley; Bizon, Chris; Black, Henry; Brunner, Robert; Brzyski, Robert; Burwen, Dale; Caan, Bette; Carty, Cara L.; Chlebowski, Rowan; Cummings, Steven; Curb, J. David; Eaton, Charles B.; Ford, Leslie; Franceschini, Nora; Fullerton, Stephanie M.; Gass, Margery; Geller, Nancy; Heiss, Gerardo; Howard, Barbara V.; Hsu, Li; Hutter, Carolyn M.; Ioannidis, John; Jiao, Shuo; Johnson, Karen C.; Kooperberg, Charles; Kuller, Lewis; LaCroix, Andrea; Lakshminarayan, Kamakshi; Lane, Dorothy; Lasser, Norman; LeBlanc, Erin; Li, Kuo-Ping; Limacher, Marian; Lin, Dan-Yu; Logsdon, Benjamin A.; Ludlam, Shari; Manson, JoAnn E.; Margolis, Karen; Martin, Lisa; McGowan, Joan; Monda, Keri L.; Kotchen, Jane Morley; Nathan, Lauren; Ockene, Judith; O’Sullivan, Mary Jo; Phillips, Lawrence S.; Prentice, Ross L.; Robbins, John; Robinson, Jennifer G.; Rossouw, Jacques E.; Sangi-Haghpeykar, Haleh; Sarto, Gloria E.; Shumaker, Sally; Simon, Michael S.; Stefanick, Marcia L.; Stein, Evan; Tang, Hua; Taylor, Kira C.; Thomson, Cynthia A.; Thornton, Timothy A.; Van Horn, Linda; Vitolins, Mara; Wactawski-Wende, Jean; Wallace, Robert; Wassertheil-Smoller, Sylvia; Zeng, Donglin; Applebaum-Bowden, Deborah; Feolo, Michael; Gan, Weiniu; Paltoo, Dina N.; Sholinsky, Phyliss; Sturcke, Anne
2014-01-01
Elevated low-density lipoprotein cholesterol (LDL-C) is a treatable, heritable risk factor for cardiovascular disease. Genome-wide association studies (GWASs) have identified 157 variants associated with lipid levels but are not well suited to assess the impact of rare and low-frequency variants. To determine whether rare or low-frequency coding variants are associated with LDL-C, we exome sequenced 2,005 individuals, including 554 individuals selected for extreme LDL-C (>98th or <2nd percentile). Follow-up analyses included sequencing of 1,302 additional individuals and genotype-based analysis of 52,221 individuals. We observed significant evidence of association between LDL-C and the burden of rare or low-frequency variants in PNPLA5, encoding a phospholipase-domain-containing protein, and both known and previously unidentified variants in PCSK9, LDLR and APOB, three known lipid-related genes. The effect sizes for the burden of rare variants for each associated gene were substantially higher than those observed for individual SNPs identified from GWASs. We replicated the PNPLA5 signal in an independent large-scale sequencing study of 2,084 individuals. In conclusion, this large whole-exome-sequencing study for LDL-C identified a gene not known to be implicated in LDL-C and provides unique insight into the design and analysis of similar experiments. PMID:24507775
Lange, Leslie A; Hu, Youna; Zhang, He; Xue, Chenyi; Schmidt, Ellen M; Tang, Zheng-Zheng; Bizon, Chris; Lange, Ethan M; Smith, Joshua D; Turner, Emily H; Jun, Goo; Kang, Hyun Min; Peloso, Gina; Auer, Paul; Li, Kuo-Ping; Flannick, Jason; Zhang, Ji; Fuchsberger, Christian; Gaulton, Kyle; Lindgren, Cecilia; Locke, Adam; Manning, Alisa; Sim, Xueling; Rivas, Manuel A; Holmen, Oddgeir L; Gottesman, Omri; Lu, Yingchang; Ruderfer, Douglas; Stahl, Eli A; Duan, Qing; Li, Yun; Durda, Peter; Jiao, Shuo; Isaacs, Aaron; Hofman, Albert; Bis, Joshua C; Correa, Adolfo; Griswold, Michael E; Jakobsdottir, Johanna; Smith, Albert V; Schreiner, Pamela J; Feitosa, Mary F; Zhang, Qunyuan; Huffman, Jennifer E; Crosby, Jacy; Wassel, Christina L; Do, Ron; Franceschini, Nora; Martin, Lisa W; Robinson, Jennifer G; Assimes, Themistocles L; Crosslin, David R; Rosenthal, Elisabeth A; Tsai, Michael; Rieder, Mark J; Farlow, Deborah N; Folsom, Aaron R; Lumley, Thomas; Fox, Ervin R; Carlson, Christopher S; Peters, Ulrike; Jackson, Rebecca D; van Duijn, Cornelia M; Uitterlinden, André G; Levy, Daniel; Rotter, Jerome I; Taylor, Herman A; Gudnason, Vilmundur; Siscovick, David S; Fornage, Myriam; Borecki, Ingrid B; Hayward, Caroline; Rudan, Igor; Chen, Y Eugene; Bottinger, Erwin P; Loos, Ruth J F; Sætrom, Pål; Hveem, Kristian; Boehnke, Michael; Groop, Leif; McCarthy, Mark; Meitinger, Thomas; Ballantyne, Christie M; Gabriel, Stacey B; O'Donnell, Christopher J; Post, Wendy S; North, Kari E; Reiner, Alexander P; Boerwinkle, Eric; Psaty, Bruce M; Altshuler, David; Kathiresan, Sekar; Lin, Dan-Yu; Jarvik, Gail P; Cupples, L Adrienne; Kooperberg, Charles; Wilson, James G; Nickerson, Deborah A; Abecasis, Goncalo R; Rich, Stephen S; Tracy, Russell P; Willer, Cristen J
2014-02-06
Elevated low-density lipoprotein cholesterol (LDL-C) is a treatable, heritable risk factor for cardiovascular disease. Genome-wide association studies (GWASs) have identified 157 variants associated with lipid levels but are not well suited to assess the impact of rare and low-frequency variants. To determine whether rare or low-frequency coding variants are associated with LDL-C, we exome sequenced 2,005 individuals, including 554 individuals selected for extreme LDL-C (>98(th) or <2(nd) percentile). Follow-up analyses included sequencing of 1,302 additional individuals and genotype-based analysis of 52,221 individuals. We observed significant evidence of association between LDL-C and the burden of rare or low-frequency variants in PNPLA5, encoding a phospholipase-domain-containing protein, and both known and previously unidentified variants in PCSK9, LDLR and APOB, three known lipid-related genes. The effect sizes for the burden of rare variants for each associated gene were substantially higher than those observed for individual SNPs identified from GWASs. We replicated the PNPLA5 signal in an independent large-scale sequencing study of 2,084 individuals. In conclusion, this large whole-exome-sequencing study for LDL-C identified a gene not known to be implicated in LDL-C and provides unique insight into the design and analysis of similar experiments. Copyright © 2014 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Rare variants in RTEL1 are associated with familial interstitial pneumonia.
Cogan, Joy D; Kropski, Jonathan A; Zhao, Min; Mitchell, Daphne B; Rives, Lynette; Markin, Cheryl; Garnett, Errine T; Montgomery, Keri H; Mason, Wendi R; McKean, David F; Powers, Julia; Murphy, Elissa; Olson, Lana M; Choi, Leena; Cheng, Dong-Sheng; Blue, Elizabeth Marchani; Young, Lisa R; Lancaster, Lisa H; Steele, Mark P; Brown, Kevin K; Schwarz, Marvin I; Fingerlin, Tasha E; Schwartz, David A; Lawson, William E; Loyd, James E; Zhao, Zhongming; Phillips, John A; Blackwell, Timothy S
2015-03-15
Up to 20% of cases of idiopathic interstitial pneumonia cluster in families, comprising the syndrome of familial interstitial pneumonia (FIP); however, the genetic basis of FIP remains uncertain in most families. To determine if new disease-causing rare genetic variants could be identified using whole-exome sequencing of affected members from FIP families, providing additional insights into disease pathogenesis. Affected subjects from 25 kindreds were selected from an ongoing FIP registry for whole-exome sequencing from genomic DNA. Candidate rare variants were confirmed by Sanger sequencing, and cosegregation analysis was performed in families, followed by additional sequencing of affected individuals from another 163 kindreds. We identified a potentially damaging rare variant in the gene encoding for regulator of telomere elongation helicase 1 (RTEL1) that segregated with disease and was associated with very short telomeres in peripheral blood mononuclear cells in 1 of 25 families in our original whole-exome sequencing cohort. Evaluation of affected individuals in 163 additional kindreds revealed another eight families (4.7%) with heterozygous rare variants in RTEL1 that segregated with clinical FIP. Probands and unaffected carriers of these rare variants had short telomeres (<10% for age) in peripheral blood mononuclear cells and increased T-circle formation, suggesting impaired RTEL1 function. Rare loss-of-function variants in RTEL1 represent a newly defined genetic predisposition for FIP, supporting the importance of telomere-related pathways in pulmonary fibrosis.
Xia, Z.; Patino, R.; Gale, W.L.; Maule, A.G.; Densmore, L.D.
1999-01-01
We obtained two channel catfish estrogen receptor (ccER) cDNA from liver of female fish using RT–PCR. The two fragments were identical in sequence except that the smaller one had an out-of-frame deletion in the E domain, suggesting the existence of ccER splice variants. The larger fragment was used to screen a cDNA library from liver of a prepubescent female. A cDNA was obtained that encoded a 581-amino-acid ER with a deduced molecular weight of 63.8 kDa. Extracts of COS-7 cells transfected with ccER cDNA bound estrogen with high affinity (Kd = 4.7 nM) and specificity. Maximum parsimony and Neighbor Joining analyses were used to generate a phylogenetic classification of ccER on the basis of 18 full-length ER sequences. The tree suggested the existence of two major ER branches. One branch contained two clearly divergent clades which included all piscine ER (except Japanese eel ER) and all tetrapod ERα, respectively. The second major branch contained the eel ER and the mammalian ERβ. The high degree of divergence between the eel ER and mammalian ERβ suggested that they also represent distinct piscine and tetrapod ER. These data suggest that ERα and ERβ are present throughout vertebrates and that these two major ER types evolved by duplication of an ancestral ER gene. Sequence alignments with other members of the nuclear hormone receptor superfamily indicated the presence of 8 amino acids in the E domain that align exclusively among ER. Four of these amino acids have not received prior research attention and their function is unknown. The novel finding of putative ER splice variants in a nonmammalian vertebrate and the novel phylogenetic classification of ER offer new perspectives in understanding the diversification and function of ER.
Pillai, Suja; Gopalan, Vinod; Lo, Chung Y; Liew, Victor; Smith, Robert A; Lam, Alfred King Y
2017-02-01
The goal of this pilot study was to develop a customized, cost-effective amplicon panel (Ampliseq) for target sequencing in a cohort of patients with sporadic phaeochromocytoma/paraganglioma. Phaeochromocytoma/paragangliomas from 25 patients were analysed by targeted next-generation sequencing approach using an Ion Torrent PGM instrument. Primers for 15 target genes (NF1, RET, VHL, SDHA, SDHB, SDHC, SDHD, SDHAF2, TMEM127, MAX, MEN1, KIF1Bβ, EPAS1, CDKN2 & PHD2) were designed using ion ampliseq designer. Ion Reporter software and Ingenuity® Variant Analysis™ software (www.ingenuity.com/variants) from Ingenuity Systems were used to analysis these results. Overall, 713 variants were identified. The variants identified from the Ion Reporter ranged from 64 to 161 per patient. Single nucleotide variants (SNV) were the most common. Further annotation with the help of Ingenuity variant analysis revealed 29 of these 713variants were deletions. Of these, six variants were non-pathogenic and four were likely to be pathogenic. The remaining 19 variants were of uncertain significance. The most frequently altered gene in the cohort was KIF1B followed by NF1. Novel KIF1B pathogenic variant c.3375+1G>A was identified. The mutation was noted in a patient with clinically confirmed neurofibromatosis. Chromosome 1 showed the presence of maximum number of variants. Use of targeted next-generation sequencing is a sensitive method for the detecting genetic changes in patients with phaeochromocytoma/paraganglioma. The precise detection of these genetic changes helps in understanding the pathogenesis of these tumours. Copyright © 2016 Elsevier Inc. All rights reserved.
Al-Allaf, Faisal A; Athar, Mohammad; Abduljaleel, Zainularifeen; Taher, Mohiuddin M; Khan, Wajahatullah; Ba-Hammam, Faisal A; Abalkhail, Hala; Alashwal, Abdullah
2015-07-01
Familial hypercholesterolemia (FH) is an autosomal dominant inherited disease characterized by elevated plasma low-density lipoprotein cholesterol (LDL-C). It is an autosomal dominant disease, caused by variants in Ldlr, ApoB or Pcsk9, which results in high levels of LDL-cholesterol (LDL-C) leading to early coronary heart disease. Sequencing whole genome for screening variants for FH are not suitable due to high cost. Hence, in this study we performed targeted customized sequencing of FH 12 genes (Ldlr, ApoB, Pcsk9, Abca1, Apoa2, Apoc3, Apon2, Arh, Ldlrap1, Apoc2, ApoE, and Lpl) that have been implicated in the homozygous phenotype of a proband pedigree to identify candidate variants by NGS Ion torrent PGM. Only three genes (Ldlr, ApoB, and Pcsk9) were found to be highly associated with FH based on the variant rate. The results showed that seven deleterious variants in Ldlr, ApoB, and Pcsk9 genes were pathological and were clinically significant based on predictions identified by SIFT and PolyPhen. Targeted customized sequencing is an efficient technique for screening variants among targeted FH genes. Final validation of seven deleterious variants conducted by capillary resulted to only one novel variant in Ldlr gene that was found in exon 14 (c.2026delG, p. Gly676fs). The variant found in Ldlr gene was a novel heterozygous variant derived from a male in the proband. Copyright © 2015 Elsevier B.V. All rights reserved.
Genomic Rearrangements in Arabidopsis Considered as Quantitative Traits.
Imprialou, Martha; Kahles, André; Steffen, Joshua G; Osborne, Edward J; Gan, Xiangchao; Lempe, Janne; Bhomra, Amarjit; Belfield, Eric; Visscher, Anne; Greenhalgh, Robert; Harberd, Nicholas P; Goram, Richard; Hein, Jotun; Robert-Seilaniantz, Alexandre; Jones, Jonathan; Stegle, Oliver; Kover, Paula; Tsiantis, Miltos; Nordborg, Magnus; Rätsch, Gunnar; Clark, Richard M; Mott, Richard
2017-04-01
To understand the population genetics of structural variants and their effects on phenotypes, we developed an approach to mapping structural variants that segregate in a population sequenced at low coverage. We avoid calling structural variants directly. Instead, the evidence for a potential structural variant at a locus is indicated by variation in the counts of short-reads that map anomalously to that locus. These structural variant traits are treated as quantitative traits and mapped genetically, analogously to a gene expression study. Association between a structural variant trait at one locus, and genotypes at a distant locus indicate the origin and target of a transposition. Using ultra-low-coverage (0.3×) population sequence data from 488 recombinant inbred Arabidopsis thaliana genomes, we identified 6502 segregating structural variants. Remarkably, 25% of these were transpositions. While many structural variants cannot be delineated precisely, we validated 83% of 44 predicted transposition breakpoints by polymerase chain reaction. We show that specific structural variants may be causative for quantitative trait loci for germination and resistance to infection by the fungus Albugo laibachii , isolate Nc14. Further we show that the phenotypic heritability attributable to read-mapping anomalies differs from, and, in the case of time to germination and bolting, exceeds that due to standard genetic variation. Genes within structural variants are also more likely to be silenced or dysregulated. This approach complements the prevalent strategy of structural variant discovery in fewer individuals sequenced at high coverage. It is generally applicable to large populations sequenced at low-coverage, and is particularly suited to mapping transpositions. Copyright © 2017 by the Genetics Society of America.
Process of labeling specific chromosomes using recombinant repetitive DNA
Moyzis, R.K.; Meyne, J.
1988-02-12
Chromosome preferential nucleotide sequences are first determined from a library of recombinant DNA clones having families of repetitive sequences. Library clones are identified with a low homology with a sequence of repetitive DNA families to which the first clones respectively belong and variant sequences are then identified by selecting clones having a pattern of hybridization with genomic DNA dissimilar to the hybridization pattern shown by the respective families. In another embodiment, variant sequences are selected from a sequence of a known repetitive DNA family. The selected variant sequence is classified as chromosome specific, chromosome preferential, or chromosome nonspecific. Sequences which are classified as chromosome preferential are further sequenced and regions are identified having a low homology with other regions of the chromosome preferential sequence or with known sequences of other family members and consensus sequences of the repetitive DNA families for the chromosome preferential sequences. The selected low homology regions are then hybridized with chromosomes to determine those low homology regions hybridized with a specific chromosome under normal stringency conditions.
A survey of tools for variant analysis of next-generation genome sequencing data
Pabinger, Stephan; Dander, Andreas; Fischer, Maria; Snajder, Rene; Sperk, Michael; Efremova, Mirjana; Krabichler, Birgit; Speicher, Michael R.; Zschocke, Johannes
2014-01-01
Recent advances in genome sequencing technologies provide unprecedented opportunities to characterize individual genomic landscapes and identify mutations relevant for diagnosis and therapy. Specifically, whole-exome sequencing using next-generation sequencing (NGS) technologies is gaining popularity in the human genetics community due to the moderate costs, manageable data amounts and straightforward interpretation of analysis results. While whole-exome and, in the near future, whole-genome sequencing are becoming commodities, data analysis still poses significant challenges and led to the development of a plethora of tools supporting specific parts of the analysis workflow or providing a complete solution. Here, we surveyed 205 tools for whole-genome/whole-exome sequencing data analysis supporting five distinct analytical steps: quality assessment, alignment, variant identification, variant annotation and visualization. We report an overview of the functionality, features and specific requirements of the individual tools. We then selected 32 programs for variant identification, variant annotation and visualization, which were subjected to hands-on evaluation using four data sets: one set of exome data from two patients with a rare disease for testing identification of germline mutations, two cancer data sets for testing variant callers for somatic mutations, copy number variations and structural variations, and one semi-synthetic data set for testing identification of copy number variations. Our comprehensive survey and evaluation of NGS tools provides a valuable guideline for human geneticists working on Mendelian disorders, complex diseases and cancers. PMID:23341494
Johnson, Ben; Lowe, Gillian C.; Futterer, Jane; Lordkipanidzé, Marie; MacDonald, David; Simpson, Michael A.; Sanchez-Guiú, Isabel; Drake, Sian; Bem, Danai; Leo, Vincenzo; Fletcher, Sarah J.; Dawood, Ban; Rivera, José; Allsup, David; Biss, Tina; Bolton-Maggs, Paula HB; Collins, Peter; Curry, Nicola; Grimley, Charlotte; James, Beki; Makris, Mike; Motwani, Jayashree; Pavord, Sue; Talks, Katherine; Thachil, Jecko; Wilde, Jonathan; Williams, Mike; Harrison, Paul; Gissen, Paul; Mundell, Stuart; Mumford, Andrew; Daly, Martina E.; Watson, Steve P.; Morgan, Neil V.
2016-01-01
Inherited thrombocytopenias are a heterogeneous group of disorders characterized by abnormally low platelet counts which can be associated with abnormal bleeding. Next-generation sequencing has previously been employed in these disorders for the confirmation of suspected genetic abnormalities, and more recently in the discovery of novel disease-causing genes. However its full potential has not yet been exploited. Over the past 6 years we have sequenced the exomes from 55 patients, including 37 index cases and 18 additional family members, all of whom were recruited to the UK Genotyping and Phenotyping of Platelets study. All patients had inherited or sustained thrombocytopenia of unknown etiology with platelet counts varying from 11×109/L to 186×109/L. Of the 51 patients phenotypically tested, 37 (73%), had an additional secondary qualitative platelet defect. Using whole exome sequencing analysis we have identified “pathogenic” or “likely pathogenic” variants in 46% (17/37) of our index patients with thrombocytopenia. In addition, we report variants of uncertain significance in 12 index cases, including novel candidate genetic variants in previously unreported genes in four index cases. These results demonstrate that whole exome sequencing is an efficient method for elucidating potential pathogenic genetic variants in inherited thrombocytopenia. Whole exome sequencing also has the added benefit of discovering potentially pathogenic genetic variants for further study in novel genes not previously implicated in inherited thrombocytopenia. PMID:27479822
R3D-2-MSA: the RNA 3D structure-to-multiple sequence alignment server.
Cannone, Jamie J; Sweeney, Blake A; Petrov, Anton I; Gutell, Robin R; Zirbel, Craig L; Leontis, Neocles
2015-07-01
The RNA 3D Structure-to-Multiple Sequence Alignment Server (R3D-2-MSA) is a new web service that seamlessly links RNA three-dimensional (3D) structures to high-quality RNA multiple sequence alignments (MSAs) from diverse biological sources. In this first release, R3D-2-MSA provides manual and programmatic access to curated, representative ribosomal RNA sequence alignments from bacterial, archaeal, eukaryal and organellar ribosomes, using nucleotide numbers from representative atomic-resolution 3D structures. A web-based front end is available for manual entry and an Application Program Interface for programmatic access. Users can specify up to five ranges of nucleotides and 50 nucleotide positions per range. The R3D-2-MSA server maps these ranges to the appropriate columns of the corresponding MSA and returns the contents of the columns, either for display in a web browser or in JSON format for subsequent programmatic use. The browser output page provides a 3D interactive display of the query, a full list of sequence variants with taxonomic information and a statistical summary of distinct sequence variants found. The output can be filtered and sorted in the browser. Previous user queries can be viewed at any time by resubmitting the output URL, which encodes the search and re-generates the results. The service is freely available with no login requirement at http://rna.bgsu.edu/r3d-2-msa. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Inferring Short-Range Linkage Information from Sequencing Chromatograms
Beggel, Bastian; Neumann-Fraune, Maria; Kaiser, Rolf; Verheyen, Jens; Lengauer, Thomas
2013-01-01
Direct Sanger sequencing of viral genome populations yields multiple ambiguous sequence positions. It is not straightforward to derive linkage information from sequencing chromatograms, which in turn hampers the correct interpretation of the sequence data. We present a method for determining the variants existing in a viral quasispecies in the case of two nearby ambiguous sequence positions by exploiting the effect of sequence context-dependent incorporation of dideoxynucleotides. The computational model was trained on data from sequencing chromatograms of clonal variants and was evaluated on two test sets of in vitro mixtures. The approach achieved high accuracies in identifying the mixture components of 97.4% on a test set in which the positions to be analyzed are only one base apart from each other, and of 84.5% on a test set in which the ambiguous positions are separated by three bases. In silico experiments suggest two major limitations of our approach in terms of accuracy. First, due to a basic limitation of Sanger sequencing, it is not possible to reliably detect minor variants with a relative frequency of no more than 10%. Second, the model cannot distinguish between mixtures of two or four clonal variants, if one of two sets of linear constraints is fulfilled. Furthermore, the approach requires repetitive sequencing of all variants that might be present in the mixture to be analyzed. Nevertheless, the effectiveness of our method on the two in vitro test sets shows that short-range linkage information of two ambiguous sequence positions can be inferred from Sanger sequencing chromatograms without any further assumptions on the mixture composition. Additionally, our model provides new insights into the established and widely used Sanger sequencing technology. The source code of our method is made available at http://bioinf.mpi-inf.mpg.de/publications/beggel/linkageinformation.zip. PMID:24376502
Common and rare variants associated with kidney stones and biochemical traits
Oddsson, Asmundur; Sulem, Patrick; Helgason, Hannes; Edvardsson, Vidar O.; Thorleifsson, Gudmar; Sveinbjörnsson, Gardar; Haraldsdottir, Eik; Eyjolfsson, Gudmundur I.; Sigurdardottir, Olof; Olafsson, Isleifur; Masson, Gisli; Holm, Hilma; Gudbjartsson, Daniel F.; Thorsteinsdottir, Unnur; Indridason, Olafur S.; Palsson, Runolfur; Stefansson, Kari
2015-01-01
Kidney stone disease is a complex disorder with a strong genetic component. We conducted a genome-wide association study of 28.3 million sequence variants detected through whole-genome sequencing of 2,636 Icelanders that were imputed into 5,419 kidney stone cases, including 2,172 cases with a history of recurrent kidney stones, and 279,870 controls. We identify sequence variants associating with kidney stones at ALPL (rs1256328[T], odds ratio (OR)=1.21, P=5.8 × 10−10) and a suggestive association at CASR (rs7627468[A], OR=1.16, P=2.0 × 10−8). Focusing our analysis on coding sequence variants in 63 genes with preferential kidney expression we identify two rare missense variants SLC34A1 p.Tyr489Cys (OR=2.38, P=2.8 × 10−5) and TRPV5 p.Leu530Arg (OR=3.62, P=4.1 × 10−5) associating with recurrent kidney stones. We also observe associations of the identified kidney stone variants with biochemical traits in a large population set, indicating potential biological mechanism. PMID:26272126
Common and rare variants associated with kidney stones and biochemical traits.
Oddsson, Asmundur; Sulem, Patrick; Helgason, Hannes; Edvardsson, Vidar O; Thorleifsson, Gudmar; Sveinbjörnsson, Gardar; Haraldsdottir, Eik; Eyjolfsson, Gudmundur I; Sigurdardottir, Olof; Olafsson, Isleifur; Masson, Gisli; Holm, Hilma; Gudbjartsson, Daniel F; Thorsteinsdottir, Unnur; Indridason, Olafur S; Palsson, Runolfur; Stefansson, Kari
2015-08-14
Kidney stone disease is a complex disorder with a strong genetic component. We conducted a genome-wide association study of 28.3 million sequence variants detected through whole-genome sequencing of 2,636 Icelanders that were imputed into 5,419 kidney stone cases, including 2,172 cases with a history of recurrent kidney stones, and 279,870 controls. We identify sequence variants associating with kidney stones at ALPL (rs1256328[T], odds ratio (OR)=1.21, P=5.8 × 10(-10)) and a suggestive association at CASR (rs7627468[A], OR=1.16, P=2.0 × 10(-8)). Focusing our analysis on coding sequence variants in 63 genes with preferential kidney expression we identify two rare missense variants SLC34A1 p.Tyr489Cys (OR=2.38, P=2.8 × 10(-5)) and TRPV5 p.Leu530Arg (OR=3.62, P=4.1 × 10(-5)) associating with recurrent kidney stones. We also observe associations of the identified kidney stone variants with biochemical traits in a large population set, indicating potential biological mechanism.
From days to hours: reporting clinically actionable variants from whole genome sequencing.
Middha, Sumit; Baheti, Saurabh; Hart, Steven N; Kocher, Jean-Pierre A
2014-01-01
As the cost of whole genome sequencing (WGS) decreases, clinical laboratories will be looking at broadly adopting this technology to screen for variants of clinical significance. To fully leverage this technology in a clinical setting, results need to be reported quickly, as the turnaround rate could potentially impact patient care. The latest sequencers can sequence a whole human genome in about 24 hours. However, depending on the computing infrastructure available, the processing of data can take several days, with the majority of computing time devoted to aligning reads to genomics regions that are to date not clinically interpretable. In an attempt to accelerate the reporting of clinically actionable variants, we have investigated the utility of a multi-step alignment algorithm focused on aligning reads and calling variants in genomic regions of clinical relevance prior to processing the remaining reads on the whole genome. This iterative workflow significantly accelerates the reporting of clinically actionable variants with no loss of accuracy when compared to genotypes obtained with the OMNI SNP platform or to variants detected with a standard workflow that combines Novoalign and GATK.
Evaluation of 10 genes encoding cardiac proteins in Doberman Pinschers with dilated cardiomyopathy.
O'Sullivan, M Lynne; O'Grady, Michael R; Pyle, W Glen; Dawson, John F
2011-07-01
To identify a causative mutation for dilated cardiomyopathy (DCM) in Doberman Pinschers by sequencing the coding regions of 10 cardiac genes known to be associated with familial DCM in humans. 5 Doberman Pinschers with DCM and congestive heart failure and 5 control mixed-breed dogs that were euthanized or died. RNA was extracted from frozen ventricular myocardial samples from each dog, and first-strand cDNA was synthesized via reverse transcription, followed by PCR amplification with gene-specific primers. Ten cardiac genes were analyzed: cardiac actin, α-actinin, α-tropomyosin, β-myosin heavy chain, metavinculin, muscle LIM protein, myosinbinding protein C, tafazzin, titin-cap (telethonin), and troponin T. Sequences for DCM-affected and control dogs and the published canine genome were compared. None of the coding sequences yielded a common causative mutation among all Doberman Pinscher samples. However, 3 variants were identified in the α-actinin gene in the DCM-affected Doberman Pinschers. One of these variants, identified in 2 of the 5 Doberman Pinschers, resulted in an amino acid change in the rod-forming triple coiled-coil domain. Mutations in the coding regions of several genes associated with DCM in humans did not appear to consistently account for DCM in Doberman Pinschers. However, an α-actinin variant was detected in some Doberman Pinschers that may contribute to the development of DCM given its potential effect on the structure of this protein. Investigation of additional candidate gene coding and noncoding regions and further evaluation of the role of α-actinin in development of DCM in Doberman Pinschers are warranted.
The genetic architecture of type 2 diabetes.
Fuchsberger, Christian; Flannick, Jason; Teslovich, Tanya M; Mahajan, Anubha; Agarwala, Vineeta; Gaulton, Kyle J; Ma, Clement; Fontanillas, Pierre; Moutsianas, Loukas; McCarthy, Davis J; Rivas, Manuel A; Perry, John R B; Sim, Xueling; Blackwell, Thomas W; Robertson, Neil R; Rayner, N William; Cingolani, Pablo; Locke, Adam E; Tajes, Juan Fernandez; Highland, Heather M; Dupuis, Josee; Chines, Peter S; Lindgren, Cecilia M; Hartl, Christopher; Jackson, Anne U; Chen, Han; Huyghe, Jeroen R; van de Bunt, Martijn; Pearson, Richard D; Kumar, Ashish; Müller-Nurasyid, Martina; Grarup, Niels; Stringham, Heather M; Gamazon, Eric R; Lee, Jaehoon; Chen, Yuhui; Scott, Robert A; Below, Jennifer E; Chen, Peng; Huang, Jinyan; Go, Min Jin; Stitzel, Michael L; Pasko, Dorota; Parker, Stephen C J; Varga, Tibor V; Green, Todd; Beer, Nicola L; Day-Williams, Aaron G; Ferreira, Teresa; Fingerlin, Tasha; Horikoshi, Momoko; Hu, Cheng; Huh, Iksoo; Ikram, Mohammad Kamran; Kim, Bong-Jo; Kim, Yongkang; Kim, Young Jin; Kwon, Min-Seok; Lee, Juyoung; Lee, Selyeong; Lin, Keng-Han; Maxwell, Taylor J; Nagai, Yoshihiko; Wang, Xu; Welch, Ryan P; Yoon, Joon; Zhang, Weihua; Barzilai, Nir; Voight, Benjamin F; Han, Bok-Ghee; Jenkinson, Christopher P; Kuulasmaa, Teemu; Kuusisto, Johanna; Manning, Alisa; Ng, Maggie C Y; Palmer, Nicholette D; Balkau, Beverley; Stančáková, Alena; Abboud, Hanna E; Boeing, Heiner; Giedraitis, Vilmantas; Prabhakaran, Dorairaj; Gottesman, Omri; Scott, James; Carey, Jason; Kwan, Phoenix; Grant, George; Smith, Joshua D; Neale, Benjamin M; Purcell, Shaun; Butterworth, Adam S; Howson, Joanna M M; Lee, Heung Man; Lu, Yingchang; Kwak, Soo-Heon; Zhao, Wei; Danesh, John; Lam, Vincent K L; Park, Kyong Soo; Saleheen, Danish; So, Wing Yee; Tam, Claudia H T; Afzal, Uzma; Aguilar, David; Arya, Rector; Aung, Tin; Chan, Edmund; Navarro, Carmen; Cheng, Ching-Yu; Palli, Domenico; Correa, Adolfo; Curran, Joanne E; Rybin, Denis; Farook, Vidya S; Fowler, Sharon P; Freedman, Barry I; Griswold, Michael; Hale, Daniel Esten; Hicks, Pamela J; Khor, Chiea-Chuen; Kumar, Satish; Lehne, Benjamin; Thuillier, Dorothée; Lim, Wei Yen; Liu, Jianjun; van der Schouw, Yvonne T; Loh, Marie; Musani, Solomon K; Puppala, Sobha; Scott, William R; Yengo, Loïc; Tan, Sian-Tsung; Taylor, Herman A; Thameem, Farook; Wilson, Gregory; Wong, Tien Yin; Njølstad, Pål Rasmus; Levy, Jonathan C; Mangino, Massimo; Bonnycastle, Lori L; Schwarzmayr, Thomas; Fadista, João; Surdulescu, Gabriela L; Herder, Christian; Groves, Christopher J; Wieland, Thomas; Bork-Jensen, Jette; Brandslund, Ivan; Christensen, Cramer; Koistinen, Heikki A; Doney, Alex S F; Kinnunen, Leena; Esko, Tõnu; Farmer, Andrew J; Hakaste, Liisa; Hodgkiss, Dylan; Kravic, Jasmina; Lyssenko, Valeriya; Hollensted, Mette; Jørgensen, Marit E; Jørgensen, Torben; Ladenvall, Claes; Justesen, Johanne Marie; Käräjämäki, Annemari; Kriebel, Jennifer; Rathmann, Wolfgang; Lannfelt, Lars; Lauritzen, Torsten; Narisu, Narisu; Linneberg, Allan; Melander, Olle; Milani, Lili; Neville, Matt; Orho-Melander, Marju; Qi, Lu; Qi, Qibin; Roden, Michael; Rolandsson, Olov; Swift, Amy; Rosengren, Anders H; Stirrups, Kathleen; Wood, Andrew R; Mihailov, Evelin; Blancher, Christine; Carneiro, Mauricio O; Maguire, Jared; Poplin, Ryan; Shakir, Khalid; Fennell, Timothy; DePristo, Mark; de Angelis, Martin Hrabé; Deloukas, Panos; Gjesing, Anette P; Jun, Goo; Nilsson, Peter; Murphy, Jacquelyn; Onofrio, Robert; Thorand, Barbara; Hansen, Torben; Meisinger, Christa; Hu, Frank B; Isomaa, Bo; Karpe, Fredrik; Liang, Liming; Peters, Annette; Huth, Cornelia; O'Rahilly, Stephen P; Palmer, Colin N A; Pedersen, Oluf; Rauramaa, Rainer; Tuomilehto, Jaakko; Salomaa, Veikko; Watanabe, Richard M; Syvänen, Ann-Christine; Bergman, Richard N; Bharadwaj, Dwaipayan; Bottinger, Erwin P; Cho, Yoon Shin; Chandak, Giriraj R; Chan, Juliana C N; Chia, Kee Seng; Daly, Mark J; Ebrahim, Shah B; Langenberg, Claudia; Elliott, Paul; Jablonski, Kathleen A; Lehman, Donna M; Jia, Weiping; Ma, Ronald C W; Pollin, Toni I; Sandhu, Manjinder; Tandon, Nikhil; Froguel, Philippe; Barroso, Inês; Teo, Yik Ying; Zeggini, Eleftheria; Loos, Ruth J F; Small, Kerrin S; Ried, Janina S; DeFronzo, Ralph A; Grallert, Harald; Glaser, Benjamin; Metspalu, Andres; Wareham, Nicholas J; Walker, Mark; Banks, Eric; Gieger, Christian; Ingelsson, Erik; Im, Hae Kyung; Illig, Thomas; Franks, Paul W; Buck, Gemma; Trakalo, Joseph; Buck, David; Prokopenko, Inga; Mägi, Reedik; Lind, Lars; Farjoun, Yossi; Owen, Katharine R; Gloyn, Anna L; Strauch, Konstantin; Tuomi, Tiinamaija; Kooner, Jaspal Singh; Lee, Jong-Young; Park, Taesung; Donnelly, Peter; Morris, Andrew D; Hattersley, Andrew T; Bowden, Donald W; Collins, Francis S; Atzmon, Gil; Chambers, John C; Spector, Timothy D; Laakso, Markku; Strom, Tim M; Bell, Graeme I; Blangero, John; Duggirala, Ravindranath; Tai, E Shyong; McVean, Gilean; Hanis, Craig L; Wilson, James G; Seielstad, Mark; Frayling, Timothy M; Meigs, James B; Cox, Nancy J; Sladek, Rob; Lander, Eric S; Gabriel, Stacey; Burtt, Noël P; Mohlke, Karen L; Meitinger, Thomas; Groop, Leif; Abecasis, Goncalo; Florez, Jose C; Scott, Laura J; Morris, Andrew P; Kang, Hyun Min; Boehnke, Michael; Altshuler, David; McCarthy, Mark I
2016-08-04
The genetic architecture of common traits, including the number, frequency, and effect sizes of inherited variants that contribute to individual risk, has been long debated. Genome-wide association studies have identified scores of common variants associated with type 2 diabetes, but in aggregate, these explain only a fraction of the heritability of this disease. Here, to test the hypothesis that lower-frequency variants explain much of the remainder, the GoT2D and T2D-GENES consortia performed whole-genome sequencing in 2,657 European individuals with and without diabetes, and exome sequencing in 12,940 individuals from five ancestry groups. To increase statistical power, we expanded the sample size via genotyping and imputation in a further 111,548 subjects. Variants associated with type 2 diabetes after sequencing were overwhelmingly common and most fell within regions previously identified by genome-wide association studies. Comprehensive enumeration of sequence variation is necessary to identify functional alleles that provide important clues to disease pathophysiology, but large-scale sequencing does not support the idea that lower-frequency variants have a major role in predisposition to type 2 diabetes.
The genetic architecture of type 2 diabetes
Ma, Clement; Fontanillas, Pierre; Moutsianas, Loukas; McCarthy, Davis J; Rivas, Manuel A; Perry, John R B; Sim, Xueling; Blackwell, Thomas W; Robertson, Neil R; Rayner, N William; Cingolani, Pablo; Locke, Adam E; Tajes, Juan Fernandez; Highland, Heather M; Dupuis, Josee; Chines, Peter S; Lindgren, Cecilia M; Hartl, Christopher; Jackson, Anne U; Chen, Han; Huyghe, Jeroen R; van de Bunt, Martijn; Pearson, Richard D; Kumar, Ashish; Müller-Nurasyid, Martina; Grarup, Niels; Stringham, Heather M; Gamazon, Eric R; Lee, Jaehoon; Chen, Yuhui; Scott, Robert A; Below, Jennifer E; Chen, Peng; Huang, Jinyan; Go, Min Jin; Stitzel, Michael L; Pasko, Dorota; Parker, Stephen C J; Varga, Tibor V; Green, Todd; Beer, Nicola L; Day-Williams, Aaron G; Ferreira, Teresa; Fingerlin, Tasha; Horikoshi, Momoko; Hu, Cheng; Huh, Iksoo; Ikram, Mohammad Kamran; Kim, Bong-Jo; Kim, Yongkang; Kim, Young Jin; Kwon, Min-Seok; Lee, Juyoung; Lee, Selyeong; Lin, Keng-Han; Maxwell, Taylor J; Nagai, Yoshihiko; Wang, Xu; Welch, Ryan P; Yoon, Joon; Zhang, Weihua; Barzilai, Nir; Voight, Benjamin F; Han, Bok-Ghee; Jenkinson, Christopher P; Kuulasmaa, Teemu; Kuusisto, Johanna; Manning, Alisa; Ng, Maggie C Y; Palmer, Nicholette D; Balkau, Beverley; Stančáková, Alena; Abboud, Hanna E; Boeing, Heiner; Giedraitis, Vilmantas; Prabhakaran, Dorairaj; Gottesman, Omri; Scott, James; Carey, Jason; Kwan, Phoenix; Grant, George; Smith, Joshua D; Neale, Benjamin M; Purcell, Shaun; Butterworth, Adam S; Howson, Joanna M M; Lee, Heung Man; Lu, Yingchang; Kwak, Soo-Heon; Zhao, Wei; Danesh, John; Lam, Vincent K L; Park, Kyong Soo; Saleheen, Danish; So, Wing Yee; Tam, Claudia H T; Afzal, Uzma; Aguilar, David; Arya, Rector; Aung, Tin; Chan, Edmund; Navarro, Carmen; Cheng, Ching-Yu; Palli, Domenico; Correa, Adolfo; Curran, Joanne E; Rybin, Denis; Farook, Vidya S; Fowler, Sharon P; Freedman, Barry I; Griswold, Michael; Hale, Daniel Esten; Hicks, Pamela J; Khor, Chiea-Chuen; Kumar, Satish; Lehne, Benjamin; Thuillier, Dorothée; Lim, Wei Yen; Liu, Jianjun; van der Schouw, Yvonne T; Loh, Marie; Musani, Solomon K; Puppala, Sobha; Scott, William R; Yengo, Loïc; Tan, Sian-Tsung; Taylor, Herman A; Thameem, Farook; Wilson, Gregory; Wong, Tien Yin; Njølstad, Pål Rasmus; Levy, Jonathan C; Mangino, Massimo; Bonnycastle, Lori L; Schwarzmayr, Thomas; Fadista, João; Surdulescu, Gabriela L; Herder, Christian; Groves, Christopher J; Wieland, Thomas; Bork-Jensen, Jette; Brandslund, Ivan; Christensen, Cramer; Koistinen, Heikki A; Doney, Alex S F; Kinnunen, Leena; Esko, Tõnu; Farmer, Andrew J; Hakaste, Liisa; Hodgkiss, Dylan; Kravic, Jasmina; Lyssenko, Valeriya; Hollensted, Mette; Jørgensen, Marit E; Jørgensen, Torben; Ladenvall, Claes; Justesen, Johanne Marie; Käräjämäki, Annemari; Kriebel, Jennifer; Rathmann, Wolfgang; Lannfelt, Lars; Lauritzen, Torsten; Narisu, Narisu; Linneberg, Allan; Melander, Olle; Milani, Lili; Neville, Matt; Orho-Melander, Marju; Qi, Lu; Qi, Qibin; Roden, Michael; Rolandsson, Olov; Swift, Amy; Rosengren, Anders H; Stirrups, Kathleen; Wood, Andrew R; Mihailov, Evelin; Blancher, Christine; Carneiro, Mauricio O; Maguire, Jared; Poplin, Ryan; Shakir, Khalid; Fennell, Timothy; DePristo, Mark; de Angelis, Martin Hrabé; Deloukas, Panos; Gjesing, Anette P; Jun, Goo; Nilsson, Peter; Murphy, Jacquelyn; Onofrio, Robert; Thorand, Barbara; Hansen, Torben; Meisinger, Christa; Hu, Frank B; Isomaa, Bo; Karpe, Fredrik; Liang, Liming; Peters, Annette; Huth, Cornelia; O'Rahilly, Stephen P; Palmer, Colin N A; Pedersen, Oluf; Rauramaa, Rainer; Tuomilehto, Jaakko; Salomaa, Veikko; Watanabe, Richard M; Syvänen, Ann-Christine; Bergman, Richard N; Bharadwaj, Dwaipayan; Bottinger, Erwin P; Cho, Yoon Shin; Chandak, Giriraj R; Chan, Juliana C N; Chia, Kee Seng; Daly, Mark J; Ebrahim, Shah B; Langenberg, Claudia; Elliott, Paul; Jablonski, Kathleen A; Lehman, Donna M; Jia, Weiping; Ma, Ronald C W; Pollin, Toni I; Sandhu, Manjinder; Tandon, Nikhil; Froguel, Philippe; Barroso, Inês; Teo, Yik Ying; Zeggini, Eleftheria; Loos, Ruth J F; Small, Kerrin S; Ried, Janina S; DeFronzo, Ralph A; Grallert, Harald; Glaser, Benjamin; Metspalu, Andres; Wareham, Nicholas J; Walker, Mark; Banks, Eric; Gieger, Christian; Ingelsson, Erik; Im, Hae Kyung; Illig, Thomas; Franks, Paul W; Buck, Gemma; Trakalo, Joseph; Buck, David; Prokopenko, Inga; Mägi, Reedik; Lind, Lars; Farjoun, Yossi; Owen, Katharine R; Gloyn, Anna L; Strauch, Konstantin; Tuomi, Tiinamaija; Kooner, Jaspal Singh; Lee, Jong-Young; Park, Taesung; Donnelly, Peter; Morris, Andrew D; Hattersley, Andrew T; Bowden, Donald W; Collins, Francis S; Atzmon, Gil; Chambers, John C; Spector, Timothy D; Laakso, Markku; Strom, Tim M; Bell, Graeme I; Blangero, John; Duggirala, Ravindranath; Tai, E Shyong; McVean, Gilean; Hanis, Craig L; Wilson, James G; Seielstad, Mark; Frayling, Timothy M; Meigs, James B; Cox, Nancy J; Sladek, Rob; Lander, Eric S; Gabriel, Stacey; Burtt, Noël P; Mohlke, Karen L; Meitinger, Thomas; Groop, Leif; Abecasis, Goncalo; Florez, Jose C; Scott, Laura J; Morris, Andrew P; Kang, Hyun Min; Boehnke, Michael; Altshuler, David; McCarthy, Mark I
2016-01-01
The genetic architecture of common traits, including the number, frequency, and effect sizes of inherited variants that contribute to individual risk, has been long debated. Genome-wide association studies have identified scores of common variants associated with type 2 diabetes, but in aggregate, these explain only a fraction of heritability. To test the hypothesis that lower-frequency variants explain much of the remainder, the GoT2D and T2D-GENES consortia performed whole genome sequencing in 2,657 Europeans with and without diabetes, and exome sequencing in a total of 12,940 subjects from five ancestral groups. To increase statistical power, we expanded sample size via genotyping and imputation in a further 111,548 subjects. Variants associated with type 2 diabetes after sequencing were overwhelmingly common and most fell within regions previously identified by genome-wide association studies. Comprehensive enumeration of sequence variation is necessary to identify functional alleles that provide important clues to disease pathophysiology, but large-scale sequencing does not support a major role for lower-frequency variants in predisposition to type 2 diabetes. PMID:27398621
Liu, Qing; Zhu, Shenghua; Mizuno, Sahoko; Kimura, Masatsugu; Liu, Peina; Isomura, Shin; Wang, Xingzhen; Kawamoto, Fumihiko
1998-01-01
By two PCR-based diagnostic methods, Plasmodium malariae infections have been rediscovered at two foci in the Sichuan province of China, a region where no cases of P. malariae have been officially reported for the last 2 decades. In addition, a variant form of P. malariae which has a deletion of 19 bp and seven substitutions of base pairs in the target sequence of the small-subunit (SSU) rRNA gene was detected with high frequency. Alignment analysis of Plasmodium sp. SSU rRNA gene sequences revealed that the 5′ region of the variant sequence is identical to that of P. vivax or P. knowlesi and its 3′ region is identical to that of P. malariae. The same sequence variations were also found in P. malariae isolates collected along the Thai-Myanmar border, suggesting a wide distribution of this variant form from southern China to Southeast Asia. PMID:9774600
MYO7A and USH2A gene sequence variants in Italian patients with Usher syndrome.
Sodi, Andrea; Mariottini, Alessandro; Passerini, Ilaria; Murro, Vittoria; Tachyla, Iryna; Bianchi, Benedetta; Menchini, Ugo; Torricelli, Francesca
2014-01-01
To analyze the spectrum of sequence variants in the MYO7A and USH2A genes in a group of Italian patients affected by Usher syndrome (USH). Thirty-six Italian patients with a diagnosis of USH were recruited. They received a standard ophthalmologic examination, visual field testing, optical coherence tomography (OCT) scan, and electrophysiological tests. Fluorescein angiography and fundus autofluorescence imaging were performed in selected cases. All the patients underwent an audiologic examination for the 0.25-8,000 Hz frequencies. Vestibular function was evaluated with specific tests. DNA samples were analyzed for sequence variants of the MYO7A gene (for USH1) and the USH2A gene (for USH2) with direct sequencing techniques. A few patients were analyzed for both genes. In the MYO7A gene, ten missense variants were found; three patients were compound heterozygous, and two were homozygous. Thirty-four USH2A gene variants were detected, including eight missense variants, nine nonsense variants, six splicing variants, and 11 duplications/deletions; 19 patients were compound heterozygous, and three were homozygous. Four MYO7A and 17 USH2A variants have already been described in the literature. Among the novel mutations there are four USH2A large deletions, detected with multiplex ligation dependent probe amplification (MLPA) technology. Two potentially pathogenic variants were found in 27 patients (75%). Affected patients showed variable clinical pictures without a clear genotype-phenotype correlation. Ten variants in the MYO7A gene and 34 variants in the USH2A gene were detected in Italian patients with USH at a high detection rate. A selective analysis of these genes may be valuable for molecular analysis, combining diagnostic efficiency with little time wastage and less resource consumption.
Natarajan, Chandrasekhar; Hoffmann, Federico G.; Lanier, Hayley C.; Wolf, Cole J.; Cheviron, Zachary A.; Spangler, Matthew L.; Weber, Roy E.; Fago, Angela; Storz, Jay F.
2015-01-01
Major challenges for illuminating the genetic basis of phenotypic evolution are to identify causative mutations, to quantify their functional effects, to trace their origins as new or preexisting variants, and to assess the manner in which segregating variation is transduced into species differences. Here, we report an experimental analysis of genetic variation in hemoglobin (Hb) function within and among species of Peromyscus mice that are native to different elevations. A multilocus survey of sequence variation in the duplicated HBA and HBB genes in Peromyscus maniculatus revealed that function-altering amino acid variants are widely shared among geographically disparate populations from different elevations, and numerous amino acid polymorphisms are also shared with closely related species. Variation in Hb-O2 affinity within and among populations of P. maniculatus is attributable to numerous amino acid mutations that have individually small effects. One especially surprising feature of the Hb polymorphism in P. maniculatus is that an appreciable fraction of functional standing variation in the two transcriptionally active HBA paralogs is attributable to recurrent gene conversion from a tandemly linked HBA pseudogene. Moreover, transpecific polymorphism in the duplicated HBA genes is not solely attributable to incomplete lineage sorting or introgressive hybridization; instead, it is mainly attributable to recurrent interparalog gene conversion that has occurred independently in different species. Partly as a result of concerted evolution between tandemly duplicated globin genes, the same amino acid changes that contribute to variation in Hb function within P. maniculatus also contribute to divergence in Hb function among different species of Peromyscus. In the case of function-altering Hb mutations in Peromyscus, there is no qualitative or quantitative distinction between segregating variants within species and fixed differences between species. PMID:25556236
Higher criticism approach to detect rare variants using whole genome sequencing data
2014-01-01
Because of low statistical power of single-variant tests for whole genome sequencing (WGS) data, the association test for variant groups is a key approach for genetic mapping. To address the features of sparse and weak genetic effects to be detected, the higher criticism (HC) approach has been proposed and theoretically has proven optimal for detecting sparse and weak genetic effects. Here we develop a strategy to apply the HC approach to WGS data that contains rare variants as the majority. By using Genetic Analysis Workshop 18 "dose" genetic data with simulated phenotypes, we assess the performance of HC under a variety of strategies for grouping variants and collapsing rare variants. The HC approach is compared with the minimal p-value method and the sequence kernel association test. The results show that the HC approach is preferred for detecting weak genetic effects. PMID:25519367
Marsic, Damien; Govindasamy, Lakshmanan; Currlin, Seth; Markusic, David M; Tseng, Yu-Shan; Herzog, Roland W; Agbandje-McKenna, Mavis; Zolotukhin, Sergei
2014-01-01
Methodologies to improve existing adeno-associated virus (AAV) vectors for gene therapy include either rational approaches or directed evolution to derive capsid variants characterized by superior transduction efficiencies in targeted tissues. Here, we integrated both approaches in one unified design strategy of “virtual family shuffling” to derive a combinatorial capsid library whereby only variable regions on the surface of the capsid are modified. Individual sublibraries were first assembled in order to preselect compatible amino acid residues within restricted surface-exposed regions to minimize the generation of dead-end variants. Subsequently, the successful families were interbred to derive a combined library of ~8 × 105 complexity. Next-generation sequencing of the packaged viral DNA revealed capsid surface areas susceptible to directed evolution, thus providing guidance for future designs. We demonstrated the utility of the library by deriving an AAV2-based vector characterized by a 20-fold higher transduction efficiency in murine liver, now equivalent to that of AAV8. PMID:25048217
Albariño, César G; Guerrero, Lisa Wiggleton; Chakrabarti, Ayan K; Kainulainen, Markus H; Whitmer, Shannon L M; Welch, Stephen R; Nichol, Stuart T
2016-09-01
During the large outbreak of Ebola virus disease that occurred in Western Africa from late 2013 to early 2016, several hundred Ebola virus (EBOV) genomes have been sequenced and the virus genetic drift analyzed. In a previous report, we described an efficient reverse genetics system designed to generate recombinant EBOV based on a Makona variant isolate obtained in 2014. Using this system, we characterized the replication and fitness of 2 isolates of the Makona variant. These virus isolates are nearly identical at the genetic level, but have single amino acid differences in the VP30 and L proteins. The potential effects of these differences were tested using minigenomes and recombinant viruses. The results obtained with this approach are consistent with the role of VP30 and L as components of the EBOV RNA replication machinery. Moreover, the 2 isolates exhibited clear fitness differences in competitive growth assays. Published by Elsevier Inc.
Conservation and variability of West Nile virus proteins.
Koo, Qi Ying; Khan, Asif M; Jung, Keun-Ok; Ramdas, Shweta; Miotto, Olivo; Tan, Tin Wee; Brusic, Vladimir; Salmon, Jerome; August, J Thomas
2009-01-01
West Nile virus (WNV) has emerged globally as an increasingly important pathogen for humans and domestic animals. Studies of the evolutionary diversity of the virus over its known history will help to elucidate conserved sites, and characterize their correspondence to other pathogens and their relevance to the immune system. We describe a large-scale analysis of the entire WNV proteome, aimed at identifying and characterizing evolutionarily conserved amino acid sequences. This study, which used 2,746 WNV protein sequences collected from the NCBI GenPept database, focused on analysis of peptides of length 9 amino acids or more, which are immunologically relevant as potential T-cell epitopes. Entropy-based analysis of the diversity of WNV sequences, revealed the presence of numerous evolutionarily stable nonamer positions across the proteome (entropy value of < or = 1). The representation (frequency) of nonamers variant to the predominant peptide at these stable positions was, generally, low (< or = 10% of the WNV sequences analyzed). Eighty-eight fragments of length 9-29 amino acids, representing approximately 34% of the WNV polyprotein length, were identified to be identical and evolutionarily stable in all analyzed WNV sequences. Of the 88 completely conserved sequences, 67 are also present in other flaviviruses, and several have been associated with the functional and structural properties of viral proteins. Immunoinformatic analysis revealed that the majority (78/88) of conserved sequences are potentially immunogenic, while 44 contained experimentally confirmed human T-cell epitopes. This study identified a comprehensive catalogue of completely conserved WNV sequences, many of which are shared by other flaviviruses, and majority are potential epitopes. The complete conservation of these immunologically relevant sequences through the entire recorded WNV history suggests they will be valuable as components of peptide-specific vaccines or other therapeutic applications, for sequence-specific diagnosis of a wide-range of Flavivirus infections, and for studies of homologous sequences among other flaviviruses.
French, Kinsley C; Makhatadze, George I
2012-12-21
PAPf39, a 39-residue peptide fragment from human prostatic acidic phosphatase, has been shown to form amyloid fibrils in semen (SEVI), which increase HIV infectivity by up to 5 orders of magnitude. The sequence of the PAPf39 fibrillar core was identified using hydrogen-deuterium exchange (HDX) mass spectrometry and protease protection assays. The central and C-terminal regions are highly protected from HDX and proteolytic cleavage and, thus, are part of the fibrillar core. Conversely, the N-terminal region is unprotected from HDX and proteolytic cleavage, suggesting that it is exposed and not part of the fibrillar core. This finding was tested using two N-terminal truncated variants, PAPf39Δ1-8 and PAPf39Δ1-13. Both variants formed amyloid fibrils at neutral pH. However, these variants showed a markedly different pH dependence of fibril formation versus that of PAPf39. PAPf39 fibrils can form at pH 7.7, but not at pH 5.5 or 2.5, while both N-terminally truncated variants can form fibrils at these pH values. Thus, the N-terminal region is not necessary for fibril formation but modulates the pH dependence of PAPf39 fibril formation. PAPf39Δ1-8 and PAPf39Δ1-13 are capable of seeding PAPf39 fibril formation at neutral pH, suggesting that these variants are structurally compatible with PAPf39, yet no mixed fibril formation occurs between the truncated variants and PAPf39 at low pH. This suggests that pH affects the PAPf39 monomer conformational ensemble, which is supported by far-UV circular dichroism spectroscopy. A conceptual model describing the pH dependence of PAPf39 aggregation is proposed and provides potential biological implications.
Piórkowska, K; Żukowski, K; Ropka-Molik, K; Tyra, M
2018-06-01
Variant calling analysis based on RNA sequencing data provides information about gene variants. RNA-seq is cheaper and faster than is DNA sequencing. However, it requires individual hard filters during data processing due to post-transcriptional modifications such as splicing and RNA editing. In the present study, RNA-seq transcriptome data on two Polish pig breeds (Puławska, PUL, n = 8, and Polish Landrace, PL, n = 8) were included. The pig breeds are significantly different with regard to meat qualities such as texture, water exudation, growth traits and fat content in carcasses. A total of 2451 significant mutations were identified by a chi square tests, and functional analysis was carried out using Panther, KEGG and Kobas. Interesting missense gene variants and mutations located in regulatory regions were found in a few genes related to fatty acid metabolism and lipid storage such as ACSL5, ALDH3A2, FADS1, SCD, PLA2G12A and ATGL. A validation of mutational influences on pig traits was performed for ALDH3A2, ATGL, PLA2G12A and MYOM1 variants using association analysis including 215 pigs of the PL and PUL breeds. The ALDH3A2ENSSSCT00000019636.2:c.470T>C polymorphism was found to affect the weight of the ham and loin eye area. In turn, an ENSSSCT00000004091.2:c.2836G>A MYOM1 mutation, which could be implicated in myofibrillar network organisation, had an effect on meatiness and loin texture parameters. The study aimed to estimate the usefulness of RNA-seq results for a purpose other than differentially expressed gene analysis. The analysis performed indicated interesting gene variants that could be used in the future as markers during selection. © 2018 Stichting International Foundation for Animal Genetics.
Abu-Farha, Mohamed; Melhem, Motasem; Abubaker, Jehad; Behbehani, Kazem; Alsmadi, Osama; Elkum, Naser
2016-02-11
ANGPTL8 (betatrophin) has been recently identified as a regulator of lipid metabolism through its interaction with ANGPTL3. A sequence variant in ANGPTL8 has been shown to associate with lower level of Low Density Lipoprotein (LDL) and High Density Lipoprotein (HDL). The objective of this study is to identify sequence variants in ANGPTL8 gene in Arabs and investigate their association with ANGPTL8 plasma level and clinical parameters. A cross sectional study was designed to examine the level of ANGPTL8 in 283 non-diabetic Arabs, and to identify its sequence variants using Sanger sequencing and their association with various clinical parameters. Using Sanger sequencing, we sequenced the full ANGPTL8 gene in 283 Arabs identifying two single nucleotide polymorphisms (SNPs) Rs.892066 and Rs.2278426 in the coding region. Our data shows for the first time that Arabs with the heterozygote form of (c.194C > T Rs.2278426) had higher level of Fasting Blood Glucose (FBG) compared to the CC homozygotes. LDL and HDL level in these subjects did not show significant difference between the two subgroups. Circulation level of ANGPTL8 did not vary between the two forms. No significant changes were observed between the various forms of Rs.892066 variant and FBG, LDL or HDL. Our data shows for the first time that heterozygote form of ANGPTL8 Rs.2278426 variant was associated with higher FBG level in Arabs highlighting the importance of these variants in controlling the function of betatrophin.
Fowler, Elizabeth V; Peters, Jennifer M; Gatton, Michelle L; Chen, Nanhua; Cheng, Qin
2002-03-01
In Plasmodium falciparum a highly polymorphic multi-copy gene family, var, encodes the variant surface antigen P. falciparum erythrocyte membrane protein 1 (PfEMP1), which has an important role in cytoadherence and immune evasion. Using previously described universal PCR primers for the first Duffy binding-like domain (DBLalpha) of var we analysed the DBLalpha repertoires of Dd2 (originally from Thailand) and eight isolates from the Solomon Islands (n=4), Philippines (n=2), Papua New Guinea (n=1) and Africa (n=1). We found 15-32 unique DBLalpha sequence types among these isolates and estimated detectable DBLalpha repertoire sizes ranging from 33-38 to 52-57 copies per genome. Our data suggest that var gene repertoires generally consist of 40-50 copies per genome. Eighteen DBLalpha sequences appeared in more than one Asia-Pacific isolate with the number of sequences shared between any two isolates ranging from 0 to 6 (mean=2.0 +/-1.6). At the amino acid level DBLalpha sequence similarity within isolates ranged from 45.2 +/- 7.1 to 50.2 +/- 6.9%, and was not significantly different from the DBLalpha amino acid sequence similarity among isolates (P>0.1). Comparisons with published sequences also revealed little overlap among DBLalpha sequences from different regions. High DBLalpha sequence diversity and minimal overlap among these isolates suggest that the global var gene repertoire is immense, and may potentially be selected for by the host's protective immune response to the var gene products, PfEMP1.
Bianchi, Marzia; Amendola, Roberto; Federico, Rodolfo; Polticelli, Fabio; Mariottini, Paolo
2005-06-01
In mouse, at least two catalytically active splice variants (mSMOalpha and mSMOmicro) of the flavin-containing spermine oxidase enzyme are present. We have demonstrated previously that the cytosolic mSMOalpha is the major isoform, while the mSMOmicro enzyme is present in both nuclear and cytoplasmic compartments and has an extra protein domain corresponding to the additional exon VIa. By amino acid sequence comparison and molecular modeling of mSMO proteins, we identified a second domain that is necessary for nuclear localization of the mSMOmicro splice variant. A deletion mutant enzyme of this region was constructed to demonstrate its role in protein nuclear targeting by means of transient expression in the murine neuroblastoma cell line, N18TG2.
Tenney, Jeffrey R; Prada, Carlos E; Hopkin, Robert J; Hallinan, Barbara E
2013-12-01
Leigh syndrome, due to a dysfunction of mitochondrial energy metabolism, is a genetically heterogeneous and progressive neurologic disorder that usually occurs in infancy and childhood. Its clinical presentation and neuroimaging findings can be variable, especially early in the course of the disease. This report presents a patient with infantile Leigh syndrome who had atypical radiologic findings on serial neuroimaging studies with early and severe involvement of the cervical spinal cord and brainstem and injury to the thalami and basal ganglia occurring only late in the clinical course. Postmortem microscopic examination supported this timing of injury within the central nervous system. In addition, mitochondrial deoxyribonucleic acid sequencing showed a novel homoplasmic variant that could be responsible for this unique lethal form of Leigh syndrome.
Identification of missing variants by combining multiple analytic pipelines.
Ren, Yingxue; Reddy, Joseph S; Pottier, Cyril; Sarangi, Vivekananda; Tian, Shulan; Sinnwell, Jason P; McDonnell, Shannon K; Biernacka, Joanna M; Carrasquillo, Minerva M; Ross, Owen A; Ertekin-Taner, Nilüfer; Rademakers, Rosa; Hudson, Matthew; Mainzer, Liudmila Sergeevna; Asmann, Yan W
2018-04-16
After decades of identifying risk factors using array-based genome-wide association studies (GWAS), genetic research of complex diseases has shifted to sequencing-based rare variants discovery. This requires large sample sizes for statistical power and has brought up questions about whether the current variant calling practices are adequate for large cohorts. It is well-known that there are discrepancies between variants called by different pipelines, and that using a single pipeline always misses true variants exclusively identifiable by other pipelines. Nonetheless, it is common practice today to call variants by one pipeline due to computational cost and assume that false negative calls are a small percent of total. We analyzed 10,000 exomes from the Alzheimer's Disease Sequencing Project (ADSP) using multiple analytic pipelines consisting of different read aligners and variant calling strategies. We compared variants identified by using two aligners in 50,100, 200, 500, 1000, and 1952 samples; and compared variants identified by adding single-sample genotyping to the default multi-sample joint genotyping in 50,100, 500, 2000, 5000 and 10,000 samples. We found that using a single pipeline missed increasing numbers of high-quality variants correlated with sample sizes. By combining two read aligners and two variant calling strategies, we rescued 30% of pass-QC variants at sample size of 2000, and 56% at 10,000 samples. The rescued variants had higher proportions of low frequency (minor allele frequency [MAF] 1-5%) and rare (MAF < 1%) variants, which are the very type of variants of interest. In 660 Alzheimer's disease cases with earlier onset ages of ≤65, 4 out of 13 (31%) previously-published rare pathogenic and protective mutations in APP, PSEN1, and PSEN2 genes were undetected by the default one-pipeline approach but recovered by the multi-pipeline approach. Identification of the complete variant set from sequencing data is the prerequisite of genetic association analyses. The current analytic practice of calling genetic variants from sequencing data using a single bioinformatics pipeline is no longer adequate with the increasingly large projects. The number and percentage of quality variants that passed quality filters but are missed by the one-pipeline approach rapidly increased with sample size.
Functional characterization of rare FOXP2 variants in neurodevelopmental disorder.
Estruch, Sara B; Graham, Sarah A; Chinnappa, Swathi M; Deriziotis, Pelagia; Fisher, Simon E
2016-01-01
Heterozygous disruption of FOXP2 causes a rare form of speech and language impairment. Screens of the FOXP2 sequence in individuals with speech/language-related disorders have identified several rare protein-altering variants, but their phenotypic relevance is often unclear. FOXP2 encodes a transcription factor with a forkhead box DNA-binding domain, but little is known about the functions of protein regions outside this domain. We performed detailed functional analyses of seven rare FOXP2 variants found in affected cases, including three which have not been previously characterized, testing intracellular localization, transcriptional regulation, dimerization, and interaction with other proteins. To shed further light on molecular functions of FOXP2, we characterized the interaction between this transcription factor and co-repressor proteins of the C-terminal binding protein (CTBP) family. Finally, we analysed the functional significance of the polyglutamine tracts in FOXP2, since tract length variations have been reported in cases of neurodevelopmental disorder. We confirmed etiological roles of multiple FOXP2 variants. Of three variants that have been suggested to cause speech/language disorder, but never before been characterized, only one showed functional effects. For the other two, we found no effects on protein function in any assays, suggesting that they are incidental to the phenotype. We identified a CTBP-binding region within the N-terminal portion of FOXP2. This region includes two amino acid substitutions that occurred on the human lineage following the split from chimpanzees. However, we did not observe any effects of these amino acid changes on CTBP binding or other core aspects of FOXP2 function. Finally, we found that FOXP2 variants with reduced polyglutamine tracts did not exhibit altered behaviour in cellular assays, indicating that such tracts are non-essential for core aspects of FOXP2 function, and that tract variation is unlikely to be a highly penetrant cause of speech/language disorder. Our findings highlight the importance of functional characterization of novel rare variants in FOXP2 in assessing the contribution of such variants to speech/language disorder and provide further insights into the molecular function of the FOXP2 protein.
Park, Eunkuk; Kim, Bo-Young; Choi, Vit-Na; Yoo, Young-Hyun; Kim, Bom-Taeck; Jeong, Seon-Yong
2015-01-01
To identify novel susceptibility variants for osteoporosis in Korean postmenopausal women, we performed a genome-wide association analysis of 1180 nonsynonymous single nucleotide polymorphisms (nsSNPs) in 405 individuals with osteoporosis and 722 normal controls of the Korean Association Resource cohort. A logistic regression analysis revealed 72 nsSNPs that showed a significant association with osteoporosis (p<0.05). The top 10 nsSNPs showing the lowest p-values (p = 5.2×10-4–8.5×10-3) were further studied to investigate their effects at the protein level. Based on the results of an in silico prediction of the protein’s functional effect based on amino acid alterations and a sequence conservation evaluation of the amino acid residues at the positions of the nsSNPs among orthologues, we selected one nsSNP in the SQRDL gene (rs1044032, SQRDL I264T) as a meaningful genetic variant associated with postmenopausal osteoporosis. To assess whether the SQRDL I264T variant played a functional role in the pathogenesis of osteoporosis, we examined the in vitro effect of the nsSNP on bone remodeling. Overexpression of the SQRDL I264T variant in the preosteoblast MC3T3-E1 cells significantly increased alkaline phosphatase activity, mineralization, and the mRNA expression of osteoblastogenesis markers, Runx2, Sp7, and Bglap genes, whereas the SQRDL wild type had no effect or a negative effect on osteoblast differentiation. Overexpression of the SQRDL I264T variant did not affect osteoclast differentiation of the primary-cultured monocytes. The known effects of hydrogen sulfide (H2S) on bone remodeling may explain the findings of the current study, which demonstrated the functional role of the H2S-catalyzing enzyme SQRDL I264T variant in osteoblast differentiation. In conclusion, the results of the statistical and experimental analyses indicate that the SQRDL I264T nsSNP may be a significant susceptibility variant for osteoporosis in Korean postmenopausal women that is involved in osteoblast differentiation. PMID:26258864
Mu, John C.; Tootoonchi Afshar, Pegah; Mohiyuddin, Marghoob; Chen, Xi; Li, Jian; Bani Asadi, Narges; Gerstein, Mark B.; Wong, Wing H.; Lam, Hugo Y. K.
2015-01-01
A high-confidence, comprehensive human variant set is critical in assessing accuracy of sequencing algorithms, which are crucial in precision medicine based on high-throughput sequencing. Although recent works have attempted to provide such a resource, they still do not encompass all major types of variants including structural variants (SVs). Thus, we leveraged the massive high-quality Sanger sequences from the HuRef genome to construct by far the most comprehensive gold set of a single individual, which was cross validated with deep Illumina sequencing, population datasets, and well-established algorithms. It was a necessary effort to completely reanalyze the HuRef genome as its previously published variants were mostly reported five years ago, suffering from compatibility, organization, and accuracy issues that prevent their direct use in benchmarking. Our extensive analysis and validation resulted in a gold set with high specificity and sensitivity. In contrast to the current gold sets of the NA12878 or HS1011 genomes, our gold set is the first that includes small variants, deletion SVs and insertion SVs up to a hundred thousand base-pairs. We demonstrate the utility of our HuRef gold set to benchmark several published SV detection tools. PMID:26412485
SeSaM-Tv-II generates a protein sequence space that is unobtainable by epPCR.
Mundhada, Hemanshu; Marienhagen, Jan; Scacioc, Andreea; Schenk, Alexander; Roccatano, Danilo; Schwaneberg, Ulrich
2011-07-04
Generating high-quality mutant libraries in which each amino acid is equally targeted and substituted in a chemically diverse manner is crucial to obtain improved variants in small mutant libraries. The sequence saturation mutagenesis method (SeSaM-Tv(+) ) offers the opportunity to generate such high-quality mutant libraries by introducing consecutive mutations and by enriching transversions. In this study, automated gel electrophoresis, real-time quantitative PCR, and a phosphorimager quantification system were developed and employed to optimize each step of previously reported SeSaM-Tv(+) method. Advancements of the SeSaM-Tv(+) protocol and the use of a novel DNA polymerase quadrupled the number of transversions, by doubling the fraction of consecutive mutations (from 16.7 to 37.1 %). About 33 % of all amino acid substitutions observed in a model library are rarely introduced by epPCR methods, and around 10 % of all clones carried amino acid substitutions that are unobtainable by epPCR. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Stability and function of interdomain linker variants of glucoamylase 1 from Aspergillus niger.
Sauer, J; Christensen, T; Frandsen, T P; Mirgorodskaya, E; McGuire, K A; Driguez, H; Roepstorff, P; Sigurskjold, B W; Svensson, B
2001-08-07
Several variants of glucoamylase 1 (GA1) from Aspergillus niger were created in which the highly O-glycosylated peptide (aa 468--508) connecting the (alpha/alpha)(6)-barrel catalytic domain and the starch binding domain was substituted at the gene level by equivalent segments of glucoamylases from Hormoconis resinae, Humicola grisea, and Rhizopus oryzae encoding 5, 19, and 36 amino acid residues. Variants were constructed in which the H. resinae linker was elongated by proline-rich sequences as this linker itself apparently was too short to allow formation of the corresponding protein variant. Size and isoelectric point of GA1 variants reflected differences in linker length, posttranslational modification, and net charge. While calculated polypeptide chain molecular masses for wild-type GA1, a nonnatural proline-rich linker variant, H. grisea, and R. oryzae linker variants were 65,784, 63,777, 63,912, and 65,614 Da, respectively, MALDI-TOF-MS gave values of 82,042, 73,800, 73,413, and 90,793 Da, respectively, where the latter value could partly be explained by an N-glycosylation site introduced near the linker C-terminus. The k(cat) and K(m) for hydrolysis of maltooligodextrins and soluble starch, and the rate of hydrolysis of barley starch granules were essentially the same for the variants as for wild-type GA1. beta-Cyclodextrin, acarbose, and two heterobidentate inhibitors were found by isothermal titration calorimetry to bind to the catalytic and starch binding domains of the linker variants, indicating that the function of the active site and the starch binding site was maintained. The stability of GA1 linker variants toward GdnHCl and heat, however, was reduced compared to wild-type.
Structural analysis of two length variants of the rDNA intergenic spacer from Eruca sativa.
Lakshmikumaran, M; Negi, M S
1994-03-01
Restriction enzyme analysis of the rRNA genes of Eruca sativa indicated the presence of many length variants within a single plant and also between different cultivars which is unusual for most crucifers studied so far. Two length variants of the rDNA intergenic spacer (IGS) from a single individual E. sativa (cv. Itsa) plant were cloned and characterized. The complete nucleotide sequences of both the variants (3 kb and 4 kb) were determined. The intergenic spacer contains three families of tandemly repeated DNA sequences denoted as A, B and C. However, the long (4 kb) variant shows the presence of an additional repeat, denoted as D, which is a duplication of a 224 bp sequence just upstream of the putative transcription initiation site. Repeat units belonging to the three different families (A, B and C) were in the size range of 22 to 30 bp. Such short repeat elements are present in the IGS of most of the crucifers analysed so far. Sequence analysis of the variants (3 kb and 4 kb) revealed that the length heterogeneity of the spacer is located at three different regions and is due to the varying copy numbers of repeat units belonging to families A and B. Length variation of the spacer is also due to the presence of a large duplication (D repeats) in the 4 kb variant which is absent in the 3 kb variant. The putative transcription initiation site was identified by comparisons with the rDNA sequences from other plant species.
Next-generation sequencing for genetic testing of familial colorectal cancer syndromes.
Simbolo, Michele; Mafficini, Andrea; Agostini, Marco; Pedrazzani, Corrado; Bedin, Chiara; Urso, Emanuele D; Nitti, Donato; Turri, Giona; Scardoni, Maria; Fassan, Matteo; Scarpa, Aldo
2015-01-01
Genetic screening in families with high risk to develop colorectal cancer (CRC) prevents incurable disease and permits personalized therapeutic and follow-up strategies. The advancement of next-generation sequencing (NGS) technologies has revolutionized the throughput of DNA sequencing. A series of 16 probands for either familial adenomatous polyposis (FAP; 8 cases) or hereditary nonpolyposis colorectal cancer (HNPCC; 8 cases) were investigated for intragenic mutations in five CRC familial syndromes-associated genes (APC, MUTYH, MLH1, MSH2, MSH6) applying both a custom multigene Ion AmpliSeq NGS panel and conventional Sanger sequencing. Fourteen pathogenic variants were detected in 13/16 FAP/HNPCC probands (81.3 %); one FAP proband presented two co-existing pathogenic variants, one in APC and one in MUTYH. Thirteen of these 14 pathogenic variants were detected by both NGS and Sanger, while one MSH2 mutation (L280FfsX3) was identified only by Sanger sequencing. This is due to a limitation of the NGS approach in resolving sequences close or within homopolymeric stretches of DNA. To evaluate the performance of our NGS custom panel we assessed its capability to resolve the DNA sequences corresponding to 2225 pathogenic variants reported in the COSMIC database for APC, MUTYH, MLH1, MSH2, MSH6. Our NGS custom panel resolves the sequences where 2108 (94.7 %) of these variants occur. The remaining 117 mutations reside inside or in close proximity to homopolymer stretches; of these 27 (1.2 %) are imprecisely identified by the software but can be resolved by visual inspection of the region, while the remaining 90 variants (4.0 %) are blind spots. In summary, our custom panel would miss 4 % (90/2225) of pathogenic variants that would need a small set of Sanger sequencing reactions to be solved. The multiplex NGS approach has the advantage of analyzing multiple genes in multiple samples simultaneously, requiring only a reduced number of Sanger sequences to resolve homopolymeric DNA regions not adequately assessed by NGS. The implementation of NGS approaches in routine diagnostics of familial CRC is cost-effective and significantly reduces diagnostic turnaround times.
Kim, Do Gyun; Kim, Hyoung Jin; Kim, Hong-Jin
2016-10-01
Charge variants (acidic and basic) of recombinant monoclonal antibodies (Mabs) have received much attention due to their potential biological effects. C-terminal lysine variants are common in Mabs and their proportion is affected by the manufacturing process. In the present study, changes of trastuzumab charge variants brought about by carboxypeptidase B treatment and subsequent storage at 8 or 37 °C for up to 24 h were monitored by cation-exchange chromatography analysis to investigate the effects of C-terminal lysine cleavage and its subsequent reaction at 8 or 37 °C. C-terminal lysine cleavage at 8 °C reduced the fraction of basic species and had little effect on the fraction of acidic species. Analysis of individual peaks demonstrated that C-terminal lysine cleavage induced both increases and decreases in individual acidic variants, with the result that there was little overall change in the overall proportion of acidic species. It appeared that most of the basic variant Mab molecules but only a fraction of the acidic variant molecules had C-terminal lysines. Increasing the temperature to 37 °C appeared to increase the fraction of acidic species and decrease main species significantly, without a similar change in basic species. These results indicate that length of exposure to elevated temperature is a critical consideration in charge variant analysis.
Paasinen-Sohns, Aino; Koelzer, Viktor H; Frank, Angela; Schafroth, Julian; Gisler, Aline; Sachs, Melanie; Graber, Anne; Rothschild, Sacha I; Wicki, Andreas; Cathomas, Gieri; Mertz, Kirsten D
2017-03-01
Companion diagnostics rely on genomic testing of molecular alterations to enable effective cancer treatment. Here we report the clinical application and validation of the Oncomine Focus Assay (OFA), an integrated, commercially available next-generation sequencing (NGS) assay for the rapid and simultaneous detection of single nucleotide variants, short insertions and deletions, copy number variations, and gene rearrangements in 52 cancer genes with therapeutic relevance. Two independent patient cohorts were investigated to define the workflow, turnaround times, feasibility, and reliability of OFA targeted sequencing in clinical application and using archival material. Cohort I consisted of 59 diagnostic clinical samples from the daily routine submitted for molecular testing over a 4-month time period. Cohort II consisted of 39 archival melanoma samples that were up to 15years old. Libraries were prepared from isolated nucleic acids and sequenced on the Ion Torrent PGM sequencer. Sequencing datasets were analyzed using the Ion Reporter software. Genomic alterations were identified and validated by orthogonal conventional assays including pyrosequencing and immunohistochemistry. Sequencing results of both cohorts, including archival formalin-fixed, paraffin-embedded material stored up to 15years, were consistent with published variant frequencies. A concordance of 100% between established assays and OFA targeted NGS was observed. The OFA workflow enabled a turnaround of 3½ days. Taken together, OFA was found to be a convenient tool for fast, reliable, broadly applicable and cost-effective targeted NGS of tumor samples in routine diagnostics. Thus, OFA has strong potential to become an important asset for precision oncology. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Norovirus-like VP1 particles exhibit isolate dependent stability profiles
NASA Astrophysics Data System (ADS)
Pogan, Ronja; Schneider, Carola; Reimer, Rudolph; Hansman, Grant; Uetrecht, Charlotte
2018-02-01
Noroviruses are the main cause of viral gastroenteritis with new variants emerging frequently. There are three norovirus genogroups infecting humans. These genogroups are divided based on the sequence of their major capsid protein, which is able to form virus-like particles (VLPs) when expressed recombinantly. VLPs of the prototypical GI.1 Norwalk virus are known to disassemble into specific capsid protein oligomers upon alkaline treatment. Here, native mass spectrometry and electron microscopy on variants of GI.1 and of GII.17 were performed, revealing differences in terms of stability between these groups. Beyond that, these experiments indicate differences even between variants within a genotype. The capsid stability was monitored in different ammonium acetate solutions varying both in ionic strength and pH. The investigated GI.1 West Chester isolate showed comparable disassembly profiles to the previously studied GI.1 Norwalk virus isolate. However, differences were observed with the West Chester being more sensitive to alkaline pH. In stark contrast to that, capsids of the variant belonging to the currently prevalent genogroup GII were stable in all tested conditions. Both variants formed smaller capsid particles already at neutral pH. Certain amino acid substitutions in the S domain of West Chester relative to the Norwalk virus potentially result in the formation of these T = 1 capsids.
Saravanaperumal, Siva Arumugam; Pediconi, Dario; Renieri, Carlo; La Terza, Antonietta
2012-01-01
Stem cell factor (SCF) is a growth factor, essential for haemopoiesis, mast cell development and melanogenesis. In the hematopoietic microenvironment (HM), SCF is produced either as a membrane-bound (−) or soluble (+) forms. Skin expression of SCF stimulates melanocyte migration, proliferation, differentiation, and survival. We report for the first time, a novel mRNA splice variant of SCF from the skin of white merino sheep via cloning and sequencing. Reverse transcriptase (RT)-PCR and molecular prediction revealed two different cDNA products of SCF. Full-length cDNA libraries were enriched by the method of rapid amplification of cDNA ends (RACE-PCR). Nucleotide sequencing and molecular prediction revealed that the primary 1519 base pair (bp) cDNA encodes a precursor protein of 274 amino acids (aa), commonly known as ‘soluble’ isoform. In contrast, the shorter (835 and/or 725 bp) cDNA was found to be a ‘novel’ mRNA splice variant. It contains an open reading frame (ORF) corresponding to a truncated protein of 181 aa (vs 245 aa) with an unique C-terminus lacking the primary proteolytic segment (28 aa) right after the D175G site which is necessary to produce ‘soluble’ form of SCF. This alternative splice (AS) variant was explained by the complete nucleotide sequencing of splice junction covering exon 5-intron (5)-exon 6 (948 bp) with a premature termination codon (PTC) whereby exons 6 to 9/10 are skipped (Cassette Exon, CE 6–9/10). We also demonstrated that the Northern blot analysis at transcript level is mediated via an intron-5 splicing event. Our data refine the structure of SCF gene; clarify the presence (+) and/or absence (−) of primary proteolytic-cleavage site specific SCF splice variants. This work provides a basis for understanding the functional role and regulation of SCF in hair follicle melanogenesis in sheep beyond what was known in mice, humans and other mammals. PMID:22719917
Bandarian, Fatemeh; Daneshpour, Maryam Sadat; Hedayati, Mehdi; Naseri, Mohsen; Azizi, Fereidoun
2016-01-01
Background: Apolipoprotein A2 (APOA2) is the second major apolipoprotein of the high-density lipoprotein cholesterol (HDL-C). The study aim was to identify APOA2 gene variation in individuals within two extreme tails of HDL-C levels and its relationship with HDL-C level. Methods: This cross-sectional survey was conducted on participants from Tehran Glucose and Lipid Study (TLGS) at Research Institute for Endocrine Sciences, Tehran, Iran from April 2012 to February 2013. In total, 79 individuals with extreme low HDL-C levels (≤5th percentile for age and gender) and 63 individuals with extreme high HDL-C levels (≥95th percentile for age and gender) were selected. Variants were identified using DNA amplification and direct sequencing. Results: Screen of all exons and the core promoter region of APOA2 gene identified nine single nucleotide substitutions and one microsatellite; five of which were known and four were new variants. Of these nine variants, two were common tag single nucleotide polymorphisms (SNPs) and seven were rare SNPs. Both exonic substitutions were missense mutations and caused an amino acid change. There was a significant association between the new missense mutation (variant Chr.1:16119226, Ala98Pro) and HDL-C level. Conclusion: None of two common tag SNPs of rs6413453 and rs5082 contributes to the HDL-C trait in Iranian population, but a new missense mutation in APOA2 in our population has a significant association with HDL-C. PMID:26590203
Novel mutation in the CHST6 gene causes macular corneal dystrophy in a black South African family.
Carstens, Nadia; Williams, Susan; Goolam, Saadiah; Carmichael, Trevor; Cheung, Ming Sin; Büchmann-Møller, Stine; Sultan, Marc; Staedtler, Frank; Zou, Chao; Swart, Peter; Rice, Dennis S; Lacoste, Arnaud; Paes, Kim; Ramsay, Michèle
2016-07-20
Macular corneal dystrophy (MCD) is a rare autosomal recessive disorder that is characterized by progressive corneal opacity that starts in early childhood and ultimately progresses to blindness in early adulthood. The aim of this study was to identify the cause of MCD in a black South African family with two affected sisters. A multigenerational South African Sotho-speaking family with type I MCD was studied using whole exome sequencing. Variant filtering to identify the MCD-causal mutation included the disease inheritance pattern, variant minor allele frequency and potential functional impact. Ophthalmologic evaluation of the cases revealed a typical MCD phenotype and none of the other family members were affected. An average of 127 713 variants per individual was identified following exome sequencing and approximately 1.2 % were not present in any of the investigated public databases. Variant filtering identified a homozygous E71Q mutation in CHST6, a known MCD-causing gene encoding corneal N-acetyl glucosamine-6-O-sulfotransferase. This E71Q mutation results in a non-conservative amino acid change in a highly conserved functional domain of the human CHST6 that is essential for enzyme activity. We identified a novel E71Q mutation in CHST6 as the MCD-causal mutation in a black South African family with type I MCD. This is the first description of MCD in a black Sub-Saharan African family and therefore contributes valuable insights into the genetic aetiology of this disease, while improving genetic counselling for this and potentially other MCD families.