Science.gov

Sample records for acid sequence variants

  1. Amino acid substitutions in genetic variants of human serum albumin and in sequences inferred from molecular cloning

    SciTech Connect

    Takahashi, N.; Takahashi, Y.; Blumberg, B.S.; Putnam, F.W.

    1987-07-01

    The structural changes in four genetic variants of human serum albumin were analyzed by tandem high-pressure liquid chromatography (HPLC) of the tryptic peptides, HPLC mapping and isoelectric focusing of the CNBr fragments, and amino acid sequence analysis of the purified peptides. Lysine-372 of normal (common) albumin A was changed to glutamic acid both in albumin Naskapi, a widespread polymorphic variant of North American Indians, and in albumin Mersin found in Eti Turks. The two variants also exhibited anomalous migration in NaDodSO/sub 4//PAGE, which is attributed to a conformational change. The identity of albumins Naskapi and Mersin may have originated through descent from a common mid-Asiatic founder of the two migrating ethnic groups, or it may represent identical but independent mutations of the albumin gene. In albumin Adana, from Eti Turks, the substitution site was not identified but was localized to the region from positions 447 through 548. The substitution of aspartic acid-550 by glycine was found in albumin Mexico-2 from four individuals of the Pima tribe. Although only single-point substitutions have been found in these and in certain other genetic variants of human albumin, five differences exist in the amino acid sequences inferred from cDNA sequences by workers in three other laboratories. However, our results on albumin A and on 14 different genetic variants accord with the amino acid sequence of albumin deduced from the genomic sequence. The apparent amino acid substitutions inferred from comparison of individual cDNA sequences probably reflect artifacts in cloning or in cDNA sequence analysis rather than polymorphism of the coding sections of the albumin gene.

  2. Variant Calling From Next Generation Sequence Data.

    PubMed

    Hansen, Nancy F

    2016-01-01

    The use of next generation nucleotide sequencing to discover and genotype small sequence variants has led to numerous insights into the molecular causes of various diseases. This chapter describes the use of freely available software to align next generation sequencing reads to a reference and then to use the resulting alignments to call, annotate, view, and filter small sequence variants. The suggested variant calling workflow includes read alignment with novoalign, the removal of polymerase chain reaction duplicate sequences with samtools or bamUtils, and the detection of variants with Freebayes or bam2mpg software. ANNOVAR is then used to annotate the predicted variants using gene models, population frequencies, and predicted mutation severity, producing variant files which can be viewed and filtered with the variant display tool VarSifter.

  3. Efficient analysis of mouse genome sequences reveal many nonsense variants

    PubMed Central

    Steeland, Sophie; Timmermans, Steven; Van Ryckeghem, Sara; Hulpiau, Paco; Saeys, Yvan; Van Montagu, Marc; Vandenbroucke, Roosmarijn E.; Libert, Claude

    2016-01-01

    Genetic polymorphisms in coding genes play an important role when using mouse inbred strains as research models. They have been shown to influence research results, explain phenotypical differences between inbred strains, and increase the amount of interesting gene variants present in the many available inbred lines. SPRET/Ei is an inbred strain derived from Mus spretus that has ∼1% sequence difference with the C57BL/6J reference genome. We obtained a listing of all SNPs and insertions/deletions (indels) present in SPRET/Ei from the Mouse Genomes Project (Wellcome Trust Sanger Institute) and processed these data to obtain an overview of all transcripts having nonsynonymous coding sequence variants. We identified 8,883 unique variants affecting 10,096 different transcripts from 6,328 protein-coding genes, which is about 28% of all coding genes. Because only a subset of these variants results in drastic changes in proteins, we focused on variations that are nonsense mutations that ultimately resulted in a gain of a stop codon. These genes were identified by in silico changing the C57BL/6J coding sequences to the SPRET/Ei sequences, converting them to amino acid (AA) sequences, and comparing the AA sequences. All variants and transcripts affected were also stored in a database, which can be browsed using a SPRET/Ei M. spretus variants web tool (www.spretus.org), including a manual. We validated the tool by demonstrating the loss of function of three proteins predicted to be severely truncated, namely Fas, IRAK2, and IFNγR1. PMID:27147605

  4. Nanopore sequencing detects structural variants in cancer.

    PubMed

    Norris, Alexis L; Workman, Rachael E; Fan, Yunfan; Eshleman, James R; Timp, Winston

    2016-01-01

    Despite advances in sequencing, structural variants (SVs) remain difficult to reliably detect due to the short read length (<300 bp) of 2nd generation sequencing. Not only do the reads (or paired-end reads) need to straddle a breakpoint, but repetitive elements often lead to ambiguities in the alignment of short reads. We propose to use the long-reads (up to 20 kb) possible with 3rd generation sequencing, specifically nanopore sequencing on the MinION. Nanopore sequencing relies on a similar concept to a Coulter counter, reading the DNA sequence from the change in electrical current resulting from a DNA strand being forced through a nanometer-sized pore embedded in a membrane. Though nanopore sequencing currently has a relatively high mismatch rate that precludes base substitution and small frameshift mutation detection, its accuracy is sufficient for SV detection because of its long reads. In fact, long reads in some cases may improve SV detection efficiency. We have tested nanopore sequencing to detect a series of well-characterized SVs, including large deletions, inversions, and translocations that inactivate the CDKN2A/p16 and SMAD4/DPC4 tumor suppressor genes in pancreatic cancer. Using PCR amplicon mixes, we have demonstrated that nanopore sequencing can detect large deletions, translocations and inversions at dilutions as low as 1:100, with as few as 500 reads per sample. Given the speed, small footprint, and low capital cost, nanopore sequencing could become the ideal tool for the low-level detection of cancer-associated SVs needed for molecular relapse, early detection, or therapeutic monitoring.

  5. Strategies to choose from millions of imputed sequence variants

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Millions of sequence variants are known, but subsets are needed for routine genomic predictions or to include on genotyping arrays. Variant selection and imputation strategies were tested using 26 984 simulated reference bulls, of which 1 000 had 30 million sequence variants, 773 had 600 000 markers...

  6. Selecting sequence variants to improve genomic predictions for dairy cattle

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Millions of genetic variants have been identified by population-scale sequencing projects, but subsets are needed for routine genomic predictions or to include on genotyping arrays. Methods of selecting sequence variants were compared using both simulated sequence genotypes and actual data from run ...

  7. αIIbβ3 variants defined by next-generation sequencing: Predicting variants likely to cause Glanzmann thrombasthenia

    PubMed Central

    Buitrago, Lorena; Rendon, Augusto; Liang, Yupu; Simeoni, Ilenia; Negri, Ana; Filizola, Marta; Ouwehand, Willem H.; Coller, Barry S.; Alessi, Marie-Christine; Ballmaier, Matthias; Bariana, Tadbir; Bellissimo, Daniel; Bertoli, Marta; Bray, Paul; Bury, Loredana; Carrell, Robin; Cattaneo, Marco; Collins, Peter; French, Deborah; Favier, Remi; Freson, Kathleen; Furie, Bruce; Germeshausen, Manuela; Ghevaert, Cedric; Gomez, Keith; Goodeve, Anne; Gresele, Paolo; Guerrero, Jose; Hampshire, Dan J.; Hadinnapola, Charaka; Heemskerk, Johan; Henskens, Yvonne; Hill, Marian; Hogg, Nancy; Johnsen, Jill; Kahr, Walter; Kerr, Ron; Kunishima, Shinji; Laffan, Michael; Natwani, Amit; Neerman-Arbez, Marguerite; Nurden, Paquita; Nurden, Alan; Ormiston, Mark; Othman, Maha; Ouwehand, Willem; Perry, David; Vilk, Shoshana Ravel; Reitsma, Pieter; Rondina, Matthew; Simeoni, Ilenia; Smethurst, Peter; Stephens, Jonathan; Stevenson, William; Szkotak, Artur; Turro, Ernest; Van Geet, Christel; Vries, Minka; Ward, June; Waye, John; Westbury, Sarah; Whiteheart, Sidney; Wilcox, David; Zhang, Bi

    2015-01-01

    Next-generation sequencing is transforming our understanding of human genetic variation but assessing the functional impact of novel variants presents challenges. We analyzed missense variants in the integrin αIIbβ3 receptor subunit genes ITGA2B and ITGB3 identified by whole-exome or -genome sequencing in the ThromboGenomics project, comprising ∼32,000 alleles from 16,108 individuals. We analyzed the results in comparison with 111 missense variants in these genes previously reported as being associated with Glanzmann thrombasthenia (GT), 20 associated with alloimmune thrombocytopenia, and 5 associated with aniso/macrothrombocytopenia. We identified 114 novel missense variants in ITGA2B (affecting ∼11% of the amino acids) and 68 novel missense variants in ITGB3 (affecting ∼9% of the amino acids). Of the variants, 96% had minor allele frequencies (MAF) < 0.1%, indicating their rarity. Based on sequence conservation, MAF, and location on a complete model of αIIbβ3, we selected three novel variants that affect amino acids previously associated with GT for expression in HEK293 cells. αIIb P176H and β3 C547G severely reduced αIIbβ3 expression, whereas αIIb P943A partially reduced αIIbβ3 expression and had no effect on fibrinogen binding. We used receiver operating characteristic curves of combined annotation-dependent depletion, Polyphen 2-HDIV, and sorting intolerant from tolerant to estimate the percentage of novel variants likely to be deleterious. At optimal cut-off values, which had 69–98% sensitivity in detecting GT mutations, between 27% and 71% of the novel αIIb or β3 missense variants were predicted to be deleterious. Our data have implications for understanding the evolutionary pressure on αIIbβ3 and highlight the challenges in predicting the clinical significance of novel missense variants. PMID:25827233

  8. αIIbβ3 variants defined by next-generation sequencing: predicting variants likely to cause Glanzmann thrombasthenia.

    PubMed

    Buitrago, Lorena; Rendon, Augusto; Liang, Yupu; Simeoni, Ilenia; Negri, Ana; Filizola, Marta; Ouwehand, Willem H; Coller, Barry S

    2015-04-14

    Next-generation sequencing is transforming our understanding of human genetic variation but assessing the functional impact of novel variants presents challenges. We analyzed missense variants in the integrin αIIbβ3 receptor subunit genes ITGA2B and ITGB3 identified by whole-exome or -genome sequencing in the ThromboGenomics project, comprising ∼32,000 alleles from 16,108 individuals. We analyzed the results in comparison with 111 missense variants in these genes previously reported as being associated with Glanzmann thrombasthenia (GT), 20 associated with alloimmune thrombocytopenia, and 5 associated with aniso/macrothrombocytopenia. We identified 114 novel missense variants in ITGA2B (affecting ∼11% of the amino acids) and 68 novel missense variants in ITGB3 (affecting ∼9% of the amino acids). Of the variants, 96% had minor allele frequencies (MAF) < 0.1%, indicating their rarity. Based on sequence conservation, MAF, and location on a complete model of αIIbβ3, we selected three novel variants that affect amino acids previously associated with GT for expression in HEK293 cells. αIIb P176H and β3 C547G severely reduced αIIbβ3 expression, whereas αIIb P943A partially reduced αIIbβ3 expression and had no effect on fibrinogen binding. We used receiver operating characteristic curves of combined annotation-dependent depletion, Polyphen 2-HDIV, and sorting intolerant from tolerant to estimate the percentage of novel variants likely to be deleterious. At optimal cut-off values, which had 69-98% sensitivity in detecting GT mutations, between 27% and 71% of the novel αIIb or β3 missense variants were predicted to be deleterious. Our data have implications for understanding the evolutionary pressure on αIIbβ3 and highlight the challenges in predicting the clinical significance of novel missense variants.

  9. Rare variant detection using family-based sequencing analysis.

    PubMed

    Peng, Gang; Fan, Yu; Palculict, Timothy B; Shen, Peidong; Ruteshouser, E Cristy; Chi, Aung-Kyaw; Davis, Ronald W; Huff, Vicki; Scharfe, Curt; Wang, Wenyi

    2013-03-05

    Next-generation sequencing is revolutionizing genomic analysis, but this analysis can be compromised by high rates of missing true variants. To develop a robust statistical method capable of identifying variants that would otherwise not be called, we conducted sequence data simulations and both whole-genome and targeted sequencing data analysis of 28 families. Our method (Family-Based Sequencing Program, FamSeq) integrates Mendelian transmission information and raw sequencing reads. Sequence analysis using FamSeq reduced the number of false negative variants by 14-33% as assessed by HapMap sample genotype confirmation. In a large family affected with Wilms tumor, 84% of variants uniquely identified by FamSeq were confirmed by Sanger sequencing. In children with early-onset neurodevelopmental disorders from 26 families, de novo variant calls in disease candidate genes were corrected by FamSeq as mendelian variants, and the number of uniquely identified variants in affected individuals increased proportionally as additional family members were included in the analysis. To gain insight into maximizing variant detection, we studied factors impacting actual improvements of family-based calling, including pedigree structure, allele frequency (common vs. rare variants), prior settings of minor allele frequency, sequence signal-to-noise ratio, and coverage depth (∼20× to >200×). These data will help guide the design, analysis, and interpretation of family-based sequencing studies to improve the ability to identify new disease-associated genes.

  10. Selection of sequence variants to improve dairy cattle genomic predictions

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Genomic prediction reliabilities improved when adding selected sequence variants from run 5 of the 1,000 bull genomes project. High density (HD) imputed genotypes for 26,970 progeny tested Holstein bulls were combined with sequence variants for 444 Holstein animals. The first test included 481,904 c...

  11. Reproducibility of Variant Calls in Replicate Next Generation Sequencing Experiments

    PubMed Central

    Qi, Yuan; Liu, Xiuping; Liu, Chang-gong; Wang, Bailing; Hess, Kenneth R.; Symmans, W. Fraser; Shi, Weiwei; Pusztai, Lajos

    2015-01-01

    Nucleotide alterations detected by next generation sequencing are not always true biological changes but could represent sequencing errors. Even highly accurate methods can yield substantial error rates when applied to millions of nucleotides. In this study, we examined the reproducibility of nucleotide variant calls in replicate sequencing experiments of the same genomic DNA. We performed targeted sequencing of all known human protein kinase genes (kinome) (~3.2 Mb) using the SOLiD v4 platform. Seventeen breast cancer samples were sequenced in duplicate (n=14) or triplicate (n=3) to assess concordance of all calls and single nucleotide variant (SNV) calls. The concordance rates over the entire sequenced region were >99.99%, while the concordance rates for SNVs were 54.3-75.5%. There was substantial variation in basic sequencing metrics from experiment to experiment. The type of nucleotide substitution and genomic location of the variant had little impact on concordance but concordance increased with coverage level, variant allele count (VAC), variant allele frequency (VAF), variant allele quality and p-value of SNV-call. The most important determinants of concordance were VAC and VAF. Even using the highest stringency of QC metrics the reproducibility of SNV calls was around 80% suggesting that erroneous variant calling can be as high as 20-40% in a single experiment. The sequence data have been deposited into the European Genome-phenome Archive (EGA) with accession number EGAS00001000826. PMID:26136146

  12. Identification of STRA6 and SKI sequence variants in patients with anophthalmia/microphthalmia

    PubMed Central

    White, Tristan; Lu, Tianyi; Metlapally, Ravikanth; Katowitz, James; Kherani, Femida; Wang, Tian-Yuan; Tran-Viet, Khanh-Nhat

    2008-01-01

    Purpose Anophthalmia and microphthalmia (A/M) are rare congenital ocular malformations presenting with the absence of eye components or small eyes with or without structural abnormalities. A/M can be isolated or syndromic. The stimulated by retinoic acid gene 6 (STRA6) and Sloan-Kettering viral oncogene homolog (SKI) genes are involved in vitamin A metabolism, and are implicated with A/M developmental abnormalities in human and animal studies. Vitamin A metabolism is vital to normal eye development and growth. This study explores the association of these genes in a cohort of subjects with A/M. Methods STRA6 and SKI were screened for sequence variants by direct sequencing of genomic DNA samples from 18 affected subjects with A/M. The DNA samples of 4 external, unrelated controls were initially screened. Eighty-nine additional unrelated controls were screened to confirm that any sequence variants found in the affected subject DNA samples were related to the phenotype. Coding regions, intron-exon boundaries, and untranslated regions were sequenced by standard techniques. Derived DNA sequences were compared to known reference sequences from public genomic databases. Results For STRA6, a novel coding non-synonymous sequence variant was found in one subject, resulting in an amino acid change from glycine to glutamic acid in residue 217. One novel nonsense sequence variant found in the same subject changed the STRA6 amino acid residue 592 from cytosine to thymine resulting in a premature stop codon. For SKI, a known coding non-synonymous sequence variant (rs28384811) was found in 3 subject DNA samples and 11/89 control DNA samples. Four novel coding-synonymous sequence variants were observed in SKI. Conclusions The STRA6 sequence variants reported in this study could play a role in the pathogenesis of A/M by structural changes to the STRA6 protein. We can attribute 4% A/M incidence in this cohort to these sequence variants. Although no SKI sequence variants were found in

  13. Rich annotation of DNA sequencing variants by leveraging the Ensembl Variant Effect Predictor with plugins.

    PubMed

    Yourshaw, Michael; Taylor, S Paige; Rao, Aliz R; Martín, Martín G; Nelson, Stanley F

    2015-03-01

    High-throughput DNA sequencing has become a mainstay for the discovery of genomic variants that may cause disease or affect phenotype. A next-generation sequencing pipeline typically identifies thousands of variants in each sample. A particular challenge is the annotation of each variant in a way that is useful to downstream consumers of the data, such as clinical sequencing centers or researchers. These users may require that all data storage and analysis remain on secure local servers to protect patient confidentiality or intellectual property, may have unique and changing needs to draw on a variety of annotation data sets and may prefer not to rely on closed-source applications beyond their control. Here we describe scalable methods for using the plugin capability of the Ensembl Variant Effect Predictor to enrich its basic set of variant annotations with additional data on genes, function, conservation, expression, diseases, pathways and protein structure, and describe an extensible framework for easily adding additional custom data sets.

  14. Phosphodiesterase sequence variants may predispose to prostate cancer

    PubMed Central

    de Alexandre, Rodrigo Bertollo; Horvath, Anelia; Szarek, Eva; Manning, Allison D.; Leal, Leticia Ferro; Kardauke, Fabio; Epstein, Jonathan A.; Carraro, Dirce Maria; Soares, Fernando Augusto; Apanasovich, Tatiyana; Stratakis, Constantine A.; Faucz, Fabio Rueda

    2015-01-01

    We hypothesized that mutations that inactivate phosphodiesterase (PDE) activity and lead to increased cyclic AMP (cAMP) and cyclic GMP (cGMP) levels may be associated with prostate cancer (PCa). We sequenced the entire PDE coding sequences in the DNA of 16 biopsy samples from PCa patients. Novel mutations were confirmed in the somatic or germline state by Sanger sequencing. Data were then compared to the 1000 Genome Project. PDE, CREB and pCREB protein expression was also studied in all samples, in both normal and abnormal tissue, by immunofluorescence. We identified 3 previously described PDE sequence variants that were significantly higher in PCa. Four novel sequence variations, one each in the PDE4B, PDE6C, PDE7B and PDE10A genes, respectively, were also found in the PCa samples. Interestingly, PDE10A and PDE4B novel variants that were present in 19% and 6% of the patients, respectively, were found in the tumor tissue only. In patients carrying PDE defects, there was pCREB accumulation (p<0.001), and an increase of the pCREB/CREB ratio (patients 0.97± 0.03; controls 0.52± 0.03; p-value < 0.001) by immunohistochemical analysis. We conclude that PDE sequence variants may play a role in the predisposition and/or progression to PCa at the germline and/or somatic state, respectively. Larger such studies are needed to confirm these findings. PMID:25979379

  15. Composition for nucleic acid sequencing

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2008-08-26

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  16. Fast single-pass alignment and variant calling using sequencing data

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Sequencing research requires efficient computation. Few programs use already known information about DNA variants when aligning sequence data to the reference map. New program findmap.f90 reads the previous variant list before aligning sequence, calling variant alleles, and summing the allele counts...

  17. Localized structural frustration for evaluating the impact of sequence variants

    PubMed Central

    Kumar, Sushant; Clarke, Declan; Gerstein, Mark

    2016-01-01

    Population-scale sequencing is increasingly uncovering large numbers of rare single-nucleotide variants (SNVs) in coding regions of the genome. The rarity of these variants makes it challenging to evaluate their deleteriousness with conventional phenotype–genotype associations. Protein structures provide a way of addressing this challenge. Previous efforts have focused on globally quantifying the impact of SNVs on protein stability. However, local perturbations may severely impact protein functionality without strongly disrupting global stability (e.g. in relation to catalysis or allostery). Here, we describe a workflow in which localized frustration, quantifying unfavorable local interactions, is employed as a metric to investigate such effects. Using this workflow on the Protein Databank, we find that frustration produces many immediately intuitive results: for instance, disease-related SNVs create stronger changes in localized frustration than non-disease related variants, and rare SNVs tend to disrupt local interactions to a larger extent than common variants. Less obviously, we observe that somatic SNVs associated with oncogenes and tumor suppressor genes (TSGs) induce very different changes in frustration. In particular, those associated with TSGs change the frustration more in the core than the surface (by introducing loss-of-function events), whereas those associated with oncogenes manifest the opposite pattern, creating gain-of-function events. PMID:27915290

  18. Omnipotent decoding potential resides in eukaryotic translation termination factor eRF1 of variant-code organisms and is modulated by the interactions of amino acid sequences within domain 1.

    PubMed

    Ito, Koichi; Frolova, Ludmila; Seit-Nebi, Alim; Karamyshev, Andrey; Kisselev, Lev; Nakamura, Yoshikazu

    2002-06-25

    In eukaryotes, a single translational release factor, eRF1, deciphers three stop codons, although its decoding mechanism remains puzzling. In the ciliate Tetrahymena thermophila, UAA and UAG codons are reassigned to Gln codons. A yeast eRF1-domain swap containing Tetrahymena domain 1 responded only to UGA in vitro and failed to complement a defect in yeast eRF1 in vivo at 37 degrees C. This finding demonstrates that decoding specificity of eRF1 from variant code organisms resides at domain 1. However, the wild-type eRF1 hybrid fully restored the growth of eRF1-deficient yeast at 30 degrees C. Tetrahymena eRF1 contains a variant sequence, KATNIKD, at the tip of domain 1. The TASNIKD variant of hybrid eRF1 rendered the eRF1-nullified yeast viable, although in an in vitro assay, the same hybrid eRF1 responded only to UGA. Nevertheless, the yeast eRF1 bearing the KATNIKD motif instead of the TASNIKS heptapeptide present in higher eukaryotes remains omnipotent in vivo. Collectively, these data suggest that variant genetic code organisms like Tetrahymena have an intrinsic potential to decode three stop codons in vivo, and that interaction within domain 1 between the KAT tripeptide and other sequences modulates the decoding specificity of Tetrahymena eRF1.

  19. Novel sequence feature variant type analysis of the HLA genetic association in systemic sclerosis

    PubMed Central

    Karp, David R.; Marthandan, Nishanth; Marsh, Steven G.E.; Ahn, Chul; Arnett, Frank C.; DeLuca, David S.; Diehl, Alexander D.; Dunivin, Raymond; Eilbeck, Karen; Feolo, Michael; Guidry, Paula A.; Helmberg, Wolfgang; Lewis, Suzanna; Mayes, Maureen D.; Mungall, Chris; Natale, Darren A.; Peters, Bjoern; Petersdorf, Effie; Reveille, John D.; Smith, Barry; Thomson, Glenys; Waller, Matthew J.; Scheuermann, Richard H.

    2010-01-01

    We describe a novel approach to genetic association analyses with proteins sub-divided into biologically relevant smaller sequence features (SFs), and their variant types (VTs). SFVT analyses are particularly informative for study of highly polymorphic proteins such as the human leukocyte antigen (HLA), given the nature of its genetic variation: the high level of polymorphism, the pattern of amino acid variability, and that most HLA variation occurs at functionally important sites, as well as its known role in organ transplant rejection, autoimmune disease development and response to infection. Further, combinations of variable amino acid sites shared by several HLA alleles (shared epitopes) are most likely better descriptors of the actual causative genetic variants. In a cohort of systemic sclerosis patients/controls, SFVT analysis shows that a combination of SFs implicating specific amino acid residues in peptide binding pockets 4 and 7 of HLA-DRB1 explains much of the molecular determinant of risk. PMID:19933168

  20. Re-ranking sequencing variants in the post-GWAS era for accurate causal variant identification.

    PubMed

    Faye, Laura L; Machiela, Mitchell J; Kraft, Peter; Bull, Shelley B; Sun, Lei

    2013-01-01

    Next generation sequencing has dramatically increased our ability to localize disease-causing variants by providing base-pair level information at costs increasingly feasible for the large sample sizes required to detect complex-trait associations. Yet, identification of causal variants within an established region of association remains a challenge. Counter-intuitively, certain factors that increase power to detect an associated region can decrease power to localize the causal variant. First, combining GWAS with imputation or low coverage sequencing to achieve the large sample sizes required for high power can have the unintended effect of producing differential genotyping error among SNPs. This tends to bias the relative evidence for association toward better genotyped SNPs. Second, re-use of GWAS data for fine-mapping exploits previous findings to ensure genome-wide significance in GWAS-associated regions. However, using GWAS findings to inform fine-mapping analysis can bias evidence away from the causal SNP toward the tag SNP and SNPs in high LD with the tag. Together these factors can reduce power to localize the causal SNP by more than half. Other strategies commonly employed to increase power to detect association, namely increasing sample size and using higher density genotyping arrays, can, in certain common scenarios, actually exacerbate these effects and further decrease power to localize causal variants. We develop a re-ranking procedure that accounts for these adverse effects and substantially improves the accuracy of causal SNP identification, often doubling the probability that the causal SNP is top-ranked. Application to the NCI BPC3 aggressive prostate cancer GWAS with imputation meta-analysis identified a new top SNP at 2 of 3 associated loci and several additional possible causal SNPs at these loci that may have otherwise been overlooked. This method is simple to implement using R scripts provided on the author's website.

  1. High speed nucleic acid sequencing

    SciTech Connect

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2011-05-17

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid. Each type of labeled nucleotide comprises an acceptor fluorophore attached to a phosphate portion of the nucleotide such that the fluorophore is removed upon incorporation into a growing strand. Fluorescent signal is emitted via fluorescent resonance energy transfer between the donor fluorophore and the acceptor fluorophore as each nucleotide is incorporated into the growing strand. The sequence is deduced by identifying which base is being incorporated into the growing strand.

  2. mitoSAVE: mitochondrial sequence analysis of variants in Excel.

    PubMed

    King, Jonathan L; Sajantila, Antti; Budowle, Bruce

    2014-09-01

    The mitochondrial genome (mtGenome) contains genetic information amenable to numerous applications such as medical research, population and evolutionary studies, and human identity testing. However, inconsistent nomenclature assignment makes haplotype comparison difficult and can lead to false exclusion of potentially useful profiles. Massively Parallel Sequencing (MPS) is a platform for sequencing large datasets and potentially whole populations with relative ease. However, the data generated are not easily parsed and interpreted. With this in mind, mitoSAVE has been developed to enable fast conversion of Variant Call Format (VCF) files. mitoSAVE is an Excel-based workbook that converts data within the VCF into mtDNA haplotypes using phylogenetically-established nomenclature as well as rule-based alignments consistent with current forensic standards. mitoSAVE is formatted for human mitochondrial genome; however, it can easily be adapted to support other reasonably small genomes.

  3. Whole-exome sequencing identifies variants in invasive pituitary adenomas

    PubMed Central

    Lan, Xiaolei; Gao, Hua; Wang, Fei; Feng, Jie; Bai, Jiwei; Zhao, Peng; Cao, Lei; Gui, Songbai; Gong, Lei; Zhang, Yazhuo

    2016-01-01

    Pituitary adenomas exhibit a wide range of behaviors. The prediction of invasion or malignant behavior in pituitary adenomas remains challenging. The objective of the present study was to identify the genetic abnormalities associated with invasion in sporadic pituitary adenomas. In the present study, the exomes of six invasive pituitary adenomas (IPA) and six non-invasive pituitary adenomas (nIPA) were sequenced by whole-exome sequencing. Variants were confirmed by dideoxynucleotide sequencing, and candidate driver genes were assessed in an additional 28 pituitary adenomas. A total of 15 identified variants were mainly associated with angiogenesis, metabolism, cell cycle phase, cellular component organization, cytoskeleton and biogenesis immune at a cellular level, including 13 variants that occurred as single nucleotide variants and 2 that comprised of insertions. The messenger RNA (mRNA) levels of diffuse panbronchiolitis critical region 1 (DPCR1), KIAA0226, myxovirus (influenza virus) resistance, proline-rich protein BstNI subfamily 3, PR domain containing 2, with ZNF domain, RIZ1 (PRDM2), PR domain containing 8 (PRDM8), SPANX family member N2 (SPANXN2), TRIO and F-actin binding protein and zinc finger protein 717 in IPA specimens were 50% decreased compared with nIPA specimens. In particular, DPCR1, PRDM2, PRDM8 and SPANXN2 mRNA levels in IPA specimens were approximately four-fold lower compared with nIPA specimens (P=0.003, 0.007, 0.009 and 0.004, respectively). By contrast, the mRNA levels of dentin sialophospho protein, EGF like domain, multiple 7 (EGFL7), low density lipoprotein receptor-related protein 1B and dynein, axonemal, assembly factor 1 (LRRC50) were increased in IPA compared with nIPA specimens (P=0.041, 0.037, 0.022 and 0.013, respectively). Furthermore, decreased PRDM2 expression was associated with tumor recurrence. The findings of the present study indicate that DPCR1, EGFL7, the PRDM family and LRRC50 in pituitary adenomas are modifiers of

  4. Hypomorphic variants of cationic amino acid transporter 3 in males with autism spectrum disorders.

    PubMed

    Nava, Caroline; Rupp, Johanna; Boissel, Jean-Paul; Mignot, Cyril; Rastetter, Agnès; Amiet, Claire; Jacquette, Aurélia; Dupuits, Céline; Bouteiller, Delphine; Keren, Boris; Ruberg, Merle; Faudet, Anne; Doummar, Diane; Philippe, Anne; Périsse, Didier; Laurent, Claudine; Lebrun, Nicolas; Guillemot, Vincent; Chelly, Jamel; Cohen, David; Héron, Delphine; Brice, Alexis; Closs, Ellen I; Depienne, Christel

    2015-12-01

    Cationic amino acid transporters (CATs) mediate the entry of L-type cationic amino acids (arginine, ornithine and lysine) into the cells including neurons. CAT-3, encoded by the SLC7A3 gene on chromosome X, is one of the three CATs present in the human genome, with selective expression in brain. SLC7A3 is highly intolerant to variation in humans, as attested by the low frequency of deleterious variants in available databases, but the impact on variants in this gene in humans remains undefined. In this study, we identified a missense variant in SLC7A3, encoding the CAT-3 cationic amino acid transporter, on chromosome X by exome sequencing in two brothers with autism spectrum disorder (ASD). We then sequenced the SLC7A3 coding sequence in 148 male patients with ASD and identified three additional rare missense variants in unrelated patients. Functional analyses of the mutant transporters showed that two of the four identified variants cause severe or moderate loss of CAT-3 function due to altered protein stability or abnormal trafficking to the plasma membrane. The patient with the most deleterious SLC7A3 variant had high-functioning autism and epilepsy, and also carries a de novo 16p11.2 duplication possibly contributing to his phenotype. This study shows that rare hypomorphic variants of SLC7A3 exist in male individuals and suggest that SLC7A3 variants possibly contribute to the etiology of ASD in male subjects in association with other genetic factors.

  5. Predicted Molecular Effects of Sequence Variants Link to System Level of Disease

    PubMed Central

    Bromberg, Yana; Rost, Burkhard

    2016-01-01

    Developments in experimental and computational biology are advancing our understanding of how protein sequence variation impacts molecular protein function. However, the leap from the micro level of molecular function to the macro level of the whole organism, e.g. disease, remains barred. Here, we present new results emphasizing earlier work that suggested some links from molecular function to disease. We focused on non-synonymous single nucleotide variants, also referred to as single amino acid variants (SAVs). Building upon OMIA (Online Mendelian Inheritance in Animals), we introduced a curated set of 117 disease-causing SAVs in animals. Methods optimized to capture effects upon molecular function often correctly predict human (OMIM) and animal (OMIA) Mendelian disease-causing variants. We also predicted effects of human disease-causing variants in the mouse model, i.e. we put OMIM SAVs into mouse orthologs. Overall, fewer variants were predicted with effect in the model organism than in the original organism. Our results, along with other recent studies, demonstrate that predictions of molecular effects capture some important aspects of disease. Thus, in silico methods focusing on the micro level of molecular function can help to understand the macro system level of disease. PMID:27536940

  6. Mutational analysis of the variant surface glycoprotein GPI-anchor signal sequence in Trypanosoma brucei.

    PubMed

    Böhme, Ulrike; Cross, George A M

    2002-02-15

    The variant surface glycoproteins (VSG) of Trypanosoma brucei are anchored to the cell surface via a glycosylphosphatidylinositol (GPI) anchor. All GPI-anchored proteins are synthesized with a C-terminal signal sequence, which is replaced by a GPI-anchor in a rapid post-translational transamidation reaction. VSG GPI signal sequences are extraordinarily conserved. They contain either 23 or 17 amino acids, a difference that distinguishes the two major VSG classes, and consist of a spacer sequence followed by a more hydrophobic region. The omega amino acid, to which GPI is transferred, is either Ser, Asp or Asn, the omega+2 amino acid is always Ser, and the omega+7 amino acid is almost always Lys. In order to determine whether this high conservation is necessary for GPI anchoring, we introduced several mutations into the signal peptide. Surprisingly, changing the most conserved amino acids, at positions omega+1, omega+2 and omega+7, had no detectable effect on the efficiency of GPI-anchoring or on protein abundance. Several more extensive changes also had no discernable impact on GPI-anchoring. Deleting the entire 23 amino-acid signal sequence or the 15 amino-acid hydrophobic region generated proteins that were not anchored. Instead of being secreted, these truncated proteins accumulated in the endoplasmic reticulum prior to lysosomal degradation. Replacing the GPI signal sequence with a proven cell-surface membrane-spanning domain reduced expression by about 99% and resulted not in cell surface expression but in accumulation close to the flagellar pocket and in non-lysosomal compartments. These results indicate that the high conservation of the VSG GPI signal sequence is not necessary for efficient expression and GPI attachment. Instead, the GPI anchor is essential for surface expression of VSG. However, because the VSG is a major virulence factor, it is possible that small changes in the efficiency of GPI anchoring, undetectable in our experiments, might have

  7. The current state of clinical interpretation of sequence variants.

    PubMed

    Hoskinson, Derick C; Dubuc, Adrian M; Mason-Suares, Heather

    2017-01-31

    Accurate and consistent variant classification is required for Precision Medicine. But clinical variant classification remains in its infancy. While recent guidelines put forth jointly by the American College of Medical Genetics and Genomics (ACMG) and Association of Molecular Pathology (AMP) for the classification of Mendelian variants has advanced the field, the degree of subjectivity allowed by these guidelines can still lead to inconsistent classification across clinical molecular genetic laboratories. In addition, there are currently no such guidelines for somatic cancer variants, only published institutional practices. Additional variant classification guidelines, including disease- or gene-specific criteria, along with inter-laboratory data sharing is critical for accurate and consistent variant interpretation.

  8. Many amino acid substitution variants identified in DNA repair genes during human population screenings are predicted to impact protein function

    SciTech Connect

    Xi, T; Jones, I M; Mohrenweiser, H W

    2003-11-03

    Over 520 different amino acid substitution variants have been previously identified in the systematic screening of 91 human DNA repair genes for sequence variation. Two algorithms were employed to predict the impact of these amino acid substitutions on protein activity. Sorting Intolerant From Tolerant (SIFT) classified 226 of 508 variants (44%) as ''Intolerant''. Polymorphism Phenotyping (PolyPhen) classed 165 of 489 amino acid substitutions (34%) as ''Probably or Possibly Damaging''. Another 9-15% of the variants were classed as ''Potentially Intolerant or Damaging''. The results from the two algorithms are highly associated, with concordance in predicted impact observed for {approx}62% of the variants. Twenty one to thirty one percent of the variant proteins are predicted to exhibit reduced activity by both algorithms. These variants occur at slightly lower individual allele frequency than do the variants classified as ''Tolerant'' or ''Benign''. Both algorithms correctly predicted the impact of 26 functionally characterized amino acid substitutions in the APE1 protein on biochemical activity, with one exception. It is concluded that a substantial fraction of the missense variants observed in the general human population are functionally relevant. These variants are expected to be the molecular genetic and biochemical basis for the associations of reduced DNA repair capacity phenotypes with elevated cancer risk.

  9. Leveraging ancestry to improve causal variant identification in exome sequencing for monogenic disorders.

    PubMed

    Brown, Robert; Lee, Hane; Eskin, Ascia; Kichaev, Gleb; Lohmueller, Kirk E; Reversade, Bruno; Nelson, Stanley F; Pasaniuc, Bogdan

    2016-01-01

    Recent breakthroughs in exome-sequencing technology have made possible the identification of many causal variants of monogenic disorders. Although extremely powerful when closely related individuals (eg, child and parents) are simultaneously sequenced, sequencing of a single case is often unsuccessful due to the large number of variants that need to be followed up for functional validation. Many approaches filter out common variants above a given frequency threshold (eg, 1%), and then prioritize the remaining variants according to their functional, structural and conservation properties. Here we present methods that leverage the genetic structure across different populations to improve filtering performance while accounting for the finite sample size of the reference panels. We show that leveraging genetic structure reduces the number of variants that need to be followed up by 16% in simulations and by up to 38% in empirical data of 20 exomes from individuals with monogenic disorders for which the causal variants are known.

  10. Characterization of alanine to valine sequence variants in the Fc region of nivolumab biosimilar produced in Chinese hamster ovary cells

    PubMed Central

    Li, Yantao; Fu, Tuo; Liu, Tao; Guo, Huaizu; Guo, Qingcheng; Xu, Jin; Zhang, Dapeng; Qian, Weizhu; Dai, Jianxin; Li, Bohua; Guo, Yajun; Hou, Sheng; Wang, Hao

    2016-01-01

    ABSTRACT Nivolumab is a therapeutic fully human IgG4 antibody to programmed death 1 (PD-1). In this study, a nivolumab biosimilar, which was produced in our laboratory, was analyzed and characterized. Sequence variants that contain undesired amino acid sequences may cause concern during biosimilar bioprocess development. We found that low levels of sequence variants were detected in the heavy chain of the nivolumab biosimilar by ultra performance liquid chromatography (UPLC) and tandem mass spectrometry. It was further identified with UPLC-MS/MS by IdeS or trypsin digestion. The sequence variant was confirmed through addition of synthetic mutant peptide. Subsequently, the mixing base signal of normal and mutant sequence was detected through DNA sequencing. The relative levels of mutant A424V in the Fc region of the heavy chain have been detected and demonstrated to be 12.25% and 13.54%, via base peak intensity (BPI) and UV chromatography of the tryptic peptide mapping, respectively. A424V variant was also quantified by real-time PCR (RT-PCR) at the DNA and RNA level, which was 19.2% and 16.8%, respectively. The relative content of the mutant was consistent at the DNA, RNA and protein level, indicating that the A424V mutation may have little influence at transcriptional or translational levels. These results demonstrate that orthogonal state-of-the-art techniques such as LC- UV- MS and RT-PCR should be implemented to characterize recombinant proteins and cell lines for development of biosimilars. Our study suggests that it is important to establish an integrated and effective analytical method to monitor and characterize sequence variants during antibody drug development, especially for antibody biosimilar products. PMID:27050807

  11. Rare Variants Association Analysis in Large-Scale Sequencing Studies at the Single Locus Level

    PubMed Central

    Lu, Wenbin; Tzeng, Jung-Ying

    2016-01-01

    Genetic association analyses of rare variants in next-generation sequencing (NGS) studies are fundamentally challenging due to the presence of a very large number of candidate variants at extremely low minor allele frequencies. Recent developments often focus on pooling multiple variants to provide association analysis at the gene instead of the locus level. Nonetheless, pinpointing individual variants is a critical goal for genomic researches as such information can facilitate the precise delineation of molecular mechanisms and functions of genetic factors on diseases. Due to the extreme rarity of mutations and high-dimensionality, significances of causal variants cannot easily stand out from those of noncausal ones. Consequently, standard false-positive control procedures, such as the Bonferroni and false discovery rate (FDR), are often impractical to apply, as a majority of the causal variants can only be identified along with a few but unknown number of noncausal variants. To provide informative analysis of individual variants in large-scale sequencing studies, we propose the Adaptive False-Negative Control (AFNC) procedure that can include a large proportion of causal variants with high confidence by introducing a novel statistical inquiry to determine those variants that can be confidently dispatched as noncausal. The AFNC provides a general framework that can accommodate for a variety of models and significance tests. The procedure is computationally efficient and can adapt to the underlying proportion of causal variants and quality of significance rankings. Extensive simulation studies across a plethora of scenarios demonstrate that the AFNC is advantageous for identifying individual rare variants, whereas the Bonferroni and FDR are exceedingly over-conservative for rare variants association studies. In the analyses of the CoLaus dataset, AFNC has identified individual variants most responsible for gene-level significances. Moreover, single-variant results

  12. Whole-Genome sequencing and genetic variant analysis of a quarter Horse mare

    PubMed Central

    2012-01-01

    Background The catalog of genetic variants in the horse genome originates from a few select animals, the majority originating from the Thoroughbred mare used for the equine genome sequencing project. The purpose of this study was to identify genetic variants, including single nucleotide polymorphisms (SNPs), insertion/deletion polymorphisms (INDELs), and copy number variants (CNVs) in the genome of an individual Quarter Horse mare sequenced by next-generation sequencing. Results Using massively parallel paired-end sequencing, we generated 59.6 Gb of DNA sequence from a Quarter Horse mare resulting in an average of 24.7X sequence coverage. Reads were mapped to approximately 97% of the reference Thoroughbred genome. Unmapped reads were de novo assembled resulting in 19.1 Mb of new genomic sequence in the horse. Using a stringent filtering method, we identified 3.1 million SNPs, 193 thousand INDELs, and 282 CNVs. Genetic variants were annotated to determine their impact on gene structure and function. Additionally, we genotyped this Quarter Horse for mutations of known diseases and for variants associated with particular traits. Functional clustering analysis of genetic variants revealed that most of the genetic variation in the horse's genome was enriched in sensory perception, signal transduction, and immunity and defense pathways. Conclusions This is the first sequencing of a horse genome by next-generation sequencing and the first genomic sequence of an individual Quarter Horse mare. We have increased the catalog of genetic variants for use in equine genomics by the addition of novel SNPs, INDELs, and CNVs. The genetic variants described here will be a useful resource for future studies of genetic variation regulating performance traits and diseases in equids. PMID:22340285

  13. Population sequencing of two endocannabinoid metabolic genes identifies rare and common regulatory variants associated with extreme obesity and metabolite level

    PubMed Central

    2010-01-01

    Background Targeted re-sequencing of candidate genes in individuals at the extremes of a quantitative phenotype distribution is a method of choice to gain information on the contribution of rare variants to disease susceptibility. The endocannabinoid system mediates signaling in the brain and peripheral tissues involved in the regulation of energy balance, is highly active in obese patients, and represents a strong candidate pathway to examine for genetic association with body mass index (BMI). Results We sequenced two intervals (covering 188 kb) encoding the endocannabinoid metabolic enzymes fatty-acid amide hydrolase (FAAH) and monoglyceride lipase (MGLL) in 147 normal controls and 142 extremely obese cases. After applying quality filters, we called 1,393 high quality single nucleotide variants, 55% of which are rare, and 143 indels. Using single marker tests and collapsed marker tests, we identified four intervals associated with BMI: the FAAH promoter, the MGLL promoter, MGLL intron 2, and MGLL intron 3. Two of these intervals are composed of rare variants and the majority of the associated variants are located in promoter sequences or in predicted transcriptional enhancers, suggesting a regulatory role. The set of rare variants in the FAAH promoter associated with BMI is also associated with increased level of FAAH substrate anandamide, further implicating a functional role in obesity. Conclusions Our study, which is one of the first reports of a sequence-based association study using next-generation sequencing of candidate genes, provides insights into study design and analysis approaches and demonstrates the importance of examining regulatory elements rather than exclusively focusing on exon sequences. PMID:21118518

  14. Chip-based sequencing nucleic acids

    DOEpatents

    Beer, Neil Reginald

    2014-08-26

    A system for fast DNA sequencing by amplification of genetic material within microreactors, denaturing, demulsifying, and then sequencing the material, while retaining it in a PCR/sequencing zone by a magnetic field. One embodiment includes sequencing nucleic acids on a microchip that includes a microchannel flow channel in the microchip. The nucleic acids are isolated and hybridized to magnetic nanoparticles or to magnetic polystyrene-coated beads. Microreactor droplets are formed in the microchannel flow channel. The microreactor droplets containing the nucleic acids and the magnetic nanoparticles are retained in a magnetic trap in the microchannel flow channel and sequenced.

  15. Clinical Validation and Implementation of a Targeted Next-Generation Sequencing Assay to Detect Somatic Variants in Non-Small Cell Lung, Melanoma, and Gastrointestinal Malignancies

    PubMed Central

    Fisher, Kevin E.; Zhang, Linsheng; Wang, Jason; Smith, Geoffrey H.; Newman, Scott; Schneider, Thomas M.; Pillai, Rathi N.; Kudchadkar, Ragini R.; Owonikoko, Taofeek K.; Ramalingam, Suresh S.; Lawson, David H.; Delman, Keith A.; El-Rayes, Bassel F.; Wilson, Malania M.; Sullivan, H. Clifford; Morrison, Annie S.; Balci, Serdar; Adsay, N. Volkan; Gal, Anthony A.; Sica, Gabriel L.; Saxe, Debra F.; Mann, Karen P.; Hill, Charles E.; Khuri, Fadlo R.; Rossi, Michael R.

    2017-01-01

    We tested and clinically validated a targeted next-generation sequencing (NGS) mutation panel using 80 formalin-fixed, paraffin-embedded (FFPE) tumor samples. Forty non-small cell lung carcinoma (NSCLC), 30 melanoma, and 30 gastrointestinal (12 colonic, 10 gastric, and 8 pancreatic adenocarcinoma) FFPE samples were selected from laboratory archives. After appropriate specimen and nucleic acid quality control, 80 NGS libraries were prepared using the Illumina TruSight tumor (TST) kit and sequenced on the Illumina MiSeq. Sequence alignment, variant calling, and sequencing quality control were performed using vendor software and laboratory-developed analysis workflows. TST generated ≥500× coverage for 98.4% of the 13,952 targeted bases. Reproducible and accurate variant calling was achieved at ≥5% variant allele frequency with 8 to 12 multiplexed samples per MiSeq flow cell. TST detected 112 variants overall, and confirmed all known single-nucleotide variants (n = 27), deletions (n = 5), insertions (n = 3), and multinucleotide variants (n = 3). TST detected at least one variant in 85.0% (68/80), and two or more variants in 36.2% (29/80), of samples. TP53 was the most frequently mutated gene in NSCLC (13 variants; 13/32 samples), gastrointestinal malignancies (15 variants; 13/25 samples), and overall (30 variants; 28/80 samples). BRAF mutations were most common in melanoma (nine variants; 9/23 samples). Clinically relevant NGS data can be obtained from routine clinical FFPE solid tumor specimens using TST, benchtop instruments, and vendor-supplied bioinformatics pipelines. PMID:26801070

  16. New cfiA variant and novel insertion sequence elements in carbapenem-resistant Bacteroides fragilis isolates from Korea.

    PubMed

    Roh, Kyoung Ho; Kim, Sinyoung; Kim, Chang-Ki; Yum, Jong Hwa; Kim, Myung Sook; Yong, Dongeun; Jeong, Seok Hoon; Lee, Kyungwon; Kim, June Myung; Chong, Yunsop

    2010-04-01

    Of 276 nonduplicate Bacteroides fragilis clinical isolates recovered from 1997 to 2004, 3 were resistant to carbapenem. cepA and cfiA alleles were detected by polymerase chain reaction in 240 (87.0%) and 11 (4.0%) of the isolates, respectively. Insertion sequence (IS) elements were found only in the 3 carbapenem-resistant B. fragilis isolates, which produced metallo-beta-lactamase at a level detectable by UV spectrophotometry. Sequence analysis showed 1 new cfiA variant, cfiA(11), and 2 novel IS elements. The cfiA(11) gene revealed 5 amino acid substitutions compared to cfiA, with 97.6% amino acid identity. The transposase, terminal inverted repeat sequence, and target site duplication sequence of the 2 novel IS elements were unique. This study reconfirmed the correlation between ISs and carbapenem resistance in B. fragilis.

  17. An integrated approach for analyzing clinical genomic variant data from next-generation sequencing.

    PubMed

    Crowgey, Erin L; Stabley, Deborah L; Chen, Chuming; Huang, Hongzhan; Robbins, Katherine M; Polson, Shawn W; Sol-Church, Katia; Wu, Cathy H

    2015-04-01

    Next-generation sequencing (NGS) technologies provide the potential for developing high-throughput and low-cost platforms for clinical diagnostics. A limiting factor to clinical applications of genomic NGS is downstream bioinformatics analysis for data interpretation. We have developed an integrated approach for end-to-end clinical NGS data analysis from variant detection to functional profiling. Robust bioinformatics pipelines were implemented for genome alignment, single nucleotide polymorphism (SNP), small insertion/deletion (InDel), and copy number variation (CNV) detection of whole exome sequencing (WES) data from the Illumina platform. Quality-control metrics were analyzed at each step of the pipeline by use of a validated training dataset to ensure data integrity for clinical applications. We annotate the variants with data regarding the disease population and variant impact. Custom algorithms were developed to filter variants based on criteria, such as quality of variant, inheritance pattern, and impact of variant on protein function. The developed clinical variant pipeline links the identified rare variants to Integrated Genome Viewer for visualization in a genomic context and to the Protein Information Resource's iProXpress for rich protein and disease information. With the application of our system of annotations, prioritizations, inheritance filters, and functional profiling and analysis, we have created a unique methodology for downstream variant filtering that empowers clinicians and researchers to interpret more effectively the relevance of genomic alterations within a rare genetic disease.

  18. An Integrated Approach for Analyzing Clinical Genomic Variant Data from Next-Generation Sequencing

    PubMed Central

    Stabley, Deborah L.; Chen, Chuming; Huang, Hongzhan; Robbins, Katherine M.; Polson, Shawn W.; Sol-Church, Katia; Wu, Cathy H.

    2015-01-01

    Next-generation sequencing (NGS) technologies provide the potential for developing high-throughput and low-cost platforms for clinical diagnostics. A limiting factor to clinical applications of genomic NGS is downstream bioinformatics analysis for data interpretation. We have developed an integrated approach for end-to-end clinical NGS data analysis from variant detection to functional profiling. Robust bioinformatics pipelines were implemented for genome alignment, single nucleotide polymorphism (SNP), small insertion/deletion (InDel), and copy number variation (CNV) detection of whole exome sequencing (WES) data from the Illumina platform. Quality-control metrics were analyzed at each step of the pipeline by use of a validated training dataset to ensure data integrity for clinical applications. We annotate the variants with data regarding the disease population and variant impact. Custom algorithms were developed to filter variants based on criteria, such as quality of variant, inheritance pattern, and impact of variant on protein function. The developed clinical variant pipeline links the identified rare variants to Integrated Genome Viewer for visualization in a genomic context and to the Protein Information Resource’s iProXpress for rich protein and disease information. With the application of our system of annotations, prioritizations, inheritance filters, and functional profiling and analysis, we have created a unique methodology for downstream variant filtering that empowers clinicians and researchers to interpret more effectively the relevance of genomic alterations within a rare genetic disease. PMID:25649353

  19. MYO7A and USH2A gene sequence variants in Italian patients with Usher syndrome

    PubMed Central

    Sodi, Andrea; Mariottini, Alessandro; Passerini, Ilaria; Murro, Vittoria; Bianchi, Benedetta; Menchini, Ugo; Torricelli, Francesca

    2014-01-01

    Purpose To analyze the spectrum of sequence variants in the MYO7A and USH2A genes in a group of Italian patients affected by Usher syndrome (USH). Methods Thirty-six Italian patients with a diagnosis of USH were recruited. They received a standard ophthalmologic examination, visual field testing, optical coherence tomography (OCT) scan, and electrophysiological tests. Fluorescein angiography and fundus autofluorescence imaging were performed in selected cases. All the patients underwent an audiologic examination for the 0.25–8,000 Hz frequencies. Vestibular function was evaluated with specific tests. DNA samples were analyzed for sequence variants of the MYO7A gene (for USH1) and the USH2A gene (for USH2) with direct sequencing techniques. A few patients were analyzed for both genes. Results In the MYO7A gene, ten missense variants were found; three patients were compound heterozygous, and two were homozygous. Thirty-four USH2A gene variants were detected, including eight missense variants, nine nonsense variants, six splicing variants, and 11 duplications/deletions; 19 patients were compound heterozygous, and three were homozygous. Four MYO7A and 17 USH2A variants have already been described in the literature. Among the novel mutations there are four USH2A large deletions, detected with multiplex ligation dependent probe amplification (MLPA) technology. Two potentially pathogenic variants were found in 27 patients (75%). Affected patients showed variable clinical pictures without a clear genotype-phenotype correlation. Conclusions Ten variants in the MYO7A gene and 34 variants in the USH2A gene were detected in Italian patients with USH at a high detection rate. A selective analysis of these genes may be valuable for molecular analysis, combining diagnostic efficiency with little time wastage and less resource consumption. PMID:25558175

  20. Distinguishing Proteins From Arbitrary Amino Acid Sequences

    PubMed Central

    Yau, Stephen S.-T.; Mao, Wei-Guang; Benson, Max; He, Rong Lucy

    2015-01-01

    What kinds of amino acid sequences could possibly be protein sequences? From all existing databases that we can find, known proteins are only a small fraction of all possible combinations of amino acids. Beginning with Sanger's first detailed determination of a protein sequence in 1952, previous studies have focused on describing the structure of existing protein sequences in order to construct the protein universe. No one, however, has developed a criteria for determining whether an arbitrary amino acid sequence can be a protein. Here we show that when the collection of arbitrary amino acid sequences is viewed in an appropriate geometric context, the protein sequences cluster together. This leads to a new computational test, described here, that has proved to be remarkably accurate at determining whether an arbitrary amino acid sequence can be a protein. Even more, if the results of this test indicate that the sequence can be a protein, and it is indeed a protein sequence, then its identity as a protein sequence is uniquely defined. We anticipate our computational test will be useful for those who are attempting to complete the job of discovering all proteins, or constructing the protein universe. PMID:25609314

  1. Amino Acid Changes in Disease-Associated Variants Differ Radically from Variants Observed in the 1000 Genomes Project Dataset

    PubMed Central

    de Beer, Tjaart A. P.; Laskowski, Roman A.; Parks, Sarah L.; Sipos, Botond; Goldman, Nick; Thornton, Janet M.

    2013-01-01

    The 1000 Genomes Project data provides a natural background dataset for amino acid germline mutations in humans. Since the direction of mutation is known, the amino acid exchange matrix generated from the observed nucleotide variants is asymmetric and the mutabilities of the different amino acids are very different. These differences predominantly reflect preferences for nucleotide mutations in the DNA (especially the high mutation rate of the CpG dinucleotide, which makes arginine mutability very much higher than other amino acids) rather than selection imposed by protein structure constraints, although there is evidence for the latter as well. The variants occur predominantly on the surface of proteins (82%), with a slight preference for sites which are more exposed and less well conserved than random. Mutations to functional residues occur about half as often as expected by chance. The disease-associated amino acid variant distributions in OMIM are radically different from those expected on the basis of the 1000 Genomes dataset. The disease-associated variants preferentially occur in more conserved sites, compared to 1000 Genomes mutations. Many of the amino acid exchange profiles appear to exhibit an anti-correlation, with common exchanges in one dataset being rare in the other. Disease-associated variants exhibit more extreme differences in amino acid size and hydrophobicity. More modelling of the mutational processes at the nucleotide level is needed, but these observations should contribute to an improved prediction of the effects of specific variants in humans. PMID:24348229

  2. The complete amino acid sequence of prochymosin.

    PubMed Central

    Foltmann, B; Pedersen, V B; Jacobsen, H; Kauffman, D; Wybrandt, G

    1977-01-01

    The total sequence of 365 amino acid residues in bovine prochymosin is presented. Alignment with the amino acid sequence of porcine pepsinogen shows that 204 amino acid residues are common to the two zymogens. Further comparison and alignment with the amino acid sequence of penicillopepsin shows that 66 residues are located at identical positions in all three proteases. The three enzymes belong to a large group of proteases with two aspartate residues in the active center. This group forms a family derived from one common ancestor. PMID:329280

  3. Viral population analysis and minority-variant detection using short read next-generation sequencing

    PubMed Central

    Watson, Simon J.; Welkers, Matthijs R. A.; Depledge, Daniel P.; Coulter, Eve; Breuer, Judith M.; de Jong, Menno D.; Kellam, Paul

    2013-01-01

    RNA viruses within infected individuals exist as a population of evolutionary-related variants. Owing to evolutionary change affecting the constitution of this population, the frequency and/or occurrence of individual viral variants can show marked or subtle fluctuations. Since the development of massively parallel sequencing platforms, such viral populations can now be investigated to unprecedented resolution. A critical problem with such analyses is the presence of sequencing-related errors that obscure the identification of true biological variants present at low frequency. Here, we report the development and assessment of the Quality Assessment of Short Read (QUASR) Pipeline (http://sourceforge.net/projects/quasr) specific for virus genome short read analysis that minimizes sequencing errors from multiple deep-sequencing platforms, and enables post-mapping analysis of the minority variants within the viral population. QUASR significantly reduces the error-related noise in deep-sequencing datasets, resulting in increased mapping accuracy and reduction of erroneous mutations. Using QUASR, we have determined influenza virus genome dynamics in sequential samples from an in vitro evolution of 2009 pandemic H1N1 (A/H1N1/09) influenza from samples sequenced on both the Roche 454 GSFLX and Illumina GAIIx platforms. Importantly, concordance between the 454 and Illumina sequencing allowed unambiguous minority-variant detection and accurate determination of virus population turnover in vitro. PMID:23382427

  4. Viral population analysis and minority-variant detection using short read next-generation sequencing.

    PubMed

    Watson, Simon J; Welkers, Matthijs R A; Depledge, Daniel P; Coulter, Eve; Breuer, Judith M; de Jong, Menno D; Kellam, Paul

    2013-03-19

    RNA viruses within infected individuals exist as a population of evolutionary-related variants. Owing to evolutionary change affecting the constitution of this population, the frequency and/or occurrence of individual viral variants can show marked or subtle fluctuations. Since the development of massively parallel sequencing platforms, such viral populations can now be investigated to unprecedented resolution. A critical problem with such analyses is the presence of sequencing-related errors that obscure the identification of true biological variants present at low frequency. Here, we report the development and assessment of the Quality Assessment of Short Read (QUASR) Pipeline (http://sourceforge.net/projects/quasr) specific for virus genome short read analysis that minimizes sequencing errors from multiple deep-sequencing platforms, and enables post-mapping analysis of the minority variants within the viral population. QUASR significantly reduces the error-related noise in deep-sequencing datasets, resulting in increased mapping accuracy and reduction of erroneous mutations. Using QUASR, we have determined influenza virus genome dynamics in sequential samples from an in vitro evolution of 2009 pandemic H1N1 (A/H1N1/09) influenza from samples sequenced on both the Roche 454 GSFLX and Illumina GAIIx platforms. Importantly, concordance between the 454 and Illumina sequencing allowed unambiguous minority-variant detection and accurate determination of virus population turnover in vitro.

  5. Method for sequencing nucleic acid molecules

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2006-05-30

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  6. Method for sequencing nucleic acid molecules

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2006-06-06

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  7. Re-sequencing expands our understanding of the phenotypic impact of variants at GWAS loci.

    PubMed

    Service, Susan K; Teslovich, Tanya M; Fuchsberger, Christian; Ramensky, Vasily; Yajnik, Pranav; Koboldt, Daniel C; Larson, David E; Zhang, Qunyuan; Lin, Ling; Welch, Ryan; Ding, Li; McLellan, Michael D; O'Laughlin, Michele; Fronick, Catrina; Fulton, Lucinda L; Magrini, Vincent; Swift, Amy; Elliott, Paul; Jarvelin, Marjo-Riitta; Kaakinen, Marika; McCarthy, Mark I; Peltonen, Leena; Pouta, Anneli; Bonnycastle, Lori L; Collins, Francis S; Narisu, Narisu; Stringham, Heather M; Tuomilehto, Jaakko; Ripatti, Samuli; Fulton, Robert S; Sabatti, Chiara; Wilson, Richard K; Boehnke, Michael; Freimer, Nelson B

    2014-01-01

    Genome-wide association studies (GWAS) have identified >500 common variants associated with quantitative metabolic traits, but in aggregate such variants explain at most 20-30% of the heritable component of population variation in these traits. To further investigate the impact of genotypic variation on metabolic traits, we conducted re-sequencing studies in >6,000 members of a Finnish population cohort (The Northern Finland Birth Cohort of 1966 [NFBC]) and a type 2 diabetes case-control sample (The Finland-United States Investigation of NIDDM Genetics [FUSION] study). By sequencing the coding sequence and 5' and 3' untranslated regions of 78 genes at 17 GWAS loci associated with one or more of six metabolic traits (serum levels of fasting HDL-C, LDL-C, total cholesterol, triglycerides, plasma glucose, and insulin), and conducting both single-variant and gene-level association tests, we obtained a more complete understanding of phenotype-genotype associations at eight of these loci. At all eight of these loci, the identification of new associations provides significant evidence for multiple genetic signals to one or more phenotypes, and at two loci, in the genes ABCA1 and CETP, we found significant gene-level evidence of association to non-synonymous variants with MAF<1%. Additionally, two potentially deleterious variants that demonstrated significant associations (rs138726309, a missense variant in G6PC2, and rs28933094, a missense variant in LIPC) were considerably more common in these Finnish samples than in European reference populations, supporting our prior hypothesis that deleterious variants could attain high frequencies in this isolated population, likely due to the effects of population bottlenecks. Our results highlight the value of large, well-phenotyped samples for rare-variant association analysis, and the challenge of evaluating the phenotypic impact of such variants.

  8. Somatic mutations and germline sequence variants in the expressed tyrosine kinase genes of patients with de novo acute myeloid leukemia

    PubMed Central

    Xiang, Zhifu; Walgren, Richard; Zhao, Yu; Kasai, Yumi; Miner, Tracie; Ries, Rhonda E.; Lubman, Olga; Fremont, Daved H.; McLellan, Michael D.; Payton, Jacqueline E.; Westervelt, Peter; DiPersio, John F.; Link, Daniel C.; Walter, Matthew J.; Graubert, Timothy A.; Watson, Mark; Baty, Jack; Heath, Sharon; Shannon, William D.; Nagarajan, Rakesh; Bloomfield, Clara D.; Mardis, Elaine R.; Wilson, Richard K.; Ley, Timothy J.

    2008-01-01

    Activating mutations in tyrosine kinase (TK) genes (eg, FLT3 and KIT) are found in more than 30% of patients with de novo acute myeloid leukemia (AML); many groups have speculated that mutations in other TK genes may be present in the remaining 70%. We performed high-throughput resequencing of the kinase domains of 26 TK genes (11 receptor TK; 15 cytoplasmic TK) expressed in most AML patients using genomic DNA from the bone marrow (tumor) and matched skin biopsy samples (“germline”) from 94 patients with de novo AML; sequence variants were validated in an additional 94 AML tumor samples (14.3 million base pairs of sequence were obtained and analyzed). We identified known somatic mutations in FLT3, KIT, and JAK2 TK genes at the expected frequencies and found 4 novel somatic mutations, JAK1V623A, JAK1T478S, DDR1A803V, and NTRK1S677N, once each in 4 respective patients of 188 tested. We also identified novel germline sequence changes encoding amino acid substitutions (ie, nonsynonymous changes) in 14 TK genes, including TYK2, which had the largest number of nonsynonymous sequence variants (11 total detected). Additional studies will be required to define the roles that these somatic and germline TK gene variants play in AML pathogenesis. PMID:18270328

  9. Whole-Exome Sequencing Identifies Rare and Low-Frequency Coding Variants Associated with LDL Cholesterol

    PubMed Central

    Lange, Leslie A.; Hu, Youna; Zhang, He; Xue, Chenyi; Schmidt, Ellen M.; Tang, Zheng-Zheng; Bizon, Chris; Lange, Ethan M.; Smith, Joshua D.; Turner, Emily H.; Jun, Goo; Kang, Hyun Min; Peloso, Gina; Auer, Paul; Li, Kuo-ping; Flannick, Jason; Zhang, Ji; Fuchsberger, Christian; Gaulton, Kyle; Lindgren, Cecilia; Locke, Adam; Manning, Alisa; Sim, Xueling; Rivas, Manuel A.; Holmen, Oddgeir L.; Gottesman, Omri; Lu, Yingchang; Ruderfer, Douglas; Stahl, Eli A.; Duan, Qing; Li, Yun; Durda, Peter; Jiao, Shuo; Isaacs, Aaron; Hofman, Albert; Bis, Joshua C.; Correa, Adolfo; Griswold, Michael E.; Jakobsdottir, Johanna; Smith, Albert V.; Schreiner, Pamela J.; Feitosa, Mary F.; Zhang, Qunyuan; Huffman, Jennifer E.; Crosby, Jacy; Wassel, Christina L.; Do, Ron; Franceschini, Nora; Martin, Lisa W.; Robinson, Jennifer G.; Assimes, Themistocles L.; Crosslin, David R.; Rosenthal, Elisabeth A.; Tsai, Michael; Rieder, Mark J.; Farlow, Deborah N.; Folsom, Aaron R.; Lumley, Thomas; Fox, Ervin R.; Carlson, Christopher S.; Peters, Ulrike; Jackson, Rebecca D.; van Duijn, Cornelia M.; Uitterlinden, André G.; Levy, Daniel; Rotter, Jerome I.; Taylor, Herman A.; Gudnason, Vilmundur; Siscovick, David S.; Fornage, Myriam; Borecki, Ingrid B.; Hayward, Caroline; Rudan, Igor; Chen, Y. Eugene; Bottinger, Erwin P.; Loos, Ruth J.F.; Sætrom, Pål; Hveem, Kristian; Boehnke, Michael; Groop, Leif; McCarthy, Mark; Meitinger, Thomas; Ballantyne, Christie M.; Gabriel, Stacey B.; O’Donnell, Christopher J.; Post, Wendy S.; North, Kari E.; Reiner, Alexander P.; Boerwinkle, Eric; Psaty, Bruce M.; Altshuler, David; Kathiresan, Sekar; Lin, Dan-Yu; Jarvik, Gail P.; Cupples, L. Adrienne; Kooperberg, Charles; Wilson, James G.; Nickerson, Deborah A.; Abecasis, Goncalo R.; Rich, Stephen S.; Tracy, Russell P.; Willer, Cristen J.; Gabriel, Stacey B.; Altshuler, David M.; Abecasis, Gonçalo R.; Allayee, Hooman; Cresci, Sharon; Daly, Mark J.; de Bakker, Paul I.W.; DePristo, Mark A.; Do, Ron; Donnelly, Peter; Farlow, Deborah N.; Fennell, Tim; Garimella, Kiran; Hazen, Stanley L.; Hu, Youna; Jordan, Daniel M.; Jun, Goo; Kathiresan, Sekar; Kang, Hyun Min; Kiezun, Adam; Lettre, Guillaume; Li, Bingshan; Li, Mingyao; Newton-Cheh, Christopher H.; Padmanabhan, Sandosh; Peloso, Gina; Pulit, Sara; Rader, Daniel J.; Reich, David; Reilly, Muredach P.; Rivas, Manuel A.; Schwartz, Steve; Scott, Laura; Siscovick, David S.; Spertus, John A.; Stitziel, Nathaniel O.; Stoletzki, Nina; Sunyaev, Shamil R.; Voight, Benjamin F.; Willer, Cristen J.; Rich, Stephen S.; Akylbekova, Ermeg; Atwood, Larry D.; Ballantyne, Christie M.; Barbalic, Maja; Barr, R. Graham; Benjamin, Emelia J.; Bis, Joshua; Boerwinkle, Eric; Bowden, Donald W.; Brody, Jennifer; Budoff, Matthew; Burke, Greg; Buxbaum, Sarah; Carr, Jeff; Chen, Donna T.; Chen, Ida Y.; Chen, Wei-Min; Concannon, Pat; Crosby, Jacy; Cupples, L. Adrienne; D’Agostino, Ralph; DeStefano, Anita L.; Dreisbach, Albert; Dupuis, Josée; Durda, J. Peter; Ellis, Jaclyn; Folsom, Aaron R.; Fornage, Myriam; Fox, Caroline S.; Fox, Ervin; Funari, Vincent; Ganesh, Santhi K.; Gardin, Julius; Goff, David; Gordon, Ora; Grody, Wayne; Gross, Myron; Guo, Xiuqing; Hall, Ira M.; Heard-Costa, Nancy L.; Heckbert, Susan R.; Heintz, Nicholas; Herrington, David M.; Hickson, DeMarc; Huang, Jie; Hwang, Shih-Jen; Jacobs, David R.; Jenny, Nancy S.; Johnson, Andrew D.; Johnson, Craig W.; Kawut, Steven; Kronmal, Richard; Kurz, Raluca; Lange, Ethan M.; Lange, Leslie A.; Larson, Martin G.; Lawson, Mark; Lewis, Cora E.; Levy, Daniel; Li, Dalin; Lin, Honghuang; Liu, Chunyu; Liu, Jiankang; Liu, Kiang; Liu, Xiaoming; Liu, Yongmei; Longstreth, William T.; Loria, Cay; Lumley, Thomas; Lunetta, Kathryn; Mackey, Aaron J.; Mackey, Rachel; Manichaikul, Ani; Maxwell, Taylor; McKnight, Barbara; Meigs, James B.; Morrison, Alanna C.; Musani, Solomon K.; Mychaleckyj, Josyf C.; Nettleton, Jennifer A.; North, Kari; O’Donnell, Christopher J.; O’Leary, Daniel; Ong, Frank; Palmas, Walter; Pankow, James S.; Pankratz, Nathan D.; Paul, Shom; Perez, Marco; Person, Sharina D.; Polak, Joseph; Post, Wendy S.; Psaty, Bruce M.; Quinlan, Aaron R.; Raffel, Leslie J.; Ramachandran, Vasan S.; Reiner, Alexander P.; Rice, Kenneth; Rotter, Jerome I.; Sanders, Jill P.; Schreiner, Pamela; Seshadri, Sudha; Shea, Steve; Sidney, Stephen; Silverstein, Kevin; Smith, Nicholas L.; Sotoodehnia, Nona; Srinivasan, Asoke; Taylor, Herman A.; Taylor, Kent; Thomas, Fridtjof; Tracy, Russell P.; Tsai, Michael Y.; Volcik, Kelly A.; Wassel, Chrstina L.; Watson, Karol; Wei, Gina; White, Wendy; Wiggins, Kerri L.; Wilk, Jemma B.; Williams, O. Dale; Wilson, Gregory; Wilson, James G.; Wolf, Phillip; Zakai, Neil A.; Hardy, John; Meschia, James F.; Nalls, Michael; Singleton, Andrew; Worrall, Brad; Bamshad, Michael J.; Barnes, Kathleen C.; Abdulhamid, Ibrahim; Accurso, Frank; Anbar, Ran; Beaty, Terri; Bigham, Abigail; Black, Phillip; Bleecker, Eugene; Buckingham, Kati; Cairns, Anne Marie; Caplan, Daniel; Chatfield, Barbara; Chidekel, Aaron; Cho, Michael; Christiani, David C.; Crapo, James D.; Crouch, Julia; Daley, Denise; Dang, Anthony; Dang, Hong; De Paula, Alicia; DeCelie-Germana, Joan; Drumm, Allen DozorMitch; Dyson, Maynard; Emerson, Julia; Emond, Mary J.; Ferkol, Thomas; Fink, Robert; Foster, Cassandra; Froh, Deborah; Gao, Li; Gershan, William; Gibson, Ronald L.; Godwin, Elizabeth; Gondor, Magdalen; Gutierrez, Hector; Hansel, Nadia N.; Hassoun, Paul M.; Hiatt, Peter; Hokanson, John E.; Howenstine, Michelle; Hummer, Laura K.; Kanga, Jamshed; Kim, Yoonhee; Knowles, Michael R.; Konstan, Michael; Lahiri, Thomas; Laird, Nan; Lange, Christoph; Lin, Lin; Lin, Xihong; Louie, Tin L.; Lynch, David; Make, Barry; Martin, Thomas R.; Mathai, Steve C.; Mathias, Rasika A.; McNamara, John; McNamara, Sharon; Meyers, Deborah; Millard, Susan; Mogayzel, Peter; Moss, Richard; Murray, Tanda; Nielson, Dennis; Noyes, Blakeslee; O’Neal, Wanda; Orenstein, David; O’Sullivan, Brian; Pace, Rhonda; Pare, Peter; Parker, H. Worth; Passero, Mary Ann; Perkett, Elizabeth; Prestridge, Adrienne; Rafaels, Nicholas M.; Ramsey, Bonnie; Regan, Elizabeth; Ren, Clement; Retsch-Bogart, George; Rock, Michael; Rosen, Antony; Rosenfeld, Margaret; Ruczinski, Ingo; Sanford, Andrew; Schaeffer, David; Sell, Cindy; Sheehan, Daniel; Silverman, Edwin K.; Sin, Don; Spencer, Terry; Stonebraker, Jackie; Tabor, Holly K.; Varlotta, Laurie; Vergara, Candelaria I.; Weiss, Robert; Wigley, Fred; Wise, Robert A.; Wright, Fred A.; Wurfel, Mark M.; Zanni, Robert; Zou, Fei; Nickerson, Deborah A.; Rieder, Mark J.; Green, Phil; Shendure, Jay; Akey, Joshua M.; Bustamante, Carlos D.; Crosslin, David R.; Eichler, Evan E.; Fox, P. Keolu; Fu, Wenqing; Gordon, Adam; Gravel, Simon; Jarvik, Gail P.; Johnsen, Jill M.; Kan, Mengyuan; Kenny, Eimear E.; Kidd, Jeffrey M.; Lara-Garduno, Fremiet; Leal, Suzanne M.; Liu, Dajiang J.; McGee, Sean; O’Connor, Timothy D.; Paeper, Bryan; Robertson, Peggy D.; Smith, Joshua D.; Staples, Jeffrey C.; Tennessen, Jacob A.; Turner, Emily H.; Wang, Gao; Yi, Qian; Jackson, Rebecca; Peters, Ulrike; Carlson, Christopher S.; Anderson, Garnet; Anton-Culver, Hoda; Assimes, Themistocles L.; Auer, Paul L.; Beresford, Shirley; Bizon, Chris; Black, Henry; Brunner, Robert; Brzyski, Robert; Burwen, Dale; Caan, Bette; Carty, Cara L.; Chlebowski, Rowan; Cummings, Steven; Curb, J. David; Eaton, Charles B.; Ford, Leslie; Franceschini, Nora; Fullerton, Stephanie M.; Gass, Margery; Geller, Nancy; Heiss, Gerardo; Howard, Barbara V.; Hsu, Li; Hutter, Carolyn M.; Ioannidis, John; Jiao, Shuo; Johnson, Karen C.; Kooperberg, Charles; Kuller, Lewis; LaCroix, Andrea; Lakshminarayan, Kamakshi; Lane, Dorothy; Lasser, Norman; LeBlanc, Erin; Li, Kuo-Ping; Limacher, Marian; Lin, Dan-Yu; Logsdon, Benjamin A.; Ludlam, Shari; Manson, JoAnn E.; Margolis, Karen; Martin, Lisa; McGowan, Joan; Monda, Keri L.; Kotchen, Jane Morley; Nathan, Lauren; Ockene, Judith; O’Sullivan, Mary Jo; Phillips, Lawrence S.; Prentice, Ross L.; Robbins, John; Robinson, Jennifer G.; Rossouw, Jacques E.; Sangi-Haghpeykar, Haleh; Sarto, Gloria E.; Shumaker, Sally; Simon, Michael S.; Stefanick, Marcia L.; Stein, Evan; Tang, Hua; Taylor, Kira C.; Thomson, Cynthia A.; Thornton, Timothy A.; Van Horn, Linda; Vitolins, Mara; Wactawski-Wende, Jean; Wallace, Robert; Wassertheil-Smoller, Sylvia; Zeng, Donglin; Applebaum-Bowden, Deborah; Feolo, Michael; Gan, Weiniu; Paltoo, Dina N.; Sholinsky, Phyliss; Sturcke, Anne

    2014-01-01

    Elevated low-density lipoprotein cholesterol (LDL-C) is a treatable, heritable risk factor for cardiovascular disease. Genome-wide association studies (GWASs) have identified 157 variants associated with lipid levels but are not well suited to assess the impact of rare and low-frequency variants. To determine whether rare or low-frequency coding variants are associated with LDL-C, we exome sequenced 2,005 individuals, including 554 individuals selected for extreme LDL-C (>98th or <2nd percentile). Follow-up analyses included sequencing of 1,302 additional individuals and genotype-based analysis of 52,221 individuals. We observed significant evidence of association between LDL-C and the burden of rare or low-frequency variants in PNPLA5, encoding a phospholipase-domain-containing protein, and both known and previously unidentified variants in PCSK9, LDLR and APOB, three known lipid-related genes. The effect sizes for the burden of rare variants for each associated gene were substantially higher than those observed for individual SNPs identified from GWASs. We replicated the PNPLA5 signal in an independent large-scale sequencing study of 2,084 individuals. In conclusion, this large whole-exome-sequencing study for LDL-C identified a gene not known to be implicated in LDL-C and provides unique insight into the design and analysis of similar experiments. PMID:24507775

  10. Optimum designs for next-generation sequencing to discover rare variants for common complex disease.

    PubMed

    Shi, Gang; Rao, D C

    2011-09-01

    Recent advances in next-generation sequencing technologies make it affordable to search for rare and functional variants for common complex diseases systematically. We investigated strategies for enriching rare variants in the samples selected for sequencing so as to optimize the power for their discovery. In particular, we investigated the roles of alternative sources of enrichment in families through computer simulations. We showed that linkage information, extreme phenotype, and nonrandom ascertainment, such as multiply affected families, constitute different sources for enriching rare and functional variants in a sequencing study design. Linkage is well known to have limited power for detecting small genetic effects, and hence not considered to be a powerful tool for discovering variants for common complex diseases. However, those families with some degree of family-specific linkage evidence provide an effective sampling strategy to sub-select the most linkage-informative families for sequencing. Compared with selecting subjects with extreme phenotypes, linkage evidence performs better with larger families, while extreme-phenotype method is more efficient with smaller families. Families with multiple affected siblings were found to provide the largest enrichment of rare variants. Finally, we showed that combined strategies, such as selecting linkage-informative families from multiply affected families, provide much higher enrichment of rare functional variants than either strategy alone.

  11. Whole-exome sequencing identifies rare and low-frequency coding variants associated with LDL cholesterol.

    PubMed

    Lange, Leslie A; Hu, Youna; Zhang, He; Xue, Chenyi; Schmidt, Ellen M; Tang, Zheng-Zheng; Bizon, Chris; Lange, Ethan M; Smith, Joshua D; Turner, Emily H; Jun, Goo; Kang, Hyun Min; Peloso, Gina; Auer, Paul; Li, Kuo-Ping; Flannick, Jason; Zhang, Ji; Fuchsberger, Christian; Gaulton, Kyle; Lindgren, Cecilia; Locke, Adam; Manning, Alisa; Sim, Xueling; Rivas, Manuel A; Holmen, Oddgeir L; Gottesman, Omri; Lu, Yingchang; Ruderfer, Douglas; Stahl, Eli A; Duan, Qing; Li, Yun; Durda, Peter; Jiao, Shuo; Isaacs, Aaron; Hofman, Albert; Bis, Joshua C; Correa, Adolfo; Griswold, Michael E; Jakobsdottir, Johanna; Smith, Albert V; Schreiner, Pamela J; Feitosa, Mary F; Zhang, Qunyuan; Huffman, Jennifer E; Crosby, Jacy; Wassel, Christina L; Do, Ron; Franceschini, Nora; Martin, Lisa W; Robinson, Jennifer G; Assimes, Themistocles L; Crosslin, David R; Rosenthal, Elisabeth A; Tsai, Michael; Rieder, Mark J; Farlow, Deborah N; Folsom, Aaron R; Lumley, Thomas; Fox, Ervin R; Carlson, Christopher S; Peters, Ulrike; Jackson, Rebecca D; van Duijn, Cornelia M; Uitterlinden, André G; Levy, Daniel; Rotter, Jerome I; Taylor, Herman A; Gudnason, Vilmundur; Siscovick, David S; Fornage, Myriam; Borecki, Ingrid B; Hayward, Caroline; Rudan, Igor; Chen, Y Eugene; Bottinger, Erwin P; Loos, Ruth J F; Sætrom, Pål; Hveem, Kristian; Boehnke, Michael; Groop, Leif; McCarthy, Mark; Meitinger, Thomas; Ballantyne, Christie M; Gabriel, Stacey B; O'Donnell, Christopher J; Post, Wendy S; North, Kari E; Reiner, Alexander P; Boerwinkle, Eric; Psaty, Bruce M; Altshuler, David; Kathiresan, Sekar; Lin, Dan-Yu; Jarvik, Gail P; Cupples, L Adrienne; Kooperberg, Charles; Wilson, James G; Nickerson, Deborah A; Abecasis, Goncalo R; Rich, Stephen S; Tracy, Russell P; Willer, Cristen J

    2014-02-06

    Elevated low-density lipoprotein cholesterol (LDL-C) is a treatable, heritable risk factor for cardiovascular disease. Genome-wide association studies (GWASs) have identified 157 variants associated with lipid levels but are not well suited to assess the impact of rare and low-frequency variants. To determine whether rare or low-frequency coding variants are associated with LDL-C, we exome sequenced 2,005 individuals, including 554 individuals selected for extreme LDL-C (>98(th) or <2(nd) percentile). Follow-up analyses included sequencing of 1,302 additional individuals and genotype-based analysis of 52,221 individuals. We observed significant evidence of association between LDL-C and the burden of rare or low-frequency variants in PNPLA5, encoding a phospholipase-domain-containing protein, and both known and previously unidentified variants in PCSK9, LDLR and APOB, three known lipid-related genes. The effect sizes for the burden of rare variants for each associated gene were substantially higher than those observed for individual SNPs identified from GWASs. We replicated the PNPLA5 signal in an independent large-scale sequencing study of 2,084 individuals. In conclusion, this large whole-exome-sequencing study for LDL-C identified a gene not known to be implicated in LDL-C and provides unique insight into the design and analysis of similar experiments.

  12. Single variant and multi-variant trend tests for genetic association with next generation sequencing that are robust to sequencing error

    PubMed Central

    Kim, Wonkuk; Londono, Douglas; Zhou, Lisheng; Xing, Jinchuan; Nato, Andrew; Musolf, Anthony; Matise, Tara C.; Finch, Stephen J.; Gordon, Derek

    2013-01-01

    As with any new technology, next generation sequencing (NGS) has potential advantages and potential challenges. One advantage is the identification of multiple causal variants for disease that might otherwise be missed by SNP-chip technology. One potential challenge is misclassification error (as with any emerging technology) and the issue of power loss due to multiple testing. Here, we develop an extension of the linear trend test for association that incorporates differential misclassification error and may be applied to any number of SNPs. We call the statistic the linear trend test allowing for error, applied to NGS, or LTTae,NGS. This statistic allows for differential misclassification. The observed data are phenotypes for unrelated cases and controls, coverage, and the number of putative causal variants for every individual at all SNPs. We simulate data considering multiple factors (disease mode of inheritance, genotype relative risk, causal variant frequency, sequence error rate in cases, sequence error rate in controls, number of loci, and others) and evaluate type I error rate and power for each vector of factor settings. We compare our results with two recently published NGS statistics. Also, we create a fictitious disease model, based on downloaded 1000 Genomes data for 5 SNPs and 388 individuals, and apply our statistic to that data. We find that the LTTae,NGS maintains the correct type I error rate in all simulations (differential and non-differential error), while the other statistics show large inflation in type I error for lower coverage. Power for all three methods is approximately the same for all three statistics in the presence of non-differential error. Application of our statistic to the 1000 Genomes data suggests that, for the data downloaded, there is a 1.5% sequence misclassification rate over all SNPs. Finally, application of the multi-variant form of LTTae,NGS shows high power for a number of simulation settings, although it can have

  13. Whole exome sequencing identifies genetic variants in inherited thrombocytopenia with secondary qualitative function defects

    PubMed Central

    Johnson, Ben; Lowe, Gillian C.; Futterer, Jane; Lordkipanidzé, Marie; MacDonald, David; Simpson, Michael A.; Sanchez-Guiú, Isabel; Drake, Sian; Bem, Danai; Leo, Vincenzo; Fletcher, Sarah J.; Dawood, Ban; Rivera, José; Allsup, David; Biss, Tina; Bolton-Maggs, Paula HB; Collins, Peter; Curry, Nicola; Grimley, Charlotte; James, Beki; Makris, Mike; Motwani, Jayashree; Pavord, Sue; Talks, Katherine; Thachil, Jecko; Wilde, Jonathan; Williams, Mike; Harrison, Paul; Gissen, Paul; Mundell, Stuart; Mumford, Andrew; Daly, Martina E.; Watson, Steve P.; Morgan, Neil V.

    2016-01-01

    Inherited thrombocytopenias are a heterogeneous group of disorders characterized by abnormally low platelet counts which can be associated with abnormal bleeding. Next-generation sequencing has previously been employed in these disorders for the confirmation of suspected genetic abnormalities, and more recently in the discovery of novel disease-causing genes. However its full potential has not yet been exploited. Over the past 6 years we have sequenced the exomes from 55 patients, including 37 index cases and 18 additional family members, all of whom were recruited to the UK Genotyping and Phenotyping of Platelets study. All patients had inherited or sustained thrombocytopenia of unknown etiology with platelet counts varying from 11×109/L to 186×109/L. Of the 51 patients phenotypically tested, 37 (73%), had an additional secondary qualitative platelet defect. Using whole exome sequencing analysis we have identified “pathogenic” or “likely pathogenic” variants in 46% (17/37) of our index patients with thrombocytopenia. In addition, we report variants of uncertain significance in 12 index cases, including novel candidate genetic variants in previously unreported genes in four index cases. These results demonstrate that whole exome sequencing is an efficient method for elucidating potential pathogenic genetic variants in inherited thrombocytopenia. Whole exome sequencing also has the added benefit of discovering potentially pathogenic genetic variants for further study in novel genes not previously implicated in inherited thrombocytopenia. PMID:27479822

  14. Assessment of in silico protein sequence analysis in the clinical classification of variants in cancer risk genes.

    PubMed

    Kerr, Iain D; Cox, Hannah C; Moyes, Kelsey; Evans, Brent; Burdett, Brianna C; van Kan, Aric; McElroy, Heather; Vail, Paris J; Brown, Krystal L; Sumampong, Dechie B; Monteferrante, Nicholas J; Hardman, Kennedy L; Theisen, Aaron; Mundt, Erin; Wenstrup, Richard J; Eggington, Julie M

    2017-01-03

    Missense variants represent a significant proportion of variants identified in clinical genetic testing. In the absence of strong clinical or functional evidence, the American College of Medical Genetics recommends that these findings be classified as variants of uncertain significance (VUS). VUSs may be reclassified to better inform patient care when new evidence is available. It is critical that the methods used for reclassification are robust in order to prevent inappropriate medical management strategies and unnecessary, life-altering surgeries. In an effort to provide evidence for classification, several in silico algorithms have been developed that attempt to predict the functional impact of missense variants through amino acid sequence conservation analysis. We report an analysis comparing internally derived, evidence-based classifications with the results obtained from six commonly used algorithms. We compiled a dataset of 1118 variants in BRCA1, BRCA2, MLH1, and MSH2 previously classified by our laboratory's evidence-based variant classification program. We compared internally derived classifications with those obtained from the following in silico tools: Align-GVGD, CONDEL, Grantham Analysis, MAPP-MMR, PolyPhen-2, and SIFT. Despite being based on similar underlying principles, all algorithms displayed marked divergence in accuracy, specificity, and sensitivity. Overall, accuracy ranged from 58.7 to 90.8% while the Matthews Correlation Coefficient ranged from 0.26-0.65. CONDEL, a weighted average of multiple algorithms, did not perform significantly better than its individual components evaluated here. These results suggest that the in silico algorithms evaluated here do not provide reliable evidence regarding the clinical significance of missense variants in genes associated with hereditary cancer.

  15. Localization of association signal from risk and protective variants in sequencing studies.

    PubMed

    Brisbin, Abra; Jenkins, Gregory D; Ellsworth, Katarzyna A; Wang, Liewei; Fridley, Brooke L

    2012-01-01

    Aggregating information across multiple variants in a gene or region can improve power for rare variant association testing. Power is maximized when the aggregation region contains many causal variants and few neutral variants. In this paper, we present a method for the localization of the association signal in a region using a sliding-window based approach to rare variant association testing in a region. We first introduce a novel method for analysis of rare variants, the Difference in Minor Allele Frequency test (DMAF), which allows combined analysis of common and rare variants, and makes no assumptions about the direction of effects. In whole-region analyses of simulated data with risk and protective variants, DMAF and other methods which pool data across individuals were found to outperform methods which pool data across variants. We then implement a sliding-window version of DMAF, using a step-down permutation approach to control type I error with the testing of multiple windows. In simulations, the sliding-window DMAF improved power to detect a causal sub-region, compared to applying DMAF to the whole region. Sliding-window DMAF was also effective in localizing the causal sub-region. We also applied the DMAF sliding-window approach to test for an association between response to the drug gemcitabine and variants in the gene FKBP5 sequenced in 91 lymphoblastoid cell lines derived from white non-Hispanic individuals. The application of the sliding-window test procedure detected an association in a sub-region spanning an exon and two introns, when rare and common variants were analyzed together.

  16. Complete genome sequence of a variant of maize-associated totivirus from Ecuador.

    PubMed

    Alvarez-Quinto, Robert A; Espinoza-Lozano, Rodrigo F; Mora-Pinargote, Carlos A; Quito-Avila, Diego F

    2017-04-01

    The complete genomic sequence of a variant of the recently reported maize-associated totivirus (MATV) from China was obtained from commercial maize in Ecuador. The genome of MATV-Ec (Ecuador) (4,998 bp) is considerably longer than that of MATV-Ch (China) (3,956 bp), the main difference due to a ≈ 1-kb-long capsid-protein-encoding fragment that is completely absent from the Chinese genome. Sequence alignments between MATV-Ec and MATV-Ch showed an overall identity of 82% at the nucleotide level, whereas at the amino acid level, the viruses exhibited 95% and 94% identity for the putative capsid protein and the RNA-dependent RNA polymerase (RdRp), respectively. Phylogenetic analysis of the viral RdRp domain indicated that MATV-Ec and MATV-Ch share a common ancestor with other plant-associated totiviruses, with Panax notoginseng virus A as the closest relative. MATV-Ec was detected in 46% (n = 80) of maize plants tested in this study, but not in endophytic fungi isolated from plants positive for the virus.

  17. Effect of Next-Generation Exome Sequencing Depth for Discovery of Diagnostic Variants

    PubMed Central

    Kim, Kyung; Seong, Moon-Woo; Chung, Won-Hyong; Park, Sung Sup; Leem, Sangseob; Park, Won; Kim, Jihyun; Lee, KiYoung; Park, Rae Woong; Kim, Namshin

    2015-01-01

    Sequencing depth, which is directly related to the cost and time required for the generation, processing, and maintenance of next-generation sequencing data, is an important factor in the practical utilization of such data in clinical fields. Unfortunately, identifying an exome sequencing depth adequate for clinical use is a challenge that has not been addressed extensively. Here, we investigate the effect of exome sequencing depth on the discovery of sequence variants for clinical use. Toward this, we sequenced ten germ-line blood samples from breast cancer patients on the Illumina platform GAII(x) at a high depth of ~200×. We observed that most function-related diverse variants in the human exonic regions could be detected at a sequencing depth of 120×. Furthermore, investigation using a diagnostic gene set showed that the number of clinical variants identified using exome sequencing reached a plateau at an average sequencing depth of about 120×. Moreover, the phenomena were consistent across the breast cancer samples. PMID:26175660

  18. A rare sequence variant in intron 1 of THAP1 is associated with primary dystonia

    PubMed Central

    Vemula, Satya R; Xiao, Jianfeng; Zhao, Yu; Bastian, Robert W; Perlmutter, Joel S; Racette, Brad A; Paniello, Randal C; Wszolek, Zbigniew K; Uitti, Ryan J; Van Gerpen, Jay A; Hedera, Peter; Truong, Daniel D; Blitzer, Andrew; Rudzińska, Monika; Momčilović, Dragana; Jinnah, Hyder A; Frei, Karen; Pfeiffer, Ronald F; LeDoux, Mark S

    2014-01-01

    Although coding variants in THAP1 have been causally associated with primary dystonia, the contribution of noncoding variants remains uncertain. Herein, we examine a previously identified Intron 1 variant (c.71+9C>A, rs200209986). Among 1672 subjects with mainly adult-onset primary dystonia, 12 harbored the variant in contrast to 1/1574 controls (P < 0.01). Dystonia classification included cervical dystonia (N = 3), laryngeal dystonia (adductor subtype, N = 3), jaw-opening oromandibular dystonia (N = 1), blepharospasm (N = 2), and unclassified (N = 3). Age of dystonia onset ranged from 25 to 69 years (mean = 54 years). In comparison to controls with no identified THAP1 sequence variants, the c.71+9C>A variant was associated with an elevated ratio of Isoform 1 (NM_018105) to Isoform 2 (NM_199003) in leukocytes. In silico and minigene analyses indicated that c.71+9C>A alters THAP1 splicing. Lymphoblastoid cells harboring the c.71+9C>A variant showed extensive apoptosis with relatively fewer cells in the G2 phase of the cell cycle. Differentially expressed genes from lymphoblastoid cells revealed that the c.71+9C>A variant exerts effects on DNA synthesis, cell growth and proliferation, cell survival, and cytotoxicity. In aggregate, these data indicate that THAP1 c.71+9C>A is a risk factor for adult-onset primary dystonia. PMID:24936516

  19. Deep Sequencing Reveals Novel Genetic Variants in Children with Acute Liver Failure and Tissue Evidence of Impaired Energy Metabolism

    PubMed Central

    Valencia, C. Alexander; Wang, Xinjian; Wang, Jin; Peters, Anna; Simmons, Julia R.; Moran, Molly C.; Mathur, Abhinav; Husami, Ammar; Qian, Yaping; Sheridan, Rachel; Bove, Kevin E.; Witte, David; Huang, Taosheng; Miethke, Alexander G.

    2016-01-01

    Background & Aims The etiology of acute liver failure (ALF) remains elusive in almost half of affected children. We hypothesized that inherited mitochondrial and fatty acid oxidation disorders were occult etiological factors in patients with idiopathic ALF and impaired energy metabolism. Methods Twelve patients with elevated blood molar lactate/pyruvate ratio and indeterminate etiology were selected from a retrospective cohort of 74 subjects with ALF because their fixed and frozen liver samples were available for histological, ultrastructural, molecular and biochemical analysis. Results A customized next-generation sequencing panel for 26 genes associated with mitochondrial and fatty acid oxidation defects revealed mutations and sequence variants in five subjects. Variants involved the genes ACAD9, POLG, POLG2, DGUOK, and RRM2B; the latter not previously reported in subjects with ALF. The explanted livers of the patients with heterozygous, truncating insertion mutations in RRM2B showed patchy micro- and macrovesicular steatosis, decreased mitochondrial DNA (mtDNA) content <30% of controls, and reduced respiratory chain complex activity; both patients had good post-transplant outcome. One infant with severe lactic acidosis was found to carry two heterozygous variants in ACAD9, which was associated with isolated complex I deficiency and diffuse hypergranular hepatocytes. The two subjects with heterozygous variants of unknown clinical significance in POLG and DGUOK developed ALF following drug exposure. Their hepatocytes displayed abnormal mitochondria by electron microscopy. Conclusion Targeted next generation sequencing and correlation with histological, ultrastructural and functional studies on liver tissue in children with elevated lactate/pyruvate ratio expand the spectrum of genes associated with pediatric ALF. PMID:27483465

  20. Novel Y-chromosome Short Tandem Repeat Variants Detected Through the Use of Massively Parallel Sequencing

    PubMed Central

    Warshauer, David H.; Churchill, Jennifer D.; Novroski, Nicole; King, Jonathan L.; Budowle, Bruce

    2015-01-01

    Massively parallel sequencing (MPS) technology is capable of determining the sizes of short tandem repeat (STR) alleles as well as their individual nucleotide sequences. Thus, single nucleotide polymorphisms (SNPs) within the repeat regions of STRs and variations in the pattern of repeat units in a given repeat motif can be used to differentiate alleles of the same length. In this study, MPS was used to sequence 28 forensically-relevant Y-chromosome STRs in a set of 41 DNA samples from the 3 major U.S. population groups (African Americans, Caucasians, and Hispanics). The resulting sequence data, which were analyzed with STRait Razor v2.0, revealed 37 unique allele sequence variants that have not been previously reported. Of these, 19 sequences were variations of documented sequences resulting from the presence of intra-repeat SNPs or alternative repeat unit patterns. Despite a limited sampling, two of the most frequently-observed variants were found only in African American samples. The remaining 18 variants represented allele sequences for which there were no published data with which to compare. These findings illustrate the great potential of MPS with regard to increasing the resolving power of STR typing and emphasize the need for sample population characterization of STR alleles. PMID:26391384

  1. Exome Sequencing Identifies a Rare HSPG2 Variant Associated with Familial Idiopathic Scoliosis

    PubMed Central

    Baschal, Erin E.; Wethey, Cambria I.; Swindle, Kandice; Baschal, Robin M.; Gowan, Katherine; Tang, Nelson L.S.; Alvarado, David M.; Haller, Gabe E.; Dobbs, Matthew B.; Taylor, Matthew R.G.; Gurnett, Christina A.; Jones, Kenneth L.; Miller, Nancy H.

    2014-01-01

    Idiopathic scoliosis occurs in 3% of individuals and has an unknown etiology. The objective of this study was to identify rare variants that contribute to the etiology of idiopathic scoliosis by using exome sequencing in a multigenerational family with idiopathic scoliosis. Exome sequencing was completed for three members of this multigenerational family with idiopathic scoliosis, resulting in the identification of a variant in the HSPG2 gene as a potential contributor to the phenotype. The HSPG2 gene was sequenced in a separate cohort of 100 unrelated individuals affected with idiopathic scoliosis and also was examined in an independent idiopathic scoliosis population. The exome sequencing and subsequent bioinformatics filtering resulted in 16 potentially damaging and rare coding variants. One of these variants, p.Asn786Ser, is located in the HSPG2 gene. The variant p.Asn786Ser also is overrepresented in a larger cohort of idiopathic scoliosis cases compared with a control population (P = 0.024). Furthermore, we identified additional rare HSPG2 variants that are predicted to be damaging in two independent cohorts of individuals with idiopathic scoliosis. The HSPG2 gene encodes for a ubiquitous multifunctional protein within the extracellular matrix in which loss of function mutation are known to result in a musculoskeletal phenotype in both mouse and humans. Based on these results, we conclude that rare variants in the HSPG2 gene potentially contribute to the idiopathic scoliosis phenotype in a subset of patients with idiopathic scoliosis. Further studies must be completed to confirm the effect of the HSPG2 gene on the idiopathic scoliosis phenotype. PMID:25504735

  2. VARIANT: Command Line, Web service and Web interface for fast and accurate functional characterization of variants found by Next-Generation Sequencing

    PubMed Central

    Medina, Ignacio; De Maria, Alejandro; Bleda, Marta; Salavert, Francisco; Alonso, Roberto; Gonzalez, Cristina Y.; Dopazo, Joaquin

    2012-01-01

    The massive use of Next-Generation Sequencing (NGS) technologies is uncovering an unexpected amount of variability. The functional characterization of such variability, particularly in the most common form of variation found, the Single Nucleotide Variants (SNVs), has become a priority that needs to be addressed in a systematic way. VARIANT (VARIant ANalyis Tool) reports information on the variants found that include consequence type and annotations taken from different databases and repositories (SNPs and variants from dbSNP and 1000 genomes, and disease-related variants from the Genome-Wide Association Study (GWAS) catalog, Online Mendelian Inheritance in Man (OMIM), Catalog of Somatic Mutations in Cancer (COSMIC) mutations, etc). VARIANT also produces a rich variety of annotations that include information on the regulatory (transcription factor or miRNA-binding sites, etc.) or structural roles, or on the selective pressures on the sites affected by the variation. This information allows extending the conventional reports beyond the coding regions and expands the knowledge on the contribution of non-coding or synonymous variants to the phenotype studied. Contrarily to other tools, VARIANT uses a remote database and operates through efficient RESTful Web Services that optimize search and transaction operations. In this way, local problems of installation, update or disk size limitations are overcome without the need of sacrifice speed (thousands of variants are processed per minute). VARIANT is available at: http://variant.bioinfo.cipf.es. PMID:22693211

  3. Novel pathogenic variants and genes for myopathies identified by whole exome sequencing

    PubMed Central

    Hunter, Jesse M; Ahearn, Mary Ellen; Balak, Christopher D; Liang, Winnie S; Kurdoglu, Ahmet; Corneveaux, Jason J; Russell, Megan; Huentelman, Matthew J; Craig, David W; Carpten, John; Coons, Stephen W; DeMello, Daphne E; Hall, Judith G; Bernes, Saunder M; Baumbach-Reardon, Lisa

    2015-01-01

    Neuromuscular diseases (NMD) account for a significant proportion of infant and childhood mortality and devastating chronic disease. Determining the specific diagnosis of NMD is challenging due to thousands of unique or rare genetic variants that result in overlapping phenotypes. We present four unique childhood myopathy cases characterized by relatively mild muscle weakness, slowly progressing course, mildly elevated creatine phosphokinase (CPK), and contractures. We also present two additional cases characterized by severe prenatal/neonatal myopathy. Prior extensive genetic testing and histology of these cases did not reveal the genetic etiology of disease. Here, we applied whole exome sequencing (WES) and bioinformatics to identify likely causal pathogenic variants in each pedigree. In two cases, we identified novel pathogenic variants in COL6A3. In a third case, we identified novel likely pathogenic variants in COL6A6 and COL6A3. We identified a novel splice variant in EMD in a fourth case. Finally, we classify two cases as calcium channelopathies with identification of novel pathogenic variants in RYR1 and CACNA1S. These are the first cases of myopathies reported to be caused by variants in COL6A6 and CACNA1S. Our results demonstrate the utility and genetic diagnostic value of WES in the broad class of NMD phenotypes. PMID:26247046

  4. Integrating mRNA and protein sequencing enables the detection and quantitative profiling of natural protein sequence variants of Populus trichocarpa

    DOE PAGES

    Abraham, Paul E.; Wang, Xiaojing; Ranjan, Priya; ...

    2015-10-20

    The availability of next-generation sequencing technologies has rapidly transformed our ability to link genotypes to phenotypes, and as such, promises to facilitate the dissection of genetic contribution to complex traits. Although discoveries of genetic associations will further our understanding of biology, once candidate variants have been identified, investigators are faced with the challenge of characterizing the functional effects on proteins encoded by such genes. Here we show how next-generation RNA sequencing data can be exploited to construct genotype-specific protein sequence databases, which provide a clearer picture of the molecular toolbox underlying cellular and organismal processes and their variation in amore » natural population. For this study, we used two individual genotypes (DENA-17-3 and VNDL-27-4) from a recent genome wide association (GWA) study of Populus trichocarpa, an obligate outcrosser that exhibits tremendous phenotypic variation across the natural population. This strategy allowed us to comprehensively catalogue proteins containing single amino acid polymorphisms (SAAPs) and insertions and deletions (INDELS). Based on large-scale identification of SAAPs, we profiled the frequency of 128 types of naturally occurring amino acid substitutions, with a subset of SAAPs occurring in regions of the genome having strong polymorphism patterns consistent with recent positive and/or divergent selection. In addition, we were able to explore the diploid landscape of Populus at the proteome-level, allowing the characterization of heterozygous variants.« less

  5. Integrating mRNA and protein sequencing enables the detection and quantitative profiling of natural protein sequence variants of Populus trichocarpa

    SciTech Connect

    Abraham, Paul E.; Wang, Xiaojing; Ranjan, Priya; Zhang, Bing; Tuskan, Gerald A.; Robert L. Hettich; Nookaew, Intawat

    2015-10-20

    The availability of next-generation sequencing technologies has rapidly transformed our ability to link genotypes to phenotypes, and as such, promises to facilitate the dissection of genetic contribution to complex traits. Although discoveries of genetic associations will further our understanding of biology, once candidate variants have been identified, investigators are faced with the challenge of characterizing the functional effects on proteins encoded by such genes. Here we show how next-generation RNA sequencing data can be exploited to construct genotype-specific protein sequence databases, which provide a clearer picture of the molecular toolbox underlying cellular and organismal processes and their variation in a natural population. For this study, we used two individual genotypes (DENA-17-3 and VNDL-27-4) from a recent genome wide association (GWA) study of Populus trichocarpa, an obligate outcrosser that exhibits tremendous phenotypic variation across the natural population. This strategy allowed us to comprehensively catalogue proteins containing single amino acid polymorphisms (SAAPs) and insertions and deletions (INDELS). Based on large-scale identification of SAAPs, we profiled the frequency of 128 types of naturally occurring amino acid substitutions, with a subset of SAAPs occurring in regions of the genome having strong polymorphism patterns consistent with recent positive and/or divergent selection. In addition, we were able to explore the diploid landscape of Populus at the proteome-level, allowing the characterization of heterozygous variants.

  6. Exome Sequencing Identifies Potential Risk Variants for Mendelian Disorders at High Prevalence in Qatar

    PubMed Central

    Rodriguez-Flores, Juan L.; Fakhro, Khalid; Hackett, Neil R.; Salit, Jacqueline; Fuller, Jennifer; Agosto-Perez, Francisco; Gharbiah, Maey; Malek, Joel A.; Zirie, Mahmoud; Jayyousi, Amin; Badii, Ramin; Al-Marri, Ajayeb Al-Nabet; Chouchane, Lotfi; Stadler, Dora J.; Hunter-Zinck, Haley; Mezey, Jason G.; Crystal, Ronald G.

    2013-01-01

    Exome sequencing of families of related individuals has been highly successful in identifying genetic polymorphisms responsible for Mendelian disorders. Here, we demonstrate the value of the reverse approach, where we use exome sequencing of a sample of unrelated individuals to analyze allele frequencies of known causal mutations for Mendelian diseases. We sequenced the exomes of 100 individuals representing the three major genetic subgroups of the Qatari population (Q1 Bedouin, Q2 Persian-South Asian, Q3 African) and identified 37 variants in 33 genes with effects on 36 clinically significant Mendelian diseases. These include variants not present in 1000 Genomes and variants at high frequency when compared to 1000 Genomes populations. Several of these Mendelian variants were only segregating in one Qatari subpopulation, where the observed subpopulation specificity trends were confirmed in an independent population of 386 Qataris. Pre-marital genetic screening in Qatar tests for only 4 out of the 37, such that this study provides a set of Mendelian disease variants with potential impact on the epidemiological profile of the population that could be incorporated into the testing program if further experimental and clinical characterization confirms high penetrance. PMID:24123366

  7. HLA class II sequence variants influence tuberculosis risk in populations of European ancestry

    PubMed Central

    Sveinbjornsson, Gardar; Gudbjartsson, Daniel F.; Halldorsson, Bjarni V.; Kristinsson, Karl G.; Gottfredsson, Magnus; Barrett, Jeffrey C.; Gudmundsson, Larus J.; Blondal, Kai; Gylfason, Arnaldur; Gudjonsson, Sigurjon Axel; Helgadottir, Hafdis T.; Jonasdottir, Adalbjorg; Jonasdottir, Aslaug; Karason, Ari; Kardum, Ljiljana Bulat; Knežević, Jelena; Kristjansson, Helgi; Kristjansson, Mar; Love, Arthur; Luo, Yang; Magnusson, Olafur T.; Sulem, Patrick; Kong, Augustine; Masson, Gisli; Thorsteinsdottir, Unnur; Dembic, Zlatko; Nejentsev, Sergey; Blondal, Thorsteinn; Jonsdottir, Ingileif; Stefansson, Kari

    2016-01-01

    Mycobacterium tuberculosis (M. tuberculosis) infections cause 9.0 million new tuberculosis (TB) cases and 1.5 million deaths annually1. To search for sequence variants that confer risk of TB we tested 28.3 million variants identified through whole-genome sequencing of 2,636 Icelanders for association with TB (8,162 cases and 277,643 controls), pulmonary TB (PTB), and M. tuberculosis infection. We found association of three sequence variants in the HLA class II region: rs557011[T] (MAF=40.2%) with M. tuberculosis infection (OR =1.14, P=3.1×10-13) and PTB (OR=1.25, P=5.8×10-12) and rs9271378[G] (MAF=32.5%) with PTB (OR=0.78, P=2.5×10-12), both located between HLA-DQA1 and HLA-DRB1. Finally, a missense variant p.Ala210Thr in HLA-DQA1, (MAF=19.1%, rs9272785) shows association with M. tuberculosis infection (P=9.3×10-9, OR=1.14). The association of these variants with PTB was replicated in large samples of European ancestry from Russia and Croatia (P< 5.9×10-4). These findings demonstrate that the HLA class II region contributes to the complex genetic risk of tuberculosis, possibly through reduced presentation of protective M. tuberculosis antigens to T cells. PMID:26829749

  8. Expression of multiple membrane-associated phospholipase A1 beta transcript variants and lysophosphatidic acid receptors in Ewing tumor cells.

    PubMed

    Schmiedel, Benjamin Joachim; Hutter, Christoph; Hesse, Manuela; Staege, Martin Sebastian

    2011-10-01

    The prognosis for patients with advanced stages of Ewing family tumors (EFT) is very poor. EFT express high levels of phosphatidic acid specific membrane-associated phospholipase A1 beta (lipase I, LIPI). LIPI is a cancer/testis antigen and the high tumor specificity suggests that LIPI might be an attractive target for new diagnostic and/or therapeutic developments. By using reverse transcriptase-polymerase chain reaction (RT-PCR), we observed simultaneous presence of multiple LIPI transcript variants in EFT. We cloned and sequenced these transcript variants from EFT cell lines. Sequence analysis indicated that all transcript variants were derived by alternative splicing. Homology modeling of corresponding protein structures suggested that different transcript variants differ in their regulatory lid domains. In addition, expression of receptors for lysophosphatidic acid (LPA) was analyzed in a panel of EFT cell lines by RT-PCR. We observed that EFT cell lines expressed high levels of LPA receptors. Different LIPI transcript variants present in EFT might be involved in the pathogenesis of EFT by signaling via these LPA receptors.

  9. A splice variant in the ACSL5 gene relates migraine with fatty acid activation in mitochondria

    PubMed Central

    Matesanz, Fuencisla; Fedetz, María; Barrionuevo, Cristina; Karaky, Mohamad; Catalá-Rabasa, Antonio; Potenciano, Victor; Bello-Morales, Raquel; López-Guerrero, Jose-Antonio; Alcina, Antonio

    2016-01-01

    Genome-wide association studies (GWAS) in migraine are providing the molecular basis of this heterogeneous disease, but the understanding of its aetiology is still incomplete. Although some biomarkers have currently been accepted for migraine, large amount of studies for identifying new ones is needed. The migraine-associated variant rs12355831:A>G (P=2 × 10−6), described in a GWAS of the International Headache Genetic Consortium, is localized in a non-coding sequence with unknown function. We sought to identify the causal variant and the genetic mechanism involved in the migraine risk. To this end, we integrated data of RNA sequences from the Genetic European Variation in Health and Disease (GEUVADIS) and genotypes from 1000 GENOMES of 344 lymphoblastoid cell lines (LCLs), to determine the expression quantitative trait loci (eQTLs) in the region. We found that the migraine-associated variant belongs to a linkage disequilibrium block associated with the expression of an acyl-coenzyme A synthetase 5 (ACSL5) transcript lacking exon 20 (ACSL5-Δ20). We showed by exon-skipping assay a direct causality of rs2256368-G in the exon 20 skipping of approximately 20 to 40% of ACSL5 RNA molecules. In conclusion, we identified the functional variant (rs2256368:A>G) affecting ACSL5 exon 20 skipping, as a causal factor linked to the migraine-associated rs12355831:A>G, suggesting that the activation of long-chain fatty acids by the spliced ACSL5-Δ20 molecules, a mitochondrial located enzyme, is involved in migraine pathology. PMID:27189022

  10. Variant Humicola grisea CBH1.1

    DOEpatents

    Goedegebuur, Frits; Gualfetti, Peter; Mitchinson, Colin; Larenas, Edmund

    2011-08-16

    Disclosed are variants of Humicola grisea Cel7A (CBH1.1), H. jecorina CBH1 variant or S. thermophilium CBH1, nucleic acids encoding the same and methods for producing the same. The variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted.

  11. Variant Humicola grisea CBH1.1

    DOEpatents

    Goedegebuur, Frits; Gualfetti, Peter; Mitchinson, Colin; Larenas, Edmund

    2011-05-31

    Disclosed are variants of Humicola grisea Cel7A (CBH1.1), H. jecorina CBH1 variant or S. thermophilium CBH1, nucleic acids encoding the same and methods for producing the same. The variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted.

  12. Variant Humicola grisea CBH1.1

    DOEpatents

    Goedegebuur, Frits [Vlaardingen, NL; Gualfetti, Peter [San Francisco, CA; Mitchinson, Colin [Half Moon Bay, CA; Larenas, Edmund [Moss Beach, CA

    2012-08-07

    Disclosed are variants of Humicola grisea Cel7A (CBH1.1), H. jecorina CBH1 variant or S. thermophilium CBH1, nucleic acids encoding the same and methods for producing the same. The variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted.

  13. Variant Humicola grisea CBH1.1

    DOEpatents

    Goedegebuur, Frits; Gualfetti, Peter; Mitchinson, Colin; Larenas, Edmund

    2008-12-02

    Disclosed are variants of Humicola grisea Cel7A (CBH1.1), H. jecorina CBH1 variant or S. thermophilium CBH1, nucleic acids encoding the same and methods for producing the same. The variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted.

  14. Variant humicola grisea CBH1.1

    DOEpatents

    Goedegebuur, Frits; Gualfetti, Peter; Mitchinson, Colin; Edmund, Larenas

    2014-09-09

    Disclosed are variants of Humicola grisea Cel7A (CBH1.1), H. jecorina CBH1 variant or S. thermophilium CBH1, nucleic acids encoding the same and methods for producing the same. The variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted.

  15. Variant Humicola grisea CBH1.1

    DOEpatents

    Goedegeburr, Frits; Gualfetti, Peter; Mitchinson, Colin; Larenas, Edmund

    2013-02-19

    Disclosed are variants of Humicola grisea Cel7A (CBH1.1), H. jecorina CBH1 variant or S. thermophilium CBH1, nucleic acids encoding the same and methods for producing the same. The variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted.

  16. Variant Humicola grisea CBH1.1

    DOEpatents

    Goedegebuur, Frits; Gualfetti, Peter; Mitchinson, Colin; Larenas, Edmund

    2014-03-18

    Disclosed are variants of Humicola grisea Cel7A (CBH1.1), H. jecorina CBH1 variant or S. thermophilium CBH1, nucleic acids encoding the same and methods for producing the same. The variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted.

  17. Common 5S rRNA variants are likely to be accepted in many sequence contexts

    NASA Technical Reports Server (NTRS)

    Zhang, Zhengdong; D'Souza, Lisa M.; Lee, Youn-Hyung; Fox, George E.

    2003-01-01

    Over evolutionary time RNA sequences which are successfully fixed in a population are selected from among those that satisfy the structural and chemical requirements imposed by the function of the RNA. These sequences together comprise the structure space of the RNA. In principle, a comprehensive understanding of RNA structure and function would make it possible to enumerate which specific RNA sequences belong to a particular structure space and which do not. We are using bacterial 5S rRNA as a model system to attempt to identify principles that can be used to predict which sequences do or do not belong to the 5S rRNA structure space. One promising idea is the very intuitive notion that frequently seen sequence changes in an aligned data set of naturally occurring 5S rRNAs would be widely accepted in many other 5S rRNA sequence contexts. To test this hypothesis, we first developed well-defined operational definitions for a Vibrio region of the 5S rRNA structure space and what is meant by a highly variable position. Fourteen sequence variants (10 point changes and 4 base-pair changes) were identified in this way, which, by the hypothesis, would be expected to incorporate successfully in any of the known sequences in the Vibrio region. All 14 of these changes were constructed and separately introduced into the Vibrio proteolyticus 5S rRNA sequence where they are not normally found. Each variant was evaluated for its ability to function as a valid 5S rRNA in an E. coli cellular context. It was found that 93% (13/14) of the variants tested are likely valid 5S rRNAs in this context. In addition, seven variants were constructed that, although present in the Vibrio region, did not meet the stringent criteria for a highly variable position. In this case, 86% (6/7) are likely valid. As a control we also examined seven variants that are seldom or never seen in the Vibrio region of 5S rRNA sequence space. In this case only two of seven were found to be potentially valid. The

  18. Complete Genome Sequence of a Porcine Kobuvirus Variant Strain from Jiangxi, China.

    PubMed

    Peng, Qi; Song, Deping; Huang, Dongyan; Chen, Yanjun; Zhou, Xinrong; Zhang, Fanfan; Li, Anqi; Wu, Qiong; He, Houjun; Tang, Yuxin

    2017-02-02

    The complete genome sequence of a porcine kobuvirus (PKoV) variant strain, CH/KB-1/2014 from Jiangxi, China, with a 90-nucleotide deletion in the 2B gene, was determined and characterized. This study provides a better understanding of the molecular characteristics and evolution of PKoV in Jiangxi, China.

  19. A two-dimensional pooling strategy for rare variant detection on next-generation sequencing platforms.

    PubMed

    Zuzarte, Philip C; Denroche, Robert E; Fehringer, Gordon; Katzov-Eckert, Hagit; Hung, Rayjean J; McPherson, John D

    2014-01-01

    We describe a method for pooling and sequencing DNA from a large number of individual samples while preserving information regarding sample identity. DNA from 576 individuals was arranged into four 12 row by 12 column matrices and then pooled by row and by column resulting in 96 total pools with 12 individuals in each pool. Pooling of DNA was carried out in a two-dimensional fashion, such that DNA from each individual is present in exactly one row pool and exactly one column pool. By considering the variants observed in the rows and columns of a matrix we are able to trace rare variants back to the specific individuals that carry them. The pooled DNA samples were enriched over a 250 kb region previously identified by GWAS to significantly predispose individuals to lung cancer. All 96 pools (12 row and 12 column pools from 4 matrices) were barcoded and sequenced on an Illumina HiSeq 2000 instrument with an average depth of coverage greater than 4,000×. Verification based on Ion PGM sequencing confirmed the presence of 91.4% of confidently classified SNVs assayed. In this way, each individual sample is sequenced in multiple pools providing more accurate variant calling than a single pool or a multiplexed approach. This provides a powerful method for rare variant detection in regions of interest at a reduced cost to the researcher.

  20. Complete Genome Sequence of a Porcine Kobuvirus Variant Strain from Jiangxi, China

    PubMed Central

    Peng, Qi; Song, Deping; Huang, Dongyan; Chen, Yanjun; Zhou, Xinrong; Zhang, Fanfan; Li, Anqi; Wu, Qiong; He, Houjun

    2017-01-01

    ABSTRACT The complete genome sequence of a porcine kobuvirus (PKoV) variant strain, CH/KB-1/2014 from Jiangxi, China, with a 90-nucleotide deletion in the 2B gene, was determined and characterized. This study provides a better understanding of the molecular characteristics and evolution of PKoV in Jiangxi, China. PMID:28153909

  1. Molecular Cloning and Expression of Sequence Variants of Manganese Superoxide Dismutase Genes from Wheat

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Reactive oxygen species (ROS) are very harmful to living organisms due to the potential oxidation of membrane lipids, DNA, proteins, and carbohydrates. Transformed E.coli strain QC 871, superoxide dismutase (SOD) double-mutant, with three sequence variant MnSOD1, MnSOD2, and MnSOD3 manganese supero...

  2. An integrated framework for discovery and genotyping of genomic variants from high-throughput sequencing experiments.

    PubMed

    Duitama, Jorge; Quintero, Juan Camilo; Cruz, Daniel Felipe; Quintero, Constanza; Hubmann, Georg; Foulquié-Moreno, Maria R; Verstrepen, Kevin J; Thevelein, Johan M; Tohme, Joe

    2014-04-01

    Recent advances in high-throughput sequencing (HTS) technologies and computing capacity have produced unprecedented amounts of genomic data that have unraveled the genetics of phenotypic variability in several species. However, operating and integrating current software tools for data analysis still require important investments in highly skilled personnel. Developing accurate, efficient and user-friendly software packages for HTS data analysis will lead to a more rapid discovery of genomic elements relevant to medical, agricultural and industrial applications. We therefore developed Next-Generation Sequencing Eclipse Plug-in (NGSEP), a new software tool for integrated, efficient and user-friendly detection of single nucleotide variants (SNVs), indels and copy number variants (CNVs). NGSEP includes modules for read alignment, sorting, merging, functional annotation of variants, filtering and quality statistics. Analysis of sequencing experiments in yeast, rice and human samples shows that NGSEP has superior accuracy and efficiency, compared with currently available packages for variants detection. We also show that only a comprehensive and accurate identification of repeat regions and CNVs allows researchers to properly separate SNVs from differences between copies of repeat elements. We expect that NGSEP will become a strong support tool to empower the analysis of sequencing data in a wide range of research projects on different species.

  3. BEST1 sequence variants in Italian patients with vitelliform macular dystrophy

    PubMed Central

    Sodi, Andrea; Passerini, Ilaria; Caputo, Roberto; Bacci, Giacomo Maria; Bodoj, Mirela; Torricelli, Francesca; Menchini, Ugo

    2012-01-01

    Purpose To analyze the spectrum of sequence variants in the BEST1 gene in a group of Italian patients affected by Best vitelliform macular dystrophy (VMD). Methods Thirty Italian patients with a diagnosis of VMD and 20 clinically healthy relatives were recruited. They belonged to 19 Italian families predominantly originating from central Italy. They received a standard ophthalmologic examination, OCT scan, and electrophysiological tests (ERG and EOG). Fluorescein and ICG angiographies and fundus autofluorescence imaging were performed in selected cases. DNA samples were analyzed for sequence variants of the BEST1 gene by direct sequencing techniques. Results Nine missense variants and one deletion were found in the affected patients; each patient carried one mutation. Five variants [c.73C>T (p.Arg25Trp), c.652C>T (p.Arg218Cys), c.652C>G (p.Arg218Gly), c.728C>T (p.Ala243Val), c.893T>C (p.Phe298Ser)] have already been described in literature while another five variants [c.217A>C (p.Ile73Leu), c.239T>G (p.Phe80Cys), c.883_885del (p.Ile295del), c.907G>A (p.Asp303Asn), c.911A>G (p.Asp304Gly)] had not previously been reported. Affected patients, sometimes even from the same family, occasionally showed variable phenotypes. One heterozygous variant was also found in five clinically healthy relatives with normal fundus, visual acuity and ERG but with abnormal EOG. Conclusions Ten variants in the BEST1 gene were detected in a group of individuals with clinically apparent VMD, and in some clinically normal individuals with an abnormal EOG. The high prevalence of novel variants and the frequent report of a specific variant (p.Arg25Trp) that has rarely been described in other ethnic groups suggests a distribution of BEST1 variants peculiar to Italian VMD patients. PMID:23213274

  4. Exploration of the arrest peptide sequence space reveals arrest-enhanced variants.

    PubMed

    Cymer, Florian; Hedman, Rickard; Ismail, Nurzian; von Heijne, Gunnar

    2015-04-17

    Translational arrest peptides (APs) are short stretches of polypeptides that induce translational stalling when synthesized on a ribosome. Mechanical pulling forces acting on the nascent chain can weaken or even abolish stalling. APs can therefore be used as in vivo force sensors, making it possible to measure the forces that act on a nascent chain during translation with single-residue resolution. It is also possible to score the relative strengths of APs by subjecting them to a given pulling force and ranking them according to stalling efficiency. Using the latter approach, we now report an extensive mutagenesis scan of a strong mutant variant of the Mannheimia succiniciproducens SecM AP and identify mutations that further increase the stalling efficiency. Combining three such mutations, we designed an AP that withstands the strongest pulling force we are able to generate at present. We further show that diproline stretches in a nascent protein act as very strong APs when translation is carried out in the absence of elongation factor P. Our findings highlight critical residues in APs, show that certain amino acid sequences induce very strong translational arrest and provide a toolbox of APs of varying strengths that can be used for in vivo force measurements.

  5. Deep sequencing unearths nuclear mitochondrial sequences under Leber's hereditary optic neuropathy-associated false heteroplasmic mitochondrial DNA variants.

    PubMed

    Petruzzella, Vittoria; Carrozzo, Rosalba; Calabrese, Claudia; Dell'Aglio, Rosa; Trentadue, Raffaella; Piredda, Roberta; Artuso, Lucia; Rizza, Teresa; Bianchi, Marzia; Porcelli, Anna Maria; Guerriero, Silvana; Gasparre, Giuseppe; Attimonelli, Marcella

    2012-09-01

    Leber's hereditary optic neuropathy (LHON) is associated with mitochondrial DNA (mtDNA) ND mutations that are mostly homoplasmic. However, these mutations are not sufficient to explain the peculiar features of penetrance and the tissue-specific expression of the disease and are believed to be causative in association with unknown environmental or other genetic factors. Discerning between clear-cut pathogenetic variants, such as those that appear to be heteroplasmic, and less penetrant variants, such as the homoplasmic, remains a challenging issue that we have addressed here using next-generation sequencing approach. We set up a protocol to quantify MTND5 heteroplasmy levels in a family in which the proband manifests a LHON phenotype. Furthermore, to study this mtDNA haplotype, we applied the cybridization protocol. The results demonstrate that the mutations are mostly homoplasmic, whereas the suspected heteroplasmic feature of the observed mutations is due to the co-amplification of Nuclear mitochondrial Sequences.

  6. Comprehensive Rare Variant Analysis via Whole-Genome Sequencing to Determine the Molecular Pathology of Inherited Retinal Disease.

    PubMed

    Carss, Keren J; Arno, Gavin; Erwood, Marie; Stephens, Jonathan; Sanchis-Juan, Alba; Hull, Sarah; Megy, Karyn; Grozeva, Detelina; Dewhurst, Eleanor; Malka, Samantha; Plagnol, Vincent; Penkett, Christopher; Stirrups, Kathleen; Rizzo, Roberta; Wright, Genevieve; Josifova, Dragana; Bitner-Glindzicz, Maria; Scott, Richard H; Clement, Emma; Allen, Louise; Armstrong, Ruth; Brady, Angela F; Carmichael, Jenny; Chitre, Manali; Henderson, Robert H H; Hurst, Jane; MacLaren, Robert E; Murphy, Elaine; Paterson, Joan; Rosser, Elisabeth; Thompson, Dorothy A; Wakeling, Emma; Ouwehand, Willem H; Michaelides, Michel; Moore, Anthony T; Webster, Andrew R; Raymond, F Lucy

    2017-01-05

    Inherited retinal disease is a common cause of visual impairment and represents a highly heterogeneous group of conditions. Here, we present findings from a cohort of 722 individuals with inherited retinal disease, who have had whole-genome sequencing (n = 605), whole-exome sequencing (n = 72), or both (n = 45) performed, as part of the NIHR-BioResource Rare Diseases research study. We identified pathogenic variants (single-nucleotide variants, indels, or structural variants) for 404/722 (56%) individuals. Whole-genome sequencing gives unprecedented power to detect three categories of pathogenic variants in particular: structural variants, variants in GC-rich regions, which have significantly improved coverage compared to whole-exome sequencing, and variants in non-coding regulatory regions. In addition to previously reported pathogenic regulatory variants, we have identified a previously unreported pathogenic intronic variant in CHM in two males with choroideremia. We have also identified 19 genes not previously known to be associated with inherited retinal disease, which harbor biallelic predicted protein-truncating variants in unsolved cases. Whole-genome sequencing is an increasingly important comprehensive method with which to investigate the genetic causes of inherited retinal disease.

  7. Targeted exome sequencing reveals distinct pathogenic variants in Iranians with colorectal cancer

    PubMed Central

    Ashktorab, Hassan; Mokarram, Pooneh; Azimi, Hamed; Olumi, Hasti; Varma, Sudhir; Nickerson, Michael L.; Brim, Hassan

    2017-01-01

    PURPOSE Next Generation Sequencing (NGS) is currently used to establish mutational profiles in many multigene diseases such as colorectal cancer (CRC), which is on the rise in many parts of the developing World including, Iran. Little is known about its genetic hallmarks in these populations. AIM To identify variants in 15 CRC-associated genes in patients of Iranian descent. RESULTS There were 51 validated variants distributed on 12 genes: 22% MSH3 (n = 11/51), 10% MSH6 (n = 5/51), 8% AMER1 (n = 4/51), 20% APC (n = 10/51), 2% BRAF (n = 1/51), 2% KRAS (n = 1/51), 12% PIK3CA (n = 6/51), 8% TGFβR2A (n = 4/51), 2% SMAD4 (n = 1/51), 4% SOX9 (n = 2/51), 6% TCF7L2 (n = 3/51), and 6% TP53 (n = 3/51). Most known and distinct variants were in mismatch repair genes (MMR, 32%) and APC (20%). Among oncogenes, PIK3CA was the top target (12%). MATERIALS AND METHODS CRC specimens from 63 Shirazi patients were used to establish the variant' profile on an Ion Torrent platform by targeted exome sequencing. To rule-out technical artifacts, the variants were validated in 13 of these samples using an Illumina NGS platform. Validated variants were annotated and compared to variants from publically available databases. An in-silico functional analysis was performed. MSI status of the analyzed samples was established. CONCLUSION These results illustrate for the first time CRC mutational profile in Iranian patients. MSH3, MSH6, APC and PIK3CA genes seem to play a bigger role in the path to cancer in this population. These findings will potentially lead to informed genetic diagnosis protocol and targeted therapeutic strategies. PMID:28002797

  8. Evaluating Variant Calling Tools for Non-Matched Next-Generation Sequencing Data

    NASA Astrophysics Data System (ADS)

    Sandmann, Sarah; de Graaf, Aniek O.; Karimi, Mohsen; van der Reijden, Bert A.; Hellström-Lindberg, Eva; Jansen, Joop H.; Dugas, Martin

    2017-02-01

    Valid variant calling results are crucial for the use of next-generation sequencing in clinical routine. However, there are numerous variant calling tools that usually differ in algorithms, filtering strategies, recommendations and thus, also in the output. We evaluated eight open-source tools regarding their ability to call single nucleotide variants and short indels with allelic frequencies as low as 1% in non-matched next-generation sequencing data: GATK HaplotypeCaller, Platypus, VarScan, LoFreq, FreeBayes, SNVer, SAMtools and VarDict. We analysed two real datasets from patients with myelodysplastic syndrome, covering 54 Illumina HiSeq samples and 111 Illumina NextSeq samples. Mutations were validated by re-sequencing on the same platform, on a different platform and expert based review. In addition we considered two simulated datasets with varying coverage and error profiles, covering 50 samples each. In all cases an identical target region consisting of 19 genes (42,322 bp) was analysed. Altogether, no tool succeeded in calling all mutations. High sensitivity was always accompanied by low precision. Influence of varying coverages- and background noise on variant calling was generally low. Taking everything into account, VarDict performed best. However, our results indicate that there is a need to improve reproducibility of the results in the context of multithreading.

  9. Evaluating Variant Calling Tools for Non-Matched Next-Generation Sequencing Data

    PubMed Central

    Sandmann, Sarah; de Graaf, Aniek O.; Karimi, Mohsen; van der Reijden, Bert A.; Hellström-Lindberg, Eva; Jansen, Joop H.; Dugas, Martin

    2017-01-01

    Valid variant calling results are crucial for the use of next-generation sequencing in clinical routine. However, there are numerous variant calling tools that usually differ in algorithms, filtering strategies, recommendations and thus, also in the output. We evaluated eight open-source tools regarding their ability to call single nucleotide variants and short indels with allelic frequencies as low as 1% in non-matched next-generation sequencing data: GATK HaplotypeCaller, Platypus, VarScan, LoFreq, FreeBayes, SNVer, SAMtools and VarDict. We analysed two real datasets from patients with myelodysplastic syndrome, covering 54 Illumina HiSeq samples and 111 Illumina NextSeq samples. Mutations were validated by re-sequencing on the same platform, on a different platform and expert based review. In addition we considered two simulated datasets with varying coverage and error profiles, covering 50 samples each. In all cases an identical target region consisting of 19 genes (42,322 bp) was analysed. Altogether, no tool succeeded in calling all mutations. High sensitivity was always accompanied by low precision. Influence of varying coverages- and background noise on variant calling was generally low. Taking everything into account, VarDict performed best. However, our results indicate that there is a need to improve reproducibility of the results in the context of multithreading. PMID:28233799

  10. Sequencing of SCN5A identifies rare and common variants associated with cardiac conduction

    PubMed Central

    Magnani, Jared W.; Brody, Jennifer A.; Prins, Bram P.; Arking, Dan E.; Lin, Honghuang; Yin, Xiaoyan; Liu, Ching-Ti; Morrison, Alanna C.; Zhang, Feng; Spector, Tim D.; Alonso, Alvaro; Bis, Joshua C.; Heckbert, Susan R.; Lumley, Thomas; Sitlani, Colleen M.; Cupples, L. Adrienne; Lubitz, Steven A.; Soliman, Elsayed Z.; Pulit, Sara L.; Newton-Cheh, Christopher; O'Donnell, Christopher J.; Ellinor, Patrick T.; Benjamin, Emelia J.; Muzny, Donna M.; Gibbs, Richard A.; Santibanez, Jireh; Taylor, Herman A.; Rotter, Jerome I.; Lange, Leslie A.; Psaty, Bruce M.; Jackson, Rebecca; Rich, Stephen S.; Boerwinkle, Eric; Jamshidi, Yalda; Sotoodehnia, Nona

    2014-01-01

    Background The cardiac sodium channel SCN5A regulates atrioventricular and ventricular conduction. Genetic variants in this gene are associated with PR and QRS intervals. We sought to further characterize the contribution of rare and common coding variation in SCN5A to cardiac conduction. Methods and Results In the Cohorts for Heart and Aging Research in Genomic Epidemiology Targeted Sequencing Study (CHARGE), we performed targeted exonic sequencing of SCN5A (n=3699, European-ancestry individuals) and identified 4 common (minor allele frequency >1%) and 157 rare variants. Common and rare SCN5A coding variants were examined for association with PR and QRS intervals through meta-analysis of European ancestry participants from CHARGE, NHLBI’s Exome Sequencing Project (ESP, n=607) and the UK10K (n=1275) and by examining ESP African-ancestry participants (N=972). Rare coding SCN5A variants in aggregate were associated with PR interval in European and African-ancestry participants (P=1.3×10−3). Three common variants were associated with PR and/or QRS interval duration among European-ancestry participants and one among African-ancestry participants. These included two well-known missense variants; rs1805124 (H558R) was associated with PR and QRS shortening in European-ancestry participants (P=6.25×10−4 and P=5.2×10−3 respectively) and rs7626962 (S1102Y) was associated with PR shortening in those of African ancestry (P=2.82×10−3). Among European-ancestry participants, two novel synonymous variants, rs1805126 and rs6599230, were associated with cardiac conduction. Our top signal, rs1805126 was associated with PR and QRS lengthening (P=3.35×10−7 and P=2.69×10−4 respectively), and rs6599230 was associated with PR shortening (P=2.67×10−5). Conclusions By sequencing SCN5A, we identified novel common and rare coding variants associated with cardiac conduction. PMID:24951663

  11. Major sequence variants in E7 gene of human papillomavirus type 16 from cervical cancerous and noncancerous lesions of Korean women.

    PubMed

    Song, Y S; Kee, S H; Kim, J W; Park, N H; Kang, S B; Chang, W H; Lee, H P

    1997-08-01

    Geographic specificity of nucleotide sequence variations in the coding and noncoding regions of HPV 16 genome has been reported. Little has been known, however, regarding whether these naturally occurring sequence variations of HPV 16 may result in marked differences in biological properties, such as oncogenic potential. This study was performed to identify sequence variants in the HPV 16 E7 gene derived from Korean women with cervical cancerous and noncancerous lesions, and to assess the association between the sequence variant and the cervical cancer. We examined E7 variants of HPV 16 in a total of 157 patients with no cervical disease (NCD, n = 87) or cervical neoplasia (cervical intraepithelial neoplasia 3, n = 21; cervical carcinoma, n = 49), using the nested polymerase chain reaction (PCR) and the PCR-directed sequencing methods with outer consensus and inner type-specific primers. Forty-two (NCD, n = 9; CIN 3, n = 6; cervical carcinoma, n = 27) of 157 cervical samples contained HPV 16 E7 DNA, but only 8 had prototype sequences. Four variants of the HPV 16 E7 gene were identified. The variant with a single nucleotide change at position 647 (A --> G, Asn --> Ser) was found in about 60% of DNA samples with HPV 16. The second most common variant, found in 16.7% of cases, had three silent mutations at positions 732 (T --> C), 789 (T --> C), and 795 (T --> G). Two other variants were detected, one in a patient with cervical cancer and the other in a patient with no cervical disease. One had a single nucleotide change at position 666 (G --> A) and the other had one silent mutation at position 796 (T --> C). The most common variant in Korea has a change of nucleotide affecting the predicted amino acid related with high antigenicity and binding to retinoblastoma protein. There was a statistically significant trend for this variant to be more frequently detected in cancerous lesions of the uterine cervix than in noncancerous lesions. These data suggest that naturally

  12. De novo sequencing and variant calling with nanopores using PoreSeq.

    PubMed

    Szalay, Tamas; Golovchenko, Jene A

    2015-10-01

    The accuracy of sequencing single DNA molecules with nanopores is continually improving, but de novo genome sequencing and assembly using only nanopore data remain challenging. Here we describe PoreSeq, an algorithm that identifies and corrects errors in nanopore sequencing data and improves the accuracy of de novo genome assembly with increasing coverage depth. The approach relies on modeling the possible sources of uncertainty that occur as DNA transits through the nanopore and finds the sequence that best explains multiple reads of the same region. PoreSeq increases nanopore sequencing read accuracy of M13 bacteriophage DNA from 85% to 99% at 100× coverage. We also use the algorithm to assemble Escherichia coli with 30× coverage and the λ genome at a range of coverages from 3× to 50×. Additionally, we classify sequence variants at an order of magnitude lower coverage than is possible with existing methods.

  13. Snake venom. The amino acid sequence of protein A from Dendroaspis polylepis polylepis (black mamba) venom.

    PubMed

    Joubert, F J; Strydom, D J

    1980-12-01

    Protein A from Dendroaspis polylepis polylepis venom comprises 81 amino acids, including ten half-cystine residues. The complete primary structures of protein A and its variant A' were elucidated. The sequences of proteins A and A', which differ in a single position, show no homology with various neurotoxins and non-neurotoxic proteins and represent a new type of elapid venom protein.

  14. Phenolic acid esterases, coding sequences and methods

    DOEpatents

    Blum, David L.; Kataeva, Irina; Li, Xin-Liang; Ljungdahl, Lars G.

    2002-01-01

    Described herein are four phenolic acid esterases, three of which correspond to domains of previously unknown function within bacterial xylanases, from XynY and XynZ of Clostridium thermocellum and from a xylanase of Ruminococcus. The fourth specifically exemplified xylanase is a protein encoded within the genome of Orpinomyces PC-2. The amino acids of these polypeptides and nucleotide sequences encoding them are provided. Recombinant host cells, expression vectors and methods for the recombinant production of phenolic acid esterases are also provided.

  15. Identification of polymorphisms and sequence variants in the human homologue of the mouse natural resistance-associated macrophage protein gene

    SciTech Connect

    Liu, Jing; Fujiwara, T.M.; Buu, N.T.; Sanchez, F.O.; Cellier, M.; Paradis, A.J.; Frappier, D.; Skamene, E.; Gros, P.; Morgan, K.

    1995-04-01

    The most common mycobacterial disease in humans is tuberculosis, and there is evidence for genetic factors in susceptibility to tuberculosis. In the mouse, the Bcg gene controls macrophage priming for activation and is a major gene for susceptibility to infection with mycobacteria. A candidate gene for Bcg was identified by positional cloning and was designated {open_quotes}natural resistance-associated macrophage protein gene{close_quotes} (Nramp1), and the human homologue (NRAMP1) has recently been cloned. Here we report (1) the physical mapping NRAMP1 close to VIL in chromosome region 2q35 by PCR analysis of somatic cell hybrids and YAC cloning and (2) the identification of nine sequence variants in NRAMP1. Of the four variants in the coding region, there were two missense mutations and two silent substitutions. The missense mutations were a conservative alanine-to-valine substitution at codon 318 in exon9 and an aspartic acid-to-asparagine substitution at codon 543 in the predicted cytoplasmic tail of the NRAMP1 protein. A microsatellite was located in the immediate 5{prime} region of the gene, three variants were in introns, and one variant was located in the 3{prime} UTR. The allele frequencies of each of the nine variants were determined in DNA samples of 60 Caucasians and 20 Asians. In addition, we have physically linked two highly polymorphic microsatellite markers, D2S104 and D2S173, to NRAMP1 on a 1.5-Mb YAC contig. These molecular markers will be useful to assess the role of NRAMP1 in susceptibility to tuberculosis and other macrophage-mediated diseases. 40 refs., 3 figs., 2 tabs.

  16. Identification and DNA sequence analysis of 15 new {alpha}{sub 1}-antitrypsin variants, including two PI*QO alleles and one deficient PI*M allele

    SciTech Connect

    Faber, J.P.; Kirchgesser, M.; Schwaab, R.; Bidlingmaier, F.; Poller, W.; Weidinger, S.; Olek, K. |

    1994-12-01

    The authors have investigated the molecular basis of 15 new {alpha}{sub 1}-antitrypsin ({alpha}1AT) variants. Phenotyping by isoelectric focusing (IEF) was used as a screening method to detect {alpha}1AT variants at the protein level. Genotyping was then performed by sequence analysis of all coding exons, exon-intron junctions, and the hepatocyte-specific promotor region including exon Ic. Three of these rare variants are alleles of clinical relevance, associated with undetectable or very low serum levels of {alpha}1AT: the PI*Q0saarbruecken allele generated by a 1-bp C-nucleotide insertion within a stretch of seven cytosines spanning residues 360-362, resulting in a 3{prime} frameshift and the acquisition of a stop codon at residue 376; a point mutation in the PI*Q0lisbon allele, resulting in a single amino acid substitution Thr{sup 68}(ACC){yields}Ile(ATC); and an in-frame trinucleotide deletion {Delta}Phe{sup 51} (TTC) in the highly deficient PI*Mpalermo allele. The remaining 12 alleles are associated with normal {alpha}1AT serum levels and are characterized by point mutations causing single amino acid substitutions in all but one case. This exception is a silent mutation, which does not affect the amino acid sequence. The limitation of IEF compared with DNA sequence analysis, for identification of new variants, their generation by mutagenesis, and the clinical relevance of the three deficiency alleles are discussed.

  17. Whole-genome sequencing of two probands with hereditary spastic paraplegia reveals novel splice-donor region variant and known pathogenic variant in SPG11

    PubMed Central

    Chan, Anne Yin-Yan; Au, Wing Chi; Shen, Yun; Chan, Ting Fung

    2016-01-01

    Hereditary spastic paraplegias (HSPs) are a group of heterogeneous neurodegenerative disorders, which are often presented with overlapping phenotypes such as progressive paraparesis and spasticity. To assist the diagnosis of HSP subtypes, next-generation sequencing is often used to provide supporting evidence. In this study, we report the case of two probands from the same family with HSP symptoms, including bilateral lower limb weakness, unsteady gait, cognitive decline, dysarthria, and slurring of speech since the age of 14. Subsequent whole-genome sequencing revealed that the patients are compound heterozygous for variants in the SPG11 gene, including the paternally inherited c.6856C>T (p.Arg2286*) variant and the novel maternally inherited c.2316+5G>A splice-donor region variant. Variants in SPG11 are the common cause of autosomal recessive spastic paraplegia type 11. According to the ClinVar database, there are already 101 reported pathogenic variants in SPG11 that are associated with HSPs. To our knowledge, this is the first report of SPG11 variants in our local population. The novel splice variant identified in this study enriches the catalog of SPG11 variants, potentially leading to better genetic diagnosis of HSPs. PMID:27900367

  18. Rapid Detection of Rare Deleterious Variants by Next Generation Sequencing with Optional Microarray SNP Genotype Data.

    PubMed

    Watson, Christopher M; Crinnion, Laura A; Gurgel-Gianetti, Juliana; Harrison, Sally M; Daly, Catherine; Antanavicuite, Agne; Lascelles, Carolina; Markham, Alexander F; Pena, Sergio D J; Bonthron, David T; Carr, Ian M

    2015-09-01

    Autozygosity mapping is a powerful technique for the identification of rare, autosomal recessive, disease-causing genes. The ease with which this category of disease gene can be identified has greatly increased through the availability of genome-wide SNP genotyping microarrays and subsequently of exome sequencing. Although these methods have simplified the generation of experimental data, its analysis, particularly when disparate data types must be integrated, remains time consuming. Moreover, the huge volume of sequence variant data generated from next generation sequencing experiments opens up the possibility of using these data instead of microarray genotype data to identify disease loci. To allow these two types of data to be used in an integrated fashion, we have developed AgileVCFMapper, a program that performs both the mapping of disease loci by SNP genotyping and the analysis of potentially deleterious variants using exome sequence variant data, in a single step. This method does not require microarray SNP genotype data, although analysis with a combination of microarray and exome genotype data enables more precise delineation of disease loci, due to superior marker density and distribution.

  19. Method for identifying and quantifying nucleic acid sequence aberrations

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1998-07-21

    A method is disclosed for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe. 11 figs.

  20. Method for identifying and quantifying nucleic acid sequence aberrations

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1998-01-01

    A method for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe.

  1. Novel scripts for improved annotation and selection of variants from whole exome sequencing in cancer research

    PubMed Central

    Hansen, Marcus Celik; Nederby, Line; Roug, Anne; Villesen, Palle; Kjeldsen, Eigil; Nyvold, Charlotte Guldborg; Hokland, Peter

    2015-01-01

    Sequencing the exome is quickly becoming the preferred method for discovering disease-inducing mutations. While obtaining data sets is a straightforward procedure, the subsequent analysis and interpretation of the data is a limiting step for clinical applications. Thus, while the initial mutation and variant calling can be performed by a bioinformatician or trained researcher, the output from robust packages such as MuTect and GATK is not directly informative for the general life scientists. In attempt to obviate this problem we have created complementary Wolfram scripts, which enable easy downstream annotation and selection, presented here in the perspective of hematological relevance. It also provides the researcher with the opportunity to extend the analysis by having a full-fledged programming and analysis environment of Mathematica at hand. In brief, post-processing is performed by: • Mapping of germ line and somatic variants to coding regions, and defining variant sets within Mathematica. • Processing of variants in variant effect predictor. • Extended annotation, relevance scoring and defining focus areas through the provided functions. PMID:26150983

  2. Complete nucleotide sequence analysis of the norovirus GII.4 Sydney variant in South Korea.

    PubMed

    Park, Ji-Sun; Lee, Sung-Geun; Jin, Ji-Young; Cho, Han-Gil; Jheong, Weon-Hwa; Paik, Soon-Young

    2015-01-01

    Norovirus is the primary cause of acute gastroenteritis in individuals of all ages. In Australia, a new strain of norovirus (GII.4) was identified in March 2012, and this strain has spread rapidly around the world. In August 2012, this new GII.4 strain was identified in patients in South Korea. Therefore, to examine the characteristics of the epidemic norovirus GII.4 2012 variant in South Korea, we conducted KM272334 full-length genomic analysis. The genome of the gg-12-08-04 strain consisted of 7,558 bp and contained three open reading frame (ORF) composites throughout the whole genome: ORF1 (5,100 bp), ORF2 (1,623 bp), and ORF3 (807 bp). Phylogenetic analyses showed that gg-12-08-04 belonged to the GII.4 Sydney 2012 variant, sharing 98.92% nucleotide similarity with this variant strain. According to SimPlot analysis, the gg-12-08-04 strain was a recombinant strain with breakpoint at the ORF1/2 junction between Osaka 2007 and Apeldoorn 2008 strains. This study is the first report of the complete sequence of the GII.4 Sydney 2012 strain in South Korea. Therefore, this may represent the standard sequence of the norovirus GII.4 2012 variant in South Korea and could therefore be useful for the development of norovirus vaccines.

  3. Genetic Variants Identified from Epilepsy of Unknown Etiology in Chinese Children by Targeted Exome Sequencing.

    PubMed

    Wang, Yimin; Du, Xiaonan; Bin, Rao; Yu, Shanshan; Xia, Zhezhi; Zheng, Guo; Zhong, Jianmin; Zhang, Yunjian; Jiang, Yong-Hui; Wang, Yi

    2017-01-11

    Genetic factors play a major role in the etiology of epilepsy disorders. Recent genomics studies using next generation sequencing (NGS) technique have identified a large number of genetic variants including copy number (CNV) and single nucleotide variant (SNV) in a small set of genes from individuals with epilepsy. These discoveries have contributed significantly to evaluate the etiology of epilepsy in clinic and lay the foundation to develop molecular specific treatment. However, the molecular basis for a majority of epilepsy patients remains elusive, and furthermore, most of these studies have been conducted in Caucasian children. Here we conducted a targeted exome-sequencing of 63 trios of Chinese epilepsy families using a custom-designed NGS panel that covers 412 known and candidate genes for epilepsy. We identified pathogenic and likely pathogenic variants in 15 of 63 (23.8%) families in known epilepsy genes including SCN1A, CDKL5, STXBP1, CHD2, SCN3A, SCN9A, TSC2, MBD5, POLG and EFHC1. More importantly, we identified likely pathologic variants in several novel candidate genes such as GABRE, MYH1, and CLCN6. Our results provide the evidence supporting the application of custom-designed NGS panel in clinic and indicate a conserved genetic susceptibility for epilepsy between Chinese and Caucasian children.

  4. Genetic Variants Identified from Epilepsy of Unknown Etiology in Chinese Children by Targeted Exome Sequencing

    PubMed Central

    Wang, Yimin; Du, Xiaonan; Bin, Rao; Yu, Shanshan; Xia, Zhezhi; Zheng, Guo; Zhong, Jianmin; Zhang, Yunjian; Jiang, Yong-hui; Wang, Yi

    2017-01-01

    Genetic factors play a major role in the etiology of epilepsy disorders. Recent genomics studies using next generation sequencing (NGS) technique have identified a large number of genetic variants including copy number (CNV) and single nucleotide variant (SNV) in a small set of genes from individuals with epilepsy. These discoveries have contributed significantly to evaluate the etiology of epilepsy in clinic and lay the foundation to develop molecular specific treatment. However, the molecular basis for a majority of epilepsy patients remains elusive, and furthermore, most of these studies have been conducted in Caucasian children. Here we conducted a targeted exome-sequencing of 63 trios of Chinese epilepsy families using a custom-designed NGS panel that covers 412 known and candidate genes for epilepsy. We identified pathogenic and likely pathogenic variants in 15 of 63 (23.8%) families in known epilepsy genes including SCN1A, CDKL5, STXBP1, CHD2, SCN3A, SCN9A, TSC2, MBD5, POLG and EFHC1. More importantly, we identified likely pathologic variants in several novel candidate genes such as GABRE, MYH1, and CLCN6. Our results provide the evidence supporting the application of custom-designed NGS panel in clinic and indicate a conserved genetic susceptibility for epilepsy between Chinese and Caucasian children. PMID:28074849

  5. A sequence variant associating with educational attainment also affects childhood cognition

    PubMed Central

    Gunnarsson, Bjarni; Jónsdóttir, Guðrún A.; Björnsdóttir, Gyða; Konte, Bettina; Sulem, Patrick; Kristmundsdóttir, Snædís; Kehr, Birte; Gústafsson, Ómar; Helgason, Hannes; Iordache, Paul D.; Ólafsson, Sigurgeir; Frigge, Michael L.; Þorleifsson, Guðmar; Arnarsdóttir, Sunna; Stefánsdóttir, Berglind; Giegling, Ina; Djurovic, Srdjan; Sundet, Kjetil S.; Espeseth, Thomas; Melle, Ingrid; Hartmann, Annette M.; Thorsteinsdottir, Unnur; Kong, Augustine; Guðbjartsson, Daníel F.; Ettinger, Ulrich; Andreassen, Ole A.; Dan Rujescu; Halldórsson, Jónas G.; Stefánsson, Hreinn; Halldórsson, Bjarni V.; Stefánsson, Kári

    2016-01-01

    Only a few common variants in the sequence of the genome have been shown to impact cognitive traits. Here we demonstrate that polygenic scores of educational attainment predict specific aspects of childhood cognition, as measured with IQ. Recently, three sequence variants were shown to associate with educational attainment, a confluence phenotype of genetic and environmental factors contributing to academic success. We show that one of these variants associating with educational attainment, rs4851266-T, also associates with Verbal IQ in dyslexic children (P = 4.3 × 10−4, β = 0.16 s.d.). The effect of 0.16 s.d. corresponds to 1.4 IQ points for heterozygotes and 2.8 IQ points for homozygotes. We verified this association in independent samples consisting of adults (P = 8.3 × 10−5, β = 0.12 s.d., combined P = 2.2 x 10−7, β = 0.14 s.d.). Childhood cognition is unlikely to be affected by education attained later in life, and the variant explains a greater fraction of the variance in verbal IQ than in educational attainment (0.7% vs 0.12%,. P = 1.0 × 10−5). PMID:27811963

  6. Adjusted sequence kernel association test for rare variants controlling for cryptic and family relatedness.

    PubMed

    Oualkacha, Karim; Dastani, Zari; Li, Rui; Cingolani, Pablo E; Spector, Timothy D; Hammond, Christopher J; Richards, J Brent; Ciampi, Antonio; Greenwood, Celia M T

    2013-05-01

    Recent progress in sequencing technologies makes it possible to identify rare and unique variants that may be associated with complex traits. However, the results of such efforts depend crucially on the use of efficient statistical methods and study designs. Although family-based designs might enrich a data set for familial rare disease variants, most existing rare variant association approaches assume independence of all individuals. We introduce here a framework for association testing of rare variants in family-based designs. This framework is an adaptation of the sequence kernel association test (SKAT) which allows us to control for family structure. Our adjusted SKAT (ASKAT) combines the SKAT approach and the factored spectrally transformed linear mixed models (FaST-LMMs) algorithm to capture family effects based on a LMM incorporating the realized proportion of the genome that is identical by descent between pairs of individuals, and using restricted maximum likelihood methods for estimation. In simulation studies, we evaluated type I error and power of this proposed method and we showed that regardless of the level of the trait heritability, our approach has good control of type I error and good power. Since our approach uses FaST-LMM to calculate variance components for the proposed mixed model, ASKAT is reasonably fast and can analyze hundreds of thousands of markers. Data from the UK twins consortium are presented to illustrate the ASKAT methodology.

  7. Methods for analyzing nucleic acid sequences

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2011-05-17

    The present invention is directed to a method of sequencing a target nucleic acid. The method provides a complex comprising a polymerase enzyme, a target nucleic acid molecule, and a primer, wherein the complex is immobilized on a support Fluorescent label is attached to a terminal phosphate group of the nucleotide or nucleotide analog. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The time duration of the signal from labeled nucleotides or nucleotide analogs that become incorporated is distinguished from freely diffusing labels by a longer retention in the observation volume for the nucleotides or nucleotide analogs that become incorporated than for the freely diffusing labels.

  8. Principles and Recommendations for Standardizing the Use of the Next-Generation Sequencing Variant File in Clinical Settings.

    PubMed

    Lubin, Ira M; Aziz, Nazneen; Babb, Lawrence J; Ballinger, Dennis; Bisht, Himani; Church, Deanna M; Cordes, Shaun; Eilbeck, Karen; Hyland, Fiona; Kalman, Lisa; Landrum, Melissa; Lockhart, Edward R; Maglott, Donna; Marth, Gabor; Pfeifer, John D; Rehm, Heidi L; Roy, Somak; Tezak, Zivana; Truty, Rebecca; Ullman-Cullere, Mollie; Voelkerding, Karl V; Worthey, Elizabeth; Zaranek, Alexander W; Zook, Justin M

    2017-03-15

    A national workgroup convened by the Centers for Disease Control and Prevention identified principles and made recommendations for standardizing the description of sequence data contained within the variant file generated during the course of clinical next-generation sequence analysis for diagnosing human heritable conditions. The specifications for variant files were initially developed to be flexible with regard to content representation to support a variety of research applications. This flexibility permits variation with regard to how sequence findings are described and this depends, in part, on the conventions used. For clinical laboratory testing, this poses a problem because these differences can compromise the capability to compare sequence findings among laboratories to confirm results and to query databases to identify clinically relevant variants. To provide for a more consistent representation of sequence findings described within variant files, the workgroup made several recommendations that considered alignment to a common reference sequence, variant caller settings, use of genomic coordinates, and gene and variant naming conventions. These recommendations were considered with regard to the existing variant file specifications presently used in the clinical setting. Adoption of these recommendations is anticipated to reduce the potential for ambiguity in describing sequence findings and facilitate the sharing of genomic data among clinical laboratories and other entities.

  9. Polymorphisms and variants in the prion protein sequence of European moose (Alces alces), reindeer (Rangifer tarandus), roe deer (Capreolus capreolus) and fallow deer (Dama dama) in Scandinavia.

    PubMed

    Wik, Lotta; Mikko, Sofia; Klingeborn, Mikael; Stéen, Margareta; Simonsson, Magnus; Linné, Tommy

    2012-07-01

    The prion protein (PrP) sequence of European moose, reindeer, roe deer and fallow deer in Scandinavia has high homology to the PrP sequence of North American cervids. Variants in the European moose PrP sequence were found at amino acid position 109 as K or Q. The 109Q variant is unique in the PrP sequence of vertebrates. During the 1980s a wasting syndrome in Swedish moose, Moose Wasting Syndrome (MWS), was described. SNP analysis demonstrated a difference in the observed genotype proportions of the heterozygous Q/K and homozygous Q/Q variants in the MWS animals compared with the healthy animals. In MWS moose the allele frequencies for 109K and 109Q were 0.73 and 0.27, respectively, and for healthy animals 0.69 and 0.31. Both alleles were seen as heterozygotes and homozygotes. In reindeer, PrP sequence variation was demonstrated at codon 176 as D or N and codon 225 as S or Y. The PrP sequences in roe deer and fallow deer were identical with published GenBank sequences.

  10. Possession of ATM Sequence Variants as Predictor for Late Normal Tissue Responses in Breast Cancer Patients Treated With Radiotherapy

    SciTech Connect

    Ho, Alice Y.; Fan, Grace; Atencio, David P.; Green, Sheryl; Formenti, Silvia C.; Haffty, Bruce G.; Iyengar, Preetha B.A.; Bernstein, Jonine L.; Stock, Richard G.; Cesaretti, Jamie A.; Rosenstein, Barry S.

    2007-11-01

    Purpose: The ATM gene product is a central component of cell cycle regulation and genomic surveillance. We hypothesized that DNA sequence alterations in ATM predict for adverse effects after external beam radiotherapy for early breast cancer. Methods and Materials: A total of 131 patients with a minimum of 2 years follow-up who had undergone breast-conserving surgery and adjuvant radiotherapy were screened for sequence alterations in ATM using DNA from blood lymphocytes. Genetic variants were identified using denaturing high performance liquid chromatography. The Radiation Therapy Oncology Group late morbidity scoring schemes for skin and subcutaneous tissues were applied to quantify the radiation-induced effects. Results: Of the 131 patients, 51 possessed ATM sequence alterations located within exons or in short intron regions flanking each exon that encompass putative splice site regions. Of these 51 patients, 21 (41%) exhibited a minimum of a Grade 2 late radiation response. In contrast, of the 80 patients without an ATM sequence variation, only 18 (23%) had radiation-induced adverse responses, for an odds ratio of 2.4 (95% confidence interval, 1.1-5.2). Fifteen patients were heterozygous for the G{yields}A polymorphism at nucleotide 5557, which causes substitution of asparagine for aspartic acid at position 1853 of the ATM protein. Of these 15 patients, 8 (53%) exhibited a Grade 2-4 late response compared with 31 (27%) of the 116 patients without this alteration, for an odds ratio of 3.1 (95% confidence interval, 1.1-9.4). Conclusion: Sequence variants located in the ATM gene, in particular the 5557 G{yields}A polymorphism, may predict for late adverse radiation responses in breast cancer patients.

  11. Evaluation of Nine Somatic Variant Callers for Detection of Somatic Mutations in Exome and Targeted Deep Sequencing Data.

    PubMed

    Krøigård, Anne Bruun; Thomassen, Mads; Lænkholm, Anne-Vibeke; Kruse, Torben A; Larsen, Martin Jakob

    2016-01-01

    Next generation sequencing is extensively applied to catalogue somatic mutations in cancer, in research settings and increasingly in clinical settings for molecular diagnostics, guiding therapy decisions. Somatic variant callers perform paired comparisons of sequencing data from cancer tissue and matched normal tissue in order to detect somatic mutations. The advent of many new somatic variant callers creates a need for comparison and validation of the tools, as no de facto standard for detection of somatic mutations exists and only limited comparisons have been reported. We have performed a comprehensive evaluation using exome sequencing and targeted deep sequencing data of paired tumor-normal samples from five breast cancer patients to evaluate the performance of nine publicly available somatic variant callers: EBCall, Mutect, Seurat, Shimmer, Indelocator, Somatic Sniper, Strelka, VarScan 2 and Virmid for the detection of single nucleotide mutations and small deletions and insertions. We report a large variation in the number of calls from the nine somatic variant callers on the same sequencing data and highly variable agreement. Sequencing depth had markedly diverse impact on individual callers, as for some callers, increased sequencing depth highly improved sensitivity. For SNV calling, we report EBCall, Mutect, Virmid and Strelka to be the most reliable somatic variant callers for both exome sequencing and targeted deep sequencing. For indel calling, EBCall is superior due to high sensitivity and robustness to changes in sequencing depths.

  12. Evaluation of Nine Somatic Variant Callers for Detection of Somatic Mutations in Exome and Targeted Deep Sequencing Data

    PubMed Central

    Krøigård, Anne Bruun; Thomassen, Mads; Lænkholm, Anne-Vibeke; Kruse, Torben A.; Larsen, Martin Jakob

    2016-01-01

    Next generation sequencing is extensively applied to catalogue somatic mutations in cancer, in research settings and increasingly in clinical settings for molecular diagnostics, guiding therapy decisions. Somatic variant callers perform paired comparisons of sequencing data from cancer tissue and matched normal tissue in order to detect somatic mutations. The advent of many new somatic variant callers creates a need for comparison and validation of the tools, as no de facto standard for detection of somatic mutations exists and only limited comparisons have been reported. We have performed a comprehensive evaluation using exome sequencing and targeted deep sequencing data of paired tumor-normal samples from five breast cancer patients to evaluate the performance of nine publicly available somatic variant callers: EBCall, Mutect, Seurat, Shimmer, Indelocator, Somatic Sniper, Strelka, VarScan 2 and Virmid for the detection of single nucleotide mutations and small deletions and insertions. We report a large variation in the number of calls from the nine somatic variant callers on the same sequencing data and highly variable agreement. Sequencing depth had markedly diverse impact on individual callers, as for some callers, increased sequencing depth highly improved sensitivity. For SNV calling, we report EBCall, Mutect, Virmid and Strelka to be the most reliable somatic variant callers for both exome sequencing and targeted deep sequencing. For indel calling, EBCall is superior due to high sensitivity and robustness to changes in sequencing depths. PMID:27002637

  13. Sequence variants in oxytocin pathway genes and preterm birth: a candidate gene association study

    PubMed Central

    2013-01-01

    Background Preterm birth (PTB) is a complex disorder associated with significant neonatal mortality and morbidity and long-term adverse health consequences. Multiple lines of evidence suggest that genetic factors play an important role in its etiology. This study was designed to identify genetic variation associated with PTB in oxytocin pathway genes whose role in parturition is well known. Methods To identify common genetic variants predisposing to PTB, we genotyped 16 single nucleotide polymorphisms (SNPs) in the oxytocin (OXT), oxytocin receptor (OXTR), and leucyl/cystinyl aminopeptidase (LNPEP) genes in 651 case infants from the U.S. and one or both of their parents. In addition, we examined the role of rare genetic variation in susceptibility to PTB by conducting direct sequence analysis of OXTR in 1394 cases and 1112 controls from the U.S., Argentina, Denmark, and Finland. This study was further extended to maternal triads (maternal grandparents-mother of a case infant, N=309). We also performed in vitro analysis of selected rare OXTR missense variants to evaluate their functional importance. Results Maternal genetic effect analysis of the SNP genotype data revealed four SNPs in LNPEP that show significant association with prematurity. In our case–control sequence analysis, we detected fourteen coding variants in exon 3 of OXTR, all but four of which were found in cases only. Of the fourteen variants, three were previously unreported novel rare variants. When the sequence data from the maternal triads were analyzed using the transmission disequilibrium test, two common missense SNPs (rs4686302 and rs237902) in OXTR showed suggestive association for three gestational age subgroups. In vitro functional assays showed a significant difference in ligand binding between wild-type and two mutant receptors. Conclusions Our study suggests an association between maternal common polymorphisms in LNPEP and susceptibility to PTB. Maternal OXTR missense SNPs rs4686302

  14. Targeted next-generation sequencing reveals multiple deleterious variants in OPLL-associated genes

    PubMed Central

    Chen, Xin; Guo, Jun; Cai, Tao; Zhang, Fengshan; Pan, Shengfa; Zhang, Li; Wang, Shaobo; Zhou, Feifei; Diao, Yinze; Zhao, Yanbin; Chen, Zhen; Liu, Xiaoguang; Chen, Zhongqiang; Liu, Zhongjun; Sun, Yu; Du, Jie

    2016-01-01

    Ossification of the posterior longitudinal ligament of the spine (OPLL), which is characterized by ectopic bone formation in the spinal ligaments, can cause spinal-cord compression. To date, at least 11 susceptibility genes have been genetically linked to OPLL. In order to identify potential deleterious alleles in these OPLL-associated genes, we designed a capture array encompassing all coding regions of the target genes for next-generation sequencing (NGS) in a cohort of 55 unrelated patients with OPLL. By bioinformatics analyses, we successfully identified three novel and five extremely rare variants (MAF < 0.005). These variants were predicted to be deleterious by commonly used various algorithms, thereby resulting in missense mutations in four OPLL-associated genes (i.e., COL6A1, COL11A2, FGFR1, and BMP2). Furthermore, potential effects of the patient with p.Q89E of BMP2 were confirmed by a markedly increased BMP2 level in peripheral blood samples. Notably, seven of the variants were found to be associated with the patients with continuous subtype changes by cervical spinal radiological analyses. Taken together, our findings revealed for the first time that deleterious coding variants of the four OPLL-associated genes are potentially pathogenic in the patients with OPLL. PMID:27246988

  15. Deep Sequencing Reveals Potential Antigenic Variants at Low Frequencies in Influenza A Virus-Infected Humans

    PubMed Central

    Dinis, Jorge M.; Florek, Nicholas W.; Fatola, Omolayo O.; Moncla, Louise H.; Mutschler, James P.; Charlier, Olivia K.; Meece, Jennifer K.; Belongia, Edward A.

    2016-01-01

    ABSTRACT Influenza vaccines must be frequently reformulated to account for antigenic changes in the viral envelope protein, hemagglutinin (HA). The rapid evolution of influenza virus under immune pressure is likely enhanced by the virus's genetic diversity within a host, although antigenic change has rarely been investigated on the level of individual infected humans. We used deep sequencing to characterize the between- and within-host genetic diversity of influenza viruses in a cohort of patients that included individuals who were vaccinated and then infected in the same season. We characterized influenza HA segments from the predominant circulating influenza A subtypes during the 2012-2013 (H3N2) and 2013-2014 (pandemic H1N1; H1N1pdm) flu seasons. We found that HA consensus sequences were similar in nonvaccinated and vaccinated subjects. In both groups, purifying selection was the dominant force shaping HA genetic diversity. Interestingly, viruses from multiple individuals harbored low-frequency mutations encoding amino acid substitutions in HA antigenic sites at or near the receptor-binding domain. These mutations included two substitutions in H1N1pdm viruses, G158K and N159K, which were recently found to confer escape from virus-specific antibodies. These findings raise the possibility that influenza antigenic diversity can be generated within individual human hosts but may not become fixed in the viral population even when they would be expected to have a strong fitness advantage. Understanding constraints on influenza antigenic evolution within individual hosts may elucidate potential future pathways of antigenic evolution at the population level. IMPORTANCE Influenza vaccines must be frequently reformulated due to the virus's rapid evolution rate. We know that influenza viruses exist within each infected host as a “swarm” of genetically distinct viruses, but the role of this within-host diversity in the antigenic evolution of influenza has been unclear

  16. Identification of novel functional sequence variants in the gene for peptidase inhibitor 3

    PubMed Central

    Chowdhury, Mahboob A; Kuivaniemi, Helena; Romero, Roberto; Edwin, Samuel; Chaiworapongsa, Tinnakorn; Tromp, Gerard

    2006-01-01

    Background Peptidase inhibitor 3 (PI3) inhibits neutrophil elastase and proteinase-3, and has a potential role in skin and lung diseases as well as in cancer. Genome-wide expression profiling of chorioamniotic membranes revealed decreased expression of PI3 in women with preterm premature rupture of membranes. To elucidate the molecular mechanisms contributing to the decreased expression in amniotic membranes, the PI3 gene was searched for sequence variations and the functional significance of the identified promoter variants was studied. Methods Single nucleotide polymorphisms (SNPs) were identified by direct sequencing of PCR products spanning a region from 1,173 bp upstream to 1,266 bp downstream of the translation start site. Fourteen SNPs were genotyped from 112 and nine SNPs from 24 unrelated individuals. Putative transcription factor binding sites as detected by in silico search were verified by electrophoretic mobility shift assay (EMSA) using nuclear extract from Hela and amnion cell nuclear extract. Deviation from Hardy-Weinberg equilibrium (HWE) was tested by χ2 goodness-of-fit test. Haplotypes were estimated using expectation maximization (EM) algorithm. Results Twenty-three sequence variations were identified by direct sequencing of polymerase chain reaction (PCR) products covering 2,439 nt of the PI3 gene (-1,173 nt of promoter sequences and all three exons). Analysis of 112 unrelated individuals showed that 20 variants had minor allele frequencies (MAF) ranging from 0.02 to 0.46 representing "true polymorphisms", while three had MAF ≤ 0.01. Eleven variants were in the promoter region; several putative transcription factor binding sites were found at these sites by database searches. Differential binding of transcription factors was demonstrated at two polymorphic sites by electrophoretic mobility shift assays, both in amniotic and HeLa cell nuclear extracts. Differential binding of the transcription factor GATA1 at -689C>G site was confirmed by a

  17. Novel inhibitors of human leukocyte elastase and cathepsin G. Sequence variants of squash seed protease inhibitor with altered protease selectivity

    SciTech Connect

    McWherter, C.A.; Walkenhorst, W.F.; Glover, G.I. ); Campbell, E.J. )

    1989-07-11

    Novel peptide inhibitors of human leukocyte elastase (HLE) and cathepsin G (CG) were prepared by solid-phase peptide synthesis of P1 amino acid sequence variants of Curcurbita maxima trypsin inhibitor III (CMTI-III), a 29-residue peptide found in squash seed. A systematic study of P1 variants indicated that P1, Arg, Lys, Leu, Ala, Phe, and Met inhibit trypsin; P1, Val, Ile, Gly, Leu, Ala, Phe, and Met inhibit HLE; P1 Leu, Ala, Phe, and Met inhibit CG and chymotrypsin. Variants with P1, Val, Ile, or Gly were selective inhibitors of HLE, while inhibition of trypsin required P1 amino acids with an unbranched {beta} carbon. Studies of Val-5-CMTI-III (P1 Val) inhibition of HLE demonstrated a 1:1 binding stoichiometry with a (K{sub i}){sub app} of 8.7 nM. Inhibition of HLE by Gly-5-CMTI-III indicated a significant role for reactive-site structural moieties other than the P1 side chain. Val-5-CMTI-III inhibited both HLE and human polymorphonuclear leukocyte (PMN) proteolysis of surface-bound {sup 125}I-labeled fibronectin. Val-5-CMTI-III was more effective at preventing turnover of a peptide p-nitroanilide substrate than halting dissolution of {sup 125}I-labeled fibronectin. It was about as effective as human serum {alpha}{sub 1}-proteinase inhibitor in preventing PMN degradation of the connective tissue substrate. In addition to providing interesting candidates for controlling inflammatory cell proteolytic injury, the CMTI-based inhibitors are ideal for studying molecular recognition because of their small size, their ease of preparation, and the availability of sensitive and quantitative assays for intermolecular interactions.

  18. Molecular cloning and nucleotide sequence of cDNA for human glucose-6-phosphate dehydrogenase variant A(-)

    SciTech Connect

    Hirono, A.; Beutler, E. )

    1988-06-01

    Glucose-6-phosphate dehydrogenase A(-) is a common variant in Blacks that causes sensitivity to drug- and infection-induced hemolytic anemia. A cDNA library was constructed from Epstein-Barr virus-transformed lymphoblastoid cells from a male who was G6PD A(-). One of four cDNA clones isolated contained a sequence not found in the other clones nor in the published cDNA sequence. Consisting of 138 bases and coding 46 amino acids, this segment of cDNA apparently is derived from the alternative splicing involving the 3{prime} end of intron 7. Comparison of the remaining sequences of these clones with the published sequence revealed three nucleotide substitutions: C{sup 33} {yields} G, G{sup 202} {yields} A, and A{sup 376} {yields} G. Each change produces a new restriction site. Genomic DNA from five G6PD A(-) individuals was amplified by the polymerase chain reaction. The findings of the same mutation in G6PD A(-) as is found in G6PD A(+) strongly suggests that the G6PD A(-) mutation arose in an individual with G6PD A(+), adding another mutation that causes the in vivo instability of this enzyme protein.

  19. A new system identification approach to identify genetic variants in sequencing studies for a binary phenotype.

    PubMed

    Kang, Guolian; Bi, Wenjian; Zhao, Yanlong; Zhang, Ji-Feng; Yang, Jun J; Xu, Heng; Loh, Mignon L; Hunger, Stephen P; Relling, Mary V; Pounds, Stanley; Cheng, Cheng

    2014-01-01

    We propose in this paper a set-valued (SV) system model, which is a generalized form of logistic (LG) and Probit (Probit) regression, to be considered as a method for discovering genetic variants, especially rare genetic variants in next-generation sequencing studies, for a binary phenotype. We propose a new SV system identification method to estimate all underlying key system parameters for the Probit model and compare it with the LG model in the setting of genetic association studies. Across an extensive series of simulation studies, the Probit method maintained type I error control and had similar or greater power than the LG method, which is robust to different distributions of noise: logistic, normal, or t distributions. Additionally, the Probit association parameter estimate was 2.7-46.8-fold less variable than the LG log-odds ratio association parameter estimate. Less variability in the association parameter estimate translates to greater power and robustness across the spectrum of minor allele frequencies (MAFs), and these advantages are the most pronounced for rare variants. For instance, in a simulation that generated data from an additive logistic model with an odds ratio of 7.4 for a rare single nucleotide polymorphism with a MAF of 0.005 and a sample size of 2,300, the Probit method had 60% power whereas the LG method had 25% power at the α = 10(-6) level. Consistent with these simulation results, the set of variants identified by the LG method was a subset of those identified by the Probit method in two example analyses. Thus, we suggest the Probit method may be a competitive alternative to the LG method in genetic association studies such as candidate gene, genome-wide, or next-generation sequencing studies for a binary phenotype.

  20. Rare Variants in Neurodegeneration Associated Genes Revealed by Targeted Panel Sequencing in a German ALS Cohort

    PubMed Central

    Krüger, Stefanie; Battke, Florian; Sprecher, Andrea; Munz, Marita; Synofzik, Matthis; Schöls, Ludger; Gasser, Thomas; Grehl, Torsten; Prudlo, Johannes; Biskup, Saskia

    2016-01-01

    Amyotrophic lateral sclerosis (ALS) is a progressive fatal multisystemic neurodegenerative disorder caused by preferential degeneration of upper and lower motor neurons. To further delineate the genetic architecture of the disease, we used comprehensive panel sequencing in a cohort of 80 German ALS patients. The panel covered 39 confirmed ALS genes and candidate genes, as well as 238 genes associated with other entities of the neurodegenerative disease spectrum. In addition, we performed repeat length analysis for C9orf72. Our aim was to (1) identify potentially disease-causing variants, to (2) assess a proposed model of polygenic inheritance in ALS and to (3) connect ALS with other neurodegenerative entities. We identified 79 rare potentially pathogenic variants in 27 ALS associated genes in familial and sporadic cases. Five patients had pathogenic C9orf72 repeat expansions, a further four patients harbored intermediate length repeat expansions. Our findings demonstrate that a genetic background of the disease can actually be found in a large proportion of seemingly sporadic cases and that it is not limited to putative most frequently affected genes such as C9orf72 or SOD1. Assessing the polygenic nature of ALS, we identified 15 patients carrying at least two rare potentially pathogenic variants in ALS associated genes including pathogenic or intermediate C9orf72 repeat expansions. Multiple variants might influence severity or duration of disease or could account for intrafamilial phenotypic variability or reduced penetrance. However, we could not observe a correlation with age of onset in this study. We further detected potentially pathogenic variants in other neurodegeneration associated genes in 12 patients, supporting the hypothesis of common pathways in neurodegenerative diseases and linking ALS to other entities of the neurodegenerative spectrum. Most interestingly we found variants in GBE1 and SPG7 which might represent differential diagnoses. Based on our

  1. Rare Variants in Neurodegeneration Associated Genes Revealed by Targeted Panel Sequencing in a German ALS Cohort.

    PubMed

    Krüger, Stefanie; Battke, Florian; Sprecher, Andrea; Munz, Marita; Synofzik, Matthis; Schöls, Ludger; Gasser, Thomas; Grehl, Torsten; Prudlo, Johannes; Biskup, Saskia

    2016-01-01

    Amyotrophic lateral sclerosis (ALS) is a progressive fatal multisystemic neurodegenerative disorder caused by preferential degeneration of upper and lower motor neurons. To further delineate the genetic architecture of the disease, we used comprehensive panel sequencing in a cohort of 80 German ALS patients. The panel covered 39 confirmed ALS genes and candidate genes, as well as 238 genes associated with other entities of the neurodegenerative disease spectrum. In addition, we performed repeat length analysis for C9orf72. Our aim was to (1) identify potentially disease-causing variants, to (2) assess a proposed model of polygenic inheritance in ALS and to (3) connect ALS with other neurodegenerative entities. We identified 79 rare potentially pathogenic variants in 27 ALS associated genes in familial and sporadic cases. Five patients had pathogenic C9orf72 repeat expansions, a further four patients harbored intermediate length repeat expansions. Our findings demonstrate that a genetic background of the disease can actually be found in a large proportion of seemingly sporadic cases and that it is not limited to putative most frequently affected genes such as C9orf72 or SOD1. Assessing the polygenic nature of ALS, we identified 15 patients carrying at least two rare potentially pathogenic variants in ALS associated genes including pathogenic or intermediate C9orf72 repeat expansions. Multiple variants might influence severity or duration of disease or could account for intrafamilial phenotypic variability or reduced penetrance. However, we could not observe a correlation with age of onset in this study. We further detected potentially pathogenic variants in other neurodegeneration associated genes in 12 patients, supporting the hypothesis of common pathways in neurodegenerative diseases and linking ALS to other entities of the neurodegenerative spectrum. Most interestingly we found variants in GBE1 and SPG7 which might represent differential diagnoses. Based on our

  2. The Use of Non-Variant Sites to Improve the Clinical Assessment of Whole-Genome Sequence Data

    PubMed Central

    Griggio, Francesca; Garonzi, Marianna; Cantaloni, Chiara; Centomo, Cesare; Vargas, Sergio Marin; Descombes, Patrick; Marquis, Julien; Collino, Sebastiano; Franceschi, Claudio; Garagnani, Paolo; Salisbury, Benjamin A.; Harvey, John Max; Delledonne, Massimo

    2015-01-01

    Genetic testing, which is now a routine part of clinical practice and disease management protocols, is often based on the assessment of small panels of variants or genes. On the other hand, continuous improvements in the speed and per-base costs of sequencing have now made whole exome sequencing (WES) and whole genome sequencing (WGS) viable strategies for targeted or complete genetic analysis, respectively. Standard WGS/WES data analytical workflows generally rely on calling of sequence variants respect to the reference genome sequence. However, the reference genome sequence contains a large number of sites represented by rare alleles, by known pathogenic alleles and by alleles strongly associated to disease by GWAS. It’s thus critical, for clinical applications of WGS and WES, to interpret whether non-variant sites are homozygous for the reference allele or if the corresponding genotype cannot be reliably called. Here we show that an alternative analytical approach based on the analysis of both variant and non-variant sites from WGS data allows to genotype more than 92% of sites corresponding to known SNPs compared to 6% genotyped by standard variant analysis. These include homozygous reference sites of clinical interest, thus leading to a broad and comprehensive characterization of variation necessary to an accurate evaluation of disease risk. Altogether, our findings indicate that characterization of both variant and non-variant clinically informative sites in the genome is necessary to allow an accurate clinical assessment of a personal genome. Finally, we propose a highly efficient extended VCF (eVCF) file format which allows to store genotype calls for sites of clinical interest while remaining compatible with current variant interpretation software. PMID:26147798

  3. 77 FR 65537 - Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-10-29

    ... Amino Acid Sequence Disclosures ACTION: Proposed collection; comment request. SUMMARY: The United States....'' SUPPLEMENTARY INFORMATION: I. Abstract Patent applications that contain nucleotide and/or amino acid...

  4. Quantifying unobserved protein-coding variants in human populations provides a roadmap for large-scale sequencing projects

    PubMed Central

    Zou, James; Valiant, Gregory; Valiant, Paul; Karczewski, Konrad; Chan, Siu On; Samocha, Kaitlin; Lek, Monkol; Sunyaev, Shamil; Daly, Mark; MacArthur, Daniel G.

    2016-01-01

    As new proposals aim to sequence ever larger collection of humans, it is critical to have a quantitative framework to evaluate the statistical power of these projects. We developed a new algorithm, UnseenEst, and applied it to the exomes of 60,706 individuals to estimate the frequency distribution of all protein-coding variants, including rare variants that have not been observed yet in the current cohorts. Our results quantified the number of new variants that we expect to identify as sequencing cohorts reach hundreds of thousands of individuals. With 500K individuals, we find that we expect to capture 7.5% of all possible loss-of-function variants and 12% of all possible missense variants. We also estimate that 2,900 genes have loss-of-function frequency of <0.00001 in healthy humans, consistent with very strong intolerance to gene inactivation. PMID:27796292

  5. RefCNV: Identification of Gene-Based Copy Number Variants Using Whole Exome Sequencing

    PubMed Central

    Chang, Lun-Ching; Das, Biswajit; Lih, Chih-Jian; Si, Han; Camalier, Corinne E.; McGregor, Paul M.; Polley, Eric

    2016-01-01

    With rapid advances in DNA sequencing technologies, whole exome sequencing (WES) has become a popular approach for detecting somatic mutations in oncology studies. The initial intent of WES was to characterize single nucleotide variants, but it was observed that the number of sequencing reads that mapped to a genomic region correlated with the DNA copy number variants (CNVs). We propose a method RefCNV that uses a reference set to estimate the distribution of the coverage for each exon. The construction of the reference set includes an evaluation of the sources of variability in the coverage distribution. We observed that the processing steps had an impact on the coverage distribution. For each exon, we compared the observed coverage with the expected normal coverage. Thresholds for determining CNVs were selected to control the false-positive error rate. RefCNV prediction correlated significantly (r = 0.96–0.86) with CNV measured by digital polymerase chain reaction for MET (7q31), EGFR (7p12), or ERBB2 (17q12) in 13 tumor cell lines. The genome-wide CNV analysis showed a good overall correlation (Spearman’s coefficient = 0.82) between RefCNV estimation and publicly available CNV data in Cancer Cell Line Encyclopedia. RefCNV also showed better performance than three other CNV estimation methods in genome-wide CNV analysis. PMID:27147817

  6. Genetic mapping and exome sequencing identify variants associated with five novel diseases.

    PubMed

    Puffenberger, Erik G; Jinks, Robert N; Sougnez, Carrie; Cibulskis, Kristian; Willert, Rebecca A; Achilly, Nathan P; Cassidy, Ryan P; Fiorentini, Christopher J; Heiken, Kory F; Lawrence, Johnny J; Mahoney, Molly H; Miller, Christopher J; Nair, Devika T; Politi, Kristin A; Worcester, Kimberly N; Setton, Roni A; Dipiazza, Rosa; Sherman, Eric A; Eastman, James T; Francklyn, Christopher; Robey-Bond, Susan; Rider, Nicholas L; Gabriel, Stacey; Morton, D Holmes; Strauss, Kevin A

    2012-01-01

    The Clinic for Special Children (CSC) has integrated biochemical and molecular methods into a rural pediatric practice serving Old Order Amish and Mennonite (Plain) children. Among the Plain people, we have used single nucleotide polymorphism (SNP) microarrays to genetically map recessive disorders to large autozygous haplotype blocks (mean = 4.4 Mb) that contain many genes (mean = 79). For some, uninformative mapping or large gene lists preclude disease-gene identification by Sanger sequencing. Seven such conditions were selected for exome sequencing at the Broad Institute; all had been previously mapped at the CSC using low density SNP microarrays coupled with autozygosity and linkage analyses. Using between 1 and 5 patient samples per disorder, we identified sequence variants in the known disease-causing genes SLC6A3 and FLVCR1, and present evidence to strongly support the pathogenicity of variants identified in TUBGCP6, BRAT1, SNIP1, CRADD, and HARS. Our results reveal the power of coupling new genotyping technologies to population-specific genetic knowledge and robust clinical data.

  7. Analysis of Sequence Variation and Risk Association of Human Papillomavirus 52 Variants Circulating in Korea

    PubMed Central

    Choi, Youn Jin; Ki, Eun Young; Zhang, Chuqing; Ho, Wendy C. S.; Lee, Sung-Jong; Jeong, Min Jin

    2016-01-01

    Introduction Human papillomavirus (HPV) 52 is a carcinogenic, high-risk genotype frequently detected in cervical cancer cases from East Asia, including Korea. Materials and Methods Sequences of HPV52 detected in 91 cervical samples collected from women attending Seoul St. Mary’s Hospital were analyzed. HPV52 genomic sequences were obtained by polymerase chain reaction (PCR)-based sequencing and analyzed using Seq-Scape software, and phylogenetic trees were constructed using MEGA6 software. Results Of the 91 cervical samples, 40 were normal, 22 were low-grade lesions, 21 were high-grade lesions and 7 were squamous cell carcinomas. Four HPV52 variant lineages (A, B, C and D) were identified. Lineage B was the most frequently detected lineage, followed by lineage C. By analyzing the two most frequently detected lineages (B and C), we found that distinct variations existed in each lineage. We also found that a lineage B-specific mutation K93R (A379G) was associated with an increased risk of cervical neoplasia. Conclusions To our knowledge, we are the first to reveal the predominance of the HPV52 lineages, B and C, in Korea. We also found these lineages harbored distinct genetic alterations that may affect oncogenicity. Our findings increase our understanding on the heterogeneity of HPV52 variants, and may be useful for the development of new diagnostic assays and therapeutic vaccines. PMID:27977741

  8. Detecting genomic clustering of risk variants from sequence data: cases versus controls.

    PubMed

    Schaid, Daniel J; Sinnwell, Jason P; McDonnell, Shannon K; Thibodeau, Stephen N

    2013-11-01

    As the ability to measure dense genetic markers approaches the limit of the DNA sequence itself, taking advantage of possible clustering of genetic variants in, and around, a gene would benefit genetic association analyses, and likely provide biological insights. The greatest benefit might be realized when multiple rare variants cluster in a functional region. Several statistical tests have been developed, one of which is based on the popular Kulldorff scan statistic for spatial clustering of disease. We extended another popular spatial clustering method--Tango's statistic--to genomic sequence data. An advantage of Tango's method is that it is rapid to compute, and when single test statistic is computed, its distribution is well approximated by a scaled χ(2) distribution, making computation of p values very rapid. We compared the Type-I error rates and power of several clustering statistics, as well as the omnibus sequence kernel association test. Although our version of Tango's statistic, which we call "Kernel Distance" statistic, took approximately half the time to compute than the Kulldorff scan statistic, it had slightly less power than the scan statistic. Our results showed that the Ionita-Laza version of Kulldorff's scan statistic had the greatest power over a range of clustering scenarios.

  9. Genetic Mapping and Exome Sequencing Identify Variants Associated with Five Novel Diseases

    PubMed Central

    Puffenberger, Erik G.; Jinks, Robert N.; Sougnez, Carrie; Cibulskis, Kristian; Willert, Rebecca A.; Achilly, Nathan P.; Cassidy, Ryan P.; Fiorentini, Christopher J.; Heiken, Kory F.; Lawrence, Johnny J.; Mahoney, Molly H.; Miller, Christopher J.; Nair, Devika T.; Politi, Kristin A.; Worcester, Kimberly N.; Setton, Roni A.; DiPiazza, Rosa; Sherman, Eric A.; Eastman, James T.; Francklyn, Christopher; Robey-Bond, Susan; Rider, Nicholas L.; Gabriel, Stacey; Morton, D. Holmes; Strauss, Kevin A.

    2012-01-01

    The Clinic for Special Children (CSC) has integrated biochemical and molecular methods into a rural pediatric practice serving Old Order Amish and Mennonite (Plain) children. Among the Plain people, we have used single nucleotide polymorphism (SNP) microarrays to genetically map recessive disorders to large autozygous haplotype blocks (mean = 4.4 Mb) that contain many genes (mean = 79). For some, uninformative mapping or large gene lists preclude disease-gene identification by Sanger sequencing. Seven such conditions were selected for exome sequencing at the Broad Institute; all had been previously mapped at the CSC using low density SNP microarrays coupled with autozygosity and linkage analyses. Using between 1 and 5 patient samples per disorder, we identified sequence variants in the known disease-causing genes SLC6A3 and FLVCR1, and present evidence to strongly support the pathogenicity of variants identified in TUBGCP6, BRAT1, SNIP1, CRADD, and HARS. Our results reveal the power of coupling new genotyping technologies to population-specific genetic knowledge and robust clinical data. PMID:22279524

  10. Excess variants in AFF2 detected by massively parallel sequencing of males with autism spectrum disorder.

    PubMed

    Mondal, Kajari; Ramachandran, Dhanya; Patel, Viren C; Hagen, Katie R; Bose, Promita; Cutler, David J; Zwick, Michael E

    2012-10-01

    Autism spectrum disorder (ASD) is a heterogeneous disorder with substantial heritability, most of which is unexplained. ASD has a population prevalence of one percent and affects four times as many males as females. Patients with fragile X E (FRAXE) intellectual disability, which is caused by a silencing of the X-linked gene AFF2, display a number of ASD-like phenotypes. Duplications and deletions at the AFF2 locus have also been reported in cases with moderate intellectual disability and ASD. We hypothesized that other rare X-linked sequence variants at the AFF2 locus might contribute to ASD. We sequenced the AFF2 genomic region in 202 male ASD probands and found that 2.5% of males sequenced had missense mutations at highly conserved evolutionary sites. When compared with the frequency of missense mutations in 5545 X chromosomes from unaffected controls, we saw a statistically significant enrichment in patients with ASD (OR: 4.9; P < 0.014). In addition, we identified rare AFF2 3' UTR variants at conserved sites which alter gene expression in a luciferase assay. These data suggest that rare variation in AFF2 may be a previously unrecognized ASD susceptibility locus and may help explain some of the male excess of ASD.

  11. On the Behavior of a Variant of Hofstadter's Q-Sequence

    NASA Astrophysics Data System (ADS)

    Balamohan, B.; Kuznetsov, A.; Tanny, Stephen

    2007-06-01

    We completely solve the meta-Fibonacci recursion V(n) = V(n - V(n - 1)) + V(n - V(n - 4)), a variant of Hofstadter's meta-Fibonacci Q-sequence. For the initial conditions V(1) = V(2) = V(3) = V(4) = 1 we prove that the sequence V(n) is monotone, with successive terms increasing by 0 or 1, so the sequence hits every positive integer. We demonstrate certain special structural properties and fascinating periodicities of the associated frequency sequence (the number of times V(n) hits each positive integer) that make possible an iterative computation of V(n) for any value of n. Further, we derive a natural partition of the V-sequence into blocks of consecutive terms ("generations") with the property that terms in one block determine the terms in the next. We conclude by examining all the other sets of four initial conditions for which this meta-Fibonacci recursion has a solution; we prove that in each case the resulting sequence is essentially the same as the one with initial conditions all ones.

  12. NRAS germline variant G138R and multiple rare somatic mutations on APC in colorectal cancer patients in Taiwan by next generation sequencing.

    PubMed

    Chang, Pi-Yueh; Chen, Jinn-Shiun; Chang, Nai-Chung; Chang, Shih-Cheng; Wang, Mei-Chia; Tsai, Shu-Hui; Wen, Ying-Hao; Tsai, Wen-Sy; Chan, Err-Cheng; Lu, Jang-Jih

    2016-06-21

    Colorectal cancer (CRC) arises from mutations in a subset of genes. We investigated the germline and somatic mutation spectrum of patients with CRC in Taiwan by using the AmpliSeq Cancer Hotspot Panel V2. Fifty paired freshly frozen stage 0-IV CRC tumors and adjacent normal tissue were collected. Blood DNA from 20 healthy donors were used for comparison of germline mutations. Variants were identified using an ion-torrent personal genomic machine and subsequently confirmed by Sanger sequencing or pyrosequencing. Five nonsynonymous germline variants on 4 cancer susceptible genes, CDH1, APC, MLH1, and NRAS, were observed in 6 patients with CRC (12%). Among them, oncogene NRAS G138R variant was identified as having a predicted damaging effect on protein function, which has never been reported by other laboratories. CDH1 T340A variants were presented in 3 patients. The germline variants in the cancer patients differed completely from those found in asymptomatic controls. Furthermore, a total of 56 COSMIC and 21 novel somatic variants distributed in 20 genes were detected in 44 (88%) of the CRC samples. High inter- and intra-tumor heterogeneity levels were observed. Nine rare variants located in the β-catenin binding region of the APC gene were discovered, 7 of which could cause amino acid frameshift and might have a pathogenic effect. In conclusion, panel-based mutation detection by using a high-throughput sequencing platform can elucidate race-dependent cancer genomes. This approach facilitates identifying individuals at high risk and aiding the recognition of novel mutations as targets for drug development.

  13. EXCAVATOR: detecting copy number variants from whole-exome sequencing data.

    PubMed

    Magi, Alberto; Tattini, Lorenzo; Cifola, Ingrid; D'Aurizio, Romina; Benelli, Matteo; Mangano, Eleonora; Battaglia, Cristina; Bonora, Elena; Kurg, Ants; Seri, Marco; Magini, Pamela; Giusti, Betti; Romeo, Giovanni; Pippucci, Tommaso; De Bellis, Gianluca; Abbate, Rosanna; Gensini, Gian Franco

    2013-01-01

    We developed a novel software tool, EXCAVATOR, for the detection of copy number variants (CNVs) from whole-exome sequencing data. EXCAVATOR combines a three-step normalization procedure with a novel heterogeneous hidden Markov model algorithm and a calling method that classifies genomic regions into five copy number states. We validate EXCAVATOR on three datasets and compare the results with three other methods. These analyses show that EXCAVATOR outperforms the other methods and is therefore a valuable tool for the investigation of CNVs in largescale projects, as well as in clinical research and diagnostics. EXCAVATOR is freely available at http://sourceforge.net/projects/excavatortool/.

  14. Focus group discussions on secondary variants and next-generation sequencing technologies.

    PubMed

    Christenhusz, Gabrielle M; Devriendt, Koenraad; Van Esch, Hilde; Dierickx, Kris

    2015-04-01

    The clinical application of new genetic technologies will be and already is of great benefit to children with unexplained developmental disabilities or congenital anomalies. In most cases, it will be their parents who, together with medical professionals, make decisions about what should be disclosed and how the information will be used. We conducted eight exploratory focus group discussions with stakeholders to provide a broad sketch of concerns and ideas around the communication of results from next-generation sequencing technologies involving children. Stakeholders included those with (grand-) children of various ages and those without children; those involved professionally with genetics and those who were not; and a range of ages. Participants were asked to focus on which secondary variants they would and would not want disclosed about their (hypothetical) children or themselves. While the literature often concentrates on the medical and scientific characteristics of secondary variants, focus group participants were also interested in factors involving the parent-child relationship and the broader context. This resulted in more flexibility surrounding the types of secondary variants disclosed to parents than much of the literature currently supports. In addition, participants would on occasion use the same factors to argue opposing positions. The "Family Illness Paradigms model" can help explain this seeming contradiction. This model emphasises the importance of how the family reacts to personal and family experiences of disease and loss, more than the fact of having these experiences.

  15. Detection of nucleic acid sequences by invader-directed cleavage

    DOEpatents

    Brow, Mary Ann D.; Hall, Jeff Steven Grotelueschen; Lyamichev, Victor; Olive, David Michael; Prudent, James Robert

    1999-01-01

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The 5' nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based by charge.

  16. Genetic and Functional Sequence Variants of the SIRT3 Gene Promoter in Myocardial Infarction

    PubMed Central

    Yin, Xiaoyun; Pang, Shuchao; Huang, Jian; Cui, Yinghua; Yan, Bo

    2016-01-01

    Coronary artery disease (CAD), including myocardial infarction (MI), is a common complex disease that is caused by atherosclerosis. Although a large number of genetic variants have been associated with CAD, only 10% of CAD cases could be explained. It has been proposed that low frequent and rare genetic variants may be main causes for CAD. SIRT3, a mitochondrial deacetylase, plays important roles in mitochondrial function and metabolism. Lack of SIRT3 in experimental animal leads to several age-related diseases, including cardiovascular diseases. Therefore, SIRT3 gene variants may contribute to the MI development. In this study, SIRT3 gene promoter was genetically and functionally analyzed in large cohorts of MI patients (n = 319) and ethnic-matched controls (n = 322). Total twenty-three DNA sequence variants (DSVs) were identified, including 10 single-nucleotide polymorphisms (SNPs). Six novel heterozygous DSVs, g.237307A>G, g.237270G>A, g.237023_25del, g.236653C>A, g.236628G>C, g.236557T>C, and two SNPs g.237030C>T (rs12293349) and g.237022C>G (rs369344513), were identified in nine MI patients, but in none of controls. Three SNPs, g.236473C>T (rs11246029), g.236380_81ins (rs71019893) and g.236370C>G (rs185277566), were more significantly frequent in MI patients than controls (P<0.05). These DSVs and SNPs, except g.236557T>C, significantly decreased the transcriptional activity of the SIRT3 gene promoter in cultured HEK-293 cells and H9c2 cells. Therefore, these DSVs identified in MI patients may change SIRT3 level by affecting the transcriptional activity of SIRT3 gene promoter, contributing to the MI development as a risk factor. PMID:27078640

  17. Novel Transcription Factor Variants through RNA-Sequencing: The Importance of Being “Alternative”

    PubMed Central

    Scarpato, Margherita; Federico, Antonio; Ciccodicola, Alfredo; Costa, Valerio

    2015-01-01

    Alternative splicing is a pervasive mechanism of RNA maturation in higher eukaryotes, which increases proteomic diversity and biological complexity. It has a key regulatory role in several physiological and pathological states. The diffusion of Next Generation Sequencing, particularly of RNA-Sequencing, has exponentially empowered the identification of novel transcripts revealing that more than 95% of human genes undergo alternative splicing. The highest rate of alternative splicing occurs in transcription factors encoding genes, mostly in Krüppel-associated box domains of zinc finger proteins. Since these molecules are responsible for gene expression, alternative splicing is a crucial mechanism to “regulate the regulators”. Indeed, different transcription factors isoforms may have different or even opposite functions. In this work, through a targeted re-analysis of our previously published RNA-Sequencing datasets, we identified nine novel transcripts in seven transcription factors genes. In silico analysis, combined with RT-PCR, cloning and Sanger sequencing, allowed us to experimentally validate these new variants. Through computational approaches we also predicted their novel structural and functional properties. Our findings indicate that alternative splicing is a major determinant of transcription factor diversity, confirming that accurate analysis of RNA-Sequencing data can reliably lead to the identification of novel transcripts, with potentially new functions. PMID:25590302

  18. NGS-Logistics: federated analysis of NGS sequence variants across multiple locations.

    PubMed

    Ardeshirdavani, Amin; Souche, Erika; Dehaspe, Luc; Van Houdt, Jeroen; Vermeesch, Joris Robert; Moreau, Yves

    2014-01-01

    As many personal genomes are being sequenced, collaborative analysis of those genomes has become essential. However, analysis of personal genomic data raises important privacy and confidentiality issues. We propose a methodology for federated analysis of sequence variants from personal genomes. Specific base-pair positions and/or regions are queried for samples to which the user has access but also for the whole population. The statistics results do not breach data confidentiality but allow further exploration of the data; researchers can negotiate access to relevant samples through pseudonymous identifiers. This approach minimizes the impact on data confidentiality while enabling powerful data analysis by gaining access to important rare samples. Our methodology is implemented in an open source tool called NGS-Logistics, freely available at https://ngsl.esat.kuleuven.be.

  19. Evaluating rare variants in complex disorders using next-generation sequencing.

    PubMed

    Ezewudo, Matthew; Zwick, Michael E

    2013-04-01

    Determining the genetic architecture of liability for complex neuropsychiatric disorders like autism spectrum disorders and schizophrenia poses a tremendous challenge for contemporary biomedical research. Here we discuss how genetic studies first tested, and rejected, the hypothesis that common variants with large effects account for the prevalence of these disorders. We then explore how the discovery of structural variation has contributed to our understanding of the etiology of these disorders. The rise of fast and inexpensive oligonucleotide sequencing and methods of targeted enrichment and their influence on the search for rare genetic variation contributing to complex neuropsychiatric disorders is the next focus of our article. Finally, we consider the technical challenges and future prospects for the use of next-generation sequencing to reveal the complex genetic architecture of complex neuropsychiatric disorders in both research and the clinical settings.

  20. Targeted Re-Sequencing Approach of Candidate Genes Implicates Rare Potentially Functional Variants in Tourette Syndrome Etiology

    PubMed Central

    Alexander, John; Potamianou, Hera; Xing, Jinchuan; Deng, Li; Karagiannidis, Iordanis; Tsetsos, Fotis; Drineas, Petros; Tarnok, Zsanett; Rizzo, Renata; Wolanczyk, Tomasz; Farkas, Luca; Nagy, Peter; Szymanska, Urszula; Androutsos, Christos; Tsironi, Vaia; Koumoula, Anastasia; Barta, Csaba; Sandor, Paul; Barr, Cathy L.; Tischfield, Jay; Paschou, Peristera; Heiman, Gary A.; Georgitsi, Marianthi

    2016-01-01

    Although the genetic basis of Tourette Syndrome (TS) remains unclear, several candidate genes have been implicated. Using a set of 382 TS individuals of European ancestry we investigated four candidate genes for TS (HDC, SLITRK1, BTBD9, and SLC6A4) in an effort to identify possibly causal variants using a targeted re-sequencing approach by next generation sequencing technology. Identification of possible disease causing variants under different modes of inheritance was performed using the algorithms implemented in VAAST. We prioritized variants using Variant ranker and validated five rare variants via Sanger sequencing in HDC and SLITRK1, all of which are predicted to be deleterious. Intriguingly, one of the identified variants is in linkage disequilibrium with a variant that is included among the top hits of a genome-wide association study for response to citalopram treatment, an antidepressant drug with off-label use also in obsessive compulsive disorder. Our findings provide additional evidence for the implication of these two genes in TS susceptibility and the possible role of these proteins in the pathobiology of TS should be revisited. PMID:27708560

  1. Finding Disease Variants in Mendelian Disorders By Using Sequence Data: Methods and Applications

    PubMed Central

    Ionita-Laza, Iuliana; Makarov, Vlad; Yoon, Seungtai; Raby, Benjamin; Buxbaum, Joseph; Nicolae, Dan L.; Lin, Xihong

    2011-01-01

    Many sequencing studies are now underway to identify the genetic causes for both Mendelian and complex traits. Via exome-sequencing, genes harboring variants implicated in several Mendelian traits have already been identified. The underlying methodology in these studies is a multistep algorithm based on filtering variants identified in a small number of affected individuals and depends on whether they are novel (not yet seen in public resources such as dbSNP), shared among affected individuals, and other external functional information on the variants. Although intuitive, these filter-based methods are nonoptimal and do not provide any measure of statistical uncertainty. We describe here a formal statistical approach that has several distinct advantages: (1) it provides fast computation of approximate p values for individual genes, (2) it adjusts for the background variation in each gene, (3) it allows for incorporation of functional or linkage-based information, and (4) it accommodates designs based on both affected relative pairs and unrelated affected individuals. We show via simulations that the proposed approach can be used in conjunction with the existing filter-based methods to achieve a substantially better ranking of a gene relevant for disease when compared to currently used filter-based approaches, this is especially so in the presence of disease locus heterogeneity. We revisit recent studies on three Mendelian diseases and show that the proposed approach results in the implicated gene being ranked first in all studies, and approximate p values of 10−6 for the Miller Syndrome gene, 1.0 × 10−4 for the Freeman-Sheldon Syndrome gene, and 3.5 × 10−5 for the Kabuki Syndrome gene. PMID:22137099

  2. EGFR variant heterogeneity in glioblastoma resolved through single-nucleus sequencing

    PubMed Central

    Francis, Joshua M.; Zhang, Cheng-Zhong; Maire, Cecile L.; Jung, Joonil; Manzo, Veronica E.; Adalsteinsson, Viktor A.; Homer, Heather; Haidar, Sam; Blumenstiel, Brendan; Pedamallu, Chandra Sekhar; Ligon, Azra H.; Love, J. Christopher; Meyerson, Matthew; Ligon, Keith L.

    2014-01-01

    Glioblastomas with EGFR amplification represent approximately 50% of newly diagnosed cases and recent studies have revealed frequent coexistence of multiple EGFR aberrations within the same tumor with implications for mutation cooperation and treatment resistance. However, bulk tumor sequencing studies cannot resolve the patterns of how the multiple EGFR aberrations coexist with other mutations within single tumor cells. Here we applied a population-based single-cell whole genome sequencing methodology to characterize genomic heterogeneity in EGFR amplified glioblastomas. Our analysis effectively identified clonal events, including a novel translocation of a super enhancer to the TERT promoter, as well as subclonal loss-of-heterozygosity and multiple EGFR mutational variants within tumors. Correlating the EGFR mutations onto the cellular hierarchy revealed that EGFR truncation variants (EGFRvII and EGFR Carboxyl-terminal deletions) identified in the bulk tumor segregate into non-overlapping subclonal populations. In vitro and in vivo functional studies show EGFRvII is oncogenic and sensitive to EGFR inhibitors currently in clinical trials. Thus the association between diverse activating mutations in EGFR and other subclonal mutations within a single tumor supports an intrinsic mechanism for proliferative and clonal diversification with broad implications in resistance to treatment. PMID:24893890

  3. A Simple Strategy for Reducing False Negatives in Calling Variants from Single-Cell Sequencing Data

    PubMed Central

    Ji, Cong; Miao, Zong; He, Xionglei

    2015-01-01

    Due to the growth of interest in single-cell genomics, computational methods for distinguishing true variants from artifacts are highly desirable. While special attention has been paid to false positives in variant or mutation calling from single-cell sequencing data, an equally important but often neglected issue is that of false negatives derived from allele dropout during the amplification of single cell genomes. In this paper, we propose a simple strategy to reduce the false negatives in single-cell sequencing data analysis. Simulation results show that this method is highly reliable, with an error rate of 4.94×10-5, which is orders of magnitude lower than the expected false negative rate (~34%) estimated from a single-cell exome dataset, though the method is limited by the low SNP density in the human genome. We applied this method to analyze the exome data of a few dozen single tumor cells generated in previous studies, and extracted cell specific mutation information for a small set of sites. Interestingly, we found that there are difficulties in using the classical clonal model of tumor cell growth to explain the mutation patterns observed in some tumor cells. PMID:25876174

  4. A simple strategy for reducing false negatives in calling variants from single-cell sequencing data.

    PubMed

    Ji, Cong; Miao, Zong; He, Xionglei

    2015-01-01

    Due to the growth of interest in single-cell genomics, computational methods for distinguishing true variants from artifacts are highly desirable. While special attention has been paid to false positives in variant or mutation calling from single-cell sequencing data, an equally important but often neglected issue is that of false negatives derived from allele dropout during the amplification of single cell genomes. In this paper, we propose a simple strategy to reduce the false negatives in single-cell sequencing data analysis. Simulation results show that this method is highly reliable, with an error rate of 4.94×10-5, which is orders of magnitude lower than the expected false negative rate (~34%) estimated from a single-cell exome dataset, though the method is limited by the low SNP density in the human genome. We applied this method to analyze the exome data of a few dozen single tumor cells generated in previous studies, and extracted cell specific mutation information for a small set of sites. Interestingly, we found that there are difficulties in using the classical clonal model of tumor cell growth to explain the mutation patterns observed in some tumor cells.

  5. Investigation of rare and low-frequency variants using high-throughput sequencing with pooled DNA samples

    PubMed Central

    Wang, Jingwen; Skoog, Tiina; Einarsdottir, Elisabet; Kaartokallio, Tea; Laivuori, Hannele; Grauers, Anna; Gerdhem, Paul; Hytönen, Marjo; Lohi, Hannes; Kere, Juha; Jiao, Hong

    2016-01-01

    High-throughput sequencing using pooled DNA samples can facilitate genome-wide studies on rare and low-frequency variants in a large population. Some major questions concerning the pooling sequencing strategy are whether rare and low-frequency variants can be detected reliably, and whether estimated minor allele frequencies (MAFs) can represent the actual values obtained from individually genotyped samples. In this study, we evaluated MAF estimates using three variant detection tools with two sets of pooled whole exome sequencing (WES) and one set of pooled whole genome sequencing (WGS) data. Both GATK and Freebayes displayed high sensitivity, specificity and accuracy when detecting rare or low-frequency variants. For the WGS study, 56% of the low-frequency variants in Illumina array have identical MAFs and 26% have one allele difference between sequencing and individual genotyping data. The MAF estimates from WGS correlated well (r = 0.94) with those from Illumina arrays. The MAFs from the pooled WES data also showed high concordance (r = 0.88) with those from the individual genotyping data. In conclusion, the MAFs estimated from pooled DNA sequencing data reflect the MAFs in individually genotyped samples well. The pooling strategy can thus be a rapid and cost-effective approach for the initial screening in large-scale association studies. PMID:27633116

  6. Los Alamos sequence analysis package for nucleic acids and proteins.

    PubMed Central

    Kanehisa, M I

    1982-01-01

    An interactive system for computer analysis of nucleic acid and protein sequences has been developed for the Los Alamos DNA Sequence Database. It provides a convenient way to search or verify various sequence features, e.g., restriction enzyme sites, protein coding frames, and properties of coded proteins. Further, the comprehensive analysis package on a large-scale database can be used for comparative studies on sequence and structural homologies in order to find unnoted information stored in nucleic acid sequences. PMID:6174934

  7. Variants of beta-glucosidases

    SciTech Connect

    Fidantsef, Ana; Lamsa, Michael; Gorre-Clancy, Brian

    2014-10-07

    The present invention relates to variants of a parent beta-glucosidase, comprising a substitution at one or more positions corresponding to positions 142, 183, 266, and 703 of amino acids 1 to 842 of SEQ ID NO: 2 or corresponding to positions 142, 183, 266, and 705 of amino acids 1 to 844 of SEQ ID NO: 70, wherein the variant has beta-glucosidase activity. The present invention also relates to nucleotide sequences encoding the variant beta-glucosidases and to nucleic acid constructs, vectors, and host cells comprising the nucleotide sequences.

  8. Variants of beta-glucosidase

    SciTech Connect

    Fidantsef, Ana; Lamsa, Michael; Gorre-Clancy, Brian

    2015-07-14

    The present invention relates to variants of a parent beta-glucosidase, comprising a substitution at one or more positions corresponding to positions 142, 183, 266, and 703 of amino acids 1 to 842 of SEQ ID NO: 2 or corresponding to positions 142, 183, 266, and 705 of amino acids 1 to 844 of SEQ ID NO: 70, wherein the variant has beta-glucosidase activity. The present invention also relates to nucleotide sequences encoding the variant beta-glucosidases and to nucleic acid constructs, vectors, and host cells comprising the nucleotide sequences.

  9. Variants of beta-glucosidase

    DOEpatents

    Fidantsef, Ana; Lamsa, Michael; Gorre-Clancy, Brian

    2009-12-29

    The present invention relates to variants of a parent beta-glucosidase, comprising a substitution at one or more positions corresponding to positions 142, 183, 266, and 703 of amino acids 1 to 842 of SEQ ID NO: 2 or corresponding to positions 142, 183, 266, and 705 of amino acids 1 to 844 of SEQ ID NO: 70, wherein the variant has beta-glucosidase activity. The present invention also relates to nucleotide sequences encoding the variant beta-glucosidases and to nucleic acid constructs, vectors, and host cells comprising the nucleotide sequences.

  10. Variants of beta-glucosidases

    DOEpatents

    Fidantsef, Ana; Lamsa, Michael; Clancy, Brian Gorre

    2008-08-19

    The present invention relates to variants of a parent beta-glucosidase, comprising a substitution at one or more positions corresponding to positions 142, 183, 266, and 703 of amino acids 1 to 842 of SEQ ID NO: 2 or corresponding to positions 142, 183, 266, and 705 of amino acids 1 to 844 of SEQ ID NO: 70, wherein the variant has beta-glucosidase activity. The present invention also relates to nucleotide sequences encoding the variant beta-glucosidases and to nucleic acid constructs, vectors, and host cells comprising the nucleotide sequences.

  11. Coval: Improving Alignment Quality and Variant Calling Accuracy for Next-Generation Sequencing Data

    PubMed Central

    Kosugi, Shunichi; Natsume, Satoshi; Yoshida, Kentaro; MacLean, Daniel; Cano, Liliana; Kamoun, Sophien; Terauchi, Ryohei

    2013-01-01

    Accurate identification of DNA polymorphisms using next-generation sequencing technology is challenging because of a high rate of sequencing error and incorrect mapping of reads to reference genomes. Currently available short read aligners and DNA variant callers suffer from these problems. We developed the Coval software to improve the quality of short read alignments. Coval is designed to minimize the incidence of spurious alignment of short reads, by filtering mismatched reads that remained in alignments after local realignment and error correction of mismatched reads. The error correction is executed based on the base quality and allele frequency at the non-reference positions for an individual or pooled sample. We demonstrated the utility of Coval by applying it to simulated genomes and experimentally obtained short-read data of rice, nematode, and mouse. Moreover, we found an unexpectedly large number of incorrectly mapped reads in ‘targeted’ alignments, where the whole genome sequencing reads had been aligned to a local genomic segment, and showed that Coval effectively eliminated such spurious alignments. We conclude that Coval significantly improves the quality of short-read sequence alignments, thereby increasing the calling accuracy of currently available tools for SNP and indel identification. Coval is available at http://sourceforge.net/projects/coval105/. PMID:24116042

  12. Glanzmann thrombasthenia. Cooperation between sequence variants in cis during splice site selection.

    PubMed Central

    Jin, Y; Dietz, H C; Montgomery, R A; Bell, W R; McIntosh, I; Coller, B; Bray, P F

    1996-01-01

    Glanzmann thrombasthenia (GT), an autosomal recessive bleeding disorder, results from abnormalities in the platelet fibrinogen receptor, GP(IIb)-IIIa (integrin alpha(IIb)beta3). A patient with GT was identified as homozygous for a G-->A mutation 6 bp upstream of the GP(IIIa) exon 9 splice donor site. Patient platelet GP(IIIa) transcripts lacked exon 9 despite normal DNA sequence in all of the cis-acting sequences known to regulate splice site selection. In vitro analysis of transcripts generated from mini-gene constructs demonstrated that exon skipping occurred only when the G-->A mutation was cis to a polymorphism 116 bp upstream, providing precedence that two sequence variations in the same exon which do not alter consensus splice sites and do not generate missense or nonsense mutations, can affect splice site selection. The mutant transcript resulted from utilization of a cryptic splice acceptor site and returned the open reading frame. These data support the hypothesis that pre-mRNA secondary structure and allelic sequence variants can influence splicing and provide new insight into the regulated control of RNA processing. In addition, haplotype analysis suggested that the patient has two identical copies of chromosome 17. Markers studied on three other chromosomes suggested this finding was not due to consanguinity. The restricted phenotype in this patient may provide information regarding the expression of potentially imprinted genes on chromosome 17. PMID:8878424

  13. Making the most of RNA-seq: Pre-processing sequencing data with Opossum for reliable SNP variant detection

    PubMed Central

    2017-01-01

    Identifying variants from RNA-seq (transcriptome sequencing) data is a cost-effective and versatile alternative to whole-genome sequencing. However, current variant callers do not generally behave well with RNA-seq data due to reads encompassing intronic regions. We have developed a software programme called Opossum to address this problem. Opossum pre-processes RNA-seq reads prior to variant calling, and although it has been designed to work specifically with Platypus, it can be used equally well with other variant callers such as GATK HaplotypeCaller. In this work, we show that using Opossum in conjunction with either Platypus or GATK HaplotypeCaller maintains precision and improves the sensitivity for SNP detection compared to the GATK Best Practices pipeline. In addition, using it in combination with Platypus offers a substantial reduction in run times compared to the GATK pipeline so it is ideal when there are only limited time or computational resources available. PMID:28239666

  14. Identification of a Latin American-specific BabA adhesin variant through whole genome sequencing of Helicobacter pylori patient isolates from Nicaragua

    DOE PAGES

    Thorell, Kaisa; Hosseini, Shaghayegh; Palacios Gonzales, Reyna Victoria Palacios; ...

    2016-02-29

    In this study, Helicobacter pylori (H. pylori) is one of the most common bacterial infections in humans and this infection can lead to gastric ulcers and gastric cancer. H. pylori is one of the most genetically variable human pathogens and the ability of the bacterium to bind to the host epithelium as well as the presence of different virulence factors and genetic variants within these genes have been associated with disease severity. Nicaragua has particularly high gastric cancer incidence and we therefore studied Nicaraguan clinical H. pylori isolates for factors that could contribute to cancer risk. The complete genomes ofmore » fifty-two Nicaraguan H. pylorii isolates were sequenced and assembled de novo, and phylogenetic and virulence factor analyses were performed. The Nicaraguan isolates showed phylogenetic relationship with West African isolates in whole-genome sequence comparisons and with Western and urban South-and Central American isolates using MLSA (Multi-locus sequence analysis). A majority, 77 % of the isolates carried the cancer-associated virulence gene cagA and also the s1/i1/m1 vacuolating cytotoxin, vacA allele combination, which is linked to increased severity of disease. Specifically, we also found that Nicaraguan isolates have a blood group-binding adhesin (BabA) variant highly similar to previously reported BabA sequences from Latin America, including from isolates belonging to other phylogenetic groups. These BabA sequences were found to be under positive selection at several amino acid positions that differed from the global collection of isolates. In conclusion, the discovery of a Latin American BabA variant, independent of overall phylogenetic background, suggests hitherto unknown host or environmental factors within the Latin American population giving H. pylori isolates carrying this adhesin variant a selective advantage, which could affect pathogenesis and risk for sequelae through specific adherence properties.« less

  15. Identification of a Latin American-specific BabA adhesin variant through whole genome sequencing of Helicobacter pylori patient isolates from Nicaragua

    SciTech Connect

    Thorell, Kaisa; Hosseini, Shaghayegh; Palacios Gonzales, Reyna Victoria Palacios; Chaotham, Chatchai; Graham, David Y.; Paszat, Lawrence; Rabeneck, Linda; Lundin, Samuel B.; Nookaew, Intawat; Sjoling, Asa

    2016-02-29

    In this study, Helicobacter pylori (H. pylori) is one of the most common bacterial infections in humans and this infection can lead to gastric ulcers and gastric cancer. H. pylori is one of the most genetically variable human pathogens and the ability of the bacterium to bind to the host epithelium as well as the presence of different virulence factors and genetic variants within these genes have been associated with disease severity. Nicaragua has particularly high gastric cancer incidence and we therefore studied Nicaraguan clinical H. pylori isolates for factors that could contribute to cancer risk. The complete genomes of fifty-two Nicaraguan H. pylorii isolates were sequenced and assembled de novo, and phylogenetic and virulence factor analyses were performed. The Nicaraguan isolates showed phylogenetic relationship with West African isolates in whole-genome sequence comparisons and with Western and urban South-and Central American isolates using MLSA (Multi-locus sequence analysis). A majority, 77 % of the isolates carried the cancer-associated virulence gene cagA and also the s1/i1/m1 vacuolating cytotoxin, vacA allele combination, which is linked to increased severity of disease. Specifically, we also found that Nicaraguan isolates have a blood group-binding adhesin (BabA) variant highly similar to previously reported BabA sequences from Latin America, including from isolates belonging to other phylogenetic groups. These BabA sequences were found to be under positive selection at several amino acid positions that differed from the global collection of isolates. In conclusion, the discovery of a Latin American BabA variant, independent of overall phylogenetic background, suggests hitherto unknown host or environmental factors within the Latin American population giving H. pylori isolates carrying this adhesin variant a selective advantage, which could affect pathogenesis and risk for sequelae through specific adherence

  16. Hybridization and sequencing of nucleic acids using base pair mismatches

    DOEpatents

    Fodor, Stephen P. A.; Lipshutz, Robert J.; Huang, Xiaohua

    2001-01-01

    Devices and techniques for hybridization of nucleic acids and for determining the sequence of nucleic acids. Arrays of nucleic acids are formed by techniques, preferably high resolution, light-directed techniques. Positions of hybridization of a target nucleic acid are determined by, e.g., epifluorescence microscopy. Devices and techniques are proposed to determine the sequence of a target nucleic acid more efficiently and more quickly through such synthesis and detection techniques.

  17. Sequencing and computational analysis of complete genome sequences of Citrus yellow mosaic badna virus from acid lime and pummelo.

    PubMed

    Borah, Basanta K; Johnson, A M Anthony; Sai Gopal, D V R; Dasgupta, Indranil

    2009-08-01

    Citrus yellow mosaic badna virus (CMBV), a member of the Family Caulimoviridae, Genus Badnavirus, is the causative agent of Citrus mosaic disease in India. Although the virus has been detected in several citrus species, only two full-length genomes, one each from Sweet orange and Rangpur lime, are available in publicly accessible databases. In order to obtain a better understanding of the genetic variability of the virus in other citrus mosaic-affected citrus species, we performed the cloning and sequence analysis of complete genomes of CMBV from two additional citrus species, Acid lime and Pummelo. We show that CMBV genomes from the two hosts share high homology with previously reported CMBV sequences and hence conclude that the new isolates represent variants of the virus present in these species. Based on in silico sequence analysis, we predict the possible function of the protein encoded by one of the five ORFs.

  18. The Clinical Next‐Generation Sequencing Database: A Tool for the Unified Management of Clinical Information and Genetic Variants to Accelerate Variant Pathogenicity Classification

    PubMed Central

    Nishio, Shin‐ya

    2017-01-01

    ABSTRACT Recent advances in next‐generation sequencing (NGS) have given rise to new challenges due to the difficulties in variant pathogenicity interpretation and large dataset management, including many kinds of public population databases as well as public or commercial disease‐specific databases. Here, we report a new database development tool, named the “Clinical NGS Database,” for improving clinical NGS workflow through the unified management of variant information and clinical information. This database software offers a two‐feature approach to variant pathogenicity classification. The first of these approaches is a phenotype similarity‐based approach. This database allows the easy comparison of the detailed phenotype of each patient with the average phenotype of the same gene mutation at the variant or gene level. It is also possible to browse patients with the same gene mutation quickly. The other approach is a statistical approach to variant pathogenicity classification based on the use of the odds ratio for comparisons between the case and the control for each inheritance mode (families with apparently autosomal dominant inheritance vs. control, and families with apparently autosomal recessive inheritance vs. control). A number of case studies are also presented to illustrate the utility of this database. PMID:28008688

  19. The Clinical Next-Generation Sequencing Database: A Tool for the Unified Management of Clinical Information and Genetic Variants to Accelerate Variant Pathogenicity Classification.

    PubMed

    Nishio, Shin-Ya; Usami, Shin-Ichi

    2017-03-01

    Recent advances in next-generation sequencing (NGS) have given rise to new challenges due to the difficulties in variant pathogenicity interpretation and large dataset management, including many kinds of public population databases as well as public or commercial disease-specific databases. Here, we report a new database development tool, named the "Clinical NGS Database," for improving clinical NGS workflow through the unified management of variant information and clinical information. This database software offers a two-feature approach to variant pathogenicity classification. The first of these approaches is a phenotype similarity-based approach. This database allows the easy comparison of the detailed phenotype of each patient with the average phenotype of the same gene mutation at the variant or gene level. It is also possible to browse patients with the same gene mutation quickly. The other approach is a statistical approach to variant pathogenicity classification based on the use of the odds ratio for comparisons between the case and the control for each inheritance mode (families with apparently autosomal dominant inheritance vs. control, and families with apparently autosomal recessive inheritance vs. control). A number of case studies are also presented to illustrate the utility of this database.

  20. Detection of Clinically Relevant Genetic Variants in Autism Spectrum Disorder by Whole-Genome Sequencing

    PubMed Central

    Jiang, Yong-hui; Yuen, Ryan K.C.; Jin, Xin; Wang, Mingbang; Chen, Nong; Wu, Xueli; Ju, Jia; Mei, Junpu; Shi, Yujian; He, Mingze; Wang, Guangbiao; Liang, Jieqin; Wang, Zhe; Cao, Dandan; Carter, Melissa T.; Chrysler, Christina; Drmic, Irene E.; Howe, Jennifer L.; Lau, Lynette; Marshall, Christian R.; Merico, Daniele; Nalpathamkalam, Thomas; Thiruvahindrapuram, Bhooma; Thompson, Ann; Uddin, Mohammed; Walker, Susan; Luo, Jun; Anagnostou, Evdokia; Zwaigenbaum, Lonnie; Ring, Robert H.; Wang, Jian; Lajonchere, Clara; Wang, Jun; Shih, Andy; Szatmari, Peter; Yang, Huanming; Dawson, Geraldine; Li, Yingrui; Scherer, Stephen W.

    2013-01-01

    Autism Spectrum Disorder (ASD) demonstrates high heritability and familial clustering, yet the genetic causes remain only partially understood as a result of extensive clinical and genomic heterogeneity. Whole-genome sequencing (WGS) shows promise as a tool for identifying ASD risk genes as well as unreported mutations in known loci, but an assessment of its full utility in an ASD group has not been performed. We used WGS to examine 32 families with ASD to detect de novo or rare inherited genetic variants predicted to be deleterious (loss-of-function and damaging missense mutations). Among ASD probands, we identified deleterious de novo mutations in six of 32 (19%) families and X-linked or autosomal inherited alterations in ten of 32 (31%) families (some had combinations of mutations). The proportion of families identified with such putative mutations was larger than has been previously reported; this yield was in part due to the comprehensive and uniform coverage afforded by WGS. Deleterious variants were found in four unrecognized, nine known, and eight candidate ASD risk genes. Examples include CAPRIN1 and AFF2 (both linked to FMR1, which is involved in fragile X syndrome), VIP (involved in social-cognitive deficits), and other genes such as SCN2A and KCNQ2 (linked to epilepsy), NRXN1, and CHD7, which causes ASD-associated CHARGE syndrome. Taken together, these results suggest that WGS and thorough bioinformatic analyses for de novo and rare inherited mutations will improve the detection of genetic variants likely to be associated with ASD or its accompanying clinical symptoms. PMID:23849776

  1. Trainable high resolution melt curve machine learning classifier for large-scale reliable genotyping of sequence variants.

    PubMed

    Athamanolap, Pornpat; Parekh, Vishwa; Fraley, Stephanie I; Agarwal, Vatsal; Shin, Dong J; Jacobs, Michael A; Wang, Tza-Huei; Yang, Samuel

    2014-01-01

    High resolution melt (HRM) is gaining considerable popularity as a simple and robust method for genotyping sequence variants. However, accurate genotyping of an unknown sample for which a large number of possible variants may exist will require an automated HRM curve identification method capable of comparing unknowns against a large cohort of known sequence variants. Herein, we describe a new method for automated HRM curve classification based on machine learning methods and learned tolerance for reaction condition deviations. We tested this method in silico through multiple cross-validations using curves generated from 9 different simulated experimental conditions to classify 92 known serotypes of Streptococcus pneumoniae and demonstrated over 99% accuracy with 8 training curves per serotype. In vitro verification of the algorithm was tested using sequence variants of a cancer-related gene and demonstrated 100% accuracy with 3 training curves per sequence variant. The machine learning algorithm enabled reliable, scalable, and automated HRM genotyping analysis with broad potential clinical and epidemiological applications.

  2. Direct mutation analysis by high-throughput sequencing: from germline to low-abundant, somatic variants

    PubMed Central

    Gundry, Michael; Vijg, Jan

    2011-01-01

    DNA mutations are the source of genetic variation within populations. The majority of mutations with observable effects are deleterious. In humans mutations in the germ line can cause genetic disease. In somatic cells multiple rounds of mutations and selection lead to cancer. The study of genetic variation has progressed rapidly since the completion of the draft sequence of the human genome. Recent advances in sequencing technology, most importantly the introduction of massively parallel sequencing (MPS), have resulted in more than a hundred-fold reduction in the time and cost required for sequencing nucleic acids. These improvements have greatly expanded the use of sequencing as a practical tool for mutation analysis. While in the past the high cost of sequencing limited mutation analysis to selectable markers or small forward mutation targets assumed to be representative for the genome overall, current platforms allow whole genome sequencing for less than $5,000. This has already given rise to direct estimates of germline mutation rates in multiple organisms including humans by comparing whole genome sequences between parents and offspring. Here we present a brief history of the field of mutation research, with a focus on classical tools for the measurement of mutation rates. We then review MPS, how it is currently applied and the new insight into human and animal mutation frequencies and spectra that has been obtained from whole genome sequencing. While great progress has been made, we note that the single most important limitation of current MPS approaches for mutation analysis is the inability to address low-abundance mutations that turn somatic tissues into mosaics of cells. Such mutations are at the basis of intra-tumor heterogeneity, with important implications for clinical diagnosis, and could also contribute to somatic diseases other than cancer, including aging. Some possible approaches to gain access to low-abundance mutations are discussed, with a

  3. Direct mutation analysis by high-throughput sequencing: from germline to low-abundant, somatic variants.

    PubMed

    Gundry, Michael; Vijg, Jan

    2012-01-03

    DNA mutations are the source of genetic variation within populations. The majority of mutations with observable effects are deleterious. In humans mutations in the germ line can cause genetic disease. In somatic cells multiple rounds of mutations and selection lead to cancer. The study of genetic variation has progressed rapidly since the completion of the draft sequence of the human genome. Recent advances in sequencing technology, most importantly the introduction of massively parallel sequencing (MPS), have resulted in more than a hundred-fold reduction in the time and cost required for sequencing nucleic acids. These improvements have greatly expanded the use of sequencing as a practical tool for mutation analysis. While in the past the high cost of sequencing limited mutation analysis to selectable markers or small forward mutation targets assumed to be representative for the genome overall, current platforms allow whole genome sequencing for less than $5000. This has already given rise to direct estimates of germline mutation rates in multiple organisms including humans by comparing whole genome sequences between parents and offspring. Here we present a brief history of the field of mutation research, with a focus on classical tools for the measurement of mutation rates. We then review MPS, how it is currently applied and the new insight into human and animal mutation frequencies and spectra that has been obtained from whole genome sequencing. While great progress has been made, we note that the single most important limitation of current MPS approaches for mutation analysis is the inability to address low-abundance mutations that turn somatic tissues into mosaics of cells. Such mutations are at the basis of intra-tumor heterogeneity, with important implications for clinical diagnosis, and could also contribute to somatic diseases other than cancer, including aging. Some possible approaches to gain access to low-abundance mutations are discussed, with a brief

  4. Dual-color detection of DNA sequence variants by ligase-mediated analysis

    SciTech Connect

    Samiotaki, M.; Kwiatkowski, M.; Parik, J.; Landegren, U. )

    1994-03-15

    Genetic screening for sequence variants associated with disease is assuming increasing importance in clinical medicine as well as in research. The authors describe an efficient method for such analyses, comprising a combination of practical features: (1) Amplified DNA samples are analyzed for their ability to serve as templates in standardized allele-specific ligation reactions between oligonucleotide probes; (2) Two allele-specific probes, differentially labeled with either of two lanthanide labels, compete for ligation to a third oligonucleotide (the signal from the two labeled probes can thus be directly compared in a sensitive time-resolved fluorescence detection reaction); and (3) Large sets of analyses are processed in parallel using a 96-pin capture manifold, serving to reduce pipetting steps and the risk of contamination. The authors present here the basis of the technique and its application to the screening for two common mutations causing cystic fibrosis and [alpha][sub 1]-antiytrypsin deficiency. 19 refs., 4 figs.

  5. Identification of Genome-Wide Variants and Discovery of Variants Associated with Brassica rapa Clubroot Resistance Gene Rcr1 through Bulked Segregant RNA Sequencing.

    PubMed

    Yu, Fengqun; Zhang, Xingguo; Huang, Zhen; Chu, Mingguang; Song, Tao; Falk, Kevin C; Deora, Abhinandan; Chen, Qilin; Zhang, Yan; McGregor, Linda; Gossen, Bruce D; McDonald, Mary Ruth; Peng, Gary

    2016-01-01

    Clubroot, caused by Plasmodiophora brassicae, is an important disease on Brassica species worldwide. A clubroot resistance gene, Rcr1, with efficacy against pathotype 3 of P. brassicae, was previously mapped to chromosome A03 of B. rapa in pak choy cultivar "Flower Nabana". In the current study, resistance to pathotypes 2, 5 and 6 was shown to be associated with Rcr1 region on chromosome A03. Bulked segregant RNA sequencing was performed and short read sequences were assembled into 10 chromosomes of the B. rapa reference genome v1.5. For the resistant (R) bulks, a total of 351.8 million (M) sequences, 30,836.5 million bases (Mb) in length, produced 120-fold coverage of the reference genome. For the susceptible (S) bulks, 322.9 M sequences, 28,216.6 Mb in length, produced 109-fold coverage. In total, 776.2 K single nucleotide polymorphisms (SNPs) and 122.2 K insertion / deletion (InDels) in R bulks and 762.8 K SNPs and 118.7 K InDels in S bulks were identified; each chromosome had about 87% SNPs and 13% InDels, with 78% monomorphic and 22% polymorphic variants between the R and S bulks. Polymorphic variants on each chromosome were usually below 23%, but made up 34% of the variants on chromosome A03. There were 35 genes annotated in the Rcr1 target region and variants were identified in 21 genes. The numbers of poly variants differed significantly among the genes. Four out of them encode Toll-Interleukin-1 receptor / nucleotide-binding site / leucine-rich-repeat proteins; Bra019409 and Bra019410 harbored the higher numbers of polymorphic variants, which indicates that they are more likely candidates of Rcr1. Fourteen SNP markers in the target region were genotyped using the Kompetitive Allele Specific PCR method and were confirmed to associate with Rcr1. Selected SNP markers were analyzed with 26 recombinants obtained from a segregating population consisting of 1587 plants, indicating that they were completely linked to Rcr1. Nine SNP markers were used for marker

  6. Mapping whole genome shotgun sequence and variant calling in mammalian species without their reference genomes.

    PubMed

    Kalbfleisch, Ted; Heaton, Michael P

    2013-01-01

    Genomics research in mammals has produced reference genome sequences that are essential for identifying variation associated with disease.  High quality reference genome sequences are now available for humans, model species, and economically important agricultural animals.  Comparisons between these species have provided unique insights into mammalian gene function.  However, the number of species with reference genomes is small compared to those needed for studying molecular evolutionary relationships in the tree of life.  For example, among the even-toed ungulates there are approximately 300 species whose phylogenetic relationships have been calculated in the 10k trees project.  Only six of these have reference genomes:  cattle, swine, sheep, goat, water buffalo, and bison.  Although reference sequences will eventually be developed for additional hoof stock, the resources in terms of time, money, infrastructure and expertise required to develop a quality reference genome may be unattainable for most species for at least another decade.  In this work we mapped 35 Gb of next generation sequence data of a Katahdin sheep to its own species' reference genome ( Ovis aries Oar3.1) and to that of a species that diverged 15 to 30 million years ago ( Bos taurus UMD3.1).  In total, 56% of reads covered 76% of UMD3.1 to an average depth of 6.8 reads per site, 83 million variants were identified, of which 78 million were homozygous and likely represent interspecies nucleotide differences. Excluding repeat regions and sex chromosomes, nearly 3.7 million heterozygous sites were identified in this animal vs. bovine UMD3.1, representing polymorphisms occurring in sheep.  Of these, 41% could be readily mapped to orthologous positions in ovine Oar3.1 with 80% corroborated as heterozygous.  These variant sites, identified via interspecies mapping could be used for comparative genomics, disease association studies, and ultimately to understand mammalian gene

  7. iSVP: an integrated structural variant calling pipeline from high-throughput sequencing data

    PubMed Central

    2013-01-01

    Background Structural variations (SVs), such as insertions, deletions, inversions, and duplications, are a common feature in human genomes, and a number of studies have reported that such SVs are associated with human diseases. Although the progress of next generation sequencing (NGS) technologies has led to the discovery of a large number of SVs, accurate and genome-wide detection of SVs remains challenging. Thus far, various calling algorithms based on NGS data have been proposed. However, their strategies are diverse and there is no tool able to detect a full range of SVs accurately. Results We focused on evaluating the performance of existing deletion calling algorithms for various spanning ranges from low- to high-coverage simulation data. The simulation data was generated from a whole genome sequence with artificial SVs constructed based on the distribution of variants obtained from the 1000 Genomes Project. From the simulation analysis, deletion calls of various deletion sizes were obtained with each caller, and it was found that the performance was quite different according to the type of algorithms and targeting deletion size. Based on these results, we propose an integrated structural variant calling pipeline (iSVP) that combines existing methods with a newly devised filtering and merging processes. It achieved highly accurate deletion calling with >90% precision and >90% recall on the 30× read data for a broad range of size. We applied iSVP to the whole-genome sequence data of a CEU HapMap sample, and detected a large number of deletions, including notable peaks around 300 bp and 6,000 bp, which corresponded to Alus and long interspersed nuclear elements, respectively. In addition, many of the predicted deletions were highly consistent with experimentally validated ones by other studies. Conclusions We present iSVP, a new deletion calling pipeline to obtain a genome-wide landscape of deletions in a highly accurate manner. From simulation and real data

  8. The contribution of lactic acid to acidification of tumours: studies of variant cells lacking lactate dehydrogenase.

    PubMed Central

    Yamagata, M.; Hasuda, K.; Stamato, T.; Tannock, I. F.

    1998-01-01

    Solid tumours develop an acidic extracellular environment with high concentration of lactic acid, and lactic acid produced by glycolysis has been assumed to be the major cause of tumour acidity. Experiments using lactate dehydrogenase (LDH)-deficient ras-transfected Chinese hamster ovarian cells have been undertaken to address directly the hypothesis that lactic acid production is responsible for tumour acidification. The variant cells produce negligible quantities of lactic acid and consume minimal amounts of glucose compared with parental cells. Lactate-producing parental cells acidified lightly-buffered medium but variant cells did not. Tumours derived from parental and variant cells implanted into nude mice were found to have mean values of extracellular pH (pHe) of 7.03 +/- 0.03 and 7.03 +/- 0.05, respectively, both of which were significantly lower than that of normal muscle (pHe = 7.43 +/- 0.03; P < 0.001). Lactic acid concentration in variant tumours (450 +/- 90 microg g(-1) wet weight) was much lower than that in parental tumours (1880 +/- 140 microg/g(-1)) and similar to that in serum (400 +/- 35 microg/g(-1)). These data show discordance between mean levels of pHe and lactate content in tumours; the results support those of Newell et al (1993) and suggest that the production of lactic acid via glycolysis causes acidification of culture medium, but is not the only mechanism, and is probably not the major mechanism responsible for the development of an acidic environment within solid tumours. PMID:9667639

  9. Associations between variants of FADS genes and omega-3 and omega-6 milk fatty acids of Canadian Holstein cows

    PubMed Central

    2014-01-01

    Background Fatty acid desaturase 1 (FADS1) and 2 (FADS2) genes code respectively for the enzymes delta-5 and delta-6 desaturases which are rate limiting enzymes in the synthesis of polyunsaturated omega-3 and omega-6 fatty acids (FAs). Omega-3 and-6 FAs as well as conjugated linoleic acid (CLA) are present in bovine milk and have demonstrated positive health effects in humans. Studies in humans have shown significant relationships between genetic variants in FADS1 and 2 genes with plasma and tissue concentrations of omega-3 and-6 FAs. The aim of this study was to evaluate the extent of sequence variations within these two genes in Canadian Holstein cows as well as the association between sequence variants and health promoting FAs in milk. Results Thirty three SNPs were detected within the studied regions of genes including a synonymous mutation (FADS1-07, rs42187261, 306Tyr > Tyr) in exon 8 of FADS1, a non-synonymous mutation (FADS2-14, rs211580559, 294Ala > Val) within FADS2 exon 7, a splice site SNP (FADS2-05, rs211263660), a 3′UTR SNP (FADS2-23, rs109772589), and another 3′UTR SNP with an effect on a microRNA binding site within FADS2 gene (FADS2-19, rs210169303). Association analyses showed significant relations between three out of seven tested SNPs and several FAs. Significant associations (FDR P < 0.05) were recorded between FADS2-23 (rs109772589) and two omega-6 FAs (dihomogamma linolenic acid [C20:3n6] and arachidonic acid [C20:4n6]), FADS1-07 (rs42187261) and one omega-3 FA (eicosapentaenoic acid, C20:5n3) and tricosanoic acid (C23:0), and one intronic SNP, FADS1-01 (rs136261927) and C20:3n6. Conclusion Our study has demonstrated positive associations between three SNPs within FADS1 and FADS2 genes (a SNP within the 3’UTR, a synonymous SNP and an intronic SNP), with three milk PUFAs of Canadian Holstein cows thus suggesting possible involvement of synonymous and non-coding region variants in FA synthesis. These SNPs may serve as

  10. Prioritization Of Nonsynonymous Single Nucleotide Variants For Exome Sequencing Studies Via Integrative Learning On Multiple Genomic Data.

    PubMed

    Wu, Mengmeng; Wu, Jiaxin; Chen, Ting; Jiang, Rui

    2015-10-13

    The rapid advancement of next generation sequencing technology has greatly accelerated the progress for understanding human inherited diseases via such innovations as exome sequencing. Nevertheless, the identification of causative variants from sequencing data remains a great challenge. Traditional statistical genetics approaches such as linkage analysis and association studies have limited power in analyzing exome sequencing data, while relying on simply filtration strategies and predicted functional implications of mutations to pinpoint pathogenic variants are prone to produce false positives. To overcome these limitations, we herein propose a supervised learning approach, termed snvForest, to prioritize candidate nonsynonymous single nucleotide variants for a specific type of disease by integrating 11 functional scores at the variant level and 8 association scores at the gene level. We conduct a series of large-scale in silico validation experiments, demonstrating the effectiveness of snvForest across 2,511 diseases of different inheritance styles and the superiority of our approach over two state-of-the-art methods. We further apply snvForest to three real exome sequencing data sets of epileptic encephalophathies and intellectual disability to show the ability of our approach to identify causative de novo mutations for these complex diseases. The online service and standalone software of snvForest are found at http://bioinfo.au.tsinghua.edu.cn/jianglab/snvforest.

  11. Recurrent triploidy due to a failure to complete maternal meiosis II: whole-exome sequencing reveals candidate variants

    PubMed Central

    Filges, I.; Manokhina, I.; Peñaherrera, M.S.; McFadden, D.E.; Louie, K.; Nosova, E.; Friedman, J.M.; Robinson, W.P.

    2015-01-01

    Triploidy is a relatively common cause of miscarriage; however, recurrent triploidy has rarely been reported. A healthy 34-year-old woman was ascertained because of 18 consecutive miscarriages with triploidy found in all 5 karyotyped losses. Molecular results in a sixth loss were also consistent with triploidy. Genotyping of markers near the centromere on multiple chromosomes suggested that all six triploid conceptuses occurred as a result of failure to complete meiosis II (MII). The proband's mother had also experienced recurrent miscarriage, with a total of 18 miscarriages. Based on the hypothesis that an inherited autosomal-dominant maternal predisposition would explain the phenotype, whole-exome sequencing of the proband and her parents was undertaken to identify potential candidate variants. After filtering for quality and rarity, potentially damaging variants shared between the proband and her mother were identified in 47 genes. Variants in genes coding for proteins implicated in oocyte maturation, oocyte activation or polar body extrusion were then prioritized. Eight of the most promising candidate variants were confirmed by Sanger sequencing. These included a novel change in the PLCD4 gene, and a rare variant in the OSBPL5 gene, which have been implicated in oocyte activation upon fertilization and completion of MII. Several variants in genes coding proteins playing a role in oocyte maturation and early embryonic development were also identified. The genes identified may be candidates for the study in other women experiencing recurrent triploidy or recurrent IVF failure. PMID:25504873

  12. Rare variant testing across methods and thresholds using the multi-kernel sequence kernel association test (MK-SKAT).

    PubMed

    Urrutia, Eugene; Lee, Seunggeun; Maity, Arnab; Zhao, Ni; Shen, Judong; Li, Yun; Wu, Michael C

    Analysis of rare genetic variants has focused on region-based analysis wherein a subset of the variants within a genomic region is tested for association with a complex trait. Two important practical challenges have emerged. First, it is difficult to choose which test to use. Second, it is unclear which group of variants within a region should be tested. Both depend on the unknown true state of nature. Therefore, we develop the Multi-Kernel SKAT (MK-SKAT) which tests across a range of rare variant tests and groupings. Specifically, we demonstrate that several popular rare variant tests are special cases of the sequence kernel association test which compares pair-wise similarity in trait value to similarity in the rare variant genotypes between subjects as measured through a kernel function. Choosing a particular test is equivalent to choosing a kernel. Similarly, choosing which group of variants to test also reduces to choosing a kernel. Thus, MK-SKAT uses perturbation to test across a range of kernels. Simulations and real data analyses show that our framework controls type I error while maintaining high power across settings: MK-SKAT loses power when compared to the kernel for a particular scenario but has much greater power than poor choices.

  13. Enhanced copy number variants detection from whole-exome sequencing data using EXCAVATOR2

    PubMed Central

    D'Aurizio, Romina; Pippucci, Tommaso; Tattini, Lorenzo; Giusti, Betti; Pellegrini, Marco; Magi, Alberto

    2016-01-01

    Copy Number Variants (CNVs) are structural rearrangements contributing to phenotypic variation that have been proved to be associated with many disease states. Over the last years, the identification of CNVs from whole-exome sequencing (WES) data has become a common practice for research and clinical purpose and, consequently, the demand for more and more efficient and accurate methods has increased. In this paper, we demonstrate that more than 30% of WES data map outside the targeted regions and that these reads, usually discarded, can be exploited to enhance the identification of CNVs from WES experiments. Here, we present EXCAVATOR2, the first read count based tool that exploits all the reads produced by WES experiments to detect CNVs with a genome-wide resolution. To evaluate the performance of our novel tool we use it for analysing two WES data sets, a population data set sequenced by the 1000 Genomes Project and a tumor data set made of bladder cancer samples. The results obtained from these analyses demonstrate that EXCAVATOR2 outperforms other four state-of-the-art methods and that our combined approach enlarge the spectrum of detectable CNVs from WES data with an unprecedented resolution. EXCAVATOR2 is freely available at http://sourceforge.net/projects/excavator2tool/. PMID:27507884

  14. Specificity in transmembrane helix–helix interactions can define a hierarchy of stability for sequence variants

    PubMed Central

    Fleming, Karen G.; Engelman, Donald M.

    2001-01-01

    The folding, stability, and oligomerization of helical membrane proteins depend in part on a precise set of packing interactions between transmembrane helices. To understand the energetic principles of these helix–helix interactions, we have used alanine-scanning mutagenesis and sedimentation equilibrium analytical ultracentrifugation to quantitatively examine the sequence dependence of the glycophorin A transmembrane helix dimerization. In all cases, we found that mutations to alanine at interface positions cost free energy of association. In contrast, mutations to alanine away from the dimer interface showed free energies of association that are insignificantly different from wild-type or are slightly stabilizing. Our study further revealed that the energy of association is not evenly distributed across the interface, but that there are several “hot spots” for interaction including both glycines participating in a GxxxG motif. Inspection of the NMR structure indicates that simple principles of protein–protein interactions can explain the changes in energy that are observed. A comparison of the dimer stability between different hydrophobic environments suggested that the hierarchy of stability for sequence variants is conserved. Together, these findings imply that the protein–protein interaction portion of the overall association energy may be separable from the contributions arising from protein–lipid and lipid–lipid energy terms. This idea is a conceptual simplification of the membrane protein folding problem and has implications for prediction and design. PMID:11724930

  15. Assessing pathogenicity for novel mutation/sequence variants: the value of healthy older individuals.

    PubMed

    Zatz, Mayana; Pavanello, Rita de Cassia M; Lourenço, Naila Cristina V; Cerqueira, Antonia; Lazar, Monize; Vainzof, Mariz

    2012-12-01

    Improvement in DNA technology is increasingly revealing unexpected/unknown mutations in healthy persons and generating anxiety due to their still unknown health consequences. We report a 44-year-old healthy father of a 10-year-old daughter with bilateral coloboma and hearing loss, but without muscle weakness, in whom a whole-genome CGH revealed a deletion of exons 38-44 in the dystrophin gene. This mutation was inherited from her asymptomatic father, who was further clinically and molecularly evaluated for prognosis and genetic counseling (GC). This deletion was never identified by us in 982 Duchenne/Becker patients. To assess whether the present case represents a rare case of non-penetrance, and aiming to obtain more information for prognosis and GC, we suggested that healthy older relatives submit their DNA for analysis, to which several complied. Mutation analysis revealed that his mother, brother, and 56-year-old maternal uncle also carry the 38-44 deletion, suggesting it an unlikely cause of muscle weakness. Genome sequencing will disclose mutations and variants whose health impact are still unknown, raising important problems in interpreting results, defining prognosis, and discussing GC. We suggest that, in addition to family history, keeping the DNA of older relatives could be very informative, in particular for those interested in having their genome sequenced.

  16. Pooled Sequencing of Candidate Genes Implicates Rare Variants in the Development of Asthma Following Severe RSV Bronchiolitis in Infancy.

    PubMed

    Torgerson, Dara G; Giri, Tusar; Druley, Todd E; Zheng, Jie; Huntsman, Scott; Seibold, Max A; Young, Andrew L; Schweiger, Toni; Yin-Declue, Huiqing; Sajol, Geneline D; Schechtman, Kenneth B; Hernandez, Ryan D; Randolph, Adrienne G; Bacharier, Leonard B; Castro, Mario

    2015-01-01

    Severe infection with respiratory syncytial virus (RSV) during infancy is strongly associated with the development of asthma. To identify genetic variation that contributes to asthma following severe RSV bronchiolitis during infancy, we sequenced the coding exons of 131 asthma candidate genes in 182 European and African American children with severe RSV bronchiolitis in infancy using anonymous pools for variant discovery, and then directly genotyped a set of 190 nonsynonymous variants. Association testing was performed for physician-diagnosed asthma before the 7th birthday (asthma) using genotypes from 6,500 individuals from the Exome Sequencing Project (ESP) as controls to gain statistical power. In addition, among patients with severe RSV bronchiolitis during infancy, we examined genetic associations with asthma, active asthma, persistent wheeze, and bronchial hyperreactivity (methacholine PC20) at age 6 years. We identified four rare nonsynonymous variants that were significantly associated with asthma following severe RSV bronchiolitis, including single variants in ADRB2, FLG and NCAM1 in European Americans (p = 4.6x10-4, 1.9x10-13 and 5.0x10-5, respectively), and NOS1 in African Americans (p = 2.3x10-11). One of the variants was a highly functional nonsynonymous variant in ADRB2 (rs1800888), which was also nominally associated with asthma (p = 0.027) and active asthma (p = 0.013) among European Americans with severe RSV bronchiolitis without including the ESP. Our results suggest that rare nonsynonymous variants contribute to the development of asthma following severe RSV bronchiolitis in infancy, notably in ADRB2. Additional studies are required to explore the role of rare variants in the etiology of asthma and asthma-related traits following severe RSV bronchiolitis.

  17. Analysis of ANK3 and CACNA1C variants identified in bipolar disorder whole genome sequence data

    PubMed Central

    Fiorentino, Alessia; O'Brien, Niamh Louise; Locke, Devin Paul; McQuillin, Andrew; Jarram, Alexandra; Anjorin, Adebayo; Kandaswamy, Radhika; Curtis, David; Blizard, Robert Alan; Gurling, Hugh Malcolm Douglas

    2014-01-01

    Objectives Genetic markers in the genes encoding ankyrin 3 (ANK3) and the α-calcium channel subunit (CACNA1C) are associated with bipolar disorder (BP). The associated variants in the CACNA1C gene are mainly within intron 3 of the gene. ANK3 BP-associated variants are in two distinct clusters at the ends of the gene, indicating disease allele heterogeneity. Methods In order to screen both coding and non-coding regions to identify potential aetiological variants, we used whole-genome sequencing in 99 BP cases. Variants with markedly different allele frequencies in the BP samples and the 1,000 genomes project European data were genotyped in 1,510 BP cases and 1,095 controls. Results We found that the CACNA1C intron 3 variant, rs79398153, potentially affecting an ENCyclopedia of DNA Elements (ENCODE)-defined region, showed an association with BP (p = 0.015). We also found the ANK3 BP-associated variant rs139972937, responsible for an asparagine to serine change (p = 0.042). However, a previous study had not found support for an association between rs139972937 and BP. The variants at ANK3 and CACNA1C previously known to be associated with BP were not in linkage disequilibrium with either of the two variants that we identified and these are therefore independent of the previous haplotypes implicated by genome-wide association. Conclusions Sequencing in additional BP samples is needed to find the molecular pathology that explains the previous association findings. If changes similar to those we have found can be shown to have an effect on the expression and function of ANK3 and CACNA1C, they might help to explain the so-called ‘missing heritability’ of BP. PMID:24716743

  18. Pooled Sequencing of Candidate Genes Implicates Rare Variants in the Development of Asthma Following Severe RSV Bronchiolitis in Infancy

    PubMed Central

    Torgerson, Dara G.; Giri, Tusar; Druley, Todd E.; Zheng, Jie; Huntsman, Scott; Seibold, Max A.; Young, Andrew L.; Schweiger, Toni; Yin-Declue, Huiqing; Sajol, Geneline D.; Schechtman, Kenneth B; Hernandez, Ryan D.; Randolph, Adrienne G.; Bacharier, Leonard B.; Castro, Mario

    2015-01-01

    Severe infection with respiratory syncytial virus (RSV) during infancy is strongly associated with the development of asthma. To identify genetic variation that contributes to asthma following severe RSV bronchiolitis during infancy, we sequenced the coding exons of 131 asthma candidate genes in 182 European and African American children with severe RSV bronchiolitis in infancy using anonymous pools for variant discovery, and then directly genotyped a set of 190 nonsynonymous variants. Association testing was performed for physician-diagnosed asthma before the 7th birthday (asthma) using genotypes from 6,500 individuals from the Exome Sequencing Project (ESP) as controls to gain statistical power. In addition, among patients with severe RSV bronchiolitis during infancy, we examined genetic associations with asthma, active asthma, persistent wheeze, and bronchial hyperreactivity (methacholine PC20) at age 6 years. We identified four rare nonsynonymous variants that were significantly associated with asthma following severe RSV bronchiolitis, including single variants in ADRB2, FLG and NCAM1 in European Americans (p = 4.6x10-4, 1.9x10-13 and 5.0x10-5, respectively), and NOS1 in African Americans (p = 2.3x10-11). One of the variants was a highly functional nonsynonymous variant in ADRB2 (rs1800888), which was also nominally associated with asthma (p = 0.027) and active asthma (p = 0.013) among European Americans with severe RSV bronchiolitis without including the ESP. Our results suggest that rare nonsynonymous variants contribute to the development of asthma following severe RSV bronchiolitis in infancy, notably in ADRB2. Additional studies are required to explore the role of rare variants in the etiology of asthma and asthma-related traits following severe RSV bronchiolitis. PMID:26587832

  19. Sequence analysis of three pigmentation genes in the Newfoundland population of Canis latrans links the Golden Retriever Mc1r variant to white coat color in coyotes.

    PubMed

    Brockerville, Ryan M; McGrath, Michael J; Pilgrim, Brettney L; Marshall, H Dawn

    2013-04-01

    Three genes, Mc1r, Agouti, and CBD103, interact in a type-switching process that controls much of the pigmentation variation observed in mammals. A deletion in the CBD103 gene is responsible for dominant black color in dogs, while the white-phased black bear ("spirit bear") of British Columbia, Canada, is the lightest documented color variant caused by a mutation in Mc1r. Rare all-white animals have recently been discovered in a new northeastern population of the coyote in insular Newfoundland and Labrador, Canada. To investigate the causative gene and mutation of white coat in coyotes, we sequenced the three type-switching genes in white and dark-phased animals from Newfoundland. The only sequence variants unambiguously associated with white color were in Mc1r, and one of these variants causes the amino acid variant R306Ter, a premature stop codon also linked to coat color in Golden Retrievers and other dogs with yellow/red coats. The allele carrying R306Ter in coyotes matches that in the Golden Retriever at other variable amino acid sites and hence may have originated in these dogs. Coyotes experienced introgression with wolves and dogs as they colonized northeastern North America, and coyote/Golden Retriever interactions have been observed in Newfoundland. We speculate that natural selection, with or without a founder effect, may contribute to the observed frequency of white coyotes in Newfoundland, as it has contributed to the high frequency of white bears, and of a domestic dog-derived CBD allele in gray wolves.

  20. Methods and compositions for efficient nucleic acid sequencing

    DOEpatents

    Drmanac, Radoje

    2006-07-04

    Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.

  1. Methods and compositions for efficient nucleic acid sequencing

    DOEpatents

    Drmanac, Radoje

    2002-01-01

    Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.

  2. Kit for detecting nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    2001-01-01

    A kit is provided for detecting a target nucleic acid sequence in a sample, the kit comprising: a first hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the first hybridization probe including a first complexing agent for forming a binding pair with a second complexing agent; and a second hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the first hybridization probe does not selectively hybridize, the second hybridization probe including a detectable marker; a third hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the third hybridization probe including the same detectable marker as the second hybridization probe; and a fourth hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the third hybridization probe does not selectively hybridize, the fourth hybridization probe including the first complexing agent for forming a binding pair with the second complexing agent; wherein the first and second hybridization probes are capable of simultaneously hybridizing to the target sequence and the third and fourth hybridization probes are capable of simultaneously hybridizing to the target sequence, the detectable marker is not present on the first or fourth hybridization probes and the first, second, third, and fourth hybridization probes each include a competitive nucleic acid sequence which is sufficiently complementary to a third portion of the target sequence that the competitive sequences of the first, second, third, and fourth hybridization probes compete with each other to hybridize to the third portion of the

  3. A genetic polymorphism in coumarin 7-hydroxylation: Sequence of the human CYP2A genes and identification of variant CYP2A6 alleles

    SciTech Connect

    Fernandez-Salguero, P.; Hoffman, S.M.G.; Mohrenweiser, H.

    1995-09-01

    A group of human cytochrome P450 genes encompassing the CYP2A, CYP2B, and CYP2F subfamilies were cloned and assembled into a 350-kb contig localized on the long arm of chromosome 19. Three complete CYP2A genes - CYP2A6, CYP2A7, and CYP2A13 - plus two pseudogenes truncated after exon 5 were identified and sequenced. A variant CYP2A6 allele that differed from the corresponding CYP2A6 and CYP2A7 cDNAs previously sequenced was found and was designated CYP2A6{nu}2. Sequence differences in the CY-P2A6{nu}2 gene are restricted to regions encompassing exons 3, 6, and 8, which bear sequence relatedness with the corresponding exons of the CYP2A7 gene, located downstream and centromeric of CYP2A6{nu}2, suggesting recent gene-conversion events. The sequencing of all the CYP2A genes allowed the design of a PCR diagnostic test for the normal CYP2A6 allele, the CYP2A6{nu}2 allele, and a variant - designated CYP2A6{nu}1 - that encodes an enzyme with a single inactivating amino acid change. These variant alleles were found in individuals who were deficient in their ability to metabolize the CYP2A6 probe drug coumarin. The allelic frequencies of CYP2A6{nu}1 and CYP2A6{nu}2 differed significantly between Caucasian, Asian, and African-American populations. These studies establish the existence of a new cytochrome P450 genetic polymorphism. 30 refs., 4 figs., 2 tabs.

  4. Whole Exome Sequence Analysis Implicates Rare Il17REL Variants In Familial And Sporadic Inflammatory Bowel Disease

    PubMed Central

    Sasaki, Mark M; Skol, Andrew D; Hungate, Eric A; Bao, Riyue; Huang, Lei; Kahn, Stacy A; Allan, James M; Brant, Steven R; McGovern, Dermot PB; Peter, Inga; Silverberg, Mark S; Cho, Judy H; Kirschner, Barbara S; Onel, Kenan

    2015-01-01

    Background Rare variants (<1%) likely contribute significantly to risk for common diseases such as inflammatory bowel disease (IBD) in specific patient subsets, such as those with high familiality. They are, however, extraordinarily challenging to identify. Methods To discover candidate rare variants associated with IBD, we performed whole exome sequencing (WES) on six members of a pediatric-onset IBD family with multiple affected individuals. To determine whether the variants discovered in this family are also associated with non-familial IBD, we investigated their influence on disease in two large case-control (CC) series. Results We identified two rare variants, rs142430606 and rs200958270, both in the established IBD-susceptibility gene IL17REL, carried by all four affected family members and their obligate-carrier parents. We then demonstrated that both variants are associated with sporadic ulcerative colitis (UC) in two independent datasets. For UC in CC 1: rs142430606 (OR=2.99, Padj=0.028; MAFcases=0.0063, MAFcontrols=0.0021); rs200958270 (OR=2.61, Padj=0.082; MAFcases=0.0045, MAFcontrols=0.0017). For UC in CC 2: rs142430606 (OR=1.94, P=0.0056; MAFcases=0.0071, MAFcontrols=0.0045); rs200958270 (OR=2.08, P=0.0028; MAFcases=0.0071, MAFcontrols=0.0042). Conclusions We discover in a family and replicate in two CC datasets two rare susceptibility variants for IBD, both in IL17REL. Our results illustrate that WES performed on disease-enriched families to guide association testing can be an efficient strategy for the discovery of rare disease-associated variants. We speculate that rare variants identified in families and confirmed in the general population may be important modifiers of disease risk for patients with a family history, and that genetic testing of these variants may be warranted in this patient subset. PMID:26480299

  5. Exploring the feasibility of using copy number variants as genetic markers through large-scale whole genome sequencing experiments

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Copy number variants (CNV) are large scale duplications or deletions of genomic sequence that are caused by a diverse set of molecular phenomena that are distinct from single nucleotide polymorphism (SNP) formation. Due to their different mechanisms of formation, CNVs are often difficult to track us...

  6. A Systematic Assessment of Accuracy in Detecting Somatic Mosaic Variants by Deep Amplicon Sequencing: Application to NF2 Gene

    PubMed Central

    Sestini, Roberta; Candita, Luisa; Capone, Gabriele Lorenzo; Barbetti, Lorenzo; Falconi, Serena; Frusconi, Sabrina; Giotti, Irene; Giuliani, Costanza; Torricelli, Francesca; Benelli, Matteo; Papi, Laura

    2015-01-01

    The accurate detection of low-allelic variants is still challenging, particularly for the identification of somatic mosaicism, where matched control sample is not available. High throughput sequencing, by the simultaneous and independent analysis of thousands of different DNA fragments, might overcome many of the limits of traditional methods, greatly increasing the sensitivity. However, it is necessary to take into account the high number of false positives that may arise due to the lack of matched control samples. Here, we applied deep amplicon sequencing to the analysis of samples with known genotype and variant allele fraction (VAF) followed by a tailored statistical analysis. This method allowed to define a minimum value of VAF for detecting mosaic variants with high accuracy. Then, we exploited the estimated VAF to select candidate alterations in NF2 gene in 34 samples with unknown genotype (30 blood and 4 tumor DNAs), demonstrating the suitability of our method. The strategy we propose optimizes the use of deep amplicon sequencing for the identification of low abundance variants. Moreover, our method can be applied to different high throughput sequencing approaches to estimate the background noise and define the accuracy of the experimental design. PMID:26066488

  7. Analysis and Annotation of Nucleic Acid Sequence

    SciTech Connect

    States, David J.

    2004-07-28

    The aims of this project were to develop improved methods for computational genome annotation and to apply these methods to improve the annotation of genomic sequence data with a specific focus on human genome sequencing. The project resulted in a substantial body of published work. Notable contributions of this project were the identification of basecalling and lane tracking as error processes in genome sequencing and contributions to improved methods for these steps in genome sequencing. This technology improved the accuracy and throughput of genome sequence analysis. Probabilistic methods for physical map construction were developed. Improved methods for sequence alignment, alternative splicing analysis, promoter identification and NF kappa B response gene prediction were also developed.

  8. Solid phase sequencing of double-stranded nucleic acids

    DOEpatents

    Fu, Dong-Jing; Cantor, Charles R.; Koster, Hubert; Smith, Cassandra L.

    2002-01-01

    This invention relates to methods for detecting and sequencing of target double-stranded nucleic acid sequences, to nucleic acid probes and arrays of probes useful in these methods, and to kits and systems which contain these probes. Useful methods involve hybridizing the nucleic acids or nucleic acids which represent complementary or homologous sequences of the target to an array of nucleic acid probes. These probe comprise a single-stranded portion, an optional double-stranded portion and a variable sequence within the single-stranded portion. The molecular weights of the hybridized nucleic acids of the set can be determined by mass spectroscopy, and the sequence of the target determined from the molecular weights of the fragments. Nucleic acids whose sequences can be determined include nucleic acids in biological samples such as patient biopsies and environmental samples. Probes may be fixed to a solid support such as a hybridization chip to facilitate automated determination of molecular weights and identification of the target sequence.

  9. A FACS-based screening strategy to assess sequence-specific RNA-binding of Pumilio protein variants in E. coli.

    PubMed

    Kellermann, Stefanie J; Rentmeister, Andrea

    2017-01-01

    Sequence-specific and programmable binding of proteins to RNA bears the potential to detect and manipulate target RNAs. Applications include analysis of subcellular RNA localization or post-transcriptional regulation but require sequence-specificity to be readily adjustable to any target RNA. The Pumilio homology domain binds an eight nucleotide target sequence in a predictable manner allowing for rational design of variants with new specificities. We describe a high-throughput system for screening Pumilio variants based on fluorescence-activated cell sorting of E. coli. Our approach should help optimizing variants obtained from rational design regarding folding and stability or identifying new variants with alternative binding modes.

  10. Exome Sequencing in an Admixed Isolated Population Indicates NFXL1 Variants Confer a Risk for Specific Language Impairment

    PubMed Central

    Villanueva, Pía; Nudel, Ron; Hoischen, Alexander; Fernández, María Angélica; Simpson, Nuala H.; Gilissen, Christian; Reader, Rose H.; Jara, Lillian; Echeverry, Maria Magdalena; Francks, Clyde; Baird, Gillian; Conti-Ramsden, Gina; O’Hare, Anne; Bolton, Patrick F.; Hennessy, Elizabeth R.; Palomino, Hernán; Carvajal-Carmona, Luis; Veltman, Joris A.; Cazier, Jean-Baptiste; De Barbieri, Zulema

    2015-01-01

    Children affected by Specific Language Impairment (SLI) fail to acquire age appropriate language skills despite adequate intelligence and opportunity. SLI is highly heritable, but the understanding of underlying genetic mechanisms has proved challenging. In this study, we use molecular genetic techniques to investigate an admixed isolated founder population from the Robinson Crusoe Island (Chile), who are affected by a high incidence of SLI, increasing the power to discover contributory genetic factors. We utilize exome sequencing in selected individuals from this population to identify eight coding variants that are of putative significance. We then apply association analyses across the wider population to highlight a single rare coding variant (rs144169475, Minor Allele Frequency of 4.1% in admixed South American populations) in the NFXL1 gene that confers a nonsynonymous change (N150K) and is significantly associated with language impairment in the Robinson Crusoe population (p = 2.04 × 10–4, 8 variants tested). Subsequent sequencing of NFXL1 in 117 UK SLI cases identified four individuals with heterozygous variants predicted to be of functional consequence. We conclude that coding variants within NFXL1 confer an increased risk of SLI within a complex genetic model. PMID:25781923

  11. Exome sequencing in an admixed isolated population indicates NFXL1 variants confer a risk for specific language impairment.

    PubMed

    Villanueva, Pía; Nudel, Ron; Hoischen, Alexander; Fernández, María Angélica; Simpson, Nuala H; Gilissen, Christian; Reader, Rose H; Jara, Lillian; Echeverry, María Magdalena; Echeverry, Maria Magdalena; Francks, Clyde; Baird, Gillian; Conti-Ramsden, Gina; O'Hare, Anne; Bolton, Patrick F; Hennessy, Elizabeth R; Palomino, Hernán; Carvajal-Carmona, Luis; Veltman, Joris A; Cazier, Jean-Baptiste; De Barbieri, Zulema; Fisher, Simon E; Newbury, Dianne F

    2015-03-01

    Children affected by Specific Language Impairment (SLI) fail to acquire age appropriate language skills despite adequate intelligence and opportunity. SLI is highly heritable, but the understanding of underlying genetic mechanisms has proved challenging. In this study, we use molecular genetic techniques to investigate an admixed isolated founder population from the Robinson Crusoe Island (Chile), who are affected by a high incidence of SLI, increasing the power to discover contributory genetic factors. We utilize exome sequencing in selected individuals from this population to identify eight coding variants that are of putative significance. We then apply association analyses across the wider population to highlight a single rare coding variant (rs144169475, Minor Allele Frequency of 4.1% in admixed South American populations) in the NFXL1 gene that confers a nonsynonymous change (N150K) and is significantly associated with language impairment in the Robinson Crusoe population (p = 2.04 × 10-4, 8 variants tested). Subsequent sequencing of NFXL1 in 117 UK SLI cases identified four individuals with heterozygous variants predicted to be of functional consequence. We conclude that coding variants within NFXL1 confer an increased risk of SLI within a complex genetic model.

  12. Compact variant-rich customized sequence database and a fast and sensitive database search for efficient proteogenomic analyses.

    PubMed

    Park, Heejin; Bae, Junwoo; Kim, Hyunwoo; Kim, Sangok; Kim, Hokeun; Mun, Dong-Gi; Joh, Yoonsung; Lee, Wonyeop; Chae, Sehyun; Lee, Sanghyuk; Kim, Hark Kyun; Hwang, Daehee; Lee, Sang-Won; Paek, Eunok

    2014-12-01

    In proteogenomic analysis, construction of a compact, customized database from mRNA-seq data and a sensitive search of both reference and customized databases are essential to accurately determine protein abundances and structural variations at the protein level. However, these tasks have not been systematically explored, but rather performed in an ad-hoc fashion. Here, we present an effective method for constructing a compact database containing comprehensive sequences of sample-specific variants--single nucleotide variants, insertions/deletions, and stop-codon mutations derived from Exome-seq and RNA-seq data. It, however, occupies less space by storing variant peptides, not variant proteins. We also present an efficient search method for both customized and reference databases. The separate searches of the two databases increase the search time, and a unified search is less sensitive to identify variant peptides due to the smaller size of the customized database, compared to the reference database, in the target-decoy setting. Our method searches the unified database once, but performs target-decoy validations separately. Experimental results show that our approach is as fast as the unified search and as sensitive as the separate searches. Our customized database includes mutation information in the headers of variant peptides, thereby facilitating the inspection of peptide-spectrum matches.

  13. Identification of cancer predisposition variants in apparently healthy individuals using a next-generation sequencing-based family genomics approach.

    PubMed

    Karageorgos, Ioannis; Mizzi, Clint; Giannopoulou, Efstathia; Pavlidis, Cristiana; Peters, Brock A; Zagoriti, Zoi; Stenson, Peter D; Mitropoulos, Konstantinos; Borg, Joseph; Kalofonos, Haralabos P; Drmanac, Radoje; Stubbs, Andrew; van der Spek, Peter; Cooper, David N; Katsila, Theodora; Patrinos, George P

    2015-06-20

    Cancer, like many common disorders, has a complex etiology, often with a strong genetic component and with multiple environmental factors contributing to susceptibility. A considerable number of genomic variants have been previously reported to be causative of, or associated with, an increased risk for various types of cancer. Here, we adopted a next-generation sequencing approach in 11 members of two families of Greek descent to identify all genomic variants with the potential to predispose family members to cancer. Cross-comparison with data from the Human Gene Mutation Database identified a total of 571 variants, from which 47 % were disease-associated polymorphisms, 26 % disease-associated polymorphisms with additional supporting functional evidence, 19 % functional polymorphisms with in vitro/laboratory or in vivo supporting evidence but no known disease association, 4 % putative disease-causing mutations but with some residual doubt as to their pathological significance, and 3 % disease-causing mutations. Subsequent analysis, focused on the latter variant class most likely to be involved in cancer predisposition, revealed two variants of prime interest, namely MSH2 c.2732T>A (p.L911R) and BRCA1 c.2955delC, the first of which is novel. KMT2D c.13895delC and c.1940C>A variants are additionally reported as incidental findings. The next-generation sequencing-based family genomics approach described herein has the potential to be applied to other types of complex genetic disorder in order to identify variants of potential pathological significance.

  14. Ultradeep Sequencing for Detection of Quasispecies Variants in the Major Hydrophilic Region of Hepatitis B Virus in Indonesian Patients.

    PubMed

    Yamani, Laura Navika; Yano, Yoshihiko; Utsumi, Takako; Juniastuti; Wandono, Hadi; Widjanarko, Doddy; Triantanoe, Ari; Wasityastuti, Widya; Liang, Yujiao; Okada, Rina; Tanahashi, Toshihito; Murakami, Yoshiki; Azuma, Takeshi; Soetjipto; Lusida, Maria Inge; Hayashi, Yoshitake

    2015-10-01

    Quasispecies of hepatitis B virus (HBV) with variations in the major hydrophilic region (MHR) of the HBV surface antigen (HBsAg) can evolve during infection, allowing HBV to evade neutralizing antibodies. These escape variants may contribute to chronic infections. In this study, we looked for MHR variants in HBV quasispecies using ultradeep sequencing and evaluated the relationship between these variants and clinical manifestations in infected patients. We enrolled 30 Indonesian patients with hepatitis B infection (11 with chronic hepatitis and 19 with advanced liver disease). The most common subgenotype/subtype of HBV was B3/adw (97%). The HBsAg titer was lower in patients with advanced liver disease than that in patients with chronic hepatitis. The MHR variants were grouped based on the percentage of the viral population affected: major, ≥20% of the total population; intermediate, 5% to <20%; and minor, 1% to <5%. The rates of MHR variation that were present in the major and intermediate viral population were significantly greater in patients with advanced liver disease than those in chronic patients. The most frequent MHR variants related to immune evasion in the major and intermediate populations were P120Q/T, T123A, P127T, Q129H/R, M133L/T, and G145R. The major population of MHR variants causing impaired of HBsAg secretion (e.g., G119R, Q129R, T140I, and G145R) was detected only in advanced liver disease patients. This is the first study to use ultradeep sequencing for the detection of MHR variants of HBV quasispecies in Indonesian patients. We found that a greater number of MHR variations was related to disease severity and reduced likelihood of HBsAg titer.

  15. Ultradeep Sequencing for Detection of Quasispecies Variants in the Major Hydrophilic Region of Hepatitis B Virus in Indonesian Patients

    PubMed Central

    Yamani, Laura Navika; Utsumi, Takako; Juniastuti; Wandono, Hadi; Widjanarko, Doddy; Triantanoe, Ari; Wasityastuti, Widya; Liang, Yujiao; Okada, Rina; Tanahashi, Toshihito; Murakami, Yoshiki; Azuma, Takeshi; Soetjipto; Lusida, Maria Inge; Hayashi, Yoshitake

    2015-01-01

    Quasispecies of hepatitis B virus (HBV) with variations in the major hydrophilic region (MHR) of the HBV surface antigen (HBsAg) can evolve during infection, allowing HBV to evade neutralizing antibodies. These escape variants may contribute to chronic infections. In this study, we looked for MHR variants in HBV quasispecies using ultradeep sequencing and evaluated the relationship between these variants and clinical manifestations in infected patients. We enrolled 30 Indonesian patients with hepatitis B infection (11 with chronic hepatitis and 19 with advanced liver disease). The most common subgenotype/subtype of HBV was B3/adw (97%). The HBsAg titer was lower in patients with advanced liver disease than that in patients with chronic hepatitis. The MHR variants were grouped based on the percentage of the viral population affected: major, ≥20% of the total population; intermediate, 5% to <20%; and minor, 1% to <5%. The rates of MHR variation that were present in the major and intermediate viral population were significantly greater in patients with advanced liver disease than those in chronic patients. The most frequent MHR variants related to immune evasion in the major and intermediate populations were P120Q/T, T123A, P127T, Q129H/R, M133L/T, and G145R. The major population of MHR variants causing impaired of HBsAg secretion (e.g., G119R, Q129R, T140I, and G145R) was detected only in advanced liver disease patients. This is the first study to use ultradeep sequencing for the detection of MHR variants of HBV quasispecies in Indonesian patients. We found that a greater number of MHR variations was related to disease severity and reduced likelihood of HBsAg titer. PMID:26202119

  16. Dipeptide Sequence Determination: Analyzing Phenylthiohydantoin Amino Acids by HPLC

    NASA Astrophysics Data System (ADS)

    Barton, Janice S.; Tang, Chung-Fei; Reed, Steven S.

    2000-02-01

    Amino acid composition and sequence determination, important techniques for characterizing peptides and proteins, are essential for predicting conformation and studying sequence alignment. This experiment presents improved, fundamental methods of sequence analysis for an upper-division biochemistry laboratory. Working in pairs, students use the Edman reagent to prepare phenylthiohydantoin derivatives of amino acids for determination of the sequence of an unknown dipeptide. With a single HPLC technique, students identify both the N-terminal amino acid and the composition of the dipeptide. This method yields good precision of retention times and allows use of a broad range of amino acids as components of the dipeptide. Students learn fundamental principles and techniques of sequence analysis and HPLC.

  17. Complete nucleotide sequence of a new variant of grapevine fanleaf virus from northeastern China.

    PubMed

    Zhou, Jun; Fan, Xudong; Dong, Yafeng; Zhang, Zunping; Ren, Fang; Hu, Guojun; Li, Zhengnan

    2017-02-01

    The complete RNA1 and RNA2 sequences of a new grapevine fanleaf virus isolate (GFLV-SDHN) from northeastern China were determined. The two RNAs are 7,367 and 3,788 nucleotides (nt) in length, respectively, excluding the poly(A) tails. Compared to other GFLV isolates, GFLV-SDHN has a 22- to 24-nt insertion in the RNA1 5' untranslated region, and there was 19.1-20.1 % and 11.7 %-13.0 % sequence divergence in RNA1, and 15.5 %-20.5 % and 8.5-13.5 % in RNA2, at the nt and amino acid level, respectively. Phylogenetic analysis revealed that the origins of GFLV-SDHN are distinct from those of other GFLV isolates. One recombination event was identified in the 2A(HP) region of RNA2 in GFLV-SDHN.

  18. A unified test of linkage analysis and rare-variant association for analysis of pedigree sequence data.

    PubMed

    Hu, Hao; Roach, Jared C; Coon, Hilary; Guthery, Stephen L; Voelkerding, Karl V; Margraf, Rebecca L; Durtschi, Jacob D; Tavtigian, Sean V; Shankaracharya; Wu, Wilfred; Scheet, Paul; Wang, Shuoguo; Xing, Jinchuan; Glusman, Gustavo; Hubley, Robert; Li, Hong; Garg, Vidu; Moore, Barry; Hood, Leroy; Galas, David J; Srivastava, Deepak; Reese, Martin G; Jorde, Lynn B; Yandell, Mark; Huff, Chad D

    2014-07-01

    High-throughput sequencing of related individuals has become an important tool for studying human disease. However, owing to technical complexity and lack of available tools, most pedigree-based sequencing studies rely on an ad hoc combination of suboptimal analyses. Here we present pedigree-VAAST (pVAAST), a disease-gene identification tool designed for high-throughput sequence data in pedigrees. pVAAST uses a sequence-based model to perform variant and gene-based linkage analysis. Linkage information is then combined with functional prediction and rare variant case-control association information in a unified statistical framework. pVAAST outperformed linkage and rare-variant association tests in simulations and identified disease-causing genes from whole-genome sequence data in three human pedigrees with dominant, recessive and de novo inheritance patterns. The approach is robust to incomplete penetrance and locus heterogeneity and is applicable to a wide variety of genetic traits. pVAAST maintains high power across studies of monogenic, high-penetrance phenotypes in a single pedigree to highly polygenic, common phenotypes involving hundreds of pedigrees.

  19. Two Novel Toxin Variants Revealed by Whole-Genome Sequencing of 175 Clostridium botulinum Type E Strains

    PubMed Central

    Weedmark, K. A.; Lambert, D. L.; Mabon, P.; Hayden, K. L.; Urfano, C. J.; Leclair, D.; Van Domselaar, G.; Austin, J. W.

    2014-01-01

    We sequenced 175 Clostridium botulinum type E strains isolated from food, clinical, and environmental sources from northern Canada and analyzed their botulinum neurotoxin (bont) coding sequences (CDSs). In addition to bont/E1 and bont/E3 variant types, neurotoxin sequence analysis identified two novel BoNT type E variants termed E10 and E11. Strains producing type E10 were found along the eastern coastlines of Hudson Bay and the shores of Ungava Bay, while strains producing type E11 were only found in the Koksoak River region of Nunavik. Strains producing BoNT/E3 were widespread throughout northern Canada, with the exception of the coast of eastern Hudson Bay. PMID:25107978

  20. Prevalence of HPV 16 genomic variant carrying a 63 bp duplicated sequence within the E1 gene in Slovenian women.

    PubMed

    Bogovac, Zeljka; Lunar, Maja M; Kocjan, Boštjan J; Seme, Katja; Jančar, Nina; Poljak, Mario

    2011-09-01

    High-risk HPV, particularly HPV-16, is etiologically associated with the development of cervical cancer and its precursor lesions - cervical intraepithelial neoplasia (CIN). However, most precancerous lesions will not progress to cancer. Numerous studies have shown that HPV-16 consists of several genomic variants, which differ in their association with cervical cancer, viral persistence and the frequency of recurrence of cervical disease. Recently, a novel, presumably less pathogenic, HPV-16 E6-T350G genomic variant has been identified, carrying a 63-bp in-frame insertion in the E1 gene. No data from Slovenian patients have so far been reported for this specific HPV-16 variant. In the present study, therefore, a total of 390 HPV-16 positive samples obtained from the same number of women with normal cytology, CIN I, CIN II, CIN III or cervical cancer, were analyzed. The HPV-16 E1 insert variant was detected using real-time PCR-amplification of a 146-210-bp fragment of the E1 gene and PCR-sequencing of a 169-bp fragment of the E6 gene. The HPV-16 E1 insert variant was identified in 7/48 (14.6%), 1/21 (4.8%), 2/20 (10.0%), 9/131 (6.9%) and 12/170 (7.1%) of women with normal cytology, CIN I, CIN II, CIN III and cervical cancer, respectively. All HPV-16 E1 insert variants with an amplifiable E6 gene belonged to the European HPV-16 E6-350G variant group. No statistically significant differences in the prevalence of HPV-16 E1 insert genomic variant in women presenting with normal cytology and those with the different stages of HPV-16-induced disease were found.

  1. Multiplexed highly-accurate DNA sequencing of closely-related HIV-1 variants using continuous long reads from single molecule, real-time sequencing.

    PubMed

    Dilernia, Dario A; Chien, Jung-Ting; Monaco, Daniela C; Brown, Michael P S; Ende, Zachary; Deymier, Martin J; Yue, Ling; Paxinos, Ellen E; Allen, Susan; Tirado-Ramos, Alfredo; Hunter, Eric

    2015-11-16

    Single Molecule, Real-Time (SMRT) Sequencing (Pacific Biosciences, Menlo Park, CA, USA) provides the longest continuous DNA sequencing reads currently available. However, the relatively high error rate in the raw read data requires novel analysis methods to deconvolute sequences derived from complex samples. Here, we present a workflow of novel computer algorithms able to reconstruct viral variant genomes present in mixtures with an accuracy of >QV50. This approach relies exclusively on Continuous Long Reads (CLR), which are the raw reads generated during SMRT Sequencing. We successfully implement this workflow for simultaneous sequencing of mixtures containing up to forty different >9 kb HIV-1 full genomes. This was achieved using a single SMRT Cell for each mixture and desktop computing power. This novel approach opens the possibility of solving complex sequencing tasks that currently lack a solution.

  2. Evaluating alignment and variant-calling software for mutation identification in C. elegans by whole-genome sequencing

    PubMed Central

    Yun, Sijung

    2017-01-01

    Whole-genome sequencing is a powerful tool for analyzing genetic variation on a global scale. One particularly useful application is the identification of mutations obtained by classical phenotypic screens in model species. Sequence data from the mutant strain is aligned to the reference genome, and then variants are called to generate a list of candidate alleles. A number of software pipelines for mutation identification have been targeted to C. elegans, with particular emphasis on ease of use, incorporation of mapping strain data, subtraction of background variants, and similar criteria. Although success is predicated upon the sensitive and accurate detection of candidate alleles, relatively little effort has been invested in evaluating the underlying software components that are required for mutation identification. Therefore, we have benchmarked a number of commonly used tools for sequence alignment and variant calling, in all pair-wise combinations, against both simulated and actual datasets. We compared the accuracy of those pipelines for mutation identification in C. elegans, and found that the combination of BBMap for alignment plus FreeBayes for variant calling offers the most robust performance. PMID:28333980

  3. De novo assembly and next-generation sequencing to analyse full-length gene variants from codon-barcoded libraries

    PubMed Central

    Cho, Namjin; Hwang, Byungjin; Yoon, Jung-ki; Park, Sangun; Lee, Joongoo; Seo, Han Na; Lee, Jeewon; Huh, Sunghoon; Chung, Jinsoo; Bang, Duhee

    2015-01-01

    Interpreting epistatic interactions is crucial for understanding evolutionary dynamics of complex genetic systems and unveiling structure and function of genetic pathways. Although high resolution mapping of en masse variant libraries renders molecular biologists to address genotype-phenotype relationships, long-read sequencing technology remains indispensable to assess functional relationship between mutations that lie far apart. Here, we introduce JigsawSeq for multiplexed sequence identification of pooled gene variant libraries by combining a codon-based molecular barcoding strategy and de novo assembly of short-read data. We first validate JigsawSeq on small sub-pools and observed high precision and recall at various experimental settings. With extensive simulations, we then apply JigsawSeq to large-scale gene variant libraries to show that our method can be reliably scaled using next-generation sequencing. JigsawSeq may serve as a rapid screening tool for functional genomics and offer the opportunity to explore evolutionary trajectories of protein variants. PMID:26387459

  4. Evaluating alignment and variant-calling software for mutation identification in C. elegans by whole-genome sequencing.

    PubMed

    Smith, Harold E; Yun, Sijung

    2017-01-01

    Whole-genome sequencing is a powerful tool for analyzing genetic variation on a global scale. One particularly useful application is the identification of mutations obtained by classical phenotypic screens in model species. Sequence data from the mutant strain is aligned to the reference genome, and then variants are called to generate a list of candidate alleles. A number of software pipelines for mutation identification have been targeted to C. elegans, with particular emphasis on ease of use, incorporation of mapping strain data, subtraction of background variants, and similar criteria. Although success is predicated upon the sensitive and accurate detection of candidate alleles, relatively little effort has been invested in evaluating the underlying software components that are required for mutation identification. Therefore, we have benchmarked a number of commonly used tools for sequence alignment and variant calling, in all pair-wise combinations, against both simulated and actual datasets. We compared the accuracy of those pipelines for mutation identification in C. elegans, and found that the combination of BBMap for alignment plus FreeBayes for variant calling offers the most robust performance.

  5. Variant call concordance between two laboratory-developed, solid tumor targeted genomic profiling assays using distinct workflows and sequencing instruments.

    PubMed

    Hampel, Ken J; de Abreu, Francine B; Sidiropoulos, Nikoletta; Peterson, Jason D; Tsongalis, Gregory J

    2017-02-10

    Targeted genomic profiling (TGP) using massively parallel DNA sequencing is becoming the standard methodology in clinical laboratories for detecting somatic variants in solid tumors. The variety of methodologies and sequencing platforms in the marketplace for TGP has resulted in a variety of clinical TGP laboratory developed tests (LDT). The variability of LDTs is a challenge for test-to-test and laboratory-to-laboratory reliability. At the University of Vermont Medical Center (UVMMC), we validated a TGP assay for solid tumors which utilizes DNA hybridization capture and complete exon and selected intron sequencing of 29 clinically actionable genes. The validation samples were run on the Illumina MiSeq platform. Clinical specificity and sensitivity were evaluated by testing samples harboring genomic variants previously identified in CLIA-approved, CAP accredited laboratories with clinically validated molecular assays. The Molecular Laboratory at Dartmouth Hitchcock Medical Center (DHMC) provided 11 FFPE specimens that had been analyzed on AmpliSeq Cancer Hotspot Panel version 2 (CHPv2) and run on the Ion Torrent PGM. A Venn diagram of the gene lists from the two institutions is shown. This provided an excellent opportunity to compare the inter-laboratory reliability using two different target sequencing methods and sequencing platforms. Our data demonstrated an exceptionally high level of concordance with respect to the sensitivity and specificity of the analyses. All clinically-actionable SNV and InDel variant calls in genes covered by both panels (n=17) were identified by both laboratories. This data supports the proposal that distinct gene panel designs and sequencing workflows are capable of making consistent variant calls in solid tumor FFPE-derived samples.

  6. Dietary fatty acids modulate associations between genetic variants and circulating fatty acids in plasma and erythrocyte membranes: meta-analysis of 9 studies in the CHARGE consortium

    PubMed Central

    Smith, Caren E.; Follis, Jack L.; Nettleton, Jennifer A.; Foy, Millennia; Wu, Jason H.Y.; Ma, Yiyi; Tanaka, Toshiko; Manichakul, Ani W.; Wu, Hongyu; Chu, Audrey Y.; Steffen, Lyn M.; Fornage, Myriam; Mozaffarian, Dariush; Kabagambe, Edmond K.; Ferruci, Luigi; da Chen, Yii-Der I; Rich, Stephen S.; Djoussé, Luc; Ridker, Paul M.; Tang, Weihong; McKnight, Barbara; Tsai, Michael Y.; Bandinelli, Stefania; Rotter, Jerome I.; Hu, Frank B.; Chasman, Daniel I.; Psaty, Bruce M.; Arnett, Donna K.; King, Irena B.; Sun, Qi; Wang, Lu; Lumley, Thomas; Chiuve, Stephanie E.; Siscovick, David S; Ordovás, José M.; Lemaitre, Rozenn N.

    2015-01-01

    Scope Tissue concentrations of omega-3 fatty acids may reduce cardiovascular disease risk, and genetic variants are associated with circulating fatty acids concentrations. Whether dietary fatty acids interact with genetic variants to modify circulating omega-3 fatty acids is unclear. Objective We evaluated interactions between genetic variants and fatty acid intakes for circulating alpha-linoleic acid (ALA), eicosapentaenoic acid (EPA), docosahexaenoic acid (DHA) and docosapentaenoic acid (DPA). Methods and Results We conducted meta-analyses (N to 11,668) evaluating interactions between dietary fatty acids and genetic variants (rs174538 and rs174548 in FADS1 (fatty acid desaturase 1), rs7435 in AGPAT3 (1-acyl-sn-glycerol-3-phosphate), rs4985167 in PDXDC1 (pyridoxal-dependent decarboxylase domain-containing 1), rs780094 in GCKR (glucokinase regulatory protein) and rs3734398 in ELOVL2 (fatty acid elongase 2)). Stratification by measurement compartment (plasma vs. erthyrocyte) revealed compartment-specific interactions between FADS1 rs174538 and rs174548 and dietary ALA and linoleic acid for DHA and DPA. Conclusion Our findings reinforce earlier reports that genetically-based differences in circulating fatty acids may be partially due to differences in the conversion of fatty acid precursors. Further, fatty acids measurement compartment may modify gene-diet relationships, and considering compartment may improve the detection of gene-fatty acids interactions for circulating fatty acid outcomes. PMID:25626431

  7. Amino acid sequence of mouse submaxillary gland renin.

    PubMed Central

    Misono, K S; Chang, J J; Inagami, T

    1982-01-01

    The complete amino acid sequences of the heavy chain and light chain of mouse submaxillary gland renin have been determined. The heavy chain consists of 288 amino acid residues having a Mr of 31,036 calculated from the sequence. The light chain contains 48 amino acid residues with a Mr of 5,458. The sequence of the heavy chain was determined by automated Edman degradations of the cyanogen bromide peptides and tryptic peptides generated after citraconylation, as well as other peptides generated therefrom. The sequence of the light chain was derived from sequence analyses of the peptides generated by cyanogen bromide cleavage or by digestion with Staphylococcus aureus protease. The sequences in the active site regions in renin containing two catalytically essential aspartyl residues 32 and 215 were found identical with those in pepsin, chymosin, and penicillopepsin. Comparison of the amino acid sequence of renin with that of porcine pepsin indicated a 42% sequence identity of the heavy chain with the amino-terminal and middle regions and a 46% identity of the light chain with the carboxyl-terminal region of the porcine pepsin sequence. Residues identical in renin and pepsin are distributed throughout the length of the molecules, suggesting a similarity in their overall structures. PMID:6812055

  8. Genetic variants of the unsaturated fatty acid receptor GPR120 relating to obesity in dogs.

    PubMed

    Miyabe, Masahiro; Gin, Azusa; Onozawa, Eri; Daimon, Mana; Yamada, Hana; Oda, Hitomi; Mori, Akihiro; Momota, Yutaka; Azakami, Daigo; Yamamoto, Ichiro; Mochizuki, Mariko; Sako, Toshinori; Tamura, Katsutoshi; Ishioka, Katsumi

    2015-10-01

    G protein-coupled receptor (GPR) 120 is an unsaturated fatty acid receptor, which is associated with various physiological functions. It is reported that the genetic variant of GPR120, p.Arg270His, is detected more in obese people, and this genetic variation functionally relates to obesity in humans. Obesity is a common nutritional disorder also in dogs, but the genetic factors have not ever been identified in dogs. In this study, we investigated the molecular structure of canine GPR120 and searched for candidate genetic variants which may relate to obesity in dogs. Canine GPR120 was highly homologous to those of other species, and seven transmembrane domains and two N-glycosylation sites were conserved. GPR120 mRNA was expressed in lung, jejunum, ileum, colon, hypothalamus, hippocampus, spinal cord, bone marrow, dermis and white adipose tissues in dogs, as those in mice and humans. Genetic variants of GPR120 were explored in client-owned 141 dogs, resulting in that 5 synonymous and 4 non-synonymous variants were found. The variant c.595C>A (p.Pro199Thr) was found in 40 dogs, and the gene frequency was significantly higher in dogs with higher body condition scores, i.e. 0.320 in BCS4-5 dogs, 0.175 in BCS3 dogs and 0.000 in BCS2 dogs. We conclude that c.595C>A (p.Pro199Thr) is a candidate variant relating to obesity, which may be helpful for nutritional management of dogs.

  9. Effects of different amino acids in culture media on surfactin variants produced by Bacillus subtilis TD7.

    PubMed

    Liu, Jin-Feng; Yang, Juan; Yang, Shi-Zhong; Ye, Ru-Qiang; Mu, Bo-Zhong

    2012-04-01

    Surfactin produced by Bacillus subtilis has different variants, which are affected by the composition of substrate available. To demonstrate the effects of amino acids on surfactin variants, B. subtilis TD7 was cultivated under the same conditions but with different amino acids supplied in media, respectively, and the type as well as the proportion of surfactin variants produced was analyzed with electrospray ionization mass spectrometry and gas chromatography-mass spectrometry. The result shows that the addition of different amino acids significantly influences the proportion of surfactin variants with different fatty acids. When Arg, Gln, or Val was added to the culture medium of B. subtilis TD7, the proportion of produced surfactin variants with even β-hydroxy fatty acids significantly increased, while the addition of Cys, His, Ile, Leu, Met, Ser, or Thr enhanced the proportion of surfactin variants with odd β-hydroxy fatty acids markedly. This result may be of some reference value in enhancing the production of specific surfactin variants as well as in the research on the relationship between culture media and the corresponding products of a certain bacterium.

  10. Next-generation re-sequencing of genes involved in increased platelet reactivity in diabetic patients on acetylsalicylic acid.

    PubMed

    Postula, Marek; Janicki, Piotr K; Eyileten, Ceren; Rosiak, Marek; Kaplon-Cieslicka, Agnieszka; Sugino, Shigekazu; Wilimski, Radosław; Kosior, Dariusz A; Opolski, Grzegorz; Filipiak, Krzysztof J; Mirowska-Guzel, Dagmara

    2016-06-01

    The objective of this study was to investigate whether rare missense genetic variants in several genes related to platelet functions and acetylsalicylic acid (ASA) response are associated with the platelet reactivity in patients with diabetes type 2 (T2D) on ASA therapy. Fifty eight exons and corresponding introns of eight selected genes, including PTGS1, PTGS2, TXBAS1, PTGIS, ADRA2A, ADRA2B, TXBA2R, and P2RY1 were re-sequenced in 230 DNA samples from T2D patients by using a pooled PCR amplification and next-generation sequencing by Illumina HiSeq2000. The observed non-synonymous variants were confirmed by individual genotyping of 384 DNA samples comprising of the individuals from the original discovery pools and additional verification cohort of 154 ASA-treated T2DM patients. The association between investigated phenotypes (ASA induced changes in platelets reactivity by PFA-100, VerifyNow and serum thromboxane B2 level [sTxB2]), and accumulation of rare missense variants (genetic burden) in investigated genes was tested using statistical collapsing tests. We identified a total of 35 exonic variants, including 3 common missense variants, 15 rare missense variants, and 17 synonymous variants in 8 investigated genes. The rare missense variants exhibited statistically significant difference in the accumulation pattern between a group of patients with increased and normal platelet reactivity based on PFA-100 assay. Our study suggests that genetic burden of the rare functional variants in eight genes may contribute to differences in the platelet reactivity measured with the PFA-100 assay in the T2DM patients treated with ASA.

  11. Genome Sequence of Rough and Smooth Variants of Pleomorphic Strain Lactobacillus farciminis CNCM-I-3699

    PubMed Central

    Tareb, R.; Bernardeau, M.

    2015-01-01

    The probiotic Lactobacillus farciminis CNCM-I-3699 is a pleomorphic strain exhibiting smooth and rough variants. We report their complete genomes consisting of a chromosome of 2, 4 Mb and a plasmid of 6,417 bp. The smooth variant differs by the presence of an additional plasmid of 35,418 bp. PMID:26383668

  12. Whole genome sequencing of an African American family highlights toll like receptor 6 variants in Kawasaki disease susceptibility

    PubMed Central

    Veeraraghavan, Narayanan; Levy, Eric; Ribeiro dos Santos, Andre M.; Yang, Hai; Hibberd, Martin L.; Tremoulet, Adriana H.; Harismendy, Olivier; Ohno-Machado, Lucila; Burns, Jane C.

    2017-01-01

    Kawasaki disease (KD) is the most common acquired pediatric heart disease. We analyzed Whole Genome Sequences (WGS) from a 6-member African American family in which KD affected two of four children. We sought rare, potentially causative genotypes by sequentially applying the following WGS filters: sequence quality scores, inheritance model (recessive homozygous and compound heterozygous), predicted deleteriousness, allele frequency, genes in KD-associated pathways or with significant associations in published KD genome-wide association studies (GWAS), and with differential expression in KD blood transcriptomes. Biologically plausible genotypes were identified in twelve variants in six genes in the two affected children. The affected siblings were compound heterozygous for the rare variants p.Leu194Pro and p.Arg247Lys in Toll-like receptor 6 (TLR6), which affect TLR6 signaling. The affected children were also homozygous for three common, linked (r2 = 1) intronic single nucleotide variants (SNVs) in TLR6 (rs56245262, rs56083757 and rs7669329), that have previously shown association with KD in cohorts of European descent. Using transcriptome data from pre-treatment whole blood of KD subjects (n = 146), expression quantitative trait loci (eQTL) analyses were performed. Subjects homozygous for the intronic risk allele (A allele of TLR6 rs56245262) had differential expression of Interleukin-6 (IL-6) as a function of genotype (p = 0.0007) and a higher erythrocyte sedimentation rate at diagnosis. TLR6 plays an important role in pathogen-associated molecular pattern recognition, and sequence variations may affect binding affinities that in turn influence KD susceptibility. This integrative genomic approach illustrates how the analysis of WGS in multiplex families with a complex genetic disease allows examination of both the common disease–common variant and common disease–rare variant hypotheses. PMID:28151979

  13. Whole-Genome Sequencing of a Canine Family Trio Reveals a FAM83G Variant Associated with Hereditary Footpad Hyperkeratosis.

    PubMed

    Sayyab, Shumaila; Viluma, Agnese; Bergvall, Kerstin; Brunberg, Emma; Jagannathan, Vidhya; Leeb, Tosso; Andersson, Göran; Bergström, Tomas F

    2016-01-08

    Over 250 Mendelian traits and disorders, caused by rare alleles have been mapped in the canine genome. Although each disease is rare in the dog as a species, they are collectively common and have major impact on canine health. With SNP-based genotyping arrays, genome-wide association studies (GWAS) have proven to be a powerful method to map the genomic region of interest when 10-20 cases and 10-20 controls are available. However, to identify the genetic variant in associated regions, fine-mapping and targeted resequencing is required. Here we present a new approach using whole-genome sequencing (WGS) of a family trio without prior GWAS. As a proof-of-concept, we chose an autosomal recessive disease known as hereditary footpad hyperkeratosis (HFH) in Kromfohrländer dogs. To our knowledge, this is the first time this family trio WGS-approach has been used successfully to identify a genetic variant that perfectly segregates with a canine disorder. The sequencing of three Kromfohrländer dogs from a family trio (an affected offspring and both its healthy parents) resulted in an average genome coverage of 9.2X per individual. After applying stringent filtering criteria for candidate causative coding variants, 527 single nucleotide variants (SNVs) and 15 indels were found to be homozygous in the affected offspring and heterozygous in the parents. Using the computer software packages ANNOVAR and SIFT to functionally annotate coding sequence differences, and to predict their functional effect, resulted in seven candidate variants located in six different genes. Of these, only FAM83G:c155G > C (p.R52P) was found to be concordant in eight additional cases, and 16 healthy Kromfohrländer dogs.

  14. The eSNV-detect: a computational system to identify expressed single nucleotide variants from transcriptome sequencing data.

    PubMed

    Tang, Xiaojia; Baheti, Saurabh; Shameer, Khader; Thompson, Kevin J; Wills, Quin; Niu, Nifang; Holcomb, Ilona N; Boutet, Stephane C; Ramakrishnan, Ramesh; Kachergus, Jennifer M; Kocher, Jean-Pierre A; Weinshilboum, Richard M; Wang, Liewei; Thompson, E Aubrey; Kalari, Krishna R

    2014-12-16

    Rapid development of next generation sequencing technology has enabled the identification of genomic alterations from short sequencing reads. There are a number of software pipelines available for calling single nucleotide variants from genomic DNA but, no comprehensive pipelines to identify, annotate and prioritize expressed SNVs (eSNVs) from non-directional paired-end RNA-Seq data. We have developed the eSNV-Detect, a novel computational system, which utilizes data from multiple aligners to call, even at low read depths, and rank variants from RNA-Seq. Multi-platform comparisons with the eSNV-Detect variant candidates were performed. The method was first applied to RNA-Seq from a lymphoblastoid cell-line, achieving 99.7% precision and 91.0% sensitivity in the expressed SNPs for the matching HumanOmni2.5 BeadChip data. Comparison of RNA-Seq eSNV candidates from 25 ER+ breast tumors from The Cancer Genome Atlas (TCGA) project with whole exome coding data showed 90.6-96.8% precision and 91.6-95.7% sensitivity. Contrasting single-cell mRNA-Seq variants with matching traditional multicellular RNA-Seq data for the MD-MB231 breast cancer cell-line delineated variant heterogeneity among the single-cells. Further, Sanger sequencing validation was performed for an ER+ breast tumor with paired normal adjacent tissue validating 29 out of 31 candidate eSNVs. The source code and user manuals of the eSNV-Detect pipeline for Sun Grid Engine and virtual machine are available at http://bioinformaticstools.mayo.edu/research/esnv-detect/.

  15. Whole-Genome Sequencing of a Canine Family Trio Reveals a FAM83G Variant Associated with Hereditary Footpad Hyperkeratosis

    PubMed Central

    Sayyab, Shumaila; Viluma, Agnese; Bergvall, Kerstin; Brunberg, Emma; Jagannathan, Vidhya; Leeb, Tosso; Andersson, Göran; Bergström, Tomas F.

    2016-01-01

    Over 250 Mendelian traits and disorders, caused by rare alleles have been mapped in the canine genome. Although each disease is rare in the dog as a species, they are collectively common and have major impact on canine health. With SNP-based genotyping arrays, genome-wide association studies (GWAS) have proven to be a powerful method to map the genomic region of interest when 10–20 cases and 10–20 controls are available. However, to identify the genetic variant in associated regions, fine-mapping and targeted resequencing is required. Here we present a new approach using whole-genome sequencing (WGS) of a family trio without prior GWAS. As a proof-of-concept, we chose an autosomal recessive disease known as hereditary footpad hyperkeratosis (HFH) in Kromfohrländer dogs. To our knowledge, this is the first time this family trio WGS-approach has been used successfully to identify a genetic variant that perfectly segregates with a canine disorder. The sequencing of three Kromfohrländer dogs from a family trio (an affected offspring and both its healthy parents) resulted in an average genome coverage of 9.2X per individual. After applying stringent filtering criteria for candidate causative coding variants, 527 single nucleotide variants (SNVs) and 15 indels were found to be homozygous in the affected offspring and heterozygous in the parents. Using the computer software packages ANNOVAR and SIFT to functionally annotate coding sequence differences, and to predict their functional effect, resulted in seven candidate variants located in six different genes. Of these, only FAM83G:c155G > C (p.R52P) was found to be concordant in eight additional cases, and 16 healthy Kromfohrländer dogs. PMID:26747202

  16. Clinically Relevant Variants Identified in Thoracic Aortic Aneurysm Patients by Research Exome Sequencing

    PubMed Central

    Schubert, Jeffrey A.; Landis, Benjamin J.; Shikany, Amy R.; Hinton, Robert B.; Ware, Stephanie M.

    2016-01-01

    Thoracic aortic aneurysm (TAA) is a genetically heterogeneous disease involving subclinical and progressive dilation of the thoracic aorta, which can lead to life-threatening complications such as dissection or rupture. Genetic testing is important for risk stratification and identification of at risk family members, and clinically available genetic testing panels have been expanding rapidly. However, when past testing results are normal, there is little evidence to guide decision-making about the indications and timing to pursue additional clinical genetic testing. Results from research based genetic testing can help inform this process. Here we present 10 TAA patients who have a family history of disease and who enrolled in research-based exome testing. Nine of these ten patients had previous clinical genetic testing that did not identify the cause of disease. We sought to determine the number of rare variants in 23 known TAA associated genes identified by research-based exome testing. In total, we found 10 rare variants in six patients. Likely pathogenic variants included a TGFB2 variant in one patient and a SMAD3 variant in another. These variants have been reported previously in individuals with similar phenotypes. Variants of uncertain significance of particular interest included novel variants in MYLK and MFAP5, which were identified in a third patient. In total, clinically reportable rare variants were found in 6/10 (60%) patients, with at least 2/10 (20%) patients having likely pathogenic variants identified. These data indicate that consideration of re-testing is important in TAA patients with previous negative or inconclusive results. PMID:26854089

  17. The Caveolin‐3 G56S sequence variant of unknown significance: Muscle biopsy findings and functional cell biological analysis

    PubMed Central

    Kollipara, Laxmikanth; Zahedi, René P.; Beckmann, Alf; Mohanadas, Nilane; Bauer, Hartmut; Häusler, Martin; Thoma, Stéphanie; Kress, Wolfram; Senderek, Jan; Weis, Joachim

    2016-01-01

    1 Purpose In the era of next‐generation sequencing, we are increasingly confronted with sequence variants of unknown significance. This phenomenon is also known for variations in Caveolin‐3 and can complicate the molecular diagnosis of the disease. Here, we aimed to study the ambiguous character of the G56S Caveolin‐3 variant. 2 Experimental design A comprehensive approach combining genetic and morphological studies of muscle derived from carriers of the G56S Caveolin‐3 variant were carried out and linked to biochemical assays (including phosphoblot studies and proteome profiling) and morphological investigations of cultured myoblasts. 3 Results Muscles showed moderate chronic myopathic changes in all carriers of the variant. Myogenic RCMH cells expressing the G56S Caveolin‐3 protein presented irregular Caveolin‐3 deposits within the Golgi in addition to a regular localization of the protein to the plasma membrane. This result was associated with abnormal findings on the ultra‐structural level. Phosphoblot studies revealed that G56S affects EGFR‐signaling. Proteomic profiling demonstrated alterations in levels of physiologically relevant proteins which are indicative for antagonization of G56S Caveolin‐3 expression. Remarkably, some proteomic alterations were enhanced by osmotic/mechanical stress. 4 Conclusions and clinical relevance Our studies suggest that G56S might influence the manifestation of myopathic changes upon the presence of additional cellular stress burden. Results of our studies moreover improve the current understanding of (genetic) causes of myopathic disorders classified as caveolinopathies. PMID:27739254

  18. Serotonin (5-HT) receptor 5A sequence variants affect human plasma triglyceride levels

    PubMed Central

    Zhang, Y.; Smith, E. M.; Baye, T. M.; Eckert, J. V.; Abraham, L. J.; Moses, E. K.; Kissebah, A. H.; Martin, L. J.

    2010-01-01

    Neurotransmitters such as serotonin (5-hydroxytryptamine, 5-HT) work closely with leptin and insulin to fine-tune the metabolic and neuroendocrine responses to dietary intake. Losing the sensitivity to excess food intake can lead to obesity, diabetes, and a multitude of behavioral disorders. It is largely unclear how different serotonin receptor subtypes respond to and integrate metabolic signals and which genetic variations in these receptor genes lead to individual differences in susceptibility to metabolic disorders. In an obese cohort of families of Northern European descent (n = 2,209), the serotonin type 5A receptor gene, HTR5A, was identified as a prominent factor affecting plasma levels of triglycerides (TG), supported by our data from both genome-wide linkage and targeted association analyses using 28 publicly available and 12 newly discovered single nucleotide polymorphisms (SNPs), of which 3 were strongly associated with plasma TG levels (P < 0.00125). Bayesian quantitative trait nucleotide (BQTN) analysis identified a putative causal promoter SNP (rs3734967) with substantial posterior probability (P = 0.59). Functional analysis of rs3734967 by electrophoretic mobility shift assay (EMSA) showed distinct binding patterns of the two alleles of this SNP with nuclear proteins from glioma cell lines. In conclusion, sequence variants in HTR5A are strongly associated with high plasma levels of TG in a Northern European population, suggesting a novel role of the serotonin receptor system in humans. This suggests a potential brain-specific regulation of plasma TG levels, possibly by alteration of the expression of HTR5A. PMID:20388841

  19. CBH1 homologs and variant CBH1 cellulases

    DOEpatents

    Goedegebuur, Frits; Gualfetti, Peter; Mitchinson, Colin; Neefe, Paulien

    2008-11-18

    Disclosed are a number of homologs and variants of Hypocrea jecorina Cel7A (formerly Trichoderma reesei cellobiohydrolase I or CBH1), nucleic acids encoding the same and methods for producing the same. The homologs and variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted and/or deleted.

  20. CBH1 homologs and variant CBH1 cellulases

    DOEpatents

    Goedegebuur, Frits; Gualfetti, Peter; Mitchinson, Colin; Neefe, Paulien

    2011-05-31

    Disclosed are a number of homologs and variants of Hypocrea jecorina Cel7A (formerly Trichoderma reesei cellobiohydrolase I or CBH1), nucleic acids encoding the same and methods for producing the same. The homologs and variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted and/or deleted.

  1. Next generation sequencing to identify novel genetic variants causative of autosomal dominant familial hypercholesterolemia associated with increased risk of coronary heart disease.

    PubMed

    Al-Allaf, Faisal A; Athar, Mohammad; Abduljaleel, Zainularifeen; Taher, Mohiuddin M; Khan, Wajahatullah; Ba-Hammam, Faisal A; Abalkhail, Hala; Alashwal, Abdullah

    2015-07-01

    Familial hypercholesterolemia (FH) is an autosomal dominant inherited disease characterized by elevated plasma low-density lipoprotein cholesterol (LDL-C). It is an autosomal dominant disease, caused by variants in Ldlr, ApoB or Pcsk9, which results in high levels of LDL-cholesterol (LDL-C) leading to early coronary heart disease. Sequencing whole genome for screening variants for FH are not suitable due to high cost. Hence, in this study we performed targeted customized sequencing of FH 12 genes (Ldlr, ApoB, Pcsk9, Abca1, Apoa2, Apoc3, Apon2, Arh, Ldlrap1, Apoc2, ApoE, and Lpl) that have been implicated in the homozygous phenotype of a proband pedigree to identify candidate variants by NGS Ion torrent PGM. Only three genes (Ldlr, ApoB, and Pcsk9) were found to be highly associated with FH based on the variant rate. The results showed that seven deleterious variants in Ldlr, ApoB, and Pcsk9 genes were pathological and were clinically significant based on predictions identified by SIFT and PolyPhen. Targeted customized sequencing is an efficient technique for screening variants among targeted FH genes. Final validation of seven deleterious variants conducted by capillary resulted to only one novel variant in Ldlr gene that was found in exon 14 (c.2026delG, p. Gly676fs). The variant found in Ldlr gene was a novel heterozygous variant derived from a male in the proband.

  2. Amino Acid Sequence of Human Cholinesterase

    DTIC Science & Technology

    1985-10-01

    liquid chromatography (HPLC). Activity testing of the aged, DFP-labeled cholinesterase showed that 99.8% of the active sites had been labeled, since...acids were quantitated by ninhydrin at the AAA Labs, or by derivatization with phenylisothiocyanate at the University of Michigan. The latter method

  3. Complete Genome Sequence of a Sapporo Virus GV.2 Variant from a 2016 Outbreak of Gastroenteritis in Sweden

    PubMed Central

    Hallström, Björn; Lagerqvist, Nina; Lind-Karlberg, Maria; Helgesson, Sofia; Follin, Per; Hergens, Maria-Pia; Nederby-Öhd, Joanna; Tolfvenstam, Thomas

    2017-01-01

    ABSTRACT During an outbreak of acute gastroenteritis in Sweden when laboratory routine diagnostics failed to detect a causative agent, Sapporo virus was detected in stool specimens using electron microscopy (M.-P. Hergens, J. Nederby Öhd, E. Alm, H. Hervius Askling, S. Helgesson, M. Insulander, N. Lagerkvist, B. Svennungsson, M. Tihane, T. Tolfvenstam, P. Follin, unpublished data). Whole-genome sequencing revealed a Sapporo virus variant clustering with genogroup V. PMID:28153884

  4. Complete Genome Sequence of a Sapporo Virus GV.2 Variant from a 2016 Outbreak of Gastroenteritis in Sweden.

    PubMed

    Hallström, Björn; Lagerqvist, Nina; Lind-Karlberg, Maria; Helgesson, Sofia; Follin, Per; Hergens, Maria-Pia; Nederby-Öhd, Joanna; Tolfvenstam, Thomas; Alm, Erik

    2017-02-02

    During an outbreak of acute gastroenteritis in Sweden when laboratory routine diagnostics failed to detect a causative agent, Sapporo virus was detected in stool specimens using electron microscopy (M.-P. Hergens, J. Nederby Öhd, E. Alm, H. Hervius Askling, S. Helgesson, M. Insulander, N. Lagerkvist, B. Svennungsson, M. Tihane, T. Tolfvenstam, P. Follin, unpublished data). Whole-genome sequencing revealed a Sapporo virus variant clustering with genogroup V.

  5. Identification of rare DNA sequence variants in high-risk autism families and their prevalence in a large case/control population

    PubMed Central

    2014-01-01

    Background Genetics clearly plays a major role in the etiology of autism spectrum disorders (ASDs), but studies to date are only beginning to characterize the causal genetic variants responsible. Until recently, studies using multiple extended multi-generation families to identify ASD risk genes had not been undertaken. Methods We identified haplotypes shared among individuals with ASDs in large multiplex families, followed by targeted DNA capture and sequencing to identify potential causal variants. We also assayed the prevalence of the identified variants in a large ASD case/control population. Results We identified 584 non-conservative missense, nonsense, frameshift and splice site variants that might predispose to autism in our high-risk families. Eleven of these variants were observed to have odds ratios greater than 1.5 in a set of 1,541 unrelated children with autism and 5,785 controls. Three variants, in the RAB11FIP5, ABP1, and JMJD7-PLA2G4B genes, each were observed in a single case and not in any controls. These variants also were not seen in public sequence databases, suggesting that they may be rare causal ASD variants. Twenty-eight additional rare variants were observed only in high-risk ASD families. Collectively, these 39 variants identify 36 genes as ASD risk genes. Segregation of sequence variants and of copy number variants previously detected in these families reveals a complex pattern, with only a RAB11FIP5 variant segregating to all affected individuals in one two-generation pedigree. Some affected individuals were found to have multiple potential risk alleles, including sequence variants and copy number variants (CNVs), suggesting that the high incidence of autism in these families could be best explained by variants at multiple loci. Conclusions Our study is the first to use haplotype sharing to identify familial ASD risk loci. In total, we identified 39 variants in 36 genes that may confer a genetic risk of developing autism. The

  6. Cystatin. Amino acid sequence and possible secondary structure.

    PubMed Central

    Schwabe, C; Anastasi, A; Crow, H; McDonald, J K; Barrett, A J

    1984-01-01

    The amino acid sequence of cystatin, the protein from chicken egg-white that is a tight-binding inhibitor of many cysteine proteinases, is reported. Cystatin is composed of 116 amino acid residues, and the Mr is calculated to be 13 143. No striking similarity to any other known sequence has been detected. The results of computer analysis of the sequence and c.d. spectrometry indicate that the secondary structure includes relatively little alpha-helix (about 20%) and that the remainder is mainly beta-structure. PMID:6712597

  7. Construction and Application of Variants of the Pseudomonas fluorescens EBC191 Arylacetonitrilase for Increased Production of Acids or Amides▿ †

    PubMed Central

    Sosedov, Olga; Baum, Stefanie; Bürger, Sibylle; Matzer, Kathrin; Kiziak, Christoph; Stolz, Andreas

    2010-01-01

    The arylacetonitrilase from Pseudomonas fluorescens EBC191 differs from previously studied arylacetonitrilases by its low enantiospecificity during the turnover of mandelonitrile and by the large amounts of amides that are formed in the course of this reaction. In the sequence of the nitrilase from P. fluorescens, a cysteine residue (Cys163) is present in direct neighborhood (toward the amino terminus) to the catalytic active cysteine residue, which is rather unique among bacterial nitrilases. Therefore, this cysteine residue was exchanged in the nitrilase from P. fluorescens EBC191 for various amino acid residues which are present in other nitrilases at the homologous position. The influence of these mutations on the reaction specificity and enantiospecificity was analyzed with (R,S)-mandelonitrile and (R,S)-2-phenylpropionitrile as substrates. The mutants obtained demonstrated significant differences in their amide-forming capacities. The exchange of Cys163 for asparagine or glutamine residues resulted in significantly increased amounts of amides formed. In contrast, a substitution for alanine or serine residues decreased the amounts of amides formed. The newly discovered mutation was combined with previously identified mutations which also resulted in increased amide formation. Thus, variants which possessed in addition to the mutation Cys163Asn also a deletion at the C terminus of the enzyme and/or the modification Ala165Arg were constructed. These constructs demonstrated increased amide formation capacity in comparison to the mutants carrying only single mutations. The recombinant plasmids that encoded enzyme variants which formed large amounts of mandeloamide or that formed almost stoichiometric amounts of mandelic acid from mandelonitrile were used to transform Escherichia coli strains that expressed a plant-derived (S)-hydroxynitrile lyase. The whole-cell biocatalysts obtained in this way converted benzaldehyde plus cyanide either to (S)-mandeloamide or (S

  8. Antagonistic lactic acid bacteria isolated from goat milk and identification of a novel nisin variant Lactococcus lactis

    PubMed Central

    2014-01-01

    Background The raw goat milk microbiota is considered a good source of novel bacteriocinogenic lactic acid bacteria (LAB) strains that can be exploited as an alternative for use as biopreservatives in foods. The constant demand for such alternative tools justifies studies that investigate the antimicrobial potential of such strains. Results The obtained data identified a predominance of Lactococcus and Enterococcus strains in raw goat milk microbiota with antimicrobial activity against Listeria monocytogenes ATCC 7644. Enzymatic assays confirmed the bacteriocinogenic nature of the antimicrobial substances produced by the isolated strains, and PCR reactions detected a variety of bacteriocin-related genes in their genomes. Rep-PCR identified broad genetic variability among the Enterococcus isolates, and close relations between the Lactococcus strains. The sequencing of PCR products from nis-positive Lactococcus allowed the identification of a predicted nisin variant not previously described and possessing a wide inhibitory spectrum. Conclusions Raw goat milk was confirmed as a good source of novel bacteriocinogenic LAB strains, having identified Lactococcus isolates possessing variations in their genomes that suggest the production of a nisin variant not yet described and with potential for use as biopreservatives in food due to its broad spectrum of action. PMID:24521354

  9. Mouse Vk gene classification by nucleic acid sequence similarity.

    PubMed

    Strohal, R; Helmberg, A; Kroemer, G; Kofler, R

    1989-01-01

    Analyses of immunoglobulin (Ig) variable (V) region gene usage in the immune response, estimates of V gene germline complexity, and other nucleic acid hybridization-based studies depend on the extent to which such genes are related (i.e., sequence similarity) and their organization in gene families. While mouse Igh heavy chain V region (VH) gene families are relatively well-established, a corresponding systematic classification of Igk light chain V region (Vk) genes has not been reported. The present analysis, in the course of which we reviewed the known extent of the Vk germline gene repertoire and Vk gene usage in a variety of responses to foreign and self antigens, provides a classification of mouse Vk genes in gene families composed of members with greater than 80% overall nucleic acid sequence similarity. This classification differed in several aspects from that of VH genes: only some Vk gene families were as clearly separated (by greater than 25% sequence dissimilarity) as typical VH gene families; most Vk gene families were closely related and, in several instances, members from different families were very similar (greater than 80%) over large sequence portions; frequently, classification by nucleic acid sequence similarity diverged from existing classifications based on amino-terminal protein sequence similarity. Our data have implications for Vk gene analyses by nucleic acid hybridization and describe potentially important differences in sequence organization between VH and Vk genes.

  10. Deep sequencing is an appropriate tool for the selection of unique Hepatitis C virus (HCV) variants after single genomic amplification.

    PubMed

    Guinoiseau, Thibault; Moreau, Alain; Hohnadel, Guillaume; Ngo-Giang-Huong, Nicole; Brulard, Celine; Vourc'h, Patrick; Goudeau, Alain; Gaudy-Graffin, Catherine

    2017-01-01

    Hepatitis C virus (HCV) evolves rapidly in a single host and circulates as a quasispecies wich is a complex mixture of genetically distinct virus's but closely related namely variants. To identify intra-individual diversity and investigate their functional properties in vitro, it is necessary to define their quasispecies composition and isolate the HCV variants. This is possible using single genome amplification (SGA). This technique, based on serially diluted cDNA to amplify a single cDNA molecule (clonal amplicon), has already been used to determine individual HCV diversity. In these studies, positive PCR reactions from SGA were directly sequenced using Sanger technology. The detection of non-clonal amplicons is necessary for excluding them to facilitate further functional analysis. Here, we compared Next Generation Sequencing (NGS) with De Novo assembly and Sanger sequencing for their ability to distinguish clonal and non-clonal amplicons after SGA on one plasma specimen. All amplicons (n = 42) classified as clonal by NGS were also classified as clonal by Sanger sequencing. No double peaks were seen on electropherograms for non-clonal amplicons with position-specific nucleotide variation below 15% by NGS. Altogether, NGS circumvented many of the difficulties encountered when using Sanger sequencing after SGA and is an appropriate tool to reliability select clonal amplicons for further functional studies.

  11. Deep sequencing is an appropriate tool for the selection of unique Hepatitis C virus (HCV) variants after single genomic amplification

    PubMed Central

    Guinoiseau, Thibault; Moreau, Alain; Hohnadel, Guillaume; Ngo-Giang-Huong, Nicole; Brulard, Celine; Vourc’h, Patrick; Goudeau, Alain; Gaudy-Graffin, Catherine

    2017-01-01

    Hepatitis C virus (HCV) evolves rapidly in a single host and circulates as a quasispecies wich is a complex mixture of genetically distinct virus’s but closely related namely variants. To identify intra-individual diversity and investigate their functional properties in vitro, it is necessary to define their quasispecies composition and isolate the HCV variants. This is possible using single genome amplification (SGA). This technique, based on serially diluted cDNA to amplify a single cDNA molecule (clonal amplicon), has already been used to determine individual HCV diversity. In these studies, positive PCR reactions from SGA were directly sequenced using Sanger technology. The detection of non-clonal amplicons is necessary for excluding them to facilitate further functional analysis. Here, we compared Next Generation Sequencing (NGS) with De Novo assembly and Sanger sequencing for their ability to distinguish clonal and non-clonal amplicons after SGA on one plasma specimen. All amplicons (n = 42) classified as clonal by NGS were also classified as clonal by Sanger sequencing. No double peaks were seen on electropherograms for non-clonal amplicons with position-specific nucleotide variation below 15% by NGS. Altogether, NGS circumvented many of the difficulties encountered when using Sanger sequencing after SGA and is an appropriate tool to reliability select clonal amplicons for further functional studies. PMID:28362878

  12. High-Throughput Sequencing of mGluR Signaling Pathway Genes Reveals Enrichment of Rare Variants in Autism

    PubMed Central

    Hovhannisyan, Hayk; Trautman, Edwin; Pinard, Robert; Rathmell, Barbara; Carpenter, Randall; Margulies, David

    2012-01-01

    Identification of common molecular pathways affected by genetic variation in autism is important for understanding disease pathogenesis and devising effective therapies. Here, we test the hypothesis that rare genetic variation in the metabotropic glutamate-receptor (mGluR) signaling pathway contributes to autism susceptibility. Single-nucleotide variants in genes encoding components of the mGluR signaling pathway were identified by high-throughput multiplex sequencing of pooled samples from 290 non-syndromic autism cases and 300 ethnically matched controls on two independent next-generation platforms. This analysis revealed significant enrichment of rare functional variants in the mGluR pathway in autism cases. Higher burdens of rare, potentially deleterious variants were identified in autism cases for three pathway genes previously implicated in syndromic autism spectrum disorder, TSC1, TSC2, and SHANK3, suggesting that genetic variation in these genes also contributes to risk for non-syndromic autism. In addition, our analysis identified HOMER1, which encodes a postsynaptic density-localized scaffolding protein that interacts with Shank3 to regulate mGluR activity, as a novel autism-risk gene. Rare, potentially deleterious HOMER1 variants identified uniquely in the autism population affected functionally important protein regions or regulatory sequences and co-segregated closely with autism among children of affected families. We also identified rare ASD-associated coding variants predicted to have damaging effects on components of the Ras/MAPK cascade. Collectively, these findings suggest that altered signaling downstream of mGluRs contributes to the pathogenesis of non-syndromic autism. PMID:22558107

  13. Thiostrepton Variants Containing a Contracted Quinaldic Acid Macrocycle Result from Mutagenesis of the Second Residue

    PubMed Central

    Zhang, Feifei; Li, Chaoxuan

    2016-01-01

    The thiopeptides are a family of ribosomally synthesized and posttranslationally modified peptide metabolites, and the vast majority of thiopeptides characterized to date possess one highly modified macrocycle. A few members, including thiostrepton A, harbor a second macrocycle that incorporates a quinaldic acid moiety and the four N-terminal residues of the peptide. The antibacterial properties of thiostrepton A are well established, and its recently discovered ability to inhibit the proteasome has additional implications for the development of antimalarial and anticancer therapeutics. We have conducted the saturation mutagenesis of Ala2 in the precursor peptide, TsrA, to examine which variants can be transformed into a mature thiostrepton analogue. Although the thiostrepton biosynthetic system is somewhat restrictive towards substitutions at the second residue, eight thiostrepton Ala2 analogues were isolated. The TsrA Ala2Ile and Ala2Val variants were largely channeled through an alternate processing pathway wherein the first residue of the core peptide, Ile1, is removed and the resulting thiostrepton analogues bear quinaldic acid macrocycles abridged by one residue. This is the first report revealing that quinaldic acid loop size is amenable to alteration during the course of thiostrepton biosynthesis. Both the antibacterial and proteasome inhibitory properties of the thiostrepton Ala2 analogues were examined. While the identity of the residue at the second position of the core peptide influences thiostrepton biosynthesis, our report suggests it may not be crucial for antibacterial and proteasome inhibitory properties of the full-length variants. In contrast, the contracted quinaldic acid loop can, to differing degrees, affect both types of biological activity. PMID:26630475

  14. Multiple site-selective insertions of non-canonical amino acids into sequence-repetitive polypeptides

    PubMed Central

    Wu, I-Lin; Patterson, Melissa A.; Carpenter Desai, Holly E.; Mehl, Ryan A.; Giorgi, Gianluca

    2013-01-01

    A simple and efficient method is described for introduction of non-canonical amino acids at multiple, structurally defined sites within recombinant polypeptide sequences. E. coli MRA30, a bacterial host strain with attenuated activity for release factor 1 (RF1), is assessed for its ability to support the incorporation of a diverse range of non-canonical amino acids in response to multiple encoded amber (TAG) codons within genetic templates derived from superfolder GFP and an elastin-mimetic protein polymer. Suppression efficiency and isolated protein yield were observed to depend on the identity of the orthogonal aminoacyl-tRNA synthetase/tRNACUA pair and the non-canonical amino acid substrate. This approach afforded elastin-mimetic protein polymers containing non-canonical amino acid derivatives at up to twenty-two positions within the repeat sequence with high levels of substitution. The identity and position of the variant residues was confirmed by mass spectrometric analysis of the full-length polypeptides and proteolytic cleavage fragments resulting from thermolysin digestion. The accumulated data suggest that this multi-site suppression approach permits the preparation of protein-based materials in which novel chemical functionality can be introduced at precisely defined positions within the polypeptide sequence. PMID:23625817

  15. Exome sequencing and genome-wide linkage analysis in 17 families illustrate the complex contribution of TTN truncating variants to dilated cardiomyopathy.

    PubMed

    Norton, Nadine; Li, Duanxiang; Rampersaud, Evadnie; Morales, Ana; Martin, Eden R; Zuchner, Stephan; Guo, Shengru; Gonzalez, Michael; Hedges, Dale J; Robertson, Peggy D; Krumm, Niklas; Nickerson, Deborah A; Hershberger, Ray E

    2013-04-01

    BACKGROUND- Familial dilated cardiomyopathy (DCM) is a genetically heterogeneous disease with >30 known genes. TTN truncating variants were recently implicated in a candidate gene study to cause 25% of familial and 18% of sporadic DCM cases. METHODS AND RESULTS- We used an unbiased genome-wide approach using both linkage analysis and variant filtering across the exome sequences of 48 individuals affected with DCM from 17 families to identify genetic cause. Linkage analysis ranked the TTN region as falling under the second highest genome-wide multipoint linkage peak, multipoint logarithm of odds, 1.59. We identified 6 TTN truncating variants carried by individuals affected with DCM in 7 of 17 DCM families (logarithm of odds, 2.99); 2 of these 7 families also had novel missense variants that segregated with disease. Two additional novel truncating TTN variants did not segregate with DCM. Nucleotide diversity at the TTN locus, including missense variants, was comparable with 5 other known DCM genes. The average number of missense variants in the exome sequences from the DCM cases or the ≈5400 cases from the Exome Sequencing Project was ≈23 per individual. The average number of TTN truncating variants in the Exome Sequencing Project was 0.014 per individual. We also identified a region (chr9q21.11-q22.31) with no known DCM genes with a maximum heterogeneity logarithm of odds score of 1.74. CONCLUSIONS- These data suggest that TTN truncating variants contribute to DCM cause. However, the lack of segregation of all identified TTN truncating variants illustrates the challenge of determining variant pathogenicity even with full exome sequencing.

  16. BRCA1 mutations and other sequence variants in a population-based sample of Australian women with breast cancer

    PubMed Central

    Southey, M C; Tesoriero, A A; Andersen, C R; Jennings, K M; Brown, S M; Dite, G S; Jenkins, M A; Osborne, R H; Maskiell, J A; Porter, L; Giles, G G; McCredie, M R E; Hopper, J L; Venter, D J

    1999-01-01

    The frequency, in women with breast cancer, of mutations and other variants in the susceptibility gene, BRCA1, was investigated using a population-based case–control-family study. Cases were women living in Melbourne or Sydney, Australia, with histologically confirmed, first primary, invasive breast cancer, diagnosed before the age of 40 years, recorded on the state Cancer Registries. Controls were women without breast cancer, frequency-matched for age, randomly selected from electoral rolls. Full manual sequencing of the coding region of BRCA1 was conducted in a randomly stratified sample of 91 cases; 47 with, and 44 without, a family history of breast cancer in a first- or second-degree relative. All detected variants were tested in a random sample of 67 controls. Three cases with a (protein-truncating) mutation were detected. Only one case had a family history; her mother had breast cancer, but did not carry the mutation. The proportion of Australian women with breast cancer before age 40 who carry a germline mutation in BRCA1 was estimated to be 3.8% (95% Cl 0.3–12.6%). Seven rare variants were also detected, but for none was there evidence of a strong effect on breast cancer susceptibility. Therefore, on a population basis, rare variants are likely to contribute little to breast cancer incidence. © 1999 Cancer Research Campaign PMID:10408690

  17. Characterization of Genomic Variants Associated with Scout and Recruit Behavioral Castes in Honey Bees Using Whole-Genome Sequencing

    PubMed Central

    Southey, Bruce R.; Zhu, Ping; Carr-Markell, Morgan K.; Liang, Zhengzheng S.; Zayed, Amro; Li, Ruiqiang; Robinson, Gene E.; Rodriguez-Zas, Sandra L.

    2016-01-01

    Among forager honey bees, scouts seek new resources and return to the colony, enlisting recruits to collect these resources. Differentially expressed genes between these behaviors and genetic variability in scouting phenotypes have been reported. Whole-genome sequencing of 44 Apis mellifera scouts and recruits was undertaken to detect variants and further understand the genetic architecture underlying the behavioral differences between scouts and recruits. The median coverage depth in recruits and scouts was 10.01 and 10.7 X, respectively. Representation of bacterial species among the unmapped reads reflected a more diverse microbiome in scouts than recruits. Overall, 1,412,705 polymorphic positions were analyzed for associations with scouting behavior, and 212 significant (p-value < 0.0001) associations with scouting corresponding to 137 positions were detected. Most frequent putative transcription factor binding sites proximal to significant variants included Broad-complex 4, Broad-complex 1, Hunchback, and CF2-II. Three variants associated with scouting were located within coding regions of ncRNAs including one codon change (LOC102653644) and 2 frameshift indels (LOC102654879 and LOC102655256). Significant variants were also identified on the 5’UTR of membrin, and 3’UTRs of laccase 2 and diacylglycerol kinase theta. The 60 significant variants located within introns corresponded to 39 genes and most of these positions were > 1000 bp apart from each other. A number of these variants were mapped to ncRNA LOC100578102, solute carrier family 12 member 6-like gene, and LOC100576965 (meprin and TRAF-C homology domain containing gene). Functional categories represented among the genes corresponding to significant variants included: neuronal function, exoskeleton, immune response, salivary gland development, and enzymatic food processing. These categories offer a glimpse into the molecular support to the behaviors of scouts and recruits. The level of association

  18. Molecular modeling and molecular dynamic simulation of the effects of variants in the TGFBR2 kinase domain as a paradigm for interpretation of variants obtained by next generation sequencing

    PubMed Central

    Urrutia, Raul; Oliver, Gavin R.; Cousin, Margot A.; Bozeck, Nicole J.; Klee, Eric W.

    2017-01-01

    Variants in the TGFBR2 kinase domain cause several human diseases and can increase propensity for cancer. The widespread application of next generation sequencing within the setting of Individualized Medicine (IM) is increasing the rate at which TGFBR2 kinase domain variants are being identified. However, their clinical relevance is often uncertain. Consequently, we sought to evaluate the use of molecular modeling and molecular dynamics (MD) simulations for assessing the potential impact of variants within this domain. We documented the structural differences revealed by these models across 57 variants using independent MD simulations for each. Our simulations revealed various mechanisms by which variants may lead to functional alteration; some are revealed energetically, while others structurally or dynamically. We found that the ATP binding site and activation loop dynamics may be affected by variants at positions throughout the structure. This prediction cannot be made from the linear sequence alone. We present our structure-based analyses alongside those obtained using several commonly used genomics-based predictive algorithms. We believe the further mechanistic information revealed by molecular modeling will be useful in guiding the examination of clinically observed variants throughout the exome, as well as those likely to be discovered in the near future by clinical tests leveraging next-generation sequencing through IM efforts. PMID:28182693

  19. Exome-wide association analysis reveals novel coding sequence variants associated with lipid traits in Chinese

    PubMed Central

    Tang, Clara S.; Zhang, He; Cheung, Chloe Y. Y.; Xu, Ming; Ho, Jenny C. Y.; Zhou, Wei; Cherny, Stacey S.; Zhang, Yan; Holmen, Oddgeir; Au, Ka-Wing; Yu, Haiyi; Xu, Lin; Jia, Jia; Porsch, Robert M.; Sun, Lijie; Xu, Weixian; Zheng, Huiping; Wong, Lai-Yung; Mu, Yiming; Dou, Jingtao; Fong, Carol H. Y.; Wang, Shuyu; Hong, Xueyu; Dong, Liguang; Liao, Yanhua; Wang, Jiansong; Lam, Levina S. M.; Su, Xi; Yan, Hua; Yang, Min-Lee; Chen, Jin; Siu, Chung-Wah; Xie, Gaoqiang; Woo, Yu-Cho; Wu, Yangfeng; Tan, Kathryn C. B.; Hveem, Kristian; Cheung, Bernard M. Y.; Zöllner, Sebastian; Xu, Aimin; Eugene Chen, Y; Jiang, Chao Qiang; Zhang, Youyi; Lam, Tai-Hing; Ganesh, Santhi K.; Huo, Yong; Sham, Pak C.; Lam, Karen S. L.; Willer, Cristen J.; Tse, Hung-Fat; Gao, Wei

    2015-01-01

    Blood lipids are important risk factors for coronary artery disease (CAD). Here we perform an exome-wide association study by genotyping 12,685 Chinese, using a custom Illumina HumanExome BeadChip, to identify additional loci influencing lipid levels. Single-variant association analysis on 65,671 single nucleotide polymorphisms reveals 19 loci associated with lipids at exome-wide significance (P<2.69 × 10−7), including three Asian-specific coding variants in known genes (CETP p.Asp459Gly, PCSK9 p.Arg93Cys and LDLR p.Arg257Trp). Furthermore, missense variants at two novel loci—PNPLA3 p.Ile148Met and PKD1L3 p.Thr429Ser—also influence levels of triglycerides and low-density lipoprotein cholesterol, respectively. Another novel gene, TEAD2, is found to be associated with high-density lipoprotein cholesterol through gene-based association analysis. Most of these newly identified coding variants show suggestive association (P<0.05) with CAD. These findings demonstrate that exome-wide genotyping on samples of non-European ancestry can identify additional population-specific possible causal variants, shedding light on novel lipid biology and CAD. PMID:26690388

  20. Amino acid sequence repertoire of the bacterial proteome and the occurrence of untranslatable sequences

    PubMed Central

    Navon, Sharon Penias; Kornberg, Guy; Chen, Jin; Schwartzman, Tali; Tsai, Albert; Puglisi, Elisabetta Viani; Puglisi, Joseph D.; Adir, Noam

    2016-01-01

    Bioinformatic analysis of Escherichia coli proteomes revealed that all possible amino acid triplet sequences occur at their expected frequencies, with four exceptions. Two of the four underrepresented sequences (URSs) were shown to interfere with translation in vivo and in vitro. Enlarging the URS by a single amino acid resulted in increased translational inhibition. Single-molecule methods revealed stalling of translation at the entrance of the peptide exit tunnel of the ribosome, adjacent to ribosomal nucleotides A2062 and U2585. Interaction with these same ribosomal residues is involved in regulation of translation by longer, naturally occurring protein sequences. The E. coli exit tunnel has evidently evolved to minimize interaction with the exit tunnel and maximize the sequence diversity of the proteome, although allowing some interactions for regulatory purposes. Bioinformatic analysis of the human proteome revealed no underrepresented triplet sequences, possibly reflecting an absence of regulation by interaction with the exit tunnel. PMID:27307442

  1. Amino acid sequence repertoire of the bacterial proteome and the occurrence of untranslatable sequences.

    PubMed

    Navon, Sharon Penias; Kornberg, Guy; Chen, Jin; Schwartzman, Tali; Tsai, Albert; Puglisi, Elisabetta Viani; Puglisi, Joseph D; Adir, Noam

    2016-06-28

    Bioinformatic analysis of Escherichia coli proteomes revealed that all possible amino acid triplet sequences occur at their expected frequencies, with four exceptions. Two of the four underrepresented sequences (URSs) were shown to interfere with translation in vivo and in vitro. Enlarging the URS by a single amino acid resulted in increased translational inhibition. Single-molecule methods revealed stalling of translation at the entrance of the peptide exit tunnel of the ribosome, adjacent to ribosomal nucleotides A2062 and U2585. Interaction with these same ribosomal residues is involved in regulation of translation by longer, naturally occurring protein sequences. The E. coli exit tunnel has evidently evolved to minimize interaction with the exit tunnel and maximize the sequence diversity of the proteome, although allowing some interactions for regulatory purposes. Bioinformatic analysis of the human proteome revealed no underrepresented triplet sequences, possibly reflecting an absence of regulation by interaction with the exit tunnel.

  2. Characterization of the Two Intra-Individual Sequence Variants in the 18S rRNA Gene in the Plant Parasitic Nematode, Rotylenchulus reniformis

    PubMed Central

    Nyaku, Seloame T.; Sripathi, Venkateswara R.; Kantety, Ramesh V.; Gu, Yong Q.; Lawrence, Kathy; Sharma, Govind C.

    2013-01-01

    The 18S rRNA gene is fundamental to cellular and organismal protein synthesis and because of its stable persistence through generations it is also used in phylogenetic analysis among taxa. Sequence variation in this gene within a single species is rare, but it has been observed in few metazoan organisms. More frequently it has mostly been reported in the non-transcribed spacer region. Here, we have identified two sequence variants within the near full coding region of 18S rRNA gene from a single reniform nematode (RN) Rotylenchulus reniformis labeled as reniform nematode variant 1 (RN_VAR1) and variant 2 (RN_VAR2). All sequences from three of the four isolates had both RN variants in their sequences; however, isolate 13B had only RN variant 2 sequence. Specific variable base sites (96 or 5.5%) were found within the 18S rRNA gene that can clearly distinguish the two 18S rDNA variants of RN, in 11 (25.0%) and 33 (75.0%) of the 44 RN clones, for RN_VAR1 and RN_VAR2, respectively. Neighbor-joining trees show that the RN_VAR1 is very similar to the previously existing R. reniformis sequence in GenBank, while the RN_VAR2 sequence is more divergent. This is the first report of the identification of two major variants of the 18S rRNA gene in the same single RN, and documents the specific base variation between the two variants, and hypothesizes on simultaneous co-existence of these two variants for this gene. PMID:23593343

  3. Enhanced Acid Tolerance in Bifidobacterium longum by Adaptive Evolution: Comparison of the Genes between the Acid-Resistant Variant and Wild-Type Strain.

    PubMed

    Jiang, Yunyun; Ren, Fazheng; Liu, Songling; Zhao, Liang; Guo, Huiyuan; Hou, Caiyun

    2016-03-01

    Acid stress can affect the viability of probiotics, especially Bifidobacterium. This study aimed to improve the acid tolerance of Bifidobacterium longum BBMN68 using adaptive evolution. The stress response, and genomic differences of the parental strain and the variant strain were compared by acid stress. The highest acid-resistant mutant strain (BBMN68m) was isolated from more than 100 asexual lines, which were adaptive to the acid stress for 10(th), 20(th), 30(th), 40(th), and 50(th) repeats, respectively. The variant strain showed a significant increase in acid tolerance under conditions of pH 2.5 for 2 h (from 7.92 to 4.44 log CFU/ml) compared with the wildtype strain (WT, from 7.87 to 0 log CFU/ml). The surface of the variant strain was also smoother. Comparative whole-genome analysis showed that the galactosyl transferase D gene (cpsD, bbmn68_1012), a key gene involved in exopolysaccharide (EPS) synthesis, was altered by two nucleotides in the mutant, causing alteration in amino acids, pI (from 8.94 to 9.19), and predicted protein structure. Meanwhile, cpsD expression and EPS production were also reduced in the variant strain (p < 0.05) compared with WT, and the exogenous WT-EPS in the variant strain reduced its acid-resistant ability. These results suggested EPS was related to acid responses of BBMN68.

  4. FamSeq: a variant calling program for family-based sequencing data using graphics processing units.

    PubMed

    Peng, Gang; Fan, Yu; Wang, Wenyi

    2014-10-01

    Various algorithms have been developed for variant calling using next-generation sequencing data, and various methods have been applied to reduce the associated false positive and false negative rates. Few variant calling programs, however, utilize the pedigree information when the family-based sequencing data are available. Here, we present a program, FamSeq, which reduces both false positive and false negative rates by incorporating the pedigree information from the Mendelian genetic model into variant calling. To accommodate variations in data complexity, FamSeq consists of four distinct implementations of the Mendelian genetic model: the Bayesian network algorithm, a graphics processing unit version of the Bayesian network algorithm, the Elston-Stewart algorithm and the Markov chain Monte Carlo algorithm. To make the software efficient and applicable to large families, we parallelized the Bayesian network algorithm that copes with pedigrees with inbreeding loops without losing calculation precision on an NVIDIA graphics processing unit. In order to compare the difference in the four methods, we applied FamSeq to pedigree sequencing data with family sizes that varied from 7 to 12. When there is no inbreeding loop in the pedigree, the Elston-Stewart algorithm gives analytical results in a short time. If there are inbreeding loops in the pedigree, we recommend the Bayesian network method, which provides exact answers. To improve the computing speed of the Bayesian network method, we parallelized the computation on a graphics processing unit. This allowed the Bayesian network method to process the whole genome sequencing data of a family of 12 individuals within two days, which was a 10-fold time reduction compared to the time required for this computation on a central processing unit.

  5. Candidate genes for congenital diaphragmatic hernia from animalmodels: sequencing of fog2 and pdgfra reveals rare variants indiaphragmatic hernia patients

    SciTech Connect

    Bleyl, S.B.; Moshrefi, A.; Shaw, G.M.; Saijoh, Y.; Schoenwolf,G.C.; Pennacchio, L.A.; Slavotinek, A.M.

    2007-05-11

    Congenital diaphragmatic hernia (CDH) is a common, lifethreatening birth defect. Although there is strong evidence implicatinggenetic factors in its pathogenesis, few causative genes have beenidentified, and in isolated CDH, only one de novo, nonsense mutation hasbeen reported in FOG2 in a female with posterior diaphragmaticeventration. We report here that the homozygous null mouse for the Pdgfragene has posterolateral diaphragmatic defects and thus is a model forhuman CDH. We hypothesized that mutations in this gene could cause humanCDH. We sequenced PDGFRa and FOG2 in 96 patients with CDH, of which 53had isolated CDH (55.2 percent), 36 had CDH and additional anomalies(37.5 percent), and 7 had CDH and known chromosome aberrations (7.3percent). For FOG2, we identified novel sequence alterations predictingp.M703L and p.T843A in two patients with isolated CDH that were absent in526 and 564 control chromosomes respectively. These altered amino acidswere highly conserved. However, due to the lack of available parental DNAsamples we were not able to determine if the sequence alterations were denovo. For PDGFRa, we found a single variant predicting p.L967V in apatient with CDH and multiple anomalies that was absent in 768 controlchromosomes. This patient also had one cell with trisomy 15 on skinfibroblast culture, a finding of uncertain significance. Although ourstudy identified sequence variants in FOG2 and PDGFRa, we have notdefinitively established the variants as mutations and we found noevidence that CDH commonly results from mutations in thesegenes.

  6. Amino acid sequences of proteins from Leptospira serovar pomona.

    PubMed

    Alves, S F; Lefebvre, R B; Probert, W

    2000-01-01

    This report describes a partial amino acid sequences from three putative outer envelope proteins from Leptospira serovar pomona. In order to obtain internal fragments for protein sequencing, enzymatic and chemical digestion was performed. The enzyme clostripain was used to digest the proteins 32 and 45 kDa. In situ digestion of 40 kDa molecular weight protein was accomplished using cyanogen bromide. The 32 kDa protein generated two fragments, one of 21 kDa and another of 10 kDa that yielded five residues. A fragment of 24 kDa that yielded nineteen residues of amino acids was obtained from 45 kDa protein. A fragment with a molecular weight of 20 kDa, yielding a twenty amino acids sequence from the 40 kDa protein.

  7. Longitudinal Antigenic Sequences and Sites from Intra-Host Evolution (LASSIE) identifies immune-selected HIV variants

    DOE PAGES

    Hraber, Peter; Korber, Bette; Wagh, Kshitij; ...

    2015-10-21

    Within-host genetic sequencing from samples collected over time provides a dynamic view of how viruses evade host immunity. Immune-driven mutations might stimulate neutralization breadth by selecting antibodies adapted to cycles of immune escape that generate within-subject epitope diversity. Comprehensive identification of immune-escape mutations is experimentally and computationally challenging. With current technology, many more viral sequences can readily be obtained than can be tested for binding and neutralization, making down-selection necessary. Typically, this is done manually, by picking variants that represent different time-points and branches on a phylogenetic tree. Such strategies are likely to miss many relevant mutations and combinations ofmore » mutations, and to be redundant for other mutations. Longitudinal Antigenic Sequences and Sites from Intrahost Evolution (LASSIE) uses transmitted founder loss to identify virus “hot-spots” under putative immune selection and chooses sequences that represent recurrent mutations in selected sites. LASSIE favors earliest sequences in which mutations arise. Here, with well-characterized longitudinal Env sequences, we confirmed selected sites were concentrated in antibody contacts and selected sequences represented diverse antigenic phenotypes. Finally, practical applications include rapidly identifying immune targets under selective pressure within a subject, selecting minimal sets of reagents for immunological assays that characterize evolving antibody responses, and for immunogens in polyvalent “cocktail” vaccines.« less

  8. Longitudinal Antigenic Sequences and Sites from Intra-Host Evolution (LASSIE) identifies immune-selected HIV variants

    SciTech Connect

    Hraber, Peter; Korber, Bette; Wagh, Kshitij; Giorgi, Elena; Bhattacharya, Tanmoy; Gnanakaran, S.; Lapedes, Alan S.; Learn, Gerald H.; Kreider, Edward F.; Li, Yingying; Shaw, George M.; Hahn, Beatrice H.; Montefiori, David C.; Alam, S. Munir; Bonsignori, Mattia; Moody, M. Anthony; Liao, Hua-Xin; Gao, Feng; Haynes, Barton

    2015-10-21

    Within-host genetic sequencing from samples collected over time provides a dynamic view of how viruses evade host immunity. Immune-driven mutations might stimulate neutralization breadth by selecting antibodies adapted to cycles of immune escape that generate within-subject epitope diversity. Comprehensive identification of immune-escape mutations is experimentally and computationally challenging. With current technology, many more viral sequences can readily be obtained than can be tested for binding and neutralization, making down-selection necessary. Typically, this is done manually, by picking variants that represent different time-points and branches on a phylogenetic tree. Such strategies are likely to miss many relevant mutations and combinations of mutations, and to be redundant for other mutations. Longitudinal Antigenic Sequences and Sites from Intrahost Evolution (LASSIE) uses transmitted founder loss to identify virus “hot-spots” under putative immune selection and chooses sequences that represent recurrent mutations in selected sites. LASSIE favors earliest sequences in which mutations arise. Here, with well-characterized longitudinal Env sequences, we confirmed selected sites were concentrated in antibody contacts and selected sequences represented diverse antigenic phenotypes. Finally, practical applications include rapidly identifying immune targets under selective pressure within a subject, selecting minimal sets of reagents for immunological assays that characterize evolving antibody responses, and for immunogens in polyvalent “cocktail” vaccines.

  9. Longitudinal Antigenic Sequences and Sites from Intra-Host Evolution (LASSIE) Identifies Immune-Selected HIV Variants

    PubMed Central

    Hraber, Peter; Korber, Bette; Wagh, Kshitij; Giorgi, Elena E.; Bhattacharya, Tanmoy; Gnanakaran, S.; Lapedes, Alan S.; Learn, Gerald H.; Kreider, Edward F.; Li, Yingying; Shaw, George M.; Hahn, Beatrice H.; Montefiori, David C.; Alam, S. Munir; Bonsignori, Mattia; Moody, M. Anthony; Liao, Hua-Xin; Gao, Feng; Haynes, Barton F.

    2015-01-01

    Within-host genetic sequencing from samples collected over time provides a dynamic view of how viruses evade host immunity. Immune-driven mutations might stimulate neutralization breadth by selecting antibodies adapted to cycles of immune escape that generate within-subject epitope diversity. Comprehensive identification of immune-escape mutations is experimentally and computationally challenging. With current technology, many more viral sequences can readily be obtained than can be tested for binding and neutralization, making down-selection necessary. Typically, this is done manually, by picking variants that represent different time-points and branches on a phylogenetic tree. Such strategies are likely to miss many relevant mutations and combinations of mutations, and to be redundant for other mutations. Longitudinal Antigenic Sequences and Sites from Intrahost Evolution (LASSIE) uses transmitted founder loss to identify virus “hot-spots” under putative immune selection and chooses sequences that represent recurrent mutations in selected sites. LASSIE favors earliest sequences in which mutations arise. With well-characterized longitudinal Env sequences, we confirmed selected sites were concentrated in antibody contacts and selected sequences represented diverse antigenic phenotypes. Practical applications include rapidly identifying immune targets under selective pressure within a subject, selecting minimal sets of reagents for immunological assays that characterize evolving antibody responses, and for immunogens in polyvalent “cocktail” vaccines. PMID:26506369

  10. Detecting PKD1 variants in polycystic kidney disease patients by single-molecule long-read sequencing.

    PubMed

    Borràs, Daniel M; Vossen, Rolf; Liem, Michael; Buermans, Henk P J; Dauwerse, Hans; van Heusden, Dave; Gansevoort, Ron T; den Dunnen, Johan T; Janssen, Bart; Peters, Dorien J M; Losekoot, Monique; Anvar, Seyed Yahya

    2017-04-04

    A genetic diagnosis of autosomal dominant polycystic kidney disease (ADPKD) is challenging due to allelic heterogeneity, high GC-content, and homology of the PKD1 gene with six pseudogenes. Short-read next-generation sequencing (NGS) approaches, such as whole genome (WGS) and whole exome sequencing (WES), often fail at reliably characterizing complex regions such as PKD1. However, long-read single-molecule sequencing has been shown to be an alternative strategy that could overcome PKD1 complexities and discriminate between homologous regions of PKD1 and its pseudogenes. In this study, we present the increased power of resolution for complex regions using long-read sequencing to characterize a cohort of 19 patients with ADPKD. Our approach provided high sensitivity in identifying PKD1 pathogenic variants, diagnosing 94.7% of the patients. We show that reliable screening of ADPKD patients in a single test without interference of PKD1 homologous sequences, commonly introduced by residual amplification of PKD1 pseudogenes, by direct long-read sequencing is now possible. This strategy can be implemented in diagnostics and is highly suitable to sequence and resolve complex genomic regions that are of clinical relevance. This article is protected by copyright. All rights reserved.

  11. Early strains of multidrug-resistant Salmonella enterica serovar Kentucky sequence type 198 from Southeast Asia harbor Salmonella genomic island 1-J variants with a novel insertion sequence.

    PubMed

    Le Hello, Simon; Weill, François-Xavier; Guibert, Véronique; Praud, Karine; Cloeckaert, Axel; Doublet, Benoît

    2012-10-01

    Salmonella genomic island 1 (SGI1) is a 43-kb integrative mobilizable element that harbors a great diversity of multidrug resistance gene clusters described in numerous Salmonella enterica serovars and also in Proteus mirabilis. The majority of SGI1 variants contain an In104-derivative complex class 1 integron inserted between resolvase gene res and open reading frame (ORF) S044 in SGI1. Recently, the international spread of ciprofloxacin-resistant S. enterica serovar Kentucky sequence type 198 (ST198) containing SGI1-K variants has been reported. A retrospective study was undertaken to characterize ST198 S. Kentucky strains isolated before the spread of the epidemic ST198-SGI1-K population in Africa and the Middle East. Here, we characterized 12 ST198 S. Kentucky strains isolated between 1969 and 1999, mainly from humans returning from Southeast Asia (n = 10 strains) or Israel (n = 1 strain) or from meat in Egypt (n = 1 strain). All these ST198 S. Kentucky strains did not belong to the XbaI pulsotype X1 associated with the African epidemic clone but to pulsotype X2. SGI1-J subgroup variants containing different complex integrons with a partial transposition module and inserted within ORF S023 of SGI1 were detected in six strains. The SGI1-J4 variant containing a partially deleted class 1 integron and thus showing a narrow resistance phenotype to sulfonamides was identified in two epidemiologically unrelated strains from Indonesia. The four remaining strains harbored a novel SGI1-J variant, named SGI1-J6, which contained aadA2, floR2, tetR(G)-tetA(G), and sul1 resistance genes within its complex integron. Moreover, in all these S. Kentucky isolates, a novel insertion sequence related to the IS630 family and named ISSen5 was found inserted upstream of the SGI1 complex integron in ORF S023. Thus, two subpopulations of S. Kentucky ST198 independently and exclusively acquired the SGI1 during the 1980s and 1990s. Unlike the ST198-X1 African epidemic subpopulation, the

  12. Draft Genome Sequences of Five Shiga Toxin-Producing Escherichia coli Isolates Harboring the New and Recently Described Subtilase Cytotoxin Allelic Variant subAB2-3

    PubMed Central

    Tasara, Taurai; Fierz, Lisa; Schmidt, Herbert

    2017-01-01

    ABSTRACT We present here the draft genome sequences of five Shiga toxin-producing Escherichia coli (STEC) strains which tested positive in a primary subAB screening. Assembly and annotation of the draft genomes revealed that all strains harbored the recently described allelic variant subAB2-3. Based on the sequence data, primers were designed to identify and differentiate this variant. PMID:28232433

  13. Application of next generation sequencing to CEPH cell lines to discover variants associated with FDA approved chemotherapeutics

    PubMed Central

    2014-01-01

    Background The goal of this study was to perform candidate gene association with cytotoxicity of chemotherapeutics in cell line models through resequencing and discovery of rare and low frequency variants along with common variations. Here, an association study of cytotoxicity response to 30 FDA approved drugs was conducted and we applied next generation targeted sequencing technology to discover variants from 103 candidate genes in 95 lymphoblastoid cell lines from 14 CEPH pedigrees. In this article, we called variants across 95 cell lines and performed association analysis for cytotoxic response using the Family Based Association Testing method and software. Results We called 2281 variable SNP genotypes across the 103 genes for these cell lines and identified three genes of significant association within this marker set. Specifically, ATP-binding cassette, sub-family C, member 5 (ABCC5), metallothionein 1A (MT1A) and NAD(P)H dehydrogenase quinone1 (NQO1) were significantly associated with oxaliplatin drug response. The significant SNP on NQO1 (rs1800566) has been linked with poor survival rates in patients with non-small cell lung cancer treated with cisplatin (which belongs to the same class of drugs as oxaliplatin). A SNP (rs1846692) near the 5′ region of MT1A was associated with arsenic trioxide. Conclusions The results from this study are promising and this serves as a proof-of-principle demonstration of the use of sequencing data in the cytotoxicity models of human cell lines. With increased sample sizes, such studies will be a fast and powerful way to associate common and rare variants with drug response; while overcoming the cost and time limitations to recruit cohorts for association study. PMID:24924344

  14. Association between SLC2A9 transporter gene variants and uric acid phenotypes in African American and white families

    PubMed Central

    de Andrade, Mariza; Matsumoto, Martha; Mosley, Tom H.; Kardia, Sharon; Turner, Stephen T.

    2011-01-01

    Objectives. SLC2A9 gene variants associate with serum uric acid in white populations, but little is known about African American populations. Since SLC2A9 is a transporter, gene variants may be expected to associate more closely with the fractional excretion of urate, a measure of renal tubular transport, than with serum uric acid, which is influenced by production and extrarenal clearance. Methods. Genotypes of single nucleotide polymorphisms (SNPs) distributed across the SLC2A9 gene were obtained in the Genetic Epidemiology Network of Arteriopathy cohorts. The associations of SNPs with serum uric acid, fractional excretion of urate and urine urate-to-creatinine ratio were assessed with adjustments for age, sex, diuretic use, BMI, homocysteine and triglycerides. Results. We identified SLC2A9 gene variants that were associated with serum uric acid in 1155 African American subjects (53 SNPs) and 1132 white subjects (63 SNPs). The most statistically significant SNPs in African American subjects (rs13113918) and white subjects (rs11723439) were in the latter half of the gene and explained 2.7 and 2.8% of the variation in serum uric acid, respectively. After adjustment for this SNP in African Americans, 0.9% of the variation in serum uric acid was explained by an SNP (rs1568318) in the first half of the gene. Unexpectedly, SLC2A9 gene variants had stronger associations with serum uric acid than with fractional excretion of urate. Conclusions. These findings support two different loci by which SLC2A9 variants affect uric acid levels in African Americans and suggest SLC2A9 variants affect serum uric acid level via renal and extrarenal clearance. PMID:21186168

  15. Extensive amino acid sequence homologies between animal lectins

    SciTech Connect

    Paroutaud, P.; Levi, G.; Teichberg, V.I.; Strosberg, A.D.

    1987-09-01

    The authors have established the amino acid sequence of the ..beta..-D-galactoside binding lectin from the electric eel and the sequences of several peptides from a similar lectin isolated from human placenta. These sequences were compared with the published sequences of peptides derived from the ..beta..-D-galactoside binding lectin from human lung and with sequences deduced from cDNAs assigned to the ..beta..-D-galactoside binding lectins from chicken embryo skin and human hepatomas. Significant homologies were observed. One of the highly conserved regions that contains a tryptophan residue and two glutamic acid resides is probably part of the ..beta..-D-galactoside binding site, which, on the basis of spectroscopic studies of the electric eel lectin, is expected to contain such residues. The similarity of the hydropathy profiles and the predicted secondary structure of the lectins from chicken skin and electric eel, in spite of differences in their amino acid sequences, strongly suggests that these proteins have maintained structural homologies during evolution and together with the other ..beta..-D-galactoside binding lectins were derived form a common ancestor gene.

  16. Amino acid sequence of porcine spleen cathepsin D.

    PubMed Central

    Shewale, J G; Tang, J

    1984-01-01

    The amino acid sequence of porcine spleen cathepsin D heavy chain has been determined and, hence, the complete structure of this enzyme is now known. The sequence of heavy chain was constructed by aligning the structures of peptides generated by cyanogen bromide, trypsin, and endo-proteinase Lys C cleavages. The structure of the light chain has been published previously. The cathepsin D molecule contains 339 amino acid residues in two polypeptide chains: a 97-residue light chain and a 242-residue heavy chain, with a combined Mr of 36,779 (without carbohydrate). There are two carbohydrate units linked to asparagine residues 70 and 192. The disulfide bond arrangement in cathepsin D is probably similar to that of pepsin, because the positions of six half-cystine residues are conserved. The active site aspartyl residues, corresponding to aspartic acid-32 and -215 of pepsin, are located at residues 33 and 224 in the cathepsin D molecule. The amino acid sequence around these aspartyl residues is strongly conserved. Cathepsin D shows a strong homology with other acid proteases. When the sequence of cathepsin D, renin, and pepsin are aligned, 32.7% of the residues are identical. The homology is observed throughout the length of the molecules, indicating that three-dimensional structures of all three molecules are similar. PMID:6587385

  17. Analysis of amino acid sequence variations and immunoglobulin E-binding epitopes of German cockroach tropomyosin.

    PubMed

    Jeong, Kyoung Yong; Lee, Jongweon; Lee, In-Yong; Ree, Han-Il; Hong, Chein-Soo; Yong, Tai-Soon

    2004-09-01

    The allergenicities of tropomyosins from different organisms have been reported to vary. The cDNA encoding German cockroach tropomyosin (Bla g 7) was isolated, expressed, and characterized previously. In the present study, the amino acid sequence variations in German cockroach tropomyosin were analyzed in order to investigate its influence on allergenicity. We also undertook the identification of immunodominant peptides containing immunoglobulin E (IgE) epitopes which may facilitate the development of diagnostic and immunotherapeutic strategies based on the recombinant proteins. Two-dimensional gel electrophoresis and immunoblot analysis with mouse anti-recombinant German cockroach tropomyosin serum was performed to investigate the isoforms at the protein level. Reverse transcriptase PCR (RT-PCR) was applied to examine the sequence diversity. Eleven different variants of the deduced amino acid sequences were identified by RT-PCR. German cockroach tropomyosin has only minor sequence variations that did not seem to affect its allergenicity significantly. These results support the molecular basis underlying the cross-reactivities of arthropod tropomyosins. Recombinant fragments were also generated by PCR, and IgE-binding epitopes were assessed by enzyme-linked immunosorbent assay. Sera from seven patients revealed heterogeneous IgE-binding responses. This study demonstrates multiple IgE-binding epitope regions in a single molecule, suggesting that full-length tropomyosin should be used for the development of diagnostic and therapeutic reagents.

  18. Complete genome sequence of Campylobacter jejuni RM1285 a rod-shaped morphological variant

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Campylobacter jejuni is a spiral-shaped Gram-negative food-borne human pathogen found on poultry products. Strain RM1285 is a rod-shaped variant of this species. The genome of RM1285 was determined to be 1,635,803 bp with a G+C content of 30.5%....

  19. Red-Shifted Aequorin Variants Incorporating Non-Canonical Amino Acids: Applications in In Vivo Imaging

    PubMed Central

    Grinstead, Kristen M.; Rowe, Laura; Ensor, Charles M.; Joel, Smita; Daftarian, Pirouz; Dikici, Emre; Zingg, Jean-Marc; Daunert, Sylvia

    2016-01-01

    The increased importance of in vivo diagnostics has posed new demands for imaging technologies. In that regard, there is a need for imaging molecules capable of expanding the applications of current state-of-the-art imaging in vivo diagnostics. To that end, there is a desire for new reporter molecules capable of providing strong signals, are non-toxic, and can be tailored to diagnose or monitor the progression of a number of diseases. Aequorin is a non-toxic photoprotein that can be used as a sensitive marker for bioluminescence in vivo imaging. The sensitivity of aequorin is due to the fact that bioluminescence is a rare phenomenon in nature and, therefore, it does not suffer from autofluorescence, which contributes to background emission. Emission of bioluminescence in the blue-region of the spectrum by aequorin only occurs when calcium, and its luciferin coelenterazine, are bound to the protein and trigger a biochemical reaction that results in light generation. It is this reaction that endows aequorin with unique characteristics, making it ideally suited for a number of applications in bioanalysis and imaging. Herein we report the site-specific incorporation of non-canonical or non-natural amino acids and several coelenterazine analogues, resulting in a catalog of 72 cysteine-free, aequorin variants which expand the potential applications of these photoproteins by providing several red-shifted mutants better suited to use in vivo. In vivo studies in mouse models using the transparent tissue of the eye confirmed the activity of the aequorin variants incorporating L-4-iodophehylalanine and L-4-methoxyphenylalanine after injection into the eye and topical addition of coelenterazine. The signal also remained localized within the eye. This is the first time that aequorin variants incorporating non-canonical amino acids have shown to be active in vivo and useful as reporters in bioluminescence imaging. PMID:27367859

  20. Mapping whole genome shotgun sequence and variant calling in mammalian species without their reference genomes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Genomics research in mammals has produced reference genome sequences that are essential for identifying variation associated with disease. High quality reference genome sequences are now available for humans, model species, and economically important agricultural animals. Comparisons between these s...

  1. Comparison of the distribution of the repetitive DNA sequences in three variants of Cucumis sativus reveals their phylogenetic relationships.

    PubMed

    Zhao, Xin; Lu, Jingyuan; Zhang, Zhonghua; Hu, Jiajin; Huang, Sanwen; Jin, Weiwei

    2011-01-01

    Repetitive DNA sequences with variability in copy number or/and sequence polymorphism can be employed as useful molecular markers to study phylogenetics and identify species/chromosomes when combined with fluorescence in situ hybridization (FISH). Cucumis sativus has three variants, Cucumis sativus L. var. sativus, Cucumis sativus L. var. hardwickii and Cucumis sativus L. var. xishuangbannesis. The phylogenetics among these three variants has not been well explored using cytological landmarks. Here, we concentrate on the organization and distribution of highly repetitive DNA sequences in cucumbers, with emphasis on the differences between cultivar and wild cucumber. The diversity of chromosomal karyotypes in cucumber and its relatives was detected in our study. Thereby, sequential FISH with three sets of multi-probe cocktails (combined repetitive DNA with chromosome-specific fosmid clones as probes) were conducted on the same metaphase cell, which helped us to simultaneously identify each of the 7 metaphase chromosomes of wild cucumber C. sativus var. hardwickii. A standardized karyotype of somatic metaphase chromosomes was constructed. Our data also indicated that the relationship between cultivar cucumber and C. s. var. xishuangbannesis was closer than that of C. s. var. xishuangbannesis and C. s. var. hardwickii.

  2. DNA Sequence Variants in the Five Prime Untranslated Region of the Cyclooxygenase-2 Gene Are Commonly Found in Healthy Dogs and Gray Wolves

    PubMed Central

    Safra, Noa; Hayward, Louisa J.; Aguilar, Miriam; Sacks, Benjamin N.; Westropp, Jodi L.; Mohr, F. Charles; Mellersh, Cathryn S.; Bannasch, Danika L.

    2015-01-01

    The aim of this study was to investigate the frequency of regional DNA variants upstream to the translation initiation site of the canine Cyclooxygenase-2 (Cox-2) gene in healthy dogs. Cox-2 plays a role in various disease conditions such as acute and chronic inflammation, osteoarthritis and malignancy. A role for Cox-2 DNA variants in genetic predisposition to canine renal dysplasia has been proposed and dog breeders have been encouraged to select against these DNA variants. We sequenced 272–422 bases in 152 dogs unaffected by renal dysplasia and found 19 different haplotypes including 11 genetic variants which had not been described previously. We genotyped 7 gray wolves to ascertain the wildtype variant and found that the wolves we analyzed had predominantly the second most common DNA variant found in dogs. Our results demonstrate an elevated level of regional polymorphism that appears to be a feature of healthy domesticated dogs. PMID:26244515

  3. Whole Genome Deep Sequencing of HIV-1 Reveals the Impact of Early Minor Variants Upon Immune Recognition During Acute Infection

    PubMed Central

    Henn, Matthew R.; Lennon, Niall J.; Power, Karen A.; Macalalad, Alexander R.; Berlin, Aaron M.; Malboeuf, Christine M.; Ryan, Elizabeth M.; Gnerre, Sante; Zody, Michael C.; Erlich, Rachel L.; Green, Lisa M.; Berical, Andrew; Wang, Yaoyu; Casali, Monica; Streeck, Hendrik; Bloom, Allyson K.; Dudek, Tim; Tully, Damien; Newman, Ruchi; Axten, Karen L.; Gladden, Adrianne D.; Battis, Laura; Kemper, Michael; Zeng, Qiandong; Shea, Terrance P.; Gujja, Sharvari; Zedlack, Carmen; Gasser, Olivier; Brander, Christian; Hess, Christoph; Günthard, Huldrych F.; Brumme, Zabrina L.; Brumme, Chanson J.; Bazner, Suzane; Rychert, Jenna; Tinsley, Jake P.; Mayer, Ken H.; Rosenberg, Eric; Pereyra, Florencia; Levin, Joshua Z.; Young, Sarah K.; Jessen, Heiko; Altfeld, Marcus; Birren, Bruce W.; Walker, Bruce D.; Allen, Todd M.

    2012-01-01

    Deep sequencing technologies have the potential to transform the study of highly variable viral pathogens by providing a rapid and cost-effective approach to sensitively characterize rapidly evolving viral quasispecies. Here, we report on a high-throughput whole HIV-1 genome deep sequencing platform that combines 454 pyrosequencing with novel assembly and variant detection algorithms. In one subject we combined these genetic data with detailed immunological analyses to comprehensively evaluate viral evolution and immune escape during the acute phase of HIV-1 infection. The majority of early, low frequency mutations represented viral adaptation to host CD8+ T cell responses, evidence of strong immune selection pressure occurring during the early decline from peak viremia. CD8+ T cell responses capable of recognizing these low frequency escape variants coincided with the selection and evolution of more effective secondary HLA-anchor escape mutations. Frequent, and in some cases rapid, reversion of transmitted mutations was also observed across the viral genome. When located within restricted CD8 epitopes these low frequency reverting mutations were sufficient to prime de novo responses to these epitopes, again illustrating the capacity of the immune response to recognize and respond to low frequency variants. More importantly, rapid viral escape from the most immunodominant CD8+ T cell responses coincided with plateauing of the initial viral load decline in this subject, suggestive of a potential link between maintenance of effective, dominant CD8 responses and the degree of early viremia reduction. We conclude that the early control of HIV-1 replication by immunodominant CD8+ T cell responses may be substantially influenced by rapid, low frequency viral adaptations not detected by conventional sequencing approaches, which warrants further investigation. These data support the critical need for vaccine-induced CD8+ T cell responses to target more highly constrained

  4. Whole genome deep sequencing of HIV-1 reveals the impact of early minor variants upon immune recognition during acute infection.

    PubMed

    Henn, Matthew R; Boutwell, Christian L; Charlebois, Patrick; Lennon, Niall J; Power, Karen A; Macalalad, Alexander R; Berlin, Aaron M; Malboeuf, Christine M; Ryan, Elizabeth M; Gnerre, Sante; Zody, Michael C; Erlich, Rachel L; Green, Lisa M; Berical, Andrew; Wang, Yaoyu; Casali, Monica; Streeck, Hendrik; Bloom, Allyson K; Dudek, Tim; Tully, Damien; Newman, Ruchi; Axten, Karen L; Gladden, Adrianne D; Battis, Laura; Kemper, Michael; Zeng, Qiandong; Shea, Terrance P; Gujja, Sharvari; Zedlack, Carmen; Gasser, Olivier; Brander, Christian; Hess, Christoph; Günthard, Huldrych F; Brumme, Zabrina L; Brumme, Chanson J; Bazner, Suzane; Rychert, Jenna; Tinsley, Jake P; Mayer, Ken H; Rosenberg, Eric; Pereyra, Florencia; Levin, Joshua Z; Young, Sarah K; Jessen, Heiko; Altfeld, Marcus; Birren, Bruce W; Walker, Bruce D; Allen, Todd M

    2012-01-01

    Deep sequencing technologies have the potential to transform the study of highly variable viral pathogens by providing a rapid and cost-effective approach to sensitively characterize rapidly evolving viral quasispecies. Here, we report on a high-throughput whole HIV-1 genome deep sequencing platform that combines 454 pyrosequencing with novel assembly and variant detection algorithms. In one subject we combined these genetic data with detailed immunological analyses to comprehensively evaluate viral evolution and immune escape during the acute phase of HIV-1 infection. The majority of early, low frequency mutations represented viral adaptation to host CD8+ T cell responses, evidence of strong immune selection pressure occurring during the early decline from peak viremia. CD8+ T cell responses capable of recognizing these low frequency escape variants coincided with the selection and evolution of more effective secondary HLA-anchor escape mutations. Frequent, and in some cases rapid, reversion of transmitted mutations was also observed across the viral genome. When located within restricted CD8 epitopes these low frequency reverting mutations were sufficient to prime de novo responses to these epitopes, again illustrating the capacity of the immune response to recognize and respond to low frequency variants. More importantly, rapid viral escape from the most immunodominant CD8+ T cell responses coincided with plateauing of the initial viral load decline in this subject, suggestive of a potential link between maintenance of effective, dominant CD8 responses and the degree of early viremia reduction. We conclude that the early control of HIV-1 replication by immunodominant CD8+ T cell responses may be substantially influenced by rapid, low frequency viral adaptations not detected by conventional sequencing approaches, which warrants further investigation. These data support the critical need for vaccine-induced CD8+ T cell responses to target more highly constrained

  5. Ancient human sialic acid variant restricts an emerging zoonotic malaria parasite.

    PubMed

    Dankwa, Selasi; Lim, Caeul; Bei, Amy K; Jiang, Rays H Y; Abshire, James R; Patel, Saurabh D; Goldberg, Jonathan M; Moreno, Yovany; Kono, Maya; Niles, Jacquin C; Duraisingh, Manoj T

    2016-04-04

    Plasmodium knowlesi is a zoonotic parasite transmitted from macaques causing malaria in humans in Southeast Asia. Plasmodium parasites bind to red blood cell (RBC) surface receptors, many of which are sialylated. While macaques synthesize the sialic acid variant N-glycolylneuraminic acid (Neu5Gc), humans cannot because of a mutation in the enzyme CMAH that converts N-acetylneuraminic acid (Neu5Ac) to Neu5Gc. Here we reconstitute CMAH in human RBCs for the reintroduction of Neu5Gc, which results in enhancement of P. knowlesi invasion. We show that two P. knowlesi invasion ligands, PkDBPβ and PkDBPγ, bind specifically to Neu5Gc-containing receptors. A human-adapted P. knowlesi line invades human RBCs independently of Neu5Gc, with duplication of the sialic acid-independent invasion ligand, PkDBPα and loss of PkDBPγ. Our results suggest that absence of Neu5Gc on human RBCs limits P. knowlesi invasion, but that parasites may evolve to invade human RBCs through the use of sialic acid-independent pathways.

  6. Ancient human sialic acid variant restricts an emerging zoonotic malaria parasite

    PubMed Central

    Dankwa, Selasi; Lim, Caeul; Bei, Amy K.; Jiang, Rays H. Y.; Abshire, James R.; Patel, Saurabh D.; Goldberg, Jonathan M.; Moreno, Yovany; Kono, Maya; Niles, Jacquin C.; Duraisingh, Manoj T.

    2016-01-01

    Plasmodium knowlesi is a zoonotic parasite transmitted from macaques causing malaria in humans in Southeast Asia. Plasmodium parasites bind to red blood cell (RBC) surface receptors, many of which are sialylated. While macaques synthesize the sialic acid variant N-glycolylneuraminic acid (Neu5Gc), humans cannot because of a mutation in the enzyme CMAH that converts N-acetylneuraminic acid (Neu5Ac) to Neu5Gc. Here we reconstitute CMAH in human RBCs for the reintroduction of Neu5Gc, which results in enhancement of P. knowlesi invasion. We show that two P. knowlesi invasion ligands, PkDBPβ and PkDBPγ, bind specifically to Neu5Gc-containing receptors. A human-adapted P. knowlesi line invades human RBCs independently of Neu5Gc, with duplication of the sialic acid-independent invasion ligand, PkDBPα and loss of PkDBPγ. Our results suggest that absence of Neu5Gc on human RBCs limits P. knowlesi invasion, but that parasites may evolve to invade human RBCs through the use of sialic acid-independent pathways. PMID:27041489

  7. Role of common and rare APP DNA sequence variants in Alzheimer disease

    PubMed Central

    Hooli, B.V.; Mohapatra, G.; Mattheisen, M.; Parrado, A.R.; Roehr, J.T.; Shen, Y.; Gusella, J.F.; Moir, R.; Saunders, A.J.; Lange, C.; Tanzi, R.E.

    2012-01-01

    Objectives: More than 30 different rare mutations, including copy number variants (CNVs), in the amyloid precursor protein gene (APP) cause early-onset familial Alzheimer disease (EOFAD), whereas the contribution of common APP variants to disease risk remains controversial. In this study we systematically assessed the role of both rare and common APP DNA variants in Alzheimer disease (AD) families. Methods: Families with EOFAD genetically linked to the APP region were screened for missense mutations and locus duplications of APP. Further, using genome-wide DNA microarray data, we examined the APP locus for CNVs in a total of 797 additional early- and late-onset AD pedigrees. Finally, 423 single nucleotide polymorphisms (SNPs) in the APP locus, including 2 promoter polymorphisms previously associated with AD risk, were tested in up to 4,200 individuals from multiplex AD families. Results: Analyses of 8 21q21-linked families revealed one family carrying a nonsynonymous mutation in exon 17 (Val717Leu) and another family with a partially penetrant 3.5-Mb locus duplication encompassing APP. CNV analysis in the APP locus revealed an additional family carrying a fully penetrant 380-kb duplication, merely spanning APP. Last, contrary to previous reports, association analyses of more than 400 different SNPs in or near APP failed to show significant effects on AD risk. Conclusion: Our study shows that APP mutations and locus duplications are a very rare cause of EOFAD and that the contribution of common APP variants to AD susceptibility is insignificant. Furthermore, duplications of APP may not be fully penetrant, possibly indicating the existence of hitherto unknown protective genetic factors. PMID:22491860

  8. Active site amino acid sequence of human factor D.

    PubMed

    Davis, A E

    1980-08-01

    Factor D was isolated from human plasma by chromatography on CM-Sephadex C50, Sephadex G-75, and hydroxylapatite. Digestion of reduced, S-carboxymethylated factor D with cyanogen bromide resulted in three peptides which were isolated by chromatography on Sephadex G-75 (superfine) equilibrated in 20% formic acid. NH2-Terminal sequences were determined by automated Edman degradation with a Beckman 890C sequencer using a 0.1 M Quadrol program. The smallest peptide (CNBr III) consisted of the NH2-terminal 14 amino acids. The other two peptides had molecular weights of 17,000 (CNBr I) and 7000 (CNBr II). Overlap of the NH2-terminal sequence of factor D with the NH2-terminal sequence of CNBr I established the order of the peptides. The NH2-terminal 53 residues of factor D are somewhat more homologous with the group-specific protease of rat intestine than with other serine proteases. The NH2-terminal sequence of CNBr II revealed the active site serine of factor D. The typical serine protease active site sequence (Gly-Asp-Ser-Gly-Gly-Pro was found at residues 12-17. The region surrounding the active site serine does not appear to be more highly homologous with any one of the other serine proteases. The structural data obtained point out the similarities between factor D and the other proteases. However, complete definition of the degree of relationship between factor D and other proteases will require determination of the remainder of the primary structure.

  9. The amino acid sequence of iguana (Iguana iguana) pancreatic ribonuclease.

    PubMed

    Zhao, W; Beintema, J J; Hofsteenge, J

    1994-01-15

    The pyrimidine-specific ribonuclease superfamily constitutes a group of homologous proteins so far found only in higher vertebrates. Four separate families are found in mammals, which have resulted from gene duplications in mammalian ancestors. To learn more about the evolutionary history of this superfamily, the primary structure and other characteristics of the pancreatic enzyme from iguana (Iguana iguana), a herbivorous lizard species belonging to the reptiles, have been determined. The polypeptide chain consists of 119 amino acid residues. The positions of insertions and deletions in the sequence are identical to those in the enzyme from snapping turtle. However, the two enzymes differ at 54% of the amino acid positions. Iguana ribonuclease contains no carbohydrate, although the enzyme possesses three recognition sites for carbohydrate attachment, and has a high number of acidic residues in a localized part of the sequence.

  10. Whole-exome Sequencing Identifies Rare Variants in ATP8B4 as a Risk Factor for Systemic Sclerosis

    PubMed Central

    Gao, Li; Emond, Mary J; Louie, Tin; Cheadle, Chris; Berger, Alan E.; Rafaels, Nicholas; Vergara, Candelaria; Kim, Yoonhee; Taub, Margaret A.; Ruczinski, Ingo; Mathai, Stephen C.; Rich, Stephen S; Nickerson, Deborah A; Hummers, Laura K.; Bamshad, Michael J; Hassoun, Paul M.; Mathias, Rasika A; Barnes, Kathleen C.

    2015-01-01

    Objective To determine the contribution of rare variants as genetic modifiers of the expressivity, penetrance, and severity of systemic sclerosis (SSc). Methods We performed whole-exome sequencing of 78 European American systemic sclerosis patients, including 35 patients without pulmonary arterial hypertension (SSc-PAH−) and 43 patients with PAH (SSc-PAH+). Association testing of case-control probability for rare variants was performed using the aSKAT-O method with small sample adjustment by comparing all SSc patients with a reference population of 3,179 controls from the ESP 5,500 exome dataset. Replication genotyping was performed in an independent sample of 3,263 patients (415 SSc and 2,848 controls). We conducted expression profiling of mRNA from 61 SSc patients (19 SSc-PAH− and 42 SSc-PAH+) and 41 corresponding controls. Results The ATP8B4 gene was associated with a significant increase in the risk of SSc (P = 3.18 × 10−7). Among the 64 ATP8B4 variants tested, a single missense variant, c.1308C>G (F436L, rs55687265), provided the most compelling evidence for association (P = 9.35 × 10−10; OR = 6.11), which was confirmed in the replication cohort (P = 0.012; OR = 1.86) and meta-analysis (P = 1.92 x 10−7; OR = 2.5). Genes involved in E3 ubiquitin-protein ligase complex (ASB10) and cyclic nucleotide gated channelopathies (CNGB3) as well as HLA-DRB5 and HSPB2 (aka heat shock protein 27) provided additional evidence for association (P < 10−5). Differential ATP8B4 expression was observed among the SSc patients compared to the controls (P = 0.0005). Conclusion ATP8B4 may represent a putative genetic risk factor for SSc and pulmonary vascular complications. PMID:26473621

  11. Detection of CYP2C19 Genetic Variants in Malaysian Orang Asli from Massively Parallel Sequencing Data

    PubMed Central

    Ang, Geik Yong; Yu, Choo Yee; Subramaniam, Vinothini; Abdul Khalid, Mohd Ikhmal Hanif; Tuan Abdu Aziz, Tuan Azlin; Johari James, Richard; Ahmad, Aminuddin; Abdul Rahman, Thuhairah; Mohd Nor, Fadzilah; Ismail, Adzrool Idzwan; Md. Isa, Kamarudzaman; Salleh, Hood; Teh, Lay Kek; Salleh, Mohd Zaki

    2016-01-01

    The human cytochrome P450 (CYP) is a superfamily of enzymes that have been a focus in research for decades due to their prominent role in drug metabolism. CYP2C is one of the major subfamilies which metabolize more than 10% of all clinically used drugs. In the context of CYP2C19, several key genetic variations that alter the enzyme’s activity have been identified and catalogued in the CYP allele nomenclature database. In this study, we investigated the presence of well-established variants as well as novel polymorphisms in the CYP2C19 gene of 62 Orang Asli from the Peninsular Malaysia. A total of 449 genetic variants were detected including 70 novel polymorphisms; 417 SNPs were located in introns, 23 in upstream, 7 in exons, and 2 in downstream regions. Five alleles and seven genotypes were inferred based on the polymorphisms that were found. Null alleles that were observed include CYP2C19*3 (6.5%), *2 (5.7%) and *35 (2.4%) whereas allele with increased function *17 was detected at a frequency of 4.8%. The normal metabolizer genotype was the most predominant (66.1%), followed by intermediate metabolizer (19.4%), rapid metabolizer (9.7%) and poor metabolizer (4.8%) genotypes. Findings from this study provide further insights into the CYP2C19 genetic profile of the Orang Asli as previously unreported variant alleles were detected through the use of massively parallel sequencing technology platform. The systematic and comprehensive analysis of CYP2C19 will allow uncharacterized variants that are present in the Orang Asli to be included in the genotyping panel in the future. PMID:27798644

  12. Targeted Sequencing Reveals Low-Frequency Variants in EPHA Genes as Markers of Paclitaxel-Induced Peripheral Neuropathy.

    PubMed

    Apellániz-Ruiz, María; Tejero, Héctor; Inglada-Pérez, Lucía; Sánchez-Barroso, Lara; Gutiérrez-Gutiérrez, Gerardo; Calvo, Isabel; Castelo, Beatriz; Redondo, Andrés; García-Donás, Jesús; Romero-Laorden, Nuria; Sereno, María; Merino, María; Currás-Freixes, María; Montero-Conde, Cristina; Mancikova, Veronika; Åvall-Lundqvist, Elisabeth; Green, Henrik; Al-Shahrour, Fátima; Cascón, Alberto; Robledo, Mercedes; Rodríguez-Antona, Cristina

    2017-03-01

    Purpose: Neuropathy is the dose-limiting toxicity of paclitaxel and a major cause for decreased quality of life. Genetic factors have been shown to contribute to paclitaxel neuropathy susceptibility; however, the major causes for interindividual differences remain unexplained. In this study, we identified genetic markers associated with paclitaxel-induced neuropathy through massive sequencing of candidate genes.Experimental Design: We sequenced the coding region of 4 EPHA genes, 5 genes involved in paclitaxel pharmacokinetics, and 30 Charcot-Marie-Tooth genes, in 228 cancer patients with no/low neuropathy or high-grade neuropathy during paclitaxel treatment. An independent validation series included 202 paclitaxel-treated patients. Variation-/gene-based analyses were used to compare variant frequencies among neuropathy groups, and Cox regression models were used to analyze neuropathy along treatment.Results: Gene-based analysis identified EPHA6 as the gene most significantly associated with paclitaxel-induced neuropathy. Low-frequency nonsynonymous variants in EPHA6 were present exclusively in patients with high neuropathy, and all affected the ligand-binding domain of the protein. Accumulated dose analysis in the discovery series showed a significantly higher neuropathy risk for EPHA5/6/8 low-frequency nonsynonymous variant carriers [HR, 14.60; 95% confidence interval (CI), 2.33-91.62; P = 0.0042], and an independent cohort confirmed an increased neuropathy risk (HR, 2.07; 95% CI, 1.14-3.77; P = 0.017). Combining the series gave an estimated 2.5-fold higher risk of neuropathy (95% CI, 1.46-4.31; P = 9.1 × 10(-4)).Conclusions: This first study sequencing EPHA genes revealed that low-frequency variants in EPHA6, EPHA5, and EPHA8 contribute to the susceptibility to paclitaxel-induced neuropathy. Furthermore, EPHA's neuronal injury repair function suggests that these genes might constitute important neuropathy markers for many neurotoxic drugs. Clin Cancer Res; 23

  13. Distinctive Epstein-Barr Virus Variants Associated with Benign and Malignant Pediatric Pathologies: LMP1 Sequence Characterization and Linkage with Other Viral Gene Polymorphisms

    PubMed Central

    Gantuz, Magdalena; Altcheh, Jaime; De Matteo, Elena; Chabay, Paola Andrea; Preciado, María Victoria

    2012-01-01

    The ubiquitous Epstein-Barr virus (EBV) is related to the development of lymphoma and is also the etiological agent for infectious mononucleosis (IM). Sequence variations in the gene encoding LMP1 have been deeply studied in different pathologies and geographic regions. Controversial results propose the existence of tumor-related variants, while others argued in favor of a geographical distribution of these variants. Reports assessing EBV variants in IM were performed in adult patients who displayed multiple variant infections. In the present study, LMP1 variants in 15 pediatric patients with IM and 20 pediatric patients with EBV-associated lymphomas from Argentina were analyzed as representatives of benign and malignant infections in children, respectively. A 3-month follow-up study of LMP1 variants in peripheral blood cells and in oral secretions of patients with IM was performed. Moreover, an integrated linkage analysis was performed with variants of EBNA1 and the promoter region of BZLF1. Similar sequence polymorphisms were detected in both pathological conditions, IM and lymphoma, but these differ from those previously described in healthy donors from Argentina and Brazil. The results suggest that certain LMP1 polymorphisms, namely, the 30-bp deletion and high copy number of the 33-bp repeats, are associated with EBV-related pathologies, either benign or malignant, instead of just being tumor related. Additionally, this is the first study to describe the Alaskan variant in EBV-related lymphomas that previously was restricted to nasopharyngeal carcinomas from North America. PMID:22205789

  14. Identification of Novel Variants in LTBP2 and PXDN Using Whole-Exome Sequencing in Developmental and Congenital Glaucoma

    PubMed Central

    Micheal, Shazia; Siddiqui, Sorath Noorani; Zafar, Saemah Nuzhat; Iqbal, Aftab; Khan, Muhammad Imran; den Hollander, Anneke I.

    2016-01-01

    Background Primary congenital glaucoma (PCG) is the most common form of glaucoma in children. PCG occurs due to the developmental defects in the trabecular meshwork and anterior chamber of the eye. The purpose of this study is to identify the causative genetic variants in three families with developmental and primary congenital glaucoma (PCG) with a recessive inheritance pattern. Methods DNA samples were obtained from consanguineous families of Pakistani ancestry. The CYP1B1 gene was sequenced in the affected probands by conventional Sanger DNA sequencing. Whole exome sequencing (WES) was performed in DNA samples of four individuals belonging to three different CYP1B1-negative families. Variants identified by WES were validated by Sanger sequencing. Results WES identified potentially causative novel mutations in the latent transforming growth factor beta binding protein 2 (LTBP2) gene in two PCG families. In the first family a novel missense mutation (c.4934G>A; p.Arg1645Glu) co-segregates with the disease phenotype, and in the second family a novel frameshift mutation (c.4031_4032insA; p.Asp1345Glyfs*6) was identified. In a third family with developmental glaucoma a novel mutation (c.3496G>A; p.Gly1166Arg) was identified in the PXDN gene, which segregates with the disease. Conclusions We identified three novel mutations in glaucoma families using WES; two in the LTBP2 gene and one in the PXDN gene. The results will not only enhance our current understanding of the genetic basis of glaucoma, but may also contribute to a better understanding of the diverse phenotypic consequences caused by mutations in these genes. PMID:27409795

  15. Hydrogen Exchange Mass Spectrometry of Related Proteins with Divergent Sequences: A Comparative Study of HIV-1 Nef Allelic Variants

    NASA Astrophysics Data System (ADS)

    Wales, Thomas E.; Poe, Jerrod A.; Emert-Sedlak, Lori; Morgan, Christopher R.; Smithgall, Thomas E.; Engen, John R.

    2016-06-01

    Hydrogen exchange mass spectrometry can be used to compare the conformation and dynamics of proteins that are similar in tertiary structure. If relative deuterium levels are measured, differences in sequence, deuterium forward- and back-exchange, peptide retention time, and protease digestion patterns all complicate the data analysis. We illustrate what can be learned from such data sets by analyzing five variants (Consensus G2E, SF2, NL4-3, ELI, and LTNP4) of the HIV-1 Nef protein, both alone and when bound to the human Hck SH3 domain. Regions with similar sequence could be compared between variants. Although much of the hydrogen exchange features were preserved across the five proteins, the kinetics of Nef binding to Hck SH3 were not the same. These observations may be related to biological function, particularly for ELI Nef where we also observed an impaired ability to downregulate CD4 surface presentation. The data illustrate some of the caveats that must be considered for comparison experiments and provide a framework for investigations of other protein relatives, families, and superfamilies with HX MS.

  16. Complete genome sequence of two rabbit hemorrhagic disease virus variant b isolates detected on the Iberian Peninsula.

    PubMed

    Dalton, K P; Abrantes, J; Lopes, A M; Nicieza, I; Álvarez, Á L; Esteves, P J; Parra, F

    2015-03-01

    We report the complete genome sequences of two isolates (RHDV-N11 and CBVal16) of variant rabbit hemorrhagic disease virus (RHDVb). Isolate N11 was detected in young domestic animals during a rabbit hemorrhagic disease (RHD) outbreak that occurred in 2011 on a rabbit farm in Navarra, Spain, while CBVal16 was isolated from a wild rabbit found dead in Valpaços, Northern Portugal, a year later. The viral sequences reported show 84.8-85.1 % and 78.3-78.5 % identity to RHDVAst/89 and RCV-A1 MIC-07, representative members of the pathogenic genogroup 1 RHDV and apathogenic rabbit calicivirus, respectively. In comparison with other RHDV isolates belonging to the previously known genogroups 1-6, RHDVb shows marked phenotypic differences, as it causes disease preferentially in young rabbits under 40 days of age and shows modified red blood cell agglutination profiles as well as antigenic differences that allow this variant to escape protection by the currently available vaccines.

  17. Hypervariable prophage WO sequences describe an unexpected high number of Wolbachia variants in the mosquito Culex pipiens

    PubMed Central

    Duron, Olivier; Fort, Philippe; Weill, Mylène

    2005-01-01

    Wolbachia are maternally inherited endosymbiotic bacteria that infect many arthropod species and may induce cytoplasmic incompatibility (CI) resulting in abortive embryonic development. Among all the described host species, mosquitoes of the Culex pipiens complex display the highest variability of CI crossing types. Paradoxically, searches for polymorphism in Wolbachia infecting strains and field populations hitherto failed or produced very few markers. Here, we show that an abundant source of the long-sought polymorphism lies in WO prophage sequences present in multiple copies dispersed in the genome of Wolbachia infecting C. pipiens (wPip). We identified up to 66 different Wolbachia variants in C. pipiens strains and field populations and no occurrence of superinfection was observed. At least 49 different Wolbachia occurred in Southern Europe C. pipiens populations, and up to 10 different Wolbachia were even detected in a single population. This is in sharp contrast with North African and Cretan samples, which exhibited only six variants. The WO polymorphism appeared stable over time, and was exclusively transferred maternally. Interestingly, we found that the CI pattern previously described correlates with the variability of Gp15, a prophage protein similar to a bacterial virulence protein. WO prophage sequences thus represent variable markers that now open routes for approaching the molecular basis of CI, the host effects, the structure and dynamics of Wolbachia populations. PMID:16615218

  18. Amino acid sequence of bovine heart coupling factor 6.

    PubMed Central

    Fang, J K; Jacobs, J W; Kanner, B I; Racker, E; Bradshaw, R A

    1984-01-01

    The amino acid sequence of bovine heart mitochondrial coupling factor 6 (F6) has been determined by automated Edman degradation of the whole protein and derived peptides. Preparations based on heat precipitation and ethanol extraction showed allotypic variation at three positions while material further purified by HPLC yielded only one sequence that also differed by a Phe-Thr replacement at residue 62. The mature protein contains 76 amino acids with a calculated molecular weight of 9006 and a pI of approximately equal to 5, in good agreement with experimentally measured values. The charged amino acids are mainly clustered at the termini and in one section in the middle; these three polar segments are separated by two segments relatively rich in nonpolar residues. Chou-Fasman analysis suggests three stretches of alpha-helix coinciding (or within) the high-charge-density sequences with a single beta-turn at the first polar-nonpolar junction. Comparison of the F6 sequence with those of other proteins did not reveal any homologous structures. PMID:6149548

  19. Amino acid sequence and comparative antigenicity of chicken metallothionein.

    PubMed Central

    McCormick, C C; Fullmer, C S; Garvey, J S

    1988-01-01

    The complete amino acid sequence of metallothionein (MT) from chicken liver is reported. The primary structure was determined by automated sequence analysis of peptides produced by limited acid hydrolysis and by trypsin digestion. The comparative antigenicity of chicken MT was determined by radioimmunoassay using rabbit anti-rat MT polyclonal antibody. Chicken MT consists of 63 amino acids as compared to 61 found in MTs from mammals. One insertion (and two substitutions) occurs in the amino-terminal region, a region considered invariant among mammalian MTs. Eighteen of the 20 cysteines in chicken MT were aligned with cysteines from other mammalian sequences. Two cysteines near the carboxyl terminus are shifted by one residue due to the insertion of proline in that region. Overall, the chicken protein showed approximately equal to 68% sequence identity in a comparison with various mammalian MTs. The affinity of the polyclonal antibody for chicken MT was decreased by 2 orders of magnitude in comparison to that of a mammalian MT (rat MT isoforms). This reduced affinity is attributed to major substitutions in chicken MT in the regions of the principal determinants of mammalian MTs. Theoretical analysis of the primary structure predicted the secondary structure to consist of reverse turns and random coils with no stable beta or helix conformations. There is no evidence that chicken MT differs functionally from mammalian MTs. PMID:2448773

  20. Secondary Variants in Individuals Undergoing Exome Sequencing: Screening of 572 Individuals Identifies High-Penetrance Mutations in Cancer-Susceptibility Genes

    PubMed Central

    Johnston, Jennifer J.; Rubinstein, Wendy S.; Facio, Flavia M.; Ng, David; Singh, Larry N.; Teer, Jamie K.; Mullikin, James C.; Biesecker, Leslie G.

    2012-01-01

    Genome- and exome-sequencing costs are continuing to fall, and many individuals are undergoing these assessments as research participants and patients. The issue of secondary (so-called incidental) findings in exome analysis is controversial, and data are needed on methods of detection and their frequency. We piloted secondary variant detection by analyzing exomes for mutations in cancer-susceptibility syndromes in subjects ascertained for atherosclerosis phenotypes. We performed exome sequencing on 572 ClinSeq participants, and in 37 genes, we interpreted variants that cause high-penetrance cancer syndromes by using an algorithm that filtered results on the basis of mutation type, quality, and frequency and that filtered mutation-database entries on the basis of defined categories of causation. We identified 454 sequence variants that differed from the human reference. Exclusions were made on the basis of sequence quality (26 variants) and high frequency in the cohort (77 variants) or dbSNP (17 variants), leaving 334 variants of potential clinical importance. These were further filtered on the basis of curation of literature reports. Seven participants, four of whom were of Ashkenazi Jewish descent and three of whom did not meet family-history-based referral criteria, had deleterious BRCA1 or BRCA2 mutations. One participant had a deleterious SDHC mutation, which causes paragangliomas. Exome sequencing, coupled with multidisciplinary interpretation, detected clinically important mutations in cancer-susceptibility genes; four of such mutations were in individuals without a significant family history of disease. We conclude that secondary variants of high clinical importance will be detected at an appreciable frequency in exomes, and we suggest that priority be given to the development of more efficient modes of interpretation with trials in larger patient groups. PMID:22703879

  1. Common and rare von Willebrand factor (VWF) coding variants, VWF levels, and factor VIII levels in African Americans: the NHLBI Exome Sequencing Project.

    PubMed

    Johnsen, Jill M; Auer, Paul L; Morrison, Alanna C; Jiao, Shuo; Wei, Peng; Haessler, Jeffrey; Fox, Keolu; McGee, Sean R; Smith, Joshua D; Carlson, Christopher S; Smith, Nicholas; Boerwinkle, Eric; Kooperberg, Charles; Nickerson, Deborah A; Rich, Stephen S; Green, David; Peters, Ulrike; Cushman, Mary; Reiner, Alex P

    2013-07-25

    Several rare European von Willebrand disease missense variants of VWF (including p.Arg2185Gln and p.His817Gln) were recently reported to be common in apparently healthy African Americans (AAs). Using data from the NHLBI Exome Sequencing Project, we assessed the association of these and other VWF coding variants with von Willebrand factor (VWF) and factor VIII (FVIII) levels in 4468 AAs. Of 30 nonsynonymous VWF variants, 6 were significantly and independently associated (P < .001) with levels of VWF and/or FVIII. Each additional copy of the common VWF variants encoding p.Thr789Ala or p.Asp1472His was associated with 6 to 8 IU/dL higher VWF levels. The VWF variant encoding p.Arg2185Gln was associated with 7 to 13 IU/dL lower VWF and FVIII levels. The type 2N-related VWF variant encoding p.His817Gln was associated with 17 IU/dL lower FVIII level but normal VWF level. A novel, rare missense VWF variant that predicts disruption of an O-glycosylation site (p.Ser1486Leu) and a rare variant encoding p.Arg2287Trp were each associated with 30 to 40 IU/dL lower VWF level (P < .001). In summary, several common and rare VWF missense variants contribute to phenotypic differences in VWF and FVIII among AAs.

  2. In silico comparative characterization of pharmacogenomic missense variants

    PubMed Central

    2014-01-01

    Background Missense pharmacogenomic (PGx) variants refer to amino acid substitutions that potentially affect the pharmacokinetic (PK) or pharmacodynamic (PD) response to drug therapies. The PGx variants, as compared to disease-associated variants, have not been investigated as deeply. The ability to computationally predict future PGx variants is desirable; however, it is not clear what data sets should be used or what features are beneficial to this end. Hence we carried out a comparative characterization of PGx variants with annotated neutral and disease variants from UniProt, to test the predictive power of sequence conservation and structural information in discriminating these three groups. Results 126 PGx variants of high quality from PharmGKB were selected and two data sets were created: one set contained 416 variants with structural and sequence information, and, the other set contained 1,265 variants with sequence information only. In terms of sequence conservation, PGx variants are more conserved than neutral variants and much less conserved than disease variants. A weighted random forest was used to strike a more balanced classification for PGx variants. Generally structural features are helpful in discriminating PGx variant from the other two groups, but still classification of PGx from neutral polymorphisms is much less effective than between disease and neutral variants. Conclusions We found that PGx variants are much more similar to neutral variants than to disease variants in the feature space consisting of residue conservation, neighboring residue conservation, number of neighbors, and protein solvent accessibility. Such similarity poses great difficulty in the classification of PGx variants and polymorphisms. PMID:25057096

  3. Quantitative Analysis of Single Amino Acid Variant Peptides Associated with Pancreatic Cancer in Serum by an Isobaric Labeling Quantitative Method

    PubMed Central

    2015-01-01

    Single amino acid variations are highly associated with many human diseases. The direct detection of peptides containing single amino acid variants (SAAVs) derived from nonsynonymous single nucleotide polymorphisms (SNPs) in serum can provide unique opportunities for SAAV associated biomarker discovery. In the present study, an isobaric labeling quantitative strategy was applied to identify and quantify variant peptides in serum samples of pancreatic cancer patients and other benign controls. The largest number of SAAV peptides to date in serum including 96 unique variant peptides were quantified in this quantitative analysis, of which five variant peptides showed a statistically significant difference between pancreatic cancer and other controls (p-value < 0.05). Significant differences in the variant peptide SDNCEDTPEAGYFAVAVVK from serotransferrin were detected between pancreatic cancer and controls, which was further validated by selected reaction monitoring (SRM) analysis. The novel biomarker panel obtained by combining α-1-antichymotrypsin (AACT), Thrombospondin-1 (THBS1) and this variant peptide showed an excellent diagnostic performance in discriminating pancreatic cancer from healthy controls (AUC = 0.98) and chronic pancreatitis (AUC = 0.90). These results suggest that large-scale analysis of SAAV peptides in serum may provide a new direction for biomarker discovery research. PMID:25393578

  4. Coding variants at hexa-allelic amino acid 13 of HLA-DRB1 explain independent SNP associations with follicular lymphoma risk.

    PubMed

    Foo, Jia Nee; Smedby, Karin E; Akers, Nicholas K; Berglund, Mattias; Irwan, Ishak D; Jia, Xiaoming; Li, Yi; Conde, Lucia; Darabi, Hatef; Bracci, Paige M; Melbye, Mads; Adami, Hans-Olov; Glimelius, Bengt; Khor, Chiea Chuen; Hjalgrim, Henrik; Padyukov, Leonid; Humphreys, Keith; Enblad, Gunilla; Skibola, Christine F; de Bakker, Paul I W; Liu, Jianjun

    2013-07-11

    Non-Hodgkin lymphoma represents a diverse group of blood malignancies, of which follicular lymphoma (FL) is a common subtype. Previous genome-wide association studies (GWASs) have identified in the human leukocyte antigen (HLA) class II region multiple independent SNPs that are significantly associated with FL risk. To dissect these signals and determine whether coding variants in HLA genes are responsible for the associations, we conducted imputation, HLA typing, and sequencing in three independent populations for a total of 689 cases and 2,446 controls. We identified a hexa-allelic amino acid polymorphism at position 13 of the HLA-DR beta chain that showed the strongest association with FL within the major histocompatibility complex (MHC) region (multiallelic p = 2.3 × 10⁻¹⁵). Out of six possible amino acids that occurred at that position within the population, we classified two as high risk (Tyr and Phe), two as low risk (Ser and Arg), and two as moderate risk (His and Gly). There was a 4.2-fold difference in risk (95% confidence interval = 2.9-6.1) between subjects carrying two alleles encoding high-risk amino acids and those carrying two alleles encoding low-risk amino acids (p = 1.01 × 10⁻¹⁴). This coding variant might explain the complex SNP associations identified by GWASs and suggests a common HLA-DR antigen-driven mechanism for the pathogenesis of FL and rheumatoid arthritis.

  5. Sequences Of Amino Acids For Human Serum Albumin

    NASA Technical Reports Server (NTRS)

    Carter, Daniel C.

    1992-01-01

    Sequences of amino acids defined for use in making polypeptides one-third to one-sixth as large as parent human serum albumin molecule. Smaller, chemically stable peptides have diverse applications including service as artificial human serum and as active components of biosensors and chromatographic matrices. In applications involving production of artificial sera from new sequences, little or no concern about viral contaminants. Smaller genetically engineered polypeptides more easily expressed and produced in large quantities, making commercial isolation and production more feasible and profitable.

  6. An Evaluation of Phylogenetic Methods for Reconstructing Transmitted HIV Variants using Longitudinal Clonal HIV Sequence Data

    PubMed Central

    McCloskey, Rosemary M.; Liang, Richard H.; Harrigan, P. Richard; Brumme, Zabrina L.

    2014-01-01

    ABSTRACT A population of human immunodeficiency virus (HIV) within a host often descends from a single transmitted/founder virus. The high mutation rate of HIV, coupled with long delays between infection and diagnosis, make isolating and characterizing this strain a challenge. In theory, ancestral reconstruction could be used to recover this strain from sequences sampled in chronic infection; however, the accuracy of phylogenetic techniques in this context is unknown. To evaluate the accuracy of these methods, we applied ancestral reconstruction to a large panel of published longitudinal clonal and/or single-genome-amplification HIV sequence data sets with at least one intrapatient sequence set sampled within 6 months of infection or seroconversion (n = 19,486 sequences, median [interquartile range] = 49 [20 to 86] sequences/set). The consensus of the earliest sequences was used as the best possible estimate of the transmitted/founder. These sequences were compared to ancestral reconstructions from sequences sampled at later time points using both phylogenetic and phylogeny-naive methods. Overall, phylogenetic methods conferred a 16% improvement in reproducing the consensus of early sequences, compared to phylogeny-naive methods. This relative advantage increased with intrapatient sequence diversity (P < 10−5) and the time elapsed between the earliest and subsequent samples (P < 10−5). However, neither approach performed well for reconstructing ancestral indel variation, especially within indel-rich regions of the HIV genome. Although further improvements are needed, our results indicate that phylogenetic methods for ancestral reconstruction significantly outperform phylogeny-naive alternatives, and we identify experimental conditions and study designs that can enhance accuracy of transmitted/founder virus reconstruction. IMPORTANCE When HIV is transmitted into a new host, most of the viruses fail to infect host cells. Consequently, an HIV infection tends to be

  7. Analysis of expression and amino acid sequence of the allergen Mag 3 in two species of house dust mites-Dermatophagoides farinae and D. pteronyssinus (Acari: Astigmata: Pyroglyphidae).

    PubMed

    Asman, Marek; Solarz, Krzysztof; Szilman, Ewa; Szilman, Piotr

    2010-01-01

    In the 90's of the XX century, 2 new and important allergens of house dust mites mites were cloned and sequenced: Mag 1 and Mag 3. However, the second allergen has been identified to date only in extracts of Dermatophagoides farinae [DF ]. In this work, we aimed to detect expression of this important allergen and for the first time analyze to the amino acid sequence in other species of house dust mite - Dermatophagoides pteronyssinus [DP ]. We were able to confirm the expression of allergen Mag 3 in DF and to exclude it in DP . By sequencing the products of DNA amplification, we revealed the nucleotide sequence encoding allergen Mag 3 in DF . This analysis enabled detection of 9 single base changes. An analysis of encoded amino acid sequence by triplets with substituted nucleotides revealed that 8 changes were polymorphic, and 1 was a mutation substituting GTG (valine) for ATG (methionine) at 236 position. However, the presence of amino acid sequence difference in this allergen might suggest that there exist other isoforms which can make difficult both diagnosis as well as immunotherapy in persons who produce allergic response to this allergen. The variants of allergen Mag 3 (group 14) are still not known beside the very good known allergen variants of the other main groups 1, 2, 4, 5 or 7. Thus, the identification and definition of allergic properties of allergen Mag 3 variants needs to be further investigated.

  8. Nanopores and nucleic acids: prospects for ultrarapid sequencing

    NASA Technical Reports Server (NTRS)

    Deamer, D. W.; Akeson, M.

    2000-01-01

    DNA and RNA molecules can be detected as they are driven through a nanopore by an applied electric field at rates ranging from several hundred microseconds to a few milliseconds per molecule. The nanopore can rapidly discriminate between pyrimidine and purine segments along a single-stranded nucleic acid molecule. Nanopore detection and characterization of single molecules represents a new method for directly reading information encoded in linear polymers. If single-nucleotide resolution can be achieved, it is possible that nucleic acid sequences can be determined at rates exceeding a thousand bases per second.

  9. Pooled sequencing and rare variant association tests for identifying the determinants of emerging drug resistance in malaria parasites.

    PubMed

    Cheeseman, Ian H; McDew-White, Marina; Phyo, Aung Pyae; Sriprawat, Kanlaya; Nosten, François; Anderson, Timothy J C

    2015-04-01

    We explored the potential of pooled sequencing to swiftly and economically identify selective sweeps due to emerging artemisinin (ART) resistance in a South-East Asian malaria parasite population. ART resistance is defined by slow parasite clearance from the blood of ART-treated patients and mutations in the kelch gene (chr. 13) have been strongly implicated to play a role. We constructed triplicate pools of 70 slow-clearing (resistant) and 70 fast-clearing (sensitive) infections collected from the Thai-Myanmar border and sequenced these to high (∼ 150-fold) read depth. Allele frequency estimates from pools showed almost perfect correlation (Lin's concordance = 0.98) with allele frequencies at 93 single nucleotide polymorphisms measured directly from individual infections, giving us confidence in the accuracy of this approach. By mapping genome-wide divergence (FST) between pools of drug-resistant and drug-sensitive parasites, we identified two large (>150 kb) regions (on chrs. 13 and 14) and 17 smaller candidate genome regions. To identify individual genes within these genome regions, we resequenced an additional 38 parasite genomes (16 slow and 22 fast-clearing) and performed rare variant association tests. These confirmed kelch as a major molecular marker for ART resistance (P = 6.03 × 10(-6)). This two-tier approach is powerful because pooled sequencing rapidly narrows down genome regions of interest, while targeted rare variant association testing within these regions can pinpoint the genetic basis of resistance. We show that our approach is robust to recurrent mutation and the generation of soft selective sweeps, which are predicted to be common in pathogen populations with large effective population sizes, and may confound more traditional gene mapping approaches.

  10. Copy number variants calling for single cell sequencing data by multi-constrained optimization.

    PubMed

    Xu, Bo; Cai, Hongmin; Zhang, Changsheng; Yang, Xi; Han, Guoqiang

    2016-08-01

    Variations in DNA copy number carry important information on genome evolution and regulation of DNA replication in cancer cells. The rapid development of single-cell sequencing technology allows one to explore gene expression heterogeneity among single-cells, thus providing important cancer cell evolution information. Single-cell DNA/RNA sequencing data usually have low genome coverage, which requires an extra step of amplification to accumulate enough samples. However, such amplification will introduce large bias and makes bioinformatics analysis challenging. Accurately modeling the distribution of sequencing data and effectively suppressing the bias influence is the key to success variations analysis. Recent advances demonstrate the technical noises by amplification are more likely to follow negative binomial distribution, a special case of Poisson distribution. Thus, we tackle the problem CNV detection by formulating it into a quadratic optimization problem involving two constraints, in which the underling signals are corrupted by Poisson distributed noises. By imposing the constraints of sparsity and smoothness, the reconstructed read depth signals from single-cell sequencing data are anticipated to fit the CNVs patterns more accurately. An efficient numerical solution based on the classical alternating direction minimization method (ADMM) is tailored to solve the proposed model. We demonstrate the advantages of the proposed method using both synthetic and empirical single-cell sequencing data. Our experimental results demonstrate that the proposed method achieves excellent performance and high promise of success with single-cell sequencing data.

  11. A reference data set of 5.4 million phased human variants validated by genetic inheritance from sequencing a three-generation 17-member pedigree

    PubMed Central

    Eberle, Michael A.; Fritzilas, Epameinondas; Krusche, Peter; Källberg, Morten; Moore, Benjamin L.; Bekritsky, Mitchell A.; Iqbal, Zamin; Chuang, Han-Yu; Humphray, Sean J.; Halpern, Aaron L.; Kruglyak, Semyon; Margulies, Elliott H.; McVean, Gil; Bentley, David R.

    2017-01-01

    Improvement of variant calling in next-generation sequence data requires a comprehensive, genome-wide catalog of high-confidence variants called in a set of genomes for use as a benchmark. We generated deep, whole-genome sequence data of 17 individuals in a three-generation pedigree and called variants in each genome using a range of currently available algorithms. We used haplotype transmission information to create a phased “Platinum” variant catalog of 4.7 million single-nucleotide variants (SNVs) plus 0.7 million small (1–50 bp) insertions and deletions (indels) that are consistent with the pattern of inheritance in the parents and 11 children of this pedigree. Platinum genotypes are highly concordant with the current catalog of the National Institute of Standards and Technology for both SNVs (>99.99%) and indels (99.92%) and add a validated truth catalog that has 26% more SNVs and 45% more indels. Analysis of 334,652 SNVs that were consistent between informatics pipelines yet inconsistent with haplotype transmission (“nonplatinum”) revealed that the majority of these variants are de novo and cell-line mutations or reside within previously unidentified duplications and deletions. The reference materials from this study are a resource for objective assessment of the accuracy of variant calls throughout genomes. PMID:27903644

  12. Variants of glycoside hydrolases

    DOEpatents

    Teter, Sarah; Ward, Connie; Cherry, Joel; Jones, Aubrey; Harris, Paul; Yi, Jung

    2013-02-26

    The present invention relates to variants of a parent glycoside hydrolase, comprising a substitution at one or more positions corresponding to positions 21, 94, 157, 205, 206, 247, 337, 350, 373, 383, 438, 455, 467, and 486 of amino acids 1 to 513 of SEQ ID NO: 2, and optionally further comprising a substitution at one or more positions corresponding to positions 8, 22, 41, 49, 57, 113, 193, 196, 226, 227, 246, 251, 255, 259, 301, 356, 371, 411, and 462 of amino acids 1 to 513 of SEQ ID NO: 2 a substitution at one or more positions corresponding to positions 8, 22, 41, 49, 57, 113, 193, 196, 226, 227, 246, 251, 255, 259, 301, 356, 371, 411, and 462 of amino acids 1 to 513 of SEQ ID NO: 2, wherein the variants have glycoside hydrolase activity. The present invention also relates to nucleotide sequences encoding the variant glycoside hydrolases and to nucleic acid constructs, vectors, and host cells comprising the nucleotide sequences.

  13. Variants of glycoside hydrolases

    DOEpatents

    Teter, Sarah; Ward, Connie; Cherry, Joel; Jones, Aubrey; Harris, Paul; Yi, Jung

    2011-04-26

    The present invention relates to variants of a parent glycoside hydrolase, comprising a substitution at one or more positions corresponding to positions 21, 94, 157, 205, 206, 247, 337, 350, 373, 383, 438, 455, 467, and 486 of amino acids 1 to 513 of SEQ ID NO: 2, and optionally further comprising a substitution at one or more positions corresponding to positions 8, 22, 41, 49, 57, 113, 193, 196, 226, 227, 246, 251, 255, 259, 301, 356, 371, 411, and 462 of amino acids 1 to 513 of SEQ ID NO: 2 a substitution at one or more positions corresponding to positions 8, 22, 41, 49, 57, 113, 193, 196, 226, 227, 246, 251, 255, 259, 301, 356, 371, 411, and 462 of amino acids 1 to 513 of SEQ ID NO: 2, wherein the variants have glycoside hydrolase activity. The present invention also relates to nucleotide sequences encoding the variant glycoside hydrolases and to nucleic acid constructs, vectors, and host cells comprising the nucleotide sequences.

  14. Development of a Targeted Multi-Disorder High-Throughput Sequencing Assay for the Effective Identification of Disease-Causing Variants

    PubMed Central

    Delio, Maria; Patel, Kunjan; Maslov, Alex; Marion, Robert W.; McDonald, Thomas V.; Cadoff, Evan M.; Golden, Aaron; Greally, John M.; Vijg, Jan; Morrow, Bernice; Montagna, Cristina

    2015-01-01

    Background While next generation sequencing (NGS) is a useful tool for the identification of genetic variants to aid diagnosis and support therapy decision, high sequencing costs have limited its application within routine clinical care, especially in economically depressed areas. To investigate the utility of a multi-disease NGS based genetic test, we designed a custom sequencing assay targeting over thirty disease-associated areas including cardiac disorders, intellectual disabilities, hearing loss, collagenopathies, muscular dystrophy, Ashkenazi Jewish genetic disorders, and complex Mendelian disorders. We focused on these specific areas based on the interest of our collaborative clinical team, suggesting these diseases being the ones in need for the development of a sequencing-screening assay. Results We targeted all coding, untranslated regions (UTR) and flanking intronic regions of 650 known disease-associated genes using the Roche-NimbleGen EZ SeqCapV3 capture system and sequenced on the Illumina HiSeq 2500 Rapid Run platform. Eight controls with known variants and one HapMap sample were first sequenced to assess the performance of the panel. Subsequently, as a proof of principle and to explore the possible utility of our test, we analyzed test disease subjects (n = 16). Eight had known Mendelian disorders and eight had complex pediatric diseases. In addition to assess whether copy number variation may be of utility as a companion assay relative to these specific disease areas, we used the Affymetrix Genome-Wide SNP Array 6.0 to analyze the same samples. Conclusion We identified potentially disease-associated variants: 22 missense, 4 nonsense, 1 frameshift, and 1 splice variants (16 previously identified, 12 novel among dbSNP and 15 novel among NHLBI Exome Variant Server). We found multi-disease targeted high-throughput sequencing to be a cost efficient approach in detecting disease-associated variants to aid diagnosis. PMID:26214305

  15. Long insert whole genome sequencing for copy number variant and translocation detection

    PubMed Central

    Liang, Winnie S.; Aldrich, Jessica; Tembe, Waibhav; Kurdoglu, Ahmet; Cherni, Irene; Phillips, Lori; Reiman, Rebecca; Baker, Angela; Weiss, Glen J.; Carpten, John D.; Craig, David W.

    2014-01-01

    As next-generation sequencing continues to have an expanding presence in the clinic, the identification of the most cost-effective and robust strategy for identifying copy number changes and translocations in tumor genomes is needed. We hypothesized that performing shallow whole genome sequencing (WGS) of 900–1000-bp inserts (long insert WGS, LI-WGS) improves our ability to detect these events, compared with shallow WGS of 300–400-bp inserts. A priori analyses show that LI-WGS requires less sequencing compared with short insert WGS to achieve a target physical coverage, and that LI-WGS requires less sequence coverage to detect a heterozygous event with a power of 0.99. We thus developed an LI-WGS library preparation protocol based off of Illumina’s WGS library preparation protocol and illustrate the feasibility of performing LI-WGS. We additionally applied LI-WGS to three separate tumor/normal DNA pairs collected from patients diagnosed with different cancers to demonstrate our application of LI-WGS on actual patient samples for identification of somatic copy number alterations and translocations. With the evolution of sequencing technologies and bioinformatics analyses, we show that modifications to current approaches may improve our ability to interrogate cancer genomes. PMID:24071583

  16. Association of Amine-Receptor DNA Sequence Variants with Associative Learning in the Honeybee.

    PubMed

    Lagisz, Malgorzata; Mercer, Alison R; de Mouzon, Charlotte; Santos, Luana L S; Nakagawa, Shinichi

    2016-03-01

    Octopamine- and dopamine-based neuromodulatory systems play a critical role in learning and learning-related behaviour in insects. To further our understanding of these systems and resulting phenotypes, we quantified DNA sequence variations at six loci coding octopamine-and dopamine-receptors and their association with aversive and appetitive learning traits in a population of honeybees. We identified 79 polymorphic sequence markers (mostly SNPs and a few insertions/deletions) located within or close to six candidate genes. Intriguingly, we found that levels of sequence variation in the protein-coding regions studied were low, indicating that sequence variation in the coding regions of receptor genes critical to learning and memory is strongly selected against. Non-coding and upstream regions of the same genes, however, were less conserved and sequence variations in these regions were weakly associated with between-individual differences in learning-related traits. While these associations do not directly imply a specific molecular mechanism, they suggest that the cross-talk between dopamine and octopamine signalling pathways may influence olfactory learning and memory in the honeybee.

  17. Targeted sequencing identifies a novel SH2D1A pathogenic variant in a Chinese family: Carrier screening and prenatal genetic testing.

    PubMed

    Zhang, Jun-Yu; Chen, Song-Chang; Chen, Yi-Yao; Li, Shu-Yuan; Zhang, Lan-Lan; Shen, Ying-Hua; Chang, Chun-Xin; Xiang, Yu-Qian; Huang, He-Feng; Xu, Chen-Ming

    2017-01-01

    X-linked lymphoproliferative disease type 1 (XLP1) is a rare primary immunodeficiency characterized by a clinical triad consisting of severe EBV-induced hemophagocytic lymphohistiocytosis, B-cell lymphoma, and dysgammaglobulinemia. Mutations in SH2D1A gene have been revealed as the cause of XLP1. In this study, a pregnant woman with recurrence history of birthing immunodeficiency was screened for pathogenic variant because the proband sample was unavailable. We aimed to clarify the genetic diagnosis and provide prenatal testing for the family. Next-generation sequencing (NGS)-based multigene panel was used in carrier screening of the pregnant woman. Variants of immunodeficiency related genes were analyzed and prioritized. Candidate variant was verified by using Sanger sequencing. The possible influence of the identified variant was evaluated through RNA assay. Amniocentesis, karyotyping, and Sanger sequencing were performed for prenatal testing. We identified a novel de novo frameshift SH2D1A pathogenic variant (c.251_255delTTTCA) in the pregnant carrier. Peripheral blood RNA assay indicated that the mutant transcript could escape nonsense-mediated mRNA decay (NMD) and might encode a C-terminal truncated protein. Information of the variant led to success prenatal diagnosis of the fetus. In conclusion, our study clarified the genetic diagnosis and altered disease prevention for a pregnant carrier of XLP1.

  18. Targeted sequencing identifies a novel SH2D1A pathogenic variant in a Chinese family: Carrier screening and prenatal genetic testing

    PubMed Central

    Chen, Yi-Yao; Li, Shu-Yuan; Zhang, Lan-Lan; Shen, Ying-Hua; Chang, Chun-Xin; Xiang, Yu-Qian; Huang, He-Feng; Xu, Chen-Ming

    2017-01-01

    X-linked lymphoproliferative disease type 1 (XLP1) is a rare primary immunodeficiency characterized by a clinical triad consisting of severe EBV-induced hemophagocytic lymphohistiocytosis, B-cell lymphoma, and dysgammaglobulinemia. Mutations in SH2D1A gene have been revealed as the cause of XLP1. In this study, a pregnant woman with recurrence history of birthing immunodeficiency was screened for pathogenic variant because the proband sample was unavailable. We aimed to clarify the genetic diagnosis and provide prenatal testing for the family. Next-generation sequencing (NGS)-based multigene panel was used in carrier screening of the pregnant woman. Variants of immunodeficiency related genes were analyzed and prioritized. Candidate variant was verified by using Sanger sequencing. The possible influence of the identified variant was evaluated through RNA assay. Amniocentesis, karyotyping, and Sanger sequencing were performed for prenatal testing. We identified a novel de novo frameshift SH2D1A pathogenic variant (c.251_255delTTTCA) in the pregnant carrier. Peripheral blood RNA assay indicated that the mutant transcript could escape nonsense-mediated mRNA decay (NMD) and might encode a C-terminal truncated protein. Information of the variant led to success prenatal diagnosis of the fetus. In conclusion, our study clarified the genetic diagnosis and altered disease prevention for a pregnant carrier of XLP1. PMID:28231257

  19. Complete Nucleotide Sequence of IncP-1β Plasmid pDTC28 Reveals a Non-Functional Variant of the blaGES-Type Gene

    PubMed Central

    Dang, Bingjun; Mao, Daqing; Luo, Yi

    2016-01-01

    Plasmid pDTC28 was isolated from the sediments of Haihe River using E. coli CV601 (gfp-tagged) as recipient and indigenous bacteria from the sediment as donors. This plasmid confers reduced susceptibility to tetracycline and sulfamethoxazole. The complete sequence of plasmid pDTC28 was 61,503 bp in length with an average G+C content of 64.09%. Plasmid pDTC28 belongs to the IncP-1β group by phylogenetic analysis. The backbones of plasmid pDTC28 and other IncP-1β plasmids are very classical and conserved, whereas the accessory regions of these plasmids are diverse. A blaGES-5-like gene was found on the accessory region, and this blaGES-5-like gene contained 18 silent mutations and 7 missense mutations compared with the blaGES-5 gene. The mutations resulted in 7 amino acid substitutions in GES-5 carbapenemase, causing the loss of function of the blaGES-5-like gene on plasmid pDTC28 against carbapenems and even β-lactams. The enzyme produced by the blaGES-5-like gene cassette may be a new variant of GES-type enzymes. Thus, the plasmid sequenced in this study will expand our understanding of GES-type β-lactamases and provide insights into the genetic platforms used for the dissemination of GES-type genes. PMID:27152950

  20. Evaluation of a 5-tier scheme proposed for classification of sequence variants using bioinformatic and splicing assay data: inter-reviewer variability and promotion of minimum reporting guidelines.

    PubMed

    Walker, Logan C; Whiley, Phillip J; Houdayer, Claude; Hansen, Thomas V O; Vega, Ana; Santamarina, Marta; Blanco, Ana; Fachal, Laura; Southey, Melissa C; Lafferty, Alan; Colombo, Mara; De Vecchi, Giovanna; Radice, Paolo; Spurdle, Amanda B

    2013-10-01

    Splicing assays are commonly undertaken in the clinical setting to assess the clinical relevance of sequence variants in disease predisposition genes. A 5-tier classification system incorporating both bioinformatic and splicing assay information was previously proposed as a method to provide consistent clinical classification of such variants. Members of the ENIGMA Consortium Splicing Working Group undertook a study to assess the applicability of the scheme to published assay results, and the consistency of classifications across multiple reviewers. Splicing assay data were identified for 235 BRCA1 and 176 BRCA2 unique variants, from 77 publications. At least six independent reviewers from research and/or clinical settings comprehensively examined splicing assay methods and data reported for 22 variant assays of 21 variants in four publications, and classified the variants using the 5-tier classification scheme. Inconsistencies in variant classification occurred between reviewers for 17 of the variant assays. These could be attributed to a combination of ambiguity in presentation of the classification criteria, differences in interpretation of the data provided, nonstandardized reporting of results, and the lack of quantitative data for the aberrant transcripts. We propose suggestions for minimum reporting guidelines for splicing assays, and improvements to the 5-tier splicing classification system to allow future evaluation of its performance as a clinical tool.

  1. Evidence for differences in the binding of drugs to the two main genetic variants of human alpha 1-acid glycoprotein.

    PubMed Central

    Herve, F; Gomas, E; Duche, J C; Tillement, J P

    1993-01-01

    1. Human alpha 1-acid glycoprotein (AAG), a plasma transport protein, has three main genetic variants. F1. S and A. Native commercial AAG (a mixture of almost equal proportions of these three variants) has been separated by chromatography into variants which correspond to the proteins of the two genes which code for AAG in humans: the A variant and a mixture of the F1 and S variants (60% F1 and 40% S). Their binding properties towards imipramine, warfarin and mifepristone were studied by equilibrium dialysis. 2. The F1S variant mixture strongly bound warfarin and mifepristone with an affinity of 1.89 and 2.06 x 10(6) l mol-1, respectively, but had a low affinity for imipramine. Conversely, the A variant strongly bound imipramine with an affinity of 0.98 x 10(6) l mol-1. The low degree of binding of warfarin and mifepristone to the A variant sample was explained by the presence of protein contaminants in this sample. These results indicate specific drug transport roles for each variant, with respect to its separate genetic origin. 3. Control binding experiments performed with (unfractionated) commercial AAG and with AAG isolated from individuals with either the F1/A or S/A phenotypes, agreed with these findings. The results for the binding of warfarin and mifepristone by the AAG samples were similar to those obtained with the F1S mixture: the mean high-affinity association constant of the AAG samples for each drug was of the same order as that of the F1S mixture: the decrease in the number of binding sites of the AAG samples, as compared with the F1S mixture, was explained by the smaller proportion of variants F1 and/or S in these samples. Conversely, results of the imipramine binding study with the AAG samples concurred with those for the binding of this basic drug by the A variant, with respect to the proportion of the A variant in these samples. Images Figure 1 PMID:9114911

  2. Evidence for differences in the binding of drugs to the two main genetic variants of human alpha 1-acid glycoprotein.

    PubMed

    Herve, F; Gomas, E; Duche, J C; Tillement, J P

    1993-09-01

    1. Human alpha 1-acid glycoprotein (AAG), a plasma transport protein, has three main genetic variants. F1. S and A. Native commercial AAG (a mixture of almost equal proportions of these three variants) has been separated by chromatography into variants which correspond to the proteins of the two genes which code for AAG in humans: the A variant and a mixture of the F1 and S variants (60% F1 and 40% S). Their binding properties towards imipramine, warfarin and mifepristone were studied by equilibrium dialysis. 2. The F1S variant mixture strongly bound warfarin and mifepristone with an affinity of 1.89 and 2.06 x 10(6) l mol-1, respectively, but had a low affinity for imipramine. Conversely, the A variant strongly bound imipramine with an affinity of 0.98 x 10(6) l mol-1. The low degree of binding of warfarin and mifepristone to the A variant sample was explained by the presence of protein contaminants in this sample. These results indicate specific drug transport roles for each variant, with respect to its separate genetic origin. 3. Control binding experiments performed with (unfractionated) commercial AAG and with AAG isolated from individuals with either the F1/A or S/A phenotypes, agreed with these findings. The results for the binding of warfarin and mifepristone by the AAG samples were similar to those obtained with the F1S mixture: the mean high-affinity association constant of the AAG samples for each drug was of the same order as that of the F1S mixture: the decrease in the number of binding sites of the AAG samples, as compared with the F1S mixture, was explained by the smaller proportion of variants F1 and/or S in these samples. Conversely, results of the imipramine binding study with the AAG samples concurred with those for the binding of this basic drug by the A variant, with respect to the proportion of the A variant in these samples.

  3. Sequence variants of IDE are associated with the extent of beta-amyloid deposition in the Alzheimer's disease brain.

    PubMed

    Blomqvist, Mia E-L; Chalmers, Katy; Andreasen, Niels; Bogdanovic, Nenad; Wilcock, Gordon K; Cairns, Nigel J; Feuk, Lars; Brookes, Anthony J; Love, Seth; Blennow, Kaj; Kehoe, Patrick G; Prince, Jonathan A

    2005-06-01

    Insulin degrading enzyme, encoded by IDE, plays a primary role in the degradation of amyloid beta-protein (A beta), the deposition of which in senile plaques is one of the defining hallmarks of Alzheimer's disease (AD). We recently identified haplotypes in a broad linkage disequilibrium (LD) block encompassing IDE that associate with several AD-related quantitative traits. Here, by examining 32 polymorphic markers extending across IDE and testing quantitative measures of plaque density and cognitive function in three independent Swedish AD samples, we have refined the probable position of pathogenic sequences to a 3' region of IDE, with local maximum effects in the proximity of marker rs1887922. To replicate these findings, a subset of variants were examined against measures of brain A beta load in an independent English AD sample, whereby maximum effects were again observed for rs1887922. For both Swedish and English autopsy materials, variation at rs1887922 explained approximately 10% of the total variance in the respective histopathology traits. However, across all clinical materials studied to date, this variant site does not appear to associate directly with disease, suggesting that IDE may affect AD severity rather than risk. Results indicate that alleles of IDE contribute to variability in A beta deposition in the AD brain and suggest that this relationship may have relevance for the degree of cognitive dysfunction in AD patients.

  4. Association analysis for feet and legs disorders with whole-genome sequence variants in 3 dairy cattle breeds.

    PubMed

    Wu, Xiaoping; Guldbrandtsen, Bernt; Lund, Mogens Sandø; Sahana, Goutam

    2016-09-01

    Identification of genetic variants associated with feet and legs disorders (FLD) will aid in the genetic improvement of these traits by providing knowledge on genes that influence trait variations. In Denmark, FLD in cattle has been recorded since the 1990s. In this report, we used deregressed breeding values as response variables for a genome-wide association study. Bulls (5,334 Danish Holstein, 4,237 Nordic Red Dairy Cattle, and 1,180 Danish Jersey) with deregressed estimated breeding values were genotyped with the Illumina Bovine 54k single nucleotide polymorphism (SNP) genotyping array. Genotypes were imputed to whole-genome sequence variants, and then 22,751,039 SNP on 29 autosomes were used for an association analysis. A modified linear mixed-model approach (efficient mixed-model association eXpedited, EMMAX) and a linear mixed model were used for association analysis. We identified 5 (3,854 SNP), 3 (13,642 SNP), and 0 quantitative trait locus (QTL) regions associated with the FLD index in Danish Holstein, Nordic Red Dairy Cattle, and Danish Jersey populations, respectively. We did not identify any QTL that were common among the 3 breeds. In a meta-analysis of the 3 breeds, 4 QTL regions were significant, but no additional QTL region was identified compared with within-breed analyses. Comparison between top SNP locations within these QTL regions and known genes suggested that RASGRP1, LCORL, MOS, and MITF may be candidate genes for FLD in dairy cattle.

  5. Toll-Like Receptor (TLR)-Associated Sequence Variants and Prostate Cancer Risk among Men of African Descent

    PubMed Central

    Rogers, Erica N.; Jones, Dominique; Kidd, Nayla C.; Yeyeodu, Susan; Brock, Guy; Ragin, Camille; Jackson, Maria; McFarlane-Anderson, Norma; Tulloch-Reid, Marshall; Kimbro, K. Sean; Kidd, LaCreis R.

    2013-01-01

    BACKGROUND Recent advances demonstrate a relationship between chronic/recurrent inflammation and prostate cancer (PCA). Among inflammatory regulators, toll-like receptors (TLRs) play a critical role in innate immune responses. However, it remains unclear whether variant TLR genes influence PCA risk among men of African descent. Therefore, we evaluated the impact of 32 TLR-associated single nucleotide polymorphisms (SNPs) on PCA risk among African-Americans and Jamaicans. METHODS SNP profiles of 814 subjects were evaluated using Illumina’s Veracode genotyping platform. Single and combined effects of SNPs in relation to PCA risk were assessed using age-adjusted logistic regression and entropy-based multifactor dimensionality reduction (MDR) models. RESULTS Seven sequence variants detected in TLR6, TOLLIP, IRAK4, IRF3 were marginally related to PCA. However, none of these effects remained significant after adjusting for multiple hypothesis testing. Nevertheless, MDR modeling revealed a complex interaction between IRAK4 rs4251545 and TLR2 rs1898830 as a significant predictor of PCA risk among U.S. men (permutation testing p-value = 0.001). CONCLUSIONS MDR identified an interaction between IRAK4 and TLR2 as the best two factor model for predicting PCA risk among men of African descent. However, these findings require further assessment and validation. PMID:23657238

  6. Using Sequence Variants in Linkage Disequilibrium with Causative Mutations to Improve Across-Breed Prediction in Dairy Cattle: A Simulation Study

    PubMed Central

    van den Berg, Irene; Boichard, Didier; Guldbrandtsen, Bernt; Lund, Mogens S.

    2016-01-01

    Sequence data are expected to increase the reliability of genomic prediction by containing causative mutations directly, especially in cases where low linkage disequilibrium between markers and causative mutations limits prediction reliability, such as across-breed prediction in dairy cattle. In practice, the causative mutations are unknown, and prediction with only variants in perfect linkage disequilibrium with the causative mutations is not realistic, leading to a reduced reliability compared to knowing the causative variants. Our objective was to use sequence data to investigate the potential benefits of sequence data for the prediction of genomic relationships, and consequently reliability of genomic breeding values. We used sequence data from five dairy cattle breeds, and a larger number of imputed sequences for two of the five breeds. We focused on the influence of linkage disequilibrium between markers and causative mutations, and assumed that a fraction of the causative mutations was shared across breeds and had the same effect across breeds. By comparing the loss in reliability of different scenarios, varying the distance between markers and causative mutations, using either all genome wide markers from commercial SNP chips, or only the markers closest to the causative mutations, we demonstrate the importance of using only variants very close to the causative mutations, especially for across-breed prediction. Rare variants improved prediction only if they were very close to rare causative mutations, and all causative mutations were rare. Our results show that sequence data can potentially improve genomic prediction, but careful selection of markers is essential. PMID:27317779

  7. Identification of Bari Transposons in 23 Sequenced Drosophila Genomes Reveals Novel Structural Variants, MITEs and Horizontal Transfer.

    PubMed

    Palazzo, Antonio; Lovero, Domenica; D'Addabbo, Pietro; Caizzi, Ruggiero; Marsano, René Massimiliano

    2016-01-01

    Bari elements are members of the Tc1-mariner superfamily of DNA transposons, originally discovered in Drosophila melanogaster, and subsequently identified in silico in 11 sequenced Drosophila genomes and as experimentally isolated in four non-sequenced Drosophila species. Bari-like elements have been also studied for their mobility both in vivo and in vitro. We analyzed 23 Drosophila genomes and carried out a detailed characterization of the Bari elements identified, including those from the heterochromatic Bari1 cluster in D. melanogaster. We have annotated 401 copies of Bari elements classified either as putatively autonomous or inactive according to the structure of the terminal sequences and the presence of a complete transposase-coding region. Analyses of the integration sites revealed that Bari transposase prefers AT-rich sequences in which the TA target is cleaved and duplicated. Furthermore evaluation of transposon's co-occurrence near the integration sites of Bari elements showed a non-random distribution of other transposable elements. We also unveil the existence of a putatively autonomous Bari1 variant characterized by two identical long Terminal Inverted Repeats, in D. rhopaloa. In addition, we detected MITEs related to Bari transposons in 9 species. Phylogenetic analyses based on transposase gene and the terminal sequences confirmed that Bari-like elements are distributed into three subfamilies. A few inconsistencies in Bari phylogenetic tree with respect to the Drosophila species tree could be explained by the occurrence of horizontal transfer events as also suggested by the results of dS analyses. This study further clarifies the Bari transposon's evolutionary dynamics and increases our understanding on the Tc1-mariner elements' biology.

  8. Identification of Bari Transposons in 23 Sequenced Drosophila Genomes Reveals Novel Structural Variants, MITEs and Horizontal Transfer

    PubMed Central

    D’Addabbo, Pietro; Caizzi, Ruggiero

    2016-01-01

    Bari elements are members of the Tc1-mariner superfamily of DNA transposons, originally discovered in Drosophila melanogaster, and subsequently identified in silico in 11 sequenced Drosophila genomes and as experimentally isolated in four non-sequenced Drosophila species. Bari-like elements have been also studied for their mobility both in vivo and in vitro. We analyzed 23 Drosophila genomes and carried out a detailed characterization of the Bari elements identified, including those from the heterochromatic Bari1 cluster in D. melanogaster. We have annotated 401 copies of Bari elements classified either as putatively autonomous or inactive according to the structure of the terminal sequences and the presence of a complete transposase-coding region. Analyses of the integration sites revealed that Bari transposase prefers AT-rich sequences in which the TA target is cleaved and duplicated. Furthermore evaluation of transposon’s co-occurrence near the integration sites of Bari elements showed a non-random distribution of other transposable elements. We also unveil the existence of a putatively autonomous Bari1 variant characterized by two identical long Terminal Inverted Repeats, in D. rhopaloa. In addition, we detected MITEs related to Bari transposons in 9 species. Phylogenetic analyses based on transposase gene and the terminal sequences confirmed that Bari-like elements are distributed into three subfamilies. A few inconsistencies in Bari phylogenetic tree with respect to the Drosophila species tree could be explained by the occurrence of horizontal transfer events as also suggested by the results of dS analyses. This study further clarifies the Bari transposon’s evolutionary dynamics and increases our understanding on the Tc1-mariner elements’ biology. PMID:27213270

  9. The complementary deoxyribonucleic acid sequence of guinea pig endometrial prorelaxin.

    PubMed

    Lee, Y A; Bryant-Greenwood, G D; Mandel, M; Greenwood, F C

    1992-03-01

    The nucleotide sequence of the relaxin gene transcript in the endometrium of the late pregnant guinea pig has been determined. The strategy used was a combination of polymerase chain reaction (PCR) with primers designed from the mRNA sequence of porcine preprorelaxin, rapid amplification of cDNA ends-PCR, and blunt end cloning in M13 mp18. With heterologous primers, a 226-basepair (bp) segment of the guinea pig relaxin gene sequence was obtained and was used to design a guinea pig-specific primer for use with the rapid amplification of cDNA ends-PCR method. The latter allowed completion of the sequence of 336 bp, with a 96-bp overlap. The sequence obtained shows greater homology at both the nucleotide and amino acid levels with porcine and human relaxins H1 and H2 than with rat relaxin, supporting the thesis that the guinea pig is not a rodent. The transcription of the guinea pig endometrial relaxin gene during pregnancy was confirmed by Northern analysis of guinea pig endometrial tissues with a species-specific cDNA probe. The endometrial relaxin gene is transcribed during pregnancy, but not in lactation, consistent with the observed immunostaining for relaxin.

  10. Quantum-Sequencing: Biophysics of quantum tunneling through nucleic acids

    NASA Astrophysics Data System (ADS)

    Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant

    2014-03-01

    Tunneling microscopy and spectroscopy has extensively been used in physical surface sciences to study quantum tunneling to measure electronic local density of states of nanomaterials and to characterize adsorbed species. Quantum-Sequencing (Q-Seq) is a new method based on tunneling microscopy for electronic sequencing of single molecule of nucleic acids. A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free single-molecule sequencing method. Here, we present the unique ``electronic fingerprints'' for all nucleotides on DNA and RNA using Q-Seq along their intrinsic biophysical parameters. We have analyzed tunneling spectra for the nucleotides at different pH conditions and analyzed the HOMO, LUMO and energy gap for all of them. In addition we show a number of biophysical parameters to further characterize all nucleobases (electron and hole transition voltage and energy barriers). These results highlight the robustness of Q-Seq as a technique for next-generation sequencing.

  11. Panel sequencing for clinically oriented variant screening and copy number detection in 142 untreated multiple myeloma patients

    PubMed Central

    Kortuem, K M; Braggio, E; Bruins, L; Barrio, S; Shi, C S; Zhu, Y X; Tibes, R; Viswanatha, D; Votruba, P; Ahmann, G; Fonseca, R; Jedlowski, P; Schlam, I; Kumar, S; Bergsagel, P L; Stewart, A K

    2016-01-01

    We employed a customized Multiple Myeloma (MM)-specific Mutation Panel (M3P) to screen a homogenous cohort of 142 untreated MM patients for relevant mutations in a selection of disease-specific genes. M3Pv2.0 includes 77 genes selected for being either actionable targets, potentially related to drug–response or part of known key pathways in MM biology. We identified mutations in potentially actionable genes in 49% of patients and provided prognostic evidence of STAT3 mutations. This panel may serve as a practical alternative to more comprehensive sequencing approaches, providing genomic information in a timely and cost-effective manner, thus allowing clinically oriented variant screening in MM. PMID:26918361

  12. Enhancer Sequence Variants and Transcription Factor Deregulation Synergize to Construct Pathogenic Regulatory Circuits in B Cell Lymphoma

    PubMed Central

    Koues, Olivia I.; Kowalewski, Rodney A.; Chang, Li-Wei; Pyfrom, Sarah C.; Schmidt, Jennifer A.; Luo, Hong; Sandoval, Luis E.; Hughes, Tyler B.; Bednarski, Jeffrey J.; Cashen, Amanda F.; Payton, Jacqueline E.; Oltz, Eugene M.

    2014-01-01

    Summary Most B cell lymphomas arise in the germinal center (GC), where humoral immune responses evolve from potentially oncogenic cycles of mutation, proliferation, and clonal selection. Although lymphoma gene expression diverges significantly from GC-B cells, underlying mechanisms that alter the activities of corresponding regulatory elements (REs) remain elusive. Here we define the complete pathogenic circuitry of human follicular lymphoma (FL), which activates or decommissions REs from normal GC-B cells and commandeers enhancers from other lineages. Moreover, independent sets of transcription factors, whose expression was deregulated in FL, targeted commandeered versus decommissioned REs. Our approach revealed two distinct subtypes of low-grade FL, whose pathogenic circuitries resembled GC-B or activated B cells. FL-altered enhancers also were enriched for sequence variants, including somatic mutations, which disrupt transcription factor binding and expression of circuit-linked genes. Thus, the pathogenic regulatory circuitry of FL reveals distinct genetic and epigenetic etiologies for GC-B transformation. PMID:25607463

  13. Deep sequencing and variant analysis of an Italian pathogenic field strain of equine infectious anaemia virus.

    PubMed

    Cappelli, K; Cook, R F; Stefanetti, V; Passamonti, F; Autorino, G L; Scicluna, M T; Coletti, M; Verini Supplizi, A; Capomaccio, S

    2017-03-15

    Equine infectious anaemia virus (EIAV) is a lentivirus with an almost worldwide distribution that causes persistent infections in equids. Technical limitations have restricted genetic analysis of EIAV field isolates predominantly to gag sequences resulting in very little published information concerning the extent of inter-strain variation in pol, env and the three ancillary open reading frames (ORFs). Here, we describe the use of long-range PCR in conjunction with next-generation sequencing (NGS) for rapid molecular characterization of all viral ORFs and known transcription factor binding motifs within the long terminal repeat of two EIAV isolates from the 2006 Italian outbreak. These isolates were from foals believed to have been exposed to the same source material but with different clinical histories: one died 53 days post-infection (SA) while the other (DE) survived 5 months despite experiencing multiple febrile episodes. Nucleotide sequence identity between the isolates was 99.358% confirming infection with the same EIAV strain with most differences comprising single nucleotide polymorphisms in env and the second exon of rev. Although the synonymous:non-synonymous nucleotide substitution ratio was approximately 2:1 in gag and pol, the situation is reversed in env and ORF3 suggesting these sequences are subjected to host-mediated selective pressure. EIAV proviral quasispecies complexity in vivo has not been extensively investigated; however, analysis suggests it was relatively low in SA at the time of death. These results highlight advantages of NGS for molecular characterization of EIAV namely it avoids potential artefacts generated by traditional composite sequencing strategies and can provide information about viral quasispecies complexity.

  14. Pooled sequencing of 531 genes in inflammatory bowel disease identifies an associated rare variant in BTNL2 and implicates other immune related genes.

    PubMed

    Prescott, Natalie J; Lehne, Benjamin; Stone, Kristina; Lee, James C; Taylor, Kirstin; Knight, Jo; Papouli, Efterpi; Mirza, Muddassar M; Simpson, Michael A; Spain, Sarah L; Lu, Grace; Fraternali, Franca; Bumpstead, Suzannah J; Gray, Emma; Amar, Ariella; Bye, Hannah; Green, Peter; Chung-Faye, Guy; Hayee, Bu'Hussain; Pollok, Richard; Satsangi, Jack; Parkes, Miles; Barrett, Jeffrey C; Mansfield, John C; Sanderson, Jeremy; Lewis, Cathryn M; Weale, Michael E; Schlitt, Thomas; Mathew, Christopher G

    2015-02-01

    The contribution of rare coding sequence variants to genetic susceptibility in complex disorders is an important but unresolved question. Most studies thus far have investigated a limited number of genes from regions which contain common disease associated variants. Here we investigate this in inflammatory bowel disease by sequencing the exons and proximal promoters of 531 genes selected from both genome-wide association studies and pathway analysis in pooled DNA panels from 474 cases of Crohn's disease and 480 controls. 80 variants with evidence of association in the sequencing experiment or with potential functional significance were selected for follow up genotyping in 6,507 IBD cases and 3,064 population controls. The top 5 disease associated variants were genotyped in an extension panel of 3,662 IBD cases and 3,639 controls, and tested for association in a combined analysis of 10,147 IBD cases and 7,008 controls. A rare coding variant p.G454C in the BTNL2 gene within the major histocompatibility complex was significantly associated with increased risk for IBD (p = 9.65x10-10, OR = 2.3[95% CI = 1.75-3.04]), but was independent of the known common associated CD and UC variants at this locus. Rare (<1%) and low frequency (1-5%) variants in 3 additional genes showed suggestive association (p<0.005) with either an increased risk (ARIH2 c.338-6C>T) or decreased risk (IL12B p.V298F, and NICN p.H191R) of IBD. These results provide additional insights into the involvement of the inhibition of T cell activation in the development of both sub-phenotypes of inflammatory bowel disease. We suggest that although rare coding variants may make a modest overall contribution to complex disease susceptibility, they can inform our understanding of the molecular pathways that contribute to pathogenesis.

  15. The KL-VS sequence variant of Klotho and cancer risk in BRCA1 and BRCA2 mutation carriers

    PubMed Central

    Laitman, Yael; Kuchenbaecker, Karoline B.; Rantala, Johanna; Hogervorst, Frans; Peock, Susan; Godwin, Andrew K.; Arason, Adalgeir; Kirchhoff, Tomas; Offit, Kenneth; Isaacs, Claudine; Schmutzler, Rita K.; Wappenschmidt, Barbara; Nevanlinna, Heli; Chen, Xiaoqing; Chenevix-Trench, Georgia; Healey, Sue; Couch, Fergus; Peterlongo, Paolo; Radice, Paolo; Nathanson, Katherine L.; Caligo, Maria Adelaide; Neuhausen, Susan L.; Ganz, Patricia; Sinilnikova, Olga M.; McGuffog, Lesley; Easton, Douglas F.; Antoniou, Antonis C.; Wolf, Ido

    2012-01-01

    Klotho (KL) is a putative tumor suppressor gene in breast and pancreatic cancers located at chromosome 13q12. A functional sequence variant of Klotho (KL-VS) was previously reported to modify breast cancer risk in Jewish BRCA1 mutation carriers. The effect of this variant on breast and ovarian cancer risks in non-Jewish BRCA1/BRCA2 mutation carriers has not been reported. The KL-VS variant was genotyped in women of European ancestry carrying a BRCA mutation: 5,741 BRCA1 mutation carriers (2,997 with breast cancer, 705 with ovarian cancer, and 2,039 cancer free women) and 3,339 BRCA2 mutation carriers (1,846 with breast cancer, 207 with ovarian cancer, and 1,286 cancer free women) from 16 centers. Genotyping was accomplished using TaqMan® allelic discrimination or matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. Data were analyzed within a retrospective cohort approach, stratified by country of origin and Ashkenazi Jewish origin. The per-allele hazard ratio (HR) for breast cancer was 1.02 (95% CI 0.93–1.12, P = 0.66) for BRCA1 mutation carriers and 0.92 (95% CI 0.82–1.04, P = 0.17) for BRCA2 mutation carriers. Results remained unaltered when analysis excluded prevalent breast cancer cases. Similarly, the per-allele HR for ovarian cancer was 1.01 (95% CI 0.84–1.20, P = 0.95) for BRCA1 mutation carriers and 0.9 (95% CI 0.66–1.22, P = 0.45) for BRCA2 mutation carriers. The risk did not change when carriers of the 6174delT mutation were excluded. There was a lack of association of the KL-VS Klotho variant with either breast or ovarian cancer risk in BRCA1 and BRCA2 mutation carriers. PMID:22212556

  16. Sequence variants in COL4A1 and COL4A2 genes in Ecuadorian families with keratoconus

    PubMed Central

    Karolak, Justyna A.; Kulinska, Karolina; Nowak, Dorota M.; Pitarque, Jose A.; Molinari, Andrea; Rydzanicz, Malgorzata; Bejjani, Bassem A.

    2011-01-01

    Purpose Keratoconus (KTCN) is a non-inflammatory, usually bilateral disorder of the eye which results in the conical shape and the progressive thinning of the cornea. Several studies have suggested that genetic factors play a role in the etiology of the disease. Several loci were previously described as possible candidate regions for familial KTCN; however, no causative mutations in any genes have been identified for any of these loci. The purpose of this study was to evaluate role of the collagen genes collagen type IV, alpha-1 (COL4A1) and collagen type IV, alpha-2 (COL4A2) in KTCN in Ecuadorian families. Methods COL4A1 and COL4A2 in 15 Ecuadorian KTCN families were examined with polymerase chain reaction amplification, and direct sequencing of all exons, promoter and intron-exon junctions was performed. Results Screening of COL4A1 and COL4A2 revealed numerous alterations in coding and non-coding regions of both genes. We detected three missense substitutions in COL4A1: c.19G>C (Val7Leu), c.1663A>C (Thr555Pro), and c.4002A>C (Gln1334His). Five non-synonymous variants were identified in COL4A2: c.574G>T (Val192Phe), c.1550G>A (Arg517Lys), c.2048G>C (Gly683Ala), c.2102A>G (Lys701Arg), and c.2152C>T (Pro718Ser). None of the identified sequence variants completely segregated with the affected phenotype. The Gln1334His variant was possibly damaging to protein function and structure. Conclusions This is the first mutation screening of COL4A1 and COL4A2 genes in families with KTCN and linkage to a locus close to these genes. Analysis of COL4A1 and COL4A2 revealed no mutations indicating that other genes are involved in KTCN causation in Ecuadorian families. PMID:21527998

  17. International Interlaboratory Digital PCR Study Demonstrating High Reproducibility for the Measurement of a Rare Sequence Variant.

    PubMed

    Whale, Alexandra S; Devonshire, Alison S; Karlin-Neumann, George; Regan, Jack; Javier, Leanne; Cowen, Simon; Fernandez-Gonzalez, Ana; Jones, Gerwyn M; Redshaw, Nicholas; Beck, Julia; Berger, Andreas W; Combaret, Valérie; Dahl Kjersgaard, Nina; Davis, Lisa; Fina, Frederic; Forshew, Tim; Fredslund Andersen, Rikke; Galbiati, Silvia; González Hernández, Álvaro; Haynes, Charles A; Janku, Filip; Lacave, Roger; Lee, Justin; Mistry, Vilas; Pender, Alexandra; Pradines, Anne; Proudhon, Charlotte; Saal, Lao H; Stieglitz, Elliot; Ulrich, Bryan; Foy, Carole A; Parkes, Helen; Tzonev, Svilen; Huggett, Jim F

    2017-02-07

    This study tested the claim that digital PCR (dPCR) can offer highly reproducible quantitative measurements in disparate laboratories. Twenty-one laboratories measured four blinded samples containing different quantities of a KRAS fragment encoding G12D, an important genetic marker for guiding therapy of certain cancers. This marker is challenging to quantify reproducibly using quantitative PCR (qPCR) or next generation sequencing (NGS) due to the presence of competing wild type sequences and the need for calibration. Using dPCR, 18 laboratories were able to quantify the G12D marker within 12% of each other in all samples. Three laboratories appeared to measure consistently outlying results; however, proper application of a follow-up analysis recommendation rectified their data. Our findings show that dPCR has demonstrable reproducibility across a large number of laboratories without calibration. This could enable the reproducible application of molecular stratification to guide therapy and, potentially, for molecular diagnostics.

  18. Deep Sequencing Analysis of Aptazyme Variants Based on a Pistol Ribozyme.

    PubMed

    Kobori, Shungo; Takahashi, Kei; Yokobayashi, Yohei

    2017-04-14

    Chemically regulated self-cleaving ribozymes, or aptazymes, are emerging as a promising class of genetic devices that allow dynamic control of gene expression in synthetic biology. However, further expansion of the limited repertoire of ribozymes and aptamers, and development of new strategies to couple the RNA elements to engineer functional aptazymes are highly desirable for synthetic biology applications. Here, we report aptazymes based on the recently identified self-cleaving pistol ribozyme class using a guanine aptamer as the molecular sensing element. Two aptazyme architectures were studied by constructing and assaying 17 728 mutants by deep sequencing. Although one of the architectures did not yield functional aptazymes, a novel aptazyme design in which the aptamer and the ribozyme were placed in tandem yielded a number of guanine-inhibited ribozymes. Detailed analysis of the extensive sequence-function data suggests a mechanism that involves a competition between two mutually exclusive RNA structures reminiscent of natural bacterial riboswitches.

  19. Whole-genome re-sequencing for the identification of high contribution susceptibility gene variants in patients with type 2 diabetes

    PubMed Central

    SUN, XIAOJUAN; SUI, WEIGUO; WANG, XIAOBING; HOU, XIANLIANG; OU, MINGLIN; DAI, YONG; XIANG, YUEYING

    2016-01-01

    There is increasing evidence that several genes are associated with an increased risk of type 2 diabetes (T2D); genome-wide association investigations and whole-genome re-sequencing investigations offer a useful approach for the identification of genes involved in common human diseases. To further investigate which polymorphisms confer susceptibility to T2D, the present study screened for high-contribution susceptibility gene variants Chinese patients with T2D using whole-genome re-sequencing with DNA pooling. In total, 100 Chinese individuals with T2D and 100 healthy Chinese individuals were analyzed using whole-genome re-sequencing using DNA pooling. To minimize the likelihood of systematic bias in sampling, paired-end libraries with an insert size of 500 bp were prepared for in T2D in all samples, which were then subjected to whole-genome sequencing. Each library contained four lanes. The average sequencing depth was 35.70. In the present study, 1.36 GB of clean sequence data were generated, and the resulting calculated T2D genome consensus sequence covered 99.88% of the hg19 sequence. A total of 3,974,307 single nucleotide polymorphisms were identified, of which 99.88% were in the dbSNP database. The present study also found 642,189 insertions and deletions, 5,590 structure variants (SVs), 4,713 copy number variants (CNVs) and 13,049 single nucleotide variants. A total of 1,884 somatic CNVs and 74 somatic SVs were significantly different between the cases and controls. Therefore, the present study provided validation of whole-genome re-sequencing using the DNA pooling approach. It also generated a whole-genome re-sequencing genotype database for future investigations of T2D. PMID:27035118

  20. cDNA sequences of variant forms of human placenta diamine oxidase

    SciTech Connect

    Zhang, X.; Kim, J.; McIntire, S.

    1995-08-01

    Genes for two forms of human placenta diamine oxidase (dao) were cloned from a cDNA library and sequenced. One gene, pdao1, is identical in length to human kidney dao but differs from it by two bases in the coding region and differs slightly in the 3{prime} - and 5{prime}-noncoding regions. The second gene, pdao2, is nearly identical to these genes in the coding region, except that it has an extra 57-nucleotide coding segment near the 3{prime} end of this region. This segment corresponds to the contiguous sequence of the 3{prime} end of intron 3 of human kidney dao. pdao2 also differs significantly from pdao1 and human kidney dao in a 13-base sequence in the t{prime}-noncoding region. It is proposed that pdao1 and human kidney dao are polymorphic forms of the same allele. Whether pdao2 is a polymorph of these two is not certain, because of the significant differences in the coding and noncoding regions. pdao2 may represent a different allele. 21 refs., 2 figs.

  1. Molecular cloning and amino acid sequence of human 5-lipoxygenase

    SciTech Connect

    Matsumoto, T.; Funk, C.D.; Radmark, O.; Hoeoeg, J.O.; Joernvall, H.; Samuelsson, B.

    1988-01-01

    5-Lipoxygenase (EC 1.13.11.34), a Ca/sup 2 +/- and ATP-requiring enzyme, catalyzes the first two steps in the biosynthesis of the peptidoleukotrienes and the chemotactic factor leukotriene B/sub 4/. A cDNA clone corresponding to 5-lipoxygenase was isolated from a human lung lambda gt11 expression library by immunoscreening with a polyclonal antibody. Additional clones from a human placenta lambda gt11 cDNA library were obtained by plaque hybridization with the /sup 32/P-labeled lung cDNA clone. Sequence data obtained from several overlapping clones indicate that the composite DNAs contain the complete coding region for the enzyme. From the deduced primary structure, 5-lipoxygenase encodes a 673 amino acid protein with a calculated molecular weight of 77,839. Direct analysis of the native protein and its proteolytic fragments confirmed the deduced composition, the amino-terminal amino acid sequence, and the structure of many internal segments. 5-Lipoxygenase has no apparent sequence homology with leukotriene A/sub 4/ hydrolase or Ca/sup 2 +/-binding proteins. RNA blot analysis indicated substantial amounts of an mRNA species of approx. = 2700 nucleotides in leukocytes, lung, and placenta.

  2. Nucleic acid sequence detection using multiplexed oligonucleotide PCR

    DOEpatents

    Nolan, John P.; White, P. Scott

    2006-12-26

    Methods for rapidly detecting single or multiple sequence alleles in a sample nucleic acid are described. Provided are all of the oligonucleotide pairs capable of annealing specifically to a target allele and discriminating among possible sequences thereof, and ligating to each other to form an oligonucleotide complex when a particular sequence feature is present (or, alternatively, absent) in the sample nucleic acid. The design of each oligonucleotide pair permits the subsequent high-level PCR amplification of a specific amplicon when the oligonucleotide complex is formed, but not when the oligonucleotide complex is not formed. The presence or absence of the specific amplicon is used to detect the allele. Detection of the specific amplicon may be achieved using a variety of methods well known in the art, including without limitation, oligonucleotide capture onto DNA chips or microarrays, oligonucleotide capture onto beads or microspheres, electrophoresis, and mass spectrometry. Various labels and address-capture tags may be employed in the amplicon detection step of multiplexed assays, as further described herein.

  3. The amino acid sequence of chymopapain from Carica papaya.

    PubMed Central

    Watson, D C; Yaguchi, M; Lynn, K R

    1990-01-01

    Chymopapain is a polypeptide of 218 amino acid residues. It has considerable structural similarity with papain and papaya proteinase omega, including conservation of the catalytic site and of the disulphide bonding. Chymopapain is like papaya proteinase omega in carrying four extra residues between papain positions 168 and 169, but differs from both papaya proteinases in the composition of its S2 subsite, as well as in having a second thiol group, Cys-117. Some evidence for the amino acid sequence of chymopapain has been deposited as Supplementary Publication SUP 50153 (12 pages) at the British Library Document Supply Centre, Boston Spa., Wetherby, West Yorkshire LS23 7BQ, U.K., from whom copies may be obtained on the terms indicated in Biochem. J. (1990) 265, 5. The information comprises Supplement Tables 1-4, which contain, in order, amino acid compositions of peptides from tryptic, peptic, CNBr and mild acid cleavages, Supplement Fig. 1, showing re-fractionation of selected peaks from Fig. 2 of the main paper. Supplement Fig. 2, showing cation-exchange chromatography of the earliest-eluted peak of Fig. 3 of the main paper, Supplement Fig. 3, showing reverse-phase h.p.l.c. of the later-eluted peak from Fig. 3 of the main paper, and Supplement Fig. 4, showing the separation of peptides after mild acid hydrolysis of CNBr-cleavage fragment CB3. PMID:2106878

  4. The amino acid sequence of rabbit cardiac troponin I.

    PubMed Central

    Grand, R J; Wilkinson, J M

    1976-01-01

    The complete amino acid sequence of troponin I from rabbit cardiac muscle was determined by the isolation of four unique CNBr fragments, together with overlapping tryptic peptides containing radioactive methionine residues. Overlap data for residues 35-36, 93-94 and 140-145 are incomplete, the sequence at these positions being based on homology with the sequence of the fast-skeletal-muscle protein. Cardiac troponin I is a single polypeptide chain of 206 residues with mol.wt. 23550 and an extinction coefficient, E 1%,1cm/280, of 4.37. The protein has a net positive charge of 14 and is thus somewhat more basic than troponin I from fast-skeletal muscle. Comparison of the sequences of troponin I from cardiac and fast skeletal muscle show that the cardiac protein has 26 extra residues at the N-terminus which account for the larger size of the protein. In the remainder of sequence there is a considerable degree of homology, this being greater in the C-terminal two-thirds of the molecule. The region in the cardiac protein corresponding to the peptide with inhibitory activity from the fast-skeletal-muscle protein is very similar and it seems unlikely that this is the cause of the difference in inhibitory activity between the two proteins. The region responsible for binding troponin C, however, possesses a lower degree of homology. Detailed evidence on which the sequence is based has been deposited as Supplementary Publication SUP 50072 (20 pages), at the British Library Lending Division, Boston Spa, Wetherby, West Yorkshire LS23 7QB, U.K., from whom copies may be obtained on the terms given in Biochem. J. (1976) 153, 5. PMID:1008822

  5. Optimal Unified Approach for Rare-Variant Association Testing with Application to Small-Sample Case-Control Whole-Exome Sequencing Studies

    PubMed Central

    Lee, Seunggeun; Emond, Mary J.; Bamshad, Michael J.; Barnes, Kathleen C.; Rieder, Mark J.; Nickerson, Deborah A.; Christiani, David C.; Wurfel, Mark M.; Lin, Xihong

    2012-01-01

    We propose in this paper a unified approach for testing the association between rare variants and phenotypes in sequencing association studies. This approach maximizes power by adaptively using the data to optimally combine the burden test and the nonburden sequence kernel association test (SKAT). Burden tests are more powerful when most variants in a region are causal and the effects are in the same direction, whereas SKAT is more powerful when a large fraction of the variants in a region are noncausal or the effects of causal variants are in different directions. The proposed unified test maintains the power in both scenarios. We show that the unified test corresponds to the optimal test in an extended family of SKAT tests, which we refer to as SKAT-O. The second goal of this paper is to develop a small-sample adjustment procedure for the proposed methods for the correction of conservative type I error rates of SKAT family tests when the trait of interest is dichotomous and the sample size is small. Both small-sample-adjusted SKAT and the optimal unified test (SKAT-O) are computationally efficient and can easily be applied to genome-wide sequencing association studies. We evaluate the finite sample performance of the proposed methods using extensive simulation studies and illustrate their application using the acute-lung-injury exome-sequencing data of the National Heart, Lung, and Blood Institute Exome Sequencing Project. PMID:22863193

  6. Association of low-frequency and rare coding-sequence variants with blood lipids and coronary heart disease in 56,000 whites and blacks.

    PubMed

    Peloso, Gina M; Auer, Paul L; Bis, Joshua C; Voorman, Arend; Morrison, Alanna C; Stitziel, Nathan O; Brody, Jennifer A; Khetarpal, Sumeet A; Crosby, Jacy R; Fornage, Myriam; Isaacs, Aaron; Jakobsdottir, Johanna; Feitosa, Mary F; Davies, Gail; Huffman, Jennifer E; Manichaikul, Ani; Davis, Brian; Lohman, Kurt; Joon, Aron Y; Smith, Albert V; Grove, Megan L; Zanoni, Paolo; Redon, Valeska; Demissie, Serkalem; Lawson, Kim; Peters, Ulrike; Carlson, Christopher; Jackson, Rebecca D; Ryckman, Kelli K; Mackey, Rachel H; Robinson, Jennifer G; Siscovick, David S; Schreiner, Pamela J; Mychaleckyj, Josyf C; Pankow, James S; Hofman, Albert; Uitterlinden, Andre G; Harris, Tamara B; Taylor, Kent D; Stafford, Jeanette M; Reynolds, Lindsay M; Marioni, Riccardo E; Dehghan, Abbas; Franco, Oscar H; Patel, Aniruddh P; Lu, Yingchang; Hindy, George; Gottesman, Omri; Bottinger, Erwin P; Melander, Olle; Orho-Melander, Marju; Loos, Ruth J F; Duga, Stefano; Merlini, Piera Angelica; Farrall, Martin; Goel, Anuj; Asselta, Rosanna; Girelli, Domenico; Martinelli, Nicola; Shah, Svati H; Kraus, William E; Li, Mingyao; Rader, Daniel J; Reilly, Muredach P; McPherson, Ruth; Watkins, Hugh; Ardissino, Diego; Zhang, Qunyuan; Wang, Judy; Tsai, Michael Y; Taylor, Herman A; Correa, Adolfo; Griswold, Michael E; Lange, Leslie A; Starr, John M; Rudan, Igor; Eiriksdottir, Gudny; Launer, Lenore J; Ordovas, Jose M; Levy, Daniel; Chen, Y-D Ida; Reiner, Alexander P; Hayward, Caroline; Polasek, Ozren; Deary, Ian J; Borecki, Ingrid B; Liu, Yongmei; Gudnason, Vilmundur; Wilson, James G; van Duijn, Cornelia M; Kooperberg, Charles; Rich, Stephen S; Psaty, Bruce M; Rotter, Jerome I; O'Donnell, Christopher J; Rice, Kenneth; Boerwinkle, Eric; Kathiresan, Sekar; Cupples, L Adrienne

    2014-02-06

    Low-frequency coding DNA sequence variants in the proprotein convertase subtilisin/kexin type 9 gene (PCSK9) lower plasma low-density lipoprotein cholesterol (LDL-C), protect against risk of coronary heart disease (CHD), and have prompted the development of a new class of therapeutics. It is uncertain whether the PCSK9 example represents a paradigm or an isolated exception. We used the "Exome Array" to genotype >200,000 low-frequency and rare coding sequence variants across the genome in 56,538 individuals (42,208 European ancestry [EA] and 14,330 African ancestry [AA]) and tested these variants for association with LDL-C, high-density lipoprotein cholesterol (HDL-C), and triglycerides. Although we did not identify new genes associated with LDL-C, we did identify four low-frequency (frequencies between 0.1% and 2%) variants (ANGPTL8 rs145464906 [c.361C>T; p.Gln121*], PAFAH1B2 rs186808413 [c.482C>T; p.Ser161Leu], COL18A1 rs114139997 [c.331G>A; p.Gly111Arg], and PCSK7 rs142953140 [c.1511G>A; p.Arg504His]) with large effects on HDL-C and/or triglycerides. None of these four variants was associated with risk for CHD, suggesting that examples of low-frequency coding variants with robust effects on both lipids and CHD will be limited.

  7. Identification of Functional Variants for Cleft Lip with or without Cleft Palate in or near PAX7, FGFR2, and NOG by Targeted Sequencing of GWAS Loci

    PubMed Central

    Leslie, Elizabeth J.; Taub, Margaret A.; Liu, Huan; Steinberg, Karyn Meltz; Koboldt, Daniel C.; Zhang, Qunyuan; Carlson, Jenna C.; Hetmanski, Jacqueline B.; Wang, Hang; Larson, David E.; Fulton, Robert S.; Kousa, Youssef A.; Fakhouri, Walid D.; Naji, Ali; Ruczinski, Ingo; Begum, Ferdouse; Parker, Margaret M.; Busch, Tamara; Standley, Jennifer; Rigdon, Jennifer; Hecht, Jacqueline T.; Scott, Alan F.; Wehby, George L.; Christensen, Kaare; Czeizel, Andrew E.; Deleyiannis, Frederic W.-B.; Schutte, Brian C.; Wilson, Richard K.; Cornell, Robert A.; Lidral, Andrew C.; Weinstock, George M.; Beaty, Terri H.; Marazita, Mary L.; Murray, Jeffrey C.

    2015-01-01

    Although genome-wide association studies (GWASs) for nonsyndromic orofacial clefts have identified multiple strongly associated regions, the causal variants are unknown. To address this, we selected 13 regions from GWASs and other studies, performed targeted sequencing in 1,409 Asian and European trios, and carried out a series of statistical and functional analyses. Within a cluster of strongly associated common variants near NOG, we found that one, rs227727, disrupts enhancer activity. We furthermore identified significant clusters of non-coding rare variants near NTN1 and NOG and found several rare coding variants likely to affect protein function, including four nonsense variants in ARHGAP29. We confirmed 48 de novo mutations and, based on best biological evidence available, chose two of these for functional assays. One mutation in PAX7 disrupted the DNA binding of the encoded transcription factor in an in vitro assay. The second, a non-coding mutation, disrupted the activity of a neural crest enhancer downstream of FGFR2 both in vitro and in vivo. This targeted sequencing study provides strong functional evidence implicating several specific variants as primary contributory risk alleles for nonsyndromic clefting in humans. PMID:25704602

  8. Translational Repression of a Splice Variant of Cynomolgus Macaque CXCL1L by Its C-Terminal Sequence.

    PubMed

    Nomiyama, Hisayuki; Osada, Naoki; Takahashi, Ichiro; Terao, Keiji; Yamagata, Kazuya; Yoshie, Osamu

    2017-03-01

    We previously isolated a cDNA clone from cynomolgus macaque encoding a novel CXC chemokine that we termed CXCL1L from its close similarity to CXCL1. However, the cDNA consisted of 3 exons instead of 4 exons that were typically seen in other CXC chemokines. Here, we isolated a cDNA encoding the full-length variant of CXCL1L that we termed CXCL1Lβ. CXCL1Lβ is 50 amino acids longer than the original CXCL1L, which we now term CXCL1Lα. The CXCL1Lβ mRNA is much more abundantly expressed in the cynomolgus macaque tissues than CXCL1Lα mRNA. However, CXCL1Lβ protein was poorly produced by transfected cells compared with that of CXCL1Lα. When the coding region of the fourth exon was fused to the C-terminus of CXCL1 or even to a nonsecretory protein firefly luciferase, the fused proteins were also barely produced, although the mRNAs were abundantly expressed. The polysome profiling analysis suggested that the inhibition was mainly at the translational level. Furthermore, we demonstrated that the C-terminal 5 amino acids of CXCL1Lβ were critical for the translational repression. The present study, thus, reveals a unique translational regulation controlling the production of a splicing variant of CXCL1L. Since the CXCL1L gene is functional only in the Old World monkeys, we also discuss possible reasons for the conservation of the active CXCL1L gene in these monkeys during the primate evolution.

  9. Amino acid sequence of a mouse immunoglobulin mu chain.

    PubMed Central

    Kehry, M; Sibley, C; Fuhrman, J; Schilling, J; Hood, L E

    1979-01-01

    The complete amino acid sequence of the mouse mu chain from the BALB/c myeloma tumor MOPC 104E is reported. The C mu region contains four consecutive homology regions of approximately 110 residues and a COOH-terminal region of 19 residues. A comparison of this mu chain from mouse with a complete mu sequence from human (Ou) and a partial mu chain sequence from dog (Moo) reveals a striking gradient of increasing homology from the NH2-terminal to the COOH-terminal portion of these mu chains, with the former being the least and the latter the most highly conserved. Four of the five sites of carbohydrate attachment appear to be at identical residue positions when the constant regions of the mouse and human mu chains are compared. The mu chain of MOPC 104E has a carbohydrate moiety attached in the second hypervariable region. This is particularly interesting in view of the fact that MOPC 104E binds alpha-(1 leads to 3)-dextran, a simple carbohydrate. The structural and functional constraints imposed by these comparative sequence analyses are discussed. PMID:111247

  10. Deep sequencing of RYR3 gene identifies rare and common variants associated with increased carotid intima-media thickness (cIMT) in HIV-infected individuals.

    PubMed

    Zhi, Degui; Shendre, Aditi; Scherzer, Rebecca; Irvin, Marguerite R; Perry, Rodney T; Levy, Shawn; Arnett, Donna K; Grunfeld, Carl; Shrestha, Sadeep

    2015-02-01

    Carotid intima-media thickness (cIMT) is a subclinical measure of atherosclerosis with mounting evidence that higher cIMT confers an increased risk of cardiovascular disease. The ryanodine receptor 3 gene (RYR3) has previously been linked to increased cIMT; however, the causal variants have not yet been localized. Therefore, we sequenced 339,480 bp encompassing 104 exons and 2 kb flanking region of the RYR3 gene in 96 HIV-positive white men from the extremes of the distribution of common cIMT from the Fat Redistribution and Metabolic Changes in HIV infection study (FRAM). We identified 2710 confirmed variants (2414 single-nucleotide polymorphisms (SNPs) and 296 insertion/deletions (indels)), with a mean count of 736 SNPs (ranging from 528 to 1032) and 170 indels (ranging from 128 to 214) distributed in each individual. There were 39 variants in the exons and 15 of these were non-synonymous, of which with only 4 were common variants and the remaining 11 were rare variants, one was a novel SNP. We confirmed that the common variant rs2229116 was significantly associated with cIMT in this design (P<7.9 × 10(-9)), and observed seven other significantly associated SNPs (P<10(-8)). These variants including the private non-synonymous SNPs need to be followed up in a larger sample size and also tested with clinical atherosclerotic outcomes.

  11. Whole-exome sequencing identifies rare, functional CFH variants in families with macular degeneration.

    PubMed

    Yu, Yi; Triebwasser, Michael P; Wong, Edwin K S; Schramm, Elizabeth C; Thomas, Brett; Reynolds, Robyn; Mardis, Elaine R; Atkinson, John P; Daly, Mark; Raychaudhuri, Soumya; Kavanagh, David; Seddon, Johanna M

    2014-10-01

    We sequenced the whole exome of 35 cases and 7 controls from 9 age-related macular degeneration (AMD) families in whom known common genetic risk alleles could not explain their high disease burden and/or their early-onset advanced disease. Two families harbored novel rare mutations in CFH (R53C and D90G). R53C segregates perfectly with AMD in 11 cases (heterozygous) and 1 elderly control (reference allele) (LOD = 5.07, P = 6.7 × 10(-7)). In an independent cohort, 4 out of 1676 cases but none of the 745 examined controls or 4300 NHBLI Exome Sequencing Project (ESP) samples carried the R53C mutation (P = 0.0039). In another family of six siblings, D90G similarly segregated with AMD in five cases and one control (LOD = 1.22, P = 0.009). No other sample in our large cohort or the ESP had this mutation. Functional studies demonstrated that R53C decreased the ability of FH to perform decay accelerating activity. D90G exhibited a decrease in cofactor-mediated inactivation. Both of these changes would lead to a loss of regulatory activity, resulting in excessive alternative pathway activation. This study represents an initial application of the whole-exome strategy to families with early-onset AMD. It successfully identified high impact alleles leading to clearer functional insight into AMD etiopathogenesis.

  12. Novel mutation and three other sequence variants segregating with phenotype at keratoconus 13q32 susceptibility locus

    PubMed Central

    Czugala, Marta; Karolak, Justyna A; Nowak, Dorota M; Polakowski, Piotr; Pitarque, Jose; Molinari, Andrea; Rydzanicz, Malgorzata; Bejjani, Bassem A; Yue, Beatrice Y J T; Szaflik, Jacek P; Gajecka, Marzena

    2012-01-01

    Keratoconus (KTCN), a non-inflammatory corneal disorder characterized by stromal thinning, represents a major cause of corneal transplantations. Genetic and environmental factors have a role in the etiology of this complex disease. Previously reported linkage analysis revealed that chromosomal region 13q32 is likely to contain causative gene(s) for familial KTCN. Consequently, we have chosen eight positional candidate genes in this region: MBNL1, IPO5, FARP1, RNF113B, STK24, DOCK9, ZIC5 and ZIC2, and sequenced all of them in 51 individuals from Ecuadorian KTCN families and 105 matching controls. The mutation screening identified one mutation and three sequence variants showing 100% segregation under a dominant model with KTCN phenotype in one large Ecuadorian family. These substitutions were found in three different genes: c.2262A>C (p.Gln754His) and c.720+43A>G in DOCK9; c.2377-132A>C in IPO5 and c.1053+29G>C in STK24. PolyPhen analyses predicted that c.2262A>C (Gln754His) is possibly damaging for the protein function and structure. Our results suggest that c.2262A>C (p.Gln754His) mutation in DOCK9 may contribute to the KTCN phenotype in the large KTCN-014 family. PMID:22045297

  13. Ultrasensitive nucleic acid sequence detection by single-molecule electrophoresis

    SciTech Connect

    Castro, A; Shera, E.B.

    1996-09-01

    This is the final report of a one-year laboratory-directed research and development project at Los Alamos National Laboratory. There has been considerable interest in the development of very sensitive clinical diagnostic techniques over the last few years. Many pathogenic agents are often present in extremely small concentrations in clinical samples, especially at the initial stages of infection, making their detection very difficult. This project sought to develop a new technique for the detection and accurate quantification of specific bacterial and viral nucleic acid sequences in clinical samples. The scheme involved the use of novel hybridization probes for the detection of nucleic acids combined with our recently developed technique of single-molecule electrophoresis. This project is directly relevant to the DOE`s Defense Programs strategic directions in the area of biological warfare counter-proliferation.

  14. Variant Ionotropic Receptors in the Malaria Vector Mosquito Anopheles gambiae Tuned to Amines and Carboxylic Acids

    PubMed Central

    Pitts, R. Jason; Derryberry, Stephen L.; Zhang, Zhiwei; Zwiebel, Laurence J.

    2017-01-01

    The principal Afrotropical human malaria vector mosquito, Anopheles gambiae, remains a significant threat to global health. A critical component in the transmission of malaria is the ability of An. gambiae females to detect and respond to human-derived chemical kairomones in their search for blood meal hosts. The basis for host odor responses resides in olfactory receptor neurons (ORNs) that express chemoreceptors encoded by large gene families, including the odorant receptors (ORs) and the variant ionotropic receptors (IRs). While ORs have been the focus of extensive investigation, functional IR complexes and the chemical compounds that activate them have not been identified in An. gambiae. Here we report the transcriptional profiles and functional characterization of three An. gambiae IR (AgIr) complexes that specifically respond to amines or carboxylic acids - two classes of semiochemicals that have been implicated in mediating host-seeking by adult females but are not known to activate An. gambiae ORs (AgOrs). Our results suggest that AgIrs play critical roles in the detection and behavioral responses to important classes of host odors that are underrepresented in the AgOr chemical space. PMID:28067294

  15. Standards and Guidelines for the Interpretation of Sequence Variants: A Joint Consensus Recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology

    PubMed Central

    Richards, Sue; Aziz, Nazneen; Bale, Sherri; Bick, David; Das, Soma; Gastier-Foster, Julie; Grody, Wayne W.; Hegde, Madhuri; Lyon, Elaine; Spector, Elaine; Voelkerding, Karl; Rehm, Heidi L.

    2015-01-01

    The American College of Medical Genetics and Genomics (ACMG) previously developed guidance for the interpretation of sequence variants.1 In the past decade, sequencing technology has evolved rapidly with the advent of high-throughput next generation sequencing. By adopting and leveraging next generation sequencing, clinical laboratories are now performing an ever increasing catalogue of genetic testing spanning genotyping, single genes, gene panels, exomes, genomes, transcriptomes and epigenetic assays for genetic disorders. By virtue of increased complexity, this paradigm shift in genetic testing has been accompanied by new challenges in sequence interpretation. In this context, the ACMG convened a workgroup in 2013 comprised of representatives from the ACMG, the Association for Molecular Pathology (AMP) and the College of American Pathologists (CAP) to revisit and revise the standards and guidelines for the interpretation of sequence variants. The group consisted of clinical laboratory directors and clinicians. This report represents expert opinion of the workgroup with input from ACMG, AMP and CAP stakeholders. These recommendations primarily apply to the breadth of genetic tests used in clinical laboratories including genotyping, single genes, panels, exomes and genomes. This report recommends the use of specific standard terminology: ‘pathogenic’, ‘likely pathogenic’, ‘uncertain significance’, ‘likely benign’, and ‘benign’ to describe variants identified in Mendelian disorders. Moreover, this recommendation describes a process for classification of variants into these five categories based on criteria using typical types of variant evidence (e.g. population data, computational data, functional data, segregation data, etc.). Because of the increased complexity of analysis and interpretation of clinical genetic testing described in this report, the ACMG strongly recommends that clinical molecular genetic testing should be performed in a CLIA

  16. Fatty acid translocase gene CD36 rs1527483 variant influences oral fat perception in Malaysian subjects.

    PubMed

    Ong, Hing-Huat; Tan, Yen-Nee; Say, Yee-How

    2017-01-01

    We determined whether single nucleotide polymorphisms (SNPs; rs1761667 and rs1527483) in the fatty acid translocase CD36 gene - a receptor for fatty acids - is associated with oral fat perception (OFP) of different fat contents in custards and commercially-available foods, and obesity measures in Malaysian subjects (n=313; 118 males, 293 ethnic Chinese; 20 ethnic Indians). A 170-mm visual analogue scale was used to assess the ratings of perceived fat content, oiliness and creaminess of 0%, 2%, 6% and 10% fat content-by-weight custards and low-fat/regular versions of commercially-available milk, mayonnaise and cream crackers. Overall, the subjects managed to significantly discriminate the fat content, oiliness and creaminess between low-fat/regular versions of milk and mayonnaise. Females rated the perception of fat content and oiliness of both milks higher, but ethnicity, obesity and adiposity status did not seem to play a role in influencing most of OFP. The overall minor allele frequencies for rs1761667 and rs1527483 were 0.30 and 0.26, respectively. Females and individuals with rs1527483 TT genotype significantly perceived greater creaminess of 10% fat-by-weight custard. Also, individuals with rs1527483 TT genotype and T allele significantly perceived greater fat content of cream crackers, independent of fat concentration. rs1761667 SNP did not significantly affect OFP, except for cream crackers. Both gene variants were also not associated with obesity measures. Taken together, this study supports the notion that CD36 - specifically rs1527483, plays a role in OFP, but not in influencing obesity in Malaysian subjects. Besides, gender is an important factor for OFP, where females had higher sensitivity.

  17. Next-Generation Sequencing and In Vitro Expression Study of ADAMTS13 Single Nucleotide Variants in Deep Vein Thrombosis

    PubMed Central

    Pagliari, Maria Teresa; Lotta, Luca A.; de Haan, Hugoline G.; Valsecchi, Carla; Casoli, Gloria; Pontiggia, Silvia; Martinelli, Ida; Passamonti, Serena M.; Rosendaal, Frits R.

    2016-01-01

    Background Deep vein thrombosis (DVT) genetic predisposition is partially known. Objectives This study aimed at assessing the functional impact of nine ADAMTS13 single nucleotide variants (SNVs) previously reported to be associated as a group with DVT in a burden test and the individual association of selected variants with DVT risk in two replication studies. Methods Wild-type and mutant recombinant ADAMTS13 were transiently expressed in HEK293 cells. Antigen and activity of recombinant ADAMTS13 were measured by ELISA and FRETS-VWF73 assays, respectively. The replication studies were performed in an Italian case-control study (Milan study; 298/298 patients/controls) using a next-generation sequencing approach and in a Dutch case-control study (MEGA study; 4306/4887 patients/controls) by TaqMan assays. Results In vitro results showed reduced ADAMTS13 activity for three SNVs (p.Val154Ile [15%; 95% confidence interval [CI] 14–16], p.Asp187His [19%; 95%[CI] 17–21], p.Arg421Cys [24%; 95%[CI] 22–26]) similar to reduced plasma ADAMTS13 levels of patients carriers for these SNVs. Therefore these three SNVs were interrogated for risk association. The first replication study identified 3 heterozygous carriers (2 cases, 1 control) of p.Arg421Cys (odds ratio [OR] 2, 95%[CI] 0.18–22.25). The second replication study identified 2 heterozygous carriers (1 case, 1 control) of p.Asp187His ([OR] 1.14, 95%[CI] 0.07–18.15) and 10 heterozygous carriers (4 cases, 6 controls) of p.Arg421Cys ([OR] 0.76, 95%[CI] 0.21–2.68). Conclusions Three SNVs (p.Val154Ile, p.Asp187His and p.Arg421Cys) showed reduced ex vivo and in vitro ADAMTS13 levels. However, the low frequency of these variants makes it difficult to confirm their association with DVT. PMID:27802307

  18. Full genome sequence analysis of a wild, non-MLV-related type 2 Hungarian PRRSV variant isolated in Europe.

    PubMed

    Balka, Gyula; Wang, Xiong; Olasz, Ferenc; Bálint, Ádám; Kiss, István; Bányai, Krisztián; Rusvai, Miklós; Stadejek, Tomasz; Marthaler, Douglas; Murtaugh, Michael P; Zádori, Zoltán

    2015-03-16

    Porcine reproductive and respiratory syndrome virus (PRRSV) is a widespread pathogen of pigs causing significant economic losses to the swine industry. The expanding diversity of PRRSV strains makes the diagnosis, control and eradication of the disease more and more difficult. In the present study, the authors report the full genome sequencing of a type 2 PRRSV strain isolated from piglet carcasses in Hungary. Next generation sequencing was used to determine the complete genome sequence of the isolate (PRRSV-2/Hungary/102/2012). Recombination analysis performed with the available full-length genome sequences showed no evidence of such event with other known PRRSV. Unique deletions and an insertion were found in the nsp2 region of PRRSV-2/Hungary/102/2012 when it was compared to the highly virulent VR2332 and JXA-1 prototype strains. The majority of amino acid alterations in GP4 and GP5 of the virus were in the known antigenic regions suggesting an important role for immunological pressure in PRRSV-2/Hungary/102/2012 evolution. Phylogenetic analysis revealed that it belongs to lineage 1 or 2 of type 2 PRRSV. Considering the lack of related PRRSV in Europe, except for a partial sequence from Slovakia, the ancestor of PRRSV-2/Hungary/102/2012 was most probably transported from North-America. It is the first documented type 2 PRRSV isolated in Europe that is not related to the Ingelvac MLV.

  19. Association of low-frequency and rare coding-sequence variants with blood lipids and Coronary Heart Disease in 56,000 whites and blacks

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Low-frequency coding DNA sequence variants in the proprotein convertase subtilisin/kexin type 9 gene (PCSK9) lower plasma low-density lipoprotein cholesterol (LDL-C), protect against risk of coronary heart disease (CHD), and have prompted the development of a new class of therapeutics. It is uncerta...

  20. Genome wide association study of uric acid in Indian population and interaction of identified variants with Type 2 diabetes

    PubMed Central

    Giri, Anil K; Banerjee, Priyanka; Chakraborty, Shraddha; Kauser, Yasmeen; Undru, Aditya; Roy, Suki; Parekatt, Vaisak; Ghosh, Saurabh; Tandon, Nikhil; Bharadwaj, Dwaipayan

    2016-01-01

    Abnormal level of Serum Uric Acid (SUA) is an important marker and risk factor for complex diseases including Type 2 Diabetes. Since genetic determinant of uric acid in Indians is totally unexplored, we tried to identify common variants associated with SUA in Indians using Genome Wide Association Study (GWAS). Association of five known variants in SLC2A9 and SLC22A11 genes with SUA level in 4,834 normoglycemics (1,109 in discovery and 3,725 in validation phase) was revealed with different effect size in Indians compared to other major ethnic population of the world. Combined analysis of 1,077 T2DM subjects (772 in discovery and 305 in validation phase) and normoglycemics revealed additional GWAS signal in ABCG2 gene. Differences in effect sizes of ABCG2 and SLC2A9 gene variants were observed between normoglycemics and T2DM patients. We identified two novel variants near long non-coding RNA genes AL356739.1 and AC064865.1 with nearly genome wide significance level. Meta-analysis and in silico replication in 11,745 individuals from AUSTWIN consortium improved association for rs12206002 in AL356739.1 gene to sub-genome wide association level. Our results extends association of SLC2A9, SLC22A11 and ABCG2 genes with SUA level in Indians and enrich the assemblages of evidence for SUA level and T2DM interrelationship. PMID:26902266

  1. Nucleic acid (cDNA) and amino acid sequences of alpha-type gliadins from wheat (Triticum aestivum).

    PubMed Central

    Kasarda, D D; Okita, T W; Bernardin, J E; Baecker, P A; Nimmo, C C; Lew, E J; Dietler, M D; Greene, F C

    1984-01-01

    The complete amino acid sequence for an alpha-type gliadin protein of wheat (Triticum aestivum Linnaeus) endosperm has been derived from a cloned cDNA sequence. An additional cDNA clone that corresponds to about 75% of a similar alpha-type gliadin has been sequenced and shows some important differences. About 97% of the composite sequence of A-gliadin (an alpha-type gliadin fraction) has also been obtained by direct amino acid sequencing. This sequence shows a high degree of similarity with amino acid sequences derived from both cDNA clones and is virtually identical to one of them. On the basis of sequence information, after loss of the signal sequence, the mature alpha-type gliadins may be divided into five different domains, two of which may have evolved from an ancestral gliadin gene, whereas the remaining three contain repeating sequences that may have developed independently. Images PMID:6589619

  2. Structural and functional interaction of fatty acids with human liver fatty acid-binding protein (L-FABP) T94A variant.

    PubMed

    Huang, Huan; McIntosh, Avery L; Martin, Gregory G; Landrock, Kerstin K; Landrock, Danilo; Gupta, Shipra; Atshaves, Barbara P; Kier, Ann B; Schroeder, Friedhelm

    2014-05-01

    The human liver fatty acid-binding protein (L-FABP) T94A variant, the most common in the FABP family, has been associated with elevated liver triglyceride levels. How this amino acid substitution elicits these effects is not known. This issue was addressed using human recombinant wild-type (WT) and T94A variant L-FABP proteins as well as cultured primary human hepatocytes expressing the respective proteins (genotyped as TT, TC and CC). The T94A substitution did not alter or only slightly altered L-FABP binding affinities for saturated, monounsaturated or polyunsaturated long chain fatty acids, nor did it change the affinity for intermediates of triglyceride synthesis. Nevertheless, the T94A substitution markedly altered the secondary structural response of L-FABP induced by binding long chain fatty acids or intermediates of triglyceride synthesis. Finally, the T94A substitution markedly decreased the levels of induction of peroxisome proliferator-activated receptor α-regulated proteins such as L-FABP, fatty acid transport protein 5 and peroxisome proliferator-activated receptor α itself meditated by the polyunsaturated fatty acids eicosapentaenoic acid and docosahexaenoic acid in cultured primary human hepatocytes. Thus, although the T94A substitution did not alter the affinity of human L-FABP for long chain fatty acids, it significantly altered human L-FABP structure and stability, as well as the conformational and functional response to these ligands.

  3. Novel variants in MLL confer to bladder cancer recurrence identified by whole-exome sequencing

    PubMed Central

    Wang, Yongqiang; Huang, Yi; Liu, Huan; Li, Feida; He, Luyun; Sun, Da; Yu, Yuan; Li, Qiaoling; Huang, Peide; Zhang, Meng; Zhao, Xin; Bi, Tengteng; Zhuang, Xuehan; Zhang, Liyan; Lu, Jingxiao; Sun, Xiaojuan; Zhou, Fangjian; Liu, Chunxiao; Yang, Guosheng; Hou, Yong; Fan, Zusen; Cai, Zhiming

    2016-01-01

    Bladder cancer (BC) is distinguished by high rate of recurrence after surgery, but the underlying mechanisms remain poorly understood. Here we performed the whole-exome sequencing of 37 BC individuals including 20 primary and 17 recurrent samples in which the primary and recurrent samples were not from the same patient. We uncovered that MLL, EP400, PRDM2, ANK3 and CHD5 exclusively altered in recurrent BCs. Specifically, the recurrent BCs and bladder cancer cells with MLL mutation displayed increased histone H3 tri-methyl K4 (H3K4me3) modification in tissue and cell levels and showed enhanced expression of GATA4 and ETS1 downstream. What's more, MLL mutated bladder cancer cells obtained with CRISPR/Cas9 showed increased ability of drug-resistance to epirubicin (a chemotherapy drug for bladder cancer) than wild type cells. Additionally, the BC patients with high expression of GATA4 and ETS1 significantly displayed shorter lifespan than patients with low expression. Our study provided an overview of the genetic basis of recrudescent bladder cancer and discovered that genetic alterations of MLL were involved in BC relapse. The increased modification of H3K4me3 and expression of GATA4 and ETS1 would be the promising targets for the diagnosis and therapy of relapsed bladder cancer. PMID:26625313

  4. A novel strategy for clustering major depression individuals using whole-genome sequencing variant data

    PubMed Central

    Yu, Chenglong; Baune, Bernhard T.; Licinio, Julio; Wong, Ma-Li

    2017-01-01

    Major depressive disorder (MDD) is highly prevalent, resulting in an exceedingly high disease burden. The identification of generic risk factors could lead to advance prevention and therapeutics. Current approaches examine genotyping data to identify specific variations between cases and controls. Compared to genotyping, whole-genome sequencing (WGS) allows for the detection of private mutations. In this proof-of-concept study, we establish a conceptually novel computational approach that clusters subjects based on the entirety of their WGS. Those clusters predicted MDD diagnosis. This strategy yielded encouraging results, showing that depressed Mexican-American participants were grouped closer; in contrast ethnically-matched controls grouped away from MDD patients. This implies that within the same ancestry, the WGS data of an individual can be used to check whether this individual is within or closer to MDD subjects or to controls. We propose a novel strategy to apply WGS data to clinical medicine by facilitating diagnosis through genetic clustering. Further studies utilising our method should examine larger WGS datasets on other ethnical groups. PMID:28287625

  5. Novel variants in MLL confer to bladder cancer recurrence identified by whole-exome sequencing.

    PubMed

    Wu, Song; Yang, Zhao; Ye, Rui; An, Dan; Li, Chong; Wang, Yitian; Wang, Yongqiang; Huang, Yi; Liu, Huan; Li, Feida; He, Luyun; Sun, Da; Yu, Yuan; Li, Qiaoling; Huang, Peide; Zhang, Meng; Zhao, Xin; Bi, Tengteng; Zhuang, Xuehan; Zhang, Liyan; Lu, Jingxiao; Sun, Xiaojuan; Zhou, Fangjian; Liu, Chunxiao; Yang, Guosheng; Hou, Yong; Fan, Zusen; Cai, Zhiming

    2016-01-19

    Bladder cancer (BC) is distinguished by high rate of recurrence after surgery, but the underlying mechanisms remain poorly understood. Here we performed the whole-exome sequencing of 37 BC individuals including 20 primary and 17 recurrent samples in which the primary and recurrent samples were not from the same patient. We uncovered that MLL, EP400, PRDM2, ANK3 and CHD5 exclusively altered in recurrent BCs. Specifically, the recurrent BCs and bladder cancer cells with MLL mutation displayed increased histone H3 tri-methyl K4 (H3K4me3) modification in tissue and cell levels and showed enhanced expression of GATA4 and ETS1 downstream. What's more, MLL mutated bladder cancer cells obtained with CRISPR/Cas9 showed increased ability of drug-resistance to epirubicin (a chemotherapy drug for bladder cancer) than wild type cells. Additionally, the BC patients with high expression of GATA4 and ETS1 significantly displayed shorter lifespan than patients with low expression. Our study provided an overview of the genetic basis of recrudescent bladder cancer and discovered that genetic alterations of MLL were involved in BC relapse. The increased modification of H3K4me3 and expression of GATA4 and ETS1 would be the promising targets for the diagnosis and therapy of relapsed bladder cancer.

  6. Structural gene and complete amino acid sequence of Vibrio alginolyticus collagenase.

    PubMed Central

    Takeuchi, H; Shibano, Y; Morihara, K; Fukushima, J; Inami, S; Keil, B; Gilles, A M; Kawamoto, S; Okuda, K

    1992-01-01

    The DNA encoding the collagenase of Vibrio alginolyticus was cloned, and its complete nucleotide sequence was determined. When the cloned gene was ligated to pUC18, the Escherichia coli expression vector, bacteria carrying the gene exhibited both collagenase antigen and collagenase activity. The open reading frame from the ATG initiation codon was 2442 bp in length for the collagenase structural gene. The amino acid sequence, deduced from the nucleotide sequence, revealed that the mature collagenase consists of 739 amino acids with an Mr of 81875. The amino acid sequences of 20 polypeptide fragments were completely identical with the deduced amino acid sequences of the collagenase gene. The amino acid composition predicted from the DNA sequence was similar to the chemically determined composition of purified collagenase reported previously. The analyses of both the DNA and amino acid sequences of the collagenase gene were rigorously performed, but we could not detect any significant sequence similarity to other collagenases. Images Fig. 2. PMID:1311172

  7. Sequence Variants and Haplotype Analysis of Cat ERBB2 Gene: A Survey on Spontaneous Cat Mammary Neoplastic and Non-Neoplastic Lesions

    PubMed Central

    Santos, Sara; Bastos, Estela; Baptista, Cláudia S.; Sá, Daniela; Caloustian, Christophe; Guedes-Pinto, Henrique; Gärtner, Fátima; Gut, Ivo G.; Chaves, Raquel

    2012-01-01

    The human ERBB2 proto-oncogene is widely considered a key gene involved in human breast cancer onset and progression. Among spontaneous tumors, mammary tumors are the most frequent cause of cancer death in cats and second most frequent in humans. In fact, naturally occurring tumors in domestic animals, more particularly cat mammary tumors, have been proposed as a good model for human breast cancer, but critical genetic and molecular information is still scarce. The aims of this study include the analysis of the cat ERBB2 gene partial sequences (between exon 17 and 20) in order to characterize a normal and a mammary lesion heterogeneous populations. Cat genomic DNA was extracted from normal frozen samples (n = 16) and from frozen and formalin-fixed paraffin-embedded mammary lesion samples (n = 41). We amplified and sequenced two cat ERBB2 DNA fragments comprising exons 17 to 20. It was possible to identify five sequence variants and six haplotypes in the total population. Two sequence variants and two haplotypes show to be specific for cat mammary tumor samples. Bioinformatics analysis predicts that four of the sequence variants can produce alternative transcripts or activate cryptic splicing sites. Also, a possible association was identified between clinicopathological traits and the variant haplotypes. As far as we know, this is the first attempt to examine ERBB2 genetic variations in cat mammary genome and its possible association with the onset and progression of cat mammary tumors. The demonstration of a possible association between primary tumor size (one of the two most important prognostic factors) and the number of masses with the cat ERBB2 variant haplotypes reveal the importance of the analysis of this gene in veterinary medicine. PMID:22489125

  8. Sequence variants and haplotype analysis of cat ERBB2 gene: a survey on spontaneous cat mammary neoplastic and non-neoplastic lesions.

    PubMed

    Santos, Sara; Bastos, Estela; Baptista, Cláudia S; Sá, Daniela; Caloustian, Christophe; Guedes-Pinto, Henrique; Gärtner, Fátima; Gut, Ivo G; Chaves, Raquel

    2012-01-01

    The human ERBB2 proto-oncogene is widely considered a key gene involved in human breast cancer onset and progression. Among spontaneous tumors, mammary tumors are the most frequent cause of cancer death in cats and second most frequent in humans. In fact, naturally occurring tumors in domestic animals, more particularly cat mammary tumors, have been proposed as a good model for human breast cancer, but critical genetic and molecular information is still scarce. The aims of this study include the analysis of the cat ERBB2 gene partial sequences (between exon 17 and 20) in order to characterize a normal and a mammary lesion heterogeneous populations. Cat genomic DNA was extracted from normal frozen samples (n = 16) and from frozen and formalin-fixed paraffin-embedded mammary lesion samples (n = 41). We amplified and sequenced two cat ERBB2 DNA fragments comprising exons 17 to 20. It was possible to identify five sequence variants and six haplotypes in the total population. Two sequence variants and two haplotypes show to be specific for cat mammary tumor samples. Bioinformatics analysis predicts that four of the sequence variants can produce alternative transcripts or activate cryptic splicing sites. Also, a possible association was identified between clinicopathological traits and the variant haplotypes. As far as we know, this is the first attempt to examine ERBB2 genetic variations in cat mammary genome and its possible association with the onset and progression of cat mammary tumors. The demonstration of a possible association between primary tumor size (one of the two most important prognostic factors) and the number of masses with the cat ERBB2 variant haplotypes reveal the importance of the analysis of this gene in veterinary medicine.

  9. Rare DNA variants in the brain-derived neurotrophic factor gene increase risk for attention-deficit hyperactivity disorder: a next-generation sequencing study.

    PubMed

    Hawi, Z; Cummins, T D R; Tong, J; Arcos-Burgos, M; Zhao, Q; Matthews, N; Newman, D P; Johnson, B; Vance, A; Heussler, H S; Levy, F; Easteal, S; Wray, N R; Kenny, E; Morris, D; Kent, L; Gill, M; Bellgrove, M A

    2017-04-01

    Attention-deficit hyperactivity disorder (ADHD) is a prevalent and highly heritable disorder of childhood with negative lifetime outcomes. Although candidate gene and genome-wide association studies have identified promising common variant signals, these explain only a fraction of the heritability of ADHD. The observation that rare structural variants confer substantial risk to psychiatric disorders suggests that rare variants might explain a portion of the missing heritability for ADHD. Here we believe we performed the first large-scale next-generation targeted sequencing study of ADHD in 152 child and adolescent cases and 188 controls across an a priori set of 117 genes. A multi-marker gene-level analysis of rare (<1% frequency) single-nucleotide variants (SNVs) revealed that the gene encoding brain-derived neurotrophic factor (BDNF) was associated with ADHD at Bonferroni corrected levels. Sanger sequencing confirmed the existence of all novel rare BDNF variants. Our results implicate BDNF as a genetic risk factor for ADHD, potentially by virtue of its critical role in neurodevelopment and synaptic plasticity.

  10. Gonococcal pilin variants in experimental gonorrhea.

    PubMed

    Swanson, J; Robbins, K; Barrera, O; Corwin, D; Boslego, J; Ciak, J; Blake, M; Koomey, J M

    1987-05-01

    When pilus+ Gc were introduced into a male subject's urethra, they gave rise to pilus+ variants whose pilin mRNAs differed from that of input Gc. The differences stemmed from the Gc genome's single complete pilin gene having undergone gene conversion by different partial pilin genes' sequences and by different length stretches of a single partial pilin gene. In some instances, the variant's pilin mRNA appeared to reflect two independent gene-conversion events that used sequences from two different partial pilin genes. The resulting variants' pilins exhibited antigenic differences compared with the pilin polypeptide of input Gc; these differences were discernible by immunoblotting with mAbs. Amino acid and antigenic changes occurred in a segment of the variants' pilin polypeptides that previously was thought to be conserved or constant in sequence.

  11. Improved resolution of tryptic digest fragments from haemoglobin variants using phytic acid in free zone capillary electrophoresis.

    PubMed

    Okafo, G N; Perrett, D; Camilleri, P

    1994-01-01

    Capillary electrophoresis can be applied to the rapid characterization of tryptic digests of proteins. The addition of phytic acid to the separation buffer was found to improve resolution considerably when the technique was applied to differentiate between tryptic digests derived from variant haemoglobins. Moreover, analysis time was of the order of 15 min, which is considerably shorter than that obtained using gradient reversed-phase high-performance liquid chromatography or two-dimensional paper chromatography-electrophoresis.

  12. Identification of genetic risk variants for deep vein thrombosis by multiplexed next-generation sequencing of 186 hemostatic/pro-inflammatory genes

    PubMed Central

    2012-01-01

    Background Next-generation DNA sequencing is opening new avenues for genetic association studies in common diseases that, like deep vein thrombosis (DVT), have a strong genetic predisposition still largely unexplained by currently identified risk variants. In order to develop sequencing and analytical pipelines for the application of next-generation sequencing to complex diseases, we conducted a pilot study sequencing the coding area of 186 hemostatic/proinflammatory genes in 10 Italian cases of idiopathic DVT and 12 healthy controls. Results A molecular-barcoding strategy was used to multiplex DNA target capture and sequencing, while retaining individual sequence information. Genomic libraries with barcode sequence-tags were pooled (in pools of 8 or 16 samples) and enriched for target DNA sequences. Sequencing was performed on ABI SOLiD-4 platforms. We produced > 12 gigabases of raw sequence data to sequence at high coverage (average: 42X) the 700-kilobase target area in 22 individuals. A total of 1876 high-quality genetic variants were identified (1778 single nucleotide substitutions and 98 insertions/deletions). Annotation on databases of genetic variation and human disease mutations revealed several novel, potentially deleterious mutations. We tested 576 common variants in a case-control association analysis, carrying the top-5 associations over to replication in up to 719 DVT cases and 719 controls. We also conducted an analysis of the burden of nonsynonymous variants in coagulation factor and anticoagulant genes. We found an excess of rare missense mutations in anticoagulant genes in DVT cases compared to controls and an association for a missense polymorphism of FGA (rs6050; p = 1.9 × 10-5, OR 1.45; 95% CI, 1.22-1.72; after replication in > 1400 individuals). Conclusions We implemented a barcode-based strategy to efficiently multiplex sequencing of hundreds of candidate genes in several individuals. In the relatively small dataset of our pilot study we were

  13. Sequencing of sporadic Attention-Deficit Hyperactivity Disorder (ADHD) identifies novel and potentially pathogenic de novo variants and excludes overlap with genes associated with autism spectrum disorder.

    PubMed

    Kim, Daniel Seung; Burt, Amber A; Ranchalis, Jane E; Wilmot, Beth; Smith, Joshua D; Patterson, Karynne E; Coe, Bradley P; Li, Yatong K; Bamshad, Michael J; Nikolas, Molly; Eichler, Evan E; Swanson, James M; Nigg, Joel T; Nickerson, Deborah A; Jarvik, Gail P

    2017-03-22

    Attention-Deficit Hyperactivity Disorder (ADHD) has high heritability; however, studies of common variation account for <5% of ADHD variance. Using data from affected participants without a family history of ADHD, we sought to identify de novo variants that could account for sporadic ADHD. Considering a total of 128 families, two analyses were conducted in parallel: first, in 11 unaffected parent/affected proband trios (or quads with the addition of an unaffected sibling) we completed exome sequencing. Six de novo missense variants at highly conserved bases were identified and validated from four of the 11 families: the brain-expressed genes TBC1D9, DAGLA, QARS, CSMD2, TRPM2, and WDR83. Separately, in 117 unrelated probands with sporadic ADHD, we sequenced a panel of 26 genes implicated in intellectual disability (ID) and autism spectrum disorder (ASD) to evaluate whether variation in ASD/ID-associated genes were also present in participants with ADHD. Only one putative deleterious variant (Gln600STOP) in CHD1L was identified; this was found in a single proband. Notably, no other nonsense, splice, frameshift, or highly conserved missense variants in the 26 gene panel were identified and validated. These data suggest that de novo variant analysis in families with independently adjudicated sporadic ADHD diagnosis can identify novel genes implicated in ADHD pathogenesis. Moreover, that only one of the 128 cases (0.8%, 11 exome, and 117 MIP sequenced participants) had putative deleterious variants within our data in 26 genes related to ID and ASD suggests significant independence in the genetic pathogenesis of ADHD as compared to ASD and ID phenotypes. © 2017 Wiley Periodicals, Inc.

  14. Single nucleotide variants and InDels identified from whole-genome re-sequencing of Guzerat, Gyr, Girolando and Holstein cattle breeds.

    PubMed

    Stafuzza, Nedenia Bonvino; Zerlotini, Adhemar; Lobo, Francisco Pereira; Yamagishi, Michel Eduardo Beleza; Chud, Tatiane Cristina Seleguim; Caetano, Alexandre Rodrigues; Munari, Danísio Prado; Garrick, Dorian J; Machado, Marco Antonio; Martins, Marta Fonseca; Carvalho, Maria Raquel; Cole, John Bruce; Barbosa da Silva, Marcos Vinicius Gualberto

    2017-01-01

    Whole-genome re-sequencing, alignment and annotation analyses were undertaken for 12 sires representing four important cattle breeds in Brazil: Guzerat (multi-purpose), Gyr, Girolando and Holstein (dairy production). A total of approximately 4.3 billion reads from an Illumina HiSeq 2000 sequencer generated for each animal 10.7 to 16.4-fold genome coverage. A total of 27,441,279 single nucleotide variations (SNVs) and 3,828,041 insertions/deletions (InDels) were detected in the samples, of which 2,557,670 SNVs and 883,219 InDels were novel. The submission of these genetic variants to the dbSNP database significantly increased the number of known variants, particularly for the indicine genome. The concordance rate between genotypes obtained using the Bovine HD BeadChip array and the same variants identified by sequencing was about 99.05%. The annotation of variants identified numerous non-synonymous SNVs and frameshift InDels which could affect phenotypic variation. Functional enrichment analysis was performed and revealed that variants in the olfactory transduction pathway was over represented in all four cattle breeds, while the ECM-receptor interaction pathway was over represented in Girolando and Guzerat breeds, the ABC transporters pathway was over represented only in Holstein breed, and the metabolic pathways was over represented only in Gyr breed. The genetic variants discovered here provide a rich resource to help identify potential genomic markers and their associated molecular mechanisms that impact economically important traits for Gyr, Girolando, Guzerat and Holstein breeding programs.

  15. Single nucleotide variants and InDels identified from whole-genome re-sequencing of Guzerat, Gyr, Girolando and Holstein cattle breeds

    PubMed Central

    Lobo, Francisco Pereira; Yamagishi, Michel Eduardo Beleza; Chud, Tatiane Cristina Seleguim; Caetano, Alexandre Rodrigues; Munari, Danísio Prado; Garrick, Dorian J.; Machado, Marco Antonio; Martins, Marta Fonseca; Carvalho, Maria Raquel; Cole, John Bruce; Barbosa da Silva, Marcos Vinicius Gualberto

    2017-01-01

    Whole-genome re-sequencing, alignment and annotation analyses were undertaken for 12 sires representing four important cattle breeds in Brazil: Guzerat (multi-purpose), Gyr, Girolando and Holstein (dairy production). A total of approximately 4.3 billion reads from an Illumina HiSeq 2000 sequencer generated for each animal 10.7 to 16.4-fold genome coverage. A total of 27,441,279 single nucleotide variations (SNVs) and 3,828,041 insertions/deletions (InDels) were detected in the samples, of which 2,557,670 SNVs and 883,219 InDels were novel. The submission of these genetic variants to the dbSNP database significantly increased the number of known variants, particularly for the indicine genome. The concordance rate between genotypes obtained using the Bovine HD BeadChip array and the same variants identified by sequencing was about 99.05%. The annotation of variants identified numerous non-synonymous SNVs and frameshift InDels which could affect phenotypic variation. Functional enrichment analysis was performed and revealed that variants in the olfactory transduction pathway was over represented in all four cattle breeds, while the ECM-receptor interaction pathway was over represented in Girolando and Guzerat breeds, the ABC transporters pathway was over represented only in Holstein breed, and the metabolic pathways was over represented only in Gyr breed. The genetic variants discovered here provide a rich resource to help identify potential genomic markers and their associated molecular mechanisms that impact economically important traits for Gyr, Girolando, Guzerat and Holstein breeding programs. PMID:28323836

  16. Haplotype combination of the bovine CFL2 gene sequence variants and association with growth traits in Qinchuan cattle.

    PubMed

    Sun, Yujia; Lan, Xianyong; Lei, Chuzhao; Zhang, Chunlei; Chen, Hong

    2015-06-01

    The aim of this study was to examine the association of cofilin2 (CFL2) gene polymorphisms with growth traits in Chinese Qinchuan cattle. Three single nucleotide polymorphisms (SNPs) were identified in the bovine CFL2 gene using DNA sequencing and (forced) PCR-RFLP methods. These polymorphisms included a missense mutation (NC_007319.5: g. C 2213 G) in exon 4, one synonymous mutation (NC_007319.5: g. T 1694 A) in exon 4, and a mutation (NC_007319.5: g. G 1500 A) in intron 2, respectively. In addition, we evaluated the haplotype frequency and linkage disequilibrium coefficient of three sequence variants in 488 individuals in QC cattle. All the three SNPs in QC cattle belonged to an intermediate level of genetic diversity (0.250.33). Association analysis indicated that SNP G 1500 A, T 1694 A and C 2213 G were significantly associated with growth traits in the QC population. The results of our study suggest that the CFL2 gene may be a strong candidate gene that affects growth traits in the QC cattle breeding program.

  17. Next-generation sequencing of hereditary hemochromatosis-related genes: Novel likely pathogenic variants found in the Portuguese population.

    PubMed

    Faria, Ricardo; Silva, Bruno; Silva, Catarina; Loureiro, Pedro; Queiroz, Ana; Fraga, Sofia; Esteves, Jorge; Mendes, Diana; Fleming, Rita; Vieira, Luís; Gonçalves, João; Faustino, Paula

    2016-10-01

    Hereditary hemochromatosis (HH) is an autosomal recessive disorder characterized by excessive iron absorption resulting in pathologically increased body iron stores. It is typically associated with common HFE gene mutation (p.Cys282Tyr and p.His63Asp). However, in Southern European populations up to one third of HH patients do not carry the risk genotypes. This study aimed to explore the use of next-generation sequencing (NGS) technology to analyse a panel of iron metabolism-related genes (HFE, TFR2, HJV, HAMP, SLC40A1, and FTL) in 87 non-classic HH Portuguese patients. A total of 1241 genetic alterations were detected corresponding to 53 different variants, 13 of which were not described in the available public databases. Among them, five were predicted to be potentially pathogenic: three novel mutations in TFR2 [two missense (p.Leu750Pro and p.Ala777Val) and one intronic splicing mutation (c.967-1G>C)], one missense mutation in HFE (p.Tyr230Cys), and one mutation in the 5'-UTR of HAMP gene (c.-25G>A). The results reported here illustrate the usefulness of NGS for targeted iron metabolism-related gene panels, as a likely cost-effective approach for molecular genetics diagnosis of non-classic HH patients. Simultaneously, it has contributed to the knowledge of the pathophysiology of those rare iron metabolism-related disorders.

  18. Every amino acid matters: essential contributions of histone variants to mammalian development and disease

    PubMed Central

    Maze, Ian; Noh, Kyung-Min; Soshnev, Alexey A.; Allis, C. David

    2014-01-01

    Despite a conserved role for histones as general DNA packaging agents, it is now clear that another key function of these proteins is to confer variations in chromatin structure to ensure dynamic patterns of transcriptional regulation in eukaryotes. The incorporation of histone variants is particularly important to this process. Recent knockdown and knockout studies in various cellular systems, as well as direct mutational evidence from human cancers, now suggest a crucial role for histone variant regulation in processes as diverse as differentiation and proliferation, meiosis and nuclear reprogramming. In this Review, we provide an overview of histone variants in the context of their unique functions during mammalian germ cell and embryonic development, and examine the consequences of aberrant histone variant regulation in human disease. PMID:24614311

  19. Development and Validation of a Next-Generation Sequencing Assay for BRCA1 and BRCA2 Variants for the Clinical Laboratory

    PubMed Central

    Strom, Charles M.; Rivera, Steven; Elzinga, Christopher; Angeloni, Taraneh; Rosenthal, Sun Hee; Goos-Root, Dana; Siaw, Martin; Platt, Jamie; Braastadt, Cory; Cheng, Linda; Ross, David; Sun, Weimin

    2015-01-01

    The objective of this study was to design and validate a next-generation sequencing assay (NGS) to detect BRCA1 and BRCA2 mutations. We developed an assay using random shearing of genomic DNA followed by RNA bait tile hybridization and NGS sequencing on both the Illumina MiSeq and Ion Personal Gene Machine (PGM). We determined that the MiSeq Reporter software supplied with the instrument could not detect deletions greater than 9 base pairs. Therefore, we developed an alternative alignment and variant calling software, Quest Sequencing Analysis Pipeline (QSAP), that was capable of detecting large deletions and insertions. In validation studies, we used DNA from 27 stem cell lines, all with known deleterious BRCA1 or BRCA2 mutations, and DNA from 67 consented control individuals who had a total of 352 benign variants. Both the MiSeq/QSAP combination and PGM/Torrent Suite combination had 100% sensitivity for the 379 known variants in the validation series. However, the PGM/Torrent Suite combination had a lower intra- and inter-assay precision of 96.2% and 96.7%, respectively when compared to the MiSeq/QSAP combination of 100% and 99.4%, respectively. All PGM/Torrent Suite inconsistencies were false-positive variant assignments. We began commercial testing using both platforms and in the first 521 clinical samples MiSeq/QSAP had 100% sensitivity for BRCA1/2 variants, including a 64-bp deletion and a 10-bp insertion not identified by PGM/Torrent Suite, which also suffered from a high false-positive rate. Neither the MiSeq nor PGM platform with their supplied alignment and variant calling software are appropriate for a clinical laboratory BRCA sequencing test. We have developed an NGS BRCA1/2 sequencing assay, MiSeq/QSAP, with 100% analytic sensitivity and specificity in the validation set consisting of 379 variants. The MiSeq/QSAP combination has sufficient performance for use in a clinical laboratory. PMID:26295337

  20. Systematic Identification of Single Amino Acid Variants in Glioma Stem-Cell-Derived Chromosome 19 Proteins

    PubMed Central

    2015-01-01

    Novel proteoforms with single amino acid variations represent proteins that often have altered biological functions but are less explored in the human proteome. We have developed an approach, searching high quality shotgun proteomic data against an extended protein database, to identify expressed mutant proteoforms in glioma stem cell (GSC) lines. The systematic search of MS/MS spectra using PEAKS 7.0 as the search engine has recognized 17 chromosome 19 proteins in GSCs with altered amino acid sequences. The results were further verified by manual spectral examination, validating 19 proteoforms. One of the novel findings, a mutant form of branched-chain aminotransferase 2 (p.Thr186Arg), was verified at the transcript level and by targeted proteomics in several glioma stem cell lines. The structure of this proteoform was examined by molecular modeling in order to estimate conformational changes due to mutation that might lead to functional modifications potentially linked to glioma. Based on our initial findings, we believe that our approach presented could contribute to construct a more complete map of the human functional proteome. PMID:25399873

  1. Nucleic acid (cDNA) and amino acid sequences of the maize endosperm protein glutelin-2.

    PubMed Central

    Prat, S; Cortadas, J; Puigdomènech, P; Palau, J

    1985-01-01

    The cDNA coding for a glutelin-2 protein from maize endosperm has been cloned and the complete amino acid sequence of the protein derived for the first time. An immature maize endosperm cDNA bank was screened for the expression of a beta-lactamase:glutelin-2 (G2) fusion polypeptide by using antibodies against the purified 28 kd G2 protein. A clone corresponding to the 28 kd G2 protein was sequenced and the primary structure of this protein was derived. Five regions can be defined in the protein sequence: an 11 residue N-terminal part, a repeated region formed by eight units of the sequence Pro-Pro-Pro-Val-His-Leu, an alternating Pro-X stretch 21 residues long, a Cys rich domain and a C-terminal part rich in Gln. The protein sequence is preceded by 19 residues which have the characteristics of the signal peptide found in secreted proteins. Unlike zeins, the main maize storage proteins, 28 kd glutelin-2 has several homologous sequences in common with other cereal storage proteins. Images PMID:3839076

  2. Molecular characterization of two Pepino mosaic virus variants from imported tomato seed reveals high levels of sequence identity between Chilean and US isolates.

    PubMed

    Ling, Kai-Shu

    2007-01-01

    Pepino mosaic virus (PepMV), a member of the genus Potexvirus, was first described in South America on pepino (Solanum muricatum A.). Only in recent years, it was reported to infect greenhouse-grown tomatoes. Genome nucleotide sequences from several European isolates showed extensive sequence identity (>99%). Recent genome nucleotide sequences from two US isolates (US1 and US2) however showed much greater sequence divergence from that of the European PepMV isolates. My interest in characterizing virus isolates from South America was due to an active commercial tomato seed production in Chile. Through genome sequence comparison and phylogenetic analyses, we may be able to understand the source of virus infection and control this devastating disease from further spreading into new tomato growing regions of the world. Complete genome nucleotide sequences from two PepMV variants (designated as Ch1 and Ch2) were determined from a virus isolate obtained from a commercial tomato seed lot produced in Chile. Using RT-PCR-based genome walking strategy, complete genome sequences from these two variants were determined. Excluding poly (A) tails, the genomes of PepMV Ch1 and Ch2 were 6414 and 6412 nucleotides (nt), respectively. Pairwise comparisons of PepMV Ch1 and Ch2 genomes with other PepMV isolates showed that the highest nucleotide sequence identity was with two US isolates, 98.7% between PepMV Ch1 and US1, and 90.7% between Ch2 and US2. Similar to PepMV US1 and US2, the two Chilean variants were the most divergent from one another (78% nt identity). These two Chilean PepMV variants also shared only 78-86% nucleotide sequence identity to that of five European isolates. The high level of nucleotide sequence identity between Chilean and US isolates suggests a common origin. Phylogenetic analyses with various gene products generated three distinct sequence clusters (or strains): US1 and Ch1 in the first group, US2 and Ch2 in the second, and the European tomato isolates in

  3. The human liver fatty acid binding protein T94A variant alters the structure, stability, and interaction with fibrates.

    PubMed

    Martin, Gregory G; McIntosh, Avery L; Huang, Huan; Gupta, Shipra; Atshaves, Barbara P; Landrock, Kerstin K; Landrock, Danilo; Kier, Ann B; Schroeder, Friedhelm

    2013-12-23

    Although the human liver fatty acid binding protein (L-FABP) T94A variant arises from the most commonly occurring single-nucleotide polymorphism in the entire FABP family, there is a complete lack of understanding regarding the role of this polymorphism in human disease. It has been hypothesized that the T94A substitution results in the complete loss of ligand binding ability and function analogous to that seen with L-FABP gene ablation. This possibility was addressed using the recombinant human wild-type (WT) T94T and T94A variant L-FABP and cultured primary human hepatocytes. Nonconservative replacement of the medium-sized, polar, uncharged T residue with a smaller, nonpolar, aliphatic A residue at position 94 of the human L-FABP significantly increased the L-FABP α-helical structure content at the expense of β-sheet content and concomitantly decreased the thermal stability. T94A did not alter the binding affinities for peroxisome proliferator-activated receptor α (PPARα) agonist ligands (phytanic acid, fenofibrate, and fenofibric acid). While T94A did not alter the impact of phytanic acid and only slightly altered that of fenofibrate on the human L-FABP secondary structure, the active metabolite fenofibric acid altered the T94A secondary structure much more than that of the WT T94T L-FABP. Finally, in cultured primary human hepatocytes, the T94A variant exhibited a significantly reduced extent of fibrate-mediated induction of PPARα-regulated proteins such as L-FABP, FATP5, and PPARα itself. Thus, while the T94A substitution did not alter the affinity of the human L-FABP for PPARα agonist ligands, it significantly altered the human L-FABP structure, stability, and conformational and functional response to fibrate.

  4. Phylogenetic analysis of beta-papillomaviruses as inferred from nucleotide and amino acid sequence data.

    PubMed

    Gottschling, Marc; Köhler, Anja; Stockfleth, Eggert; Nindl, Ingo

    2007-01-01

    Human papillomaviruses (HPV) of the beta-group seem to be involved in the pathogenesis of non-melanoma skin cancer. Papillomaviruses are host specific and are considered closely co-evolving with their hosts. Evolutionary incongruence between early genes and late genes has been reported among oncogenic genital alpha-papillomaviruses and considerably challenge phylogenetic reconstructions. We investigated the relationships of 29 beta-HPV (25 types plus four putative new types, subtypes, or variants) as inferred from codon aligned and amino acid sequence data of the genes E1, E2, E6, E7, L1, and L2 using likelihood, distance, and parsimony approaches. An analysis of a L1 fragment included additional nucleotide and amino acid sequences from seven non-human beta-papillomaviruses. Early genes and late genes evolution did not conflict significantly in beta-papillomaviruses based on partition homogeneity tests (p > or = 0.001). As inferred from the complete genome analyses, beta-papillomaviruses were monophyletic and segregated into four highly supported monophyletic assemblages corresponding to the species 1, 2, 3, and fused 4/5. They basically split into the species 1 and the remainder of beta-papillomaviruses, whose species 3, 4, and 5 constituted the sistergroup of species 2. beta-Papillomaviruses have been isolated from humans, apes, and monkeys, and phylogenetic analyses of the L1 fragment showed non-human papillomaviruses highly polyphyletic nesting within the HPV species. Thus, host and virus phylogenies were not congruent in beta-papillomaviruses, and multiple invasions across species borders may contribute (additionally to host-linked evolution) to their diversification.

  5. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2010-07-01 2010-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide...

  6. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2012-07-01 2012-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide...

  7. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2014-07-01 2014-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide...

  8. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2011-07-01 2011-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide...

  9. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2013-07-01 2013-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide...

  10. Sequencing analysis of ghrelin gene 5' flanking region: relations between the sequence variants, fasting plasma total ghrelin concentrations, and body mass index.

    PubMed

    Vartiainen, Johanna; Kesäniemi, Y Antero; Ukkola, Olavi

    2006-10-01

    Ghrelin is a 28-amino-acid peptide with several functions linked to energy metabolism. Low ghrelin plasma concentrations are associated with obesity, hypertension, and type 2 diabetes mellitus, whereas high concentrations reflect states of negative energy balance. Several studies addressing the hormonal and neural regulation of ghrelin gene expression have been carried out, but the role of genetic factors in the regulation of ghrelin plasma levels remains unclear. To elucidate the role of genetic factors in the regulation of ghrelin expression, we screened 1657 nucleotides of the ghrelin gene 5' flanking region (promoter and possible regulatory sites) for new sequential variations from patient samples with low (n = 50) and high (n = 50) fasting plasma total ghrelin concentrations (low- and high-ghrelin groups). Eleven single nucleotide polymorphisms (SNPs), 3 of which were rare variants (allelic frequency less than 1%) were found in our population. The genotype distribution patterns of the SNPs did not differ between the study groups, except for SNP-501A>C (P = .039). In addition, the SNP-01A>C was associated with body mass index (BMI) (P = .018). This variant was studied further in our large and well-defined Oulu Project Elucidating Risk for Atherosclerosis (OPERA) cohort (n = 1045) by the restriction fragment length polymorphism (RFLP) technique. No significant association of SNP-501A>C genotypes with fasting ghrelin plasma concentrations was found in the whole OPERA population. However, the association of this SNP with BMI and with waist circumference reached statistical significance in OPERA (P = .047 and .049, respectively), remaining of borderline significance for BMI after adjustments (P = .055). The results indicate that factors other than the 11 SNPs found in this study in the 5' flanking region of ghrelin gene are the main determinants of ghrelin plasma levels. However, SNP-501 A>C genotype distribution seems to be different in subjects having the highest

  11. Analysis of coding variants identified from exome sequencing resources for association with diabetic and non-diabetic nephropathy in African Americans.

    PubMed

    Cooke Bailey, Jessica N; Palmer, Nicholette D; Ng, Maggie C Y; Bonomo, Jason A; Hicks, Pamela J; Hester, Jessica M; Langefeld, Carl D; Freedman, Barry I; Bowden, Donald W

    2014-06-01

    Prior studies have identified common genetic variants influencing diabetic and non-diabetic nephropathy, diseases which disproportionately affect African Americans. Recently, exome sequencing techniques have facilitated identification of coding variants on a genome-wide basis in large samples. Exonic variants in known or suspected end-stage kidney disease (ESKD) or nephropathy genes can be tested for their ability to identify association either singly or in combination with known associated common variants. Coding variants in genes with prior evidence for association with ESKD or nephropathy were identified in the NHLBI-ESP GO database and genotyped in 5,045 African Americans (3,324 cases with type 2 diabetes associated nephropathy [T2D-ESKD] or non-T2D ESKD, and 1,721 controls) and 1,465 European Americans (568 T2D-ESKD cases and 897 controls). Logistic regression analyses were performed to assess association, with admixture and APOL1 risk status incorporated as covariates. Ten of 31 SNPs were associated in African Americans; four replicated in European Americans. In African Americans, SNPs in OR2L8, OR2AK2, C6orf167 (MMS22L), LIMK2, APOL3, APOL2, and APOL1 were nominally associated (P = 1.8 × 10(-4)-0.044). Haplotype analysis of common and coding variants increased evidence of association at the OR2L13 and APOL1 loci (P = 6.2 × 10(-5) and 4.6 × 10(-5), respectively). SNPs replicating in European Americans were in OR2AK2, LIMK2, and APOL2 (P = 0.0010-0.037). Meta-analyses highlighted four SNPs associated in T2D-ESKD and all-cause ESKD. Results from this study suggest a role for coding variants in the development of diabetic, non-diabetic, and/or all-cause ESKD in African Americans and/or European Americans.

  12. Whole-exome sequencing and imaging genetics identify functional variants for rate of change in hippocampal volume in mild cognitive impairment

    PubMed Central

    Nho, K; Corneveaux, JJ; Kim, S; Lin, H; Risacher, SL; Shen, L; Swaminathan, S; Ramanan, VK; Liu, Y; Foroud, T; Inlow, MH; Siniard, AL; Reiman, RA; Aisen, PS; Petersen, RC; Green, RC; Jack, CR; Weiner, MW; Baldwin, CT; Lunetta, K; Farrer, LA; Furney, SJ; Lovestone, S; Simmons, A; Mecocci, P; Vellas, B; Tsolaki, M; Kloszewska, I; Soininen, H; McDonald, BC; Farlow, MR; Ghetti, B; Huentelman, MJ; Saykin, AJ

    2013-01-01

    Whole-exome sequencing of individuals with mild cognitive impairment, combined with genotype imputation, was used to identify coding variants other than the apolipoprotein E (APOE) ε4 allele associated with rate of hippocampal volume loss using an extreme trait design. Matched unrelated APOE ε3 homozygous male Caucasian participants from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) were selected at the extremes of the 2-year longitudinal change distribution of hippocampal volume (eight subjects with rapid rates of atrophy and eight with slow/stable rates of atrophy). We identified 57 non-synonymous single nucleotide variants (SNVs) which were found exclusively in at least 4 of 8 subjects in the rapid atrophy group, but not in any of the 8 subjects in the slow atrophy group. Among these SNVs, the variants that accounted for the greatest group difference and were predicted in silico as ‘probably damaging’ missense variants were rs9610775 (CARD10) and rs1136410 (PARP1). To further investigate and extend the exome findings in a larger sample, we conducted quantitative trait analysis including whole-brain search in the remaining ADNI APOE ε3/ε3 group (N =315). Genetic variation within PARP1 and CARD10 was associated with rate of hippocampal neurodegeneration in APOE ε3/ε3. Meta-analysis across five independent cross sectional cohorts indicated that rs1136410 is also significantly associated with hippocampal volume in APOE ε3/ε3 individuals (N =923). Larger sequencing studies and longitudinal follow-up are needed for confirmation. The combination of next-generation sequencing and quantitative imaging phenotypes holds significant promise for discovery of variants involved in neurodegeneration. PMID:23608917

  13. Whole-exome sequencing and imaging genetics identify functional variants for rate of change in hippocampal volume in mild cognitive impairment.

    PubMed

    Nho, K; Corneveaux, J J; Kim, S; Lin, H; Risacher, S L; Shen, L; Swaminathan, S; Ramanan, V K; Liu, Y; Foroud, T; Inlow, M H; Siniard, A L; Reiman, R A; Aisen, P S; Petersen, R C; Green, R C; Jack, C R; Weiner, M W; Baldwin, C T; Lunetta, K; Farrer, L A; Furney, S J; Lovestone, S; Simmons, A; Mecocci, P; Vellas, B; Tsolaki, M; Kloszewska, I; Soininen, H; McDonald, B C; Farlow, M R; Ghetti, B; Huentelman, M J; Saykin, A J

    2013-07-01

    Whole-exome sequencing of individuals with mild cognitive impairment, combined with genotype imputation, was used to identify coding variants other than the apolipoprotein E (APOE) ε4 allele associated with rate of hippocampal volume loss using an extreme trait design. Matched unrelated APOE ε3 homozygous male Caucasian participants from the Alzheimer's Disease Neuroimaging Initiative (ADNI) were selected at the extremes of the 2-year longitudinal change distribution of hippocampal volume (eight subjects with rapid rates of atrophy and eight with slow/stable rates of atrophy). We identified 57 non-synonymous single nucleotide variants (SNVs) which were found exclusively in at least 4 of 8 subjects in the rapid atrophy group, but not in any of the 8 subjects in the slow atrophy group. Among these SNVs, the variants that accounted for the greatest group difference and were predicted in silico as 'probably damaging' missense variants were rs9610775 (CARD10) and rs1136410 (PARP1). To further investigate and extend the exome findings in a larger sample, we conducted quantitative trait analysis including whole-brain search in the remaining ADNI APOE ε3/ε3 group (N=315). Genetic variation within PARP1 and CARD10 was associated with rate of hippocampal neurodegeneration in APOE ε3/ε3. Meta-analysis across five independent cross sectional cohorts indicated that rs1136410 is also significantly associated with hippocampal volume in APOE ε3/ε3 individuals (N=923). Larger sequencing studies and longitudinal follow-up are needed for confirmation. The combination of next-generation sequencing and quantitative imaging phenotypes holds significant promise for discovery of variants involved in neurodegeneration.

  14. Human liver apolipoprotein B-100 cDNA: complete nucleic acid and derived amino acid sequence.

    PubMed Central

    Law, S W; Grant, S M; Higuchi, K; Hospattankar, A; Lackner, K; Lee, N; Brewer, H B

    1986-01-01

    Human apolipoprotein B-100 (apoB-100), the ligand on low density lipoproteins that interacts with the low density lipoprotein receptor and initiates receptor-mediated endocytosis and low density lipoprotein catabolism, has been cloned, and the complete nucleic acid and derived amino acid sequences have been determined. ApoB-100 cDNAs were isolated from normal human liver cDNA libraries utilizing immunoscreening as well as filter hybridization with radiolabeled apoB-100 oligodeoxynucleotides. The apoB-100 mRNA is 14.1 kilobases long encoding a mature apoB-100 protein of 4536 amino acids with a calculated amino acid molecular weight of 512,723. ApoB-100 contains 20 potential glycosylation sites, and 12 of a total of 25 cysteine residues are located in the amino-terminal region of the apolipoprotein providing a potential globular structure of the amino terminus of the protein. ApoB-100 contains relatively few regions of amphipathic helices, but compared to other human apolipoproteins it is enriched in beta-structure. The delineation of the entire human apoB-100 sequence will now permit a detailed analysis of the conformation of the protein, the low density lipoprotein receptor binding domain(s), and the structural relationship between apoB-100 and apoB-48 and will provide the basis for the study of genetic defects in apoB-100 in patients with dyslipoproteinemias. PMID:3464946

  15. A variant in the sonic hedgehog regulatory sequence (ZRS) is associated with triphalangeal thumb and deregulates expression in the developing limb

    PubMed Central

    Furniss, Dominic; Lettice, Laura A.; Taylor, Indira B.; Critchley, Paul S.; Giele, Henk; Hill, Robert E.; Wilkie, Andrew O.M.

    2008-01-01

    A locus for triphalangeal thumb, variably associated with pre-axial polydactyly, was previously identified in the zone of polarizing activity regulatory sequence (ZRS), a long range limb-specific enhancer of the Sonic Hedgehog (SHH) gene at human chromosome 7q36.3. Here, we demonstrate that a 295T>C variant in the human ZRS, previously thought to represent a neutral polymorphism, acts as a dominant allele with reduced penetrance. We found this variant in three independently ascertained probands from southern England with triphalangeal thumb, demonstrated significant linkage of the phenotype to the variant (LOD = 4.1), and identified a shared microsatellite haplotype around the ZRS, suggesting that the probands share a common ancestor. An individual homozygous for the 295C allele presented with isolated bilateral triphalangeal thumb resembling the heterozygous phenotype, suggesting that the variant is largely dominant to the wild-type allele. As a functional test of the pathogenicity of the 295C allele, we utilized a mutated ZRS construct to demonstrate that it can drive ectopic anterior expression of a reporter gene in the developing mouse forelimb. We conclude that the 295T>C variant is in fact pathogenic and, in southern England, appears to be the most common cause of triphalangeal thumb. Depending on the dispersal of the founding mutation, it may play a wider role in the aetiology of this disorder. PMID:18463159

  16. A novel small acid soluble protein variant is important for spore resistance of most Clostridium perfringens food poisoning isolates.

    PubMed

    Li, Jihong; McClane, Bruce A

    2008-05-02

    Clostridium perfringens is a major cause of food poisoning (FP) in developed countries. C. perfringens isolates usually induce the gastrointestinal symptoms of this FP by producing an enterotoxin that is encoded by a chromosomal (cpe) gene. Those typical FP strains also produce spores that are extremely resistant to food preservation approaches such as heating and chemical preservatives. This resistance favors their survival and subsequent germination in improperly cooked, prepared, or stored foods. The current study identified a novel alpha/beta-type small acid soluble protein, now named Ssp4, and showed that sporulating cultures of FP isolates producing resistant spores consistently express a variant Ssp4 with an Asp substitution at residue 36. In contrast, Gly was detected at Ssp4 residue 36 in C. perfringens strains producing sensitive spores. Studies with isogenic mutants and complementing strains demonstrated the importance of the Asp 36 Ssp4 variant for the exceptional heat and sodium nitrite resistance of spores made by most FP strains carrying a chromosomal cpe gene. Electrophoretic mobility shift assays and DNA binding studies showed that Ssp4 variants with an Asp at residue 36 bind more efficiently and tightly to DNA than do Ssp4 variants with Gly at residue 36. Besides suggesting one possible mechanistic explanation for the highly resistant spore phenotype of most FP strains carrying a chromosomal cpe gene, these findings may facilitate eventual development of targeted strategies to increase killing of the resistant spores in foods. They also provide the first indication that SASP variants can be important contributors to intra-species (and perhaps inter-species) variations in bacterial spore resistance phenotypes. Finally, Ssp4 may contribute to spore resistance properties throughout the genus Clostridium since ssp4 genes also exist in the genomes of other clostridial species.

  17. Computer selection of oligonucleotide probes from amino acid sequences for use in gene library screening.

    PubMed

    Yang, J H; Ye, J H; Wallace, D C

    1984-01-11

    We present a computer program, FINPROBE, which utilizes known amino acid sequence data to deduce minimum redundancy oligonucleotide probes for use in screening cDNA or genomic libraries or in primer extension. The user enters the amino acid sequence of interest, the desired probe length, the number of probes sought, and the constraints on oligonucleotide synthesis. The computer generates a table of possible probes listed in increasing order of redundancy and provides the location of each probe in the protein and mRNA coding sequence. Activation of a next function provides the amino acid and mRNA sequences of each probe of interest as well as the complementary sequence and the minimum dissociation temperature of the probe. A final routine prints out the amino acid sequence of the protein in parallel with the mRNA sequence listing all possible codons for each amino acid.

  18. Sequence and Copy Number Analyses of HEXB Gene in Patients Affected by Sandhoff Disease: Functional Characterization of 9 Novel Sequence Variants

    PubMed Central

    Zampieri, Stefania; Cattarossi, Silvia; Oller Ramirez, Ana Maria; Rosano, Camillo; Lourenco, Charles Marques; Passon, Nadia; Moroni, Isabella; Uziel, Graziella; Pettinari, Antonella; Stanzial, Franco; de Kremer, Raquel Dodelson; Azar, Nydia Beatriz; Hazan, Filiz; Filocamo, Mirella; Bembi, Bruno; Dardis, Andrea

    2012-01-01

    Sandhoff disease (SD) is a lysosomal disorder caused by mutations in the HEXB gene. To date, 43 mutations of HEXB have been described, including 3 large deletions. Here, we have characterized 14 unrelated SD patients and developed a Multiplex Ligation-dependent Probe Amplification (MLPA) assay to investigate the presence of large HEXB deletions. Overall, we identified 16 alleles, 9 of which were novel, including 4 sequence variation leading to aminoacid changes [c.626C>T (p.T209I), c.634C>A (p.H212N), c.926G>T (p.C309F), c.1451G>A (p.G484E)] 3 intronic mutations (c.1082+5G>A, c.1242+1G>A, c.1169+5G>A), 1 nonsense mutation c.146C>A (p.S49X) and 1 small in-frame deletion c.1260_1265delAGTTGA (p.V421_E422del). Using the new MLPA assay, 2 previously described deletions were identified. In vitro expression studies showed that proteins bearing aminoacid changes p.T209I and p.G484E presented a very low or absent activity, while proteins bearing the p.H212N and p.C309F changes retained a significant residual activity. The detrimental effect of the 3 novel intronic mutations on the HEXB mRNA processing was demonstrated using a minigene assay. Unprecedentedly, minigene studies revealed the presence of a novel alternative spliced HEXB mRNA variant also present in normal cells. In conclusion, we provided new insights into the molecular basis of SD and validated an MLPA assay for detecting large HEXB deletions. PMID:22848519

  19. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  20. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  1. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  2. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  3. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  4. Sequencing the GRHL3 Coding Region Reveals Rare Truncating Mutations and a Common Susceptibility Variant for Nonsyndromic Cleft Palate

    PubMed Central

    Mangold, Elisabeth; Böhmer, Anne C.; Ishorst, Nina; Hoebel, Ann-Kathrin; Gültepe, Pinar; Schuenke, Hannah; Klamt, Johanna; Hofmann, Andrea; Gölz, Lina; Raff, Ruth; Tessmann, Peter; Nowak, Stefanie; Reutter, Heiko; Hemprich, Alexander; Kreusch, Thomas; Kramer, Franz-Josef; Braumann, Bert; Reich, Rudolf; Schmidt, Gül; Jäger, Andreas; Reiter, Rudolf; Brosch, Sibylle; Stavusis, Janis; Ishida, Miho; Seselgyte, Rimante; Moore, Gudrun E.; Nöthen, Markus M.; Borck, Guntram; Aldhorae, Khalid A.; Lace, Baiba; Stanier, Philip; Knapp, Michael; Ludwig, Kerstin U.

    2016-01-01

    Nonsyndromic cleft lip with/without cleft palate (nsCL/P) and nonsyndromic cleft palate only (nsCPO) are the most frequent subphenotypes of orofacial clefts. A common syndromic form of orofacial clefting is Van der Woude syndrome (VWS) where individuals have CL/P or CPO, often but not always associated with lower lip pits. Recently, ∼5% of VWS-affected individuals were identified with mutations in the grainy head-like 3 gene (GRHL3). To investigate GRHL3 in nonsyndromic clefting, we sequenced its coding region in 576 Europeans with nsCL/P and 96 with nsCPO. Most strikingly, nsCPO-affected individuals had a higher minor allele frequency for rs41268753 (0.099) than control subjects (0.049; p = 1.24 × 10−2). This association was replicated in nsCPO/control cohorts from Latvia, Yemen, and the UK (pcombined = 2.63 × 10−5; ORallelic = 2.46 [95% CI 1.6–3.7]) and reached genome-wide significance in combination with imputed data from a GWAS in nsCPO triads (p = 2.73 × 10−9). Notably, rs41268753 is not associated with nsCL/P (p = 0.45). rs41268753 encodes the highly conserved p.Thr454Met (c.1361C>T) (GERP = 5.3), which prediction programs denote as deleterious, has a CADD score of 29.6, and increases protein binding capacity in silico. Sequencing also revealed four novel truncating GRHL3 mutations including two that were de novo in four families, where all nine individuals harboring mutations had nsCPO. This is important for genetic counseling: given that VWS is rare compared to nsCPO, our data suggest that dominant GRHL3 mutations are more likely to cause nonsyndromic than syndromic CPO. Thus, with rare dominant mutations and a common risk variant in the coding region, we have identified an important contribution for GRHL3 in nsCPO. PMID:27018475

  5. Human retroviruses and AIDS 1996. A compilation and analysis of nucleic acid and amino acid sequences

    SciTech Connect

    Myers, G.; Foley, B.; Korber, B.; Mellors, J.W.; Jeang, K.T.; Wain-Hobson, S.

    1997-04-01

    This compendium and the accompanying floppy diskettes are the result of an effort to compile and rapidly publish all relevant molecular data concerning the human immunodeficiency viruses (HIV) and related retroviruses. The scope of the compendium and database is best summarized by the five parts that it comprises: (1) Nuclear Acid Alignments and Sequences; (2) Amino Acid Alignments; (3) Analysis; (4) Related Sequences; and (5) Database Communications. Information within all the parts is updated throughout the year on the Web site, http://hiv-web.lanl.gov. While this publication could take the form of a review or sequence monograph, it is not so conceived. Instead, the literature from which the database is derived has simply been summarized and some elementary computational analyses have been performed upon the data. Interpretation and commentary have been avoided insofar as possible so that the reader can form his or her own judgments concerning the complex information. In addition to the general descriptions of the parts of the compendium, the user should read the individual introductions for each part.

  6. The LITAF/SIMPLE I92V sequence variant results in an earlier age of onset of CMT1A/HNPP diseases.

    PubMed

    Sinkiewicz-Darol, Elena; Lacerda, Andressa Ferreira; Kostera-Pruszczyk, Anna; Potulska-Chromik, Anna; Sokołowska, Beata; Kabzińska, Dagmara; Brunetti, Craig R; Hausmanowa-Petrusewicz, Irena; Kochański, Andrzej

    2015-01-01

    Charcot-Marie-Tooth disease type 1A (CMT1A) and hereditary neuropathy with liability to pressure palsies (HNPP) represent the most common heritable neuromuscular disorders. Molecular diagnostics of CMT1A/HNPP diseases confirm clinical diagnosis, but their value is limited to the clinical course and prognosis. However, no biomarkers of CMT1A/HNPP have been identified. We decided to explore if the LITAF/SIMPLE gene shared a functional link to the PMP22 gene, whose duplication or deletion results in CMT1A and HNPP, respectively. By studying a large cohort of CMT1A/HNPP-affected patients, we found that the LITAF I92V sequence variant predisposes patients to an earlier age of onset of both the CMT1A and HNPP diseases. Using cell transfection experiments, we showed that the LITAF I92V sequence variant partially mislocalizes to the mitochondria in contrast to wild-type LITAF which localizes to the late endosome/lysosomes and is associated with a tendency for PMP22 to accumulate in the cells. Overall, this study shows that the I92V LITAF sequence variant would be a good candidate for a biomarker in the case of the CMT1A/HNPP disorders.

  7. Next-Generation Sequencing of 5' Untranslated Region of Hepatitis C Virus in Search of Minor Viral Variant in a Patient Who Revealed New Genotype While on Antiviral Treatment.

    PubMed

    Caraballo Cortes, Kamila; Bukowska-Ośko, Iwona; Pawełczyk, Agnieszka; Perlejewski, Karol; Płoski, Rafał; Lechowicz, Urszula; Stawiński, Piotr; Demkow, Urszula; Laskus, Tomasz; Radkowski, Marek

    2016-01-01

    The role of mixed infections with different hepatitis C virus (HCV) genotypes in viral persistence, treatment effects, and tissue tropism is unclear. Next-generation sequencing (NGS), which is suitable for analysis of large, genetically diverse populations offers unparalleled advantages for the study of mixed infections. The aim of the study was to determine, using two different deep sequencing strategies (pyrosequencing - 454 Life Sciences/Roche and reversible terminator sequencing-by-synthesis by Illumina), the origin of a novel HCV genotype transiently detectable during antiviral therapy (pre-existing minor population vs. de novo superinfection). Secondly, we compared 5' untranslated region (5'-UTR) variants obtained by the two NGS approaches. 5' UTR amplification products from 9 samples collected from genotype 1b infected patient before, during, and after treatment (4 serum and 5 peripheral blood mononuclear cell - PBMC - samples) were subjected to the next-generation sequencing. The sequencing revealed the presence of two (454/Roche) and one (Illumina) genotype 4 variants in PBMC at Week 16. None of these variants were present either in the preceding or following samples as revealed by both platforms. 454/Roche sequencing detected 24 different 5'-UTR variants: 8 were present in serum and PBMC, 4 only in serum and 12 only in PBMC. Illumina sequencing detected 11 different 5'-UTR variants: 5 in serum and PBMC, 4 only in serum and 2 only in PBMC. Six variants were identical for both sequencing platforms. The difference in variants number was primarily due to variability in two 5'-UTR homopolymeric regions. In conclusion, longitudinal analysis of HCV variants, employing two independent deep sequencing methods, suggests that the transient presence of a different genotype strain in PBMC was a result of superinfection and not a selection of pre-existing minor variant.

  8. Sugar and organic acid content of astringent, non-astringent, and pollination variant persimmons (abstract)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Although persimmons are native (Diospyros virginiana) to the United States, commercial production consists almost exclusively of the Asian persimmon, Diospyros kaki. Cultivars within this species are classified by their astringency type; non-astringent, astringent, and pollination variant. In the U...

  9. Exome Sequencing of SLC30A2 Identifies Novel Loss- and Gain-of-Function Variants Associated with Breast Cell Dysfunction.

    PubMed

    Alam, Samina; Hennigar, Stephen R; Gallagher, Carla; Soybel, David I; Kelleher, Shannon L

    2015-12-01

    The zinc (Zn) transporter ZnT2 (SLC30A2) is expressed in specialized secretory cells including breast, pancreas and prostate, and imports Zn into mitochondria and vesicles. Mutations in SLC30A2 substantially reduce milk Zn concentration ([Zn]) and cause severe Zn deficiency in exclusively breastfed infants. Recent studies show that ZnT2-null mice have low milk [Zn], in addition to profound defects in mammary gland function during lactation. Here, we used breast milk [Zn] to identify novel non-synonymous ZnT2 variants in a population of lactating women. We also asked whether specific variants induce disturbances in intracellular Zn management or cause cellular dysfunction in mammary epithelial cells. Healthy, breastfeeding women were stratified into quartiles by milk [Zn] and exonic sequencing of SLC30A2 was performed. We found that 36% of women tested carried non-synonymous ZnT2 variants, all of whom had milk Zn levels that were distinctly above or below those in women without variants. We identified 12 novel heterozygous variants. Two variants (D(103)E and T(288)S) were identified with high frequency (9 and 16%, respectively) and expression of T(288)S was associated with a known hallmark of breast dysfunction (elevated milk sodium/potassium ratio). Select variants (A(28)D, K(66)N, Q(71)H, D(103)E, A(105)P, Q(137)H, T(288)S and T(312)K) were characterized in vitro. Compared with wild-type ZnT2, these variants were inappropriately localized, and most resulted in either 'loss-of-function' or 'gain-of-function', and altered sub-cellular Zn pools, Zn secretion, and cell cycle check-points. Our study indicates that SLC30A2 variants are common in this population, dysregulate Zn management and can lead to breast cell dysfunction. This suggests that genetic variation in ZnT2 could be an important modifier of infant growth/development and reproductive health/disease. Importantly, milk [Zn] level may serve as a bio-reporter of breast function during lactation.

  10. Transcriptome Sequencing in Response to Salicylic Acid in Salvia miltiorrhiza

    PubMed Central

    Zhang, Xiaoru; Dong, Juane; Liu, Hailong; Wang, Jiao; Qi, Yuexin; Liang, Zongsuo

    2016-01-01

    Salvia miltiorrhiza is a traditional Chinese herbal medicine, whose quality and yield are often affected by diseases and environmental stresses during its growing season. Salicylic acid (SA) plays a significant role in plants responding to biotic and abiotic stresses, but the involved regulatory factors and their signaling mechanisms are largely unknown. In order to identify the genes involved in SA signaling, the RNA sequencing (RNA-seq) strategy was employed to evaluate the transcriptional profiles in S. miltiorrhiza cell cultures. A total of 50,778 unigenes were assembled, in which 5,316 unigenes were differentially expressed among 0-, 2-, and 8-h SA induction. The up-regulated genes were mainly involved in stimulus response and multi-organism process. A core set of candidate novel genes coding SA signaling component proteins was identified. Many transcription factors (e.g., WRKY, bHLH and GRAS) and genes involved in hormone signal transduction were differentially expressed in response to SA induction. Detailed analysis revealed that genes associated with defense signaling, such as antioxidant system genes, cytochrome P450s and ATP-binding cassette transporters, were significantly overexpressed, which can be used as genetic tools to investigate disease resistance. Our transcriptome analysis will help understand SA signaling and its mechanism of defense systems in S. miltiorrhiza. PMID:26808150

  11. Isolation of a hop-sensitive variant of Lactobacillus lindneri and identification of genetic markers for beer spoilage ability of lactic acid bacteria.

    PubMed

    Suzuki, Koji; Iijima, Kazumaru; Ozaki, Kazutaka; Yamashita, Hiroshi

    2005-09-01

    We have isolated a hop-sensitive variant of the beer spoilage bacterium Lactobacillus lindneri DSM 20692. The variant lost a plasmid carrying two contiguous open reading frames (ORF s) designated horB(L) and horC(L) that encode a putative regulator and multidrug transporter presumably belonging to the resistance-nodulation-cell division superfamily. The loss of hop resistance ability occurred with the loss of resistance to other drugs, including ethidium bromide, novobiocin, and cetyltrimethylammonium bromide. PCR and Southern blot analysis using 51 beer spoilage strains of various species of lactic acid bacteria (LAB) revealed that 49 strains possessed homologs of horB and horC. No false-positive results have been observed for nonspoilage LAB or frequently encountered brewery isolates. These features are superior to those of horA and ORF 5, previously reported genetic markers for determining the beer spoilage ability of LAB. It was further shown that the combined use of horB/horC and horA is able to detect all 51 beer spoilage strains examined in this study. Furthermore sequence comparison of horB and horC homologs identified in four different beer spoilage species indicates these homologs are 96.6 to 99.5% identical, which is not typical of distinct species. The wide and exclusive distribution of horB and horC homologs among beer spoilage LAB and their sequence identities suggest that the hop resistance ability of beer spoilage LAB has been acquired through horizontal gene transfer. These insights provide a foundation for applying trans-species genetic markers to differentiating beer spoilage LAB including previously unencountered species.

  12. Isolation of a Hop-Sensitive Variant of Lactobacillus lindneri and Identification of Genetic Markers for Beer Spoilage Ability of Lactic Acid Bacteria

    PubMed Central

    Suzuki, Koji; Iijima, Kazumaru; Ozaki, Kazutaka; Yamashita, Hiroshi

    2005-01-01

    We have isolated a hop-sensitive variant of the beer spoilage bacterium Lactobacillus lindneri DSM 20692. The variant lost a plasmid carrying two contiguous open reading frames (ORF s) designated horBL and horCL that encode a putative regulator and multidrug transporter presumably belonging to the resistance-nodulation-cell division superfamily. The loss of hop resistance ability occurred with the loss of resistance to other drugs, including ethidium bromide, novobiocin, and cetyltrimethylammonium bromide. PCR and Southern blot analysis using 51 beer spoilage strains of various species of lactic acid bacteria (LAB) revealed that 49 strains possessed homologs of horB and horC. No false-positive results have been observed for nonspoilage LAB or frequently encountered brewery isolates. These features are superior to those of horA and ORF 5, previously reported genetic markers for determining the beer spoilage ability of LAB. It was further shown that the combined use of horB/horC and horA is able to detect all 51 beer spoilage strains examined in this study. Furthermore sequence comparison of horB and horC homologs identified in four different beer spoilage species indicates these homologs are 96.6 to 99.5% identical, which is not typical of distinct species. The wide and exclusive distribution of horB and horC homologs among beer spoilage LAB and their sequence identities suggest that the hop resistance ability of beer spoilage LAB has been acquired through horizontal gene transfer. These insights provide a foundation for applying trans-species genetic markers to differentiating beer spoilage LAB including previously unencountered species. PMID:16151091

  13. Novel variant of tickborne encephalitis virus, Russia.

    PubMed

    Ternovoi, Vladimir A; Protopopova, Elena V; Chausov, Eugene V; Novikov, Dmitry V; Leonova, Galina N; Netesov, Sergey V; Loktev, Valery B

    2007-10-01

    We isolated a novel strain of tickborne encephalitis virus (TBEV), Glubinnoe/2004, from a patient with a fatal case in Russia. We sequenced the strain, whose landmark features included 57 amino acid substitutions and 5 modified cleavage sites. Phylogenetically, Glubinnoe/2004 is a novel variant that belongs to the Eastern type of TBEV.

  14. Human retroviruses and aids, 1992. A compilation and analysis of nucleic acid and amino acid sequences

    SciTech Connect

    Myers, G.; Korber, B.; Berzofsky, J.A.; Pavlakis, G.N.; Smith, R.F.

    1992-10-01

    This compendium and the accompanying floppy diskettes are the result of an effort to compile and rapidly publish all relevant molecular data concerning the human immunodeficiency viruses (HIV) and related retroviruses. The scope of the compendium and database is best summarized by the five parts that it comprises: (1) HIV and SIV Nucleotide Sequences; (H) Amino Acid Sequences; (III) Analyses; (IV) Related Sequences; and (V) Database Communications. information within all the parts is updated at least twice in each year, which accounts for the modes of binding and pagination in the compendium. While this publication could take the form of a review or sequence monograph, it is not so conceived. Instead, the literature from which the database is derived has simply been summarized and some elementary computational analyses have been performed upon the data. Interpretation and commentary have been avoided insofar as possible so that the reader can form his or her own judgments concerning the complex information. In addition to the general descriptions below of the parts of the compendium, the user should read the individual introductions for each part.

  15. Amino acid sequence diversity of the major human papillomavirus capsid protein: Implications for current and next generation vaccines☆

    PubMed Central

    Ahmed, Amina I.; Bissett, Sara L.; Beddows, Simon

    2013-01-01

    Despite the fidelity of host cell polymerases, the human papillomavirus (HPV) displays a degree of genomic polymorphism resulting in distinct genotypes and intra-type variants. The current HPV vaccines target the most prevalent genotypes associated with cervical cancer (HPV16/18) and genital warts (HPV6/11). Although these vaccines confer some measure of cross-protection, a multivalent HPV vaccine is in the pipeline that aims to broaden vaccine protection against other cervical cancer-associated genotypes including HPV31, HPV33, HPV45, HPV52 and HPV58. Both current and next generation vaccines comprise virus-like particles, based upon the major capsid protein, L1, and vaccine-induced, type-specific protection is likely mediated by neutralizing antibodies targeting L1 surface-exposed domains. The aim of this study was to perform an in silico analysis of existing full length L1 sequences representing vaccine-relevant HPV genotypes in order to address the degree of naturally-occurring, intra-type polymorphisms. In total, 1281 sequences from the Americas, Africa, Asia and Europe were assembled. Intra-type entropy was low and/or limited to non-surface-exposed residues for HPV6, HPV11 and HPV52 suggesting a minimal effect on vaccine antibodies for these genotypes. For HPV16, intra-type entropy was high but the present analysis did not reveal any significant polymorphisms not previously identified. For HPV31, HPV33, HPV58, however, intra-type entropy was high, mostly mapped to surface-exposed domains and in some cases within known neutralizing antibody epitopes. For HPV18 and HPV45 there were too few sequences for a definitive analysis, but HPV45 displayed some degree of surface-exposed residue diversity. In most cases, the reference sequence for each genotype represented a minority variant and the consensus L1 sequences for HPV18, HPV31, HPV45 and HPV58 did not reflect the L1 sequence of the currently available HPV pseudoviruses. These data highlight a number of variant

  16. Infection with a plasmid-free variant Chlamydia related to Chlamydia trachomatis identified by using multiple assays for nucleic acid detection.

    PubMed Central

    An, Q; Radcliffe, G; Vassallo, R; Buxton, D; O'Brien, W J; Pelletier, D A; Weisburg, W G; Klinger, J D; Olive, D M

    1992-01-01

    Clinical samples in transport media from 40 patients exhibiting pathologies potentially caused by Chlamydia trachomatis infection were analyzed for chlamydial nucleic acid, and the results were compared with those of culture. Chlamydial culture was performed by a shell vial centrifugation method with HeLa 229 host cells. Polymerase chain reaction (PCR) assays were used to detect either regions on a 7.5-kb plasmid characteristic of C. trachomatis (plasmid-PCR) or a segment of the 16S rRNA genes (rRNA-PCR). All PCR results were confirmed by hybridization with probes for the specific amplified products in either a Southern or a dot blot format. An RNase protection (RNP) assay was used to detect genus-specific chlamydial 16S rRNA directly from the clinical samples. The PCR assays detected C. trachomatis but not other bacteria, including Chlamydia spp. C. trachomatis was isolated from six samples which were positive by the rDNA-PCR and plasmid-PCR assays. Five of the culture-positive specimens were positive by the RNP assay. Twenty-two samples were negative by all criteria. Surprisingly, nine samples were positive by rRNA-PCR and RNP assays only. Nucleic acid sequencing of the rRNA-PCR-amplified products indicated a close relationship between the variants and C. trachomatis. The data may indicate an unrecognized process in C. trachomatis infection or that these patients were infected by a variant strain of C. trachomatis which lacks the C. trachomatis-specific plasmid. Images PMID:1280642

  17. Relationship between a Common Variant in the Fatty Acid Desaturase (FADS) Cluster and Eicosanoid Generation in Humans*

    PubMed Central

    Hester, Austin G.; Murphy, Robert C.; Uhlson, Charis J.; Ivester, Priscilla; Lee, Tammy C.; Sergeant, Susan; Miller, Leslie R.; Howard, Timothy D.; Mathias, Rasika A.; Chilton, Floyd H.

    2014-01-01

    Dramatic shifts in the Western diet have led to a marked increase in the dietary intake of the n-6 polyunsaturated fatty acid (PUFA), linoleic acid (LA). Dietary LA can then be converted to arachidonic acid (ARA) utilizing three enzymatic steps. Two of these steps are encoded for by the fatty acid desaturase (FADS) cluster (chromosome 11, 11q12.2-q13) and certain genetic variants within the cluster are highly associated with ARA levels. However, no study to date has examined whether these variants further influence pro-inflammatory, cyclooxygenase and lipoxygenase eicosanoid products. This study examined the impact of a highly influential FADS SNP, rs174537 on leukotriene, HETE, prostaglandin, and thromboxane biosynthesis in stimulated whole blood. Thirty subjects were genotyped at rs174537 (GG, n = 11; GT, n = 13; TT, n = 6), a panel of fatty acids from whole serum was analyzed, and precursor-to-product PUFA ratios were calculated as a marker of the capacity of tissues (particularly the liver) to synthesize long chain PUFAs. Eicosanoids produced by stimulated human blood were measured by LC-MS/MS. We observed an association between rs174537 and the ratio of ARA/LA, leukotriene B4, and 5-HETE but no effect on levels of cyclooxygenase products. Our results suggest that variation at rs174537 not only impacts the synthesis of ARA but the overall capacity of whole blood to synthesize 5-lipoxygenase products; these genotype-related changes in eicosanoid levels could have important implications in a variety of inflammatory diseases. PMID:24962583

  18. Targeted Deep Sequencing Identifies Rare ‘loss-of-function’ Variants in IFNGR1 for Risk of Atopic Dermatitis Complicated by Eczema Herpeticum

    PubMed Central

    Gao, Li; Rafaels, Nicholas M; Huang, Lili; Potee, Joseph; Ruczinski, Ingo; Beaty, Terri H.; Paller, Amy S.; Schneider, Lynda C.; Gallo, Rich; Hanifin, Jon M.; Beck, Lisa A.; Geha, Raif S.; Mathias, Rasika A.; Leung, Donald Y. M.

    2015-01-01

    Background A subset of atopic dermatitis (AD) is associated with increased susceptibility to eczema herpeticum (ADEH+). We previously reported that common single nucleotide polymorphisms (SNPs) in interferon-gamma (IFNG) and receptor 1 (IFNGR1) were associated with ADEH+ phenotype. Objective To interrogate the role of rare variants in IFN-pathway genes for risk of ADEH+. Methods We performed targeted sequencing of interferon-pathway genes (IFNG, IFNGR1, IFNAR1 and IL12RB1) in 228 European American (EA) AD patients selected according to their EH status and severity measured by Eczema Area and Severity Index (EASI). Replication genotyping was performed in independent samples of 219 EA and 333 African Americans (AA). Functional investigation of ‘loss-of-function’ variants was conducted using site-directed mutagenesis. Results We identified 494 single nucleotide variants (SNVs) encompassing 105kb of sequence, including 145 common, 349 (70.6%) rare (minor allele frequency (MAF) <5%) and 86 (17.4%) novel variants, of which 2.8% were coding-synonymous, 93.3% were non-coding (64.6% intronic), and 3.8% were missense. We identified six rare IFNGR1 missense including three damaging variants (Val14Met (V14M), Val61Ile and Tyr397Cys (Y397C)) conferring a higher risk for ADEH+ (P=0.031). Variants V14M and Y397C were confirmed to be deleterious leading to partial IFNGR1 deficiency. Seven common IFNGR1 SNPs, along with common protective haplotypes (2 to 7-SNPs) conferred a reduced risk of ADEH+ (P=0.015-0.002, P=0.0015-0.0004, respectively), and both SNP and haplotype associations were replicated in an independent AA sample (P=0.004-0.0001 and P=0.001-0.0001, respectively). Conclusion Our results provide evidence that both genetic variants in the gene encoding IFNGR1 are implicated in susceptibility to the ADEH+ phenotype. CAPSULE SUMMARY We provided the first evidence that rare functional IFNGR1 mutations contribute to a defective systemic IFN-γ immune response that accounts

  19. Completion of the amino acid sequence of the alpha 1 chain from type I calf skin collagen. Amino acid sequence of alpha 1(I)B8.

    PubMed Central

    Glanville, R W; Breitkreutz, D; Meitinger, M; Fietzek, P P

    1983-01-01

    The complete amino acid sequence of the 279-residue CNBr peptide CB8 from the alpha 1 chain of type I calf skin collagen is presented. It was determined by sequencing overlapping fragments of CB8 produced by Staphylococcus aureus V8 proteinase, trypsin, Endoproteinase Arg-C and hydroxylamine. Tryptic cleavages were also made specific for lysine by blocking arginine residues with cyclohexane-1,2-dione. This completes the amino acid sequence analysis of the 1054-residues-long alpha (I) chain of calf skin collagen. PMID:6354180

  20. Causal variants screened by whole exome sequencing in a patient with maternal uniparental isodisomy of chromosome 10 and a complicated phenotype

    PubMed Central

    LI, NIU; DING, YU; YU, TINGTING; LI, JUAN; SHEN, YONGNIAN; WANG, XIUMIN; FU, QIHUA; SHEN, YIPING; HUANG, XIAODONG; WANG, JIAN

    2016-01-01

    Uniparental disomy (UPD), which is the abnormal situation in which both copies of a chromosomal pair have been inherited from one parent, may cause clinical abnormalities by affecting genomic imprinting or causing autosomal recessive variation. Whole Exome Sequencing (WES) and chromosomal microarray analysis (CMA) are powerful technologies used to search for underlying causal variants. In the present study, WES was used to screen for candidate causal variants in the genome of a Chinese pediatric patient, who had been shown by CMA to have maternal uniparental isodisomy of chromosome 10. This was associated with numerous severe medical problems, including bilateral deafness, binocular blindness, stunted growth and leukoderma. A total of 13 rare homozygous variants of these genes were identified on chromosome 10. These included a classical splice variant in the HPS1 gene (c.398+5G>A), which causes Hermansky-Pudlak syndrome type 1 and may explain the patient's ocular and dermal disorders. In addition, six likely pathogenic genes on other chromosomes were found to be associated with the subject's ocular and aural disorders by phenotypic analysis. The results of the present study demonstrated that WES and CMA may be successfully combined in order to identify candidate causal genes. Furthermore, a connection between phenotype and genotype was established in this patient. PMID:27284308

  1. Whole-Exome Sequencing Identifies Loci Associated with Blood Cell Traits and Reveals a Role for Alternative GFI1B Splice Variants in Human Hematopoiesis.

    PubMed

    Polfus, Linda M; Khajuria, Rajiv K; Schick, Ursula M; Pankratz, Nathan; Pazoki, Raha; Brody, Jennifer A; Chen, Ming-Huei; Auer, Paul L; Floyd, James S; Huang, Jie; Lange, Leslie; van Rooij, Frank J A; Gibbs, Richard A; Metcalf, Ginger; Muzny, Donna; Veeraraghavan, Narayanan; Walter, Klaudia; Chen, Lu; Yanek, Lisa; Becker, Lewis C; Peloso, Gina M; Wakabayashi, Aoi; Kals, Mart; Metspalu, Andres; Esko, Tõnu; Fox, Keolu; Wallace, Robert; Franceshini, Nora; Matijevic, Nena; Rice, Kenneth M; Bartz, Traci M; Lyytikäinen, Leo-Pekka; Kähönen, Mika; Lehtimäki, Terho; Raitakari, Olli T; Li-Gao, Ruifang; Mook-Kanamori, Dennis O; Lettre, Guillaume; van Duijn, Cornelia M; Franco, Oscar H; Rich, Stephen S; Rivadeneira, Fernando; Hofman, Albert; Uitterlinden, André G; Wilson, James G; Psaty, Bruce M; Soranzo, Nicole; Dehghan, Abbas; Boerwinkle, Eric; Zhang, Xiaoling; Johnson, Andrew D; O'Donnell, Christopher J; Johnsen, Jill M; Reiner, Alexander P; Ganesh, Santhi K; Sankaran, Vijay G

    2016-08-04

    Circulating blood cell counts and indices are important indicators of hematopoietic function and a number of clinical parameters, such as blood oxygen-carrying capacity, inflammation, and hemostasis. By performing whole-exome sequence association analyses of hematologic quantitative traits in 15,459 community-dwelling individuals, followed by in silico replication in up to 52,024 independent samples, we identified two previously undescribed coding variants associated with lower platelet count: a common missense variant in CPS1 (rs1047891, MAF = 0.33, discovery + replication p = 6.38 × 10(-10)) and a rare synonymous variant in GFI1B (rs150813342, MAF = 0.009, discovery + replication p = 1.79 × 10(-27)). By performing CRISPR/Cas9 genome editing in hematopoietic cell lines and follow-up targeted knockdown experiments in primary human hematopoietic stem and progenitor cells, we demonstrate an alternative splicing mechanism by which the GFI1B rs150813342 variant suppresses formation of a GFI1B isoform that preferentially promotes megakaryocyte differentiation and platelet production. These results demonstrate how unbiased studies of natural variation in blood cell traits can provide insight into the regulation of human hematopoiesis.

  2. A sequence dimorphism in a conserved domain of human 28S rRNA. Uneven distribution of variant genes among individuals. Differential expression in HeLa cells.

    PubMed Central

    Qu, L H; Nicoloso, M; Bachellerie, J P

    1991-01-01

    In humans, cellular 28S rRNA displays a sequence dimorphism within an evolutionarily conserved motif, with the presence, at position +60, of either a A (like the metazoan consensus) or a G. The relative abundance of the two forms of variant genes in the genome exhibit large differences among individuals. The two variant forms are generally represented in cellular 28S rRNA in proportion of their relative abundance in the genome, at least for leucocytes. However, in some cases, one form of variant may be markedly underexpressed as compared to the other. Thus, in HeLa cells, A-form genes contribute to only 1% of the cellular content in mature 28S rRNA although amounting to 15% of the ribosomal genes. The differential expression seems to result from different transcriptional activities rather than from differences in pre-rRNA processing efficiency or in stabilities of mature rRNAs. G-form ribosomal genes were not detected in other mammals, including chimpanzee, which suggests that the fixation of this variant type is a rather recent event in primate evolution. Images PMID:2020541

  3. Partial VP2 sequencing of canine parvovirus (CPV) strains circulating in the state of Rio de Janeiro, Brazil: detection of the new variant CPV-2c

    PubMed Central

    Castro, T.X.; Costa, E.M; Leite, J.P.G.; Labarthe, N.V.; Cubel Garcia, R.C.N.

    2010-01-01

    Canine parvovirus (CPV) is the most important enteric virus for dogs and it seems to be undergoing continuous evolution, generating new genetic and antigenic variants throughout the world. The aim of this study was to analyze the distribution of CPV variants from 1995 to 2009 and to investigate the circulation of the new variant CPV-2c in Rio de Janeiro, Brazil. In addition, the clinical features of CPV infection were also reported. After CPV laboratorial confirmation by HA/HI and PCR, thirty-two fecal samples were analyzed by sequencing a 583-bp fragment of the VP2 gene. One sample, collected in 2008 was typed as the new type CPV-2c. All samples from 1995 to 2003 were identified as “new CPV-2a”. From 2004 to 2006, both “new CPV-2a” and CPV-2b were observed. From 2006 to 2009, most of the samples were characterized as CPV-2b. The classical signs of CPV enteritis were observed in 16/18 CPV-2a and 5/13 CPV-2b infected puppies. These results show that continuous epidemiological surveillance of CPV strain distribution is essential for studying the patterns of CPV-2a and 2b spread and for determining whether the new variant CPV-2c has become permanently established in Brazilian canine population. PMID:24031592

  4. An Integrated Sequence-Structure Database incorporating matching mRNA sequence, amino acid sequence and protein three-dimensional structure data.

    PubMed Central

    Adzhubei, I A; Adzhubei, A A; Neidle, S

    1998-01-01

    We have constructed a non-homologous database, termed the Integrated Sequence-Structure Database (ISSD) which comprises the coding sequences of genes, amino acid sequences of the corresponding proteins, their secondary structure and straight phi,psi angles assignments, and polypeptide backbone coordinates. Each protein entry in the database holds the alignment of nucleotide sequence, amino acid sequence and the PDB three-dimensional structure data. The nucleotide and amino acid sequences for each entry are selected on the basis of exact matches of the source organism and cell environment. The current version 1.0 of ISSD is available on the WWW at http://www.protein.bio.msu.su/issd/ and includes 107 non-homologous mammalian proteins, of which 80 are human proteins. The database has been used by us for the analysis of synonymous codon usage patterns in mRNA sequences showing their correlation with the three-dimensional structure features in the encoded proteins. Possible ISSD applications include optimisation of protein expression, improvement of the protein structure prediction accuracy, and analysis of evolutionary aspects of the nucleotide sequence-protein structure relationship. PMID:9399866

  5. ICO amplicon NGS data analysis: a Web tool for variant detection in common high-risk hereditary cancer genes analyzed by amplicon GS Junior next-generation sequencing.

    PubMed

    Lopez-Doriga, Adriana; Feliubadaló, Lídia; Menéndez, Mireia; Lopez-Doriga, Sergio; Morón-Duran, Francisco D; del Valle, Jesús; Tornero, Eva; Montes, Eva; Cuesta, Raquel; Campos, Olga; Gómez, Carolina; Pineda, Marta; González, Sara; Moreno, Victor; Capellá, Gabriel; Lázaro, Conxi

    2014-03-01

    Next-generation sequencing (NGS) has revolutionized genomic research and is set to have a major impact on genetic diagnostics thanks to the advent of benchtop sequencers and flexible kits for targeted libraries. Among the main hurdles in NGS are the difficulty of performing bioinformatic analysis of the huge volume of data generated and the high number of false positive calls that could be obtained, depending on the NGS technology and the analysis pipeline. Here, we present the development of a free and user-friendly Web data analysis tool that detects and filters sequence variants, provides coverage information, and allows the user to customize some basic parameters. The tool has been developed to provide accurate genetic analysis of targeted sequencing of common high-risk hereditary cancer genes using amplicon libraries run in a GS Junior System. The Web resource is linked to our own mutation database, to assist in the clinical classification of identified variants. We believe that this tool will greatly facilitate the use of the NGS approach in routine laboratories.

  6. Use of heteroduplex mobility assays (HMA) for pre-sequencing screening and identification of variant strains of swine and avian hepatitis E viruses.

    PubMed

    Sun, Z F; Huang, F F; Halbur, P G; Schommer, S K; Pierson, F W; Toth, T E; Meng, X J

    2003-10-17

    Hepatitis E virus (HEV), the causative agent of human hepatitis E, is an important public health problem in many developing countries and is also endemic in many industrialized countries including the US. The discoveries of avian and swine HEVs by our group from chickens and pigs, respectively, suggest that hepatitis E may be a zoonosis. Current methods for molecular epidemiological studies of HEV require PCR amplification of field strains of HEV followed by DNA sequencing and sequence analyses, which are laborious and expensive. As novel or variant strains of HEV continue to evolve rapidly both in humans and other animals, it is important to develop a rapid pre-sequencing screening method to select field isolates for further molecular characterization. In this study, we developed two heteroduplex mobility assays (HMA) (one for swine HEV based on the ORF2 region, and the other for avian HEV based on the ORF1 region) to genetically differentiate field strains of avian and swine HEVs from known reference strains. The ORF2 regions of 22 swine HEV isolates and the ORF1 regions of 13 avian HEV isolates were amplified by PCR, sequenced and analyzed by HMA against reference prototype swine HEV strain and reference prototype avian HEV strain, respectively. We showed that, in general, the HMA profiles correlate well with nucleotide sequence identities and with phylogenetic clustering between field strains and the reference swine HEV or avian HEV strains. Field isolates with similar HMA patterns generally showed similar sequence identities with the reference strains and clustered together in the phylogenetic trees. Therefore, by using different HEV isolates as references, the HMA developed in this study can be used as a pre-sequencing screening tool to identify variant HEV isolates for further molecular epidemiological studies.

  7. Identification of a Novel De Novo Variant in the PAX3 Gene in Waardenburg Syndrome by Diagnostic Exome Sequencing: The First Molecular Diagnosis in Korea

    PubMed Central

    Jang, Mi-Ae; Lee, Taeheon; Lee, Junnam

    2015-01-01

    Waardenburg syndrome (WS) is a clinically and genetically heterogeneous hereditary auditory pigmentary disorder characterized by congenital sensorineural hearing loss and iris discoloration. Many genes have been linked to WS, including PAX3, MITF, SNAI2, EDNRB, EDN3, and SOX10, and many additional genes have been associated with disorders with phenotypic overlap with WS. To screen all possible genes associated with WS and congenital deafness simultaneously, we performed diagnostic exome sequencing (DES) in a male patient with clinical features consistent with WS. Using DES, we identified a novel missense variant (c.220C>G; p.Arg74Gly) in exon 2 of the PAX3 gene in the patient. Further analysis by Sanger sequencing of the patient and his parents revealed a de novo occurrence of the variant. Our findings show that DES can be a useful tool for the identification of pathogenic gene variants in WS patients and for differentiation between WS and similar disorders. To the best of our knowledge, this is the first report of genetically confirmed WS in Korea. PMID:25932447

  8. Exome Sequencing Identifies a Missense Variant in EFEMP1 Co-Segregating in a Family with Autosomal Dominant Primary Open-Angle Glaucoma

    PubMed Central

    Mackay, Donna S.; Bennett, Thomas M.; Shiels, Alan

    2015-01-01

    Primary open-angle glaucoma (POAG) is a clinically important and genetically heterogeneous cause of progressive vision loss as a result of retinal ganglion cell death. Here we have utilized trio-based, whole-exome sequencing to identify the genetic defect underlying an autosomal dominant form of adult-onset POAG segregating in an African-American family. Exome sequencing identified a novel missense variant (c.418C>T, p.Arg140Trp) in exon-5 of the gene coding for epidermal growth factor (EGF) containing fibulin-like extracellular matrix protein 1 (EFEMP1) that co-segregated with disease in the family. Linkage and haplotype analyses with microsatellite markers indicated that the disease interval overlapped a known POAG locus (GLC1H) on chromosome 2p. The p.Arg140Trp substitution was predicted in silico to have damaging effects on protein function and transient expression studies in cultured cells revealed that the Trp140-mutant protein exhibited increased intracellular accumulation compared with wild-type EFEMP1. In situ hybridization of the mouse eye with oligonucleotide probes detected the highest levels of EFEMP1 transcripts in the ciliary body, cornea, inner nuclear layer of the retina, and the optic nerve head. The recent finding that a common variant near EFEMP1 was associated with optic nerve-head morphology supports the possibility that the EFEMP1 variant identified in this POAG family may be pathogenic. PMID:26162006

  9. Complete amino acid sequence and structure characterization of the taste-modifying protein, miraculin.

    PubMed

    Theerasilp, S; Hitotsuya, H; Nakajo, S; Nakaya, K; Nakamura, Y; Kurihara, Y

    1989-04-25

    The taste-modifying protein, miraculin, has the unusual property of modifying sour taste into sweet taste. The complete amino acid sequence of miraculin purified from miracle fruits by a newly developed method (Theerasilp, S., and Kurihara, Y. (1988) J. Biol. Chem. 263, 11536-11539) was determined by an automatic Edman degradation method. Miraculin was a single polypeptide with 191 amino acid residues. The calculated molecular weight based on the amino acid sequence and the carbohydrate content (13.9%) was 24,600. Asn-42 and Asn-186 were linked N-glycosidically to carbohydrate chains. High homology was found between the amino acid sequences of miraculin and soybean trypsin inhibitor.

  10. Detection and isolation of nucleic acid sequences using a bifunctional hybridization probe

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    2000-01-01

    A method for detecting and isolating a target sequence in a sample of nucleic acids is provided using a bifunctional hybridization probe capable of hybridizing to the target sequence that includes a detectable marker and a first complexing agent capable of forming a binding pair with a second complexing agent. A kit is also provided for detecting a target sequence in a sample of nucleic acids using a bifunctional hybridization probe according to this method.

  11. Identification of two novel HLA-A*02 variants, A*02:319 and A*02:01:64, in two Taiwanese marrow stem cell donors by sequence-based typing.

    PubMed

    Yang, K L; Lee, S K; Yang, S Y; Kao, R H; Lin, C L; Lin, P Y

    2012-06-01

    We report here two novel variants of HLA-A*02 allele, A*02:319 and A*02:01:64, discovered in two Taiwanese unrelated volunteer bone marrow donors by sequence-based typing (SBT) method. The DNA sequence of A*02:319 is identical to A*02:07 in exons 2 and 3 but varies with one nucleotide at codon 9 (TTC->TCC). The variation caused one amino acid substitution at residue 9 (F->S). On the other hand, the DNA sequence of A*02:01:64 is identical to the sequence of A*02:01:01:01 in exons 2 and 3 except a silent mutation at codon 114 (CAC->CAT). The probable HLA-A, HLA-B and HLA-DRB1 haplotypes in association with A*02:319 and A*02:01:64 were deduced as A*02:319-B*46:01-DRB1*04 and A*02:01:64-B*38:02-DRB1*16:02, respectively.

  12. Characterization of acidic and basic variants of IgG1 therapeutic monoclonal antibodies based on non-denaturing IEF fractionation.

    PubMed

    Dada, Oluwatosin O; Jaya, Nomalie; Valliere-Douglass, John; Salas-Solano, Oscar

    2015-08-20

    Characterization of both the acidic and basic regions of imaged capillary isoelectric focusing (icIEF) profile of an IgG1 antibody was achieved through preparative immobilized pH gradient isoelectric focusing (IPG-IEF) fractionation. Recent attempts at using this method to fractionate charge variants of monoclonal antibodies (mAbs) have shown promising results, but identification of the chemical modifications in the variants was limited to the basic species. We have optimized the method to achieve enrichment of each variant across the icIEF profile of an IgG1 mAb. The fractionation was followed by extended characterization to elucidate the composition of the acidic, main, and basic species observed in the icIEF profile. Deamidation, sialylation, glycation, and fragmentation were identified as the main modifications contributing to acidic variants of the mAb while C-terminal lysine, C-terminal proline amidation, and uncyclized N-terminal glutamine were the major species contributing to the basic variants. This characterization allows a better understanding of the modifications that contribute to the charge variants observed by icIEF, facilitating the evaluation of impacts on product safety and efficacy.

  13. Differences in Transcriptional Activation by the Two Allelic (L162V Polymorphic) Variants of PPARα after Omega-3 Fatty Acids Treatment

    PubMed Central

    Rudkowska, Iwona; Verreault, Mélanie; Barbier, Olivier; Vohl, Marie-Claude

    2009-01-01

    Omega-3 fatty acids (FAs) have the potential to regulate gene expression via the peroxisome proliferator-activated receptor α (PPARα); therefore, genetic variations in this gene may impact its transcriptional activity on target genes. It is hypothesized that the transcriptional activity by wild-type L162-PPARα is enhanced to a greater extent than the mutated variant (V162-PPARα) in the presence of eicosapentaenoic acid (EPA), docosahexaenoic acid (DHA) or a mixture of EPA:DHA. To examine the functional difference of the two allelic variants on receptor activity, transient co-transfections were performed in human hepatoma HepG2 cells activated with EPA, DHA and EPA:DHA mixtures. Results indicate that the addition of EPA or DHA demonstrate potential to increase the transcriptional activity by PPARα with respect to basal level in both variants. Yet, the EPA:DHA mixtures enhanced the transcriptional activity to a greater extent than individual FAs indicating possible additive effects of EPA and DHA. Additionally, the V162 allelic form of PPARα demonstrated consistently lower transcriptional activation when incubated with EPA, DHA or EPA:DHA mixtures than, the wild-type variant. In conclusion, both allelic variants of the PPARα L162V are activated by omega-3 FAs; however, the V162 allelic form displays a lower transcriptional activity than the wild-type variant. PMID:19266045

  14. A Robust and Powerful Set-Valued Approach to Rare Variant Association Analyses of Secondary Traits in Case-Control Sequencing Studies.

    PubMed

    Kang, Guolian; Bi, Wenjian; Zhang, Hang; Pounds, Stanley; Cheng, Cheng; Shete, Sanjay; Zou, Fei; Zhao, Yanlong; Zhang, Ji-Feng; Yue, Weihua

    2017-03-01

    In many case-control designs of genome-wide association (GWAS) or next generation sequencing (NGS) studies, extensive data on secondary traits that may correlate and share the common genetic variants with the primary disease are available. Investigating these secondary traits can provide critical insights into the disease etiology or pathology, and enhance the GWAS or NGS results. Methods based on logistic regression (LG) were developed for this purpose. However, for the identification of rare variants (RVs), certain inadequacies in the LG models and algorithmic instability can cause severely inflated type I error, and significant loss of power, when the two traits are correlated and the RV is associated with the disease, especially at stringent significance levels. To address this issue, we propose a novel set-valued (SV) method that models a binary trait by dichotomization of an underlying continuous variable, and incorporate this into the genetic association model as a critical component. Extensive simulations and an analysis of seven secondary traits in a GWAS of benign ethnic neutropenia show that the SV method consistently controls type I error well at stringent significance levels, has larger power than the LG-based methods, and is robust in performance to effect pattern of the genetic variant (risk or protective), rare or common variants, rare or common diseases, and trait distributions. Because of the SV method's striking and profound advantage, we strongly recommend the SV method be employed instead of the LG-based methods for secondary traits analyses in case-control sequencing studies.

  15. Side-to-side range of movement variability in variants of the median and radial neurodynamic test sequences in asymptomatic people.

    PubMed

    Stalioraitis, Vaidas; Robinson, Kim; Hall, Toby

    2014-08-01

    Side-to-side discrepancy in range of motion (ROM) during upper limb neurodynamic testing is used in part to identify abnormal peripheral nerve mechanosensitivity and is one of three factors to consider in determining a positive test. Large side-to-side variability is reported for some variants of the upper limb neurodynamic test sequences, however discrepancies for other test variants are unknown. Hence the purpose of this study was to evaluate side-to-side discrepancy in elbow flexion ROM during two variants of upper limb neurodynamic test sequence for the median and radial nerves. 51 asymptomatic subjects (26 females, mean age 29.69 years) were evaluated. A uniaxial electrogoniometer was used to measure elbow flexion ROM at onset of resistance (R1) and onset of discomfort (P1) during the median and radial neurodynamic tests on each side. Reliability was determined by testing 20 subjects twice and was found to be good (ICC greater than 0.88 and SEM less than 4.02°). There was no significant difference in mean ROM between sides. Lower-bound scores indicate that intra-individual, inter-limb differences of more than 15° for the median nerve and 11° for the radial nerve exceeds the range of normal ROM asymmetry on neurodynamic testing at R1 and P1. Correlation of ROM between limbs was significant with R(2) values of 0.62 and 0.85 for the median and radial nerves respectively. These finding provide clinicians with information regarding normal side-to-side variability in ROM during two commonly used variants of neurodynamic tests.

  16. Phenotype characterization and sequence analysis of BMP2 and BMP4 variants in two Mexican families with oligodontia.

    PubMed

    Mu, Y; Xu, Z; Contreras, C I; McDaniel, J S; Donly, K J; Chen, S

    2012-11-28

    Both BMP2 and BMP4 are involved in tooth development. We examined phenotypes and BMP2 and BMP4 gene variations in two Mexican oligodontia families. Physical and oral examinations and panoramic radiographs were performed on affected and unaffected members in these two families. The affected members lacked six or more teeth. DNA sequencing was performed to detect BMP2 and BMP4 gene variations. Three single nucleotide polymorphisms (SNPs) in BMP2 and BMP4 genes were identified in the two families, including one synonymous and two missense SNPs: BMP2 c261A>G, pS87S, BMP2 c570A>T, pR190S, and BMP4 c455T>C, pV152A. Among the six affected patients, 67% carried "GG" or "AG" genotype in BMP2 c261A>G and four were "TT" or "AT" genotype in BMP2 c570A>T (pR190S). Polymorphism of BMP4 c455T>C resulted in amino acid changes of Val/Ala (pV152A). BMP2 c261A>G and BMP4 c455T>C affect mRNA stability. This was the first time that BMP2 and BMP4 SNPs were observed in Mexican oligodontia families.

  17. The amino-acid sequence of the 2S sulphur-rich proteins from seeds of Brazil nut (Bertholletia excelsa H.B.K.).

    PubMed

    Ampe, C; Van Damme, J; de Castro, L A; Sampaio, M J; Van Montagu, M; Vandekerckhove, J

    1986-09-15

    Storage proteins of the albumin solubility fraction from seeds of Bertholletia excelsa H.B.K. were separated by reversed-phase high-performance liquid chromatography and their primary structures were determined by gas-phase sequencing on intact polypeptides and on the overlapping tryptic and thermolysin peptides. The 2S storage proteins consist of two subunits linked by disulphide bridges. The large subunit (8.5 kDa) is expressed in at least six different isoforms while the small subunit (3.6 kDa) consists of only one form. These proteins are extremely rich in glutamine, glutamic acid, arginine and the sulphur-containing amino acids cysteine and methionine. One of the variants even contains a sequence of six methionine residues in a row. Comparison with known sequences of 2S proteins of other dicotyledonous plants shows limited but distinct sequence homology. In particular, the positions of the cysteine residues relative to each other appear to be completely conserved, suggesting that tertiary structure constraints imposed by disulphide bridges dominate sequence conservation. It has been proposed that the two subunits of a related protein (the Brassica napus storage protein) is cleaved from a precursor polypeptide [Crouch, M. L., Tenbarge, K. M., Simon, A. E. & Ferl, R. (1983) J. Mol. Appl. Genet. 2,273-283]. The amino acid sequence homology of the Brazil nut protein with the former suggests that a similar protein processing event could occur.

  18. Fractionation of the genetic variants of human alpha 1-acid glycoprotein in the native form by chromatography on an immobilized copper(II) affinity adsorbent. Heterogeneity of the separate variants by isoelectrofocusing and by concanavalin A affinity chromatography.

    PubMed

    Hervé, F; Gomas, E; Duché, J C; Tillement, J P

    1993-05-19

    Fractionation of the three main genetic variants (F1, S and A) of human alpha 1-acid glycoprotein (AAG), in their native (sialylated) form, by chromatography on immobilized copper(II) affinity adsorbent was investigated. This chromatographic method had been previously developed to fractionate the desialylated protein variants. For that purpose, the three main AAG phenotypes samples (F1S/A, F1/A and S/A), which had been previously isolated from individual human plasma samples, and an AAG sample from commercial source (a mixture of the phenotypes) were used in the native form. Affinity chromatography of these different samples on an iminodiacetate Sepharose-copper(II) gel at pH 7 resolved two protein peaks, irrespective of the origin of the native AAG sample used. The unbound peak 1 was found to consist of the F1, the S or both variants, depending on the phenotype of the AAG sample used in the chromatography. The bound peak 2 was found to consist of the A variant in a pure form. The fractionation results obtained with native AAG were found to be the same as those originally yielded by the desialylated protein. However, comparison of the interactions of native and desialylated AAG with immobilized copper(II) ions, using an affinity chromatographic method and a non-chromatographic equilibrium binding technique, respectively, showed that desialylation increased the non-specific interactions of the protein with immobilized copper(II) ions. The AAG variants were not fractionated when affinity chromatography was performed using immobilized zinc, nickel or cobalt(II) ions, instead of copper. After purification of each variant in the sialylated form (F1, S and A), their respective heterogeneity was studied by analytical isoelectrofocusing with carrier ampholytes in the pH range 2.5-4.5. In addition, the lectin-binding behaviour of the separate sialylated AAG variants was investigated by affinity chromatography on immobilized concanavalin A.

  19. Fatty acid desaturase gene variants, cardiovascular risk factors, and myocardial infarction in the costa rica study

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Genetic variation in fatty acid desaturases (FADS) has previously been linked to long-chain polyunsaturated fatty acids (PUFAs) in adipose tissue and cardiovascular risk. The goal of our study was to test associations between six common FADS polymorphisms (rs174556, rs3834458, rs174570, rs2524299, r...

  20. Lipoprotein lipase variants interact with polyunsaturated fatty acids to modulate obesity traits in Puerto Ricans

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Lipoprotein lipase (LPL) is a candidate gene for obesity based on its role in triglyceride hydrolysis and the partitioning of fatty acids towards storage or oxidation. Whether dietary fatty acids modify LPL associated obesity risk is unknown. We examined five single nucleotide polymorphisms (SNPs) (...

  1. Trichomonas vaginalis acidic phospholipase A2: isolation and partial amino acid sequence.

    PubMed

    Escobedo-Guajardo, Brenda L; González-Salazar, Francisco; Palacios-Corona, Rebeca; Torres de la Cruz, Víctor M; Morales-Vallarta, Mario; Mata-Cárdenas, Benito D; Garza-González, Jesús N; Rivera-Silva, Gerardo; Vargas-Villarreal, Javier

    2013-12-01

    Sexually transmitted diseases are a major cause of acute disease worldwide, and trichomoniasis is the most common and curable disease, generating more than 170 million cases annually worldwide. Trichomonas vaginalis is the causal agent of trichomoniasis and has the ability to destroy in vitro cell monolayers of the vaginal mucosa, where the phospholipases A2 (PLA2) have been reported as potential virulence factors. These enzymes have been partially characterized from the subcellular fraction S30 of pathogenic T. vaginalis strains. The main objective of this study was to purify a phospholipase A2 from T. vaginalis, make a partial characterization, obtain a partial amino acid sequence, and determine its enzymatic participation as hemolytic factor causing lysis of erythrocytes. Trichomonas S30, RF30 and UFF30 sub-fractions from GT-15 strain have the capacity to hydrolyze [2-(14)C-PA]-PC at pH 6.0. Proteins from the UFF30 sub-fraction were separated by affinity chromatography into two eluted fractions with detectable PLA A2 activity. The EDTA-eluted fraction was analyzed by HPLC using on-line HPLC-tandem mass spectrometry and two protein peaks were observed at 8.2 and 13 kDa. Peptide sequences were identified from the proteins present in the eluted EDTA UFF30 fraction; bioinformatic analysis using Protein Link Global Server charged with T. vaginalis protein database suggests that eluted peptides correspond a putative ubiquitin protein in the 8.2 kDa fraction and a phospholipase preserved in the 13 kDa fraction. The EDTA-eluted fraction hydrolyzed [2-(14)C-PA]-PC lyses erythrocytes from Sprague-Dawley in a time and dose-dependent manner. The acidic hemolytic activity decreased by 84% with the addition of 100 μM of Rosenthal's inhibitor.

  2. Variant discovery and breakpoint region prediction for studying the human 22q11.2 deletion using BAC clone and whole genome sequencing analysis.

    PubMed

    Guo, Xingyi; Delio, Maria; Haque, Nousin; Castellanos, Raquel; Hestand, Matthew S; Vermeesch, Joris R; Morrow, Bernice E; Zheng, Deyou

    2016-09-01

    Velo-cardio-facial syndrome/DiGeorge syndrome/22q11.2 deletion syndrome (22q11.2DS) is caused by meiotic non-allelic homologous recombination events between flanking low copy repeats termed LCR22A and LCR22D, resulting in a 3 million base pair (Mb) deletion. Due to their complex structure, large size and high sequence identity, genetic variation within LCR22s among different individuals has not been well characterized. In this study, we sequenced 13 BAC clones derived from LCR22A/D and aligned them with 15 previously available BAC sequences to create a new genetic variation map. The thousands of variants identified by this analysis were not uniformly distributed in the two LCR22s. Moreover, shared single nucleotide variants between LCR22A and LCR22D were enriched in the Breakpoint Cluster Region pseudogene (BCRP) block, suggesting the existence of a possible recombination hotspot there. Interestingly, breakpoints for atypical 22q11.2 rearrangements have previously been located to BCRPs To further explore this finding, we carried out in-depth analyses of whole genome sequence (WGS) data from two unrelated probands harbouring a de novo 3Mb 22q11.2 deletion and their normal parents. By focusing primarily on WGS reads uniquely mapped to LCR22A, using the variation map from our BAC analysis to help resolve allele ambiguity, and by performing PCR analysis, we infer that the deletion breakpoints were most likely located near or within the BCRP module. In summary, we found a high degree of sequence variation in LCR22A and LCR22D and a potential recombination breakpoint near or within the BCRP block, providing a starting point for future breakpoint mapping using additional trios.

  3. Moving Away from the Reference Genome: Evaluating a Peptide Sequencing Tagging Approach for Single Amino Acid Polymorphism Identifications in the Genus Populus

    SciTech Connect

    Abraham, Paul E; Adams, Rachel M; Tuskan, Gerald A; Hettich, Robert {Bob} L

    2013-01-01

    The genetic diversity across natural populations of the model organism, Populus, is extensive, containing a single nucleotide polymorphism roughly every 200 base pairs. When deviations from the reference genome occur in coding regions, they can impact protein sequences. Rather than relying on a static reference database to profile protein expression, we employed a peptide sequence tagging (PST) approach capable of decoding the plasticity of the Populus proteome. Using shotgun proteomics data from two genotypes of P. trichocarpa, a tag-based approach enabled the detection of 6,653 unexpected sequence variants. Through manual validation, our study investigated how the most abundant chemical modification (methionine oxidation) could masquerade as a sequence variant (AlaSer) when few site-determining ions existed. In fact, precise localization of an oxidation site for peptides with more than one potential placement was indeterminate for 70% of the MS/MS spectra. We demonstrate that additional fragment ions made available by high energy collisional dissociation enhances the robustness of the peptide sequence tagging approach (81% of oxidation events could be exclusively localized to a methionine). We are confident that augmenting fragmentation processes for a PST approach will further improve the identification of single amino acid polymorphism in Populus and potentially other species as well.

  4. Identification of random nucleic acid sequence aberrations using dual capture probes which hybridize to different chromosome regions

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1998-03-24

    A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration. 14 figs.

  5. Identification of random nucleic acid sequence aberrations using dual capture probes which hybridize to different chromosome regions

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1998-01-01

    A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration.

  6. Sequence variants in SLC16A11 are a common risk factor for type 2 diabetes in Mexico.

    PubMed

    Williams, Amy L; Jacobs, Suzanne B R; Moreno-Macías, Hortensia; Huerta-Chagoya, Alicia; Churchhouse, Claire; Márquez-Luna, Carla; García-Ortíz, Humberto; Gómez-Vázquez, María José; Burtt, Noël P; Aguilar-Salinas, Carlos A; González-Villalpando, Clicerio; Florez, Jose C; Orozco, Lorena; Haiman, Christopher A; Tusié-Luna, Teresa; Altshuler, David

    2014-02-06

    Performing genetic studies in multiple human populations can identify disease risk alleles that are common in one population but rare in others, with the potential to illuminate pathophysiology, health disparities, and the population genetic origins of disease alleles. Here we analysed 9.2 million single nucleotide polymorphisms (SNPs) in each of 8,214 Mexicans and other Latin Americans: 3,848 with type 2 diabetes and 4,366 non-diabetic controls. In addition to replicating previous findings, we identified a novel locus associated with type 2 diabetes at genome-wide significance spanning the solute carriers SLC16A11 and SLC16A13 (P = 3.9 × 10(-13); odds ratio (OR) = 1.29). The association was stronger in younger, leaner people with type 2 diabetes, and replicated in independent samples (P = 1.1 × 10(-4); OR = 1.20). The risk haplotype carries four amino acid substitutions, all in SLC16A11; it is present at ~50% frequency in Native American samples and ~10% in east Asian, but is rare in European and African samples. Analysis of an archaic genome sequence indicated that the risk haplotype introgressed into modern humans via admixture with Neanderthals. The SLC16A11 messenger RNA is expressed in liver, and V5-tagged SLC16A11 protein localizes to the endoplasmic reticulum. Expression of SLC16A11 in heterologous cells alters lipid metabolism, most notably causing an increase in intracellular triacylglycerol levels. Despite type 2 diabetes having been well studied by genome-wide association studies in other populations, analysis in Mexican and Latin American individuals identified SLC16A11 as a novel candidate gene for type 2 diabetes with a possible role in triacylglycerol metabolism.

  7. Sequence variants in SLC16A11 are a common risk factor for type 2 diabetes in Mexico

    PubMed Central

    2014-01-01

    Performing genetic studies in multiple human populations can identify disease risk alleles that are common in one population but rare in others