dna sequence variant: Topics by Science.gov

Sample records for dna sequence variant

Process of labeling specific chromosomes using recombinant repetitive DNA

DOEpatents

Moyzis, R.K.; Meyne, J.

1988-02-12

Chromosome preferential nucleotide sequences are first determined from a library of recombinant DNA clones having families of repetitive sequences. Library clones are identified with a low homology with a sequence of repetitive DNA families to which the first clones respectively belong and variant sequences are then identified by selecting clones having a pattern of hybridization with genomic DNA dissimilar to the hybridization pattern shown by the respective families. In another embodiment, variant sequences are selected from a sequence of a known repetitive DNA family. The selected variant sequence is classified as chromosome specific, chromosome preferential, or chromosome nonspecific. Sequences which are classified as chromosome preferential are further sequenced and regions are identified having a low homology with other regions of the chromosome preferential sequence or with known sequences of other family members and consensus sequences of the repetitive DNA families for the chromosome preferential sequences. The selected low homology regions are then hybridized with chromosomes to determine those low homology regions hybridized with a specific chromosome under normal stringency conditions.
Pooled-DNA Sequencing for Elucidating New Genomic Risk Factors, Rare Variants Underlying Alzheimer's Disease.

PubMed

Jin, Sheng Chih; Benitez, Bruno A; Deming, Yuetiva; Cruchaga, Carlos

2016-01-01

Analyses of genome-wide association studies (GWAS) for complex disorders usually identify common variants with a relatively small effect size that only explain a small proportion of phenotypic heritability. Several studies have suggested that a significant fraction of heritability may be explained by low-frequency (minor allele frequency (MAF) of 1-5 %) and rare-variants that are not contained in the commercial GWAS genotyping arrays (Schork et al., Curr Opin Genet Dev 19:212, 2009). Rare variants can also have relatively large effects on risk for developing human diseases or disease phenotype (Cruchaga et al., PLoS One 7:e31039, 2012). However, it is necessary to perform next-generation sequencing (NGS) studies in a large population (>4,000 samples) to detect a significant rare-variant association. Several NGS methods, such as custom capture sequencing and amplicon-based sequencing, are designed to screen a small proportion of the genome, but most of these methods are limited in the number of samples that can be multiplexed (i.e. most sequencing kits only provide 96 distinct index). Additionally, the sequencing library preparation for 4,000 samples remains expensive and thus conducting NGS studies with the aforementioned methods are not feasible for most research laboratories.The need for low-cost large scale rare-variant detection makes pooled-DNA sequencing an ideally efficient and cost-effective technique to identify rare variants in target regions by sequencing hundreds to thousands of samples. Our recent work has demonstrated that pooled-DNA sequencing can accurately detect rare variants in targeted regions in multiple DNA samples with high sensitivity and specificity (Jin et al., Alzheimers Res Ther 4:34, 2012). In these studies we used a well-established pooled-DNA sequencing approach and a computational package, SPLINTER (short indel prediction by large deviation inference and nonlinear true frequency estimation by recursion) (Vallania et al., Genome Res 20:1711, 2010), for accurate identification of rare variants in large DNA pools. Given an average sequencing coverage of 30× per haploid genome, SPLINTER can detect rare variants and short indels up to 4 base pairs (bp) with high sensitivity and specificity (up to 1 haploid allele in a pool as large as 500 individuals). Step-by-step instructions on how to conduct pooled-DNA sequencing experiments and data analyses are described in this chapter.
A Glimpse into the Satellite DNA Library in Characidae Fish (Teleostei, Characiformes)

PubMed Central

Utsunomia, Ricardo; Ruiz-Ruano, Francisco J.; Silva, Duílio M. Z. A.; Serrano, Érica A.; Rosa, Ivana F.; Scudeler, Patrícia E. S.; Hashimoto, Diogo T.; Oliveira, Claudio; Camacho, Juan Pedro M.; Foresti, Fausto

2017-01-01

Satellite DNA (satDNA) is an abundant fraction of repetitive DNA in eukaryotic genomes and plays an important role in genome organization and evolution. In general, satDNA sequences follow a concerted evolutionary pattern through the intragenomic homogenization of different repeat units. In addition, the satDNA library hypothesis predicts that related species share a series of satDNA variants descended from a common ancestor species, with differential amplification of different satDNA variants. The finding of a same satDNA family in species belonging to different genera within Characidae fish provided the opportunity to test both concerted evolution and library hypotheses. For this purpose, we analyzed here sequence variation and abundance of this satDNA family in ten species, by a combination of next generation sequencing (NGS), PCR and Sanger sequencing, and fluorescence in situ hybridization (FISH). We found extensive between-species variation for the number and size of pericentromeric FISH signals. At genomic level, the analysis of 1000s of DNA sequences obtained by Illumina sequencing and PCR amplification allowed defining 150 haplotypes which were linked in a common minimum spanning tree, where different patterns of concerted evolution were apparent. This also provided a glimpse into the satDNA library of this group of species. In consistency with the library hypothesis, different variants for this satDNA showed high differences in abundance between species, from highly abundant to simply relictual variants. PMID:28855916
Detecting very low allele fraction variants using targeted DNA sequencing and a novel molecular barcode-aware variant caller.

PubMed

Xu, Chang; Nezami Ranjbar, Mohammad R; Wu, Zhong; DiCarlo, John; Wang, Yexun

2017-01-03

Detection of DNA mutations at very low allele fractions with high accuracy will significantly improve the effectiveness of precision medicine for cancer patients. To achieve this goal through next generation sequencing, researchers need a detection method that 1) captures rare mutation-containing DNA fragments efficiently in the mix of abundant wild-type DNA; 2) sequences the DNA library extensively to deep coverage; and 3) distinguishes low level true variants from amplification and sequencing errors with high accuracy. Targeted enrichment using PCR primers provides researchers with a convenient way to achieve deep sequencing for a small, yet most relevant region using benchtop sequencers. Molecular barcoding (or indexing) provides a unique solution for reducing sequencing artifacts analytically. Although different molecular barcoding schemes have been reported in recent literature, most variant calling has been done on limited targets, using simple custom scripts. The analytical performance of barcode-aware variant calling can be significantly improved by incorporating advanced statistical models. We present here a highly efficient, simple and scalable enrichment protocol that integrates molecular barcodes in multiplex PCR amplification. In addition, we developed smCounter, an open source, generic, barcode-aware variant caller based on a Bayesian probabilistic model. smCounter was optimized and benchmarked on two independent read sets with SNVs and indels at 5 and 1% allele fractions. Variants were called with very good sensitivity and specificity within coding regions. We demonstrated that we can accurately detect somatic mutations with allele fractions as low as 1% in coding regions using our enrichment protocol and variant caller.
Fast single-pass alignment and variant calling using sequencing data

USDA-ARS?s Scientific Manuscript database

Sequencing research requires efficient computation. Few programs use already known information about DNA variants when aligning sequence data to the reference map. New program findmap.f90 reads the previous variant list before aligning sequence, calling variant alleles, and summing the allele counts...
Simultaneous detection of human mitochondrial DNA and nuclear-inserted mitochondrial-origin sequences (NumtS) using forensic mtDNA amplification strategies and pyrosequencing technology.

PubMed

Bintz, Brittania J; Dixon, Groves B; Wilson, Mark R

2014-07-01

Next-generation sequencing technologies enable the identification of minor mitochondrial DNA variants with higher sensitivity than Sanger methods, allowing for enhanced identification of minor variants. In this study, mixtures of human mtDNA control region amplicons were subjected to pyrosequencing to determine the detection threshold of the Roche GS Junior(®) instrument (Roche Applied Science, Indianapolis, IN). In addition to expected variants, a set of reproducible variants was consistently found in reads from one particular amplicon. A BLASTn search of the variant sequence revealed identity to a segment of a 611-bp nuclear insertion of the mitochondrial control region (NumtS) spanning the primer-binding sites of this amplicon (Nature 1995;378:489). Primers (Hum Genet 2012;131:757; Hum Biol 1996;68:847) flanking the insertion were used to confirm the presence or absence of the NumtS in buccal DNA extracts from twenty donors. These results further our understanding of human mtDNA variation and are expected to have a positive impact on the interpretation of mtDNA profiles using deep-sequencing methods in casework. © 2014 American Academy of Forensic Sciences.
Chromosome specific repetitive DNA sequences

DOEpatents

Moyzis, Robert K.; Meyne, Julianne

1991-01-01

A method is provided for determining specific nucleotide sequences useful in forming a probe which can identify specific chromosomes, preferably through in situ hybridization within the cell itself. In one embodiment, chromosome preferential nucleotide sequences are first determined from a library of recombinant DNA clones having families of repetitive sequences. Library clones are identified with a low homology with a sequence of repetitive DNA families to which the first clones respectively belong and variant sequences are then identified by selecting clones having a pattern of hybridization with genomic DNA dissimilar to the hybridization pattern shown by the respective families. In another embodiment, variant sequences are selected from a sequence of a known repetitive DNA family. The selected variant sequence is classified as chromosome specific, chromosome preferential, or chromosome nonspecific. Sequences which are classified as chromosome preferential are further sequenced and regions are identified having a low homology with other regions of the chromosome preferential sequence or with known sequences of other family me This invention is the result of a contract with the Department of Energy (Contract No. W-7405-ENG-36).
Mitochondrial DNA (mtDNA) variants in the European haplogroups HV, JT, and U do not have a major role in schizophrenia.

PubMed

Torrell, Helena; Salas, Antonio; Abasolo, Nerea; Morén, Constanza; Garrabou, Glòria; Valero, Joaquín; Alonso, Yolanda; Vilella, Elisabet; Costas, Javier; Martorell, Lourdes

2014-10-01

It has been reported that certain genetic factors involved in schizophrenia could be located in the mitochondrial DNA (mtDNA). Therefore, we hypothesized that mtDNA mutations and/or variants would be present in schizophrenia patients and may be related to schizophrenia characteristics and mitochondrial function. This study was performed in three steps: (1) identification of pathogenic mutations and variants in 14 schizophrenia patients with an apparent maternal inheritance of the disease by sequencing the entire mtDNA; (2) case-control association study of 23 variants identified in step 1 (16 missense, 3 rRNA, and 4 tRNA variants) in 495 patients and 615 controls, and (3) analyses of the associated variants according to the clinical, psychopathological, and neuropsychological characteristics and according to the oxidative and enzymatic activities of the mitochondrial respiratory chain. We did not identify pathogenic mtDNA mutations in the 14 sequenced patients. Two known variants were nominally associated with schizophrenia and were further studied. The MT-RNR2 1811A > G variant likely does not play a major role in schizophrenia, as it was not associated with clinical, psychopathological, or neuropsychological variables, and the MT-ATP6 9110T > C p.Ile195Thr variant did not result in differences in the oxidative and enzymatic functions of the mitochondrial respiratory chain. The patients with apparent maternal inheritance of schizophrenia did not exhibit any mutations in their mtDNA. The variants nominally associated with schizophrenia in the present study were not related either to phenotypic characteristics or to mitochondrial function. We did not find evidence pointing to a role for mtDNA sequence variation in schizophrenia. © 2014 Wiley Periodicals, Inc.
Segregation and recombination of a multipartite mitochondrial DNA in populations of the potato cyst nematode Globodera pallida.

PubMed

Armstrong, Miles R; Husmeier, Dirk; Phillips, Mark S; Blok, Vivian C

2007-06-01

The discovery that the potato cyst nematode Globodera pallida has a multipartite mitochondrial DNA (mtDNA) composed, at least in part, of six small circular mtDNAs (scmtDNAs) raised a number of questions concerning the population-level processes that might act on such a complex genome. Here we report our observations on the distribution of some scmtDNAs among a sample of European and South American G. pallida populations. The occurrence of sequence variants of scmtDNA IV in population P4A from South America, and that particular sequence variants are common to the individuals within a single cyst, is described. Evidence for recombination of sequence variants of scmtDNA IV in P4A is also reported. The mosaic structure of P4A scmtDNA IV sequences was revealed using several detection methods and recombination breakpoints were independently detected by maximum likelihood and Bayesian MCMC methods.
Multiplexed enrichment of rare DNA variants via sequence-selective and temperature-robust amplification

PubMed Central

Wu, Lucia R.; Chen, Sherry X.; Wu, Yalei; Patel, Abhijit A.; Zhang, David Yu

2018-01-01

Rare DNA-sequence variants hold important clinical and biological information, but existing detection techniques are expensive, complex, allele-specific, or don’t allow for significant multiplexing. Here, we report a temperature-robust polymerase-chain-reaction method, which we term blocker displacement amplification (BDA), that selectively amplifies all sequence variants, including single-nucleotide variants (SNVs), within a roughly 20-nucleotide window by 1,000-fold over wild-type sequences. This allows for easy detection and quantitation of hundreds of potential variants originally at ≤0.1% in allele frequency. BDA is compatible with inexpensive thermocycler instrumentation and employs a rationally designed competitive hybridization reaction to achieve comparable enrichment performance across annealing temperatures ranging from 56 °C to 64 °C. To show the sequence generality of BDA, we demonstrate enrichment of 156 SNVs and the reliable detection of single-digit copies. We also show that the BDA detection of rare driver mutations in cell-free DNA samples extracted from the blood plasma of lung-cancer patients is highly consistent with deep sequencing using molecular lineage tags, with a receiver operator characteristic accuracy of 95%. PMID:29805844
DNA Sequence Variants in the Five Prime Untranslated Region of the Cyclooxygenase-2 Gene Are Commonly Found in Healthy Dogs and Gray Wolves.

PubMed

Safra, Noa; Hayward, Louisa J; Aguilar, Miriam; Sacks, Benjamin N; Westropp, Jodi L; Mohr, F Charles; Mellersh, Cathryn S; Bannasch, Danika L

2015-01-01

The aim of this study was to investigate the frequency of regional DNA variants upstream to the translation initiation site of the canine Cyclooxygenase-2 (Cox-2) gene in healthy dogs. Cox-2 plays a role in various disease conditions such as acute and chronic inflammation, osteoarthritis and malignancy. A role for Cox-2 DNA variants in genetic predisposition to canine renal dysplasia has been proposed and dog breeders have been encouraged to select against these DNA variants. We sequenced 272-422 bases in 152 dogs unaffected by renal dysplasia and found 19 different haplotypes including 11 genetic variants which had not been described previously. We genotyped 7 gray wolves to ascertain the wildtype variant and found that the wolves we analyzed had predominantly the second most common DNA variant found in dogs. Our results demonstrate an elevated level of regional polymorphism that appears to be a feature of healthy domesticated dogs.
Genotype-specific signal generation based on digestion of 3-way DNA junctions: application to KRAS variation detection.

PubMed

Amicarelli, Giulia; Adlerstein, Daniel; Shehi, Erlet; Wang, Fengfei; Makrigiorgos, G Mike

2006-10-01

Genotyping methods that reveal single-nucleotide differences are useful for a wide range of applications. We used digestion of 3-way DNA junctions in a novel technology, OneCutEventAmplificatioN (OCEAN) that allows sequence-specific signal generation and amplification. We combined OCEAN with peptide-nucleic-acid (PNA)-based variant enrichment to detect and simultaneously genotype v-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog (KRAS) codon 12 sequence variants in human tissue specimens. We analyzed KRAS codon 12 sequence variants in 106 lung cancer surgical specimens. We conducted a PNA-PCR reaction that suppresses wild-type KRAS amplification and genotyped the product with a set of OCEAN reactions carried out in fluorescence microplate format. The isothermal OCEAN assay enabled a 3-way DNA junction to form between the specific target nucleic acid, a fluorescently labeled "amplifier", and an "anchor". The amplifier-anchor contact contains the recognition site for a restriction enzyme. Digestion produces a cleaved amplifier and generation of a fluorescent signal. The cleaved amplifier dissociates from the 3-way DNA junction, allowing a new amplifier to bind and propagate the reaction. The system detected and genotyped KRAS sequence variants down to approximately 0.3% variant-to-wild-type alleles. PNA-PCR/OCEAN had a concordance rate with PNA-PCR/sequencing of 93% to 98%, depending on the exact implementation. Concordance rate with restriction endonuclease-mediated selective-PCR/sequencing was 89%. OCEAN is a practical and low-cost novel technology for sequence-specific signal generation. Reliable analysis of KRAS sequence alterations in human specimens circumvents the requirement for sequencing. Application is expected in genotyping KRAS codon 12 sequence variants in surgical specimens or in bodily fluids, as well as single-base variations and sequence alterations in other genes.
Investigation of rare and low-frequency variants using high-throughput sequencing with pooled DNA samples

PubMed Central

Wang, Jingwen; Skoog, Tiina; Einarsdottir, Elisabet; Kaartokallio, Tea; Laivuori, Hannele; Grauers, Anna; Gerdhem, Paul; Hytönen, Marjo; Lohi, Hannes; Kere, Juha; Jiao, Hong

2016-01-01

High-throughput sequencing using pooled DNA samples can facilitate genome-wide studies on rare and low-frequency variants in a large population. Some major questions concerning the pooling sequencing strategy are whether rare and low-frequency variants can be detected reliably, and whether estimated minor allele frequencies (MAFs) can represent the actual values obtained from individually genotyped samples. In this study, we evaluated MAF estimates using three variant detection tools with two sets of pooled whole exome sequencing (WES) and one set of pooled whole genome sequencing (WGS) data. Both GATK and Freebayes displayed high sensitivity, specificity and accuracy when detecting rare or low-frequency variants. For the WGS study, 56% of the low-frequency variants in Illumina array have identical MAFs and 26% have one allele difference between sequencing and individual genotyping data. The MAF estimates from WGS correlated well (r = 0.94) with those from Illumina arrays. The MAFs from the pooled WES data also showed high concordance (r = 0.88) with those from the individual genotyping data. In conclusion, the MAFs estimated from pooled DNA sequencing data reflect the MAFs in individually genotyped samples well. The pooling strategy can thus be a rapid and cost-effective approach for the initial screening in large-scale association studies. PMID:27633116
Multiplexed resequencing analysis to identify rare variants in pooled DNA with barcode indexing using next-generation sequencer.

PubMed

Mitsui, Jun; Fukuda, Yoko; Azuma, Kyo; Tozaki, Hirokazu; Ishiura, Hiroyuki; Takahashi, Yuji; Goto, Jun; Tsuji, Shoji

2010-07-01

We have recently found that multiple rare variants of the glucocerebrosidase gene (GBA) confer a robust risk for Parkinson disease, supporting the 'common disease-multiple rare variants' hypothesis. To develop an efficient method of identifying rare variants in a large number of samples, we applied multiplexed resequencing using a next-generation sequencer to identification of rare variants of GBA. Sixteen sets of pooled DNAs from six pooled DNA samples were prepared. Each set of pooled DNAs was subjected to polymerase chain reaction to amplify the target gene (GBA) covering 6.5 kb, pooled into one tube with barcode indexing, and then subjected to extensive sequence analysis using the SOLiD System. Individual samples were also subjected to direct nucleotide sequence analysis. With the optimization of data processing, we were able to extract all the variants from 96 samples with acceptable rates of false-positive single-nucleotide variants.
A pooling-based approach to mapping genetic variants associated with DNA methylation

PubMed Central

Kaplow, Irene M.; MacIsaac, Julia L.; Mah, Sarah M.; McEwen, Lisa M.; Kobor, Michael S.; Fraser, Hunter B.

2015-01-01

DNA methylation is an epigenetic modification that plays a key role in gene regulation. Previous studies have investigated its genetic basis by mapping genetic variants that are associated with DNA methylation at specific sites, but these have been limited to microarrays that cover <2% of the genome and cannot account for allele-specific methylation (ASM). Other studies have performed whole-genome bisulfite sequencing on a few individuals, but these lack statistical power to identify variants associated with DNA methylation. We present a novel approach in which bisulfite-treated DNA from many individuals is sequenced together in a single pool, resulting in a truly genome-wide map of DNA methylation. Compared to methods that do not account for ASM, our approach increases statistical power to detect associations while sharply reducing cost, effort, and experimental variability. As a proof of concept, we generated deep sequencing data from a pool of 60 human cell lines; we evaluated almost twice as many CpGs as the largest microarray studies and identified more than 2000 genetic variants associated with DNA methylation. We found that these variants are highly enriched for associations with chromatin accessibility and CTCF binding but are less likely to be associated with traits indirectly linked to DNA, such as gene expression and disease phenotypes. In summary, our approach allows genome-wide mapping of genetic variants associated with DNA methylation in any tissue of any species, without the need for individual-level genotype or methylation data. PMID:25910490
A pooling-based approach to mapping genetic variants associated with DNA methylation

DOE PAGES

Kaplow, Irene M.; MacIsaac, Julia L.; Mah, Sarah M.; ...

2015-04-24

DNA methylation is an epigenetic modification that plays a key role in gene regulation. Previous studies have investigated its genetic basis by mapping genetic variants that are associated with DNA methylation at specific sites, but these have been limited to microarrays that cover <2% of the genome and cannot account for allele-specific methylation (ASM). Other studies have performed whole-genome bisulfite sequencing on a few individuals, but these lack statistical power to identify variants associated with DNA methylation. We present a novel approach in which bisulfite-treated DNA from many individuals is sequenced together in a single pool, resulting in a trulymore » genome-wide map of DNA methylation. Compared to methods that do not account for ASM, our approach increases statistical power to detect associations while sharply reducing cost, effort, and experimental variability. As a proof of concept, we generated deep sequencing data from a pool of 60 human cell lines; we evaluated almost twice as many CpGs as the largest microarray studies and identified more than 2000 genetic variants associated with DNA methylation. Here we found that these variants are highly enriched for associations with chromatin accessibility and CTCF binding but are less likely to be associated with traits indirectly linked to DNA, such as gene expression and disease phenotypes. In summary, our approach allows genome-wide mapping of genetic variants associated with DNA methylation in any tissue of any species, without the need for individual-level genotype or methylation data.« less
Simple and efficient identification of rare recessive pathologically important sequence variants from next generation exome sequence data.

PubMed

Carr, Ian M; Morgan, Joanne; Watson, Christopher; Melnik, Svitlana; Diggle, Christine P; Logan, Clare V; Harrison, Sally M; Taylor, Graham R; Pena, Sergio D J; Markham, Alexander F; Alkuraya, Fowzan S; Black, Graeme C M; Ali, Manir; Bonthron, David T

2013-07-01

Massively parallel ("next generation") DNA sequencing (NGS) has quickly become the method of choice for seeking pathogenic mutations in rare uncharacterized monogenic diseases. Typically, before DNA sequencing, protein-coding regions are enriched from patient genomic DNA, representing either the entire genome ("exome sequencing") or selected mapped candidate loci. Sequence variants, identified as differences between the patient's and the human genome reference sequences, are then filtered according to various quality parameters. Changes are screened against datasets of known polymorphisms, such as dbSNP and the 1000 Genomes Project, in the effort to narrow the list of candidate causative variants. An increasing number of commercial services now offer to both generate and align NGS data to a reference genome. This potentially allows small groups with limited computing infrastructure and informatics skills to utilize this technology. However, the capability to effectively filter and assess sequence variants is still an important bottleneck in the identification of deleterious sequence variants in both research and diagnostic settings. We have developed an approach to this problem comprising a user-friendly suite of programs that can interactively analyze, filter and screen data from enrichment-capture NGS data. These programs ("Agile Suite") are particularly suitable for small-scale gene discovery or for diagnostic analysis. © 2013 WILEY PERIODICALS, INC.
Structural analysis of two length variants of the rDNA intergenic spacer from Eruca sativa.

PubMed

Lakshmikumaran, M; Negi, M S

1994-03-01

Restriction enzyme analysis of the rRNA genes of Eruca sativa indicated the presence of many length variants within a single plant and also between different cultivars which is unusual for most crucifers studied so far. Two length variants of the rDNA intergenic spacer (IGS) from a single individual E. sativa (cv. Itsa) plant were cloned and characterized. The complete nucleotide sequences of both the variants (3 kb and 4 kb) were determined. The intergenic spacer contains three families of tandemly repeated DNA sequences denoted as A, B and C. However, the long (4 kb) variant shows the presence of an additional repeat, denoted as D, which is a duplication of a 224 bp sequence just upstream of the putative transcription initiation site. Repeat units belonging to the three different families (A, B and C) were in the size range of 22 to 30 bp. Such short repeat elements are present in the IGS of most of the crucifers analysed so far. Sequence analysis of the variants (3 kb and 4 kb) revealed that the length heterogeneity of the spacer is located at three different regions and is due to the varying copy numbers of repeat units belonging to families A and B. Length variation of the spacer is also due to the presence of a large duplication (D repeats) in the 4 kb variant which is absent in the 3 kb variant. The putative transcription initiation site was identified by comparisons with the rDNA sequences from other plant species.
VaDiR: an integrated approach to Variant Detection in RNA.

PubMed

Neums, Lisa; Suenaga, Seiji; Beyerlein, Peter; Anders, Sara; Koestler, Devin; Mariani, Andrea; Chien, Jeremy

2018-02-01

Advances in next-generation DNA sequencing technologies are now enabling detailed characterization of sequence variations in cancer genomes. With whole-genome sequencing, variations in coding and non-coding sequences can be discovered. But the cost associated with it is currently limiting its general use in research. Whole-exome sequencing is used to characterize sequence variations in coding regions, but the cost associated with capture reagents and biases in capture rate limit its full use in research. Additional limitations include uncertainty in assigning the functional significance of the mutations when these mutations are observed in the non-coding region or in genes that are not expressed in cancer tissue. We investigated the feasibility of uncovering mutations from expressed genes using RNA sequencing datasets with a method called Variant Detection in RNA(VaDiR) that integrates 3 variant callers, namely: SNPiR, RVBoost, and MuTect2. The combination of all 3 methods, which we called Tier 1 variants, produced the highest precision with true positive mutations from RNA-seq that could be validated at the DNA level. We also found that the integration of Tier 1 variants with those called by MuTect2 and SNPiR produced the highest recall with acceptable precision. Finally, we observed a higher rate of mutation discovery in genes that are expressed at higher levels. Our method, VaDiR, provides a possibility of uncovering mutations from RNA sequencing datasets that could be useful in further functional analysis. In addition, our approach allows orthogonal validation of DNA-based mutation discovery by providing complementary sequence variation analysis from paired RNA/DNA sequencing datasets.
Whole Gene Capture Analysis of 15 CRC Susceptibility Genes in Suspected Lynch Syndrome Patients.

PubMed

Jansen, Anne M L; Geilenkirchen, Marije A; van Wezel, Tom; Jagmohan-Changur, Shantie C; Ruano, Dina; van der Klift, Heleen M; van den Akker, Brendy E W M; Laros, Jeroen F J; van Galen, Michiel; Wagner, Anja; Letteboer, Tom G W; Gómez-García, Encarna B; Tops, Carli M J; Vasen, Hans F; Devilee, Peter; Hes, Frederik J; Morreau, Hans; Wijnen, Juul T

2016-01-01

Lynch Syndrome (LS) is caused by pathogenic germline variants in one of the mismatch repair (MMR) genes. However, up to 60% of MMR-deficient colorectal cancer cases are categorized as suspected Lynch Syndrome (sLS) because no pathogenic MMR germline variant can be identified, which leads to difficulties in clinical management. We therefore analyzed the genomic regions of 15 CRC susceptibility genes in leukocyte DNA of 34 unrelated sLS patients and 11 patients with MLH1 hypermethylated tumors with a clear family history. Using targeted next-generation sequencing, we analyzed the entire non-repetitive genomic sequence, including intronic and regulatory sequences, of 15 CRC susceptibility genes. In addition, tumor DNA from 28 sLS patients was analyzed for somatic MMR variants. Of 1979 germline variants found in the leukocyte DNA of 34 sLS patients, one was a pathogenic variant (MLH1 c.1667+1delG). Leukocyte DNA of 11 patients with MLH1 hypermethylated tumors was negative for pathogenic germline variants in the tested CRC susceptibility genes and for germline MLH1 hypermethylation. Somatic DNA analysis of 28 sLS tumors identified eight (29%) cases with two pathogenic somatic variants, one with a VUS predicted to pathogenic and LOH, and nine cases (32%) with one pathogenic somatic variant (n = 8) or one VUS predicted to be pathogenic (n = 1). This is the first study in sLS patients to include the entire genomic sequence of CRC susceptibility genes. An underlying somatic or germline MMR gene defect was identified in ten of 34 sLS patients (29%). In the remaining sLS patients, the underlying genetic defect explaining the MMRdeficiency in their tumors might be found outside the genomic regions harboring the MMR and other known CRC susceptibility genes.

VarDict: a novel and versatile variant caller for next-generation sequencing in cancer research

PubMed Central

Lai, Zhongwu; Markovets, Aleksandra; Ahdesmaki, Miika; Chapman, Brad; Hofmann, Oliver; McEwen, Robert; Johnson, Justin; Dougherty, Brian; Barrett, J. Carl; Dry, Jonathan R.

2016-01-01

Abstract Accurate variant calling in next generation sequencing (NGS) is critical to understand cancer genomes better. Here we present VarDict, a novel and versatile variant caller for both DNA- and RNA-sequencing data. VarDict simultaneously calls SNV, MNV, InDels, complex and structural variants, expanding the detected genetic driver landscape of tumors. It performs local realignments on the fly for more accurate allele frequency estimation. VarDict performance scales linearly to sequencing depth, enabling ultra-deep sequencing used to explore tumor evolution or detect tumor DNA circulating in blood. In addition, VarDict performs amplicon aware variant calling for polymerase chain reaction (PCR)-based targeted sequencing often used in diagnostic settings, and is able to detect PCR artifacts. Finally, VarDict also detects differences in somatic and loss of heterozygosity variants between paired samples. VarDict reprocessing of The Cancer Genome Atlas (TCGA) Lung Adenocarcinoma dataset called known driver mutations in KRAS, EGFR, BRAF, PIK3CA and MET in 16% more patients than previously published variant calls. We believe VarDict will greatly facilitate application of NGS in clinical cancer research. PMID:27060149
Evolutional dynamics of 45S and 5S ribosomal DNA in ancient allohexaploid Atropa belladonna.

PubMed

Volkov, Roman A; Panchuk, Irina I; Borisjuk, Nikolai V; Hosiawa-Baranska, Marta; Maluszynska, Jolanta; Hemleben, Vera

2017-01-23

Polyploid hybrids represent a rich natural resource to study molecular evolution of plant genes and genomes. Here, we applied a combination of karyological and molecular methods to investigate chromosomal structure, molecular organization and evolution of ribosomal DNA (rDNA) in nightshade, Atropa belladonna (fam. Solanaceae), one of the oldest known allohexaploids among flowering plants. Because of their abundance and specific molecular organization (evolutionarily conserved coding regions linked to variable intergenic spacers, IGS), 45S and 5S rDNA are widely used in plant taxonomic and evolutionary studies. Molecular cloning and nucleotide sequencing of A. belladonna 45S rDNA repeats revealed a general structure characteristic of other Solanaceae species, and a very high sequence similarity of two length variants, with the only difference in number of short IGS subrepeats. These results combined with the detection of three pairs of 45S rDNA loci on separate chromosomes, presumably inherited from both tetraploid and diploid ancestor species, example intensive sequence homogenization that led to substitution/elimination of rDNA repeats of one parent. Chromosome silver-staining revealed that only four out of six 45S rDNA sites are frequently transcriptionally active, demonstrating nucleolar dominance. For 5S rDNA, three size variants of repeats were detected, with the major class represented by repeats containing all functional IGS elements required for transcription, the intermediate size repeats containing partially deleted IGS sequences, and the short 5S repeats containing severe defects both in the IGS and coding sequences. While shorter variants demonstrate increased rate of based substitution, probably in their transition into pseudogenes, the functional 5S rDNA variants are nearly identical at the sequence level, pointing to their origin from a single parental species. Localization of the 5S rDNA genes on two chromosome pairs further supports uniparental inheritance from the tetraploid progenitor. The obtained molecular, cytogenetic and phylogenetic data demonstrate complex evolutionary dynamics of rDNA loci in allohexaploid species of Atropa belladonna. The high level of sequence unification revealed in 45S and 5S rDNA loci of this ancient hybrid species have been seemingly achieved by different molecular mechanisms.
Genotyping of 25 leukemia-associated genes in a single work flow by next-generation sequencing technology with low amounts of input template DNA.

PubMed

Rinke, Jenny; Schäfer, Vivien; Schmidt, Mathias; Ziermann, Janine; Kohlmann, Alexander; Hochhaus, Andreas; Ernst, Thomas

2013-08-01

We sought to establish a convenient, sensitive next-generation sequencing (NGS) method for genotyping the 26 most commonly mutated leukemia-associated genes in a single work flow and to optimize this method for low amounts of input template DNA. We designed 184 PCR amplicons that cover all of the candidate genes. NGS was performed with genomic DNA (gDNA) from a cohort of 10 individuals with chronic myelomonocytic leukemia. The results were compared with NGS data obtained from sequencing of DNA generated by whole-genome amplification (WGA) of 20 ng template gDNA. Differences between gDNA and WGA samples in variant frequencies were determined for 2 different WGA kits. For gDNA samples, 25 of 26 genes were successfully sequenced with a sensitivity of 5%, which was achieved by a median coverage of 492 reads (range, 308-636 reads) per amplicon. We identified 24 distinct mutations in 11 genes. With WGA samples, we reliably detected all mutations above 5% sensitivity with a median coverage of 506 reads (range, 256-653 reads) per amplicon. With all variants included in the analysis, WGA amplification by the 2 kits tested yielded differences in variant frequencies that ranged from -28.19% to +9.94% [mean (SD) difference, -0.2% (4.08%)] and from -35.03% to +18.67% [mean difference, -0.75% (5.12%)]. Our method permits simultaneous analysis of a wide range of leukemia-associated target genes in a single sequencing run. NGS can be performed after WGA of template DNA for reliable detection of variants without introducing appreciable bias.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Kaplow, Irene M.; MacIsaac, Julia L.; Mah, Sarah M.

DNA methylation is an epigenetic modification that plays a key role in gene regulation. Previous studies have investigated its genetic basis by mapping genetic variants that are associated with DNA methylation at specific sites, but these have been limited to microarrays that cover <2% of the genome and cannot account for allele-specific methylation (ASM). Other studies have performed whole-genome bisulfite sequencing on a few individuals, but these lack statistical power to identify variants associated with DNA methylation. We present a novel approach in which bisulfite-treated DNA from many individuals is sequenced together in a single pool, resulting in a trulymore » genome-wide map of DNA methylation. Compared to methods that do not account for ASM, our approach increases statistical power to detect associations while sharply reducing cost, effort, and experimental variability. As a proof of concept, we generated deep sequencing data from a pool of 60 human cell lines; we evaluated almost twice as many CpGs as the largest microarray studies and identified more than 2000 genetic variants associated with DNA methylation. Here we found that these variants are highly enriched for associations with chromatin accessibility and CTCF binding but are less likely to be associated with traits indirectly linked to DNA, such as gene expression and disease phenotypes. In summary, our approach allows genome-wide mapping of genetic variants associated with DNA methylation in any tissue of any species, without the need for individual-level genotype or methylation data.« less
Population-based rare variant detection via pooled exome or custom hybridization capture with or without individual indexing.

PubMed

Ramos, Enrique; Levinson, Benjamin T; Chasnoff, Sara; Hughes, Andrew; Young, Andrew L; Thornton, Katherine; Li, Allie; Vallania, Francesco L M; Province, Michael; Druley, Todd E

2012-12-06

Rare genetic variation in the human population is a major source of pathophysiological variability and has been implicated in a host of complex phenotypes and diseases. Finding disease-related genes harboring disparate functional rare variants requires sequencing of many individuals across many genomic regions and comparing against unaffected cohorts. However, despite persistent declines in sequencing costs, population-based rare variant detection across large genomic target regions remains cost prohibitive for most investigators. In addition, DNA samples are often precious and hybridization methods typically require large amounts of input DNA. Pooled sample DNA sequencing is a cost and time-efficient strategy for surveying populations of individuals for rare variants. We set out to 1) create a scalable, multiplexing method for custom capture with or without individual DNA indexing that was amenable to low amounts of input DNA and 2) expand the functionality of the SPLINTER algorithm for calling substitutions, insertions and deletions across either candidate genes or the entire exome by integrating the variant calling algorithm with the dynamic programming aligner, Novoalign. We report methodology for pooled hybridization capture with pre-enrichment, indexed multiplexing of up to 48 individuals or non-indexed pooled sequencing of up to 92 individuals with as little as 70 ng of DNA per person. Modified solid phase reversible immobilization bead purification strategies enable no sample transfers from sonication in 96-well plates through adapter ligation, resulting in 50% less library preparation reagent consumption. Custom Y-shaped adapters containing novel 7 base pair index sequences with a Hamming distance of ≥2 were directly ligated onto fragmented source DNA eliminating the need for PCR to incorporate indexes, and was followed by a custom blocking strategy using a single oligonucleotide regardless of index sequence. These results were obtained aligning raw reads against the entire genome using Novoalign followed by variant calling of non-indexed pools using SPLINTER or SAMtools for indexed samples. With these pipelines, we find sensitivity and specificity of 99.4% and 99.7% for pooled exome sequencing. Sensitivity, and to a lesser degree specificity, proved to be a function of coverage. For rare variants (≤2% minor allele frequency), we achieved sensitivity and specificity of ≥94.9% and ≥99.99% for custom capture of 2.5 Mb in multiplexed libraries of 22-48 individuals with only ≥5-fold coverage/chromosome, but these parameters improved to ≥98.7 and 100% with 20-fold coverage/chromosome. This highly scalable methodology enables accurate rare variant detection, with or without individual DNA sample indexing, while reducing the amount of required source DNA and total costs through less hybridization reagent consumption, multi-sample sonication in a standard PCR plate, multiplexed pre-enrichment pooling with a single hybridization and lesser sequencing coverage required to obtain high sensitivity.
Next-generation sequencing for genetic testing of familial colorectal cancer syndromes.

PubMed

Simbolo, Michele; Mafficini, Andrea; Agostini, Marco; Pedrazzani, Corrado; Bedin, Chiara; Urso, Emanuele D; Nitti, Donato; Turri, Giona; Scardoni, Maria; Fassan, Matteo; Scarpa, Aldo

2015-01-01

Genetic screening in families with high risk to develop colorectal cancer (CRC) prevents incurable disease and permits personalized therapeutic and follow-up strategies. The advancement of next-generation sequencing (NGS) technologies has revolutionized the throughput of DNA sequencing. A series of 16 probands for either familial adenomatous polyposis (FAP; 8 cases) or hereditary nonpolyposis colorectal cancer (HNPCC; 8 cases) were investigated for intragenic mutations in five CRC familial syndromes-associated genes (APC, MUTYH, MLH1, MSH2, MSH6) applying both a custom multigene Ion AmpliSeq NGS panel and conventional Sanger sequencing. Fourteen pathogenic variants were detected in 13/16 FAP/HNPCC probands (81.3 %); one FAP proband presented two co-existing pathogenic variants, one in APC and one in MUTYH. Thirteen of these 14 pathogenic variants were detected by both NGS and Sanger, while one MSH2 mutation (L280FfsX3) was identified only by Sanger sequencing. This is due to a limitation of the NGS approach in resolving sequences close or within homopolymeric stretches of DNA. To evaluate the performance of our NGS custom panel we assessed its capability to resolve the DNA sequences corresponding to 2225 pathogenic variants reported in the COSMIC database for APC, MUTYH, MLH1, MSH2, MSH6. Our NGS custom panel resolves the sequences where 2108 (94.7 %) of these variants occur. The remaining 117 mutations reside inside or in close proximity to homopolymer stretches; of these 27 (1.2 %) are imprecisely identified by the software but can be resolved by visual inspection of the region, while the remaining 90 variants (4.0 %) are blind spots. In summary, our custom panel would miss 4 % (90/2225) of pathogenic variants that would need a small set of Sanger sequencing reactions to be solved. The multiplex NGS approach has the advantage of analyzing multiple genes in multiple samples simultaneously, requiring only a reduced number of Sanger sequences to resolve homopolymeric DNA regions not adequately assessed by NGS. The implementation of NGS approaches in routine diagnostics of familial CRC is cost-effective and significantly reduces diagnostic turnaround times.
Diff-seq: A high throughput sequencing-based mismatch detection assay for DNA variant enrichment and discovery

PubMed Central

Karas, Vlad O; Sinnott-Armstrong, Nicholas A; Varghese, Vici; Shafer, Robert W; Greenleaf, William J; Sherlock, Gavin

2018-01-01

Abstract Much of the within species genetic variation is in the form of single nucleotide polymorphisms (SNPs), typically detected by whole genome sequencing (WGS) or microarray-based technologies. However, WGS produces mostly uninformative reads that perfectly match the reference, while microarrays require genome-specific reagents. We have developed Diff-seq, a sequencing-based mismatch detection assay for SNP discovery without the requirement for specialized nucleic-acid reagents. Diff-seq leverages the Surveyor endonuclease to cleave mismatched DNA molecules that are generated after cross-annealing of a complex pool of DNA fragments. Sequencing libraries enriched for Surveyor-cleaved molecules result in increased coverage at the variant sites. Diff-seq detected all mismatches present in an initial test substrate, with specific enrichment dependent on the identity and context of the variation. Application to viral sequences resulted in increased observation of variant alleles in a biologically relevant context. Diff-Seq has the potential to increase the sensitivity and efficiency of high-throughput sequencing in the detection of variation. PMID:29361139
Random Mutagenesis, Clonal Events, and Embryonic or Somatic Origin Determine the mtDNA Variant Type and Load in Human Pluripotent Stem Cells.

PubMed

Zambelli, Filippo; Mertens, Joke; Dziedzicka, Dominika; Sterckx, Johan; Markouli, Christina; Keller, Alexander; Tropel, Philippe; Jung, Laura; Viville, Stephane; Van de Velde, Hilde; Geens, Mieke; Seneca, Sara; Sermon, Karen; Spits, Claudia

2018-06-07

In this study, we deep-sequenced the mtDNA of human embryonic and induced pluripotent stem cells (hESCs and hiPSCs) and their source cells and found that the majority of variants pre-existed in the cells used to establish the lines. Early-passage hESCs carried few and low-load heteroplasmic variants, similar to those identified in oocytes and inner cell masses. The number and heteroplasmic loads of these variants increased with prolonged cell culture. The study of 120 individual cells of early- and late-passage hESCs revealed a significant diversity in mtDNA heteroplasmic variants at the single-cell level and that the variants that increase during time in culture are always passenger to the appearance of chromosomal abnormalities. We found that early-passage hiPSCs carry much higher loads of mtDNA variants than hESCs, which single-fibroblast sequencing proved pre-existed in the source cells. Finally, we show that these variants are stably transmitted during short-term differentiation. Copyright © 2018 The Author(s). Published by Elsevier Inc. All rights reserved.
Mosaic CREBBP mutation causes overlapping clinical features of Rubinstein-Taybi and Filippi syndromes.

PubMed

de Vries, Tamar I; Monroe, Glen R; van Belzen, Martine J; van der Lans, Christian A; Savelberg, Sanne Mc; Newman, William G; van Haaften, Gijs; Nievelstein, Rutger A; van Haelst, Mieke M

2016-08-01

Rubinstein-Taybi syndrome (RTS, OMIM 180849) and Filippi syndrome (FLPIS, OMIM 272440) are both rare syndromes, with multiple congenital anomalies and intellectual deficit (MCA/ID). We present a patient with intellectual deficit, short stature, bilateral syndactyly of hands and feet, broad thumbs, ocular abnormalities, and dysmorphic facial features. These clinical features suggest both RTS and FLPIS. Initial DNA analysis of DNA isolated from blood did not identify variants to confirm either of these syndrome diagnoses. Whole-exome sequencing identified a homozygous variant in C9orf173, which was novel at the time of analysis. Further Sanger sequencing analysis of FLPIS cases tested negative for CKAP2L variants did not, however, reveal any further variants. Subsequent analysis using DNA isolated from buccal mucosa revealed a mosaic variant in CREBBP. This report highlights the importance of excluding mosaic variants in patients with a strong but atypical clinical presentation of a MCA/ID syndrome if no disease-causing variants can be detected in DNA isolated from blood samples. As the striking syndactyly observed in the present case is typical for FLPIS, we suggest CREBBP analysis in saliva samples for FLPIS syndrome cases in which no causal CKAP2L variant is detected.
[Structural organization of 5S ribosomal DNA of Rosa rugosa].

PubMed

Tynkevych, Iu O; Volkov, R A

2014-01-01

In order to clarify molecular organization of the genomic region encoding 5S rRNA in diploid species Rosa rugosa several 5S rDNA repeated units were cloned and sequenced. Analysis of the obtained sequences revealed that only one length variant of 5S rDNA repeated units, which contains intact promoter elements in the intergenic spacer region (IGS) and appears to be transcriptionally active is present in the genome. Additionally, a limited number of 5S rDNA pseudogenes lacking a portion of coding sequence and the complete IGS was detected. A high level of sequence similarity (from 93.7 to 97.5%) between the IGS of major 5S rDNA variants of East Asian R. rugosa and North American R. nitida was found indicating comparatively recent divergence of these species.
Distribution and sequence homogeneity of an abundant satellite DNA in the beetle, Tenebrio molitor.

PubMed Central

Davis, C A; Wyatt, G R

1989-01-01

The mealworm beetle, Tenebrio molitor, contains an unusually abundant and homogeneous satellite DNA which constitutes up to 60% of its genome. The satellite DNA is shown to be present in all of the chromosomes by in situ hybridization. 18 dimers of the repeat unit were cloned and sequenced. The consensus sequence is 142 nt long and lacks any internal repeat structure. Monomers of the sequence are very similar, showing on average a 2% divergence from the calculated consensus. Variant nucleotides are scattered randomly throughout the sequence although some variants are more common than others. Neighboring repeat units are no more alike than randomly chosen ones. The results suggest that some mechanism, perhaps gene conversion, is acting to maintain the homogeneity of the satellite DNA despite its abundance and distribution on all of the chromosomes. Images PMID:2762148
Evaluation of Two Highly-Multiplexed Custom Panels for Massively Parallel Semiconductor Sequencing on Paraffin DNA

PubMed Central

Kotoula, Vassiliki; Lyberopoulou, Aggeliki; Papadopoulou, Kyriaki; Charalambous, Elpida; Alexopoulou, Zoi; Gakou, Chryssa; Lakis, Sotiris; Tsolaki, Eleftheria; Lilakos, Konstantinos; Fountzilas, George

2015-01-01

Background—Aim Massively parallel sequencing (MPS) holds promise for expanding cancer translational research and diagnostics. As yet, it has been applied on paraffin DNA (FFPE) with commercially available highly multiplexed gene panels (100s of DNA targets), while custom panels of low multiplexing are used for re-sequencing. Here, we evaluated the performance of two highly multiplexed custom panels on FFPE DNA. Methods Two custom multiplex amplification panels (B, 373 amplicons; T, 286 amplicons) were coupled with semiconductor sequencing on DNA samples from FFPE breast tumors and matched peripheral blood samples (n samples: 316; n libraries: 332). The two panels shared 37% DNA targets (common or shifted amplicons). Panel performance was evaluated in paired sample groups and quartets of libraries, where possible. Results Amplicon read ratios yielded similar patterns per gene with the same panel in FFPE and blood samples; however, performance of common amplicons differed between panels (p<0.001). FFPE genotypes were compared for 1267 coding and non-coding variant replicates, 999 out of which (78.8%) were concordant in different paired sample combinations. Variant frequency was highly reproducible (Spearman’s rho 0.959). Repeatedly discordant variants were of high coverage / low frequency (p<0.001). Genotype concordance was (a) high, for intra-run duplicates with the same panel (mean±SD: 97.2±4.7, 95%CI: 94.8–99.7, p<0.001); (b) modest, when the same DNA was analyzed with different panels (mean±SD: 81.1±20.3, 95%CI: 66.1–95.1, p = 0.004); and (c) low, when different DNA samples from the same tumor were compared with the same panel (mean±SD: 59.9±24.0; 95%CI: 43.3–76.5; p = 0.282). Low coverage / low frequency variants were validated with Sanger sequencing even in samples with unfavourable DNA quality. Conclusions Custom MPS may yield novel information on genomic alterations, provided that data evaluation is adjusted to tumor tissue FFPE DNA. To this scope, eligibility of all amplicons along with variant coverage and frequency need to be assessed. PMID:26039550
A low density microarray method for the identification of human papillomavirus type 18 variants.

PubMed

Meza-Menchaca, Thuluz; Williams, John; Rodríguez-Estrada, Rocío B; García-Bravo, Aracely; Ramos-Ligonio, Ángel; López-Monteon, Aracely; Zepeda, Rossana C

2013-09-26

We describe a novel microarray based-method for the screening of oncogenic human papillomavirus 18 (HPV-18) molecular variants. Due to the fact that sequencing methodology may underestimate samples containing more than one variant we designed a specific and sensitive stacking DNA hybridization assay. This technology can be used to discriminate between three possible phylogenetic branches of HPV-18. Probes were attached covalently on glass slides and hybridized with single-stranded DNA targets. Prior to hybridization with the probes, the target strands were pre-annealed with the three auxiliary contiguous oligonucleotides flanking the target sequences. Screening HPV-18 positive cell lines and cervical samples were used to evaluate the performance of this HPV DNA microarray. Our results demonstrate that the HPV-18's variants hybridized specifically to probes, with no detection of unspecific signals. Specific probes successfully reveal detectable point mutations in these variants. The present DNA oligoarray system can be used as a reliable, sensitive and specific method for HPV-18 variant screening. Furthermore, this simple assay allows the use of inexpensive equipment, making it accessible in resource-poor settings.
A Low Density Microarray Method for the Identification of Human Papillomavirus Type 18 Variants

PubMed Central

Meza-Menchaca, Thuluz; Williams, John; Rodríguez-Estrada, Rocío B.; García-Bravo, Aracely; Ramos-Ligonio, Ángel; López-Monteon, Aracely; Zepeda, Rossana C.

2013-01-01

We describe a novel microarray based-method for the screening of oncogenic human papillomavirus 18 (HPV-18) molecular variants. Due to the fact that sequencing methodology may underestimate samples containing more than one variant we designed a specific and sensitive stacking DNA hybridization assay. This technology can be used to discriminate between three possible phylogenetic branches of HPV-18. Probes were attached covalently on glass slides and hybridized with single-stranded DNA targets. Prior to hybridization with the probes, the target strands were pre-annealed with the three auxiliary contiguous oligonucleotides flanking the target sequences. Screening HPV-18 positive cell lines and cervical samples were used to evaluate the performance of this HPV DNA microarray. Our results demonstrate that the HPV-18's variants hybridized specifically to probes, with no detection of unspecific signals. Specific probes successfully reveal detectable point mutations in these variants. The present DNA oligoarray system can be used as a reliable, sensitive and specific method for HPV-18 variant screening. Furthermore, this simple assay allows the use of inexpensive equipment, making it accessible in resource-poor settings. PMID:24077317
Telomere extension by telomerase and ALT generates variant repeats by mechanistically distinct processes

PubMed Central

Lee, Michael; Hills, Mark; Conomos, Dimitri; Stutz, Michael D.; Dagg, Rebecca A.; Lau, Loretta M.S.; Reddel, Roger R.; Pickett, Hilda A.

2014-01-01

Telomeres are terminal repetitive DNA sequences on chromosomes, and are considered to comprise almost exclusively hexameric TTAGGG repeats. We have evaluated telomere sequence content in human cells using whole-genome sequencing followed by telomere read extraction in a panel of mortal cell strains and immortal cell lines. We identified a wide range of telomere variant repeats in human cells, and found evidence that variant repeats are generated by mechanistically distinct processes during telomerase- and ALT-mediated telomere lengthening. Telomerase-mediated telomere extension resulted in biased repeat synthesis of variant repeats that differed from the canonical sequence at positions 1 and 3, but not at positions 2, 4, 5 or 6. This indicates that telomerase is most likely an error-prone reverse transcriptase that misincorporates nucleotides at specific positions on the telomerase RNA template. In contrast, cell lines that use the ALT pathway contained a large range of variant repeats that varied greatly between lines. This is consistent with variant repeats spreading from proximal telomeric regions throughout telomeres in a stochastic manner by recombination-mediated templating of DNA synthesis. The presence of unexpectedly large numbers of variant repeats in cells utilizing either telomere maintenance mechanism suggests a conserved role for variant sequences at human telomeres. PMID:24225324
Characterization of alanine to valine sequence variants in the Fc region of nivolumab biosimilar produced in Chinese hamster ovary cells.

PubMed

Li, Yantao; Fu, Tuo; Liu, Tao; Guo, Huaizu; Guo, Qingcheng; Xu, Jin; Zhang, Dapeng; Qian, Weizhu; Dai, Jianxin; Li, Bohua; Guo, Yajun; Hou, Sheng; Wang, Hao

2016-07-01

Nivolumab is a therapeutic fully human IgG4 antibody to programmed death 1 (PD-1). In this study, a nivolumab biosimilar, which was produced in our laboratory, was analyzed and characterized. Sequence variants that contain undesired amino acid sequences may cause concern during biosimilar bioprocess development. We found that low levels of sequence variants were detected in the heavy chain of the nivolumab biosimilar by ultra performance liquid chromatography (UPLC) and tandem mass spectrometry. It was further identified with UPLC-MS/MS by IdeS or trypsin digestion. The sequence variant was confirmed through addition of synthetic mutant peptide. Subsequently, the mixing base signal of normal and mutant sequence was detected through DNA sequencing. The relative levels of mutant A424V in the Fc region of the heavy chain have been detected and demonstrated to be 12.25% and 13.54%, via base peak intensity (BPI) and UV chromatography of the tryptic peptide mapping, respectively. A424V variant was also quantified by real-time PCR (RT-PCR) at the DNA and RNA level, which was 19.2% and 16.8%, respectively. The relative content of the mutant was consistent at the DNA, RNA and protein level, indicating that the A424V mutation may have little influence at transcriptional or translational levels. These results demonstrate that orthogonal state-of-the-art techniques such as LC- UV- MS and RT-PCR should be implemented to characterize recombinant proteins and cell lines for development of biosimilars. Our study suggests that it is important to establish an integrated and effective analytical method to monitor and characterize sequence variants during antibody drug development, especially for antibody biosimilar products.
A statistical method for the detection of variants from next-generation resequencing of DNA pools.

PubMed

Bansal, Vikas

2010-06-15

Next-generation sequencing technologies have enabled the sequencing of several human genomes in their entirety. However, the routine resequencing of complete genomes remains infeasible. The massive capacity of next-generation sequencers can be harnessed for sequencing specific genomic regions in hundreds to thousands of individuals. Sequencing-based association studies are currently limited by the low level of multiplexing offered by sequencing platforms. Pooled sequencing represents a cost-effective approach for studying rare variants in large populations. To utilize the power of DNA pooling, it is important to accurately identify sequence variants from pooled sequencing data. Detection of rare variants from pooled sequencing represents a different challenge than detection of variants from individual sequencing. We describe a novel statistical approach, CRISP [Comprehensive Read analysis for Identification of Single Nucleotide Polymorphisms (SNPs) from Pooled sequencing] that is able to identify both rare and common variants by using two approaches: (i) comparing the distribution of allele counts across multiple pools using contingency tables and (ii) evaluating the probability of observing multiple non-reference base calls due to sequencing errors alone. Information about the distribution of reads between the forward and reverse strands and the size of the pools is also incorporated within this framework to filter out false variants. Validation of CRISP on two separate pooled sequencing datasets generated using the Illumina Genome Analyzer demonstrates that it can detect 80-85% of SNPs identified using individual sequencing while achieving a low false discovery rate (3-5%). Comparison with previous methods for pooled SNP detection demonstrates the significantly lower false positive and false negative rates for CRISP. Implementation of this method is available at http://polymorphism.scripps.edu/~vbansal/software/CRISP/.
Simple, multiplexed, PCR-based barcoding of DNA enables sensitive mutation detection in liquid biopsies using sequencing.

PubMed

Ståhlberg, Anders; Krzyzanowski, Paul M; Jackson, Jennifer B; Egyud, Matthew; Stein, Lincoln; Godfrey, Tony E

2016-06-20

Detection of cell-free DNA in liquid biopsies offers great potential for use in non-invasive prenatal testing and as a cancer biomarker. Fetal and tumor DNA fractions however can be extremely low in these samples and ultra-sensitive methods are required for their detection. Here, we report an extremely simple and fast method for introduction of barcodes into DNA libraries made from 5 ng of DNA. Barcoded adapter primers are designed with an oligonucleotide hairpin structure to protect the molecular barcodes during the first rounds of polymerase chain reaction (PCR) and prevent them from participating in mis-priming events. Our approach enables high-level multiplexing and next-generation sequencing library construction with flexible library content. We show that uniform libraries of 1-, 5-, 13- and 31-plex can be generated. Utilizing the barcodes to generate consensus reads for each original DNA molecule reduces background sequencing noise and allows detection of variant alleles below 0.1% frequency in clonal cell line DNA and in cell-free plasma DNA. Thus, our approach bridges the gap between the highly sensitive but specific capabilities of digital PCR, which only allows a limited number of variants to be analyzed, with the broad target capability of next-generation sequencing which traditionally lacks the sensitivity to detect rare variants. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Supplementation of Nucleosides During Selection can Reduce Sequence Variant Levels in CHO Cells Using GS/MSX Selection System.

PubMed

Tang, Danming; Lam, Cynthia; Louie, Salina; Hoi, Kam Hon; Shaw, David; Yim, Mandy; Snedecor, Brad; Misaghi, Shahram

2018-01-01

In the process of generating stable monoclonal antibody (mAb) producing cell lines, reagents such as methotrexate (MTX) or methionine sulfoximine (MSX) are often used. However, using such selection reagent(s) increases the possibility of having higher occurrence of sequence variants in the expressed antibody molecules due to the effects of MTX or MSX on de novo nucleotide synthesis. Since MSX inhibits glutamine synthase (GS) and results in both amino acid and nucleoside starvation, it is questioned whether supplementing nucleosides into the media could lower sequence variant levels without affecting titer. The results show that the supplementation of nucleosides to the media during MSX selection decreased genomic DNA mutagenesis rates in the selected cells, probably by reducing nucleotide mis-incorporation into the DNA. Furthermore, addition of nucleosides enhance clone recovery post selection and does not affect antibody expression. It is further observed that nucleoside supplements lowered DNA mutagenesis rates only at the initial stage of the clone selection and do not have any effect on DNA mutagenesis rates after stable cell lines are established. Therefore, the data suggests that addition of nucleosides during early stages of MSX selection can lower sequence variant levels without affecting titer or clone stability in antibody expression. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Mosaic CREBBP mutation causes overlapping clinical features of Rubinstein–Taybi and Filippi syndromes

PubMed Central

de Vries, Tamar I; R Monroe, Glen; van Belzen, Martine J; van der Lans, Christian A; Savelberg, Sanne MC; Newman, William G; van Haaften, Gijs; Nievelstein, Rutger A; van Haelst, Mieke M

2016-01-01

Rubinstein–Taybi syndrome (RTS, OMIM 180849) and Filippi syndrome (FLPIS, OMIM 272440) are both rare syndromes, with multiple congenital anomalies and intellectual deficit (MCA/ID). We present a patient with intellectual deficit, short stature, bilateral syndactyly of hands and feet, broad thumbs, ocular abnormalities, and dysmorphic facial features. These clinical features suggest both RTS and FLPIS. Initial DNA analysis of DNA isolated from blood did not identify variants to confirm either of these syndrome diagnoses. Whole-exome sequencing identified a homozygous variant in C9orf173, which was novel at the time of analysis. Further Sanger sequencing analysis of FLPIS cases tested negative for CKAP2L variants did not, however, reveal any further variants. Subsequent analysis using DNA isolated from buccal mucosa revealed a mosaic variant in CREBBP. This report highlights the importance of excluding mosaic variants in patients with a strong but atypical clinical presentation of a MCA/ID syndrome if no disease-causing variants can be detected in DNA isolated from blood samples. As the striking syndactyly observed in the present case is typical for FLPIS, we suggest CREBBP analysis in saliva samples for FLPIS syndrome cases in which no causal CKAP2L variant is detected. PMID:26956253

Analysis of Multiallelic CNVs by Emulsion Haplotype Fusion PCR.

PubMed

Tyson, Jess; Armour, John A L

2017-01-01

Emulsion-fusion PCR recovers long-range sequence information by combining products in cis from individual genomic DNA molecules. Emulsion droplets act as very numerous small reaction chambers in which different PCR products from a single genomic DNA molecule are condensed into short joint products, to unite sequences in cis from widely separated genomic sites. These products can therefore provide information about the arrangement of sequences and variants at a larger scale than established long-read sequencing methods. The method has been useful in defining the phase of variants in haplotypes, the typing of inversions, and determining the configuration of sequence variants in multiallelic CNVs. In this description we outline the rationale for the application of emulsion-fusion PCR methods to the analysis of multiallelic CNVs, and give practical details for our own implementation of the method in that context.
Mitochondrial pathology in inclusion body myositis.

PubMed

Lindgren, Ulrika; Roos, Sara; Hedberg Oldfors, Carola; Moslemi, Ali-Reza; Lindberg, Christopher; Oldfors, Anders

2015-04-01

Inclusion body myositis (IBM) is usually associated with a large number of cytochrome c oxidase (COX)-deficient muscle fibers and acquired mitochondrial DNA (mtDNA) deletions. We studied the number of COX-deficient fibers and the amount of mtDNA deletions, and if variants in nuclear genes involved in mtDNA maintenance may contribute to the occurrence of mtDNA deletions in IBM muscle. Twenty-six IBM patients were included. COX-deficient fibers were assayed by morphometry and mtDNA deletions by qPCR. POLG was analyzed in all patients by Sanger sequencing and C10orf2 (Twinkle), DNA2, MGME1, OPA1, POLG2, RRM2B, SLC25A4 and TYMP in six patients by next generation sequencing. Patients with many COX-deficient muscle fibers had a significantly higher proportion of mtDNA deletions than patients with few COX-deficient fibers. We found previously unreported variants in POLG and C10orf2 and IBM patients had a significantly higher frequency of an RRM2B variant than controls. POLG variants appeared more common in IBM patients with many COX-deficient fibers, but the difference was not statistically significant. We conclude that COX-deficient fibers in inclusion body myositis are associated with multiple mtDNA deletions. In IBM patients we found novel and also previously reported variants in genes of importance for mtDNA maintenance that warrants further studies. Copyright © 2014 Elsevier B.V. All rights reserved.
Using Drosophila melanogaster as a Model for Genotoxic Chemical Mutational Studies with a New Program, SnpSift

PubMed Central

Cingolani, Pablo; Patel, Viral M.; Coon, Melissa; Nguyen, Tung; Land, Susan J.; Ruden, Douglas M.; Lu, Xiangyi

2012-01-01

This paper describes a new program SnpSift for filtering differential DNA sequence variants between two or more experimental genomes after genotoxic chemical exposure. Here, we illustrate how SnpSift can be used to identify candidate phenotype-relevant variants including single nucleotide polymorphisms, multiple nucleotide polymorphisms, insertions, and deletions (InDels) in mutant strains isolated from genome-wide chemical mutagenesis of Drosophila melanogaster. First, the genomes of two independently isolated mutant fly strains that are allelic for a novel recessive male-sterile locus generated by genotoxic chemical exposure were sequenced using the Illumina next-generation DNA sequencer to obtain 20- to 29-fold coverage of the euchromatic sequences. The sequencing reads were processed and variants were called using standard bioinformatic tools. Next, SnpEff was used to annotate all sequence variants and their potential mutational effects on associated genes. Then, SnpSift was used to filter and select differential variants that potentially disrupt a common gene in the two allelic mutant strains. The potential causative DNA lesions were partially validated by capillary sequencing of polymerase chain reaction-amplified DNA in the genetic interval as defined by meiotic mapping and deletions that remove defined regions of the chromosome. Of the five candidate genes located in the genetic interval, the Pka-like gene CG12069 was found to carry a separate pre-mature stop codon mutation in each of the two allelic mutants whereas the other four candidate genes within the interval have wild-type sequences. The Pka-like gene is therefore a strong candidate gene for the male-sterile locus. These results demonstrate that combining SnpEff and SnpSift can expedite the identification of candidate phenotype-causative mutations in chemically mutagenized Drosophila strains. This technique can also be used to characterize the variety of mutations generated by genotoxic chemicals. PMID:22435069
The germline variants in DNA repair genes in pediatric medulloblastoma: a challenge for current therapeutic strategies.

PubMed

Trubicka, Joanna; Żemojtel, Tomasz; Hecht, Jochen; Falana, Katarzyna; Piekutowska-Abramczuk, Dorota; Płoski, Rafał; Perek-Polnik, Marta; Drogosiewicz, Monika; Grajkowska, Wiesława; Ciara, Elżbieta; Moszczyńska, Elżbieta; Dembowska-Bagińska, Bożenna; Perek, Danuta; Chrzanowska, Krystyna H; Krajewska-Walasek, Małgorzata; Łastowska, Maria

2017-04-04

The defects in DNA repair genes are potentially linked to development and response to therapy in medulloblastoma. Therefore the purpose of this study was to establish the spectrum and frequency of germline variants in selected DNA repair genes and their impact on response to chemotherapy in medulloblastoma patients. The following genes were investigated in 102 paediatric patients: MSH2 and RAD50 using targeted gene panel sequencing and NBN variants (p.I171V and p.K219fs*19) by Sanger sequencing. In three patients with presence of rare life-threatening adverse events (AE) and no detected variants in the analyzed genes, whole exome sequencing was performed. Based on combination of molecular and immunohistochemical evaluations tumors were divided into molecular subgroups. Presence of variants was tested for potential association with the occurrence of rare life-threatening AE and other clinical features. We have identified altogether six new potentially pathogenic variants in MSH2 (p.A733T and p.V606I), RAD50 (p.R1093*), FANCM (p.L694*), ERCC2 (p.R695C) and EXO1 (p.V738L), in addition to two known NBN variants. Five out of twelve patients with defects in either of MSH2, RAD50 and NBN genes suffered from rare life-threatening AE, more frequently than in control group (p = 0.0005). When all detected variants were taken into account, the majority of patients (8 out of 15) suffered from life-threatening toxicity during chemotherapy. Our results, based on the largest systematic study performed in a clinical setting, provide preliminary evidence for a link between defects in DNA repair genes and treatment related toxicity in children with medulloblastoma. The data suggest that patients with DNA repair gene variants could need special vigilance during and after courses of chemotherapy.
Molecular diagnosis of populational variants of Anthonomus grandis (Coleoptera: Curculionidae) in North America.

PubMed

Barr, Norman; Ruiz-Arce, Raul; Obregón, Oscar; De Leon, Rosita; Foster, Nelson; Reuter, Chris; Boratynski, Theodore; Vacek, Don

2013-02-01

The utility of the cytochrome oxidase I (COI) DNA sequence used for DNA barcoding and a Sequence Characterized Amplified Region for diagnosing boll weevil, Anthonomus grandis Boheman, variants was evaluated. Maximum likelihood analysis of COI DNA sequences from 154 weevils collected from the United States and Mexico supports previous evidence for limited gene flow between weevil populations on wild cotton and commercial cotton in northern Mexico and southern United States. The wild cotton populations represent a variant of the species called the thurberia weevil, which is not regarded as a significant pest. The 31 boll weevil COI haplotypes observed in the study form two distinct haplogroups (A and B) that are supported by five fixed nucleotide differences and a phylogenetic analysis. Although wild and commercial cotton populations are closely associated with specific haplogroups, there is not a fixed difference between the thurberia weevil variant and other populations. The Sequence Characterized Amplified Region marker generated a larger number of inconclusive results than the COI gene but also supported evidence of shared genotypes between wild and commercial cotton weevil populations. These methods provide additional markers that can assist in the identification of pest weevil populations but not definitively diagnose samples.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Ruggles, Kelly V.; Tang, Zuojian; Wang, Xuya

Improvements in mass spectrometry (MS)-based peptide sequencing provide a new opportunity to determine whether polymorphisms, mutations and splice variants identified in cancer cells are translated. Herein we therefore describe a proteogenomic data integration tool (QUILTS) and illustrate its application to whole genome, transcriptome and global MS peptide sequence datasets generated from a pair of luminal and basal-like breast cancer patient derived xenografts (PDX). The sensitivity of proteogenomic analysis for singe nucleotide variant (SNV) expression and novel splice junction (NSJ) detection was probed using multiple MS/MS process replicates. Despite over thirty sample replicates, only about 10% of all SNV (somatic andmore » germline) were detected by both DNA and RNA sequencing were observed as peptides. An even smaller proportion of peptides corresponding to NSJ observed by RNA sequencing were detected (<0.1%). Peptides mapping to DNA-detected SNV without a detectable mRNA transcript were also observed demonstrating the transcriptome coverage was also incomplete (~80%). In contrast to germ-line variants, somatic variants were less likely to be detected at the peptide level in the basal-like tumor than the luminal tumor raising the possibility of differential translation or protein degradation effects. In conclusion, the QUILTS program integrates DNA, RNA and peptide sequencing to assess the degree to which somatic mutations are translated and therefore biologically active. By identifying gaps in sequence coverage QUILTS benchmarks current technology and assesses progress towards whole cancer proteome and transcriptome analysis.« less
A rare variant of the mtDNA HVS1 sequence in the hairs of Napoléon's family.

PubMed

Lucotte, Gérard

2010-10-04

This paper describes the finding of a rare variant in the sequence of the hypervariable segment (HVS1) of mitochondrial (mtDNA) extracted from two preserved hairs, authenticated as belonging to the French Emperor Napoléon I (Napoléon Bonaparte). This rare variant is a mutation that changes the base C to T at position 16,184 (16184C→T), and it constitutes the only mutation found in this HVS1 sequence. This mutation is rare, because it was not found in a reference database (P < 0.05). In a personal database (M. Pala) comprising 37,000 different sequences, the 16184C→T mutation was found in only three samples, thus in this database the mutation frequency was 0.00008%. This mutation 16184C→T was also the only variant found subsequently in the HVS1 sequences of mtDNAs extracted from Napoléon's mother (Letizia) and from his youngest sister (Caroline), confirming that this mutation is maternally inherited. This 16184C→T variant could be used for genetic verification to authenticate any doubtful material and determine whether it should indeed be attributed to Napoléon.
A rare variant of the mtDNA HVS1 sequence in the hairs of Napoléon's family

PubMed Central

2010-01-01

This paper describes the finding of a rare variant in the sequence of the hypervariable segment (HVS1) of mitochondrial (mtDNA) extracted from two preserved hairs, authenticated as belonging to the French Emperor Napoléon I (Napoléon Bonaparte). This rare variant is a mutation that changes the base C to T at position 16,184 (16184C→T), and it constitutes the only mutation found in this HVS1 sequence. This mutation is rare, because it was not found in a reference database (P < 0.05). In a personal database (M. Pala) comprising 37,000 different sequences, the 16184C→T mutation was found in only three samples, thus in this database the mutation frequency was 0.00008%. This mutation 16184C→T was also the only variant found subsequently in the HVS1 sequences of mtDNAs extracted from Napoléon's mother (Letizia) and from his youngest sister (Caroline), confirming that this mutation is maternally inherited. This 16184C→T variant could be used for genetic verification to authenticate any doubtful material and determine whether it should indeed be attributed to Napoléon. PMID:21092341
Mitochondrial DNA variant at HVI region as a candidate of genetic markers of type 2 diabetes

NASA Astrophysics Data System (ADS)

Gumilar, Gun Gun; Purnamasari, Yunita; Setiadi, Rahmat

2016-02-01

Mitochondrial DNA (mtDNA) is maternally inherited. mtDNA mutations which can contribute to the excess of maternal inheritance of type 2 diabetes. Due to the high mutation rate, one of the areas in the mtDNA that is often associated with the disease is the hypervariable region I (HVI). Therefore, this study was conducted to determine the genetic variants of human mtDNA HVI that related to the type 2 diabetes in four samples that were taken from four generations in one lineage. Steps being taken include the lyses of hair follicles, amplification of mtDNA HVI fragment using Polymerase Chain Reaction (PCR), detection of PCR products through agarose gel electrophoresis technique, the measurement of the concentration of mtDNA using UV-Vis spectrophotometer, determination of the nucleotide sequence via direct sequencing method and analysis of the sequencing results using SeqMan DNASTAR program. Based on the comparison between nucleotide sequence of samples and revised Cambridge Reference Sequence (rCRS) obtained six same mutations that these are C16147T, T16189C, C16193del, T16127C, A16235G, and A16293C. After comparing the data obtained to the secondary data from Mitomap and NCBI, it were found that two mutations, T16189C and T16217C, become candidates as genetic markers of type 2 diabetes even the mutations were found also in the generations of undiagnosed type 2 diabetes. The results of this study are expected to give contribution to the collection of human mtDNA database of genetic variants that associated to metabolic diseases, so that in the future it can be utilized in various fields, especially in medicine.
Structural and Thermodynamic Signatures of DNA Recognition by Mycobacterium tuberculosis DnaA

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tsodikov, Oleg V.; Biswas, Tapan

An essential protein, DnaA, binds to 9-bp DNA sites within the origin of replication oriC. These binding events are prerequisite to forming an enigmatic nucleoprotein scaffold that initiates replication. The number, sequences, positions, and orientations of these short DNA sites, or DnaA boxes, within the oriCs of different bacteria vary considerably. To investigate features of DnaA boxes that are important for binding Mycobacterium tuberculosis DnaA (MtDnaA), we have determined the crystal structures of the DNA binding domain (DBD) of MtDnaA bound to a cognate MtDnaA-box (at 2.0 {angstrom} resolution) and to a consensus Escherichia coli DnaA-box (at 2.3 {angstrom}). Thesemore » structures, complemented by calorimetric equilibrium binding studies of MtDnaA DBD in a series of DnaA-box variants, reveal the main determinants of DNA recognition and establish the [T/C][T/A][G/A]TCCACA sequence as a high-affinity MtDnaA-box. Bioinformatic and calorimetric analyses indicate that DnaA-box sequences in mycobacterial oriCs generally differ from the optimal binding sequence. This sequence variation occurs commonly at the first 2 bp, making an in vivo mycobacterial DnaA-box effectively a 7-mer and not a 9-mer. We demonstrate that the decrease in the affinity of these MtDnaA-box variants for MtDnaA DBD relative to that of the highest-affinity box TTGTCCACA is less than 10-fold. The understanding of DnaA-box recognition by MtDnaA and E. coli DnaA enables one to map DnaA-box sequences in the genomes of M. tuberculosis and other eubacteria.« less
Comparison of Ion Personal Genome Machine Platforms for the Detection of Variants in BRCA1 and BRCA2.

PubMed

Hwang, Sang Mee; Lee, Ki Chan; Lee, Min Seob; Park, Kyoung Un

2018-01-01

Transition to next generation sequencing (NGS) for BRCA1 / BRCA2 analysis in clinical laboratories is ongoing but different platforms and/or data analysis pipelines give different results resulting in difficulties in implementation. We have evaluated the Ion Personal Genome Machine (PGM) Platforms (Ion PGM, Ion PGM Dx, Thermo Fisher Scientific) for the analysis of BRCA1 /2. The results of Ion PGM with OTG-snpcaller, a pipeline based on Torrent mapping alignment program and Genome Analysis Toolkit, from 75 clinical samples and 14 reference DNA samples were compared with Sanger sequencing for BRCA1 / BRCA2 . Ten clinical samples and 14 reference DNA samples were additionally sequenced by Ion PGM Dx with Torrent Suite. Fifty types of variants including 18 pathogenic or variants of unknown significance were identified from 75 clinical samples and known variants of the reference samples were confirmed by Sanger sequencing and/or NGS. One false-negative results were present for Ion PGM/OTG-snpcaller for an indel variant misidentified as a single nucleotide variant. However, eight discordant results were present for Ion PGM Dx/Torrent Suite with both false-positive and -negative results. A 40-bp deletion, a 4-bp deletion and a 1-bp deletion variant was not called and a false-positive deletion was identified. Four other variants were misidentified as another variant. Ion PGM/OTG-snpcaller showed acceptable performance with good concordance with Sanger sequencing. However, Ion PGM Dx/Torrent Suite showed many discrepant results not suitable for use in a clinical laboratory, requiring further optimization of the data analysis for calling variants.
FA-SAT Is an Old Satellite DNA Frozen in Several Bilateria Genomes

PubMed Central

Chaves, Raquel; Ferreira, Daniela; Mendes-da-Silva, Ana; Meles, Susana; Adega, Filomena

2017-01-01

Abstract In recent years, a growing body of evidence has recognized the tandem repeat sequences, and specifically satellite DNA, as a functional class of sequences in the genomic “dark matter.” Using an original, complementary, and thus an eclectic experimental design, we show that the cat archetypal satellite DNA sequence, FA-SAT, is “frozen” conservatively in several Bilateria genomes. We found different genomic FA-SAT architectures, and the interspersion pattern was conserved. In Carnivora genomes, the FA-SAT-related sequences are also amplified, with the predominance of a specific FA-SAT variant, at the heterochromatic regions. We inspected the cat genome project to locate FA-SAT array flanking regions and revealed an intensive intermingling with transposable elements. Our results also show that FA-SAT-related sequences are transcribed and that the most abundant FA-SAT variant is not always the most transcribed. We thus conclude that the DNA sequences of FA-SAT and their transcripts are “frozen” in these genomes. Future work is needed to disclose any putative function that these sequences may play in these genomes. PMID:29608678
Analysis of protein-coding genetic variation in 60,706 humans.

PubMed

Lek, Monkol; Karczewski, Konrad J; Minikel, Eric V; Samocha, Kaitlin E; Banks, Eric; Fennell, Timothy; O'Donnell-Luria, Anne H; Ware, James S; Hill, Andrew J; Cummings, Beryl B; Tukiainen, Taru; Birnbaum, Daniel P; Kosmicki, Jack A; Duncan, Laramie E; Estrada, Karol; Zhao, Fengmei; Zou, James; Pierce-Hoffman, Emma; Berghout, Joanne; Cooper, David N; Deflaux, Nicole; DePristo, Mark; Do, Ron; Flannick, Jason; Fromer, Menachem; Gauthier, Laura; Goldstein, Jackie; Gupta, Namrata; Howrigan, Daniel; Kiezun, Adam; Kurki, Mitja I; Moonshine, Ami Levy; Natarajan, Pradeep; Orozco, Lorena; Peloso, Gina M; Poplin, Ryan; Rivas, Manuel A; Ruano-Rubio, Valentin; Rose, Samuel A; Ruderfer, Douglas M; Shakir, Khalid; Stenson, Peter D; Stevens, Christine; Thomas, Brett P; Tiao, Grace; Tusie-Luna, Maria T; Weisburd, Ben; Won, Hong-Hee; Yu, Dongmei; Altshuler, David M; Ardissino, Diego; Boehnke, Michael; Danesh, John; Donnelly, Stacey; Elosua, Roberto; Florez, Jose C; Gabriel, Stacey B; Getz, Gad; Glatt, Stephen J; Hultman, Christina M; Kathiresan, Sekar; Laakso, Markku; McCarroll, Steven; McCarthy, Mark I; McGovern, Dermot; McPherson, Ruth; Neale, Benjamin M; Palotie, Aarno; Purcell, Shaun M; Saleheen, Danish; Scharf, Jeremiah M; Sklar, Pamela; Sullivan, Patrick F; Tuomilehto, Jaakko; Tsuang, Ming T; Watkins, Hugh C; Wilson, James G; Daly, Mark J; MacArthur, Daniel G

2016-08-18

Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.
Role of Mitochondrial Inheritance on Prostate Cancer Outcome in African American Men. Addendum

DTIC Science & Technology

2016-11-01

DNA sequencing technique developed by our collaborator using single amplicon long-range PCR that permits deep coverage (10,000-20,000X on average) of...the mitochondrial genome. We have sequenced 652 samples derived from frozen fully using this technology. The additional DNA samples derived from...paraffin embedded (FFPE) tissue were more challenging, but have now been sequenced . Mapping of DNA variants in our sequenced genomes to mitochondrial
Genomics in Cardiovascular Disease

PubMed Central

Roberts, Robert; Marian, A.J.; Dandona, Sonny; Stewart, Alexandre F.R.

2013-01-01

A paradigm shift towards biology occurred in the 1990’s subsequently catalyzed by the sequencing of the human genome in 2000. The cost of DNA sequencing has gone from millions to thousands of dollars with sequencing of one’s entire genome costing only $1,000. Rapid DNA sequencing is being embraced for single gene disorders, particularly for sporadic cases and those from small families. Transmission of lethal genes such as associated with Huntington’s disease can, through in-vitro fertilization, avoid passing it on to one’s offspring. DNA sequencing will meet the challenge of elucidating the genetic predisposition for common polygenic diseases, especially in determining the function of the novel common genetic risk variants and identifying the rare variants, which may also partially ascertain the source of the missing heritability. The challenge for DNA sequencing remains great, despite human genome sequences being 99.5% identical, the 3 million single nucleotide polymorphisms (SNPs) responsible for most of the unique features add up to 60 new mutations per person which, for 7 billion people, is 420 billion mutations. It is claimed that DNA sequencing has increased 10,000 fold while information storage and retrieval only 16 fold. The physician and health user will be challenged by the convergence of two major trends, whole genome sequencing and the storage/retrieval and integration of the data. PMID:23524054
Mapping Ribonucleotides Incorporated into DNA by Hydrolytic End-Sequencing.

PubMed

Orebaugh, Clinton D; Lujan, Scott A; Burkholder, Adam B; Clausen, Anders R; Kunkel, Thomas A

2018-01-01

Ribonucleotides embedded within DNA render the DNA sensitive to the formation of single-stranded breaks under alkali conditions. Here, we describe a next-generation sequencing method called hydrolytic end sequencing (HydEn-seq) to map ribonucleotides inserted into the genome of Saccharomyce cerevisiae strains deficient in ribonucleotide excision repair. We use this method to map several genomic features in wild-type and replicase variant yeast strains.
Mitochondrial Mutations in Subjects with Psychiatric Disorders

PubMed Central

Magnan, Christophe; van Oven, Mannis; Baldi, Pierre; Myers, Richard M.; Barchas, Jack D.; Schatzberg, Alan F.; Watson, Stanley J.; Akil, Huda; Bunney, William E.; Vawter, Marquis P.

2015-01-01

A considerable body of evidence supports the role of mitochondrial dysfunction in psychiatric disorders and mitochondrial DNA (mtDNA) mutations are known to alter brain energy metabolism, neurotransmission, and cause neurodegenerative disorders. Genetic studies focusing on common nuclear genome variants associated with these disorders have produced genome wide significant results but those studies have not directly studied mtDNA variants. The purpose of this study is to investigate, using next generation sequencing, the involvement of mtDNA variation in bipolar disorder, schizophrenia, major depressive disorder, and methamphetamine use. MtDNA extracted from multiple brain regions and blood were sequenced (121 mtDNA samples with an average of 8,800x coverage) and compared to an electronic database containing 26,850 mtDNA genomes. We confirmed novel and rare variants, and confirmed next generation sequencing error hotspots by traditional sequencing and genotyping methods. We observed a significant increase of non-synonymous mutations found in individuals with schizophrenia. Novel and rare non-synonymous mutations were found in psychiatric cases in mtDNA genes: ND6, ATP6, CYTB, and ND2. We also observed mtDNA heteroplasmy in brain at a locus previously associated with schizophrenia (T16519C). Large differences in heteroplasmy levels across brain regions within subjects suggest that somatic mutations accumulate differentially in brain regions. Finally, multiplasmy, a heteroplasmic measure of repeat length, was observed in brain from selective cases at a higher frequency than controls. These results offer support for increased rates of mtDNA substitutions in schizophrenia shown in our prior results. The variable levels of heteroplasmic/multiplasmic somatic mutations that occur in brain may be indicators of genetic instability in mtDNA. PMID:26011537
The Personal Genome Project Canada: findings from whole genome sequences of the inaugural 56 participants

PubMed Central

Reuter, Miriam S.; Walker, Susan; Thiruvahindrapuram, Bhooma; Whitney, Joe; Cohn, Iris; Sondheimer, Neal; Yuen, Ryan K.C.; Trost, Brett; Paton, Tara A.; Pereira, Sergio L.; Herbrick, Jo-Anne; Wintle, Richard F.; Merico, Daniele; Howe, Jennifer; MacDonald, Jeffrey R.; Lu, Chao; Nalpathamkalam, Thomas; Sung, Wilson W.L.; Wang, Zhuozhi; Patel, Rohan V.; Pellecchia, Giovanna; Wei, John; Strug, Lisa J.; Bell, Sherilyn; Kellam, Barbara; Mahtani, Melanie M.; Bassett, Anne S.; Bombard, Yvonne; Weksberg, Rosanna; Shuman, Cheryl; Cohn, Ronald D.; Stavropoulos, Dimitri J.; Bowdin, Sarah; Hildebrandt, Matthew R.; Wei, Wei; Romm, Asli; Pasceri, Peter; Ellis, James; Ray, Peter; Meyn, M. Stephen; Monfared, Nasim; Hosseini, S. Mohsen; Joseph-George, Ann M.; Keeley, Fred W.; Cook, Ryan A.; Fiume, Marc; Lee, Hin C.; Marshall, Christian R.; Davies, Jill; Hazell, Allison; Buchanan, Janet A.; Szego, Michael J.; Scherer, Stephen W.

2018-01-01

BACKGROUND: The Personal Genome Project Canada is a comprehensive public data resource that integrates whole genome sequencing data and health information. We describe genomic variation identified in the initial recruitment cohort of 56 volunteers. METHODS: Volunteers were screened for eligibility and provided informed consent for open data sharing. Using blood DNA, we performed whole genome sequencing and identified all possible classes of DNA variants. A genetic counsellor explained the implication of the results to each participant. RESULTS: Whole genome sequencing of the first 56 participants identified 207 662 805 sequence variants and 27 494 copy number variations. We analyzed a prioritized disease-associated data set (n = 1606 variants) according to standardized guidelines, and interpreted 19 variants in 14 participants (25%) as having obvious health implications. Six of these variants (e.g., in BRCA1 or mosaic loss of an X chromosome) were pathogenic or likely pathogenic. Seven were risk factors for cancer, cardiovascular or neurobehavioural conditions. Four other variants — associated with cancer, cardiac or neurodegenerative phenotypes — remained of uncertain significance because of discrepancies among databases. We also identified a large structural chromosome aberration and a likely pathogenic mitochondrial variant. There were 172 recessive disease alleles (e.g., 5 individuals carried mutations for cystic fibrosis). Pharmacogenomics analyses revealed another 3.9 potentially relevant genotypes per individual. INTERPRETATION: Our analyses identified a spectrum of genetic variants with potential health impact in 25% of participants. When also considering recessive alleles and variants with potential pharmacologic relevance, all 56 participants had medically relevant findings. Although access is mostly limited to research, whole genome sequencing can provide specific and novel information with the potential of major impact for health care. PMID:29431110
The Personal Genome Project Canada: findings from whole genome sequences of the inaugural 56 participants.

PubMed

Reuter, Miriam S; Walker, Susan; Thiruvahindrapuram, Bhooma; Whitney, Joe; Cohn, Iris; Sondheimer, Neal; Yuen, Ryan K C; Trost, Brett; Paton, Tara A; Pereira, Sergio L; Herbrick, Jo-Anne; Wintle, Richard F; Merico, Daniele; Howe, Jennifer; MacDonald, Jeffrey R; Lu, Chao; Nalpathamkalam, Thomas; Sung, Wilson W L; Wang, Zhuozhi; Patel, Rohan V; Pellecchia, Giovanna; Wei, John; Strug, Lisa J; Bell, Sherilyn; Kellam, Barbara; Mahtani, Melanie M; Bassett, Anne S; Bombard, Yvonne; Weksberg, Rosanna; Shuman, Cheryl; Cohn, Ronald D; Stavropoulos, Dimitri J; Bowdin, Sarah; Hildebrandt, Matthew R; Wei, Wei; Romm, Asli; Pasceri, Peter; Ellis, James; Ray, Peter; Meyn, M Stephen; Monfared, Nasim; Hosseini, S Mohsen; Joseph-George, Ann M; Keeley, Fred W; Cook, Ryan A; Fiume, Marc; Lee, Hin C; Marshall, Christian R; Davies, Jill; Hazell, Allison; Buchanan, Janet A; Szego, Michael J; Scherer, Stephen W

2018-02-05

The Personal Genome Project Canada is a comprehensive public data resource that integrates whole genome sequencing data and health information. We describe genomic variation identified in the initial recruitment cohort of 56 volunteers. Volunteers were screened for eligibility and provided informed consent for open data sharing. Using blood DNA, we performed whole genome sequencing and identified all possible classes of DNA variants. A genetic counsellor explained the implication of the results to each participant. Whole genome sequencing of the first 56 participants identified 207 662 805 sequence variants and 27 494 copy number variations. We analyzed a prioritized disease-associated data set ( n = 1606 variants) according to standardized guidelines, and interpreted 19 variants in 14 participants (25%) as having obvious health implications. Six of these variants (e.g., in BRCA1 or mosaic loss of an X chromosome) were pathogenic or likely pathogenic. Seven were risk factors for cancer, cardiovascular or neurobehavioural conditions. Four other variants - associated with cancer, cardiac or neurodegenerative phenotypes - remained of uncertain significance because of discrepancies among databases. We also identified a large structural chromosome aberration and a likely pathogenic mitochondrial variant. There were 172 recessive disease alleles (e.g., 5 individuals carried mutations for cystic fibrosis). Pharmacogenomics analyses revealed another 3.9 potentially relevant genotypes per individual. Our analyses identified a spectrum of genetic variants with potential health impact in 25% of participants. When also considering recessive alleles and variants with potential pharmacologic relevance, all 56 participants had medically relevant findings. Although access is mostly limited to research, whole genome sequencing can provide specific and novel information with the potential of major impact for health care. © 2018 Joule Inc. or its licensors.
Novel variants of the 5S rRNA genes in Eruca sativa.

PubMed

Singh, K; Bhatia, S; Lakshmikumaran, M

1994-02-01

The 5S ribosomal RNA (rRNA) genes of Eruca sativa were cloned and characterized. They are organized into clusters of tandemly repeated units. Each repeat unit consists of a 119-bp coding region followed by a noncoding spacer region that separates it from the coding region of the next repeat unit. Our study reports novel gene variants of the 5S rRNA genes in plants. Two families of the 5S rDNA, the 0.5-kb size family and the 1-kb size family, coexist in the E. sativa genome. The 0.5-kb size family consists of the 5S rRNA genes (S4) that have coding regions similar to those of other reported plant 5S rDNA sequences, whereas the 1-kb size family consists of the 5S rRNA gene variants (S1) that exist as 1-kb BamHI tandem repeats. S1 is made up of two variant units (V1 and V2) of 5S rDNA where the BamHI site between the two units is mutated. Sequence heterogeneity among S4, V1, and V2 units exists throughout the sequence and is not limited to the noncoding spacer region only. The coding regions of V1 and V2 show approximately 20% dissimilarity to the coding regions of S4 and other reported plant 5S rDNA sequences. Such a large variation in the coding regions of the 5S rDNA units within the same plant species has been observed for the first time. Restriction site variation is observed between the two size classes of 5S rDNA in E. sativa.(ABSTRACT TRUNCATED AT 250 WORDS)

Unlocking hidden genomic sequence

PubMed Central

Keith, Jonathan M.; Cochran, Duncan A. E.; Lala, Gita H.; Adams, Peter; Bryant, Darryn; Mitchelson, Keith R.

2004-01-01

Despite the success of conventional Sanger sequencing, significant regions of many genomes still present major obstacles to sequencing. Here we propose a novel approach with the potential to alleviate a wide range of sequencing difficulties. The technique involves extracting target DNA sequence from variants generated by introduction of random mutations. The introduction of mutations does not destroy original sequence information, but distributes it amongst multiple variants. Some of these variants lack problematic features of the target and are more amenable to conventional sequencing. The technique has been successfully demonstrated with mutation levels up to an average 18% base substitution and has been used to read previously intractable poly(A), AT-rich and GC-rich motifs. PMID:14973330
Mapping DNA Methylation with High Throughput Nanopore Sequencing

PubMed Central

Rand, Arthur C.; Jain, Miten; Eizenga, Jordan M.; Musselman-Brown, Audrey; Olsen, Hugh E.; Akeson, Mark

2017-01-01

Chemical modifications to DNA regulate its biological function. We present a framework for mapping methylation to cytosine and adenosine with the Oxford Nanopore Technologies MinION using its ionic current signal. We map three cytosine variants and two adenine variants. The results show that our model is sensitive enough to detect changes in genomic DNA methylation levels as a function of growth phase in E. coli. PMID:28218897
The Fanconi anemia DNA damage repair pathway in the spotlight for germline predisposition to colorectal cancer.

PubMed

Esteban-Jurado, Clara; Franch-Expósito, Sebastià; Muñoz, Jenifer; Ocaña, Teresa; Carballal, Sabela; López-Cerón, Maria; Cuatrecasas, Miriam; Vila-Casadesús, Maria; Lozano, Juan José; Serra, Enric; Beltran, Sergi; Brea-Fernández, Alejandro; Ruiz-Ponte, Clara; Castells, Antoni; Bujanda, Luis; Garre, Pilar; Caldés, Trinidad; Cubiella, Joaquín; Balaguer, Francesc; Castellví-Bel, Sergi

2016-10-01

Colorectal cancer (CRC) is one of the most common neoplasms in the world. Fanconi anemia (FA) is a very rare genetic disease causing bone marrow failure, congenital growth abnormalities and cancer predisposition. The comprehensive FA DNA damage repair pathway requires the collaboration of 53 proteins and it is necessary to restore genome integrity by efficiently repairing damaged DNA. A link between FA genes in breast and ovarian cancer germline predisposition has been previously suggested. We selected 74 CRC patients from 40 unrelated Spanish families with strong CRC aggregation compatible with an autosomal dominant pattern of inheritance and without mutations in known hereditary CRC genes and performed germline DNA whole-exome sequencing with the aim of finding new candidate germline predisposition variants. After sequencing and data analysis, variant prioritization selected only those very rare alterations, producing a putative loss of function and located in genes with a role compatible with cancer. We detected an enrichment for variants in FA DNA damage repair pathway genes in our familial CRC cohort as 6 families carried heterozygous, rare, potentially pathogenic variants located in BRCA2/FANCD1, BRIP1/FANCJ, FANCC, FANCE and REV3L/POLZ. In conclusion, the FA DNA damage repair pathway may play an important role in the inherited predisposition to CRC.
SG-ADVISER mtDNA: a web server for mitochondrial DNA annotation with data from 200 samples of a healthy aging cohort.

PubMed

Rueda, Manuel; Torkamani, Ali

2017-08-18

Whole genome and exome sequencing usually include reads containing mitochondrial DNA (mtDNA). Yet, state-of-the-art pipelines and services for human nuclear genome variant calling and annotation do not handle mitochondrial genome data appropriately. As a consequence, any researcher desiring to add mtDNA variant analysis to their investigations is forced to explore the literature for mtDNA pipelines, evaluate them, and implement their own instance of the desired tool. This task is far from trivial, and can be prohibitive for non-bioinformaticians. We have developed SG-ADVISER mtDNA, a web server to facilitate the analysis and interpretation of mtDNA genomic data coming from next generation sequencing (NGS) experiments. The server was built in the context of our SG-ADVISER framework and on top of the MtoolBox platform (Calabrese et al., Bioinformatics 30(21):3115-3117, 2014), and includes most of its functionalities (i.e., assembly of mitochondrial genomes, heteroplasmic fractions, haplogroup assignment, functional and prioritization analysis of mitochondrial variants) as well as a back-end and a front-end interface. The server has been tested with unpublished data from 200 individuals of a healthy aging cohort (Erikson et al., Cell 165(4):1002-1011, 2016) and their data is made publicly available here along with a preliminary analysis of the variants. We observed that individuals over ~90 years old carried low levels of heteroplasmic variants in their genomes. SG-ADVISER mtDNA is a fast and functional tool that allows for variant calling and annotation of human mtDNA data coming from NGS experiments. The server was built with simplicity in mind, and builds on our own experience in interpreting mtDNA variants in the context of sudden death and rare diseases. Our objective is to provide an interface for non-bioinformaticians aiming to acquire (or contrast) mtDNA annotations via MToolBox. SG-ADVISER web server is freely available to all users at https://genomics.scripps.edu/mtdna .
Molecular Cloning and Expression of Sequence Variants of Manganese Superoxide Dismutase Genes from Wheat

USDA-ARS?s Scientific Manuscript database

Reactive oxygen species (ROS) are very harmful to living organisms due to the potential oxidation of membrane lipids, DNA, proteins, and carbohydrates. Transformed E.coli strain QC 871, superoxide dismutase (SOD) double-mutant, with three sequence variant MnSOD1, MnSOD2, and MnSOD3 manganese supero...
DNA methylation of the filaggrin gene adds to the risk of eczema associated with loss-of-function variants

PubMed Central

Ziyab, A. H.; Karmaus, W.; Holloway, J. W.; Zhang, H.; Ewart, S.; Arshad, S. H.

2012-01-01

Background Loss-of-function variants within the filaggrin gene (FLG) are associated with a dysfunctional skin barrier that contributes to the development of eczema. Epigenetic modifications, such as DNA methylation, are genetic regulatory mechanisms that modulate gene expression without changing the DAN sequence. Objectives To investigate whether genetic variants and adjacent differential DNA methylation within the FLG gene synergistically act on the development of eczema. Methods A subsample (n = 245, only females aged 18 years) of the Isle of Wight birth cohort participants (n = 1,456) had available information for FLG variants R501X, 2282del4, and S3247X and DNA methylation levels for 10 CpG sites within the FLG gene. Log-binomial regression was used to estimate the risk ratios (RRs) of eczema associated with FLG variants at different methylation levels. Results The period prevalence of eczema was 15.2% at age 18 years and 9.0% of participants were carriers (heterozygous) of FLG variants. Of the 10 CpG sites spanning the genomic region of FLG, methylation levels of CpG site ‘cg07548383’ showed a significant interaction with FLG sequence variants on the risk for eczema. At 86% methylation level, filaggrin haploinsufficient individuals had 5.48-fold increased risk of eczema when compared to those with wild type FLG genotype (p-value = 0.0008). Conclusions Our novel results indicated that the association between FLG loss-of-function variants and eczema is modulated by DNA methylation. Simultaneously assessing the joint effect of genetic and epigenetic factors within the FLG gene further highlights the importance of this genomic region for eczema manifestation. PMID:23003573
An efficient and scalable analysis framework for variant extraction and refinement from population-scale DNA sequence data.

PubMed

Jun, Goo; Wing, Mary Kate; Abecasis, Gonçalo R; Kang, Hyun Min

2015-06-01

The analysis of next-generation sequencing data is computationally and statistically challenging because of the massive volume of data and imperfect data quality. We present GotCloud, a pipeline for efficiently detecting and genotyping high-quality variants from large-scale sequencing data. GotCloud automates sequence alignment, sample-level quality control, variant calling, filtering of likely artifacts using machine-learning techniques, and genotype refinement using haplotype information. The pipeline can process thousands of samples in parallel and requires less computational resources than current alternatives. Experiments with whole-genome and exome-targeted sequence data generated by the 1000 Genomes Project show that the pipeline provides effective filtering against false positive variants and high power to detect true variants. Our pipeline has already contributed to variant detection and genotyping in several large-scale sequencing projects, including the 1000 Genomes Project and the NHLBI Exome Sequencing Project. We hope it will now prove useful to many medical sequencing studies. © 2015 Jun et al.; Published by Cold Spring Harbor Laboratory Press.
Engineering of a DNA Polymerase for Direct m6 A Sequencing.

PubMed

Aschenbrenner, Joos; Werner, Stephan; Marchand, Virginie; Adam, Martina; Motorin, Yuri; Helm, Mark; Marx, Andreas

2018-01-08

Methods for the detection of RNA modifications are of fundamental importance for advancing epitranscriptomics. N 6 -methyladenosine (m 6 A) is the most abundant RNA modification in mammalian mRNA and is involved in the regulation of gene expression. Current detection techniques are laborious and rely on antibody-based enrichment of m 6 A-containing RNA prior to sequencing, since m 6 A modifications are generally "erased" during reverse transcription (RT). To overcome the drawbacks associated with indirect detection, we aimed to generate novel DNA polymerase variants for direct m 6 A sequencing. Therefore, we developed a screen to evolve an RT-active KlenTaq DNA polymerase variant that sets a mark for N 6 -methylation. We identified a mutant that exhibits increased misincorporation opposite m 6 A compared to unmodified A. Application of the generated DNA polymerase in next-generation sequencing allowed the identification of m 6 A sites directly from the sequencing data of untreated RNA samples. © 2017 The Authors. Published by Wiley-VCH Verlag GmbH & Co. KGaA.
[Current applications of high-throughput DNA sequencing technology in antibody drug research].

PubMed

Yu, Xin; Liu, Qi-Gang; Wang, Ming-Rong

2012-03-01

Since the publication of a high-throughput DNA sequencing technology based on PCR reaction was carried out in oil emulsions in 2005, high-throughput DNA sequencing platforms have been evolved to a robust technology in sequencing genomes and diverse DNA libraries. Antibody libraries with vast numbers of members currently serve as a foundation of discovering novel antibody drugs, and high-throughput DNA sequencing technology makes it possible to rapidly identify functional antibody variants with desired properties. Herein we present a review of current applications of high-throughput DNA sequencing technology in the analysis of antibody library diversity, sequencing of CDR3 regions, identification of potent antibodies based on sequence frequency, discovery of functional genes, and combination with various display technologies, so as to provide an alternative approach of discovery and development of antibody drugs.
Fanconi anemia and homologous recombination gene variants are associated with functional DNA repair defects in vitro and poor outcome in patients with advanced head and neck squamous cell carcinoma

PubMed Central

Verhagen, Caroline V.M.; Vossen, David M.; Borgmann, Kerstin; Hageman, Floor; Grénman, Reidar; Verwijs-Janssen, Manon; Mout, Lisanne; Kluin, Roel J.C.; Nieuwland, Marja; Severson, Tesa M.; Velds, Arno; Kerkhoven, Ron; O’Connor, Mark J.; van der Heijden, Martijn; van Velthuysen, Marie-Louise; Verheij, Marcel; Wreesmann, Volkert B.; Wessels, Lodewyk F.A.; van den Brekel, Michiel W.M.; Vens, Conchita

2018-01-01

Mutations in Fanconi Anemia or Homologous Recombination (FA/HR) genes can cause DNA repair defects and could therefore impact cancer treatment response and patient outcome. Their functional impact and clinical relevance in head and neck squamous cell carcinoma (HNSCC) is unknown. We therefore questioned whether functional FA/HR defects occurred in HNSCC and whether they are associated with FA/HR variants. We assayed a panel of 29 patient-derived HNSCC cell lines and found that a considerable fraction is hypersensitive to the crosslinker Mitomycin C and PARP inhibitors, a functional measure of FA/HR defects. DNA sequencing showed that these hypersensitivities are associated with the presence of bi-allelic rare germline and somatic FA/HR gene variants. We next questioned whether such variants are associated with prognosis and treatment response in HNSCC patients. DNA sequencing of 77 advanced stage HNSCC tumors revealed a 19% incidence of such variants. Importantly, these variants were associated with a poor prognosis (p = 0.027; HR = 2.6, 1.1–6.0) but favorable response to high cumulative cisplatin dose. We show how an integrated in vitro functional repair and genomic analysis can improve the prognostic value of genetic biomarkers. We conclude that repair defects are marked and frequent in HNSCC and are associated with clinical outcome. PMID:29719599
Fanconi anemia and homologous recombination gene variants are associated with functional DNA repair defects in vitro and poor outcome in patients with advanced head and neck squamous cell carcinoma.

PubMed

Verhagen, Caroline V M; Vossen, David M; Borgmann, Kerstin; Hageman, Floor; Grénman, Reidar; Verwijs-Janssen, Manon; Mout, Lisanne; Kluin, Roel J C; Nieuwland, Marja; Severson, Tesa M; Velds, Arno; Kerkhoven, Ron; O'Connor, Mark J; van der Heijden, Martijn; van Velthuysen, Marie-Louise; Verheij, Marcel; Wreesmann, Volkert B; Wessels, Lodewyk F A; van den Brekel, Michiel W M; Vens, Conchita

2018-04-06

Mutations in Fanconi Anemia or Homologous Recombination (FA/HR) genes can cause DNA repair defects and could therefore impact cancer treatment response and patient outcome. Their functional impact and clinical relevance in head and neck squamous cell carcinoma (HNSCC) is unknown. We therefore questioned whether functional FA/HR defects occurred in HNSCC and whether they are associated with FA/HR variants. We assayed a panel of 29 patient-derived HNSCC cell lines and found that a considerable fraction is hypersensitive to the crosslinker Mitomycin C and PARP inhibitors, a functional measure of FA/HR defects. DNA sequencing showed that these hypersensitivities are associated with the presence of bi-allelic rare germline and somatic FA/HR gene variants. We next questioned whether such variants are associated with prognosis and treatment response in HNSCC patients. DNA sequencing of 77 advanced stage HNSCC tumors revealed a 19% incidence of such variants. Importantly, these variants were associated with a poor prognosis ( p = 0.027; HR = 2.6, 1.1-6.0) but favorable response to high cumulative cisplatin dose. We show how an integrated in vitro functional repair and genomic analysis can improve the prognostic value of genetic biomarkers. We conclude that repair defects are marked and frequent in HNSCC and are associated with clinical outcome.
BlackOPs: increasing confidence in variant detection through mappability filtering.

PubMed

Cabanski, Christopher R; Wilkerson, Matthew D; Soloway, Matthew; Parker, Joel S; Liu, Jinze; Prins, Jan F; Marron, J S; Perou, Charles M; Hayes, D Neil

2013-10-01

Identifying variants using high-throughput sequencing data is currently a challenge because true biological variants can be indistinguishable from technical artifacts. One source of technical artifact results from incorrectly aligning experimentally observed sequences to their true genomic origin ('mismapping') and inferring differences in mismapped sequences to be true variants. We developed BlackOPs, an open-source tool that simulates experimental RNA-seq and DNA whole exome sequences derived from the reference genome, aligns these sequences by custom parameters, detects variants and outputs a blacklist of positions and alleles caused by mismapping. Blacklists contain thousands of artifact variants that are indistinguishable from true variants and, for a given sample, are expected to be almost completely false positives. We show that these blacklist positions are specific to the alignment algorithm and read length used, and BlackOPs allows users to generate a blacklist specific to their experimental setup. We queried the dbSNP and COSMIC variant databases and found numerous variants indistinguishable from mapping errors. We demonstrate how filtering against blacklist positions reduces the number of potential false variants using an RNA-seq glioblastoma cell line data set. In summary, accounting for mapping-caused variants tuned to experimental setups reduces false positives and, therefore, improves genome characterization by high-throughput sequencing.
Mitochondrial DNA sequence data reveals association of haplogroup U with psychosis in bipolar disorder.

PubMed

Frye, Mark A; Ryu, Euijung; Nassan, Malik; Jenkins, Gregory D; Andreazza, Ana C; Evans, Jared M; McElroy, Susan L; Oglesbee, Devin; Highsmith, W Edward; Biernacka, Joanna M

2017-01-01

Converging genetic, postmortem gene-expression, cellular, and neuroimaging data implicate mitochondrial dysfunction in bipolar disorder. This study was conducted to investigate whether mitochondrial DNA (mtDNA) haplogroups and single nucleotide variants (SNVs) are associated with sub-phenotypes of bipolar disorder. MtDNA from 224 patients with Bipolar I disorder (BPI) was sequenced, and association of sequence variations with 3 sub-phenotypes (psychosis, rapid cycling, and adolescent illness onset) was evaluated. Gene-level tests were performed to evaluate overall burden of minor alleles for each phenotype. The haplogroup U was associated with a higher risk of psychosis. Secondary analyses of SNVs provided nominal evidence for association of psychosis with variants in the tRNA, ND4 and ND5 genes. The association of psychosis with ND4 (gene that encodes NADH dehydrogenase 4) was further supported by gene-level analysis. Preliminary analysis of mtDNA sequence data suggests a higher risk of psychosis with the U haplogroup and variation in the ND4 gene implicated in electron transport chain energy regulation. Further investigation of the functional consequences of this mtDNA variation is encouraged. Copyright Â© 2016. Published by Elsevier Ltd.
Genomic prediction using preselected DNA variants from a GWAS with whole-genome sequence data in Holstein-Friesian cattle.

PubMed

Veerkamp, Roel F; Bouwman, Aniek C; Schrooten, Chris; Calus, Mario P L

2016-12-01

Whole-genome sequence data is expected to capture genetic variation more completely than common genotyping panels. Our objective was to compare the proportion of variance explained and the accuracy of genomic prediction by using imputed sequence data or preselected SNPs from a genome-wide association study (GWAS) with imputed whole-genome sequence data. Phenotypes were available for 5503 Holstein-Friesian bulls. Genotypes were imputed up to whole-genome sequence (13,789,029 segregating DNA variants) by using run 4 of the 1000 bull genomes project. The program GCTA was used to perform GWAS for protein yield (PY), somatic cell score (SCS) and interval from first to last insemination (IFL). From the GWAS, subsets of variants were selected and genomic relationship matrices (GRM) were used to estimate the variance explained in 2087 validation animals and to evaluate the genomic prediction ability. Finally, two GRM were fitted together in several models to evaluate the effect of selected variants that were in competition with all the other variants. The GRM based on full sequence data explained only marginally more genetic variation than that based on common SNP panels: for PY, SCS and IFL, genomic heritability improved from 0.81 to 0.83, 0.83 to 0.87 and 0.69 to 0.72, respectively. Sequence data also helped to identify more variants linked to quantitative trait loci and resulted in clearer GWAS peaks across the genome. The proportion of total variance explained by the selected variants combined in a GRM was considerably smaller than that explained by all variants (less than 0.31 for all traits). When selected variants were used, accuracy of genomic predictions decreased and bias increased. Although 35 to 42 variants were detected that together explained 13 to 19% of the total variance (18 to 23% of the genetic variance) when fitted alone, there was no advantage in using dense sequence information for genomic prediction in the Holstein data used in our study. Detection and selection of variants within a single breed are difficult due to long-range linkage disequilibrium. Stringent selection of variants resulted in more biased genomic predictions, although this might be due to the training population being the same dataset from which the selected variants were identified.
Comparison of variable region 3 sequences of human immunodeficiency virus type 1 from infected children with the RNA and DNA sequences of the virus populations of their mothers.

PubMed Central

Scarlatti, G; Leitner, T; Halapi, E; Wahlberg, J; Marchisio, P; Clerici-Schoeller, M A; Wigzell, H; Fenyö, E M; Albert, J; Uhlén, M

1993-01-01

We have compared the variable region 3 sequences from 10 human immunodeficiency virus type 1 (HIV-1)-infected infants to virus sequences from the corresponding mothers. The sequences were derived from DNA of uncultured peripheral blood mononuclear cells (PBMC), DNA of cultured PBMC, and RNA from serum collected at or shortly after delivery. The infected infants, in contrast to the mothers, harbored homogeneous virus populations. Comparison of sequences from the children and clones derived from DNA of the corresponding mothers showed that the transmitted virus represented either a minor or a major virus population of the mother. In contrast to an earlier study, we found no evidence of selection of minor virus variants during transmission. Furthermore, the transmitted virus variant did not show any characteristic molecular features. In some cases the transmitted virus was more related to the virus RNA population of the mother and in other cases it was more related to the virus DNA population. This suggests that either cell-free or cell-associated virus may be transmitted. These data will help AIDS researchers to understand the mechanism of transmission and to plan strategies for prevention of transmission. PMID:8446584
Comparative oncology DNA sequencing of canine T cell lymphoma via human hotspot panel

PubMed Central

Beheshti, Afshin; Pilichowska, Monika; Burgess, Kristine; Ricks-Santi, Luisel; McNiel, Elizabeth; London, Cheryl B.; Ravi, Dashnamoorthy; Evens, Andrew M.

2018-01-01

T-cell lymphoma (TCL) is an uncommon and aggressive form of human cancer. Lymphoma is the most common hematopoietic tumor in canines (companion animals), with TCL representing approximately 30% of diagnoses. Collectively, the canine is an appealing model for cancer research given the spontaneous occurrence of cancer, intact immune system, and phytogenetic proximity to humans. We sought to establish mutational congruence of the canine with known human TCL mutations in order to identify potential actionable oncogenic pathways. Following pathologic confirmation, DNA was sequenced in 16 canine TCL (cTCL) cases using a custom Human Cancer Hotspot Panel of 68 genes commonly mutated in human TCL. Sequencing identified 4,527,638 total reads with average length of 229 bases containing 346 unique variants and 1,474 total variants; each sample had an average of 92 variants. Among these, there were 258 germline and 32 somatic variants. Among the 32 somatic variants there were 8 missense variants, 1 splice junction variant and the remaining were intron or synonymous variants. A frequency of 4 somatic mutations per sample were noted with >7 mutations detected in MET, KDR, STK11 and BRAF. Expression of these associated proteins were also detected via Western blot analyses. In addition, Sanger sequencing confirmed three variants of high quality (MYC, MET, and TP53 missense mutation). Taken together, the mutational spectrum and protein analyses showed mutations in signaling pathways similar to human TCL and also identified novel mutations that may serve as drug targets as well as potential biomarkers. PMID:29854308
Longitudinal studies on maternal HIV-1 variants by biological phenotyping, sequence analysis and viral load.

PubMed

Renta, J Y; Cadilla, C L; Vega, M E; Hillyer, G V; Estrada, C; Jiménez, E; Abreu, E; Méndez, I; Gandía, J; Meléndez-Guerrero, L M

1997-11-01

In this study, the HIV-1 variant viruses from ten pregnant women and their infants were isolated and characterized longitudinally in order to determine the role that viral envelope (gp120-V3 loop) gene variation and viral tropism play in vertical transmission. Biological phenotyping of each HIV variant was accomplished by growth in MT-2, and macrophages from healthy and non-HIV-infected donors. Genetic characterization of the variants was accomplished by DNA sequence analysis. All the women enrolled in this study received ZDV therapy. Virus was cultured from eight out of ten env V3-PCR positive mothers. HIV-1 isolates were all non-syncitium inducing variants. None of the mothers were found to transmit HIV, as determined by DNA PCR and quantitative co-cultures on their infants which were seronegative for HIV-1 through one year after birth. Viral cultures from infant blood samples were negative and infants were all healthy. However, nested env V3-PCR detected proviral DNA in five out of ten infants. In contrast, conventional gag-PCR was negative in the same five infants. Sequences of the five maternal-infant pairs were different, suggesting unique infant HIV-1 variants. The three highest maternal viral load values corresponded to infants that were env V3-PCR positive. These results suggest that HIV-1 particles are transmitted from ZDV-treated mothers to infants. Infant follow up is recommended to determine if HIV-1 has been inhibited by the immune system of the infants.
High-resolution melting (HRM) re-analysis of a polyposis patients cohort reveals previously undetected heterozygous and mosaic APC gene mutations.

PubMed

Out, Astrid A; van Minderhout, Ivonne J H M; van der Stoep, Nienke; van Bommel, Lysette S R; Kluijt, Irma; Aalfs, Cora; Voorendt, Marsha; Vossen, Rolf H A M; Nielsen, Maartje; Vasen, Hans F A; Morreau, Hans; Devilee, Peter; Tops, Carli M J; Hes, Frederik J

2015-06-01

Familial adenomatous polyposis is most frequently caused by pathogenic variants in either the APC gene or the MUTYH gene. The detection rate of pathogenic variants depends on the severity of the phenotype and sensitivity of the screening method, including sensitivity for mosaic variants. For 171 patients with multiple colorectal polyps without previously detectable pathogenic variant, APC was reanalyzed in leukocyte DNA by one uniform technique: high-resolution melting (HRM) analysis. Serial dilution of heterozygous DNA resulted in a lowest detectable allelic fraction of 6% for the majority of variants. HRM analysis and subsequent sequencing detected pathogenic fully heterozygous APC variants in 10 (6%) of the patients and pathogenic mosaic variants in 2 (1%). All these variants were previously missed by various conventional scanning methods. In parallel, HRM APC scanning was applied to DNA isolated from polyp tissue of two additional patients with apparently sporadic polyposis and without detectable pathogenic APC variant in leukocyte DNA. In both patients a pathogenic mosaic APC variant was present in multiple polyps. The detection of pathogenic APC variants in 7% of the patients, including mosaics, illustrates the usefulness of a complete APC gene reanalysis of previously tested patients, by a supplementary scanning method. HRM is a sensitive and fast pre-screening method for reliable detection of heterozygous and mosaic variants, which can be applied to leukocyte and polyp derived DNA.
SV40 host-substituted variants: a new look at the monkey DNA inserts and recombinant junctions.

PubMed

Singer, Maxine; Winocour, Ernest

2011-04-10

The available monkey genomic data banks were examined in order to determine the chromosomal locations of the host DNA inserts in 8 host-substituted SV40 variant DNAs. Five of the 8 variants contained more than one linked monkey DNA insert per tandem repeat unit and in all cases but one, the 19 monkey DNA inserts in the 8 variants mapped to different locations in the monkey genome. The 50 parental DNAs (32 monkey and 18 SV40 DNA segments) which spanned the crossover and flanking regions that participated in monkey/monkey and monkey/SV40 recombinations were characterized by substantial levels of microhomology of up to 8 nucleotides in length; the parental DNAs also exhibited direct and inverted repeats at or adjacent to the crossover sequences. We discuss how the host-substituted SV40 variants arose and the nature of the recombination mechanisms involved. Copyright © 2011 Elsevier Inc. All rights reserved.
Molecular Characterization of Anaplasma phagocytophilum and Borrelia burgdorferi in Ixodes scapularis Ticks from Pennsylvania

PubMed Central

Courtney, Joshua W.; Dryden, Richard L.; Montgomery, Jill; Schneider, Bradley S.; Smith, Gary; Massung, Robert F.

2003-01-01

Ixodes scapularis ticks were collected in 2000 and 2001 from two areas in Pennsylvania and tested for the presence of Anaplasma phagocytophilum and Borrelia burgdorferi by PCR and DNA sequencing. Of the ticks collected from northwestern and southeastern Pennsylvania, 162 of 263 (61.6%) and 25 of 191 (13.1%), respectively, were found to be positive for B. burgdorferi. DNA sequencing showed >99% identity with B. burgdorferi strains B31 and JD1. PCR testing for A. phagocytophilum revealed that 5 of 263 (1.9%) from northwestern Pennsylvania and 76 of 191 (39.8%) from southeastern Pennsylvania were positive. DNA sequencing revealed two genotypes of A. phagocytophilum, the human granulocytic ehrlichiosis (HGE) agent and a variant (AP-Variant 1) that has not been associated with human infection. Although only the HGE agent was present in northwestern Pennsylvania, both genotypes were found in southeastern Pennsylvania. These data add to a growing body of evidence showing that AP-Variant 1 is the predominant agent in areas where both genotypes coexist. PMID:12682147

Structure of allelic variants of subtype 5 of histone H1 in pea Pisum sativum L.

PubMed

Bogdanova, V S; Lester, D R; Berdnikov, V A; Andersson, I

2005-06-01

The pea genome contains seven histone H1 genes encoding different subtypes. Previously, the DNA sequence of only one gene, His1, coding for the subtype H1-1, had been identified. We isolated a histone H1 allele from a pea genomic DNA library. Data from the electrophoretic mobility of the pea H1 subtypes and their N-bromosuccinimide cleavage products indicated that the newly isolated gene corresponded to the H1-5 subtype encoded by His5. We confirmed this result by sequencing the gene from three pea lines with H1-5 allelic variants of altered electrophoretic mobility. The allele of the slow H1-5 variant differed from the standard allele by a nucleotide substitution that caused the replacement of the positively charged lysine with asparagine in the DNA-interacting domain of the histone molecule. A temperature-related occurrence had previously been demonstrated for this H1-5 variant in a study on a worldwide collection of pea germplasm. The variant tended to occur at higher frequencies in geographic regions with a cold climate. The fast allelic variant of H1-5 displayed a deletion resulting in the loss of a duplicated pentapeptide in the C-terminal domain.
Tinnitus in patients with hearing loss due to mitochondrial DNA pathogenic variants.

PubMed

Lechowicz, Urszula; Pollak, Agnieszka; Raj-Koziak, Danuta; Dziendziel, Beata; Skarżyński, Piotr Henryk; Skarżyński, Henryk; Ołdak, Monika

2018-06-23

Tinnitus described as individual perception of phantom sound constitutes a significant medical problem and has become an essential subject of many studies conducted worldwide. In the study, we aimed to examine the prevalence of tinnitus among Polish hearing loss (HL) patients with identified mitochondrial DNA (mtDNA) variants. Among the selected group of unrelated HL patients with known mtDNA pathogenic variants, two questionnaires were conducted, i.e. Tinnitus Handicap Inventory translated into Polish (THI-POL) and Visual Analogue Scale (VAS) for measuring subjectively perceived tinnitus loudness, distress, annoyance and possibility of coping with this condition (VASs). Pathogenic mtDNA variants were detected with real-time PCR and sequencing of the whole mtDNA. This is the first extensive tinnitus characterization using THI-POL and VASs questionnaires in HL patients due to mtDNA variants. We have established the prevalence of tinnitus among the studied group at 23.5%. We found that there are no statistically significant differences in the prevalence of tinnitus and its characteristic features between HL patients with known HL mtDNA variants and the general Polish population. In Polish HL patients with tinnitus, m.7511T>C was significantly more frequent than in patients without tinnitus. We observed that the prevalence of tinnitus is lower in Polish patients with m.1555A>G as compared to other available data. Our data suggest that the mtDNA variants causative of HL may affect tinnitus development but this effect seems to be ethnic-specific.
In vitro selection of zinc fingers with altered DNA-binding specificity.

PubMed

Jamieson, A C; Kim, S H; Wells, J A

1994-05-17

We have used random mutagenesis and phage display to alter the DNA-binding specificity of Zif268, a transcription factor that contains three zinc finger domains. Four residues in the helix of finger 1 of Zif268 that potentially mediate DNA binding were identified from an X-ray structure of the Zif268-DNA complex. A library was constructed in which these residues were randomly mutated and the Zif268 variants were fused to a truncated version of the gene III coat protein on the surface of M13 filamentous phage particles. The phage displayed the mutant proteins in a monovalent fashion and were sorted by repeated binding and elution from affinity matrices containing different DNA sequences. When the matrix contained the natural nine base pair operator sequence 5'-GCG-TGG-GCG-3', native-like zinc fingers were isolated. New finger 1 variants were found by sorting with two different operators in which the singly modified triplets, GTG and TCG, replaced the native finger 1 triplet, GCG. Overall, the selected finger 1 variants contained a preponderance of polar residues at the four sites. Interestingly, the net charge of the four residues in any selected finger never derived more that one unit from neutrality despite the fact that about half the variants contained three or four charged residues over the four sites. Measurements of the dissociation constants for two of these purified finger 1 variants by gel-shift assay showed their specificities to vary over a 10-fold range, with the greatest affinity being for the DNA binding site for which they were sorted.(ABSTRACT TRUNCATED AT 250 WORDS)
Association of low-frequency and rare coding-sequence variants with blood lipids and Coronary Heart Disease in 56,000 whites and blacks

USDA-ARS?s Scientific Manuscript database

Low-frequency coding DNA sequence variants in the proprotein convertase subtilisin/kexin type 9 gene (PCSK9) lower plasma low-density lipoprotein cholesterol (LDL-C), protect against risk of coronary heart disease (CHD), and have prompted the development of a new class of therapeutics. It is uncerta...
Deep sequencing shows that oocytes are not prone to accumulate mtDNA heteroplasmic mutations during ovarian ageing.

PubMed

Boucret, L; Bris, C; Seegers, V; Goudenège, D; Desquiret-Dumas, V; Domin-Bernhard, M; Ferré-L'Hotellier, V; Bouet, P E; Descamps, P; Reynier, P; Procaccio, V; May-Panloup, P

2017-10-01

Does ovarian ageing increase the number of heteroplasmic mitochondrial DNA (mtDNA) point mutations in oocytes? Our results suggest that oocytes are not subject to the accumulation of mtDNA point mutations during ovarian ageing. Ageing is associated with the alteration of mtDNA integrity in various tissues. Primary oocytes, present in the ovary since embryonic life, may accumulate mtDNA mutations during the process of ovarian ageing. This was an observational study of 53 immature oocyte-cumulus complexes retrieved from 35 women undergoing IVF at the University Hospital of Angers, France, from March 2013 to March 2014. The women were classified in two groups, one including 19 women showing signs of ovarian ageing objectified by a diminished ovarian reserve (DOR), and the other, including 16 women with a normal ovarian reserve (NOR), which served as a control group. mtDNA was extracted from isolated oocytes, and from their corresponding cumulus cells (CCs) considered as a somatic cell compartment. The average mtDNA content of each sample was assessed by using a quantitative real-time PCR technique. Deep sequencing was performed using the Ion Torrent Proton for Next-Generation Sequencing. Signal processing and base calling were done by the embedded pre-processing pipeline and the variants were analyzed using an in-house workflow. The distribution of the different variants between DOR and NOR patients, on one hand, and oocyte and CCs, on the other, was analyzed with the generalized mixed linear model to take into account the cluster of cells belonging to a given mother. There were no significant differences between the numbers of mtDNA variants between the DOR and the NOR patients, either in the oocytes (P = 0.867) or in the surrounding CCs (P = 0.154). There were also no differences in terms of variants with potential functional consequences. De-novo mtDNA variants were found in 28% of the oocytes and in 66% of the CCs with the mean number of variants being significantly different (respectively 0.321, SD = 0.547 and 1.075, SD = 1.158) (P < 0.0001). Variants with a potential functional consequence were also overrepresented in CCs compared with oocytes (P = 0.0019). N/A. Limitations may be due to the use of immature oocytes discarded during the assisted reproductive technology procedure, the small size of the sample, and the high-throughput sequencing technology that might not have detected heteroplasmy levels lower than 2%. The alteration of mtDNA integrity in oocytes during ovarian ageing is a recurring question to which our pilot study suggests a reassuring answer. This work was supported by the University Hospital of Angers, the University of Angers, France, and the French national research centers, INSERM and the CNRS. There are nocompeting interests. © The Author 2017. Published by Oxford University Press on behalf of the European Society of Human Reproduction and Embryology. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
DNA sequence-level analyses reveal potential phenotypic modifiers in a large family with psychiatric disorders.

PubMed

Ryan, Niamh M; Lihm, Jayon; Kramer, Melissa; McCarthy, Shane; Morris, Stewart W; Arnau-Soler, Aleix; Davies, Gail; Duff, Barbara; Ghiban, Elena; Hayward, Caroline; Deary, Ian J; Blackwood, Douglas H R; Lawrie, Stephen M; McIntosh, Andrew M; Evans, Kathryn L; Porteous, David J; McCombie, W Richard; Thomson, Pippa A

2018-06-07

Psychiatric disorders are a group of genetically related diseases with highly polygenic architectures. Genome-wide association analyses have made substantial progress towards understanding the genetic architecture of these disorders. More recently, exome- and whole-genome sequencing of cases and families have identified rare, high penetrant variants that provide direct functional insight. There remains, however, a gap in the heritability explained by these complementary approaches. To understand how multiple genetic variants combine to modify both severity and penetrance of a highly penetrant variant, we sequenced 48 whole genomes from a family with a high loading of psychiatric disorder linked to a balanced chromosomal translocation. The (1;11)(q42;q14.3) translocation directly disrupts three genes: DISC1, DISC2, DISC1FP and has been linked to multiple brain imaging and neurocognitive outcomes in the family. Using DNA sequence-level linkage analysis, functional annotation and population-based association, we identified common and rare variants in GRM5 (minor allele frequency (MAF) > 0.05), PDE4D (MAF > 0.2) and CNTN5 (MAF < 0.01) that may help explain the individual differences in phenotypic expression in the family. We suggest that whole-genome sequencing in large families will improve the understanding of the combined effects of the rare and common sequence variation underlying psychiatric phenotypes.
Sequence Variants and Haplotype Analysis of Cat ERBB2 Gene: A Survey on Spontaneous Cat Mammary Neoplastic and Non-Neoplastic Lesions

PubMed Central

Santos, Sara; Bastos, Estela; Baptista, Cláudia S.; Sá, Daniela; Caloustian, Christophe; Guedes-Pinto, Henrique; Gärtner, Fátima; Gut, Ivo G.; Chaves, Raquel

2012-01-01

The human ERBB2 proto-oncogene is widely considered a key gene involved in human breast cancer onset and progression. Among spontaneous tumors, mammary tumors are the most frequent cause of cancer death in cats and second most frequent in humans. In fact, naturally occurring tumors in domestic animals, more particularly cat mammary tumors, have been proposed as a good model for human breast cancer, but critical genetic and molecular information is still scarce. The aims of this study include the analysis of the cat ERBB2 gene partial sequences (between exon 17 and 20) in order to characterize a normal and a mammary lesion heterogeneous populations. Cat genomic DNA was extracted from normal frozen samples (n = 16) and from frozen and formalin-fixed paraffin-embedded mammary lesion samples (n = 41). We amplified and sequenced two cat ERBB2 DNA fragments comprising exons 17 to 20. It was possible to identify five sequence variants and six haplotypes in the total population. Two sequence variants and two haplotypes show to be specific for cat mammary tumor samples. Bioinformatics analysis predicts that four of the sequence variants can produce alternative transcripts or activate cryptic splicing sites. Also, a possible association was identified between clinicopathological traits and the variant haplotypes. As far as we know, this is the first attempt to examine ERBB2 genetic variations in cat mammary genome and its possible association with the onset and progression of cat mammary tumors. The demonstration of a possible association between primary tumor size (one of the two most important prognostic factors) and the number of masses with the cat ERBB2 variant haplotypes reveal the importance of the analysis of this gene in veterinary medicine. PMID:22489125
Mutations Affecting Expression of the rosy Locus in Drosophila melanogaster

PubMed Central

Lee, Chong Sung; Curtis, Daniel; McCarron, Margaret; Love, Carol; Gray, Mark; Bender, Welcome; Chovnick, Arthur

1987-01-01

The rosy locus in Drosophila melanogaster codes for the enzyme xanthine dehydrogenase (XDH). Previous studies defined a "control element" near the 5' end of the gene, where variant sites affected the amount of rosy mRNA and protein produced. We have determined the DNA sequence of this region from both genomic and cDNA clones, and from the ry+10 underproducer strain. This variant strain had many sequence differences, so that the site of the regulatory change could not be fixed. A mutagenesis was also undertaken to isolate new regulatory mutations. We induced 376 new mutations with 1-ethyl-1-nitrosourea (ENU) and screened them to isolate those that reduced the amount of XDH protein produced, but did not change the properties of the enzyme. Genetic mapping was used to find mutations located near the 5' end of the gene. DNA from each of seven mutants was cloned and sequenced through the 5' region. Mutant base changes were identified in all seven; they appear to affect splicing and translation of the rosy mRNA. In a related study (T. P. Keith et al. 1987), the genomic and cDNA sequences are extended through the 3' end of the gene; the combined sequences define the processing pattern of the rosy transcript and predict the amino acid sequence of XDH. PMID:3036645
A homozygous transthyretin variant associated with senile systemic amyloidosis: evidence for a late-onset disease of genetic etiology.

PubMed Central

Jacobson, D R; Gorevic, P D; Buxbaum, J N

1990-01-01

Senile systemic amyloidosis (SSA) is a late-onset disease characterized by deposition of amyloid fibrils containing transthyretin (TTR). Amino acid sequencing of protein isolated from the amyloid fibrils of a patient with SSA identified TTR containing a position - 122 isoleucine-for-valine substitution. This change led to the prediction of a genomic G-to-A transition, destroying an MaeIII restriction site. We confirmed the presence of the variant DNA fragment both by Southern blotting and by visualization of MaeIII digests of DNA amplified around codon 122, by using the polymerase chain reaction. The patient's DNA was entirely resistant to MaeIII cleavage; therefore, only the mutant sequence was present. DNA from none of either 24 controls or six other SSA patients contained the variant. Quantitative Southern blotting demonstrated that the patient's DNA contained two copies of the TTR gene per genome; the mutation was therefore homozygous rather than hemizygous. In the present case, the homozygous mutation TTR (122 Val----Ile) is associated with SSA, a finding which is consistent with autosomal recessive inheritance of this condition. Images Figure 2 Figure 4 Figure 5 Figure 6 Figure 7 PMID:2349941
Whole Exome Sequencing Identifies Rare Protein-Coding Variants in Behçet's Disease.

PubMed

Ognenovski, Mikhail; Renauer, Paul; Gensterblum, Elizabeth; Kötter, Ina; Xenitidis, Theodoros; Henes, Jörg C; Casali, Bruno; Salvarani, Carlo; Direskeneli, Haner; Kaufman, Kenneth M; Sawalha, Amr H

2016-05-01

Behçet's disease (BD) is a systemic inflammatory disease with an incompletely understood etiology. Despite the identification of multiple common genetic variants associated with BD, rare genetic variants have been less explored. We undertook this study to investigate the role of rare variants in BD by performing whole exome sequencing in BD patients of European descent. Whole exome sequencing was performed in a discovery set comprising 14 German BD patients of European descent. For replication and validation, Sanger sequencing and Sequenom genotyping were performed in the discovery set and in 2 additional independent sets of 49 German BD patients and 129 Italian BD patients of European descent. Genetic association analysis was then performed in BD patients and 503 controls of European descent. Functional effects of associated genetic variants were assessed using bioinformatic approaches. Using whole exome sequencing, we identified 77 rare variants (in 74 genes) with predicted protein-damaging effects in BD. These variants were genotyped in 2 additional patient sets and then analyzed to reveal significant associations with BD at 2 genetic variants detected in all 3 patient sets that remained significant after Bonferroni correction. We detected genetic association between BD and LIMK2 (rs149034313), involved in regulating cytoskeletal reorganization, and between BD and NEIL1 (rs5745908), involved in base excision DNA repair (P = 3.22 × 10(-4) and P = 5.16 × 10(-4) , respectively). The LIMK2 association is a missense variant with predicted protein damage that may influence functional interactions with proteins involved in cytoskeletal regulation by Rho GTPase, inflammation mediated by chemokine and cytokine signaling pathways, T cell activation, and angiogenesis (Bonferroni-corrected P = 5.63 × 10(-14) , P = 7.29 × 10(-6) , P = 1.15 × 10(-5) , and P = 6.40 × 10(-3) , respectively). The genetic association in NEIL1 is a predicted splice donor variant that may introduce a deleterious intron retention and result in a noncoding transcript variant. We used whole exome sequencing in BD for the first time and identified 2 rare putative protein-damaging genetic variants associated with this disease. These genetic variants might influence cytoskeletal regulation and DNA repair mechanisms in BD and might provide further insight into increased leukocyte tissue infiltration and the role of oxidative stress in BD. © 2016, American College of Rheumatology.
Human Chromosome Y and Haplogroups; introducing YDHS Database.

PubMed

Tiirikka, Timo; Moilanen, Jukka S

2015-12-01

As the high throughput sequencing efforts generate more biological information, scientists from different disciplines are interpreting the polymorphisms that make us unique. In addition, there is an increasing trend in general public to research their own genealogy, find distant relatives and to know more about their biological background. Commercial vendors are providing analyses of mitochondrial and Y-chromosomal markers for such purposes. Clearly, an easy-to-use free interface to the existing data on the identified variants would be in the interest of general public and professionals less familiar with the field. Here we introduce a novel metadatabase YDHS that aims to provide such an interface for Y-chromosomal DNA (Y-DNA) haplogroups and sequence variants. The database uses ISOGG Y-DNA tree as the source of mutations and haplogroups and by using genomic positions of the mutations the database links them to genes and other biological entities. YDHS contains analysis tools for deeper Y-SNP analysis. YDHS addresses the shortage of Y-DNA related databases. We have tested our database using a set of different cases from literature ranging from infertility to autism. The database is at http://www.semanticgen.net/ydhs Y-chromosomal DNA (Y-DNA) haplogroups and sequence variants have not been in the scientific limelight, excluding certain specialized fields like forensics, mainly because there is not much freely available information or it is scattered in different sources. However, as we have demonstrated Y-SNPs do play a role in various cases on the haplogroup level and it is possible to create a free Y-DNA dedicated bioinformatics resource.
Identification of BRCA1 missense substitutions that confer partial functional activity: potential moderate risk variants?

PubMed

Lovelock, Paul K; Spurdle, Amanda B; Mok, Myth T S; Farrugia, Daniel J; Lakhani, Sunil R; Healey, Sue; Arnold, Stephen; Buchanan, Daniel; Couch, Fergus J; Henderson, Beric R; Goldgar, David E; Tavtigian, Sean V; Chenevix-Trench, Georgia; Brown, Melissa A

2007-01-01

Many of the DNA sequence variants identified in the breast cancer susceptibility gene BRCA1 remain unclassified in terms of their potential pathogenicity. Both multifactorial likelihood analysis and functional approaches have been proposed as a means to elucidate likely clinical significance of such variants, but analysis of the comparative value of these methods for classifying all sequence variants has been limited. We have compared the results from multifactorial likelihood analysis with those from several functional analyses for the four BRCA1 sequence variants A1708E, G1738R, R1699Q, and A1708V. Our results show that multifactorial likelihood analysis, which incorporates sequence conservation, co-inheritance, segregation, and tumour immunohistochemical analysis, may improve classification of variants. For A1708E, previously shown to be functionally compromised, analysis of oestrogen receptor, cytokeratin 5/6, and cytokeratin 14 tumour expression data significantly strengthened the prediction of pathogenicity, giving a posterior probability of pathogenicity of 99%. For G1738R, shown to be functionally defective in this study, immunohistochemistry analysis confirmed previous findings of inconsistent 'BRCA1-like' phenotypes for the two tumours studied, and the posterior probability for this variant was 96%. The posterior probabilities of R1699Q and A1708V were 54% and 69%, respectively, only moderately suggestive of increased risk. Interestingly, results from functional analyses suggest that both of these variants have only partial functional activity. R1699Q was defective in foci formation in response to DNA damage and displayed intermediate transcriptional transactivation activity but showed no evidence for centrosome amplification. In contrast, A1708V displayed an intermediate transcriptional transactivation activity and a normal foci formation response in response to DNA damage but induced centrosome amplification. These data highlight the need for a range of functional studies to be performed in order to identify variants with partially compromised function. The results also raise the possibility that A1708V and R1699Q may be associated with a low or moderate risk of cancer. While data pooling strategies may provide more information for multifactorial analysis to improve the interpretation of the clinical significance of these variants, it is likely that the development of current multifactorial likelihood approaches and the consideration of alternative statistical approaches will be needed to determine whether these individually rare variants do confer a low or moderate risk of breast cancer.
Sequencing Structural Variants in Cancer for Precision Therapeutics.

PubMed

Macintyre, Geoff; Ylstra, Bauke; Brenton, James D

2016-09-01

The identification of mutations that guide therapy selection for patients with cancer is now routine in many clinical centres. The majority of assays used for solid tumour profiling use DNA sequencing to interrogate somatic point mutations because they are relatively easy to identify and interpret. Many cancers, however, including high-grade serous ovarian, oesophageal, and small-cell lung cancer, are driven by somatic structural variants that are not measured by these assays. Therefore, there is currently an unmet need for clinical assays that can cheaply and rapidly profile structural variants in solid tumours. In this review we survey the landscape of 'actionable' structural variants in cancer and identify promising detection strategies based on massively-parallel sequencing. Copyright © 2016 Elsevier Ltd. All rights reserved.
Determination of a Screening Metric for High Diversity DNA Libraries.

PubMed

Guido, Nicholas J; Handerson, Steven; Joseph, Elaine M; Leake, Devin; Kung, Li A

2016-01-01

The fields of antibody engineering, enzyme optimization and pathway construction rely increasingly on screening complex variant DNA libraries. These highly diverse libraries allow researchers to sample a maximized sequence space; and therefore, more rapidly identify proteins with significantly improved activity. The current state of the art in synthetic biology allows for libraries with billions of variants, pushing the limits of researchers' ability to qualify libraries for screening by measuring the traditional quality metrics of fidelity and diversity of variants. Instead, when screening variant libraries, researchers typically use a generic, and often insufficient, oversampling rate based on a common rule-of-thumb. We have developed methods to calculate a library-specific oversampling metric, based on fidelity, diversity, and representation of variants, which informs researchers, prior to screening the library, of the amount of oversampling required to ensure that the desired fraction of variant molecules will be sampled. To derive this oversampling metric, we developed a novel alignment tool to efficiently measure frequency counts of individual nucleotide variant positions using next-generation sequencing data. Next, we apply a method based on the "coupon collector" probability theory to construct a curve of upper bound estimates of the sampling size required for any desired variant coverage. The calculated oversampling metric will guide researchers to maximize their efficiency in using highly variant libraries.
Highly multiplexed targeted DNA sequencing from single nuclei.

PubMed

Leung, Marco L; Wang, Yong; Kim, Charissa; Gao, Ruli; Jiang, Jerry; Sei, Emi; Navin, Nicholas E

2016-02-01

Single-cell DNA sequencing methods are challenged by poor physical coverage, high technical error rates and low throughput. To address these issues, we developed a single-cell DNA sequencing protocol that combines flow-sorting of single nuclei, time-limited multiple-displacement amplification (MDA), low-input library preparation, DNA barcoding, targeted capture and next-generation sequencing (NGS). This approach represents a major improvement over our previous single nucleus sequencing (SNS) Nature Protocols paper in terms of generating higher-coverage data (>90%), thereby enabling the detection of genome-wide variants in single mammalian cells at base-pair resolution. Furthermore, by pooling 48-96 single-cell libraries together for targeted capture, this approach can be used to sequence many single-cell libraries in parallel in a single reaction. This protocol greatly reduces the cost of single-cell DNA sequencing, and it can be completed in 5-6 d by advanced users. This single-cell DNA sequencing protocol has broad applications for studying rare cells and complex populations in diverse fields of biological research and medicine.
Hypervariability of ribosomal DNA at multiple chromosomal sites in lake trout (Salvelinus namaycush).

PubMed

Zhuo, L; Reed, K M; Phillips, R B

1995-06-01

Variation in the intergenic spacer (IGS) of the ribosomal DNA (rDNA) of lake trout (Salvelinus namaycush) was examined. Digestion of genomic DNA with restriction enzymes showed that almost every individual had a unique combination of length variants with most of this variation occurring within rather than between populations. Sequence analysis of a 2.3 kilobase (kb) EcoRI-DraI fragment spanning the 3' end of the 28S coding region and approximately 1.8 kb of the IGS revealed two blocks of repetitive DNA. Putative transcriptional termination sites were found approximately 220 bases (b) downstream from the end of the 28S coding region. Comparison of the 2.3-kb fragments with two longer (3.1 kb) fragments showed that the major difference in length resulted from variation in the number of short (89 b) repeats located 3' to the putative terminator. Repeat units within a single nucleolus organizer region (NOR) appeared relatively homogeneous and genetic analysis found variants to be stably inherited. A comparison of the number of spacer-length variants with the number of NORs found that the number of length variants per individual was always less than the number of NORs. Examination of spacer variants in five populations showed that populations with more NORs had more spacer variants, indicating that variants are present at different rDNA sites on nonhomologous chromosomes.
mtDNA-Server: next-generation sequencing data analysis of human mitochondrial DNA in the cloud.

PubMed

Weissensteiner, Hansi; Forer, Lukas; Fuchsberger, Christian; Schöpf, Bernd; Kloss-Brandstätter, Anita; Specht, Günther; Kronenberg, Florian; Schönherr, Sebastian

2016-07-08

Next generation sequencing (NGS) allows investigating mitochondrial DNA (mtDNA) characteristics such as heteroplasmy (i.e. intra-individual sequence variation) to a higher level of detail. While several pipelines for analyzing heteroplasmies exist, issues in usability, accuracy of results and interpreting final data limit their usage. Here we present mtDNA-Server, a scalable web server for the analysis of mtDNA studies of any size with a special focus on usability as well as reliable identification and quantification of heteroplasmic variants. The mtDNA-Server workflow includes parallel read alignment, heteroplasmy detection, artefact or contamination identification, variant annotation as well as several quality control metrics, often neglected in current mtDNA NGS studies. All computational steps are parallelized with Hadoop MapReduce and executed graphically with Cloudgene. We validated the underlying heteroplasmy and contamination detection model by generating four artificial sample mix-ups on two different NGS devices. Our evaluation data shows that mtDNA-Server detects heteroplasmies and artificial recombinations down to the 1% level with perfect specificity and outperforms existing approaches regarding sensitivity. mtDNA-Server is currently able to analyze the 1000G Phase 3 data (n = 2,504) in less than 5 h and is freely accessible at https://mtdna-server.uibk.ac.at. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Mitochondrial genomic analysis of late onset Alzheimer's disease reveals protective haplogroups H6A1A/H6A1B: the Cache County Study on Memory in Aging.

PubMed

Ridge, Perry G; Maxwell, Taylor J; Corcoran, Christopher D; Norton, Maria C; Tschanz, Joann T; O'Brien, Elizabeth; Kerber, Richard A; Cawthon, Richard M; Munger, Ronald G; Kauwe, John S K

2012-01-01

Alzheimer's disease (AD) is the most common cause of dementia and AD risk clusters within families. Part of the familial aggregation of AD is accounted for by excess maternal vs. paternal inheritance, a pattern consistent with mitochondrial inheritance. The role of specific mitochondrial DNA (mtDNA) variants and haplogroups in AD risk is uncertain. We determined the complete mitochondrial genome sequence of 1007 participants in the Cache County Study on Memory in Aging, a population-based prospective cohort study of dementia in northern Utah. AD diagnoses were made with a multi-stage protocol that included clinical examination and review by a panel of clinical experts. We used TreeScanning, a statistically robust approach based on haplotype networks, to analyze the mtDNA sequence data. Participants with major mitochondrial haplotypes H6A1A and H6A1B showed a reduced risk of AD (p=0.017, corrected for multiple comparisons). The protective haplotypes were defined by three variants: m.3915G>A, m.4727A>G, and m.9380G>A. These three variants characterize two different major haplogroups. Together m.4727A>G and m.9380G>A define H6A1, and it has been suggested m.3915G>A defines H6A. Additional variants differentiate H6A1A and H6A1B; however, none of these variants had a significant relationship with AD case-control status. Our findings provide evidence of a reduced risk of AD for individuals with mtDNA haplotypes H6A1A and H6A1B. These findings are the results of the largest study to date with complete mtDNA genome sequence data, yet the functional significance of the associated haplotypes remains unknown and replication in others studies is necessary.
Mitochondrial Genomic Analysis of Late Onset Alzheimer’s Disease Reveals Protective Haplogroups H6A1A/H6A1B: The Cache County Study on Memory in Aging

PubMed Central

Ridge, Perry G.; Maxwell, Taylor J.; Corcoran, Christopher D.; Norton, Maria C.; Tschanz, JoAnn T.; O’Brien, Elizabeth; Kerber, Richard A.; Cawthon, Richard M.; Munger, Ronald G.; Kauwe, John S. K.

2012-01-01

Background Alzheimer’s disease (AD) is the most common cause of dementia and AD risk clusters within families. Part of the familial aggregation of AD is accounted for by excess maternal vs. paternal inheritance, a pattern consistent with mitochondrial inheritance. The role of specific mitochondrial DNA (mtDNA) variants and haplogroups in AD risk is uncertain. Methodology/Principal Findings We determined the complete mitochondrial genome sequence of 1007 participants in the Cache County Study on Memory in Aging, a population-based prospective cohort study of dementia in northern Utah. AD diagnoses were made with a multi-stage protocol that included clinical examination and review by a panel of clinical experts. We used TreeScanning, a statistically robust approach based on haplotype networks, to analyze the mtDNA sequence data. Participants with major mitochondrial haplotypes H6A1A and H6A1B showed a reduced risk of AD (p = 0.017, corrected for multiple comparisons). The protective haplotypes were defined by three variants: m.3915G>A, m.4727A>G, and m.9380G>A. These three variants characterize two different major haplogroups. Together m.4727A>G and m.9380G>A define H6A1, and it has been suggested m.3915G>A defines H6A. Additional variants differentiate H6A1A and H6A1B; however, none of these variants had a significant relationship with AD case-control status. Conclusions/Significance Our findings provide evidence of a reduced risk of AD for individuals with mtDNA haplotypes H6A1A and H6A1B. These findings are the results of the largest study to date with complete mtDNA genome sequence data, yet the functional significance of the associated haplotypes remains unknown and replication in others studies is necessary. PMID:23028804
Design of DNA pooling to allow incorporation of covariates in rare variants analysis.

PubMed

Guan, Weihua; Li, Chun

2014-01-01

Rapid advances in next-generation sequencing technologies facilitate genetic association studies of an increasingly wide array of rare variants. To capture the rare or less common variants, a large number of individuals will be needed. However, the cost of a large scale study using whole genome or exome sequencing is still high. DNA pooling can serve as a cost-effective approach, but with a potential limitation that the identity of individual genomes would be lost and therefore individual characteristics and environmental factors could not be adjusted in association analysis, which may result in power loss and a biased estimate of genetic effect. For case-control studies, we propose a design strategy for pool creation and an analysis strategy that allows covariate adjustment, using multiple imputation technique. Simulations show that our approach can obtain reasonable estimate for genotypic effect with only slight loss of power compared to the much more expensive approach of sequencing individual genomes. Our design and analysis strategies enable more powerful and cost-effective sequencing studies of complex diseases, while allowing incorporation of covariate adjustment.

Whole-genome sequencing and genetic variant analysis of a Quarter Horse mare.

PubMed

Doan, Ryan; Cohen, Noah D; Sawyer, Jason; Ghaffari, Noushin; Johnson, Charlie D; Dindot, Scott V

2012-02-17

The catalog of genetic variants in the horse genome originates from a few select animals, the majority originating from the Thoroughbred mare used for the equine genome sequencing project. The purpose of this study was to identify genetic variants, including single nucleotide polymorphisms (SNPs), insertion/deletion polymorphisms (INDELs), and copy number variants (CNVs) in the genome of an individual Quarter Horse mare sequenced by next-generation sequencing. Using massively parallel paired-end sequencing, we generated 59.6 Gb of DNA sequence from a Quarter Horse mare resulting in an average of 24.7X sequence coverage. Reads were mapped to approximately 97% of the reference Thoroughbred genome. Unmapped reads were de novo assembled resulting in 19.1 Mb of new genomic sequence in the horse. Using a stringent filtering method, we identified 3.1 million SNPs, 193 thousand INDELs, and 282 CNVs. Genetic variants were annotated to determine their impact on gene structure and function. Additionally, we genotyped this Quarter Horse for mutations of known diseases and for variants associated with particular traits. Functional clustering analysis of genetic variants revealed that most of the genetic variation in the horse's genome was enriched in sensory perception, signal transduction, and immunity and defense pathways. This is the first sequencing of a horse genome by next-generation sequencing and the first genomic sequence of an individual Quarter Horse mare. We have increased the catalog of genetic variants for use in equine genomics by the addition of novel SNPs, INDELs, and CNVs. The genetic variants described here will be a useful resource for future studies of genetic variation regulating performance traits and diseases in equids.
Haplotype block structure study of the CFTR gene. Most variants are associated with the M470 allele in several European populations.

PubMed

Pompei, Fiorenza; Ciminelli, Bianca Maria; Bombieri, Cristina; Ciccacci, Cinzia; Koudova, Monika; Giorgi, Silvia; Belpinati, Francesca; Begnini, Angela; Cerny, Milos; Des Georges, Marie; Claustres, Mireille; Ferec, Claude; Macek, Milan; Modiano, Guido; Pignatti, Pier Franco

2006-01-01

An average of about 1700 CFTR (cystic fibrosis transmembrane conductance regulator) alleles from normal individuals from different European populations were extensively screened for DNA sequence variation. A total of 80 variants were observed: 61 coding SNSs (results already published), 13 noncoding SNSs, three STRs, two short deletions, and one nucleotide insertion. Eight DNA variants were classified as non-CF causing due to their high frequency of occurrence. Through this survey the CFTR has become the most exhaustively studied gene for its coding sequence variability and, though to a lesser extent, for its noncoding sequence variability as well. Interestingly, most variation was associated with the M470 allele, while the V470 allele showed an 'extended haplotype homozygosity' (EHH). These findings make us suggest a role for selection acting either on the M470V itself or through an hitchhiking mechanism involving a second site. The possible ancient origin of the V allele in an 'out of Africa' time frame is discussed.
Sequence analysis of the mitochondrial DNA control region of ciscoes (genus Coregonus): taxonomic implications for the Great Lakes species flock.

PubMed

Reed, K M; Dorschner, M O; Todd, T N; Phillips, R B

1998-09-01

Sequence variation in the control region (D-loop) of the mitochondrial DNA (mtDNA) was examined to assess the genetic distinctiveness of the shortjaw cisco (Coregonus zenithicus). Individuals from within the Great Lakes Basin as well as inland lakes outside the basin were sampled. DNA fragments containing the entire D-loop were amplified by PCR from specimens of C. zenithicus and the related species C. artedi, C. hoyi, C. kiyi, and C. clupeaformis. DNA sequence analysis revealed high similarity within and among species and shared polymorphism for length variants. Based on this analysis, the shortjaw cisco is not genetically distinct from other cisco species.
Ribosomal DNA intergenic spacer sequence in foxtail millet, Setaria italica (L.) P. Beauv. and its characterization and application to typing of foxtail millet landraces.

PubMed

Fukunaga, Kenji; Ichitani, Katsuyuki; Taura, Satoru; Sato, Muneharu; Kawase, Makoto

2005-02-01

We determined the sequence of ribosomal DNA (rDNA) intergenic spacer (IGS) of foxtail millet isolated in our previous study, and identified subrepeats in the polymorphic region. We also developed a PCR-based method for identifying rDNA types based on sequence information and assessed 153 accessions of foxtail millet. Results were congruent with our previous works. This study provides new findings regarding the geographical distribution of rDNA variants. This new method facilitates analyses of numerous foxtail millet accessions. It is helpful for typing of foxtail millet germplasms and elucidating the evolution of this millet.
Detection of de novo single nucleotide variants in offspring of atomic-bomb survivors close to the hypocenter by whole-genome sequencing.

PubMed

Horai, Makiko; Mishima, Hiroyuki; Hayashida, Chisa; Kinoshita, Akira; Nakane, Yoshibumi; Matsuo, Tatsuki; Tsuruda, Kazuto; Yanagihara, Katsunori; Sato, Shinya; Imanishi, Daisuke; Imaizumi, Yoshitaka; Hata, Tomoko; Miyazaki, Yasushi; Yoshiura, Koh-Ichiro

2018-03-01

Ionizing radiation released by the atomic bombs at Hiroshima and Nagasaki, Japan, in 1945 caused many long-term illnesses, including increased risks of malignancies such as leukemia and solid tumours. Radiation has demonstrated genetic effects in animal models, leading to concerns over the potential hereditary effects of atomic bomb-related radiation. However, no direct analyses of whole DNA have yet been reported. We therefore investigated de novo variants in offspring of atomic-bomb survivors by whole-genome sequencing (WGS). We collected peripheral blood from three trios, each comprising a father (atomic-bomb survivor with acute radiation symptoms), a non-exposed mother, and their child, none of whom had any past history of haematological disorders. One trio of non-exposed individuals was included as a control. DNA was extracted and the numbers of de novo single nucleotide variants in the children were counted by WGS with sequencing confirmation. Gross structural variants were also analysed. Written informed consent was obtained from all participants prior to the study. There were 62, 81, and 42 de novo single nucleotide variants in the children of atomic-bomb survivors, compared with 48 in the control trio. There were no gross structural variants in any trio. These findings are in accord with previously published results that also showed no significant genetic effects of atomic-bomb radiation on second-generation survivors.
Targeted exome sequencing of suspected mitochondrial disorders

PubMed Central

Lieber, Daniel S.; Calvo, Sarah E.; Shanahan, Kristy; Slate, Nancy G.; Liu, Shangtao; Hershman, Steven G.; Gold, Nina B.; Chapman, Brad A.; Thorburn, David R.; Berry, Gerard T.; Schmahmann, Jeremy D.; Borowsky, Mark L.; Mueller, David M.; Sims, Katherine B.

2013-01-01

Objective: To evaluate the utility of targeted exome sequencing for the molecular diagnosis of mitochondrial disorders, which exhibit marked phenotypic and genetic heterogeneity. Methods: We considered a diverse set of 102 patients with suspected mitochondrial disorders based on clinical, biochemical, and/or molecular findings, and whose disease ranged from mild to severe, with varying age at onset. We sequenced the mitochondrial genome (mtDNA) and the exons of 1,598 nuclear-encoded genes implicated in mitochondrial biology, mitochondrial disease, or monogenic disorders with phenotypic overlap. We prioritized variants likely to underlie disease and established molecular diagnoses in accordance with current clinical genetic guidelines. Results: Targeted exome sequencing yielded molecular diagnoses in established disease loci in 22% of cases, including 17 of 18 (94%) with prior molecular diagnoses and 5 of 84 (6%) without. The 5 new diagnoses implicated 2 genes associated with canonical mitochondrial disorders (NDUFV1, POLG2), and 3 genes known to underlie other neurologic disorders (DPYD, KARS, WFS1), underscoring the phenotypic and biochemical overlap with other inborn errors. We prioritized variants in an additional 26 patients, including recessive, X-linked, and mtDNA variants that were enriched 2-fold over background and await further support of pathogenicity. In one case, we modeled patient mutations in yeast to provide evidence that recessive mutations in ATP5A1 can underlie combined respiratory chain deficiency. Conclusion: The results demonstrate that targeted exome sequencing is an effective alternative to the sequential testing of mtDNA and individual nuclear genes as part of the investigation of mitochondrial disease. Our study underscores the ongoing challenge of variant interpretation in the clinical setting. PMID:23596069
Exome Sequence Analysis of 14 Families With High Myopia.

PubMed

Kloss, Bethany A; Tompson, Stuart W; Whisenhunt, Kristina N; Quow, Krystina L; Huang, Samuel J; Pavelec, Derek M; Rosenberg, Thomas; Young, Terri L

2017-04-01

To identify causal gene mutations in 14 families with autosomal dominant (AD) high myopia using exome sequencing. Select individuals from 14 large Caucasian families with high myopia were exome sequenced. Gene variants were filtered to identify potential pathogenic changes. Sanger sequencing was used to confirm variants in original DNA, and to test for disease cosegregation in additional family members. Candidate genes and chromosomal loci previously associated with myopic refractive error and its endophenotypes were comprehensively screened. In 14 high myopia families, we identified 73 rare and 31 novel gene variants as candidates for pathogenicity. In seven of these families, two of the novel and eight of the rare variants were within known myopia loci. A total of 104 heterozygous nonsynonymous rare variants in 104 genes were identified in 10 out of 14 probands. Each variant cosegregated with affection status. No rare variants were identified in genes known to cause myopia or in genes closest to published genome-wide association study association signals for refractive error or its endophenotypes. Whole exome sequencing was performed to determine gene variants implicated in the pathogenesis of AD high myopia. This study provides new genes for consideration in the pathogenesis of high myopia, and may aid in the development of genetic profiling of those at greatest risk for attendant ocular morbidities of this disorder.
Variant ribosomal RNA alleles are conserved and exhibit tissue-specific expression

PubMed Central

Parks, Matthew M.; Kurylo, Chad M.; Dass, Randall A.; Bojmar, Linda; Lyden, David; Vincent, C. Theresa; Blanchard, Scott C.

2018-01-01

The ribosome, the integration point for protein synthesis in the cell, is conventionally considered a homogeneous molecular assembly that only passively contributes to gene expression. Yet, epigenetic features of the ribosomal DNA (rDNA) operon and changes in the ribosome’s molecular composition have been associated with disease phenotypes, suggesting that the ribosome itself may possess inherent regulatory capacity. Analyzing whole-genome sequencing data from the 1000 Genomes Project and the Mouse Genomes Project, we find that rDNA copy number varies widely across individuals, and we identify pervasive intra- and interindividual nucleotide variation in the 5S, 5.8S, 18S, and 28S ribosomal RNA (rRNA) genes of both human and mouse. Conserved rRNA sequence heterogeneities map to functional centers of the assembled ribosome, variant rRNA alleles exhibit tissue-specific expression, and ribosomes bearing variant rRNA alleles are present in the actively translating ribosome pool. These findings provide a critical framework for exploring the possibility that the expression of genomically encoded variant rRNA alleles gives rise to physically and functionally heterogeneous ribosomes that contribute to mammalian physiology and human disease. PMID:29503865
Checking of individuality by DNA profiling.

PubMed

Brdicka, R; Nürnberg, P

1993-08-25

A review of methods of DNA analysis used in forensic medicine for identification, paternity testing, etc. is provided. Among other techniques, DNA fingerprinting using different probes and polymerase chain reaction-based techniques such as amplified sequence polymorphisms and minisatellite variant repeat mapping are thoroughly described and both theoretical and practical aspects are discussed.
Strategic approaches to unraveling genetic causes of cardiovascular diseases

USDA-ARS?s Scientific Manuscript database

DNA sequence variants are major components of the "causal field" for virtually all medical phenotypes, whether single gene familial disorders or complex traits without a clear familial aggregation. The causal variants in single gene disorders are necessary and sufficient to impart large effects. In ...
A Directed Molecular Evolution Approach to Improved Immunogenicity of the HIV-1 Envelope Glycoprotein

PubMed Central

Du, Sean X.; Xu, Li; Zhang, Wenge; Tang, Susan; Boenig, Rebecca I.; Chen, Helen; Mariano, Ellaine B.; Zwick, Michael B.; Parren, Paul W. H. I.; Burton, Dennis R.; Wrin, Terri; Petropoulos, Christos J.; Ballantyne, John A.; Chambers, Michael; Whalen, Robert G.

2011-01-01

A prophylactic vaccine is needed to slow the spread of HIV-1 infection. Optimization of the wild-type envelope glycoproteins to create immunogens that can elicit effective neutralizing antibodies is a high priority. Starting with ten genes encoding subtype B HIV-1 gp120 envelope glycoproteins and using in vitro homologous DNA recombination, we created chimeric gp120 variants that were screened for their ability to bind neutralizing monoclonal antibodies. Hundreds of variants were identified with novel antigenic phenotypes that exhibit considerable sequence diversity. Immunization of rabbits with these gp120 variants demonstrated that the majority can induce neutralizing antibodies to HIV-1. One novel variant, called ST-008, induced significantly improved neutralizing antibody responses when assayed against a large panel of primary HIV-1 isolates. Further study of various deletion constructs of ST-008 showed that the enhanced immunogenicity results from a combination of effective DNA priming, an enhanced V3-based response, and an improved response to the constant backbone sequences. PMID:21738594
Evidence for two transferrin loci in the Salmo trutta genome.

PubMed

Rozman, T; Dovc, P; Marić, S; Kokalj-Vokac, N; Erjavec-Skerget, A; Rab, P; Snoj, A

2008-12-01

To determine the organization of transferrin (TF) locus in the Salmo trutta genome, partial DNA and cDNA sequencing, fluorescent in situ hybridization (FISH) and Salmo salar BAC analysis were performed. TF expression levels and copy number prediction were assessed using real-time PCR. In addition to two previously reported DNA TF variant sequences of S. trutta and Salmo marmoratus (TF1), two novel variant sequences (TF2) were revealed in both species. Variant-specific sequence tags, characterizing two variants for each TF type (TF1 and TF2), were identified in genomic clones from each of the F1 hybrids between S. trutta and S. marmoratus. These clearly documented double heterozygote status at the TF loci. The real-time PCR data showed that each of the two TF types (TF1 and TF2) existed in one copy only and that the transcription of TF2 was considerably lower compared with TF1. Using FISH, hybridization signals were observed on two medium-sized acrocentric chromosomes of S. trutta karyotype. A TF type-specific PCR followed by a restriction analysis revealed the presence of two TF loci in the majority of analysed BAC clones. It was concluded that the TF gene is duplicated in the genome of S. trutta, and that the two TF loci are located adjacent to one another on the same chromosome. The differing transcription levels of TF1 and TF2 appear to depend on the corresponding promoter activity, which at least for TF2 seems to vary between different Salmo congeners.
MToolBox: a highly automated pipeline for heteroplasmy annotation and prioritization analysis of human mitochondrial variants in high-throughput sequencing

PubMed Central

Diroma, Maria Angela; Santorsola, Mariangela; Guttà, Cristiano; Gasparre, Giuseppe; Picardi, Ernesto; Pesole, Graziano; Attimonelli, Marcella

2014-01-01

Motivation: The increasing availability of mitochondria-targeted and off-target sequencing data in whole-exome and whole-genome sequencing studies (WXS and WGS) has risen the demand of effective pipelines to accurately measure heteroplasmy and to easily recognize the most functionally important mitochondrial variants among a huge number of candidates. To this purpose, we developed MToolBox, a highly automated pipeline to reconstruct and analyze human mitochondrial DNA from high-throughput sequencing data. Results: MToolBox implements an effective computational strategy for mitochondrial genomes assembling and haplogroup assignment also including a prioritization analysis of detected variants. MToolBox provides a Variant Call Format file featuring, for the first time, allele-specific heteroplasmy and annotation files with prioritized variants. MToolBox was tested on simulated samples and applied on 1000 Genomes WXS datasets. Availability and implementation: MToolBox package is available at https://sourceforge.net/projects/mtoolbox/. Contact: marcella.attimonelli@uniba.it Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25028726
Deep sequencing shows low-level oncogenic hepatitis B virus variants persists post-liver transplant despite potent anti-HBV prophylaxis.

PubMed

Lau, K C K; Osiowy, C; Giles, E; Lusina, B; van Marle, G; Burak, K W; Coffin, C S

2018-06-01

Recent studies suggest that withdrawal of hepatitis B immune globulin (HBIG) and nucleos(t)ide analogues (NA) prophylaxis may be considered in HBV surface antigen (HBsAg)-negative liver transplant (LT) recipients with a low risk of disease recurrence. However, the frequency of occult HBV infection (OBI) and HBV variants after LT in the current era of potent NA therapy is unknown. Twelve LT recipients on prophylaxis were tested in matched plasma and peripheral blood mononuclear cells (PBMCs) for HBV quasispecies by in-house nested PCR and next-generation sequencing of amplicons. HBV covalently closed circular DNA (cccDNA) was detected in Hirt DNA isolated from PBMCs with cccDNA-specific primers and confirmed by nucleic acid hybridization and Sanger sequencing. HBV mRNA in PBMC was detected with reverse-transcriptase nested PCR. In LT recipients on immunosuppressive therapy (10/12 male; median age 57.5 [IQR: 39.8-66.5]; median follow-up post-LT 60 months; 6 pre-LT hepatocellular carcinoma [HCC]), 9 were HBsAg-. HBV DNA was detected in all plasma and PBMC tested; cccDNA and/or mRNA was detected in the PBMC of 10/12 patients. Significant HBV quasispecies diversity (ie 143-2212 nonredundant HBV species) was noted in both sites, and single nucleotide polymorphisms associated with cirrhosis and HCC were detected at varying frequencies. In conclusion, OBI and HBV variants associated with severe liver disease persist in LT recipients on prophylaxis. Although HBV control and cccDNA transcriptional silencing may occur despite immunosuppression, complete virological eradication does not occur in LT recipients with a history of HBV-related end-stage liver disease. © 2018 John Wiley & Sons Ltd.
A comprehensive characterization of rare mitochondrial DNA variants in neuroblastoma.

PubMed

Calabrese, Francesco Maria; Clima, Rosanna; Pignataro, Piero; Lasorsa, Vito Alessandro; Hogarty, Michael D; Castellano, Aurora; Conte, Massimo; Tonini, Gian Paolo; Iolascon, Achille; Gasparre, Giuseppe; Capasso, Mario

2016-08-02

Neuroblastoma, a tumor of the developing sympathetic nervous system, is a common childhood neoplasm that is often lethal. Mitochondrial DNA (mtDNA) mutations have been found in most tumors including neuroblastoma. We extracted mtDNA data from a cohort of neuroblastoma samples that had undergone Whole Exome Sequencing (WES) and also used snap-frozen samples in which mtDNA was entirely sequenced by Sanger technology. We next undertook the challenge of determining those mutations that are relevant to, or arisen during tumor development. The bioinformatics pipeline used to extract mitochondrial variants from matched tumor/blood samples was enriched by a set of filters inclusive of heteroplasmic fraction, nucleotide variability, and in silico prediction of pathogenicity. Our in silico multistep workflow applied both on WES and Sanger-sequenced neuroblastoma samples, allowed us to identify a limited burden of somatic and germline mitochondrial mutations with a potential pathogenic impact. The few singleton germline and somatic mitochondrial mutations emerged, according to our in silico analysis, do not appear to impact on the development of neuroblastoma. Our findings are consistent with the hypothesis that most mitochondrial somatic mutations can be considered as 'passengers' and consequently have no discernible effect in this type of cancer.
Characterization of the two intra-individual sequence variants in the 18S rRNA gene in the plant parasitic nematode, Rotylenchulus reniformis.

PubMed

Nyaku, Seloame T; Sripathi, Venkateswara R; Kantety, Ramesh V; Gu, Yong Q; Lawrence, Kathy; Sharma, Govind C

2013-01-01

The 18S rRNA gene is fundamental to cellular and organismal protein synthesis and because of its stable persistence through generations it is also used in phylogenetic analysis among taxa. Sequence variation in this gene within a single species is rare, but it has been observed in few metazoan organisms. More frequently it has mostly been reported in the non-transcribed spacer region. Here, we have identified two sequence variants within the near full coding region of 18S rRNA gene from a single reniform nematode (RN) Rotylenchulus reniformis labeled as reniform nematode variant 1 (RN_VAR1) and variant 2 (RN_VAR2). All sequences from three of the four isolates had both RN variants in their sequences; however, isolate 13B had only RN variant 2 sequence. Specific variable base sites (96 or 5.5%) were found within the 18S rRNA gene that can clearly distinguish the two 18S rDNA variants of RN, in 11 (25.0%) and 33 (75.0%) of the 44 RN clones, for RN_VAR1 and RN_VAR2, respectively. Neighbor-joining trees show that the RN_VAR1 is very similar to the previously existing R. reniformis sequence in GenBank, while the RN_VAR2 sequence is more divergent. This is the first report of the identification of two major variants of the 18S rRNA gene in the same single RN, and documents the specific base variation between the two variants, and hypothesizes on simultaneous co-existence of these two variants for this gene.
Characterization of the Two Intra-Individual Sequence Variants in the 18S rRNA Gene in the Plant Parasitic Nematode, Rotylenchulus reniformis

PubMed Central

Nyaku, Seloame T.; Sripathi, Venkateswara R.; Kantety, Ramesh V.; Gu, Yong Q.; Lawrence, Kathy; Sharma, Govind C.

2013-01-01

The 18S rRNA gene is fundamental to cellular and organismal protein synthesis and because of its stable persistence through generations it is also used in phylogenetic analysis among taxa. Sequence variation in this gene within a single species is rare, but it has been observed in few metazoan organisms. More frequently it has mostly been reported in the non-transcribed spacer region. Here, we have identified two sequence variants within the near full coding region of 18S rRNA gene from a single reniform nematode (RN) Rotylenchulus reniformis labeled as reniform nematode variant 1 (RN_VAR1) and variant 2 (RN_VAR2). All sequences from three of the four isolates had both RN variants in their sequences; however, isolate 13B had only RN variant 2 sequence. Specific variable base sites (96 or 5.5%) were found within the 18S rRNA gene that can clearly distinguish the two 18S rDNA variants of RN, in 11 (25.0%) and 33 (75.0%) of the 44 RN clones, for RN_VAR1 and RN_VAR2, respectively. Neighbor-joining trees show that the RN_VAR1 is very similar to the previously existing R. reniformis sequence in GenBank, while the RN_VAR2 sequence is more divergent. This is the first report of the identification of two major variants of the 18S rRNA gene in the same single RN, and documents the specific base variation between the two variants, and hypothesizes on simultaneous co-existence of these two variants for this gene. PMID:23593343
Towards Clinical Molecular Diagnosis of Inherited Cardiac Conditions: A Comparison of Bench-Top Genome DNA Sequencers

PubMed Central

Wilkinson, Samuel L.; John, Shibu; Walsh, Roddy; Novotny, Tomas; Valaskova, Iveta; Gupta, Manu; Game, Laurence; Barton, Paul J R.; Cook, Stuart A.; Ware, James S.

2013-01-01

Background Molecular genetic testing is recommended for diagnosis of inherited cardiac disease, to guide prognosis and treatment, but access is often limited by cost and availability. Recently introduced high-throughput bench-top DNA sequencing platforms have the potential to overcome these limitations. Methodology/Principal Findings We evaluated two next-generation sequencing (NGS) platforms for molecular diagnostics. The protein-coding regions of six genes associated with inherited arrhythmia syndromes were amplified from 15 human samples using parallelised multiplex PCR (Access Array, Fluidigm), and sequenced on the MiSeq (Illumina) and Ion Torrent PGM (Life Technologies). Overall, 97.9% of the target was sequenced adequately for variant calling on the MiSeq, and 96.8% on the Ion Torrent PGM. Regions missed tended to be of high GC-content, and most were problematic for both platforms. Variant calling was assessed using 107 variants detected using Sanger sequencing: within adequately sequenced regions, variant calling on both platforms was highly accurate (Sensitivity: MiSeq 100%, PGM 99.1%. Positive predictive value: MiSeq 95.9%, PGM 95.5%). At the time of the study the Ion Torrent PGM had a lower capital cost and individual runs were cheaper and faster. The MiSeq had a higher capacity (requiring fewer runs), with reduced hands-on time and simpler laboratory workflows. Both provide significant cost and time savings over conventional methods, even allowing for adjunct Sanger sequencing to validate findings and sequence exons missed by NGS. Conclusions/Significance MiSeq and Ion Torrent PGM both provide accurate variant detection as part of a PCR-based molecular diagnostic workflow, and provide alternative platforms for molecular diagnosis of inherited cardiac conditions. Though there were performance differences at this throughput, platforms differed primarily in terms of cost, scalability, protocol stability and ease of use. Compared with current molecular genetic diagnostic tests for inherited cardiac arrhythmias, these NGS approaches are faster, less expensive, and yet more comprehensive. PMID:23861798
Whole genome sequences of a male and female supercentenarian, ages greater than 114 years.

PubMed

Sebastiani, Paola; Riva, Alberto; Montano, Monty; Pham, Phillip; Torkamani, Ali; Scherba, Eugene; Benson, Gary; Milton, Jacqueline N; Baldwin, Clinton T; Andersen, Stacy; Schork, Nicholas J; Steinberg, Martin H; Perls, Thomas T

2011-01-01

Supercentenarians (age 110+ years old) generally delay or escape age-related diseases and disability well beyond the age of 100 and this exceptional survival is likely to be influenced by a genetic predisposition that includes both common and rare genetic variants. In this report, we describe the complete genomic sequences of male and female supercentenarians, both age >114 years old. We show that: (1) the sequence variant spectrum of these two individuals' DNA sequences is largely comparable to existing non-supercentenarian genomes; (2) the two individuals do not appear to carry most of the well-established human longevity enabling variants already reported in the literature; (3) they have a comparable number of known disease-associated variants relative to most human genomes sequenced to-date; (4) approximately 1% of the variants these individuals possess are novel and may point to new genes involved in exceptional longevity; and (5) both individuals are enriched for coding variants near longevity-associated variants that we discovered through a large genome-wide association study. These analyses suggest that there are both common and rare longevity-associated variants that may counter the effects of disease-predisposing variants and extend lifespan. The continued analysis of the genomes of these and other rare individuals who have survived to extremely old ages should provide insight into the processes that contribute to the maintenance of health during extreme aging.
Whole Genome Sequences of a Male and Female Supercentenarian, Ages Greater than 114 Years

PubMed Central

Sebastiani, Paola; Riva, Alberto; Montano, Monty; Pham, Phillip; Torkamani, Ali; Scherba, Eugene; Benson, Gary; Milton, Jacqueline N.; Baldwin, Clinton T.; Andersen, Stacy; Schork, Nicholas J.; Steinberg, Martin H.; Perls, Thomas T.

2012-01-01

Supercentenarians (age 110+ years old) generally delay or escape age-related diseases and disability well beyond the age of 100 and this exceptional survival is likely to be influenced by a genetic predisposition that includes both common and rare genetic variants. In this report, we describe the complete genomic sequences of male and female supercentenarians, both age >114 years old. We show that: (1) the sequence variant spectrum of these two individuals’ DNA sequences is largely comparable to existing non-supercentenarian genomes; (2) the two individuals do not appear to carry most of the well-established human longevity enabling variants already reported in the literature; (3) they have a comparable number of known disease-associated variants relative to most human genomes sequenced to-date; (4) approximately 1% of the variants these individuals possess are novel and may point to new genes involved in exceptional longevity; and (5) both individuals are enriched for coding variants near longevity-associated variants that we discovered through a large genome-wide association study. These analyses suggest that there are both common and rare longevity-associated variants that may counter the effects of disease-predisposing variants and extend lifespan. The continued analysis of the genomes of these and other rare individuals who have survived to extremely old ages should provide insight into the processes that contribute to the maintenance of health during extreme aging. PMID:22303384

A comprehensive approach to identification of pathogenic FANCA variants in Fanconi anemia patients and their families.

PubMed

Kimble, Danielle C; Lach, Francis P; Gregg, Siobhan Q; Donovan, Frank X; Flynn, Elizabeth K; Kamat, Aparna; Young, Alice; Vemulapalli, Meghana; Thomas, James W; Mullikin, James C; Auerbach, Arleen D; Smogorzewska, Agata; Chandrasekharappa, Settara C

2018-02-01

Fanconi anemia (FA) is a rare recessive DNA repair deficiency resulting from mutations in one of at least 22 genes. Two-thirds of FA families harbor mutations in FANCA. To genotype patients in the International Fanconi Anemia Registry (IFAR) we employed multiple methodologies, screening 216 families for FANCA mutations. We describe identification of 57 large deletions and 261 sequence variants, in 159 families. All but seven families harbored distinct combinations of two mutations demonstrating high heterogeneity. Pathogenicity of the 18 novel missense variants was analyzed functionally by determining the ability of the mutant cDNA to improve the survival of a FANCA-null cell line when treated with MMC. Overexpressed pathogenic missense variants were found to reside in the cytoplasm, and nonpathogenic in the nucleus. RNA analysis demonstrated that two variants (c.522G > C and c.1565A > G), predicted to encode missense variants, which were determined to be nonpathogenic by a functional assay, caused skipping of exons 5 and 16, respectively, and are most likely pathogenic. We report 48 novel FANCA sequence variants. Defining both variants in a large patient cohort is a major step toward cataloging all FANCA variants, and permitting studies of genotype-phenotype correlations. © Published 2017. This article is a U.S. Government work and is in the public domain in the USA.
Control of artefactual variation in reported inter-sample relatedness during clinical use of a Mycobacterium tuberculosis sequencing pipeline.

PubMed

Wyllie, David H; Sanderson, Nicholas; Myers, Richard; Peto, Tim; Robinson, Esther; Crook, Derrick W; Smith, E Grace; Walker, A Sarah

2018-06-06

Contact tracing requires reliable identification of closely related bacterial isolates. When we noticed the reporting of artefactual variation between M. tuberculosis isolates during routine next generation sequencing of Mycobacterium spp, we investigated its basis in 2,018 consecutive M. tuberculosis isolates. In the routine process used, clinical samples were decontaminated and inoculated into broth cultures; from positive broth cultures DNA was extracted, sequenced, reads mapped, and consensus sequences determined. We investigated the process of consensus sequence determination, which selects the most common nucleotide at each position. Having determined the high-quality read depth and depth of minor variants across 8,006 M. tuberculosis genomic regions, we quantified the relationship between the minor variant depth and the amount of non-Mycobacterial bacterial DNA, which originates from commensal microbes killed during sample decontamination. In the presence of non-Mycobacterial bacterial DNA, we found significant increases in minor variant frequencies of more than 1.5 fold in 242 regions covering 5.1% of the M. tuberculosis genome. Included within these were four high variation regions strongly influenced by the amount of non-Mycobacterial bacterial DNA. Excluding these four regions from pairwise distance comparisons reduced biologically implausible variation from 5.2% to 0% in an independent validation set derived from 226 individuals. Thus, we have demonstrated an approach identifying critical genomic regions contributing to clinically relevant artefactual variation in bacterial similarity searches. The approach described monitors the outputs of the complex multi-step laboratory and bioinformatics process, allows periodic process adjustments, and will have application to quality control of routine bacterial genomics. Copyright © 2018 Wyllie et al.
Biochemical and biophysical characterization of the major outer surface protein, OSP-A from North American and European isolates of Borrelia burgdorferi

DOE Office of Scientific and Technical Information (OSTI.GOV)

McGrath, B.C.; Dunn, J.J.; France, L.L.

1995-12-31

Lyme borreliosis, caused by the spirochete Borrelia burgdorferi, is the most common vector-borne disease in North America and Western Europe. As the major delayed immune response in humans, a better understanding of the major outer surface lipoproteins OspA and OspB are of much interest. These proteins have been shown to exhibit three distinct phylogenetic genotypes based on their DNA sequences. This paper describes the cloning of genomic DNA for each variant and amplification of PCR. DNA sequence data was used to derive computer driven phylogenetic analysis and deduced amino acid sequences. Overproduction of variant OspAs was carried out in E.more » coli using a T7-based expression system. Circular dichroism and fluorescence studies was carried out on the recombinant B31 PspA yielding evidence supporting a B31 protein containing 11% alpha-helix, 34% antiparallel beta-sheet, 12% parallel beta sheet.« less
[Application of the polymerase chain reaction (PCR) in the diagnosis of Hb S-beta(+)-thalassemia].

PubMed

Harano, K; Harano, T; Kushida, Y; Ueda, S

1991-08-01

Isoelectric focusing of the hemolysate prepared from a two-year-old American black boy with microcytic hypochromia showed the presence of a high percentage (63.3%) of such Hb variant as Hb S, while the levels of Hb A, Hb F and Hb A2 were 20.0%, 12.7%, and 4.0%, respectively. The ratio of the non-alpha-chain to the alpha-chain of the biosynthesized globin chains was 0.49. The variant was identified as Hb S by amino acid analysis of the abnormal peptide (beta T-1) and digestion of DNA amplified by the polymerase chain reaction with enzyme Eco 81 I. This was further confirmed by DNA sequencing. DNA sequencing of a beta-gene without the beta s-mutation revealed a nucleotide change of T to C in the polyadenylation signal sequence AATAAA 3' to the beta-gene, resulting in beta(+)-thalassemia. These results are consistent with the existence of a beta s-gene and a beta(+)-thalassemia gene in trans.
Comparison and evaluation of two exome capture kits and sequencing platforms for variant calling.

PubMed

Zhang, Guoqiang; Wang, Jianfeng; Yang, Jin; Li, Wenjie; Deng, Yutian; Li, Jing; Huang, Jun; Hu, Songnian; Zhang, Bing

2015-08-05

To promote the clinical application of next-generation sequencing, it is important to obtain accurate and consistent variants of target genomic regions at low cost. Ion Proton, the latest updated semiconductor-based sequencing instrument from Life Technologies, is designed to provide investigators with an inexpensive platform for human whole exome sequencing that achieves a rapid turnaround time. However, few studies have comprehensively compared and evaluated the accuracy of variant calling between Ion Proton and Illumina sequencing platforms such as HiSeq 2000, which is the most popular sequencing platform for the human genome. The Ion Proton sequencer combined with the Ion TargetSeq Exome Enrichment Kit together make up TargetSeq-Proton, whereas SureSelect-Hiseq is based on the Agilent SureSelect Human All Exon v4 Kit and the HiSeq 2000 sequencer. Here, we sequenced exonic DNA from four human blood samples using both TargetSeq-Proton and SureSelect-HiSeq. We then called variants in the exonic regions that overlapped between the two exome capture kits (33.6 Mb). The rates of shared variant loci called by two sequencing platforms were from 68.0 to 75.3% in four samples, whereas the concordance of co-detected variant loci reached 99%. Sanger sequencing validation revealed that the validated rate of concordant single nucleotide polymorphisms (SNPs) (91.5%) was higher than the SNPs specific to TargetSeq-Proton (60.0%) or specific to SureSelect-HiSeq (88.3%). With regard to 1-bp small insertions and deletions (InDels), the Sanger sequencing validated rates of concordant variants (100.0%) and SureSelect-HiSeq-specific (89.6%) were higher than those of TargetSeq-Proton-specific (15.8%). In the sequencing of exonic regions, a combination of using of two sequencing strategies (SureSelect-HiSeq and TargetSeq-Proton) increased the variant calling specificity for concordant variant loci and the sensitivity for variant loci called by any one platform. However, for the sequencing of platform-specific variants, the accuracy of variant calling by HiSeq 2000 was higher than that of Ion Proton, specifically for the InDel detection. Moreover, the variant calling software also influences the detection of SNPs and, specifically, InDels in Ion Proton exome sequencing.
Assessment of the Clinical Relevance of BRCA2 Missense Variants by Functional and Computational Approaches.

PubMed

Guidugli, Lucia; Shimelis, Hermela; Masica, David L; Pankratz, Vernon S; Lipton, Gary B; Singh, Namit; Hu, Chunling; Monteiro, Alvaro N A; Lindor, Noralane M; Goldgar, David E; Karchin, Rachel; Iversen, Edwin S; Couch, Fergus J

2018-01-17

Many variants of uncertain significance (VUS) have been identified in BRCA2 through clinical genetic testing. VUS pose a significant clinical challenge because the contribution of these variants to cancer risk has not been determined. We conducted a comprehensive assessment of VUS in the BRCA2 C-terminal DNA binding domain (DBD) by using a validated functional assay of BRCA2 homologous recombination (HR) DNA-repair activity and defined a classifier of variant pathogenicity. Among 139 variants evaluated, 54 had ≥99% probability of pathogenicity, and 73 had ≥95% probability of neutrality. Functional assay results were compared with predictions of variant pathogenicity from the Align-GVGD protein-sequence-based prediction algorithm, which has been used for variant classification. Relative to the HR assay, Align-GVGD significantly (p < 0.05) over-predicted pathogenic variants. We subsequently combined functional and Align-GVGD prediction results in a Bayesian hierarchical model (VarCall) to estimate the overall probability of pathogenicity for each VUS. In addition, to predict the effects of all other BRCA2 DBD variants and to prioritize variants for functional studies, we used the endoPhenotype-Optimized Sequence Ensemble (ePOSE) algorithm to train classifiers for BRCA2 variants by using data from the HR functional assay. Together, the results show that systematic functional assays in combination with in silico predictors of pathogenicity provide robust tools for clinical annotation of BRCA2 VUS. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Cystinuria Associated with Different SLC7A9 Gene Variants in the Cat

PubMed Central

Raj, Karthik; Osborne, Carl; Giger, Urs

2016-01-01

Cystinuria is a classical inborn error of metabolism characterized by a selective proximal renal tubular defect affecting cystine, ornithine, lysine, and arginine (COLA) reabsorption, which can lead to uroliths and urinary obstruction. In humans, dogs and mice, cystinuria is caused by variants in one of two genes, SLC3A1 and SLC7A9, which encode the rBAT and bo,+AT subunits of the bo,+ basic amino acid transporter system, respectively. In this study, exons and flanking regions of the SLC3A1 and SLC7A9 genes were sequenced from genomic DNA of cats (Felis catus) with COLAuria and cystine calculi. Relative to the Felis catus-6.2 reference genome sequence, DNA sequences from these affected cats revealed 3 unique homozygous SLC7A9 missense variants: one in exon 5 (p.Asp236Asn) from a non-purpose-bred medium-haired cat, one in exon 7 (p.Val294Glu) in a Maine Coon and a Sphinx cat, and one in exon 10 (p.Thr392Met) from a non-purpose-bred long-haired cat. A genotyping assay subsequently identified another cystinuric domestic medium-haired cat that was homozygous for the variant originally identified in the purebred cats. These missense variants result in deleterious amino acid substitutions of highly conserved residues in the bo,+AT protein. A limited population survey supported that the variants found were likely causative. The remaining 2 sequenced domestic short-haired cats had a heterozygous variant at a splice donor site in intron 10 and a homozygous single nucleotide variant at a branchpoint in intron 11 of SLC7A9, respectively. This study identifies the first SLC7A9 variants causing feline cystinuria and reveals that, as in humans and dogs, this disease is genetically heterogeneous in cats. PMID:27404572
[Genetic variants in miRNAs and its association with breast cancer].

PubMed

Méndez-Gómez, Susana; Ruiz Esparza-Garrido, Ruth; Velázquez-Flores, Miguel; Dolores-Vergara, Maria; Salamanca-Gómez, Fabio; Arenas-Aranda, Diego Julio

2014-01-01

In Mexico, breast cancer represents the first cause of cancer death in females. At the molecular level, non-coding RNAs and especially microRNAs have played an important role in the origin and development of this neoplasm In the Anglo-Saxon population, diverse genetic variants in microRNA genes and in their targets are associated with the development of this disease. In the Mexican population it is not known if these or other variants exist. Identification of these or new variants in our population is fundamental in order to have a better understanding of cancer development and to help establish a better diagnostic strategy. DNA was isolated from mammary tumors, adjacent tissue and peripheral blood of Mexican females with or without cancer. From DNA, five microRNA genes and three of their targets were amplified and sequenced. Genetic variants associated with breast cancer in an Anglo- Saxon population have been previously identified in these sequences. In the samples studied we identified seven single nucleotide polymorphisms (SNPs). Two had not been previously described and were identified only in women with cancer. The new variants may be genetic predisposition factors for the development of breast cancer in our population. Further experiments are needed to determine the involvement of these variants in the development, establishment and progression of breast cancer.
Detecting and Estimating Contamination of Human DNA Samples in Sequencing and Array-Based Genotype Data

PubMed Central

Jun, Goo; Flickinger, Matthew; Hetrick, Kurt N.; Romm, Jane M.; Doheny, Kimberly F.; Abecasis, Gonçalo R.; Boehnke, Michael; Kang, Hyun Min

2012-01-01

DNA sample contamination is a serious problem in DNA sequencing studies and may result in systematic genotype misclassification and false positive associations. Although methods exist to detect and filter out cross-species contamination, few methods to detect within-species sample contamination are available. In this paper, we describe methods to identify within-species DNA sample contamination based on (1) a combination of sequencing reads and array-based genotype data, (2) sequence reads alone, and (3) array-based genotype data alone. Analysis of sequencing reads allows contamination detection after sequence data is generated but prior to variant calling; analysis of array-based genotype data allows contamination detection prior to generation of costly sequence data. Through a combination of analysis of in silico and experimentally contaminated samples, we show that our methods can reliably detect and estimate levels of contamination as low as 1%. We evaluate the impact of DNA contamination on genotype accuracy and propose effective strategies to screen for and prevent DNA contamination in sequencing studies. PMID:23103226
DOE Office of Scientific and Technical Information (OSTI.GOV)

Tamrin, Mohd Izzuddin Mohd; Turaev, Sherzod; Sembok, Tengku Mohd Tengku

There are tremendous works in biotechnology especially in area of DNA molecules. The computer society is attempting to develop smaller computing devices through computational models which are based on the operations performed on the DNA molecules. A Watson-Crick automaton, a theoretical model for DNA based computation, has two reading heads, and works on double-stranded sequences of the input related by a complementarity relation similar with the Watson-Crick complementarity of DNA nucleotides. Over the time, several variants of Watson-Crick automata have been introduced and investigated. However, they cannot be used as suitable DNA based computational models for molecular stochastic processes andmore » fuzzy processes that are related to important practical problems such as molecular parsing, gene disease detection, and food authentication. In this paper we define new variants of Watson-Crick automata, called weighted Watson-Crick automata, developing theoretical models for molecular stochastic and fuzzy processes. We define weighted Watson-Crick automata adapting weight restriction mechanisms associated with formal grammars and automata. We also study the generative capacities of weighted Watson-Crick automata, including probabilistic and fuzzy variants. We show that weighted variants of Watson-Crick automata increase their generative power.« less
Weighted Watson-Crick automata

NASA Astrophysics Data System (ADS)

Tamrin, Mohd Izzuddin Mohd; Turaev, Sherzod; Sembok, Tengku Mohd Tengku

2014-07-01

There are tremendous works in biotechnology especially in area of DNA molecules. The computer society is attempting to develop smaller computing devices through computational models which are based on the operations performed on the DNA molecules. A Watson-Crick automaton, a theoretical model for DNA based computation, has two reading heads, and works on double-stranded sequences of the input related by a complementarity relation similar with the Watson-Crick complementarity of DNA nucleotides. Over the time, several variants of Watson-Crick automata have been introduced and investigated. However, they cannot be used as suitable DNA based computational models for molecular stochastic processes and fuzzy processes that are related to important practical problems such as molecular parsing, gene disease detection, and food authentication. In this paper we define new variants of Watson-Crick automata, called weighted Watson-Crick automata, developing theoretical models for molecular stochastic and fuzzy processes. We define weighted Watson-Crick automata adapting weight restriction mechanisms associated with formal grammars and automata. We also study the generative capacities of weighted Watson-Crick automata, including probabilistic and fuzzy variants. We show that weighted variants of Watson-Crick automata increase their generative power.
Chimeric TALE recombinases with programmable DNA sequence specificity.

PubMed

Mercer, Andrew C; Gaj, Thomas; Fuller, Roberta P; Barbas, Carlos F

2012-11-01

Site-specific recombinases are powerful tools for genome engineering. Hyperactivated variants of the resolvase/invertase family of serine recombinases function without accessory factors, and thus can be re-targeted to sequences of interest by replacing native DNA-binding domains (DBDs) with engineered zinc-finger proteins (ZFPs). However, imperfect modularity with particular domains, lack of high-affinity binding to all DNA triplets, and difficulty in construction has hindered the widespread adoption of ZFPs in unspecialized laboratories. The discovery of a novel type of DBD in transcription activator-like effector (TALE) proteins from Xanthomonas provides an alternative to ZFPs. Here we describe chimeric TALE recombinases (TALERs): engineered fusions between a hyperactivated catalytic domain from the DNA invertase Gin and an optimized TALE architecture. We use a library of incrementally truncated TALE variants to identify TALER fusions that modify DNA with efficiency and specificity comparable to zinc-finger recombinases in bacterial cells. We also show that TALERs recombine DNA in mammalian cells. The TALER architecture described herein provides a platform for insertion of customized TALE domains, thus significantly expanding the targeting capacity of engineered recombinases and their potential applications in biotechnology and medicine.
Identification of BRCA1 missense substitutions that confer partial functional activity: potential moderate risk variants?

PubMed Central

Lovelock, Paul K; Spurdle, Amanda B; Mok, Myth TS; Farrugia, Daniel J; Lakhani, Sunil R; Healey, Sue; Arnold, Stephen; Buchanan, Daniel; Investigators, kConFab; Couch, Fergus J; Henderson, Beric R; Goldgar, David E; Tavtigian, Sean V; Chenevix-Trench, Georgia; Brown, Melissa A

2007-01-01

Introduction Many of the DNA sequence variants identified in the breast cancer susceptibility gene BRCA1 remain unclassified in terms of their potential pathogenicity. Both multifactorial likelihood analysis and functional approaches have been proposed as a means to elucidate likely clinical significance of such variants, but analysis of the comparative value of these methods for classifying all sequence variants has been limited. Methods We have compared the results from multifactorial likelihood analysis with those from several functional analyses for the four BRCA1 sequence variants A1708E, G1738R, R1699Q, and A1708V. Results Our results show that multifactorial likelihood analysis, which incorporates sequence conservation, co-inheritance, segregation, and tumour immunohistochemical analysis, may improve classification of variants. For A1708E, previously shown to be functionally compromised, analysis of oestrogen receptor, cytokeratin 5/6, and cytokeratin 14 tumour expression data significantly strengthened the prediction of pathogenicity, giving a posterior probability of pathogenicity of 99%. For G1738R, shown to be functionally defective in this study, immunohistochemistry analysis confirmed previous findings of inconsistent 'BRCA1-like' phenotypes for the two tumours studied, and the posterior probability for this variant was 96%. The posterior probabilities of R1699Q and A1708V were 54% and 69%, respectively, only moderately suggestive of increased risk. Interestingly, results from functional analyses suggest that both of these variants have only partial functional activity. R1699Q was defective in foci formation in response to DNA damage and displayed intermediate transcriptional transactivation activity but showed no evidence for centrosome amplification. In contrast, A1708V displayed an intermediate transcriptional transactivation activity and a normal foci formation response in response to DNA damage but induced centrosome amplification. Conclusion These data highlight the need for a range of functional studies to be performed in order to identify variants with partially compromised function. The results also raise the possibility that A1708V and R1699Q may be associated with a low or moderate risk of cancer. While data pooling strategies may provide more information for multifactorial analysis to improve the interpretation of the clinical significance of these variants, it is likely that the development of current multifactorial likelihood approaches and the consideration of alternative statistical approaches will be needed to determine whether these individually rare variants do confer a low or moderate risk of breast cancer. PMID:18036263
Sequence analysis of the mitochondrial DNA control region of ciscoes (genus Coregonus): Taxonomic implications for the Great Lakes species flock

USGS Publications Warehouse

Reed, Kent M.; Dorschner, Michael O.; Todd, Thomas N.; Phillips, Ruth B.

1998-01-01

Sequence variation in the control region (D-loop) of the mitochondrial DNA (mtDNA) was examined to assess the genetic distinctiveness of the shortjaw cisco (Coregonus zenithicus). Individuals from within the Great Lakes Basin as well as inland lakes outside the basin were sampled. DNA fragments containing the entire D-loop were amplified by PCR from specimens ofC. zenithicus and the related species C. artedi, C. hoyi, C. kiyi, and C. clupeaformis. DNA sequence analysis revealed high similarity within and among species and shared polymorphism for length variants. Based on this analysis, the shortjaw cisco is not genetically distinct from other cisco species.
Low incidence of DNA sequence variation in human induced pluripotent stem cells generated by non-integrating plasmid expression

PubMed Central

Cheng, Linzhao; Hansen, Nancy F.; Zhao, Ling; Du, Yutao; Zou, Chunlin; Donovan, Frank X.; Chou, Bin-Kuan; Zhou, Guangyu; Li, Shijie; Dowey, Sarah N.; Ye, Zhaohui; Chandrasekharappa, Settara C.; Yang, Huanming; Mullikin, James C.; Liu, P. Paul

2012-01-01

Summary The utility of induced pluripotent stem cells (iPSCs) as models to study diseases and as sources for cell therapy depends on the integrity of their genomes. Despite recent publications of DNA sequence variations in the iPSCs, the true scope of such changes for the entire genome is not clear. Here we report the whole-genome sequencing of three human iPSC lines derived from two cell types of an adult donor by episomal vectors. The vector sequence was undetectable in the deeply sequenced iPSC lines. We identified 1058–1808 heterozygous single nucleotide variants (SNVs), but no copy number variants, in each iPSC line. Six to twelve of these SNVs were within coding regions in each iPSC line, but ~50% of them are synonymous changes and the remaining are not selectively enriched for known genes associated with cancers. Our data thus suggest that episome-mediated reprogramming is not inherently mutagenic during integration-free iPSC induction. PMID:22385660
Sensitivity of BRCA1/2 testing in high-risk breast/ovarian/male breast cancer families: little contribution of comprehensive RNA/NGS panel testing.

PubMed

Byers, Helen; Wallis, Yvonne; van Veen, Elke M; Lalloo, Fiona; Reay, Kim; Smith, Philip; Wallace, Andrew J; Bowers, Naomi; Newman, William G; Evans, D Gareth

2016-11-01

The sensitivity of testing BRCA1 and BRCA2 remains unresolved as the frequency of deep intronic splicing variants has not been defined in high-risk familial breast/ovarian cancer families. This variant category is reported at significant frequency in other tumour predisposition genes, including NF1 and MSH2. We carried out comprehensive whole gene RNA analysis on 45 high-risk breast/ovary and male breast cancer families with no identified pathogenic variant on exonic sequencing and copy number analysis of BRCA1/2. In addition, we undertook variant screening of a 10-gene high/moderate risk breast/ovarian cancer panel by next-generation sequencing. DNA testing identified the causative variant in 50/56 (89%) breast/ovarian/male breast cancer families with Manchester scores of ≥50 with two variants being confirmed to affect splicing on RNA analysis. RNA sequencing of BRCA1/BRCA2 on 45 individuals from high-risk families identified no deep intronic variants and did not suggest loss of RNA expression as a cause of lost sensitivity. Panel testing in 42 samples identified a known RAD51D variant, a high-risk ATM variant in another breast ovary family and a truncating CHEK2 mutation. Current exonic sequencing and copy number analysis variant detection methods of BRCA1/2 have high sensitivity in high-risk breast/ovarian cancer families. Sequence analysis of RNA does not identify any variants undetected by current analysis of BRCA1/2. However, RNA analysis clarified the pathogenicity of variants of unknown significance detected by current methods. The low diagnostic uplift achieved through sequence analysis of the other known breast/ovarian cancer susceptibility genes indicates that further high-risk genes remain to be identified.
Germline pathogenic variants in PALB2 and other cancer-predisposing genes in families with hereditary diffuse gastric cancer without CDH1 mutation: a whole-exome sequencing study.

PubMed

Fewings, Eleanor; Larionov, Alexey; Redman, James; Goldgraben, Mae A; Scarth, James; Richardson, Susan; Brewer, Carole; Davidson, Rosemarie; Ellis, Ian; Evans, D Gareth; Halliday, Dorothy; Izatt, Louise; Marks, Peter; McConnell, Vivienne; Verbist, Louis; Mayes, Rebecca; Clark, Graeme R; Hadfield, James; Chin, Suet-Feung; Teixeira, Manuel R; Giger, Olivier T; Hardwick, Richard; di Pietro, Massimiliano; O'Donovan, Maria; Pharoah, Paul; Caldas, Carlos; Fitzgerald, Rebecca C; Tischkowitz, Marc

2018-04-26

Germline pathogenic variants in the E-cadherin gene (CDH1) are strongly associated with the development of hereditary diffuse gastric cancer. There is a paucity of data to guide risk assessment and management of families with hereditary diffuse gastric cancer that do not carry a CDH1 pathogenic variant, making it difficult to make informed decisions about surveillance and risk-reducing surgery. We aimed to identify new candidate genes associated with predisposition to hereditary diffuse gastric cancer in affected families without pathogenic CDH1 variants. We did whole-exome sequencing on DNA extracted from the blood of 39 individuals (28 individuals diagnosed with hereditary diffuse gastric cancer and 11 unaffected first-degree relatives) in 22 families without pathogenic CDH1 variants. Genes with loss-of-function variants were prioritised using gene-interaction analysis to identify clusters of genes that could be involved in predisposition to hereditary diffuse gastric cancer. Protein-affecting germline variants were identified in probands from six families with hereditary diffuse gastric cancer; variants were found in genes known to predispose to cancer and in lesser-studied DNA repair genes. A frameshift deletion in PALB2 was found in one member of a family with a history of gastric and breast cancer. Two different MSH2 variants were identified in two unrelated affected individuals, including one frameshift insertion and one previously described start-codon loss. One family had a unique combination of variants in the DNA repair genes ATR and NBN. Two variants in the DNA repair gene RECQL5 were identified in two unrelated families: one missense variant and a splice-acceptor variant. The results of this study suggest a role for the known cancer predisposition gene PALB2 in families with hereditary diffuse gastric cancer and no detected pathogenic CDH1 variants. We also identified new candidate genes associated with disease risk in these families. UK Medical Research Council (Sackler programme), European Research Council under the European Union's Seventh Framework Programme (2007-13), National Institute for Health Research Cambridge Biomedical Research Centre, Experimental Cancer Medicine Centres, and Cancer Research UK. Copyright © 2018 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY 4.0 license. Published by Elsevier Ltd.. All rights reserved.
iMETHYL: an integrative database of human DNA methylation, gene expression, and genomic variation.

PubMed

Komaki, Shohei; Shiwa, Yuh; Furukawa, Ryohei; Hachiya, Tsuyoshi; Ohmomo, Hideki; Otomo, Ryo; Satoh, Mamoru; Hitomi, Jiro; Sobue, Kenji; Sasaki, Makoto; Shimizu, Atsushi

2018-01-01

We launched an integrative multi-omics database, iMETHYL (http://imethyl.iwate-megabank.org). iMETHYL provides whole-DNA methylation (~24 million autosomal CpG sites), whole-genome (~9 million single-nucleotide variants), and whole-transcriptome (>14 000 genes) data for CD4 + T-lymphocytes, monocytes, and neutrophils collected from approximately 100 subjects. These data were obtained from whole-genome bisulfite sequencing, whole-genome sequencing, and whole-transcriptome sequencing, making iMETHYL a comprehensive database.
Chromatin accessibility prediction via a hybrid deep convolutional neural network.

PubMed

Liu, Qiao; Xia, Fei; Yin, Qijin; Jiang, Rui

2018-03-01

A majority of known genetic variants associated with human-inherited diseases lie in non-coding regions that lack adequate interpretation, making it indispensable to systematically discover functional sites at the whole genome level and precisely decipher their implications in a comprehensive manner. Although computational approaches have been complementing high-throughput biological experiments towards the annotation of the human genome, it still remains a big challenge to accurately annotate regulatory elements in the context of a specific cell type via automatic learning of the DNA sequence code from large-scale sequencing data. Indeed, the development of an accurate and interpretable model to learn the DNA sequence signature and further enable the identification of causative genetic variants has become essential in both genomic and genetic studies. We proposed Deopen, a hybrid framework mainly based on a deep convolutional neural network, to automatically learn the regulatory code of DNA sequences and predict chromatin accessibility. In a series of comparison with existing methods, we show the superior performance of our model in not only the classification of accessible regions against background sequences sampled at random, but also the regression of DNase-seq signals. Besides, we further visualize the convolutional kernels and show the match of identified sequence signatures and known motifs. We finally demonstrate the sensitivity of our model in finding causative noncoding variants in the analysis of a breast cancer dataset. We expect to see wide applications of Deopen with either public or in-house chromatin accessibility data in the annotation of the human genome and the identification of non-coding variants associated with diseases. Deopen is freely available at https://github.com/kimmo1019/Deopen. ruijiang@tsinghua.edu.cn. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Method for creating polynucleotide and polypeptide sequences

NASA Technical Reports Server (NTRS)

Arnold, Frances (Inventor); Volkov, Alexander (Inventor); Shao, Zhixin (Inventor)

2003-01-01

The invention provides methods for evolving a polynucleotide toward acquisition of a desired property. Such methods entail incubating a population of parental polynucleotide variants under conditions to generate annealed polynucleotides comprising heteroduplexes. The heteroduplexes are then exposed to a cellular DNA repair system to convert the heteroduplexes to parental polynucleotide variants or recombined polynucleotide variants. The resulting polynucleotides are then screened or selected for the desired property.

CYP3A4 allelic variants with amino acid substitutions in exons 7 and 12: evidence for an allelic variant with altered catalytic activity.

PubMed

Sata, F; Sapone, A; Elizondo, G; Stocker, P; Miller, V P; Zheng, W; Raunio, H; Crespi, C L; Gonzalez, F J

2000-01-01

To determine the existence of mutant and variant CgammaP3A4 alleles in three racial groups and to assess functions of the variant alleles by complementary deoxyribonucleic acid (cDNA) expression. A bacterial artificial chromosome that contains the complete CgammaP3A4 gene was isolated and the exons and surrounding introns were directly sequenced to develop primers to polymerase chain reaction (PCR) amplify and sequence the gene from lymphocyte DNA. DNA samples from Chinese, black, and white subjects were screened. Mutating the affected amino acid in the wild-type cDNA and expressing the variant enzyme with use of the baculovirus system was used to functionally evaluate the variant allele having a missense mutation. To investigate the existence of mutant and variant CgammaP3A4 alleles in humans, all 13 exons and the 5'-flanking region of the human CgammaP3A4 gene in three racial groups were sequenced and four alleles were identified. An A-->G point mutation in the 5'-flanking region of the human CgammaP3A4 gene, designated CgammaP3A4*1B, was found in the three different racial groups. The frequency of this allele in a white population was 4.2%, whereas it was 66.7% in black subjects. The CgammaP3A4*1B allele was not found in Chinese subjects. A second variant allele, designated CgammaP3A4*2, having a Ser222Pro change, was found at a frequency of 2.7% in the white population and was absent in the black subjects and Chinese subjects analyzed. Baculovirus-directed cDNA expression revealed that the CYP3A4*2 P450 had a lower intrinsic clearance for the CYP3A4 substrate nifedipine compared with the wild-type enzyme but was not significantly different from the wild-type enzyme for testosterone 6beta-hydroxylation. Another rare allele, designated CgammaP3A4*3, was found in a single Chinese subject who had a Met445Thr change in the conserved heme-binding region of the P450. These are the first examples of potential function polymorphisms resulting from missense mutations in the CgammaP3A4 gene. The CgammaP3A4*2 allele was found to encode a P450 with substrate-dependent altered kinetics compared with the wild-type P450.
Rare Variation in TET2 Is Associated with Clinically Relevant Prostate Carcinoma in African-Americans

PubMed Central

Koboldt, Daniel C.; Kanchi, Krishna L.; Gui, Bin; Larson, David E.; Fulton, Robert S.; Isaacs, William B.; Kraja, Aldi; Borecki, Ingrid B.; Jia, Li; Wilson, Richard K.; Mardis, Elaine R.; Kibel, Adam S.

2016-01-01

Background Common variants have been associated with prostate cancer risk. Unfortunately, few are reproducibly linked to aggressive disease, the phenotype of greatest clinical relevance. One possible explanation is that rare genetic variants underlie a significant proportion of the risk for aggressive disease. Method To identify such variants, we performed a two staged approach using whole exome sequencing followed by targeted sequencing of 800 genes in 652 aggressive prostate cancer patients and 752 disease-free controls in both African and European Americans. In each population, we tested rare variants for association using two gene-based aggregation tests. We established a study-wide significance threshold of 3.125 × 10−5 to correct for multiple testing. Results TET2 in African-Americans was associated with aggressive disease with 24.4% of cases harboring a rare deleterious variant compared to 9.6% of controls (FET p = 1.84×10−5, OR=3.0; SKAT-O p= 2.74×10−5). We report 8 additional genes with suggestive evidence of association, including the DNA repair genes PARP2 and MSH6. Finally, we observed an excess of rare truncation variants in 5 genes including the DNA repair genes MSH6, BRCA1 and BRCA2. This adds to the growing body of evidence that DNA repair pathway defects may influence susceptibility to aggressive prostate cancer. Conclusion Our findings suggest that rare variants influence risk of clinically relevant prostate cancer and, if validated, could serve to identify men for screening, prophylaxis and treatment. Impact This study provides evidence that rare variants in TET2 may help identify African-American men at increased risk for clinically relevant prostate cancer. PMID:27486019
An XRCC4 Splice Mutation Associated With Severe Short Stature, Gonadal Failure, and Early-Onset Metabolic Syndrome

PubMed Central

de Bruin, Christiaan; Mericq, Verónica; Andrew, Shayne F.; van Duyvenvoorde, Hermine A.; Verkaik, Nicole S.; Losekoot, Monique; Porollo, Aleksey; Garcia, Hernán; Kuang, Yi; Hanson, Dan; Clayton, Peter; van Gent, Dik C.; Wit, Jan M.; Hwa, Vivian

2015-01-01

Context: Severe short stature can be caused by defects in numerous biological processes including defects in IGF-1 signaling, centromere function, cell cycle control, and DNA damage repair. Many syndromic causes of short stature are associated with medical comorbidities including hypogonadism and microcephaly. Objective: To identify an underlying genetic etiology in two siblings with severe short stature and gonadal failure. Design: Clinical phenotyping, genetic analysis, complemented by in vitro functional studies of the candidate gene. Setting: An academic pediatric endocrinology clinic. Patients or Other Participants: Two adult siblings (male patient [P1] and female patient 2 [P2]) presented with a history of severe postnatal growth failure (adult heights: P1, −6.8 SD score; P2, −4 SD score), microcephaly, primary gonadal failure, and early-onset metabolic syndrome in late adolescence. In addition, P2 developed a malignant gastrointestinal stromal tumor at age 28. Intervention(s): Single nucleotide polymorphism microarray and exome sequencing. Results: Combined microarray analysis and whole exome sequencing of the two affected siblings and one unaffected sister identified a homozygous variant in XRCC4 as the probable candidate variant. Sanger sequencing and mRNA studies revealed a splice variant resulting in an in-frame deletion of 23 amino acids. Primary fibroblasts (P1) showed a DNA damage repair defect. Conclusions: In this study we have identified a novel pathogenic variant in XRCC4, a gene that plays a critical role in non-homologous end-joining DNA repair. This finding expands the spectrum of DNA damage repair syndromes to include XRCC4 deficiency causing severe postnatal growth failure, microcephaly, gonadal failure, metabolic syndrome, and possibly tumor predisposition. PMID:25742519
Company profile: Complete Genomics Inc.

PubMed

Reid, Clifford

2011-02-01

Complete Genomics Inc. is a life sciences company that focuses on complete human genome sequencing. It is taking a completely different approach to DNA sequencing than other companies in the industry. Rather than building a general-purpose platform for sequencing all organisms and all applications, it has focused on a single application - complete human genome sequencing. The company's Complete Genomics Analysis Platform (CGA™ Platform) comprises an integrated package of biochemistry, instrumentation and software that sequences human genomes at the highest quality, lowest cost and largest scale available. Complete Genomics offers a turnkey service that enables customers to outsource their human genome sequencing to the company's genome sequencing center in Mountain View, CA, USA. Customers send in their DNA samples, the company does all the library preparation, DNA sequencing, assembly and variant analysis, and customers receive research-ready data that they can use for biological discovery.
The NnCenH3 protein and centromeric DNA sequence profiles of Nelumbo nucifera Gaertn. (sacred lotus) reveal the DNA structures and dynamics of centromeres in basal eudicots.

PubMed

Zhu, Zhixuan; Gui, Songtao; Jin, Jing; Yi, Rong; Wu, Zhihua; Qian, Qian; Ding, Yi

2016-09-01

Centromeres on eukaryotic chromosomes consist of large arrays of DNA repeats that undergo very rapid evolution. Nelumbo nucifera Gaertn. (sacred lotus) is a phylogenetic relict and an aquatic perennial basal eudicot. Studies concerning the centromeres of this basal eudicot species could provide ancient evolutionary perspectives. In this study, we characterized the centromeric marker protein NnCenH3 (sacred lotus centromere-specific histone H3 variant), and used a chromatin immunoprecipitation (ChIP)-based technique to recover the NnCenH3 nucleosome-associated sequences of sacred lotus. The properties of the centromere-binding protein and DNA sequences revealed notable divergence between sacred lotus and other flowering plants, including the following factors: (i) an NnCenH3 alternative splicing variant comprising only a partial centromere-targeting domain, (ii) active genes with low transcription levels in the NnCenH3 nucleosomal regions, and (iii) the prevalence of the Ty1/copia class of long terminal repeat (LTR) retrotransposons in the centromeres of sacred lotus chromosomes. In addition, the dynamic natures of the centromeric region showed that some of the centromeric repeat DNA sequences originated from telomeric repeats, and a pair of centromeres on the dicentric chromosome 1 was inactive in the metaphase cells of sacred lotus. Our characterization of the properties of centromeric DNA structure within the sacred lotus genome describes a centromeric profile in ancient basal eudicots and might provide evidence of the origins and evolution of centromeres. Furthermore, the identification of centromeric DNA sequences is of great significance for the assembly of the sacred lotus genome. © 2016 The Authors The Plant Journal © 2016 John Wiley & Sons Ltd.
Developmental validation of a Nextera XT mitogenome Illumina MiSeq sequencing method for high-quality samples.

PubMed

Peck, Michelle A; Sturk-Andreaggi, Kimberly; Thomas, Jacqueline T; Oliver, Robert S; Barritt-Ross, Suzanne; Marshall, Charla

2018-05-01

Generating mitochondrial genome (mitogenome) data from reference samples in a rapid and efficient manner is critical to harnessing the greater power of discrimination of the entire mitochondrial DNA (mtDNA) marker. The method of long-range target enrichment, Nextera XT library preparation, and Illumina sequencing on the MiSeq is a well-established technique for generating mitogenome data from high-quality samples. To this end, a validation was conducted for this mitogenome method processing up to 24 samples simultaneously along with analysis in the CLC Genomics Workbench and utilizing the AQME (AFDIL-QIAGEN mtDNA Expert) tool to generate forensic profiles. This validation followed the Federal Bureau of Investigation's Quality Assurance Standards (QAS) for forensic DNA testing laboratories and the Scientific Working Group on DNA Analysis Methods (SWGDAM) validation guidelines. The evaluation of control DNA, non-probative samples, blank controls, mixtures, and nonhuman samples demonstrated the validity of this method. Specifically, the sensitivity was established at ≥25 pg of nuclear DNA input for accurate mitogenome profile generation. Unreproducible low-level variants were observed in samples with low amplicon yields. Further, variant quality was shown to be a useful metric for identifying sequencing error and crosstalk. Success of this method was demonstrated with a variety of reference sample substrates and extract types. These studies further demonstrate the advantages of using NGS techniques by highlighting the quantitative nature of heteroplasmy detection. The results presented herein from more than 175 samples processed in ten sequencing runs, show this mitogenome sequencing method and analysis strategy to be valid for the generation of reference data. Copyright © 2018 Elsevier B.V. All rights reserved.
Method of generating ploynucleotides encoding enhanced folding variants

DOEpatents

Bradbury, Andrew M.; Kiss, Csaba; Waldo, Geoffrey S.

2017-05-02

The invention provides directed evolution methods for improving the folding, solubility and stability (including thermostability) characteristics of polypeptides. In one aspect, the invention provides a method for generating folding and stability-enhanced variants of proteins, including but not limited to fluorescent proteins, chromophoric proteins and enzymes. In another aspect, the invention provides methods for generating thermostable variants of a target protein or polypeptide via an internal destabilization baiting strategy. Internally destabilization a protein of interest is achieved by inserting a heterologous, folding-destabilizing sequence (folding interference domain) within DNA encoding the protein of interest, evolving the protein sequences adjacent to the heterologous insertion to overcome the destabilization (using any number of mutagenesis methods), thereby creating a library of variants. The variants in the library are expressed, and those with enhanced folding characteristics selected.
Polymorphism discovery and allele frequency estimation using high-throughput DNA sequencing of target-enriched pooled DNA samples

PubMed Central

2012-01-01

Background The central role of the somatotrophic axis in animal post-natal growth, development and fertility is well established. Therefore, the identification of genetic variants affecting quantitative traits within this axis is an attractive goal. However, large sample numbers are a pre-requisite for the identification of genetic variants underlying complex traits and although technologies are improving rapidly, high-throughput sequencing of large numbers of complete individual genomes remains prohibitively expensive. Therefore using a pooled DNA approach coupled with target enrichment and high-throughput sequencing, the aim of this study was to identify polymorphisms and estimate allele frequency differences across 83 candidate genes of the somatotrophic axis, in 150 Holstein-Friesian dairy bulls divided into two groups divergent for genetic merit for fertility. Results In total, 4,135 SNPs and 893 indels were identified during the resequencing of the 83 candidate genes. Nineteen percent (n = 952) of variants were located within 5' and 3' UTRs. Seventy-two percent (n = 3,612) were intronic and 9% (n = 464) were exonic, including 65 indels and 236 SNPs resulting in non-synonymous substitutions (NSS). Significant (P < 0.01) mean allele frequency differentials between the low and high fertility groups were observed for 720 SNPs (58 NSS). Allele frequencies for 43 of the SNPs were also determined by genotyping the 150 individual animals (Sequenom® MassARRAY). No significant differences (P > 0.1) were observed between the two methods for any of the 43 SNPs across both pools (i.e., 86 tests in total). Conclusions The results of the current study support previous findings of the use of DNA sample pooling and high-throughput sequencing as a viable strategy for polymorphism discovery and allele frequency estimation. Using this approach we have characterised the genetic variation within genes of the somatotrophic axis and related pathways, central to mammalian post-natal growth and development and subsequent lactogenesis and fertility. We have identified a large number of variants segregating at significantly different frequencies between cattle groups divergent for calving interval plausibly harbouring causative variants contributing to heritable variation. To our knowledge, this is the first report describing sequencing of targeted genomic regions in any livestock species using groups with divergent phenotypes for an economically important trait. PMID:22235840
Widespread Site-Dependent Buffering of Human Regulatory Polymorphism

PubMed Central

Kutyavin, Tanya; Stamatoyannopoulos, John A.

2012-01-01

The average individual is expected to harbor thousands of variants within non-coding genomic regions involved in gene regulation. However, it is currently not possible to interpret reliably the functional consequences of genetic variation within any given transcription factor recognition sequence. To address this, we comprehensively analyzed heritable genome-wide binding patterns of a major sequence-specific regulator (CTCF) in relation to genetic variability in binding site sequences across a multi-generational pedigree. We localized and quantified CTCF occupancy by ChIP-seq in 12 related and unrelated individuals spanning three generations, followed by comprehensive targeted resequencing of the entire CTCF–binding landscape across all individuals. We identified hundreds of variants with reproducible quantitative effects on CTCF occupancy (both positive and negative). While these effects paralleled protein–DNA recognition energetics when averaged, they were extensively buffered by striking local context dependencies. In the significant majority of cases buffering was complete, resulting in silent variants spanning every position within the DNA recognition interface irrespective of level of binding energy or evolutionary constraint. The prevalence of complex partial or complete buffering effects severely constrained the ability to predict reliably the impact of variation within any given binding site instance. Surprisingly, 40% of variants that increased CTCF occupancy occurred at positions of human–chimp divergence, challenging the expectation that the vast majority of functional regulatory variants should be deleterious. Our results suggest that, even in the presence of “perfect” genetic information afforded by resequencing and parallel studies in multiple related individuals, genomic site-specific prediction of the consequences of individual variation in regulatory DNA will require systematic coupling with empirical functional genomic measurements. PMID:22457641
How proteins bind to DNA: target discrimination and dynamic sequence search by the telomeric protein TRF1

PubMed Central

2017-01-01

Abstract Target search as performed by DNA-binding proteins is a complex process, in which multiple factors contribute to both thermodynamic discrimination of the target sequence from overwhelmingly abundant off-target sites and kinetic acceleration of dynamic sequence interrogation. TRF1, the protein that binds to telomeric tandem repeats, faces an intriguing variant of the search problem where target sites are clustered within short fragments of chromosomal DNA. In this study, we use extensive (>0.5 ms in total) MD simulations to study the dynamical aspects of sequence-specific binding of TRF1 at both telomeric and non-cognate DNA. For the first time, we describe the spontaneous formation of a sequence-specific native protein–DNA complex in atomistic detail, and study the mechanism by which proteins avoid off-target binding while retaining high affinity for target sites. Our calculated free energy landscapes reproduce the thermodynamics of sequence-specific binding, while statistical approaches allow for a comprehensive description of intermediate stages of complex formation. PMID:28633355
CRAWview: for viewing splicing variation, gene families, and polymorphism in clusters of ESTs and full-length sequences.

PubMed

Chou, A; Burke, J

1999-05-01

DNA sequence clustering has become a valuable method in support of gene discovery and gene expression analysis. Our interest lies in leveraging the sequence diversity within clusters of expressed sequence tags (ESTs) to model gene structure for the study of gene variants that arise from, among other things, alternative mRNA splicing, polymorphism, and divergence after gene duplication, fusion, and translocation events. In previous work, CRAW was developed to discover gene variants from assembled clusters of ESTs. Most importantly, novel gene features (the differing units between gene variants, for example alternative exons, polymorphisms, transposable elements, etc.) that are specialized to tissue, disease, population, or developmental states can be identified when these tools collate DNA source information with gene variant discrimination. While the goal is complete automation of novel feature and gene variant detection, current methods are far from perfect and hence the development of effective tools for visualization and exploratory data analysis are of paramount importance in the process of sifting through candidate genes and validating targets. We present CRAWview, a Java based visualization extension to CRAW. Features that vary between gene forms are displayed using an automatically generated color coded index. The reporting format of CRAWview gives a brief, high level summary report to display overlap and divergence within clusters of sequences as well as the ability to 'drill down' and see detailed information concerning regions of interest. Additionally, the alignment viewing and editing capabilities of CRAWview make it possible to interactively correct frame-shifts and otherwise edit cluster assemblies. We have implemented CRAWview as a Java application across windows NT/95 and UNIX platforms. A beta version of CRAWview will be freely available to academic users from Pangea Systems (http://www.pangeasystems.com). Contact :
Uncovering the Ancestry of B Chromosomes in Moenkhausia sanctaefilomenae (Teleostei, Characidae)

PubMed Central

Utsunomia, Ricardo; Silva, Duílio Mazzoni Zerbinato de Andrade; Ruiz-Ruano, Francisco J.; Araya-Jaime, Cristian; Pansonato-Alves, José Carlos; Scacchetti, Priscilla Cardim; Hashimoto, Diogo Teruo; Oliveira, Claudio; Trifonov, Vladmir A.; Porto-Foresti, Fábio; Camacho, Juan Pedro M.; Foresti, Fausto

2016-01-01

B chromosomes constitute a heterogeneous mixture of genomic parasites that are sometimes derived intraspecifically from the standard genome of the host species, but result from interspecific hybridization in other cases. The mode of origin determines the DNA content, with the B chromosomes showing high similarity with the A genome in the first case, but presenting higher similarity with a different species in the second. The characid fish Moenkhausia sanctaefilomenae harbours highly invasive B chromosomes, which are present in all populations analyzed to date in the Parana and Tietê rivers. To investigate the origin of these B chromosomes, we analyzed two natural populations: one carrying B chromosomes and the other lacking them, using a combination of molecular cytogenetic techniques, nucleotide sequence analysis and high-throughput sequencing (Illumina HiSeq2000). Our results showed that i) B chromosomes have not yet reached the Paranapanema River basin; ii) B chromosomes are mitotically unstable; iii) there are two types of B chromosomes, the most frequent of which is lightly C-banded (similar to euchromatin in A chromosomes) (B1), while the other is darkly C-banded (heterochromatin-like) (B2); iv) the two B types contain the same tandem repeat DNA sequences (18S ribosomal DNA, H3 histone genes, MS3 and MS7 satellite DNA), with a higher content of 18S rDNA in the heterochromatic variant; v) all of these repetitive DNAs are present together only in the paracentromeric region of autosome pair no. 6, suggesting that the B chromosomes are derived from this A chromosome; vi) the two B chromosome variants show MS3 sequences that are highly divergent from each other and from the 0B genome, although the B2-derived sequences exhibit higher similarity with the 0B genome (this suggests an independent origin of the two B variants, with the less frequent, B2 type presumably being younger); and vii) the dN/dS ratio for the H3.2 histone gene is almost 4–6 times higher for B chromosomes than for A chromosome sequences, suggesting that purifying selection is relaxed for the DNA sequences located on the B chromosomes, presumably because they are mostly inactive. PMID:26934481
Sequence polymorphism data of the hypervariable regions of mitochondrial DNA in the Yadav population of Haryana.

PubMed

Verma, Kapil; Sharma, Sapna; Sharma, Arun; Dalal, Jyoti; Bhardwaj, Tapeshwar

2018-06-01

Genetic variations among humans occur both within and among populations and range from single nucleotide changes to multiple-nucleotide variants. These multiple-nucleotide variants are useful for studying the relationships among individuals or various population groups. The study of human genetic variations can help scientists understand how different population groups are biologically related to one another. Sequence analysis of hypervariable regions of human mitochondrial DNA (mtDNA) has been successfully used for the genetic characterization of different population groups for forensic purposes. It is well established that different ethnic or population groups differ significantly in their mtDNA distributions. In the last decade, very little research has been conducted on mtDNA variations in the Indian population, although such data would be useful for elucidating the history of human population expansion across the world. Moreover, forensic studies on mtDNA variations in the Indian subcontinent are also scarce, particularly in the northern part of India. In this report, variations in the hypervariable regions of mtDNA were analyzed in the Yadav population of Haryana. Different molecular diversity indices were computed. Further, the obtained haplotypes were classified into different haplogroups and the phylogenetic relationship between different haplogroups was inferred.
Skipping of Exons by Premature Termination of Transcription and Alternative Splicing within Intron-5 of the Sheep SCF Gene: A Novel Splice Variant

PubMed Central

Saravanaperumal, Siva Arumugam; Pediconi, Dario; Renieri, Carlo; La Terza, Antonietta

2012-01-01

Stem cell factor (SCF) is a growth factor, essential for haemopoiesis, mast cell development and melanogenesis. In the hematopoietic microenvironment (HM), SCF is produced either as a membrane-bound (−) or soluble (+) forms. Skin expression of SCF stimulates melanocyte migration, proliferation, differentiation, and survival. We report for the first time, a novel mRNA splice variant of SCF from the skin of white merino sheep via cloning and sequencing. Reverse transcriptase (RT)-PCR and molecular prediction revealed two different cDNA products of SCF. Full-length cDNA libraries were enriched by the method of rapid amplification of cDNA ends (RACE-PCR). Nucleotide sequencing and molecular prediction revealed that the primary 1519 base pair (bp) cDNA encodes a precursor protein of 274 amino acids (aa), commonly known as ‘soluble’ isoform. In contrast, the shorter (835 and/or 725 bp) cDNA was found to be a ‘novel’ mRNA splice variant. It contains an open reading frame (ORF) corresponding to a truncated protein of 181 aa (vs 245 aa) with an unique C-terminus lacking the primary proteolytic segment (28 aa) right after the D175G site which is necessary to produce ‘soluble’ form of SCF. This alternative splice (AS) variant was explained by the complete nucleotide sequencing of splice junction covering exon 5-intron (5)-exon 6 (948 bp) with a premature termination codon (PTC) whereby exons 6 to 9/10 are skipped (Cassette Exon, CE 6–9/10). We also demonstrated that the Northern blot analysis at transcript level is mediated via an intron-5 splicing event. Our data refine the structure of SCF gene; clarify the presence (+) and/or absence (−) of primary proteolytic-cleavage site specific SCF splice variants. This work provides a basis for understanding the functional role and regulation of SCF in hair follicle melanogenesis in sheep beyond what was known in mice, humans and other mammals. PMID:22719917
Novel sequence variants in the TMIE gene in families with autosomal recessive nonsyndromic hearing impairment

PubMed Central

Santos, Regie Lyn P.; El-Shanti, Hatem; Sikandar, Shaheen; Lee, Kwanghyuk; Bhatti, Attya; Yan, Kai; Chahrour, Maria H.; McArthur, Nathan; Pham, Thanh L.; Mahasneh, Amjad Abdullah; Ahmad, Wasim

2010-01-01

To date, 37 genes have been identified for nonsyndromic hearing impairment (NSHI). Identifying the functional sequence variants within these genes and knowing their population-specific frequencies is of public health value, in particular for genetic screening for NSHI. To determine putatively functional sequence variants in the transmembrane inner ear (TMIE) gene in Pakistani and Jordanian families with autosomal recessive (AR) NSHI, four Jordanian and 168 Pakistani families with ARNSHI that is not due to GJB2 (CX26) were submitted to a genome scan. Two-point and multipoint parametric linkage analyses were performed, and families with logarithmic odds (LOD) scores of 1.0 or greater within the TMIE region underwent further DNA sequencing. The evolutionary conservation and location in predicted protein domains of amino acid residues where sequence variants occurred were studied to elucidate the possible effects of these sequence variants on function. Of seven families that were screened for TMIE, putatively functional sequence variants were found to segregate with hearing impairment in four families but were not seen in not less than 110 ethnically matched control chromosomes. The previously reported c.241C>T (p.R81C) variant was observed in two Pakistani families. Two novel variants, c.92A>G (p.E31G) and the splice site mutation c.212–2A>C, were identified in one Pakistani and one Jordanian family, respectively. The c.92A>G (p.E31G) variant occurred at a residue that is conserved in the mouse and is predicted to be extracellular. Conservation and potential functionality of previously published mutations were also examined. The prevalence of functional TMIE variants in Pakistani families is 1.7% [95% confidence interval (CI) 0.3–4.8]. Further studies on the spectrum, prevalence rates, and functional effect of sequence variants in the TMIE gene in other populations should demonstrate the true importance of this gene as a cause of hearing impairment. PMID:16389551
Isolation of a complementary DNA clone for the human complement protein C2 and its use in the identification of a restriction fragment length polymorphism.

PubMed Central

Woods, D E; Edge, M D; Colten, H R

1984-01-01

Complementary DNA (cDNA) clones corresponding to the major histocompatibility (MHC) class III antigen, complement protein C2, have been isolated from human liver cDNA libraries with the use of a complex mixture of synthetic oligonucleotides (17 mer) that contains 576 different oligonucleotide sequences. The C2 cDNA were used to identify a DNA restriction enzyme fragment length polymorphism that provides a genetic marker within the MHC that was not detectable at the protein level. An extensive search for genomic polymorphisms using a cDNA clone for another MHC class III gene, factor B, failed to reveal any DNA variants. The genomic variants detected with the C2 cDNA probe provide an additional genetic marker for analysis of MHC-linked diseases. Images PMID:6086718
Using sheep genomes from diverse U.S. breeds to identify missense variants in genes affecting fecundity

USDA-ARS?s Scientific Manuscript database

Background: Access to sheep genome sequences significantly improves the chances of identifying genes that may influence the health, welfare, and productivity of these animals. Methods: A public, searchable DNA sequence resource for U.S. sheep was created with whole genome sequence (WGS) of 96 rams. ...
DNA sequence of the lymphotropic variant of minute virus of mice, MVM(i), and comparison with the DNA sequence of the fibrotropic prototype strain.

PubMed

Astell, C R; Gardiner, E M; Tattersall, P

1986-02-01

The sequence of molecular clones of the genome of MVM(i), a lymphotropic variant of minute virus of mice, was determined and compared with that of MVM(p), the fibrotropic prototype strain. At the nucleotide level there are 163 base changes: 129 transitions and 34 transversions. Most nucleotide changes are silent, with only 27 amino acids changes predicted, of which 22 are conservative. Notable differences between the MVM(i) and MVM(p) genomes which may account for the cell specificities of these viruses occur within the 3' nontranslated regions. The differences discussed include the absence of a 65-base-pair direct in MVM(i), the presence of only two polyadenylation sites in MVM(i) compared with four in MVM(p), and sequences that bear a resemblance to enhancer sequences. Also included in this paper is an important correction to the MVM(p) sequence (C.R. Astell, M. Thomson, M. Merchlinsky, and D. C. Ward, Nucleic Acids Res. 11:999-1018, 1983).
Deep sequencing is an appropriate tool for the selection of unique Hepatitis C virus (HCV) variants after single genomic amplification.

PubMed

Guinoiseau, Thibault; Moreau, Alain; Hohnadel, Guillaume; Ngo-Giang-Huong, Nicole; Brulard, Celine; Vourc'h, Patrick; Goudeau, Alain; Gaudy-Graffin, Catherine

2017-01-01

Hepatitis C virus (HCV) evolves rapidly in a single host and circulates as a quasispecies wich is a complex mixture of genetically distinct virus's but closely related namely variants. To identify intra-individual diversity and investigate their functional properties in vitro, it is necessary to define their quasispecies composition and isolate the HCV variants. This is possible using single genome amplification (SGA). This technique, based on serially diluted cDNA to amplify a single cDNA molecule (clonal amplicon), has already been used to determine individual HCV diversity. In these studies, positive PCR reactions from SGA were directly sequenced using Sanger technology. The detection of non-clonal amplicons is necessary for excluding them to facilitate further functional analysis. Here, we compared Next Generation Sequencing (NGS) with De Novo assembly and Sanger sequencing for their ability to distinguish clonal and non-clonal amplicons after SGA on one plasma specimen. All amplicons (n = 42) classified as clonal by NGS were also classified as clonal by Sanger sequencing. No double peaks were seen on electropherograms for non-clonal amplicons with position-specific nucleotide variation below 15% by NGS. Altogether, NGS circumvented many of the difficulties encountered when using Sanger sequencing after SGA and is an appropriate tool to reliability select clonal amplicons for further functional studies.
Deep sequencing is an appropriate tool for the selection of unique Hepatitis C virus (HCV) variants after single genomic amplification

PubMed Central

Guinoiseau, Thibault; Moreau, Alain; Hohnadel, Guillaume; Ngo-Giang-Huong, Nicole; Brulard, Celine; Vourc’h, Patrick; Goudeau, Alain; Gaudy-Graffin, Catherine

2017-01-01

Hepatitis C virus (HCV) evolves rapidly in a single host and circulates as a quasispecies wich is a complex mixture of genetically distinct virus’s but closely related namely variants. To identify intra-individual diversity and investigate their functional properties in vitro, it is necessary to define their quasispecies composition and isolate the HCV variants. This is possible using single genome amplification (SGA). This technique, based on serially diluted cDNA to amplify a single cDNA molecule (clonal amplicon), has already been used to determine individual HCV diversity. In these studies, positive PCR reactions from SGA were directly sequenced using Sanger technology. The detection of non-clonal amplicons is necessary for excluding them to facilitate further functional analysis. Here, we compared Next Generation Sequencing (NGS) with De Novo assembly and Sanger sequencing for their ability to distinguish clonal and non-clonal amplicons after SGA on one plasma specimen. All amplicons (n = 42) classified as clonal by NGS were also classified as clonal by Sanger sequencing. No double peaks were seen on electropherograms for non-clonal amplicons with position-specific nucleotide variation below 15% by NGS. Altogether, NGS circumvented many of the difficulties encountered when using Sanger sequencing after SGA and is an appropriate tool to reliability select clonal amplicons for further functional studies. PMID:28362878

Allelic variants of hereditary prions: The bimodularity principle.

PubMed

Tikhodeyev, Oleg N; Tarasov, Oleg V; Bondarev, Stanislav A

2017-01-02

Modern biology requires modern genetic concepts equally valid for all discovered mechanisms of inheritance, either "canonical" (mediated by DNA sequences) or epigenetic. Applying basic genetic terms such as "gene" and "allele" to protein hereditary factors is one of the necessary steps toward these concepts. The basic idea that different variants of the same prion protein can be considered as alleles has been previously proposed by Chernoff and Tuite. In this paper, the notion of prion allele is further developed. We propose the idea that any prion allele is a bimodular hereditary system that depends on a certain DNA sequence (DNA determinant) and a certain epigenetic mark (epigenetic determinant). Alteration of any of these 2 determinants may lead to establishment of a new prion allele. The bimodularity principle is valid not only for hereditary prions; it seems to be universal for any epigenetic hereditary factor.
HPV Genotyping of Modified General Primer-Amplicons Is More Analytically Sensitive and Specific by Sequencing than by Hybridization

PubMed Central

Meisal, Roger; Rounge, Trine Ballestad; Christiansen, Irene Kraus; Eieland, Alexander Kirkeby; Worren, Merete Molton; Molden, Tor Faksvaag; Kommedal, Øyvind; Hovig, Eivind; Leegaard, Truls Michael

2017-01-01

Sensitive and specific genotyping of human papillomaviruses (HPVs) is important for population-based surveillance of carcinogenic HPV types and for monitoring vaccine effectiveness. Here we compare HPV genotyping by Next Generation Sequencing (NGS) to an established DNA hybridization method. In DNA isolated from urine, the overall analytical sensitivity of NGS was found to be 22% higher than that of hybridization. NGS was also found to be the most specific method and expanded the detection repertoire beyond the 37 types of the DNA hybridization assay. Furthermore, NGS provided an increased resolution by identifying genetic variants of individual HPV types. The same Modified General Primers (MGP)-amplicon was used in both methods. The NGS method is described in detail to facilitate implementation in the clinical microbiology laboratory and includes suggestions for new standards for detection and calling of types and variants with improved resolution. PMID:28045981
HPV Genotyping of Modified General Primer-Amplicons Is More Analytically Sensitive and Specific by Sequencing than by Hybridization.

PubMed

Meisal, Roger; Rounge, Trine Ballestad; Christiansen, Irene Kraus; Eieland, Alexander Kirkeby; Worren, Merete Molton; Molden, Tor Faksvaag; Kommedal, Øyvind; Hovig, Eivind; Leegaard, Truls Michael; Ambur, Ole Herman

2017-01-01

Sensitive and specific genotyping of human papillomaviruses (HPVs) is important for population-based surveillance of carcinogenic HPV types and for monitoring vaccine effectiveness. Here we compare HPV genotyping by Next Generation Sequencing (NGS) to an established DNA hybridization method. In DNA isolated from urine, the overall analytical sensitivity of NGS was found to be 22% higher than that of hybridization. NGS was also found to be the most specific method and expanded the detection repertoire beyond the 37 types of the DNA hybridization assay. Furthermore, NGS provided an increased resolution by identifying genetic variants of individual HPV types. The same Modified General Primers (MGP)-amplicon was used in both methods. The NGS method is described in detail to facilitate implementation in the clinical microbiology laboratory and includes suggestions for new standards for detection and calling of types and variants with improved resolution.
Allelic variants of hereditary prions: The bimodularity principle

PubMed Central

Tikhodeyev, Oleg N.; Tarasov, Oleg V.; Bondarev, Stanislav A.

2017-01-01

ABSTRACT Modern biology requires modern genetic concepts equally valid for all discovered mechanisms of inheritance, either “canonical” (mediated by DNA sequences) or epigenetic. Applying basic genetic terms such as “gene” and “allele” to protein hereditary factors is one of the necessary steps toward these concepts. The basic idea that different variants of the same prion protein can be considered as alleles has been previously proposed by Chernoff and Tuite. In this paper, the notion of prion allele is further developed. We propose the idea that any prion allele is a bimodular hereditary system that depends on a certain DNA sequence (DNA determinant) and a certain epigenetic mark (epigenetic determinant). Alteration of any of these 2 determinants may lead to establishment of a new prion allele. The bimodularity principle is valid not only for hereditary prions; it seems to be universal for any epigenetic hereditary factor. PMID:28281926
Pan-cancer analysis reveals technical artifacts in TCGA germline variant calls.

PubMed

Buckley, Alexandra R; Standish, Kristopher A; Bhutani, Kunal; Ideker, Trey; Lasken, Roger S; Carter, Hannah; Harismendy, Olivier; Schork, Nicholas J

2017-06-12

Cancer research to date has largely focused on somatically acquired genetic aberrations. In contrast, the degree to which germline, or inherited, variation contributes to tumorigenesis remains unclear, possibly due to a lack of accessible germline variant data. Here we called germline variants on 9618 cases from The Cancer Genome Atlas (TCGA) database representing 31 cancer types. We identified batch effects affecting loss of function (LOF) variant calls that can be traced back to differences in the way the sequence data were generated both within and across cancer types. Overall, LOF indel calls were more sensitive to technical artifacts than LOF Single Nucleotide Variant (SNV) calls. In particular, whole genome amplification of DNA prior to sequencing led to an artificially increased burden of LOF indel calls, which confounded association analyses relating germline variants to tumor type despite stringent indel filtering strategies. The samples affected by these technical artifacts include all acute myeloid leukemia and practically all ovarian cancer samples. We demonstrate how technical artifacts induced by whole genome amplification of DNA can lead to false positive germline-tumor type associations and suggest TCGA whole genome amplified samples be used with caution. This study draws attention to the need to be sensitive to problems associated with a lack of uniformity in data generation in TCGA data.
ERASE-Seq: Leveraging replicate measurements to enhance ultralow frequency variant detection in NGS data

PubMed Central

Kamps-Hughes, Nick; McUsic, Andrew; Kurihara, Laurie; Harkins, Timothy T.; Pal, Prithwish; Ray, Claire

2018-01-01

The accurate detection of ultralow allele frequency variants in DNA samples is of interest in both research and medical settings, particularly in liquid biopsies where cancer mutational status is monitored from circulating DNA. Next-generation sequencing (NGS) technologies employing molecular barcoding have shown promise but significant sensitivity and specificity improvements are still needed to detect mutations in a majority of patients before the metastatic stage. To address this we present analytical validation data for ERASE-Seq (Elimination of Recurrent Artifacts and Stochastic Errors), a method for accurate and sensitive detection of ultralow frequency DNA variants in NGS data. ERASE-Seq differs from previous methods by creating a robust statistical framework to utilize technical replicates in conjunction with background error modeling, providing a 10 to 100-fold reduction in false positive rates compared to published molecular barcoding methods. ERASE-Seq was tested using spiked human DNA mixtures with clinically realistic DNA input quantities to detect SNVs and indels between 0.05% and 1% allele frequency, the range commonly found in liquid biopsy samples. Variants were detected with greater than 90% sensitivity and a false positive rate below 0.1 calls per 10,000 possible variants. The approach represents a significant performance improvement compared to molecular barcoding methods and does not require changing molecular reagents. PMID:29630678
Novel approach to genetic analysis and results in 3000 hemophilia patients enrolled in the My Life, Our Future initiative

PubMed Central

Johnsen, Jill M.; Fletcher, Shelley N.; Huston, Haley; Roberge, Sarah; Martin, Beth K.; Kircher, Martin; Josephson, Neil C.; Shendure, Jay; Ruuska, Sarah; Koerper, Marion A.; Morales, Jaime; Pierce, Glenn F.; Aschman, Diane J.

2017-01-01

Hemophilia A and B are rare, X-linked bleeding disorders. My Life, Our Future (MLOF) is a collaborative project established to genotype and study hemophilia. Patients were enrolled at US hemophilia treatment centers (HTCs). Genotyping was performed centrally using next-generation sequencing (NGS) with an approach that detected common F8 gene inversions simultaneously with F8 and F9 gene sequencing followed by confirmation using standard genotyping methods. Sixty-nine HTCs enrolled the first 3000 patients in under 3 years. Clinically reportable DNA variants were detected in 98.1% (2357/2401) of hemophilia A and 99.3% (595/599) of hemophilia B patients. Of the 924 unique variants found, 285 were novel. Predicted gene-disrupting variants were common in severe disease; missense variants predominated in mild–moderate disease. Novel DNA variants accounted for ∼30% of variants found and were detected continuously throughout the project, indicating that additional variation likely remains undiscovered. The NGS approach detected >1 reportable variants in 36 patients (10 females), a finding with potential clinical implications. NGS also detected incidental variants unlikely to cause disease, including 11 variants previously reported in hemophilia. Although these genes are thought to be conserved, our findings support caution in interpretation of new variants. In summary, MLOF has contributed significantly toward variant annotation in the F8 and F9 genes. In the near future, investigators will be able to access MLOF data and repository samples for research to advance our understanding of hemophilia. PMID:29296726
A comprehensive characterization of rare mitochondrial DNA variants in neuroblastoma

PubMed Central

Pignataro, Piero; Lasorsa, Vito Alessandro; Hogarty, Michael D.; Castellano, Aurora; Conte, Massimo; Tonini, Gian Paolo; Iolascon, Achille; Gasparre, Giuseppe; Capasso, Mario

2016-01-01

Background Neuroblastoma, a tumor of the developing sympathetic nervous system, is a common childhood neoplasm that is often lethal. Mitochondrial DNA (mtDNA) mutations have been found in most tumors including neuroblastoma. We extracted mtDNA data from a cohort of neuroblastoma samples that had undergone Whole Exome Sequencing (WES) and also used snap-frozen samples in which mtDNA was entirely sequenced by Sanger technology. We next undertook the challenge of determining those mutations that are relevant to, or arisen during tumor development. The bioinformatics pipeline used to extract mitochondrial variants from matched tumor/blood samples was enriched by a set of filters inclusive of heteroplasmic fraction, nucleotide variability, and in silico prediction of pathogenicity. Results Our in silico multistep workflow applied both on WES and Sanger-sequenced neuroblastoma samples, allowed us to identify a limited burden of somatic and germline mitochondrial mutations with a potential pathogenic impact. Conclusions The few singleton germline and somatic mitochondrial mutations emerged, according to our in silico analysis, do not appear to impact on the development of neuroblastoma. Our findings are consistent with the hypothesis that most mitochondrial somatic mutations can be considered as ‘passengers’ and consequently have no discernible effect in this type of cancer. PMID:27351283
An Analysis of the Sensitivity of Proteogenomic Mapping of Somatic Mutations and Novel Splicing Events in Cancer.

PubMed

Ruggles, Kelly V; Tang, Zuojian; Wang, Xuya; Grover, Himanshu; Askenazi, Manor; Teubl, Jennifer; Cao, Song; McLellan, Michael D; Clauser, Karl R; Tabb, David L; Mertins, Philipp; Slebos, Robbert; Erdmann-Gilmore, Petra; Li, Shunqiang; Gunawardena, Harsha P; Xie, Ling; Liu, Tao; Zhou, Jian-Ying; Sun, Shisheng; Hoadley, Katherine A; Perou, Charles M; Chen, Xian; Davies, Sherri R; Maher, Christopher A; Kinsinger, Christopher R; Rodland, Karen D; Zhang, Hui; Zhang, Zhen; Ding, Li; Townsend, R Reid; Rodriguez, Henry; Chan, Daniel; Smith, Richard D; Liebler, Daniel C; Carr, Steven A; Payne, Samuel; Ellis, Matthew J; Fenyő, David

2016-03-01

Improvements in mass spectrometry (MS)-based peptide sequencing provide a new opportunity to determine whether polymorphisms, mutations, and splice variants identified in cancer cells are translated. Herein, we apply a proteogenomic data integration tool (QUILTS) to illustrate protein variant discovery using whole genome, whole transcriptome, and global proteome datasets generated from a pair of luminal and basal-like breast-cancer-patient-derived xenografts (PDX). The sensitivity of proteogenomic analysis for singe nucleotide variant (SNV) expression and novel splice junction (NSJ) detection was probed using multiple MS/MS sample process replicates defined here as an independent tandem MS experiment using identical sample material. Despite analysis of over 30 sample process replicates, only about 10% of SNVs (somatic and germline) detected by both DNA and RNA sequencing were observed as peptides. An even smaller proportion of peptides corresponding to NSJ observed by RNA sequencing were detected (<0.1%). Peptides mapping to DNA-detected SNVs without a detectable mRNA transcript were also observed, suggesting that transcriptome coverage was incomplete (∼80%). In contrast to germline variants, somatic variants were less likely to be detected at the peptide level in the basal-like tumor than in the luminal tumor, raising the possibility of differential translation or protein degradation effects. In conclusion, this large-scale proteogenomic integration allowed us to determine the degree to which mutations are translated and identify gaps in sequence coverage, thereby benchmarking current technology and progress toward whole cancer proteome and transcriptome analysis. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
A patient with congenital hyperlactataemia and Leigh syndrome: an uncommon mitochondrial variant.

PubMed

Ching, C K; Mak, Chloe M; Au, K M; Chan, K Y; Yuen, Y P; Yau, Eric K C; Ma, Louis C K; Chow, H L; Chan, Albert Y W

2013-08-01

We report an uncommon mitochondrial variant in a baby girl with congenital hyperlactataemia and Leigh syndrome. The patient presented with a single episode of generalised clonic convulsion at day 19, and was found to have isolated and persistent hyperlactataemia ranging from 3.34 to 9.26 mmol/L. She had elevated serum lactate-to-pyruvate ratios of up to 35 and high plasma alanine concentration, indicative of a respiratory chain defect. At the age of 8 months, she developed evolving neurological and imaging features compatible with Leigh syndrome. Genetic testing for common mitochondrial DNA mutations, large mitochondrial DNA deletions, and selected nuclear genes was negative. Further analysis of lymphocyte mitochondrial DNA by sequencing revealed an uncommon heteroplasmic variant, NC_012920.1(MT-ND5):m.13094T>C (p.Val253Ala), which was previously shown to reduce complex I activity. In patients in whom there was a high suspicion of mitochondrial disorder, entire mitochondrial DNA analysis may be warranted if initial screening of common mitochondrial DNA mutations is negative.
HGVS Recommendations for the Description of Sequence Variants: 2016 Update.

PubMed

den Dunnen, Johan T; Dalgleish, Raymond; Maglott, Donna R; Hart, Reece K; Greenblatt, Marc S; McGowan-Jordan, Jean; Roux, Anne-Francoise; Smith, Timothy; Antonarakis, Stylianos E; Taschner, Peter E M

2016-06-01

The consistent and unambiguous description of sequence variants is essential to report and exchange information on the analysis of a genome. In particular, DNA diagnostics critically depends on accurate and standardized description and sharing of the variants detected. The sequence variant nomenclature system proposed in 2000 by the Human Genome Variation Society has been widely adopted and has developed into an internationally accepted standard. The recommendations are currently commissioned through a Sequence Variant Description Working Group (SVD-WG) operating under the auspices of three international organizations: the Human Genome Variation Society (HGVS), the Human Variome Project (HVP), and the Human Genome Organization (HUGO). Requests for modifications and extensions go through the SVD-WG following a standard procedure including a community consultation step. Version numbers are assigned to the nomenclature system to allow users to specify the version used in their variant descriptions. Here, we present the current recommendations, HGVS version 15.11, and briefly summarize the changes that were made since the 2000 publication. Most focus has been on removing inconsistencies and tightening definitions allowing automatic data processing. An extensive version of the recommendations is available online, at http://www.HGVS.org/varnomen. © 2016 WILEY PERIODICALS, INC.
Rare variants in RTEL1 are associated with familial interstitial pneumonia.

PubMed

Cogan, Joy D; Kropski, Jonathan A; Zhao, Min; Mitchell, Daphne B; Rives, Lynette; Markin, Cheryl; Garnett, Errine T; Montgomery, Keri H; Mason, Wendi R; McKean, David F; Powers, Julia; Murphy, Elissa; Olson, Lana M; Choi, Leena; Cheng, Dong-Sheng; Blue, Elizabeth Marchani; Young, Lisa R; Lancaster, Lisa H; Steele, Mark P; Brown, Kevin K; Schwarz, Marvin I; Fingerlin, Tasha E; Schwartz, David A; Lawson, William E; Loyd, James E; Zhao, Zhongming; Phillips, John A; Blackwell, Timothy S

2015-03-15

Up to 20% of cases of idiopathic interstitial pneumonia cluster in families, comprising the syndrome of familial interstitial pneumonia (FIP); however, the genetic basis of FIP remains uncertain in most families. To determine if new disease-causing rare genetic variants could be identified using whole-exome sequencing of affected members from FIP families, providing additional insights into disease pathogenesis. Affected subjects from 25 kindreds were selected from an ongoing FIP registry for whole-exome sequencing from genomic DNA. Candidate rare variants were confirmed by Sanger sequencing, and cosegregation analysis was performed in families, followed by additional sequencing of affected individuals from another 163 kindreds. We identified a potentially damaging rare variant in the gene encoding for regulator of telomere elongation helicase 1 (RTEL1) that segregated with disease and was associated with very short telomeres in peripheral blood mononuclear cells in 1 of 25 families in our original whole-exome sequencing cohort. Evaluation of affected individuals in 163 additional kindreds revealed another eight families (4.7%) with heterozygous rare variants in RTEL1 that segregated with clinical FIP. Probands and unaffected carriers of these rare variants had short telomeres (<10% for age) in peripheral blood mononuclear cells and increased T-circle formation, suggesting impaired RTEL1 function. Rare loss-of-function variants in RTEL1 represent a newly defined genetic predisposition for FIP, supporting the importance of telomere-related pathways in pulmonary fibrosis.
Comparing sequencing assays and human-machine analyses in actionable genomics for glioblastoma.

PubMed

Wrzeszczynski, Kazimierz O; Frank, Mayu O; Koyama, Takahiko; Rhrissorrakrai, Kahn; Robine, Nicolas; Utro, Filippo; Emde, Anne-Katrin; Chen, Bo-Juen; Arora, Kanika; Shah, Minita; Vacic, Vladimir; Norel, Raquel; Bilal, Erhan; Bergmann, Ewa A; Moore Vogel, Julia L; Bruce, Jeffrey N; Lassman, Andrew B; Canoll, Peter; Grommes, Christian; Harvey, Steve; Parida, Laxmi; Michelini, Vanessa V; Zody, Michael C; Jobanputra, Vaidehi; Royyuru, Ajay K; Darnell, Robert B

2017-08-01

To analyze a glioblastoma tumor specimen with 3 different platforms and compare potentially actionable calls from each. Tumor DNA was analyzed by a commercial targeted panel. In addition, tumor-normal DNA was analyzed by whole-genome sequencing (WGS) and tumor RNA was analyzed by RNA sequencing (RNA-seq). The WGS and RNA-seq data were analyzed by a team of bioinformaticians and cancer oncologists, and separately by IBM Watson Genomic Analytics (WGA), an automated system for prioritizing somatic variants and identifying drugs. More variants were identified by WGS/RNA analysis than by targeted panels. WGA completed a comparable analysis in a fraction of the time required by the human analysts. The development of an effective human-machine interface in the analysis of deep cancer genomic datasets may provide potentially clinically actionable calls for individual patients in a more timely and efficient manner than currently possible. NCT02725684.
Characterization of mtDNA variation in a cohort of South African paediatric patients with mitochondrial disease.

PubMed

van der Walt, Elizna M; Smuts, Izelle; Taylor, Robert W; Elson, Joanna L; Turnbull, Douglass M; Louw, Roan; van der Westhuizen, Francois H

2012-06-01

Mitochondrial disease can be attributed to both mitochondrial and nuclear gene mutations. It has a heterogeneous clinical and biochemical profile, which is compounded by the diversity of the genetic background. Disease-based epidemiological information has expanded significantly in recent decades, but little information is known that clarifies the aetiology in African patients. The aim of this study was to investigate mitochondrial DNA variation and pathogenic mutations in the muscle of diagnosed paediatric patients from South Africa. A cohort of 71 South African paediatric patients was included and a high-throughput nucleotide sequencing approach was used to sequence full-length muscle mtDNA. The average coverage of the mtDNA genome was 81±26 per position. After assigning haplogroups, it was determined that although the nature of non-haplogroup-defining variants was similar in African and non-African haplogroup patients, the number of substitutions were significantly higher in African patients. We describe previously reported disease-associated and novel variants in this cohort. We observed a general lack of commonly reported syndrome-associated mutations, which supports clinical observations and confirms general observations in African patients when using single mutation screening strategies based on (predominantly non-African) mtDNA disease-based information. It is finally concluded that this first extensive report on muscle mtDNA sequences in African paediatric patients highlights the need for a full-length mtDNA sequencing strategy, which applies to all populations where specific mutations is not present. This, in addition to nuclear DNA gene mutation and pathogenicity evaluations, will be required to better unravel the aetiology of these disorders in African patients.
Ribosomal DNA Organization Before and After Magnification in Drosophila melanogaster

PubMed Central

Bianciardi, Alessio; Boschi, Manuela; Swanson, Ellen E.; Belloni, Massimo; Robbins, Leonard G.

2012-01-01

In all eukaryotes, the ribosomal RNA genes are stably inherited redundant elements. In Drosophila melanogaster, the presence of a Ybb− chromosome in males, or the maternal presence of the Ribosomal exchange (Rex) element, induces magnification: a heritable increase of rDNA copy number. To date, several alternative classes of mechanisms have been proposed for magnification: in situ replication or extra-chromosomal replication, either of which might act on short or extended strings of rDNA units, or unequal sister chromatid exchange. To eliminate some of these hypotheses, none of which has been clearly proven, we examined molecular-variant composition and compared genetic maps of the rDNA in the bb2 mutant and in some magnified bb+ alleles. The genetic markers used are molecular-length variants of IGS sequences and of R1 and R2 mobile elements present in many 28S sequences. Direct comparison of PCR products does not reveal any particularly intensified electrophoretic bands in magnified alleles compared to the nonmagnified bb2 allele. Hence, the increase of rDNA copy number is diluted among multiple variants. We can therefore reject mechanisms of magnification based on multiple rounds of replication of short strings. Moreover, we find no changes of marker order when pre- and postmagnification maps are compared. Thus, we can further restrict the possible mechanisms to two: replication in situ of an extended string of rDNA units or unequal exchange between sister chromatids. PMID:22505623
Methods for decoding Cas9 protospacer adjacent motif (PAM) sequences: A brief overview.

PubMed

Karvelis, Tautvydas; Gasiunas, Giedrius; Siksnys, Virginijus

2017-05-15

Recently the Cas9, an RNA guided DNA endonuclease, emerged as a powerful tool for targeted genome manipulations. Cas9 protein can be reprogrammed to cleave, bind or nick any DNA target by simply changing crRNA sequence, however a short nucleotide sequence, termed PAM, is required to initiate crRNA hybridization to the DNA target. PAM sequence is recognized by Cas9 protein and must be determined experimentally for each Cas9 variant. Exploration of Cas9 orthologs could offer a diversity of PAM sequences and novel biochemical properties that may be beneficial for genome editing applications. Here we briefly review and compare Cas9 PAM identification assays that can be adopted for other PAM-dependent CRISPR-Cas systems. Copyright © 2017 Elsevier Inc. All rights reserved.
A thermostable variant of fructose bisphosphate aldolase constructed by directed evolution also shows increased stability in organic solvents.

PubMed

Hao, Jijun; Berry, Alan

2004-09-01

Thermostable variants of the Class II fructose bisphosphate aldolase have been isolated following four rounds of directed evolution using DNA shuffling of the fda genes from Escherichia coli and Edwardsiella ictaluri. Variants from all four generations of evolution have been purified and characterized. The variants show increased thermostability with no loss of catalytic function at room temperature. The temperature at which 50% of the initial enzyme activity is lost after incubation for 10 min (T50) of the most stable variant, 4-43D6, is increased by 11-12 degrees C over the wild-type enzymes and the half-life of activity at 53 degrees C is increased approximately 190-fold. In addition, variant 4-43D6 shows increased stability to treatment with organic solvents. DNA sequencing of the evolved variants has identified the mutations which have been introduced and which lead to increased thermostability, and the role of the mutations introduced is discussed.
Whole exome or genome sequencing: nurses need to prepare families for the possibilities.

PubMed

Prows, Cynthia A; Tran, Grace; Blosser, Beverly

2014-12-01

A discussion of whole exome sequencing and the type of possible results patients and families should be aware of before samples are obtained. To find the genetic cause of a rare disorder, whole exome sequencing analyses all known and suspected human genes from a single sample. Over 20,000 detected DNA variants in each individual exome must be considered as possibly causing disease or disregarded as not relevant to the person's disease. In the process, unexpected gene variants associated with known diseases unrelated to the primary purpose of the test may be incidentally discovered. Because family members' DNA samples are often needed, gene variants associated with known genetic diseases or predispositions for diseases can also be discovered in their samples. Discussion paper. PubMed 2009-2013, list of references in retrieved articles, Google Scholar. Nurses need a general understanding of the scope of potential genomic information that may be revealed with whole exome sequencing to provide support and guidance to individuals and families during their decision-making process, while waiting for results and after disclosure. Nurse scientists who want to use whole exome sequencing in their study design and methods must decide early in study development if they will return primary whole exome sequencing research results and if they will give research participants choices about learning incidental research results. It is critical that nurses translate their knowledge about whole exome sequencing into their patient education and patient advocacy roles and relevant programmes of research. © 2014 John Wiley & Sons Ltd.
Whole-Genome Sequencing and Variant Analysis of Human Papillomavirus 16 Infections.

PubMed

van der Weele, Pascal; Meijer, Chris J L M; King, Audrey J

2017-10-01

Human papillomavirus (HPV) is a strongly conserved DNA virus, high-risk types of which can cause cervical cancer in persistent infections. The most common type found in HPV-attributable cancer is HPV16, which can be subdivided into four lineages (A to D) with different carcinogenic properties. Studies have shown HPV16 sequence diversity in different geographical areas, but only limited information is available regarding HPV16 diversity within a population, especially at the whole-genome level. We analyzed HPV16 major variant diversity and conservation in persistent infections and performed a single nucleotide polymorphism (SNP) comparison between persistent and clearing infections. Materials were obtained in the Netherlands from a cohort study with longitudinal follow-up for up to 3 years. Our analysis shows a remarkably large variant diversity in the population. Whole-genome sequences were obtained for 57 persistent and 59 clearing HPV16 infections, resulting in 109 unique variants. Interestingly, persistent infections were completely conserved through time. One reinfection event was identified where the initial and follow-up samples clustered differently. Non-A1/A2 variants seemed to clear preferentially ( P = 0.02). Our analysis shows that population-wide HPV16 sequence diversity is very large. In persistent infections, the HPV16 sequence was fully conserved. Sequencing can identify HPV16 reinfections, although occurrence is rare. SNP comparison identified no strongly acting effect of the viral genome affecting HPV16 infection clearance or persistence in up to 3 years of follow-up. These findings suggest the progression of an early HPV16 infection could be host related. IMPORTANCE Human papillomavirus 16 (HPV16) is the predominant type found in cervical cancer. Progression of initial infection to cervical cancer has been linked to sequence properties; however, knowledge of variants circulating in European populations, especially with longitudinal follow-up, is limited. By sequencing a number of infections with known follow-up for up to 3 years, we gained initial insights into the genetic diversity of HPV16 and the effects of the viral genome on the persistence of infections. A SNP comparison between sequences obtained from clearing and persistent infections did not identify strongly acting DNA variations responsible for these infection outcomes. In addition, we identified an HPV16 reinfection event where sequencing of initial and follow-up samples showed different HPV16 variants. Based on conventional genotyping, this infection would incorrectly be considered a persistent HPV16 infection. In the context of vaccine efficacy and monitoring studies, such infections could potentially cause reduced reported efficacy or efficiency. Copyright © 2017 van der Weele et al.
Many amino acid substitution variants identified in DNA repair genes during human population screenings are predicted to impact protein function

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xi, T; Jones, I M; Mohrenweiser, H W

2003-11-03

Over 520 different amino acid substitution variants have been previously identified in the systematic screening of 91 human DNA repair genes for sequence variation. Two algorithms were employed to predict the impact of these amino acid substitutions on protein activity. Sorting Intolerant From Tolerant (SIFT) classified 226 of 508 variants (44%) as ''Intolerant''. Polymorphism Phenotyping (PolyPhen) classed 165 of 489 amino acid substitutions (34%) as ''Probably or Possibly Damaging''. Another 9-15% of the variants were classed as ''Potentially Intolerant or Damaging''. The results from the two algorithms are highly associated, with concordance in predicted impact observed for {approx}62% of themore » variants. Twenty one to thirty one percent of the variant proteins are predicted to exhibit reduced activity by both algorithms. These variants occur at slightly lower individual allele frequency than do the variants classified as ''Tolerant'' or ''Benign''. Both algorithms correctly predicted the impact of 26 functionally characterized amino acid substitutions in the APE1 protein on biochemical activity, with one exception. It is concluded that a substantial fraction of the missense variants observed in the general human population are functionally relevant. These variants are expected to be the molecular genetic and biochemical basis for the associations of reduced DNA repair capacity phenotypes with elevated cancer risk.« less

Genome characterization of the selected long- and short-sleep mouse lines.

PubMed

Dowell, Robin; Odell, Aaron; Richmond, Phillip; Malmer, Daniel; Halper-Stromberg, Eitan; Bennett, Beth; Larson, Colin; Leach, Sonia; Radcliffe, Richard A

2016-12-01

The Inbred Long- and Short-Sleep (ILS, ISS) mouse lines were selected for differences in acute ethanol sensitivity using the loss of righting response (LORR) as the selection trait. The lines show an over tenfold difference in LORR and, along with a recombinant inbred panel derived from them (the LXS), have been widely used to dissect the genetic underpinnings of acute ethanol sensitivity. Here we have sequenced the genomes of the ILS and ISS to investigate the DNA variants that contribute to their sensitivity difference. We identified ~2.7 million high-confidence SNPs and small indels and ~7000 structural variants between the lines; variants were found to occur in 6382 annotated genes. Using a hidden Markov model, we were able to reconstruct the genome-wide ancestry patterns of the eight inbred progenitor strains from which the ILS and ISS were derived, and found that quantitative trait loci that have been mapped for LORR were slightly enriched for DNA variants. Finally, by mapping and quantifying RNA-seq reads from the ILS and ISS to their strain-specific genomes rather than to the reference genome, we found a substantial improvement in a differential expression analysis between the lines. This work will help in identifying and characterizing the DNA sequence variants that contribute to the difference in ethanol sensitivity between the ILS and ISS and will also aid in accurate quantification of RNA-seq data generated from the LXS RIs.
Molecular dynamics simulations revealed structural differences among WRKY domain-DNA interaction in barley (Hordeum vulgare).

PubMed

Pandey, Bharati; Grover, Abhinav; Sharma, Pradeep

2018-02-12

The WRKY transcription factors are a class of DNA-binding proteins involved in diverse plant processes play critical roles in response to abiotic and biotic stresses. Genome-wide divergence analysis of WRKY gene family in Hordeum vulgare provided a framework for molecular evolution and functional roles. So far, the crystal structure of WRKY from barley has not been resolved; moreover, knowledge of the three-dimensional structure of WRKY domain is pre-requisites for exploring the protein-DNA recognition mechanisms. Homology modelling based approach was used to generate structures for WRKY DNA binding domain (DBD) and its variants using AtWRKY1 as a template. Finally, the stability and conformational changes of the generated model in unbound and bound form was examined through atomistic molecular dynamics (MD) simulations for 100 ns time period. In this study, we investigated the comparative binding pattern of WRKY domain and its variants with W-box cis-regulatory element using molecular docking and dynamics (MD) simulations assays. The atomic insight into WRKY domain exhibited significant variation in the intermolecular hydrogen bonding pattern, leading to the structural anomalies in the variant type and differences in the DNA-binding specificities. Based on the MD analysis, residual contribution and interaction contour, wild-type WRKY (HvWRKY46) were found to interact with DNA through highly conserved heptapeptide in the pre- and post-MD simulated complexes, whereas heptapeptide interaction with DNA was missing in variants (I and II) in post-MD complexes. Consequently, through principal component analysis, wild-type WRKY was also found to be more stable by obscuring a reduced conformational space than the variant I (HvWRKY34). Lastly, high binding free energy for wild-type and variant II allowed us to conclude that wild-type WRKY-DNA complex was more stable relative to variants I. The results of our study revealed complete dynamic and structural information about WRKY domain-DNA interactions. However, no structure base information reported to date for WRKY variants and their mechanism of interaction with DNA. Our findings highlighted the importance of selecting a sequence to generate newer transgenic plants that would be increasingly tolerance to stress conditions.
Time-course Interactions between Cell Proliferation and DNA Sequence Variants in a Mouse Model of Latent Carcinogenicity

EPA Science Inventory

A fundamental principle of non-mutagenic chemical carcinogenesis is that increased cell proliferation enhances spontaneous DNA damage. Over time, this damage drives mutations in oncogenic genes that ultimately lead to cancer. This concept is a central part of cancer mode of actio...
Pooled Resequencing of 122 Ulcerative Colitis Genes in a Large Dutch Cohort Suggests Population-Specific Associations of Rare Variants in MUC2.

PubMed

Visschedijk, Marijn C; Alberts, Rudi; Mucha, Soren; Deelen, Patrick; de Jong, Dirk J; Pierik, Marieke; Spekhorst, Lieke M; Imhann, Floris; van der Meulen-de Jong, Andrea E; van der Woude, C Janneke; van Bodegraven, Adriaan A; Oldenburg, Bas; Löwenberg, Mark; Dijkstra, Gerard; Ellinghaus, David; Schreiber, Stefan; Wijmenga, Cisca; Rivas, Manuel A; Franke, Andre; van Diemen, Cleo C; Weersma, Rinse K

2016-01-01

Genome-wide association studies have revealed several common genetic risk variants for ulcerative colitis (UC). However, little is known about the contribution of rare, large effect genetic variants to UC susceptibility. In this study, we performed a deep targeted re-sequencing of 122 genes in Dutch UC patients in order to investigate the contribution of rare variants to the genetic susceptibility to UC. The selection of genes consists of 111 established human UC susceptibility genes and 11 genes that lead to spontaneous colitis when knocked-out in mice. In addition, we sequenced the promoter regions of 45 genes where known variants exert cis-eQTL-effects. Targeted pooled re-sequencing was performed on DNA of 790 Dutch UC cases. The Genome of the Netherlands project provided sequence data of 500 healthy controls. After quality control and prioritization based on allele frequency and pathogenicity probability, follow-up genotyping of 171 rare variants was performed on 1021 Dutch UC cases and 1166 Dutch controls. Single-variant association and gene-based analyses identified an association of rare variants in the MUC2 gene with UC. The associated variants in the Dutch population could not be replicated in a German replication cohort (1026 UC cases, 3532 controls). In conclusion, this study has identified a putative role for MUC2 on UC susceptibility in the Dutch population and suggests a population-specific contribution of rare variants to UC.
A double mutation in exon 6 of the [beta]-hexosaminidase [alpha] subunit in a patient with the B1 variant of Tay-Sachs disease

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ainsworth, P.J.; Coulter-Mackie, M.B.

1992-10-01

The B1 variant form of Tay-Sachs disease is enzymologically unique in that the causative mutation(s) appear to affect the active site in the [alpha] subunit of [beta]-hexosaminidase A without altering its ability to associate with the [beta] subunit. Most previously reported B1 variant mutations were found in exon 5 within codon 178. The coding sequence of the [alpha] subunit gene of a patient with the B1 variant form was examined with a combination of reverse transcription of mRNA to cDNA, PCR, and dideoxy sequencing. A double mutation in exon 6 has been identified: a G[sub 574][yields]C transversion causing a val[submore » 192][yields]leu change and a G[sub 598][yields] A transition resulting in a val[sub 200][yields]met alteration. The amplified cDNAs were otherwise normal throughout their sequence. The 574 and 598 alterations have been confirmed by amplification directly from genomic DNA from the patient and her mother. Transient-expression studies of the two exon 6 mutations (singly or together) in COS-1 cells show that the G[sub 574][yields]C change is sufficient to cause the loss of enzyme activity. The biochemical phenotype of the 574 alteration in transfection studies is consistent with that expected for a B1 variant mutation. As such, this mutation differs from previously reported B1 variant mutations, all of which occur in exon 5. 31 refs., 2 figs., 2 tabs.« less
Somatic APC mosaicism and oligogenic inheritance in genetically unsolved colorectal adenomatous polyposis patients.

PubMed

Ciavarella, Michele; Miccoli, Sara; Prossomariti, Anna; Pippucci, Tommaso; Bonora, Elena; Buscherini, Francesco; Palombo, Flavia; Zuntini, Roberta; Balbi, Tiziana; Ceccarelli, Claudio; Bazzoli, Franco; Ricciardiello, Luigi; Turchetti, Daniela; Piazzi, Giulia

2018-03-01

Germline variants in the APC gene cause familial adenomatous polyposis. Inherited variants in MutYH, POLE, POLD1, NTHL1, and MSH3 genes and somatic APC mosaicism have been reported as alternative causes of polyposis. However, ~30-50% of cases of polyposis remain genetically unsolved. Thus, the aim of this study was to investigate the genetic causes of unexplained adenomatous polyposis. Eight sporadic cases with >20 adenomatous polyps by 35 years of age or >50 adenomatous polyps by 55 years of age, and no causative germline variants in APC and/or MutYH, were enrolled from a cohort of 56 subjects with adenomatous colorectal polyposis. APC gene mosaicism was investigated on DNA from colonic adenomas by Sanger sequencing or Whole Exome Sequencing (WES). Mosaicism extension to other tissues (peripheral blood, saliva, hair follicles) was evaluated using Sanger sequencing and/or digital PCR. APC second hit was investigated in adenomas from mosaic patients. WES was performed on DNA from peripheral blood to identify additional polyposis candidate variants. We identified APC mosaicism in 50% of patients. In three cases mosaicism was restricted to the colon, while in one it also extended to the duodenum and saliva. One patient without APC mosaicism, carrying an APC in-frame deletion of uncertain significance, was found to harbor rare germline variants in OGG1, POLQ, and EXO1 genes. In conclusion, our restrictive selection criteria improved the detection of mosaic APC patients. In addition, we showed for the first time that an oligogenic inheritance of rare variants might have a cooperative role in sporadic colorectal polyposis onset.
HapFABIA: Identification of very short segments of identity by descent characterized by rare variants in large sequencing data

PubMed Central

Hochreiter, Sepp

2013-01-01

Identity by descent (IBD) can be reliably detected for long shared DNA segments, which are found in related individuals. However, many studies contain cohorts of unrelated individuals that share only short IBD segments. New sequencing technologies facilitate identification of short IBD segments through rare variants, which convey more information on IBD than common variants. Current IBD detection methods, however, are not designed to use rare variants for the detection of short IBD segments. Short IBD segments reveal genetic structures at high resolution. Therefore, they can help to improve imputation and phasing, to increase genotyping accuracy for low-coverage sequencing and to increase the power of association studies. Since short IBD segments are further assumed to be old, they can shed light on the evolutionary history of humans. We propose HapFABIA, a computational method that applies biclustering to identify very short IBD segments characterized by rare variants. HapFABIA is designed to detect short IBD segments in genotype data that were obtained from next-generation sequencing, but can also be applied to DNA microarray data. Especially in next-generation sequencing data, HapFABIA exploits rare variants for IBD detection. HapFABIA significantly outperformed competing algorithms at detecting short IBD segments on artificial and simulated data with rare variants. HapFABIA identified 160 588 different short IBD segments characterized by rare variants with a median length of 23 kb (mean 24 kb) in data for chromosome 1 of the 1000 Genomes Project. These short IBD segments contain 752 000 single nucleotide variants (SNVs), which account for 39% of the rare variants and 23.5% of all variants. The vast majority—152 000 IBD segments—are shared by Africans, while only 19 000 and 11 000 are shared by Europeans and Asians, respectively. IBD segments that match the Denisova or the Neandertal genome are found significantly more often in Asians and Europeans but also, in some cases exclusively, in Africans. The lengths of IBD segments and their sharing between continental populations indicate that many short IBD segments from chromosome 1 existed before humans migrated out of Africa. Thus, rare variants that tag these short IBD segments predate human migration from Africa. The software package HapFABIA is available from Bioconductor. All data sets, result files and programs for data simulation, preprocessing and evaluation are supplied at http://www.bioinf.jku.at/research/short-IBD. PMID:24174545
Single-cell whole exome and targeted sequencing in NPM1/FLT3 positive pediatric acute myeloid leukemia.

PubMed

Walter, Christiane; Pozzorini, Christian; Reinhardt, Katarina; Geffers, Robert; Xu, Zhenyu; Reinhardt, Dirk; von Neuhoff, Nils; Hanenberg, Helmut

2018-02-01

The small portion of leukemic stem cells (LSCs) in acute myeloid leukemia (AML) present in children and adolescents is often masked by the high background of AML blasts and normal hematopoietic cells. The aim of the current study was to establish a simple workflow for reliable genetic analysis of single LSC-enriched blasts from pediatric patients. For three AMLs with mutations in nucleophosmin 1 and/or fms-like tyrosine kinase 3, we performed whole genome amplification on sorted single-cell DNA followed by whole exome sequencing (WES). The corresponding bulk bone marrow DNAs were also analyzed by WES and by targeted sequencing (TS) that included 54 genes associated with myeloid malignancies. Analysis revealed that read coverage statistics were comparable between single-cell and bulk WES data, indicating high-quality whole genome amplification. From 102 single-cell variants, 72 single nucleotide variants and insertions or deletions (70%) were consistently found in the two bulk DNA analyses. Variants reliably detected in single cells were also present in TS. However, initial screening by WES with read counts between 50-72× failed to detect rare AML subclones in the bulk DNAs. In summary, our study demonstrated that single-cell WES combined with bulk DNA TS is a promising tool set for detecting AML subclones and possibly LSCs. © 2017 Wiley Periodicals, Inc.
Quantitation of heteroplasmy of mtDNA sequence variants identified in a population of AD patients and controls by array-based resequencing.

PubMed

Coon, Keith D; Valla, Jon; Szelinger, Szabolics; Schneider, Lonnie E; Niedzielko, Tracy L; Brown, Kevin M; Pearson, John V; Halperin, Rebecca; Dunckley, Travis; Papassotiropoulos, Andreas; Caselli, Richard J; Reiman, Eric M; Stephan, Dietrich A

2006-08-01

The role of mitochondrial dysfunction in the pathogenesis of Alzheimer's disease (AD) has been well documented. Though evidence for the role of mitochondria in AD seems incontrovertible, the impact of mitochondrial DNA (mtDNA) mutations in AD etiology remains controversial. Though mutations in mitochondrially encoded genes have repeatedly been implicated in the pathogenesis of AD, many of these studies have been plagued by lack of replication as well as potential contamination of nuclear-encoded mitochondrial pseudogenes. To assess the role of mtDNA mutations in the pathogenesis of AD, while avoiding the pitfalls of nuclear-encoded mitochondrial pseudogenes encountered in previous investigations and showcasing the benefits of a novel resequencing technology, we sequenced the entire coding region (15,452 bp) of mtDNA from 19 extremely well-characterized AD patients and 18 age-matched, unaffected controls utilizing a new, reliable, high-throughput array-based resequencing technique, the Human MitoChip. High-throughput, array-based DNA resequencing of the entire mtDNA coding region from platelets of 37 subjects revealed the presence of 208 loci displaying a total of 917 sequence variants. There were no statistically significant differences in overall mutational burden between cases and controls, however, 265 independent sites of statistically significant change between cases and controls were identified. Changed sites were found in genes associated with complexes I (30.2%), III (3.0%), IV (33.2%), and V (9.1%) as well as tRNA (10.6%) and rRNA (14.0%). Despite their statistical significance, the subtle nature of the observed changes makes it difficult to determine whether they represent true functional variants involved in AD etiology or merely naturally occurring dissimilarity. Regardless, this study demonstrates the tremendous value of this novel mtDNA resequencing platform, which avoids the pitfalls of erroneously amplifying nuclear-encoded mtDNA pseudogenes, and our proposed analysis paradigm, which utilizes the availability of raw signal intensity values for each of the four potential alleles to facilitate quantitative estimates of mtDNA heteroplasmy. This information provides a potential new target for burgeoning diagnostics and therapeutics that could truly assist those suffering from this devastating disorder.
Structure and Function of the Splice Variants of TMPRSS2-ERG, a Prevalent Genomic Alteration in Prostate Cancer

DTIC Science & Technology

2009-09-01

binding ETS domain) and five type II (without ETS domain). Fusion-positive type I– and type II–containing phages were amplified with T3 and T7 primers...will be performed to identify the authentic 3’ UTRs from the mRNA pool from CaP patient specimens. Using phage excision strategy, we will use to... phage DNA sequences plasmids (cDNA) clones were generated by using phage excision strategy. Figure 1. ERG splice variants in prostate cancer
Whole-Exome Sequencing to Decipher the Genetic Heterogeneity of Hearing Loss in a Chinese Family with Deaf by Deaf Mating

PubMed Central

Qing, Jie; Yan, Denise; Zhou, Yuan; Liu, Qiong; Wu, Weijing; Xiao, Zian; Liu, Yuyuan; Liu, Jia; Du, Lilin; Xie, Dinghua; Liu, Xue Zhong

2014-01-01

Inherited deafness has been shown to have high genetic heterogeneity. For many decades, linkage analysis and candidate gene approaches have been the main tools to elucidate the genetics of hearing loss. However, this associated study design is costly, time-consuming, and unsuitable for small families. This is mainly due to the inadequate numbers of available affected individuals, locus heterogeneity, and assortative mating. Exome sequencing has now become technically feasible and a cost-effective method for detection of disease variants underlying Mendelian disorders due to the recent advances in next-generation sequencing (NGS) technologies. In the present study, we have combined both the Deafness Gene Mutation Detection Array and exome sequencing to identify deafness causative variants in a large Chinese composite family with deaf by deaf mating. The simultaneous screening of the 9 common deafness mutations using the allele-specific PCR based universal array, resulted in the identification of the 1555A>G in the mitochondrial DNA (mtDNA) 12S rRNA in affected individuals in one branch of the family. We then subjected the mutation-negative cases to exome sequencing and identified novel causative variants in the MYH14 and WFS1 genes. This report confirms the effective use of a NGS technique to detect pathogenic mutations in affected individuals who were not candidates for classical genetic studies. PMID:25289672
Detection and Heterogeneity of Herpesviruses Causing Pacheco's Disease in Parrots

PubMed Central

Tomaszewski, Elizabeth; Wilson, Van G.; Wigle, William L.; Phalen, David N.

2001-01-01

Pacheco's disease (PD) is a common, often fatal, disease of parrots. We cloned a virus isolate from a parrot that had characteristic lesions of PD. Three viral clones were partially sequenced, demonstrating that this virus was an alphaherpesvirus most closely related to the gallid herpesvirus 1. Five primer sets were developed from these sequences. The primer sets were used with PCR to screen tissues or tissue culture media suspected to contain viruses from 54 outbreaks of PD. The primer sets amplified DNA from all but one sample. Ten amplification patterns were detected, indicating that PD is caused by a genetically heterogeneous population of viruses. A single genetic variant (psittacid herpesvirus variant 1) amplified with all primer sets and was the most common virus variant (62.7%). A single primer set (23F) amplified DNA from all of the positive samples, suggesting that PCR could be used as a rapid postmortem assay for these viruses. PCR was found to be significantly more sensitive than tissue culture for the detection of psittacid herpesviruses. PMID:11158102
Intragenomic sequence variation at the ITS1 - ITS2 region and at the 18S and 28S nuclear ribosomal DNA genes of the New Zealand mud snail, Potamopyrgus antipodarum (Hydrobiidae: mollusca)

USGS Publications Warehouse

Hoy, Marshal S.; Rodriguez, Rusty J.

2013-01-01

Molecular genetic analysis was conducted on two populations of the invasive non-native New Zealand mud snail (Potamopyrgus antipodarum), one from a freshwater ecosystem in Devil's Lake (Oregon, USA) and the other from an ecosystem of higher salinity in the Columbia River estuary (Hammond Harbor, Oregon, USA). To elucidate potential genetic differences between the two populations, three segments of nuclear ribosomal DNA (rDNA), the ITS1-ITS2 regions and the 18S and 28S rDNA genes were cloned and sequenced. Variant sequences within each individual were found in all three rDNA segments. Folding models were utilized for secondary structure analysis and results indicated that there were many sequences which contained structure-altering polymorphisms, which suggests they could be nonfunctional pseudogenes. In addition, analysis of molecular variance (AMOVA) was used for hierarchical analysis of genetic variance to estimate variation within and among populations and within individuals. AMOVA revealed significant variation in the ITS region between the populations and among clones within individuals, while in the 5.8S rDNA significant variation was revealed among individuals within the two populations. High levels of intragenomic variation were found in the ITS regions, which are known to be highly variable in many organisms. More interestingly, intragenomic variation was also found in the 18S and 28S rDNA, which has rarely been observed in animals and is so far unreported in Mollusca. We postulate that in these P. antipodarum populations the effects of concerted evolution are diminished due to the fact that not all of the rDNA genes in their polyploid genome should be essential for sustaining cellular function. This could lead to a lessening of selection pressures, allowing mutations to accumulate in some copies, changing them into variant sequences.
Molecular characterization of canine parvovirus strains in Argentina: Detection of the pathogenic variant CPV2c in vaccinated dogs.

PubMed

Calderon, Marina Gallo; Mattion, Nora; Bucafusco, Danilo; Fogel, Fernando; Remorini, Patricia; La Torre, Jose

2009-08-01

PCR amplification with sequence-specific primers was used to detect canine parvovirus (CPV) DNA in 38 rectal swabs from Argentine domestic dogs with symptoms compatible with parvovirus disease. Twenty-seven out of 38 samples analyzed were CPV positive. The classical CPV2 strain was not detected in any of the samples, but nine samples were identified as CPV2a variant and 18 samples as CPV2b variant. Further sequence analysis revealed a mutation at amino acid 426 of the VP2 gene (Asp426Glu), characteristic of the CPV2c variant, in 14 out of 18 of the samples identified initially by PCR as CPV2b. The appearance of CPV2c variant in Argentina might be dated at least to the year 2003. Three different pathogenic CPV variants circulating currently in the Argentine domestic dog population were identified, with CPV2c being the only variant affecting vaccinated and unvaccinated dogs during the year 2008.
Reanalysis of BRCA1/2 negative high risk ovarian cancer patients reveals novel germline risk loci and insights into missing heritability

PubMed Central

Dyson, Gregory; Levin, Nancy K.; Chaudhry, Sophia; Rosati, Rita; Kalpage, Hasini; Simon, Michael S.; Tainsky, Michael A.

2017-01-01

While up to 25% of ovarian cancer (OVCA) cases are thought to be due to inherited factors, the majority of genetic risk remains unexplained. To address this gap, we sought to identify previously undescribed OVCA risk variants through the whole exome sequencing (WES) and candidate gene analysis of 48 women with ovarian cancer and selected for high risk of genetic inheritance, yet negative for any known pathogenic variants in either BRCA1 or BRCA2. In silico SNP analysis was employed to identify suspect variants followed by validation using Sanger DNA sequencing. We identified five pathogenic variants in our sample, four of which are in two genes featured on current multi-gene panels; (RAD51D, ATM). In addition, we found a pathogenic FANCM variant (R1931*) which has been recently implicated in familial breast cancer risk. Numerous rare and predicted to be damaging variants of unknown significance were detected in genes on current commercial testing panels, most prominently in ATM (n = 6) and PALB2 (n = 5). The BRCA2 variant p.K3326*, resulting in a 93 amino acid truncation, was overrepresented in our sample (odds ratio = 4.95, p = 0.01) and coexisted in the germline of these women with other deleterious variants, suggesting a possible role as a modifier of genetic penetrance. Furthermore, we detected loss of function variants in non-panel genes involved in OVCA relevant pathways; DNA repair and cell cycle control, including CHEK1, TP53I3, REC8, HMMR, RAD52, RAD1, POLK, POLQ, and MCM4. In summary, our study implicates novel risk loci as well as highlights the clinical utility for retesting BRCA1/2 negative OVCA patients by genomic sequencing and analysis of genes in relevant pathways. PMID:28591191
Brief Report: Late-Onset Cryopyrin-Associated Periodic Syndrome Due to Myeloid-Restricted Somatic NLRP3 Mosaicism.

PubMed

Mensa-Vilaro, Anna; Teresa Bosque, María; Magri, Giuliana; Honda, Yoshitaka; Martínez-Banaclocha, Helios; Casorran-Berges, Marta; Sintes, Jordi; González-Roca, Eva; Ruiz-Ortiz, Estibaliz; Heike, Toshio; Martínez-Garcia, Juan J; Baroja-Mazo, Alberto; Cerutti, Andrea; Nishikomori, Ryuta; Yagüe, Jordi; Pelegrín, Pablo; Delgado-Beltran, Concha; Aróstegui, Juan I

2016-12-01

Gain-of-function NLRP3 mutations cause cryopyrin-associated periodic syndrome (CAPS), with gene mosaicism playing a relevant role in the pathogenesis. This study was undertaken to characterize the genetic cause underlying late-onset but otherwise typical CAPS. We studied a 64-year-old patient who presented with recurrent episodes of urticaria-like rash, fever, conjunctivitis, and oligoarthritis at age 56 years. DNA was extracted from both unfractionated blood and isolated leukocyte and CD34+ subpopulations. Genetic studies were performed using both the Sanger method of DNA sequencing and next-generation sequencing (NGS) methods. In vitro and ex vivo analyses were performed to determine the consequences that the presence of the variant have in the normal structure or function of the protein of the detected variant. NGS analyses revealed the novel p.Gln636Glu NLRP3 variant in unfractionated blood, with an allele frequency (18.4%) compatible with gene mosaicism. Sanger sequence chromatograms revealed a small peak corresponding to the variant allele. Amplicon-based deep sequencing revealed somatic NLRP3 mosaicism restricted to myeloid cells (31.8% in monocytes, 24.6% in neutrophils, and 11.2% in circulating CD34+ common myeloid progenitor cells) and its complete absence in lymphoid cells. Functional analyses confirmed the gain-of-function behavior of the gene variant and hyperactivity of the NLRP3 inflammasome in the patient. Treatment with anakinra resulted in good control of the disease. We identified the novel gain-of-function p.Gln636Glu NLRP3 mutation, which was detected as a somatic mutation restricted to myeloid cells, as the cause of late-onset but otherwise typical CAPS. Our results expand the diversity of CAPS toward milder phenotypes than previously reported, including those starting during adulthood. © 2016, American College of Rheumatology.
Toward rules relating zinc finger protein sequences and DNA binding site preferences.

PubMed

Desjarlais, J R; Berg, J M

1992-08-15

Zinc finger proteins of the Cys2-His2 type consist of tandem arrays of domains, where each domain appears to contact three adjacent base pairs of DNA through three key residues. We have designed and prepared a series of variants of the central zinc finger within the DNA binding domain of Sp1 by using information from an analysis of a large data base of zinc finger protein sequences. Through systematic variations at two of the three contact positions (underlined), relatively specific recognition of sequences of the form 5'-GGGGN(G or T)GGG-3' has been achieved. These results provide the basis for rules that may develop into a code that will allow the design of zinc finger proteins with preselected DNA site specificity.
Mitochondrial DNA Variant in COX1 Subunit Significantly Alters Energy Metabolism of Geographically Divergent Wild Isolates in Caenorhabditis elegans

PubMed Central

Dingley, Stephen D.; Polyak, Erzsebet; Ostrovsky, Julian; Srinivasan, Satish; Lee, Icksoo; Rosenfeld, Amy B.; Tsukikawa, Mai; Xiao, Rui; Selak, Mary A.; Coon, Joshua J.; Hebert, Alexander S.; Grimsrud, Paul A.; Kwon, Young Joon; Pagliarini, David J.; Gai, Xiaowu; Schurr, Theodore G.; Hüttemann, Maik; Nakamaru-Ogiso, Eiko; Falk, Marni J.

2014-01-01

Mitochondrial DNA (mtDNA) sequence variation can influence the penetrance of complex diseases and climatic adaptation. While studies in geographically defined human populations suggest that mtDNA mutations become fixed when they have conferred metabolic capabilities optimally suited for a specific environment, it has been challenging to definitively assign adaptive functions to specific mtDNA sequence variants in mammals. We investigated whether mtDNA genome variation functionally influences Caenorhabditis elegans wild isolates of distinct mtDNA lineages and geographic origins. We found that, relative to N2 (England) wild-type nematodes, CB4856 wild isolates from a warmer native climate (Hawaii) had a unique p.A12S amino acid substitution in the mtDNA-encoded COX1 core catalytic subunit of mitochondrial complex IV (CIV). Relative to N2, CB4856 worms grown at 20 °C had significantly increased CIV enzyme activity, mitochondrial matrix oxidant burden, and sensitivity to oxidative stress but had significantly reduced lifespan and mitochondrial membrane potential. Interestingly, mitochondrial membrane potential was significantly increased in CB4856 grown at its native temperature of 25 °C. A transmitochondrial cybrid worm strain, chpIR (M, CB4856 > N2), was bred as homoplasmic for the CB4856 mtDNA genome in the N2 nuclear background. The cybrid strain also displayed significantly increased CIV activity, demonstrating that this difference results from the mtDNA-encoded p.A12S variant. However, chpIR (M, CB4856 > N2) worms had significantly reduced median and maximal lifespan relative to CB4856, which may relate to their nuclear– mtDNA genome mismatch. Overall, these data suggest that C. elegans wild isolates of varying geographic origins may adapt to environmental challenges through mtDNA variation to modulate critical aspects of mitochondrial energy metabolism. PMID:24534730
Rare Variants in RTEL1 Are Associated with Familial Interstitial Pneumonia

PubMed Central

Cogan, Joy D.; Zhao, Min; Mitchell, Daphne B.; Rives, Lynette; Markin, Cheryl; Garnett, Errine T.; Montgomery, Keri H.; Mason, Wendi R.; McKean, David F.; Powers, Julia; Murphy, Elissa; Olson, Lana M.; Choi, Leena; Cheng, Dong-Sheng; Blue, Elizabeth Marchani; Young, Lisa R.; Lancaster, Lisa H.; Steele, Mark P.; Brown, Kevin K.; Schwarz, Marvin I.; Fingerlin, Tasha E.; Schwartz, David A.; Lawson, William E.; Loyd, James E.; Zhao, Zhongming; Phillips, John A.; Blackwell, Timothy S.

2015-01-01

Rationale: Up to 20% of cases of idiopathic interstitial pneumonia cluster in families, comprising the syndrome of familial interstitial pneumonia (FIP); however, the genetic basis of FIP remains uncertain in most families. Objectives: To determine if new disease-causing rare genetic variants could be identified using whole-exome sequencing of affected members from FIP families, providing additional insights into disease pathogenesis. Methods: Affected subjects from 25 kindreds were selected from an ongoing FIP registry for whole-exome sequencing from genomic DNA. Candidate rare variants were confirmed by Sanger sequencing, and cosegregation analysis was performed in families, followed by additional sequencing of affected individuals from another 163 kindreds. Measurements and Main Results: We identified a potentially damaging rare variant in the gene encoding for regulator of telomere elongation helicase 1 (RTEL1) that segregated with disease and was associated with very short telomeres in peripheral blood mononuclear cells in 1 of 25 families in our original whole-exome sequencing cohort. Evaluation of affected individuals in 163 additional kindreds revealed another eight families (4.7%) with heterozygous rare variants in RTEL1 that segregated with clinical FIP. Probands and unaffected carriers of these rare variants had short telomeres (<10% for age) in peripheral blood mononuclear cells and increased T-circle formation, suggesting impaired RTEL1 function. Conclusions: Rare loss-of-function variants in RTEL1 represent a newly defined genetic predisposition for FIP, supporting the importance of telomere-related pathways in pulmonary fibrosis. PMID:25607374
Local alignment of two-base encoded DNA sequence

PubMed Central

Homer, Nils; Merriman, Barry; Nelson, Stanley F

2009-01-01

Background DNA sequence comparison is based on optimal local alignment of two sequences using a similarity score. However, some new DNA sequencing technologies do not directly measure the base sequence, but rather an encoded form, such as the two-base encoding considered here. In order to compare such data to a reference sequence, the data must be decoded into sequence. The decoding is deterministic, but the possibility of measurement errors requires searching among all possible error modes and resulting alignments to achieve an optimal balance of fewer errors versus greater sequence similarity. Results We present an extension of the standard dynamic programming method for local alignment, which simultaneously decodes the data and performs the alignment, maximizing a similarity score based on a weighted combination of errors and edits, and allowing an affine gap penalty. We also present simulations that demonstrate the performance characteristics of our two base encoded alignment method and contrast those with standard DNA sequence alignment under the same conditions. Conclusion The new local alignment algorithm for two-base encoded data has substantial power to properly detect and correct measurement errors while identifying underlying sequence variants, and facilitating genome re-sequencing efforts based on this form of sequence data. PMID:19508732

Mapping neurofibromatosis 1 homologous loci by fluorescence in situ hybridization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Viskochil, D.; Breidenbach, H.H.; Cawthon, R.

Neurofibromatosis 1 maps to chromosome band 17q11.2 and the NF1 gene is comprised of 59 exons that span approximately 335 kb of genomic DNA. In order to further analyze the structure of NF1 from exons 2 through 27b, we isolated a number of cosmid and bacteriophage P-1 genomic clones using NF1-exon probes under high-stringency hybridization conditions. Using tagged, intron-based primers and DNA from various clones as a template, we PCR-amplified and sequenced individual NF1 exons. The exon sequences in PCR products from several genomic clones differed from the exon sequence derived from cloned NF1 cDNAs. Clones with variant sequences weremore » mapped by fluorescence in situ hybridization under high-stringency conditions. Three clones mapped to chromosome band 15q11.2, one mapped to 14q11.2, one mapped to both 2q14.1-14.3 and 14q11.2, one mapped to 2q33-34, and one mapped to both 18q11.2 and 21q21. Even though some PCR-product sequences retained proper splice junctions and open reading frames, we have yet to identify cDNAs that correspond to the variant exon sequences. We are now sequencing clones that map to NF1-homologous loci in order to develop discriminating primer pairs for the exclusive amplification of NF1-specific sequences in our efforts to develop a comprehensive NF1 mutation screen using genomic DNA as template. The role of NF1-homologous sequences may play in neurofibromatosis 1 is not clear.« less
Distinct Patterns of Somatic Mosaicism in the APC Gene in Neoplasms From Patients With Unexplained Adenomatous Polyposis.

PubMed

Jansen, Anne M L; Crobach, Stijn; Geurts-Giele, Willemina R R; van den Akker, Brendy E W M; Garcia, Marina Ventayol; Ruano, Dina; Nielsen, Maartje; Tops, Carli M J; Wijnen, Juul T; Hes, Frederik J; van Wezel, Tom; Dinjens, Winand N M; Morreau, Hans

2017-02-01

We investigated the presence and patterns of mosaicism in the APC gene in patients with colon neoplasms not associated with any other genetic variants; we performed deep sequence analysis of APC in at least 2 adenomas or carcinomas per patient. We identified mosaic variants in APC in adenomas from 9 of the 18 patients with 21 to approximately 100 adenomas. Mosaic variants of APC were variably detected in leukocyte DNA and/or non-neoplastic intestinal mucosa of these patients. In a comprehensive sequence analysis of 1 patient, we found no evidence for mosaicism in APC in non-neoplastic intestinal mucosa. One patient was found to carry a mosaic c.4666dupA APC variant in only 10 of 16 adenomas, indicating the importance of screening 2 or more adenomas for genetic variants. Copyright © 2017 AGA Institute. Published by Elsevier Inc. All rights reserved.
Two different size classes of 5S rDNA units coexisting in the same tandem array in the razor clam Ensis macha: is this region suitable for phylogeographic studies?

PubMed

Fernández-Tajes, Juan; Méndez, Josefina

2009-12-01

For a study of 5S ribosomal genes (rDNA) in the razor clam Ensis macha, the 5S rDNA region was amplified and sequenced. Two variants, so-called type I or short repeat (approximately 430 bp) and type II or long repeat (approximately 735 bp), appeared to be the main components of the 5S rDNA of this species. Their spacers differed markedly, both in length and nucleotide composition. The organization of the two variants was investigated by amplifying the genomic DNA with primers based on the sequence of the type I and type II spacers. PCR amplification products with primers EMLbF and EMSbR showed that the long and short repeats are associated within the same tandem array, suggesting an intermixed arrangement of both spacers. Nevertheless, amplifications carried out with inverse primers EMSinvF/R and EMLinvF/R revealed that some short and long repeats are contiguous in the same tandem array. This is the first report of the coexistence of two variable spacers in the same tandem array in bivalve mollusks.
Next-Generation Sequencing in Oncology: Genetic Diagnosis, Risk Prediction and Cancer Classification

PubMed Central

Kamps, Rick; Brandão, Rita D.; van den Bosch, Bianca J.; Paulussen, Aimee D. C.; Xanthoulea, Sofia; Blok, Marinus J.; Romano, Andrea

2017-01-01

Next-generation sequencing (NGS) technology has expanded in the last decades with significant improvements in the reliability, sequencing chemistry, pipeline analyses, data interpretation and costs. Such advances make the use of NGS feasible in clinical practice today. This review describes the recent technological developments in NGS applied to the field of oncology. A number of clinical applications are reviewed, i.e., mutation detection in inherited cancer syndromes based on DNA-sequencing, detection of spliceogenic variants based on RNA-sequencing, DNA-sequencing to identify risk modifiers and application for pre-implantation genetic diagnosis, cancer somatic mutation analysis, pharmacogenetics and liquid biopsy. Conclusive remarks, clinical limitations, implications and ethical considerations that relate to the different applications are provided. PMID:28146134
MYO7A and USH2A gene sequence variants in Italian patients with Usher syndrome.

PubMed

Sodi, Andrea; Mariottini, Alessandro; Passerini, Ilaria; Murro, Vittoria; Tachyla, Iryna; Bianchi, Benedetta; Menchini, Ugo; Torricelli, Francesca

2014-01-01

To analyze the spectrum of sequence variants in the MYO7A and USH2A genes in a group of Italian patients affected by Usher syndrome (USH). Thirty-six Italian patients with a diagnosis of USH were recruited. They received a standard ophthalmologic examination, visual field testing, optical coherence tomography (OCT) scan, and electrophysiological tests. Fluorescein angiography and fundus autofluorescence imaging were performed in selected cases. All the patients underwent an audiologic examination for the 0.25-8,000 Hz frequencies. Vestibular function was evaluated with specific tests. DNA samples were analyzed for sequence variants of the MYO7A gene (for USH1) and the USH2A gene (for USH2) with direct sequencing techniques. A few patients were analyzed for both genes. In the MYO7A gene, ten missense variants were found; three patients were compound heterozygous, and two were homozygous. Thirty-four USH2A gene variants were detected, including eight missense variants, nine nonsense variants, six splicing variants, and 11 duplications/deletions; 19 patients were compound heterozygous, and three were homozygous. Four MYO7A and 17 USH2A variants have already been described in the literature. Among the novel mutations there are four USH2A large deletions, detected with multiplex ligation dependent probe amplification (MLPA) technology. Two potentially pathogenic variants were found in 27 patients (75%). Affected patients showed variable clinical pictures without a clear genotype-phenotype correlation. Ten variants in the MYO7A gene and 34 variants in the USH2A gene were detected in Italian patients with USH at a high detection rate. A selective analysis of these genes may be valuable for molecular analysis, combining diagnostic efficiency with little time wastage and less resource consumption.
Assessing the spectrum of germline variation in Fanconi anemia genes among patients with head and neck carcinoma before age 50.

PubMed

Chandrasekharappa, Settara C; Chinn, Steven B; Donovan, Frank X; Chowdhury, Naweed I; Kamat, Aparna; Adeyemo, Adebowale A; Thomas, James W; Vemulapalli, Meghana; Hussey, Caroline S; Reid, Holly H; Mullikin, James C; Wei, Qingyi; Sturgis, Erich M

2017-10-15

Patients with Fanconi anemia (FA) have an increased risk for head and neck squamous cell carcinoma (HNSCC). The authors sought to determine the prevalence of undiagnosed FA and FA carriers among patients with HNSCC as well as an age cutoff for FA genetic screening. Germline DNA samples from 417 patients with HNSCC aged <50 years were screened for sequence variants by targeted next-generation sequencing of the entire length of 16 FA genes. The sequence revealed 194 FA gene variants in 185 patients (44%). The variant spectrum was comprised of 183 nonsynonymous point mutations, 9 indels, 1 large deletion, and 1 synonymous variant that was predicted to effect splicing. One hundred eight patients (26%) had at least 1 rare variant that was predicted to be damaging, and 57 (14%) had at least 1 rare variant that was predicted to be damaging and had been previously reported. Fifteen patients carried 2 rare variants or an X-linked variant in an FA gene. Overall, an age cutoff for FA screening was not identified among young patients with HNSCC, because there were no significant differences in mutation rates when patients were stratified by age, tumor site, ethnicity, smoking status, or human papillomavirus status. However, an increased burden, or mutation load, of FA gene variants was observed in carriers of the genes FA complementation group D2 (FANCD2), FANCE, and FANCL in the HNSCC patient cohort relative to the 1000 Genomes population. FA germline functional variants offer a novel area of study in HNSCC tumorigenesis. FANCE and FANCL, which are components of the core complex, are known to be responsible for the recruitment and ubiquitination, respectively, of FANCD2, a critical step in the FA DNA repair pathway. In the current cohort, the increased mutation load of FANCD2, FANCE, and FANCL variants among younger patients with HNSCC indicates the importance of the FA pathway in HNSCC. Cancer 2017;123:3943-54. © 2017 American Cancer Society. © 2017 American Cancer Society.
Denoising DNA deep sequencing data—high-throughput sequencing errors and their correction

PubMed Central

Laehnemann, David; Borkhardt, Arndt

2016-01-01

Characterizing the errors generated by common high-throughput sequencing platforms and telling true genetic variation from technical artefacts are two interdependent steps, essential to many analyses such as single nucleotide variant calling, haplotype inference, sequence assembly and evolutionary studies. Both random and systematic errors can show a specific occurrence profile for each of the six prominent sequencing platforms surveyed here: 454 pyrosequencing, Complete Genomics DNA nanoball sequencing, Illumina sequencing by synthesis, Ion Torrent semiconductor sequencing, Pacific Biosciences single-molecule real-time sequencing and Oxford Nanopore sequencing. There is a large variety of programs available for error removal in sequencing read data, which differ in the error models and statistical techniques they use, the features of the data they analyse, the parameters they determine from them and the data structures and algorithms they use. We highlight the assumptions they make and for which data types these hold, providing guidance which tools to consider for benchmarking with regard to the data properties. While no benchmarking results are included here, such specific benchmarks would greatly inform tool choices and future software development. The development of stand-alone error correctors, as well as single nucleotide variant and haplotype callers, could also benefit from using more of the knowledge about error profiles and from (re)combining ideas from the existing approaches presented here. PMID:26026159
Engineered external guide sequences are highly effective in inhibiting gene expression and replication of hepatitis B virus in cultured cells.

PubMed

Zhang, Zhigang; Vu, Gia-Phong; Gong, Hao; Xia, Chuan; Chen, Yuan-Chuan; Liu, Fenyong; Wu, Jianguo; Lu, Sangwei

2013-01-01

External guide sequences (EGSs) are RNA molecules that consist of a sequence complementary to a target mRNA and recruit intracellular ribonuclease P (RNase P), a tRNA processing enzyme, for specific degradation of the target mRNA. We have previously used an in vitro selection procedure to generate EGS variants that efficiently induce human RNase P to cleave a target mRNA in vitro. In this study, we constructed EGSs from a variant to target the overlapping region of the S mRNA, pre-S/L mRNA, and pregenomic RNA (pgRNA) of hepatitis B virus (HBV), which are essential for viral replication and infection. The EGS variant was about 50-fold more efficient in inducing human RNase P to cleave the mRNA in vitro than the EGS derived from a natural tRNA. Following Salmonella-mediated gene delivery, the EGSs were expressed in cultured HBV-carrying cells. A reduction of about 97% and 75% in the level of HBV RNAs and proteins and an inhibition of about 6,000- and 130-fold in the levels of capsid-associated HBV DNA were observed in cells treated with Salmonella vectors carrying the expression cassette for the variant and the tRNA-derived EGS, respectively. Our study provides direct evidence that the EGS variant is more effective in blocking HBV gene expression and DNA replication than the tRNA-derived EGS. Furthermore, these results demonstrate the feasibility of developing Salmonella-mediated gene delivery of highly active EGS RNA variants as a novel approach for gene-targeting applications such as anti-HBV therapy.
A novel pathogenic variant in an Iranian Ataxia telangiectasia family revealed by next-generation sequencing followed by in silico analysis.

PubMed

Tabatabaiefar, Mohammad Amin; Alipour, Paria; Pourahmadiyan, Azam; Fattahi, Najmeh; Shariati, Laleh; Golchin, Neda; Mohammadi-Asl, Javad

2017-08-15

Ataxia telangiectasia (A-T) is a neurodegenerative autosomal recessive disorder with the main characteristics of progressive cerebellar degeneration, sensitivity to ionizing radiation, immunodeficiency, telangiectasia, premature aging, recurrent sinopulmonary infections, and increased risk of malignancy, especially of lymphoid origin. Ataxia Telangiectasia Mutated gene, ATM, as a causative gene for the A-T disorder, encodes the ATM protein, which plays an important role in the activation of cell-cycle checkpoints and initiation of DNA repair in response to DNA damage. Targeted next-generation sequencing (NGS) was performed on an Iranian 5-year-old boy presented with truncal and limb ataxia, telangiectasia of the eye, Hodgkin lymphoma, hyper pigmentation, total alopecia, hepatomegaly, and dysarthria. Sanger sequencing was used to confirm the candidate pathogenic variants. Computational docking was done using the HEX software to examine how this change affects the interactions of ATM with the upstream and downstream proteins. Three different variants were identified comprising two homozygous SNPs and one novel homozygous frameshift variant (c.80468047delTA, p.Thr2682ThrfsX5), which creates a stop codon in exon 57 leaving the protein truncated at its C-terminal portion. Therefore, the activation and phosphorylation of target proteins are lost. Moreover, the HEX software confirmed that the mutated protein lost its interaction with upstream and downstream proteins. The variant was classified as pathogenic based on the American College of Medical Genetics and Genomics guideline. This study expands the spectrum of ATM pathogenic variants in Iran and demonstrates the utility of targeted NGS in genetic diagnostics. Copyright © 2017. Published by Elsevier B.V.
Isolation of centromeric-tandem repetitive DNA sequences by chromatin affinity purification using a HaloTag7-fused centromere-specific histone H3 in tobacco.

PubMed

Nagaki, Kiyotaka; Shibata, Fukashi; Kanatani, Asaka; Kashihara, Kazunari; Murata, Minoru

2012-04-01

The centromere is a multi-functional complex comprising centromeric DNA and a number of proteins. To isolate unidentified centromeric DNA sequences, centromere-specific histone H3 variants (CENH3) and chromatin immunoprecipitation (ChIP) have been utilized in some plant species. However, anti-CENH3 antibody for ChIP must be raised in each species because of its species specificity. Production of the antibodies is time-consuming and costly, and it is not easy to produce ChIP-grade antibodies. In this study, we applied a HaloTag7-based chromatin affinity purification system to isolate centromeric DNA sequences in tobacco. This system required no specific antibody, and made it possible to apply a highly stringent wash to remove contaminated DNA. As a result, we succeeded in isolating five tandem repetitive DNA sequences in addition to the centromeric retrotransposons that were previously identified by ChIP. Three of the tandem repeats were centromere-specific sequences located on different chromosomes. These results confirm the validity of the HaloTag7-based chromatin affinity purification system as an alternative method to ChIP for isolating unknown centromeric DNA sequences. The discovery of more than two chromosome-specific centromeric DNA sequences indicates the mosaic structure of tobacco centromeres. © Springer-Verlag 2011
Myopathic mtDNA Depletion Syndrome Due to Mutation in TK2 Gene.

PubMed

Martín-Hernández, Elena; García-Silva, María Teresa; Quijada-Fraile, Pilar; Rodríguez-García, María Elena; Rivera, Henry; Hernández-Laín, Aurelio; Coca-Robinot, David; Fernández-Toral, Joaquín; Arenas, Joaquín; Martín, Miguel A; Martínez-Azorín, Francisco

2017-01-01

Whole-exome sequencing was used to identify the disease gene(s) in a Spanish girl with failure to thrive, muscle weakness, mild facial weakness, elevated creatine kinase, deficiency of mitochondrial complex III and depletion of mtDNA. With whole-exome sequencing data, it was possible to get the whole mtDNA sequencing and discard any pathogenic variant in this genome. The analysis of whole exome uncovered a homozygous pathogenic mutation in thymidine kinase 2 gene ( TK2; NM_004614.4:c.323 C>T, p.T108M). TK2 mutations have been identified mainly in patients with the myopathic form of mtDNA depletion syndromes. This patient presents an atypical TK2-related myopathic form of mtDNA depletion syndromes, because despite having a very low content of mtDNA (<20%), she presents a slower and less severe evolution of the disease. In conclusion, our data confirm the role of TK2 gene in mtDNA depletion syndromes and expanded the phenotypic spectrum.
Annotation of Sequence Variants in Cancer Samples: Processes and Pitfalls for Routine Assays in the Clinical Laboratory.

PubMed

Lee, Lobin A; Arvai, Kevin J; Jones, Dan

2015-07-01

As DNA sequencing of multigene panels becomes routine for cancer samples in the clinical laboratory, an efficient process for classifying variants has become more critical. Determining which germline variants are significant for cancer disposition and which somatic mutations are integral to cancer development or therapy response remains difficult, even for well-studied genes such as BRCA1 and TP53. We compare and contrast the general principles and lines of evidence commonly used to distinguish the significance of cancer-associated germline and somatic genetic variants. The factors important in each step of the analysis pipeline are reviewed, as are some of the publicly available annotation tools. Given the range of indications and uses of cancer sequencing assays, including diagnosis, staging, prognostication, theranostics, and residual disease detection, the need for flexible methods for scoring of variants is discussed. The usefulness of protein prediction tools and multimodal risk-based or Bayesian approaches are highlighted. Using TET2 variants encountered in hematologic neoplasms, several examples of this multifactorial approach to classifying sequence variants of unknown significance are presented. Although there are still significant gaps in the publicly available data for many cancer genes that limit the broad application of explicit algorithms for variant scoring, the elements of a more rigorous model are outlined. Copyright © 2015 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.
Whole-Exome Sequencing of 10 Scientists: Evaluation of the Process and Outcomes.

PubMed

Lindor, Noralane M; Schahl, Kimberly A; Johnson, Kiley J; Hunt, Katherine S; Mensink, Kara A; Wieben, Eric D; Klee, Eric; Black, John L; Highsmith, W Edward; Thibodeau, Stephen N; Ferber, Matthew J; Aypar, Umut; Ji, Yuan; Graham, Rondell P; Fiksdal, Alexander S; Sarangi, Vivek; Ormond, Kelly E; Riegert-Johnson, Douglas L; McAllister, Tammy M; Farrugia, Gianrico; McCormick, Jennifer B

2015-10-01

To understand motivations, educational needs, and concerns of individuals contemplating whole-exome sequencing (WES) and determine what amount of genetic information might be obtained by sequencing a generally healthy cohort so as to more effectively counsel future patients. From 2012 to 2014, 40 medically educated, generally healthy scientists at Mayo Clinic were invited to have WES conducted on a research basis; 26 agreed to be in a drawing from which 10 participants were selected. The study involved pre- and posttest genetic counseling and completion of 4 surveys related to the experience and outcomes. Whole-exome sequencing was conducted on DNA from blood from each person. Most variants (76,305 per person; range, 74,505-77,387) were known benign allelic variants, variants in genes of unknown function, or variants of uncertain significance in genes of known function. The results of suspected pathogenic/pathogenic variants in Mendelian disorders and pharmacogenomic variants were disclosed. The mean number of suspected pathogenic/pathogenic variants was 2.2 per person (range, 1-4). Four pharmacogenomic genes were included for reporting; variants were found in 9 of 10 participants. This study provides data that may be useful in establishing reality-based patient expectations, outlines specific points to cover during counseling, and increases confidence in the feasibility of providing adequate preparation and counseling for WES in generally healthy individuals. Copyright © 2015 Mayo Foundation for Medical Education and Research. Published by Elsevier Inc. All rights reserved.
i-rDNA: alignment-free algorithm for rapid in silico detection of ribosomal gene fragments from metagenomic sequence data sets.

PubMed

Mohammed, Monzoorul Haque; Ghosh, Tarini Shankar; Chadaram, Sudha; Mande, Sharmila S

2011-11-30

Obtaining accurate estimates of microbial diversity using rDNA profiling is the first step in most metagenomics projects. Consequently, most metagenomic projects spend considerable amounts of time, money and manpower for experimentally cloning, amplifying and sequencing the rDNA content in a metagenomic sample. In the second step, the entire genomic content of the metagenome is extracted, sequenced and analyzed. Since DNA sequences obtained in this second step also contain rDNA fragments, rapid in silico identification of these rDNA fragments would drastically reduce the cost, time and effort of current metagenomic projects by entirely bypassing the experimental steps of primer based rDNA amplification, cloning and sequencing. In this study, we present an algorithm called i-rDNA that can facilitate the rapid detection of 16S rDNA fragments from amongst millions of sequences in metagenomic data sets with high detection sensitivity. Performance evaluation with data sets/database variants simulating typical metagenomic scenarios indicates the significantly high detection sensitivity of i-rDNA. Moreover, i-rDNA can process a million sequences in less than an hour on a simple desktop with modest hardware specifications. In addition to the speed of execution, high sensitivity and low false positive rate, the utility of the algorithmic approach discussed in this paper is immense given that it would help in bypassing the entire experimental step of primer-based rDNA amplification, cloning and sequencing. Application of this algorithmic approach would thus drastically reduce the cost, time and human efforts invested in all metagenomic projects. A web-server for the i-rDNA algorithm is available at http://metagenomics.atc.tcs.com/i-rDNA/
Novel rare variations of the oxytocin receptor (OXTR) gene in autism spectrum disorder individuals.

PubMed

Liu, Xiaoxi; Kawashima, Minae; Miyagawa, Taku; Otowa, Takeshi; Latt, Khun Zaw; Thiri, Myo; Nishida, Hisami; Sugiyama, Toshiro; Tsurusaki, Yoshinori; Matsumoto, Naomichi; Mabuchi, Akihiko; Tokunaga, Katsushi; Sasaki, Tsukasa

2015-01-01

The oxytocin receptor (OXTR) gene has been implicated as a risk gene for autism spectrum disorder (ASD)-a neurodevelopmental disorder with essential features of impairments in social communication and reciprocal interaction. The genetic associations between common variations in OXTR and ASD have been reported in multiple ethnic populations. However, little is known about the distribution of rare variations within OXTR in ASD patients. In this study, we resequenced the full length of OXTR in 105 ASD individuals using an approach that combined the power of next-generation sequencing technology, long-range PCR and DNA pooling. We demonstrated that rare variants with minor allele frequency as low as 0.05% could be reliably detected by our method. We identified 28 novel variants including potential functional variants in the intron region and one rare missense variant (R150S). We subsequently performed Sanger sequencing and validated five novel variants located in previously suggested candidate regions in ASD individuals. Further sequencing of 312 healthy subjects showed that the burden of rare variants is significantly higher in ASDs compared with healthy individuals. Our results support that the rare variation in OXTR gene might be involved in ASD.
A resource of single-nucleotide polymorphisms for rainbow trout generated by restriction-site associated DNA sequencing of doubled haploids

USDA-ARS?s Scientific Manuscript database

Salmonid genomes are considered to be in a pseudo-tetraploid state as a result of an evolutionarily recent genome duplication event. This situation complicates single nucleotide polymorphism (SNP) discovery in rainbow trout as many putative SNPs are actually paralogous sequence variants (PSVs) and ...
Molecular characterization and phylogenetic analysis of Citrus viroid VI variants from citrus in China

USDA-ARS?s Scientific Manuscript database

Citrus viroid VI (CVd-VI) was originally found from citrus and persimmon in Japan. We report here the identification and molecular characterization of CVd-VI from four production regions of China. A total of 90 cDNA clones from nine infected citrus cultivars were sequenced. The sequence homologies o...
Using an online genome resource to identify myostatin variation in U.S. sheep

USDA-ARS?s Scientific Manuscript database

We created a public, searchable DNA sequence resource for sheep that contained approximately 14x whole genome sequence of 96 rams. The animals represent 10 popular U.S. breeds and share minimal pedigree relationships, making the resource suitable for viewing gene variants in the user-friendly Integ...
Exploring DNA variant segregation types in pooled genome sequencing enables effective mapping of weeping trait in Malus

USDA-ARS?s Scientific Manuscript database

In recent years, next generation sequencing (NGS) based bulked segregant analysis (BSA) has become a powerful approach for allele discovery in non-model plant species. However, challenges remain, particular for out-crossing species with complex genomes. Here, the genetic control of a weeping bran...
The 5S rDNA family evolves through concerted and birth-and-death evolution in fish genomes: an example from freshwater stingrays

PubMed Central

2011-01-01

Background Ribosomal 5S genes are well known for the critical role they play in ribosome folding and functionality. These genes are thought to evolve in a concerted fashion, with high rates of homogenization of gene copies. However, the majority of previous analyses regarding the evolutionary process of rDNA repeats were conducted in invertebrates and plants. Studies have also been conducted on vertebrates, but these analyses were usually restricted to the 18S, 5.8S and 28S rRNA genes. The recent identification of divergent 5S rRNA gene paralogs in the genomes of elasmobranches and teleost fishes indicate that the eukaryotic 5S rRNA gene family has a more complex genomic organization than previously thought. The availability of new sequence data from lower vertebrates such as teleosts and elasmobranches enables an enhanced evolutionary characterization of 5S rDNA among vertebrates. Results We identified two variant classes of 5S rDNA sequences in the genomes of Potamotrygonidae stingrays, similar to the genomes of other vertebrates. One class of 5S rRNA genes was shared only by elasmobranches. A broad comparative survey among 100 vertebrate species suggests that the 5S rRNA gene variants in fishes originated from rounds of genome duplication. These variants were then maintained or eliminated by birth-and-death mechanisms, under intense purifying selection. Clustered multiple copies of 5S rDNA variants could have arisen due to unequal crossing over mechanisms. Simultaneously, the distinct genome clusters were independently homogenized, resulting in the maintenance of clusters of highly similar repeats through concerted evolution. Conclusions We believe that 5S rDNA molecular evolution in fish genomes is driven by a mixed mechanism that integrates birth-and-death and concerted evolution. PMID:21627815

Mutational load of the mitochondrial genome predicts pathological features and biochemical recurrence in prostate cancer.

PubMed

Kalsbeek, Anton M F; Chan, Eva F K; Grogan, Judith; Petersen, Desiree C; Jaratlerdsiri, Weerachai; Gupta, Ruta; Lyons, Ruth J; Haynes, Anne-Maree; Horvath, Lisa G; Kench, James G; Stricker, Phillip D; Hayes, Vanessa M

2016-10-05

Prostate cancer management is complicated by extreme disease heterogeneity, which is further limited by availability of prognostic biomarkers. Recognition of prostate cancer as a genetic disease has prompted a focus on the nuclear genome for biomarker discovery, with little attention given to the mitochondrial genome. While it is evident that mitochondrial DNA (mtDNA) mutations are acquired during prostate tumorigenesis, no study has evaluated the prognostic value of mtDNA variation. Here we used next-generation sequencing to interrogate the mitochondrial genomes from prostate tissue biopsies and matched blood of 115 men having undergone a radical prostatectomy for which there was a mean of 107 months clinical follow-up. We identified 74 unique prostate cancer specific somatic mtDNA variants in 50 patients, providing significant expansion to the growing catalog of prostate cancer mtDNA mutations. While no single variant or variant cluster showed recurrence across multiple patients, we observe a significant positive correlation between the total burden of acquired mtDNA variation and elevated Gleason Score at diagnosis and biochemical relapse. We add to accumulating evidence that total acquired genomic burden, rather than specific mtDNA mutations, has diagnostic value. This is the first study to demonstrate the prognostic potential of mtDNA mutational burden in prostate cancer.
Next generation sequencing analysis reveals a relationship between rDNA unit diversity and locus number in Nicotiana diploids

PubMed Central

2012-01-01

Background Tandemly arranged nuclear ribosomal DNA (rDNA), encoding 18S, 5.8S and 26S ribosomal RNA (rRNA), exhibit concerted evolution, a pattern thought to result from the homogenisation of rDNA arrays. However rDNA homogeneity at the single nucleotide polymorphism (SNP) level has not been detailed in organisms with more than a few hundred copies of the rDNA unit. Here we study rDNA complexity in species with arrays consisting of thousands of units. Methods We examined homogeneity of genic (18S) and non-coding internally transcribed spacer (ITS1) regions of rDNA using Roche 454 and/or Illumina platforms in four angiosperm species, Nicotiana sylvestris, N. tomentosiformis, N. otophora and N. kawakamii. We compared the data with Southern blot hybridisation revealing the structure of intergenic spacer (IGS) sequences and with the number and distribution of rDNA loci. Results and Conclusions In all four species the intragenomic homogeneity of the 18S gene was high; a single ribotype makes up over 90% of the genes. However greater variation was observed in the ITS1 region, particularly in species with two or more rDNA loci, where >55% of rDNA units were a single ribotype, with the second most abundant variant accounted for >18% of units. IGS heterogeneity was high in all species. The increased number of ribotypes in ITS1 compared with 18S sequences may reflect rounds of incomplete homogenisation with strong selection for functional genic regions and relaxed selection on ITS1 variants. The relationship between the number of ITS1 ribotypes and the number of rDNA loci leads us to propose that rDNA evolution and complexity is influenced by locus number and/or amplification of orphaned rDNA units at new chromosomal locations. PMID:23259460
A multiple-alignment based primer design algorithm for genetically highly variable DNA targets

PubMed Central

2013-01-01

Background Primer design for highly variable DNA sequences is difficult, and experimental success requires attention to many interacting constraints. The advent of next-generation sequencing methods allows the investigation of rare variants otherwise hidden deep in large populations, but requires attention to population diversity and primer localization in relatively conserved regions, in addition to recognized constraints typically considered in primer design. Results Design constraints include degenerate sites to maximize population coverage, matching of melting temperatures, optimizing de novo sequence length, finding optimal bio-barcodes to allow efficient downstream analyses, and minimizing risk of dimerization. To facilitate primer design addressing these and other constraints, we created a novel computer program (PrimerDesign) that automates this complex procedure. We show its powers and limitations and give examples of successful designs for the analysis of HIV-1 populations. Conclusions PrimerDesign is useful for researchers who want to design DNA primers and probes for analyzing highly variable DNA populations. It can be used to design primers for PCR, RT-PCR, Sanger sequencing, next-generation sequencing, and other experimental protocols targeting highly variable DNA samples. PMID:23965160
Comparing sequencing assays and human-machine analyses in actionable genomics for glioblastoma

PubMed Central

Wrzeszczynski, Kazimierz O.; Frank, Mayu O.; Koyama, Takahiko; Rhrissorrakrai, Kahn; Robine, Nicolas; Utro, Filippo; Emde, Anne-Katrin; Chen, Bo-Juen; Arora, Kanika; Shah, Minita; Vacic, Vladimir; Norel, Raquel; Bilal, Erhan; Bergmann, Ewa A.; Moore Vogel, Julia L.; Bruce, Jeffrey N.; Lassman, Andrew B.; Canoll, Peter; Grommes, Christian; Harvey, Steve; Parida, Laxmi; Michelini, Vanessa V.; Zody, Michael C.; Jobanputra, Vaidehi; Royyuru, Ajay K.

2017-01-01

Objective: To analyze a glioblastoma tumor specimen with 3 different platforms and compare potentially actionable calls from each. Methods: Tumor DNA was analyzed by a commercial targeted panel. In addition, tumor-normal DNA was analyzed by whole-genome sequencing (WGS) and tumor RNA was analyzed by RNA sequencing (RNA-seq). The WGS and RNA-seq data were analyzed by a team of bioinformaticians and cancer oncologists, and separately by IBM Watson Genomic Analytics (WGA), an automated system for prioritizing somatic variants and identifying drugs. Results: More variants were identified by WGS/RNA analysis than by targeted panels. WGA completed a comparable analysis in a fraction of the time required by the human analysts. Conclusions: The development of an effective human-machine interface in the analysis of deep cancer genomic datasets may provide potentially clinically actionable calls for individual patients in a more timely and efficient manner than currently possible. ClinicalTrials.gov identifier: NCT02725684. PMID:28740869
A gene variation of 14-3-3 zeta isoform in rat hippocampus.

PubMed

Murakami, K; Situ, S Y; Eshete, F

1996-11-14

A variant form of 14-3-3 zeta was isolated from the rat hippocampal cDNA library. The cloned cDNA is 1687 bp in length and it contains an entire ORF (nt = 63-797) with 245 amino acids that is characteristic to 14-3-3 zeta subtype. By comparing with reported sequences of 14-3-3 zeta, we found three nucleotide substitutions within the coding sequence in our clone; C<-->T transition at nt = 325 and G<-->C transversions at nt = 387 and 388. Both are missense mutations, leading ACG (Thr) to ATG (Met) and CGT (Arg) to GCT (Ala) conversions at residue 88 and 109, respectively. Our results show that at least three different genetic variants of 14-3-3 zeta are present in rat species which results in protein variations. Such mutation in the amino acid sequence is an important indication of the diverse functions of this protein and may also contribute to the recent contradictory observations regarding the role of the 14-3-3 zeta subtype.
Response to DNA damage of CHEK2 missense mutations in familial breast cancer

PubMed Central

Roeb, Wendy; Higgins, Jake; King, Mary-Claire

2012-01-01

Comprehensive sequencing of tumor suppressor genes to evaluate inherited predisposition to cancer yields many individually rare missense alleles of unknown functional and clinical consequence. To address this problem for CHEK2 missense alleles, we developed a yeast-based assay to assess in vivo CHEK2-mediated response to DNA damage. Of 25 germline CHEK2 missense alleles detected in familial breast cancer patients, 12 alleles had complete loss of DNA damage response, 8 had partial loss and 5 exhibited a DNA damage response equivalent to that mediated by wild-type CHEK2. Variants exhibiting reduced response to DNA damage were found in all domains of the CHEK2 protein. Assay results were in agreement with epidemiologic assessments of breast cancer risk for those variants sufficiently common for case–control studies to have been undertaken. Assay results were largely concordant with consensus predictions of in silico tools, particularly for damaging alleles in the kinase domain. However, of the 25 variants, 6 were not consistently classifiable by in silico tools. An in vivo assay of cellular response to DNA damage by mutant CHEK2 alleles may complement and extend epidemiologic and genetic assessment of their clinical consequences. PMID:22419737
Response to DNA damage of CHEK2 missense mutations in familial breast cancer.

PubMed

Roeb, Wendy; Higgins, Jake; King, Mary-Claire

2012-06-15

Comprehensive sequencing of tumor suppressor genes to evaluate inherited predisposition to cancer yields many individually rare missense alleles of unknown functional and clinical consequence. To address this problem for CHEK2 missense alleles, we developed a yeast-based assay to assess in vivo CHEK2-mediated response to DNA damage. Of 25 germline CHEK2 missense alleles detected in familial breast cancer patients, 12 alleles had complete loss of DNA damage response, 8 had partial loss and 5 exhibited a DNA damage response equivalent to that mediated by wild-type CHEK2. Variants exhibiting reduced response to DNA damage were found in all domains of the CHEK2 protein. Assay results were in agreement with epidemiologic assessments of breast cancer risk for those variants sufficiently common for case-control studies to have been undertaken. Assay results were largely concordant with consensus predictions of in silico tools, particularly for damaging alleles in the kinase domain. However, of the 25 variants, 6 were not consistently classifiable by in silico tools. An in vivo assay of cellular response to DNA damage by mutant CHEK2 alleles may complement and extend epidemiologic and genetic assessment of their clinical consequences.
A novel homozygous missense variant in NECTIN4 (PVRL4) causing ectodermal dysplasia cutaneous syndactyly syndrome.

PubMed

Ahmad, Farooq; Nasir, Abdul; Thiele, Holger; Umair, Muhammad; Borck, Guntram; Ahmad, Wasim

2018-02-12

Ectodermal dysplasia syndactyly syndrome 1 (EDSS1) is a rare form of ectodermal dysplasia including anomalies of hair, nails, and teeth along with bilateral cutaneous syndactyly of hands and feet. In the present report, we performed a clinical and genetic characterization of a consanguineous Pakistani family with four individuals affected by EDSS1. We performed exome sequencing using DNA of one affected individual. Exome data analysis identified a novel homozygous missense variant (c.242T>C; p.(Leu81Pro)) in NECTIN4 (PVRL4). Sanger sequencing validated this variant and confirmed its cosegregation with the disease phenotype in the family members. Thus, our report adds a novel variant to the NECTIN4 mutation spectrum and contributes to the NECTIN4-related clinical characterization. © 2018 John Wiley & Sons Ltd/University College London.
Molecular Cytogenetics Guides Massively Parallel Sequencing of a Radiation-Induced Chromosome Translocation in Human Cells.

PubMed

Cornforth, Michael N; Anur, Pavana; Wang, Nicholas; Robinson, Erin; Ray, F Andrew; Bedford, Joel S; Loucas, Bradford D; Williams, Eli S; Peto, Myron; Spellman, Paul; Kollipara, Rahul; Kittler, Ralf; Gray, Joe W; Bailey, Susan M

2018-05-11

Chromosome rearrangements are large-scale structural variants that are recognized drivers of oncogenic events in cancers of all types. Cytogenetics allows for their rapid, genome-wide detection, but does not provide gene-level resolution. Massively parallel sequencing (MPS) promises DNA sequence-level characterization of the specific breakpoints involved, but is strongly influenced by bioinformatics filters that affect detection efficiency. We sought to characterize the breakpoint junctions of chromosomal translocations and inversions in the clonal derivatives of human cells exposed to ionizing radiation. Here, we describe the first successful use of DNA paired-end analysis to locate and sequence across the breakpoint junctions of a radiation-induced reciprocal translocation. The analyses employed, with varying degrees of success, several well-known bioinformatics algorithms, a task made difficult by the involvement of repetitive DNA sequences. As for underlying mechanisms, the results of Sanger sequencing suggested that the translocation in question was likely formed via microhomology-mediated non-homologous end joining (mmNHEJ). To our knowledge, this represents the first use of MPS to characterize the breakpoint junctions of a radiation-induced chromosomal translocation in human cells. Curiously, these same approaches were unsuccessful when applied to the analysis of inversions previously identified by directional genomic hybridization (dGH). We conclude that molecular cytogenetics continues to provide critical guidance for structural variant discovery, validation and in "tuning" analysis filters to enable robust breakpoint identification at the base pair level.
Protein Science by DNA Sequencing: How Advances in Molecular Biology Are Accelerating Biochemistry.

PubMed

Higgins, Sean A; Savage, David F

2018-01-09

A fundamental goal of protein biochemistry is to determine the sequence-function relationship, but the vastness of sequence space makes comprehensive evaluation of this landscape difficult. However, advances in DNA synthesis and sequencing now allow researchers to assess the functional impact of every single mutation in many proteins, but challenges remain in library construction and the development of general assays applicable to a diverse range of protein functions. This Perspective briefly outlines the technical innovations in DNA manipulation that allow massively parallel protein biochemistry and then summarizes the methods currently available for library construction and the functional assays of protein variants. Areas in need of future innovation are highlighted with a particular focus on assay development and the use of computational analysis with machine learning to effectively traverse the sequence-function landscape. Finally, applications in the fundamentals of protein biochemistry, disease prediction, and protein engineering are presented.
Are special read alignment strategies necessary and cost-effective when handling sequencing reads from patient-derived tumor xenografts?

PubMed

Tso, Kai-Yuen; Lee, Sau Dan; Lo, Kwok-Wai; Yip, Kevin Y

2014-12-23

Patient-derived tumor xenografts in mice are widely used in cancer research and have become important in developing personalized therapies. When these xenografts are subject to DNA sequencing, the samples could contain various amounts of mouse DNA. It has been unclear how the mouse reads would affect data analyses. We conducted comprehensive simulations to compare three alignment strategies at different mutation rates, read lengths, sequencing error rates, human-mouse mixing ratios and sequenced regions. We also sequenced a nasopharyngeal carcinoma xenograft and a cell line to test how the strategies work on real data. We found the "filtering" and "combined reference" strategies performed better than aligning reads directly to human reference in terms of alignment and variant calling accuracies. The combined reference strategy was particularly good at reducing false negative variants calls without significantly increasing the false positive rate. In some scenarios the performance gain of these two special handling strategies was too small for special handling to be cost-effective, but it was found crucial when false non-synonymous SNVs should be minimized, especially in exome sequencing. Our study systematically analyzes the effects of mouse contamination in the sequencing data of human-in-mouse xenografts. Our findings provide information for designing data analysis pipelines for these data.
Multiplex picoliter-droplet digital PCR for quantitative assessment of DNA integrity in clinical samples.

PubMed

Didelot, Audrey; Kotsopoulos, Steve K; Lupo, Audrey; Pekin, Deniz; Li, Xinyu; Atochin, Ivan; Srinivasan, Preethi; Zhong, Qun; Olson, Jeff; Link, Darren R; Laurent-Puig, Pierre; Blons, Hélène; Hutchison, J Brian; Taly, Valerie

2013-05-01

Assessment of DNA integrity and quantity remains a bottleneck for high-throughput molecular genotyping technologies, including next-generation sequencing. In particular, DNA extracted from paraffin-embedded tissues, a major potential source of tumor DNA, varies widely in quality, leading to unpredictable sequencing data. We describe a picoliter droplet-based digital PCR method that enables simultaneous detection of DNA integrity and the quantity of amplifiable DNA. Using a multiplex assay, we detected 4 different target lengths (78, 159, 197, and 550 bp). Assays were validated with human genomic DNA fragmented to sizes of 170 bp to 3000 bp. The technique was validated with DNA quantities as low as 1 ng. We evaluated 12 DNA samples extracted from paraffin-embedded lung adenocarcinoma tissues. One sample contained no amplifiable DNA. The fractions of amplifiable DNA for the 11 other samples were between 0.05% and 10.1% for 78-bp fragments and ≤1% for longer fragments. Four samples were chosen for enrichment and next-generation sequencing. The quality of the sequencing data was in agreement with the results of the DNA-integrity test. Specifically, DNA with low integrity yielded sequencing results with lower levels of coverage and uniformity and had higher levels of false-positive variants. The development of DNA-quality assays will enable researchers to downselect samples or process more DNA to achieve reliable genome sequencing with the highest possible efficiency of cost and effort, as well as minimize the waste of precious samples. © 2013 American Association for Clinical Chemistry.
RefCNV: Identification of Gene-Based Copy Number Variants Using Whole Exome Sequencing.

PubMed

Chang, Lun-Ching; Das, Biswajit; Lih, Chih-Jian; Si, Han; Camalier, Corinne E; McGregor, Paul M; Polley, Eric

2016-01-01

With rapid advances in DNA sequencing technologies, whole exome sequencing (WES) has become a popular approach for detecting somatic mutations in oncology studies. The initial intent of WES was to characterize single nucleotide variants, but it was observed that the number of sequencing reads that mapped to a genomic region correlated with the DNA copy number variants (CNVs). We propose a method RefCNV that uses a reference set to estimate the distribution of the coverage for each exon. The construction of the reference set includes an evaluation of the sources of variability in the coverage distribution. We observed that the processing steps had an impact on the coverage distribution. For each exon, we compared the observed coverage with the expected normal coverage. Thresholds for determining CNVs were selected to control the false-positive error rate. RefCNV prediction correlated significantly (r = 0.96-0.86) with CNV measured by digital polymerase chain reaction for MET (7q31), EGFR (7p12), or ERBB2 (17q12) in 13 tumor cell lines. The genome-wide CNV analysis showed a good overall correlation (Spearman's coefficient = 0.82) between RefCNV estimation and publicly available CNV data in Cancer Cell Line Encyclopedia. RefCNV also showed better performance than three other CNV estimation methods in genome-wide CNV analysis.
Characterization and mapping of the human rhodopsin kinase gene and screening of the gene for mutations in patients with retinitis pigmentosa

DOE Office of Scientific and Technical Information (OSTI.GOV)

Khani, S.C.; Lin, D.; Magovcevic, I.

1994-09-01

Rhodopsin kinase (RK) is a cytosolic enzyme in rod photoreceptors that initiates the deactivation of the phototransductions cascade by phosphorylating photoactivated rhodopsin. Although the cDNA sequence of bovine RK has been determined previously, no human cDNA or genomic sequence has thus far been available for genetic studies. In order to investigate the possible role of this candidate gene in retinitis pigmentosa (RP) and allied diseases, we have isolated and characterized human cDNA and genomic clones derived from the RK locus. The coding sequence of the human gene is 1692 nucleotides in length and is split into seven exons. The humanmore » and the bovine sequence show 84% identity at the nucleotide level and 92% identity at the amino acid level. Thus far, the intronic sequences flanking each exon except for one have been determined. We have also mapped the human RK gene to chromosome 13q34 using fluorescence in situ hybridization. To our knowledge, no RP gene has as yet been linked to this region. However, since the substrate for RK (rhodopsin) and other members of the phototransduction cascade have been implicated in the pathogenesis of RP, it is conceivable that defects in RK can also cause some forms of this disease. We are evaluating this possibility by screening DNA from 173 patients with autosomal recessive RP and 190 patients with autosomal dominant RP. So far, we have found 11 patients with variant bands. In one patient with autosomal dominant RP we discovered the missense change Ser536Leu. Cosegregation studies and further sequencing of the variant bands are currently underway.« less
Integrated digital error suppression for improved detection of circulating tumor DNA

PubMed Central

Kurtz, David M.; Chabon, Jacob J.; Scherer, Florian; Stehr, Henning; Liu, Chih Long; Bratman, Scott V.; Say, Carmen; Zhou, Li; Carter, Justin N.; West, Robert B.; Sledge, George W.; Shrager, Joseph B.; Loo, Billy W.; Neal, Joel W.; Wakelee, Heather A.; Diehn, Maximilian; Alizadeh, Ash A.

2016-01-01

High-throughput sequencing of circulating tumor DNA (ctDNA) promises to facilitate personalized cancer therapy. However, low quantities of cell-free DNA (cfDNA) in the blood and sequencing artifacts currently limit analytical sensitivity. To overcome these limitations, we introduce an approach for integrated digital error suppression (iDES). Our method combines in silico elimination of highly stereotypical background artifacts with a molecular barcoding strategy for the efficient recovery of cfDNA molecules. Individually, these two methods each improve the sensitivity of cancer personalized profiling by deep sequencing (CAPP-Seq) by ~3 fold, and synergize when combined to yield ~15-fold improvements. As a result, iDES-enhanced CAPP-Seq facilitates noninvasive variant detection across hundreds of kilobases. Applied to clinical non-small cell lung cancer (NSCLC) samples, our method enabled biopsy-free profiling of EGFR kinase domain mutations with 92% sensitivity and 96% specificity and detection of ctDNA down to 4 in 105 cfDNA molecules. We anticipate that iDES will aid the noninvasive genotyping and detection of ctDNA in research and clinical settings. PMID:27018799
Rare mtDNA variants in Leber hereditary optic neuropathy families with recurrence of myoclonus.

PubMed

La Morgia, C; Achilli, A; Iommarini, L; Barboni, P; Pala, M; Olivieri, A; Zanna, C; Vidoni, S; Tonon, C; Lodi, R; Vetrugno, R; Mostacci, B; Liguori, R; Carroccia, R; Montagna, P; Rugolo, M; Torroni, A; Carelli, V

2008-03-04

To investigate the mechanisms underlying myoclonus in Leber hereditary optic neuropathy (LHON). Five patients and one unaffected carrier from two Italian families bearing the homoplasmic 11778/ND4 and 3460/ND1 mutations underwent a uniform investigation including neurophysiologic studies, muscle biopsy, serum lactic acid after exercise, and muscle ((31)P) and cerebral ((1)H) magnetic resonance spectroscopy (MRS). Biochemical investigations on fibroblasts and complete mitochondrial DNA (mtDNA) sequences of both families were also performed. All six individuals had myoclonus. In spite of a normal EEG background and the absence of giant SEPs and C reflex, EEG-EMG back-averaging showed a preceding jerk-locked EEG potential, consistent with a cortical generator of the myoclonus. Specific comorbidities in the 11778/ND4 family included muscular cramps and psychiatric disorders, whereas features common to both families were migraine and cardiologic abnormalities. Signs of mitochondrial proliferation were seen in muscle biopsies and lactic acid elevation was observed in four of six patients. (31)P-MRS was abnormal in five of six patients and (1)H-MRS showed ventricular accumulation of lactic acid in three of six patients. Fibroblast ATP depletion was evident at 48 hours incubation with galactose in LHON/myoclonus patients. Sequence analysis revealed haplogroup T2 (11778/ND4 family) and U4a (3460/ND1 family) mtDNAs. A functional role for the non-synonymous 4136A>G/ND1, 9139G>A/ATPase6, and 15773G>A/cyt b variants was supported by amino acid conservation analysis. Myoclonus and other comorbidities characterized our Leber hereditary optic neuropathy (LHON) families. Functional investigations disclosed a bioenergetic impairment in all individuals. Our sequence analysis suggests that the LHON plus phenotype in our cases may relate to the synergic role of mtDNA variants.
The dynamic DNA methylation landscape of the mutL homolog 1 shore is altered by MLH1-93G>A polymorphism in normal tissues and colorectal cancer.

PubMed

Savio, Andrea J; Mrkonjic, Miralem; Lemire, Mathieu; Gallinger, Steven; Knight, Julia A; Bapat, Bharat

2017-01-01

Colorectal cancers (CRCs) undergo distinct genetic and epigenetic alterations. Expression of mutL homolog 1 ( MLH1 ), a mismatch repair gene that corrects DNA replication errors, is lost in up to 15% of sporadic tumours due to mutation or, more commonly, due to DNA methylation of its promoter CpG island. A single nucleotide polymorphism (SNP) in the CpG island of MLH1 ( MLH1 -93G>A or rs1800734) is associated with CpG island hypermethylation and decreased MLH1 expression in CRC tumours. Further, in peripheral blood mononuclear cell (PBMC) DNA of both CRC cases and non-cancer controls, the variant allele of rs1800734 is associated with hypomethylation at the MLH1 shore, a region upstream of its CpG island that is less dense in CpG sites . To determine whether this genotype-epigenotype association is present in other tissue types, including colorectal tumours, we assessed DNA methylation in matched normal colorectal tissue, tumour, and PBMC DNA from 349 population-based CRC cases recruited from the Ontario Familial Colorectal Cancer Registry. Using the semi-quantitative real-time PCR-based MethyLight assay, MLH1 shore methylation was significantly higher in tumour tissue than normal colon or PBMCs ( P < 0.01). When shore methylation levels were stratified by SNP genotype, normal colorectal DNA and PBMC DNA were significantly hypomethylated in association with variant SNP genotype ( P < 0.05). However, this association was lost in tumour DNA. Among distinct stages of CRC, metastatic stage IV CRC tumours incurred significant hypomethylation compared to stage I-III cases, irrespective of genotype status. Shore methylation of MLH1 was not associated with MSI status or promoter CpG island hypermethylation, regardless of genotype. To confirm these results, bisulfite sequencing was performed in matched tumour and normal colorectal specimens from six CRC cases, including two cases per genotype (wildtype, heterozygous, and homozygous variant). Bisulfite sequencing results corroborated the methylation patterns found by MethyLight, with significant hypomethylation in normal colorectal tissue of variant SNP allele carriers. These results indicate that the normal tissue types tested (colorectum and PBMC) experience dynamic genotype-associated epigenetic alterations at the MLH1 shore, whereas tumour DNA incurs aberrant hypermethylation compared to normal DNA.
Heterozygous SSBP1 start loss mutation co-segregates with hearing loss and the m.1555A>G mtDNA variant in a large multigenerational family.

PubMed

Kullar, Peter J; Gomez-Duran, Aurora; Gammage, Payam A; Garone, Caterina; Minczuk, Michal; Golder, Zoe; Wilson, Janet; Montoya, Julio; Häkli, Sanna; Kärppä, Mikko; Horvath, Rita; Majamaa, Kari; Chinnery, Patrick F

2018-01-01

The m.1555A>G mtDNA variant causes maternally inherited deafness, but the reasons for the highly variable clinical penetrance are not known. Exome sequencing identified a heterozygous start loss mutation in SSBP1, encoding the single stranded binding protein 1 (SSBP1), segregating with hearing loss in a multi-generational family transmitting m.1555A>G, associated with mtDNA depletion and multiple deletions in skeletal muscle. The SSBP1 mutation reduced steady state SSBP1 levels leading to a perturbation of mtDNA metabolism, likely compounding the intra-mitochondrial translation defect due to m.1555A>G in a tissue-specific manner. This family demonstrates the importance of rare trans-acting genetic nuclear modifiers in the clinical expression of mtDNA disease. © The Author (2017). Published by Oxford University Press on behalf of the Guarantors of Brain.
Analysis of selected genes associated with cardiomyopathy by next-generation sequencing.

PubMed

Szabadosova, Viktoria; Boronova, Iveta; Ferenc, Peter; Tothova, Iveta; Bernasovska, Jarmila; Zigova, Michaela; Kmec, Jan; Bernasovsky, Ivan

2018-02-01

As the leading cause of congestive heart failure, cardiomyopathy represents a heterogenous group of heart muscle disorders. Despite considerable progress being made in the genetic diagnosis of cardiomyopathy by detection of the mutations in the most prevalent cardiomyopathy genes, the cause remains unsolved in many patients. High-throughput mutation screening in the disease genes for cardiomyopathy is now possible because of using target enrichment followed by next-generation sequencing. The aim of the study was to analyze a panel of genes associated with dilated or hypertrophic cardiomyopathy based on previously published results in order to identify the subjects at risk. The method of next-generation sequencing by IlluminaHiSeq 2500 platform was used to detect sequence variants in 16 individuals diagnosed with dilated or hypertrophic cardiomyopathy. Detected variants were filtered and the functional impact of amino acid changes was predicted by computational programs. DNA samples of the 16 patients were analyzed by whole exome sequencing. We identified six nonsynonymous variants that were shown to be pathogenic in all used prediction softwares: rs3744998 (EPG5), rs11551768 (MGME1), rs148374985 (MURC), rs78461695 (PLEC), rs17158558 (RET) and rs2295190 (SYNE1). Two of the analyzed sequence variants had minor allele frequency (MAF)<0.01: rs148374985 (MURC), rs34580776 (MYBPC3). Our data support the potential role of the detected variants in pathogenesis of dilated or hypertrophic cardiomyopathy; however, the possibility that these variants might not be true disease-causing variants but are susceptibility alleles that require additional mutations or injury to cause the clinical phenotype of disease must be considered. © 2017 Wiley Periodicals, Inc.
Clinical whole-genome sequencing from routine formalin-fixed, paraffin-embedded specimens: pilot study for the 100,000 Genomes Project.

PubMed

Robbe, Pauline; Popitsch, Niko; Knight, Samantha J L; Antoniou, Pavlos; Becq, Jennifer; He, Miao; Kanapin, Alexander; Samsonova, Anastasia; Vavoulis, Dimitrios V; Ross, Mark T; Kingsbury, Zoya; Cabes, Maite; Ramos, Sara D C; Page, Suzanne; Dreau, Helene; Ridout, Kate; Jones, Louise J; Tuff-Lacey, Alice; Henderson, Shirley; Mason, Joanne; Buffa, Francesca M; Verrill, Clare; Maldonado-Perez, David; Roxanis, Ioannis; Collantes, Elena; Browning, Lisa; Dhar, Sunanda; Damato, Stephen; Davies, Susan; Caulfield, Mark; Bentley, David R; Taylor, Jenny C; Turnbull, Clare; Schuh, Anna

2018-02-01

PurposeFresh-frozen (FF) tissue is the optimal source of DNA for whole-genome sequencing (WGS) of cancer patients. However, it is not always available, limiting the widespread application of WGS in clinical practice. We explored the viability of using formalin-fixed, paraffin-embedded (FFPE) tissues, available routinely for cancer patients, as a source of DNA for clinical WGS.MethodsWe conducted a prospective study using DNAs from matched FF, FFPE, and peripheral blood germ-line specimens collected from 52 cancer patients (156 samples) following routine diagnostic protocols. We compared somatic variants detected in FFPE and matching FF samples.ResultsWe found the single-nucleotide variant agreement reached 71% across the genome and somatic copy-number alterations (CNAs) detection from FFPE samples was suboptimal (0.44 median correlation with FF) due to nonuniform coverage. CNA detection was improved significantly with lower reverse crosslinking temperature in FFPE DNA extraction (80 °C or 65 °C depending on the methods). Our final data showed somatic variant detection from FFPE for clinical decision making is possible. We detected 98% of clinically actionable variants (including 30/31 CNAs).ConclusionWe present the first prospective WGS study of cancer patients using FFPE specimens collected in a routine clinical environment proving WGS can be applied in the clinic.GENETICS in MEDICINE advance online publication, 1 February 2018; doi:10.1038/gim.2017.241.

Utility of NIST Whole-Genome Reference Materials for the Technical Validation of a Multigene Next-Generation Sequencing Test.

PubMed

Shum, Bennett O V; Henner, Ilya; Belluoccio, Daniele; Hinchcliffe, Marcus J

2017-07-01

The sensitivity and specificity of next-generation sequencing laboratory developed tests (LDTs) are typically determined by an analyte-specific approach. Analyte-specific validations use disease-specific controls to assess an LDT's ability to detect known pathogenic variants. Alternatively, a methods-based approach can be used for LDT technical validations. Methods-focused validations do not use disease-specific controls but use benchmark reference DNA that contains known variants (benign, variants of unknown significance, and pathogenic) to assess variant calling accuracy of a next-generation sequencing workflow. Recently, four whole-genome reference materials (RMs) from the National Institute of Standards and Technology (NIST) were released to standardize methods-based validations of next-generation sequencing panels across laboratories. We provide a practical method for using NIST RMs to validate multigene panels. We analyzed the utility of RMs in validating a novel newborn screening test that targets 70 genes, called NEO1. Despite the NIST RM variant truth set originating from multiple sequencing platforms, replicates, and library types, we discovered a 5.2% false-negative variant detection rate in the RM truth set genes that were assessed in our validation. We developed a strategy using complementary non-RM controls to demonstrate 99.6% sensitivity of the NEO1 test in detecting variants. Our findings have implications for laboratories or proficiency testing organizations using whole-genome NIST RMs for testing. Copyright © 2017 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.
Whole Exome Sequencing of Pediatric Gastric Adenocarcinoma Reveals an Atypical Presentation of Li-Fraumeni Syndrome

PubMed Central

Chang, Vivian Y.; Federman, Noah; Martinez-Agosto, Julian; Tatishchev, Sergei F.; Nelson, Stanley F.

2014-01-01

Background Gastric adenocarcinoma is a rare diagnosis in childhood. A 14-year old male patient presented with metastatic gastric adenocarcinoma, and a strong family history of colon cancer. Clinical sequencing of CDH1 and APC were negative. Whole exome sequencing was therefore applied to capture the majority of protein-coding regions for the identification of single-nucleotide variants, small insertion/deletions, and copy number abnormalities in the patient’s germline as well as primary tumor. Materials and Methods DNA was extracted from the patient’s blood, primary tumor, and the unaffected mother’s blood. DNA libraries were constructed and sequenced on Illumina HiSeq2000. Data were post-processed using Picard and Samtools, then analyzed with the Genome Analysis Toolkit. Variants were annotated using an in-house Ensembl-based program. Copy number was assessed using ExomeCNV. Results Each sample was sequenced to a mean depth of coverage of greater than 120×. A rare non-synonymous coding SNV in TP53 was identified in the germline. There were 10 somatic cancer protein-damaging variants that were not observed in the unaffected mother genome. ExomeCNV comparing tumor to the patient’s germline, identified abnormal copy number, spanning 6,946 genes. Conclusion We present an unusual case of Li-Fraumeni detected by whole exome sequencing. There were also likely driver somatic mutations in the gastric adenocarcinoma. These results highlight the need for more thorough and broad scale germline and cancer analyses to accurately inform patients of inherited risk to cancer and to identify somatic mutations. PMID:23015295
Base Excision Repair Variants in Cancer

PubMed Central

Marsden, Carolyn G.; Dragon, Julie A.; Wallace, Susan S.; Sweasy, Joann B.

2018-01-01

Base excision repair (BER) is a key genome maintenance pathway that removes endogenously damaged DNA bases that arise in cells at very high levels on a daily basis. Failure to remove these damaged DNA bases leads to increased levels of mutagenesis and chromosomal instability, which have the potential to drive carcinogenesis. Next Generation sequencing efforts of the germline and tumors genomes of thousands of individuals has uncovered many rare mutations in BER genes. Given that BER is critical for genome maintenance, it is important to determine whether BER genomic variants have functional phenotypes. In this chapter we present our in silico methods for the identification and prioritization of BER variants for further study. We also provide detailed instructions and commentary on the initial cellular assays we employ to dissect potentially important phenotypes of human BER variants and highlight the strengths and weaknesses of our approaches. BER variants possessing interesting functional phenotypes can then be studied in more detail to provide important mechanistic insights regarding the role of aberrant BER in carcinogenesis. PMID:28645367
On the Sequence-Directed Nature of Human Gene Mutation: The Role of Genomic Architecture and the Local DNA Sequence Environment in Mediating Gene Mutations Underlying Human Inherited Disease

PubMed Central

Cooper, David N.; Bacolla, Albino; Férec, Claude; Vasquez, Karen M.; Kehrer-Sawatzki, Hildegard; Chen, Jian-Min

2011-01-01

Different types of human gene mutation may vary in size, from structural variants (SVs) to single base-pair substitutions, but what they all have in common is that their nature, size and location are often determined either by specific characteristics of the local DNA sequence environment or by higher-order features of the genomic architecture. The human genome is now recognized to contain ‘pervasive architectural flaws’ in that certain DNA sequences are inherently mutation-prone by virtue of their base composition, sequence repetitivity and/or epigenetic modification. Here we explore how the nature, location and frequency of different types of mutation causing inherited disease are shaped in large part, and often in remarkably predictable ways, by the local DNA sequence environment. The mutability of a given gene or genomic region may also be influenced indirectly by a variety of non-canonical (non-B) secondary structures whose formation is facilitated by the underlying DNA sequence. Since these non-B DNA structures can interfere with subsequent DNA replication and repair, and may serve to increase mutation frequencies in generalized fashion (i.e. both in the context of subtle mutations and SVs), they have the potential to serve as a unifying concept in studies of mutational mechanisms underlying human inherited disease. PMID:21853507
Directed evolution of an RNA enzyme

NASA Technical Reports Server (NTRS)

Beaudry, Amber A.; Joyce, Gerald F.

1992-01-01

An in vitro evolution procedures was used to obtain RNA enzymes with a particular catalytic function. A population of 10 exp 13 variants of the Tetrahymena ribozyme, a group I ribozyme that catalyzes sequence-specific cleavage of RNA via a phosphoester transfer mechanism, was generated. This enzyme has a limited ability to cleave DNA under conditions of high temperature or high MgCl2 concentration, or both. A selection constraint was imposed on the population of ribozyme variants such that only those individuals that carried out DNA cleavage under physiologic conditions were amplified to produce 'progeny' ribozymes. Mutations were introduced during amplification to maintain heterogeneity in the population. This process was repeated for ten successive generations, resulting in enhanced (100 times) DNA cleavage activity.
Codon optimization of antigen coding sequences improves the immune potential of DNA vaccines against avian influenza virus H5N1 in mice and chickens.

PubMed

Stachyra, Anna; Redkiewicz, Patrycja; Kosson, Piotr; Protasiuk, Anna; Góra-Sochacka, Anna; Kudla, Grzegorz; Sirko, Agnieszka

2016-08-26

Highly pathogenic avian influenza viruses are a serious threat to domestic poultry and can be a source of new human pandemic and annual influenza strains. Vaccination is the main strategy of protection against influenza, thus new generation vaccines, including DNA vaccines, are needed. One promising approach for enhancing the immunogenicity of a DNA vaccine is to maximize its expression in the immunized host. The immunogenicity of three variants of a DNA vaccine encoding hemagglutinin (HA) from the avian influenza virus A/swan/Poland/305-135V08/2006 (H5N1) was compared in two animal models, mice (BALB/c) and chickens (broilers and layers). One variant encoded the wild type HA while the other two encoded HA without proteolytic site between HA1 and HA2 subunits and differed in usage of synonymous codons. One of them was enriched for codons preferentially used in chicken genes, while in the other modified variant the third position of codons was occupied in almost 100 % by G or C nucleotides. The variant of the DNA vaccine containing almost 100 % of the GC content in the third position of codons stimulated strongest immune response in two animal models, mice and chickens. These results indicate that such modification can improve not only gene expression but also immunogenicity of DNA vaccine. Enhancement of the GC content in the third position of the codon might be a good strategy for development of a variant of a DNA vaccine against influenza that could be highly effective in distant hosts, such as birds and mammals, including humans.
Misregulation effect of a novel allelic variant in the Z promoter region found in cis with the CYP21A2 p.P482S mutation: implications for 21-hydroxylase deficiency.

PubMed

Fernández, Cecilia S; Bruque, Carlos D; Taboas, Melisa; Buzzalino, Noemí D; Espeche, Lucia D; Pasqualini, Titania; Charreau, Eduardo H; Alba, Liliana G; Ghiringhelli, Pablo D; Dain, Liliana

2015-09-01

The aim of the current study was to search for the presence of genetic variants in the CYP21A2 Z promoter regulatory region in patients with congenital adrenal hyperplasia due to 21-hydroxylase deficiency. Screening of the 10 most frequent pseudogene-derived mutations was followed by direct sequencing of the entire coding sequence, the proximal promoter, and a distal regulatory region in DNA samples from patients with at least one non-determined allele. We report three non-classical patients that presented a novel genetic variant-g.15626A>G-within the Z promoter regulatory region. In all the patients, the novel variant was found in cis with the mild, less frequent, p.P482S mutation located in the exon 10 of the CYP21A2 gene. The putative pathogenic implication of the novel variant was assessed by in silico analyses and in vitro assays. Topological analyses showed differences in the curvature and bendability of the DNA region bearing the novel variant. By performing functional studies, a significantly decreased activity of a reporter gene placed downstream from the regulatory region was found by the G transition. Our results may suggest that the activity of an allele bearing the p.P482S mutation may be influenced by the misregulated CYP21A2 transcriptional activity exerted by the Z promoter A>G variation.
Characterization of hemoglobin Hotel Dieu in a Puerto Rican adolescent.

PubMed

Cadilla, C L; López, C R; García-Castiñeiras, S; Valencia, D; Renta, J Y; Rivera-Caragol, E; Barrios, N J; Santiago-Borrero, P J

1998-01-01

Hemoglobin Hotel Dieu (HbHD) is a high-oxygen affinity variant of HbA never before reported in a Hispanic patient. This Hb variant was first reported in 1981 by Blouquit et al. in a white person with erythrocytosis with a substitution in the beta 99 aspartic acid residue by glycine. A 13-year-old Puerto Rican boy had pain in his chest, headaches, easy fatigability, and high Hb (as high as 19.1 g/dl). Protein analysis was performed by cellulose acetate, citrate agar, and isoelectric focusing electrophoresis and high-pressure liquid chromatography (HPLC), polymerase chain reaction (PCR) amplification, and DNA sequencing of the second exon of the beta gene in samples obtained from the mother, father, and the patient, and DNA fingerprinting to determine paternity. The variant found in the patient migrated on cellulose acetate electrophoresis to a cathodic position relative to HbF, and a band cathodal to HbA and close to HbF on isoelectric focusing electrophoresis. The patient showed an abnormal well-resolved peak on HPLC with a retention time slightly shorter than that for HbS. DNA analysis by direct sequencing of the PCR product demonstrated heterozygosity for codon 99 (GAT-->GGT) in the patient but not in either parent. DNA fingerprinting by multiplex PCR amplification of three simple tandem repeat loci showed that the patient shared alleles in all three loci with both parents, ruling out nonpaternity. The protein and DNA analysis indicate that the erythrocytosis is caused by the presence of HbHD in this Hispanic adolescent.
Mass Spectrometric Determination of ILPR G-quadruplex Binding Sites in Insulin and IGF-2

PubMed Central

Xiao, JunFeng

2009-01-01

The insulin-linked polymorphic region (ILPR) of the human insulin gene promoter region forms G-quadruplex structures in vitro. Previous studies show that insulin and insulin-like growth factor-2 (IGF-2) exhibit high affinity binding in vitro to 2-repeat sequences of ILPR variants a and h, but negligible binding to variant i. Two-repeat sequences of variants a and h form intramolecular G-quadruplex structures that are not evidenced for variant i. Here we report on the use of protein digestion combined with affinity capture and MALDI-MS detection to pinpoint ILPR binding sites in insulin and IGF-2. Peptides captured by ILPR variants a and h were sequenced by MALDI-MS/MS, LC-MS and in silico digestion. On-bead digestion of insulin-ILPR variant a complexes supported the conclusions. The results indicate that the sequence VCG(N)RGF is generally present in the captured peptides and is likely involved in the affinity binding interactions of the proteins with the ILPR G-quadruplexes. The significance of arginine in the interactions was studied by comparing the affinities of synthesized peptides VCGERGF and VCGEAGF with ILPR variant a. Peptides from other regions of the proteins that are connected through disulfide linkages were also detected in some capture experiments. Identification of binding sites could facilitate design of DNA binding ligands for capture and detection of insulin and IGF-2. The interactions may have biological significance as well. PMID:19747845
High-Resolution Sequence-Function Mapping of Full-Length Proteins

PubMed Central

Kowalsky, Caitlin A.; Klesmith, Justin R.; Stapleton, James A.; Kelly, Vince; Reichkitzer, Nolan; Whitehead, Timothy A.

2015-01-01

Comprehensive sequence-function mapping involves detailing the fitness contribution of every possible single mutation to a gene by comparing the abundance of each library variant before and after selection for the phenotype of interest. Deep sequencing of library DNA allows frequency reconstruction for tens of thousands of variants in a single experiment, yet short read lengths of current sequencers makes it challenging to probe genes encoding full-length proteins. Here we extend the scope of sequence-function maps to entire protein sequences with a modular, universal sequence tiling method. We demonstrate the approach with both growth-based selections and FACS screening, offer parameters and best practices that simplify design of experiments, and present analytical solutions to normalize data across independent selections. Using this protocol, sequence-function maps covering full sequences can be obtained in four to six weeks. Best practices introduced in this manuscript are fully compatible with, and complementary to, other recently published sequence-function mapping protocols. PMID:25790064
Distribution of gene mutations in sporadic congenital cataract in a Han Chinese population

PubMed Central

Li, Dan; Wang, Siying; Ye, Hongfei; Tang, Yating; Qiu, Xiaodi; Fan, Qi; Rong, Xianfang; Liu, Xin; Chen, Yuhong; Yang, Jin

2016-01-01

Purpose This study aimed to investigate the genetic effects underlying non-familial sporadic congenital cataract (SCC). Methods We collected DNA samples from 74 patients with SCC and 20 patients with traumatic cataract (TC) in an age-matched group and performed genomic sequencing of 61 lens-related genes with target region capture and next-generation sequencing (NGS). The suspected SCC variants were validated with MassARRAY and Sanger sequencing. DNA samples from 103 healthy subjects were used as additional controls in the confirmation examination. Results By filtering against common variants in public databases and those associated with TC cases, we identified 23 SCC-specific variants in 17 genes from 19 patients, which were predicted to be functional. These mutations were further confirmed by examination of the 103 healthy controls. Among the mutated genes, CRYBB3 had the highest mutation frequency with mutations detected four times in four patients, followed by EPHA2, NHS, and WDR36, the mutation of which were detected two times in two patients. We observed that the four patients with CRYBB3 mutations had three different cataract phenotypes. Conclusions From this study, we concluded the clinical and genetic heterogeneity of SCC. This is the first study to report broad spectrum genotyping for patients with SCC. PMID:27307692
Asymmetric single-strand polymorphism: an accurate and cost-effective method to amplify and sequence allelic variants

USDA-ARS?s Scientific Manuscript database

We needed to obtain an alternative to conventional cloning to generate high-quality DNA sequences from a variety of nuclear orthologs for phylogenetic studies in potato, to save time and money and to avoid problems typically encountered in cloning. We tested a variety of SSCP protocols to include pu...
Synthetic Spike-in Standards Improve Run-Specific Systematic Error Analysis for DNA and RNA Sequencing

PubMed Central

Zook, Justin M.; Samarov, Daniel; McDaniel, Jennifer; Sen, Shurjo K.; Salit, Marc

2012-01-01

While the importance of random sequencing errors decreases at higher DNA or RNA sequencing depths, systematic sequencing errors (SSEs) dominate at high sequencing depths and can be difficult to distinguish from biological variants. These SSEs can cause base quality scores to underestimate the probability of error at certain genomic positions, resulting in false positive variant calls, particularly in mixtures such as samples with RNA editing, tumors, circulating tumor cells, bacteria, mitochondrial heteroplasmy, or pooled DNA. Most algorithms proposed for correction of SSEs require a data set used to calculate association of SSEs with various features in the reads and sequence context. This data set is typically either from a part of the data set being “recalibrated” (Genome Analysis ToolKit, or GATK) or from a separate data set with special characteristics (SysCall). Here, we combine the advantages of these approaches by adding synthetic RNA spike-in standards to human RNA, and use GATK to recalibrate base quality scores with reads mapped to the spike-in standards. Compared to conventional GATK recalibration that uses reads mapped to the genome, spike-ins improve the accuracy of Illumina base quality scores by a mean of 5 Phred-scaled quality score units, and by as much as 13 units at CpG sites. In addition, since the spike-in data used for recalibration are independent of the genome being sequenced, our method allows run-specific recalibration even for the many species without a comprehensive and accurate SNP database. We also use GATK with the spike-in standards to demonstrate that the Illumina RNA sequencing runs overestimate quality scores for AC, CC, GC, GG, and TC dinucleotides, while SOLiD has less dinucleotide SSEs but more SSEs for certain cycles. We conclude that using these DNA and RNA spike-in standards with GATK improves base quality score recalibration. PMID:22859977
Identification of novel mutations and sequence variants in the SOX2 and CHX10 genes in patients with anophthalmia/microphthalmia

PubMed Central

Zhou, Jie; Kherani, Femida; Bardakjian, Tanya M.; Katowitz, James; Hughes, Nkecha; Schimmenti, Lisa A.; Schneider, Adele

2008-01-01

Purpose Mutations in the SOX2 and CHX10 genes have been reported in patients with anophthalmia and/or microphthalmia. In this study, we evaluated 34 anophthalmic/microphthalmic patient DNA samples (two sets of siblings included) for mutations and sequence variants in SOX2 and CHX10. Methods Conformational sensitive gel electrophoresis (CSGE) was used for the initial SOX2 and CHX10 screening of 34 affected individuals (two sets of siblings), five unaffected family members, and 80 healthy controls. Patient samples containing heteroduplexes were selected for sequence analysis. Base pair changes in SOX2 and CHX10 were confirmed by sequencing bidirectionally in patient samples. Results Two novel heterozygous mutations and two sequence variants (one known) in SOX2 were identified in this cohort. Mutation c.310 G>T (p. Glu104X), found in one patient, was in the region encoding the high mobility group (HMG) DNA-binding domain and resulted in a change from glutamic acid to a stop codon. The second mutation, noted in two affected siblings, was a single nucleotide deletion c.549delC (p. Pro184ArgfsX19) in the region encoding the activation domain, resulting in a frameshift and premature termination of the coding sequence. The shortened protein products may result in the loss of function. In addition, a novel nucleotide substitution c.*557G>A was identified in the 3′-untranslated region in one patient. The relationship between the nucleotide change and the protein function is indeterminate. A known single nucleotide polymorphism (c. *469 C>A, SNP rs11915160) was also detected in 2 of the 34 patients. Screening of CHX10 identified two synonymous sequence variants, c.471 C>T (p.Ser157Ser, rs35435463) and c.579 G>A (p. Gln193Gln, novel SNP), and one non-synonymous sequence variant, c.871 G>A (p. Asp291Asn, novel SNP). The non-synonymous polymorphism was also present in healthy controls, suggesting non-causality. Conclusions These results support the role of SOX2 in ocular development. Loss of SOX2 function results in severe eye malformation. CHX10 was not implicated with microphthalmia/anophthalmia in our patient cohort. PMID:18385794
[Detection of pathogenic mutations in Marfan syndrome by targeted next-generation semiconductor sequencing].

PubMed

Lu, Chaoxia; Wu, Wei; Xiao, Jifang; Meng, Yan; Zhang, Shuyang; Zhang, Xue

2013-06-01

To detect pathogenic mutations in Marfan syndrome (MFS) using an Ion Torrent Personal Genome Machine (PGM) and to validate the result of targeted next-generation semiconductor sequencing for the diagnosis of genetic disorders. Peripheral blood samples were collected from three MFS patients and a normal control with informed consent. Genomic DNA was isolated by standard method and then subjected to targeted sequencing using an Ion Ampliseq(TM) Inherited Disease Panel. Three multiplex PCR reactions were carried out to amplify the coding exons of 328 genes including FBN1, TGFBR1 and TGFBR2. DNA fragments from different samples were ligated with barcoded sequencing adaptors. Template preparation and emulsion PCR, and Ion Sphere Particles enrichment were carried out using an Ion One Touch system. The ion sphere particles were sequenced on a 318 chip using the PGM platform. Data from the PGM runs were processed using an Ion Torrent Suite 3.2 software to generate sequence reads. After sequence alignment and extraction of SNPs and indels, all the variants were filtered against dbSNP137. DNA sequences were visualized with an Integrated Genomics Viewer. The most likely disease-causing variants were analyzed by Sanger sequencing. The PGM sequencing has yielded an output of 855.80 Mb, with a > 100 × median sequencing depth and a coverage of > 98% for the targeted regions in all the four samples. After data analysis and database filtering, one known missense mutation (p.E1811K) and two novel premature termination mutations (p.E2264X and p.L871FfsX23) in the FBN1 gene were identified in the three MFS patients. All mutations were verified by conventional Sanger sequencing. Pathogenic FBN1 mutations have been identified in all patients with MFS, indicating that the targeted next-generation sequencing on the PGM sequencers can be applied for accurate and high-throughput testing of genetic disorders.
Intratumoral heterogeneity identified at the epigenetic, genetic and transcriptional level in glioblastoma.

PubMed

Parker, Nicole R; Hudson, Amanda L; Khong, Peter; Parkinson, Jonathon F; Dwight, Trisha; Ikin, Rowan J; Zhu, Ying; Cheng, Zhangkai Jason; Vafaee, Fatemeh; Chen, Jason; Wheeler, Helen R; Howell, Viive M

2016-03-04

Heterogeneity is a hallmark of glioblastoma with intratumoral heterogeneity contributing to variability in responses and resistance to standard treatments. Promoter methylation status of the DNA repair enzyme O(6)-methylguanine DNA methyltransferase (MGMT) is the most important clinical biomarker in glioblastoma, predicting for therapeutic response. However, it does not always correlate with response. This may be due to intratumoral heterogeneity, with a single biopsy unlikely to represent the entire lesion. Aberrations in other DNA repair mechanisms may also contribute. This study investigated intratumoral heterogeneity in multiple glioblastoma tumors with a particular focus on the DNA repair pathways. Transcriptional intratumoral heterogeneity was identified in 40% of cases with variability in MGMT methylation status found in 14% of cases. As well as identifying intratumoral heterogeneity at the transcriptional and epigenetic levels, targeted next generation sequencing identified between 1 and 37 unique sequence variants per specimen. In-silico tools were then able to identify deleterious variants in both the base excision repair and the mismatch repair pathways that may contribute to therapeutic response. As these pathways have roles in temozolomide response, these findings may confound patient management and highlight the importance of assessing multiple tumor biopsies.
Hotspot mutations in cancer genes may be missed in routine diagnostics due to neighbouring sequence variants.

PubMed

Bartels, Stephan; Schipper, Elisa; Hasemeier, Britta; Kreipe, Hans; Lehmann, Ulrich

2018-05-27

The detection of hotspot mutations in key cancer genes is now an essential part of the diagnostic work-up in molecular pathology. Nearly all assays for mutation detection involve an amplification step. A second single nucleotide variant (SNV) on the same allele adjacent to a mutational hotspot can interfere with primer binding, leading to unnoticed allele-specific amplification of the wild type allele and thereby false-negative mutation testing. We present two diagnostic cases with false negative sequence results for JAK2 and SRSF2. In both cases mutations would have escaped detection if only one strand of DNA had been analysed. Because many commercially available diagnostic kits rely on the analysis of only one DNA strand they are prone to fail in cases like these. Detailed protocols and quality control measures to prevent corresponding pitfalls are presented. Copyright © 2017. Published by Elsevier Inc.
Mitochondrial DNA variants can mediate methylation status of inflammation, angiogenesis and signaling genes

PubMed Central

Atilano, Shari R.; Malik, Deepika; Chwa, Marilyn; Cáceres-Del-Carpio, Javier; Nesburn, Anthony B.; Boyer, David S.; Kuppermann, Baruch D.; Jazwinski, S. Michal; Miceli, Michael V.; Wallace, Douglas C.; Udar, Nitin; Kenney, M. Cristina

2015-01-01

Mitochondrial (mt) DNA can be classified into haplogroups representing different geographic and/or racial origins of populations. The H haplogroup is protective against age-related macular degeneration (AMD), while the J haplogroup is high risk for AMD. In the present study, we performed comparison analyses of human retinal cell cybrids, which possess identical nuclei, but mtDNA from subjects with either the H or J haplogroups, and demonstrate differences in total global methylation, and expression patterns for two genes related to acetylation and five genes related to methylation. Analyses revealed that untreated-H and -J cybrids have different expression levels for nuclear genes (CFH, EFEMP1, VEGFA and NFkB2). However, expression levels for these genes become equivalent after treatment with a methylation inhibitor, 5-aza-2′-deoxycytidine. Moreover, sequencing of the entire mtDNA suggests that differences in epigenetic status found in cybrids are likely due to single nucleotide polymorphisms (SNPs) within the haplogroup profiles rather than rare variants or private SNPs. In conclusion, our findings indicate that mtDNA variants can mediate methylation profiles and transcription for inflammation, angiogenesis and various signaling pathways, which are important in several common diseases. PMID:25964427
Mitochondrial DNA sequence context in the penetrance of mitochondrial t-RNA mutations: A study across multiple lineages with diagnostic implications

PubMed Central

Queen, Rachel A.; Steyn, Jannetta S.; Lord, Phillip

2017-01-01

Mitochondrial DNA (mtDNA) mutations are well recognized as an important cause of inherited disease. Diseases caused by mtDNA mutations exhibit a high degree of clinical heterogeneity with a complex genotype-phenotype relationship, with many such mutations exhibiting incomplete penetrance. There is evidence that the spectrum of mutations causing mitochondrial disease might differ between different mitochondrial lineages (haplogroups) seen in different global populations. This would point to the importance of sequence context in the expression of mutations. To explore this possibility, we looked for mutations which are known to cause disease in humans, in animals of other species unaffected by mtDNA disease. The mt-tRNA genes are the location of many pathogenic mutations, with the m.3243A>G mutation on the mt-tRNA-Leu(UUR) being the most frequently seen mutation in humans. This study looked for the presence of m.3243A>G in 2784 sequences from 33 species, as well as any of the other mutations reported in association with disease located on mt-tRNA-Leu(UUR). We report a number of disease associated variations found on mt-tRNA-Leu(UUR) in other chordates, as the major population variant, with m.3243A>G being seen in 6 species. In these, we also found a number of mutations which appear compensatory and which could prevent the pathogenicity associated with this change in humans. This work has important implications for the discovery and diagnosis of mtDNA mutations in non-European populations. In addition, it might provide a partial explanation for the conflicting results in the literature that examines the role of mtDNA variants in complex traits. PMID:29161289
Variability and genetics of spacer DNA sequences between the ribosomal-RNA genes of hexaploid wheat (Triticum aestivum).

PubMed

May, C E; Appels, R

1987-09-01

Using restriction enzyme digests of genomic DNA extracted from the leaves of 25 hexaploid wheat (Triticum aestivum L. em. Thell.) cultivars and their hybrids, restriction fragment length polymorphisms of the spacer DNA which separates the ribosomal-RNA genes have been examined. (From one to three thousand of these genes are borne on chromosomes 1B and 6B of hexaploid wheat). The data show that there are three distinct alleles of the 1B locus, designated Nor-B1a, Nor-B1b, and Nor-B1c, and at least five allelic variants of the 6B locus, designated Nor-B2a, Nor-B2b, Nor-B2c, Nor-B2d, and Nor-B2e. A further, previously reported allele on 6B has been named Nor-B2f. Chromosome 5D has only one allelic variant, Nor-D3. Whereas the major spacer variants of the 1B alleles apparently differ by the loss or gain of one or two of the 133 bp sub-repeat units within the spacer DNA, the 6B allelic variants show major differences in their compositions and lengths. This may be related to the greater number of rDNA repeat units at this locus. The practical implications of these differences and their application to wheat breeding are discussed.

Candidate Cancer Allele cDNA Collection | Office of Cancer Genomics

Cancer.gov

CTD2 researchers at the Broad Institute/DFCI have developed a collection of plasmids including mutant alleles found in sequencing studies of cancer. It includes somatic variants found in lung adenocarcinoma and across other cancer types. The clones enable researchers to characterize the function of the cancer variants in a high throughput experiments. These plasmids are collectively called the “Broad Target Accelerator Plasmid Collections”.
Whole-Genome Sequencing of the World’s Oldest People

PubMed Central

Gierman, Hinco J.; Fortney, Kristen; Roach, Jared C.; Coles, Natalie S.; Li, Hong; Glusman, Gustavo; Markov, Glenn J.; Smith, Justin D.; Hood, Leroy; Coles, L. Stephen; Kim, Stuart K.

2014-01-01

Supercentenarians (110 years or older) are the world’s oldest people. Seventy four are alive worldwide, with twenty two in the United States. We performed whole-genome sequencing on 17 supercentenarians to explore the genetic basis underlying extreme human longevity. We found no significant evidence of enrichment for a single rare protein-altering variant or for a gene harboring different rare protein altering variants in supercentenarian compared to control genomes. We followed up on the gene most enriched for rare protein-altering variants in our cohort of supercentenarians, TSHZ3, by sequencing it in a second cohort of 99 long-lived individuals but did not find a significant enrichment. The genome of one supercentenarian had a pathogenic mutation in DSC2, known to predispose to arrhythmogenic right ventricular cardiomyopathy, which is recommended to be reported to this individual as an incidental finding according to a recent position statement by the American College of Medical Genetics and Genomics. Even with this pathogenic mutation, the proband lived to over 110 years. The entire list of rare protein-altering variants and DNA sequence of all 17 supercentenarian genomes is available as a resource to assist the discovery of the genetic basis of extreme longevity in future studies. PMID:25390934
Whole-genome sequencing of the world's oldest people.

PubMed

Gierman, Hinco J; Fortney, Kristen; Roach, Jared C; Coles, Natalie S; Li, Hong; Glusman, Gustavo; Markov, Glenn J; Smith, Justin D; Hood, Leroy; Coles, L Stephen; Kim, Stuart K

2014-01-01

Supercentenarians (110 years or older) are the world's oldest people. Seventy four are alive worldwide, with twenty two in the United States. We performed whole-genome sequencing on 17 supercentenarians to explore the genetic basis underlying extreme human longevity. We found no significant evidence of enrichment for a single rare protein-altering variant or for a gene harboring different rare protein altering variants in supercentenarian compared to control genomes. We followed up on the gene most enriched for rare protein-altering variants in our cohort of supercentenarians, TSHZ3, by sequencing it in a second cohort of 99 long-lived individuals but did not find a significant enrichment. The genome of one supercentenarian had a pathogenic mutation in DSC2, known to predispose to arrhythmogenic right ventricular cardiomyopathy, which is recommended to be reported to this individual as an incidental finding according to a recent position statement by the American College of Medical Genetics and Genomics. Even with this pathogenic mutation, the proband lived to over 110 years. The entire list of rare protein-altering variants and DNA sequence of all 17 supercentenarian genomes is available as a resource to assist the discovery of the genetic basis of extreme longevity in future studies.
Detection of the Canine Parvovirus 2c Subtype in Australian Dogs.

PubMed

Woolford, Lucy; Crocker, Paul; Bobrowski, Hannah; Baker, Trevor; Hemmatzadeh, Farhid

2017-06-01

Canine parvovirus (CPV-2) is an important cause of hemorrhagic enteritis in dogs. In Australia the disease has been associated with CPV-2a and CPV-2b variants. A third more recently emerged variant overseas, CPV-2c, has not been detected in surveys of the Australian dog population. In this study, we report three cases of canine parvoviral enteritis associated with CPV-2c infection; case 1 occurred in an 8-week-old puppy that died following acute hemorrhagic enteritis. Cases 2 and 3 were an 11-month-old female entire Saint Bernard and a 9-month-old male entire Siberian husky, respectively, both which had completed vaccination schedules and presented with vomiting or mild diarrhea only. Full genomic sequencing of parvoviral DNA from cases 1, 2, and 3 revealed greater than 99% homology to known CPV-2c variants and predicted protein sequences from the VP2 region of viral DNA from all three cases identified; glutamic acid residues at the 426 amino acid residue, characteristic of the CPV-2c variant. Veterinary professionals should be aware that CPV-2c is now present in Australia, detected in a puppy and vaccinated young adult dogs in this study. Further characterization of CPV-2c-associated disease and its prevalence in Australian dogs requires additional research.
Dynamics of drug resistance-associated mutations in HIV-1 DNA reverse transcriptase sequence during effective ART.

PubMed

Nouchi, A; Nguyen, T; Valantin, M A; Simon, A; Sayon, S; Agher, R; Calvez, V; Katlama, C; Marcelin, A G; Soulie, C

2018-05-29

To investigate the dynamics of HIV-1 variants archived in cells harbouring drug resistance-associated mutations (DRAMs) to lamivudine/emtricitabine, etravirine and rilpivirine in patients under effective ART free from selective pressure on these DRAMs, in order to assess the possibility of recycling molecules with resistance history. We studied 25 patients with at least one DRAM to lamivudine/emtricitabine, etravirine and/or rilpivirine identified on an RNA sequence in their history and with virological control for at least 5 years under a regimen excluding all drugs from the resistant class. Longitudinal ultra-deep sequencing (UDS) and Sanger sequencing of the reverse transcriptase region were performed on cell-associated HIV-1 DNA samples taken over the 5 years of follow-up. Viral variants harbouring the analysed DRAMs were no longer detected by UDS over the 5 years in 72% of patients, with viruses susceptible to the molecules of interest found after 5 years in 80% of patients with UDS and in 88% of patients with Sanger. Residual viraemia with <50 copies/mL was detected in 52% of patients. The median HIV DNA level remained stable (2.4 at baseline versus 2.1 log10 copies/106 cells 5 years later). These results show a clear trend towards clearance of archived DRAMs to reverse transcriptase inhibitors in cell-associated HIV-1 DNA after a long period of virological control, free from therapeutic selective pressure on these DRAMs, reflecting probable residual replication in some reservoirs of the fittest viruses and leading to persistent evolution of the archived HIV-1 DNA resistance profile.
Evidence for recombination of mtDNA in the marine mussel Mytilus trossulus from the Baltic.

PubMed

Burzyński, Artur; Zbawicka, Małgorzata; Skibinski, David O F; Wenne, Roman

2003-03-01

A number of studies have claimed that recombination occurs in animal mtDNA, although this evidence is controversial. Ladoukakis and Zouros (2001) provided strong evidence for mtDNA recombination in the COIII gene in gonadal tissue in the marine mussel Mytilus galloprovincialis from the Black Sea. The recombinant molecules they reported had not however become established in the population from which experimental animals were sampled. In the present study, we provide further evidence of the generality of mtDNA recombination in Mytilus by reporting recombinant mtDNA molecules in a related mussel species, Mytilus trossulus, from the Baltic. The mtDNA region studied begins in the 16S rRNA gene and terminates in the cytochrome b gene and includes a major noncoding region that may be analogous to the D-loop region observed in other animals. Many bivalve species, including some Mytilus species, are unusual in that they have two mtDNA genomes, one of which is inherited maternally (F genome) the other inherited paternally (M genome). Two recombinant variants reported in the present study have population frequencies of 5% and 36% and appear to be mosaic for F-like and M-like sequences. However, both variants have the noncoding region from the M genome, and both are transmitted to sperm like the M genome. We speculate that acquisition of the noncoding region by the recombinant molecules has conferred a paternal role on mtDNA genomes that otherwise resemble the F genome in sequence.
Individualized Mutation Detection in Circulating Tumor DNA for Monitoring Colorectal Tumor Burden Using a Cancer-Associated Gene Sequencing Panel.

PubMed

Sato, Kei A; Hachiya, Tsuyoshi; Iwaya, Takeshi; Kume, Kohei; Matsuo, Teppei; Kawasaki, Keisuke; Abiko, Yukito; Akasaka, Risaburo; Matsumoto, Takayuki; Otsuka, Koki; Nishizuka, Satoshi S

2016-01-01

Circulating tumor DNA (ctDNA) carries information on tumor burden. However, the mutation spectrum is different among tumors. This study was designed to examine the utility of ctDNA for monitoring tumor burden based on an individual mutation profile. DNA was extracted from a total of 176 samples, including pre- and post-operational plasma, primary tumors, and peripheral blood mononuclear cells (PBMC), from 44 individuals with colorectal tumor who underwent curative resection of colorectal tumors, as well as nine healthy individuals. Using a panel of 50 cancer-associated genes, tumor-unique mutations were identified by comparing the single nucleotide variants (SNVs) from tumors and PBMCs with an Ion PGM sequencer. A group of the tumor-unique mutations from individual tumors were designated as individual marker mutations (MMs) to trace tumor burden by ctDNA using droplet digital PCR (ddPCR). From these experiments, three major objectives were assessed: (a) Tumor-unique mutations; (b) mutation spectrum of a tumor; and (c) changes in allele frequency of the MMs in ctDNA after curative resection of the tumor. A total of 128 gene point mutations were identified in 27 colorectal tumors. Twenty-six genes were mutated in at least 1 sample, while 14 genes were found to be mutated in only 1 sample, respectively. An average of 2.7 genes were mutated per tumor. Subsequently, 24 MMs were selected from SNVs for tumor burden monitoring. Among the MMs found by ddPCR with > 0.1% variant allele frequency in plasma DNA, 100% (8 out of 8) exhibited a decrease in post-operation ctDNA, whereas none of the 16 MMs found by ddPCR with < 0.1% variant allele frequency in plasma DNA showed a decrease. This panel of 50 cancer-associated genes appeared to be sufficient to identify individual, tumor-unique, mutated ctDNA markers in cancer patients. The MMs showed the clinical utility in monitoring curatively-treated colorectal tumor burden if the allele frequency of MMs in plasma DNA is above 0.1%.
An Automated Pipeline for Engineering Many-Enzyme Pathways: Computational Sequence Design, Pathway Expression-Flux Mapping, and Scalable Pathway Optimization.

PubMed

Halper, Sean M; Cetnar, Daniel P; Salis, Howard M

2018-01-01

Engineering many-enzyme metabolic pathways suffers from the design curse of dimensionality. There are an astronomical number of synonymous DNA sequence choices, though relatively few will express an evolutionary robust, maximally productive pathway without metabolic bottlenecks. To solve this challenge, we have developed an integrated, automated computational-experimental pipeline that identifies a pathway's optimal DNA sequence without high-throughput screening or many cycles of design-build-test. The first step applies our Operon Calculator algorithm to design a host-specific evolutionary robust bacterial operon sequence with maximally tunable enzyme expression levels. The second step applies our RBS Library Calculator algorithm to systematically vary enzyme expression levels with the smallest-sized library. After characterizing a small number of constructed pathway variants, measurements are supplied to our Pathway Map Calculator algorithm, which then parameterizes a kinetic metabolic model that ultimately predicts the pathway's optimal enzyme expression levels and DNA sequences. Altogether, our algorithms provide the ability to efficiently map the pathway's sequence-expression-activity space and predict DNA sequences with desired metabolic fluxes. Here, we provide a step-by-step guide to applying the Pathway Optimization Pipeline on a desired multi-enzyme pathway in a bacterial host.
A novel recurrent mutation in MITF predisposes to familial and sporadic melanoma

PubMed Central

Yokoyama, Satoru; Woods, Susan L.; Boyle, Glen M.; Aoude, Lauren G.; MacGregor, Stuart; Zismann, Victoria; Gartside, Michael; Cust, Anne E.; Haq, Rizwan; Harland, Mark; Taylor, John C.; Duffy, David L.; Holohan, Kelly; Dutton-Regester, Ken; Palmer, Jane M.; Bonazzi, Vanessa; Stark, Mitchell S.; Symmons, Judith; Law, Matthew H.; Schmidt, Christopher; Lanagan, Cathy; O’Connor, Linda; Holland, Elizabeth A.; Schmid, Helen; Maskiell, Judith A.; Jetann, Jodie; Ferguson, Megan; Jenkins, Mark A.; Kefford, Richard F.; Giles, Graham G.; Armstrong, Bruce K.; Aitken, Joanne F.; Hopper, John L.; Whiteman, David C.; Pharoah, Paul D.; Easton, Douglas F.; Dunning, Alison M.; Newton-Bishop, Julia A.; Montgomery, Grant W.; Martin, Nicholas G.; Mann, Graham J.; Bishop, D. Timothy; Tsao, Hensin; Trent, Jeffrey M.; Fisher, David E.; Hayward, Nicholas K.; Brown, Kevin M.

2012-01-01

So far, two familial melanoma genes have been identified, accounting for a minority of genetic risk in families. Mutations in CDKN2A account for approximately 40% of familial cases1, and predisposing mutations in CDK4 have been reported in a very small number of melanoma kindreds2. To identify other familial melanoma genes, here we conducted whole-genome sequencing of probands from several melanoma families, identifying one individual carrying a novel germline variant (coding DNA sequence c.G1075A; protein sequence p.E318K; rs149617956) in the melanoma-lineage-specific oncogene microphthalmia-associated transcription factor (MITF). Although the variant co-segregated with melanoma in some but not all cases in the family, linkage analysis of 31 families subsequently identified to carry the variant generated a log odds ratio (lod) score of 2.7 under a dominant model, indicating E318K as a possible intermediate risk variant. Consistent with this, the E318K variant was significantly associated with melanoma in a large Australian case–control sample. Likewise, it was similarly associated in an independent case–control sample from the United Kingdom. In the Australian sample, the variant allele was significantly over-represented in cases with a family history of melanoma, multiple primary melanomas, or both. The variant allele was also associated with increased naevus count and non-blue eye colour. Functional analysis of E318K showed that MITF encoded by the variant allele had impaired sumoylation and differentially regulated several MITF targets. These data indicate that MITF is a melanoma-predisposition gene and highlight the utility of whole-genome sequencing to identify novel rare variants associated with disease susceptibility. PMID:22080950
Semiconductor Whole Exome Sequencing for the Identification of Genetic Variants in Colombian Patients Clinically Diagnosed with Long QT Syndrome.

PubMed

Burgos, Mariana; Arenas, Alvaro; Cabrera, Rodrigo

2016-08-01

Inherited long QT syndrome (LQTS) is a cardiac channelopathy characterized by a prolongation of QT interval and the risk of syncope, cardiac arrest, and sudden cardiac death. Genetic diagnosis of LQTS is critical in medical practice as results can guide adequate management of patients and distinguish phenocopies such as catecholaminergic polymorphic ventricular tachycardia (CPVT). However, extensive screening of large genomic regions is required in order to reliably identify genetic causes. Semiconductor whole exome sequencing (WES) is a promising approach for the identification of variants in the coding regions of most human genes. DNA samples from 21 Colombian patients clinically diagnosed with LQTS were enriched for coding regions using multiplex polymerase chain reaction (PCR) and subjected to WES using a semiconductor sequencer. Semiconductor WES showed mean coverage of 93.6 % for all coding regions relevant to LQTS at >10× depth with high intra- and inter-assay depth heterogeneity. Fifteen variants were detected in 12 patients in genes associated with LQTS. Three variants were identified in three patients in genes associated with CPVT. Co-segregation analysis was performed when possible. All variants were analyzed with two pathogenicity prediction algorithms. The overall prevalence of LQTS and CPVT variants in our cohort was 71.4 %. All LQTS variants previously identified through commercial genetic testing were identified. Standardized WES assays can be easily implemented, often at a lower cost than sequencing panels. Our results show that WES can identify LQTS-causing mutations and permits differential diagnosis of related conditions in a real-world clinical setting. However, high heterogeneity in sequencing depth and low coverage in the most relevant genes is expected to be associated with reduced analytical sensitivity.
Ultrasensitive Genotypic Detection of Antiviral Resistance in Hepatitis B Virus Clinical Isolates▿ †

PubMed Central

Fang, Jie; Wichroski, Michael J.; Levine, Steven M.; Baldick, Carl J.; Mazzucco, Charles E.; Walsh, Ann W.; Kienzle, Bernadette K.; Rose, Ronald E.; Pokornowski, Kevin A.; Colonno, Richard J.; Tenney, Daniel J.

2009-01-01

Amino acid substitutions that confer reduced susceptibility to antivirals arise spontaneously through error-prone viral polymerases and are selected as a result of antiviral therapy. Resistance substitutions first emerge in a fraction of the circulating virus population, below the limit of detection by nucleotide sequencing of either the population or limited sets of cloned isolates. These variants can expand under drug pressure to dominate the circulating virus population. To enhance detection of these viruses in clinical samples, we established a highly sensitive quantitative, real-time allele-specific PCR assay for hepatitis B virus (HBV) DNA. Sensitivity was accomplished using a high-fidelity DNA polymerase and oligonucleotide primers containing locked nucleic acid bases. Quantitative measurement of resistant and wild-type variants was accomplished using sequence-matched standards. Detection methodology that was not reliant on hybridization probes, and assay modifications, minimized the effect of patient-specific sequence polymorphisms. The method was validated using samples from patients chronically infected with HBV through parallel sequencing of large numbers of cloned isolates. Viruses with resistance to lamivudine and other l-nucleoside analogs and entecavir, involving 17 different nucleotide substitutions, were reliably detected at levels at or below 0.1% of the total population. The method worked across HBV genotypes. Longitudinal analysis of patient samples showed earlier emergence of resistance on therapy than was seen with sequencing methodologies, including some cases of resistance that existed prior to treatment. In summary, we established and validated an ultrasensitive method for measuring resistant HBV variants in clinical specimens, which enabled earlier, quantitative measurement of resistance to therapy. PMID:19433559
Mitochondrial cytochrome c oxidase subunit 1 gene and nuclear rDNA regions of Enterobius vermicularis parasitic in captive chimpanzees with special reference to its relationship with pinworms in humans.

PubMed

Nakano, Tadao; Okamoto, Munehiro; Ikeda, Yatsukaho; Hasegawa, Hideo

2006-12-01

Sequences of mitochondrial cytochrome c oxidase subunit 1 (CO1) gene, nuclear internal transcribed spacer 2 (ITS2) region of ribosomal DNA (rDNA), and 5S rDNA of Enterobius vermicularis from captive chimpanzees in five zoos/institutions in Japan were analyzed and compared with those of pinworm eggs from humans in Japan. Three major types of variants appearing in both CO1 and ITS2 sequences, but showing no apparent connection, were observed among materials collected from the chimpanzees. Each one of them was also observed in pinworms in humans. Sequences of 5S rDNA were identical in the materials from chimpanzees and humans. Phylogenetic analysis of CO1 gene revealed three clusters with high bootstrap value, suggesting considerable divergence, presumably correlated with human evolution, has occurred in the human pinworms. The synonymy of E. gregorii with E. vermicularis is supported by the molecular evidence.
Next-generation DNA sequencing identifies novel gene variants and pathways involved in specific language impairment.

PubMed

Chen, Xiaowei Sylvia; Reader, Rose H; Hoischen, Alexander; Veltman, Joris A; Simpson, Nuala H; Francks, Clyde; Newbury, Dianne F; Fisher, Simon E

2017-04-25

A significant proportion of children have unexplained problems acquiring proficient linguistic skills despite adequate intelligence and opportunity. Developmental language disorders are highly heritable with substantial societal impact. Molecular studies have begun to identify candidate loci, but much of the underlying genetic architecture remains undetermined. We performed whole-exome sequencing of 43 unrelated probands affected by severe specific language impairment, followed by independent validations with Sanger sequencing, and analyses of segregation patterns in parents and siblings, to shed new light on aetiology. By first focusing on a pre-defined set of known candidates from the literature, we identified potentially pathogenic variants in genes already implicated in diverse language-related syndromes, including ERC1, GRIN2A, and SRPX2. Complementary analyses suggested novel putative candidates carrying validated variants which were predicted to have functional effects, such as OXR1, SCN9A and KMT2D. We also searched for potential "multiple-hit" cases; one proband carried a rare AUTS2 variant in combination with a rare inherited haplotype affecting STARD9, while another carried a novel nonsynonymous variant in SEMA6D together with a rare stop-gain in SYNPR. On broadening scope to all rare and novel variants throughout the exomes, we identified biological themes that were enriched for such variants, including microtubule transport and cytoskeletal regulation.
Next-generation DNA sequencing identifies novel gene variants and pathways involved in specific language impairment

PubMed Central

Chen, Xiaowei Sylvia; Reader, Rose H.; Hoischen, Alexander; Veltman, Joris A.; Simpson, Nuala H.; Francks, Clyde; Newbury, Dianne F.; Fisher, Simon E.

2017-01-01

A significant proportion of children have unexplained problems acquiring proficient linguistic skills despite adequate intelligence and opportunity. Developmental language disorders are highly heritable with substantial societal impact. Molecular studies have begun to identify candidate loci, but much of the underlying genetic architecture remains undetermined. We performed whole-exome sequencing of 43 unrelated probands affected by severe specific language impairment, followed by independent validations with Sanger sequencing, and analyses of segregation patterns in parents and siblings, to shed new light on aetiology. By first focusing on a pre-defined set of known candidates from the literature, we identified potentially pathogenic variants in genes already implicated in diverse language-related syndromes, including ERC1, GRIN2A, and SRPX2. Complementary analyses suggested novel putative candidates carrying validated variants which were predicted to have functional effects, such as OXR1, SCN9A and KMT2D. We also searched for potential “multiple-hit” cases; one proband carried a rare AUTS2 variant in combination with a rare inherited haplotype affecting STARD9, while another carried a novel nonsynonymous variant in SEMA6D together with a rare stop-gain in SYNPR. On broadening scope to all rare and novel variants throughout the exomes, we identified biological themes that were enriched for such variants, including microtubule transport and cytoskeletal regulation. PMID:28440294
Discovery, genotyping and characterization of structural variation and novel sequence at single nucleotide resolution from de novo genome assemblies on a population scale.

PubMed

Liu, Siyang; Huang, Shujia; Rao, Junhua; Ye, Weijian; Krogh, Anders; Wang, Jun

2015-01-01

Comprehensive recognition of genomic variation in one individual is important for understanding disease and developing personalized medication and treatment. Many tools based on DNA re-sequencing exist for identification of single nucleotide polymorphisms, small insertions and deletions (indels) as well as large deletions. However, these approaches consistently display a substantial bias against the recovery of complex structural variants and novel sequence in individual genomes and do not provide interpretation information such as the annotation of ancestral state and formation mechanism. We present a novel approach implemented in a single software package, AsmVar, to discover, genotype and characterize different forms of structural variation and novel sequence from population-scale de novo genome assemblies up to nucleotide resolution. Application of AsmVar to several human de novo genome assemblies captures a wide spectrum of structural variants and novel sequences present in the human population in high sensitivity and specificity. Our method provides a direct solution for investigating structural variants and novel sequences from de novo genome assemblies, facilitating the construction of population-scale pan-genomes. Our study also highlights the usefulness of the de novo assembly strategy for definition of genome structure.
Base-Calling Algorithm with Vocabulary (BCV) Method for Analyzing Population Sequencing Chromatograms

PubMed Central

Fantin, Yuri S.; Neverov, Alexey D.; Favorov, Alexander V.; Alvarez-Figueroa, Maria V.; Braslavskaya, Svetlana I.; Gordukova, Maria A.; Karandashova, Inga V.; Kuleshov, Konstantin V.; Myznikova, Anna I.; Polishchuk, Maya S.; Reshetov, Denis A.; Voiciehovskaya, Yana A.; Mironov, Andrei A.; Chulanov, Vladimir P.

2013-01-01

Sanger sequencing is a common method of reading DNA sequences. It is less expensive than high-throughput methods, and it is appropriate for numerous applications including molecular diagnostics. However, sequencing mixtures of similar DNA of pathogens with this method is challenging. This is important because most clinical samples contain such mixtures, rather than pure single strains. The traditional solution is to sequence selected clones of PCR products, a complicated, time-consuming, and expensive procedure. Here, we propose the base-calling with vocabulary (BCV) method that computationally deciphers Sanger chromatograms obtained from mixed DNA samples. The inputs to the BCV algorithm are a chromatogram and a dictionary of sequences that are similar to those we expect to obtain. We apply the base-calling function on a test dataset of chromatograms without ambiguous positions, as well as one with 3–14% sequence degeneracy. Furthermore, we use BCV to assemble a consensus sequence for an HIV genome fragment in a sample containing a mixture of viral DNA variants and to determine the positions of the indels. Finally, we detect drug-resistant Mycobacterium tuberculosis strains carrying frameshift mutations mixed with wild-type bacteria in the pncA gene, and roughly characterize bacterial communities in clinical samples by direct 16S rRNA sequencing. PMID:23382983
The organization and expression of the mdm2 gene.

PubMed

de Oca Luna, R M; Tabor, A D; Eberspaecher, H; Hulboy, D L; Worth, L L; Colman, M S; Finlay, C A; Lozano, G

1996-05-01

The mdm2 gene encodes a zinc finger protein that negatively regulates p53 function by binding and masking the p53 transcriptional activation domain. Two different promoters control expression of mdm2, one of which is also transactivated by p53. We cloned and characterized the mdm2 gene from a murine 129 library. It contained at least 12 exons and spanned approximately 25 kb of DNA. Sequencing of the mdm2 gene revealed three nucleotide differences that resulted in amino acid substitutions in the previously published mdm2 sequence. Sequencing of normal BalbC/J DNA and the original cosmid clone isolated from the 3T3DM cell line revealed that they are identical, suggesting that the published sequence is in error at these three positions. In addition, we analyzed the expression pattern of mdm2 and found ubiquitous low-level expression throughout embryo development and in adult tissues. Analysis of mRNA from numerous tissues for several mdm2 spliced variants that had been identified in the transformed 3T3DM cell line revealed that these variants could not be detected in the developing embryo or in adult tissues.
[Variability of nuclear 18S-25S rDNA of Gentiana lutea L. in nature and in tissue culture in vitro].

PubMed

Mel'nyk, V M; Spiridonova, K V; Andrieiev, I O; Strashniuk, N M; Kunakh, V A

2004-01-01

18S-25S rDNA sequence in genomes of G. lutea plants from different natural populations and from tissue culture has been studied with blot-hybridization method. It was shown that ribosomal repeats are represented by the variants which differ for their size and for the presence of additional HindIII restriction site. Genome of individual plant usually possesses several variants of DNA repeats. Interpopulation variability according to their quantitative ratio and to the presence of some of them has been shown. Modifications of the range of rDNA repeats not exceeding intraspecific variability were observed in callus tissues in comparison with the plants of initial population. Non-randomness of genome modifications in the course of cell adaptation to in vitro conditions makes it possible to some extent to forecast these modifications in tissue culture.
Ubiquitous and gene-specific regulatory 5' sequences in a sea urchin histone DNA clone coding for histone protein variants.

PubMed Central

Busslinger, M; Portmann, R; Irminger, J C; Birnstiel, M L

1980-01-01

The DNA sequences of the entire structural H4, H3, H2A and H2B genes and of their 5' flanking regions have been determined in the histone DNA clone h19 of the sea urchin Psammechinus miliaris. In clone h19 the polarity of transcription and the relative arrangement of the histone genes is identical to that in clone h22 of the same species. The histone proteins encoded by h19 DNA differ in their primary structure from those encoded by clone h22 and have been compared to histone protein sequences of other sea urchin species as well as other eukaryotes. A comparative analysis of the 5' flanking DNA sequences of the structural histone genes in both clones revealed four ubiquitous sequence motifs; a pentameric element GATCC, followed at short distance by the Hogness box GTATAAATAG, a conserved sequence PyCATTCPu, in or near which the 5' ends of the mRNAs map in h22 DNA and lastly a sequence A, containing the initiation codon. These sequences are also found, sometimes in modified version, in front of other eukaryotic genes transcribed by polymerase II. When prelude sequences of isocoding histone genes in clone h19 and h22 are compared areas of homology are seen to extend beyond the ubiquitous sequence motifs towards the divergent AT-rich spacer and terminate between approximately 140 and 240 nucleotides away from the structural gene. These prelude regions contain quite large conservative sequence blocks which are specific for each type of histone genes. Images PMID:7443547
Assessing pathogenicity for novel mutation/sequence variants: the value of healthy older individuals.

PubMed

Zatz, Mayana; Pavanello, Rita de Cassia M; Lourenço, Naila Cristina V; Cerqueira, Antonia; Lazar, Monize; Vainzof, Mariz

2012-12-01

Improvement in DNA technology is increasingly revealing unexpected/unknown mutations in healthy persons and generating anxiety due to their still unknown health consequences. We report a 44-year-old healthy father of a 10-year-old daughter with bilateral coloboma and hearing loss, but without muscle weakness, in whom a whole-genome CGH revealed a deletion of exons 38-44 in the dystrophin gene. This mutation was inherited from her asymptomatic father, who was further clinically and molecularly evaluated for prognosis and genetic counseling (GC). This deletion was never identified by us in 982 Duchenne/Becker patients. To assess whether the present case represents a rare case of non-penetrance, and aiming to obtain more information for prognosis and GC, we suggested that healthy older relatives submit their DNA for analysis, to which several complied. Mutation analysis revealed that his mother, brother, and 56-year-old maternal uncle also carry the 38-44 deletion, suggesting it an unlikely cause of muscle weakness. Genome sequencing will disclose mutations and variants whose health impact are still unknown, raising important problems in interpreting results, defining prognosis, and discussing GC. We suggest that, in addition to family history, keeping the DNA of older relatives could be very informative, in particular for those interested in having their genome sequenced.

Targeted next-generation sequencing for differential diagnosis of neurofibromatosis type 2, schwannomatosis, and meningiomatosis.

PubMed

Louvrier, Camille; Pasmant, Eric; Briand-Suleau, Audrey; Cohen, Joëlle; Nitschké, Patrick; Nectoux, Juliette; Orhant, Lucie; Zordan, Cécile; Goizet, Cyril; Goutagny, Stéphane; Lallemand, Dominique; Vidaud, Michel; Vidaud, Dominique; Kalamarides, Michel; Parfait, Béatrice

2018-06-18

Clinical overlap between neurofibromatosis type 2 (NF2), schwannomatosis, and meningiomatosis can make clinical diagnosis difficult. Hence, molecular investigation of germline and tumor tissues may improve the diagnosis. We present the targeted next-generation sequencing (NGS) of NF2, SMARCB1, LZTR1, SMARCE1, and SUFU tumor suppressor genes, using an amplicon-based approach. We analyzed blood DNA from a cohort of 196 patients, including patients with NF2 (N = 79), schwannomatosis (N = 40), meningiomatosis (N = 12), and no clearly established diagnosis (N = 65). Matched tumor DNA was analyzed when available. Forty-seven NF2-/SMARCB1-negative schwannomatosis patients and 27 NF2-negative meningiomatosis patients were also evaluated. A NF2 variant was found in 41/79 (52%) NF2 patients. SMARCB1 or LZTR1 variants were identified in 5/40 (12.5%) and 13/40 (∼32%) patients in the schwannomatosis cohort. Potentially pathogenic variants were found in 12/65 (18.5%) patients with no clearly established diagnosis. A LZTR1 variant was identified in 16/47 (34%) NF2/SMARCB1-negative schwannomatosis patients. A SMARCE1 variant was found in 3/39 (∼8%) meningiomatosis patients. No SUFU variant was found in the cohort. NGS was an effective and sensitive method to detect mutant alleles in blood or tumor DNA of mosaic NF2 patients. Interestingly, we identified a 4-hit mechanism resulting in the complete NF2 loss-of-function combined with SMARCB1 and LZTR1 haploinsufficiency in two-thirds of tumors from NF2 patients. Simultaneous investigation of NF2, SMARCB1, LZTR1, and SMARCE1 is a key element in the differential diagnosis of NF2, schwannomatosis, and meningiomatosis. The targeted NGS strategy is suitable for the identification of NF2 mosaicism in blood and for the investigation of tumors from these patients.
Two Novel Variants Affecting CDKL5 Transcript Associated with Epileptic Encephalopathy.

PubMed

Neupauerová, Jana; Štěrbová, Katalin; Vlčková, Markéta; Sebroňová, Věra; Maříková, Tat'ána; Krůtová, Marcela; David, Staněk; Kršek, Pavel; Žaliová, Markéta; Seeman, Pavel; Laššuthová, Petra

2017-10-01

Variants in the human X-linked cyclin-dependent kinase-like 5 (CDKL5) gene have been reported as being etiologically associated with early infantile epileptic encephalopathy type 2 (EIEE2). We report on two patients, a boy and a girl, with EIEE2 that present with early onset epilepsy, hypotonia, severe intellectual disability, and poor eye contact. Massively parallel sequencing (MPS) of a custom-designed gene panel for epilepsy and epileptic encephalopathy containing 112 epilepsy-related genes was performed. Sanger sequencing was used to confirm the novel variants. For confirmation of the functional consequence of an intronic CDKL5 variant in patient 2, an RNA study was done. DNA sequencing revealed de novo variants in CDKL5, a c.2578C>T (p. Gln860*) present in a hemizygous state in a 3-year-old boy, and a potential splice site variant c.463+5G>A in heterozygous state in a 5-year-old girl. Multiple in silico splicing algorithms predicted a highly reduced splice site score for c.463+5G>A. A subsequent mRNA study confirmed an aberrant shorter transcript lacking exon 7. Our data confirmed that variants in the CDKL5 are associated with EIEE2. There is credible evidence that the novel identified variants are pathogenic and, therefore, are likely the cause of the disease in the presented patients. In one of the patients a stop codon variant is predicted to produce a truncated protein, and in the other patient an intronic variant results in aberrant splicing.
Hemoglobin Wayne Trait with Incidental Polycythemia.

PubMed

Ambelil, Manju; Nguyen, Nghia; Dasgupta, Amitava; Risin, Semyon; Wahed, Amer

2017-01-01

Hemoglobinopathies, caused by mutations in the globin genes, are one of the most common inherited disorders. Many of the hemoglobin variants can be identified by hemoglobin analysis using conventional electrophoresis and high performance liquid chromatography; however hemoglobin DNA analysis may be necessary in other cases for confirmation. Here, we report a case of a rare alpha chain hemoglobin variant, hemoglobin Wayne, in a 47-year-old man who presented with secondary polycythemia. Capillary zone electrophoresis and high performance liquid chromatography revealed a significant amount of a hemoglobin variant, which was further confirmed by hemoglobin DNA sequencing as hemoglobin Wayne. Since the patient was not homozygous for hemoglobin Wayne, which is associated with secondary polycythemia, the laboratory diagnosis in this case was critical in ruling out hemoglobinopathy as the etiology of his polycythemia. © 2017 by the Association of Clinical Scientists, Inc.
The Application of Next-Generation Sequencing for Mutation Detection in Autosomal-Dominant Hereditary Hearing Impairment.

PubMed

Gürtler, Nicolas; Röthlisberger, Benno; Ludin, Katja; Schlegel, Christoph; Lalwani, Anil K

2017-07-01

Identification of the causative mutation using next-generation sequencing in autosomal-dominant hereditary hearing impairment, as mutation analysis in hereditary hearing impairment by classic genetic methods, is hindered by the high heterogeneity of the disease. Two Swiss families with autosomal-dominant hereditary hearing impairment. Amplified DNA libraries for next-generation sequencing were constructed from extracted genomic DNA, derived from peripheral blood, and enriched by a custom-made sequence capture library. Validated, pooled libraries were sequenced on an Illumina MiSeq instrument, 300 cycles and paired-end sequencing. Technical data analysis was performed with SeqMonk, variant analysis with GeneTalk or VariantStudio. The detection of mutations in genes related to hearing loss by next-generation sequencing was subsequently confirmed using specific polymerase-chain-reaction and Sanger sequencing. Mutation detection in hearing-loss-related genes. The first family harbored the mutation c.5383+5delGTGA in the TECTA-gene. In the second family, a novel mutation c.2614-2625delCATGGCGCCGTG in the WFS1-gene and a second mutation TCOF1-c.1028G>A were identified. Next-generation sequencing successfully identified the causative mutation in families with autosomal-dominant hereditary hearing impairment. The results helped to clarify the pathogenic role of a known mutation and led to the detection of a novel one. NGS represents a feasible approach with great potential future in the diagnostics of hereditary hearing impairment, even in smaller labs.
Polymorphic variations in the FANCA gene in high-risk non-BRCA1/2 breast cancer individuals from the French Canadian population.

PubMed

Litim, Nadhir; Labrie, Yvan; Desjardins, Sylvie; Ouellette, Geneviève; Plourde, Karine; Belleau, Pascal; Durocher, Francine

2013-02-01

The majority of genes associated with breast cancer susceptibility, including BRCA1 and BRCA2 genes, are involved in DNA repair mechanisms. Moreover, among the genes recently associated with an increased susceptibility to breast cancer, four are Fanconi Anemia (FA) genes: FANCD1/BRCA2, FANCJ/BACH1/BRIP1, FANCN/PALB2 and FANCO/RAD51C. FANCA is implicated in DNA repair and has been shown to interact directly with BRCA1. It has been proposed that the formation of FANCA/G (dependent upon the phosphorylation of FANCA) and FANCB/L sub-complexes altogether with FANCM, represent the initial step for DNA repair activation and subsequent formation of other sub-complexes leading to ubiquitination of FANCD2 and FANCI. As only approximately 25% of inherited breast cancers are attributable to BRCA1/2 mutations, FANCA therefore becomes an attractive candidate for breast cancer susceptibility. We thus analyzed FANCA gene in 97 high-risk French Canadian non-BRCA1/2 breast cancer individuals by direct sequencing as well as in 95 healthy control individuals from the same population. Among a total of 85 sequence variants found in either or both series, 28 are coding variants and 19 of them are missense variations leading to amino acid change. Three of the amino acid changes, namely Thr561Met, Cys625Ser and particularly Ser1088Phe, which has been previously reported to be associated with FA, are predicted to be damaging by the SIFT and PolyPhen softwares. cDNA amplification revealed significant expression of 4 alternative splicing events (insertion of an intronic portion of intron 10, and the skipping of exons 11, 30 and 31). In silico analyzes of relevant genomic variants have been performed in order to identify potential variations involved in the expression of these spliced transcripts. Sequence variants in FANCA could therefore be potential spoilers of the Fanconi-BRCA pathway and as a result, they could in turn have an impact in non-BRCA1/2 breast cancer families. Copyright © 2012 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.
Truncating variants in the majority of the cytoplasmic domain of PCDH15 are unlikely to cause Usher syndrome 1F.

PubMed

Perreault-Micale, Cynthia; Frieden, Alexander; Kennedy, Caleb J; Neitzel, Dana; Sullivan, Jessica; Faulkner, Nicole; Hallam, Stephanie; Greger, Valerie

2014-11-01

Loss of function variants in the PCDH15 gene can cause Usher syndrome type 1F, an autosomal recessive disease associated with profound congenital hearing loss, vestibular dysfunction, and retinitis pigmentosa. The Ashkenazi Jewish population has an increased incidence of Usher syndrome type 1F (founder variant p.Arg245X accounts for 75% of alleles), yet the variant spectrum in a panethnic population remains undetermined. We sequenced the coding region and intron-exon borders of PCDH15 using next-generation DNA sequencing technology in approximately 14,000 patients from fertility clinics. More than 600 unique PCDH15 variants (single nucleotide changes and small indels) were identified, including previously described pathogenic variants p.Arg3X, p.Arg245X (five patients), p.Arg643X, p.Arg929X, and p.Arg1106X. Novel truncating variants were also found, including one in the N-terminal extracellular domain (p.Leu877X), but all other novel truncating variants clustered in the exon 33 encoded C-terminal cytoplasmic domain (52 patients, 14 variants). One variant was observed predominantly in African Americans (carrier frequency of 2.3%). The high incidence of truncating exon 33 variants indicates that they are unlikely to cause Usher syndrome type 1F even though many remove a large portion of the gene. They may be tolerated because PCDH15 has several alternate cytoplasmic domain exons and differentially spliced isoforms may function redundantly. Effects of some PCDH15 truncating variants were addressed by deep sequencing of a panethnic population. Copyright © 2014 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.
Cloning of a human hepatocyte growth factor/scatter factor transcription variant from a gastric cancer cell line HSC-39.

PubMed

Yokozaki, H; Tahara, H; Oue, N; Tahara, E

2000-01-01

A new transcription variant of hepatocyte growth factor/scatter factor (HGF/SF) was cloned from human gastric cancer cell line HSC-39. Northern blot analysis of eight human gastric cancer cell lines (TMK-1, MKN-1, MKN-7, MKN-28, MKN-45, MKN-74, KATO-III and HSC-39) demonstrated that HSC-39 cells expressed a 1.3 kb abnormal HGF/SF transcript. Screening of 1 x 10(6) colonies of cDNA library from HSC-39 constructed in pAP3neo mammalian expression vector selected four positive clones containing HGF/SF transcript. Among them, two contained a 1.3 kbp insert detecting the identical transcript to that obtained with HGF/SF probe by Northern blotting. Deoxynucleotide sequencing of the 1.3 kbp insert revealed that it was composed of a part of HGF/SF cDNA from exon 14 to exon 18, corresponding to the whole sequence of HGF/SF light chain, with 5' 75 nucleotides unrelated to any sequence involved in HGF/SF.
Molecular definition and the ubiquity of species in the genus Naegleria.

PubMed

De Jonckheere, Johan F

2004-03-01

To investigate the variability within species of the genus Naegleria, the ITS1,5.8S and ITS2 rDNA were sequenced of several strains of N. lovaniensis and its Western Australian variants, N. australiensis, N. fowleri, N. andersoni, N. jamiesoni, N. tihangensis, N. pringsheimi, N. pagei, N. gruberi sensu lato and a Naegleria lineage that lost a group I intron from the SSUrDNA twintron. As a result, it is possible to define a molecular species within the Naegleria genus. In addition, one strain of each different allozyme cluster was sequenced to investigate whether they belong to described species or should be treated as distinct new species. This leads to the proposal of eleven new species. The sequencing results from those Naegleria spp. of which several strains are available indicate that these species are ubiquitous. The only exception might be the species represented by the WA variants. However, there are still many Naegleria spp. for which only one strain has been isolated, hence, it is important that the search for more isolates should be continued worldwide.
Molecular characterization of a Toxocara variant from cats in Kuala Lumpur, Malaysia.

PubMed

Zhu, X Q; Jacobs, D E; Chilton, N B; Sani, R A; Cheng, N A; Gasser, R B

1998-08-01

The ascaridoid nematode of cats from Kuala Lumpur, Malaysia, previously identified morphologically as Toxocara canis, was characterized using a molecular approach. The nuclear ribosomal DNA (rDNA) region spanning the first internal transcribed spacer (ITS-1), the 5.8S gene and the second internal transcribed spacer (ITS-2) was amplified and sequenced. The sequences for the parasite from Malaysian cats were compared with those for T. canis and T. cati. The sequence data showed that this taxon was genetically more similar to T. cati than to T. canis in the ITS-1, 5.8S and ITS-2. Differences in the ITS-1 and ITS-2 sequences between the taxa (9.4-26.1%) were markedly higher than variation between samples within T. canis and T. cati (0-2.9%). The sequence data demonstrate that the parasite from Malaysian cats is neither T. canis nor T. cati and indicate that it is a distinct species. Based on these data, PCR-linked restriction fragment length polymorphism (RFLP) and single-strand conformation polymorphism (SSCP) methods were employed for the unequivocal differentiation of the Toxocara variant from T. canis and T. cati. These methods should provide valuable tools for studying the life-cycle, transmission pattern(s) and zoonotic potential of this parasite.
Coval: Improving Alignment Quality and Variant Calling Accuracy for Next-Generation Sequencing Data

PubMed Central

Kosugi, Shunichi; Natsume, Satoshi; Yoshida, Kentaro; MacLean, Daniel; Cano, Liliana; Kamoun, Sophien; Terauchi, Ryohei

2013-01-01

Accurate identification of DNA polymorphisms using next-generation sequencing technology is challenging because of a high rate of sequencing error and incorrect mapping of reads to reference genomes. Currently available short read aligners and DNA variant callers suffer from these problems. We developed the Coval software to improve the quality of short read alignments. Coval is designed to minimize the incidence of spurious alignment of short reads, by filtering mismatched reads that remained in alignments after local realignment and error correction of mismatched reads. The error correction is executed based on the base quality and allele frequency at the non-reference positions for an individual or pooled sample. We demonstrated the utility of Coval by applying it to simulated genomes and experimentally obtained short-read data of rice, nematode, and mouse. Moreover, we found an unexpectedly large number of incorrectly mapped reads in ‘targeted’ alignments, where the whole genome sequencing reads had been aligned to a local genomic segment, and showed that Coval effectively eliminated such spurious alignments. We conclude that Coval significantly improves the quality of short-read sequence alignments, thereby increasing the calling accuracy of currently available tools for SNP and indel identification. Coval is available at http://sourceforge.net/projects/coval105/. PMID:24116042
MIG-seq: an effective PCR-based method for genome-wide single-nucleotide polymorphism genotyping using the next-generation sequencing platform

PubMed Central

Suyama, Yoshihisa; Matsuki, Yu

2015-01-01

Restriction-enzyme (RE)-based next-generation sequencing methods have revolutionized marker-assisted genetic studies; however, the use of REs has limited their widespread adoption, especially in field samples with low-quality DNA and/or small quantities of DNA. Here, we developed a PCR-based procedure to construct reduced representation libraries without RE digestion steps, representing de novo single-nucleotide polymorphism discovery, and its genotyping using next-generation sequencing. Using multiplexed inter-simple sequence repeat (ISSR) primers, thousands of genome-wide regions were amplified effectively from a wide variety of genomes, without prior genetic information. We demonstrated: 1) Mendelian gametic segregation of the discovered variants; 2) reproducibility of genotyping by checking its applicability for individual identification; and 3) applicability in a wide variety of species by checking standard population genetic analysis. This approach, called multiplexed ISSR genotyping by sequencing, should be applicable to many marker-assisted genetic studies with a wide range of DNA qualities and quantities. PMID:26593239
Increased Sensitivity of Diagnostic Mutation Detection by Re-analysis Incorporating Local Reassembly of Sequence Reads.

PubMed

Watson, Christopher M; Camm, Nick; Crinnion, Laura A; Clokie, Samuel; Robinson, Rachel L; Adlard, Julian; Charlton, Ruth; Markham, Alexander F; Carr, Ian M; Bonthron, David T

2017-12-01

Diagnostic genetic testing programmes based on next-generation DNA sequencing have resulted in the accrual of large datasets of targeted raw sequence data. Most diagnostic laboratories process these data through an automated variant-calling pipeline. Validation of the chosen analytical methods typically depends on confirming the detection of known sequence variants. Despite improvements in short-read alignment methods, current pipelines are known to be comparatively poor at detecting large insertion/deletion mutations. We performed clinical validation of a local reassembly tool, ABRA (assembly-based realigner), through retrospective reanalysis of a cohort of more than 2000 hereditary cancer cases. ABRA enabled detection of a 96-bp deletion, 4-bp insertion mutation in PMS2 that had been initially identified using a comparative read-depth approach. We applied an updated pipeline incorporating ABRA to the entire cohort of 2000 cases and identified one previously undetected pathogenic variant, a 23-bp duplication in PTEN. We demonstrate the effect of read length on the ability to detect insertion/deletion variants by comparing HiSeq2500 (2 × 101-bp) and NextSeq500 (2 × 151-bp) sequence data for a range of variants and thereby show that the limitations of shorter read lengths can be mitigated using appropriate informatics tools. This work highlights the need for ongoing development of diagnostic pipelines to maximize test sensitivity. We also draw attention to the large differences in computational infrastructure required to perform day-to-day versus large-scale reprocessing tasks.
A systematic approach to assessing the clinical significance of genetic variants.

PubMed

Duzkale, H; Shen, J; McLaughlin, H; Alfares, A; Kelly, M A; Pugh, T J; Funke, B H; Rehm, H L; Lebo, M S

2013-11-01

Molecular genetic testing informs diagnosis, prognosis, and risk assessment for patients and their family members. Recent advances in low-cost, high-throughput DNA sequencing and computing technologies have enabled the rapid expansion of genetic test content, resulting in dramatically increased numbers of DNA variants identified per test. To address this challenge, our laboratory has developed a systematic approach to thorough and efficient assessments of variants for pathogenicity determination. We first search for existing data in publications and databases including internal, collaborative and public resources. We then perform full evidence-based assessments through statistical analyses of observations in the general population and disease cohorts, evaluation of experimental data from in vivo or in vitro studies, and computational predictions of potential impacts of each variant. Finally, we weigh all evidence to reach an overall conclusion on the potential for each variant to be disease causing. In this report, we highlight the principles of variant assessment, address the caveats and pitfalls, and provide examples to illustrate the process. By sharing our experience and providing a framework for variant assessment, including access to a freely available customizable tool, we hope to help move towards standardized and consistent approaches to variant assessment. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Genetic analysis of human immunodeficiency virus type 1 envelope V3 region isolates from mothers and infants after perinatal transmission.

PubMed Central

Ahmad, N; Baroudy, B M; Baker, R C; Chappey, C

1995-01-01

The human immunodeficiency virus type 1 (HIV-1) sequences from variable region 3 (V3) of the envelope gene were analyzed from seven infected mother-infant pairs following perinatal transmission. The V3 region sequences directly derived from the DNA of the uncultured peripheral blood mononuclear cells from infected mothers displayed a heterogeneous population. In contrast, the infants' sequences were less diverse than those of their mothers. In addition, the sequences from the younger infants' peripheral blood mononuclear cell DNA were more homogeneous than the older infants' sequences. All infants' sequences were different but displayed patterns similar to those seen in their mothers. In the mother-infant pair sequences analyzed, a minor genotype or subtype found in the mothers predominated in their infants. The conserved N-linked glycosylation site proximal to the first cysteine of the V3 loop was absent only in one infant's sequence set and in some variants of two other infants' sequences. Furthermore, the HIV-1 sequences of the epidemiologically linked mother-infant pairs were closer than the sequences of epidemiologically unlinked individuals, suggesting that the sequence comparison of mother-infant pairs done in order to identify genetic variants transmitted from mother to infant could be performed even in older infants. There was no evidence for transmission of a major genotype or multiple genotypes from mother to infant. In conclusion, a minor genotype of maternal virus is transmitted to the infants, and this finding could be useful in developing strategies to prevent maternal transmission of HIV-1 by means of perinatal interventions. PMID:7815476
Molecular identification and functional expression of mu 3, a novel alternatively spliced variant of the human mu opiate receptor gene.

PubMed

Cadet, Patrick; Mantione, Kirk J; Stefano, George B

2003-05-15

Studies from our laboratory have revealed a novel mu opiate receptor, mu 3, which is expressed in both vascular tissues and leukocytes. The mu 3 receptor is selective for opiate alkaloids and is insensitive to opioid peptides. We now identify the mu 3 receptor at the molecular level using a 441-bp conserved region of the mu 1 receptor. Sequence analysis of the isolated cDNA suggests that it is a novel, alternatively spliced variant of the mu opiate receptor gene. To determine whether protein expressed from this cDNA exhibits the biochemical characteristics expected of the mu 3 receptor, the cDNA clone was expressed in a heterologous system. At the functional level, COS-1 cells transfected with the mu 3 receptor cDNA exhibited dose-dependent release of NO following treatment with morphine, but not opioid peptides (i.e., Met-enkephalin). Naloxone was able to block the effect of morphine on COS-1 transfected cells. Nontransfected COS-1 cells did not produce NO in the presence of morphine or the opioid peptides at similar concentrations. Receptor binding analysis with [(3)H]dihydromorphine further supports the opiate alkaloid selectivity and opioid peptide insensitivity of this receptor. These data suggest that this new mu opiate receptor cDNA encodes the mu 3 opiate receptor, since it exhibits biochemical characteristics known to be unique to this receptor (opiate alkaloid selective and opioid peptide insensitive). Furthermore, using Northern blot, RT-PCR, and sequence analysis, we have demonstrated the expression of this new mu variant in human vascular tissue, mononuclear cells, polymorphonuclear cells, and human neuroblastoma cells.
DNA Sequence Variants in PPARGC1A, a Gene Encoding a Coactivator of the ω-3 LCPUFA Sensing PPAR-RXR Transcription Complex, Are Associated with NV AMD and AMD-Associated Loci in Genes of Complement and VEGF Signaling Pathways

PubMed Central

SanGiovanni, John Paul; Chen, Jing; Sapieha, Przemyslaw; Aderman, Christopher M.; Stahl, Andreas; Clemons, Traci E.; Chew, Emily Y.; Smith, Lois E. H.

2013-01-01

Background Increased intake of ω-3 long-chain polyunsaturated fatty acids (LCPUFAs) and use of peroxisome proliferator activator receptor (PPAR)-activating drugs are associated with attenuation of pathologic retinal angiogenesis. ω-3 LCPUFAs are endogenous agonists of PPARs. We postulated that DNA sequence variation in PPAR gamma (PPARG) co-activator 1 alpha (PPARGC1A), a gene encoding a co-activator of the LCPUFA-sensing PPARG-retinoid X receptor (RXR) transcription complex, may influence neovascularization (NV) in age-related macular degeneration (AMD). Methods We applied exact testing methods to examine distributions of DNA sequence variants in PPARGC1A for association with NV AMD and interaction of AMD-associated loci in genes of complement, lipid metabolism, and VEGF signaling systems. Our sample contained 1858 people from 3 elderly cohorts of western European ancestry. We concurrently investigated retinal gene expression profiles in 17-day-old neonatal mice on a 2% LCPUFA feeding paradigm to identify LCPUFA-regulated genes both associated with pathologic retinal angiogenesis and known to interact with PPARs or PPARGC1A. Results A DNA coding variant (rs3736265) and a 3'UTR-resident regulatory variant (rs3774923) in PPARGC1A were independently associated with NV AMD (exact P = 0.003, both SNPs). SNP-SNP interactions existed for NV AMD (P<0.005) with rs3736265 and a AMD-associated variant in complement factor B (CFB, rs512559). PPARGC1A influences activation of the AMD-associated complement component 3 (C3) promoter fragment and CFB influences activation and proteolysis of C3. We observed interaction (P≤0.003) of rs3736265 with a variant in vascular endothelial growth factor A (VEGFA, rs3025033), a key molecule in retinal angiogenesis. Another PPARGC1A coding variant (rs8192678) showed statistical interaction with a SNP in the VEGFA receptor fms-related tyrosine kinase 1 (FLT1, rs10507386; P≤0.003). C3 expression was down-regulated 2-fold in retinas of ω-3 LCPUFA-fed mice – these animals also showed 70% reduction in retinal NV (P≤0.001). Conclusion Ligands and co-activators of the ω-3 LCPUFA sensing PPAR-RXR axis may influence retinal angiogenesis in NV AMD via the complement and VEGF signaling systems. We have linked the co-activator of a lipid-sensing transcription factor (PPARG co-activator 1 alpha, PPARGC1A) to age-related macular degeneration (AMD) and AMD-associated genes. PMID:23335958
Structural Basis for the Altered PAM Recognition by Engineered CRISPR-Cpf1.

PubMed

Nishimasu, Hiroshi; Yamano, Takashi; Gao, Linyi; Zhang, Feng; Ishitani, Ryuichiro; Nureki, Osamu

2017-07-06

The RNA-guided Cpf1 nuclease cleaves double-stranded DNA targets complementary to the CRISPR RNA (crRNA), and it has been harnessed for genome editing technologies. Recently, Acidaminococcus sp. BV3L6 (AsCpf1) was engineered to recognize altered DNA sequences as the protospacer adjacent motif (PAM), thereby expanding the target range of Cpf1-mediated genome editing. Whereas wild-type AsCpf1 recognizes the TTTV PAM, the RVR (S542R/K548V/N552R) and RR (S542R/K607R) variants can efficiently recognize the TATV and TYCV PAMs, respectively. However, their PAM recognition mechanisms remained unknown. Here we present the 2.0 Å resolution crystal structures of the RVR and RR variants bound to a crRNA and its target DNA. The structures revealed that the RVR and RR variants primarily recognize the PAM-complementary nucleotides via the substituted residues. Our high-resolution structures delineated the altered PAM recognition mechanisms of the AsCpf1 variants, providing a basis for the further engineering of CRISPR-Cpf1. Copyright © 2017 Elsevier Inc. All rights reserved.
Identification of Five Novel Variants in Chinese Oculocutaneous Albinism by Targeted Next-Generation Sequencing.

PubMed

Qiu, Biyuan; Ma, Tao; Peng, Chunyan; Zheng, Xiaoqin; Yang, Jiyun

2018-04-01

The diagnosis of oculocutaneous albinism (OCA) is established using clinical signs and symptoms. OCA is, however, a highly genetically heterogeneous disease with mutations identified in at least nineteen unique genes, many of which produce overlapping phenotypic traits. Thus, differentiating genetic OCA subtypes for diagnoses and genetic counseling is challenging, based on clinical presentation alone, and would benefit from a comprehensive molecular diagnostic. To develop and validate a more comprehensive, targeted, next-generation-sequencing-based diagnostic for the identification of OCA-causing variants. The genomic DNA samples from 28 OCA probands were analyzed by targeted next-generation sequencing (NGS), and the candidate variants were confirmed through Sanger sequencing. We observed mutations in the TYR, OCA2, and SLC45A2 genes in 25/28 (89%) patients with OCA. We identified 38 pathogenic variants among these three genes, including 5 novel variants: c.1970G>T (p.Gly657Val), c.1669A>C (p.Thr557Pro), c.2339-2A>C, and c.1349C>G (p.Thr450Arg) in OCA2; c.459_470delTTTTGCTGCCGA (p.Ala155_Phe158del) in SLC45A2. Our findings expand the mutational spectrum of OCA in the Chinese population, and the assay we developed should be broadly useful as a molecular diagnostic, and as an aid for genetic counseling for OCA patients.
Feline hypersomatotropism and acromegaly tumorigenesis: a potential role for the AIP gene.

PubMed

Scudder, C J; Niessen, S J; Catchpole, B; Fowkes, R C; Church, D B; Forcada, Y

2017-04-01

Acromegaly in humans is usually sporadic, however up to 20% of familial isolated pituitary adenomas are caused by germline sequence variants of the aryl-hydrocarbon-receptor interacting protein (AIP) gene. Feline acromegaly has similarities to human acromegalic families with AIP mutations. The aim of this study was to sequence the feline AIP gene, identify sequence variants and compare the AIP gene sequence between feline acromegalic and control cats, and in acromegalic siblings. The feline AIP gene was amplified through PCR using whole blood genomic DNA from 10 acromegalic and 10 control cats, and 3 sibling pairs affected by acromegaly. PCR products were sequenced and compared with the published predicted feline AIP gene. A single nonsynonymous SNP was identified in exon 1 (AIP:c.9T > G) of two acromegalic cats and none of the control cats, as well as both members of one sibling pair. The region of this SNP is considered essential for the interaction of the AIP protein with its receptor. This sequence variant has not previously been reported in humans. Two additional synonymous sequence variants were identified (AIP:c.481C > T and AIP:c.826C > T). This is the first molecular study to investigate a potential genetic cause of feline acromegaly and identified a nonsynonymous AIP single nucleotide polymorphism in 20% of the acromegalic cat population evaluated, as well as in one of the sibling pairs evaluated. Copyright © 2016 Elsevier Inc. All rights reserved.
Novel pedigree analysis implicates DNA repair and chromatin remodeling in multiple myeloma risk

PubMed Central

Curtin, Karen; Rajamanickam, Venkatesh; Jayabalan, David; Atanackovic, Djordje; Rajkumar, S. Vincent; Kumar, Shaji; Slager, Susan; Galia, Perrine; Demangel, Delphine; Salama, Mohamed; Joseph, Vijai; Lipkin, Steven M.; Dumontet, Charles; Vachon, Celine M.

2018-01-01

The high-risk pedigree (HRP) design is an established strategy to discover rare, highly-penetrant, Mendelian-like causal variants. Its success, however, in complex traits has been modest, largely due to challenges of genetic heterogeneity and complex inheritance models. We describe a HRP strategy that addresses intra-familial heterogeneity, and identifies inherited segments important for mapping regulatory risk. We apply this new Shared Genomic Segment (SGS) method in 11 extended, Utah, multiple myeloma (MM) HRPs, and subsequent exome sequencing in SGS regions of interest in 1063 MM / MGUS (monoclonal gammopathy of undetermined significance–a precursor to MM) cases and 964 controls from a jointly-called collaborative resource, including cases from the initial 11 HRPs. One genome-wide significant 1.8 Mb shared segment was found at 6q16. Exome sequencing in this region revealed predicted deleterious variants in USP45 (p.Gln691* and p.Gln621Glu), a gene known to influence DNA repair through endonuclease regulation. Additionally, a 1.2 Mb segment at 1p36.11 is inherited in two Utah HRPs, with coding variants identified in ARID1A (p.Ser90Gly and p.Met890Val), a key gene in the SWI/SNF chromatin remodeling complex. Our results provide compelling statistical and genetic evidence for segregating risk variants for MM. In addition, we demonstrate a novel strategy to use large HRPs for risk-variant discovery more generally in complex traits. PMID:29389935

Novel pedigree analysis implicates DNA repair and chromatin remodeling in multiple myeloma risk.

PubMed

Waller, Rosalie G; Darlington, Todd M; Wei, Xiaomu; Madsen, Michael J; Thomas, Alun; Curtin, Karen; Coon, Hilary; Rajamanickam, Venkatesh; Musinsky, Justin; Jayabalan, David; Atanackovic, Djordje; Rajkumar, S Vincent; Kumar, Shaji; Slager, Susan; Middha, Mridu; Galia, Perrine; Demangel, Delphine; Salama, Mohamed; Joseph, Vijai; McKay, James; Offit, Kenneth; Klein, Robert J; Lipkin, Steven M; Dumontet, Charles; Vachon, Celine M; Camp, Nicola J

2018-02-01

The high-risk pedigree (HRP) design is an established strategy to discover rare, highly-penetrant, Mendelian-like causal variants. Its success, however, in complex traits has been modest, largely due to challenges of genetic heterogeneity and complex inheritance models. We describe a HRP strategy that addresses intra-familial heterogeneity, and identifies inherited segments important for mapping regulatory risk. We apply this new Shared Genomic Segment (SGS) method in 11 extended, Utah, multiple myeloma (MM) HRPs, and subsequent exome sequencing in SGS regions of interest in 1063 MM / MGUS (monoclonal gammopathy of undetermined significance-a precursor to MM) cases and 964 controls from a jointly-called collaborative resource, including cases from the initial 11 HRPs. One genome-wide significant 1.8 Mb shared segment was found at 6q16. Exome sequencing in this region revealed predicted deleterious variants in USP45 (p.Gln691* and p.Gln621Glu), a gene known to influence DNA repair through endonuclease regulation. Additionally, a 1.2 Mb segment at 1p36.11 is inherited in two Utah HRPs, with coding variants identified in ARID1A (p.Ser90Gly and p.Met890Val), a key gene in the SWI/SNF chromatin remodeling complex. Our results provide compelling statistical and genetic evidence for segregating risk variants for MM. In addition, we demonstrate a novel strategy to use large HRPs for risk-variant discovery more generally in complex traits.
BEST1 sequence variants in Italian patients with vitelliform macular dystrophy

PubMed Central

Sodi, Andrea; Passerini, Ilaria; Caputo, Roberto; Bacci, Giacomo Maria; Bodoj, Mirela; Torricelli, Francesca; Menchini, Ugo

2012-01-01

Purpose To analyze the spectrum of sequence variants in the BEST1 gene in a group of Italian patients affected by Best vitelliform macular dystrophy (VMD). Methods Thirty Italian patients with a diagnosis of VMD and 20 clinically healthy relatives were recruited. They belonged to 19 Italian families predominantly originating from central Italy. They received a standard ophthalmologic examination, OCT scan, and electrophysiological tests (ERG and EOG). Fluorescein and ICG angiographies and fundus autofluorescence imaging were performed in selected cases. DNA samples were analyzed for sequence variants of the BEST1 gene by direct sequencing techniques. Results Nine missense variants and one deletion were found in the affected patients; each patient carried one mutation. Five variants [c.73C>T (p.Arg25Trp), c.652C>T (p.Arg218Cys), c.652C>G (p.Arg218Gly), c.728C>T (p.Ala243Val), c.893T>C (p.Phe298Ser)] have already been described in literature while another five variants [c.217A>C (p.Ile73Leu), c.239T>G (p.Phe80Cys), c.883_885del (p.Ile295del), c.907G>A (p.Asp303Asn), c.911A>G (p.Asp304Gly)] had not previously been reported. Affected patients, sometimes even from the same family, occasionally showed variable phenotypes. One heterozygous variant was also found in five clinically healthy relatives with normal fundus, visual acuity and ERG but with abnormal EOG. Conclusions Ten variants in the BEST1 gene were detected in a group of individuals with clinically apparent VMD, and in some clinically normal individuals with an abnormal EOG. The high prevalence of novel variants and the frequent report of a specific variant (p.Arg25Trp) that has rarely been described in other ethnic groups suggests a distribution of BEST1 variants peculiar to Italian VMD patients. PMID:23213274
Ultraaccurate genome sequencing and haplotyping of single human cells.

PubMed

Chu, Wai Keung; Edge, Peter; Lee, Ho Suk; Bansal, Vikas; Bafna, Vineet; Huang, Xiaohua; Zhang, Kun

2017-11-21

Accurate detection of variants and long-range haplotypes in genomes of single human cells remains very challenging. Common approaches require extensive in vitro amplification of genomes of individual cells using DNA polymerases and high-throughput short-read DNA sequencing. These approaches have two notable drawbacks. First, polymerase replication errors could generate tens of thousands of false-positive calls per genome. Second, relatively short sequence reads contain little to no haplotype information. Here we report a method, which is dubbed SISSOR (single-stranded sequencing using microfluidic reactors), for accurate single-cell genome sequencing and haplotyping. A microfluidic processor is used to separate the Watson and Crick strands of the double-stranded chromosomal DNA in a single cell and to randomly partition megabase-size DNA strands into multiple nanoliter compartments for amplification and construction of barcoded libraries for sequencing. The separation and partitioning of large single-stranded DNA fragments of the homologous chromosome pairs allows for the independent sequencing of each of the complementary and homologous strands. This enables the assembly of long haplotypes and reduction of sequence errors by using the redundant sequence information and haplotype-based error removal. We demonstrated the ability to sequence single-cell genomes with error rates as low as 10 -8 and average 500-kb-long DNA fragments that can be assembled into haplotype contigs with N50 greater than 7 Mb. The performance could be further improved with more uniform amplification and more accurate sequence alignment. The ability to obtain accurate genome sequences and haplotype information from single cells will enable applications of genome sequencing for diverse clinical needs. Copyright © 2017 the Author(s). Published by PNAS.
The Status, Quality, and Expansion of the NIH Full-Length cDNA Project: The Mammalian Gene Collection (MGC)

PubMed Central

2004-01-01

The National Institutes of Health's Mammalian Gene Collection (MGC) project was designed to generate and sequence a publicly accessible cDNA resource containing a complete open reading frame (ORF) for every human and mouse gene. The project initially used a random strategy to select clones from a large number of cDNA libraries from diverse tissues. Candidate clones were chosen based on 5′-EST sequences, and then fully sequenced to high accuracy and analyzed by algorithms developed for this project. Currently, more than 11,000 human and 10,000 mouse genes are represented in MGC by at least one clone with a full ORF. The random selection approach is now reaching a saturation point, and a transition to protocols targeted at the missing transcripts is now required to complete the mouse and human collections. Comparison of the sequence of the MGC clones to reference genome sequences reveals that most cDNA clones are of very high sequence quality, although it is likely that some cDNAs may carry missense variants as a consequence of experimental artifact, such as PCR, cloning, or reverse transcriptase errors. Recently, a rat cDNA component was added to the project, and ongoing frog (Xenopus) and zebrafish (Danio) cDNA projects were expanded to take advantage of the high-throughput MGC pipeline. PMID:15489334
COLD-PCR: improving the sensitivity of molecular diagnostics assays

PubMed Central

Milbury, Coren A; Li, Jin; Liu, Pingfang; Makrigiorgos, G Mike

2011-01-01

The detection of low-abundance DNA variants or mutations is of particular interest to medical diagnostics, individualized patient treatment and cancer prognosis; however, detection sensitivity for low-abundance variants is a pronounced limitation of most currently available molecular assays. We have recently developed coamplification at lower denaturation temperature-PCR (COLD-PCR) to resolve this limitation. This novel form of PCR selectively amplifies low-abundance DNA variants from mixtures of wild-type and mutant-containing (or variant-containing) sequences, irrespective of the mutation type or position on the amplicon, by using a critical denaturation temperature. The use of a lower denaturation temperature in COLD-PCR results in selective denaturation of amplicons with mutation-containing molecules within wild-type mutant heteroduplexes or with a lower melting temperature. COLD-PCR can be used in lieu of conventional PCR in several molecular applications, thus enriching the mutant fraction and improving the sensitivity of downstream mutation detection by up to 100-fold. PMID:21405967
Novel transcripts of the estrogen receptor α gene in channel catfish

USGS Publications Warehouse

Patino, Reynaldo; Xia, Zhenfang; Gale, William L.; Wu, Chunfa; Maule, Alec G.; Chang, Xiaotian

2000-01-01

Complementary DNA libraries from liver and ovary of an immature female channel catfish were screened with a homologous ERα cDNA probe. The hepatic library yielded two new channel catfish ER cDNAs that encode N-terminal ERα variants of different sizes. Relative to the catfish ERα (medium size; 581 residues) previously reported, these new cDNAs encode Long-ERα (36 residues longer) and Short-ERα (389 residues shorter). The 5′-end of Long-ERα cDNA is identical to that of Medium-ERα but has an additional 503-bp segment with an upstream, in-frame translation-start codon. Recombinant Long-ERα binds estrogen with high affinity (Kd = 3.4 nM), similar to that previously reported for Medium-ERα but lower than reported for catfish ERβ. Short-ERα cDNA encodes a protein that lacks most of the receptor protein and does not bind estrogen. Northern hybridization confirmed the existence of multiple hepatic ERα RNAs that include the size range of the ERα cDNAs obtained from the libraries as well as additional sizes. Using primers for RT-PCR that target locations internal to the protein-coding sequence, we also established the presence of several ERα cDNA variants with in-frame insertions in the ligand-binding and DNA-binding domains and in-frame or out-of-frame deletions in the ligand-binding domain. These internal variants showed patterns of expression that differed between the ovary and liver. Further, the ovarian library yielded a full-length, ERα antisense cDNA containing a poly(A) signal and tail. A limited survey of histological preparations from juvenile catfish by in situ hybridization using directionally synthesized cRNA probes also suggested the expression of ERα antisense RNA in a tissue-specific manner. In conclusion, channel catfish seemingly have three broad classes of ERα mRNA variants: those encoding N-terminal truncated variants, those encoding internal variants (including C-terminal truncated variants), and antisense mRNA. The sense variants may encode functional ERα or related proteins that modulate ERα or ERβ activity. The existence of ER antisense mRNA is reported in this study for the first time. Its role may be to participate in the regulation of ER gene expression.
High-resolution characterization of sequence signatures due to non-random cleavage of cell-free DNA.

PubMed

Chandrananda, Dineika; Thorne, Natalie P; Bahlo, Melanie

2015-06-17

High-throughput sequencing of cell-free DNA fragments found in human plasma has been used to non-invasively detect fetal aneuploidy, monitor organ transplants and investigate tumor DNA. However, many biological properties of this extracellular genetic material remain unknown. Research that further characterizes circulating DNA could substantially increase its diagnostic value by allowing the application of more sophisticated bioinformatics tools that lead to an improved signal to noise ratio in the sequencing data. In this study, we investigate various features of cell-free DNA in plasma using deep-sequencing data from two pregnant women (>70X, >50X) and compare them with matched cellular DNA. We utilize a descriptive approach to examine how the biological cleavage of cell-free DNA affects different sequence signatures such as fragment lengths, sequence motifs at fragment ends and the distribution of cleavage sites along the genome. We show that the size distributions of these cell-free DNA molecules are dependent on their autosomal and mitochondrial origin as well as the genomic location within chromosomes. DNA mapping to particular microsatellites and alpha repeat elements display unique size signatures. We show how cell-free fragments occur in clusters along the genome, localizing to nucleosomal arrays and are preferentially cleaved at linker regions by correlating the mapping locations of these fragments with ENCODE annotation of chromatin organization. Our work further demonstrates that cell-free autosomal DNA cleavage is sequence dependent. The region spanning up to 10 positions on either side of the DNA cleavage site show a consistent pattern of preference for specific nucleotides. This sequence motif is present in cleavage sites localized to nucleosomal cores and linker regions but is absent in nucleosome-free mitochondrial DNA. These background signals in cell-free DNA sequencing data stem from the non-random biological cleavage of these fragments. This sequence structure can be harnessed to improve bioinformatics algorithms, in particular for CNV and structural variant detection. Descriptive measures for cell-free DNA features developed here could also be used in biomarker analysis to monitor the changes that occur during different pathological conditions.
Identification and cloning of a gamma 3 subunit splice variant of the human GABA(A) receptor.

PubMed

Poulsen, C F; Christjansen, K N; Hastrup, S; Hartvig, L

2000-05-31

cDNA sequences encoding two forms of the GABA(A) gamma 3 receptor subunit were cloned from human hippocampus. The nucleotide sequences differ by the absence (gamma 3S) or presence (gamma 3L) of 18 bp located in the presumed intracellular loop between transmembrane region (TM) III and IV. The extra 18 bp in the gamma 3L subunit generates a consensus site for phosphorylation by protein kinase C (PKC). Analysis of human genomic DNA encoding the gamma 3 subunit reveals that the 18 bp insert is contiguous with the upstream proximal exon.
Whole-exome sequencing for RH genotyping and alloimmunization risk in children with sickle cell anemia

PubMed Central

Flanagan, Jonathan M.; Vege, Sunitha; Luban, Naomi L. C.; Brown, R. Clark; Ware, Russell E.; Westhoff, Connie M.

2017-01-01

RH genes are highly polymorphic and encode the most complex of the 35 human blood group systems. This genetic diversity contributes to Rh alloimmunization in patients with sickle cell anemia (SCA) and is not avoided by serologic Rh-matched red cell transfusions. Standard serologic testing does not distinguish variant Rh antigens. Single nucleotide polymorphism (SNP)–based DNA arrays detect many RHD and RHCE variants, but the number of alleles tested is limited. We explored a next-generation sequencing (NGS) approach using whole-exome sequencing (WES) in 27 Rh alloimmunized and 27 matched non-alloimmunized patients with SCA who received chronic red cell transfusions and were enrolled in a multicenter study. We demonstrate that WES provides a comprehensive RH genotype, identifies SNPs not interrogated by DNA array, and accurately determines RHD zygosity. Among this multicenter cohort, we demonstrate an association between an altered RH genotype and Rh alloimmunization: 52% of Rh immunized vs 19% of non-immunized patients expressed variant Rh without co-expression of the conventional protein. Our findings suggest that RH allele variation in patients with SCA is clinically relevant, and NGS technology can offer a comprehensive alternative to targeted SNP-based testing. This is particularly relevant as NGS data becomes more widely available and could provide the means for reducing Rh alloimmunization in children with SCA. PMID:29296782
GM2 Gangliosidosis in Shiba Inu Dogs with an In-Frame Deletion in HEXB.

PubMed

Kolicheski, A; Johnson, G S; Villani, N A; O'Brien, D P; Mhlanga-Mutangadura, T; Wenger, D A; Mikoloski, K; Eagleson, J S; Taylor, J F; Schnabel, R D; Katz, M L

2017-09-01

Consistent with a tentative diagnosis of neuronal ceroid lipofuscinosis (NCL), autofluorescent cytoplasmic storage bodies were found in neurons from the brains of 2 related Shiba Inu dogs with a young-adult onset, progressive neurodegenerative disease. Unexpectedly, no potentially causal NCL-related variants were identified in a whole-genome sequence generated with DNA from 1 of the affected dogs. Instead, the whole-genome sequence contained a homozygous 3 base pair (bp) deletion in a coding region of HEXB. The other affected dog also was homozygous for this 3-bp deletion. Mutations in the human HEXB ortholog cause Sandhoff disease, a type of GM2 gangliosidosis. Thin-layer chromatography confirmed that GM2 ganglioside had accumulated in an affected Shiba Inu brain. Enzymatic analysis confirmed that the GM2 gangliosidosis resulted from a deficiency in the HEXB encoded protein and not from a deficiency in products from HEXA or GM2A, which are known alternative causes of GM2 gangliosidosis. We conclude that the homozygous 3-bp deletion in HEXB is the likely cause of the Shiba Inu neurodegenerative disease and that whole-genome sequencing can lead to the early identification of potentially disease-causing DNA variants thereby refocusing subsequent diagnostic analyses toward confirming or refuting candidate variant causality. Copyright © 2017 The Authors. Journal of Veterinary Internal Medicine published by Wiley Periodicals, Inc. on behalf of the American College of Veterinary Internal Medicine.
Pancreatic islet enhancer clusters enriched in type 2 diabetes risk-associated variants.

PubMed

Pasquali, Lorenzo; Gaulton, Kyle J; Rodríguez-Seguí, Santiago A; Mularoni, Loris; Miguel-Escalada, Irene; Akerman, İldem; Tena, Juan J; Morán, Ignasi; Gómez-Marín, Carlos; van de Bunt, Martijn; Ponsa-Cobas, Joan; Castro, Natalia; Nammo, Takao; Cebola, Inês; García-Hurtado, Javier; Maestro, Miguel Angel; Pattou, François; Piemonti, Lorenzo; Berney, Thierry; Gloyn, Anna L; Ravassard, Philippe; Skarmeta, José Luis Gómez; Müller, Ferenc; McCarthy, Mark I; Ferrer, Jorge

2014-02-01

Type 2 diabetes affects over 300 million people, causing severe complications and premature death, yet the underlying molecular mechanisms are largely unknown. Pancreatic islet dysfunction is central in type 2 diabetes pathogenesis, and understanding islet genome regulation could therefore provide valuable mechanistic insights. We have now mapped and examined the function of human islet cis-regulatory networks. We identify genomic sequences that are targeted by islet transcription factors to drive islet-specific gene activity and show that most such sequences reside in clusters of enhancers that form physical three-dimensional chromatin domains. We find that sequence variants associated with type 2 diabetes and fasting glycemia are enriched in these clustered islet enhancers and identify trait-associated variants that disrupt DNA binding and islet enhancer activity. Our studies illustrate how islet transcription factors interact functionally with the epigenome and provide systematic evidence that the dysregulation of islet enhancers is relevant to the mechanisms underlying type 2 diabetes.
Integrative Clinical Genomics of Metastatic Cancer

PubMed Central

Robinson, Dan R.; Wu, Yi-Mi; Lonigro, Robert J.; Vats, Pankaj; Cobain, Erin; Everett, Jessica; Cao, Xuhong; Rabban, Erica; Kumar-Sinha, Chandan; Raymond, Victoria; Schuetze, Scott; Alva, Ajjai; Siddiqui, Javed; Chugh, Rashmi; Worden, Francis; Zalupski, Mark M.; Innis, Jeffrey; Mody, Rajen J.; Tomlins, Scott A.; Lucas, David; Baker, Laurence H.; Ramnath, Nithya; Schott, Ann F.; Hayes, Daniel F.; Vijai, Joseph; Offit, Kenneth; Stoffel, Elena M.; Roberts, J. Scott; Smith, David C.; Kunju, Lakshmi P.; Talpaz, Moshe; Cieslik, Marcin; Chinnaiyan, Arul M.

2017-01-01

SUMMARY Metastasis is the primary cause of cancer-related deaths. While The Cancer Genome Atlas (TCGA) has sequenced primary tumor types obtained from surgical resections, much less comprehensive molecular analysis is available from clinically acquired metastatic cancers. Here, we perform whole exome and transcriptome sequencing of 500 adult patients with metastatic solid tumors of diverse lineage and biopsy site. The most prevalent genes somatically altered in metastatic cancer included TP53, CDKN2A, PTEN, PIK3CA, and RB1. Putative pathogenic germline variants were present in 12.2% of cases of which 75% were related to defects in DNA repair. RNA sequencing complemented DNA sequencing for the identification of gene fusions, pathway activation, and immune profiling. Integrative sequence analysis provides a clinically relevant, multi-dimensional view of the complex molecular landscape and microenvironment of metastatic cancers. PMID:28783718
Variation of 45S rDNA intergenic spacers in Arabidopsis thaliana.

PubMed

Havlová, Kateřina; Dvořáčková, Martina; Peiro, Ramon; Abia, David; Mozgová, Iva; Vansáčová, Lenka; Gutierrez, Crisanto; Fajkus, Jiří

2016-11-01

Approximately seven hundred 45S rRNA genes (rDNA) in the Arabidopsis thaliana genome are organised in two 4 Mbp-long arrays of tandem repeats arranged in head-to-tail fashion separated by an intergenic spacer (IGS). These arrays make up 5 % of the A. thaliana genome. IGS are rapidly evolving sequences and frequent rearrangements inside the rDNA loci have generated considerable interspecific and even intra-individual variability which allows to distinguish among otherwise highly conserved rRNA genes. The IGS has not been comprehensively described despite its potential importance in regulation of rDNA transcription and replication. Here we describe the detailed sequence variation in the complete IGS of A. thaliana WT plants and provide the reference/consensus IGS sequence, as well as genomic DNA analysis. We further investigate mutants dysfunctional in chromatin assembly factor-1 (CAF-1) (fas1 and fas2 mutants), which are known to have a reduced number of rDNA copies, and plant lines with restored CAF-1 function (segregated from a fas1xfas2 genetic background) showing major rDNA rearrangements. The systematic rDNA loss in CAF-1 mutants leads to the decreased variability of the IGS and to the occurrence of distinct IGS variants. We present for the first time a comprehensive and representative set of complete IGS sequences, obtained by conventional cloning and by Pacific Biosciences sequencing. Our data expands the knowledge of the A. thaliana IGS sequence arrangement and variability, which has not been available in full and in detail until now. This is also the first study combining IGS sequencing data with RFLP analysis of genomic DNA.
Machine learning classifier for identification of damaging missense mutations exclusive to human mitochondrial DNA-encoded polypeptides.

PubMed

Martín-Navarro, Antonio; Gaudioso-Simón, Andrés; Álvarez-Jarreta, Jorge; Montoya, Julio; Mayordomo, Elvira; Ruiz-Pesini, Eduardo

2017-03-07

Several methods have been developed to predict the pathogenicity of missense mutations but none has been specifically designed for classification of variants in mtDNA-encoded polypeptides. Moreover, there is not available curated dataset of neutral and damaging mtDNA missense variants to test the accuracy of predictors. Because mtDNA sequencing of patients suffering mitochondrial diseases is revealing many missense mutations, it is needed to prioritize candidate substitutions for further confirmation. Predictors can be useful as screening tools but their performance must be improved. We have developed a SVM classifier (Mitoclass.1) specific for mtDNA missense variants. Training and validation of the model was executed with 2,835 mtDNA damaging and neutral amino acid substitutions, previously curated by a set of rigorous pathogenicity criteria with high specificity. Each instance is described by a set of three attributes based on evolutionary conservation in Eukaryota of wildtype and mutant amino acids as well as coevolution and a novel evolutionary analysis of specific substitutions belonging to the same domain of mitochondrial polypeptides. Our classifier has performed better than other web-available tested predictors. We checked performance of three broadly used predictors with the total mutations of our curated dataset. PolyPhen-2 showed the best results for a screening proposal with a good sensitivity. Nevertheless, the number of false positive predictions was too high. Our method has an improved sensitivity and better specificity in relation to PolyPhen-2. We also publish predictions for the complete set of 24,201 possible missense variants in the 13 human mtDNA-encoded polypeptides. Mitoclass.1 allows a better selection of candidate damaging missense variants from mtDNA. A careful search of discriminatory attributes and a training step based on a curated dataset of amino acid substitutions belonging exclusively to human mtDNA genes allows an improved performance. Mitoclass.1 accuracy could be improved in the future when more mtDNA missense substitutions will be available for updating the attributes and retraining the model.
The genome sequence of sweet cherry (Prunus avium) for use in genomics-assisted breeding.

PubMed

Shirasawa, Kenta; Isuzugawa, Kanji; Ikenaga, Mitsunobu; Saito, Yutaro; Yamamoto, Toshiya; Hirakawa, Hideki; Isobe, Sachiko

2017-10-01

We determined the genome sequence of sweet cherry (Prunus avium) using next-generation sequencing technology. The total length of the assembled sequences was 272.4 Mb, consisting of 10,148 scaffold sequences with an N50 length of 219.6 kb. The sequences covered 77.8% of the 352.9 Mb sweet cherry genome, as estimated by k-mer analysis, and included >96.0% of the core eukaryotic genes. We predicted 43,349 complete and partial protein-encoding genes. A high-density consensus map with 2,382 loci was constructed using double-digest restriction site-associated DNA sequencing. Comparing the genetic maps of sweet cherry and peach revealed high synteny between the two genomes; thus the scaffolds were integrated into pseudomolecules using map- and synteny-based strategies. Whole-genome resequencing of six modern cultivars found 1,016,866 SNPs and 162,402 insertions/deletions, out of which 0.7% were deleterious. The sequence variants, as well as simple sequence repeats, can be used as DNA markers. The genomic information helps us to identify agronomically important genes and will accelerate genetic studies and breeding programs for sweet cherries. Further information on the genomic sequences and DNA markers is available in DBcherry (http://cherry.kazusa.or.jp (8 May 2017, date last accessed)). © The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Mitochondrial Disease Sequence Data Resource (MSeqDR): a global grass-roots consortium to facilitate deposition, curation, annotation, and integrated analysis of genomic data for the mitochondrial disease clinical and research communities.

PubMed

Falk, Marni J; Shen, Lishuang; Gonzalez, Michael; Leipzig, Jeremy; Lott, Marie T; Stassen, Alphons P M; Diroma, Maria Angela; Navarro-Gomez, Daniel; Yeske, Philip; Bai, Renkui; Boles, Richard G; Brilhante, Virginia; Ralph, David; DaRe, Jeana T; Shelton, Robert; Terry, Sharon F; Zhang, Zhe; Copeland, William C; van Oven, Mannis; Prokisch, Holger; Wallace, Douglas C; Attimonelli, Marcella; Krotoski, Danuta; Zuchner, Stephan; Gai, Xiaowu

2015-03-01

Success rates for genomic analyses of highly heterogeneous disorders can be greatly improved if a large cohort of patient data is assembled to enhance collective capabilities for accurate sequence variant annotation, analysis, and interpretation. Indeed, molecular diagnostics requires the establishment of robust data resources to enable data sharing that informs accurate understanding of genes, variants, and phenotypes. The "Mitochondrial Disease Sequence Data Resource (MSeqDR) Consortium" is a grass-roots effort facilitated by the United Mitochondrial Disease Foundation to identify and prioritize specific genomic data analysis needs of the global mitochondrial disease clinical and research community. A central Web portal (https://mseqdr.org) facilitates the coherent compilation, organization, annotation, and analysis of sequence data from both nuclear and mitochondrial genomes of individuals and families with suspected mitochondrial disease. This Web portal provides users with a flexible and expandable suite of resources to enable variant-, gene-, and exome-level sequence analysis in a secure, Web-based, and user-friendly fashion. Users can also elect to share data with other MSeqDR Consortium members, or even the general public, either by custom annotation tracks or through the use of a convenient distributed annotation system (DAS) mechanism. A range of data visualization and analysis tools are provided to facilitate user interrogation and understanding of genomic, and ultimately phenotypic, data of relevance to mitochondrial biology and disease. Currently available tools for nuclear and mitochondrial gene analyses include an MSeqDR GBrowse instance that hosts optimized mitochondrial disease and mitochondrial DNA (mtDNA) specific annotation tracks, as well as an MSeqDR locus-specific database (LSDB) that curates variant data on more than 1300 genes that have been implicated in mitochondrial disease and/or encode mitochondria-localized proteins. MSeqDR is integrated with a diverse array of mtDNA data analysis tools that are both freestanding and incorporated into an online exome-level dataset curation and analysis resource (GEM.app) that is being optimized to support needs of the MSeqDR community. In addition, MSeqDR supports mitochondrial disease phenotyping and ontology tools, and provides variant pathogenicity assessment features that enable community review, feedback, and integration with the public ClinVar variant annotation resource. A centralized Web-based informed consent process is being developed, with implementation of a Global Unique Identifier (GUID) system to integrate data deposited on a given individual from different sources. Community-based data deposition into MSeqDR has already begun. Future efforts will enhance capabilities to incorporate phenotypic data that enhance genomic data analyses. MSeqDR will fill the existing void in bioinformatics tools and centralized knowledge that are necessary to enable efficient nuclear and mtDNA genomic data interpretation by a range of shareholders across both clinical diagnostic and research settings. Ultimately, MSeqDR is focused on empowering the global mitochondrial disease community to better define and explore mitochondrial diseases. Copyright © 2014 Elsevier Inc. All rights reserved.
Mitochondrial Disease Sequence Data Resource (MSeqDR): A global grass-roots consortium to facilitate deposition, curation, annotation, and integrated analysis of genomic data for the mitochondrial disease clinical and research communities

PubMed Central

Falk, Marni J.; Shen, Lishuang; Gonzalez, Michael; Leipzig, Jeremy; Lott, Marie T.; Stassen, Alphons P.M.; Diroma, Maria Angela; Navarro-Gomez, Daniel; Yeske, Philip; Bai, Renkui; Boles, Richard G.; Brilhante, Virginia; Ralph, David; DaRe, Jeana T.; Shelton, Robert; Terry, Sharon; Zhang, Zhe; Copeland, William C.; van Oven, Mannis; Prokisch, Holger; Wallace, Douglas C.; Attimonelli, Marcella; Krotoski, Danuta; Zuchner, Stephan; Gai, Xiaowu

2014-01-01

Success rates for genomic analyses of highly heterogeneous disorders can be greatly improved if a large cohort of patient data is assembled to enhance collective capabilities for accurate sequence variant annotation, analysis, and interpretation. Indeed, molecular diagnostics requires the establishment of robust data resources to enable data sharing that informs accurate understanding of genes, variants, and phenotypes. The “Mitochondrial Disease Sequence Data Resource (MSeqDR) Consortium” is a grass-roots effort facilitated by the United Mitochondrial Disease Foundation to identify and prioritize specific genomic data analysis needs of the global mitochondrial disease clinical and research community. A central Web portal (https://mseqdr.org) facilitates the coherent compilation, organization, annotation, and analysis of sequence data from both nuclear and mitochondrial genomes of individuals and families with suspected mitochondrial disease. This Web portal provides users with a flexible and expandable suite of resources to enable variant-, gene-, and exome-level sequence analysis in a secure, Web-based, and user-friendly fashion. Users can also elect to share data with other MSeqDR Consortium members, or even the general public, either by custom annotation tracks or through use of a convenient distributed annotation system (DAS) mechanism. A range of data visualization and analysis tools are provided to facilitate user interrogation and understanding of genomic, and ultimately phenotypic, data of relevance to mitochondrial biology and disease. Currently available tools for nuclear and mitochondrial gene analyses include an MSeqDR GBrowse instance that hosts optimized mitochondrial disease and mitochondrial DNA (mtDNA) specific annotation tracks, as well as an MSeqDR locus-specific database (LSDB) that curates variant data on more than 1,300 genes that have been implicated in mitochondrial disease and/or encode mitochondria-localized proteins. MSeqDR is integrated with a diverse array of mtDNA data analysis tools that are both freestanding and incorporated into an online exome-level dataset curation and analysis resource (GEM.app) that is being optimized to support needs of the MSeqDR community. In addition, MSeqDR supports mitochondrial disease phenotyping and ontology tools, and provides variant pathogenicity assessment features that enable community review, feedback, and integration with the public ClinVar variant annotation resource. A centralized Web-based informed consent process is being developed, with implementation of a Global Unique Identifier (GUID) system to integrate data deposited on a given individual from different sources. Community-based data deposition into MSeqDR has already begun. Future efforts will enhance capabilities to incorporate phenotypic data that enhance genomic data analyses. MSeqDR will fill the existing void in bioinformatics tools and centralized knowledge that are necessary to enable efficient nuclear and mtDNA genomic data interpretation by a range of shareholders across both clinical diagnostic and research settings. Ultimately, MSeqDR is focused on empowering the global mitochondrial disease community to better define and explore mitochondrial disease. PMID:25542617
Identification of Aspergillus fumigatus and Related Species by Nested PCR Targeting Ribosomal DNA Internal Transcribed Spacer Regions

PubMed Central

Zhao, Jun; Kong, Fanrong; Li, Ruoyu; Wang, Xiaohong; Wan, Zhe; Wang, Duanli

2001-01-01

Aspergillus fumigatus is the most common species that causes invasive aspergillosis. In order to identify A. fumigatus, partial ribosomal DNA (rDNA) from two to six strains of five different Aspergillus species was sequenced. By comparing sequence data from GenBank, we designed specific primer pairs targeting rDNA internal transcribed spacer (ITS) regions of A. fumigatus. A nested PCR method for identification of other A. fumigatus-related species was established by using the primers. To evaluate the specificities and sensitivities of those primers, 24 isolates of A. fumigatus and variants, 8 isolates of Aspergillus nidulans, 7 isolates of Aspergillus flavus and variants, 8 isolates of Aspergillus terreus, 9 isolates of Aspergillus niger, 1 isolate each of Aspergillus parasiticus, Aspergillus penicilloides, Aspergillus versicolor, Aspergillus wangduanlii, Aspergillus qizutongii, Aspergillus beijingensis, and Exophiala dermatitidis, 4 isolates of Candida, 4 isolates of bacteria, and human DNA were used. The nested PCR method specifically identified the A. fumigatus isolates and closely related species and showed a high degree of sensitivity. Additionally, four A. fumigatus strains that were recently isolated from our clinic were correctly identified by this method. Our results demonstrate that these primers are useful for the identification of A. fumigatus and closely related species in culture and suggest further studies for the identification of Aspergillus fumigatus species in clinical specimens. PMID:11376067
Association between sequence variants in panicle development genes and the number of spikelets per panicle in rice.

PubMed

Jang, Su; Lee, Yunjoo; Lee, Gileung; Seo, Jeonghwan; Lee, Dongryung; Yu, Yoye; Chin, Joong Hyoun; Koh, Hee-Jong

2018-01-15

Balancing panicle-related traits such as panicle length and the numbers of primary and secondary branches per panicle, is key to improving the number of spikelets per panicle in rice. Identifying genetic information contributes to a broader understanding of the roles of gene and provides candidate alleles for use as DNA markers. Discovering relations between panicle-related traits and sequence variants allows opportunity for molecular application in rice breeding to improve the number of spikelets per panicle. In total, 142 polymorphic sites, which constructed 58 haplotypes, were detected in coding regions of ten panicle development gene and 35 sequence variants in six genes were significantly associated with panicle-related traits. Rice cultivars were clustered according to their sequence variant profiles. One of the four resultant clusters, which contained only indica and tong-il varieties, exhibited the largest average number of favorable alleles and highest average number of spikelets per panicle, suggesting that the favorable allele combination found in this cluster was beneficial in increasing the number of spikelets per panicle. Favorable alleles identified in this study can be used to develop functional markers for rice breeding programs. Furthermore, stacking several favorable alleles has the potential to substantially improve the number of spikelets per panicle in rice.
Probabilistic simple sticker systems

NASA Astrophysics Data System (ADS)

Selvarajoo, Mathuri; Heng, Fong Wan; Sarmin, Nor Haniza; Turaev, Sherzod

2017-04-01

A model for DNA computing using the recombination behavior of DNA molecules, known as a sticker system, was introduced by by L. Kari, G. Paun, G. Rozenberg, A. Salomaa, and S. Yu in the paper entitled DNA computing, sticker systems and universality from the journal of Acta Informatica vol. 35, pp. 401-420 in the year 1998. A sticker system uses the Watson-Crick complementary feature of DNA molecules: starting from the incomplete double stranded sequences, and iteratively using sticking operations until a complete double stranded sequence is obtained. It is known that sticker systems with finite sets of axioms and sticker rules generate only regular languages. Hence, different types of restrictions have been considered to increase the computational power of sticker systems. Recently, a variant of restricted sticker systems, called probabilistic sticker systems, has been introduced [4]. In this variant, the probabilities are initially associated with the axioms, and the probability of a generated string is computed by multiplying the probabilities of all occurrences of the initial strings in the computation of the string. Strings for the language are selected according to some probabilistic requirements. In this paper, we study fundamental properties of probabilistic simple sticker systems. We prove that the probabilistic enhancement increases the computational power of simple sticker systems.

Cloud-based adaptive exon prediction for DNA analysis.

PubMed

Putluri, Srinivasareddy; Zia Ur Rahman, Md; Fathima, Shaik Yasmeen

2018-02-01

Cloud computing offers significant research and economic benefits to healthcare organisations. Cloud services provide a safe place for storing and managing large amounts of such sensitive data. Under conventional flow of gene information, gene sequence laboratories send out raw and inferred information via Internet to several sequence libraries. DNA sequencing storage costs will be minimised by use of cloud service. In this study, the authors put forward a novel genomic informatics system using Amazon Cloud Services, where genomic sequence information is stored and accessed for processing. True identification of exon regions in a DNA sequence is a key task in bioinformatics, which helps in disease identification and design drugs. Three base periodicity property of exons forms the basis of all exon identification techniques. Adaptive signal processing techniques found to be promising in comparison with several other methods. Several adaptive exon predictors (AEPs) are developed using variable normalised least mean square and its maximum normalised variants to reduce computational complexity. Finally, performance evaluation of various AEPs is done based on measures such as sensitivity, specificity and precision using various standard genomic datasets taken from National Center for Biotechnology Information genomic sequence database.
Sequence variants in ESR1 and OXTR are associated with Mayer-Rokitansky-Küster-Hauser syndrome.

PubMed

Brucker, Sara Yvonne; Frank, Liliane; Eisenbeis, Simone; Henes, Melanie; Wallwiener, Diethelm; Riess, Olaf; van Eijck, Barbara; Schöller, Dorit; Bonin, Michael; Rall, Kristin Katharina

2017-11-01

Mayer-Rokitansky-Küster-Hauser syndrome (MRKHS) is characterized by congenital absence of the uterus and the upper two-thirds of the vagina in otherwise phenotypically normal females. It is found isolated or associated with renal, skeletal and other malformations. Despite ongoing research, the etiology is mainly unknown. For a long time, the hypothesis of deficient hormone receptors as the cause for MRKHS has existed, supported by previous findings of our group. The aim of the present study was to identify unknown genetic causes for MRKHS and to compare them with data banks including a review of the literature. DNA sequence analysis of the oxytocin receptor (OXTR) and estrogen receptor-1 gene (ESR1) was performed in a group of 93 clinically well-defined patients with uterovaginal aplasia (68 with the isolated form and 25 with associated malformations). In total, we detected three OXTR variants in 18 MRKHS patients with one leading to a missense mutation, and six ESR1 variants in 21 MRKHS patients, two of these causing amino acid changes and therefore potentially disease. The identified variants on DNA level might impair receptor function through different molecular mechanisms. Mutations of ESR1 and OXTR are associated with MRKHS. Thus, we consider these genes potential candidates associated with the manifestation of MRKHS. © 2017 Nordic Federation of Societies of Obstetrics and Gynecology, Acta Obstetricia et Gynecologica Scandinavica.
Screening of SHOX gene sequence variants in Saudi Arabian children with idiopathic short stature.

PubMed

Alharthi, Abdulla A; El-Hallous, Ehab I; Talaat, Iman M; Alghamdi, Hamed A; Almalki, Matar I; Gaber, Ahmed

2017-10-01

Short stature affects approximately 2%-3% of children, representing one of the most frequent disorders for which clinical attention is sought during childhood. Despite assumed genetic heterogeneity, mutations or deletions in the short stature homeobox-containing gene ( SHOX ) are frequently detected in subjects with short stature. Idiopathic short stature (ISS) refers to patients with short stature for various unknown reasons. The goal of this study was to screen all the exons of SHOX to identify related mutations. We screened all the exons of SHOX for mutations analysis in 105 ISS children patients (57 girls and 48 boys) living in Taif governorate, KSA using a direct DNA sequencing method. Height, arm span, and sitting height were recorded, and subischial leg length was calculated. A total of 30 of 105 ISS patients (28%) contained six polymorphic variants in exons 1, 2, 4, and 6. One mutation was found in the DNA domain binding region of exon 4. Three of these polymorphic variants were novel, while the others were reported previously. There were no significant differences in anthropometric measures in ISS patients with and without identifiable polymorphic variants in SHOX . In Saudi Arabia ISS patients, rather than SHOX , it is possible that new genes are involved in longitudinal growth. Additional molecular analysis is required to diagnose and understand the etiology of this disease.
OVA: integrating molecular and physical phenotype data from multiple biomedical domain ontologies with variant filtering for enhanced variant prioritization.

PubMed

Antanaviciute, Agne; Watson, Christopher M; Harrison, Sally M; Lascelles, Carolina; Crinnion, Laura; Markham, Alexander F; Bonthron, David T; Carr, Ian M

2015-12-01

Exome sequencing has become a de facto standard method for Mendelian disease gene discovery in recent years, yet identifying disease-causing mutations among thousands of candidate variants remains a non-trivial task. Here we describe a new variant prioritization tool, OVA (ontology variant analysis), in which user-provided phenotypic information is exploited to infer deeper biological context. OVA combines a knowledge-based approach with a variant-filtering framework. It reduces the number of candidate variants by considering genotype and predicted effect on protein sequence, and scores the remainder on biological relevance to the query phenotype.We take advantage of several ontologies in order to bridge knowledge across multiple biomedical domains and facilitate computational analysis of annotations pertaining to genes, diseases, phenotypes, tissues and pathways. In this way, OVA combines information regarding molecular and physical phenotypes and integrates both human and model organism data to effectively prioritize variants. By assessing performance on both known and novel disease mutations, we show that OVA performs biologically meaningful candidate variant prioritization and can be more accurate than another recently published candidate variant prioritization tool. OVA is freely accessible at http://dna2.leeds.ac.uk:8080/OVA/index.jsp. Supplementary data are available at Bioinformatics online. umaan@leeds.ac.uk. © The Author 2015. Published by Oxford University Press.
Analysis of a four generation family reveals the widespread sequence-dependent maintenance of allelic DNA methylation in somatic and germ cells

PubMed Central

Tang, Aifa; Huang, Yi; Li, Zesong; Wan, Shengqing; Mou, Lisha; Yin, Guangliang; Li, Ning; Xie, Jun; Xia, Yudong; Li, Xianxin; Luo, Liya; Zhang, Junwen; Chen, Shen; Wu, Song; Sun, Jihua; Sun, Xiaojuan; Jiang, Zhimao; Chen, Jing; Li, Yingrui; Wang, Jian; Wang, Jun; Cai, Zhiming; Gui, Yaoting

2016-01-01

Differential methylation of the homologous chromosomes, a well-known mechanism leading to genomic imprinting and X-chromosome inactivation, is widely reported at the non-imprinted regions on autosomes. To evaluate the transgenerational DNA methylation patterns in human, we analyzed the DNA methylomes of somatic and germ cells in a four-generation family. We found that allelic asymmetry of DNA methylation was pervasive at the non-imprinted loci and was likely regulated by cis-acting genetic variants. We also observed that the allelic methylation patterns for the vast majority of the cis-regulated loci were shared between the somatic and germ cells from the same individual. These results demonstrated the interaction between genetic and epigenetic variations and suggested the possibility of widespread sequence-dependent transmission of DNA methylation during spermatogenesis. PMID:26758766
Determination of a novel integron-located variant (blaOXA -320 ) of Class D β-lactamase in Proteus mirabilis.

PubMed

Cicek, Aysegul Copur; Duzgun, Azer Ozad; Saral, Aysegul; Sandalli, Cemal

2014-10-01

Proteus mirabilis (P. mirabilis) is one of Gram-negative pathogens encountered in clinical specimens. A clinical isolate (TRP41) of P. mirabilis was isolated from a Turkish patient in Turkey. The isolate was identified using the API 32GN system and 16S rRNA gene sequencing and it was found resistant to ampicillin/sulbactam, piperacillin, tetracycline, and trimethoprim/sulfamethoxazole. This isolate was harboring a Class 1 integron gene cassette and its DNA sequence analysis revealed a novel blaOXA variant exhibiting one amino acid substitution (Asn266Ile) from blaOXA-1 . This new variant of OXA was located on Class 1 integron together with aadA1 gene encoding aminoglycoside-modifying enzymes. According to sequence records, the new variant was named as blaOXA-320 . Cassette array and size of integron were found as blaOXA-320 -aadA1 and 2086 bp, respectively. The blaOXA-320 gene is not transferable according to conjugation experiment. In this study, we report the first identification of blaOXA-320 -aadA1 gene cassette, a novel variant of Class D β-lactamase, in P. mirabilis from Turkey. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Identification of root rot fungi in nursery seedlings by nested multiplex PCR.

PubMed Central

Hamelin, R C; Bérubé, P; Gignac, M; Bourassa, M

1996-01-01

The internal transcribed spacer (ITS) of the ribosomal DNA (rDNA) subunit repeat was sequenced in 12 isolates of Cylindrocladium floridanum and 11 isolates of Cylindrocarpon destructans. Sequences were aligned and compared with ITS sequences of other fungi in GenBank. Some intraspecific variability was present within our collections of C. destructans but not in C. floridanum. Three ITS variants were identified within C. destructans, but there was no apparent association between ITS variants and host or geographic origin. Two internal primers were synthesized for the specific amplification of portions of the ITS for C. floridanum, and two primers were designed to amplify all three variants of C. destructans. The species-specific primers amplified PCR products of the expected length when tested with cultures of C, destructans and C. floridanum from white spruce, black spruce, Norway spruce, red spruce, jack pine, red pine, and black walnut from eight nurseries and three plantations in Quebec. No amplification resulted from PCR reactions on fungal DNA from 26 common contaminants of conifer roots. For amplifications directly from infected tissues, a nested primer PCR using two rounds of amplification was combined with multiplex PCR approach resulting in the amplification of two different species-specific PCR fragments in the same reaction. First, the entire ITS was amplified with one universal primer and a second primer specific to fungi; a second round of amplification was carried out with species-specific primers that amplified a 400-bp PCR product from C. destructans and a 328-bp product from C. floridanum. The species-specific fragments were amplified directly from infected roots from which one or the two fungi had been isolated. PMID:8899993
High-throughput multiplex cpDNA resequencing clarifies the genetic diversity and genetic relationships among Brassica napus, Brassica rapa and Brassica oleracea.

PubMed

Qiao, Jiangwei; Cai, Mengxian; Yan, Guixin; Wang, Nian; Li, Feng; Chen, Binyun; Gao, Guizhen; Xu, Kun; Li, Jun; Wu, Xiaoming

2016-01-01

Brassica napus (rapeseed) is a recent allotetraploid plant and the second most important oilseed crop worldwide. The origin of B. napus and the genetic relationships with its diploid ancestor species remain largely unresolved. Here, chloroplast DNA (cpDNA) from 488 B. napus accessions of global origin, 139 B. rapa accessions and 49 B. oleracea accessions were populationally resequenced using Illumina Solexa sequencing technologies. The intraspecific cpDNA variants and their allelic frequencies were called genomewide and further validated via EcoTILLING analyses of the rpo region. The cpDNA of the current global B. napus population comprises more than 400 variants (SNPs and short InDels) and maintains one predominant haplotype (Bncp1). Whole-genome resequencing of the cpDNA of Bncp1 haplotype eliminated its direct inheritance from any accession of the B. rapa or B. oleracea species. The distribution of the polymorphism information content (PIC) values for each variant demonstrated that B. napus has much lower cpDNA diversity than B. rapa; however, a vast majority of the wild and cultivated B. oleracea specimens appeared to share one same distinct cpDNA haplotype, in contrast to its wild C-genome relatives. This finding suggests that the cpDNA of the three Brassica species is well differentiated. The predominant B. napus cpDNA haplotype may have originated from uninvestigated relatives or from interactions between cpDNA mutations and natural/artificial selection during speciation and evolution. These exhaustive data on variation in cpDNA would provide fundamental data for research on cpDNA and chloroplasts. © 2015 Society for Experimental Biology, Association of Applied Biologists and John Wiley & Sons Ltd.
Separation and parallel sequencing of the genomes and transcriptomes of single cells using G&T-seq.

PubMed

Macaulay, Iain C; Teng, Mabel J; Haerty, Wilfried; Kumar, Parveen; Ponting, Chris P; Voet, Thierry

2016-11-01

Parallel sequencing of a single cell's genome and transcriptome provides a powerful tool for dissecting genetic variation and its relationship with gene expression. Here we present a detailed protocol for G&T-seq, a method for separation and parallel sequencing of genomic DNA and full-length polyA(+) mRNA from single cells. We provide step-by-step instructions for the isolation and lysis of single cells; the physical separation of polyA(+) mRNA from genomic DNA using a modified oligo-dT bead capture and the respective whole-transcriptome and whole-genome amplifications; and library preparation and sequence analyses of these amplification products. The method allows the detection of thousands of transcripts in parallel with the genetic variants captured by the DNA-seq data from the same single cell. G&T-seq differs from other currently available methods for parallel DNA and RNA sequencing from single cells, as it involves physical separation of the DNA and RNA and does not require bespoke microfluidics platforms. The process can be implemented manually or through automation. When performed manually, paired genome and transcriptome sequencing libraries from eight single cells can be produced in ∼3 d by researchers experienced in molecular laboratory work. For users with experience in the programming and operation of liquid-handling robots, paired DNA and RNA libraries from 96 single cells can be produced in the same time frame. Sequence analysis and integration of single-cell G&T-seq DNA and RNA data requires a high level of bioinformatics expertise and familiarity with a wide range of informatics tools.
Regulation of pathogenicity in hop stunt viroid-related group II citrus viroids.

PubMed

Reanwarakorn, K; Semancik, J S

1998-12-01

Nucleotide sequences were determined for two hop stunt viroid-related Group II citrus viroids characterized as either a cachexia disease non-pathogenic variant (CVd-IIa) or a pathogenic variant (CVd-IIb). Sequence identity between the two variants of 95.6% indicated a conserved genome with the principal region of nucleotide difference clustered in the variable (V) domain. Full-length viroid RT-PCR cDNA products were cloned into plasmid SP72. Viroid cDNA clones as well as derived RNA transcripts were transmissible to citron (Citrus medica L.) and Luffa aegyptiaca Mill. To determine the locus of cachexia pathogenicity as well as symptom expression in Luffa, chimeric viroid cDNA clones were constructed from segments of either the left terminal, pathogenic and conserved (T1-P-C) domains or the conserved, variable and right terminal (C-V-T2) domains of CVd-IIa or CVd-IIb in reciprocal exchanges. Symptoms induced by the various chimeric constructs on the two bioassay hosts reflected the differential response observed with CVd-IIa and -IIb. Constructs with the C-V-T2 domains region from clone-IIa induced severe symptoms on Luffa typical of CVd-IIa, but were non-symptomatic on mandarin as a bioassay host for the cachexia disease. Constructs with the same region (C-V-T2) from the clone-IIb genome induced only mild symptoms on Luffa, but produced a severe reaction on mandarin, as observed for CVd-IIb. Specific site-directed mutations were introduced into the V domain of the CVd-IIa clone to construct viroid cDNA clones with either partial or complete conversions to the CVd-IIb sequence. With the introduction of six site-specific changes into the V domain of the clone-IIa genome, cachexia pathogenicity was acquired as well as a moderation of severe symptoms on Luffa.
The landscape of actionable genomic alterations in cell-free circulating tumor DNA from 21,807 advanced cancer patients.

PubMed

Zill, Oliver A; Banks, Kimberly C; Fairclough, Stephen R; Mortimer, Stefanie; Vowles, James V; Mokhtari, Reza; Gandara, David R; Mack, Philip C; Odegaard, Justin I; Nagy, Rebecca J; Baca, Arthur M; Eltoukhy, Helmy; Chudova, Darya I; Lanman, Richard B; Talasaz, AmirAli

2018-05-18

Cell-free DNA (cfDNA) sequencing provides a non-invasive method for obtaining actionable genomic information to guide personalized cancer treatment, but the presence of multiple alterations in circulation related to treatment and tumor heterogeneity complicate the interpretation of the observed variants. Experimental Design: We describe the somatic mutation landscape of 70 cancer genes from cfDNA deep-sequencing analysis of 21,807 patients with treated, late-stage cancers across >50 cancer types. To facilitate interpretation of the genomic complexity of circulating tumor DNA in advanced, treated cancer patients, we developed methods to identify cfDNA copy-number driver alterations and cfDNA clonality. Patterns and prevalence of cfDNA alterations in major driver genes for non-small cell lung, breast, and colorectal cancer largely recapitulated those from tumor tissue sequencing compendia (TCGA and COSMIC; r=0.90-0.99), with the principle differences in alteration prevalence being due to patient treatment. This highly sensitive cfDNA sequencing assay revealed numerous subclonal tumor-derived alterations, expected as a result of clonal evolution, but leading to an apparent departure from mutual exclusivity in treatment-naïve tumors. Upon applying novel cfDNA clonality and copy-number driver identification methods, robust mutual exclusivity was observed among predicted truncal driver cfDNA alterations (FDR=5x10 -7 for EGFR and ERBB2 ), in effect distinguishing tumor-initiating alterations from secondary alterations. Treatment-associated resistance, including both novel alterations and parallel evolution, was common in the cfDNA cohort and was enriched in patients with targetable driver alterations (>18.6% patients). Together these retrospective analyses of a large cfDNA sequencing data set reveal subclonal structures and emerging resistance in advanced solid tumors. Copyright ©2018, American Association for Cancer Research.
β-Globin gene sequencing of hemoglobin Austin revises the historically reported electrophoretic migration pattern.

PubMed

Racsa, Lori D; Luu, Hung S; Park, Jason Y; Mitui, Midori; Timmons, Charles F

2014-06-01

Hemoglobin (Hb) Austin was defined in 1977, using amino acid sequencing of samples from 3 unrelated Mexican-Americans, as a substitution of serine for arginine at position 40 of the β-globin chain (Arg40Ser). Its electrophoretic migration on both cellulose acetate (pH 8.4) and citrate agar (pH 6.2) was reported between Hb F and Hb A, and this description persists in reference literature. OBJECTIVES.-To review the clinical features and redefine the diagnostic characteristics of Hb Austin. Eight samples from 6 unrelated individuals and 2 siblings, all with Hispanic surnames, were submitted for abnormal Hb identification between June 2010 and September 2011. High-performance liquid chromatography, isoelectric focusing (IEF), citrate agar electrophoresis, and bidirectional DNA sequencing of the entire β-globin gene were performed. DNA sequencing confirmed all 8 individuals to be heterozygous for Hb Austin (Arg40Ser). Retention time on high-performance liquid chromatography and migration on citrate agar electrophoresis were consistent with that identification. Migration on IEF, however, was not between Hb F and Hb A, as predicted from the report of cellulose acetate electrophoresis. By IEF, Hb Austin migrated anodal to ("faster than") Hb A. Hemoglobin Austin (Arg40Ser) appears on IEF as a "fast," anodally migrating, Hb variant, just as would be expected from its amino acid substitution. The cited historic report is, at best, not applicable to IEF and is probably erroneous. Our observation of 8 cases in 16 months suggests that this variant may be relatively common in some Hispanic populations, making its recognition important. Furthermore, gene sequencing is proving itself a powerful and reliable tool for definitive identification of Hb variants.
Screening of Variations in CD22 Gene in Children with B-Precursor Acute Lymphoblastic Leukemia.

PubMed

Aslar Oner, Deniz; Akin, Dilara Fatma; Sipahi, Kadir; Mumcuoglu, Mine; Ezer, Ustun; Kürekci, A Emin; Akar, Nejat

2016-09-01

CD22 is expressed on the surface of B-cell lineage cells from the early progenitor stage of pro-B cell until terminal differentiation to mature B cells. It plays a role in signal transduction and as a regulator of B-cell receptor signaling in B-cell development. We aimed to screen exons 9-14 of the CD22 gene, which is a mutational hot spot region in B-precursor acute lymphoblastic leukemia (pre-B ALL) patients, to find possible genetic variants that could play role in the pathogenesis of pre-B ALL in Turkish children. This study included 109 Turkish children with pre-B ALL who were diagnosed at Losante Hospital for Children with Leukemia. Genomic DNA was extracted from both peripheral blood and bone marrow leukocytes. Gene amplification was performed with PCR, and all samples were screened for the variants by single strand conformation polymorphism. Samples showing band shifts were sequenced on an automated sequencer. In our patient group a total of 9 variants were identified in the CD22 gene by sequencing: a novel variant in intron 10 (T2199G); a missense variant in exon 12; 5 intronic variants between exon 12 and intron 13; a novel intronic variant (C2424T); and a synonymous in exon 13. Thirteen of 109 children (11.9%) carried the T2199G novel intronic variant located in intron 10, and 17 of 109 children (15.6%) carried the C2424T novel intronic variant. Novel variants in the CD22 gene in children with pre-B ALL in Turkey that are not present, in the Human Gene Mutation Database or NCBI SNP database, were found.
Plasmodium falciparum-like parasites infecting wild apes in southern Cameroon do not represent a recurrent source of human malaria

PubMed Central

Sundararaman, Sesh A.; Liu, Weimin; Keele, Brandon F.; Learn, Gerald H.; Bittinger, Kyle; Mouacha, Fatima; Ahuka-Mundeke, Steve; Manske, Magnus; Sherrill-Mix, Scott; Li, Yingying; Malenke, Jordan A.; Delaporte, Eric; Laurent, Christian; Mpoudi Ngole, Eitel; Kwiatkowski, Dominic P.; Shaw, George M.; Rayner, Julian C.; Peeters, Martine; Sharp, Paul M.; Bushman, Frederic D.; Hahn, Beatrice H.

2013-01-01

Wild-living chimpanzees and gorillas harbor a multitude of Plasmodium species, including six of the subgenus Laverania, one of which served as the progenitor of Plasmodium falciparum. Despite the magnitude of this reservoir, it is unknown whether apes represent a source of human infections. Here, we used Plasmodium species-specific PCR, single-genome amplification, and 454 sequencing to screen humans from remote areas of southern Cameroon for ape Laverania infections. Among 1,402 blood samples, we found 1,000 to be Plasmodium mitochondrial DNA (mtDNA) positive, all of which contained human parasites as determined by sequencing and/or restriction enzyme digestion. To exclude low-abundance infections, we subjected 514 of these samples to 454 sequencing, targeting a region of the mtDNA genome that distinguishes ape from human Laverania species. Using algorithms specifically developed to differentiate rare Plasmodium variants from 454-sequencing error, we identified single and mixed-species infections with P. falciparum, Plasmodium malariae, and/or Plasmodium ovale. However, none of the human samples contained ape Laverania parasites, including the gorilla precursor of P. falciparum. To characterize further the diversity of P. falciparum in Cameroon, we used single-genome amplification to amplify 3.4-kb mtDNA fragments from 229 infected humans. Phylogenetic analysis identified 62 new variants, all of which clustered with extant P. falciparum, providing further evidence that P. falciparum emerged following a single gorilla-to-human transmission. Thus, unlike Plasmodium knowlesi-infected macaques in southeast Asia, African apes harboring Laverania parasites do not seem to serve as a recurrent source of human malaria, a finding of import to ongoing control and eradication measures. PMID:23569255
Application of a mitochondrial DNA control region frequency database for UK domestic cats.

PubMed

Ottolini, Barbara; Lall, Gurdeep Matharu; Sacchini, Federico; Jobling, Mark A; Wetton, Jon H

2017-03-01

DNA variation in 402bp of the mitochondrial control region flanked by repeat sequences RS2 and RS3 was evaluated by Sanger sequencing in 152 English domestic cats, in order to determine the significance of matching DNA sequences between hairs found with a victim's body and the suspect's pet cat. Whilst 95% of English cats possessed one of the twelve globally widespread mitotypes, four new variants were observed, the most common of which (2% frequency) was shared with the evidential samples. No significant difference in mitotype frequency was seen between 32 individuals from the locality of the crime and 120 additional cats from the rest of England, suggesting a lack of local population structure. However, significant differences were observed in comparison with frequencies in other countries, including the closely neighbouring Netherlands, highlighting the importance of appropriate genetic databases when determining the evidential significance of mitochondrial DNA evidence. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Efficient generation of transgenic cattle using the DNA transposon and their analysis by next-generation sequencing

PubMed Central

Yum, Soo-Young; Lee, Song-Jeon; Kim, Hyun-Min; Choi, Woo-Jae; Park, Ji-Hyun; Lee, Won-Wu; Kim, Hee-Soo; Kim, Hyeong-Jong; Bae, Seong-Hun; Lee, Je-Hyeong; Moon, Joo-Yeong; Lee, Ji-Hyun; Lee, Choong-Il; Son, Bong-Jun; Song, Sang-Hoon; Ji, Su-Min; Kim, Seong-Jin; Jang, Goo

2016-01-01

Here, we efficiently generated transgenic cattle using two transposon systems (Sleeping Beauty and Piggybac) and their genomes were analyzed by next-generation sequencing (NGS). Blastocysts derived from microinjection of DNA transposons were selected and transferred into recipient cows. Nine transgenic cattle have been generated and grown-up to date without any health issues except two. Some of them expressed strong fluorescence and the transgene in the oocytes from a superovulating one were detected by PCR and sequencing. To investigate genomic variants by the transgene transposition, whole genomic DNA were analyzed by NGS. We found that preferred transposable integration (TA or TTAA) was identified in their genome. Even though multi-copies (i.e. fifteen) were confirmed, there was no significant difference in genome instabilities. In conclusion, we demonstrated that transgenic cattle using the DNA transposon system could be efficiently generated, and all those animals could be a valuable resource for agriculture and veterinary science. PMID:27324781
X-Linked and Autosomal Recessive Alport Syndrome: Pathogenic Variant Features and Further Genotype-Phenotype Correlations

PubMed Central

Savige, Judith; Storey, Helen; Il Cheong, Hae; Gyung Kang, Hee; Park, Eujin; Hilbert, Pascale; Persikov, Anton; Torres-Fernandez, Carmen; Ars, Elisabet; Torra, Roser; Hertz, Jens Michael; Thomassen, Mads; Shagam, Lev; Wang, Dongmao; Wang, Yanyan; Flinter, Frances; Nagel, Mato

2016-01-01

Alport syndrome results from mutations in the COL4A5 (X-linked) or COL4A3/COL4A4 (recessive) genes. This study examined 754 previously- unpublished variants in these genes from individuals referred for genetic testing in 12 accredited diagnostic laboratories worldwide, in addition to all published COL4A5, COL4A3 and COL4A4 variants in the LOVD databases. It also determined genotype-phenotype correlations for variants where clinical data were available. Individuals were referred for genetic testing where Alport syndrome was suspected clinically or on biopsy (renal failure, hearing loss, retinopathy, lamellated glomerular basement membrane), variant pathogenicity was assessed using currently-accepted criteria, and variants were examined for gene location, and age at renal failure onset. Results were compared using Fisher’s exact test (DNA Stata). Altogether 754 new DNA variants were identified, an increase of 25%, predominantly in people of European background. Of the 1168 COL4A5 variants, 504 (43%) were missense mutations, 273 (23%) splicing variants, 73 (6%) nonsense mutations, 169 (14%) short deletions and 76 (7%) complex or large deletions. Only 135 of the 432 Gly residues in the collagenous sequence were substituted (31%), which means that fewer than 10% of all possible variants have been identified. Both missense and nonsense mutations in COL4A5 were not randomly distributed but more common at the 70 CpG sequences (p<10−41 and p<0.001 respectively). Gly>Ala substitutions were underrepresented in all three genes (p< 0.0001) probably because of an association with a milder phenotype. The average age at end-stage renal failure was the same for all mutations in COL4A5 (24.4 ±7.8 years), COL4A3 (23.3 ± 9.3) and COL4A4 (25.4 ± 10.3) (COL4A5 and COL4A3, p = 0.45; COL4A5 and COL4A4, p = 0.55; COL4A3 and COL4A4, p = 0.41). For COL4A5, renal failure occurred sooner with non-missense than missense variants (p<0.01). For the COL4A3 and COL4A4 genes, age at renal failure occurred sooner with two non-missense variants (p = 0.08, and p = 0.01 respectively). Thus DNA variant characteristics that predict age at renal failure appeared to be the same for all three Alport genes. Founder mutations (with the pathogenic variant in at least 5 apparently- unrelated individuals) were not necessarily associated with a milder phenotype. This study illustrates the benefits when routine diagnostic laboratories share and analyse their data. PMID:27627812
X-Linked and Autosomal Recessive Alport Syndrome: Pathogenic Variant Features and Further Genotype-Phenotype Correlations.

PubMed

Savige, Judith; Storey, Helen; Il Cheong, Hae; Gyung Kang, Hee; Park, Eujin; Hilbert, Pascale; Persikov, Anton; Torres-Fernandez, Carmen; Ars, Elisabet; Torra, Roser; Hertz, Jens Michael; Thomassen, Mads; Shagam, Lev; Wang, Dongmao; Wang, Yanyan; Flinter, Frances; Nagel, Mato

2016-01-01

Alport syndrome results from mutations in the COL4A5 (X-linked) or COL4A3/COL4A4 (recessive) genes. This study examined 754 previously- unpublished variants in these genes from individuals referred for genetic testing in 12 accredited diagnostic laboratories worldwide, in addition to all published COL4A5, COL4A3 and COL4A4 variants in the LOVD databases. It also determined genotype-phenotype correlations for variants where clinical data were available. Individuals were referred for genetic testing where Alport syndrome was suspected clinically or on biopsy (renal failure, hearing loss, retinopathy, lamellated glomerular basement membrane), variant pathogenicity was assessed using currently-accepted criteria, and variants were examined for gene location, and age at renal failure onset. Results were compared using Fisher's exact test (DNA Stata). Altogether 754 new DNA variants were identified, an increase of 25%, predominantly in people of European background. Of the 1168 COL4A5 variants, 504 (43%) were missense mutations, 273 (23%) splicing variants, 73 (6%) nonsense mutations, 169 (14%) short deletions and 76 (7%) complex or large deletions. Only 135 of the 432 Gly residues in the collagenous sequence were substituted (31%), which means that fewer than 10% of all possible variants have been identified. Both missense and nonsense mutations in COL4A5 were not randomly distributed but more common at the 70 CpG sequences (p<10-41 and p<0.001 respectively). Gly>Ala substitutions were underrepresented in all three genes (p< 0.0001) probably because of an association with a milder phenotype. The average age at end-stage renal failure was the same for all mutations in COL4A5 (24.4 ±7.8 years), COL4A3 (23.3 ± 9.3) and COL4A4 (25.4 ± 10.3) (COL4A5 and COL4A3, p = 0.45; COL4A5 and COL4A4, p = 0.55; COL4A3 and COL4A4, p = 0.41). For COL4A5, renal failure occurred sooner with non-missense than missense variants (p<0.01). For the COL4A3 and COL4A4 genes, age at renal failure occurred sooner with two non-missense variants (p = 0.08, and p = 0.01 respectively). Thus DNA variant characteristics that predict age at renal failure appeared to be the same for all three Alport genes. Founder mutations (with the pathogenic variant in at least 5 apparently- unrelated individuals) were not necessarily associated with a milder phenotype. This study illustrates the benefits when routine diagnostic laboratories share and analyse their data.
27nt-RNAs guide histone variant deposition via 'RNA-induced DNA replication interference' and thus transmit parental genome partitioning in Stylonychia.

PubMed

Postberg, Jan; Jönsson, Franziska; Weil, Patrick Philipp; Bulic, Aneta; Juranek, Stefan Andreas; Lipps, Hans-Joachim

2018-06-12

During sexual reproduction in the unicellular ciliate Stylonychia somatic macronuclei differentiate from germline micronuclei. Thereby, programmed sequence reduction takes place, leading to the elimination of > 95% of germline sequences, which priorly adopt heterochromatin structure via H3K27me3. Simultaneously, 27nt-ncRNAs become synthesized from parental transcripts and are bound by the Argonaute protein PIWI1. These 27nt-ncRNAs cover sequences destined to the developing macronucleus and are thought to protect them from degradation. We provide evidence and propose that RNA/DNA base-pairing guides PIWI1/27nt-RNA complexes to complementary macronucleus-destined DNA target sequences, hence transiently causing locally stalled replication during polytene chromosome formation. This spatiotemporal delay enables the selective deposition of temporarily available histone H3.4K27me3 nucleosomes at all other sequences being continuously replicated, thus dictating their prospective heterochromatin structure before becoming developmentally eliminated. Concomitantly, 27nt-RNA-covered sites remain protected. We introduce the concept of 'RNA-induced DNA replication interference' and explain how the parental functional genome partition could become transmitted to the progeny.
Molecular characterization of variant alpha-subunit of electron transfer flavoprotein in three patients with glutaric acidemia type II--and identification of glycine substitution for valine-157 in the sequence of the precursor, producing an unstable mature protein in a patient.

PubMed Central

Indo, Y; Glassberg, R; Yokota, I; Tanaka, K

1991-01-01

In our previous study of eight glutaric acidemia type II (GAII) fibroblast lines by using [35S]methionine labeling and immunoprecipitation, three of them had a defect in the synthesis of the alpha-subunit of electron transfer flavoprotein (alpha-ETF) (Ikeda et al. 1986). In one of them (YH1313) the labeling of the mature alpha-ETF was barely detectable, while that of the precursor (p) was stronger. In another (YH605) no synthesis of immunoreactive p alpha-ETF was detectable. In the third cell line (YH1391) the rate of variant p alpha-ETF synthesis was comparable to normal, but its electrophoretic mobility was slightly faster than normal. In the present study, the northern blot analysis revealed that all three mutant cell lines contained p alpha-ETF mRNA and that their size and amount were comparable to normal. In immunoblot analysis, both alpha- and beta-ETF bands were barely detectable in YH1313 and YH605 but were detectable in YH1391 in amounts comparable to normal. Sequencing of YH1313 p alpha-ETF cDNA via PCR identified a transversion of T-470 to G. We then devised a simple PCR method for the 119-bp section (T-443/G-561) for detecting this mutation. In the upstream primer, A-466 was artificially replaced with C, to introduce a BstNI site into the amplified copies in the presence of G-470 from the variant sequence. The genomic DNA analysis using this method demonstrated that YH1313 was homozygous for T----G-470 transversion. It was not detected either in two other alpha-ETF-deficient GAII or in seven control cell lines. The alpha-ETF cDNA sequence in YH605 was identical to normal. Images Figure 1 Figure 2 Figure 3 Figure 5 PMID:1882842

The Genome of the Netherlands: design, and project goals.

PubMed

Boomsma, Dorret I; Wijmenga, Cisca; Slagboom, Eline P; Swertz, Morris A; Karssen, Lennart C; Abdellaoui, Abdel; Ye, Kai; Guryev, Victor; Vermaat, Martijn; van Dijk, Freerk; Francioli, Laurent C; Hottenga, Jouke Jan; Laros, Jeroen F J; Li, Qibin; Li, Yingrui; Cao, Hongzhi; Chen, Ruoyan; Du, Yuanping; Li, Ning; Cao, Sujie; van Setten, Jessica; Menelaou, Androniki; Pulit, Sara L; Hehir-Kwa, Jayne Y; Beekman, Marian; Elbers, Clara C; Byelas, Heorhiy; de Craen, Anton J M; Deelen, Patrick; Dijkstra, Martijn; den Dunnen, Johan T; de Knijff, Peter; Houwing-Duistermaat, Jeanine; Koval, Vyacheslav; Estrada, Karol; Hofman, Albert; Kanterakis, Alexandros; Enckevort, David van; Mai, Hailiang; Kattenberg, Mathijs; van Leeuwen, Elisabeth M; Neerincx, Pieter B T; Oostra, Ben; Rivadeneira, Fernanodo; Suchiman, Eka H D; Uitterlinden, Andre G; Willemsen, Gonneke; Wolffenbuttel, Bruce H; Wang, Jun; de Bakker, Paul I W; van Ommen, Gert-Jan; van Duijn, Cornelia M

2014-02-01

Within the Netherlands a national network of biobanks has been established (Biobanking and Biomolecular Research Infrastructure-Netherlands (BBMRI-NL)) as a national node of the European BBMRI. One of the aims of BBMRI-NL is to enrich biobanks with different types of molecular and phenotype data. Here, we describe the Genome of the Netherlands (GoNL), one of the projects within BBMRI-NL. GoNL is a whole-genome-sequencing project in a representative sample consisting of 250 trio-families from all provinces in the Netherlands, which aims to characterize DNA sequence variation in the Dutch population. The parent-offspring trios include adult individuals ranging in age from 19 to 87 years (mean=53 years; SD=16 years) from birth cohorts 1910-1994. Sequencing was done on blood-derived DNA from uncultured cells and accomplished coverage was 14-15x. The family-based design represents a unique resource to assess the frequency of regional variants, accurately reconstruct haplotypes by family-based phasing, characterize short indels and complex structural variants, and establish the rate of de novo mutational events. GoNL will also serve as a reference panel for imputation in the available genome-wide association studies in Dutch and other cohorts to refine association signals and uncover population-specific variants. GoNL will create a catalog of human genetic variation in this sample that is uniquely characterized with respect to micro-geographic location and a wide range of phenotypes. The resource will be made available to the research and medical community to guide the interpretation of sequencing projects. The present paper summarizes the global characteristics of the project.
Massively parallel sequencing of forensic STRs: Considerations of the DNA commission of the International Society for Forensic Genetics (ISFG) on minimal nomenclature requirements.

PubMed

Parson, Walther; Ballard, David; Budowle, Bruce; Butler, John M; Gettings, Katherine B; Gill, Peter; Gusmão, Leonor; Hares, Douglas R; Irwin, Jodi A; King, Jonathan L; Knijff, Peter de; Morling, Niels; Prinz, Mechthild; Schneider, Peter M; Neste, Christophe Van; Willuweit, Sascha; Phillips, Christopher

2016-05-01

The DNA Commission of the International Society for Forensic Genetics (ISFG) is reviewing factors that need to be considered ahead of the adoption by the forensic community of short tandem repeat (STR) genotyping by massively parallel sequencing (MPS) technologies. MPS produces sequence data that provide a precise description of the repeat allele structure of a STR marker and variants that may reside in the flanking areas of the repeat region. When a STR contains a complex arrangement of repeat motifs, the level of genetic polymorphism revealed by the sequence data can increase substantially. As repeat structures can be complex and include substitutions, insertions, deletions, variable tandem repeat arrangements of multiple nucleotide motifs, and flanking region SNPs, established capillary electrophoresis (CE) allele descriptions must be supplemented by a new system of STR allele nomenclature, which retains backward compatibility with the CE data that currently populate national DNA databases and that will continue to be produced for the coming years. Thus, there is a pressing need to produce a standardized framework for describing complex sequences that enable comparison with currently used repeat allele nomenclature derived from conventional CE systems. It is important to discern three levels of information in hierarchical order (i) the sequence, (ii) the alignment, and (iii) the nomenclature of STR sequence data. We propose a sequence (text) string format the minimal requirement of data storage that laboratories should follow when adopting MPS of STRs. We further discuss the variant annotation and sequence comparison framework necessary to maintain compatibility among established and future data. This system must be easy to use and interpret by the DNA specialist, based on a universally accessible genome assembly, and in place before the uptake of MPS by the general forensic community starts to generate sequence data on a large scale. While the established nomenclature for CE-based STR analysis will remain unchanged in the future, the nomenclature of sequence-based STR genotypes will need to follow updated rules and be generated by expert systems that translate MPS sequences to match CE conventions in order to guarantee compatibility between the different generations of STR data. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Molecular characterization of canine parvovirus variants (CPV-2a, CPV-2b, and CPV-2c) based on the VP2 gene in affected domestic dogs in Ecuador.

PubMed

la Torre, David De; Mafla, Eulalia; Puga, Byron; Erazo, Linda; Astolfi-Ferreira, Claudete; Ferreira, Antonio Piantino

2018-04-01

The objective of this study was to determine the presence of the variants of canine parvovirus (CPV)-2 in the city of Quito, Ecuador, due to the high domestic and street-type canine population, and to identify possible mutations at a genetic level that could be causing structural changes in the virus with a consequent influence on the immune response of the hosts. Thirty-five stool samples from different puppies with characteristic signs of the disease and positives for CPV through immunochromatography kits were collected from different veterinarian clinics of the city. Polymerase chain reaction and DNA sequencing were used to determine the mutations in residue 426 of the VP2 gene, which determines the variants of CPV-2; in addition, four samples were chosen for complete sequencing of the VP2 gene to identify all possible mutations in the circulating strains in this region of the country. The results revealed the presence of the three variants of CPV-2 with a prevalence of 57.1% (20/35) for CPV-2a, 8.5% (3/35) for CPV-2b, and 34.3% (12/35) for CPV-2c. In addition, complete sequencing of the VP2 gene showed amino acid substitutions in residues 87, 101, 139, 219, 297, 300, 305, 322, 324, 375, 386, 426, 440, and 514 of the three Ecuadorian variants when compared with the original CPV-2 sequence. This study describes the detection of CPV variants in the city of Quito, Ecuador. Variants of CPV-2 (2a, 2b, and 2c) have been reported in South America, and there are cases in Ecuador where CVP-2 is affecting even vaccinated puppies.
Diagnostics based on nucleic acid sequence variant profiling: PCR, hybridization, and NGS approaches.

PubMed

Khodakov, Dmitriy; Wang, Chunyan; Zhang, David Yu

2016-10-01

Nucleic acid sequence variations have been implicated in many diseases, and reliable detection and quantitation of DNA/RNA biomarkers can inform effective therapeutic action, enabling precision medicine. Nucleic acid analysis technologies being translated into the clinic can broadly be classified into hybridization, PCR, and sequencing, as well as their combinations. Here we review the molecular mechanisms of popular commercial assays, and their progress in translation into in vitro diagnostics. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Read clouds uncover variation in complex regions of the human genome

PubMed Central

Bishara, Alex; Liu, Yuling; Weng, Ziming; Kashef-Haghighi, Dorna; Newburger, Daniel E.; West, Robert; Sidow, Arend; Batzoglou, Serafim

2015-01-01

Although an increasing amount of human genetic variation is being identified and recorded, determining variants within repeated sequences of the human genome remains a challenge. Most population and genome-wide association studies have therefore been unable to consider variation in these regions. Core to the problem is the lack of a sequencing technology that produces reads with sufficient length and accuracy to enable unique mapping. Here, we present a novel methodology of using read clouds, obtained by accurate short-read sequencing of DNA derived from long fragment libraries, to confidently align short reads within repeat regions and enable accurate variant discovery. Our novel algorithm, Random Field Aligner (RFA), captures the relationships among the short reads governed by the long read process via a Markov Random Field. We utilized a modified version of the Illumina TruSeq synthetic long-read protocol, which yielded shallow-sequenced read clouds. We test RFA through extensive simulations and apply it to discover variants on the NA12878 human sample, for which shallow TruSeq read cloud sequencing data are available, and on an invasive breast carcinoma genome that we sequenced using the same method. We demonstrate that RFA facilitates accurate recovery of variation in 155 Mb of the human genome, including 94% of 67 Mb of segmental duplication sequence and 96% of 11 Mb of transcribed sequence, that are currently hidden from short-read technologies. PMID:26286554
Cracking the Code of Human Diseases Using Next-Generation Sequencing: Applications, Challenges, and Perspectives

PubMed Central

Precone, Vincenza; Del Monaco, Valentina; Esposito, Maria Valeria; De Palma, Fatima Domenica Elisa; Ruocco, Anna; D'Argenio, Valeria

2015-01-01

Next-generation sequencing (NGS) technologies have greatly impacted on every field of molecular research mainly because they reduce costs and increase throughput of DNA sequencing. These features, together with the technology's flexibility, have opened the way to a variety of applications including the study of the molecular basis of human diseases. Several analytical approaches have been developed to selectively enrich regions of interest from the whole genome in order to identify germinal and/or somatic sequence variants and to study DNA methylation. These approaches are now widely used in research, and they are already being used in routine molecular diagnostics. However, some issues are still controversial, namely, standardization of methods, data analysis and storage, and ethical aspects. Besides providing an overview of the NGS-based approaches most frequently used to study the molecular basis of human diseases at DNA level, we discuss the principal challenges and applications of NGS in the field of human genomics. PMID:26665001
Sequence variant classification and reporting: recommendations for improving the interpretation of cancer susceptibility genetic test results

PubMed Central

Plon, Sharon E.; Eccles, Diana M.; Easton, Douglas; Foulkes, William D.; Genuardi, Maurizio; Greenblatt, Marc S.; Hogervorst, Frans B.L.; Hoogerbrugge, Nicoline; Spurdle, Amanda B.; Tavtigian, Sean

2011-01-01

Genetic testing of cancer susceptibility genes is now widely applied in clinical practice to predict risk of developing cancer. In general, sequence-based testing of germline DNA is used to determine whether an individual carries a change that is clearly likely to disrupt normal gene function. Genetic testing may detect changes that are clearly pathogenic, clearly neutral or variants of unclear clinical significance. Such variants present a considerable challenge to the diagnostic laboratory and the receiving clinician in terms of interpretation and clear presentation of the implications of the result to the patient. There does not appear to be a consistent approach to interpreting and reporting the clinical significance of variants either among genes or among laboratories. The potential for confusion among clinicians and patients is considerable and misinterpretation may lead to inappropriate clinical consequences. In this article we review the current state of sequence-based genetic testing, describe other standardized reporting systems used in oncology and propose a standardized classification system for application to sequence based results for cancer predisposition genes. We suggest a system of five classes of variants based on the degree of likelihood of pathogenicity. Each class is associated with specific recommendations for clinical management of at-risk relatives that will depend on the syndrome. We propose that panels of experts on each cancer predisposition syndrome facilitate the classification scheme and designate appropriate surveillance and cancer management guidelines. The international adoption of a standardized reporting system should improve the clinical utility of sequence-based genetic tests to predict cancer risk. PMID:18951446
Incorporation of native antibodies and Fc-fusion proteins on DNA nanostructures via a modular conjugation strategy† †Electronic supplementary information (ESI) available: Experimental methods, DNA origami design, DNA sequences, and additional experimental data. See DOI: 10.1039/c7cc04178k

PubMed Central

Rosier, Bas J. H. M.; Cremers, Glenn A. O.; Engelen, Wouter; Merkx, Maarten; Brunsveld, Luc

2017-01-01

A photocrosslinkable protein G variant was used as an adapter protein to covalently and site-specifically conjugate an antibody and an Fc-fusion protein to an oligonucleotide. This modular approach enables straightforward decoration of DNA nanostructures with complex native proteins while retaining their innate binding affinity, allowing precise control over the nanoscale spatial organization of such proteins for in vitro and in vivo biomedical applications. PMID:28617516
Molecular structure and chromosome distribution of three repetitive DNA families in Anemone hortensis L. (Ranunculaceae).

PubMed

Mlinarec, Jelena; Chester, Mike; Siljak-Yakovlev, Sonja; Papes, Drazena; Leitch, Andrew R; Besendorfer, Visnja

2009-01-01

The structure, abundance and location of repetitive DNA sequences on chromosomes can characterize the nature of higher plant genomes. Here we report on three new repeat DNA families isolated from Anemone hortensis L.; (i) AhTR1, a family of satellite DNA (stDNA) composed of a 554-561 bp long EcoRV monomer; (ii) AhTR2, a stDNA family composed of a 743 bp long HindIII monomer and; (iii) AhDR, a repeat family composed of a 945 bp long HindIII fragment that exhibits some sequence similarity to Ty3/gypsy-like retroelements. Fluorescence in-situ hybridization (FISH) to metaphase chromosomes of A. hortensis (2n = 16) revealed that both AhTR1 and AhTR2 sequences co-localized with DAPI-positive AT-rich heterochromatic regions. AhTR1 sequences occur at intercalary DAPI bands while AhTR2 sequences occur at 8-10 terminally located heterochromatic blocks. In contrast AhDR sequences are dispersed over all chromosomes as expected of a Ty3/gypsy-like element. AhTR2 and AhTR1 repeat families include polyA- and polyT-tracks, AT/TA-motifs and a pentanucleotide sequence (CAAAA) that may have consequences for chromatin packing and sequence homogeneity. AhTR2 repeats also contain TTTAGGG motifs and degenerate variants. We suggest that they arose by interspersion of telomeric repeats with subtelomeric repeats, before hybrid unit(s) amplified through the heterochromatic domain. The three repetitive DNA families together occupy approximately 10% of the A. hortensis genome. Comparative analyses of eight Anemone species revealed that the divergence of the A. hortensis genome was accompanied by considerable modification and/or amplification of repeats.
Targeted enrichment of ancient pathogens yielding the pPCP1 plasmid of Yersinia pestis from victims of the Black Death.

PubMed

Schuenemann, Verena J; Bos, Kirsten; DeWitte, Sharon; Schmedes, Sarah; Jamieson, Joslyn; Mittnik, Alissa; Forrest, Stephen; Coombes, Brian K; Wood, James W; Earn, David J D; White, William; Krause, Johannes; Poinar, Hendrik N

2011-09-20

Although investigations of medieval plague victims have identified Yersinia pestis as the putative etiologic agent of the pandemic, methodological limitations have prevented large-scale genomic investigations to evaluate changes in the pathogen's virulence over time. We screened over 100 skeletal remains from Black Death victims of the East Smithfield mass burial site (1348-1350, London, England). Recent methods of DNA enrichment coupled with high-throughput DNA sequencing subsequently permitted reconstruction of ten full human mitochondrial genomes (16 kb each) and the full pPCP1 (9.6 kb) virulence-associated plasmid at high coverage. Comparisons of molecular damage profiles between endogenous human and Y. pestis DNA confirmed its authenticity as an ancient pathogen, thus representing the longest contiguous genomic sequence for an ancient pathogen to date. Comparison of our reconstructed plasmid against modern Y. pestis shows identity with several isolates matching the Medievalis biovar; however, our chromosomal sequences indicate the victims were infected with a Y. pestis variant that has not been previously reported. Our data reveal that the Black Death in medieval Europe was caused by a variant of Y. pestis that may no longer exist, and genetic data carried on its pPCP1 plasmid were not responsible for the purported epidemiological differences between ancient and modern forms of Y. pestis infections.
An Engineered Kinetic Amplification Mechanism for Single Nucleotide Variant Discrimination by DNA Hybridization Probes.

PubMed

Chen, Sherry Xi; Seelig, Georg

2016-04-20

Even a single-nucleotide difference between the sequences of two otherwise identical biological nucleic acids can have dramatic functional consequences. Here, we use model-guided reaction pathway engineering to quantitatively improve the performance of selective hybridization probes in recognizing single nucleotide variants (SNVs). Specifically, we build a detection system that combines discrimination by competition with DNA strand displacement-based catalytic amplification. We show, both mathematically and experimentally, that the single nucleotide selectivity of such a system in binding to single-stranded DNA and RNA is quadratically better than discrimination due to competitive hybridization alone. As an additional benefit the integrated circuit inherits the property of amplification and provides at least 10-fold better sensitivity than standard hybridization probes. Moreover, we demonstrate how the detection mechanism can be tuned such that the detection reaction is agnostic to the position of the SNV within the target sequence. in contrast, prior strand displacement-based probes designed for kinetic discrimination are highly sensitive to position effects. We apply our system to reliably discriminate between different members of the let-7 microRNA family that differ in only a single base position. Our results demonstrate the power of systematic reaction network design to quantitatively improve biotechnology.
Sequencing small genomic targets with high efficiency and extreme accuracy

PubMed Central

Schmitt, Michael W.; Fox, Edward J.; Prindle, Marc J.; Reid-Bayliss, Kate S.; True, Lawrence D.; Radich, Jerald P.; Loeb, Lawrence A.

2015-01-01

The detection of minority variants in mixed samples demands methods for enrichment and accurate sequencing of small genomic intervals. We describe an efficient approach based on sequential rounds of hybridization with biotinylated oligonucleotides, enabling more than one-million fold enrichment of genomic regions of interest. In conjunction with error correcting double-stranded molecular tags, our approach enables the quantification of mutations in individual DNA molecules. PMID:25849638
cyvcf2: fast, flexible variant analysis with Python.

PubMed

Pedersen, Brent S; Quinlan, Aaron R

2017-06-15

Variant call format (VCF) files document the genetic variation observed after DNA sequencing, alignment and variant calling of a sample cohort. Given the complexity of the VCF format as well as the diverse variant annotations and genotype metadata, there is a need for fast, flexible methods enabling intuitive analysis of the variant data within VCF and BCF files. We introduce cyvcf2 , a Python library and software package for fast parsing and querying of VCF and BCF files and illustrate its speed, simplicity and utility. bpederse@gmail.com or aaronquinlan@gmail.com. cyvcf2 is available from https://github.com/brentp/cyvcf2 under the MIT license and from common python package managers. Detailed documentation is available at http://brentp.github.io/cyvcf2/. © The Author 2017. Published by Oxford University Press.
A novel mutation in PRPF31, causative of autosomal dominant retinitis pigmentosa, using the BGISEQ-500 sequencer.

PubMed

Zheng, Yu; Wang, Hai-Lin; Li, Jian-Kang; Xu, Li; Tellier, Laurent; Li, Xiao-Lin; Huang, Xiao-Yan; Li, Wei; Niu, Tong-Tong; Yang, Huan-Ming; Zhang, Jian-Guo; Liu, Dong-Ning

2018-01-01

To study the genes responsible for retinitis pigmentosa. A total of 15 Chinese families with retinitis pigmentosa, containing 94 sporadically afflicted cases, were recruited. The targeted sequences were captured using the Target_Eye_365_V3 chip and sequenced using the BGISEQ-500 sequencer, according to the manufacturer's instructions. Data were aligned to UCSC Genome Browser build hg19, using the Burroughs Wheeler Aligner MEM algorithm. Local realignment was performed with the Genome Analysis Toolkit (GATK v.3.3.0) IndelRealigner, and variants were called with the Genome Analysis Toolkit Haplotypecaller, without any use of imputation. Variants were filtered against a panel derived from 1000 Genomes Project, 1000G_ASN, ESP6500, ExAC and dbSNP138. In all members of Family ONE and Family TWO with available DNA samples, the genetic variant was validated using Sanger sequencing. A novel, pathogenic variant of retinitis pigmentosa, c.357_358delAA (p.Ser119SerfsX5) was identified in PRPF31 in 2 of 15 autosomal-dominant retinitis pigmentosa (ADRP) families, as well as in one, sporadic case. Sanger sequencing was performed upon probands, as well as upon other family members. This novel, pathogenic genotype co-segregated with retinitis pigmentosa phenotype in these two families. ADRP is a subtype of retinitis pigmentosa, defined by its genotype, which accounts for 20%-40% of the retinitis pigmentosa patients. Our study thus expands the spectrum of PRPF31 mutations known to occur in ADRP, and provides further demonstration of the applicability of the BGISEQ500 sequencer for genomics research.
Analysis of Duck Hepatitis B Virus Reverse Transcription Indicates a Common Mechanism for the Two Template Switches during Plus-Strand DNA Synthesis

PubMed Central

Havert, Michael B.; Ji, Lin; Loeb, Daniel D.

2002-01-01

The synthesis of the hepadnavirus relaxed circular DNA genome requires two template switches, primer translocation and circularization, during plus-strand DNA synthesis. Repeated sequences serve as donor and acceptor templates for these template switches, with direct repeat 1 (DR1) and DR2 for primer translocation and 5′r and 3′r for circularization. These donor and acceptor sequences are at, or near, the ends of the minus-strand DNA. Analysis of plus-strand DNA synthesis of duck hepatitis B virus (DHBV) has indicated that there are at least three other cis-acting sequences that make contributions during the synthesis of relaxed circular DNA. These sequences, 5E, M, and 3E, are located near the 5′ end, the middle, and the 3′ end of minus-strand DNA, respectively. The mechanism by which these sequences contribute to the synthesis of plus-strand DNA was unclear. Our aim was to better understand the mechanism by which 5E and M act. We localized the DHBV 5E element to a short sequence of approximately 30 nucleotides that is 100 nucleotides 3′ of DR2 on minus-strand DNA. We found that the new 5E mutants were partially defective for primer translocation/utilization at DR2. They were also invariably defective for circularization. In addition, examination of several new DHBV M variants indicated that they too were defective for primer translocation/utilization and circularization. Thus, this analysis indicated that 5E and M play roles in both primer translocation/utilization and circularization. In conjunction with earlier findings that 3E functions in both template switches, our findings indicate that the processes of primer translocation and circularization share a common underlying mechanism. PMID:11861843
Exome Sequencing Analysis Reveals Variants in Primary Immunodeficiency Genes in Patients With Very Early Onset Inflammatory Bowel Disease

PubMed Central

Kelsen, Judith R.; Dawany, Noor; Moran, Christopher J.; Petersen, Britt-Sabina; Sarmady, Mahdi; Sasson, Ariella; Pauly-Hubbard, Helen; Martinez, Alejandro; Maurer, Kelly; Soong, Joanne; Rappaport, Eric; Franke, Andre; Keller, Andreas; Winter, Harland S.; Mamula, Petar; Piccoli, David; Artis, David; Sonnenberg, Gregory F.; Daly, Mark; Sullivan, Kathleen E.; Baldassano, Robert N.; Devoto, Marcella

2016-01-01

Background & Aims Very early onset inflammatory bowel disease (VEO-IBD), IBD diagnosed ≤5 y of age, frequently presents with a different and more severe phenotype than older-onset IBD. We investigated whether patients with VEO-IBD carry rare or novel variants in genes associated with immunodeficiencies that might contribute to disease development. Methods Patients with VEO-IBD and parents (when available) were recruited from the Children's Hospital of Philadelphia from March 2013 through July 2014. We analyzed DNA from 125 patients with VEO-IBD (ages 3 weeks to 4 y) and 19 parents, 4 of whom also had IBD. Exome capture was performed by Agilent SureSelect V4, and sequencing was performed using the Illumina HiSeq platform. Alignment to human genome GRCh37 was achieved followed by post-processing and variant calling. Following functional annotation, candidate variants were analyzed for change in protein function, minor allele frequency <0.1%, and scaled combined annotation dependent depletion scores ≤10. We focused on genes associated with primary immunodeficiencies and related pathways. An additional 210 exome samples from patients with pediatric IBD (n=45) or adult-onset Crohn's disease (n=20) and healthy individuals (controls, n=145) were obtained from the University of Kiel, Germany and used as control groups. Results Four-hundred genes and regions associated with primary immunodeficiency, covering approximately 6500 coding exons totaling > 1 Mbp of coding sequence, were selected from the whole exome data. Our analysis revealed novel and rare variants within these genes that could contribute to the development of VEO-IBD, including rare heterozygous missense variants in IL10RA and previously unidentified variants in MSH5 and CD19. Conclusions In an exome sequence analysis of patients with VEO-IBD and their parents, we identified variants in genes that regulate B- and T-cell functions and could contribute to pathogenesis. Our analysis could lead to the identification of previously unidentified IBD-associated variants. PMID:26193622
Pooled Sequencing of 531 Genes in Inflammatory Bowel Disease Identifies an Associated Rare Variant in BTNL2 and Implicates Other Immune Related Genes

PubMed Central

Prescott, Natalie J.; Lehne, Benjamin; Stone, Kristina; Lee, James C.; Taylor, Kirstin; Knight, Jo; Papouli, Efterpi; Mirza, Muddassar M.; Simpson, Michael A.; Spain, Sarah L.; Lu, Grace; Fraternali, Franca; Bumpstead, Suzannah J.; Gray, Emma; Amar, Ariella; Bye, Hannah; Green, Peter; Chung-Faye, Guy; Hayee, Bu’Hussain; Pollok, Richard; Satsangi, Jack; Parkes, Miles; Barrett, Jeffrey C.; Mansfield, John C.; Sanderson, Jeremy; Lewis, Cathryn M.; Weale, Michael E.; Schlitt, Thomas; Mathew, Christopher G.

2015-01-01

The contribution of rare coding sequence variants to genetic susceptibility in complex disorders is an important but unresolved question. Most studies thus far have investigated a limited number of genes from regions which contain common disease associated variants. Here we investigate this in inflammatory bowel disease by sequencing the exons and proximal promoters of 531 genes selected from both genome-wide association studies and pathway analysis in pooled DNA panels from 474 cases of Crohn’s disease and 480 controls. 80 variants with evidence of association in the sequencing experiment or with potential functional significance were selected for follow up genotyping in 6,507 IBD cases and 3,064 population controls. The top 5 disease associated variants were genotyped in an extension panel of 3,662 IBD cases and 3,639 controls, and tested for association in a combined analysis of 10,147 IBD cases and 7,008 controls. A rare coding variant p.G454C in the BTNL2 gene within the major histocompatibility complex was significantly associated with increased risk for IBD (p = 9.65x10−10, OR = 2.3[95% CI = 1.75–3.04]), but was independent of the known common associated CD and UC variants at this locus. Rare (<1%) and low frequency (1–5%) variants in 3 additional genes showed suggestive association (p<0.005) with either an increased risk (ARIH2 c.338-6C>T) or decreased risk (IL12B p.V298F, and NICN p.H191R) of IBD. These results provide additional insights into the involvement of the inhibition of T cell activation in the development of both sub-phenotypes of inflammatory bowel disease. We suggest that although rare coding variants may make a modest overall contribution to complex disease susceptibility, they can inform our understanding of the molecular pathways that contribute to pathogenesis. PMID:25671699
Whole-Exome Sequencing to Identify Novel Biological Pathways Associated With Infertility After Pelvic Inflammatory Disease.

PubMed

Taylor, Brandie D; Zheng, Xiaojing; Darville, Toni; Zhong, Wujuan; Konganti, Kranti; Abiodun-Ojo, Olayinka; Ness, Roberta B; O'Connell, Catherine M; Haggerty, Catherine L

2017-01-01

Ideal management of sexually transmitted infections (STI) may require risk markers for pathology or vaccine development. Previously, we identified common genetic variants associated with chlamydial pelvic inflammatory disease (PID) and reduced fecundity. As this explains only a proportion of the long-term morbidity risk, we used whole-exome sequencing to identify biological pathways that may be associated with STI-related infertility. We obtained stored DNA from 43 non-Hispanic black women with PID from the PID Evaluation and Clinical Health Study. Infertility was assessed at a mean of 84 months. Principal component analysis revealed no population stratification. Potential covariates did not significantly differ between groups. Sequencing kernel association test was used to examine associations between aggregates of variants on a single gene and infertility. The results from the sequencing kernel association test were used to choose "focus genes" (P < 0.01; n = 150) for subsequent Ingenuity Pathway Analysis to identify "gene sets" that are enriched in biologically relevant pathways. Pathway analysis revealed that focus genes were enriched in canonical pathways including, IL-1 signaling, P2Y purinergic receptor signaling, and bone morphogenic protein signaling. Focus genes were enriched in pathways that impact innate and adaptive immunity, protein kinase A activity, cellular growth, and DNA repair. These may alter host resistance or immunopathology after infection. Targeted sequencing of biological pathways identified in this study may provide insight into STI-related infertility.
Albumin Redhill (-1 Arg, 320 Ala yields Thr): A glycoprotein variant of human serum albumin whose precursor has an aberrant signal peptidase cleavage site

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brennan, S.O.; Myles, T.; Peach, R.J.

1990-01-01

Albumin Redhill is an electrophoretically slow genetic variant of human serum albumin that does not bind {sup 63}Ni{sup 2+} and has a molecular mass 2.5 kDa higher than normal albumin. Its inability to bind Ni{sup 2+} was explained by the finding of an additional residue of Arg at position -1. This did not explain the molecular basis of the genetic variation or the increase in apparent molecular mass. Fractionation of tryptic digests on concanavalin A-Sepharose followed by peptide mapping of the bound and unbound fractions and sequence analysis of the glycopeptides identified a mutation of 320 Ala {yields} Thr. Thismore » introduces as Asn-Tyr-Thr oligosaccharide attachment sequence centered on Asn-318 and explains the increase in molecular mass. This, however, did not satisfactorily explain the presence of the additional Arg residue at position -1. DNA sequencing of polymerase chain reaction-amplified genomic DNA encoding the prepro sequence of albumin indicated an additional mutation of -2 Arg {yields} Cys. The authors propose that the new Phe-Cys-Arg sequence in the propeptide is an aberrant signal peptidase cleavage site and that the signal peptidase cleaves the propeptide of albumin Redhill in the lumen of the endoplasmic reticulum before it reaches the Golgi vesicles, the site of the diarginyl-specific proalbumin convertase.« less
Designing oligo libraries taking alternative splicing into account

NASA Astrophysics Data System (ADS)

Shoshan, Avi; Grebinskiy, Vladimir; Magen, Avner; Scolnicov, Ariel; Fink, Eyal; Lehavi, David; Wasserman, Alon

2001-06-01

We have designed sequences for DNA microarrays and oligo libraries, taking alternative splicing into account. Alternative splicing is a common phenomenon, occurring in more than 25% of the human genes. In many cases, different splice variants have different functions, are expressed in different tissues or may indicate different stages of disease. When designing sequences for DNA microarrays or oligo libraries, it is very important to take into account the sequence information of all the mRNA transcripts. Therefore, when a gene has more than one transcript (as a result of alternative splicing, alternative promoter sites or alternative poly-adenylation sites), it is very important to take all of them into account in the design. We have used the LEADS transcriptome prediction system to cluster and assemble the human sequences in GenBank and design optimal oligonucleotides for all the human genes with a known mRNA sequence based on the LEADS predictions.

Bioenergetics in human evolution and disease: implications for the origins of biological complexity and the missing genetic variation of common diseases.

PubMed

Wallace, Douglas C

2013-07-19

Two major inconsistencies exist in the current neo-Darwinian evolutionary theory that random chromosomal mutations acted on by natural selection generate new species. First, natural selection does not require the evolution of ever increasing complexity, yet this is the hallmark of biology. Second, human chromosomal DNA sequence variation is predominantly either neutral or deleterious and is insufficient to provide the variation required for speciation or for predilection to common diseases. Complexity is explained by the continuous flow of energy through the biosphere that drives the accumulation of nucleic acids and information. Information then encodes complex forms. In animals, energy flow is primarily mediated by mitochondria whose maternally inherited mitochondrial DNA (mtDNA) codes for key genes for energy metabolism. In mammals, the mtDNA has a very high mutation rate, but the deleterious mutations are removed by an ovarian selection system. Hence, new mutations that subtly alter energy metabolism are continuously introduced into the species, permitting adaptation to regional differences in energy environments. Therefore, the most phenotypically significant gene variants arise in the mtDNA, are regional, and permit animals to occupy peripheral energy environments where rarer nuclear DNA (nDNA) variants can accumulate, leading to speciation. The neutralist-selectionist debate is then a consequence of mammals having two different evolutionary strategies: a fast mtDNA strategy for intra-specific radiation and a slow nDNA strategy for speciation. Furthermore, the missing genetic variation for common human diseases is primarily mtDNA variation plus regional nDNA variants, both of which have been missed by large, inter-population association studies.
Variant translocation partners of the anaplastic lymphoma kinase (ALK) gene in two cases of anaplastic large cell lymphoma, identified by inverse cDNA polymerase chain reaction.

PubMed

Takeoka, Kayo; Okumura, Atsuko; Honjo, Gen; Ohno, Hitoshi

2014-01-01

In anaplastic large cell lymphoma (ALCL), the anaplastic lymphoma kinase (ALK) gene is rearranged with diverse partners due to variant translocations/inversions. Case 1 was a 39-year-old man who developed multiple tumors in the mediastinum, psoas muscle, lung, and lymph nodes. A biopsy specimen of the inguinal node was effaced by large tumor cells expressing CD30, epithelial membrane antigen, and cytoplasmic ALK, which led to a diagnosis of ALK(+) ALCL. Case 2 was a 51-year-old man who was initially diagnosed with undifferentiated carcinoma. He developed multiple skin tumors eight years after his initial presentation, and was finally diagnosed with ALK(+) ALCL. He died of therapy-related acute myeloid leukemia. G-banding and fluorescence in situ hybridization using an ALK break-apart probe revealed the rearrangement of ALK and suggested variant translocation in both cases. We applied an inverse cDNA polymerase chain reaction (PCR) strategy to identify the partner of ALK. Nucleotide sequencing of the PCR products and a database search revealed that the sequences of ATIC in case 1 and TRAF1 in case 2 appeared to follow those of ALK. We subsequently confirmed ATIC-ALK and TRAF1-ALK fusions by reverse transcriptase PCR and nucleotide sequencing. We successfully determined the partner gene of ALK in two cases of ALK(+) ALCL. ATIC is the second most common partner of variant ALK rearrangements, while the TRAF1-ALK fusion gene was first reported in 2013, and this is the second reported case of ALK(+) ALCL carrying TRAF1-ALK.
The Centromere: Chromatin Foundation for the Kinetochore Machinery

PubMed Central

Fukagawa, Tatsuo; Earnshaw, William C.

2014-01-01

Since discovery of the centromere-specific histone H3 variant CENP-A, centromeres have come to be defined as chromatin structures that establish the assembly site for the complex kinetochore machinery. In most organisms, centromere activity is defined epigenetically, rather than by specific DNA sequences. In this review, we describe selected classic work and recent progress in studies of centromeric chromatin with a focus on vertebrates. We consider possible roles for repetitive DNA sequences found at most centromeres, chromatin factors and modifications that assemble and activate CENP-A chromatin for kinetochore assembly, plus the use of artificial chromosomes and kinetochores to study centromere function. PMID:25203206
Uncovering the molecular organization of unusual highly scattered 5S rDNA: The case of Chariesterus armatus (Heteroptera).

PubMed

Bardella, Vanessa Bellini; Cabral-de-Mello, Diogo Cavalcanti

2018-03-10

One cluster of 5S rDNA per haploid genome is the most common pattern among Heteroptera. However, in Chariesterus armatus, highly scattered signals were noticed. We isolated and characterized the entire 5S rDNA unit of C. armatus aiming to a deeper knowledge of molecular organization of the 5S rDNA among Heteroptera and to understand possible causes and consequences of 5S rDNA chromosomal spreading. For a comparative analysis, we performed the same approach in Holymenia histrio with 5S rDNA restricted to one bivalent. Multiple 5S rDNA variants were observed in both species, though they were more variable in C. armatus, with some of variants corresponding to pseudogenes. These pseudogenes suggest birth-and-death mechanism, though homogenization was also observed (concerted evolution), indicating evolution through mixed model. Association between transposable elements and 5S rDNA was not observed, suggesting spreading of 5S rDNA through other mechanisms, like ectopic recombination. Scattered organization is a rare example for 5S rDNA, and such organization in C. armatus genome could have led to the high diversification of sequences favoring their pseudogenization. Copyright © 2017. Published by Elsevier B.V.
Role of DNA conformation & energetic insights in Msx-1-DNA recognition as revealed by molecular dynamics studies on specific and nonspecific complexes.

PubMed

Kachhap, Sangita; Singh, Balvinder

2015-01-01

In most of homeodomain-DNA complexes, glutamine or lysine is present at 50th position and interacts with 5th and 6th nucleotide of core recognition region. Molecular dynamics simulations of Msx-1-DNA complex (Q50-TG) and its variant complexes, that is specific (Q50K-CC), nonspecific (Q50-CC) having mutation in DNA and (Q50K-TG) in protein, have been carried out. Analysis of protein-DNA interactions and structure of DNA in specific and nonspecific complexes show that amino acid residues use sequence-dependent shape of DNA to interact. The binding free energies of all four complexes were analysed to define role of amino acid residue at 50th position in terms of binding strength considering the variation in DNA on stability of protein-DNA complexes. The order of stability of protein-DNA complexes shows that specific complexes are more stable than nonspecific ones. Decomposition analysis shows that N-terminal amino acid residues have been found to contribute maximally in binding free energy of protein-DNA complexes. Among specific protein-DNA complexes, K50 contributes more as compared to Q50 towards binding free energy in respective complexes. The sequence dependence of local conformation of DNA enables Q50/Q50K to make hydrogen bond with nucleotide(s) of DNA. The changes in amino acid sequence of protein are accommodated and stabilized around TAAT core region of DNA having variation in nucleotides.
Investigating DNA-, RNA-, and protein-based features as a means to discriminate pathogenic synonymous variants.

PubMed

Livingstone, Mark; Folkman, Lukas; Yang, Yuedong; Zhang, Ping; Mort, Matthew; Cooper, David N; Liu, Yunlong; Stantic, Bela; Zhou, Yaoqi

2017-10-01

Synonymous single-nucleotide variants (SNVs), although they do not alter the encoded protein sequences, have been implicated in many genetic diseases. Experimental studies indicate that synonymous SNVs can lead to changes in the secondary and tertiary structures of DNA and RNA, thereby affecting translational efficiency, cotranslational protein folding as well as the binding of DNA-/RNA-binding proteins. However, the importance of these various features in disease phenotypes is not clearly understood. Here, we have built a support vector machine (SVM) model (termed DDIG-SN) as a means to discriminate disease-causing synonymous variants. The model was trained and evaluated on nearly 900 disease-causing variants. The method achieves robust performance with the area under the receiver operating characteristic curve of 0.84 and 0.85 for protein-stratified 10-fold cross-validation and independent testing, respectively. We were able to show that the disease-causing effects in the immediate proximity to exon-intron junctions (1-3 bp) are driven by the loss of splicing motif strength, whereas the gain of splicing motif strength is the primary cause in regions further away from the splice site (4-69 bp). The method is available as a part of the DDIG server at http://sparks-lab.org/ddig. © 2017 Wiley Periodicals, Inc.
Repair of DNA damage caused by cytosine deamination in mitochondrial DNA of forensic case samples.

PubMed

Gorden, Erin M; Sturk-Andreaggi, Kimberly; Marshall, Charla

2018-05-01

DNA sequence damage from cytosine deamination is well documented in degraded samples, such as those from ancient and forensic contexts. This study examined the effect of a DNA repair treatment on mitochondrial DNA (mtDNA) from aged and degraded skeletal samples. DNA extracts from 21 non-probative, degraded skeletal samples (aged 50-70 years) were utilized for the analysis. A portion of each sample extract was subjected to DNA repair using a commercial repair kit, the New England BioLabs' NEBNext FFPE DNA Repair Kit (Ipswich, MA). MtDNA was enriched using PCR and targeted capture in a side-by-side experiment of untreated and repaired DNA. Sequencing was performed using both traditional (Sanger-type; STS) and next-generation sequencing (NGS) methods Although cytosine deamination was evident in the mtDNA sequence data, the observed level of damaged bases varied by sequencing method as well as by enrichment type. The STS PCR amplicon data did not show evidence of cytosine deamination that could be distinguished from background signal in either the untreated or repaired sample set. However, the same PCR amplicons showed 850 C → T/G → A substitutions consistent with cytosine deamination with variant frequencies (VFs) of up to 25% when sequenced using NGS methods The occurrence of base misincorporation due to cytosine deamination was reduced by 98% (to 10) in the NGS amplicon data after repair. The NGS capture data indicated low levels (1-2%) of cytosine deamination in mtDNA fragments that was effectively mitigated by DNA repair. The observed difference in the level of cytosine deamination between the PCR and capture enrichment methods can be attributed to the greater propensity for stochastic effects from the PCR enrichment technique employed (e.g., low template input, increased PCR cycles). Altogether these results indicate that DNA repair may be required when sequencing PCR-amplified DNA from degraded forensic case samples with NGS methods. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.
Whole mitochondrial genome screening in maternally inherited non-syndromic hearing impairment using a microarray resequencing mitochondrial DNA chip.

PubMed

Lévêque, Marianne; Marlin, Sandrine; Jonard, Laurence; Procaccio, Vincent; Reynier, Pascal; Amati-Bonneau, Patrizia; Baulande, Sylvain; Pierron, Denis; Lacombe, Didier; Duriez, Françoise; Francannet, Christine; Mom, Thierry; Journel, Hubert; Catros, Hélène; Drouin-Garraud, Valérie; Obstoy, Marie-Françoise; Dollfus, Hélène; Eliot, Marie-Madeleine; Faivre, Laurence; Duvillard, Christian; Couderc, Remy; Garabedian, Eréa-Noël; Petit, Christine; Feldmann, Delphine; Denoyelle, Françoise

2007-11-01

Mitochondrial DNA (mtDNA) mutations have been implicated in non-syndromic hearing loss either as primary or as predisposing factors. As only a part of the mitochondrial genome is usually explored in deafness, its prevalence is probably under-estimated. Among 1350 families with non-syndromic sensorineural hearing loss collected through a French collaborative network, we selected 29 large families with a clear maternal lineage and screened them for known mtDNA mutations in 12S rRNA, tRNASer(UCN) and tRNALeu(UUR) genes. When no mutation could be identified, a whole mitochondrial genome screening was performed, using a microarray resequencing chip: the MitoChip version 2.0 developed by Affymetrix Inc. Known mtDNA mutations was found in nine of the 29 families, which are described in the article: five with A1555G, two with the T7511C, one with 7472insC and one with A3243G mutation. In the remaining 20 families, the resequencing Mitochip detected 258 mitochondrial homoplasmic variants and 107 potentially heteroplasmic variants. Controls were made by direct sequencing on selected fragments and showed a high sensibility of the MitoChip but a low specificity, especially for heteroplasmic variations. An original analysis on the basis of species conservation, frequency and phylogenetic investigation was performed to select the more probably pathogenic variants. The entire genome analysis allowed us to identify five additional families with a putatively pathogenic mitochondrial variant: T669C, C1537T, G8078A, G12236A and G15077A. These results indicate that the new MitoChip platform is a rapid and valuable tool for identification of new mtDNA mutations in deafness.
Evidence of birth-and-death evolution of 5S rRNA gene in Channa species (Teleostei, Perciformes).

PubMed

Barman, Anindya Sundar; Singh, Mamta; Singh, Rajeev Kumar; Lal, Kuldeep Kumar

2016-12-01

In higher eukaryotes, minor rDNA family codes for 5S rRNA that is arranged in tandem arrays and comprises of a highly conserved 120 bp long coding sequence with a variable non-transcribed spacer (NTS). Initially the 5S rDNA repeats are considered to be evolved by the process of concerted evolution. But some recent reports, including teleost fishes suggested that evolution of 5S rDNA repeat does not fit into the concerted evolution model and evolution of 5S rDNA family may be explained by a birth-and-death evolution model. In order to study the mode of evolution of 5S rDNA repeats in Perciformes fish species, nucleotide sequence and molecular organization of five species of genus Channa were analyzed in the present study. Molecular analyses revealed several variants of 5S rDNA repeats (four types of NTS) and networks created by a neighbor net algorithm for each type of sequences (I, II, III and IV) did not show a clear clustering in species specific manner. The stable secondary structure is predicted and upstream and downstream conserved regulatory elements were characterized. Sequence analyses also shown the presence of two putative pseudogenes in Channa marulius. Present study supported that 5S rDNA repeats in genus Channa were evolved under the process of birth-and-death.
Sexually-Transmitted/Founder HIV-1 Cannot Be Directly Predicted from Plasma or PBMC-Derived Viral Quasispecies in the Transmitting Partner

PubMed Central

Frange, Pierre; Meyer, Laurence; Jung, Matthieu; Goujard, Cecile; Zucman, David; Abel, Sylvie; Hochedez, Patrick; Gousset, Marine; Gascuel, Olivier; Rouzioux, Christine; Chaix, Marie-Laure

2013-01-01

Objective Characterization of HIV-1 sequences in newly infected individuals is important for elucidating the mechanisms of viral sexual transmission. We report the identification of transmitted/founder viruses in eight pairs of HIV-1 sexually-infected patients enrolled at the time of primary infection (“recipients”) and their transmitting partners (“donors”). Methods Using a single genome-amplification approach, we compared quasispecies in donors and recipients on the basis of 316 and 376 C2V5 env sequences amplified from plasma viral RNA and PBMC-associated DNA, respectively. Results Both DNA and RNA sequences indicated very homogeneous viral populations in all recipients, suggesting transmission of a single variant, even in cases of recent sexually transmitted infections (STIs) in donors (n = 2) or recipients (n = 3). In all pairs, the transmitted/founder virus was derived from an infrequent variant population within the blood of the donor. The donor variant sequences most closely related to the recipient sequences were found in plasma samples in 3/8 cases and/or in PBMC samples in 6/8 cases. Although donors were exclusively (n = 4) or predominantly (n = 4) infected by CCR5-tropic (R5) strains, two recipients were infected with highly homogeneous CXCR4/dual-mixed-tropic (X4/DM) viral populations, identified in both DNA and RNA. The proportion of X4/DM quasispecies in donors was higher in cases of X4/DM than R5 HIV transmission (16.7–22.0% versus 0–2.6%), suggesting that X4/DM transmission may be associated with a threshold population of X4/DM circulating quasispecies in donors. Conclusions These suggest that a severe genetic bottleneck occurs during subtype B HIV-1 heterosexual and homosexual transmission. Sexually-transmitted/founder virus cannot be directly predicted by analysis of the donor’s quasispecies in plasma and/or PBMC. Additional studies are required to fully understand the traits that confer the capacity to transmit and establish infection, and determine the role of concomitant STIs in mitigating the genetic bottleneck in mucosal HIV transmission. PMID:23874894
Kangaroo IGF-II is structurally and functionally similar to the human [Ser29]-IGF-II variant.

PubMed

Yandell, C A; Francis, G L; Wheldrake, J F; Upton, Z

1999-06-01

Kangaroo IGF-II has been purified from western grey kangaroo (Macropus fuliginosus) serum and characterised in a number of in vitro assays. In addition, the complete cDNA sequence of mature IGF-II has been obtained by reverse-transcription polymerase chain reaction. Comparison of the kangaroo IGF-II cDNA sequence with known IGF-II sequences from other species revealed that it is very similar to the human variant, [Ser29]-hIGF-II. Both the variant and kangaroo IGF-II contain an insert of nine nucleotides that encode the amino acids Leu-Pro-Gly at the junction of the B and C domains of the mature protein. The deduced kangaroo IGF-II protein sequence also contains three other amino acid changes that are not observed in human IGF-II. These amino acid differences share similarities with the changes described in many of the IGF-IIs reported for non-mammalian species. Characterisation of human IGF-II, kangaroo IGF-II, chicken IGF-II and [Ser29]-hIGF-II in a number of in vitro assays revealed that all four proteins are functionally very similar. No significant differences were observed in the ability of the IGF-IIs to bind to the bovine IGF-II/cation-independent mannose 6-phosphate receptor or to stimulate protein synthesis in rat L6 myoblasts. However, differences were observed in their abilities to bind to IGF-binding proteins (IGFBPs) present in human serum. Kangaroo, chicken and [Ser29]-hIGF-II had lower apparent affinities for human IGFBPs than did human IGF-II. Thus, it appears that the major circulating form of IGF-II in the kangaroo and a minor form of IGF-II found in human serum are structurally and functionally very similar. This suggests that the splice site that generates both the variant and major form of human IGF-II must have evolved after the divergence of marsupials from placental mammals.
Phylogenetic relationships in three species of canine Demodex mite based on partial sequences of mitochondrial 16S rDNA.

PubMed

Sastre, Natalia; Ravera, Ivan; Villanueva, Sergio; Altet, Laura; Bardagí, Mar; Sánchez, Armand; Francino, Olga; Ferrer, Lluís

2012-12-01

The historical classification of Demodex mites has been based on their hosts and morphological features. Genome sequencing has proved to be a very effective taxonomic tool in phylogenetic studies and has been applied in the classification of Demodex. Mitochondrial 16S rDNA has been demonstrated to be an especially useful marker to establish phylogenetic relationships. To amplify and sequence a segment of the mitochondrial 16S rDNA from Demodex canis and Demodex injai, as well as from the short-bodied mite called, unofficially, D. cornei and to determine their genetic proximity. Demodex mites were examined microscopically and classified as Demodex folliculorum (one sample), D. canis (four samples), D. injai (two samples) or the short-bodied species D. cornei (three samples). DNA was extracted, and a 338 bp fragment of the 16S rDNA was amplified and sequenced. The sequences of the four D. canis mites were identical and shared 99.6 and 97.3% identity with two D. canis sequences available at GenBank. The sequences of the D. cornei isolates were identical and showed 97.8, 98.2 and 99.6% identity with the D. canis isolates. The sequences of the two D. injai isolates were also identical and showed 76.6% identity with the D. canis sequence. Demodex canis and D. injai are two different species, with a genetic distance of 23.3%. It would seem that the short-bodied Demodex mite D. cornei is a morphological variant of D. canis. © 2012 The Authors. Veterinary Dermatology © 2012 ESVD and ACVD.
Dissecting enzyme function with microfluidic-based deep mutational scanning.

PubMed

Romero, Philip A; Tran, Tuan M; Abate, Adam R

2015-06-09

Natural enzymes are incredibly proficient catalysts, but engineering them to have new or improved functions is challenging due to the complexity of how an enzyme's sequence relates to its biochemical properties. Here, we present an ultrahigh-throughput method for mapping enzyme sequence-function relationships that combines droplet microfluidic screening with next-generation DNA sequencing. We apply our method to map the activity of millions of glycosidase sequence variants. Microfluidic-based deep mutational scanning provides a comprehensive and unbiased view of the enzyme function landscape. The mapping displays expected patterns of mutational tolerance and a strong correspondence to sequence variation within the enzyme family, but also reveals previously unreported sites that are crucial for glycosidase function. We modified the screening protocol to include a high-temperature incubation step, and the resulting thermotolerance landscape allowed the discovery of mutations that enhance enzyme thermostability. Droplet microfluidics provides a general platform for enzyme screening that, when combined with DNA-sequencing technologies, enables high-throughput mapping of enzyme sequence space.
Detection of a single nucleotide polymorphism in the human alpha-lactalbumin gene: implications for human milk proteins.

PubMed

Chowanadisai, Winyoo; Kelleher, Shannon L; Nemeth, Jennifer F; Yachetti, Stephen; Kuhlman, Charles F; Jackson, Joan G; Davis, Anne M; Lien, Eric L; Lönnerdal, Bo

2005-05-01

Variability in the protein composition of breast milk has been observed in many women and is believed to be due to natural variation of the human population. Single nucleotide polymorphisms (SNPs) are present throughout the entire human genome, but the impact of this variation on human milk composition and biological activity and infant nutrition and health is unclear. The goals of this study were to characterize a variant of human alpha-lactalbumin observed in milk from a Filipino population by determining the location of the polymorphism in the amino acid and genomic sequences of alpha-lactalbumin. Milk and blood samples were collected from 20 Filipino women, and milk samples were collected from an additional 450 women from nine different countries. alpha-Lactalbumin concentration was measured by high-performance liquid chromatography (HPLC), and milk samples containing the variant form of the protein were identified with both HPLC and mass spectrometry (MS). The molecular weight of the variant form was measured by MS, and the location of the polymorphism was narrowed down by protein reduction, alkylation and trypsin digestion. Genomic DNA was isolated from whole blood, and the polymorphism location and subject genotype were determined by amplifying the entire coding sequence of human alpha-lactalbumin by PCR, followed by DNA sequencing. A variant form of alpha-lactalbumin was observed in HPLC chromatograms, and the difference in molecular weight was determined by MS (wild type=14,070 Da, variant=14,056 Da). Protein reduction and digestion narrowed the polymorphism between the 33rd and 77th amino acid of the protein. The genetic polymorphism was identified as adenine to guanine, which translates to a substitution from isoleucine to valine at amino acid 46. The frequency of variation was higher in milk from China, Japan and Philippines, which suggests that this polymorphism is most prevalent in Asia. There are SNPs in the genome for human milk proteins and their implications for protein bioactivity and infant nutrition need to be considered.
Expressed sequence tag analysis of adult human lens for the NEIBank Project: over 2000 non-redundant transcripts, novel genes and splice variants.

PubMed

Wistow, Graeme; Bernstein, Steven L; Wyatt, M Keith; Behal, Amita; Touchman, Jeffrey W; Bouffard, Gerald; Smith, Don; Peterson, Katherine

2002-06-15

To explore the expression profile of the human lens and to provide a resource for microarray studies, expressed sequence tag (EST) analysis has been performed on cDNA libraries from adult lenses. A cDNA library was constructed from two adult (40 year old) human lenses. Over two thousand clones were sequenced from the unamplified, un-normalized library. The library was then normalized and a further 2200 sequences were obtained. All the data were analyzed using GRIST (GRouping and Identification of Sequence Tags), a procedure for gene identification and clustering. The lens library (by) contains a low percentage of non-mRNA contaminants and a high fraction (over 75%) of apparently full length cDNA clones. Approximately 2000 reads from the unamplified library yields 810 clusters, potentially representing individual genes expressed in the lens. After normalization, the content of crystallins and other abundant cDNAs is markedly reduced and a similar number of reads from this library (fs) yields 1455 unique groups of which only two thirds correspond to named genes in GenBank. Among the most abundant cDNAs is one for a novel gene related to glutamine synthetase, which was designated "lengsin" (LGS). Analyses of ESTs also reveal examples of alternative transcripts, including a major alternative splice form for the lens specific membrane protein MP19. Variant forms for other transcripts, including those encoding the apoptosis inhibitor Livin and the armadillo repeat protein ARVCF, are also described. The lens cDNA libraries are a resource for gene discovery, full length cDNAs for functional studies and microarrays. The discovery of an abundant, novel transcript, lengsin, and a major novel splice form of MP19 reflect the utility of unamplified libraries constructed from dissected tissue. Many novel transcripts and splice forms are represented, some of which may be candidates for genetic diseases.
Performance comparison of two commercial human whole-exome capture systems on formalin-fixed paraffin-embedded lung adenocarcinoma samples.

PubMed

Bonfiglio, Silvia; Vanni, Irene; Rossella, Valeria; Truini, Anna; Lazarevic, Dejan; Dal Bello, Maria Giovanna; Alama, Angela; Mora, Marco; Rijavec, Erika; Genova, Carlo; Cittaro, Davide; Grossi, Francesco; Coco, Simona

2016-08-30

Next Generation Sequencing (NGS) has become a valuable tool for molecular landscape characterization of cancer genomes, leading to a better understanding of tumor onset and progression, and opening new avenues in translational oncology. Formalin-fixed paraffin-embedded (FFPE) tissue is the method of choice for storage of clinical samples, however low quality of FFPE genomic DNA (gDNA) can limit its use for downstream applications. To investigate the FFPE specimen suitability for NGS analysis and to establish the performance of two solution-based exome capture technologies, we compared the whole-exome sequencing (WES) data of gDNA extracted from 5 fresh frozen (FF) and 5 matched FFPE lung adenocarcinoma tissues using: SeqCap EZ Human Exome v.3.0 (Roche NimbleGen) and SureSelect XT Human All Exon v.5 (Agilent Technologies). Sequencing metrics on Illumina HiSeq were optimal for both exome systems and comparable among FFPE and FF samples, with a slight increase of PCR duplicates in FFPE, mainly in Roche NimbleGen libraries. Comparison of single nucleotide variants (SNVs) between FFPE-FF pairs reached overlapping values >90 % in both systems. Both WES showed high concordance with target re-sequencing data by Ion PGM™ in 22 lung-cancer genes, regardless the source of samples. Exon coverage of 623 cancer-related genes revealed high coverage efficiency of both kits, proposing WES as a valid alternative to target re-sequencing. High-quality and reliable data can be successfully obtained from WES of FFPE samples starting from a relatively low amount of input gDNA, suggesting the inclusion of NGS-based tests into clinical contest. In conclusion, our analysis suggests that the WES approach could be extended to a translational research context as well as to the clinic (e.g. to study rare malignancies), where the simultaneous analysis of the whole coding region of the genome may help in the detection of cancer-linked variants.
Complex Genetics and the Etiology of Human Congenital Heart Disease

PubMed Central

Gelb, Bruce D.; Chung, Wendy K.

2014-01-01

Congenital heart disease (CHD) is the most common birth defect. Despite considerable advances in care, CHD remains a major contributor to newborn mortality and is associated with substantial morbidities and premature death. Genetic abnormalities appear to be the primary cause of CHD, but identifying precise defects has proven challenging, principally because CHD is a complex genetic trait. Mainly because of recent advances in genomic technology such as next-generation DNA sequencing, scientists have begun to identify the genetic variants underlying CHD. In this article, the roles of modifier genes, de novo mutations, copy number variants, common variants, and noncoding mutations in the pathogenesis of CHD are reviewed. PMID:24985128
Phylogenetic relationships among morphotypes of Caesalpinia echinata Lam. (Caesalpinioideae: Leguminosae) evidenced by trnL intron sequences

NASA Astrophysics Data System (ADS)

Juchum, Fabrício Sacramento; Costa, Marco Antônio; Amorim, André Márcio; Corrêa, Ronan Xavier

2008-11-01

Caesalpinia echinata (brazilwood or Pernambuco wood) comprises a complex of three morphological leaf variants, characterized by differences in the number and size of the pinnae and leaflets, and occurring in allopatric and sympatric populations. The present study evaluates the utility of the chloroplast DNA trnL intron in a phylogenetic analysis of the three leaf variants along with other species of Caesalpinia and generic relatives. Our study supports the hypothesis that the name C. echinata designates a species complex and provides evidence that one of the forms, the highly divergent C. echinata large-leafleted variant, represents a distinct taxon.
From Conventional to Next Generation Sequencing of Epstein-Barr Virus Genomes.

PubMed

Kwok, Hin; Chiang, Alan Kwok Shing

2016-02-24

Genomic sequences of Epstein-Barr virus (EBV) have been of interest because the virus is associated with cancers, such as nasopharyngeal carcinoma, and conditions such as infectious mononucleosis. The progress of whole-genome EBV sequencing has been limited by the inefficiency and cost of the first-generation sequencing technology. With the advancement of next-generation sequencing (NGS) and target enrichment strategies, increasing number of EBV genomes has been published. These genomes were sequenced using different approaches, either with or without EBV DNA enrichment. This review provides an overview of the EBV genomes published to date, and a description of the sequencing technology and bioinformatic analyses employed in generating these sequences. We further explored ways through which the quality of sequencing data can be improved, such as using DNA oligos for capture hybridization, and longer insert size and read length in the sequencing runs. These advances will enable large-scale genomic sequencing of EBV which will facilitate a better understanding of the genetic variations of EBV in different geographic regions and discovery of potentially pathogenic variants in specific diseases.
Next-generation transcriptome sequencing, SNP discovery and validation in four market classes of peanut, Arachis hypogaea L.

PubMed

Chopra, Ratan; Burow, Gloria; Farmer, Andrew; Mudge, Joann; Simpson, Charles E; Wilkins, Thea A; Baring, Michael R; Puppala, Naveen; Chamberlin, Kelly D; Burow, Mark D

2015-06-01

Single-nucleotide polymorphisms, which can be identified in the thousands or millions from comparisons of transcriptome or genome sequences, are ideally suited for making high-resolution genetic maps, investigating population evolutionary history, and discovering marker-trait linkages. Despite significant results from their use in human genetics, progress in identification and use in plants, and particularly polyploid plants, has lagged. As part of a long-term project to identify and use SNPs suitable for these purposes in cultivated peanut, which is tetraploid, we generated transcriptome sequences of four peanut cultivars, namely OLin, New Mexico Valencia C, Tamrun OL07 and Jupiter, which represent the four major market classes of peanut grown in the world, and which are important economically to the US southwest peanut growing region. CopyDNA libraries of each genotype were used to generate 2 × 54 paired-end reads using an Illumina GAIIx sequencer. Raw reads were mapped to a custom reference consisting of Tifrunner 454 sequences plus peanut ESTs in GenBank, compromising 43,108 contigs; 263,840 SNP and indel variants were identified among four genotypes compared to the reference. A subset of 6 variants was assayed across 24 genotypes representing four market types using KASP chemistry to assess the criteria for SNP selection. Results demonstrated that transcriptome sequencing can identify SNPs usable as selectable DNA-based markers in complex polyploid species such as peanut. Criteria for effective use of SNPs as markers are discussed in this context.

Validation and Implementation of BRCA1/2 Variant Screening in Ovarian Tumor Tissue.

PubMed

de Jonge, Marthe M; Ruano, Dina; van Eijk, Ronald; van der Stoep, Nienke; Nielsen, Maartje; Wijnen, Juul T; Ter Haar, Natalja T; Baalbergen, Astrid; Bos, Monique E M M; Kagie, Marjolein J; Vreeswijk, Maaike P G; Gaarenstroom, Katja N; Kroep, Judith R; Smit, Vincent T H B M; Bosse, Tjalling; van Wezel, Tom; van Asperen, Christi J

2018-06-21

BRCA1/2 variant analysis in tumor tissue could streamline the referral of patients with epithelial ovarian, fallopian tube, or primary peritoneal cancer to genetic counselors and select patients who benefit most from targeted treatment. We investigated the sensitivity of BRCA1/2 variant analysis in formalin-fixed, paraffin-embedded tumor tissue using a combination of next-generation sequencing and copy number variant multiplex ligation-dependent probe amplification. After optimization using a training cohort of known BRCA1/2 mutation carriers, validation was performed in a prospective cohort (Clinical implementation Of BRCA1/2 screening in ovarian tumor tissue: COBRA-cohort) in which screening of BRCA1/2 tumor DNA and leukocyte germline DNA was performed in parallel. BRCA1 promoter hypermethylation and pedigree analysis were also performed. In the training cohort 45 of 46 germline BRCA1/2 variants were detected (sensitivity 98%). In the COBRA cohort (n=62), all six germline variants were identified (sensitivity 100%), together with five somatic BRCA1/2 variants and eight cases with BRCA1 promoter hypermethylation. In four BRCA1/2 variant-negative patients, surveillance or prophylactic management options were offered based on positive family histories. We conclude that BRCA1/2 formalin-fixed, paraffin-embedded tumor tissue analysis reliably detects BRCA1/2 variants. When taking family history of BRCA1/2 variant-negative patients into account, tumor BRCA1/2 variant screening allows more efficient selection of epithelial ovarian cancer patients for genetic counselling and simultaneously selects patients who benefit most from targeted treatment. Copyright © 2018. Published by Elsevier Inc.
Genetic Analyses of the NF1 Gene in Turkish Neurofibromatosis Type I Patients and Definition of three Novel Variants

PubMed Central

Ulusal, SD; Gürkan, H; Atlı, E; Özal, SA; Çiftdemir, M; Tozkır, H; Karal, Y; Güçlü, H; Eker, D; Görker, I

2017-01-01

Abstract Neurofibromatosis Type I (NF1) is a multi systemic autosomal dominant neurocutaneous disorder predisposing patients to have benign and/or malignant lesions predominantly of the skin, nervous system and bone. Loss of function mutations or deletions of the NF1 gene is responsible for NF1 disease. Involvement of various pathogenic variants, the size of the gene and presence of pseudogenes makes it difficult to analyze. We aimed to report the results of 2 years of multiplex ligation-dependent probe amplification (MLPA) and next generation sequencing (NGS) for genetic diagnosis of NF1 applied at our genetic diagnosis center. The MLPA, semiconductor sequencing and Sanger sequencing were performed in genomic DNA samples from 24 unrelated patients and their affected family members referred to our center suspected of having NF1. In total, three novel and 12 known pathogenic variants and a whole gene deletion were determined. We suggest that next generation sequencing is a practical tool for genetic analysis of NF1. Deletion/duplication analysis with MLPA may also be helpful for patients clinically diagnosed to carry NF1 but do not have a detectable mutation in NGS. PMID:28924536
Association analysis of rare variants near the APOE region with CSF and neuroimaging biomarkers of Alzheimer's disease.

PubMed

Nho, Kwangsik; Kim, Sungeun; Horgusluoglu, Emrin; Risacher, Shannon L; Shen, Li; Kim, Dokyoon; Lee, Seunggeun; Foroud, Tatiana; Shaw, Leslie M; Trojanowski, John Q; Aisen, Paul S; Petersen, Ronald C; Jack, Clifford R; Weiner, Michael W; Green, Robert C; Toga, Arthur W; Saykin, Andrew J

2017-05-24

The APOE ε4 allele is the most significant common genetic risk factor for late-onset Alzheimer's disease (LOAD). The region surrounding APOE on chromosome 19 has also shown consistent association with LOAD. However, no common variants in the region remain significant after adjusting for APOE genotype. We report a rare variant association analysis of genes in the vicinity of APOE with cerebrospinal fluid (CSF) and neuroimaging biomarkers of LOAD. Whole genome sequencing (WGS) was performed on 817 blood DNA samples from the Alzheimer's Disease Neuroimaging Initiative (ADNI). Sequence data from 757 non-Hispanic Caucasian participants was used in the present analysis. We extracted all rare variants (MAF (minor allele frequency) < 0.05) within a 312 kb window in APOE's vicinity encompassing 12 genes. We assessed CSF and neuroimaging (MRI and PET) biomarkers as LOAD-related quantitative endophenotypes. Gene-based analyses of rare variants were performed using the optimal Sequence Kernel Association Test (SKAT-O). A total of 3,334 rare variants (MAF < 0.05) were found within the APOE region. Among them, 72 rare non-synonymous variants were observed. Eight genes spanning the APOE region were significantly associated with CSF Aβ 1-42 (p < 1.0 × 10 -3 ). After controlling for APOE genotype and adjusting for multiple comparisons, 4 genes (CBLC, BCAM, APOE, and RELB) remained significant. Whole-brain surface-based analysis identified highly significant clusters associated with rare variants of CBLC in the temporal lobe region including the entorhinal cortex, as well as frontal lobe regions. Whole-brain voxel-wise analysis of amyloid PET identified significant clusters in the bilateral frontal and parietal lobes showing associations of rare variants of RELB with cortical amyloid burden. Rare variants within genes spanning the APOE region are significantly associated with LOAD-related CSF Aβ 1-42 and neuroimaging biomarkers after adjusting for APOE genotype. These findings warrant further investigation and illustrate the role of next generation sequencing and quantitative endophenotypes in assessing rare variants which may help explain missing heritability in AD and other complex diseases.
Human papillomavirus variants among Inuit women in northern Quebec, Canada.

PubMed

Gauthier, Barbara; Coutlée, Francois; Franco, Eduardo L; Brassard, Paul

2015-01-01

Inuit communities in northern Quebec have high rates of human papillomavirus (HPV) infection, cervical cancer and cervical cancer-related mortality as compared to the Canadian population. HPV types can be further classified as intratypic variants based on the extent of homology in their nucleotide sequences. There is limited information on the distribution of intratypic variants in circumpolar areas. Our goal was to describe the HPV intratypic variants and associated baseline characteristics. We collected cervical cell samples in 2002-2006 from 676 Inuit women between the ages of 15 and 69 years in Nunavik. DNA isolates from high-risk HPVs were sequenced to determine the intratypic variant. There were 149 women that were positive for HPVs 16, 18, 31, 33, 35, 45, 52, 56 or 58 during follow-up. There were 5 different HPV16 variants, all of European lineage, among the 57 women positive for this type. There were 8 different variants of HPV18 present and all were of European lineage (n=21). The majority of samples of HPV31 (n=52) were of lineage B. The number of isolates and diversity of the other HPV types was low. Age was the only covariate associated with HPV16 variant category. These frequencies are similar to what was seen in another circumpolar region of Canada, although there appears to be less diversity as only European variants were detected. This study shows that most variants were clustered in one lineage for each HPV type.
HPV16 variant lineage, clinical stage, and survival in women with invasive cervical cancer

PubMed Central

2011-01-01

Background HPV16 variants are associated with different risks for development of CIN3 and invasive cancer, although all are carcinogenic. The relationship of HPV 16 variants to cancer survival has not been studied. Methods 155 HPV16-positive cervical cancers were categorized according to European and non-European variant patterns by DNA sequencing of the E6 open reading frame. Clinico-pathologic parameters and clinical outcome were collected by chart review and death registry data. Results Of the 155 women (mean age 44.7 years; median follow-up 26.7 months), 85.2% harbored European variants while 14.8% had non-European sequences. HPV16 variants differed by histologic cell type (p = 0.03) and stage (1 vs. 2+; p = 0.03). Overall, 107 women (68.0%) were alive with no evidence of cancer, 42 (27.1%) died from cervical cancer, 2 (1.3%) were alive with cervical cancer, and 4 (2.6%) died of other causes. Death due to cervical cancer was associated with European variant status (p < 0.01). While 31% of women harboring tumors with European variants died from cervical cancer during follow-up, only 1 of 23 (4.4%) non-European cases died of cancer. The better survival for non-European cases was partly mediated by lower stage at diagnosis. Conclusions Overall, invasive cervical cancers with non-European variants showed a less aggressive behavior than those with European variants. These findings should be replicated in a population with more non-European cases. PMID:22035468
A Sequence-Independent Strategy for Detection and Cloning of Circular DNA Virus Genomes by Using Multiply Primed Rolling-Circle Amplification

PubMed Central

Rector, Annabel; Tachezy, Ruth; Van Ranst, Marc

2004-01-01

The discovery of novel viruses has often been accomplished by using hybridization-based methods that necessitate the availability of a previously characterized virus genome probe or knowledge of the viral nucleotide sequence to construct consensus or degenerate PCR primers. In their natural replication cycle, certain viruses employ a rolling-circle mechanism to propagate their circular genomes, and multiply primed rolling-circle amplification (RCA) with φ29 DNA polymerase has recently been applied in the amplification of circular plasmid vectors used in cloning. We employed an isothermal RCA protocol that uses random hexamer primers to amplify the complete genomes of papillomaviruses without the need for prior knowledge of their DNA sequences. We optimized this RCA technique with extracted human papillomavirus type 16 (HPV-16) DNA from W12 cells, using a real-time quantitative PCR assay to determine amplification efficiency, and obtained a 2.4 × 104-fold increase in HPV-16 DNA concentration. We were able to clone the complete HPV-16 genome from this multiply primed RCA product. The optimized protocol was subsequently applied to a bovine fibropapillomatous wart tissue sample. Whereas no papillomavirus DNA could be detected by restriction enzyme digestion of the original sample, multiply primed RCA enabled us to obtain a sufficient amount of papillomavirus DNA for restriction enzyme analysis, cloning, and subsequent sequencing of a novel variant of bovine papillomavirus type 1. The multiply primed RCA method allows the discovery of previously unknown papillomaviruses, and possibly also other circular DNA viruses, without a priori sequence information. PMID:15113879
Exome Sequencing Reveals Primary Immunodeficiencies in Children with Community-Acquired Pseudomonas aeruginosa Sepsis.

PubMed

Asgari, Samira; McLaren, Paul J; Peake, Jane; Wong, Melanie; Wong, Richard; Bartha, Istvan; Francis, Joshua R; Abarca, Katia; Gelderman, Kyra A; Agyeman, Philipp; Aebi, Christoph; Berger, Christoph; Fellay, Jacques; Schlapbach, Luregn J

2016-01-01

One out of three pediatric sepsis deaths in high income countries occur in previously healthy children. Primary immunodeficiencies (PIDs) have been postulated to underlie fulminant sepsis, but this concept remains to be confirmed in clinical practice. Pseudomonas aeruginosa ( P. aeruginosa ) is a common bacterium mostly associated with health care-related infections in immunocompromised individuals. However, in rare cases, it can cause sepsis in previously healthy children. We used exome sequencing and bioinformatic analysis to systematically search for genetic factors underpinning severe P. aeruginosa infection in the pediatric population. We collected blood samples from 11 previously healthy children, with no family history of immunodeficiency, who presented with severe sepsis due to community-acquired P. aeruginosa bacteremia. Genomic DNA was extracted from blood or tissue samples obtained intravitam or postmortem. We obtained high-coverage exome sequencing data and searched for rare loss-of-function variants. After rigorous filtrations, 12 potentially causal variants were identified. Two out of eight (25%) fatal cases were found to carry novel pathogenic variants in PID genes, including BTK and DNMT3B . This study demonstrates that exome sequencing allows to identify rare, deleterious human genetic variants responsible for fulminant sepsis in apparently healthy children. Diagnosing PIDs in such patients is of high relevance to survivors and affected families. We propose that unusually severe and fatal sepsis cases in previously healthy children should be considered for exome/genome sequencing to search for underlying PIDs.
Exome Sequencing Reveals Primary Immunodeficiencies in Children with Community-Acquired Pseudomonas aeruginosa Sepsis

PubMed Central

Asgari, Samira; McLaren, Paul J.; Peake, Jane; Wong, Melanie; Wong, Richard; Bartha, Istvan; Francis, Joshua R.; Abarca, Katia; Gelderman, Kyra A.; Agyeman, Philipp; Aebi, Christoph; Berger, Christoph; Fellay, Jacques; Schlapbach, Luregn J.; Posfay-Barbe, Klara

2016-01-01

One out of three pediatric sepsis deaths in high income countries occur in previously healthy children. Primary immunodeficiencies (PIDs) have been postulated to underlie fulminant sepsis, but this concept remains to be confirmed in clinical practice. Pseudomonas aeruginosa (P. aeruginosa) is a common bacterium mostly associated with health care-related infections in immunocompromised individuals. However, in rare cases, it can cause sepsis in previously healthy children. We used exome sequencing and bioinformatic analysis to systematically search for genetic factors underpinning severe P. aeruginosa infection in the pediatric population. We collected blood samples from 11 previously healthy children, with no family history of immunodeficiency, who presented with severe sepsis due to community-acquired P. aeruginosa bacteremia. Genomic DNA was extracted from blood or tissue samples obtained intravitam or postmortem. We obtained high-coverage exome sequencing data and searched for rare loss-of-function variants. After rigorous filtrations, 12 potentially causal variants were identified. Two out of eight (25%) fatal cases were found to carry novel pathogenic variants in PID genes, including BTK and DNMT3B. This study demonstrates that exome sequencing allows to identify rare, deleterious human genetic variants responsible for fulminant sepsis in apparently healthy children. Diagnosing PIDs in such patients is of high relevance to survivors and affected families. We propose that unusually severe and fatal sepsis cases in previously healthy children should be considered for exome/genome sequencing to search for underlying PIDs. PMID:27703454
Constitutional sequence variation in the Fanconi anaemia group C (FANCC) gene in childhood acute myeloid leukaemia.

PubMed

Barber, Lisa M; McGrath, Helen E N; Meyer, Stefan; Will, Andrew M; Birch, Jillian M; Eden, Osborn B; Taylor, G Malcolm

2003-04-01

The extent to which genetic susceptibility contributes to the causation of childhood acute myeloid leukaemia (AML) is not known. The inherited bone marrow failure disorder Fanconi anaemia (FA) carries a substantially increased risk of AML, raising the possibility that constitutional variation in the FA (FANC) genes is involved in the aetiology of childhood AML. We have screened genomic DNA extracted from remission blood samples of 97 children with sporadic AML and 91 children with sporadic acute lymphoblastic leukaemia (ALL), together with 104 cord blood DNA samples from newborn children, for variations in the Fanconi anaemia group C (FANCC) gene. We found no evidence of known FANCC pathogenic mutations in children with AML, ALL or in the cord blood samples. However, we detected 12 different FANCC sequence variants, of which five were novel to this study. Among six FANCC variants leading to amino-acid substitutions, one (S26F) was present at a fourfold greater frequency in children with AML than in the cord blood samples (odds ratio: 4.09, P = 0.047; 95% confidence interval 1.08-15.54). Our results thus do not exclude the possibility that this polymorphic variant contributes to the risk of a small proportion of childhood AML.
X-Linked Glomerulopathy Due to COL4A5 Founder Variant.

PubMed

Barua, Moumita; John, Rohan; Stella, Lorenzo; Li, Weili; Roslin, Nicole M; Sharif, Bedra; Hack, Saidah; Lajoie-Starkell, Ginette; Schwaderer, Andrew L; Becknell, Brian; Wuttke, Matthias; Köttgen, Anna; Cattran, Daniel; Paterson, Andrew D; Pei, York

2018-03-01

Alport syndrome is a rare hereditary disorder caused by rare variants in 1 of 3 genes encoding for type IV collagen. Rare variants in COL4A5 on chromosome Xq22 cause X-linked Alport syndrome, which accounts for ∼80% of the cases. Alport syndrome has a variable clinical presentation, including progressive kidney failure, hearing loss, and ocular defects. Exome sequencing performed in 2 affected related males with an undefined X-linked glomerulopathy characterized by global and segmental glomerulosclerosis, mesangial hypercellularity, and vague basement membrane immune complex deposition revealed a COL4A5 sequence variant, a substitution of a thymine by a guanine at nucleotide 665 (c.T665G; rs281874761) of the coding DNA predicted to lead to a cysteine to phenylalanine substitution at amino acid 222, which was not seen in databases cataloguing natural human genetic variation, including dbSNP138, 1000 Genomes Project release version 01-11-2004, Exome Sequencing Project 21-06-2014, or ExAC 01-11-2014. Review of the literature identified 2 additional families with the same COL4A5 variant leading to similar atypical histopathologic features, suggesting a unique pathologic mechanism initiated by this specific rare variant. Homology modeling suggests that the substitution alters the structural and dynamic properties of the type IV collagen trimer. Genetic analysis comparing members of the 3 families indicated a distant relationship with a shared haplotype, implying a founder effect. Crown Copyright © 2017. Published by Elsevier Inc. All rights reserved.
Novel variants in PAX6 gene caused congenital aniridia in two Chinese families.

PubMed

Zhang, R; Linpeng, S; Wei, X; Li, H; Huang, Y; Guo, J; Wu, Q; Liang, D; Wu, L

2017-06-01

PurposeTo reveal the underlying genetic defect in two four-generation Chinese families with aniridia and explore the pathologic mechanism.MethodsFull ophthalmic examinations were performed in two families with aniridia. The PAX6 gene was directly sequenced in patients of two families, and the detected variants were screened in unaffected family members and two hundred unrelated healthy controls. Real-time quantitative PCR was used to explore pathologic mechanisms of the two variants.ResultsAniridia, cataract, and oscillatory nystagmus were observed in patients of the two families. In addition, we observed corneal opacity and microphthalmus in family 1, and strabismus, left ectopia lentis, microphthalmus, and microcornea in family 2. Sanger sequencing detected a novel 1-bp duplication (c.50dupA) in family 1 and a novel 2-bp splice site deletion (c.765+1_765+2delGT) in family 2. Sequencing of cDNA indicated skipping of exon 9 caused by the splice site deletion, being predicted to cause a premature stop codon, as well as the duplication. The PAX6 mRNA significantly lower in patients with aniridia than in unaffected family members in both families, suggesting that the duplication and splice site deletion caused nonsense-mediated mRNA decay.ConclusionsOur study identified two novel PAX6 variants in two families with aniridia and revealed the pathogenicity of the variants; this would expand the variant spectrum of PAX6 and help us better understand the molecular basis of aniridia, thus facilitating genetic counseling.
Direct uptake and degradation of DNA by lysosomes

PubMed Central

Fujiwara, Yuuki; Kikuchi, Hisae; Aizawa, Shu; Furuta, Akiko; Hatanaka, Yusuke; Konya, Chiho; Uchida, Kenko; Wada, Keiji; Kabuta, Tomohiro

2013-01-01

Lysosomes contain various hydrolases that can degrade proteins, lipids, nucleic acids and carbohydrates. We recently discovered “RNautophagy,” an autophagic pathway in which RNA is directly taken up by lysosomes and degraded. A lysosomal membrane protein, LAMP2C, a splice variant of LAMP2, binds to RNA and acts as a receptor for this pathway. In the present study, we show that DNA is also directly taken up by lysosomes and degraded. Like RNautophagy, this autophagic pathway, which we term “DNautophagy,” is dependent on ATP. The cytosolic sequence of LAMP2C also directly interacts with DNA, and LAMP2C functions as a receptor for DNautophagy, in addition to RNautophagy. Similarly to RNA, DNA binds to the cytosolic sequences of fly and nematode LAMP orthologs. Together with the findings of our previous study, our present findings suggest that RNautophagy and DNautophagy are evolutionarily conserved systems in Metazoa. PMID:23839276
Identifying sites of replication initiation in yeast chromosomes: looking for origins in all the right places.

PubMed

van Brabant, A J; Hunt, S Y; Fangman, W L; Brewer, B J

1998-06-01

DNA fragments that contain an active origin of replication generate bubble-shaped replication intermediates with diverging forks. We describe two methods that use two-dimensional (2-D) agarose gel electrophoresis along with DNA sequence information to identify replication origins in natural and artificial Saccharomyces cerevisiae chromosomes. The first method uses 2-D gels of overlapping DNA fragments to locate an active chromosomal replication origin within a region known to confer autonomous replication on a plasmid. A variant form of 2-D gels can be used to determine the direction of fork movement, and the second method uses this technique to find restriction fragments that are replicated by diverging forks, indicating that a bidirectional replication origin is located between the two fragments. Either of these two methods can be applied to the analysis of any genomic region for which there is DNA sequence information or an adequate restriction map.
Combined mismatch repair and POLE/POLD1 defects explain unresolved suspected Lynch syndrome cancers

PubMed Central

Jansen, Anne ML; van Wezel, Tom; van den Akker, Brendy EWM; Ventayol Garcia, Marina; Ruano, Dina; Tops, Carli MJ; Wagner, Anja; Letteboer, Tom GW; Gómez-García, Encarna B; Devilee, Peter; Wijnen, Juul T; Hes, Frederik J; Morreau, Hans

2016-01-01

Many suspected Lynch Syndrome (sLS) patients who lack mismatch repair (MMR) germline gene variants and MLH1 or MSH2 hypermethylation are currently explained by somatic MMR gene variants or, occasionally, by germline POLE variants. To further investigate unexplained sLS patients, we analyzed leukocyte and tumor DNA of 62 sLS patients using gene panel sequencing including the POLE, POLD1 and MMR genes. Forty tumors showed either one, two or more somatic MMR variants predicted to affect function. Nine sLS tumors showed a likely ultramutated phenotype and were found to carry germline (n=2) or somatic variants (n=7) in the POLE/POLD1 exonuclease domain (EDM). Six of these POLE/POLD1-EDM mutated tumors also carried somatic MMR variants. Our findings suggest that faulty proofreading may result in loss of MMR and thereby in microsatellite instability. PMID:26648449
Variants in the PRPF8 Gene are Associated with Glaucoma.

PubMed

Micheal, Shazia; Hogewind, Barend F; Khan, Muhammad Imran; Siddiqui, Sorath Noorani; Zafar, Saemah Nuzhat; Akhtar, Farah; Qamar, Raheel; Hoyng, Carel B; den Hollander, Anneke I

2018-05-01

Glaucoma is the cause of irreversible blindness worldwide. Mutations in six genes have been associated with juvenile- and adult-onset familial primary open angle glaucoma (POAG) prior to this report but they explain only a small proportion of the genetic load. The aim of the study is to identify the novel genetic cause of the POAG in the families with adult-onset glaucoma. Whole exome sequencing (WES) was performed on DNA of two affected individuals, and predicted pathogenic variants were evaluated for segregation in four affected and three unaffected Dutch family members by Sanger sequencing. We identified a pathogenic variant (p.Val956Gly) in the PRPF8 gene, which segregates with the disease in Dutch family. Targeted Sanger sequencing of PRPF8 in a panel of 40 POAG families (18 Pakistani and 22 Dutch) revealed two additional nonsynonymous variants (p.Pro13Leu and p.Met25Thr), which segregate with the disease in two other Pakistani families. Both variants were then analyzed in a case-control cohort consisting of Pakistani 320 POAG cases and 250 matched controls. The p.Pro13Leu and p.Met25Thr variants were identified in 14 and 20 cases, respectively, while they were not detected in controls (p values 0.0004 and 0.0001, respectively). Previously, PRPF8 mutations have been associated with autosomal dominant retinitis pigmentosa (RP). The PRPF8 variants associated with POAG are located at the N-terminus, while all RP-associated mutations cluster at the C-terminus, dictating a clear genotype-phenotype correlation.
Ataxia telangiectasia presenting as dopa-responsive cervical dystonia

PubMed Central

Mohire, Mahavir D.; Schneider, Susanne A.; Stamelou, Maria; Wood, Nicholas W.; Bhatia, Kailash P.

2013-01-01

Objective: To identify the cause of cervical dopa-responsive dystonia (DRD) in a Muslim Indian family inherited in an apparently autosomal recessive fashion, as previously described in this journal. Methods: Previous testing for mutations in the genes known to cause DRD (GCH1, TH, and SPR) had been negative. Whole exome sequencing was performed on all 3 affected individuals for whom DNA was available to identify potentially pathogenic shared variants. Genotyping data obtained for all 3 affected individuals using the OmniExpress single nucleotide polymorphism chip (Illumina, San Diego, CA) were used to perform linkage analysis, autozygosity mapping, and copy number variation analysis. Sanger sequencing was used to confirm all variants. Results: After filtering of the variants, exome sequencing revealed 2 genes harboring potentially pathogenic compound heterozygous variants (ATM and LRRC16A). Of these, the variants in ATM segregated perfectly with the cervical DRD. Both mutations detected in ATM have been shown to be pathogenic, and α-fetoprotein, a marker of ataxia telangiectasia, was increased in all affected individuals. Conclusion: Biallelic mutations in ATM can cause DRD, and mutations in this gene should be considered in the differential diagnosis of unexplained DRD, particularly if the dystonia is cervical and if there is a recessive family history. ATM has previously been reported to cause isolated cervical dystonia, but never, to our knowledge, DRD. Individuals with dystonia related to ataxia telangiectasia may benefit from a trial of levodopa. PMID:23946315
Analysis of the entire genomes of torque teno midi virus variants in chimpanzees: infrequent cross-species infection between humans and chimpanzees.

PubMed

Ninomiya, Masashi; Takahashi, Masaharu; Hoshino, Yu; Ichiyama, Koji; Simmonds, Peter; Okamoto, Hiroaki

2009-02-01

Humans are frequently infected with three anelloviruses which have circular DNA genomes of 3.6-3.9 kb [Torque teno virus (TTV)], 2.8-2.9 kb [Torque teno mini virus (TTMV)] and 3.2 kb [a recently discovered anellovirus named Torque teno midi virus (TTMDV)]. Unexpectedly, human TTMDV DNA was not detectable in any of 74 chimpanzees tested, although all but one tested positive for both human TTV and TTMV DNA. Using universal primers for anelloviruses, novel variants of TTMDV that are phylogenetically clearly separate from human TTMDV were identified from chimpanzees, and over the entire genome, three chimpanzee TTMDV variants differed by 17.9-20.3 % from each other and by 40.4-43.6 % from all 18 reported human TTMDVs. A newly developed PCR assay that uses chimpanzee TTMDV-specific primers revealed the high prevalence of chimpanzee TTMDV in chimpanzees (63/74, 85 %) but low prevalence in humans (1/100). While variants of TTV and TTMV from chimpanzees and humans were phylogenetically interspersed, those of TTMDV were monophyletic for each species, with sequence diversity of <33 and <20 % within the 18 human and three chimpanzee TTMDV variants, respectively. Maximum within-group divergence values for TTV and TTMV were 51 and 57 %, respectively; both of these values were substantially greater than the maximum divergence among TTMDV variants (44 %), consistent with a later evolutionary emergence of TTMDV. However, substantiation of this hypothesis will require further analysis of genetic diversity using an expanded dataset of TTMDV variants in humans and chimpanzees. Similarly, the underlying mechanism of observed infrequent cross-species infection of TTMDV between humans and chimpanzees deserves further analysis.
Population Structure of Two Rabies Hosts Relative to the Known Distribution of Rabies Virus Variants in Alaska

PubMed Central

Goldsmith, Elizabeth W.; Renshaw, Benjamin; Clement, Christopher J.; Himschoot, Elizabeth A.; Hundertmark, Kris J.; Hueffer, Karsten

2015-01-01

For pathogens that infect multiple species the distinction between reservoir hosts and spillover hosts is often difficult. In Alaska, three variants of the arctic rabies virus exist with distinct spatial distributions. We test the hypothesis that rabies virus variant distribution corresponds to the population structure of the primary rabies hosts in Alaska, arctic foxes (Vulpes lagopus) and red foxes (V. vulpes) in order to possibly distinguish reservoir and spill over hosts. We used mitochondrial DNA (mtDNA) sequence and nine microsatellites to assess population structure in those two species. mtDNA structure did not correspond to rabies virus variant structure in either species. Microsatellite analyses gave varying results. Bayesian clustering found 2 groups of arctic foxes in the coastal tundra region, but for red foxes it identified tundra and boreal types. Spatial Bayesian clustering and spatial principal components analysis identified 3 and 4 groups of arctic foxes, respectively, closely matching the distribution of rabies virus variants in the state. Red foxes, conversely, showed eight clusters comprising 2 regions (boreal and tundra) with much admixture. These results run contrary to previous beliefs that arctic fox show no fine-scale spatial population structure. While we cannot rule out that the red fox is part of the maintenance host community for rabies in Alaska, the distribution of virus variants appears to be driven primarily by the artic fox Therefore we show that host population genetics can be utilized to distinguish between maintenance and spillover hosts when used in conjunction with other approaches. PMID:26661691
Population structure of two rabies hosts relative to the known distribution of rabies virus variants in Alaska.

PubMed

Goldsmith, Elizabeth W; Renshaw, Benjamin; Clement, Christopher J; Himschoot, Elizabeth A; Hundertmark, Kris J; Hueffer, Karsten

2016-02-01

For pathogens that infect multiple species, the distinction between reservoir hosts and spillover hosts is often difficult. In Alaska, three variants of the arctic rabies virus exist with distinct spatial distributions. We tested the hypothesis that rabies virus variant distribution corresponds to the population structure of the primary rabies hosts in Alaska, arctic foxes (Vulpes lagopus) and red foxes (Vulpes vulpes) to possibly distinguish reservoir and spillover hosts. We used mitochondrial DNA (mtDNA) sequence and nine microsatellites to assess population structure in those two species. mtDNA structure did not correspond to rabies virus variant structure in either species. Microsatellite analyses gave varying results. Bayesian clustering found two groups of arctic foxes in the coastal tundra region, but for red foxes it identified tundra and boreal types. Spatial Bayesian clustering and spatial principal components analysis identified 3 and 4 groups of arctic foxes, respectively, closely matching the distribution of rabies virus variants in the state. Red foxes, conversely, showed eight clusters comprising two regions (boreal and tundra) with much admixture. These results run contrary to previous beliefs that arctic fox show no fine-scale spatial population structure. While we cannot rule out that the red fox is part of the maintenance host community for rabies in Alaska, the distribution of virus variants appears to be driven primarily by the arctic fox. Therefore, we show that host population genetics can be utilized to distinguish between maintenance and spillover hosts when used in conjunction with other approaches. © 2015 John Wiley & Sons Ltd.
Post-mortem testing; germline BRCA1/2 variant detection using archival FFPE non-tumor tissue. A new paradigm in genetic counseling.

PubMed

Petersen, Annabeth Høgh; Aagaard, Mads Malik; Nielsen, Henriette Roed; Steffensen, Karina Dahl; Waldstrøm, Marianne; Bojesen, Anders

2016-08-01

Accurate estimation of cancer risk in HBOC families often requires BRCA1/2 testing, but this may be impossible in deceased family members. Previous, testing archival formalin-fixed, paraffin-embedded (FFPE) tissue for germline BRCA1/2 variants was unsuccessful, except for the Jewish founder mutations. A high-throughput method to systematically test for variants in all coding regions of BRCA1/2 in archival FFPE samples of non-tumor tissue is described, using HaloPlex target enrichment and next-generation sequencing. In a validation study, correct identification of variants or wild-type was possible in 25 out of 30 (83%) FFPE samples (age range 1-14 years), with a known variant status in BRCA1/2. No false positive was found. Unsuccessful identification was due to highly degraded DNA or presence of large intragenic deletions. In clinical use, a total of 201 FFPE samples (aged 0-43 years) were processed. Thirty-six samples were rejected because of highly degraded DNA or failed library preparation. Fifteen samples were investigated to search for a known variant. In the remaining 150 samples (aged 0-38 years), three variants known to affect function and one variant likely to affect function in BRCA1, six variants known to affect function and one variant likely to affect function in BRCA2, as well as four variants of unknown significance (VUS) in BRCA1 and three VUS in BRCA2 were discovered. It is now possible to test for germline BRCA1/2 variants in deceased persons, using archival FFPE samples from non-tumor tissue. Accurate genetic counseling is achievable in families where variant testing would otherwise be impossible.

Post-mortem testing; germline BRCA1/2 variant detection using archival FFPE non-tumor tissue. A new paradigm in genetic counseling

PubMed Central

Petersen, Annabeth Høgh; Aagaard, Mads Malik; Nielsen, Henriette Roed; Steffensen, Karina Dahl; Waldstrøm, Marianne; Bojesen, Anders

2016-01-01

Accurate estimation of cancer risk in HBOC families often requires BRCA1/2 testing, but this may be impossible in deceased family members. Previous, testing archival formalin-fixed, paraffin-embedded (FFPE) tissue for germline BRCA1/2 variants was unsuccessful, except for the Jewish founder mutations. A high-throughput method to systematically test for variants in all coding regions of BRCA1/2 in archival FFPE samples of non-tumor tissue is described, using HaloPlex target enrichment and next-generation sequencing. In a validation study, correct identification of variants or wild-type was possible in 25 out of 30 (83%) FFPE samples (age range 1–14 years), with a known variant status in BRCA1/2. No false positive was found. Unsuccessful identification was due to highly degraded DNA or presence of large intragenic deletions. In clinical use, a total of 201 FFPE samples (aged 0–43 years) were processed. Thirty-six samples were rejected because of highly degraded DNA or failed library preparation. Fifteen samples were investigated to search for a known variant. In the remaining 150 samples (aged 0–38 years), three variants known to affect function and one variant likely to affect function in BRCA1, six variants known to affect function and one variant likely to affect function in BRCA2, as well as four variants of unknown significance (VUS) in BRCA1 and three VUS in BRCA2 were discovered. It is now possible to test for germline BRCA1/2 variants in deceased persons, using archival FFPE samples from non-tumor tissue. Accurate genetic counseling is achievable in families where variant testing would otherwise be impossible. PMID:26733283
An integrated map of structural variation in 2,504 human genomes.

PubMed

Sudmant, Peter H; Rausch, Tobias; Gardner, Eugene J; Handsaker, Robert E; Abyzov, Alexej; Huddleston, John; Zhang, Yan; Ye, Kai; Jun, Goo; Fritz, Markus Hsi-Yang; Konkel, Miriam K; Malhotra, Ankit; Stütz, Adrian M; Shi, Xinghua; Casale, Francesco Paolo; Chen, Jieming; Hormozdiari, Fereydoun; Dayama, Gargi; Chen, Ken; Malig, Maika; Chaisson, Mark J P; Walter, Klaudia; Meiers, Sascha; Kashin, Seva; Garrison, Erik; Auton, Adam; Lam, Hugo Y K; Mu, Xinmeng Jasmine; Alkan, Can; Antaki, Danny; Bae, Taejeong; Cerveira, Eliza; Chines, Peter; Chong, Zechen; Clarke, Laura; Dal, Elif; Ding, Li; Emery, Sarah; Fan, Xian; Gujral, Madhusudan; Kahveci, Fatma; Kidd, Jeffrey M; Kong, Yu; Lameijer, Eric-Wubbo; McCarthy, Shane; Flicek, Paul; Gibbs, Richard A; Marth, Gabor; Mason, Christopher E; Menelaou, Androniki; Muzny, Donna M; Nelson, Bradley J; Noor, Amina; Parrish, Nicholas F; Pendleton, Matthew; Quitadamo, Andrew; Raeder, Benjamin; Schadt, Eric E; Romanovitch, Mallory; Schlattl, Andreas; Sebra, Robert; Shabalin, Andrey A; Untergasser, Andreas; Walker, Jerilyn A; Wang, Min; Yu, Fuli; Zhang, Chengsheng; Zhang, Jing; Zheng-Bradley, Xiangqun; Zhou, Wanding; Zichner, Thomas; Sebat, Jonathan; Batzer, Mark A; McCarroll, Steven A; Mills, Ryan E; Gerstein, Mark B; Bashir, Ali; Stegle, Oliver; Devine, Scott E; Lee, Charles; Eichler, Evan E; Korbel, Jan O

2015-10-01

Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association.
Predictive genomics DNA profiling for athletic performance.

PubMed

Kambouris, Marios; Ntalouka, Foteini; Ziogas, Georgios; Maffulli, Nicola

2012-12-01

Genes control biological processes such as muscle, cartilage and bone formation, muscle energy production and metabolism (mitochondriogenesis, lactic acid removal), blood and tissue oxygenation (erythropoiesis, angiogenesis, vasodilatation), all essential in sport and athletic performance. DNA sequence variations in such genes confer genetic advantages that can be exploited, or genetic 'barriers' that could be overcome to achieve optimal athletic performance. Predictive Genomic DNA Profiling for athletic performance reveals genetic variations that may be associated with better suitability for endurance, strength and speed sports, vulnerability to sports-related injuries and individualized nutritional requirements. Knowledge of genetic 'suitability' in respect to endurance capacity or strength and speed would lead to appropriate sport and athletic activity selection. Knowledge of genetic advantages and barriers would 'direct' an individualized training program, nutritional plan and nutritional supplementation to achieving optimal performance, overcoming 'barriers' that results from intense exercise and pressure under competition with minimum waste of time and energy and avoidance of health risks (hypertension, cardiovascular disease, inflammation, and musculoskeletal injuries) related to exercise, training and competition. Predictive Genomics DNA profiling for Athletics and Sports performance is developing into a tool for athletic activity and sport selection and for the formulation of individualized and personalized training and nutritional programs to optimize health and performance for the athlete. Human DNA sequences are patentable in some countries, while in others DNA testing methodologies [unless proprietary], are non patentable. On the other hand, gene and variant selection, genotype interpretation and the risk and suitability assigning algorithms based on the specific Genomic variants used are amenable to patent protection.
Inherited mitochondrial DNA variants can affect complement, inflammation and apoptosis pathways: insights into mitochondrial–nuclear interactions

PubMed Central

Cristina Kenney, M.; Chwa, Marilyn; Atilano, Shari R.; Falatoonzadeh, Payam; Ramirez, Claudio; Malik, Deepika; Tarek, Mohamed; Cáceres-del-Carpio, Javier; Nesburn, Anthony B.; Boyer, David S.; Kuppermann, Baruch D.; Vawter, Marquis; Michal Jazwinski, S.; Miceli, Michael; Wallace, Douglas C.; Udar, Nitin

2014-01-01

Age-related macular degeneration (AMD) is the leading cause of vision loss in developed countries. While linked to genetic polymorphisms in the complement pathway, there are many individuals with high risk alleles that do not develop AMD, suggesting that other ‘modifiers’ may be involved. Mitochondrial (mt) haplogroups, defined by accumulations of specific mtDNA single nucleotide polymorphisms (SNPs) which represent population origins, may be one such modifier. J haplogroup has been associated with high risk for AMD while the H haplogroup is protective. It has been difficult to assign biological consequences for haplogroups so we created human ARPE-19 cybrids (cytoplasmic hybrids), which have identical nuclei but mitochondria of either J or H haplogroups, to investigate their effects upon bioenergetics and molecular pathways. J cybrids have altered bioenergetic profiles compared with H cybrids. Q-PCR analyses show significantly lower expression levels for seven respiratory complex genes encoded by mtDNA. J and H cybrids have significantly altered expression of eight nuclear genes of the alternative complement, inflammation and apoptosis pathways. Sequencing of the entire mtDNA was carried out for all the cybrids to identify haplogroup and non-haplogroup defining SNPs. mtDNA can mediate cellular bioenergetics and expression levels of nuclear genes related to complement, inflammation and apoptosis. Sequencing data suggest that observed effects are not due to rare mtDNA variants but rather the combination of SNPs representing the J versus H haplogroups. These findings represent a paradigm shift in our concepts of mt–nuclear interactions. PMID:24584571
Cloud-based adaptive exon prediction for DNA analysis

PubMed Central

Putluri, Srinivasareddy; Fathima, Shaik Yasmeen

2018-01-01

Cloud computing offers significant research and economic benefits to healthcare organisations. Cloud services provide a safe place for storing and managing large amounts of such sensitive data. Under conventional flow of gene information, gene sequence laboratories send out raw and inferred information via Internet to several sequence libraries. DNA sequencing storage costs will be minimised by use of cloud service. In this study, the authors put forward a novel genomic informatics system using Amazon Cloud Services, where genomic sequence information is stored and accessed for processing. True identification of exon regions in a DNA sequence is a key task in bioinformatics, which helps in disease identification and design drugs. Three base periodicity property of exons forms the basis of all exon identification techniques. Adaptive signal processing techniques found to be promising in comparison with several other methods. Several adaptive exon predictors (AEPs) are developed using variable normalised least mean square and its maximum normalised variants to reduce computational complexity. Finally, performance evaluation of various AEPs is done based on measures such as sensitivity, specificity and precision using various standard genomic datasets taken from National Center for Biotechnology Information genomic sequence database. PMID:29515813
Incidental germline variants in 1000 advanced cancers on a prospective somatic genomic profiling protocol.

PubMed

Meric-Bernstam, F; Brusco, L; Daniels, M; Wathoo, C; Bailey, A M; Strong, L; Shaw, K; Lu, K; Qi, Y; Zhao, H; Lara-Guerra, H; Litton, J; Arun, B; Eterovic, A K; Aytac, U; Routbort, M; Subbiah, V; Janku, F; Davies, M A; Kopetz, S; Mendelsohn, J; Mills, G B; Chen, K

2016-05-01

Next-generation sequencing in cancer research may reveal germline variants of clinical significance. We report patient preferences for return of results and the prevalence of incidental pathogenic germline variants (PGVs). Targeted exome sequencing of 202 genes was carried out in 1000 advanced cancers using tumor and normal DNA in a research laboratory. Pathogenic variants in 18 genes, recommended for return by The American College of Medical Genetics and Genomics, as well as PALB2, were considered actionable. Patient preferences of return of incidental germline results were collected. Return of results was initiated with genetic counseling and repeat CLIA testing. Of the 1000 patients who underwent sequencing, 43 had likely PGVs: APC (1), BRCA1 (11), BRCA2 (10), TP53 (10), MSH2 (1), MSH6 (4), PALB2 (2), PTEN (2), TSC2 (1), and RB1 (1). Twenty (47%) of 43 variants were previously known based on clinical genetic testing. Of the 1167 patients who consented for a germline testing protocol, 1157 (99%) desired to be informed of incidental results. Twenty-three previously unrecognized mutations identified in the research environment were confirmed with an orthogonal CLIA platform. All patients approached decided to proceed with formal genetic counseling; in all cases where formal genetic testing was carried out, the germline variant of concern validated with clinical genetic testing. In this series, 2.3% patients had previously unrecognized pathogenic germline mutations in 19 cancer-related genes. Thus, genomic sequencing must be accompanied by a plan for return of germline results, in partnership with genetic counseling. © The Author 2016. Published by Oxford University Press on behalf of the European Society for Medical Oncology. All rights reserved. For permissions, please email: journals.permissions@oup.com.
p53 Specifically Binds Triplex DNA In Vitro and in Cells

PubMed Central

Brázdová, Marie; Tichý, Vlastimil; Helma, Robert; Bažantová, Pavla; Polášková, Alena; Krejčí, Aneta; Petr, Marek; Navrátilová, Lucie; Tichá, Olga; Nejedlý, Karel; Bennink, Martin L.; Subramaniam, Vinod; Bábková, Zuzana; Martínek, Tomáš; Lexa, Matej; Adámik, Matej

2016-01-01

Triplex DNA is implicated in a wide range of biological activities, including regulation of gene expression and genomic instability leading to cancer. The tumor suppressor p53 is a central regulator of cell fate in response to different type of insults. Sequence and structure specific modes of DNA recognition are core attributes of the p53 protein. The focus of this work is the structure-specific binding of p53 to DNA containing triplex-forming sequences in vitro and in cells and the effect on p53-driven transcription. This is the first DNA binding study of full-length p53 and its deletion variants to both intermolecular and intramolecular T.A.T triplexes. We demonstrate that the interaction of p53 with intermolecular T.A.T triplex is comparable to the recognition of CTG-hairpin non-B DNA structure. Using deletion mutants we determined the C-terminal DNA binding domain of p53 to be crucial for triplex recognition. Furthermore, strong p53 recognition of intramolecular T.A.T triplexes (H-DNA), stabilized by negative superhelicity in plasmid DNA, was detected by competition and immunoprecipitation experiments, and visualized by AFM. Moreover, chromatin immunoprecipitation revealed p53 binding T.A.T forming sequence in vivo. Enhanced reporter transactivation by p53 on insertion of triplex forming sequence into plasmid with p53 consensus sequence was observed by luciferase reporter assays. In-silico scan of human regulatory regions for the simultaneous presence of both consensus sequence and T.A.T motifs identified a set of candidate p53 target genes and p53-dependent activation of several of them (ABCG5, ENOX1, INSR, MCC, NFAT5) was confirmed by RT-qPCR. Our results show that T.A.T triplex comprises a new class of p53 binding sites targeted by p53 in a DNA structure-dependent mode in vitro and in cells. The contribution of p53 DNA structure-dependent binding to the regulation of transcription is discussed. PMID:27907175
Molecular organization and phylogenetic analysis of 5S rDNA in crustaceans of the genus Pollicipes reveal birth-and-death evolution and strong purifying selection.

PubMed

Perina, Alejandra; Seoane, David; González-Tizón, Ana M; Rodríguez-Fariña, Fernanda; Martínez-Lage, Andrés

2011-10-17

The 5S ribosomal DNA (5S rDNA) is organized in tandem arrays with repeat units that consist of a transcribing region (5S) and a variable nontranscribed spacer (NTS), in higher eukaryotes. Until recently the 5S rDNA was thought to be subject to concerted evolution, however, in several taxa, sequence divergence levels between the 5S and the NTS were found higher than expected under this model. So, many studies have shown that birth-and-death processes and selection can drive the evolution of 5S rDNA. In analyses of 5S rDNA evolution is found several 5S rDNA types in the genome, with low levels of nucleotide variation in the 5S and a spacer region highly divergent. Molecular organization and nucleotide sequence of the 5S ribosomal DNA multigene family (5S rDNA) were investigated in three Pollicipes species in an evolutionary context. The nucleotide sequence variation revealed that several 5S rDNA variants occur in Pollicipes genomes. They are clustered in up to seven different types based on differences in their nontranscribed spacers (NTS). Five different units of 5S rDNA were characterized in P. pollicipes and two different units in P. elegans and P. polymerus. Analysis of these sequences showed that identical types were shared among species and that two pseudogenes were present. We predicted the secondary structure and characterized the upstream and downstream conserved elements. Phylogenetic analysis showed an among-species clustering pattern of 5S rDNA types. These results suggest that the evolution of Pollicipes 5S rDNA is driven by birth-and-death processes with strong purifying selection.
Molecular organization and phylogenetic analysis of 5S rDNA in crustaceans of the genus Pollicipes reveal birth-and-death evolution and strong purifying selection

PubMed Central

2011-01-01

Background The 5S ribosomal DNA (5S rDNA) is organized in tandem arrays with repeat units that consist of a transcribing region (5S) and a variable nontranscribed spacer (NTS), in higher eukaryotes. Until recently the 5S rDNA was thought to be subject to concerted evolution, however, in several taxa, sequence divergence levels between the 5S and the NTS were found higher than expected under this model. So, many studies have shown that birth-and-death processes and selection can drive the evolution of 5S rDNA. In analyses of 5S rDNA evolution is found several 5S rDNA types in the genome, with low levels of nucleotide variation in the 5S and a spacer region highly divergent. Molecular organization and nucleotide sequence of the 5S ribosomal DNA multigene family (5S rDNA) were investigated in three Pollicipes species in an evolutionary context. Results The nucleotide sequence variation revealed that several 5S rDNA variants occur in Pollicipes genomes. They are clustered in up to seven different types based on differences in their nontranscribed spacers (NTS). Five different units of 5S rDNA were characterized in P. pollicipes and two different units in P. elegans and P. polymerus. Analysis of these sequences showed that identical types were shared among species and that two pseudogenes were present. We predicted the secondary structure and characterized the upstream and downstream conserved elements. Phylogenetic analysis showed an among-species clustering pattern of 5S rDNA types. Conclusions These results suggest that the evolution of Pollicipes 5S rDNA is driven by birth-and-death processes with strong purifying selection. PMID:22004418
The Genome of the Netherlands: design, and project goals

PubMed Central

Boomsma, Dorret I; Wijmenga, Cisca; Slagboom, Eline P; Swertz, Morris A; Karssen, Lennart C; Abdellaoui, Abdel; Ye, Kai; Guryev, Victor; Vermaat, Martijn; van Dijk, Freerk; Francioli, Laurent C; Hottenga, Jouke Jan; Laros, Jeroen F J; Li, Qibin; Li, Yingrui; Cao, Hongzhi; Chen, Ruoyan; Du, Yuanping; Li, Ning; Cao, Sujie; van Setten, Jessica; Menelaou, Androniki; Pulit, Sara L; Hehir-Kwa, Jayne Y; Beekman, Marian; Elbers, Clara C; Byelas, Heorhiy; de Craen, Anton J M; Deelen, Patrick; Dijkstra, Martijn; den Dunnen, Johan T; de Knijff, Peter; Houwing-Duistermaat, Jeanine; Koval, Vyacheslav; Estrada, Karol; Hofman, Albert; Kanterakis, Alexandros; Enckevort, David van; Mai, Hailiang; Kattenberg, Mathijs; van Leeuwen, Elisabeth M; Neerincx, Pieter B T; Oostra, Ben; Rivadeneira, Fernanodo; Suchiman, Eka H D; Uitterlinden, Andre G; Willemsen, Gonneke; Wolffenbuttel, Bruce H; Wang, Jun; de Bakker, Paul I W; van Ommen, Gert-Jan; van Duijn, Cornelia M

2014-01-01

Within the Netherlands a national network of biobanks has been established (Biobanking and Biomolecular Research Infrastructure-Netherlands (BBMRI-NL)) as a national node of the European BBMRI. One of the aims of BBMRI-NL is to enrich biobanks with different types of molecular and phenotype data. Here, we describe the Genome of the Netherlands (GoNL), one of the projects within BBMRI-NL. GoNL is a whole-genome-sequencing project in a representative sample consisting of 250 trio-families from all provinces in the Netherlands, which aims to characterize DNA sequence variation in the Dutch population. The parent–offspring trios include adult individuals ranging in age from 19 to 87 years (mean=53 years; SD=16 years) from birth cohorts 1910–1994. Sequencing was done on blood-derived DNA from uncultured cells and accomplished coverage was 14–15x. The family-based design represents a unique resource to assess the frequency of regional variants, accurately reconstruct haplotypes by family-based phasing, characterize short indels and complex structural variants, and establish the rate of de novo mutational events. GoNL will also serve as a reference panel for imputation in the available genome-wide association studies in Dutch and other cohorts to refine association signals and uncover population-specific variants. GoNL will create a catalog of human genetic variation in this sample that is uniquely characterized with respect to micro-geographic location and a wide range of phenotypes. The resource will be made available to the research and medical community to guide the interpretation of sequencing projects. The present paper summarizes the global characteristics of the project. PMID:23714750
Complete cDNA sequence of SAP-like pentraxin from Limulus polyphemus: implications for pentraxin evolution.

PubMed

Tharia, Hazel A; Shrive, Annette K; Mills, John D; Arme, Chris; Williams, Gwyn T; Greenhough, Trevor J

2002-02-22

The serum amyloid P component (SAP)-like pentraxin Limulus polyphemus SAP is a recently discovered, distinct pentraxin species, of known structure, which does not bind phosphocholine and whose N-terminal sequence has been shown to differ markedly from the highly conserved N terminus of all other known horseshoe crab pentraxins. The complete cDNA sequence of Limulus SAP, and the derived amino acid sequence, the first invertebrate SAP-like pentraxin sequence, have been determined. Two sequences were identified that differed only in the length of the 3' untranslated region. Limulus SAP is synthesised as a precursor protein of 234 amino acid residues, the first 17 residues encoding a signal peptide that is absent from the mature protein. Phylogenetic analysis clusters Limulus SAP pentraxin with the horseshoe crab C-reactive proteins (CRPs) rather than the mammalian SAPs, which are clustered with mammalian CRPs. The deduced amino acid sequence shares 22% identity with both human SAP and CRP, which are 51% identical, and 31-35% with horseshoe crab CRPs. These analyses indicate that gene duplication of CRP (or SAP), followed by sequence divergence and the evolution of CRP and/or SAP function, occurred independently along the chordate and arthropod evolutionary lines rather than in a common ancestor. They further indicate that the CRP/SAP gene duplication event in Limulus occurred before both the emergence of the Limulus CRP variants and the mammalian CRP/SAP gene duplication. Limulus SAP, which does not exhibit the CRP characteristic of calcium-dependent binding to phosphocholine, is established as a pentraxin species distinct from all other known horseshoe crab pentraxins that exist in many variant forms sharing a high level of sequence homology. Copyright 2002 Elsevier Science Ltd.
Read clouds uncover variation in complex regions of the human genome.

PubMed

Bishara, Alex; Liu, Yuling; Weng, Ziming; Kashef-Haghighi, Dorna; Newburger, Daniel E; West, Robert; Sidow, Arend; Batzoglou, Serafim

2015-10-01

Although an increasing amount of human genetic variation is being identified and recorded, determining variants within repeated sequences of the human genome remains a challenge. Most population and genome-wide association studies have therefore been unable to consider variation in these regions. Core to the problem is the lack of a sequencing technology that produces reads with sufficient length and accuracy to enable unique mapping. Here, we present a novel methodology of using read clouds, obtained by accurate short-read sequencing of DNA derived from long fragment libraries, to confidently align short reads within repeat regions and enable accurate variant discovery. Our novel algorithm, Random Field Aligner (RFA), captures the relationships among the short reads governed by the long read process via a Markov Random Field. We utilized a modified version of the Illumina TruSeq synthetic long-read protocol, which yielded shallow-sequenced read clouds. We test RFA through extensive simulations and apply it to discover variants on the NA12878 human sample, for which shallow TruSeq read cloud sequencing data are available, and on an invasive breast carcinoma genome that we sequenced using the same method. We demonstrate that RFA facilitates accurate recovery of variation in 155 Mb of the human genome, including 94% of 67 Mb of segmental duplication sequence and 96% of 11 Mb of transcribed sequence, that are currently hidden from short-read technologies. © 2015 Bishara et al.; Published by Cold Spring Harbor Laboratory Press.
A TATA binding protein mutant with increased affinity for DNA directs transcription from a reversed TATA sequence in vivo.

PubMed

Spencer, J Vaughn; Arndt, Karen M

2002-12-01

The TATA-binding protein (TBP) nucleates the assembly and determines the position of the preinitiation complex at RNA polymerase II-transcribed genes. We investigated the importance of two conserved residues on the DNA binding surface of Saccharomyces cerevisiae TBP to DNA binding and sequence discrimination. Because they define a significant break in the twofold symmetry of the TBP-TATA interface, Ala100 and Pro191 have been proposed to be key determinants of TBP binding orientation and transcription directionality. In contrast to previous predictions, we found that substitution of an alanine for Pro191 did not allow recognition of a reversed TATA box in vivo; however, the reciprocal change, Ala100 to proline, resulted in efficient utilization of this and other variant TATA sequences. In vitro assays demonstrated that TBP mutants with the A100P and P191A substitutions have increased and decreased affinity for DNA, respectively. The TATA binding defect of TBP with the P191A mutation could be intragenically suppressed by the A100P substitution. Our results suggest that Ala100 and Pro191 are important for DNA binding and sequence recognition by TBP, that the naturally occurring asymmetry of Ala100 and Pro191 is not essential for function, and that a single amino acid change in TBP can lead to elevated DNA binding affinity and recognition of a reversed TATA sequence.
Integrating evolutionary and regulatory information with a multispecies approach implicates genes and pathways in obsessive-compulsive disorder.

PubMed

Noh, Hyun Ji; Tang, Ruqi; Flannick, Jason; O'Dushlaine, Colm; Swofford, Ross; Howrigan, Daniel; Genereux, Diane P; Johnson, Jeremy; van Grootheest, Gerard; Grünblatt, Edna; Andersson, Erik; Djurfeldt, Diana R; Patel, Paresh D; Koltookian, Michele; M Hultman, Christina; Pato, Michele T; Pato, Carlos N; Rasmussen, Steven A; Jenike, Michael A; Hanna, Gregory L; Stewart, S Evelyn; Knowles, James A; Ruhrmann, Stephan; Grabe, Hans-Jörgen; Wagner, Michael; Rück, Christian; Mathews, Carol A; Walitza, Susanne; Cath, Daniëlle C; Feng, Guoping; Karlsson, Elinor K; Lindblad-Toh, Kerstin

2017-10-17

Obsessive-compulsive disorder is a severe psychiatric disorder linked to abnormalities in glutamate signaling and the cortico-striatal circuit. We sequenced coding and regulatory elements for 608 genes potentially involved in obsessive-compulsive disorder in human, dog, and mouse. Using a new method that prioritizes likely functional variants, we compared 592 cases to 560 controls and found four strongly associated genes, validated in a larger cohort. NRXN1 and HTR2A are enriched for coding variants altering postsynaptic protein-binding domains. CTTNBP2 (synapse maintenance) and REEP3 (vesicle trafficking) are enriched for regulatory variants, of which at least six (35%) alter transcription factor-DNA binding in neuroblastoma cells. NRXN1 achieves genome-wide significance (p = 6.37 × 10 -11 ) when we include 33,370 population-matched controls. Our findings suggest synaptic adhesion as a key component in compulsive behaviors, and show that targeted sequencing plus functional annotation can identify potentially causative variants, even when genomic data are limited.Obsessive-compulsive disorder (OCD) is a neuropsychiatric disorder with symptoms including intrusive thoughts and time-consuming repetitive behaviors. Here Noh and colleagues identify genes enriched for functional variants associated with increased risk of OCD.
Targeted-bisulfite sequence analysis of the methylation of CpG islands in genes encoding PNPLA3, SAMM50, and PARVB of patients with non-alcoholic fatty liver disease.

PubMed

Kitamoto, Takuya; Kitamoto, Aya; Ogawa, Yuji; Honda, Yasushi; Imajo, Kento; Saito, Satoru; Yoneda, Masato; Nakamura, Takahiro; Nakajima, Atsushi; Hotta, Kikuko

2015-08-01

The pathogenesis of non-alcoholic fatty liver disease (NAFLD) is affected by epigenetic factors as well as by genetic variation. We performed targeted-bisulfite sequencing to determine the levels of DNA methylation of 4 CpG islands (CpG99, CpG71, CpG26, and CpG101) in the regulatory regions of PNPLA3, SAMM50, PARVB variant 1, and PARVB variant 2, respectively. We compared the levels of methylation of DNA in the livers of the first and second sets of patients with mild (fibrosis stages 0 and 1) or advanced (fibrosis stages 2 to 4) NAFLD and in those of patients with mild (F0 to F2) or advanced (F3 and F4) chronic hepatitis C infection. The hepatic mRNA levels of PNPLA3, SAMM50, and PARVB were measured using qPCR. CpG26, which resides in the regulatory region of PARVB variant 1, was markedly hypomethylated in the livers of patients with advanced NAFLD. Conversely, CpG99 in the regulatory region of PNPLA3 was substantially hypermethylated in these patients. These differences in DNA methylation were replicated in a second set of patients with NAFLD or chronic hepatitis C. PNPLA3 mRNA levels in the liver of the same section of a biopsy specimen used for genomic DNA preparation were lower in patients with advanced NAFLD compared with those with mild NAFLD and correlated inversely with CpG99 methylation in liver DNA. Moreover, the levels of CpG99 methylation and PNPLA3 mRNA were affected by the rs738409 genotype. Hypomethylation of CpG26 and hypermethylation of CpG99 may contribute to the severity of fibrosis in patients with NAFLD or chronic hepatitis C infection. Copyright © 2015 European Association for the Study of the Liver. Published by Elsevier B.V. All rights reserved.
Immunogenicity of DNA Vaccine against H5N1 Containing Extended Kappa B Site: In Vivo Study in Mice and Chickens

PubMed Central

Redkiewicz, Patrycja; Stachyra, Anna; Sawicka, Róz∙a; Bocian, Katarzyna; Góra-Sochacka, Anna; Kosson, Piotr; Sirko, Agnieszka

2017-01-01

Influenza is one of the most important illnesses in the modern world, causing great public health losses each year due to the lack of medication and broadly protective, long-lasting vaccines. The development of highly immunogenic and safe vaccines is currently one of the major problems encountered in efficient influenza prevention. DNA vaccines represent a novel and powerful alternative to the conventional vaccine approaches. To improve the efficacy of the DNA vaccine against influenza H5N1, we inserted three repeated kappa B (κB) motifs, separated by a 5-bp nucleotide spacer, upstream of the cytomegalovirus promoter and downstream of the SV40 late polyadenylation signal. The κB motif is a specific DNA element (10pb-long) recognized by one of the most important transcription factors NFκB. NFκB is present in almost all animal cell types and upon cell stimulation under a variety of pathogenic conditions. NFκB is released from IκB and translocates to the nucleus and binds to κB sites, thereby leading to enhanced transcription and expression of downstream genes. We tested the variants of DNA vaccine with κB sites flanking the antigen expression cassette and without such sites in two animal models: chickens (broilers and layers) and mice (BALB/c). In chickens, the variant with κB sites stimulated stronger humoral response against the target antigen. In mice, the differences in humoral response were less apparent. Instead, it was possible to spot several gene expression differences in the spleens isolated from mice immunized with both variants. The results of our study indicate that modification of the sequence outside of the sequence encoding the antigen might enhance the immune response to the target but understanding the mechanisms responsible for this process requires further analysis. PMID:28883819
A novel mutation in PRPF31, causative of autosomal dominant retinitis pigmentosa, using the BGISEQ-500 sequencer

PubMed Central

Zheng, Yu; Wang, Hai-Lin; Li, Jian-Kang; Xu, Li; Tellier, Laurent; Li, Xiao-Lin; Huang, Xiao-Yan; Li, Wei; Niu, Tong-Tong; Yang, Huan-Ming; Zhang, Jian-Guo; Liu, Dong-Ning

2018-01-01

AIM To study the genes responsible for retinitis pigmentosa. METHODS A total of 15 Chinese families with retinitis pigmentosa, containing 94 sporadically afflicted cases, were recruited. The targeted sequences were captured using the Target_Eye_365_V3 chip and sequenced using the BGISEQ-500 sequencer, according to the manufacturer's instructions. Data were aligned to UCSC Genome Browser build hg19, using the Burroughs Wheeler Aligner MEM algorithm. Local realignment was performed with the Genome Analysis Toolkit (GATK v.3.3.0) IndelRealigner, and variants were called with the Genome Analysis Toolkit Haplotypecaller, without any use of imputation. Variants were filtered against a panel derived from 1000 Genomes Project, 1000G_ASN, ESP6500, ExAC and dbSNP138. In all members of Family ONE and Family TWO with available DNA samples, the genetic variant was validated using Sanger sequencing. RESULTS A novel, pathogenic variant of retinitis pigmentosa, c.357_358delAA (p.Ser119SerfsX5) was identified in PRPF31 in 2 of 15 autosomal-dominant retinitis pigmentosa (ADRP) families, as well as in one, sporadic case. Sanger sequencing was performed upon probands, as well as upon other family members. This novel, pathogenic genotype co-segregated with retinitis pigmentosa phenotype in these two families. CONCLUSION ADRP is a subtype of retinitis pigmentosa, defined by its genotype, which accounts for 20%-40% of the retinitis pigmentosa patients. Our study thus expands the spectrum of PRPF31 mutations known to occur in ADRP, and provides further demonstration of the applicability of the BGISEQ500 sequencer for genomics research. PMID:29375987
Genomic profiling of pelvic genital type leiomyosarcoma in a woman with a germline CHEK2:c.1100delC mutation and a concomitant diagnosis of metastatic invasive ductal breast carcinoma

PubMed Central

Reisle, Caralyn; Martin, Lee Ann; Alwelaie, Yazeed; Mungall, Karen L.; Ch'ng, Carolyn; Thomas, Ruth; Ng, Tony; Yip, Stephen; J. Lim, Howard; Sun, Sophie; Young, Sean S.; Karsan, Aly; Zhao, Yongjun; Mungall, Andrew J.; Moore, Richard A.; J. Renouf, Daniel; Gelmon, Karen; Ma, Yussanne P.; Hayes, Malcolm; Laskin, Janessa; Marra, Marco A.; Schrader, Kasmintan A.; Jones, Steven J. M.

2017-01-01

We describe a woman with the known pathogenic germline variant CHEK2:c.1100delC and synchronous diagnoses of both pelvic genital type leiomyosarcoma (LMS) and metastatic invasive ductal breast carcinoma. CHEK2 (checkpoint kinase 2) is a tumor-suppressor gene encoding a serine/threonine-protein kinase (CHEK2) involved in double-strand DNA break repair and cell cycle arrest. The CHEK2:c.1100delC variant is a moderate penetrance allele resulting in an approximately twofold increase in breast cancer risk. Whole-genome and whole-transcriptome sequencing were performed on the leiomyosarcoma and matched blood-derived DNA. Despite the presence of several genomic hits within the double-strand DNA damage pathway (CHEK2 germline variant and multiple RAD51B somatic structural variants), tumor profiling did not show an obvious DNA repair deficiency signature. However, even though the LMS displayed clear malignant features, its genomic profiling revealed several characteristics classically associated with leiomyomas including a translocation, t(12;14), with one breakpoint disrupting RAD51B and the other breakpoint upstream of HMGA2 with very high expression of HMGA2 and PLAG1. This is the first report of LMS genomic profiling in a patient with the germline CHEK2:c.1100delC variant and an additional diagnosis of metastatic invasive ductal breast carcinoma. We also describe a possible mechanistic relationship between leiomyoma and LMS based on genomic and transcriptome data. Our findings suggest that RAD51B translocation and HMGA2 overexpression may play an important role in LMS oncogenesis. PMID:28514723
Genomic profiling of pelvic genital type leiomyosarcoma in a woman with a germline CHEK2:c.1100delC mutation and a concomitant diagnosis of metastatic invasive ductal breast carcinoma.

PubMed

Thibodeau, My Linh; Reisle, Caralyn; Zhao, Eric; Martin, Lee Ann; Alwelaie, Yazeed; Mungall, Karen L; Ch'ng, Carolyn; Thomas, Ruth; Ng, Tony; Yip, Stephen; J Lim, Howard; Sun, Sophie; Young, Sean S; Karsan, Aly; Zhao, Yongjun; Mungall, Andrew J; Moore, Richard A; J Renouf, Daniel; Gelmon, Karen; Ma, Yussanne P; Hayes, Malcolm; Laskin, Janessa; Marra, Marco A; Schrader, Kasmintan A; Jones, Steven J M

2017-09-01

We describe a woman with the known pathogenic germline variant CHEK2 :c.1100delC and synchronous diagnoses of both pelvic genital type leiomyosarcoma (LMS) and metastatic invasive ductal breast carcinoma. CHEK2 (checkpoint kinase 2) is a tumor-suppressor gene encoding a serine/threonine-protein kinase (CHEK2) involved in double-strand DNA break repair and cell cycle arrest. The CHEK2 :c.1100delC variant is a moderate penetrance allele resulting in an approximately twofold increase in breast cancer risk. Whole-genome and whole-transcriptome sequencing were performed on the leiomyosarcoma and matched blood-derived DNA. Despite the presence of several genomic hits within the double-strand DNA damage pathway ( CHEK2 germline variant and multiple RAD51B somatic structural variants), tumor profiling did not show an obvious DNA repair deficiency signature. However, even though the LMS displayed clear malignant features, its genomic profiling revealed several characteristics classically associated with leiomyomas including a translocation, t(12;14), with one breakpoint disrupting RAD51B and the other breakpoint upstream of HMGA2 with very high expression of HMGA2 and PLAG1 This is the first report of LMS genomic profiling in a patient with the germline CHEK2 :c.1100delC variant and an additional diagnosis of metastatic invasive ductal breast carcinoma. We also describe a possible mechanistic relationship between leiomyoma and LMS based on genomic and transcriptome data. Our findings suggest that RAD51B translocation and HMGA2 overexpression may play an important role in LMS oncogenesis. © 2017 Thibodeau et al.; Published by Cold Spring Harbor Laboratory Press.
Preferential cleavage sites for Sau3A restriction endonuclease in human ribosomal DNA.

PubMed

Kupriyanova, N S; Kirilenko, P M; Netchvolodov, K K; Ryskov, A P

2000-07-21

Previous studies of cloned ribosomal DNA (rDNA) variants isolated from the cosmid library of human chromosome 13 have revealed some disproportion in representativity of different rDNA regions (N. S. Kupriyanova, K. K. Netchvolodov, P. M. Kirilenko, B. I. Kapanadze, N. K. Yankovsky, and A. P. Ryskov, Mol. Biol. 30, 51-60, 1996). Here we show nonrandom cleavage of human rDNA with Sau3A or its isoshizomer MboI under mild hydrolysis conditions. The hypersensitive cleavage sites were found to be located in the ribosomal intergenic spacer (rIGS), especially in the regions of about 5-5.5 and 11 kb upstream of the rRNA transcription start point. This finding is based on sequencing mapping of the rDNA insert ends in randomly selected cosmid clones of human chromosome 13 and on the data of digestion kinetics of cloned and noncloned human genomic rDNA with Sau3A and MboI. The results show that a methylation status and superhelicity state of the rIGS have no effect on cleavage site sensitivity. It is interesting that all primary cleavage sites are adjacent to or entering into Alu or Psi cdc 27 retroposons of the rIGS suggesting a possible role of neighboring sequences in nuclease accessibility. The results explain nonequal representation of rDNA sequences in the human genomic DNA library used for this study. Copyright 2000 Academic Press.

Genetic investigation of 100 heart genes in sudden unexplained death victims in a forensic setting

PubMed Central

Christiansen, Sofie Lindgren; Hertz, Christin Løth; Ferrero-Miliani, Laura; Dahl, Morten; Weeke, Peter Ejvin; LuCamp; Ottesen, Gyda Lolk; Frank-Hansen, Rune; Bundgaard, Henning; Morling, Niels

2016-01-01

In forensic medicine, one-third of the sudden deaths remain unexplained after medico-legal autopsy. A major proportion of these sudden unexplained deaths (SUD) are considered to be caused by inherited cardiac diseases. Sudden cardiac death (SCD) may be the first manifestation of these diseases. The purpose of this study was to explore the yield of next-generation sequencing of genes associated with SCD in a cohort of SUD victims. We investigated 100 genes associated with cardiac diseases in 61 young (1–50 years) SUD cases. DNA was captured with the Haloplex target enrichment system and sequenced using an Illumina MiSeq. The identified genetic variants were evaluated and classified as likely, unknown or unlikely to have a functional effect. The criteria for this classification were based on the literature, databases, conservation and prediction of the effect of the variant. We found that 21 (34%) individuals carried variants with a likely functional effect. Ten (40%) of these variants were located in genes associated with cardiomyopathies and 15 (60%) of the variants in genes associated with cardiac channelopathies. Nineteen individuals carried variants with unknown functional effect. Our findings indicate that broad genetic investigation of SUD victims increases the diagnostic outcome, and the investigation should comprise genes involved in both cardiomyopathies and cardiac channelopathies. PMID:27650965
Genetic investigation of 100 heart genes in sudden unexplained death victims in a forensic setting.

PubMed

Christiansen, Sofie Lindgren; Hertz, Christin Løth; Ferrero-Miliani, Laura; Dahl, Morten; Weeke, Peter Ejvin; LuCamp; Ottesen, Gyda Lolk; Frank-Hansen, Rune; Bundgaard, Henning; Morling, Niels

2016-12-01

In forensic medicine, one-third of the sudden deaths remain unexplained after medico-legal autopsy. A major proportion of these sudden unexplained deaths (SUD) are considered to be caused by inherited cardiac diseases. Sudden cardiac death (SCD) may be the first manifestation of these diseases. The purpose of this study was to explore the yield of next-generation sequencing of genes associated with SCD in a cohort of SUD victims. We investigated 100 genes associated with cardiac diseases in 61 young (1-50 years) SUD cases. DNA was captured with the Haloplex target enrichment system and sequenced using an Illumina MiSeq. The identified genetic variants were evaluated and classified as likely, unknown or unlikely to have a functional effect. The criteria for this classification were based on the literature, databases, conservation and prediction of the effect of the variant. We found that 21 (34%) individuals carried variants with a likely functional effect. Ten (40%) of these variants were located in genes associated with cardiomyopathies and 15 (60%) of the variants in genes associated with cardiac channelopathies. Nineteen individuals carried variants with unknown functional effect. Our findings indicate that broad genetic investigation of SUD victims increases the diagnostic outcome, and the investigation should comprise genes involved in both cardiomyopathies and cardiac channelopathies.
Next-generation sequencing reveals cryptic mtDNA diversity of Plasmodium relictum in the Hawaiian Islands

USGS Publications Warehouse

Jarvi, S.I.; Farias, M.E.; Lapointe, D.A.; Belcaid, M.; Atkinson, C.T.

2013-01-01

Next-generation 454 sequencing techniques were used to re-examine diversity of mitochondrial cytochrome b lineages of avian malaria (Plasmodium relictum) in Hawaii. We document a minimum of 23 variant lineages of the parasite based on single nucleotide transitional changes, in addition to the previously reported single lineage (GRW4). A new, publicly available portal (Integroomer) was developed for initial parsing of 454 datasets. Mean variant prevalence and frequency was higher in low elevation Hawaii Amakihi (Hemignathus virens) with Avipoxvirus-like lesions (P = 0·001), suggesting that the variants may be biologically distinct. By contrast, variant prevalence and frequency did not differ significantly among mid-elevation Apapane (Himatione sanguinea) with or without lesions (P = 0·691). The low frequency and the lack of detection of variants independent of GRW4 suggest that multiple independent introductions of P. relictum to Hawaii are unlikely. Multiple variants may have been introduced in heteroplasmy with GRW4 or exist within the tandem repeat structure of the mitochondrial genome. The discovery of multiple mitochondrial lineages of P. relictum in Hawaii provides a measure of genetic diversity within a geographically isolated population of this parasite and suggests the origins and evolution of parasite diversity may be more complicated than previously recognized.
Next-generation sequencing reveals cryptic mtDNA diversity of Plasmodium relictum in the Hawaiian Islands.

PubMed

Jarvi, S I; Farias, M E; Lapointe, D A; Belcaid, M; Atkinson, C T

2013-12-01

Next-generation 454 sequencing techniques were used to re-examine diversity of mitochondrial cytochrome b lineages of avian malaria (Plasmodium relictum) in Hawaii. We document a minimum of 23 variant lineages of the parasite based on single nucleotide transitional changes, in addition to the previously reported single lineage (GRW4). A new, publicly available portal (Integroomer) was developed for initial parsing of 454 datasets. Mean variant prevalence and frequency was higher in low elevation Hawaii Amakihi (Hemignathus virens) with Avipoxvirus-like lesions (P = 0·001), suggesting that the variants may be biologically distinct. By contrast, variant prevalence and frequency did not differ significantly among mid-elevation Apapane (Himatione sanguinea) with or without lesions (P = 0·691). The low frequency and the lack of detection of variants independent of GRW4 suggest that multiple independent introductions of P. relictum to Hawaii are unlikely. Multiple variants may have been introduced in heteroplasmy with GRW4 or exist within the tandem repeat structure of the mitochondrial genome. The discovery of multiple mitochondrial lineages of P. relictum in Hawaii provides a measure of genetic diversity within a geographically isolated population of this parasite and suggests the origins and evolution of parasite diversity may be more complicated than previously recognized.
Exome analysis of a family with Wolff-Parkinson-White syndrome identifies a novel disease locus.

PubMed

Bowles, Neil E; Jou, Chuanchau J; Arrington, Cammon B; Kennedy, Brett J; Earl, Aubree; Matsunami, Norisada; Meyers, Lindsay L; Etheridge, Susan P; Saarel, Elizabeth V; Bleyl, Steven B; Yost, H Joseph; Yandell, Mark; Leppert, Mark F; Tristani-Firouzi, Martin; Gruber, Peter J

2015-12-01

Wolff-Parkinson-White (WPW) syndrome is a common cause of supraventricular tachycardia that carries a risk of sudden cardiac death. To date, mutations in only one gene, PRKAG2, which encodes the 5'-AMP-activated protein kinase subunit γ-2, have been identified as causative for WPW. DNA samples from five members of a family with WPW were analyzed by exome sequencing. We applied recently designed prioritization strategies (VAAST/pedigree VAAST) coupled with an ontology-based algorithm (Phevor) that reduced the number of potentially damaging variants to 10: a variant in KCNE2 previously associated with Long QT syndrome was also identified. Of these 11 variants, only MYH6 p.E1885K segregated with the WPW phenotype in all affected individuals and was absent in 10 unaffected family members. This variant was predicted to be damaging by in silico methods and is not present in the 1,000 genome and NHLBI exome sequencing project databases. Screening of a replication cohort of 47 unrelated WPW patients did not identify other likely causative variants in PRKAG2 or MYH6. MYH6 variants have been identified in patients with atrial septal defects, cardiomyopathies, and sick sinus syndrome. Our data highlight the pleiotropic nature of phenotypes associated with defects in this gene. © 2015 Wiley Periodicals, Inc.
Exome Analysis of a Family with Wolff–Parkinson–White Syndrome Identifies a Novel Disease Locus

PubMed Central

Bowles, Neil E.; Jou, Chuanchau J.; Arrington, Cammon B.; Kennedy, Brett J.; Earl, Aubree; Matsunami, Norisada; Meyers, Lindsay L.; Etheridge, Susan P.; Saarel, Elizabeth V.; Bleyl, Steven B.; Yost, H. Joseph; Yandell, Mark; Leppert, Mark F.; Tristani-Firouzi, Martin; Gruber, Peter J.

2016-01-01

Wolff–Parkinson–White (WPW) syndrome is a common cause of supraventricular tachycardia that carries a risk of sudden cardiac death. To date, mutations in only one gene, PRKAG2, which encodes the 5’ -AMP-activated protein kinase subunit γ-2, have been identified as causative for WPW. DNA samples from five members of a family with WPW were analyzed by exome sequencing. We applied recently designed prioritization strategies (VAAST/pedigree VAAST) coupled with an ontology-based algorithm (Phevor) that reduced the number of potentially damaging variants to 10: a variant in KCNE2 previously associated with Long QT syndrome was also identified. Of these 11 variants, only MYH6 p.E1885K segregated with the WPW phenotype in all affected individuals and was absent in 10 unaffected family members. This variant was predicted to be damaging by in silico methods and is not present in the 1,000 genome and NHLBI exome sequencing project databases. Screening of a replication cohort of 47 unrelated WPW patients did not identify other likely causative variants in PRKAG2 or MYH6. MYH6 variants have been identified in patients with atrial septal defects, cardiomyopathies, and sick sinus syndrome. Our data highlight the pleiotropic nature of phenotypes associated with defects in this gene. PMID:26284702
Highly sensitive detection of mutations in CHO cell recombinant DNA using multi-parallel single molecule real-time DNA sequencing.

PubMed

Cartwright, Joseph F; Anderson, Karin; Longworth, Joseph; Lobb, Philip; James, David C

2018-06-01

High-fidelity replication of biologic-encoding recombinant DNA sequences by engineered mammalian cell cultures is an essential pre-requisite for the development of stable cell lines for the production of biotherapeutics. However, immortalized mammalian cells characteristically exhibit an increased point mutation frequency compared to mammalian cells in vivo, both across their genomes and at specific loci (hotspots). Thus unforeseen mutations in recombinant DNA sequences can arise and be maintained within producer cell populations. These may affect both the stability of recombinant gene expression and give rise to protein sequence variants with variable bioactivity and immunogenicity. Rigorous quantitative assessment of recombinant DNA integrity should therefore form part of the cell line development process and be an essential quality assurance metric for instances where synthetic/multi-component assemblies are utilized to engineer mammalian cells, such as the assessment of recombinant DNA fidelity or the mutability of single-site integration target loci. Based on Pacific Biosciences (Menlo Park, CA) single molecule real-time (SMRT™) circular consensus sequencing (CCS) technology we developed a rDNA sequence analysis tool to process the multi-parallel sequencing of ∼40,000 single recombinant DNA molecules. After statistical filtering of raw sequencing data, we show that this analytical method is capable of detecting single point mutations in rDNA to a minimum single mutation frequency of 0.0042% (<1/24,000 bases). Using a stable CHO transfectant pool harboring a randomly integrated 5 kB plasmid construct encoding GFP we found that 28% of recombinant plasmid copies contained at least one low frequency (<0.3%) point mutation. These mutations were predominantly found in GC base pairs (85%) and that there was no positional bias in mutation across the plasmid sequence. There was no discernable difference between the mutation frequencies of coding and non-coding DNA. The putative ratio of non-synonymous and synonymous changes within the open reading frames (ORFs) in the plasmid sequence indicates that natural selection does not impact upon the prevalence of these mutations. Here we have demonstrated the abundance of mutations that fall outside of the reported range of detection of next generation sequencing (NGS) and second generation sequencing (SGS) platforms, providing a methodology capable of being utilized in cell line development platforms to identify the fidelity of recombinant genes throughout the production process. © 2018 Wiley Periodicals, Inc.
Structure and specificity of the RNA-guided endonuclease Cas9 during DNA interrogation, target binding and cleavage

PubMed Central

Josephs, Eric A.; Kocak, D. Dewran; Fitzgibbon, Christopher J.; McMenemy, Joshua; Gersbach, Charles A.; Marszalek, Piotr E.

2015-01-01

CRISPR-associated endonuclease Cas9 cuts DNA at variable target sites designated by a Cas9-bound RNA molecule. Cas9's ability to be directed by single ‘guide RNA’ molecules to target nearly any sequence has been recently exploited for a number of emerging biological and medical applications. Therefore, understanding the nature of Cas9's off-target activity is of paramount importance for its practical use. Using atomic force microscopy (AFM), we directly resolve individual Cas9 and nuclease-inactive dCas9 proteins as they bind along engineered DNA substrates. High-resolution imaging allows us to determine their relative propensities to bind with different guide RNA variants to targeted or off-target sequences. Mapping the structural properties of Cas9 and dCas9 to their respective binding sites reveals a progressive conformational transformation at DNA sites with increasing sequence similarity to its target. With kinetic Monte Carlo (KMC) simulations, these results provide evidence of a ‘conformational gating’ mechanism driven by the interactions between the guide RNA and the 14th–17th nucleotide region of the targeted DNA, the stabilities of which we find correlate significantly with reported off-target cleavage rates. KMC simulations also reveal potential methodologies to engineer guide RNA sequences with improved specificity by considering the invasion of guide RNAs into targeted DNA duplex. PMID:26384421
Myopathic mtDNA Depletion Syndrome Due to Mutation in TK2 Gene.

PubMed

Martín-Hernández, Elena; García-Silva, María Teresa; Quijada-Fraile, Pilar; Rodríguez-García, María Elena; Hernández-Laín, Aurelio; Coca-Robinot, David; Rivera, Henry; Fernández-Toral, Joaquín; Arenas, Joaquín; Martín, MiguelÁngel; Martínez-Azorín, Francisco

2016-02-29

Whole-exome sequencing (WES) was used to identify the disease gene(s) in a Spanish girl with failure to thrive, muscle weakness, mild facial weakness, elevated creatine kinase (CK), deficiency of mitochondrial complex III and depletion of mtDNA. With WES data, it was possible to get the whole mtDNA sequencing and discard any pathogenic variant in this genome. The analysis of whole exome uncovered a homozygous pathogenic mutation in Thymidine kinase 2 gene (TK2; NM_004614.4:c.323C>T, p.T108M). TK2 mutations have been identified mainly in patients with the myopathic form of mtDNA depletion syndromes (MDS). This patient presents an atypical TK2 related-myopathic form of MDS, because despite having a very low content of mtDNA (<20%), she presents a slower and less severe evolution of the disease. In conclusion, our data confirm the role of TK2 gene in MDS and expanded the phenotypic spectrum.
TDP-43 Is Not a Common Cause of Sporadic Amyotrophic Lateral Sclerosis

PubMed Central

Guerreiro, Rita J.; Schymick, Jennifer C.; Crews, Cynthia; Singleton, Andrew; Hardy, John; Traynor, Bryan J.

2008-01-01

Background TAR DNA binding protein, encoded by TARDBP, was shown to be a central component of ubiquitin-positive, tau-negative inclusions in frontotemporal lobar degeneration (FTLD-U) and amyotrophic lateral sclerosis (ALS). Recently, mutations in TARDBP have been linked to familial and sporadic ALS. Methodology/Principal Findings To further examine the frequency of mutations in TARDBP in sporadic ALS, 279 ALS cases and 806 neurologically normal control individuals of European descent were screened for sequence variants, copy number variants, genetic and haplotype association with disease. An additional 173 African samples from the Human Gene Diversity Panel were sequenced as this population had the highest likelihood of finding changes. No mutations were found in the ALS cases. Several genetic variants were identified in controls, which were considered as non-pathogenic changes. Furthermore, pathogenic structural variants were not observed in the cases and there was no genetic or haplotype association with disease status across the TARDBP locus. Conclusions Our data indicate that genetic variation in TARDBP is not a common cause of sporadic ALS in North American. PMID:18545701
Duplication polymorphisms in exon 4 of κ-casein gene in yak breeds/populations.

PubMed

Pingcuo, S; Gao, J; Jiang, Z R; Jin, S Y; Fu, C Y; Liu, X; Huang, L; Zheng, Y C

2015-08-28

The objective of this study was to compare 12 bp-duplication polymorphisms in exon 4 of the κ-casein gene among 3 breeds/populations of yak (Bos grunniens). Genomic DNA was extracted from yak blood or muscle samples (N = 211) and a partial sequence of exon 4 of κ-casein gene was amplified by polymerase chain reaction. A polyacrylamide gel electrophoresis assay of the products (169 bp) revealed 2 variants. These variants differed in a 12-bp duplication of the nucleotide sequence corresponding to amino acids 147-150 (Glu-Ala-Ser-Pro) or 148-151 (Ala-Ser-Pro-Glu). The genotype frequency and gene frequency of the 2 κ-casein variants differed among the 3 yak breeds/populations. The long form of the κ-casein gene was the predominant allele, and the Jiulong yak showed the highest frequency of the short form variant of the κ-casein gene. In addition, 2 nucleotide differences resulting in amino acid substitutions were also identified in yaks. These results are significant for designing a breeding strategy to improve the genetic makeup of yak herds.
Development of a genotyping microarray for Usher syndrome.

PubMed

Cremers, Frans P M; Kimberling, William J; Külm, Maigi; de Brouwer, Arjan P; van Wijk, Erwin; te Brinke, Heleen; Cremers, Cor W R J; Hoefsloot, Lies H; Banfi, Sandro; Simonelli, Francesca; Fleischhauer, Johannes C; Berger, Wolfgang; Kelley, Phil M; Haralambous, Elene; Bitner-Glindzicz, Maria; Webster, Andrew R; Saihan, Zubin; De Baere, Elfride; Leroy, Bart P; Silvestri, Giuliana; McKay, Gareth J; Koenekoop, Robert K; Millan, Jose M; Rosenberg, Thomas; Joensuu, Tarja; Sankila, Eeva-Marja; Weil, Dominique; Weston, Mike D; Wissinger, Bernd; Kremer, Hannie

2007-02-01

Usher syndrome, a combination of retinitis pigmentosa (RP) and sensorineural hearing loss with or without vestibular dysfunction, displays a high degree of clinical and genetic heterogeneity. Three clinical subtypes can be distinguished, based on the age of onset and severity of the hearing impairment, and the presence or absence of vestibular abnormalities. Thus far, eight genes have been implicated in the syndrome, together comprising 347 protein-coding exons. To improve DNA diagnostics for patients with Usher syndrome, we developed a genotyping microarray based on the arrayed primer extension (APEX) method. Allele-specific oligonucleotides corresponding to all 298 Usher syndrome-associated sequence variants known to date, 76 of which are novel, were arrayed. Approximately half of these variants were validated using original patient DNAs, which yielded an accuracy of >98%. The efficiency of the Usher genotyping microarray was tested using DNAs from 370 unrelated European and American patients with Usher syndrome. Sequence variants were identified in 64/140 (46%) patients with Usher syndrome type I, 45/189 (24%) patients with Usher syndrome type II, 6/21 (29%) patients with Usher syndrome type III and 6/20 (30%) patients with atypical Usher syndrome. The chip also identified two novel sequence variants, c.400C>T (p.R134X) in PCDH15 and c.1606T>C (p.C536S) in USH2A. The Usher genotyping microarray is a versatile and affordable screening tool for Usher syndrome. Its efficiency will improve with the addition of novel sequence variants with minimal extra costs, making it a very useful first-pass screening tool.
Functional PMS2 Hybrid Alleles Containing a Pseudogene-Specific Missense Variant Trace Back to a Single Ancient Intrachromosomal Recombination Event

PubMed Central

Ganster, Christina; Wernstedt, Annekatrin; Kehrer-Sawatzki, Hildegard; Messiaen, Ludwine; Schmidt, Konrad; Rahner, Nils; Heinimann, Karl; Fonatsch, Christa; Zschocke, Johannes; Wimmer, Katharina

2012-01-01

Sequence exchange between PMS2 and its pseudogene PMS2CL, embedded in an inverted duplication on chromosome 7p22, has been reported to be an ongoing process that leads to functional PMS2 hybrid alleles containing PMS2- and PMS2CL-specific sequence variants at the 5′-and the 3′-end, respectively. The frequency of PMS2 hybrid alleles, their biological significance, and the mechanisms underlying their formation are largely unknown. Here we show that overall hybrid alleles account for one-third of 384 PMS2 alleles analyzed in individuals of different ethnic backgrounds. Depending on the population, 14–60% of hybrid alleles carry PMS2CL-specific sequences in exons 13–15, the remainder only in exon 15. We show that exons 13–15 hybrid alleles, named H1 hybrid alleles, constitute different haplotypes but trace back to a single ancient intrachromosomal recombination event with crossover. Taking advantage of an ancestral sequence variant specific for all H1 alleles we developed a simple gDNA-based polymerase chain reaction (PCR) assay that can be used to identify H1-allele carriers with high sensitivity and specificity (100 and 99%, respectively). Because H1 hybrid alleles harbor missense variant p.N775S of so far unknown functional significance, we assessed the H1-carrier frequency in 164 colorectal cancer patients. So far, we found no indication that the variant plays a major role with regard to cancer susceptibility. PMID:20186689
Functional PMS2 hybrid alleles containing a pseudogene-specific missense variant trace back to a single ancient intrachromosomal recombination event.

PubMed

Ganster, Christina; Wernstedt, Annekatrin; Kehrer-Sawatzki, Hildegard; Messiaen, Ludwine; Schmidt, Konrad; Rahner, Nils; Heinimann, Karl; Fonatsch, Christa; Zschocke, Johannes; Wimmer, Katharina

2010-05-01

Sequence exchange between PMS2 and its pseudogene PMS2CL, embedded in an inverted duplication on chromosome 7p22, has been reported to be an ongoing process that leads to functional PMS2 hybrid alleles containing PMS2- and PMS2CL-specific sequence variants at the 5'-and the 3'-end, respectively. The frequency of PMS2 hybrid alleles, their biological significance, and the mechanisms underlying their formation are largely unknown. Here we show that overall hybrid alleles account for one-third of 384 PMS2 alleles analyzed in individuals of different ethnic backgrounds. Depending on the population, 14-60% of hybrid alleles carry PMS2CL-specific sequences in exons 13-15, the remainder only in exon 15. We show that exons 13-15 hybrid alleles, named H1 hybrid alleles, constitute different haplotypes but trace back to a single ancient intrachromosomal recombination event with crossover. Taking advantage of an ancestral sequence variant specific for all H1 alleles we developed a simple gDNA-based polymerase chain reaction (PCR) assay that can be used to identify H1-allele carriers with high sensitivity and specificity (100 and 99%, respectively). Because H1 hybrid alleles harbor missense variant p.N775S of so far unknown functional significance, we assessed the H1-carrier frequency in 164 colorectal cancer patients. So far, we found no indication that the variant plays a major role with regard to cancer susceptibility. (c) 2010 Wiley-Liss, Inc.
Development of a genotyping microarray for Usher syndrome

PubMed Central

Cremers, Frans P M; Kimberling, William J; Külm, Maigi; de Brouwer, Arjan P; van Wijk, Erwin; te Brinke, Heleen; Cremers, Cor W R J; Hoefsloot, Lies H; Banfi, Sandro; Simonelli, Francesca; Fleischhauer, Johannes C; Berger, Wolfgang; Kelley, Phil M; Haralambous, Elene; Bitner‐Glindzicz, Maria; Webster, Andrew R; Saihan, Zubin; De Baere, Elfride; Leroy, Bart P; Silvestri, Giuliana; McKay, Gareth J; Koenekoop, Robert K; Millan, Jose M; Rosenberg, Thomas; Joensuu, Tarja; Sankila, Eeva‐Marja; Weil, Dominique; Weston, Mike D; Wissinger, Bernd; Kremer, Hannie

2007-01-01

Background Usher syndrome, a combination of retinitis pigmentosa (RP) and sensorineural hearing loss with or without vestibular dysfunction, displays a high degree of clinical and genetic heterogeneity. Three clinical subtypes can be distinguished, based on the age of onset and severity of the hearing impairment, and the presence or absence of vestibular abnormalities. Thus far, eight genes have been implicated in the syndrome, together comprising 347 protein‐coding exons. Methods: To improve DNA diagnostics for patients with Usher syndrome, we developed a genotyping microarray based on the arrayed primer extension (APEX) method. Allele‐specific oligonucleotides corresponding to all 298 Usher syndrome‐associated sequence variants known to date, 76 of which are novel, were arrayed. Results Approximately half of these variants were validated using original patient DNAs, which yielded an accuracy of >98%. The efficiency of the Usher genotyping microarray was tested using DNAs from 370 unrelated European and American patients with Usher syndrome. Sequence variants were identified in 64/140 (46%) patients with Usher syndrome type I, 45/189 (24%) patients with Usher syndrome type II, 6/21 (29%) patients with Usher syndrome type III and 6/20 (30%) patients with atypical Usher syndrome. The chip also identified two novel sequence variants, c.400C>T (p.R134X) in PCDH15 and c.1606T>C (p.C536S) in USH2A. Conclusion The Usher genotyping microarray is a versatile and affordable screening tool for Usher syndrome. Its efficiency will improve with the addition of novel sequence variants with minimal extra costs, making it a very useful first‐pass screening tool. PMID:16963483
Functional Analyses of a Novel Splice Variant in the CHD7 Gene, Found by Next Generation Sequencing, Confirm Its Pathogenicity in a Spanish Patient and Diagnose Him with CHARGE Syndrome.

PubMed

Villate, Olatz; Ibarluzea, Nekane; Fraile-Bethencourt, Eugenia; Valenzuela, Alberto; Velasco, Eladio A; Grozeva, Detelina; Raymond, F L; Botella, María P; Tejada, María-Isabel

2018-01-01

Mutations in CHD7 have been shown to be a major cause of CHARGE syndrome, which presents many symptoms and features common to other syndromes making its diagnosis difficult. Next generation sequencing (NGS) of a panel of intellectual disability related genes was performed in an adult patient without molecular diagnosis. A splice donor variant in CHD7 (c.5665 + 1G > T) was identified. To study its potential pathogenicity, exons and flanking intronic sequences were amplified from patient DNA and cloned into the pSAD ® splicing vector. HeLa cells were transfected with this construct and a wild-type minigene and functional analysis were performed. The construct with the c.5665 + 1G > T variant produced an aberrant transcript with an insert of 63 nucleotides of intron 28 creating a premature termination codon (TAG) 25 nucleotides downstream. This would lead to the insertion of 8 new amino acids and therefore a truncated 1896 amino acid protein. As a result of this, the patient was diagnosed with CHARGE syndrome. Functional analyses underline their usefulness for studying the pathogenicity of variants found by NGS and therefore its application to accurately diagnose patients.
Compartmentalization of HIV-1 within the female genital tract is due to monotypic and low-diversity variants not distinct viral populations.

PubMed

Bull, Marta; Learn, Gerald; Genowati, Indira; McKernan, Jennifer; Hitti, Jane; Lockhart, David; Tapia, Kenneth; Holte, Sarah; Dragavon, Joan; Coombs, Robert; Mullins, James; Frenkel, Lisa

2009-09-22

Compartmentalization of HIV-1 between the genital tract and blood was noted in half of 57 women included in 12 studies primarily using cell-free virus. To further understand differences between genital tract and blood viruses of women with chronic HIV-1 infection cell-free and cell-associated virus populations were sequenced from these tissues, reasoning that integrated viral DNA includes variants archived from earlier in infection, and provides a greater array of genotypes for comparisons. Multiple sequences from single-genome-amplification of HIV-1 RNA and DNA from the genital tract and blood of each woman were compared in a cross-sectional study. Maximum likelihood phylogenies were evaluated for evidence of compartmentalization using four statistical tests. Genital tract and blood HIV-1 appears compartmentalized in 7/13 women by >/=2 statistical analyses. These subjects' phylograms were characterized by low diversity genital-specific viral clades interspersed between clades containing both genital and blood sequences. Many of the genital-specific clades contained monotypic HIV-1 sequences. In 2/7 women, HIV-1 populations were significantly compartmentalized across all four statistical tests; both had low diversity genital tract-only clades. Collapsing monotypic variants into a single sequence diminished the prevalence and extent of compartmentalization. Viral sequences did not demonstrate tissue-specific signature amino acid residues, differential immune selection, or co-receptor usage. In women with chronic HIV-1 infection multiple identical sequences suggest proliferation of HIV-1-infected cells, and low diversity tissue-specific phylogenetic clades are consistent with bursts of viral replication. These monotypic and tissue-specific viruses provide statistical support for compartmentalization of HIV-1 between the female genital tract and blood. However, the intermingling of these clades with clades comprised of both genital and blood sequences and the absence of tissue-specific genetic features suggests compartmentalization between blood and genital tract may be due to viral replication and proliferation of infected cells, and questions whether HIV-1 in the female genital tract is distinct from blood.
Evaluation of point mutations in dystrophin gene in Iranian Duchenne and Becker muscular dystrophy patients: introducing three novel variants.

PubMed

Haghshenas, Maryam; Akbari, Mohammad Taghi; Karizi, Shohreh Zare; Deilamani, Faravareh Khordadpoor; Nafissi, Shahriar; Salehi, Zivar

2016-06-01

Duchenne and Becker muscular dystrophies (DMD and BMD) are X-linked neuromuscular diseases characterized by progressive muscular weakness and degeneration of skeletal muscles. Approximately two-thirds of the patients have large deletions or duplications in the dystrophin gene and the remaining one-third have point mutations. This study was performed to evaluate point mutations in Iranian DMD/BMD male patients. A total of 29 DNA samples from patients who did not show any large deletion/duplication mutations following multiplex polymerase chain reaction (PCR) and multiplex ligation-dependent probe amplification (MLPA) screening were sequenced for detection of point mutations in exons 50-79. Also exon 44 was sequenced in one sample in which a false positive deletion was detected by MLPA method. Cycle sequencing revealed four nonsense, one frameshift and two splice site mutations as well as two missense variants.
Molecular characterization and phylogenetic analysis of a yak (Bos grunniens) κ-casein cDNA from lactating mammary gland.

PubMed

Bai, W L; Yin, R H; Dou, Q L; Jiang, W Q; Zhao, S J; Ma, Z J; Luo, G B; Zhao, Z H

2011-04-01

κ-Casein is one of the major proteins in the milk of mammals. It plays an important role in determining the size and specific function of milk micelles. We have previously identified and characterized a genetic variant of yak κ-casein by evaluating genomic DNA. Here, we isolate and characterize a yak κ-casein cDNA harboring the full-length open reading frame (ORF) from lactating mammary gland. Total RNA was extracted from mammary tissue of lactating female yak, and the κ-casein cDNA were synthesized by RT-PCR technique, then cloned and sequenced. The obtained cDNA of 660-bp contained an ORF sufficient to encode the entire amino acid sequence of κ-casein precursor protein consisting of 190 amino acids with a signal peptide of 21 amino acids. Yak κ-casein has a predicted molecular mass of 19,006.588 Da with a calculated isoelectric point of 7.245. Compared with the corresponding sequences in GenBank of cattle, buffalo, sheep, goat, Arabian camel, horse, and rabbit, yak κ-casein sequence had identity of 64.76-98.78% in cDNA, and identity of 44.79-98.42% and similarity of 53.65-98.42% in deduced amino acids, revealing a high homology with the other livestock species. Based on κ-casein cDNA sequences, the phylogenetic analysis indicated that yak κ-casein had a close relationship with that of cattle. This work might be useful in the genetic engineering researches for yak κ-casein.
Molecular identification and phylogenetic analysis of Wuchereria bancrofti from human blood samples in Egypt.

PubMed

Abdel-Shafi, Iman R; Shoieb, Eman Y; Attia, Samar S; Rubio, José M; Ta-Tang, Thuy-Huong; El-Badry, Ayman A

2017-03-01

Lymphatic filariasis (LF) is a serious vector-borne health problem, and Wuchereria bancrofti (W.b) is the major cause of LF worldwide and is focally endemic in Egypt. Identification of filarial infection using traditional morphologic and immunological criteria can be difficult and lead to misdiagnosis. The aim of the present study was molecular detection of W.b in residents in endemic areas in Egypt, sequence variance analysis, and phylogenetic analysis of W.b DNA. Collected blood samples from residents in filariasis endemic areas in five governorates were subjected to semi-nested PCR targeting repeated DNA sequence, for detection of W.b DNA. PCR products were sequenced; subsequently, a phylogenetic analysis of the obtained sequences was performed. Out of 300 blood samples, W.b DNA was identified in 48 (16%). Sequencing analysis confirmed PCR results identifying only W.b species. Sequence alignment and phylogenetic analysis indicated genetically distinct clusters of W.b among the study population. Study results demonstrated that the semi-nested PCR proved to be an effective diagnostic tool for accurate and rapid detection of W.b infections in nano-epidemics and is applicable for samples collected in the daytime as well as the night time. PCR products sequencing and phylogenitic analysis revealed three different nucleotide sequences variants. Further genetic studies of W.b in Egypt and other endemic areas are needed to distinguish related strains and the various ecological as well as drug effects exerted on them to support W.b elimination.

Evolutionary history of Wolbachia infections in the fire ant Solenopsis invicta

PubMed Central

Ahrens, Michael E; Shoemaker, Dewayne

2005-01-01

Background Wolbachia are endosymbiotic bacteria that commonly infect numerous arthropods. Despite their broad taxonomic distribution, the transmission patterns of these bacteria within and among host species are not well understood. We sequenced a portion of the wsp gene from the Wolbachia genome infecting 138 individuals from eleven geographically distributed native populations of the fire ant Solenopsis invicta. We then compared these wsp sequence data to patterns of mitochondrial DNA (mtDNA) variation of both infected and uninfected host individuals to infer the transmission patterns of Wolbachia in S. invicta. Results Three different Wolbachia (wsp) variants occur within S. invicta, all of which are identical to previously described strains in fire ants. A comparison of the distribution of Wolbachia variants within S. invicta to a phylogeny of mtDNA haplotypes suggests S. invicta has acquired Wolbachia infections on at least three independent occasions. One common Wolbachia variant in S. invicta (wSinvictaB) is associated with two divergent mtDNA haplotype clades. Further, within each of these clades, Wolbachia-infected and uninfected individuals possess virtually identical subsets of mtDNA haplotypes, including both putative derived and ancestral mtDNA haplotypes. The same pattern also holds for wSinvictaA, where at least one and as many as three invasions into S. invicta have occurred. These data suggest that the initial invasions of Wolbachia into host ant populations may be relatively ancient and have been followed by multiple secondary losses of Wolbachia in different infected lineages over time. Finally, our data also provide additional insights into the factors responsible for previously reported variation in Wolbachia prevalence among S. invicta populations. Conclusion The history of Wolbachia infections in S. invicta is rather complex and involves multiple invasions or horizontal transmission events of Wolbachia into this species. Although these Wolbachia infections apparently have been present for relatively long time periods, these data clearly indicate that Wolbachia infections frequently have been secondarily lost within different lineages. Importantly, the uncoupled transmission of the Wolbachia and mtDNA genomes suggests that the presumed effects of Wolbachia on mtDNA evolution within S. invicta are less severe than originally predicted. Thus, the common concern that use of mtDNA markers for studying the evolutionary history of insects is confounded by maternally inherited endosymbionts such as Wolbachia may be somewhat unwarranted in the case of S. invicta. PMID:15927071
Identification of a novel 16S rRNA gene variant of Actinomyces funkei from six patients with purulent infections.

PubMed

Hinić, V; Straub, C; Schultheiss, E; Kaempfer, P; Frei, R; Goldenberger, D

2013-07-01

Little is known about the clinical significance and laboratory diagnosis of Actinomyces funkei. In this report we describe six clinical cases where A. funkei was isolated from purulent, polymicrobial infections. Conventional identification procedures were compared with molecular methods including matrix-assisted laser desorption/ionization time-of-flight mass spectrometry technique. Analysis of the full 16S rRNA gene sequence of the six investigated strains revealed differences from the A. funkei type strain. DNA-DNA hybridization showed that the clinical strains represent a novel 16S rRNA gene variant within the species of A. funkei. © 2013 The Authors Clinical Microbiology and Infection © 2013 European Society of Clinical Microbiology and Infectious Diseases.
FY*A silencing by the GATA-motif variant FY*A(-69C) in a Caucasian family.

PubMed

Písačka, Martin; Marinov, Iuri; Králová, Miroslava; Králová, Jana; Kořánová, Michaela; Bohoněk, Miloš; Sood, Chhavi; Ochoa-Garay, Gorka

2015-11-01

The c.1-67C variant polymorphism in a GATA motif of the FY promoter is known to result in erythroid-specific FY silencing, that is, in Fy(a-) and Fy(b-) phenotypes. A Caucasian donor presented with the very rare Fy(a-b-) phenotype and was further investigated. Genomic DNA was analyzed by sequencing to identify the cause of the Fy(a-b-) phenotype. Samples were collected from some of his relatives to establish a correlation between the serology and genotyping results. Red blood cells were analyzed by gel column agglutination and flow cytometry. Genomic DNA was analyzed on genotyping microarrays, by DNA sequencing and by allele-specific PCR. In the donor, a single-nucleotide polymorphism T>C within the GATA motif was found at Position c.1-69 of the FY promoter and shown to occur in the FY*A allele. His genotype was found to be FY*A(-69C), FY*BW.01. In six FY*A/FY*B heterozygous members of the family, a perfect correlation was found between the presence vs. absence of the FY*A(-69C) variant allele and a Fy(a-) vs. Fy(a+) phenotype. The location of the c.1-69C polymorphism in a GATA motif whose disruption is known to result in a Fy null phenotype, together with the perfect correlation between the presence of the FY*A(-69C) allele and the Fy(a-) phenotype support a cause-effect relationship between the two. © 2015 AABB.
FaStore - a space-saving solution for raw sequencing data.

PubMed

Roguski, Lukasz; Ochoa, Idoia; Hernaez, Mikel; Deorowicz, Sebastian

2018-03-29

The affordability of DNA sequencing has led to the generation of unprecedented volumes of raw sequencing data. These data must be stored, processed, and transmitted, which poses significant challenges. To facilitate this effort, we introduce FaStore, a specialized compressor for FASTQ files. FaStore does not use any reference sequences for compression, and permits the user to choose from several lossy modes to improve the overall compression ratio, depending on the specific needs. FaStore in the lossless mode achieves a significant improvement in compression ratio with respect to previously proposed algorithms. We perform an analysis on the effect that the different lossy modes have on variant calling, the most widely used application for clinical decision making, especially important in the era of precision medicine. We show that lossy compression can offer significant compression gains, while preserving the essential genomic information and without affecting the variant calling performance. FaStore can be downloaded from https://github.com/refresh-bio/FaStore. sebastian.deorowicz@polsl.pl. Supplementary data are available at Bioinformatics online.
Gene conversion events and variable degree of homogenization of rDNA loci in cultivars of Brassica napus

PubMed Central

Sochorová, Jana; Coriton, Olivier; Kuderová, Alena; Lunerová, Jana; Chèvre, Anne-Marie; Kovařík, Aleš

2017-01-01

Background and aims Brassica napus (AACC, 2n = 38, oilseed rape) is a relatively recent allotetraploid species derived from the putative progenitor diploid species Brassica rapa (AA, 2n = 20) and Brassica oleracea (CC, 2n = 18). To determine the influence of intensive breeding conditions on the evolution of its genome, we analysed structure and copy number of rDNA in 21 cultivars of B. napus, representative of genetic diversity. Methods We used next-generation sequencing genomic approaches, Southern blot hybridization, expression analysis and fluorescence in situ hybridization (FISH). Subgenome-specific sequences derived from rDNA intergenic spacers (IGS) were used as probes for identification of loci composition on chromosomes. Key Results Most B. napus cultivars (18/21, 86 %) had more A-genome than C-genome rDNA copies. Three cultivars analysed by FISH (‘Darmor’, ‘Yudal’ and ‘Asparagus kale’) harboured the same number (12 per diploid set) of loci. In B. napus ‘Darmor’, the A-genome-specific rDNA probe hybridized to all 12 rDNA loci (eight on the A-genome and four on the C-genome) while the C-genome-specific probe showed weak signals on the C-genome loci only. Deep sequencing revealed high homogeneity of arrays suggesting that the C-genome genes were largely overwritten by the A-genome variants in B. napus ‘Darmor’. In contrast, B. napus ‘Yudal’ showed a lack of gene conversion evidenced by additive inheritance of progenitor rDNA variants and highly localized hybridization signals of subgenome-specific probes on chromosomes. Brassica napus ‘Asparagus kale’ showed an intermediate pattern to ‘Darmor’ and ‘Yudal’. At the expression level, most cultivars (95 %) exhibited stable A-genome nucleolar dominance while one cultivar (‘Norin 9’) showed co-dominance. Conclusions The B. napus cultivars differ in the degree and direction of rDNA homogenization. The prevalent direction of gene conversion (towards the A-genome) correlates with the direction of expression dominance indicating that gene activity may be needed for interlocus gene conversion. PMID:27707747
Loss of syntaxin 3 causes variant microvillus inclusion disease.

PubMed

Wiegerinck, Caroline L; Janecke, Andreas R; Schneeberger, Kerstin; Vogel, Georg F; van Haaften-Visser, Désirée Y; Escher, Johanna C; Adam, Rüdiger; Thöni, Cornelia E; Pfaller, Kristian; Jordan, Alexander J; Weis, Cleo-Aron; Nijman, Isaac J; Monroe, Glen R; van Hasselt, Peter M; Cutz, Ernest; Klumperman, Judith; Clevers, Hans; Nieuwenhuis, Edward E S; Houwen, Roderick H J; van Haaften, Gijs; Hess, Michael W; Huber, Lukas A; Stapelbroek, Janneke M; Müller, Thomas; Middendorp, Sabine

2014-07-01

Microvillus inclusion disease (MVID) is a disorder of intestinal epithelial differentiation characterized by life-threatening intractable diarrhea. MVID can be diagnosed based on loss of microvilli, microvillus inclusions, and accumulation of subapical vesicles. Most patients with MVID have mutations in myosin Vb that cause defects in recycling of apical vesicles. Whole-exome sequencing of DNA from patients with variant MVID showed homozygous truncating mutations in syntaxin 3 (STX3). STX3 is an apical receptor involved in membrane fusion of apical vesicles in enterocytes. Patient-derived organoid cultures and overexpression of truncated STX3 in Caco-2 cells recapitulated most characteristics of variant MVID. We conclude that loss of STX3 function causes variant MVID. Copyright © 2014 AGA Institute. Published by Elsevier Inc. All rights reserved.
Impact of genotyping errors on statistical power of association tests in genomic analyses: A case study

PubMed Central

Hou, Lin; Sun, Ning; Mane, Shrikant; Sayward, Fred; Rajeevan, Nallakkandi; Cheung, Kei-Hoi; Cho, Kelly; Pyarajan, Saiju; Aslan, Mihaela; Miller, Perry; Harvey, Philip D.; Gaziano, J. Michael; Concato, John; Zhao, Hongyu

2017-01-01

A key step in genomic studies is to assess high throughput measurements across millions of markers for each participant’s DNA, either using microarrays or sequencing techniques. Accurate genotype calling is essential for downstream statistical analysis of genotype-phenotype associations, and next generation sequencing (NGS) has recently become a more common approach in genomic studies. How the accuracy of variant calling in NGS-based studies affects downstream association analysis has not, however, been studied using empirical data in which both microarrays and NGS were available. In this article, we investigate the impact of variant calling errors on the statistical power to identify associations between single nucleotides and disease, and on associations between multiple rare variants and disease. Both differential and nondifferential genotyping errors are considered. Our results show that the power of burden tests for rare variants is strongly influenced by the specificity in variant calling, but is rather robust with regard to sensitivity. By using the variant calling accuracies estimated from a substudy of a Cooperative Studies Program project conducted by the Department of Veterans Affairs, we show that the power of association tests is mostly retained with commonly adopted variant calling pipelines. An R package, GWAS.PC, is provided to accommodate power analysis that takes account of genotyping errors (http://zhaocenter.org/software/). PMID:28019059
Novel variant in the TP63 gene associated to ankyloblepharon-ectodermal dysplasia-cleft lip/palate (AEC) syndrome.

PubMed

Gonzalez, Francisco; Loidi, Lourdes; Abalo-Lojo, Jose M

2017-01-01

Ankyloblepharon-ectodermal dysplasia-cleft lip/palate (AEC) syndrome is a disorder resulting from anomalous embryonic development of ectodermal tissues. There is evidence that AEC syndrome is caused by mutations in the TP63 gene, which encodes the p63 protein. This is an important regulatory protein involved in epidermal proliferation and differentiation. Genome sequencing was performed in DNA from peripheral blood leukocytes of a newborn with AEC syndrome and her parents. Variants were searched in all coding exons and intron-exon boundaries of the TP63 gene. A heterozygous missense variant (NM_003722.4:c.1063G>C (p.Asp355His) was found in the newborn patient. No variants were found in either of the parents. We identified a previously unreported variant in TP63 gene which seems to be involved in the somatic malformations found in the AEC syndrome. The absence of this variant in both parents suggests that the variant appeared de novo.
The molecular landscape of pediatric acute myeloid leukemia reveals recurrent structural alterations and age-specific mutational interactions | Office of Cancer Genomics

Cancer.gov

We present the molecular landscape of pediatric acute myeloid leukemia (AML) and characterize nearly 1,000 participants in Children’s Oncology Group (COG) AML trials. The COG–National Cancer Institute (NCI) TARGET AML initiative assessed cases by whole-genome, targeted DNA, mRNA and microRNA sequencing and CpG methylation profiling. Validated DNA variants corresponded to diverse, infrequent mutations, with fewer than 40 genes mutated in >2% of cases.
Digital PCR methods improve detection sensitivity and measurement precision of low abundance mtDNA deletions.

PubMed

Belmonte, Frances R; Martin, James L; Frescura, Kristin; Damas, Joana; Pereira, Filipe; Tarnopolsky, Mark A; Kaufman, Brett A

2016-04-28

Mitochondrial DNA (mtDNA) mutations are a common cause of primary mitochondrial disorders, and have also been implicated in a broad collection of conditions, including aging, neurodegeneration, and cancer. Prevalent among these pathogenic variants are mtDNA deletions, which show a strong bias for the loss of sequence in the major arc between, but not including, the heavy and light strand origins of replication. Because individual mtDNA deletions can accumulate focally, occur with multiple mixed breakpoints, and in the presence of normal mtDNA sequences, methods that detect broad-spectrum mutations with enhanced sensitivity and limited costs have both research and clinical applications. In this study, we evaluated semi-quantitative and digital PCR-based methods of mtDNA deletion detection using double-stranded reference templates or biological samples. Our aim was to describe key experimental assay parameters that will enable the analysis of low levels or small differences in mtDNA deletion load during disease progression, with limited false-positive detection. We determined that the digital PCR method significantly improved mtDNA deletion detection sensitivity through absolute quantitation, improved precision and reduced assay standard error.
Digital PCR methods improve detection sensitivity and measurement precision of low abundance mtDNA deletions

PubMed Central

Belmonte, Frances R.; Martin, James L.; Frescura, Kristin; Damas, Joana; Pereira, Filipe; Tarnopolsky, Mark A.; Kaufman, Brett A.

2016-01-01

Mitochondrial DNA (mtDNA) mutations are a common cause of primary mitochondrial disorders, and have also been implicated in a broad collection of conditions, including aging, neurodegeneration, and cancer. Prevalent among these pathogenic variants are mtDNA deletions, which show a strong bias for the loss of sequence in the major arc between, but not including, the heavy and light strand origins of replication. Because individual mtDNA deletions can accumulate focally, occur with multiple mixed breakpoints, and in the presence of normal mtDNA sequences, methods that detect broad-spectrum mutations with enhanced sensitivity and limited costs have both research and clinical applications. In this study, we evaluated semi-quantitative and digital PCR-based methods of mtDNA deletion detection using double-stranded reference templates or biological samples. Our aim was to describe key experimental assay parameters that will enable the analysis of low levels or small differences in mtDNA deletion load during disease progression, with limited false-positive detection. We determined that the digital PCR method significantly improved mtDNA deletion detection sensitivity through absolute quantitation, improved precision and reduced assay standard error. PMID:27122135
Epigenome-wide inheritance of cytosine methylation variants in a recombinant inbred population

PubMed Central

Schmitz, Robert J.; He, Yupeng; Valdés-López, Oswaldo; Khan, Saad M.; Joshi, Trupti; Urich, Mark A.; Nery, Joseph R.; Diers, Brian; Xu, Dong; Stacey, Gary; Ecker, Joseph R.

2013-01-01

Cytosine DNA methylation is one avenue for passing information through cell divisions. Here, we present epigenomic analyses of soybean recombinant inbred lines (RILs) and their parents. Identification of differentially methylated regions (DMRs) revealed that DMRs mostly cosegregated with the genotype from which they were derived, but examples of the uncoupling of genotype and epigenotype were identified. Linkage mapping of methylation states assessed from whole-genome bisulfite sequencing of 83 RILs uncovered widespread evidence for local methylQTL. This epigenomics approach provides a comprehensive study of the patterns and heritability of methylation variants in a complex genetic population over multiple generations, paving the way for understanding how methylation variants contribute to phenotypic variation. PMID:23739894
Epigenome-wide inheritance of cytosine methylation variants in a recombinant inbred population.

PubMed

Schmitz, Robert J; He, Yupeng; Valdés-López, Oswaldo; Khan, Saad M; Joshi, Trupti; Urich, Mark A; Nery, Joseph R; Diers, Brian; Xu, Dong; Stacey, Gary; Ecker, Joseph R

2013-10-01

Cytosine DNA methylation is one avenue for passing information through cell divisions. Here, we present epigenomic analyses of soybean recombinant inbred lines (RILs) and their parents. Identification of differentially methylated regions (DMRs) revealed that DMRs mostly cosegregated with the genotype from which they were derived, but examples of the uncoupling of genotype and epigenotype were identified. Linkage mapping of methylation states assessed from whole-genome bisulfite sequencing of 83 RILs uncovered widespread evidence for local methylQTL. This epigenomics approach provides a comprehensive study of the patterns and heritability of methylation variants in a complex genetic population over multiple generations, paving the way for understanding how methylation variants contribute to phenotypic variation.
Characterization of an Equine α-S2-Casein Variant Due to a 1.3 kb Deletion Spanning Two Coding Exons

PubMed Central

Brinkmann, Julia; Koudelka, Tomas; Keppler, Julia K.; Tholey, Andreas; Schwarz, Karin; Thaller, Georg; Tetens, Jens

2015-01-01

The production and consumption of mare’s milk in Europe has gained importance, mainly based on positive health effects and a lower allergenic potential as compared to cows’ milk. The allergenicity of milk is to a certain extent affected by different genetic variants. In classical dairy species, much research has been conducted into the genetic variability of milk proteins, but the knowledge in horses is scarce. Here, we characterize two major forms of equine αS2-casein arising from genomic 1.3 kb in-frame deletion involving two coding exons, one of which represents an equid specific duplication. Findings at the DNA-level have been verified by cDNA sequencing from horse milk of mares with different genotypes. At the protein-level, we were able to show by SDS-page and in-gel digestion with subsequent LC-MS analysis that both proteins are actually expressed. The comparison with published sequences of other equids revealed that the deletion has probably occurred before the ancestor of present-day asses and zebras diverged from the horse lineage. PMID:26444874
High-throughput engineering of a mammalian genome reveals building principles of methylation states at CG rich regions.

PubMed

Krebs, Arnaud R; Dessus-Babus, Sophie; Burger, Lukas; Schübeler, Dirk

2014-09-26

The majority of mammalian promoters are CpG islands; regions of high CG density that require protection from DNA methylation to be functional. Importantly, how sequence architecture mediates this unmethylated state remains unclear. To address this question in a comprehensive manner, we developed a method to interrogate methylation states of hundreds of sequence variants inserted at the same genomic site in mouse embryonic stem cells. Using this assay, we were able to quantify the contribution of various sequence motifs towards the resulting DNA methylation state. Modeling of this comprehensive dataset revealed that CG density alone is a minor determinant of their unmethylated state. Instead, these data argue for a principal role for transcription factor binding sites, a prediction confirmed by testing synthetic mutant libraries. Taken together, these findings establish the hierarchy between the two cis-encoded mechanisms that define the DNA methylation state and thus the transcriptional competence of CpG islands.
Calibrating genomic and allelic coverage bias in single-cell sequencing.

PubMed

Zhang, Cheng-Zhong; Adalsteinsson, Viktor A; Francis, Joshua; Cornils, Hauke; Jung, Joonil; Maire, Cecile; Ligon, Keith L; Meyerson, Matthew; Love, J Christopher

2015-04-16

Artifacts introduced in whole-genome amplification (WGA) make it difficult to derive accurate genomic information from single-cell genomes and require different analytical strategies from bulk genome analysis. Here, we describe statistical methods to quantitatively assess the amplification bias resulting from whole-genome amplification of single-cell genomic DNA. Analysis of single-cell DNA libraries generated by different technologies revealed universal features of the genome coverage bias predominantly generated at the amplicon level (1-10 kb). The magnitude of coverage bias can be accurately calibrated from low-pass sequencing (∼0.1 × ) to predict the depth-of-coverage yield of single-cell DNA libraries sequenced at arbitrary depths. We further provide a benchmark comparison of single-cell libraries generated by multi-strand displacement amplification (MDA) and multiple annealing and looping-based amplification cycles (MALBAC). Finally, we develop statistical models to calibrate allelic bias in single-cell whole-genome amplification and demonstrate a census-based strategy for efficient and accurate variant detection from low-input biopsy samples.
Calibrating genomic and allelic coverage bias in single-cell sequencing

PubMed Central

Francis, Joshua; Cornils, Hauke; Jung, Joonil; Maire, Cecile; Ligon, Keith L.; Meyerson, Matthew; Love, J. Christopher

2016-01-01

Artifacts introduced in whole-genome amplification (WGA) make it difficult to derive accurate genomic information from single-cell genomes and require different analytical strategies from bulk genome analysis. Here, we describe statistical methods to quantitatively assess the amplification bias resulting from whole-genome amplification of single-cell genomic DNA. Analysis of single-cell DNA libraries generated by different technologies revealed universal features of the genome coverage bias predominantly generated at the amplicon level (1–10 kb). The magnitude of coverage bias can be accurately calibrated from low-pass sequencing (~0.1 ×) to predict the depth-of-coverage yield of single-cell DNA libraries sequenced at arbitrary depths. We further provide a benchmark comparison of single-cell libraries generated by multi-strand displacement amplification (MDA) and multiple annealing and looping-based amplification cycles (MALBAC). Finally, we develop statistical models to calibrate allelic bias in single-cell whole-genome amplification and demonstrate a census-based strategy for efficient and accurate variant detection from low-input biopsy samples. PMID:25879913
3' rapid amplification of cDNA ends (RACE) walking for rapid structural analysis of large transcripts.

PubMed

Ozawa, Tatsuhiko; Kondo, Masato; Isobe, Masaharu

2004-01-01

The 3' rapid amplification of cDNA ends (3' RACE) is widely used to isolate the cDNA of unknown 3' flanking sequences. However, the conventional 3' RACE often fails to amplify cDNA from a large transcript if there is a long distance between the 5' gene-specific primer and poly(A) stretch, since the conventional 3' RACE utilizes 3' oligo-dT-containing primer complementary to the poly(A) tail of mRNA at the first strand cDNA synthesis. To overcome this problem, we have developed an improved 3' RACE method suitable for the isolation of cDNA derived from very large transcripts. By using the oligonucleotide-containing random 9mer together with the GC-rich sequence for the suppression PCR technology at the first strand of cDNA synthesis, we have been able to amplify the cDNA from a very large transcript, such as the microtubule-actin crosslinking factor 1 (MACF1) gene, which codes a transcript of 20 kb in size. When there is no splicing variant, our highly specific amplification allows us to perform the direct sequencing of 3' RACE products without requiring cloning in bacterial hosts. Thus, this stepwise 3' RACE walking will help rapid characterization of the 3' structure of a gene, even when it encodes a very large transcript.
Genomic Approach to Understand the Association of DNA Repair with Longevity and Healthy Aging Using Genomic Databases of Oldest-Old Population

PubMed Central

Kim, Hyun Soo

2018-01-01

Aged population is increasing worldwide due to the aging process that is inevitable. Accordingly, longevity and healthy aging have been spotlighted to promote social contribution of aged population. Many studies in the past few decades have reported the process of aging and longevity, emphasizing the importance of maintaining genomic stability in exceptionally long-lived population. Underlying reason of longevity remains unclear due to its complexity involving multiple factors. With advances in sequencing technology and human genome-associated approaches, studies based on population-based genomic studies are increasing. In this review, we summarize recent longevity and healthy aging studies of human population focusing on DNA repair as a major factor in maintaining genome integrity. To keep pace with recent growth in genomic research, aging- and longevity-associated genomic databases are also briefly introduced. To suggest novel approaches to investigate longevity-associated genetic variants related to DNA repair using genomic databases, gene set analysis was conducted, focusing on DNA repair- and longevity-associated genes. Their biological networks were additionally analyzed to grasp major factors containing genetic variants of human longevity and healthy aging in DNA repair mechanisms. In summary, this review emphasizes DNA repair activity in human longevity and suggests approach to conduct DNA repair-associated genomic study on human healthy aging.
Molecular characterization of canine parvovirus variants (CPV-2a, CPV-2b, and CPV-2c) based on the VP2 gene in affected domestic dogs in Ecuador

PubMed Central

la Torre, David De; Mafla, Eulalia; Puga, Byron; Erazo, Linda; Astolfi-Ferreira, Claudete; Ferreira, Antonio Piantino

2018-01-01

Aim The objective of this study was to determine the presence of the variants of canine parvovirus (CPV)-2 in the city of Quito, Ecuador, due to the high domestic and street-type canine population, and to identify possible mutations at a genetic level that could be causing structural changes in the virus with a consequent influence on the immune response of the hosts. Materials and Methods Thirty-five stool samples from different puppies with characteristic signs of the disease and positives for CPV through immunochromatography kits were collected from different veterinarian clinics of the city. Polymerase chain reaction and DNA sequencing were used to determine the mutations in residue 426 of the VP2 gene, which determines the variants of CPV-2; in addition, four samples were chosen for complete sequencing of the VP2 gene to identify all possible mutations in the circulating strains in this region of the country. Results The results revealed the presence of the three variants of CPV-2 with a prevalence of 57.1% (20/35) for CPV-2a, 8.5% (3/35) for CPV-2b, and 34.3% (12/35) for CPV-2c. In addition, complete sequencing of the VP2 gene showed amino acid substitutions in residues 87, 101, 139, 219, 297, 300, 305, 322, 324, 375, 386, 426, 440, and 514 of the three Ecuadorian variants when compared with the original CPV-2 sequence. Conclusion This study describes the detection of CPV variants in the city of Quito, Ecuador. Variants of CPV-2 (2a, 2b, and 2c) have been reported in South America, and there are cases in Ecuador where CVP-2 is affecting even vaccinated puppies. PMID:29805214

Preferential Targeting of Conserved Gag Regions after Vaccination with a Heterologous DNA Prime-Modified Vaccinia Virus Ankara Boost HIV-1 Vaccine Regimen.

PubMed

Bauer, Asli; Podola, Lilli; Mann, Philipp; Missanga, Marco; Haule, Antelmo; Sudi, Lwitiho; Nilsson, Charlotta; Kaluwa, Bahati; Lueer, Cornelia; Mwakatima, Maria; Munseri, Patricia J; Maboko, Leonard; Robb, Merlin L; Tovanabutra, Sodsai; Kijak, Gustavo; Marovich, Mary; McCormack, Sheena; Joseph, Sarah; Lyamuya, Eligius; Wahren, Britta; Sandström, Eric; Biberfeld, Gunnel; Hoelscher, Michael; Bakari, Muhammad; Kroidl, Arne; Geldmacher, Christof

2017-09-15

Prime-boost vaccination strategies against HIV-1 often include multiple variants for a given immunogen for better coverage of the extensive viral diversity. To study the immunologic effects of this approach, we characterized breadth, phenotype, function, and specificity of Gag-specific T cells induced by a DNA-prime modified vaccinia virus Ankara (MVA)-boost vaccination strategy, which uses mismatched Gag immunogens in the TamoVac 01 phase IIa trial. Healthy Tanzanian volunteers received three injections of the DNA-SMI vaccine encoding a subtype B and AB-recombinant Gag p37 and two vaccinations with MVA-CMDR encoding subtype A Gag p55 Gag-specific T-cell responses were studied in 42 vaccinees using fresh peripheral blood mononuclear cells. After the first MVA-CMDR boost, vaccine-induced gamma interferon-positive (IFN-γ + ) Gag-specific T-cell responses were dominated by CD4 + T cells ( P < 0.001 compared to CD8 + T cells) that coexpressed interleukin-2 (IL-2) (66.4%) and/or tumor necrosis factor alpha (TNF-α) (63.7%). A median of 3 antigenic regions were targeted with a higher-magnitude median response to Gag p24 regions, more conserved between prime and boost, compared to those of regions within Gag p15 (not primed) and Gag p17 (less conserved; P < 0.0001 for both). Four regions within Gag p24 each were targeted by 45% to 74% of vaccinees upon restimulation with DNA-SMI-Gag matched peptides. The response rate to individual antigenic regions correlated with the sequence homology between the MVA- and DNA Gag-encoded immunogens ( P = 0.04, r 2 = 0.47). In summary, after the first MVA-CMDR boost, the sequence-mismatched DNA-prime MVA-boost vaccine strategy induced a Gag-specific T-cell response that was dominated by polyfunctional CD4 + T cells and that targeted multiple antigenic regions within the conserved Gag p24 protein. IMPORTANCE Genetic diversity is a major challenge for the design of vaccines against variable viruses. While including multiple variants for a given immunogen in prime-boost vaccination strategies is one approach that aims to improve coverage for global virus variants, the immunologic consequences of this strategy have been poorly defined so far. It is unclear whether inclusion of multiple variants in prime-boost vaccination strategies improves recognition of variant viruses by T cells and by which mechanisms this would be achieved, either by improved cross-recognition of multiple variants for a given antigenic region or through preferential targeting of antigenic regions more conserved between prime and boost. Engineering vaccines to induce adaptive immune responses that preferentially target conserved antigenic regions of viral vulnerability might facilitate better immune control after preventive and therapeutic vaccination for HIV and for other variable viruses. Copyright © 2017 American Society for Microbiology.
Variant Profiling of Candidate Genes in Pancreatic Ductal Adenocarcinoma.

PubMed

Huang, Jiaqi; Löhr, Johannes-Matthias; Nilsson, Magnus; Segersvärd, Ralf; Matsson, Hans; Verbeke, Caroline; Heuchel, Rainer; Kere, Juha; Iafrate, A John; Zheng, Zongli; Ye, Weimin

2015-11-01

Pancreatic ductal adenocarcinoma (PDAC) has a poor prognosis. Variant profiling is crucial for developing personalized treatment and elucidating the etiology of this disease. Patients with PDAC undergoing surgery from 2007 to 2012 (n = 73) were followed from diagnosis until death or the end of the study. We applied an anchored multiplex PCR (AMP)-based next-generation sequencing (NGS) method to a panel of 65 selected genes and assessed analytical performance by sequencing a quantitative multiplex DNA reference standard. In clinical PDAC samples, detection of low-level KRAS (Kirsten rat sarcoma viral oncogene homolog) mutations was validated by allele-specific PCR and digital PCR. We compared overall survival of patients according to KRAS mutation status by log-rank test and applied logistic regression to evaluate the association between smoking and tumor variant types. The AMP-based NGS method could detect variants with allele frequencies as low as 1% given sufficient sequencing depth (>1500×). Low-frequency KRAS G12 mutations (allele frequency 1%-5%) were all confirmed by allele-specific PCR and digital PCR. The most prevalent genetic alterations were in KRAS (78% of patients), TP53 (tumor protein p53) (25%), and SMAD4 (SMAD family member 4) (8%). Overall survival in T3-stage PDAC patients differed among KRAS mutation subtypes (P = 0.019). Transversion variants were more common in ever-smokers than in never-smokers (odds ratio 5.7; 95% CI 1.2-27.8). The AMP-based NGS method is applicable for profiling tumor variants. Using this approach, we demonstrated that in PDAC patients, KRAS mutant subtype G12V is associated with poorer survival, and that transversion variants are more common among smokers. © 2015 American Association for Clinical Chemistry.
Discovery of a novel HLA-B*51 variant, B*51:112, in a Taiwanese bone marrow donor and identification of the plausible HLA haplotype in association with B*51:112.

PubMed

Yang, K L; Lee, S K; Lin, P Y

2012-10-01

The sequence of B*51:112 is identical to the sequence of B*51:01:01 in exons 2, 3 and 4, except the nucleotides at positions 206 (C→A) and 213 (C→G). The nucleotide replacement caused one amino acid substitution at residue 45 (T→K). The plausible HLA-A, -B and -DRB1 haplotype in association with B*51:112 may be deduced as HLA-A*02-B*51:112-DRB1*12. The generation of B*51:112 was probably as the result of a DNA recombination event where B*40:01:01 acted as a sequence donor donating a segment of the DNA sequence to the recipient sequence B*51:01:01. The donor carrying B*51:112 was a Minna Taiwanese whose ancestor came to Taiwan from the southern region of China. © 2012 Blackwell Publishing Ltd.
DNA methylation and targeted sequencing of methyltransferases family genes in canine acute myeloid leukaemia, modelling human myeloid leukaemia.

PubMed

Bronzini, I; Aresu, L; Paganin, M; Marchioretto, L; Comazzi, S; Cian, F; Riondato, F; Marconato, L; Martini, V; Te Kronnie, G

2017-09-01

Tumours shows aberrant DNA methylation patterns, being hypermethylated or hypomethylated compared with normal tissues. In human acute myeloid leukaemia (hAML) mutations in DNA methyltransferase (DNMT3A) are associated to a more aggressive tumour behaviour. As AML is lethal in dogs, we defined global DNA methylation content, and screened the C-terminal domain of DNMT3 family of genes for sequence variants in 39 canine acute myeloid leukaemia (cAML) cases. A heterogeneous pattern of DNA methylation was found among cAML samples, with subsets of cases being hypermethylated or hypomethylated compared with healthy controls; four recurrent single nucleotide variations (SNVs) were found in DNMT3L gene. Although SNVs were not directly correlated to whole genome DNA methylation levels, all hypomethylated cAML cases were homozygous for the deleterious mutation at p.Arg222Trp. This study contributes to understand genetic modifications of cAML, leading up to studies that will elucidate the role of methylome alterations in the pathogenesis of AML in dogs. © 2016 John Wiley & Sons Ltd.
Transforming single DNA molecules into fluorescent magnetic particles for detection and enumeration of genetic variations

PubMed Central

Dressman, Devin; Yan, Hai; Traverso, Giovanni; Kinzler, Kenneth W.; Vogelstein, Bert

2003-01-01

Many areas of biomedical research depend on the analysis of uncommon variations in individual genes or transcripts. Here we describe a method that can quantify such variation at a scale and ease heretofore unattainable. Each DNA molecule in a collection of such molecules is converted into a single magnetic particle to which thousands of copies of DNA identical in sequence to the original are bound. This population of beads then corresponds to a one-to-one representation of the starting DNA molecules. Variation within the original population of DNA molecules can then be simply assessed by counting fluorescently labeled particles via flow cytometry. This approach is called BEAMing on the basis of four of its principal components (beads, emulsion, amplification, and magnetics). Millions of individual DNA molecules can be assessed in this fashion with standard laboratory equipment. Moreover, specific variants can be isolated by flow sorting and used for further experimentation. BEAMing can be used for the identification and quantification of rare mutations as well as to study variations in gene sequences or transcripts in specific populations or tissues. PMID:12857956
Exome sequencing analysis reveals variants in primary immunodeficiency genes in patients with very early onset inflammatory bowel disease.

PubMed

Kelsen, Judith R; Dawany, Noor; Moran, Christopher J; Petersen, Britt-Sabina; Sarmady, Mahdi; Sasson, Ariella; Pauly-Hubbard, Helen; Martinez, Alejandro; Maurer, Kelly; Soong, Joanne; Rappaport, Eric; Franke, Andre; Keller, Andreas; Winter, Harland S; Mamula, Petar; Piccoli, David; Artis, David; Sonnenberg, Gregory F; Daly, Mark; Sullivan, Kathleen E; Baldassano, Robert N; Devoto, Marcella

2015-11-01

Very early onset inflammatory bowel disease (VEO-IBD), IBD diagnosed at 5 years of age or younger, frequently presents with a different and more severe phenotype than older-onset IBD. We investigated whether patients with VEO-IBD carry rare or novel variants in genes associated with immunodeficiencies that might contribute to disease development. Patients with VEO-IBD and parents (when available) were recruited from the Children's Hospital of Philadelphia from March 2013 through July 2014. We analyzed DNA from 125 patients with VEO-IBD (age, 3 wk to 4 y) and 19 parents, 4 of whom also had IBD. Exome capture was performed by Agilent SureSelect V4, and sequencing was performed using the Illumina HiSeq platform. Alignment to human genome GRCh37 was achieved followed by postprocessing and variant calling. After functional annotation, candidate variants were analyzed for change in protein function, minor allele frequency less than 0.1%, and scaled combined annotation-dependent depletion scores of 10 or less. We focused on genes associated with primary immunodeficiencies and related pathways. An additional 210 exome samples from patients with pediatric IBD (n = 45) or adult-onset Crohn's disease (n = 20) and healthy individuals (controls, n = 145) were obtained from the University of Kiel, Germany, and used as control groups. Four hundred genes and regions associated with primary immunodeficiency, covering approximately 6500 coding exons totaling more than 1 Mbp of coding sequence, were selected from the whole-exome data. Our analysis showed novel and rare variants within these genes that could contribute to the development of VEO-IBD, including rare heterozygous missense variants in IL10RA and previously unidentified variants in MSH5 and CD19. In an exome sequence analysis of patients with VEO-IBD and their parents, we identified variants in genes that regulate B- and T-cell functions and could contribute to pathogenesis. Our analysis could lead to the identification of previously unidentified IBD-associated variants. Copyright © 2015 AGA Institute. Published by Elsevier Inc. All rights reserved.
A powerful tool for genome analysis in maize: development and evaluation of the high density 600 k SNP genotyping array.

PubMed

Unterseer, Sandra; Bauer, Eva; Haberer, Georg; Seidel, Michael; Knaak, Carsten; Ouzunova, Milena; Meitinger, Thomas; Strom, Tim M; Fries, Ruedi; Pausch, Hubert; Bertani, Christofer; Davassi, Alessandro; Mayer, Klaus Fx; Schön, Chris-Carolin

2014-09-29

High density genotyping data are indispensable for genomic analyses of complex traits in animal and crop species. Maize is one of the most important crop plants worldwide, however a high density SNP genotyping array for analysis of its large and highly dynamic genome was not available so far. We developed a high density maize SNP array composed of 616,201 variants (SNPs and small indels). Initially, 57 M variants were discovered by sequencing 30 representative temperate maize lines and then stringently filtered for sequence quality scores and predicted conversion performance on the array resulting in the selection of 1.2 M polymorphic variants assayed on two screening arrays. To identify high-confidence variants, 285 DNA samples from a broad genetic diversity panel of worldwide maize lines including the samples used for sequencing, important founder lines for European maize breeding, hybrids, and proprietary samples with European, US, semi-tropical, and tropical origin were used for experimental validation. We selected 616 k variants according to their performance during validation, support of genotype calls through sequencing data, and physical distribution for further analysis and for the design of the commercially available Affymetrix® Axiom® Maize Genotyping Array. This array is composed of 609,442 SNPs and 6,759 indels. Among these are 116,224 variants in coding regions and 45,655 SNPs of the Illumina® MaizeSNP50 BeadChip for study comparison. In a subset of 45,974 variants, apart from the target SNP additional off-target variants are detected, which show only a minor bias towards intermediate allele frequencies. We performed principal coordinate and admixture analyses to determine the ability of the array to detect and resolve population structure and investigated the extent of LD within a worldwide validation panel. The high density Affymetrix® Axiom® Maize Genotyping Array is optimized for European and American temperate maize and was developed based on a diverse sample panel by applying stringent quality filter criteria to ensure its suitability for a broad range of applications. With 600 k variants it is the largest currently publically available genotyping array in crop species.
UniPROBE, update 2011: expanded content and search tools in the online database of protein-binding microarray data on protein-DNA interactions.

PubMed

Robasky, Kimberly; Bulyk, Martha L

2011-01-01

The Universal PBM Resource for Oligonucleotide-Binding Evaluation (UniPROBE) database is a centralized repository of information on the DNA-binding preferences of proteins as determined by universal protein-binding microarray (PBM) technology. Each entry for a protein (or protein complex) in UniPROBE provides the quantitative preferences for all possible nucleotide sequence variants ('words') of length k ('k-mers'), as well as position weight matrix (PWM) and graphical sequence logo representations of the k-mer data. In this update, we describe >130% expansion of the database content, incorporation of a protein BLAST (blastp) tool for finding protein sequence matches in UniPROBE, the introduction of UniPROBE accession numbers and additional database enhancements. The UniPROBE database is available at http://uniprobe.org.
The importance of proper bioinformatics analysis and clinical interpretation of tumor genomic profiling: a case study of undifferentiated sarcoma and a constitutional pathogenic BRCA2 mutation and an MLH1 variant of uncertain significance.

PubMed

Varga, Elizabeth; Chao, Elizabeth C; Yeager, Nicholas D

2015-09-01

Next-generation sequencing (NGS) technology is increasingly utilized to identify therapeutic targets for patients with malignancy. This technology also has the capability to reveal the presence of constitutional genetic alterations, which may have significant implications for patients and their family members. Here we present the case of a 23 year old Caucasian patient with recurrent undifferentiated sarcoma who had NGS-based tumor analysis using an assay which simultaneously analyzed the entire coding sequence of 236 cancer-related genes (3769 exons) plus 47 introns from 19 genes often rearranged or altered in cancer. Pathogenic alterations were reported in tumor as the predicted protein alterations, BRCA2 "R645fs*15″ and MLH1 "E694*". Because constitutional BRCA2 and MLH1 gene mutations are associated with Hereditary Breast Ovarian Cancer Syndrome (HBOCS) and Lynch syndrome respectively, sequence analysis of DNA isolated from peripheral blood was performed. The presence of the alterations, BRCA2 c.1929delG and MLH1 c.2080G>T, corresponding to the previously reported predicted protein alterations, were confirmed by Sanger sequencing in the constitutional DNA. An additional DNA finding was reported in this analysis, MLH1 c.2081A>C at the neighboring nucleotide. Further evaluation of the family revealed that all alterations were paternally inherited and the two MLH1 substitutions were in cis, more appropriately referred to as MLH1 c.2080_2081delGAinsTC, which is classified as a variant of uncertain significance. This case illustrates important considerations related to appropriate interpretation of NGS tumor results and follow-up of patients with potentially deleterious constitutional alterations.
SUGAR: graphical user interface-based data refiner for high-throughput DNA sequencing.

PubMed

Sato, Yukuto; Kojima, Kaname; Nariai, Naoki; Yamaguchi-Kabata, Yumi; Kawai, Yosuke; Takahashi, Mamoru; Mimori, Takahiro; Nagasaki, Masao

2014-08-08

Next-generation sequencers (NGSs) have become one of the main tools for current biology. To obtain useful insights from the NGS data, it is essential to control low-quality portions of the data affected by technical errors such as air bubbles in sequencing fluidics. We develop a software SUGAR (subtile-based GUI-assisted refiner) which can handle ultra-high-throughput data with user-friendly graphical user interface (GUI) and interactive analysis capability. The SUGAR generates high-resolution quality heatmaps of the flowcell, enabling users to find possible signals of technical errors during the sequencing. The sequencing data generated from the error-affected regions of a flowcell can be selectively removed by automated analysis or GUI-assisted operations implemented in the SUGAR. The automated data-cleaning function based on sequence read quality (Phred) scores was applied to a public whole human genome sequencing data and we proved the overall mapping quality was improved. The detailed data evaluation and cleaning enabled by SUGAR would reduce technical problems in sequence read mapping, improving subsequent variant analysis that require high-quality sequence data and mapping results. Therefore, the software will be especially useful to control the quality of variant calls to the low population cells, e.g., cancers, in a sample with technical errors of sequencing procedures.
Infections with multiple Cryptosporidium species and new genetic variants in young dairy calves on a farm located within a drinking water catchment area in New Zealand.

PubMed

Shrestha, Rima D; Grinberg, Alex; Dukkipati, Venkata S R; Pleydell, Eve J; Prattley, Deborah J; French, Nigel P

2014-05-28

Several Cryptosporidium species are known to infect cattle. However, the occurrence of mixed infections with more than one species and the impact of this phenomenon on animal and human health are poorly understood. Therefore, to detect the presence of mixed Cryptosporidium infections, 15 immunofluorescence-positive specimens obtained from 6-week-old calves' faeces (n=60) on one dairy farm were subjected to PCR-sequencing at multiple loci. DNA sequences of three Cryptosporidium species: C. parvum (15/15), C. bovis (3/15) and C. andersoni (1/15), and two new genetic variants were identified. There was evidence of mixed infections in five specimens. C. parvum, C. bovis and C. andersoni sequences were detected together in one specimen, C. parvum and C. bovis in two specimens, and C. parvum and C. parvum-like variants in the remaining two specimens. Sequencing of gp60 amplicons identified the IIaA19G4R1 (8/15) and IIaA18G3R1 (4/15) C. parvum subgenotypes. This study provides evidence of endemic mixed infections with the three main Cryptosporidium species of cattle and new genetic variants, in calves at the transition age of six weeks. The results add to the body of evidence describing Cryptosporidium isolates as genetically heterogeneous populations, and highlight the need for iterative genotyping to explore their genetic makeup. Copyright © 2014 Elsevier B.V. All rights reserved.
Whole genome sequencing and integrative genomic analysis approach on two 22q11.2 deletion syndrome family trios for genotype to phenotype correlations

PubMed Central

Chung, Jonathan H.; Cai, Jinlu; Suskin, Barrie G.; Zhang, Zhengdong; Coleman, Karlene

2015-01-01

The 22q11.2 deletion syndrome (22q11DS) affects 1:4000 live births and presents with highly variable phenotype expressivity. In this study, we developed an analytical approach utilizing whole genome sequencing and integrative analysis to discover genetic modifiers. Our pipeline combined available tools in order to prioritize rare, predicted deleterious, coding and non-coding single nucleotide variants (SNVs) and insertion/deletions (INDELs) from whole genome sequencing (WGS). We sequenced two unrelated probands with 22q11DS, with contrasting clinical findings, and their unaffected parents. Proband P1 had cognitive impairment, psychotic episodes, anxiety, and tetralogy of Fallot (TOF); while proband P2 had juvenile rheumatoid arthritis but no other major clinical findings. In P1, we identified common variants in COMT and PRODH on 22q11.2 as well as rare potentially deleterious DNA variants in other behavioral/neurocognitive genes. We also identified a de novo SNV in ADNP2 (NM_014913.3:c.2243G>C), encoding a neuroprotective protein that may be involved in behavioral disorders. In P2, we identified a novel non-synonymous SNV in ZFPM2 (NM_012082.3:c.1576C>T), a known causative gene for TOF, which may act as a protective variant downstream of TBX1, haploinsufficiency of which is responsible for congenital heart disease in individuals with 22q11DS. PMID:25981510
Genome and Transcriptome Sequencing of the Ostreid herpesvirus 1 From Tomales Bay, California

NASA Astrophysics Data System (ADS)

Burge, C. A.; Langevin, S.; Closek, C. J.; Roberts, S. B.; Friedman, C. S.

2016-02-01

Mass mortalities of larval and seed bivalve molluscs attributed to the Ostreid herpesvirus 1 (OsHV-1) occur globally. OsHV-1 was fully sequenced and characterized as a member of the Family Malacoherpesviridae. Multiple strains of OsHV-1 exist and may vary in virulence, i.e. OsHV-1 µvar. For most global variants of OsHV-1, sequence data is limited to PCR-based sequencing of segments, including two recent genomes. In the United States, OsHV-1 is limited to detection in adjacent embayments in California, Tomales and Drakes bays. Limited DNA sequence data of OsHV-1 infecting oysters in Tomales Bay indicates the virus detected in Tomales Bay is similar but not identical to any one global variant of OsHV-1. In order to better understand both strain variation and virulence of OsHV-1 infecting oysters in Tomales Bay, we used genomic and transcriptomic sequencing. Meta-genomic sequencing (Illumina MiSeq) was conducted from infected oysters (n=4 per year) collected in 2003, 2007, and 2014, where full OsHV-1 genome sequences and low overall microbial diversity were achieved from highly infected oysters. Increased microbial diversity was detected in three of four samples sequenced from 2003, where qPCR based genome copy numbers of OsHV-1 were lower. Expression analysis (SOLiD RNA sequencing) of OsHV-1 genes expressed in oyster larvae at 24 hours post exposure revealed a nearly complete transcriptome, with several highly expressed genes, which are similar to recent transcriptomic analyses of other OsHV-1 variants. Taken together, our results indicate that genome and transcriptome sequencing may be powerful tools in understanding both strain variation and virulence of non-culturable marine viruses.
Molecular characterization of allelic variants of (GATA)n microsatellite loci in parthenogenetic lizards Darevskia unisexualis (Lacertidae).

PubMed

Korchagin, V I; Badaeva, T N; Tokarskaya, O N; Martirosyan, I A; Darevsky, I S; Ryskov, A P

2007-05-01

Populations of parthenogenetic lizards of the genus Darevskia consist of genetically identical animals, and represent a unique model for studying the molecular mechanisms underlying the variability and evolution of hypervariable DNA repeats. As unisexual lineages, parthenogenetic lizards are characterized by some level of genetic diversity at microsatellite loci. We cloned and sequenced a number of (GATA)n microsatellite loci of Darevskia unisexualis. PCR products from these loci were also sequenced and the degree of intraspecific polymorphism was assessed. Among the five (GATA)n loci analysed, two (Du215 and Du281) were polymorphic. Cross-species analysis of Du215 and Du281 indicate that the priming sites at the D. unisexualis loci are conserved in the bisexual parental species, D. raddei and D. valentini. Sequencing the PCR products amplified from Du215 and Du281 and from monomorphic Du323 showed that allelic differences at the polymorphic loci are caused by microsatellite mutations and by point mutations in the flanking regions. The haplotypes identified among the allelic variants of Du281 and among its orthologues in the parental species provide new evidence of the cross-species origin of D. unisexualis. To our knowledge, these data are the first to characterize the nucleotide sequences of allelic variants at microsatellite loci within parthenogenetic vertebrate animals.
Detecting Genomic Clustering of Risk Variants from Sequence Data: Cases vs. Controls

PubMed Central

Schaid, Daniel J.; Sinnwell, Jason P.; McDonnell, Shannon K.; Thibodeau, Stephen N.

2013-01-01

As the ability to measure dense genetic markers approaches the limit of the DNA sequence itself, taking advantage of possible clustering of genetic variants in, and around, a gene would benefit genetic association analyses, and likely provide biological insights. The greatest benefit might be realized when multiple rare variants cluster in a functional region. Several statistical tests have been developed, one of which is based on the popular Kulldorff scan statistic for spatial clustering of disease. We extended another popular spatial clustering method – Tango’s statistic – to genomic sequence data. An advantage of Tango’s method is that it is rapid to compute, and when single test statistic is computed, its distribution is well approximated by a scaled chi-square distribution, making computation of p-values very rapid. We compared the Type-I error rates and power of several clustering statistics, as well as the omnibus sequence kernel association test (SKAT). Although our version of Tango’s statistic, which we call “Kernel Distance” statistic, took approximately half the time to compute than the Kulldorff scan statistic, it had slightly less power than the scan statistic. Our results showed that the Ionita-Laza version of Kulldorff’s scan statistic had the greatest power over a range of clustering scenarios. PMID:23842950
Identification of Two Novel Mycobacterium avium Allelic Variants in Pig and Human Isolates from Brazil by PCR-Restriction Enzyme Analysis

PubMed Central

Leão, Sylvia Cardoso; Briones, Marcelo R. S.; Sircili, Marcelo Palma; Balian, Simone Carvalho; Mores, Nelson; Ferreira-Neto, José Soares

1999-01-01

Mycobacterium avium complex (MAC) is composed of environmental mycobacteria found widely in soil, water, and aerosols that can cause disease in animals and humans, especially disseminated infections in AIDS patients. MAC consists of two closely related species, M. avium and M. intracellulare, and may also include other, less-defined groups. The precise differentiation of MAC species is a fundamental step in epidemiological studies and for the evaluation of possible reservoirs for MAC infection in humans and animals. In this study, which included 111 pig and 26 clinical MAC isolates, two novel allelic M. avium PCR-restriction enzyme analysis (PRA) variants were identified, differing from the M. avium PRA prototype in the HaeIII digestion pattern. Mutations in HaeIII sites were confirmed by DNA sequencing. Identification of these isolates as M. avium was confirmed by PCR with DT1-DT6 and IS1245 primers, nucleic acid hybridization with the AccuProbe system, 16S ribosomal DNA sequencing, and biochemical tests. The characterization of M. avium PRA variants can be useful in the elucidation of factors involved in mycobacterial virulence and routes of infection and also has diagnostic significance, since they can be misidentified as M. simiae II and M. kansasii I if the PRA method is used in the clinical laboratory for identification of mycobacteria. PMID:10405407
Evolutionary Analyses of Entire Genomes Do Not Support the Association of mtDNA Mutations with Ras/MAPK Pathway Syndromes

PubMed Central

Cerezo, María; Balboa, Emilia; Heredia, Claudia; Castro-Feijóo, Lidia; Rica, Itxaso; Barreiro, Jesús; Eirís, Jesús; Cabanas, Paloma; Martínez-Soto, Isabel; Fernández-Toral, Joaquín; Castro-Gago, Manuel; Pombo, Manuel; Carracedo, Ángel; Barros, Francisco

2011-01-01

Background There are several known autosomal genes responsible for Ras/MAPK pathway syndromes, including Noonan syndrome (NS) and related disorders (such as LEOPARD, neurofibromatosis type 1), although mutations of these genes do not explain all cases. Due to the important role played by the mitochondrion in the energetic metabolism of cardiac muscle, it was recently proposed that variation in the mitochondrial DNA (mtDNA) genome could be a risk factor in the Noonan phenotype and in hypertrophic cardiomyopathy (HCM), which is a common clinical feature in Ras/MAPK pathway syndromes. In order to test these hypotheses, we sequenced entire mtDNA genomes in the largest series of patients suffering from Ras/MAPK pathway syndromes analyzed to date (n = 45), most of them classified as NS patients (n = 42). Methods/Principal Findings The results indicate that the observed mtDNA lineages were mostly of European ancestry, reproducing in a nutshell the expected haplogroup (hg) patterns of a typical Iberian dataset (including hgs H, T, J, and U). Three new branches of the mtDNA phylogeny (H1j1, U5b1e, and L2a5) are described for the first time, but none of these are likely to be related to NS or Ras/MAPK pathway syndromes when observed under an evolutionary perspective. Patterns of variation in tRNA and protein genes, as well as redundant, private and heteroplasmic variants, in the mtDNA genomes of patients were as expected when compared with the patterns inferred from a worldwide mtDNA phylogeny based on more than 8700 entire genomes. Moreover, most of the mtDNA variants found in patients had already been reported in healthy individuals and constitute common polymorphisms in human population groups. Conclusions/Significance As a whole, the observed mtDNA genome variation in the NS patients was difficult to reconcile with previous findings that indicated a pathogenic role of mtDNA variants in NS. PMID:21526175
De novo truncating variants in the AHDC1 gene encoding the AT-hook DNA-binding motif-containing protein 1 are associated with intellectual disability and developmental delay.

PubMed

Yang, Hui; Douglas, Ganka; Monaghan, Kristin G; Retterer, Kyle; Cho, Megan T; Escobar, Luis F; Tucker, Megan E; Stoler, Joan; Rodan, Lance H; Stein, Diane; Marks, Warren; Enns, Gregory M; Platt, Julia; Cox, Rachel; Wheeler, Patricia G; Crain, Carrie; Calhoun, Amy; Tryon, Rebecca; Richard, Gabriele; Vitazka, Patrik; Chung, Wendy K

2015-10-01

Whole-exome sequencing (WES) represents a significant breakthrough in clinical genetics, and identifies a genetic etiology in up to 30% of cases of intellectual disability (ID). Using WES, we identified seven unrelated patients with a similar clinical phenotype of severe intellectual disability or neurodevelopmental delay who were all heterozygous for de novo truncating variants in the AT-hook DNA-binding motif-containing protein 1 (AHDC1). The patients were all minimally verbal or nonverbal and had variable neurological problems including spastic quadriplegia, ataxia, nystagmus, seizures, autism, and self-injurious behaviors. Additional common clinical features include dysmorphic facial features and feeding difficulties associated with failure to thrive and short stature. The AHDC1 gene has only one coding exon, and the protein contains conserved regions including AT-hook motifs and a PDZ binding domain. We postulate that all seven variants detected in these patients result in a truncated protein missing critical functional domains, disrupting interactions with other proteins important for brain development. Our study demonstrates that truncating variants in AHDC1 are associated with ID and are primarily associated with a neurodevelopmental phenotype.
Engineered Cpf1 variants with altered PAM specificities.

PubMed

Gao, Linyi; Cox, David B T; Yan, Winston X; Manteiga, John C; Schneider, Martin W; Yamano, Takashi; Nishimasu, Hiroshi; Nureki, Osamu; Crosetto, Nicola; Zhang, Feng

2017-08-01

The RNA-guided endonuclease Cpf1 is a promising tool for genome editing in eukaryotic cells. However, the utility of the commonly used Acidaminococcus sp. BV3L6 Cpf1 (AsCpf1) and Lachnospiraceae bacterium ND2006 Cpf1 (LbCpf1) is limited by their requirement of a TTTV protospacer adjacent motif (PAM) in the DNA substrate. To address this limitation, we performed a structure-guided mutagenesis screen to increase the targeting range of Cpf1. We engineered two AsCpf1 variants carrying the mutations S542R/K607R and S542R/K548V/N552R, which recognize TYCV and TATV PAMs, respectively, with enhanced activities in vitro and in human cells. Genome-wide assessment of off-target activity using BLISS indicated that these variants retain high DNA-targeting specificity, which we further improved by introducing an additional non-PAM-interacting mutation. Introducing the identified PAM-interacting mutations at their corresponding positions in LbCpf1 similarly altered its PAM specificity. Together, these variants increase the targeting range of Cpf1 by approximately threefold in human coding sequences to one cleavage site per ∼11 bp.
Engineered Cpf1 variants with altered PAM specificities increase genome targeting range

PubMed Central

Gao, Linyi; Cox, David B.T.; Yan, Winston X.; Manteiga, John C.; Schneider, Martin W.; Yamano, Takashi; Nishimasu, Hiroshi; Nureki, Osamu; Crosetto, Nicola; Zhang, Feng

2017-01-01

The RNA-guided endonuclease Cpf1 is a promising tool for genome editing in eukaryotic cells1–7. However, the utility of the commonly used Acidaminococcus sp. BV3L6 Cpf1 (AsCpf1) and Lachnospiraceae bacterium ND2006 Cpf1 (LbCpf1) is limited by their requirement of a TTTV protospacer adjacent motif (PAM) in the DNA substrate. To address this limitation, we performed a structure-guided mutagenesis screen to increase the targeting range of Cpf1. We engineered two AsCpf1 variants carrying the mutations S542R/K607R and S542R/K548V/N552R, which recognize TYCV and TATV PAMs, respectively, with enhanced activities in vitro and in human cells. Genome-wide assessment of off-target activity using BLISS7 assay indicated that these variants retain high DNA targeting specificity, which we further improved by introducing an additional non-PAM-interacting mutation. Introducing the identified mutations at their corresponding positions in LbCpf1 similarly altered its PAM specificity. Together, these variants increase the targeting range of Cpf1 by approximately three-fold in human coding sequences to one cleavage site per ~11 bp. PMID:28581492

Adeno-Associated Virus Type 2 Wild-Type and Vector-Mediated Genomic Integration Profiles of Human Diploid Fibroblasts Analyzed by Third-Generation PacBio DNA Sequencing

PubMed Central

Hüser, Daniela; Gogol-Döring, Andreas; Chen, Wei

2014-01-01

ABSTRACT Genome-wide analysis of adeno-associated virus (AAV) type 2 integration in HeLa cells has shown that wild-type AAV integrates at numerous genomic sites, including AAVS1 on chromosome 19q13.42. Multiple GAGY/C repeats, resembling consensus AAV Rep-binding sites are preferred, whereas rep-deficient AAV vectors (rAAV) regularly show a random integration profile. This study is the first study to analyze wild-type AAV integration in diploid human fibroblasts. Applying high-throughput third-generation PacBio-based DNA sequencing, integration profiles of wild-type AAV and rAAV are compared side by side. Bioinformatic analysis reveals that both wild-type AAV and rAAV prefer open chromatin regions. Although genomic features of AAV integration largely reproduce previous findings, the pattern of integration hot spots differs from that described in HeLa cells before. DNase-Seq data for human fibroblasts and for HeLa cells reveal variant chromatin accessibility at preferred AAV integration hot spots that correlates with variant hot spot preferences. DNase-Seq patterns of these sites in human tissues, including liver, muscle, heart, brain, skin, and embryonic stem cells further underline variant chromatin accessibility. In summary, AAV integration is dependent on cell-type-specific, variant chromatin accessibility leading to random integration profiles for rAAV, whereas wild-type AAV integration sites cluster near GAGY/C repeats. IMPORTANCE Adeno-associated virus type 2 (AAV) is assumed to establish latency by chromosomal integration of its DNA. This is the first genome-wide analysis of wild-type AAV2 integration in diploid human cells and the first to compare wild-type to recombinant AAV vector integration side by side under identical experimental conditions. Major determinants of wild-type AAV integration represent open chromatin regions with accessible consensus AAV Rep-binding sites. The variant chromatin accessibility of different human tissues or cell types will have impact on vector targeting to be considered during gene therapy. PMID:25031342
Molecular Darwinism: The Contingency of Spontaneous Genetic Variation

PubMed Central

Arber, Werner

2011-01-01

The availability of spontaneously occurring genetic variants is an important driving force of biological evolution. Largely thanks to experimental investigations by microbial geneticists, we know today that several different molecular mechanisms contribute to the overall genetic variations. These mechanisms can be assigned to three natural strategies to generate genetic variants: 1) local sequence changes, 2) intragenomic reshuffling of DNA segments, and 3) acquisition of a segment of foreign DNA. In these processes, specific gene products are involved in cooperation with different nongenetic elements. Some genetic variations occur fully at random along the DNA filaments, others rather with a statistical reproducibility, although at many possible sites. We have to be aware that evolution in natural ecosystems is of higher complexity than under most laboratory conditions, not at least in view of symbiotic associations and the occurrence of horizontal gene transfer. The encountered contingency of genetic variation can possibly best ensure a long-term persistence of life under steadily changing living conditions. PMID:21979160
Interactions between inner membrane proteins in donor and recipient cells limit conjugal DNA transfer.

PubMed

Marrero, Joeli; Waldor, Matthew K

2005-06-01

Conjugation enables horizontal transmission of DNA among bacteria, thereby facilitating the rapid spread of genes such as those conferring resistance to antibiotics. Cell-cell contact is required for conjugative DNA transfer but does not ensure its success. The presence of certain plasmids in potential recipient cells inhibits redundant transfer of these plasmids from competent donors despite contact between donor and recipient cells. Here, we used two closely related integrating conjugative elements (ICEs), SXT and R391, to identify genes that inhibit redundant conjugative transfer. Cells containing SXT exclude transfer of a second copy of SXT but not R391 and vice versa. The specific exclusion of SXT and R391 is dependent upon variants of TraG and Eex, ICE-encoded inner membrane proteins in donor and recipient cells, respectively. We identified short sequences within each variant that determine the exquisite specificity of self-recognition; these data suggest that direct interactions between TraG and Eex mediate exclusion.
Molecular Darwinism: the contingency of spontaneous genetic variation.

PubMed

Arber, Werner

2011-01-01

The availability of spontaneously occurring genetic variants is an important driving force of biological evolution. Largely thanks to experimental investigations by microbial geneticists, we know today that several different molecular mechanisms contribute to the overall genetic variations. These mechanisms can be assigned to three natural strategies to generate genetic variants: 1) local sequence changes, 2) intragenomic reshuffling of DNA segments, and 3) acquisition of a segment of foreign DNA. In these processes, specific gene products are involved in cooperation with different nongenetic elements. Some genetic variations occur fully at random along the DNA filaments, others rather with a statistical reproducibility, although at many possible sites. We have to be aware that evolution in natural ecosystems is of higher complexity than under most laboratory conditions, not at least in view of symbiotic associations and the occurrence of horizontal gene transfer. The encountered contingency of genetic variation can possibly best ensure a long-term persistence of life under steadily changing living conditions.
Modeling of DNA local parameters predicts encrypted architectural motifs in Xenopus laevis ribosomal gene promoter.

PubMed

Roux-Rouquie, M; Marilley, M

2000-09-15

We have modeled local DNA sequence parameters to search for DNA architectural motifs involved in transcription regulation and promotion within the Xenopus laevis ribosomal gene promoter and the intergenic spacer (IGS) sequences. The IGS was found to be shaped into distinct topological domains. First, intrinsic bends split the IGS into domains of common but different helical features. Local parameters at inter-domain junctions exhibit a high variability with respect to intrinsic curvature, bendability and thermal stability. Secondly, the repeated sequence blocks of the IGS exhibit right-handed supercoiled structures which could be related to their enhancer properties. Thirdly, the gene promoter presents both inherent curvature and minor groove narrowing which may be viewed as motifs of a structural code for protein recognition and binding. Such pre-existing deformations could simply be remodeled during the binding of the transcription complex. Alternatively, these deformations could pre-shape the promoter in such a way that further remodeling is facilitated. Mutations shown to abolish promoter curvature as well as intrinsic minor groove narrowing, in a variant which maintained full transcriptional activity, bring circumstantial evidence for structurally-preorganized motifs in relation to transcription regulation and promotion. Using well documented X. laevis rDNA regulatory sequences we showed that computer modeling may be of invaluable assistance in assessing encrypted architectural motifs. The evidence of these DNA topological motifs with respect to the concept of structural code is discussed.
Single-cell paired-end genome sequencing reveals structural variation per cell cycle

PubMed Central

Voet, Thierry; Kumar, Parveen; Van Loo, Peter; Cooke, Susanna L.; Marshall, John; Lin, Meng-Lay; Zamani Esteki, Masoud; Van der Aa, Niels; Mateiu, Ligia; McBride, David J.; Bignell, Graham R.; McLaren, Stuart; Teague, Jon; Butler, Adam; Raine, Keiran; Stebbings, Lucy A.; Quail, Michael A.; D’Hooghe, Thomas; Moreau, Yves; Futreal, P. Andrew; Stratton, Michael R.; Vermeesch, Joris R.; Campbell, Peter J.

2013-01-01

The nature and pace of genome mutation is largely unknown. Because standard methods sequence DNA from populations of cells, the genetic composition of individual cells is lost, de novo mutations in cells are concealed within the bulk signal and per cell cycle mutation rates and mechanisms remain elusive. Although single-cell genome analyses could resolve these problems, such analyses are error-prone because of whole-genome amplification (WGA) artefacts and are limited in the types of DNA mutation that can be discerned. We developed methods for paired-end sequence analysis of single-cell WGA products that enable (i) detecting multiple classes of DNA mutation, (ii) distinguishing DNA copy number changes from allelic WGA-amplification artefacts by the discovery of matching aberrantly mapping read pairs among the surfeit of paired-end WGA and mapping artefacts and (iii) delineating the break points and architecture of structural variants. By applying the methods, we capture DNA copy number changes acquired over one cell cycle in breast cancer cells and in blastomeres derived from a human zygote after in vitro fertilization. Furthermore, we were able to discover and fine-map a heritable inter-chromosomal rearrangement t(1;16)(p36;p12) by sequencing a single blastomere. The methods will expedite applications in basic genome research and provide a stepping stone to novel approaches for clinical genetic diagnosis. PMID:23630320
Continuous in vitro evolution of bacteriophage RNA polymerase promoters

NASA Technical Reports Server (NTRS)

Breaker, R. R.; Banerji, A.; Joyce, G. F.

1994-01-01

Rapid in vitro evolution of bacteriophage T7, T3, and SP6 RNA polymerase promoters was achieved by a method that allows continuous enrichment of DNAs that contain functional promoter elements. This method exploits the ability of a special class of nucleic acid molecules to replicate continuously in the presence of both a reverse transcriptase and a DNA-dependent RNA polymerase. Replication involves the synthesis of both RNA and cDNA intermediates. The cDNA strand contains an embedded promoter sequence, which becomes converted to a functional double-stranded promoter element, leading to the production of RNA transcripts. Synthetic cDNAs, including those that contain randomized promoter sequences, can be used to initiate the amplification cycle. However, only those cDNAs that contain functional promoter sequences are able to produce RNA transcripts. Furthermore, each RNA transcript encodes the RNA polymerase promoter sequence that was responsible for initiation of its own transcription. Thus, the population of amplifying molecules quickly becomes enriched for those templates that encode functional promoters. Optimal promoter sequences for phage T7, T3, and SP6 RNA polymerase were identified after a 2-h amplification reaction, initiated in each case with a pool of synthetic cDNAs encoding greater than 10(10) promoter sequence variants.
The evolving genetic risk for sporadic ALS.

PubMed

Gibson, Summer B; Downie, Jonathan M; Tsetsou, Spyridoula; Feusier, Julie E; Figueroa, Karla P; Bromberg, Mark B; Jorde, Lynn B; Pulst, Stefan M

2017-07-18

To estimate the genetic risk conferred by known amyotrophic lateral sclerosis (ALS)-associated genes to the pathogenesis of sporadic ALS (SALS) using variant allele frequencies combined with predicted variant pathogenicity. Whole exome sequencing and repeat expansion PCR of C9orf72 and ATXN2 were performed on 87 patients of European ancestry with SALS seen at the University of Utah. DNA variants that change the protein coding sequence of 31 ALS-associated genes were annotated to determine which were rare and deleterious as predicted by MetaSVM. The percentage of patients with SALS with a rare and deleterious variant or repeat expansion in an ALS-associated gene was calculated. An odds ratio analysis was performed comparing the burden of ALS-associated genes in patients with SALS vs 324 normal controls. Nineteen rare nonsynonymous variants in an ALS-associated gene, 2 of which were found in 2 different individuals, were identified in 21 patients with SALS. Further, 5 deleterious C9orf72 and 2 ATXN2 repeat expansions were identified. A total of 17.2% of patients with SALS had a rare and deleterious variant or repeat expansion in an ALS-associated gene. The genetic burden of ALS-associated genes in patients with SALS as predicted by MetaSVM was significantly higher than in normal controls. Previous analyses have identified SALS-predisposing variants only in terms of their rarity in normal control populations. By incorporating variant pathogenicity as well as variant frequency, we demonstrated that the genetic risk contributed by these genes for SALS is substantially lower than previous estimates. © 2017 American Academy of Neurology.
Unusual presentation of hepatitis B serological markers in an Amerindian community of Venezuela with a majority of occult cases

PubMed Central

2011-01-01

Background Occult hepatitis B infection (OBI) is characterized by the presence of hepatitis B virus (HBV) DNA in the absence of HBsAg in the serum of patients. The aim of this study was to characterize HBV infection among a Piaroa community, an Amerindian group which exhibits significant evidence of exposure to HBV but relatively low presence of HBsAg, and to explore the presence of OBI in this population. Results Of 150 sera, with 17% anti-HBc and 1.3% HBsAg prevalence, 70 were tested for the presence of HBV DNA. From these, 25 (36%) were found positive for HBV DNA by PCR in the core region. Two of these 25 sera were HBsAg positive, indicating an overt infection. Of the remaining 68 sera tested, 23 exhibited OBI. Of these, 13 were HBV DNA out of 25 anti-HBc positive (52%) and 10 HBV DNA positive, out of 43 anti-HBc negative (23%), with a statistical significance of p = 0.03. Viral DNA and HBsAg were present intermittently in follow up sera of 13 individuals. Sequence analysis in the core region of the amplified DNA products showed that all the strains belonged to HBV genotype F3. The OBI isolates displayed 96-100% nucleotide identity between them. One isolate exhibited the co-circulation of a wild type variant with a variant with a premature stop codon at the core protein, and a variant exhibiting a deletion of 28 amino acids. Conclusions The frequency of OBI found in this Amerindian group warrants further studies in other communities exhibiting different degrees of HBV exposure. PMID:22152023
Characterization of mussel H2A.Z.2: a new H2A.Z variant preferentially expressed in germinal tissues from Mytilus.

PubMed

Rivera-Casas, Ciro; González-Romero, Rodrigo; Vizoso-Vazquez, Ángel; Cheema, Manjinder S; Cerdán, M Esperanza; Méndez, Josefina; Ausió, Juan; Eirin-Lopez, Jose M

2016-10-01

Histones are the fundamental constituents of the eukaryotic chromatin, facilitating the physical organization of DNA in chromosomes and participating in the regulation of its metabolism. The H2A family displays the largest number of variants among core histones, including the renowned H2A.X, macroH2A, H2A.B (Bbd), and H2A.Z. This latter variant is especially interesting because of its regulatory role and its differentiation into 2 functionally divergent variants (H2A.Z.1 and H2A.Z.2), further specializing the structure and function of vertebrate chromatin. In the present work we describe, for the first time, the presence of a second H2A.Z variant (H2A.Z.2) in the genome of a non-vertebrate animal, the mussel Mytilus. The molecular and evolutionary characterization of mussel H2A.Z.1 and H2A.Z.2 histones is consistent with their functional specialization, supported on sequence divergence at promoter and coding regions as well as on varying gene expression patterns. More precisely, the expression of H2A.Z.2 transcripts in gonadal tissue and its potential upregulation in response to genotoxic stress might be mirroring the specialization of this variant in DNA repair. Overall, the findings presented in this work complement recent reports describing the widespread presence of other histone variants across eukaryotes, supporting an ancestral origin and conserved role for histone variants in chromatin.
Spectrum of mutations in leiomyosarcomas identified by clinical targeted next-generation sequencing.

PubMed

Lee, Paul J; Yoo, Naomi S; Hagemann, Ian S; Pfeifer, John D; Cottrell, Catherine E; Abel, Haley J; Duncavage, Eric J

2017-02-01

Recurrent genomic mutations in uterine and non-uterine leiomyosarcomas have not been well established. Using a next generation sequencing (NGS) panel of common cancer-associated genes, 25 leiomyosarcomas arising from multiple sites were examined to explore genetic alterations, including single nucleotide variants (SNV), small insertions/deletions (indels), and copy number alterations (CNA). Sequencing showed 86 non-synonymous, coding region somatic variants within 151 gene targets in 21 cases, with a mean of 4.1 variants per case; 4 cases had no putative mutations in the panel of genes assayed. The most frequently altered genes were TP53 (36%), ATM and ATRX (16%), and EGFR and RB1 (12%). CNA were identified in 85% of cases, with the most frequent copy number losses observed in chromosomes 10 and 13 including PTEN and RB1; the most frequent gains were seen in chromosomes 7 and 17. Our data show that deletions in canonical cancer-related genes are common in leiomyosarcomas. Further, the spectrum of gene mutations observed shows that defects in DNA repair and chromosomal maintenance are central to the biology of leiomyosarcomas, and that activating mutations observed in other common cancer types are rare in leiomyosarcomas. Copyright © 2017 Elsevier Inc. All rights reserved.
Analysis of the neuroligin 4Y gene in patients with autism.

PubMed

Yan, Jin; Feng, Jinong; Schroer, Richard; Li, Wenyan; Skinner, Cindy; Schwartz, Charles E; Cook, Edwin H; Sommer, Steve S

2008-08-01

Frameshift and missense mutations in the X-linked neuroligin 4 (NLGN4, MIM# 300427) and neuroligin 3 (NLGN3, MIM# 300336) genes have been identified in patients with autism, Asperger syndrome and mental retardation. We hypothesize that sequence variants in NLGN4Y are associated with autism or mental retardation. The coding sequences and splice junctions of the NLGN4Y gene were analyzed in 335 male samples (290 with autism and 45 with mental retardation). A total of 1.1 Mb of genomic DNA was sequenced. One missense variant, p.I679V, was identified in a patient with autism, as well as his father with learning disabilities. The I679 residue is highly conserved in three members of the neuroligin family. The absence of p.I679V in 2986 control Y chromosomes and the high similarity of NLGN4 and NLGN4Y are consistent with the hypothesis that p.I679V contributes to the etiology of autism. The presence of only one structural variant in our population of 335 males with autism/mental retardation, the unavailability of significant family cosegregation and an absence of functional assays are, however, important limitations of this study.
Cardiovascular genetics: technological advancements and applicability for dilated cardiomyopathy.

PubMed

Kummeling, G J M; Baas, A F; Harakalova, M; van der Smagt, J J; Asselbergs, F W

2015-07-01

Genetics plays an important role in the pathophysiology of cardiovascular diseases, and is increasingly being integrated into clinical practice. Since 2008, both capacity and cost-efficiency of mutation screening of DNA have been increased magnificently due to the technological advancement obtained by next-generation sequencing. Hence, the discovery rate of genetic defects in cardiovascular genetics has grown rapidly and the financial threshold for gene diagnostics has been lowered, making large-scale DNA sequencing broadly accessible. In this review, the genetic variants, mutations and inheritance models are briefly introduced, after which an overview is provided of current clinical and technological applications in gene diagnostics and research for cardiovascular disease and in particular, dilated cardiomyopathy. Finally, a reflection on the future perspectives in cardiogenetics is given.
HPV frequency in penile carcinoma of Mexican patients: important contribution of HPV16 European variant.

PubMed

López-Romero, Ricardo; Iglesias-Chiesa, Candela; Alatorre, Brenda; Vázquez, Karla; Piña-Sánchez, Patricia; Alvarado, Isabel; Lazos, Minerva; Peralta, Raúl; González-Yebra, Beatriz; Romero, Anae; Salcedo, Mauricio

2013-01-01

The role of human papillomavirus (HPV) infection in penile carcinoma (PeC) is currently reported and about half of the PeC is associated with HPV16 and 18. We used a PCR-based strategy by using HPV general primers to analyze 86 penile carcinomas paraffin-embedded tissues. Some clinical data, the histological subtype, growth pattern, and differentiation degree were also collected. The amplified fragments were then sequenced to confirm the HPV type and for HPV16/18 variants. DNA samples were also subjected to relative real time PCR for hTERC gene copy number. Some clinical data were also collected. Global HPV frequency was 77.9%. Relative contributions was for HPV16 (85%), 31 (4.4%), 11 (4.4%), 58, 33, 18, and 59 (1.4% each one). Sequence analysis of HPV16 identified European variants and Asian-American (AAb-c) variants in 92% and in 8% of the samples, respectively. Furthermore hTERC gene amplification was observed in only 17% of the cases. Our results suggest that some members of HPV A9 group (represented by HPV16, 58, and 31) are the most frequent among PeC patients studied with an important contribution from HPV16 European variant. The hTERC gene amplification could be poorly related to penile epithelial tissue.
HPV frequency in penile carcinoma of Mexican patients: important contribution of HPV16 European variant

PubMed Central

López-Romero, Ricardo; Iglesias-Chiesa, Candela; Alatorre, Brenda; Vázquez, Karla; Piña-Sánchez, Patricia; Alvarado, Isabel; Lazos, Minerva; Peralta, Raúl; González-Yebra, Beatriz; Romero, AnaE; Salcedo, Mauricio

2013-01-01

The role of human papillomavirus (HPV) infection in penile carcinoma (PeC) is currently reported and about half of the PeC is associated with HPV16 and 18. We used a PCR-based strategy by using HPV general primers to analyze 86 penile carcinomas paraffin-embedded tissues. Some clinical data, the histological subtype, growth pattern, and differentiation degree were also collected. The amplified fragments were then sequenced to confirm the HPV type and for HPV16/18 variants. DNA samples were also subjected to relative real time PCR for hTERC gene copy number. Some clinical data were also collected. Global HPV frequency was 77.9%. Relative contributions was for HPV16 (85%), 31 (4.4%), 11 (4.4%), 58, 33, 18, and 59 (1.4% each one). Sequence analysis of HPV16 identified European variants and Asian-American (AAb-c) variants in 92% and in 8% of the samples, respectively. Furthermore hTERC gene amplification was observed in only 17% of the cases. Our results suggest that some members of HPV A9 group (represented by HPV16, 58, and 31) are the most frequent among PeC patients studied with an important contribution from HPV16 European variant. The hTERC gene amplification could be poorly related to penile epithelial tissue. PMID:23826423
Comprehensive genetic testing for female and male infertility using next-generation sequencing.

PubMed

Patel, Bonny; Parets, Sasha; Akana, Matthew; Kellogg, Gregory; Jansen, Michael; Chang, Chihyu; Cai, Ying; Fox, Rebecca; Niknazar, Mohammad; Shraga, Roman; Hunter, Colby; Pollock, Andrew; Wisotzkey, Robert; Jaremko, Malgorzata; Bisignano, Alex; Puig, Oscar

2018-05-19

To develop a comprehensive genetic test for female and male infertility in support of medical decisions during assisted reproductive technology (ART) protocols. We developed a next-generation sequencing (NGS) gene panel consisting of 87 genes including promoters, 5' and 3' untranslated regions, exons, and selected introns. In addition, sex chromosome aneuploidies and Y chromosome microdeletions were analyzed concomitantly using the same panel. The NGS panel was analytically validated by retrospective analysis of 118 genomic DNA samples with known variants in loci representative of female and male infertility. Our results showed analytical accuracy of > 99%, with > 98% sensitivity for single-nucleotide variants (SNVs) and > 91% sensitivity for insertions/deletions (indels). Clinical sensitivity was assessed with samples containing variants representative of male and female infertility, and it was 100% for SNVs/indels, CFTR IVS8-5T variants, sex chromosome aneuploidies, and copy number variants (CNVs) and > 93% for Y chromosome microdeletions. Cost analysis shows potential savings when comparing this single NGS assay with the standard approach, which includes multiple assays. A single, comprehensive, NGS panel can simplify the ordering process for healthcare providers, reduce turnaround time, and lower the overall cost of testing for genetic assessment of infertility in females and males, while maintaining accuracy.
Sherpas share genetic variations with Tibetans for high-altitude adaptation.

PubMed

Bhandari, Sushil; Zhang, Xiaoming; Cui, Chaoying; Yangla; Liu, Lan; Ouzhuluobu; Baimakangzhuo; Gonggalanzi; Bai, Caijuan; Bianba; Peng, Yi; Zhang, Hui; Xiang, Kun; Shi, Hong; Liu, Shiming; Gengdeng; Wu, Tianyi; Qi, Xuebin; Su, Bing

2017-01-01

Sherpas, a highlander population living in Khumbu region of Nepal, are well known for their superior climbing ability in Himalayas. However, the genetic basis of their adaptation to high-altitude environments remains elusive. We collected DNA samples of 582 Sherpas from Nepal and Tibetan Autonomous Region of China, and we measured their hemoglobin levels and degrees of blood oxygen saturation. We genotyped 29 EPAS1 SNPs, two EGLN1 SNPs and the TED polymorphism (3.4 kb deletion) in Sherpas. We also performed genetic association analysis among these sequence variants with phenotypic data. We found similar allele frequencies on the tested 32 variants of these genes in Sherpas and Tibetans. Sherpa individuals carrying the derived alleles of EPAS1 (rs113305133, rs116611511 and rs12467821), EGLN1 (rs186996510 and rs12097901) and TED have lower hemoglobin levels when compared with those wild-type allele carriers. Most of the EPAS1 variants showing significant association with hemoglobin levels in Tibetans were replicated in Sherpas. The shared sequence variants and hemoglobin trait between Sherpas and Tibetans indicate a shared genetic basis for high-altitude adaptation, consistent with the proposal that Sherpas are in fact a recently derived population from Tibetans and they inherited adaptive variants for high-altitude adaptation from their Tibetan ancestors.
A Children's Oncology Group and TARGET initiative exploring the genetic landscape of Wilms tumor.

PubMed

Gadd, Samantha; Huff, Vicki; Walz, Amy L; Ooms, Ariadne H A G; Armstrong, Amy E; Gerhard, Daniela S; Smith, Malcolm A; Auvil, Jaime M Guidry; Meerzaman, Daoud; Chen, Qing-Rong; Hsu, Chih Hao; Yan, Chunhua; Nguyen, Cu; Hu, Ying; Hermida, Leandro C; Davidsen, Tanja; Gesuwan, Patee; Ma, Yussanne; Zong, Zusheng; Mungall, Andrew J; Moore, Richard A; Marra, Marco A; Dome, Jeffrey S; Mullighan, Charles G; Ma, Jing; Wheeler, David A; Hampton, Oliver A; Ross, Nicole; Gastier-Foster, Julie M; Arold, Stefan T; Perlman, Elizabeth J

2017-10-01

We performed genome-wide sequencing and analyzed mRNA and miRNA expression, DNA copy number, and DNA methylation in 117 Wilms tumors, followed by targeted sequencing of 651 Wilms tumors. In addition to genes previously implicated in Wilms tumors (WT1, CTNNB1, AMER1, DROSHA, DGCR8, XPO5, DICER1, SIX1, SIX2, MLLT1, MYCN, and TP53), we identified mutations in genes not previously recognized as recurrently involved in Wilms tumors, the most frequent being BCOR, BCORL1, NONO, MAX, COL6A3, ASXL1, MAP3K4, and ARID1A. DNA copy number changes resulted in recurrent 1q gain, MYCN amplification, LIN28B gain, and MIRLET7A loss. Unexpected germline variants involved PALB2 and CHEK2. Integrated analyses support two major classes of genetic changes that preserve the progenitor state and/or interrupt normal development.
Whole Transcriptome Sequencing Enables Discovery and Analysis of Viruses in Archived Primary Central Nervous System Lymphomas

PubMed Central

DeBoever, Christopher; Reid, Erin G.; Smith, Erin N.; Wang, Xiaoyun; Dumaop, Wilmar; Harismendy, Olivier; Carson, Dennis; Richman, Douglas; Masliah, Eliezer; Frazer, Kelly A.

2013-01-01

Primary central nervous system lymphomas (PCNSL) have a dramatically increased prevalence among persons living with AIDS and are known to be associated with human Epstein Barr virus (EBV) infection. Previous work suggests that in some cases, co-infection with other viruses may be important for PCNSL pathogenesis. Viral transcription in tumor samples can be measured using next generation transcriptome sequencing. We demonstrate the ability of transcriptome sequencing to identify viruses, characterize viral expression, and identify viral variants by sequencing four archived AIDS-related PCNSL tissue samples and analyzing raw sequencing reads. EBV was detected in all four PCNSL samples and cytomegalovirus (CMV), JC polyomavirus (JCV), and HIV were also discovered, consistent with clinical diagnoses. CMV was found to express three long non-coding RNAs recently reported as expressed during active infection. Single nucleotide variants were observed in each of the viruses observed and three indels were found in CMV. No viruses were found in several control tumor types including 32 diffuse large B-cell lymphoma samples. This study demonstrates the ability of next generation transcriptome sequencing to accurately identify viruses, including DNA viruses, in solid human cancer tissue samples. PMID:24023918
Analysis of Sequence Data Under Multivariate Trait-Dependent Sampling.

PubMed

Tao, Ran; Zeng, Donglin; Franceschini, Nora; North, Kari E; Boerwinkle, Eric; Lin, Dan-Yu

2015-06-01

High-throughput DNA sequencing allows for the genotyping of common and rare variants for genetic association studies. At the present time and for the foreseeable future, it is not economically feasible to sequence all individuals in a large cohort. A cost-effective strategy is to sequence those individuals with extreme values of a quantitative trait. We consider the design under which the sampling depends on multiple quantitative traits. Under such trait-dependent sampling, standard linear regression analysis can result in bias of parameter estimation, inflation of type I error, and loss of power. We construct a likelihood function that properly reflects the sampling mechanism and utilizes all available data. We implement a computationally efficient EM algorithm and establish the theoretical properties of the resulting maximum likelihood estimators. Our methods can be used to perform separate inference on each trait or simultaneous inference on multiple traits. We pay special attention to gene-level association tests for rare variants. We demonstrate the superiority of the proposed methods over standard linear regression through extensive simulation studies. We provide applications to the Cohorts for Heart and Aging Research in Genomic Epidemiology Targeted Sequencing Study and the National Heart, Lung, and Blood Institute Exome Sequencing Project.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.