Science.gov

Sample records for acid sequence identified

  1. Method for identifying and quantifying nucleic acid sequence aberrations

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1998-01-01

    A method for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe.

  2. Method for identifying and quantifying nucleic acid sequence aberrations

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1998-07-21

    A method is disclosed for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe. 11 figs.

  3. Evolutionary Distance of Amino Acid Sequence Orthologs across Macaque Subspecies: Identifying Candidate Genes for SIV Resistance in Chinese Rhesus Macaques

    PubMed Central

    Ross, Cody T.; Roodgar, Morteza; Smith, David Glenn

    2015-01-01

    We use the Reciprocal Smallest Distance (RSD) algorithm to identify amino acid sequence orthologs in the Chinese and Indian rhesus macaque draft sequences and estimate the evolutionary distance between such orthologs. We then use GOanna to map gene function annotations and human gene identifiers to the rhesus macaque amino acid sequences. We conclude methodologically by cross-tabulating a list of amino acid orthologs with large divergence scores with a list of genes known to be involved in SIV or HIV pathogenesis. We find that many of the amino acid sequences with large evolutionary divergence scores, as calculated by the RSD algorithm, have been shown to be related to HIV pathogenesis in previous laboratory studies. Four of the strongest candidate genes for SIVmac resistance in Chinese rhesus macaques identified in this study are CDK9, CXCL12, TRIM21, and TRIM32. Additionally, ANKRD30A, CTSZ, GORASP2, GTF2H1, IL13RA1, MUC16, NMDAR1, Notch1, NT5M, PDCD5, RAD50, and TM9SF2 were identified as possible candidates, among others. We failed to find many laboratory experiments contrasting the effects of Indian and Chinese orthologs at these sites on SIVmac pathogenesis, but future comparative studies might hold fertile ground for research into the biological mechanisms underlying innate resistance to SIVmac in Chinese rhesus macaques. PMID:25884674

  4. Evolutionary distance of amino acid sequence orthologs across macaque subspecies: identifying candidate genes for SIV resistance in Chinese rhesus macaques.

    PubMed

    Ross, Cody T; Roodgar, Morteza; Smith, David Glenn

    2015-01-01

    We use the Reciprocal Smallest Distance (RSD) algorithm to identify amino acid sequence orthologs in the Chinese and Indian rhesus macaque draft sequences and estimate the evolutionary distance between such orthologs. We then use GOanna to map gene function annotations and human gene identifiers to the rhesus macaque amino acid sequences. We conclude methodologically by cross-tabulating a list of amino acid orthologs with large divergence scores with a list of genes known to be involved in SIV or HIV pathogenesis. We find that many of the amino acid sequences with large evolutionary divergence scores, as calculated by the RSD algorithm, have been shown to be related to HIV pathogenesis in previous laboratory studies. Four of the strongest candidate genes for SIVmac resistance in Chinese rhesus macaques identified in this study are CDK9, CXCL12, TRIM21, and TRIM32. Additionally, ANKRD30A, CTSZ, GORASP2, GTF2H1, IL13RA1, MUC16, NMDAR1, Notch1, NT5M, PDCD5, RAD50, and TM9SF2 were identified as possible candidates, among others. We failed to find many laboratory experiments contrasting the effects of Indian and Chinese orthologs at these sites on SIVmac pathogenesis, but future comparative studies might hold fertile ground for research into the biological mechanisms underlying innate resistance to SIVmac in Chinese rhesus macaques. PMID:25884674

  5. Composition for nucleic acid sequencing

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2008-08-26

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  6. Incorporating substrate sequence motifs and spatial amino acid composition to identify kinase-specific phosphorylation sites on protein three-dimensional structures

    PubMed Central

    2013-01-01

    Background Protein phosphorylation catalyzed by kinases plays crucial regulatory roles in cellular processes. Given the high-throughput mass spectrometry-based experiments, the desire to annotate the catalytic kinases for in vivo phosphorylation sites has motivated. Thus, a variety of computational methods have been developed for performing a large-scale prediction of kinase-specific phosphorylation sites. However, most of the proposed methods solely rely on the local amino acid sequences surrounding the phosphorylation sites. An increasing number of three-dimensional structures make it possible to physically investigate the structural environment of phosphorylation sites. Results In this work, all of the experimental phosphorylation sites are mapped to the protein entries of Protein Data Bank by sequence identity. It resulted in a total of 4508 phosphorylation sites containing the protein three-dimensional (3D) structures. To identify phosphorylation sites on protein 3D structures, this work incorporates support vector machines (SVMs) with the information of linear motifs and spatial amino acid composition, which is determined for each kinase group by calculating the relative frequencies of 20 amino acid types within a specific radial distance from central phosphorylated amino acid residue. After the cross-validation evaluation, most of the kinase-specific models trained with the consideration of structural information outperform the models considering only the sequence information. Furthermore, the independent testing set which is not included in training set has demonstrated that the proposed method could provide a comparable performance to other popular tools. Conclusion The proposed method is shown to be capable of predicting kinase-specific phosphorylation sites on 3D structures and has been implemented as a web server which is freely accessible at http://csb.cse.yzu.edu.tw/PhosK3D/. Due to the difficulty of identifying the kinase-specific phosphorylation

  7. High speed nucleic acid sequencing

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2011-05-17

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid. Each type of labeled nucleotide comprises an acceptor fluorophore attached to a phosphate portion of the nucleotide such that the fluorophore is removed upon incorporation into a growing strand. Fluorescent signal is emitted via fluorescent resonance energy transfer between the donor fluorophore and the acceptor fluorophore as each nucleotide is incorporated into the growing strand. The sequence is deduced by identifying which base is being incorporated into the growing strand.

  8. De novo Sequencing and Transcriptome Analysis of Pinellia ternata Identify the Candidate Genes Involved in the Biosynthesis of Benzoic Acid and Ephedrine

    PubMed Central

    Zhang, Guang-hui; Jiang, Ni-hao; Song, Wan-ling; Ma, Chun-hua; Yang, Sheng-chao; Chen, Jun-wen

    2016-01-01

    Background: The medicinal herb, Pinellia ternata, is purported to be an anti-emetic with analgesic and sedative effects. Alkaloids are the main biologically active compounds in P. ternata, especially ephedrine that is a phenylpropylamino alkaloid specifically produced by Ephedra and Catha edulis. However, how ephedrine is synthesized in plants is uncertain. Only the phenylalanine ammonia lyase (PAL) and relevant genes in this pathway have been characterized. Genomic information of P. ternata is also unavailable. Results: We analyzed the transcriptome of the tuber of P. ternata with the Illumina HiSeq™ 2000 sequencing platform. 66,813,052 high-quality reads were generated, and these reads were assembled de novo into 89,068 unigenes. Most known genes involved in benzoic acid biosynthesis were identified in the unigene dataset of P. ternata, and the expression patterns of some ephedrine biosynthesis-related genes were analyzed by reverse transcription quantitative real-time PCR (RT-qPCR). Also, 14,468 simple sequence repeats (SSRs) were identified from 12,000 unigenes. Twenty primer pairs for SSRs were randomly selected for the validation of their amplification effect. Conclusion: RNA-seq data was used for the first time to provide a comprehensive gene information on P. ternata at the transcriptional level. These data will advance molecular genetics in this valuable medicinal plant. PMID:27579029

  9. RNA sequencing identifies upregulated kyphoscoliosis peptidase and phosphatidic acid signaling pathways in muscle hypertrophy generated by transgenic expression of myostatin propeptide.

    PubMed

    Miao, Yuanxin; Yang, Jinzeng; Xu, Zhong; Jing, Lu; Zhao, Shuhong; Li, Xinyun

    2015-01-01

    Myostatin (MSTN), a member of the transforming growth factor-β superfamily, plays a crucial negative role in muscle growth. MSTN mutations or inhibitions can dramatically increase muscle mass in most mammal species. Previously, we generated a transgenic mouse model of muscle hypertrophy via the transgenic expression of the MSTN N-terminal propeptide cDNA under the control of the skeletal muscle-specific MLC1 promoter. Here, we compare the mRNA profiles between transgenic mice and wild-type littermate controls with a high-throughput RNA sequencing method. The results show that 132 genes were significantly differentially expressed between transgenic mice and wild-type control mice; 97 of these genes were up-regulated, and 35 genes were down-regulated in the skeletal muscle. Several genes that had not been reported to be involved in muscle hypertrophy were identified, including up-regulated myosin binding protein H (mybph), and zinc metallopeptidase STE24 (Zmpste24). In addition, kyphoscoliosis peptidase (Ky), which plays a vital role in muscle growth, was also up-regulated in the transgenic mice. Interestingly, a pathway analysis based on grouping the differentially expressed genes uncovered that cardiomyopathy-related pathways and phosphatidic acid (PA) pathways (Dgki, Dgkz, Plcd4) were up-regulated. Increased PA signaling may increase mTOR signaling, resulting in skeletal muscle growth. The findings of the RNA sequencing analysis help to understand the molecular mechanisms of muscle hypertrophy caused by MSTN inhibition. PMID:25860951

  10. Method for sequencing nucleic acid molecules

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2006-05-30

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  11. Method for sequencing nucleic acid molecules

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2006-06-06

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  12. Transcriptome sequencing revealed the transcriptional organization at ribosome-mediated attenuation sites in Corynebacterium glutamicum and identified a novel attenuator involved in aromatic amino acid biosynthesis.

    PubMed

    Neshat, Armin; Mentz, Almut; Rückert, Christian; Kalinowski, Jörn

    2014-11-20

    The Gram-positive bacterium Corynebacterium glutamicum belongs to the order Corynebacteriales and is used as a producer of amino acids at industrial scales. Due to its economic importance, gene expression and particularly the regulation of amino acid biosynthesis has been investigated extensively. Applying the high-resolution technique of transcriptome sequencing (RNA-seq), recently a vast amount of data has been generated that was used to comprehensively analyze the C. glutamicum transcriptome. By analyzing RNA-seq data from a small RNA cDNA library of C. glutamicum, short transcripts in the known transcriptional attenuators sites of the trp operon, the ilvBNC operon and the leuA gene were verified. Furthermore, whole transcriptome RNA-seq data were used to elucidate the transcriptional organization of these three amino acid biosynthesis operons. In addition, we discovered and analyzed the novel attenuator aroR, located upstream of the aroF gene (cg1129). The DAHP synthase encoded by aroF catalyzes the first step in aromatic amino acid synthesis. The AroR leader peptide contains the amino acid sequence motif F-Y-F, indicating a regulatory effect by phenylalanine and tyrosine. Analysis by real-time RT-PCR suggests that the attenuator regulates the transcription of aroF in dependence of the cellular amount of tRNA loaded with phenylalanine when comparing a phenylalanine-auxotrophic C. glutamicum mutant fed with limiting and excess amounts of a phenylalanine-containing dipeptide. Additionally, the very interesting finding was made that all analyzed attenuators are leaderless transcripts. PMID:24910972

  13. Identifying a base in a nucleic acid

    DOEpatents

    Fodor, Stephen P. A.; Lipshutz, Robert J.; Huang, Xiaohua

    2005-02-08

    Devices and techniques for hybridization of nucleic acids and for determining the sequence of nucleic acids. Arrays of nucleic acids are formed by techniques, preferably high resolution, light-directed techniques. Positions of hybridization of a target nucleic acid are determined by, e.g., epifluorescence microscopy. Devices and techniques are proposed to determine the sequence of a target nucleic acid more efficiently and more quickly through such synthesis and detection techniques.

  14. Understanding and identifying amino acid repeats.

    PubMed

    Luo, Hong; Nijveen, Harm

    2014-07-01

    Amino acid repeats (AARs) are abundant in protein sequences. They have particular roles in protein function and evolution. Simple repeat patterns generated by DNA slippage tend to introduce length variations and point mutations in repeat regions. Loss of normal and gain of abnormal function owing to their variable length are potential risks leading to diseases. Repeats with complex patterns mostly refer to the functional domain repeats, such as the well-known leucine-rich repeat and WD repeat, which are frequently involved in protein–protein interaction. They are mainly derived from internal gene duplication events and stabilized by ‘gate-keeper’ residues, which play crucial roles in preventing inter-domain aggregation. AARs are widely distributed in different proteomes across a variety of taxonomic ranges, and especially abundant in eukaryotic proteins. However, their specific evolutionary and functional scenarios are still poorly understood. Identifying AARs in protein sequences is the first step for the further investigation of their biological function and evolutionary mechanism. In principle, this is an NP-hard problem, as most of the repeat fragments are shaped by a series of sophisticated evolutionary events and become latent periodical patterns. It is not possible to define a uniform criterion for detecting and verifying various repeat patterns. Instead, different algorithms based on different strategies have been developed to cope with different repeat patterns. In this review, we attempt to describe the amino acid repeat-detection algorithms currently available and compare their strategies based on an in-depth analysis of the biological significance of protein repeats. PMID:23418055

  15. Methods for analyzing nucleic acid sequences

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2011-05-17

    The present invention is directed to a method of sequencing a target nucleic acid. The method provides a complex comprising a polymerase enzyme, a target nucleic acid molecule, and a primer, wherein the complex is immobilized on a support Fluorescent label is attached to a terminal phosphate group of the nucleotide or nucleotide analog. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The time duration of the signal from labeled nucleotides or nucleotide analogs that become incorporated is distinguished from freely diffusing labels by a longer retention in the observation volume for the nucleotides or nucleotide analogs that become incorporated than for the freely diffusing labels.

  16. A mitochondrial DNA variant, identified in Leber hereditary optic neuropathy patients, which extends the amino acid sequence of cytochrome c oxidase subunit I.

    PubMed Central

    Brown, M D; Yang, C C; Trounce, I; Torroni, A; Lott, M T; Wallace, D C

    1992-01-01

    A G-to-A transition at nucleotide pair (np) 7444 in the mtDNA was found to correlate with Leber hereditary optic neuropathy (LHON). The mutation eliminates the termination codon of the cytochrome c oxidase subunit I (COI) gene, extending the COI polypeptide by three amino acids. The mutation was discovered as an XbaI restriction-endonuclease-site loss present in 2 (9.1%) of 22 LHON patients who lacked the np 11778 LHON mutation and in 6 (1.1%) of 545 unaffected controls. The mutant polypeptide has an altered mobility on SDS-PAGE, suggesting a structural alteration, and the cytochrome c oxidase enzyme activity of patient lymphocytes is reduced approximately 40% relative to that in controls. These data suggest that the np 7444 mutation results in partial respiratory deficiency and thus contributes to the onset of LHON. Images Figure 1 Figure 3 PMID:1322638

  17. Chip-based sequencing nucleic acids

    DOEpatents

    Beer, Neil Reginald

    2014-08-26

    A system for fast DNA sequencing by amplification of genetic material within microreactors, denaturing, demulsifying, and then sequencing the material, while retaining it in a PCR/sequencing zone by a magnetic field. One embodiment includes sequencing nucleic acids on a microchip that includes a microchannel flow channel in the microchip. The nucleic acids are isolated and hybridized to magnetic nanoparticles or to magnetic polystyrene-coated beads. Microreactor droplets are formed in the microchannel flow channel. The microreactor droplets containing the nucleic acids and the magnetic nanoparticles are retained in a magnetic trap in the microchannel flow channel and sequenced.

  18. Deep Sequencing to Identify the Causes of Viral Encephalitis

    PubMed Central

    Chan, Benjamin K.; Wilson, Theodore; Fischer, Kael F.; Kriesel, John D.

    2014-01-01

    Deep sequencing allows for a rapid, accurate characterization of microbial DNA and RNA sequences in many types of samples. Deep sequencing (also called next generation sequencing or NGS) is being developed to assist with the diagnosis of a wide variety of infectious diseases. In this study, seven frozen brain samples from deceased subjects with recent encephalitis were investigated. RNA from each sample was extracted, randomly reverse transcribed and sequenced. The sequence analysis was performed in a blinded fashion and confirmed with pathogen-specific PCR. This analysis successfully identified measles virus sequences in two brain samples and herpes simplex virus type-1 sequences in three brain samples. No pathogen was identified in the other two brain specimens. These results were concordant with pathogen-specific PCR and partially concordant with prior neuropathological examinations, demonstrating that deep sequencing can accurately identify viral infections in frozen brain tissue. PMID:24699691

  19. Distinguishing Proteins From Arbitrary Amino Acid Sequences

    PubMed Central

    Yau, Stephen S.-T.; Mao, Wei-Guang; Benson, Max; He, Rong Lucy

    2015-01-01

    What kinds of amino acid sequences could possibly be protein sequences? From all existing databases that we can find, known proteins are only a small fraction of all possible combinations of amino acids. Beginning with Sanger's first detailed determination of a protein sequence in 1952, previous studies have focused on describing the structure of existing protein sequences in order to construct the protein universe. No one, however, has developed a criteria for determining whether an arbitrary amino acid sequence can be a protein. Here we show that when the collection of arbitrary amino acid sequences is viewed in an appropriate geometric context, the protein sequences cluster together. This leads to a new computational test, described here, that has proved to be remarkably accurate at determining whether an arbitrary amino acid sequence can be a protein. Even more, if the results of this test indicate that the sequence can be a protein, and it is indeed a protein sequence, then its identity as a protein sequence is uniquely defined. We anticipate our computational test will be useful for those who are attempting to complete the job of discovering all proteins, or constructing the protein universe. PMID:25609314

  20. RNA Sequencing Identifies Novel Translational Biomarkers of Kidney Fibrosis.

    PubMed

    Craciun, Florin L; Bijol, Vanesa; Ajay, Amrendra K; Rao, Poornima; Kumar, Ramya K; Hutchinson, John; Hofmann, Oliver; Joshi, Nikita; Luyendyk, James P; Kusebauch, Ulrike; Moss, Christopher L; Srivastava, Anand; Himmelfarb, Jonathan; Waikar, Sushrut S; Moritz, Robert L; Vaidya, Vishal S

    2016-06-01

    CKD is the gradual, asymptomatic loss of kidney function, but current tests only identify CKD when significant loss has already happened. Several potential biomarkers of CKD have been reported, but none have been approved for preclinical or clinical use. Using RNA sequencing in a mouse model of folic acid-induced nephropathy, we identified ten genes that track kidney fibrosis development, the common pathologic finding in patients with CKD. The gene expression of all ten candidates was confirmed to be significantly higher (approximately ten- to 150-fold) in three well established, mechanistically distinct mouse models of kidney fibrosis than in models of nonfibrotic AKI. Protein expression of these genes was also high in the folic acid model and in patients with biopsy-proven kidney fibrosis. mRNA expression of the ten genes increased with increasing severity of kidney fibrosis, decreased in response to therapeutic intervention, and increased only modestly (approximately two- to five-fold) with liver fibrosis in mice and humans, demonstrating specificity for kidney fibrosis. Using targeted selected reaction monitoring mass spectrometry, we detected three of the ten candidates in human urine: cadherin 11 (CDH11), macrophage mannose receptor C1 (MRC1), and phospholipid transfer protein (PLTP). Furthermore, urinary levels of each of these three proteins distinguished patients with CKD (n=53) from healthy individuals (n=53; P<0.05). In summary, we report the identification of urinary CDH11, MRC1, and PLTP as novel noninvasive biomarkers of CKD. PMID:26449608

  1. Brain-specific genes have identifier sequences in their introns.

    PubMed Central

    Milner, R J; Bloom, F E; Lai, C; Lerner, R A; Sutcliffe, J G

    1984-01-01

    The 82-nucleotide identifier (ID) sequence is present in the rat genome in 1-1.5 X 10(5) copies and in cDNA clones of precursors of brain-specific mRNAs. One brain-specific gene contains more than one ID sequence in its introns. There is an excess of ID sequences to brain genes, and some ID sequences appear to have been inserted as mobile elements into other genetic locations. Therefore, brain genes contain ID sequences in their introns, but not all ID sequences are located in brain gene introns. A brain ID consensus sequence has been obtained by comparing 8 ID nucleotide sequences. Images PMID:6583673

  2. Unconventional P-35S sequence identified in genetically modified maize.

    PubMed

    Al-Hmoud, Nisreen; Al-Husseini, Nawar; Ibrahim-Alobaide, Mohammed A; Kübler, Eric; Farfoura, Mahmoud; Alobydi, Hytham; Al-Rousan, Hiyam

    2014-01-01

    The Cauliflower Mosaic Virus 35S promoter sequence, CaMV P-35S, is one of several commonly used genetic targets to detect genetically modified maize and is found in most GMOs. In this research we report the finding of an alternative P-35S sequence and its incidence in GM maize marketed in Jordan. The primer pair normally used to amplify a 123 bp DNA fragment of the CaMV P-35S promoter in GMOs also amplified a previously undetected alternative sequence of CaMV P-35S in GM maize samples which we term V3. The amplified V3 sequence comprises 386 base pairs and was not found in the standard wild-type maize, MON810 and MON 863 GM maize. The identified GM maize samples carrying the V3 sequence were found free of CaMV when compared with CaMV infected brown mustard sample. The data of sequence alignment analysis of the V3 genetic element showed 90% similarity with the matching P-35S sequence of the cauliflower mosaic virus isolate CabbB-JI and 99% similarity with matching P-35S sequences found in several binary plant vectors, of which the binary vector locus JQ693018 is one example. The current study showed an increase of 44% in the incidence of the identified 386 bp sequence in GM maize sold in Jordan's markets during the period 2009 and 2012. PMID:24495911

  3. Probe kit for identifying a base in a nucleic acid

    DOEpatents

    Fodor, Stephen P. A.; Lipshutz, Robert J.; Huang, Xiaohua

    2001-01-01

    Devices and techniques for hybridization of nucleic acids and for determining the sequence of nucleic acids. Arrays of nucleic acids are formed by techniques, preferably high resolution, light-directed techniques. Positions of hybridization of a target nucleic acid are determined by, e.g., epifluorescence microscopy. Devices and techniques are proposed to determine the sequence of a target nucleic acid more efficiently and more quickly through such synthesis and detection techniques.

  4. Method of Identifying a Base in a Nucleic Acid

    DOEpatents

    Fodor, Stephen P. A.; Lipshutz, Robert J.; Huang, Xiaohua

    1999-01-01

    Devices and techniques for hybridization of nucleic acids and for determining the sequence of nucleic acids. Arrays of nucleic acids are formed by techniques, preferably high resolution, light-directed techniques. Positions of hybridization of a target nucleic acid are determined by, e.g., epifluorescence microscopy. Devices and techniques are proposed to determine the sequence of a target nucleic acid more efficiently and more quickly through such synthesis and detection techniques.

  5. Complete genome sequence of southern tomato virus identified from China using next generation sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Complete genome sequence of a double-stranded RNA (dsRNA) virus, southern tomato virus (STV), on tomatoes in China, was elucidated using small RNAs deep sequencing. The identified STV_CN12 shares 99% sequence identity to other isolates from Mexico, France, Spain, and U.S. This is the first report ...

  6. Exome sequencing identified new mutations in a Marfan syndrome family

    PubMed Central

    2014-01-01

    Marfan syndrome is a common autosomal dominant hereditary connective tissue disorder. There is no cure for Marfan syndrome currently. Next-generation sequencing (NGS) technology is efficient to identify genetic lesions at the exome level. Here we carried out exome sequencing of two Marfan syndrome patients. Further Sanger sequencing validation in other five members from the same family was also implemented to confirm new variants which may contribute to the pathogenesis of the disease. Two new variants, including one nonsense SNP in the Marfan syndrome gene FBN1 and one missense mutation in exon 15 of LRP1, which may be related to the phenotype of the patients were identified. The exome sequencing analysis provides us a new insight into the molecular events governing pathogenesis of Marfan syndrome. Virtual slide http://www.diagnosticpathology.diagnomx.eu/vs/1229110069114125. PMID:24484584

  7. Promoter sequences and algorithmical methods for identifying them.

    PubMed

    Vanet, A; Marsan, L; Sagot, M F

    1999-01-01

    This paper presents a survey of currently available mathematical models and algorithmical methods for trying to identify promoter sequences. The methods concern both searching in a genome for a previously defined consensus and extracting a consensus from a set of sequences. Such methods were often tailored for either eukaryotes or prokaryotes although this does not preclude use of the same method for both types of organisms. The survey therefore covers all methods; however, emphasis is placed on prokaryotic promoter sequence identification. Illustrative applications of the main extracting algorithms are given for three bacteria. PMID:10673015

  8. Phenolic acid esterases, coding sequences and methods

    DOEpatents

    Blum, David L.; Kataeva, Irina; Li, Xin-Liang; Ljungdahl, Lars G.

    2002-01-01

    Described herein are four phenolic acid esterases, three of which correspond to domains of previously unknown function within bacterial xylanases, from XynY and XynZ of Clostridium thermocellum and from a xylanase of Ruminococcus. The fourth specifically exemplified xylanase is a protein encoded within the genome of Orpinomyces PC-2. The amino acids of these polypeptides and nucleotide sequences encoding them are provided. Recombinant host cells, expression vectors and methods for the recombinant production of phenolic acid esterases are also provided.

  9. Amino-Acid Sequence of Porcine Pepsin

    PubMed Central

    Tang, J.; Sepulveda, P.; Marciniszyn, J.; Chen, K. C. S.; Huang, W-Y.; Tao, N.; Liu, D.; Lanier, J. P.

    1973-01-01

    As the culmination of several years of experiments, we propose a complete amino-acid sequence for porcine pepsin, an enzyme containing 327 amino-acid residues in a single polypeptide chain. In the sequence determination, the enzyme was treated with cyanogen bromide. Five resulting fragments were purified. The amino-acid sequence of four of the fragments accounted for 290 residues. Because the structure of a 37-residue carboxyl-terminal fragment was already known, it was not studied. The alignment of these fragments was determined from the sequence of methionyl-peptides we had previously reported. We also discovered the locations of activesite aspartyl residues, as well as the pairing of the three disulfide bridges. A minor component of commercial crystalline pepsin was found to contain two extra amino-acid residues, Ala-Leu-, at the amino-terminus of the molecule. This minor component was apparently derived from a different site of cleavage during the activation of porcine pepsinogen. PMID:4587252

  10. Identifying features in biological sequences: Sixth workshop report

    SciTech Connect

    Burks, C.; Myers, E.; Pearson, W.R.

    1995-12-31

    This report covers the sixth of an annual series of workshops held at the Aspen Center for Physics concentrating particularly on the identification of features in DNA sequence, and more broadly on related topics in computational molecular biology. The workshop series originally focused primarily on discussion of current needs and future strategies for identifying and predicting the presence of complex functional units on sequenced, but otherwise uncharacterized, genomic DNA. We addressed the need for computationally-based, automatic tools for synthesizing available data about individual consensus sequences and local compositional patterns into the composite objects (e.g., genes) that are -- as composite entities -- the true object of interest when scanning DNA sequences. The workshop was structured to promote sustained informal contact and exchange of expertise between molecular biologists, computer scientists, and mathematicians.

  11. A homozygous mutation in PEX16 identified by whole-exome sequencing ending a diagnostic odyssey

    PubMed Central

    Bacino, Carlos A.; Chao, Yu-Hsin; Seto, Elaine; Lotze, Tim; Xia, Fan; Jones, Richard O.; Moser, Ann; Wangler, Michael F.

    2015-01-01

    We present a patient with a unique neurological phenotype with a progressive neurodegenerative. An 18-year diagnostic odyssey for the patient ended when exome sequencing identified a homozygous PEX16 mutation suggesting an atypical peroxisomal biogenesis disorder (PBD). Interestingly, the patient's peroxisomal biochemical abnormalities were subtle, such that plasma very-long-chain fatty acids initially failed to provide a diagnosis. This case suggests that next-generation sequencing may be diagnostic in some atypical peroxisomal biogenesis disorders. PMID:26644994

  12. Whole exome sequencing to identify genetic causes of short stature

    PubMed Central

    Guo, Michael H.; Shen, Yiping; Walvoord, Emily C.; Miller, Timothy C.; Moon, Jennifer E.; Hirschhorn, Joel N; Dauber, Andrew

    2014-01-01

    Background/Aims Short stature is a common reason for presentation to pediatric endocrinology clinics. However, for most patients, no cause for the short stature can be identified. As genetics plays a strong role in height, we sought to identify known and novel genetic causes of short stature. Methods We recruited 14 children with severe short stature of unknown etiology. We conducted whole exome sequencing of the patients and their family members. We used an analysis pipeline to identify rare nonsynonymous genetic variants that cause the short stature. Results We identified a genetic cause of short stature in 5 of the 14 patients. This included cases of Floating Harbor syndrome, Kenny-Caffey syndrome, the progeroid form of Ehlers-Danlos syndrome, as well as two cases of the 3-M syndrome. For remaining patients, we have generated lists of candidate variants. Conclusions Whole exome sequencing can help identify genetic causes of short stature in the context of defined genetic syndromes, but may be less effective in identifying novel genetic causes of short stature in individual families. Utilized in the clinic, whole exome sequencing can provide clinically relevant diagnoses for these patients. Rare syndromic causes of short stature may be under-recognized and under-diagnosed in pediatric endocrinology clinics. PMID:24970356

  13. Identifying novel sequence variants of RNA 3D motifs

    PubMed Central

    Zirbel, Craig L.; Roll, James; Sweeney, Blake A.; Petrov, Anton I.; Pirrung, Meg; Leontis, Neocles B.

    2015-01-01

    Predicting RNA 3D structure from sequence is a major challenge in biophysics. An important sub-goal is accurately identifying recurrent 3D motifs from RNA internal and hairpin loop sequences extracted from secondary structure (2D) diagrams. We have developed and validated new probabilistic models for 3D motif sequences based on hybrid Stochastic Context-Free Grammars and Markov Random Fields (SCFG/MRF). The SCFG/MRF models are constructed using atomic-resolution RNA 3D structures. To parameterize each model, we use all instances of each motif found in the RNA 3D Motif Atlas and annotations of pairwise nucleotide interactions generated by the FR3D software. Isostericity relations between non-Watson–Crick basepairs are used in scoring sequence variants. SCFG techniques model nested pairs and insertions, while MRF ideas handle crossing interactions and base triples. We use test sets of randomly-generated sequences to set acceptance and rejection thresholds for each motif group and thus control the false positive rate. Validation was carried out by comparing results for four motif groups to RMDetect. The software developed for sequence scoring (JAR3D) is structured to automatically incorporate new motifs as they accumulate in the RNA 3D Motif Atlas when new structures are solved and is available free for download. PMID:26130723

  14. Identifying novel sequence variants of RNA 3D motifs.

    PubMed

    Zirbel, Craig L; Roll, James; Sweeney, Blake A; Petrov, Anton I; Pirrung, Meg; Leontis, Neocles B

    2015-09-01

    Predicting RNA 3D structure from sequence is a major challenge in biophysics. An important sub-goal is accurately identifying recurrent 3D motifs from RNA internal and hairpin loop sequences extracted from secondary structure (2D) diagrams. We have developed and validated new probabilistic models for 3D motif sequences based on hybrid Stochastic Context-Free Grammars and Markov Random Fields (SCFG/MRF). The SCFG/MRF models are constructed using atomic-resolution RNA 3D structures. To parameterize each model, we use all instances of each motif found in the RNA 3D Motif Atlas and annotations of pairwise nucleotide interactions generated by the FR3D software. Isostericity relations between non-Watson-Crick basepairs are used in scoring sequence variants. SCFG techniques model nested pairs and insertions, while MRF ideas handle crossing interactions and base triples. We use test sets of randomly-generated sequences to set acceptance and rejection thresholds for each motif group and thus control the false positive rate. Validation was carried out by comparing results for four motif groups to RMDetect. The software developed for sequence scoring (JAR3D) is structured to automatically incorporate new motifs as they accumulate in the RNA 3D Motif Atlas when new structures are solved and is available free for download. PMID:26130723

  15. Myeloid neoplasm demonstrating a STAT5B-RARA rearrangement and genetic alterations associated with all-trans retinoic acid resistance identified by a custom next-generation sequencing assay.

    PubMed

    Kluk, Michael J; Abo, Ryan P; Brown, Ronald D; Kuo, Frank C; Dal Cin, Paola; Pozdnyakova, Olga; Morgan, Elizabeth A; Lindeman, Neal I; DeAngelo, Daniel J; Aster, Jon C

    2015-10-01

    We describe the case of a patient presenting with several weeks of symptoms related to pancytopenia associated with a maturation arrest at the late promyelocyte/early myelocyte stage of granulocyte differentiation. A diagnosis of acute promyelocytic leukemia was considered, but the morphologic features were atypical for this entity and conventional tests for the presence of a PML-RARA fusion gene were negative. Additional analysis using a custom next-generation sequencing assay revealed a rearrangement producing a STAT5B-RARA fusion gene, which was confirmed by reverse transcription polymerase chain reaction (RT-PCR) and supplementary cytogenetic studies, allowing the diagnosis of a morphologically atypical form of acute promyelocytic leukemia to be made. Analysis of the sequencing data permitted characterization of both chromosomal breakpoints and revealed two additional alterations, a small deletion in RARA exon 9 and a RARA R276W substitution, that have been linked to resistance to all-trans retinoic acid. This case highlights how next-generation sequencing can augment currently standard testing to establish diagnoses in difficult cases, and in doing so help guide selection of therapy. PMID:27148563

  16. Myeloid neoplasm demonstrating a STAT5B-RARA rearrangement and genetic alterations associated with all-trans retinoic acid resistance identified by a custom next-generation sequencing assay

    PubMed Central

    Kluk, Michael J.; Abo, Ryan P.; Brown, Ronald D.; Kuo, Frank C.; Dal Cin, Paola; Pozdnyakova, Olga; Morgan, Elizabeth A.; Lindeman, Neal I.; DeAngelo, Daniel J.; Aster, Jon C.

    2015-01-01

    We describe the case of a patient presenting with several weeks of symptoms related to pancytopenia associated with a maturation arrest at the late promyelocyte/early myelocyte stage of granulocyte differentiation. A diagnosis of acute promyelocytic leukemia was considered, but the morphologic features were atypical for this entity and conventional tests for the presence of a PML-RARA fusion gene were negative. Additional analysis using a custom next-generation sequencing assay revealed a rearrangement producing a STAT5B-RARA fusion gene, which was confirmed by reverse transcription polymerase chain reaction (RT-PCR) and supplementary cytogenetic studies, allowing the diagnosis of a morphologically atypical form of acute promyelocytic leukemia to be made. Analysis of the sequencing data permitted characterization of both chromosomal breakpoints and revealed two additional alterations, a small deletion in RARA exon 9 and a RARA R276W substitution, that have been linked to resistance to all-trans retinoic acid. This case highlights how next-generation sequencing can augment currently standard testing to establish diagnoses in difficult cases, and in doing so help guide selection of therapy. PMID:27148563

  17. Protective variant for hippocampal atrophy identified by whole exome sequencing.

    PubMed

    Nho, Kwangsik; Kim, Sungeun; Risacher, Shannon L; Shen, Li; Corneveaux, Jason J; Swaminathan, Shanker; Lin, Hai; Ramanan, Vijay K; Liu, Yunlong; Foroud, Tatiana M; Inlow, Mark H; Siniard, Ashley L; Reiman, Rebecca A; Aisen, Paul S; Petersen, Ronald C; Green, Robert C; Jack, Clifford R; Weiner, Michael W; Baldwin, Clinton T; Lunetta, Kathryn L; Farrer, Lindsay A; Furney, Simon J; Lovestone, Simon; Simmons, Andrew; Mecocci, Patrizia; Vellas, Bruno; Tsolaki, Magda; Kloszewska, Iwona; Soininen, Hilkka; McDonald, Brenna C; Farlow, Martin R; Ghetti, Bernardino; Huentelman, Matthew J; Saykin, Andrew J

    2015-03-01

    We used whole-exome sequencing to identify variants other than APOE associated with the rate of hippocampal atrophy in amnestic mild cognitive impairment. An in-silico predicted missense variant in REST (rs3796529) was found exclusively in subjects with slow hippocampal volume loss and validated using unbiased whole-brain analysis and meta-analysis across 5 independent cohorts. REST is a master regulator of neurogenesis and neuronal differentiation that has not been previously implicated in Alzheimer's disease. These findings nominate REST and its functional pathways as protective and illustrate the potential of combining next-generation sequencing with neuroimaging to discover novel disease mechanisms and potential therapeutic targets. PMID:25559091

  18. Regulated expression of repetitive sequences including the identifier sequence during myotube formation in culture.

    PubMed Central

    Herget, T; Reich, M; Stüber, K; Starzinski-Powitz, A

    1986-01-01

    We have isolated and characterized a cDNA of 1183 bp, pL6-411, from rat L6 muscle cells. This cDNA contains repetitive sequences - including two inverted copies of the previously described identifier sequence - as shown by sequence analysis. Repetitive sequences from pL6-411 characterize a family of RNAs which is specifically induced during L6 myotube formation. Another part of the pL6-411 sequence, existing at low-copy number per haploid rat genome, hybridized to two RNAs of 5 kb and 2 kb from L6 myoblasts as well as from L6 myotubes. A third pL6-411-related RNA of 150 bases was detected which hybridized with the repetitive sequence but did not hybridize with the low-copy number part of pL6-411. It appears that the 'identifier' sequence in this population of small RNAs is complementary to one of the 'identifier' copies in the pL6-411-related RNA. Finally, we identified on cDNA pL6-411 the recognition site for the TGGCA-binding protein and in both orientations a total of four putative promoters for RNA polymerase III. Images Fig.1. Fig.2. Fig.3. PMID:2423328

  19. Correlation approach to identify coding regions in DNA sequences

    NASA Technical Reports Server (NTRS)

    Ossadnik, S. M.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Mantegna, R. N.; Peng, C. K.; Simons, M.; Stanley, H. E.

    1994-01-01

    Recently, it was observed that noncoding regions of DNA sequences possess long-range power-law correlations, whereas coding regions typically display only short-range correlations. We develop an algorithm based on this finding that enables investigators to perform a statistical analysis on long DNA sequences to locate possible coding regions. The algorithm is particularly successful in predicting the location of lengthy coding regions. For example, for the complete genome of yeast chromosome III (315,344 nucleotides), at least 82% of the predictions correspond to putative coding regions; the algorithm correctly identified all coding regions larger than 3000 nucleotides, 92% of coding regions between 2000 and 3000 nucleotides long, and 79% of coding regions between 1000 and 2000 nucleotides. The predictive ability of this new algorithm supports the claim that there is a fundamental difference in the correlation property between coding and noncoding sequences. This algorithm, which is not species-dependent, can be implemented with other techniques for rapidly and accurately locating relatively long coding regions in genomic sequences.

  20. Phenotype Sequencing: Identifying the Genes That Cause a Phenotype Directly from Pooled Sequencing of Independent Mutants

    PubMed Central

    Harper, Marc A.; Chen, Zugen; Toy, Traci; Machado, Iara M. P.; Nelson, Stanley F.; Liao, James C.; Lee, Christopher J.

    2011-01-01

    Random mutagenesis and phenotype screening provide a powerful method for dissecting microbial functions, but their results can be laborious to analyze experimentally. Each mutant strain may contain 50–100 random mutations, necessitating extensive functional experiments to determine which one causes the selected phenotype. To solve this problem, we propose a “Phenotype Sequencing” approach in which genes causing the phenotype can be identified directly from sequencing of multiple independent mutants. We developed a new computational analysis method showing that 1. causal genes can be identified with high probability from even a modest number of mutant genomes; 2. costs can be cut many-fold compared with a conventional genome sequencing approach via an optimized strategy of library-pooling (multiple strains per library) and tag-pooling (multiple tagged libraries per sequencing lane). We have performed extensive validation experiments on a set of E. coli mutants with increased isobutanol biofuel tolerance. We generated a range of sequencing experiments varying from 3 to 32 mutant strains, with pooling on 1 to 3 sequencing lanes. Our statistical analysis of these data (4099 mutations from 32 mutant genomes) successfully identified 3 genes (acrB, marC, acrA) that have been independently validated as causing this experimental phenotype. It must be emphasized that our approach reduces mutant sequencing costs enormously. Whereas a conventional genome sequencing experiment would have cost $7,200 in reagents alone, our Phenotype Sequencing design yielded the same information value for only $1200. In fact, our smallest experiments reliably identified acrB and marC at a cost of only $110–$340. PMID:21364744

  1. Identifying rare variants associated with complex traits via sequencing

    PubMed Central

    Li, Bingshan; Liu, Dajiang J.; Leal, Suzanne M.

    2013-01-01

    Although genome-wide association studies have been successful in detecting associations with common variants, there is currently an increasing interest in identifying low frequency and rare variants associated with complex traits. Next-generation sequencing technologies make it feasible to survey the full spectrum of genetic variation in coding regions or the entire genome. Due to the low frequency of rare variants, coupled with allelic heterogeneity, however, the association analysis for rare variants is challenging and traditional methods are ineffective. Recently a battery of new statistical methods has been proposed for identifying rare variants associated with complex traits. These methods test for associations by aggregating multiple rare variants across a gene or a genomic region, or a group of variants in the genome. In this Unit, we describe key concepts for rare variant association for complex traits, survey some of the recent methods and discuss their statistical power under various scenarios, and provide practical guidance on analyzing next-generation sequencing data for identifying rare variants associated with complex traits. PMID:23853079

  2. Using whole exome sequencing to identify inherited causes of autism

    PubMed Central

    Yu, T.W.; Chahrour, M.H.; Coulter, M.E.; Jiralerspong, S.; Okamura-Ikeda, K.; Ataman, B.; Schmitz-Abe, K.; Harmin, D.A.; Adli, M.; Malik, A.N.; D’Gama, A.M.; Lim, E.T.; Sanders, S.J.; Mochida, G.H.; Partlow, J.N.; Sunu, C.M.; Felie, J.M.; Rodriguez, J.; Nasir, R.H.; Ware, J.; Joseph, R.M.; Hill, R.S.; Kwan, B.Y.; Al-Saffar, M.; Mukaddes, N.M.; Hashmi, A.; Balkhy, S.; Gascon, G.G.; Hisama, F.M.; LeClair, E.; Poduri, A.; Oner, O.; Al-Saad, S.; Al-Awadi, S.A.; Bastaki, L.; Ben-Omran, T.; Teebi, A.; Al-Gazali, L.; Eapen, V.; Stevens, C.R.; Rappaport, L.; Gabriel, S.B.; Markianos, K.; State, M.W.; Greenberg, M.E.; Taniguchi, H.; Braverman, N.E.; Morrow, E.M.; Walsh, C.A.

    2013-01-01

    Summary Despite significant heritability of autism spectrum disorders (ASDs), their extreme genetic heterogeneity has proven challenging for gene discovery. Studies of primarily simplex families have implicated de novo copy number changes and point mutations, but are not optimally designed to identify inherited risk alleles. We apply whole exome sequencing (WES) to ASD families enriched for inherited causes due to consanguinity and find familial ASD associated with biallelic mutations in disease genes (AMT, PEX7, SYNE1, VPS13B, PAH, POMGNT1), some implicated for the first time in ASD. At least some of these genes show biallelic mutations in nonconsanguineous families as well. These mutations are often only partially disabling or present atypically, with patients lacking diagnostic features of the Mendelian disorders with which these genes are classically associated. Our study shows the utility of WES for identifying specific genetic conditions not clinically suspected and the importance of partial loss of gene function in ASDs. PMID:23352163

  3. Detection of nucleic acid sequences by invader-directed cleavage

    DOEpatents

    Brow, Mary Ann D.; Hall, Jeff Steven Grotelueschen; Lyamichev, Victor; Olive, David Michael; Prudent, James Robert

    1999-01-01

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The 5' nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based by charge.

  4. Exome sequencing of contralateral breast cancer identifies metastatic disease.

    PubMed

    Klevebring, Daniel; Lindberg, Johan; Rockberg, Julia; Hilliges, Camilla; Hall, Per; Sandberg, Maria; Czene, Kamila

    2015-06-01

    Women with contralateral breast cancer (CBC) have significantly worse prognosis compared to women with unilateral cancer. A possible explanation of the poor prognosis of patients with CBC is that in a subset of patients, the second cancer is not a new primary tumor but a metastasis of the first cancer that has potentially obtained aggressive characteristics through selection of treatment. Exome and whole-genome sequencing of solid tumors has previously been used to investigate the clonal relationship between primary tumors and metastases in several diseases. In order to assess the relationship between the first and the second cancer, we performed exome sequencing to identify somatic mutations in both first and second cancers, and compared paired normal tissue of 25 patients with metachronous CBC. For three patients, we identified shared somatic mutations indicating a common clonal origin thereby demonstrating that the second tumor is a metastasis of the first cancer, rather than a new primary cancer. Accordingly, these patients all developed distant metastasis within 3 years of the second diagnosis, compared with 7 out of 22 patients with non-shared somatic profiles. Genomic profiling of both tumors help the clinicians distinguish between true CBCs and subsequent metastases. PMID:25922084

  5. Exome sequencing identifies PDE4D mutations in acrodysostosis.

    PubMed

    Lee, Hane; Graham, John M; Rimoin, David L; Lachman, Ralph S; Krejci, Pavel; Tompson, Stuart W; Nelson, Stanley F; Krakow, Deborah; Cohn, Daniel H

    2012-04-01

    Acrodysostosis is a dominantly-inherited, multisystem disorder characterized by skeletal, endocrine, and neurological abnormalities. To identify the molecular basis of acrodysostosis, we performed exome sequencing on five genetically independent cases. Three different missense mutations in PDE4D, which encodes cyclic AMP (cAMP)-specific phosphodiesterase 4D, were found to be heterozygous in three of the cases. Two of the mutations were demonstrated to have occurred de novo, providing strong genetic evidence of causation. Two additional cases were heterozygous for de novo missense mutations in PRKAR1A, which encodes the cAMP-dependent regulatory subunit of protein kinase A and which has been recently reported to be the cause of a form of acrodysostosis resistant to multiple hormones. These findings demonstrate that acrodysostosis is genetically heterogeneous and underscore the exquisite sensitivity of many tissues to alterations in cAMP homeostasis. PMID:22464252

  6. Hybridization and sequencing of nucleic acids using base pair mismatches

    DOEpatents

    Fodor, Stephen P. A.; Lipshutz, Robert J.; Huang, Xiaohua

    2001-01-01

    Devices and techniques for hybridization of nucleic acids and for determining the sequence of nucleic acids. Arrays of nucleic acids are formed by techniques, preferably high resolution, light-directed techniques. Positions of hybridization of a target nucleic acid are determined by, e.g., epifluorescence microscopy. Devices and techniques are proposed to determine the sequence of a target nucleic acid more efficiently and more quickly through such synthesis and detection techniques.

  7. Exome sequencing identifies recurrent somatic RAC1 mutations in melanoma

    SciTech Connect

    Krauthammer, Michael; Kong, Yong; Ha, Byung Hak; Evans, Perry; Bacchiocchi, Antonella; McCusker, James P.; Cheng, Elaine; Davis, Matthew J.; Goh, Gerald; Choi, Murim; Ariyan, Stephan; Narayan, Deepak; Dutton-Regester, Ken; Capatana, Ana; Holman, Edna C.; Bosenberg, Marcus; Sznol, Mario; Kluger, Harriet M.; Brash, Douglas E.; Stern, David F.; Materin, Miguel A.; Lo, Roger S.; Mane, Shrikant; Ma, Shuangge; Kidd, Kenneth K.; Hayward, Nicholas K.; Lifton, Richard P.; Schlessinger, Joseph; Boggon, Titus J.; Halaban, Ruth

    2012-10-11

    We characterized the mutational landscape of melanoma, the form of skin cancer with the highest mortality rate, by sequencing the exomes of 147 melanomas. Sun-exposed melanomas had markedly more ultraviolet (UV)-like C>T somatic mutations compared to sun-shielded acral, mucosal and uveal melanomas. Among the newly identified cancer genes was PPP6C, encoding a serine/threonine phosphatase, which harbored mutations that clustered in the active site in 12% of sun-exposed melanomas, exclusively in tumors with mutations in BRAF or NRAS. Notably, we identified a recurrent UV-signature, an activating mutation in RAC1 in 9.2% of sun-exposed melanomas. This activating mutation, the third most frequent in our cohort of sun-exposed melanoma after those of BRAF and NRAS, changes Pro29 to serine (RAC1{sup P29S}) in the highly conserved switch I domain. Crystal structures, and biochemical and functional studies of RAC1{sup P29S} showed that the alteration releases the conformational restraint conferred by the conserved proline, causes an increased binding of the protein to downstream effectors, and promotes melanocyte proliferation and migration. These findings raise the possibility that pharmacological inhibition of downstream effectors of RAC1 signaling could be of therapeutic benefit.

  8. 77 FR 65537 - Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-10-29

    ... Amino Acid Sequence Disclosures ACTION: Proposed collection; comment request. SUMMARY: The United States....'' SUPPLEMENTARY INFORMATION: I. Abstract Patent applications that contain nucleotide and/or amino acid sequence disclosures must include a copy of the sequence listing in accordance with the requirements in 37 CFR...

  9. Predicting intrinsic disorder from amino acid sequence.

    PubMed

    Obradovic, Zoran; Peng, Kang; Vucetic, Slobodan; Radivojac, Predrag; Brown, Celeste J; Dunker, A Keith

    2003-01-01

    Blind predictions of intrinsic order and disorder were made on 42 proteins subsequently revealed to contain 9,044 ordered residues, 284 disordered residues in 26 segments of length 30 residues or less, and 281 disordered residues in 2 disordered segments of length greater than 30 residues. The accuracies of the six predictors used in this experiment ranged from 77% to 91% for the ordered regions and from 56% to 78% for the disordered segments. The average of the order and disorder predictions ranged from 73% to 77%. The prediction of disorder in the shorter segments was poor, from 25% to 66% correct, while the prediction of disorder in the longer segments was better, from 75% to 95% correct. Four of the predictors were composed of ensembles of neural networks. This enabled them to deal more efficiently with the large asymmetry in the training data through diversified sampling from the significantly larger ordered set and achieve better accuracy on ordered and long disordered regions. The exclusive use of long disordered regions for predictor training likely contributed to the disparity of the predictions on long versus short disordered regions, while averaging the output values over 61-residue windows to eliminate short predictions of order or disorder probably contributed to the even greater disparity for three of the predictors. This experiment supports the predictability of intrinsic disorder from amino acid sequence. PMID:14579347

  10. A novel exogenous retrovirus sequence identified in humans.

    PubMed Central

    Griffiths, D J; Venables, P J; Weiss, R A; Boyd, M T

    1997-01-01

    A 932-bp retrovirus sequence was cloned by reverse transcriptase PCR from salivary gland tissue of a patient with Sjögren's syndrome. The sequence is related to that of type B and type D retroviruses and was present in a sucrose density gradient fraction corresponding to that of an enveloped retrovirus particle. Sequences amplified from tissues of eight individuals with or without Sjögren's syndrome had over 90% similarity and were present at a level of less than one copy per 10(3) cells. The sequence was not detectable in human genomic DNA by PCR or by Southern hybridization. These data indicate that the sequence represents an infectiously acquired genome, provisionally called human retrovirus 5. PMID:9060643

  11. Methods and compositions for efficient nucleic acid sequencing

    DOEpatents

    Drmanac, Radoje

    2002-01-01

    Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.

  12. Methods and compositions for efficient nucleic acid sequencing

    DOEpatents

    Drmanac, Radoje

    2006-07-04

    Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.

  13. Kit for detecting nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    2001-01-01

    A kit is provided for detecting a target nucleic acid sequence in a sample, the kit comprising: a first hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the first hybridization probe including a first complexing agent for forming a binding pair with a second complexing agent; and a second hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the first hybridization probe does not selectively hybridize, the second hybridization probe including a detectable marker; a third hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the third hybridization probe including the same detectable marker as the second hybridization probe; and a fourth hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the third hybridization probe does not selectively hybridize, the fourth hybridization probe including the first complexing agent for forming a binding pair with the second complexing agent; wherein the first and second hybridization probes are capable of simultaneously hybridizing to the target sequence and the third and fourth hybridization probes are capable of simultaneously hybridizing to the target sequence, the detectable marker is not present on the first or fourth hybridization probes and the first, second, third, and fourth hybridization probes each include a competitive nucleic acid sequence which is sufficiently complementary to a third portion of the target sequence that the competitive sequences of the first, second, third, and fourth hybridization probes compete with each other to hybridize to the third portion of the

  14. Predicting protein disorder by analyzing amino acid sequence

    PubMed Central

    Yang, Jack Y; Yang, Mary Qu

    2008-01-01

    Background Many protein regions and some entire proteins have no definite tertiary structure, presenting instead as dynamic, disorder ensembles under different physiochemical circumstances. These proteins and regions are known as Intrinsically Unstructured Proteins (IUP). IUP have been associated with a wide range of protein functions, along with roles in diseases characterized by protein misfolding and aggregation. Results Identifying IUP is important task in structural and functional genomics. We exact useful features from sequences and develop machine learning algorithms for the above task. We compare our IUP predictor with PONDRs (mainly neural-network-based predictors), disEMBL (also based on neural networks) and Globplot (based on disorder propensity). Conclusion We find that augmenting features derived from physiochemical properties of amino acids (such as hydrophobicity, complexity etc.) and using ensemble method proved beneficial. The IUP predictor is a viable alternative software tool for identifying IUP protein regions and proteins. PMID:18831799

  15. Solid phase sequencing of double-stranded nucleic acids

    DOEpatents

    Fu, Dong-Jing; Cantor, Charles R.; Koster, Hubert; Smith, Cassandra L.

    2002-01-01

    This invention relates to methods for detecting and sequencing of target double-stranded nucleic acid sequences, to nucleic acid probes and arrays of probes useful in these methods, and to kits and systems which contain these probes. Useful methods involve hybridizing the nucleic acids or nucleic acids which represent complementary or homologous sequences of the target to an array of nucleic acid probes. These probe comprise a single-stranded portion, an optional double-stranded portion and a variable sequence within the single-stranded portion. The molecular weights of the hybridized nucleic acids of the set can be determined by mass spectroscopy, and the sequence of the target determined from the molecular weights of the fragments. Nucleic acids whose sequences can be determined include nucleic acids in biological samples such as patient biopsies and environmental samples. Probes may be fixed to a solid support such as a hybridization chip to facilitate automated determination of molecular weights and identification of the target sequence.

  16. Analysis and Annotation of Nucleic Acid Sequence

    SciTech Connect

    States, David J.

    2004-07-28

    The aims of this project were to develop improved methods for computational genome annotation and to apply these methods to improve the annotation of genomic sequence data with a specific focus on human genome sequencing. The project resulted in a substantial body of published work. Notable contributions of this project were the identification of basecalling and lane tracking as error processes in genome sequencing and contributions to improved methods for these steps in genome sequencing. This technology improved the accuracy and throughput of genome sequence analysis. Probabilistic methods for physical map construction were developed. Improved methods for sequence alignment, alternative splicing analysis, promoter identification and NF kappa B response gene prediction were also developed.

  17. Analysis and Annotation of Nucleic Acid Sequence

    SciTech Connect

    David J. States

    1998-08-01

    The aims of this project were to develop improved methods for computational genome annotation and to apply these methods to improve the annotation of genomic sequence data with a specific focus on human genome sequencing. The project resulted in a substantial body of published work. Notable contributions of this project were the identification of basecalling and lane tracking as error processes in genome sequencing and contributions to improved methods for these steps in genome sequencing. This technology improved the accuracy and throughput of genome sequence analysis. Probabilistic methods for physical map construction were developed. Improved methods for sequence alignment, alternative splicing analysis, promoter identification and NF kappa B response gene prediction were also developed.

  18. Partial amino acid sequence of human factor D:homology with serine proteases.

    PubMed Central

    Volanakis, J E; Bhown, A; Bennett, J C; Mole, J E

    1980-01-01

    Human factor D purified to homogeneity by a modified procedure was subjected to NH2-terminal amino acid sequence analysis by using a modified automated Beckman sequencer. We identified 48 of the first 57 NH2-terminal amino acids in a single sequencer run, using microgram quantities of factor D. The deduced amino acid sequence represents approximately 25% of the primary structure of factor D. This extended NH2-terminal amino acid sequence of factor D was compared to that of other trypsin-related serine proteases. By visual inspection, strong homologies (33--50% identity) were observed with all the serine proteases included in the comparison. Interestingly, factor D showed a higher degree of homology to serine proteases of pancreatic origin than to those of serum origin. Images PMID:6987665

  19. Nucleotide sequence variation of the envelope protein gene identifies two distinct genotypes of yellow fever virus.

    PubMed Central

    Chang, G J; Cropp, B C; Kinney, R M; Trent, D W; Gubler, D J

    1995-01-01

    The evolution of yellow fever virus over 67 years was investigated by comparing the nucleotide sequences of the envelope (E) protein genes of 20 viruses isolated in Africa, the Caribbean, and South America. Uniformly weighted parsimony algorithm analysis defined two major evolutionary yellow fever virus lineages designated E genotypes I and II. E genotype I contained viruses isolated from East and Central Africa. E genotype II viruses were divided into two sublineages: IIA viruses from West Africa and IIB viruses from America, except for a 1979 virus isolated from Trinidad (TRINID79A). Unique signature patterns were identified at 111 nucleotide and 12 amino acid positions within the yellow fever virus E gene by signature pattern analysis. Yellow fever viruses from East and Central Africa contained unique signatures at 60 nucleotide and five amino acid positions, those from West Africa contained unique signatures at 25 nucleotide and two amino acid positions, and viruses from America contained such signatures at 30 nucleotide and five amino acid positions in the E gene. The dissemination of yellow fever viruses from Africa to the Americas is supported by the close genetic relatedness of genotype IIA and IIB viruses and genetic evidence of a possible second introduction of yellow fever virus from West Africa, as illustrated by the TRINID79A virus isolate. The E protein genes of American IIB yellow fever viruses had higher frequencies of amino acid substitutions than did genes of yellow fever viruses of genotypes I and IIA on the basis of comparisons with a consensus amino acid sequence for the yellow fever E gene. The great variation in the E proteins of American yellow fever virus probably results from positive selection imposed by virus interaction with different species of mosquitoes or nonhuman primates in the Americas. PMID:7637022

  20. The Chinese hamster Alu-equivalent sequence: a conserved highly repetitious, interspersed deoxyribonucleic acid sequence in mammals has a structure suggestive of a transposable element.

    PubMed Central

    Haynes, S R; Toomey, T P; Leinwand, L; Jelinek, W R

    1981-01-01

    A consensus sequence has been determined for a major interspersed deoxyribonucleic acid repeat in the genome of Chinese hamster ovary cells (CHO cells). This sequence is extensively homologous to (i) the human Alu sequence (P. L. Deininger et al., J. Mol. Biol., in press), (ii) the mouse B1 interspersed repetitious sequence (Krayev et al., Nucleic Acids Res. 8:1201-1215, 1980) (iii) an interspersed repetitious sequence from African green monkey deoxyribonucleic acid (Dhruva et al., Proc. Natl. Acad. Sci. U.S.A. 77:4514-4518, 1980) and (iv) the CHO and mouse 4.5S ribonucleic acid (this report; F. Harada and N. Kato, Nucleic Acids Res. 8:1273-1285, 1980). Because the CHO consensus sequence shows significant homology to the human Alu sequence it is termed the CHO Alu-equivalent sequence. A conserved structure surrounding CHO Alu-equivalent family members can be recognized. It is similar to that surrounding the human Alu and the mouse B1 sequences, and is represented as follows: direct repeat-CHO-Alu-A-rich sequence-direct repeat. A composite interspersed repetitious sequence has been identified. Its structure is represented as follows: direct repeat-residue 47 to 107 of CHO-Alu-non-Alu repetitious sequence-A-rich sequence-direct repeat. Because the Alu flanking sequences resemble those that flank known transposable elements, we think it likely that the Alu sequence dispersed throughout the mammalian genome by transposition. Images PMID:9279371

  1. Simple sequence repeat markers that identify Claviceps species and strains

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Claviceps purpurea is a pathogen that infects most members of the Pooideae subfamily and causes ergot, a floral disease in which the ovary is replaced with a sclerotium. This study was initiated to develop Simple Sequence Repeat (SSRs) markers for rapid identification of C. purpurea. SSRs were desi...

  2. Use of HLA peptidomics and whole exome sequencing to identify human immunogenic neo-antigens

    PubMed Central

    Kalaora, Shelly; Qutob, Nouar; Teer, Jamie K.; Shimony, Nilly; Schachter, Jacob; Rosenberg, Steven A.; Samuels, Yardena

    2016-01-01

    The antigenicity of cells is demarcated by the peptides bound by their Human Leucocyte Antigen (HLA) molecules. Through this antigen presentation, T cell specificity response is controlled. As a fraction of the expressed mutated peptides is presented on the HLA, these neo-epitopes could be immunogenic. Such neoantigens have recently been identified through screening for predicted mutated peptides, using synthetic peptides or ones expressed from minigenes, combined with screening of patient tumor-infiltrating lymphocytes (TILs). Here we present a time and cost-effective method that combines whole-exome sequencing analysis with HLA peptidome mass spectrometry, to identify neo-antigens in a melanoma patient. Of the 1,019 amino acid changes identified through exome sequencing, two were confirmed by mass spectrometry to be presented by the cells. We then synthesized peptides and evaluated the two mutated neo-antigens for reactivity with autologous bulk TILs, and found that one yielded mutant-specific T-cell response. Our results demonstrate that this method can be used for immune response prediction and promise to provide an alternative approach for identifying immunogenic neo-epitopes in cancer. PMID:26819371

  3. Use of HLA peptidomics and whole exome sequencing to identify human immunogenic neo-antigens.

    PubMed

    Kalaora, Shelly; Barnea, Eilon; Merhavi-Shoham, Efrat; Qutob, Nouar; Teer, Jamie K; Shimony, Nilly; Schachter, Jacob; Rosenberg, Steven A; Besser, Michal J; Admon, Arie; Samuels, Yardena

    2016-02-01

    The antigenicity of cells is demarcated by the peptides bound by their Human Leucocyte Antigen (HLA) molecules. Through this antigen presentation, T cell specificity response is controlled. As a fraction of the expressed mutated peptides is presented on the HLA, these neo-epitopes could be immunogenic. Such neo-antigens have recently been identified through screening for predicted mutated peptides, using synthetic peptides or ones expressed from minigenes, combined with screening of patient tumor-infiltrating lymphocytes (TILs). Here we present a time and cost-effective method that combines whole-exome sequencing analysis with HLA peptidome mass spectrometry, to identify neo-antigens in a melanoma patient. Of the 1,019 amino acid changes identified through exome sequencing, two were confirmed by mass spectrometry to be presented by the cells. We then synthesized peptides and evaluated the two mutated neo-antigens for reactivity with autologous bulk TILs, and found that one yielded mutant-specific T-cell response. Our results demonstrate that this method can be used for immune response prediction and promise to provide an alternative approach for identifying immunogenic neo-epitopes in cancer. PMID:26819371

  4. From Artificial Amino Acids to Sequence-Defined Targeted Oligoaminoamides.

    PubMed

    Morys, Stephan; Wagner, Ernst; Lächelt, Ulrich

    2016-01-01

    Artificial oligoamino acids with appropriate protecting groups can be used for the sequential assembly of oligoaminoamides on solid-phase. With the help of these oligoamino acids multifunctional nucleic acid (NA) carriers can be designed and produced in highly defined topologies. Here we describe the synthesis of the artificial oligoamino acid Fmoc-Stp(Boc3)-OH, the subsequent assembly into sequence-defined oligomers and the formulation of tumor-targeted plasmid DNA (pDNA) polyplexes. PMID:27436323

  5. Novel mutation in SUCLA2 identified on sequencing analysis.

    PubMed

    Güngör, Olcay; Özkaya, Ahmet Kağan; Güngör, Gülay; Karaer, Kadri; Dilber, Cengiz; Aydin, Kürşad

    2016-07-01

    Succinate-CoA ligase, ADP-forming, beta subunit (SUCLA2)-related mitochondrial DNA depletion syndrome is caused by mutations affecting the ADP-using isoform of the beta subunit in succinyl-CoA synthase, which is involved in the Krebs cycle. The SUCLA2 protein is found mostly in heart, skeletal muscle, and brain tissues. SUCLA2 mutations result in a mitochondrial disorder that manifests as deafness, lesions in the basal ganglia, and encephalomyopathy accompanied by dystonia. Such mutations are generally associated with mildly increased plasma methylmalonic acid, increased plasma lactate, elevated plasma carnitine esters, and the presence of methylmalonic acid in urine. In this case report, we describe a new mutation in a patient with a succinyl-CoA synthase deficiency caused by an SUCLA2 defect. PMID:26952923

  6. Detecting frame shifts by amino acid sequence comparison.

    PubMed

    Claverie, J M

    1993-12-20

    Various amino acid substitution scoring matrices are used in conjunction with local alignments programs to detect regions of similarity and infer potential common ancestry between proteins. The usual scoring schemes derive from the implicit hypothesis that related proteins evolve from a common ancestor by the accumulation of point mutations and that amino acids tend to be progressively substituted by others with similar properties. However, other frequent single mutation events, like nucleotide insertion or deletion and gene inversion, change the translation reading frame and cause previously encoded amino acid sequences to become unrecognizable at once. Here, I derive five new types of scoring matrix, each capable of detecting a specific frame shift (deletion, insertion and inversion in 3 frames) and use them with a regular local alignments program to detect amino acid sequences that may have derived from alternative reading frames of the same nucleotide sequence. Frame shifts are inferred from the sole comparison of the protein sequences. The five scoring matrices were used with the BLASTP program to compare all the protein sequences in the Swissprot database. Surprisingly, the searches revealed hundreds of highly significant frame shift matches, of which many are likely to represent sequencing errors. Others provide some evidence that frame shift mutations might be used in protein evolution as a way to create new amino acid sequences from pre-existing coding regions. PMID:7903399

  7. Exome sequencing identifies a new mutation in SERAC1 in a patient with 3-methylglutaconic aciduria.

    PubMed

    Tort, Frederic; García-Silva, María Teresa; Ferrer-Cortès, Xènia; Navarro-Sastre, Aleix; Garcia-Villoria, Judith; Coll, Maria Josep; Vidal, Enrique; Jiménez-Almazán, Jorge; Dopazo, Joaquín; Briones, Paz; Elpeleg, Orly; Ribes, Antonia

    2013-01-01

    3-Methylglutaconic aciduria (3-MGA-uria) is a heterogeneous group of syndromes characterized by an increased excretion of 3-methylglutaconic and 3-methylglutaric acids. Five types of 3-MGA-uria (I to V) with different clinical presentations have been described. Causative mutations in TAZ, OPA3, DNAJC19, ATP12, ATP5E, and TMEM70 have been identified. After excluding the known genetic causes of 3-MGA-uria we used exome sequencing to investigate a patient with Leigh syndrome and 3-MGA-uria. We identified a homozygous variant in SERAC1 (c.202C>T; p.Arg68*), that generates a premature stop codon at position 68 of SERAC1 protein. Western blot analysis in patient's fibroblasts showed a complete absence of SERAC1 that was consistent with the prediction of a truncated protein and supports the pathogenic role of the mutation. During the course of this project a parallel study identified mutations in SERAC1 as the genetic cause of the disease in 15 patients with MEGDEL syndrome, which was compatible with the clinical and biochemical phenotypes of the patient described here. In addition, our patient developed microcephaly and optic atrophy, two features not previously reported in MEGDEL syndrome. We highlight the usefulness of exome sequencing to reveal the genetic bases of human rare diseases even if only one affected individual is available. PMID:23707711

  8. Segments of amino acid sequence similarity in beta-amylases.

    PubMed

    Friedberg, F; Rhodes, C

    1988-01-01

    In alpha-amylases from animals, plants and bacteria and in beta-amylases from plants and bacteria a number of segments exhibit amino acid sequence similarity specific to the alpha or to the beta type, respectively. In the case of the beta-amylases the similar sequence regions are extensive and they are disrupted only by short interspersed dissimilar regions. Close to the C terminus, however, no such sequence similarity exist. PMID:2464171

  9. Novel alpha-conotoxins identified by gene sequencing from cone snails native to Hainan, and their sequence diversity.

    PubMed

    Luo, Sulan; Zhangsun, Dongting; Zhang, Ben; Quan, Yaru; Wu, Yong

    2006-11-01

    Conotoxins (CTX) from the venom of marine cone snails (genus Conus) represent large families of proteins, which show a similar precursor organization with surprisingly conserved signal sequence of the precursor peptides, but highly diverse pharmacological activities. By using the conserved sequences found within the genes that encode the alpha-conotoxin precursors, a technique based on RT-PCR was used to identify, respectively, two novel peptides (LiC22, LeD2) from the two worm-hunting Conus species Conus lividus, and Conus litteratus, and one novel peptide (TeA21) from the snail-hunting Conus species Conus textile, all native to Hainan in China. The three peptides share an alpha4/7 subfamily alpha-conotoxins common cysteine pattern (CCX(4)CX(7)C, two disulfide bonds), which are competitive antagonists of nicotinic acetylcholine receptor (nAChRs). The cDNA of LiC22N encodes a precursor of 40 residues, including a propeptide of 19 residues and a mature peptide of 21 residues. The cDNA of LeD2N encodes a precursor of 41 residues, including a propeptide of 21 residues and a mature peptide of 16 residues with three additional Gly residues. The cDNA of TeA21N encodes a precursor of 38 residues, including a propeptide of 20 residues and a mature peptide of 17 residues with an additional residue Gly. The additional residue Gly of LeD2N and TeA21N is a prerequisite for the amidation of the preceding C-terminal Cys. All three sequences are processed at the common signal site -X-Arg- immediately before the mature peptide sequences. The properties of the alpha4/7 conotoxins known so far were discussed in detail. Phylogenetic analysis of the new conotoxins in the present study and the published homologue of alpha4/7 conotoxins from the other Conus species were performed systematically. Patterns of sequence divergence for the three regions of signal, proregion, and mature peptides, both nucleotide acids and residue substitutions in DNA and peptide levels, as well as Cys codon

  10. Genome sequence of a novel mitovirus identified in the phytopathogenic fungus Alternaria arborescens.

    PubMed

    Komatsu, Ken; Katayama, Yukie; Omatsu, Tsutomu; Mizutani, Tetsuya; Fukuhara, Toshiyuki; Kodama, Motoichiro; Arie, Tsutomu; Teraoka, Tohru; Moriyama, Hiromitsu

    2016-09-01

    The phytopathogenic fungus Alternaria spp. contains a variety of double-stranded RNA (dsRNA) elements of different sizes. Detailed analysis of next-generation sequencing data obtained using dsRNA purified from Alternaria arborescens, from which we had previously found Alternaria arborescens victorivirus 1, revealed the presence of another mycoviral-like dsRNA of approximately 2.5 kbp in length. When using the fungal mitochondrial genetic code, this dsRNA has a single open reading frame that potentially encodes an RNA-dependent RNA polymerase (RdRp) with significant to sequence similarity to those of viruses of the genus Mitovirus. Moreover, both the 5'- and 3'-untranslated regions have the potential to fold into stable stem-loop structures, which is characteristic of mitoviruses. Pairwise comparisons and phylogenetic analysis of the deduced amino acid sequences of RdRp indicated that the virus we identified in A. arborescens is a distinct member of the genus Mitovirus in the family Narnaviridae, designated as "Alternaria arborescens mitovirus 1" (AaMV1). PMID:27368994

  11. Newly identified essential amino acid residues affecting ^8-sphingolipid desaturase activity revealed by site-directed mutagenesis

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In order to identify amino acid residues crucial for the enzymatic activity of ^8-sphingolipid desaturases, a sequence comparison was performed among ^8-sphingolipid desaturases and ^6-fatty acid desaturase from various plants. In addition to the known conserved cytb5 (cytochrome b5) HPGG motif and...

  12. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... acids are not intended to be embraced by this definition. Any amino acid sequence that contains post-translationally modified amino acids may be described as the amino acid sequence that is initially translated... sequence of four or more amino acids or an unbranched sequence of ten or more nucleotides....

  13. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... acids are not intended to be embraced by this definition. Any amino acid sequence that contains post-translationally modified amino acids may be described as the amino acid sequence that is initially translated... sequence of four or more amino acids or an unbranched sequence of ten or more nucleotides....

  14. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... acids are not intended to be embraced by this definition. Any amino acid sequence that contains post-translationally modified amino acids may be described as the amino acid sequence that is initially translated... sequence of four or more amino acids or an unbranched sequence of ten or more nucleotides....

  15. N-terminal sequence of amino acids and some properties of an acid-stable alpha-amylase from citric acid-koji (Aspergillus usamii var.).

    PubMed

    Suganuma, T; Tahara, N; Kitahara, K; Nagahama, T; Inuzuka, K

    1996-01-01

    An acid-stable alpha-amylase (AA) was purified from an acidic extract of citric acid-koji (A. usamii var.). The N-terminal sequence of the first 20 amino acids of the enzyme was identical with that of AA from A. niger, but the two enzymes differed in molecular weight. HPLC analysis for identifying the anomers of products indicated that the AA hydrolyzed maltopentaose (G5) at the third glycoside bond predominantly, which differed from Taka-amylase A and the neutral alpha-amylase (NA) from the citric acid-koji. PMID:8824843

  16. A method to find palindromes in nucleic acid sequences.

    PubMed

    Anjana, Ramnath; Shankar, Mani; Vaishnavi, Marthandan Kirti; Sekar, Kanagaraj

    2013-01-01

    Various types of sequences in the human genome are known to play important roles in different aspects of genomic functioning. Among these sequences, palindromic nucleic acid sequences are one such type that have been studied in detail and found to influence a wide variety of genomic characteristics. For a nucleotide sequence to be considered as a palindrome, its complementary strand must read the same in the opposite direction. For example, both the strands i.e the strand going from 5' to 3' and its complementary strand from 3' to 5' must be complementary. A typical nucleotide palindromic sequence would be TATA (5' to 3') and its complimentary sequence from 3' to 5' would be ATAT. Thus, a new method has been developed using dynamic programming to fetch the palindromic nucleic acid sequences. The new method uses less memory and thereby it increases the overall speed and efficiency. The proposed method has been tested using the bacterial (3891 KB bases) and human chromosomal sequences (Chr-18: 74366 kb and Chr-Y: 25554 kb) and the computation time for finding the palindromic sequences is in milli seconds. PMID:23515654

  17. Fatal Psychrobacter sp. infection in a pediatric patient with meningitis identified by metagenomic next-generation sequencing in cerebrospinal fluid.

    PubMed

    Ortiz-Alcántara, Joanna María; Segura-Candelas, José Miguel; Garcés-Ayala, Fabiola; Gonzalez-Durán, Elizabeth; Rodríguez-Castillo, Araceli; Alcántara-Pérez, Patricia; Wong-Arámbula, Claudia; González-Villa, Maribel; León-Ávila, Gloria; García-Chéquer, Adda Jeanette; Diaz-Quiñonez, José Alberto; Méndez-Tenorio, Alfonso; Ramírez-González, José Ernesto

    2016-03-01

    The genus Psychrobacter contains environmental, psychrophilic and halotolerant gram-negative bacteria considered rare opportunistic pathogens in humans. Metagenomics was performed on the cerebrospinal fluid (CSF) of a pediatric patient with meningitis. Nucleic acids were extracted, randomly amplified, and sequenced with the 454 GS FLX Titanium next-generation sequencing (NGS) system. Sequencing reads were assembled, and potential virulence genes were predicted. Phylogenomic and phylogenetic studies were performed. Psychrobacter sp. 310 was identified, and several virulence genes characteristic of pathogenic bacteria were found. The phylogenomic study and 16S rRNA gene phylogenetic analysis showed that the closest relative of Psychrobacter sp. 310 was Psychrobacter sanguinis. To our knowledge, this is the first report of a meningitis case associated with Psychrobacter sp. identified by NGS metagenomics in CSF from a pediatric patient. The metagenomic strategy based on NGS was a powerful tool to identify a rare unknown pathogen in a clinical case. PMID:26546315

  18. Identifiability of PBPK models with applications to dimethylarsinic acid exposure.

    PubMed

    Garcia, Ramon I; Ibrahim, Joseph G; Wambaugh, John F; Kenyon, Elaina M; Setzer, R Woodrow

    2015-12-01

    Any statistical model should be identifiable in order for estimates and tests using it to be meaningful. We consider statistical analysis of physiologically-based pharmacokinetic (PBPK) models in which parameters cannot be estimated precisely from available data, and discuss different types of identifiability that occur in PBPK models and give reasons why they occur. We particularly focus on how the mathematical structure of a PBPK model and lack of appropriate data can lead to statistical models in which it is impossible to estimate at least some parameters precisely. Methods are reviewed which can determine whether a purely linear PBPK model is globally identifiable. We propose a theorem which determines when identifiability at a set of finite and specific values of the mathematical PBPK model (global discete identifiability) implies identifiability of the statistical model. However, we are unable to establish conditions that imply global discrete identifiability, and conclude that the only safe approach to analysis of PBPK models involves Bayesian analysis with truncated priors. Finally, computational issues regarding posterior simulations of PBPK models are discussed. The methodology is very general and can be applied to numerous PBPK models which can be expressed as linear time-invariant systems. A real data set of a PBPK model for exposure to dimethyl arsinic acid (DMA(V)) is presented to illustrate the proposed methodology. PMID:26194069

  19. Whole exome sequencing identifies a mutation for a novel form of corneal intraepithelial dyskeratosis

    PubMed Central

    Soler, Vincent José; Tran-Viet, Khanh-Nhat; Galiacy, Stéphane D; Limviphuvadh, Vachiranee; Klemm, Thomas Patrick; St Germain, Elizabeth; Fournié, Pierre R; Guillaud, Céline; Maurer-Stroh, Sebastian; Hawthorne, Felicia; Suarez, Cyrielle; Kantelip, Bernadette; Afshari, Natalie A; Creveaux, Isabelle; Luo, Xiaoyan; Meng, Weihua; Calvas, Patrick; Cassagne, Myriam; Arné, Jean-Louis; Rozen, Steven G; Malecaze, François; Young, Terri L

    2014-01-01

    Background Corneal intraepithelial dyskeratosis is an extremely rare condition. The classical form, affecting Native American Haliwa-Saponi tribe members, is called hereditary benign intraepithelial dyskeratosis (HBID). Herein, we present a new form of corneal intraepithelial dyskeratosis for which we identified the causative gene by using deep sequencing technology. Methods and results A seven member Caucasian French family with two corneal intraepithelial dyskeratosis affected individuals (6-year-old proband and his mother) was ascertained. The proband presented with bilateral complete corneal opacification and dyskeratosis. Palmoplantar hyperkeratosis and laryngeal dyskeratosis were associated with the phenotype. Histopathology studies of cornea and vocal cord biopsies showed dyskeratotic keratinisation. Quantitative PCR ruled out 4q35 duplication, classically described in HBID cases. Next generation sequencing with mean coverage of 50× using the Illumina Hi Seq and whole exome capture processing was performed. Sequence reads were aligned, and screened for single nucleotide variants and insertion/deletion calls. In-house pipeline filtering analyses and comparisons with available databases were performed. A novel missense mutation M77T was discovered for the gene NLRP1 which maps to chromosome 17p13.2. This was a de novo mutation in the proband’s mother, following segregation in the family, and not found in 738 control DNA samples. NLRP1 expression was determined in adult corneal epithelium. The amino acid change was found to destabilise significantly the protein structure. Conclusions We describe a new corneal intraepithelial dyskeratosis and how we identified its causative gene. The NLRP1 gene product is implicated in inflammation, autoimmune disorders, and caspase mediated apoptosis. NLRP1 polymorphisms are associated with various diseases. PMID:23349227

  20. First Genome Sequences of Porcine Parvovirus 5 Strains Identified in Polish Pigs

    PubMed Central

    Fan, Jinghui; Cui, Jin; Gerber, Priscilla F.; Biernacka, Kinga; Stadejek, Tomasz

    2016-01-01

    Porcine parvovirus type 5 (PPV5) has been recently identified. Here, we report the genome sequences of five PPV5 strains identified in serum samples from Polish pigs, which represent the first PPV5 sequences recovered from European pigs. The PPV5 strains isolated in Poland are most related to the Chinese strain HN01. PMID:27587805

  1. First Genome Sequences of Porcine Parvovirus 5 Strains Identified in Polish Pigs.

    PubMed

    Fan, Jinghui; Cui, Jin; Gerber, Priscilla F; Biernacka, Kinga; Stadejek, Tomasz; Opriessnig, Tanja

    2016-01-01

    Porcine parvovirus type 5 (PPV5) has been recently identified. Here, we report the genome sequences of five PPV5 strains identified in serum samples from Polish pigs, which represent the first PPV5 sequences recovered from European pigs. The PPV5 strains isolated in Poland are most related to the Chinese strain HN01. PMID:27587805

  2. Transcriptome Sequencing in Response to Salicylic Acid in Salvia miltiorrhiza

    PubMed Central

    Zhang, Xiaoru; Dong, Juane; Liu, Hailong; Wang, Jiao; Qi, Yuexin; Liang, Zongsuo

    2016-01-01

    Salvia miltiorrhiza is a traditional Chinese herbal medicine, whose quality and yield are often affected by diseases and environmental stresses during its growing season. Salicylic acid (SA) plays a significant role in plants responding to biotic and abiotic stresses, but the involved regulatory factors and their signaling mechanisms are largely unknown. In order to identify the genes involved in SA signaling, the RNA sequencing (RNA-seq) strategy was employed to evaluate the transcriptional profiles in S. miltiorrhiza cell cultures. A total of 50,778 unigenes were assembled, in which 5,316 unigenes were differentially expressed among 0-, 2-, and 8-h SA induction. The up-regulated genes were mainly involved in stimulus response and multi-organism process. A core set of candidate novel genes coding SA signaling component proteins was identified. Many transcription factors (e.g., WRKY, bHLH and GRAS) and genes involved in hormone signal transduction were differentially expressed in response to SA induction. Detailed analysis revealed that genes associated with defense signaling, such as antioxidant system genes, cytochrome P450s and ATP-binding cassette transporters, were significantly overexpressed, which can be used as genetic tools to investigate disease resistance. Our transcriptome analysis will help understand SA signaling and its mechanism of defense systems in S. miltiorrhiza. PMID:26808150

  3. Transcriptome Sequencing in Response to Salicylic Acid in Salvia miltiorrhiza.

    PubMed

    Zhang, Xiaoru; Dong, Juane; Liu, Hailong; Wang, Jiao; Qi, Yuexin; Liang, Zongsuo

    2016-01-01

    Salvia miltiorrhiza is a traditional Chinese herbal medicine, whose quality and yield are often affected by diseases and environmental stresses during its growing season. Salicylic acid (SA) plays a significant role in plants responding to biotic and abiotic stresses, but the involved regulatory factors and their signaling mechanisms are largely unknown. In order to identify the genes involved in SA signaling, the RNA sequencing (RNA-seq) strategy was employed to evaluate the transcriptional profiles in S. miltiorrhiza cell cultures. A total of 50,778 unigenes were assembled, in which 5,316 unigenes were differentially expressed among 0-, 2-, and 8-h SA induction. The up-regulated genes were mainly involved in stimulus response and multi-organism process. A core set of candidate novel genes coding SA signaling component proteins was identified. Many transcription factors (e.g., WRKY, bHLH and GRAS) and genes involved in hormone signal transduction were differentially expressed in response to SA induction. Detailed analysis revealed that genes associated with defense signaling, such as antioxidant system genes, cytochrome P450s and ATP-binding cassette transporters, were significantly overexpressed, which can be used as genetic tools to investigate disease resistance. Our transcriptome analysis will help understand SA signaling and its mechanism of defense systems in S. miltiorrhiza. PMID:26808150

  4. Amino acid sequence repertoire of the bacterial proteome and the occurrence of untranslatable sequences.

    PubMed

    Navon, Sharon Penias; Kornberg, Guy; Chen, Jin; Schwartzman, Tali; Tsai, Albert; Puglisi, Elisabetta Viani; Puglisi, Joseph D; Adir, Noam

    2016-06-28

    Bioinformatic analysis of Escherichia coli proteomes revealed that all possible amino acid triplet sequences occur at their expected frequencies, with four exceptions. Two of the four underrepresented sequences (URSs) were shown to interfere with translation in vivo and in vitro. Enlarging the URS by a single amino acid resulted in increased translational inhibition. Single-molecule methods revealed stalling of translation at the entrance of the peptide exit tunnel of the ribosome, adjacent to ribosomal nucleotides A2062 and U2585. Interaction with these same ribosomal residues is involved in regulation of translation by longer, naturally occurring protein sequences. The E. coli exit tunnel has evidently evolved to minimize interaction with the exit tunnel and maximize the sequence diversity of the proteome, although allowing some interactions for regulatory purposes. Bioinformatic analysis of the human proteome revealed no underrepresented triplet sequences, possibly reflecting an absence of regulation by interaction with the exit tunnel. PMID:27307442

  5. Whole Exome Sequencing Identifies RAI1 Mutation in a Morbidly Obese Child Diagnosed With ROHHAD Syndrome

    PubMed Central

    Esteves, Kristyn M.; Towne, Meghan C.; Brownstein, Catherine A.; James, Philip M.; Crowley, Laura; Hirschhorn, Joel N.; Elsea, Sarah H.; Beggs, Alan H.; Picker, Jonathan

    2015-01-01

    Context: The current obesity epidemic is attributed to complex interactions between genetic and environmental factors. However, a limited number of cases, especially those with early-onset severe obesity, are linked to single gene defects. Rapid-onset obesity with hypothalamic dysfunction, hypoventilation and autonomic dysregulation (ROHHAD) is one of the syndromes that presents with abrupt-onset extreme weight gain with an unknown genetic basis. Objective: To identify the underlying genetic etiology in a child with morbid early-onset obesity, hypoventilation, and autonomic and behavioral disturbances who was clinically diagnosed with ROHHAD syndrome. Design/Setting/Intervention: The index patient was evaluated at an academic medical center. Whole-exome sequencing was performed on the proband and his parents. Genetic variants were validated by Sanger sequencing. Results: We identified a novel de novo nonsense mutation, c.3265 C>T (p.R1089X), in the retinoic acid-induced 1 (RAI1) gene in the proband. Mutations in the RAI1 gene are known to cause Smith-Magenis syndrome (SMS). On further evaluation, his clinical features were not typical of either SMS or ROHHAD syndrome. Conclusions: This study identifies a de novo RAI1 mutation in a child with morbid obesity and a clinical diagnosis of ROHHAD syndrome. Although extreme early-onset obesity, autonomic disturbances, and hypoventilation are present in ROHHAD, several of the clinical findings are consistent with SMS. This case highlights the challenges in the diagnosis of ROHHAD syndrome and its potential overlap with SMS. We also propose RAI1 as a candidate gene for children with morbid obesity. PMID:25781356

  6. On Quantum Algorithm for Multiple Alignment of Amino Acid Sequences

    NASA Astrophysics Data System (ADS)

    Iriyama, Satoshi; Ohya, Masanori

    2009-02-01

    The alignment of genome sequences or amino acid sequences is one of fundamental operations for the study of life. Usual computational complexity for the multiple alignment of N sequences with common length L by dynamic programming is O(LN). This alignment is considered as one of the NP problems, so that it is desirable to find a nice algorithm of the multiple alignment. Thus in this paper we propose the quantum algorithm for the multiple alignment based on the works12,1,2 in which the NP complete problem was shown to be the P problem by means of quantum algorithm and chaos information dynamics.

  7. Prebiotically plausible mechanisms increase compositional diversity of nucleic acid sequences

    PubMed Central

    Derr, Julien; Manapat, Michael L.; Rajamani, Sudha; Leu, Kevin; Xulvi-Brunet, Ramon; Joseph, Isaac; Nowak, Martin A.; Chen, Irene A.

    2012-01-01

    During the origin of life, the biological information of nucleic acid polymers must have increased to encode functional molecules (the RNA world). Ribozymes tend to be compositionally unbiased, as is the vast majority of possible sequence space. However, ribonucleotides vary greatly in synthetic yield, reactivity and degradation rate, and their non-enzymatic polymerization results in compositionally biased sequences. While natural selection could lead to complex sequences, molecules with some activity are required to begin this process. Was the emergence of compositionally diverse sequences a matter of chance, or could prebiotically plausible reactions counter chemical biases to increase the probability of finding a ribozyme? Our in silico simulations using a two-letter alphabet show that template-directed ligation and high concatenation rates counter compositional bias and shift the pool toward longer sequences, permitting greater exploration of sequence space and stable folding. We verified experimentally that unbiased DNA sequences are more efficient templates for ligation, thus increasing the compositional diversity of the pool. Our work suggests that prebiotically plausible chemical mechanisms of nucleic acid polymerization and ligation could predispose toward a diverse pool of longer, potentially structured molecules. Such mechanisms could have set the stage for the appearance of functional activity very early in the emergence of life. PMID:22319215

  8. The amino-acid sequence of kangaroo pancreatic ribonuclease.

    PubMed

    Gaastra, W; Welling, G W; Beintema, J J

    1978-05-01

    Red kangaroo (Macropus rufus) ribonuclease was isolated from pancreatic tissue by affinity chromatography. The amino acid sequence was determined by automatic sequencing of overlapping large fragments and by analysis of shorter peptides obtained by digestion with a number of proteolytic enzymes. The polypeptide chain consists of 122 amino acid residues. Compared to other ribonucleases, the N-terminal residue and residue 114 are deleted. In other pancreatic ribonucleases position 114 is occupied by a cis proline residue in an external loop at the surface of the molecule. Other remarkable substitutions are the presence of a tyrosine residue at position 123 instead of a serine which forms a hydrogen bond with the pyrimidine ring of a nucleotide substrate, and a number of hydrophobichydrophilic interchanges in the sequence 51-55, which forms part of an alpha-helix in bovine ribonuclease and exhibits few substitutions in the placental mammals. Kangaroo ribonuclease contains no carbohydrate, although the enzyme possesses a recognition site for carbohydrate attachment in the sequence Asn-Val-Thr (62-64). The enzyme differs at about 35-40% of the positions from all other mammalian pancreatic ribonucleases sequenced to date, which is in agreement with the early divergence between the marsupials and the placental mammals. From fragmentary data a tentative sequence of red-necked wallaby (Macropus rufogriseus) pancreatic ribonuclease has been derived. Eight differences with the kangaroo sequence were found. PMID:658039

  9. Anchored pseudo-de novo assembly of human genomes identifies extensive sequence variation from unmapped sequence reads.

    PubMed

    Faber-Hammond, Joshua J; Brown, Kim H

    2016-07-01

    The human genome reference (HGR) completion marked the genomics era beginning, yet despite its utility universal application is limited by the small number of individuals used in its development. This is highlighted by the presence of high-quality sequence reads failing to map within the HGR. Sequences failing to map generally represent 2-5 % of total reads, which may harbor regions that would enhance our understanding of population variation, evolution, and disease. Alternatively, complete de novo assemblies can be created, but these effectively ignore the groundwork of the HGR. In an effort to find a middle ground, we developed a bioinformatic pipeline that maps paired-end reads to the HGR as separate single reads, exports unmappable reads, de novo assembles these reads per individual and then combines assemblies into a secondary reference assembly used for comparative analysis. Using 45 diverse 1000 Genomes Project individuals, we identified 351,361 contigs covering 195.5 Mb of sequence unincorporated in GRCh38. 30,879 contigs are represented in multiple individuals with ~40 % showing high sequence complexity. Genomic coordinates were generated for 99.9 %, with 52.5 % exhibiting high-quality mapping scores. Comparative genomic analyses with archaic humans and primates revealed significant sequence alignments and comparisons with model organism RefSeq gene datasets identified novel human genes. If incorporated, these sequences will expand the HGR, but more importantly our data highlight that with this method low coverage (~10-20×) next-generation sequencing can still be used to identify novel unmapped sequences to explore biological functions contributing to human phenotypic variation, disease and functionality for personal genomic medicine. PMID:27061184

  10. Amino acid sequence of Salmonella typhimurium branched-chain amino acid aminotransferase.

    PubMed

    Feild, M J; Nguyen, D C; Armstrong, F B

    1989-06-13

    The complete amino acid sequence of the subunit of branched-chain amino acid aminotransferase (transaminase B, EC 2.6.1.42) of Salmonella typhimurium was determined. An Escherichia coli recombinant containing the ilvGEDAY gene cluster of Salmonella was used as the source of the hexameric enzyme. The peptide fragments used for sequencing were generated by treatment with trypsin, Staphylococcus aureus V8 protease, endoproteinase Lys-C, and cyanogen bromide. The enzyme subunit contains 308 residues and has a molecular weight of 33,920. To determine the coenzyme-binding site, the pyridoxal 5-phosphate containing enzyme was treated with tritiated sodium borohydride prior to trypsin digestion. Peptide map comparisons with an apoenzyme tryptic digest and monitoring radioactivity incorporation allowed identification of the pyridoxylated peptide, which was then isolated and sequenced. The coenzyme-binding site is the lysyl residue at position 159. The amino acid sequence of Salmonella transaminase B is 97.4% identical with that of Escherichia coli, differing in only eight amino acid positions. Sequence comparisons of transaminase B to other known aminotransferase sequences revealed limited sequence similarity (24-33%) when conserved amino acid substitutions are allowed and alignments were forced to occur on the coenzyme-binding site. PMID:2669973

  11. Amino acid sequence of bovine heart coupling factor 6.

    PubMed Central

    Fang, J K; Jacobs, J W; Kanner, B I; Racker, E; Bradshaw, R A

    1984-01-01

    The amino acid sequence of bovine heart mitochondrial coupling factor 6 (F6) has been determined by automated Edman degradation of the whole protein and derived peptides. Preparations based on heat precipitation and ethanol extraction showed allotypic variation at three positions while material further purified by HPLC yielded only one sequence that also differed by a Phe-Thr replacement at residue 62. The mature protein contains 76 amino acids with a calculated molecular weight of 9006 and a pI of approximately equal to 5, in good agreement with experimentally measured values. The charged amino acids are mainly clustered at the termini and in one section in the middle; these three polar segments are separated by two segments relatively rich in nonpolar residues. Chou-Fasman analysis suggests three stretches of alpha-helix coinciding (or within) the high-charge-density sequences with a single beta-turn at the first polar-nonpolar junction. Comparison of the F6 sequence with those of other proteins did not reveal any homologous structures. PMID:6149548

  12. The complete genomic sequence of a tentative new polerovirus identified in barley in South Korea.

    PubMed

    Zhao, Fumei; Lim, Seungmo; Yoo, Ran Hee; Igori, Davaajargal; Kim, Sang-Min; Kwak, Do Yeon; Kim, Sun Lim; Lee, Bong Choon; Moon, Jae Sun

    2016-07-01

    The complete nucleotide sequence of a new barley polerovirus, tentatively named barley virus G (BVG), which was isolated in Gimje, South Korea, has been determined using an RNA sequencing technique combined with polymerase chain reaction methods. The viral genomic RNA of BVG is 5,620 nucleotides long and contains six typical open reading frames commonly observed in other poleroviruses. Sequence comparisons revealed that BVG is most closely related to maize yellow dwarf virus-RMV, with the highest amino acid identities being less than 90 % for all of the corresponding proteins. These results suggested that BVG is a member of a new species in the genus Polerovirus. PMID:27146139

  13. Transcriptome Sequencing of Chemically Induced Aquilaria sinensis to Identify Genes Related to Agarwood Formation

    PubMed Central

    Ye, Wei; Wu, Hongqing; He, Xin; Wang, Lei; Zhang, Weimin; Li, Haohua; Fan, Yunfei; Tan, Guohui; Liu, Taomei; Gao, Xiaoxia

    2016-01-01

    Background Agarwood is a traditional Chinese medicine used as a clinical sedative, carminative, and antiemetic drug. Agarwood is formed in Aquilaria sinensis when A. sinensis trees are threatened by external physical, chemical injury or endophytic fungal irritation. However, the mechanism of agarwood formation via chemical induction remains unclear. In this study, we characterized the transcriptome of different parts of a chemically induced A. sinensis trunk sample with agarwood. The Illumina sequencing platform was used to identify the genes involved in agarwood formation. Methodology/Principal Findings A five-year-old Aquilaria sinensis treated by formic acid was selected. The white wood part (B1 sample), the transition part between agarwood and white wood (W2 sample), the agarwood part (J3 sample), and the rotten wood part (F5 sample) were collected for transcriptome sequencing. Accordingly, 54,685,634 clean reads, which were assembled into 83,467 unigenes, were obtained with a Q20 value of 97.5%. A total of 50,565 unigenes were annotated using the Nr, Nt, SWISS-PROT, KEGG, COG, and GO databases. In particular, 171,331,352 unigenes were annotated by various pathways, including the sesquiterpenoid (ko00909) and plant–pathogen interaction (ko03040) pathways. These pathways were related to sesquiterpenoid biosynthesis and defensive responses to chemical stimulation. Conclusions/Significance The transcriptome data of the different parts of the chemically induced A. sinensis trunk provide a rich source of materials for discovering and identifying the genes involved in sesquiterpenoid production and in defensive responses to chemical stimulation. This study is the first to use de novo sequencing and transcriptome assembly for different parts of chemically induced A. sinensis. Results demonstrate that the sesquiterpenoid biosynthesis pathway and WRKY transcription factor play important roles in agarwood formation via chemical induction. The comparative analysis of

  14. Sequences Of Amino Acids For Human Serum Albumin

    NASA Technical Reports Server (NTRS)

    Carter, Daniel C.

    1992-01-01

    Sequences of amino acids defined for use in making polypeptides one-third to one-sixth as large as parent human serum albumin molecule. Smaller, chemically stable peptides have diverse applications including service as artificial human serum and as active components of biosensors and chromatographic matrices. In applications involving production of artificial sera from new sequences, little or no concern about viral contaminants. Smaller genetically engineered polypeptides more easily expressed and produced in large quantities, making commercial isolation and production more feasible and profitable.

  15. DNA affinity labeling of adenovirus type 2 upstream promoter sequence-binding factors identifies two distinct proteins

    SciTech Connect

    Safer, B.; Cohen, R.B.; Garfinkel, S.; Thompson, J.A.

    1988-01-01

    A rapid affinity labeling procedure with enhanced specificity was developed to identify DNA-binding proteins. /sup 32/P was first introduced at unique phosphodiester bonds within the DNA recognition sequence. UV light-dependent cross-linking of pyrimidines to amino acid residues in direct contact at the binding site, followed by micrococcal nuclease digestion, resulted in the transfer of /sup 32/P to only those specific protein(s) which recognized the binding sequence. This method was applied to the detection and characterization of proteins that bound to the upstream promoter sequence (-50 to -66) of the human adenovirus type 2 major late promoter. We detected two distinct proteins with molecular weights of 45,000 and 116,000 that interacted with this promoter element. The two proteins differed significantly in their chromatographic and cross-linking behaviors.

  16. New families in the classification of glycosyl hydrolases based on amino acid sequence similarities.

    PubMed Central

    Henrissat, B; Bairoch, A

    1993-01-01

    301 glycosyl hydrolases and related enzymes corresponding to 39 EC entries of the I.U.B. classification system have been classified into 35 families on the basis of amino-acid-sequence similarities [Henrissat (1991) Biochem. J. 280, 309-316]. Approximately half of the families were found to be monospecific (containing only one EC number), whereas the other half were found to be polyspecific (containing at least two EC numbers). A > 60% increase in sequence data for glycosyl hydrolases (181 additional enzymes or enzyme domains sequences have since become available) allowed us to update the classification not only by the addition of more members to already identified families, but also by the finding of ten new families. On the basis of a comparison of 482 sequences corresponding to 52 EC entries, 45 families, out of which 22 are polyspecific, can now be defined. This classification has been implemented in the SWISS-PROT protein sequence data bank. PMID:8352747

  17. MinION nanopore sequencing identifies the position and structure of a bacterial antibiotic resistance island.

    PubMed

    Ashton, Philip M; Nair, Satheesh; Dallman, Tim; Rubino, Salvatore; Rabsch, Wolfgang; Mwaigwisya, Solomon; Wain, John; O'Grady, Justin

    2015-03-01

    Short-read, high-throughput sequencing technology cannot identify the chromosomal position of repetitive insertion sequences that typically flank horizontally acquired genes such as bacterial virulence genes and antibiotic resistance genes. The MinION nanopore sequencer can produce long sequencing reads on a device similar in size to a USB memory stick. Here we apply a MinION sequencer to resolve the structure and chromosomal insertion site of a composite antibiotic resistance island in Salmonella Typhi Haplotype 58. Nanopore sequencing data from a single 18-h run was used to create a scaffold for an assembly generated from short-read Illumina data. Our results demonstrate the potential of the MinION device in clinical laboratories to fully characterize the epidemic spread of bacterial pathogens. PMID:25485618

  18. Nanopores and nucleic acids: prospects for ultrarapid sequencing

    NASA Technical Reports Server (NTRS)

    Deamer, D. W.; Akeson, M.

    2000-01-01

    DNA and RNA molecules can be detected as they are driven through a nanopore by an applied electric field at rates ranging from several hundred microseconds to a few milliseconds per molecule. The nanopore can rapidly discriminate between pyrimidine and purine segments along a single-stranded nucleic acid molecule. Nanopore detection and characterization of single molecules represents a new method for directly reading information encoded in linear polymers. If single-nucleotide resolution can be achieved, it is possible that nucleic acid sequences can be determined at rates exceeding a thousand bases per second.

  19. Amino acid sequence of the Amur tiger prion protein.

    PubMed

    Wu, Changde; Pang, Wanyong; Zhao, Deming

    2006-10-01

    Prion diseases are fatal neurodegenerative disorders in human and animal associated with conformational conversion of a cellular prion protein (PrP(C)) into the pathologic isoform (PrP(Sc)). Various data indicate that the polymorphisms within the open reading frame (ORF) of PrP are associated with the susceptibility and control the species barrier in prion diseases. In the present study, partial Prnp from 25 Amur tigers (tPrnp) were cloned and screened for polymorphisms. Four single nucleotide polymorphisms (T423C, A501G, C511A, A610G) were found; the C511A and A610G nucleotide substitutions resulted in the amino acid changes Lysine171Glutamine and Alanine204Threoine, respectively. The tPrnp amino acid sequence is similar to house cat (Felis catus ) and sheep, but differs significantly from other two cat Prnp sequences that were previously deposited in GenBank. PMID:16780982

  20. New monoclonal antibodies to the Ebola virus glycoprotein: Identification and analysis of the amino acid sequence of the variable domains.

    PubMed

    Panina, A A; Aliev, T K; Shemchukova, O B; Dement'yeva, I G; Varlamov, N E; Pozdnyakova, L P; Bokov, M N; Dolgikh, D A; Sveshnikov, P G; Kirpichnikov, M P

    2016-03-01

    We determined the nucleotide and amino acid sequences of variable domains of three new monoclonal antibodies to the glycoprotein of Ebola virus capsid. The framework and hypervariable regions of immunoglobulin heavy and light chains were identified. The primary structures were confirmed using massspectrometry analysis. Immunoglobulin database search showed the uniqueness of the sequences obtained. PMID:27193713

  1. Exome sequencing identifies a novel SMCHD1 mutation in facioscapulohumeral muscular dystrophy 2

    PubMed Central

    Mitsuhashi, Satomi; Boyden, Steven E; Estrella, Elicia A; Jones, Takako I; Rahimov, Fedik; Yu, Timothy W; Darras, Basil T; Amato, Anthony A; Folkerth, Rebecca D; Jones, Peter L; Kunkel, Louis M; Kang, Peter B

    2013-01-01

    FSHD2 is a rare form of facioscapulohumeral muscular dystrophy (FSHD) characterized by the absence of a contraction in the D4Z4 macrosatellite repeat region on chromosome 4q35 that is the hallmark of FSHD1. However, hypomethylation of this region is common to both subtypes. Recently, mutations in SMCHD1 combined with a permissive 4q35 allele were reported to cause FSHD2. We identified a novel p.Lys275del SMCHD1 mutation in a family affected with FSHD2 using whole-exome sequencing and linkage analysis. This mutation alters a highly conserved amino acid in the ATPase domain of SMCHD1. Subject III-11 is a male who developed asymmetrical muscle weakness characteristic of FSHD at 13 years. Physical examination revealed marked bilateral atrophy at biceps brachii, bilateral scapular winging, some asymmetrical weakness at tibialis anterior and peroneal muscles, and mild lower facial weakness. Biopsy of biceps brachii in subject II-5, the father of III-11, demonstrated lobulated fibers and dystrophic changes. Endomysial and perivascular inflammation was found, which has been reported in FSHD1 but not FSHD2. Given the previous report of SMCHD1 mutations in FSHD2 and the clinical presentations consistent with the FSHD phenotype, we conclude that the SMCHD1 mutation is the likely cause of the disease in this family. PMID:24128691

  2. Functional Brain Activation Differences in Stuttering Identified with a Rapid fMRI Sequence

    ERIC Educational Resources Information Center

    Loucks, Torrey; Kraft, Shelly Jo; Choo, Ai Leen; Sharma, Harish; Ambrose, Nicoline G.

    2011-01-01

    The purpose of this study was to investigate whether brain activity related to the presence of stuttering can be identified with rapid functional MRI (fMRI) sequences that involved overt and covert speech processing tasks. The long-term goal is to develop sensitive fMRI approaches with developmentally appropriate tasks to identify deviant speech…

  3. Transitive Homology-Guided Structural Studies Lead to Discovery of Cro Proteins With 40% Sequence Identify But Different Folds

    SciTech Connect

    Roessler, C.G.; Hall, B.M.; Anderson, W.J.; Ingram, W.M.; Roberts, S.A.; Montfort, W.R.; Cordes, M.H.J.

    2009-05-27

    Proteins that share common ancestry may differ in structure and function because of divergent evolution of their amino acid sequences. For a typical diverse protein superfamily, the properties of a few scattered members are known from experiment. A satisfying picture of functional and structural evolution in relation to sequence changes, however, may require characterization of a larger, well chosen subset. Here, we employ a 'stepping-stone' method, based on transitive homology, to target sequences intermediate between two related proteins with known divergent properties. We apply the approach to the question of how new protein folds can evolve from preexisting folds and, in particular, to an evolutionary change in secondary structure and oligomeric state in the Cro family of bacteriophage transcription factors, initially identified by sequence-structure comparison of distant homologs from phages P22 and {lambda}. We report crystal structures of two Cro proteins, Xfaso 1 and Pfl 6, with sequences intermediate between those of P22 and {lambda}. The domains show 40% sequence identity but differ by switching of {alpha}-helix to {beta}-sheet in a C-terminal region spanning {approx}25 residues. Sedimentation analysis also suggests a correlation between helix-to-sheet conversion and strengthened dimerization.

  4. kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets

    PubMed Central

    Fletez-Brant, Christopher; Lee, Dongwon; McCallion, Andrew S.; Beer, Michael A.

    2013-01-01

    Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-seq identifies regions of open chromatin. A major bottleneck in the interpretation of these data is the identification of the underlying DNA sequence code that defines, and ultimately facilitates prediction of, these transcription factor (TF) bound or open chromatin regions. We have recently developed a novel computational methodology, which uses a support vector machine (SVM) with kmer sequence features (kmer-SVM) to identify predictive combinations of short transcription factor-binding sites, which determine the tissue specificity of these genomic assays (Lee, Karchin and Beer, Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011; 21:2167–80). This regulatory information can (i) give confidence in genomic experiments by recovering previously known binding sites, and (ii) reveal novel sequence features for subsequent experimental testing of cooperative mechanisms. Here, we describe the development and implementation of a web server to allow the broader research community to independently apply our kmer-SVM to analyze and interpret their genomic datasets. We analyze five recently published data sets and demonstrate how this tool identifies accessory factors and repressive sequence elements. kmer-SVM is available at http://kmersvm.beerlab.org. PMID:23771147

  5. Intravenous phage display identifies peptide sequences that target the burn-injured intestine

    PubMed Central

    Costantini, Todd W.; Eliceiri, Brian P.; Putnam, James G.; Bansal, Vishal; Baird, Andrew; Coimbra, Raul

    2015-01-01

    The injured intestine is responsible for significant morbidity and mortality after severe trauma and burn; however, targeting the intestine with therapeutics aimed at decreasing injury has proven difficult. We hypothesized that we could use intravenous phage display technology to identify peptide sequences that target the injured intestinal mucosa in a murine model, and then confirm the cross-reactivity of this peptide sequence with ex vivo human gut. Four hours following 30% TBSA burn we performed an in vivo, intravenous systemic administration of phage library containing 1012 phage in balb/c mice to biopan for gut-targeting peptides. In vivo assessment of the candidate peptide sequences identified after 4 rounds of internalization was performed by injecting 1 × 1012 copies of each selected phage clone into sham or burned animals. Internalization into the gut was assessed using quantitative polymerase chain reaction. We then incubated this gut-targeting peptide sequence with human intestine and visualized fluorescence using confocal microscopy. We identified 3 gut-targeting peptide sequences which caused collapse of the phage library (4–1: SGHQLLLNKMP, 4–5: ILANDLTAPGPR, 4–11: SFKPSGLPAQSL). Sequence 4–5 was internalized into the intestinal mucosa of burned animals 9.3-fold higher than sham animals injected with the same sequence (2.9 × 105 vs. 3.1 × 104 particles per mg tissue). Sequences 4–1 and 4–11 were both internalized into the gut, but did not demonstrate specificity for the injured mucosa. Phage sequence 4–11 demonstrated cross-reactivity with human intestine. In the future, this gut-targeting peptide sequence could serve as a platform for the delivery of biotherapeutics. PMID:22960048

  6. Plant RNA virus sequences identified in kimchi by microbial metatranscriptome analysis.

    PubMed

    Kim, Dong Seon; Jung, Ji Young; Wang, Yao; Oh, Hye Ji; Choi, Dongjin; Jeon, Che Ok; Hahn, Yoonsoo

    2014-07-01

    Plant pathogenic RNA viruses are present in a variety of plant-based foods. When ingested by humans, these viruses can survive the passage through the digestive tract, and are frequently detected in human feces. Kimchi is a traditional fermented Korean food made from cabbage or vegetables, with a variety of other plant-based ingredients, including ground red pepper and garlic paste. We analyzed microbial metatranscriptome data from kimchi at five fermentation stages to identify plant RNA virus-derived sequences. We successfully identified a substantial amount of plant RNA virus sequences, especially during the early stages of fermentation: 23.47% and 16.45% of total clean reads on days 7 and 13, respectively. The most abundant plant RNA virus sequences were from pepper mild mottle virus, a major pathogen of red peppers; this constituted 95% of the total RNA virus sequences identified throughout the fermentation period. We observed distinct sequencing read-depth distributions for plant RNA virus genomes, possibly implying intrinsic and/or technical biases during the metatranscriptome generation procedure. We also identified RNA virus sequences in publicly available microbial metatranscriptome data sets. We propose that metatranscriptome data may serve as a valuable resource for RNA virus detection, and a systematic screening of the ingredients may help prevent the use of virus-infected low-quality materials for food production. PMID:24836186

  7. Close sequence comparisons are sufficient to identify human cis-regulatory elements.

    PubMed

    Prabhakar, Shyam; Poulin, Francis; Shoukry, Malak; Afzal, Veena; Rubin, Edward M; Couronne, Olivier; Pennacchio, Len A

    2006-07-01

    Cross-species DNA sequence comparison is the primary method used to identify functional noncoding elements in human and other large genomes. However, little is known about the relative merits of evolutionarily close and distant sequence comparisons. To address this problem, we identified evolutionarily conserved noncoding regions in primate, mammalian, and more distant comparisons using a uniform approach (Gumby) that facilitates unbiased assessment of the impact of evolutionary distance on predictive power. We benchmarked computational predictions against previously identified cis-regulatory elements at diverse genomic loci and also tested numerous extremely conserved human-rodent sequences for transcriptional enhancer activity using an in vivo enhancer assay in transgenic mice. Human regulatory elements were identified with acceptable sensitivity (53%-80%) and true-positive rate (27%-67%) by comparison with one to five other eutherian mammals or six other simian primates. More distant comparisons (marsupial, avian, amphibian, and fish) failed to identify many of the empirically defined functional noncoding elements. Our results highlight the practical utility of close sequence comparisons, and the loss of sensitivity entailed by more distant comparisons. We derived an intuitive relationship between ancient and recent noncoding sequence conservation from whole-genome comparative analysis that explains most of the observations from empirical benchmarking. Lastly, we determined that, in addition to strength of conservation, genomic location and/or density of surrounding conserved elements must also be considered in selecting candidate enhancers for in vivo testing at embryonic time points. PMID:16769978

  8. Targeted Next Generation Sequencing Identifies Clinically Actionable Mutations in Patients with Melanoma

    PubMed Central

    Jeck, William R.; Parker, Joel; Carson, Craig C.; Shields, Janiel M.; Sambade, Maria J.; Peters, Eldon C.; Burd, Christin E.; Thomas, Nancy E.; Chiang, Derek Y.; Liu, Wenjin; Eberhard, David A.; Ollila, David; Grilley-Olson, Juneko; Moschos, Stergios; Hayes, D. Neil; Sharpless, Norman E.

    2014-01-01

    Somatic sequencing of cancers has produced new insight into tumorigenesis, tumor heterogeneity, and disease progression, but the vast majority of genetic events identified are of indeterminate clinical significance. Here we describe a NextGen sequencing approach to fully analyze 248 genes, including all those of known clinical significance in melanoma. This strategy features solution capture of DNA followed by multiplexed, high-throughput sequencing, and was evaluated in 31 melanoma cell lines and 18 tumor tissues from patients with metastatic melanoma. Mutations in melanoma cell lines correlated with their sensitivity to corresponding small molecule inhibitors, confirming, for example, lapatinib sensitivity in ERBB4 mutant lines and identifying a novel activating mutation of BRAF. The latter event would not have been identified by clinical sequencing and was associated with responsiveness to a BRAF kinase inhibitor. This approach identified focal copy number changes of PTEN not found by standard methods, such as comparative genomic hybridization (CGH). Actionable mutations were found in 89% of the tumor tissues analyzed, 56% of which would not be identified by standard-of-care approaches. This work shows that targeted sequencing is an attractive approach for clinical use in melanoma. PMID:24628946

  9. In silico comparative analysis of DNA and amino acid sequences for prion protein gene.

    PubMed

    Kim, Y; Lee, J; Lee, C

    2008-01-01

    Genetic variability might contribute to species specificity of prion diseases in various organisms. In this study, structures of the prion protein gene (PRNP) and its amino acids were compared among species of which sequence data were available. Comparisons of PRNP DNA sequences among 12 species including human, chimpanzee, monkey, bovine, ovine, dog, mouse, rat, wallaby, opossum, chicken and zebrafish allowed us to identify candidate regulatory regions in intron 1 and 3'-untranslated region (UTR) in addition to the coding region. Highly conserved putative binding sites for transcription factors, such as heat shock factor 2 (HSF2) and myocite enhancer factor 2 (MEF2), were discovered in the intron 1. In 3'-UTR, the functional sequence (ATTAAA) for nucleus-specific polyadenylation was found in all the analysed species. The functional sequence (TTTTTAT) for maturation-specific polyadenylation was identically observed only in ovine, and one or two nucleotide mismatches in the other species. A comparison of the amino acid sequences in 53 species revealed a large sequence identity. Especially the octapeptide repeat region was observed in all the species but frog and zebrafish. Functional changes and susceptibility to prion diseases with various isoforms of prion protein could be caused by numeric variability and conformational changes discovered in the repeat sequences. PMID:18397498

  10. Identifying New Drug Targets for Potent Phospholipase D Inhibitors: Combining Sequence Alignment, Molecular Docking, and Enzyme Activity/Binding Assays.

    PubMed

    Djakpa, Helene; Kulkarni, Aditya; Barrows-Murphy, Scheneque; Miller, Greg; Zhou, Weihong; Cho, Hyejin; Török, Béla; Stieglitz, Kimberly

    2016-05-01

    Phospholipase D enzymes cleave phospholipid substrates generating choline and phosphatidic acid. Phospholipase D from Streptomyces chromofuscus is a non-HKD (histidine, lysine, and aspartic acid) phospholipase D as the enzyme is more similar to members of the diverse family of metallo-phosphodiesterase/phosphatase enzymes than phospholipase D enzymes with active site HKD repeats. A highly efficient library of phospholipase D inhibitors based on 1,3-disubstituted-4-amino-pyrazolopyrimidine core structure was utilized to evaluate the inhibition of purified S. chromofuscus phospholipase D. The molecules exhibited inhibition of phospholipase D activity (IC50 ) in the nanomolar range with monomeric substrate diC4 PC and micromolar range with phospholipid micelles and vesicles. Binding studies with vesicle substrate and phospholipase D strongly indicate that these inhibitors directly block enzyme vesicle binding. Following these compelling results as a starting point, sequence searches and alignments with S. chromofuscus phospholipase D have identified potential new drug targets. Using AutoDock, inhibitors were docked into the enzymes selected from sequence searches and alignments (when 3D co-ordinates were available) and results analyzed to develop next-generation inhibitors for new targets. In vitro enzyme activity assays with several human phosphatases demonstrated that the predictive protocol was accurate. The strategy of combining sequence comparison, docking, and high-throughput screening assays has helped to identify new drug targets and provided some insight into how to make potential inhibitors more specific to desired targets. PMID:26691755

  11. Quantum-Sequencing: Biophysics of quantum tunneling through nucleic acids

    NASA Astrophysics Data System (ADS)

    Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant

    2014-03-01

    Tunneling microscopy and spectroscopy has extensively been used in physical surface sciences to study quantum tunneling to measure electronic local density of states of nanomaterials and to characterize adsorbed species. Quantum-Sequencing (Q-Seq) is a new method based on tunneling microscopy for electronic sequencing of single molecule of nucleic acids. A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free single-molecule sequencing method. Here, we present the unique ``electronic fingerprints'' for all nucleotides on DNA and RNA using Q-Seq along their intrinsic biophysical parameters. We have analyzed tunneling spectra for the nucleotides at different pH conditions and analyzed the HOMO, LUMO and energy gap for all of them. In addition we show a number of biophysical parameters to further characterize all nucleobases (electron and hole transition voltage and energy barriers). These results highlight the robustness of Q-Seq as a technique for next-generation sequencing.

  12. An integrated system for identifying the hidden assassins in traditional medicines containing aristolochic acids

    NASA Astrophysics Data System (ADS)

    Wu, Lan; Sun, Wei; Wang, Bo; Zhao, Haiyu; Li, Yaoli; Cai, Shaoqing; Xiang, Li; Zhu, Yingjie; Yao, Hui; Song, Jingyuan; Cheng, Yung-Chi; Chen, Shilin

    2015-08-01

    Traditional herbal medicines adulterated and contaminated with plant materials from the Aristolochiaceae family, which contain aristolochic acids (AAs), cause aristolochic acid nephropathy. Approximately 256 traditional Chinese patent medicines, containing Aristolochiaceous materials, are still being sold in Chinese markets today. In order to protect consumers from health risks due to AAs, the hidden assassins, efficient methods to differentiate Aristolochiaceous herbs from their putative substitutes need to be established. In this study, 158 Aristolochiaceous samples representing 46 species and four genera as well as 131 non-Aristolochiaceous samples representing 33 species, 20 genera and 12 families were analyzed using DNA barcodes based on the ITS2 and psbA-trnH sequences. Aristolochiaceous materials and their non-Aristolochiaceous substitutes were successfully identified using BLAST1, the nearest distance method and the neighbor-joining (NJ) tree. In addition, based on sequence information of ITS2, we developed a Real-Time PCR assay which successfully identified herbal material from the Aristolochiaceae family. Using Ultra High Performance Liquid Chromatography-Mass Spectrometer (UHPLC-HR-MS), we demonstrated that most representatives from the Aristolochiaceae family contain toxic AAs. Therefore, integrated DNA barcodes, Real-Time PCR assays using TaqMan probes and UHPLC-HR-MS system provides an efficient and reliable authentication system to protect consumers from health risks due to the hidden assassins (AAs).

  13. An integrated system for identifying the hidden assassins in traditional medicines containing aristolochic acids

    PubMed Central

    Wu, Lan; Sun, Wei; Wang, Bo; Zhao, Haiyu; Li, Yaoli; Cai, Shaoqing; Xiang, Li; Zhu, Yingjie; Yao, Hui; Song, Jingyuan; Cheng, Yung-Chi; Chen, Shilin

    2015-01-01

    Traditional herbal medicines adulterated and contaminated with plant materials from the Aristolochiaceae family, which contain aristolochic acids (AAs), cause aristolochic acid nephropathy. Approximately 256 traditional Chinese patent medicines, containing Aristolochiaceous materials, are still being sold in Chinese markets today. In order to protect consumers from health risks due to AAs, the hidden assassins, efficient methods to differentiate Aristolochiaceous herbs from their putative substitutes need to be established. In this study, 158 Aristolochiaceous samples representing 46 species and four genera as well as 131 non-Aristolochiaceous samples representing 33 species, 20 genera and 12 families were analyzed using DNA barcodes based on the ITS2 and psbA-trnH sequences. Aristolochiaceous materials and their non-Aristolochiaceous substitutes were successfully identified using BLAST1, the nearest distance method and the neighbor-joining (NJ) tree. In addition, based on sequence information of ITS2, we developed a Real-Time PCR assay which successfully identified herbal material from the Aristolochiaceae family. Using Ultra High Performance Liquid Chromatography-Mass Spectrometer (UHPLC-HR-MS), we demonstrated that most representatives from the Aristolochiaceae family contain toxic AAs. Therefore, integrated DNA barcodes, Real-Time PCR assays using TaqMan probes and UHPLC-HR-MS system provides an efficient and reliable authentication system to protect consumers from health risks due to the hidden assassins (AAs). PMID:26270958

  14. A machine learning strategy to identify candidate binding sites in human protein-coding sequence

    PubMed Central

    Down, Thomas; Leong, Bernard; Hubbard, Tim JP

    2006-01-01

    Background The splicing of RNA transcripts is thought to be partly promoted and regulated by sequences embedded within exons. Known sequences include binding sites for SR proteins, which are thought to mediate interactions between splicing factors bound to the 5' and 3' splice sites. It would be useful to identify further candidate sequences, however identifying them computationally is hard since exon sequences are also constrained by their functional role in coding for proteins. Results This strategy identified a collection of motifs including several previously reported splice enhancer elements. Although only trained on coding exons, the model discriminates both coding and non-coding exons from intragenic sequence. Conclusion We have trained a computational model able to detect signals in coding exons which seem to be orthogonal to the sequences' primary function of coding for proteins. We believe that many of the motifs detected here represent binding sites for both previously unrecognized proteins which influence RNA splicing as well as other regulatory elements. PMID:17002805

  15. Correlation between fibroin amino acid sequence and physical silk properties.

    PubMed

    Fedic, Robert; Zurovec, Michal; Sehnal, Frantisek

    2003-09-12

    The fiber properties of lepidopteran silk depend on the amino acid repeats that interact during H-fibroin polymerization. The aim of our research was to relate repeat composition to insect biology and fiber strength. Representative regions of the H-fibroin genes were sequenced and analyzed in three pyralid species: wax moth (Galleria mellonella), European flour moth (Ephestia kuehniella), and Indian meal moth (Plodia interpunctella). The amino acid repeats are species-specific, evidently a diversification of an ancestral region of 43 residues, and include three types of regularly dispersed motifs: modifications of GSSAASAA sequence, stretches of tripeptides GXZ where X and Z represent bulky residues, and sequences similar to PVIVIEE. No concatenations of GX dipeptide or alanine, which are typical for Bombyx silkworms and Antheraea silk moths, respectively, were found. Despite different repeat structure, the silks of G. mellonella and E. kuehniella exhibit similar tensile strength as the Bombyx and Antheraea silks. We suggest that in these latter two species, variations in the repeat length obstruct repeat alignment, but sufficiently long stretches of iterated residues get superposed to interact. In the pyralid H-fibroins, interactions of the widely separated and diverse motifs depend on the precision of repeat matching; silk is strong in G. mellonella and E. kuehniella, with 2-3 types of long homogeneous repeats, and nearly 10 times weaker in P. interpunctella, with seven types of shorter erratic repeats. The high proportion of large amino acids in the H-fibroin of pyralids has probably evolved in connection with the spinning habit of caterpillars that live in protective silk tubes and spin continuously, enlarging the tubes on one end and partly devouring the other one. The silk serves as a depot of energetically rich and essential amino acids that may be scarce in the diet. PMID:12816957

  16. Amino acid sequence of the nonsecretory ribonuclease of human urine.

    PubMed

    Beintema, J J; Hofsteenge, J; Iwama, M; Morita, T; Ohgi, K; Irie, M; Sugiyama, R H; Schieven, G L; Dekker, C A; Glitz, D G

    1988-06-14

    The amino acid sequence of a nonsecretory ribonuclease isolated from human urine was determined except for the identity of the residue at position 7. Sequence information indicates that the ribonucleases of human liver and spleen and an eosinophil-derived neurotoxin are identical or very closely related gene products. The sequence is identical at about 30% of the amino acid positions with those of all of the secreted mammalian ribonucleases for which information is available. Identical residues include active-site residues histidine-12, histidine-119, and lysine-41, other residues known to be important for substrate binding and catalytic activity, and all eight half-cystine residues common to these enzymes. Major differences include a deletion of six residues in the (so-called) S-peptide loop, insertions of two, and nine residues, respectively, in three other external loops of the molecule, and an addition of three residues at the amino terminus. The sequence shows the human nonsecretory ribonuclease to belong to the same ribonuclease superfamily as the mammalian secretory ribonucleases, turtle pancreatic ribonuclease, and human angiogenin. Sequence data suggest that a gene duplication occurred in an ancient vertebrate ancestor; one branch led to the nonsecretory ribonuclease, while the other branch led to a second duplication, with one line leading to the secretory ribonucleases (in mammals) and the second line leading to pancreatic ribonuclease in turtle and an angiogenic factor in mammals (human angiogenin). The nonsecretory ribonuclease has five short carbohydrate chains attached via asparagine residues at the surface of the molecule; these chains may have been shortened by exoglycosidase action.(ABSTRACT TRUNCATED AT 250 WORDS) PMID:3166997

  17. Close Sequence Comparisons are Sufficient to Identify Humancis-Regulatory Elements

    SciTech Connect

    Prabhakar, Shyam; Poulin, Francis; Shoukry, Malak; Afzal, Veena; Rubin, Edward M.; Couronne, Olivier; Pennacchio, Len A.

    2005-12-01

    Cross-species DNA sequence comparison is the primary method used to identify functional noncoding elements in human and other large genomes. However, little is known about the relative merits of evolutionarily close and distant sequence comparisons, due to the lack of a universal metric for sequence conservation, and also the paucity of empirically defined benchmark sets of cis-regulatory elements. To address this problem, we developed a general-purpose algorithm (Gumby) that detects slowly-evolving regions in primate, mammalian and more distant comparisons without requiring adjustment of parameters, and ranks conserved elements by P-value using Karlin-Altschul statistics. We benchmarked Gumby predictions against previously identified cis-regulatory elements at diverse genomic loci, and also tested numerous extremely conserved human-rodent sequences for transcriptional enhancer activity using reporter-gene assays in transgenic mice. Human regulatory elements were identified with acceptable sensitivity and specificity by comparison with 1-5 other eutherian mammals or 6 other simian primates. More distant comparisons (marsupial, avian, amphibian and fish) failed to identify many of the empirically defined functional noncoding elements. We derived an intuitive relationship between ancient and recent noncoding sequence conservation from whole genome comparative analysis, which explains some of these findings. Lastly, we determined that, in addition to strength of conservation, genomic location and/or density of surrounding conserved elements must also be considered in selecting candidate enhancers for testing at embryonic time points.

  18. Use of a structural alphabet to find compatible folds for amino acid sequences

    PubMed Central

    Mahajan, Swapnil; de Brevern, Alexandre G; Sanejouand, Yves-Henri; Srinivasan, Narayanaswamy; Offmann, Bernard

    2015-01-01

    The structural annotation of proteins with no detectable homologs of known 3D structure identified using sequence-search methods is a major challenge today. We propose an original method that computes the conditional probabilities for the amino-acid sequence of a protein to fit to known protein 3D structures using a structural alphabet, known as “Protein Blocks” (PBs). PBs constitute a library of 16 local structural prototypes that approximate every part of protein backbone structures. It is used to encode 3D protein structures into 1D PB sequences and to capture sequence to structure relationships. Our method relies on amino acid occurrence matrices, one for each PB, to score global and local threading of query amino acid sequences to protein folds encoded into PB sequences. It does not use any information from residue contacts or sequence-search methods or explicit incorporation of hydrophobic effect. The performance of the method was assessed with independent test datasets derived from SCOP 1.75A. With a Z-score cutoff that achieved 95% specificity (i.e., less than 5% false positives), global and local threading showed sensitivity of 64.1% and 34.2%, respectively. We further tested its performance on 57 difficult CASP10 targets that had no known homologs in PDB: 38 compatible templates were identified by our approach and 66% of these hits yielded correctly predicted structures. This method scales-up well and offers promising perspectives for structural annotations at genomic level. It has been implemented in the form of a web-server that is freely available at http://www.bo-protscience.fr/forsa. PMID:25297700

  19. Use of a structural alphabet to find compatible folds for amino acid sequences.

    PubMed

    Mahajan, Swapnil; de Brevern, Alexandre G; Sanejouand, Yves-Henri; Srinivasan, Narayanaswamy; Offmann, Bernard

    2015-01-01

    The structural annotation of proteins with no detectable homologs of known 3D structure identified using sequence-search methods is a major challenge today. We propose an original method that computes the conditional probabilities for the amino-acid sequence of a protein to fit to known protein 3D structures using a structural alphabet, known as "Protein Blocks" (PBs). PBs constitute a library of 16 local structural prototypes that approximate every part of protein backbone structures. It is used to encode 3D protein structures into 1D PB sequences and to capture sequence to structure relationships. Our method relies on amino acid occurrence matrices, one for each PB, to score global and local threading of query amino acid sequences to protein folds encoded into PB sequences. It does not use any information from residue contacts or sequence-search methods or explicit incorporation of hydrophobic effect. The performance of the method was assessed with independent test datasets derived from SCOP 1.75A. With a Z-score cutoff that achieved 95% specificity (i.e., less than 5% false positives), global and local threading showed sensitivity of 64.1% and 34.2%, respectively. We further tested its performance on 57 difficult CASP10 targets that had no known homologs in PDB: 38 compatible templates were identified by our approach and 66% of these hits yielded correctly predicted structures. This method scales-up well and offers promising perspectives for structural annotations at genomic level. It has been implemented in the form of a web-server that is freely available at http://www.bo-protscience.fr/forsa. PMID:25297700

  20. Characterization and amino acid sequence of a fatty acid-binding protein from human heart.

    PubMed

    Offner, G D; Brecher, P; Sawlivich, W B; Costello, C E; Troxler, R F

    1988-05-15

    The complete amino acid sequence of a fatty acid-binding protein from human heart was determined by automated Edman degradation of CNBr, BNPS-skatole [3'-bromo-3-methyl-2-(2-nitrobenzenesulphenyl)indolenine], hydroxylamine, Staphylococcus aureus V8 proteinase, tryptic and chymotryptic peptides, and by digestion of the protein with carboxypeptidase A. The sequence of the blocked N-terminal tryptic peptide from citraconylated protein was determined by collisionally induced decomposition mass spectrometry. The protein contains 132 amino acid residues, is enriched with respect to threonine and lysine, lacks cysteine, has an acetylated valine residue at the N-terminus, and has an Mr of 14768 and an isoelectric point of 5.25. This protein contains two short internal repeated sequences from residues 48-54 and from residues 114-119 located within regions of predicted beta-structure and decreasing hydrophobicity. These short repeats are contained within two longer repeated regions from residues 48-60 and residues 114-125, which display 62% sequence similarity. These regions could accommodate the charged and uncharged moieties of long-chain fatty acids and may represent fatty acid-binding domains consistent with the finding that human heart fatty acid-binding protein binds 2 mol of oleate or palmitate/mol of protein. Detailed evidence for the amino acid sequences of the peptides has been deposited as Supplementary Publication SUP 50143 (23 pages) at the British Library Lending Division, Boston Spa, Yorkshire LS23 7BQ, U.K., from whom copies may be obtained as indicated in Biochem. J. (1988) 249, 5. PMID:3421901

  1. PRIMARY PEPTIDE SEQUENCES FROM SQUID MUSCLE AND OPTIC LOBE MYOSIN IIs: A STRATEGY TO IDENTIFY AN ORGANELLE MYOSIN

    PubMed Central

    MEDEIROS, NELSON A.; REESE, THOMAS S.; JAFFE, HOWARD; DEGIORGIS, JOSEPH A.; BEARER, ELAINE L.

    2013-01-01

    The squid giant axon provides an excellent model system for the study of actin-based organelle transport likely to be mediated by myosins, but the identification of these motors has proven to be difficult. Here the authors purified and obtained primary peptide sequence of squid muscle myosin as a first step in a strategy designed to identify myosins in the squid nervous system. Limited digestion yielded fourteen peptides derived from the muscle myosin which possess high amino acid sequence identities to myosin II from scallop (60–95%) and chick pectoralis muscle (31–83%). Antibodies generated to this purified muscle myosin were used to isolate a potential myosin from squid optic lobe which yielded 11 peptide fragments. Sequences from six of these fragments identified this protein as a myosin II. The other five sequences matched myosin II (50–60%, identities), and some also matched unconventional myosins (33–50%). A single band that has a molecular weight similar to the myosin purified from optic lobe copurifies with axoplasmic organelles, and, like the optic lobe myosin, this band is also recognized by the antibodies raised against squid muscle myosin II. Hence, this strategy provides an approach to the identification of a myosin associated with motile axoplasmic organelles. PMID:9878103

  2. Molecular cloning and amino acid sequence of human 5-lipoxygenase

    SciTech Connect

    Matsumoto, T.; Funk, C.D.; Radmark, O.; Hoeoeg, J.O.; Joernvall, H.; Samuelsson, B.

    1988-01-01

    5-Lipoxygenase (EC 1.13.11.34), a Ca/sup 2 +/- and ATP-requiring enzyme, catalyzes the first two steps in the biosynthesis of the peptidoleukotrienes and the chemotactic factor leukotriene B/sub 4/. A cDNA clone corresponding to 5-lipoxygenase was isolated from a human lung lambda gt11 expression library by immunoscreening with a polyclonal antibody. Additional clones from a human placenta lambda gt11 cDNA library were obtained by plaque hybridization with the /sup 32/P-labeled lung cDNA clone. Sequence data obtained from several overlapping clones indicate that the composite DNAs contain the complete coding region for the enzyme. From the deduced primary structure, 5-lipoxygenase encodes a 673 amino acid protein with a calculated molecular weight of 77,839. Direct analysis of the native protein and its proteolytic fragments confirmed the deduced composition, the amino-terminal amino acid sequence, and the structure of many internal segments. 5-Lipoxygenase has no apparent sequence homology with leukotriene A/sub 4/ hydrolase or Ca/sup 2 +/-binding proteins. RNA blot analysis indicated substantial amounts of an mRNA species of approx. = 2700 nucleotides in leukocytes, lung, and placenta.

  3. Nucleic acid sequence detection using multiplexed oligonucleotide PCR

    DOEpatents

    Nolan, John P.; White, P. Scott

    2006-12-26

    Methods for rapidly detecting single or multiple sequence alleles in a sample nucleic acid are described. Provided are all of the oligonucleotide pairs capable of annealing specifically to a target allele and discriminating among possible sequences thereof, and ligating to each other to form an oligonucleotide complex when a particular sequence feature is present (or, alternatively, absent) in the sample nucleic acid. The design of each oligonucleotide pair permits the subsequent high-level PCR amplification of a specific amplicon when the oligonucleotide complex is formed, but not when the oligonucleotide complex is not formed. The presence or absence of the specific amplicon is used to detect the allele. Detection of the specific amplicon may be achieved using a variety of methods well known in the art, including without limitation, oligonucleotide capture onto DNA chips or microarrays, oligonucleotide capture onto beads or microspheres, electrophoresis, and mass spectrometry. Various labels and address-capture tags may be employed in the amplicon detection step of multiplexed assays, as further described herein.

  4. The amino acid sequence of rabbit muscle triose phosphate isomerase.

    PubMed Central

    Corran, P H; Waley, S G

    1975-01-01

    The amino acid sequence of rabbit muscle triose phosphate isomerase was deduced by characterizing peptides that overlap the tryptic peptides. Thiol groups were modified by oxidation, carboxymethylation or aminoen. About 50 peptides that provided information about overlaps were isolated; the peptides were mostly characterized by their compositions and N-terminal residues. The peptide chains contain 248 amino acid residues, and no evidence for dissimilarity of the two subunits that comprise the native enzyme was found. The sequence of the rabbit muscle enzyme may be compared with that of the coelacanth enzyme (Kolb et al., 1974): 84% of the residues are in identical positions. Similarly, comparison of the sequence with that inferred for the chicken enzyme (Furth et al., 1974) shows that 87% of the residues are in identical positions. Limited though these comparisons are, they suggest that triose phosphate isomerase has one of the lowest rates of evolutionary change. An extended version of the present paper has been deposited as Supplementary Publication SUP 50040 (42 pages) at the British Library (Lending Division) (formerly the National Lending Library for Science and Technology), Boston Spa, Yorks. LS23 7BQ, U.K., from whom copies can be obtained on the terms given in Biochem. J. (1975) 145, 5. PMID:1171682

  5. The amino acid sequence of chymopapain from Carica papaya.

    PubMed Central

    Watson, D C; Yaguchi, M; Lynn, K R

    1990-01-01

    Chymopapain is a polypeptide of 218 amino acid residues. It has considerable structural similarity with papain and papaya proteinase omega, including conservation of the catalytic site and of the disulphide bonding. Chymopapain is like papaya proteinase omega in carrying four extra residues between papain positions 168 and 169, but differs from both papaya proteinases in the composition of its S2 subsite, as well as in having a second thiol group, Cys-117. Some evidence for the amino acid sequence of chymopapain has been deposited as Supplementary Publication SUP 50153 (12 pages) at the British Library Document Supply Centre, Boston Spa., Wetherby, West Yorkshire LS23 7BQ, U.K., from whom copies may be obtained on the terms indicated in Biochem. J. (1990) 265, 5. The information comprises Supplement Tables 1-4, which contain, in order, amino acid compositions of peptides from tryptic, peptic, CNBr and mild acid cleavages, Supplement Fig. 1, showing re-fractionation of selected peaks from Fig. 2 of the main paper. Supplement Fig. 2, showing cation-exchange chromatography of the earliest-eluted peak of Fig. 3 of the main paper, Supplement Fig. 3, showing reverse-phase h.p.l.c. of the later-eluted peak from Fig. 3 of the main paper, and Supplement Fig. 4, showing the separation of peptides after mild acid hydrolysis of CNBr-cleavage fragment CB3. PMID:2106878

  6. The amino acid sequence of chymopapain from Carica papaya.

    PubMed

    Watson, D C; Yaguchi, M; Lynn, K R

    1990-02-15

    Chymopapain is a polypeptide of 218 amino acid residues. It has considerable structural similarity with papain and papaya proteinase omega, including conservation of the catalytic site and of the disulphide bonding. Chymopapain is like papaya proteinase omega in carrying four extra residues between papain positions 168 and 169, but differs from both papaya proteinases in the composition of its S2 subsite, as well as in having a second thiol group, Cys-117. Some evidence for the amino acid sequence of chymopapain has been deposited as Supplementary Publication SUP 50153 (12 pages) at the British Library Document Supply Centre, Boston Spa., Wetherby, West Yorkshire LS23 7BQ, U.K., from whom copies may be obtained on the terms indicated in Biochem. J. (1990) 265, 5. The information comprises Supplement Tables 1-4, which contain, in order, amino acid compositions of peptides from tryptic, peptic, CNBr and mild acid cleavages, Supplement Fig. 1, showing re-fractionation of selected peaks from Fig. 2 of the main paper. Supplement Fig. 2, showing cation-exchange chromatography of the earliest-eluted peak of Fig. 3 of the main paper, Supplement Fig. 3, showing reverse-phase h.p.l.c. of the later-eluted peak from Fig. 3 of the main paper, and Supplement Fig. 4, showing the separation of peptides after mild acid hydrolysis of CNBr-cleavage fragment CB3. PMID:2106878

  7. Internet-Accessible DNA Sequence Database for Identifying Fusaria from Human and Animal Infections ▿

    PubMed Central

    O'Donnell, Kerry; Sutton, Deanna A.; Rinaldi, Michael G.; Sarver, Brice A. J.; Balajee, S. Arunmozhi; Schroers, Hans-Josef; Summerbell, Richard C.; Robert, Vincent A. R. G.; Crous, Pedro W.; Zhang, Ning; Aoki, Takayuki; Jung, Kyongyong; Park, Jongsun; Lee, Yong-Hwan; Kang, Seogchan; Park, Bongsoo; Geiser, David M.

    2010-01-01

    Because less than one-third of clinically relevant fusaria can be accurately identified to species level using phenotypic data (i.e., morphological species recognition), we constructed a three-locus DNA sequence database to facilitate molecular identification of the 69 Fusarium species associated with human or animal mycoses encountered in clinical microbiology laboratories. The database comprises partial sequences from three nuclear genes: translation elongation factor 1α (EF-1α), the largest subunit of RNA polymerase (RPB1), and the second largest subunit of RNA polymerase (RPB2). These three gene fragments can be amplified by PCR and sequenced using primers that are conserved across the phylogenetic breadth of Fusarium. Phylogenetic analyses of the combined data set reveal that, with the exception of two monotypic lineages, all clinically relevant fusaria are nested in one of eight variously sized and strongly supported species complexes. The monophyletic lineages have been named informally to facilitate communication of an isolate's clade membership and genetic diversity. To identify isolates to the species included within the database, partial DNA sequence data from one or more of the three genes can be used as a BLAST query against the database which is Web accessible at FUSARIUM-ID (http://isolate.fusariumdb.org) and the Centraalbureau voor Schimmelcultures (CBS-KNAW) Fungal Biodiversity Center (http://www.cbs.knaw.nl/fusarium). Alternatively, isolates can be identified via phylogenetic analysis by adding sequences of unknowns to the DNA sequence alignment, which can be downloaded from the two aforementioned websites. The utility of this database should increase significantly as members of the clinical microbiology community deposit in internationally accessible culture collections (e.g., CBS-KNAW or the Fusarium Research Center) cultures of novel mycosis-associated fusaria, along with associated, corrected sequence chromatograms and data, so that the

  8. Internet-accessible DNA sequence database for identifying fusaria from human and animal infections.

    PubMed

    O'Donnell, Kerry; Sutton, Deanna A; Rinaldi, Michael G; Sarver, Brice A J; Balajee, S Arunmozhi; Schroers, Hans-Josef; Summerbell, Richard C; Robert, Vincent A R G; Crous, Pedro W; Zhang, Ning; Aoki, Takayuki; Jung, Kyongyong; Park, Jongsun; Lee, Yong-Hwan; Kang, Seogchan; Park, Bongsoo; Geiser, David M

    2010-10-01

    Because less than one-third of clinically relevant fusaria can be accurately identified to species level using phenotypic data (i.e., morphological species recognition), we constructed a three-locus DNA sequence database to facilitate molecular identification of the 69 Fusarium species associated with human or animal mycoses encountered in clinical microbiology laboratories. The database comprises partial sequences from three nuclear genes: translation elongation factor 1α (EF-1α), the largest subunit of RNA polymerase (RPB1), and the second largest subunit of RNA polymerase (RPB2). These three gene fragments can be amplified by PCR and sequenced using primers that are conserved across the phylogenetic breadth of Fusarium. Phylogenetic analyses of the combined data set reveal that, with the exception of two monotypic lineages, all clinically relevant fusaria are nested in one of eight variously sized and strongly supported species complexes. The monophyletic lineages have been named informally to facilitate communication of an isolate's clade membership and genetic diversity. To identify isolates to the species included within the database, partial DNA sequence data from one or more of the three genes can be used as a BLAST query against the database which is Web accessible at FUSARIUM-ID (http://isolate.fusariumdb.org) and the Centraalbureau voor Schimmelcultures (CBS-KNAW) Fungal Biodiversity Center (http://www.cbs.knaw.nl/fusarium). Alternatively, isolates can be identified via phylogenetic analysis by adding sequences of unknowns to the DNA sequence alignment, which can be downloaded from the two aforementioned websites. The utility of this database should increase significantly as members of the clinical microbiology community deposit in internationally accessible culture collections (e.g., CBS-KNAW or the Fusarium Research Center) cultures of novel mycosis-associated fusaria, along with associated, corrected sequence chromatograms and data, so that the

  9. Identifying wrong assemblies in de novo short read primary sequence assembly contigs.

    PubMed

    Chawla, Vandna; Kumar, Rajnish; Shankar, Ravi

    2016-09-01

    With the advent of short-reads-based genome sequencing approaches, large number of organisms are being sequenced all over the world. Most of these assemblies are done using some de novo short read assemblers and other related approaches. However, the contigs produced this way are prone to wrong assembly. So far, there is a conspicuous dearth of reliable tools to identify mis-assembled contigs. Mis-assemblies could result from incorrectly deleted or wrongly arranged genomic sequences. In the present work various factors related to sequence, sequencing and assembling have been assessed for their role in causing mis-assembly by using different genome sequencing data. Finally, some mis-assembly detecting tools have been evaluated for their ability to detect the wrongly assembled primary contigs, suggesting a lot of scope for improvement in this area. The present work also proposes a simple unsupervised learning-based novel approach to identify mis-assemblies in the contigs which was found performing reasonably well when compared to the already existing tools to report mis-assembled contigs. It was observed that the proposed methodology may work as a complementary system to the existing tools to enhance their accuracy. PMID:27581937

  10. RADcap: sequence capture of dual-digest RADseq libraries with identifiable duplicates and reduced missing data.

    PubMed

    Hoffberg, Sandra L; Kieran, Troy J; Catchen, Julian M; Devault, Alison; Faircloth, Brant C; Mauricio, Rodney; Glenn, Travis C

    2016-09-01

    Molecular ecologists seek to genotype hundreds to thousands of loci from hundreds to thousands of individuals at minimal cost per sample. Current methods, such as restriction-site-associated DNA sequencing (RADseq) and sequence capture, are constrained by costs associated with inefficient use of sequencing data and sample preparation. Here, we introduce RADcap, an approach that combines the major benefits of RADseq (low cost with specific start positions) with those of sequence capture (repeatable sequencing of specific loci) to significantly increase efficiency and reduce costs relative to current approaches. RADcap uses a new version of dual-digest RADseq (3RAD) to identify candidate SNP loci for capture bait design and subsequently uses custom sequence capture baits to consistently enrich candidate SNP loci across many individuals. We combined this approach with a new library preparation method for identifying and removing PCR duplicates from 3RAD libraries, which allows researchers to process RADseq data using traditional pipelines, and we tested the RADcap method by genotyping sets of 96-384 Wisteria plants. Our results demonstrate that our RADcap method: (i) methodologically reduces (to <5%) and allows computational removal of PCR duplicate reads from data, (ii) achieves 80-90% reads on target in 11 of 12 enrichments, (iii) returns consistent coverage (≥4×) across >90% of individuals at up to 99.8% of the targeted loci, (iv) produces consistently high occupancy matrices of genotypes across hundreds of individuals and (v) costs significantly less than current approaches. PMID:27416967

  11. An Evolutionarily Young Polar Bear (Ursus maritimus) Endogenous Retrovirus Identified from Next Generation Sequence Data

    PubMed Central

    Tsangaras, Kyriakos; Mayer, Jens; Alquezar-Planas, David E.; Greenwood, Alex D.

    2015-01-01

    Transcriptome analysis of polar bear (Ursus maritimus) tissues identified sequences with similarity to Porcine Endogenous Retroviruses (PERV). Based on these sequences, four proviral copies and 15 solo long terminal repeats (LTRs) of a newly described endogenous retrovirus were characterized from the polar bear draft genome sequence. Closely related sequences were identified by PCR analysis of brown bear (Ursus arctos) and black bear (Ursus americanus) but were absent in non-Ursinae bear species. The virus was therefore designated UrsusERV. Two distinct groups of LTRs were observed including a recombinant ERV that contained one LTR belonging to each group indicating that genomic invasions by at least two UrsusERV variants have recently occurred. Age estimates based on proviral LTR divergence and conservation of integration sites among ursids suggest the viral group is only a few million years old. The youngest provirus was polar bear specific, had intact open reading frames (ORFs) and could potentially encode functional proteins. Phylogenetic analyses of UrsusERV consensus protein sequences suggest that it is part of a pig, gibbon and koala retrovirus clade. The young age estimates and lineage specificity of the virus suggests UrsusERV is a recent cross species transmission from an unknown reservoir and places the viral group among the youngest of ERVs identified in mammals. PMID:26610552

  12. An Evolutionarily Young Polar Bear (Ursus maritimus) Endogenous Retrovirus Identified from Next Generation Sequence Data.

    PubMed

    Tsangaras, Kyriakos; Mayer, Jens; Alquezar-Planas, David E; Greenwood, Alex D

    2015-11-01

    Transcriptome analysis of polar bear (Ursus maritimus) tissues identified sequences with similarity to Porcine Endogenous Retroviruses (PERV). Based on these sequences, four proviral copies and 15 solo long terminal repeats (LTRs) of a newly described endogenous retrovirus were characterized from the polar bear draft genome sequence. Closely related sequences were identified by PCR analysis of brown bear (Ursus arctos) and black bear (Ursus americanus) but were absent in non-Ursinae bear species. The virus was therefore designated UrsusERV. Two distinct groups of LTRs were observed including a recombinant ERV that contained one LTR belonging to each group indicating that genomic invasions by at least two UrsusERV variants have recently occurred. Age estimates based on proviral LTR divergence and conservation of integration sites among ursids suggest the viral group is only a few million years old. The youngest provirus was polar bear specific, had intact open reading frames (ORFs) and could potentially encode functional proteins. Phylogenetic analyses of UrsusERV consensus protein sequences suggest that it is part of a pig, gibbon and koala retrovirus clade. The young age estimates and lineage specificity of the virus suggests UrsusERV is a recent cross species transmission from an unknown reservoir and places the viral group among the youngest of ERVs identified in mammals. PMID:26610552

  13. Identifying Human Genome-Wide CNV, LOH and UPD by Targeted Sequencing of Selected Regions.

    PubMed

    Wang, Yu; Li, Wei; Xia, Yingying; Wang, Chongzhi; Tang, Y Tom; Guo, Wenying; Li, Jinliang; Zhao, Xia; Sun, Yepeng; Hu, Juan; Zhen, Hefu; Zhang, Xiandong; Chen, Chao; Shi, Yujian; Li, Lin; Cao, Hongzhi; Du, Hongli; Li, Jian

    2014-01-01

    Copy-number variations (CNV), loss of heterozygosity (LOH), and uniparental disomy (UPD) are large genomic aberrations leading to many common inherited diseases, cancers, and other complex diseases. An integrated tool to identify these aberrations is essential in understanding diseases and in designing clinical interventions. Previous discovery methods based on whole-genome sequencing (WGS) require very high depth of coverage on the whole genome scale, and are cost-wise inefficient. Another approach, whole exome genome sequencing (WEGS), is limited to discovering variations within exons. Thus, we are lacking efficient methods to detect genomic aberrations on the whole genome scale using next-generation sequencing technology. Here we present a method to identify genome-wide CNV, LOH and UPD for the human genome via selectively sequencing a small portion of genome termed Selected Target Regions (SeTRs). In our experiments, the SeTRs are covered by 99.73%~99.95% with sufficient depth. Our developed bioinformatics pipeline calls genome-wide CNVs with high confidence, revealing 8 credible events of LOH and 3 UPD events larger than 5M from 15 individual samples. We demonstrate that genome-wide CNV, LOH and UPD can be detected using a cost-effective SeTRs sequencing approach, and that LOH and UPD can be identified using just a sample grouping technique, without using a matched sample or familial information. PMID:25919136

  14. Identifying Human Genome-Wide CNV, LOH and UPD by Targeted Sequencing of Selected Regions

    PubMed Central

    Guo, Wenying; Li, Jinliang; Zhao, Xia; Sun, Yepeng; Hu, Juan; Zhen, Hefu; Zhang, Xiandong; Chen, Chao; Shi, Yujian; Li, Lin; Cao, Hongzhi; Du, Hongli; Li, Jian

    2015-01-01

    Copy-number variations (CNV), loss of heterozygosity (LOH), and uniparental disomy (UPD) are large genomic aberrations leading to many common inherited diseases, cancers, and other complex diseases. An integrated tool to identify these aberrations is essential in understanding diseases and in designing clinical interventions. Previous discovery methods based on whole-genome sequencing (WGS) require very high depth of coverage on the whole genome scale, and are cost-wise inefficient. Another approach, whole exome genome sequencing (WEGS), is limited to discovering variations within exons. Thus, we are lacking efficient methods to detect genomic aberrations on the whole genome scale using next-generation sequencing technology. Here we present a method to identify genome-wide CNV, LOH and UPD for the human genome via selectively sequencing a small portion of genome termed Selected Target Regions (SeTRs). In our experiments, the SeTRs are covered by 99.73%~99.95% with sufficient depth. Our developed bioinformatics pipeline calls genome-wide CNVs with high confidence, revealing 8 credible events of LOH and 3 UPD events larger than 5M from 15 individual samples. We demonstrate that genome-wide CNV, LOH and UPD can be detected using a cost-effective SeTRs sequencing approach, and that LOH and UPD can be identified using just a sample grouping technique, without using a matched sample or familial information. PMID:25919136

  15. Amino acid sequence prerequisites for the formation of cn ions.

    PubMed

    Downard, K M; Biemann, K

    1993-11-01

    Ammo acid sequence prerequisites are described for the formation of c, ions observed in high-energy collision-induced decomposition spectra of peptides. It is shown that the formation of cn ions is promoted by the nature of the amino acid C-terminal to the cleavage site. A propensity for cn cleavage preceding threonine, and to a lesser extent tryptophan, lysine, and serine, is demonstrated where fragmentation is directed N-terminally at these residues. In addition, the nature of the residue N-terminal to the cleavage site is shown to have little effect on cn ion formation. A mechanism for cn ion formation is proposed and its applicability to the results observed is discussed. PMID:24227531

  16. Ultrasensitive nucleic acid sequence detection by single-molecule electrophoresis

    SciTech Connect

    Castro, A; Shera, E.B.

    1996-09-01

    This is the final report of a one-year laboratory-directed research and development project at Los Alamos National Laboratory. There has been considerable interest in the development of very sensitive clinical diagnostic techniques over the last few years. Many pathogenic agents are often present in extremely small concentrations in clinical samples, especially at the initial stages of infection, making their detection very difficult. This project sought to develop a new technique for the detection and accurate quantification of specific bacterial and viral nucleic acid sequences in clinical samples. The scheme involved the use of novel hybridization probes for the detection of nucleic acids combined with our recently developed technique of single-molecule electrophoresis. This project is directly relevant to the DOE`s Defense Programs strategic directions in the area of biological warfare counter-proliferation.

  17. An Internet-Accessible DNA Sequence Database for Identifying Fusaria from Human and Animal Infections

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Because less than one-third of clinically relevant fusaria can be accurately identified to species level using phenotypic data (i.e., morphological species recognition), we constructed a three-locus DNA sequence database to facilitate molecular identification of the 69 Fusarium species associated wi...

  18. Identifying the Critical Time Period for Information Extraction when Recognizing Sequences of Play

    ERIC Educational Resources Information Center

    North, Jamie S.; Williams, A. Mark

    2008-01-01

    The authors attempted to determine the critical time period for information extraction when recognizing play sequences in soccer. Although efforts have been made to identify the perceptual information underpinning such decisions, no researchers have attempted to determine "when" this information may be extracted from the display. The authors…

  19. Tsukamurella pulmonis Bloodstream Infection Identified by secA1 Gene Sequencing

    PubMed Central

    Cano, María E.; García de la Fuente, Celia; Martínez-Martínez, Luis; López, Mónica; Fernández-Mazarrasa, Carlos

    2014-01-01

    Recurrent bloodstream infections caused by a Gram-positive bacterium affected an immunocompromised child. Tsukamurella pulmonis was the microorganism identified by secA1 gene sequencing. Antibiotic treatment in combination with removal of the subcutaneous port healed the patient. PMID:25520439

  20. Multi-locus DNA sequencing of Toxoplasma gondii isolated from Brazilian pigs identifies genetically divergent strains

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Five Toxoplasma gondii isolates (TgPgBr1-5) were isolated from hearts and brains of pigs freshly purchased at the market of Campos dos Goytacazes, Northern Rio de Janeiro State, Brazil. Four of the five isolates were highly pathogenic in mice. Four genotypes were identified. Multi-locus DNA sequenci...

  1. SoftSearch: Integration of Multiple Sequence Features to Identify Breakpoints of Structural Variations

    PubMed Central

    Hart, Steven N.; Sarangi, Vivekananda; Moore, Raymond; Baheti, Saurabh; Bhavsar, Jaysheel D.; Couch, Fergus J.; Kocher, Jean-Pierre A.

    2013-01-01

    Background Structural variation (SV) represents a significant, yet poorly understood contribution to an individual’s genetic makeup. Advanced next-generation sequencing technologies are widely used to discover such variations, but there is no single detection tool that is considered a community standard. In an attempt to fulfil this need, we developed an algorithm, SoftSearch, for discovering structural variant breakpoints in Illumina paired-end next-generation sequencing data. SoftSearch combines multiple strategies for detecting SV including split-read, discordant read-pair, and unmated pairs. Co-localized split-reads and discordant read pairs are used to refine the breakpoints. Results We developed and validated SoftSearch using real and synthetic datasets. SoftSearch’s key features are 1) not requiring secondary (or exhaustive primary) alignment, 2) portability into established sequencing workflows, and 3) is applicable to any DNA-sequencing experiment (e.g. whole genome, exome, custom capture, etc.). SoftSearch identifies breakpoints from a small number of soft-clipped bases from split reads and a few discordant read-pairs which on their own would not be sufficient to make an SV call. Conclusions We show that SoftSearch can identify more true SVs by combining multiple sequence features. SoftSearch was able to call clinically relevant SVs in the BRCA2 gene not reported by other tools while offering significantly improved overall performance. PMID:24358278

  2. RNA-ID, a Powerful Tool for Identifying and Characterizing Regulatory Sequences.

    PubMed

    Brule, C E; Dean, K M; Grayhack, E J

    2016-01-01

    The identification and analysis of sequences that regulate gene expression is critical because regulated gene expression underlies biology. RNA-ID is an efficient and sensitive method to discover and investigate regulatory sequences in the yeast Saccharomyces cerevisiae, using fluorescence-based assays to detect green fluorescent protein (GFP) relative to a red fluorescent protein (RFP) control in individual cells. Putative regulatory sequences can be inserted either in-frame or upstream of a superfolder GFP fusion protein whose expression, like that of RFP, is driven by the bidirectional GAL1,10 promoter. In this chapter, we describe the methodology to identify and study cis-regulatory sequences in the RNA-ID system, explaining features and variations of the RNA-ID reporter, as well as some applications of this system. We describe in detail the methods to analyze a single regulatory sequence, from construction of a single GFP variant to assay of variants by flow cytometry, as well as modifications required to screen libraries of different strains simultaneously. We also describe subsequent analyses of regulatory sequences. PMID:27241757

  3. Whole-Exome Sequencing Identifies Rare and Low-Frequency Coding Variants Associated with LDL Cholesterol

    PubMed Central

    Lange, Leslie A.; Hu, Youna; Zhang, He; Xue, Chenyi; Schmidt, Ellen M.; Tang, Zheng-Zheng; Bizon, Chris; Lange, Ethan M.; Smith, Joshua D.; Turner, Emily H.; Jun, Goo; Kang, Hyun Min; Peloso, Gina; Auer, Paul; Li, Kuo-ping; Flannick, Jason; Zhang, Ji; Fuchsberger, Christian; Gaulton, Kyle; Lindgren, Cecilia; Locke, Adam; Manning, Alisa; Sim, Xueling; Rivas, Manuel A.; Holmen, Oddgeir L.; Gottesman, Omri; Lu, Yingchang; Ruderfer, Douglas; Stahl, Eli A.; Duan, Qing; Li, Yun; Durda, Peter; Jiao, Shuo; Isaacs, Aaron; Hofman, Albert; Bis, Joshua C.; Correa, Adolfo; Griswold, Michael E.; Jakobsdottir, Johanna; Smith, Albert V.; Schreiner, Pamela J.; Feitosa, Mary F.; Zhang, Qunyuan; Huffman, Jennifer E.; Crosby, Jacy; Wassel, Christina L.; Do, Ron; Franceschini, Nora; Martin, Lisa W.; Robinson, Jennifer G.; Assimes, Themistocles L.; Crosslin, David R.; Rosenthal, Elisabeth A.; Tsai, Michael; Rieder, Mark J.; Farlow, Deborah N.; Folsom, Aaron R.; Lumley, Thomas; Fox, Ervin R.; Carlson, Christopher S.; Peters, Ulrike; Jackson, Rebecca D.; van Duijn, Cornelia M.; Uitterlinden, André G.; Levy, Daniel; Rotter, Jerome I.; Taylor, Herman A.; Gudnason, Vilmundur; Siscovick, David S.; Fornage, Myriam; Borecki, Ingrid B.; Hayward, Caroline; Rudan, Igor; Chen, Y. Eugene; Bottinger, Erwin P.; Loos, Ruth J.F.; Sætrom, Pål; Hveem, Kristian; Boehnke, Michael; Groop, Leif; McCarthy, Mark; Meitinger, Thomas; Ballantyne, Christie M.; Gabriel, Stacey B.; O’Donnell, Christopher J.; Post, Wendy S.; North, Kari E.; Reiner, Alexander P.; Boerwinkle, Eric; Psaty, Bruce M.; Altshuler, David; Kathiresan, Sekar; Lin, Dan-Yu; Jarvik, Gail P.; Cupples, L. Adrienne; Kooperberg, Charles; Wilson, James G.; Nickerson, Deborah A.; Abecasis, Goncalo R.; Rich, Stephen S.; Tracy, Russell P.; Willer, Cristen J.; Gabriel, Stacey B.; Altshuler, David M.; Abecasis, Gonçalo R.; Allayee, Hooman; Cresci, Sharon; Daly, Mark J.; de Bakker, Paul I.W.; DePristo, Mark A.; Do, Ron; Donnelly, Peter; Farlow, Deborah N.; Fennell, Tim; Garimella, Kiran; Hazen, Stanley L.; Hu, Youna; Jordan, Daniel M.; Jun, Goo; Kathiresan, Sekar; Kang, Hyun Min; Kiezun, Adam; Lettre, Guillaume; Li, Bingshan; Li, Mingyao; Newton-Cheh, Christopher H.; Padmanabhan, Sandosh; Peloso, Gina; Pulit, Sara; Rader, Daniel J.; Reich, David; Reilly, Muredach P.; Rivas, Manuel A.; Schwartz, Steve; Scott, Laura; Siscovick, David S.; Spertus, John A.; Stitziel, Nathaniel O.; Stoletzki, Nina; Sunyaev, Shamil R.; Voight, Benjamin F.; Willer, Cristen J.; Rich, Stephen S.; Akylbekova, Ermeg; Atwood, Larry D.; Ballantyne, Christie M.; Barbalic, Maja; Barr, R. Graham; Benjamin, Emelia J.; Bis, Joshua; Boerwinkle, Eric; Bowden, Donald W.; Brody, Jennifer; Budoff, Matthew; Burke, Greg; Buxbaum, Sarah; Carr, Jeff; Chen, Donna T.; Chen, Ida Y.; Chen, Wei-Min; Concannon, Pat; Crosby, Jacy; Cupples, L. Adrienne; D’Agostino, Ralph; DeStefano, Anita L.; Dreisbach, Albert; Dupuis, Josée; Durda, J. Peter; Ellis, Jaclyn; Folsom, Aaron R.; Fornage, Myriam; Fox, Caroline S.; Fox, Ervin; Funari, Vincent; Ganesh, Santhi K.; Gardin, Julius; Goff, David; Gordon, Ora; Grody, Wayne; Gross, Myron; Guo, Xiuqing; Hall, Ira M.; Heard-Costa, Nancy L.; Heckbert, Susan R.; Heintz, Nicholas; Herrington, David M.; Hickson, DeMarc; Huang, Jie; Hwang, Shih-Jen; Jacobs, David R.; Jenny, Nancy S.; Johnson, Andrew D.; Johnson, Craig W.; Kawut, Steven; Kronmal, Richard; Kurz, Raluca; Lange, Ethan M.; Lange, Leslie A.; Larson, Martin G.; Lawson, Mark; Lewis, Cora E.; Levy, Daniel; Li, Dalin; Lin, Honghuang; Liu, Chunyu; Liu, Jiankang; Liu, Kiang; Liu, Xiaoming; Liu, Yongmei; Longstreth, William T.; Loria, Cay; Lumley, Thomas; Lunetta, Kathryn; Mackey, Aaron J.; Mackey, Rachel; Manichaikul, Ani; Maxwell, Taylor; McKnight, Barbara; Meigs, James B.; Morrison, Alanna C.; Musani, Solomon K.; Mychaleckyj, Josyf C.; Nettleton, Jennifer A.; North, Kari; O’Donnell, Christopher J.; O’Leary, Daniel; Ong, Frank; Palmas, Walter

    2014-01-01

    Elevated low-density lipoprotein cholesterol (LDL-C) is a treatable, heritable risk factor for cardiovascular disease. Genome-wide association studies (GWASs) have identified 157 variants associated with lipid levels but are not well suited to assess the impact of rare and low-frequency variants. To determine whether rare or low-frequency coding variants are associated with LDL-C, we exome sequenced 2,005 individuals, including 554 individuals selected for extreme LDL-C (>98th or <2nd percentile). Follow-up analyses included sequencing of 1,302 additional individuals and genotype-based analysis of 52,221 individuals. We observed significant evidence of association between LDL-C and the burden of rare or low-frequency variants in PNPLA5, encoding a phospholipase-domain-containing protein, and both known and previously unidentified variants in PCSK9, LDLR and APOB, three known lipid-related genes. The effect sizes for the burden of rare variants for each associated gene were substantially higher than those observed for individual SNPs identified from GWASs. We replicated the PNPLA5 signal in an independent large-scale sequencing study of 2,084 individuals. In conclusion, this large whole-exome-sequencing study for LDL-C identified a gene not known to be implicated in LDL-C and provides unique insight into the design and analysis of similar experiments. PMID:24507775

  4. Whole-exome sequencing identifies rare and low-frequency coding variants associated with LDL cholesterol.

    PubMed

    Lange, Leslie A; Hu, Youna; Zhang, He; Xue, Chenyi; Schmidt, Ellen M; Tang, Zheng-Zheng; Bizon, Chris; Lange, Ethan M; Smith, Joshua D; Turner, Emily H; Jun, Goo; Kang, Hyun Min; Peloso, Gina; Auer, Paul; Li, Kuo-Ping; Flannick, Jason; Zhang, Ji; Fuchsberger, Christian; Gaulton, Kyle; Lindgren, Cecilia; Locke, Adam; Manning, Alisa; Sim, Xueling; Rivas, Manuel A; Holmen, Oddgeir L; Gottesman, Omri; Lu, Yingchang; Ruderfer, Douglas; Stahl, Eli A; Duan, Qing; Li, Yun; Durda, Peter; Jiao, Shuo; Isaacs, Aaron; Hofman, Albert; Bis, Joshua C; Correa, Adolfo; Griswold, Michael E; Jakobsdottir, Johanna; Smith, Albert V; Schreiner, Pamela J; Feitosa, Mary F; Zhang, Qunyuan; Huffman, Jennifer E; Crosby, Jacy; Wassel, Christina L; Do, Ron; Franceschini, Nora; Martin, Lisa W; Robinson, Jennifer G; Assimes, Themistocles L; Crosslin, David R; Rosenthal, Elisabeth A; Tsai, Michael; Rieder, Mark J; Farlow, Deborah N; Folsom, Aaron R; Lumley, Thomas; Fox, Ervin R; Carlson, Christopher S; Peters, Ulrike; Jackson, Rebecca D; van Duijn, Cornelia M; Uitterlinden, André G; Levy, Daniel; Rotter, Jerome I; Taylor, Herman A; Gudnason, Vilmundur; Siscovick, David S; Fornage, Myriam; Borecki, Ingrid B; Hayward, Caroline; Rudan, Igor; Chen, Y Eugene; Bottinger, Erwin P; Loos, Ruth J F; Sætrom, Pål; Hveem, Kristian; Boehnke, Michael; Groop, Leif; McCarthy, Mark; Meitinger, Thomas; Ballantyne, Christie M; Gabriel, Stacey B; O'Donnell, Christopher J; Post, Wendy S; North, Kari E; Reiner, Alexander P; Boerwinkle, Eric; Psaty, Bruce M; Altshuler, David; Kathiresan, Sekar; Lin, Dan-Yu; Jarvik, Gail P; Cupples, L Adrienne; Kooperberg, Charles; Wilson, James G; Nickerson, Deborah A; Abecasis, Goncalo R; Rich, Stephen S; Tracy, Russell P; Willer, Cristen J

    2014-02-01

    Elevated low-density lipoprotein cholesterol (LDL-C) is a treatable, heritable risk factor for cardiovascular disease. Genome-wide association studies (GWASs) have identified 157 variants associated with lipid levels but are not well suited to assess the impact of rare and low-frequency variants. To determine whether rare or low-frequency coding variants are associated with LDL-C, we exome sequenced 2,005 individuals, including 554 individuals selected for extreme LDL-C (>98(th) or <2(nd) percentile). Follow-up analyses included sequencing of 1,302 additional individuals and genotype-based analysis of 52,221 individuals. We observed significant evidence of association between LDL-C and the burden of rare or low-frequency variants in PNPLA5, encoding a phospholipase-domain-containing protein, and both known and previously unidentified variants in PCSK9, LDLR and APOB, three known lipid-related genes. The effect sizes for the burden of rare variants for each associated gene were substantially higher than those observed for individual SNPs identified from GWASs. We replicated the PNPLA5 signal in an independent large-scale sequencing study of 2,084 individuals. In conclusion, this large whole-exome-sequencing study for LDL-C identified a gene not known to be implicated in LDL-C and provides unique insight into the design and analysis of similar experiments. PMID:24507775

  5. New aldehyde tag sequences identified by screening formylglycine generating enzymes in vitro and in vivo.

    PubMed

    Rush, Jason S; Bertozzi, Carolyn R

    2008-09-17

    Formylglycine generating enzyme (FGE) performs a critical posttranslational modification of type I sulfatases, converting cysteine within the motif CxPxR to the aldehyde-bearing residue formylglycine (FGly). This concise motif can be installed within heterologous proteins as a genetically encoded "aldehyde tag" for site-specific labeling with aminooxy- or hydrazide-functionalized probes. In this report, we screened FGEs from M. tuberculosis and S. coelicolor against synthetic peptide libraries and identified new substrate sequences that diverge from the canonical motif. We found that E. coli's FGE-like activity is similarly promiscuous, enabling the use of novel aldehyde tag sequences for in vivo modification of recombinant proteins. PMID:18722427

  6. Targeted capture enrichment and sequencing identifies extensive nucleotide variation in the turkey MHC-B.

    PubMed

    Reed, Kent M; Mendoza, Kristelle M; Settlage, Robert E

    2016-03-01

    Variation in the major histocompatibility complex (MHC) is increasingly associated with disease susceptibility and resistance in avian species of agricultural importance. This variation includes sequence polymorphisms but also structural differences (gene rearrangement) and copy number variation (CNV). The MHC has now been described for multiple galliform species including the best defined assemblies of the chicken (Gallus gallus) and domestic turkey (Meleagris gallopavo). Using this sequence resource, this study applied high-throughput sequencing to investigate MHC variation in turkeys of North America (NA turkeys). An MHC-specific SureSelect (Agilent) capture array was developed, and libraries were created for 14 turkeys representing domestic (commercial bred), heritage breed, and wild turkeys. In addition, a representative of the Ocellated turkey (M. ocellata) and chicken (G. gallus) was included to test cross-species applicability of the capture array allowing for identification of new species-specific polymorphisms. Libraries were hybridized to ∼12 K cRNA baits and the resulting pools were sequenced. On average, 98% of processed reads mapped to the turkey whole genome sequence and 53% to the MHC target. In addition to the MHC, capture hybridization recovered sequences corresponding to other MHC regions. Sequence alignment and de novo assembly indicated the presence of several additional BG genes in the turkey with evidence for CNV. Variant detection identified an average of 2245 polymorphisms per individual for the NA turkeys, 3012 for the Ocellated turkey, and 462 variants in the chicken (RJF-256). This study provides an extensive sequence resource for examining MHC variation and its relation to health of this agriculturally important group of birds. PMID:26729471

  7. BLAT2DOLite: An Online System for Identifying Significant Relationships between Genetic Sequences and Diseases

    PubMed Central

    Cheng, Liang; Zhang, Shuo; Hu, Yang

    2016-01-01

    The significantly related diseases of sequences could play an important role in understanding the functions of these sequences. In this paper, we introduced BLAT2DOLite, an online system for annotating human genes and diseases and identifying the significant relationships between sequences and diseases. Currently, BLAT2DOLite integrates Entrez Gene database and Disease Ontology Lite (DOLite), which contain loci of gene and relationships between genes and diseases. It utilizes hypergeometric test to calculate P-values between genes and diseases of DOLite. The system can be accessed from: http://123.59.132.21:8080/BLAT2DOLite. The corresponding web service is described in: http://123.59.132.21:8080/BLAT2DOLite/BLAT2DOLiteIDMappingPort?wsdl. PMID:27315278

  8. New Hosts of Simplicimonas similis and Trichomitus batrachorum Identified by 18S Ribosomal RNA Gene Sequences

    PubMed Central

    Dimasuay, Kris Genelyn B.; Lavilla, Orlie John Y.; Rivera, Windell L.

    2013-01-01

    Trichomonads are obligate anaerobes generally found in the digestive and genitourinary tract of domestic animals. In this study, four trichomonad isolates were obtained from carabao, dog, and pig hosts using rectal swab. Genomic DNA was extracted using Chelex method and the 18S rRNA gene was successfully amplified through novel sets of primers and undergone DNA sequencing. Aligned isolate sequences together with retrieved 18S rRNA gene sequences of known trichomonads were utilized to generate phylogenetic trees using maximum likelihood and neighbor-joining analyses. Two isolates from carabao were identified as Simplicimonas similis while each isolate from dog and pig was identified as Pentatrichomonas hominis and Trichomitus batrachorum, respectively. This is the first report of S. similis in carabao and the identification of T. batrachorum in pig using 18S rRNA gene sequence analysis. The generated phylogenetic tree yielded three distinct groups mostly with relatively moderate to high bootstrap support and in agreement with the most recent classification. Pathogenic potential of the trichomonads in these hosts still needs further investigation. PMID:23936631

  9. SERPINA1 Full-Gene Sequencing Identifies Rare Mutations Not Detected in Targeted Mutation Analysis.

    PubMed

    Graham, Rondell P; Dina, Michelle A; Howe, Sarah C; Butz, Malinda L; Willkomm, Kurt S; Murray, David L; Snyder, Melissa R; Rumilla, Kandelaria M; Halling, Kevin C; Highsmith, W Edward

    2015-11-01

    Genetic α-1 antitrypsin (AAT) deficiency is characterized by low serum AAT levels and the identification of causal mutations or an abnormal protein. It needs to be distinguished from deficiency because of nongenetic causes, and diagnostic delay may contribute to worse patient outcome. Current routine clinical testing assesses for only the most common mutations. We wanted to determine the proportion of unexplained cases of AAT deficiency that harbor causal mutations not identified through current standard allele-specific genotyping and isoelectric focusing (IEF). All prospective cases from December 1, 2013, to October 1, 2014, with a low serum AAT level not explained by allele-specific genotyping and IEF were assessed through full-gene sequencing with a direct sequencing method for pathogenic mutations. We reviewed the results using American Council of Medical Genetics criteria. Of 3523 cases, 42 (1.2%) met study inclusion criteria. Pathogenic or likely pathogenic mutations not identified through clinical testing were detected through full-gene sequencing in 16 (38%) of the 42 cases. Rare mutations not detected with current allele-specific testing and IEF underlie a substantial proportion of genetic AAT deficiency. Full-gene sequencing, therefore, has the ability to improve accuracy in the diagnosis of AAT deficiency. PMID:26321041

  10. Characterization of the microbial acid mine drainage microbial community using culturing and direct sequencing techniques.

    PubMed

    Auld, Ryan R; Myre, Maxine; Mykytczuk, Nadia C S; Leduc, Leo G; Merritt, Thomas J S

    2013-05-01

    We characterized the bacterial community from an AMD tailings pond using both classical culturing and modern direct sequencing techniques and compared the two methods. Acid mine drainage (AMD) is produced by the environmental and microbial oxidation of minerals dissolved from mining waste. Surprisingly, we know little about the microbial communities associated with AMD, despite the fundamental ecological roles of these organisms and large-scale economic impact of these waste sites. AMD microbial communities have classically been characterized by laboratory culturing-based techniques and more recently by direct sequencing of marker gene sequences, primarily the 16S rRNA gene. In our comparison of the techniques, we find that their results are complementary, overall indicating very similar community structure with similar dominant species, but with each method identifying some species that were missed by the other. We were able to culture the majority of species that our direct sequencing results indicated were present, primarily species within the Acidithiobacillus and Acidiphilium genera, although estimates of relative species abundance were only obtained from direct sequencing. Interestingly, our culture-based methods recovered four species that had been overlooked from our sequencing results because of the rarity of the marker gene sequences, likely members of the rare biosphere. Further, direct sequencing indicated that a single genus, completely missed in our culture-based study, Legionella, was a dominant member of the microbial community. Our results suggest that while either method does a reasonable job of identifying the dominant members of the AMD microbial community, together the methods combine to give a more complete picture of the true diversity of this environment. PMID:23485423

  11. Using an intervening sequence of Faecalibacterium 16S rDNA to identify poultry feces.

    PubMed

    Shen, Zhenyu; Duan, Chuanren; Zhang, Chao; Carson, Andrew; Xu, Dong; Zheng, Guolu

    2013-10-15

    This study was designed to identify poultry feces-specific marker(s) within sequences of Faecalibacterium 16S rDNA for detecting poultry fecal pollution in water. Bioinformatics tools were used in the comparative analysis of 7,458 sequences of Faecalibacterium 16S rDNA, reportedly associated with various poultry (chicken and turkey) and animal species. One intervening sequence (IVS) within between the hypervariable region 1 and the conserved region 2, designated as IVS-p, was found to be unique to poultry feces. Based on this sequence, a PCR assay (PCR-p) was developed. The PCR-p produced an amplicon of 132 bp only in the test when fecal or wastewater samples from poultry were used, but not when using fecal or wastewater samples from other sources. The non-poultry sources included feces of beef or dairy cattle, dog, horse, human, domestic or wild geese, seagull, sheep, swine, and wild turkey. These data indicate that IVS-p may prove to be a useful genetic marker for the specific identification of poultry fecal pollution in environmental waterways. Furthermore, results of data mining and PCR assay indicate that the IVS-p may have a broad geographic distribution. This report represents initial evidence of the potential utility of ribosomal intervening sequences as genetic markers for tracking host sources of fecal pollution in waterways. PMID:24011842

  12. Novel expressed sequences identified in a model of androgen independent prostate cancer

    PubMed Central

    Quayle, Steven N; Hare, Heidi; Delaney, Allen D; Hirst, Martin; Hwang, Dorothy; Schein, Jacqueline E; Jones, Steven JM; Marra, Marco A; Sadar, Marianne D

    2007-01-01

    Background Prostate cancer is the most frequently diagnosed cancer in American men, and few effective treatment options are available to patients who develop hormone-refractory prostate cancer. The molecular changes that occur to allow prostate cells to proliferate in the absence of androgens are not fully understood. Results Subtractive hybridization experiments performed with samples from an in vivo model of hormonal progression identified 25 expressed sequences representing novel human transcripts. Intriguingly, these 25 sequences have small open-reading frames and are not highly conserved through evolution, suggesting many of these novel expressed sequences may be derived from untranslated regions of novel transcripts or from non-coding transcripts. Examination of a large metalibrary of human Serial Analysis of Gene Expression (SAGE) tags demonstrated that only three of these novel sequences had been previously detected. RT-PCR experiments confirmed that the 6 sequences tested were expressed in specific human tissues, as well as in clinical samples of prostate cancer. Further RT-PCR experiments for five of these fragments indicated they originated from large untranslated regions of unannotated transcripts. Conclusion This study underlines the value of using complementary techniques in the annotation of the human genome. The tissue-specific expression of 4 of the 6 clones tested indicates the expression of these novel transcripts is tightly regulated, and future work will determine the possible role(s) these novel transcripts may play in the progression of prostate cancer. PMID:17257419

  13. Exome sequencing in amyotrophic lateral sclerosis identifies risk genes and pathways

    PubMed Central

    Cirulli, Elizabeth T.; Lasseigne, Brittany N.; Petrovski, Slavé; Sapp, Peter C.; Dion, Patrick A.; Leblond, Claire S.; Couthouis, Julien; Lu, Yi-Fan; Wang, Quanli; Krueger, Brian J.; Ren, Zhong; Keebler, Jonathan; Han, Yujun; Levy, Shawn E.; Boone, Braden E.; Wimbish, Jack R.; Waite, Lindsay L.; Jones, Angela L.; Carulli, John P.; Day-Williams, Aaron G.; Staropoli, John F.; Xin, Winnie W.; Chesi, Alessandra; Raphael, Alya R.; McKenna-Yasek, Diane; Cady, Janet; de Jong, J.M.B. Vianney; Kenna, Kevin P.; Smith, Bradley N.; Topp, Simon; Miller, Jack; Gkazi, Athina; Al-Chalabi, Ammar; van den Berg, Leonard H.; Veldink, Jan; Silani, Vincenzo; Ticozzi, Nicola; Shaw, Christopher E.; Baloh, Robert H.; Appel, Stanley; Simpson, Ericka; Lagier-Tourenne, Clotilde; Pulst, Stefan M.; Gibson, Summer; Trojanowski, John Q.; Elman, Lauren; McCluskey, Leo; Grossman, Murray; Shneider, Neil A.; Chung, Wendy K.; Ravits, John M.; Glass, Jonathan D.; Sims, Katherine B.; Van Deerlin, Vivianna M.; Maniatis, Tom; Hayes, Sebastian D.; Ordureau, Alban; Swarup, Sharan; Landers, John; Baas, Frank; Allen, Andrew S.; Bedlack, Richard S.; Harper, J. Wade; Gitler, Aaron D.; Rouleau, Guy A.; Brown, Robert; Harms, Matthew B.; Cooper, Gregory M.; Harris, Tim; Myers, Richard M.; Goldstein, David B.

    2015-01-01

    Amyotrophic lateral sclerosis (ALS) is a devastating neurological disease with no effective treatment. Here we report the results of a moderate-scale sequencing study aimed at identifying new genes contributing to predisposition for ALS. We performed whole exome sequencing of 2,874 ALS patients and compared them to 6,405 controls. Several known ALS genes were found to be associated, and the non-canonical IκB kinase family TANK-Binding Kinase 1 (TBK1) was identified as an ALS gene. TBK1 is known to bind to and phosphorylate a number of proteins involved in innate immunity and autophagy, including optineurin (OPTN) and p62 (SQSTM1/sequestosome), both of which have also been implicated in ALS. These observations reveal a key role of the autophagic pathway in ALS and suggest specific targets for therapeutic intervention. PMID:25700176

  14. Exome Sequencing Identifies SLCO2A1 Mutations as a Cause of Primary Hypertrophic Osteoarthropathy

    PubMed Central

    Zhang, Zhenlin; Xia, Weibo; He, Jinwei; Zhang, Zeng; Ke, Yaohua; Yue, Hua; Wang, Chun; Zhang, Hao; Gu, Jiemei; Hu, Weiwei; Fu, Wenzhen; Hu, Yunqiu; Li, Miao; Liu, Yujuan

    2012-01-01

    By using whole-exome sequencing, we identified a homozygous guanine-to-adenine transition at the invariant −1 position of the acceptor site of intron 1 (c.97−1G>A) in solute carrier organic anion transporter family member 2A1 (SLCO2A1), which encodes a prostaglandin transporter protein, as the causative mutation in a single individual with primary hypertrophic osteoarthropathy (PHO) from a consanguineous family. In two other affected individuals with PHO from two unrelated nonconsanguineous families, we identified two different compound heterozygous mutations by using Sanger sequencing. These findings confirm that SLCO2A1 mutations inactivate prostaglandin E2 (PGE2) transport, and they indicate that mutations in SLCO2A1 are the pathogenic cause of PHO. Moreover, this study might also help to explain the cause of secondary hypertrophic osteoarthropathy. PMID:22197487

  15. Exome sequencing identifies NBEAL2 as the causative gene for Gray Platelet Syndrome

    PubMed Central

    Albers, Cornelis A; Cvejic, Ana; Favier, Rémi; Bouwmans, Evelien E; Alessi, Marie-Christine; Bertone, Paul; Jordan, Gregory; Kettleborough, Ross NW; Kiddle, Graham; Kostadima, Myrto; Read, Randy J; Sipos, Botond; Sivapalaratnam, Suthesh; Smethurst, Peter A; Stephens, Jonathan; Voss, Katrin; Nurden, Alan; Rendon, Augusto; Nurden, Paquita; Ouwehand, Willem H

    2012-01-01

    Gray platelet syndrome (GPS) is a predominantly recessive platelet disorder characterized by a mild thrombocytopenia with large platelets and a paucity of α-granules; these abnormalities cause mostly moderate but in rare cases severe bleeding. We sequenced the exomes of four unrelated cases and identified as the causative gene NBEAL2, a gene with previously unknown function but a member of a gene family involved in granule development. Silencing of nbeal2 in zebrafish abrogated thrombocyte formation. PMID:21765411

  16. Using VAAST to Identify Disease-Associated Variants in Next-Generation Sequencing Data

    PubMed Central

    Kennedy, Brett; Kronenberg, Zev; Hu, Hao; Moore, Barry; Flygare, Steven; Reese, Martin G.; Jorde, Lynn B.; Yandell, Mark; Huff, Chad

    2014-01-01

    The VAAST pipeline is specifically designed to identify disease-associated alleles in next-generation sequencing data. In the protocols presented in this paper, we outline the best practices for variant prioritization using VAAST. Examples and test data are provided for case-control, small pedigree, and large pedigree analyses. These protocols will teach users the fundamentals of VAAST, VAAST 2.0, and pVAAST analyses. PMID:24763993

  17. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... in the sequence. (4) The enumeration of amino acids may start at the first amino acid of the first..., counting backwards starting with the amino acid next to number 1. Otherwise, the enumeration of amino acids... sequence every 5 amino acids. The enumeration method for amino acid sequences that is set forth......

  18. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... in the sequence. (4) The enumeration of amino acids may start at the first amino acid of the first..., counting backwards starting with the amino acid next to number 1. Otherwise, the enumeration of amino acids... sequence every 5 amino acids. The enumeration method for amino acid sequences that is set forth......

  19. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... in the sequence. (4) The enumeration of amino acids may start at the first amino acid of the first..., counting backwards starting with the amino acid next to number 1. Otherwise, the enumeration of amino acids... sequence every 5 amino acids. The enumeration method for amino acid sequences that is set forth......

  20. Identifiability of PBPK Models with Applications to Dimethylarsinic Acid Exposure

    EPA Science Inventory

    Any statistical model should be identifiable in order for estimates and tests using it to be meaningful. We consider statistical analysis of physiologically-based pharmacokinetic (PBPK) models in which parameters cannot be estimated precisely from available data, and discuss diff...

  1. Whole Genome Sequencing Identifies a Novel Factor Required for Secretory Granule Maturation in Tetrahymena thermophila.

    PubMed

    Kontur, Cassandra; Kumar, Santosh; Lan, Xun; Pritchard, Jonathan K; Turkewitz, Aaron P

    2016-01-01

    Unbiased genetic approaches have a unique ability to identify novel genes associated with specific biological pathways. Thanks to next generation sequencing, forward genetic strategies can be expanded to a wider range of model organisms. The formation of secretory granules, called mucocysts, in the ciliate Tetrahymena thermophila relies, in part, on ancestral lysosomal sorting machinery, but is also likely to involve novel factors. In prior work, multiple strains with defects in mucocyst biogenesis were generated by nitrosoguanidine mutagenesis, and characterized using genetic and cell biological approaches, but the genetic lesions themselves were unknown. Here, we show that analyzing one such mutant by whole genome sequencing reveals a novel factor in mucocyst formation. Strain UC620 has both morphological and biochemical defects in mucocyst maturation-a process analogous to dense core granule maturation in animals. Illumina sequencing of a pool of UC620 F2 clones identified a missense mutation in a novel gene called MMA1 (Mucocyst maturation). The defects in UC620 were rescued by expression of a wild-type copy of MMA1, and disrupting MMA1 in an otherwise wild-type strain phenocopies UC620. The product of MMA1, characterized as a CFP-tagged copy, encodes a large soluble cytosolic protein. A small fraction of Mma1p-CFP is pelletable, which may reflect association with endosomes. The gene has no identifiable homologs except in other Tetrahymena species, and therefore represents an evolutionarily recent innovation that is required for granule maturation. PMID:27317773

  2. Whole Genome Sequencing Identifies a Novel Factor Required for Secretory Granule Maturation in Tetrahymena thermophila

    PubMed Central

    Kontur, Cassandra; Kumar, Santosh; Lan, Xun; Pritchard, Jonathan K.; Turkewitz, Aaron P.

    2016-01-01

    Unbiased genetic approaches have a unique ability to identify novel genes associated with specific biological pathways. Thanks to next generation sequencing, forward genetic strategies can be expanded to a wider range of model organisms. The formation of secretory granules, called mucocysts, in the ciliate Tetrahymena thermophila relies, in part, on ancestral lysosomal sorting machinery, but is also likely to involve novel factors. In prior work, multiple strains with defects in mucocyst biogenesis were generated by nitrosoguanidine mutagenesis, and characterized using genetic and cell biological approaches, but the genetic lesions themselves were unknown. Here, we show that analyzing one such mutant by whole genome sequencing reveals a novel factor in mucocyst formation. Strain UC620 has both morphological and biochemical defects in mucocyst maturation—a process analogous to dense core granule maturation in animals. Illumina sequencing of a pool of UC620 F2 clones identified a missense mutation in a novel gene called MMA1 (Mucocyst maturation). The defects in UC620 were rescued by expression of a wild-type copy of MMA1, and disrupting MMA1 in an otherwise wild-type strain phenocopies UC620. The product of MMA1, characterized as a CFP-tagged copy, encodes a large soluble cytosolic protein. A small fraction of Mma1p-CFP is pelletable, which may reflect association with endosomes. The gene has no identifiable homologs except in other Tetrahymena species, and therefore represents an evolutionarily recent innovation that is required for granule maturation. PMID:27317773

  3. Using complete genome comparisons to identify sequences whose presence accurately predicts clinically important phenotypes.

    PubMed

    Hall, Barry G; Cardenas, Heliodoro; Barlow, Miriam

    2013-01-01

    In clinical settings it is often important to know not just the identity of a microorganism, but also the danger posed by that particular strain. For instance, Escherichia coli can range from being a harmless commensal to being a very dangerous enterohemorrhagic (EHEC) strain. Determining pathogenic phenotypes can be both time consuming and expensive. Here we propose a simple, rapid, and inexpensive method of predicting pathogenic phenotypes on the basis of the presence or absence of short homologous DNA segments in an isolate. Our method compares completely sequenced genomes without the necessity of genome alignments in order to identify the presence or absence of the segments to produce an automatic alignment of the binary string that describes each genome. Analysis of the segment alignment allows identification of those segments whose presence strongly predicts a phenotype. Clinical application of the method requires nothing more that PCR amplification of each of the set of predictive segments. Here we apply the method to identifying EHEC strains of E. coli and to distinguishing E. coli from Shigella. We show in silico that with as few as 8 predictive sequences, if even three of those predictive sequences are amplified the probability of being EHEC or Shigella is >0.99. The method is thus very robust to the occasional amplification failure for spurious reasons. Experimentally, we apply the method to screening a set of 98 isolates to distinguishing E. coli from Shigella, and EHEC from non-EHEC E. coli strains and show that all isolates are correctly identified. PMID:23935901

  4. Failure to Identify Somatic Mutations in Monozygotic Twins Discordant for Schizophrenia by Whole Exome Sequencing

    PubMed Central

    Lyu, Nan; Guan, Li-Li; Ma, Hong; Wang, Xi-Jin; Wu, Bao-Ming; Shang, Fan-Hong; Wang, Dan; Wen, Hong; Yu, Xin

    2016-01-01

    Background: Schizophrenia (SCZ) is a severe, debilitating, and complex psychiatric disorder with multiple causative factors. An increasing number of studies have determined that rare variations play an important role in its etiology. A somatic mutation is a rare form of genetic variation that occurs at an early stage of embryonic development and is thought to contribute substantially to the development of SCZ. The aim of the study was to explore the novel pathogenic somatic single nucleotide variations (SNVs) and somatic insertions and deletions (indels) of SCZ. Methods: One Chinese family with a monozygotic (MZ) twin pair discordant for SCZ was included. Whole exome sequencing was performed in the co-twin and their parents. Rigorous filtering processes were conducted to prioritize pathogenic somatic variations, and all identified SNVs and indels were further confirmed by Sanger sequencing. Results: One somatic SNV and two somatic indels were identified after rigorous selection processes. However, none was validated by Sanger sequencing. Conclusions: This study is not alone in the failure to identify pathogenic somatic variations in MZ twins, suggesting that exonic somatic variations are extremely rare. Further efforts are warranted to explore the potential genetic mechanism of SCZ. PMID:26960372

  5. Multiple Amino Acid Sequence Alignment Nitrogenase Component 1: Insights into Phylogenetics and Structure-Function Relationships

    PubMed Central

    Howard, James B.; Kechris, Katerina J.; Rees, Douglas C.; Glazer, Alexander N.

    2013-01-01

    Amino acid residues critical for a protein's structure-function are retained by natural selection and these residues are identified by the level of variance in co-aligned homologous protein sequences. The relevant residues in the nitrogen fixation Component 1 α- and β-subunits were identified by the alignment of 95 protein sequences. Proteins were included from species encompassing multiple microbial phyla and diverse ecological niches as well as the nitrogen fixation genotypes, anf, nif, and vnf, which encode proteins associated with cofactors differing at one metal site. After adjusting for differences in sequence length, insertions, and deletions, the remaining >85% of the sequence co-aligned the subunits from the three genotypes. Six Groups, designated Anf, Vnf , and Nif I-IV, were assigned based upon genetic origin, sequence adjustments, and conserved residues. Both subunits subdivided into the same groups. Invariant and single variant residues were identified and were defined as “core” for nitrogenase function. Three species in Group Nif-III, Candidatus Desulforudis audaxviator, Desulfotomaculum kuznetsovii, and Thermodesulfatator indicus, were found to have a seleno-cysteine that replaces one cysteinyl ligand of the 8Fe:7S, P-cluster. Subsets of invariant residues, limited to individual groups, were identified; these unique residues help identify the gene of origin (anf, nif, or vnf) yet should not be considered diagnostic of the metal content of associated cofactors. Fourteen of the 19 residues that compose the cofactor pocket are invariant or single variant; the other five residues are highly variable but do not correlate with the putative metal content of the cofactor. The variable residues are clustered on one side of the cofactor, away from other functional centers in the three dimensional structure. Many of the invariant and single variant residues were not previously recognized as potentially critical and their identification provides the bases

  6. A Flexible, Efficient Binomial Mixed Model for Identifying Differential DNA Methylation in Bisulfite Sequencing Data.

    PubMed

    Lea, Amanda J; Tung, Jenny; Zhou, Xiang

    2015-11-01

    Identifying sources of variation in DNA methylation levels is important for understanding gene regulation. Recently, bisulfite sequencing has become a popular tool for investigating DNA methylation levels. However, modeling bisulfite sequencing data is complicated by dramatic variation in coverage across sites and individual samples, and because of the computational challenges of controlling for genetic covariance in count data. To address these challenges, we present a binomial mixed model and an efficient, sampling-based algorithm (MACAU: Mixed model association for count data via data augmentation) for approximate parameter estimation and p-value computation. This framework allows us to simultaneously account for both the over-dispersed, count-based nature of bisulfite sequencing data, as well as genetic relatedness among individuals. Using simulations and two real data sets (whole genome bisulfite sequencing (WGBS) data from Arabidopsis thaliana and reduced representation bisulfite sequencing (RRBS) data from baboons), we show that our method provides well-calibrated test statistics in the presence of population structure. Further, it improves power to detect differentially methylated sites: in the RRBS data set, MACAU detected 1.6-fold more age-associated CpG sites than a beta-binomial model (the next best approach). Changes in these sites are consistent with known age-related shifts in DNA methylation levels, and are enriched near genes that are differentially expressed with age in the same population. Taken together, our results indicate that MACAU is an efficient, effective tool for analyzing bisulfite sequencing data, with particular salience to analyses of structured populations. MACAU is freely available at www.xzlab.org/software.html. PMID:26599596

  7. A Flexible, Efficient Binomial Mixed Model for Identifying Differential DNA Methylation in Bisulfite Sequencing Data

    PubMed Central

    Lea, Amanda J.

    2015-01-01

    Identifying sources of variation in DNA methylation levels is important for understanding gene regulation. Recently, bisulfite sequencing has become a popular tool for investigating DNA methylation levels. However, modeling bisulfite sequencing data is complicated by dramatic variation in coverage across sites and individual samples, and because of the computational challenges of controlling for genetic covariance in count data. To address these challenges, we present a binomial mixed model and an efficient, sampling-based algorithm (MACAU: Mixed model association for count data via data augmentation) for approximate parameter estimation and p-value computation. This framework allows us to simultaneously account for both the over-dispersed, count-based nature of bisulfite sequencing data, as well as genetic relatedness among individuals. Using simulations and two real data sets (whole genome bisulfite sequencing (WGBS) data from Arabidopsis thaliana and reduced representation bisulfite sequencing (RRBS) data from baboons), we show that our method provides well-calibrated test statistics in the presence of population structure. Further, it improves power to detect differentially methylated sites: in the RRBS data set, MACAU detected 1.6-fold more age-associated CpG sites than a beta-binomial model (the next best approach). Changes in these sites are consistent with known age-related shifts in DNA methylation levels, and are enriched near genes that are differentially expressed with age in the same population. Taken together, our results indicate that MACAU is an efficient, effective tool for analyzing bisulfite sequencing data, with particular salience to analyses of structured populations. MACAU is freely available at www.xzlab.org/software.html. PMID:26599596

  8. Exome sequencing identifies potential novel candidate genes in patients with unexplained colorectal adenomatous polyposis.

    PubMed

    Spier, Isabel; Kerick, Martin; Drichel, Dmitriy; Horpaopan, Sukanya; Altmüller, Janine; Laner, Andreas; Holzapfel, Stefanie; Peters, Sophia; Adam, Ronja; Zhao, Bixiao; Becker, Tim; Lifton, Richard P; Holinski-Feder, Elke; Perner, Sven; Thiele, Holger; Nöthen, Markus M; Hoffmann, Per; Timmermann, Bernd; Schweiger, Michal R; Aretz, Stefan

    2016-04-01

    In up to 30% of patients with colorectal adenomatous polyposis, no germline mutation in the known genes APC, causing familial adenomatous polyposis, MUTYH, causing MUTYH-associated polyposis, and POLE or POLD1, causing Polymerase-Proofreading-associated polyposis can be identified, although a hereditary etiology is likely. To uncover new causative genes, exome sequencing was performed using DNA from leukocytes and a total of 12 colorectal adenomas from seven unrelated patients with unexplained sporadic adenomatous polyposis. For data analysis and variant filtering, an established bioinformatics pipeline including in-house tools was applied. Variants were filtered for rare truncating point mutations and copy-number variants assuming a dominant, recessive, or tumor suppressor model of inheritance. Subsequently, targeted sequence analysis of the most promising candidate genes was performed in a validation cohort of 191 unrelated patients. All relevant variants were validated by Sanger sequencing. The analysis of exome sequencing data resulted in the identification of rare loss-of-function germline mutations in three promising candidate genes (DSC2, PIEZO1, ZSWIM7). In the validation cohort, further variants predicted to be pathogenic were identified in DSC2 and PIEZO1. According to the somatic mutation spectra, the adenomas in this patient cohort follow the classical pathways of colorectal tumorigenesis. The present study identified three candidate genes which might represent rare causes for a predisposition to colorectal adenoma formation. Especially PIEZO1 (FAM38A) and ZSWIM7 (SWS1) warrant further exploration. To evaluate the clinical relevance of these genes, investigation of larger patient cohorts and functional studies are required. PMID:26780541

  9. Using exome sequencing to identify the cause of myocardial hypertrophy in a Chinese family.

    PubMed

    Pu, Tian; Guo, Qianqian; Cao, Ruixue; Xu, Rang; Sun, Kun; Chen, Sun

    2015-09-01

    Myocardial hypertrophy is a common feature of numerous diseases. It is important to distinguish between these diseases in order to enable accurate diagnosis and the administration of appropriate therapy. Using whole‑exome sequencing, the present study aimed to identify a pathogenic mutation in a Chinese family, which may lead to cardiac hypertrophy and Wolff‑Parkinson‑White syndrome. The proband from the Chinese family exhibited left ventricular hypertrophy and pre-excitation with a short PR interval. DNA was extracted from peripheral blood obtained from the subject family, and exome sequencing was performed in the proband. Polymerase chain reaction and direct sequencing were used to confirm the presence of a mutation, and confirmed that the pathogenic mutation was 5'-AMP‑activated protein kinase subunit γ2 (PRKAG2) (p.R302Q), which has been previously reported in a family with an inherited from of WPW. A stop‑gain mutation in urotensin II receptor (UTS2R) (p.S241X), which is associated with congestive heart failure, was identified in the proband and in one other affected family member. It is important to identify the causes of myocardial hypertrophy, in order to provide a theoretical basis with which to improve clinical diagnosis and the assessment of prognosis. The results of the present study suggest that if a patient has myocardial hypertrophy with a short PR interval on electrocardiogram, a mutation in the PRKAG2 gene should be considered. In conclusion, exome sequencing methods may assist with the identification of causative genes in myocardial hypertrophy, as well as genes that are associated with an increased risk of sudden cardiac death. PMID:25997934

  10. Peptide sequencing by using a combination of partial acid hydrolysis and fast-atom-bombardment mass spectrometry.

    PubMed Central

    De Angelis, F; Botta, M; Ceccarelli, S; Nicoletti, R

    1986-01-01

    To overcome the limit of the intensity of ions carrying sequence information in structural determinations of peptides by fast-atom-bombardment m.s., we have developed a method that consists in taking spectra of the peptide acid hydrolysates at different hydrolysis times. Peaks correspond to the oligomers arising from the peptide partial hydrolysis. The sequence can then be identified from the structurally overlapping fragments. PMID:2428356

  11. Microfluidic screening and whole-genome sequencing identifies mutations associated with improved protein secretion by yeast

    PubMed Central

    Huang, Mingtao; Bai, Yunpeng; Sjostrom, Staffan L.; Hallström, Björn M.; Liu, Zihe; Petranovic, Dina; Uhlén, Mathias; Joensson, Haakan N.; Andersson-Svahn, Helene; Nielsen, Jens

    2015-01-01

    There is an increasing demand for biotech-based production of recombinant proteins for use as pharmaceuticals in the food and feed industry and in industrial applications. Yeast Saccharomyces cerevisiae is among preferred cell factories for recombinant protein production, and there is increasing interest in improving its protein secretion capacity. Due to the complexity of the secretory machinery in eukaryotic cells, it is difficult to apply rational engineering for construction of improved strains. Here we used high-throughput microfluidics for the screening of yeast libraries, generated by UV mutagenesis. Several screening and sorting rounds resulted in the selection of eight yeast clones with significantly improved secretion of recombinant α-amylase. Efficient secretion was genetically stable in the selected clones. We performed whole-genome sequencing of the eight clones and identified 330 mutations in total. Gene ontology analysis of mutated genes revealed many biological processes, including some that have not been identified before in the context of protein secretion. Mutated genes identified in this study can be potentially used for reverse metabolic engineering, with the objective to construct efficient cell factories for protein secretion. The combined use of microfluidics screening and whole-genome sequencing to map the mutations associated with the improved phenotype can easily be adapted for other products and cell types to identify novel engineering targets, and this approach could broadly facilitate design of novel cell factories. PMID:26261321

  12. HUGE: a database for human large proteins identified in the Kazusa cDNA sequencing project.

    PubMed

    Kikuno, R; Nagase, T; Suyama, M; Waki, M; Hirosawa, M; Ohara, O

    2000-01-01

    HUGE is a database for human large proteins newly identified in the Kazusa cDNA project, the aim of which is to predict the primary structure of proteins from the sequences of human large cDNAs (>4 kb). In particular, cDNA clones capable of coding for large proteins (>50 kDa) are the current targets of the project. HUGE contains >1100 cDNA sequences and detailed information obtained through analysis of the sequences of cDNAs and the predicted proteins. Besides an increase in the number of cDNA entries, the amount of experimental data for expression profiling has been largely increased and data on chromosomal locations have been newly added. All of the protein-coding regions were examined by GeneMark analysis, and the results of a motif/domain search of each predicted protein sequence against the Pfam database have been newly added. HUGE is available through the WWW at http://www.kazusa.or.jp/huge PMID:10592264

  13. Genomic islands of divergence in hybridizing Heliconius butterflies identified by large-scale targeted sequencing

    PubMed Central

    Nadeau, Nicola J.; Whibley, Annabel; Jones, Robert T.; Davey, John W.; Dasmahapatra, Kanchon K.; Baxter, Simon W.; Quail, Michael A.; Joron, Mathieu; ffrench-Constant, Richard H.; Blaxter, Mark L.; Mallet, James; Jiggins, Chris D.

    2012-01-01

    Heliconius butterflies represent a recent radiation of species, in which wing pattern divergence has been implicated in speciation. Several loci that control wing pattern phenotypes have been mapped and two were identified through sequencing. These same gene regions play a role in adaptation across the whole Heliconius radiation. Previous studies of population genetic patterns at these regions have sequenced small amplicons. Here, we use targeted next-generation sequence capture to survey patterns of divergence across these entire regions in divergent geographical races and species of Heliconius. This technique was successful both within and between species for obtaining high coverage of almost all coding regions and sufficient coverage of non-coding regions to perform population genetic analyses. We find major peaks of elevated population differentiation between races across hybrid zones, which indicate regions under strong divergent selection. These ‘islands’ of divergence appear to be more extensive between closely related species, but there is less clear evidence for such islands between more distantly related species at two further points along the ‘speciation continuum’. We also sequence fosmid clones across these regions in different Heliconius melpomene races. We find no major structural rearrangements but many relatively large (greater than 1 kb) insertion/deletion events (including gain/loss of transposable elements) that are variable between races. PMID:22201164

  14. Genetic Mapping and Exome Sequencing Identify Variants Associated with Five Novel Diseases

    PubMed Central

    Puffenberger, Erik G.; Jinks, Robert N.; Sougnez, Carrie; Cibulskis, Kristian; Willert, Rebecca A.; Achilly, Nathan P.; Cassidy, Ryan P.; Fiorentini, Christopher J.; Heiken, Kory F.; Lawrence, Johnny J.; Mahoney, Molly H.; Miller, Christopher J.; Nair, Devika T.; Politi, Kristin A.; Worcester, Kimberly N.; Setton, Roni A.; DiPiazza, Rosa; Sherman, Eric A.; Eastman, James T.; Francklyn, Christopher; Robey-Bond, Susan; Rider, Nicholas L.; Gabriel, Stacey; Morton, D. Holmes; Strauss, Kevin A.

    2012-01-01

    The Clinic for Special Children (CSC) has integrated biochemical and molecular methods into a rural pediatric practice serving Old Order Amish and Mennonite (Plain) children. Among the Plain people, we have used single nucleotide polymorphism (SNP) microarrays to genetically map recessive disorders to large autozygous haplotype blocks (mean = 4.4 Mb) that contain many genes (mean = 79). For some, uninformative mapping or large gene lists preclude disease-gene identification by Sanger sequencing. Seven such conditions were selected for exome sequencing at the Broad Institute; all had been previously mapped at the CSC using low density SNP microarrays coupled with autozygosity and linkage analyses. Using between 1 and 5 patient samples per disorder, we identified sequence variants in the known disease-causing genes SLC6A3 and FLVCR1, and present evidence to strongly support the pathogenicity of variants identified in TUBGCP6, BRAT1, SNIP1, CRADD, and HARS. Our results reveal the power of coupling new genotyping technologies to population-specific genetic knowledge and robust clinical data. PMID:22279524

  15. Genetic profile for suspected dysferlinopathy identified by targeted next-generation sequencing

    PubMed Central

    Izumi, Rumiko; Niihori, Tetsuya; Takahashi, Toshiaki; Suzuki, Naoki; Tateyama, Maki; Watanabe, Chigusa; Sugie, Kazuma; Nakanishi, Hirotaka; Sobue, Gen; Kato, Masaaki; Warita, Hitoshi; Aoki, Yoko

    2015-01-01

    Objective: To investigate the genetic causes of suspected dysferlinopathy and to reveal the genetic profile for myopathies with dysferlin deficiency. Methods: Using next-generation sequencing, we analyzed 42 myopathy-associated genes, including DYSF, in 64 patients who were clinically or pathologically suspected of having dysferlinopathy. Putative pathogenic mutations were confirmed by Sanger sequencing. In addition, copy-number variations in DYSF were investigated using multiplex ligation-dependent probe amplification. We also analyzed the genetic profile for 90 patients with myopathy with dysferlin deficiency, as indicated by muscle specimen immunohistochemistry, including patients from a previous cohort. Results: We identified putative pathogenic mutations in 38 patients (59% of all investigated patients). Twenty-three patients had DYSF mutations, including 6 novel mutations. The remaining 16 patients, including a single patient who also carried the DYSF mutation, harbored putative pathogenic mutations in other genes. The genetic profile for 90 patients with dysferlin deficiency revealed that 70% had DYSF mutations (n = 63), 10% had CAPN3 mutations (n = 9), 2% had CAV3 mutations (n = 2), 3% had mutations in other genes (in single patients), and 16% did not have any identified mutations (n = 14). Conclusions: This study clarified the heterogeneous genetic profile for myopathies with dysferlin deficiency. Our results demonstrate the importance of a comprehensive analysis of related genes in improving the genetic diagnosis of dysferlinopathy as one of the most common subtypes of limb-girdle muscular dystrophy. Unresolved diagnoses should be investigated using whole-genome or whole-exome sequencing. PMID:27066573

  16. Multiplex targeted sequencing identifies recurrently mutated genes in autism spectrum disorders.

    PubMed

    O'Roak, Brian J; Vives, Laura; Fu, Wenqing; Egertson, Jarrett D; Stanaway, Ian B; Phelps, Ian G; Carvill, Gemma; Kumar, Akash; Lee, Choli; Ankenman, Katy; Munson, Jeff; Hiatt, Joseph B; Turner, Emily H; Levy, Roie; O'Day, Diana R; Krumm, Niklas; Coe, Bradley P; Martin, Beth K; Borenstein, Elhanan; Nickerson, Deborah A; Mefford, Heather C; Doherty, Dan; Akey, Joshua M; Bernier, Raphael; Eichler, Evan E; Shendure, Jay

    2012-12-21

    Exome sequencing studies of autism spectrum disorders (ASDs) have identified many de novo mutations but few recurrently disrupted genes. We therefore developed a modified molecular inversion probe method enabling ultra-low-cost candidate gene resequencing in very large cohorts. To demonstrate the power of this approach, we captured and sequenced 44 candidate genes in 2446 ASD probands. We discovered 27 de novo events in 16 genes, 59% of which are predicted to truncate proteins or disrupt splicing. We estimate that recurrent disruptive mutations in six genes-CHD8, DYRK1A, GRIN2B, TBR1, PTEN, and TBL1XR1-may contribute to 1% of sporadic ASDs. Our data support associations between specific genes and reciprocal subphenotypes (CHD8-macrocephaly and DYRK1A-microcephaly) and replicate the importance of a β-catenin-chromatin-remodeling network to ASD etiology. PMID:23160955

  17. Multi-locus DNA sequencing of Toxoplasma gondii isolated from Brazilian pigs identifies genetically divergent strains

    PubMed Central

    Frazão-Teixeira, E.; Sundar, N.; Dubey, J. P.; Grigg, M. E.; de Oliveira, F. C. R.

    2010-01-01

    Five Toxoplasma gondii isolates (TgPgBr1–5) were isolated from hearts and brains of pigs freshly purchased at the market of Campos dos Goytacazes, Northern Rio de Janeiro State, Brazil. Four of the five isolates were highly pathogenic in mice. Four genotypes were identified. Multi-locus PCR-DNA sequencing showed that each strain possessed a unique combination of archetypal and novel alleles not previously described in South America. The data suggest that different strains circulate in pigs destined for human consumption from those previously isolated from cats and chickens in Brazil. Further, multi-locus PCR-RFLP analyses failed to accurately genotype the Brazilian isolates due to the high presence of atypical alleles. This is the first report of multi-locus DNA sequencing of T. gondii isolates in pigs from Brazil. PMID:21051148

  18. Exome sequencing in amyotrophic lateral sclerosis identifies risk genes and pathways.

    PubMed

    Cirulli, Elizabeth T; Lasseigne, Brittany N; Petrovski, Slavé; Sapp, Peter C; Dion, Patrick A; Leblond, Claire S; Couthouis, Julien; Lu, Yi-Fan; Wang, Quanli; Krueger, Brian J; Ren, Zhong; Keebler, Jonathan; Han, Yujun; Levy, Shawn E; Boone, Braden E; Wimbish, Jack R; Waite, Lindsay L; Jones, Angela L; Carulli, John P; Day-Williams, Aaron G; Staropoli, John F; Xin, Winnie W; Chesi, Alessandra; Raphael, Alya R; McKenna-Yasek, Diane; Cady, Janet; Vianney de Jong, J M B; Kenna, Kevin P; Smith, Bradley N; Topp, Simon; Miller, Jack; Gkazi, Athina; Al-Chalabi, Ammar; van den Berg, Leonard H; Veldink, Jan; Silani, Vincenzo; Ticozzi, Nicola; Shaw, Christopher E; Baloh, Robert H; Appel, Stanley; Simpson, Ericka; Lagier-Tourenne, Clotilde; Pulst, Stefan M; Gibson, Summer; Trojanowski, John Q; Elman, Lauren; McCluskey, Leo; Grossman, Murray; Shneider, Neil A; Chung, Wendy K; Ravits, John M; Glass, Jonathan D; Sims, Katherine B; Van Deerlin, Vivianna M; Maniatis, Tom; Hayes, Sebastian D; Ordureau, Alban; Swarup, Sharan; Landers, John; Baas, Frank; Allen, Andrew S; Bedlack, Richard S; Harper, J Wade; Gitler, Aaron D; Rouleau, Guy A; Brown, Robert; Harms, Matthew B; Cooper, Gregory M; Harris, Tim; Myers, Richard M; Goldstein, David B

    2015-03-27

    Amyotrophic lateral sclerosis (ALS) is a devastating neurological disease with no effective treatment. We report the results of a moderate-scale sequencing study aimed at increasing the number of genes known to contribute to predisposition for ALS. We performed whole-exome sequencing of 2869 ALS patients and 6405 controls. Several known ALS genes were found to be associated, and TBK1 (the gene encoding TANK-binding kinase 1) was identified as an ALS gene. TBK1 is known to bind to and phosphorylate a number of proteins involved in innate immunity and autophagy, including optineurin (OPTN) and p62 (SQSTM1/sequestosome), both of which have also been implicated in ALS. These observations reveal a key role of the autophagic pathway in ALS and suggest specific targets for therapeutic intervention. PMID:25700176

  19. Exome Sequencing of Only Seven Qataris Identifies Potentially Deleterious Variants in the Qatari Population

    PubMed Central

    Rodriguez-Flores, Juan L.; Fuller, Jennifer; Hackett, Neil R.; Salit, Jacqueline; Malek, Joel A.; Al-Dous, Eman; Chouchane, Lotfi; Zirie, Mahmoud; Jayoussi, Amin; Mahmoud, Mai A.; Crystal, Ronald G.; Mezey, Jason G.

    2012-01-01

    The Qatari population, located at the Arabian migration crossroads of African and Eurasia, is comprised of Bedouin, Persian and African genetic subgroups. By deep exome sequencing of only 7 Qataris, including individuals in each subgroup, we identified 2,750 nonsynonymous SNPs predicted to be deleterious, many of which are linked to human health, or are in genes linked to human health. Many of these SNPs were at significantly elevated deleterious allele frequency in Qataris compared to other populations worldwide. Despite the small sample size, SNP allele frequency was highly correlated with a larger Qatari sample. Together, the data demonstrate that exome sequencing of only a small number of individuals can reveal genetic variations with potential health consequences in understudied populations. PMID:23139751

  20. Genomic Sequencing Identifies ELF3 as a Driver of Ampullary Carcinoma.

    PubMed

    Yachida, Shinichi; Wood, Laura D; Suzuki, Masami; Takai, Erina; Totoki, Yasushi; Kato, Mamoru; Luchini, Claudio; Arai, Yasuhito; Nakamura, Hiromi; Hama, Natsuko; Elzawahry, Asmaa; Hosoda, Fumie; Shirota, Tomoki; Morimoto, Nobuhiko; Hori, Kunio; Funazaki, Jun; Tanaka, Hikaru; Morizane, Chigusa; Okusaka, Takuji; Nara, Satoshi; Shimada, Kazuaki; Hiraoka, Nobuyoshi; Taniguchi, Hirokazu; Higuchi, Ryota; Oshima, Minoru; Okano, Keiichi; Hirono, Seiko; Mizuma, Masamichi; Arihiro, Koji; Yamamoto, Masakazu; Unno, Michiaki; Yamaue, Hiroki; Weiss, Matthew J; Wolfgang, Christopher L; Furukawa, Toru; Nakagama, Hitoshi; Vogelstein, Bert; Kiyono, Tohru; Hruban, Ralph H; Shibata, Tatsuhiro

    2016-02-01

    Ampullary carcinomas are highly malignant neoplasms that can have either intestinal or pancreatobiliary differentiation. To characterize somatic alterations in ampullary carcinomas, we performed whole-exome sequencing and DNA copy-number analysis on 60 ampullary carcinomas resected from clinically well-characterized Japanese and American patients. We next selected 92 genes and performed targeted sequencing to validate significantly mutated genes in an additional 112 cancers. The prevalence of driver gene mutations in carcinomas with the intestinal phenotype is different from those with the pancreatobiliary phenotype. We identified a characteristic significantly mutated driver gene (ELF3) as well as previously known driver genes (TP53, KRAS, APC, and others). Functional studies demonstrated that ELF3 silencing in normal human epithelial cells enhances their motility and invasion. PMID:26806338

  1. Complete genome sequence of Lactobacillus plantarum ZS2058, a probiotic strain with high conjugated linoleic acid production ability.

    PubMed

    Yang, Bo; Chen, Haiqin; Tian, Fengwei; Zhao, Jianxin; Gu, Zhennan; Zhang, Hao; Chen, Yong Q; Chen, Wei

    2015-11-20

    Lactobacillus plantarum ZS2058 was isolated from sauerkraut and identified to synthesize the beneficial metabolite conjugated linoleic acid. The genome contains a 319,7363-bp chromosome and three plasmids. The sequence will facilitate identification and characterization of the genetic determinants for its putative biological benefits. PMID:26439428

  2. Identifying bottlenecks in transient and stable production of recombinant monoclonal-antibody sequence variants in Chinese hamster ovary cells

    PubMed Central

    Mason, Megan; Sweeney, Bernadette; Cain, Katharine; Stephens, Paul; Sharfstein, Susan T.

    2012-01-01

    The increasing demand for antibody-based therapeutics has emphasized the need for technologies to improve recombinant antibody titers from mammalian cell lines. Moreover, as antibody therapeutics address an increasing spectrum of indications, interest has increased in antibody engineering to improve affinity and biological activity. However, the cellular mechanisms that dictate expression and the relationships between antibody sequence and expression level remain poorly understood. Fundamental understanding of how mammalian cells handle high levels of transgene expression and of the relationship between sequence and expression are vital to the development of new antibodies and for increasing recombinant antibody titers. In this work, we analyzed a pair of mutants that vary by a single amino acid at Kabat position 49 (heavy chain framework), resulting in differential transient and stable titers with no apparent loss of antigen affinity. Through analysis of mRNA, gene copy number, intracellular antibody content, and secreted antibody, we found that while translational/post-translational mechanisms are limiting in transient systems, it appears that the amount of available transgenic mRNA becomes the limiting event upon stable integration of the recombinant genes. We also show that amino acid substitution at residue 49 results in production of a non-secreted HC variant and postulate that stable antibody expression is maintained at a level which prevents toxic accumulation of this HC-related protein. This study highlights the need for proper sequence engineering strategies when developing therapeutic antibodies and alludes to the early analysis of transient expression systems to identify the potential for aberrant stable expression behavior. PMID:22467228

  3. Exome sequencing identified null mutations in LOXL3 associated with early-onset high myopia

    PubMed Central

    Li, Jiali; Gao, Bei; Xiao, Xueshan; Li, Shiqiang; Jia, Xiaoyun; Sun, Wenmin; Guo, Xiangming

    2016-01-01

    Purpose To identify null mutations in novel genes associated with early-onset high myopia using whole exome sequencing. Methods Null mutations, including homozygous and compound heterozygous truncations, were selected from whole exome sequencing data for 298 probands with early-onset high myopia. These data were compared with those of 507 probands with other forms of eye diseases. Null mutations specific to early-onset high myopia were considered potential candidates. Candidate mutations were confirmed with Sanger sequencing and were subsequently evaluated in available family members and 480 healthy controls. Results A homozygous frameshift mutation (c.39dup; p.L14Afs*21) and a compound heterozygous frameshift mutation (c.39dup; p.L14Afs*21 and c.594delG; p.Q199Kfs*35) in LOXL3 were separately identified in two of the 298 probands with early-onset high myopia. These mutations were confirmed with Sanger sequencing and were not detected in 1,974 alleles of the controls from the same region (507 individuals with other conditions and 480 healthy control individuals). These two probands were singleton cases, and their parents had only heterozygous mutations. A homozygous missense mutation in LOXL3 was recently reported in a consanguineous family with Stickler syndrome. Conclusions Our results suggest that null mutations in LOXL3 are likely associated with autosomal recessive early-onset high myopia. LOXL3 is a potential candidate gene for high myopia, but this possibility should be confirmed in additional studies. LOXL3 null mutations in human beings are not lethal, providing a phenotype contrary to that in mice. PMID:26957899

  4. Structural gene and complete amino acid sequence of Pseudomonas aeruginosa IFO 3455 elastase.

    PubMed Central

    Fukushima, J; Yamamoto, S; Morihara, K; Atsumi, Y; Takeuchi, H; Kawamoto, S; Okuda, K

    1989-01-01

    The DNA encoding the elastase of Pseudomonas aeruginosa IFO 3455 was cloned, and its complete nucleotide sequence was determined. When the cloned gene was ligated to pUC18, the Escherichia coli expression vector, bacteria carrying the gene exhibited high levels of both elastase activity and elastase antigens. The amino acid sequence, deduced from the nucleotide sequence, revealed that the mature elastase consisted of 301 amino acids with a relative molecular mass of 32,926 daltons. The amino acid composition predicted from the DNA sequence was quite similar to the chemically determined composition of purified elastase reported previously. We also observed nucleotide sequence encoding a signal peptide and "pro" sequence consisting of 197 amino acids upstream from the mature elastase protein gene. The amino acid sequence analysis revealed that both the N-terminal sequence of the purified elastase and the N-terminal side sequences of the C-terminal tryptic peptide as well as the internal lysyl peptide fragment were completely identical to the deduced amino acid sequences. The pattern of identity of amino acid sequences was quite evident in the regions that include structurally and functionally important residues of Bacillus subtilis thermolysin. PMID:2493453

  5. Sequencing-based approach identified three new susceptibility loci for psoriasis.

    PubMed

    Sheng, Yujun; Jin, Xin; Xu, Jinhua; Gao, Jinping; Du, Xiaoqing; Duan, Dawei; Li, Bing; Zhao, Jinhua; Zhan, Wenying; Tang, Huayang; Tang, Xianfa; Li, Yang; Cheng, Hui; Zuo, Xianbo; Mei, Junpu; Zhou, Fusheng; Liang, Bo; Chen, Gang; Shen, Changbing; Cui, Hongzhou; Zhang, Xiaoguang; Zhang, Change; Wang, Wenjun; Zheng, Xiaodong; Fan, Xing; Wang, Zaixing; Xiao, Fengli; Cui, Yong; Li, Yingrui; Wang, Jun; Yang, Sen; Xu, Lei; Sun, Liangdan; Zhang, Xuejun

    2014-01-01

    In a previous large-scale exome sequencing analysis for psoriasis, we discovered seven common and low-frequency missense variants within six genes with genome-wide significance. Here we describe an in-depth analysis of noncoding variants based on sequencing data (10,727 cases and 10,582 controls) with replication in an independent cohort of Han Chinese individuals consisting of 4,480 cases and 6,521 controls to identify additional psoriasis susceptibility loci. We confirmed four known psoriasis susceptibility loci (IL12B, IFIH1, ERAP1 and RNF114; 2.30 × 10(-20)≤P≤2.41 × 10(-7)) and identified three new susceptibility loci: 4q24 (NFKB1) at rs1020760 (P=2.19 × 10(-8)), 12p13.3 (CD27-LAG3) at rs758739 (P=4.08 × 10(-8)) and 17q12 (IKZF3) at rs10852936 (P=1.96 × 10(-8)). Two suggestive loci, 3p21.31 and 17q25, are also identified with P<1.00 × 10(-6). The results of this study increase the number of confirmed psoriasis risk loci and provide novel insight into the pathogenesis of psoriasis. PMID:25006012

  6. Exome and whole-genome sequencing of esophageal adenocarcinoma identifies recurrent driver events and mutational complexity.

    PubMed

    Dulak, Austin M; Stojanov, Petar; Peng, Shouyong; Lawrence, Michael S; Fox, Cameron; Stewart, Chip; Bandla, Santhoshi; Imamura, Yu; Schumacher, Steven E; Shefler, Erica; McKenna, Aaron; Carter, Scott L; Cibulskis, Kristian; Sivachenko, Andrey; Saksena, Gordon; Voet, Douglas; Ramos, Alex H; Auclair, Daniel; Thompson, Kristin; Sougnez, Carrie; Onofrio, Robert C; Guiducci, Candace; Beroukhim, Rameen; Zhou, Zhongren; Lin, Lin; Lin, Jules; Reddy, Rishindra; Chang, Andrew; Landrenau, Rodney; Pennathur, Arjun; Ogino, Shuji; Luketich, James D; Golub, Todd R; Gabriel, Stacey B; Lander, Eric S; Beer, David G; Godfrey, Tony E; Getz, Gad; Bass, Adam J

    2013-05-01

    The incidence of esophageal adenocarcinoma (EAC) has risen 600% over the last 30 years. With a 5-year survival rate of ~15%, the identification of new therapeutic targets for EAC is greatly important. We analyze the mutation spectra from whole-exome sequencing of 149 EAC tumor-normal pairs, 15 of which have also been subjected to whole-genome sequencing. We identify a mutational signature defined by a high prevalence of A>C transversions at AA dinucleotides. Statistical analysis of exome data identified 26 significantly mutated genes. Of these genes, five (TP53, CDKN2A, SMAD4, ARID1A and PIK3CA) have previously been implicated in EAC. The new significantly mutated genes include chromatin-modifying factors and candidate contributors SPG20, TLR4, ELMO1 and DOCK2. Functional analyses of EAC-derived mutations in ELMO1 identifies increased cellular invasion. Therefore, we suggest the potential activation of the RAC1 pathway as a contributor to EAC tumorigenesis. PMID:23525077

  7. Deep sequencing identifies genetic heterogeneity and recurrent convergent evolution in chronic lymphocytic leukemia

    PubMed Central

    Ojha, Juhi; Ayres, Jackline; Secreto, Charla; Tschumper, Renee; Rabe, Kari; Van Dyke, Daniel; Slager, Susan; Shanafelt, Tait; Fonseca, Rafael; Kay, Neil E.

    2015-01-01

    Recent high-throughput sequencing and microarray studies have characterized the genetic landscape and clonal complexity of chronic lymphocytic leukemia (CLL). Here, we performed a longitudinal study in a homogeneously treated cohort of 12 patients, with sequential samples obtained at comparable stages of disease. We identified clonal competition between 2 or more genetic subclones in 70% of the patients with relapse, and stable clonal dynamics in the remaining 30%. By deep sequencing, we identified a high reservoir of genetic heterogeneity in the form of several driver genes mutated in small subclones underlying the disease course. Furthermore, in 2 patients, we identified convergent evolution, characterized by the combination of genetic lesions affecting the same genes or copy number abnormality in different subclones. The phenomenon affects multiple CLL putative driver abnormalities, including mutations in NOTCH1, SF3B1, DDX3X, and del(11q23). This is the first report documenting convergent evolution as a recurrent event in the CLL genome. Furthermore, this finding suggests the selective advantage of specific combinations of genetic lesions for CLL pathogenesis in a subset of patients. PMID:25377784

  8. Utility of next-generation RNA-sequencing in identifying chimeric transcription involving human endogenous retroviruses.

    PubMed

    Sokol, Martin; Jessen, Karen Margrethe; Pedersen, Finn Skou

    2016-01-01

    Several studies have shown that human endogenous retroviruses and endogenous retrovirus-like repeats (here collectively HERVs) impose direct regulation on human genes through enhancer and promoter motifs present in their long terminal repeats (LTRs). Although chimeric transcription in which novel gene isoforms containing retroviral and human sequence are transcribed from viral promoters are commonly associated with disease, regulation by HERVs is beneficial in other settings; for example, in human testis chimeric isoforms of TP63 induced by an ERV9 LTR protect the male germ line upon DNA damage by inducing apoptosis, whereas in the human globin locus the γ- and β-globin switch during normal hematopoiesis is mediated by complex interactions of an ERV9 LTR and surrounding human sequence. The advent of deep sequencing or next-generation sequencing (NGS) has revolutionized the way researchers solve important scientific questions and develop novel hypotheses in relation to human genome regulation. We recently applied next-generation paired-end RNA-sequencing (RNA-seq) together with chromatin immunoprecipitation with sequencing (ChIP-seq) to examine ERV9 chimeric transcription in human reference cell lines from Encyclopedia of DNA Elements (ENCODE). This led to the discovery of advanced regulation mechanisms by ERV9s and other HERVs across numerous human loci including transcription of large gene-unannotated genomic regions, as well as cooperative regulation by multiple HERVs and non-LTR repeats such as Alu elements. In this article, well-established examples of human gene regulation by HERVs are reviewed followed by a description of paired-end RNA-seq, and its application in identifying chimeric transcription genome-widely. Based on integrative analyses of RNA-seq and ChIP-seq, data we then present novel examples of regulation by ERV9s of tumor suppressor genes CADM2 and SEMA3A, as well as transcription of an unannotated region. Taken together, this article highlights

  9. Human retroviruses and AIDS 1996. A compilation and analysis of nucleic acid and amino acid sequences

    SciTech Connect

    Myers, G.; Foley, B.; Korber, B.; Mellors, J.W.; Jeang, K.T.; Wain-Hobson, S.

    1997-04-01

    This compendium and the accompanying floppy diskettes are the result of an effort to compile and rapidly publish all relevant molecular data concerning the human immunodeficiency viruses (HIV) and related retroviruses. The scope of the compendium and database is best summarized by the five parts that it comprises: (1) Nuclear Acid Alignments and Sequences; (2) Amino Acid Alignments; (3) Analysis; (4) Related Sequences; and (5) Database Communications. Information within all the parts is updated throughout the year on the Web site, http://hiv-web.lanl.gov. While this publication could take the form of a review or sequence monograph, it is not so conceived. Instead, the literature from which the database is derived has simply been summarized and some elementary computational analyses have been performed upon the data. Interpretation and commentary have been avoided insofar as possible so that the reader can form his or her own judgments concerning the complex information. In addition to the general descriptions of the parts of the compendium, the user should read the individual introductions for each part.

  10. Phylogenetic analysis of rbcL sequences identifies Acorus calamus as the primal extant monocotyledon.

    PubMed Central

    Duvall, M R; Learn, G H; Eguiarte, L E; Clegg, M T

    1993-01-01

    The identity of the oldest lineage of monocotyledons is a subject of debate. Alternative interpretations of morphological homologies are variously consistent with proposals that species of Alismatanae, Dioscoreales, or Melanthiales were the earliest descendants of the first monocotyledons. We present phylogenetic analyses based on DNA sequences of the plastid locus rbcL in which Acorus calamus, an herb with unspecialized floral features and of uncertain affinities, is supported as a member of the oldest extant lineage of monocotyledons. This conclusion is consistent with a substantial body of morphological, anatomical, and embryological evidence and offers an explanation for the failure to identify any close relationship between Acorus and other genera. PMID:8506310

  11. Identifying and Mitigating Bias in Next-Generation Sequencing Methods for Chromatin Biology

    PubMed Central

    Meyer, Clifford A.; Liu, X. Shirley

    2015-01-01

    Next generation sequencing (NGS) technologies have been used in diverse ways to investigate facets of chromatin biology by identifying genomic loci that are bound by transcription factors, occupied by nucleosomes, accessible to nuclease cleavage, or physically interact with remote genomic loci. Reaching sound biological conclusions from such NGS enrichment profiles, however, requires that many potential biases be taken into account. In this Review we discuss common ways in which bias may be introduced into NGS chromatin profiling data, ways in which these biases can be diagnosed, and analytical techniques to mitigate their effect. PMID:25223782

  12. A novel PCCB mutation in a Thai patient with propionic acidemia identified by exome sequencing.

    PubMed

    Porntaveetus, Thantrira; Srichomthong, Chalurmpon; Suphapeetiporn, Kanya; Shotelersuk, Vorasuk

    2015-01-01

    Propionic acidemia (PA) is an inborn error of metabolism, caused by mutations in either the PCCA or PCCB gene, leading to mitochondrial accumulation of propionyl-CoA and its by-products. Here we report a 6-year-old Thai boy with PA who was born to consanguineous parents. Exome sequencing identified a novel homozygous frameshift insertion (c.379_380insA; p.T127NfsX160) in the PCCB gene, expanding its mutational spectrum. PMID:27081542

  13. A novel PCCB mutation in a Thai patient with propionic acidemia identified by exome sequencing

    PubMed Central

    Porntaveetus, Thantrira; Srichomthong, Chalurmpon; Suphapeetiporn, Kanya; Shotelersuk, Vorasuk

    2015-01-01

    Propionic acidemia (PA) is an inborn error of metabolism, caused by mutations in either the PCCA or PCCB gene, leading to mitochondrial accumulation of propionyl-CoA and its by-products. Here we report a 6-year-old Thai boy with PA who was born to consanguineous parents. Exome sequencing identified a novel homozygous frameshift insertion (c.379_380insA; p.T127NfsX160) in the PCCB gene, expanding its mutational spectrum. PMID:27081542

  14. Method for the detection of specific nucleic acid sequences by polymerase nucleotide incorporation

    DOEpatents

    Castro, Alonso

    2004-06-01

    A method for rapid and efficient detection of a target DNA or RNA sequence is provided. A primer having a 3'-hydroxyl group at one end and having a sequence of nucleotides sufficiently homologous with an identifying sequence of nucleotides in the target DNA is selected. The primer is hybridized to the identifying sequence of nucleotides on the DNA or RNA sequence and a reporter molecule is synthesized on the target sequence by progressively binding complementary nucleotides to the primer, where the complementary nucleotides include nucleotides labeled with a fluorophore. Fluorescence emitted by fluorophores on single reporter molecules is detected to identify the target DNA or RNA sequence.

  15. Exome sequencing of hepatocellular carcinomas identifies new mutational signatures and potential therapeutic targets

    SciTech Connect

    Schulze, Kornelius; Imbeaud, Sandrine; Letouzé, Eric; Alexandrov, Ludmil B.; Calderaro, Julien; Rebouissou, Sandra; Couchy, Gabrielle; Meiller, Clément; Shinde, Jayendra; Soysouvanh, Frederic; Calatayud, Anna-Line; Pinyol, Roser; Pelletier, Laura; Balabaud, Charles; Laurent, Alexis; Blanc, Jean-Frederic; Mazzaferro, Vincenzo; Calvo, Fabien; Villanueva, Augusto; Nault, Jean-Charles; Bioulac-Sage, Paulette; Stratton, Michael R.; Llovet, Josep M.; Zucman-Rossi, Jessica

    2015-03-30

    Our genomic analyses promise to improve tumor characterization to optimize personalized treatment for patients with hepatocellular carcinoma (HCC). Exome sequencing analysis of 243 liver tumors identified mutational signatures associated with specific risk factors, mainly combined alcohol and tobacco consumption and exposure to aflatoxin B1. We identified 161 putative driver genes associated with 11 recurrently altered pathways. Associations of mutations defined 3 groups of genes related to risk factors and centered on CTNNB1 (alcohol), TP53 (hepatitis B virus, HBV) and AXIN1. These analyses according to tumor stage progression identified TERT promoter mutation as an early event, whereasFGF3, FGF4, FGF19 or CCND1 amplification and TP53 and CDKN2A alterations appeared at more advanced stages in aggressive tumors. In 28% of the tumors, we identified genetic alterations potentially targetable by US Food and Drug Administration (FDA)–approved drugs. Finally, we identified risk factor–specific mutational signatures and defined the extensive landscape of altered genes and pathways in HCC, which will be useful to design clinical trials for targeted therapy.

  16. Exome sequencing of hepatocellular carcinomas identifies new mutational signatures and potential therapeutic targets

    DOE PAGESBeta

    Schulze, Kornelius; Imbeaud, Sandrine; Letouzé, Eric; Alexandrov, Ludmil B.; Calderaro, Julien; Rebouissou, Sandra; Couchy, Gabrielle; Meiller, Clément; Shinde, Jayendra; Soysouvanh, Frederic; et al

    2015-03-30

    Our genomic analyses promise to improve tumor characterization to optimize personalized treatment for patients with hepatocellular carcinoma (HCC). Exome sequencing analysis of 243 liver tumors identified mutational signatures associated with specific risk factors, mainly combined alcohol and tobacco consumption and exposure to aflatoxin B1. We identified 161 putative driver genes associated with 11 recurrently altered pathways. Associations of mutations defined 3 groups of genes related to risk factors and centered on CTNNB1 (alcohol), TP53 (hepatitis B virus, HBV) and AXIN1. These analyses according to tumor stage progression identified TERT promoter mutation as an early event, whereasFGF3, FGF4, FGF19 or CCND1more » amplification and TP53 and CDKN2A alterations appeared at more advanced stages in aggressive tumors. In 28% of the tumors, we identified genetic alterations potentially targetable by US Food and Drug Administration (FDA)–approved drugs. Finally, we identified risk factor–specific mutational signatures and defined the extensive landscape of altered genes and pathways in HCC, which will be useful to design clinical trials for targeted therapy.« less

  17. Next-generation sequencing identifies novel CACNA1A gene mutations in episodic ataxia type 2.

    PubMed

    Maksemous, Neven; Roy, Bishakha; Smith, Robert A; Griffiths, Lyn R

    2016-03-01

    Episodic Ataxia type 2 (EA2) is a rare autosomal dominantly inherited neurological disorder characterized by recurrent disabling imbalance, vertigo, and episodes of ataxia lasting minutes to hours. EA2 is caused most often by loss of function mutations of the calcium channel gene CACNA1A. In addition to EA2, mutations in CACNA1A are responsible for two other allelic disorders: familial hemiplegic migraine type 1 (FHM1) and spinocerebellar ataxia type 6 (SCA6). Herein, we have utilized next-generation sequencing (NGS) to screen the coding sequence, exon-intron boundaries, and Untranslated Regions (UTRs) of five genes where mutation is known to produce symptoms related to EA2, including CACNA1A. We performed this screening in a group of 31 unrelated patients with EA2 symptoms. Both novel and known mutations were detected through NGS technology, and confirmed through Sanger sequencing. Genetic testing showed in total 15 mutation bearing patients (48%), of which nine were novel mutations (6 missense and 3 small frameshift deletion mutations) and six known mutations (4 missense and 2 nonsense).These results demonstrate the efficiency of our NGS-panel for detecting known and novel mutations for EA2 in the CACNA1A gene, also identifying a novel missense mutation in ATP1A2 which is not a normal target for EA2 screening. PMID:27066515

  18. Characterization of functional methylomes by next-generation capture sequencing identifies novel disease-associated variants

    PubMed Central

    Allum, Fiona; Shao, Xiaojian; Guénard, Frédéric; Simon, Marie-Michelle; Busche, Stephan; Caron, Maxime; Lambourne, John; Lessard, Julie; Tandre, Karolina; Hedman, Åsa K.; Kwan, Tony; Ge, Bing; Rönnblom, Lars; McCarthy, Mark I.; Deloukas, Panos; Richmond, Todd; Burgess, Daniel; Spector, Timothy D.; Tchernof, André; Marceau, Simon; Lathrop, Mark; Vohl, Marie-Claude; Pastinen, Tomi; Grundberg, Elin; Ahmadi, Kourosh R.; Ainali, Chrysanthi; Barrett, Amy; Bataille, Veronique; Bell, Jordana T.; Buil, Alfonso; Dermitzakis, Emmanouil T.; Dimas, Antigone S.; Durbin, Richard; Glass, Daniel; Hassanali, Neelam; Ingle, Catherine; Knowles, David; Krestyaninova, Maria; Lindgren, Cecilia M.; Lowe, Christopher E.; Meduri, Eshwar; di Meglio, Paola; Min, Josine L.; Montgomery, Stephen B.; Nestle, Frank O.; Nica, Alexandra C.; Nisbet, James; O'Rahilly, Stephen; Parts, Leopold; Potter, Simon; Sandling, Johanna; Sekowska, Magdalena; Shin, So-Youn; Small, Kerrin S.; Soranzo, Nicole; Surdulescu, Gabriela; Travers, Mary E.; Tsaprouni, Loukia; Tsoka, Sophia; Wilk, Alicja; Yang, Tsun-Po; Zondervan, Krina T.

    2015-01-01

    Most genome-wide methylation studies (EWAS) of multifactorial disease traits use targeted arrays or enrichment methodologies preferentially covering CpG-dense regions, to characterize sufficiently large samples. To overcome this limitation, we present here a new customizable, cost-effective approach, methylC-capture sequencing (MCC-Seq), for sequencing functional methylomes, while simultaneously providing genetic variation information. To illustrate MCC-Seq, we use whole-genome bisulfite sequencing on adipose tissue (AT) samples and public databases to design AT-specific panels. We establish its efficiency for high-density interrogation of methylome variability by systematic comparisons with other approaches and demonstrate its applicability by identifying novel methylation variation within enhancers strongly correlated to plasma triglyceride and HDL-cholesterol, including at CD36. Our more comprehensive AT panel assesses tissue methylation and genotypes in parallel at ∼4 and ∼3 M sites, respectively. Our study demonstrates that MCC-Seq provides comparable accuracy to alternative approaches but enables more efficient cataloguing of functional and disease-relevant epigenetic and genetic variants for large-scale EWAS. PMID:26021296

  19. Natural vs. random protein sequences: Discovering combinatorics properties on amino acid words.

    PubMed

    Santoni, Daniele; Felici, Giovanni; Vergni, Davide

    2016-02-21

    Casual mutations and natural selection have driven the evolution of protein amino acid sequences that we observe at present in nature. The question about which is the dominant force of proteins evolution is still lacking of an unambiguous answer. Casual mutations tend to randomize protein sequences while, in order to have the correct functionality, one expects that selection mechanisms impose rigid constraints on amino acid sequences. Moreover, one also has to consider that the space of all possible amino acid sequences is so astonishingly large that it could be reasonable to have a well tuned amino acid sequence indistinguishable from a random one. In order to study the possibility to discriminate between random and natural amino acid sequences, we introduce different measures of association between pairs of amino acids in a sequence, and apply them to a dataset of 1047 natural protein sequences and 10,470 random sequences, carefully generated in order to preserve the relative length and amino acid distribution of the natural proteins. We analyze the multidimensional measures with machine learning techniques and show that, to a reasonable extent, natural protein sequences can be differentiated from random ones. PMID:26656109

  20. Sequencing and molecular modeling identifies candidate members of Caliciviridae family in bats.

    PubMed

    Kemenesi, Gábor; Gellért, Ákos; Dallos, Bianka; Görföl, Tamás; Boldogh, Sándor; Estók, Péter; Marton, Szilvia; Oldal, Miklós; Martella, Vito; Bányai, Krisztián; Jakab, Ferenc

    2016-07-01

    Emerging viral diseases represent an ongoing challenge for globalized world and bats constitute an immense, partially explored, reservoir of potentially zoonotic viruses. Caliciviruses are important human and animal pathogens and, as observed for human noroviruses, they may impact on human health on a global scale. By screening fecal samples of bats in Hungary, calicivirus RNA was identified in the samples of Myotis daubentonii and Eptesicus serotinus bats. In order to characterize more in detail the bat caliciviruses, large portions of the genome sequence of the viruses were determined. Phylogenetic analyses and molecular modeling identified firmly the two viruses as candidate members within the Caliciviridae family, with one calicivirus strain resembling members of the Sapovirus genus and the other bat calicivirus being more related to porcine caliciviruses of the proposed genus Valovirus. This data serves the effort for detecting reservoir hosts for potential emerging viruses and recognize important evolutionary relationships. PMID:27085289

  1. Genomic Aberrations in Crizotinib Resistant Lung Adenocarcinoma Samples Identified by Transcriptome Sequencing

    PubMed Central

    Saber, Ali; van der Wekken, Anthonie J.; Kok, Klaas; Terpstra, M. Martijn; Bosman, Lisette J.; Mastik, Mirjam F.; Timens, Wim; Schuuring, Ed; Hiltermann, T. Jeroen N.; Groen, Harry J. M.; van den Berg, Anke

    2016-01-01

    ALK-break positive non-small cell lung cancer (NSCLC) patients initially respond to crizotinib, but resistance occurs inevitably. In this study we aimed to identify fusion genes in crizotinib resistant tumor samples. Re-biopsies of three patients were subjected to paired-end RNA sequencing to identify fusion genes using deFuse and EricScript. The IGV browser was used to determine presence of known resistance-associated mutations. Sanger sequencing was used to validate fusion genes and digital droplet PCR to validate mutations. ALK fusion genes were detected in all three patients with EML4 being the fusion partner. One patient had no additional fusion genes. Another patient had one additional fusion gene, but without a predicted open reading frame (ORF). The third patient had three additional fusion genes, of which two were derived from the same chromosomal region as the EML4-ALK. A predicted ORF was identified only in the CLIP4-VSNL1 fusion product. The fusion genes validated in the post-treatment sample were also present in the biopsy before crizotinib. ALK mutations (p.C1156Y and p.G1269A) detected in the re-biopsies of two patients, were not detected in pre-treatment biopsies. In conclusion, fusion genes identified in our study are unlikely to be involved in crizotinib resistance based on presence in pre-treatment biopsies. The detection of ALK mutations in post-treatment tumor samples of two patients underlines their role in crizotinib resistance. PMID:27045755

  2. Whole-genome sequencing identifies EN1 as a determinant of bone density and fracture.

    PubMed

    Zheng, Hou-Feng; Forgetta, Vincenzo; Hsu, Yi-Hsiang; Estrada, Karol; Rosello-Diez, Alberto; Leo, Paul J; Dahia, Chitra L; Park-Min, Kyung Hyun; Tobias, Jonathan H; Kooperberg, Charles; Kleinman, Aaron; Styrkarsdottir, Unnur; Liu, Ching-Ti; Uggla, Charlotta; Evans, Daniel S; Nielson, Carrie M; Walter, Klaudia; Pettersson-Kymmer, Ulrika; McCarthy, Shane; Eriksson, Joel; Kwan, Tony; Jhamai, Mila; Trajanoska, Katerina; Memari, Yasin; Min, Josine; Huang, Jie; Danecek, Petr; Wilmot, Beth; Li, Rui; Chou, Wen-Chi; Mokry, Lauren E; Moayyeri, Alireza; Claussnitzer, Melina; Cheng, Chia-Ho; Cheung, Warren; Medina-Gómez, Carolina; Ge, Bing; Chen, Shu-Huang; Choi, Kwangbom; Oei, Ling; Fraser, James; Kraaij, Robert; Hibbs, Matthew A; Gregson, Celia L; Paquette, Denis; Hofman, Albert; Wibom, Carl; Tranah, Gregory J; Marshall, Mhairi; Gardiner, Brooke B; Cremin, Katie; Auer, Paul; Hsu, Li; Ring, Sue; Tung, Joyce Y; Thorleifsson, Gudmar; Enneman, Anke W; van Schoor, Natasja M; de Groot, Lisette C P G M; van der Velde, Nathalie; Melin, Beatrice; Kemp, John P; Christiansen, Claus; Sayers, Adrian; Zhou, Yanhua; Calderari, Sophie; van Rooij, Jeroen; Carlson, Chris; Peters, Ulrike; Berlivet, Soizik; Dostie, Josée; Uitterlinden, Andre G; Williams, Stephen R; Farber, Charles; Grinberg, Daniel; LaCroix, Andrea Z; Haessler, Jeff; Chasman, Daniel I; Giulianini, Franco; Rose, Lynda M; Ridker, Paul M; Eisman, John A; Nguyen, Tuan V; Center, Jacqueline R; Nogues, Xavier; Garcia-Giralt, Natalia; Launer, Lenore L; Gudnason, Vilmunder; Mellström, Dan; Vandenput, Liesbeth; Amin, Najaf; van Duijn, Cornelia M; Karlsson, Magnus K; Ljunggren, Östen; Svensson, Olle; Hallmans, Göran; Rousseau, François; Giroux, Sylvie; Bussière, Johanne; Arp, Pascal P; Koromani, Fjorda; Prince, Richard L; Lewis, Joshua R; Langdahl, Bente L; Hermann, A Pernille; Jensen, Jens-Erik B; Kaptoge, Stephen; Khaw, Kay-Tee; Reeve, Jonathan; Formosa, Melissa M; Xuereb-Anastasi, Angela; Åkesson, Kristina; McGuigan, Fiona E; Garg, Gaurav; Olmos, Jose M; Zarrabeitia, Maria T; Riancho, Jose A; Ralston, Stuart H; Alonso, Nerea; Jiang, Xi; Goltzman, David; Pastinen, Tomi; Grundberg, Elin; Gauguier, Dominique; Orwoll, Eric S; Karasik, David; Davey-Smith, George; Smith, Albert V; Siggeirsdottir, Kristin; Harris, Tamara B; Zillikens, M Carola; van Meurs, Joyce B J; Thorsteinsdottir, Unnur; Maurano, Matthew T; Timpson, Nicholas J; Soranzo, Nicole; Durbin, Richard; Wilson, Scott G; Ntzani, Evangelia E; Brown, Matthew A; Stefansson, Kari; Hinds, David A; Spector, Tim; Cupples, L Adrienne; Ohlsson, Claes; Greenwood, Celia M T; Jackson, Rebecca D; Rowe, David W; Loomis, Cynthia A; Evans, David M; Ackert-Bicknell, Cheryl L; Joyner, Alexandra L; Duncan, Emma L; Kiel, Douglas P; Rivadeneira, Fernando; Richards, J Brent

    2015-10-01

    The extent to which low-frequency (minor allele frequency (MAF) between 1-5%) and rare (MAF ≤ 1%) variants contribute to complex traits and disease in the general population is mainly unknown. Bone mineral density (BMD) is highly heritable, a major predictor of osteoporotic fractures, and has been previously associated with common genetic variants, as well as rare, population-specific, coding variants. Here we identify novel non-coding genetic variants with large effects on BMD (ntotal = 53,236) and fracture (ntotal = 508,253) in individuals of European ancestry from the general population. Associations for BMD were derived from whole-genome sequencing (n = 2,882 from UK10K (ref. 10); a population-based genome sequencing consortium), whole-exome sequencing (n = 3,549), deep imputation of genotyped samples using a combined UK10K/1000 Genomes reference panel (n = 26,534), and de novo replication genotyping (n = 20,271). We identified a low-frequency non-coding variant near a novel locus, EN1, with an effect size fourfold larger than the mean of previously reported common variants for lumbar spine BMD (rs11692564(T), MAF = 1.6%, replication effect size = +0.20 s.d., Pmeta = 2 × 10(-14)), which was also associated with a decreased risk of fracture (odds ratio = 0.85; P = 2 × 10(-11); ncases = 98,742 and ncontrols = 409,511). Using an En1(cre/flox) mouse model, we observed that conditional loss of En1 results in low bone mass, probably as a consequence of high bone turnover. We also identified a novel low-frequency non-coding variant with large effects on BMD near WNT16 (rs148771817(T), MAF = 1.2%, replication effect size = +0.41 s.d., Pmeta = 1 × 10(-11)). In general, there was an excess of association signals arising from deleterious coding and conserved non-coding variants. These findings provide evidence that low-frequency non-coding variants have large effects on BMD and fracture

  3. Genome sequence of a novel victorivirus identified in the phytopathogenic fungus Alternaria arborescens.

    PubMed

    Komatsu, Ken; Katayama, Yukie; Omatsu, Tsutomu; Mizutani, Tetsuya; Fukuhara, Toshiyuki; Kodama, Motoichiro; Arie, Tsutomu; Teraoka, Tohru; Moriyama, Hiromitsu

    2016-06-01

    Strains of the phytopathogenic fungus Alternaria spp. have been found to contain a variety of double-stranded RNA (dsRNA) elements indicative of mycovirus infection. Here, we report the molecular characterization of a novel dsRNA mycovirus, Alternaria arborescens victorivirus 1 (AaVV1), from A. arborescens, the tomato pathotype of A. alternata. Using next-generation sequencing of dsRNA purified from an A. arborescens strain from the United States of America, we found that the AaVV1 genome is 5203 bp in length and contains two open reading frames (ORF1 and 2) that overlap at the tetranucleotide AUGA. Proteins encoded by ORF1 and ORF2 showed significant similarities to the coat protein (CP) and the RNA-dependent RNA polymerase (RdRp), respectively, of dsRNA mycoviruses of the genus Victorivirus. Pairwise comparisons and phylogenetic analysis of the deduced amino acid sequences of both CP and RdRp indicated that AaVV1 is a member of a distinct species of the genus Victorivirus in the family Totiviridae. PMID:26923927

  4. High-throughput single-cell sequencing identifies photoheterotrophs and chemoautotrophs in freshwater bacterioplankton

    PubMed Central

    Martinez-Garcia, Manuel; Swan, Brandon K; Poulton, Nicole J; Gomez, Monica Lluesma; Masland, Dashiell; Sieracki, Michael E; Stepanauskas, Ramunas

    2012-01-01

    Recent discoveries suggest that photoheterotrophs (rhodopsin-containing bacteria (RBs) and aerobic anoxygenic phototrophs (AAPs)) and chemoautotrophs may be significant for marine and freshwater ecosystem productivity. However, their abundance and taxonomic identities remain largely unknown. We used a combination of single-cell and metagenomic DNA sequencing to study the predominant photoheterotrophs and chemoautotrophs inhabiting the euphotic zone of temperate, physicochemically diverse freshwater lakes. Multi-locus sequencing of 712 single amplified genomes, generated by fluorescence-activated cell sorting and whole genome multiple displacement amplification, showed that most of the cosmopolitan freshwater clusters contain photoheterotrophs. These comprised at least 10–23% of bacterioplankton, and RBs were the dominant fraction. Our data demonstrate that Actinobacteria, including clusters acI, Luna and acSTL, are the predominant freshwater RBs. We significantly broaden the known taxonomic range of freshwater RBs, to include Alpha-, Beta-, Gamma- and Deltaproteobacteria, Verrucomicrobia and Sphingobacteria. By sequencing single cells, we found evidence for inter-phyla horizontal gene transfer and recombination of rhodopsin genes and identified specific taxonomic groups involved in these evolutionary processes. Our data suggest that members of the ubiquitous betaproteobacteria Polynucleobacter spp. are the dominant AAPs in temperate freshwater lakes. Furthermore, the RuBisCO (ribulose 1,5-bisphosphate carboxylase/oxygenase) gene was found in several single cells of Betaproteobacteria, Bacteroidetes and Gammaproteobacteria, suggesting that chemoautotrophs may be more prevalent among aerobic bacterioplankton than previously thought. This study demonstrates the power of single-cell DNA sequencing addressing previously unresolved questions about the metabolic potential and evolutionary histories of uncultured microorganisms, which dominate most natural environments

  5. High-throughput single-cell sequencing identifies photoheterotrophs and chemoautotrophs in freshwater bacterioplankton.

    PubMed

    Martinez-Garcia, Manuel; Swan, Brandon K; Poulton, Nicole J; Gomez, Monica Lluesma; Masland, Dashiell; Sieracki, Michael E; Stepanauskas, Ramunas

    2012-01-01

    Recent discoveries suggest that photoheterotrophs (rhodopsin-containing bacteria (RBs) and aerobic anoxygenic phototrophs (AAPs)) and chemoautotrophs may be significant for marine and freshwater ecosystem productivity. However, their abundance and taxonomic identities remain largely unknown. We used a combination of single-cell and metagenomic DNA sequencing to study the predominant photoheterotrophs and chemoautotrophs inhabiting the euphotic zone of temperate, physicochemically diverse freshwater lakes. Multi-locus sequencing of 712 single amplified genomes, generated by fluorescence-activated cell sorting and whole genome multiple displacement amplification, showed that most of the cosmopolitan freshwater clusters contain photoheterotrophs. These comprised at least 10-23% of bacterioplankton, and RBs were the dominant fraction. Our data demonstrate that Actinobacteria, including clusters acI, Luna and acSTL, are the predominant freshwater RBs. We significantly broaden the known taxonomic range of freshwater RBs, to include Alpha-, Beta-, Gamma- and Deltaproteobacteria, Verrucomicrobia and Sphingobacteria. By sequencing single cells, we found evidence for inter-phyla horizontal gene transfer and recombination of rhodopsin genes and identified specific taxonomic groups involved in these evolutionary processes. Our data suggest that members of the ubiquitous betaproteobacteria Polynucleobacter spp. are the dominant AAPs in temperate freshwater lakes. Furthermore, the RuBisCO (ribulose 1,5-bisphosphate carboxylase/oxygenase) gene was found in several single cells of Betaproteobacteria, Bacteroidetes and Gammaproteobacteria, suggesting that chemoautotrophs may be more prevalent among aerobic bacterioplankton than previously thought. This study demonstrates the power of single-cell DNA sequencing addressing previously unresolved questions about the metabolic potential and evolutionary histories of uncultured microorganisms, which dominate most natural environments

  6. Identifying Genetic Signatures of Natural Selection Using Pooled Population Sequencing in Picea abies

    PubMed Central

    Chen, Jun; Källman, Thomas; Ma, Xiao-Fei; Zaina, Giusi; Morgante, Michele; Lascoux, Martin

    2016-01-01

    The joint inference of selection and past demography remain a costly and demanding task. We used next generation sequencing of two pools of 48 Norway spruce mother trees, one corresponding to the Fennoscandian domain, and the other to the Alpine domain, to assess nucleotide polymorphism at 88 nuclear genes. These genes are candidate genes for phenological traits, and most belong to the photoperiod pathway. Estimates of population genetic summary statistics from the pooled data are similar to previous estimates, suggesting that pooled sequencing is reliable. The nonsynonymous SNPs tended to have both lower frequency differences and lower FST values between the two domains than silent ones. These results suggest the presence of purifying selection. The divergence between the two domains based on synonymous changes was around 5 million yr, a time similar to a recent phylogenetic estimate of 6 million yr, but much larger than earlier estimates based on isozymes. Two approaches, one of them novel and that considers both FST and difference in allele frequencies between the two domains, were used to identify SNPs potentially under diversifying selection. SNPs from around 20 genes were detected, including genes previously identified as main target for selection, such as PaPRR3 and PaGI. PMID:27172202

  7. Challenges in identifying cancer genes by analysis of exome sequencing data.

    PubMed

    Hofree, Matan; Carter, Hannah; Kreisberg, Jason F; Bandyopadhyay, Sourav; Mischel, Paul S; Friend, Stephen; Ideker, Trey

    2016-01-01

    Massively parallel sequencing has permitted an unprecedented examination of the cancer exome, leading to predictions that all genes important to cancer will soon be identified by genetic analysis of tumours. To examine this potential, here we evaluate the ability of state-of-the-art sequence analysis methods to specifically recover known cancer genes. While some cancer genes are identified by analysis of recurrence, spatial clustering or predicted impact of somatic mutations, many remain undetected due to lack of power to discriminate driver mutations from the background mutational load (13-60% recall of cancer genes impacted by somatic single-nucleotide variants, depending on the method). Cancer genes not detected by mutation recurrence also tend to be missed by all types of exome analysis. Nonetheless, these genes are implicated by other experiments such as functional genetic screens and expression profiling. These challenges are only partially addressed by increasing sample size and will likely hold even as greater numbers of tumours are analysed. PMID:27417679

  8. Whole-genome sequencing identifies a recurrent functional synonymous mutation in melanoma

    PubMed Central

    Gartner, Jared J.; Parker, Stephen C. J.; Prickett, Todd D.; Dutton-Regester, Ken; Stitzel, Michael L.; Lin, Jimmy C.; Davis, Sean; Simhadri, Vijaya L.; Jha, Sujata; Katagiri, Nobuko; Gotea, Valer; Teer, Jamie K.; Morken, Mario A.; Bhanot, Umesh K.; Chen, Guo; Elnitski, Laura L.; Davies, Michael A.; Gershenwald, Jeffrey E.; Carter, Hannah; Karchin, Rachel; Robinson, William; Robinson, Steven; Rosenberg, Steven A.; Collins, Francis S.; Parmigiani, Giovanni; Komar, Anton A.; Kimchi-Sarfaty, Chava; Hayward, Nicholas K.; Margulies, Elliott H.; Samuels, Yardena

    2013-01-01

    Synonymous mutations, which do not alter the protein sequence, have been shown to affect protein function [Sauna ZE, Kimchi-Sarfaty C (2011) Nat Rev Genet 12(10):683–691]. However, synonymous mutations are rarely investigated in the cancer genomics field. We used whole-genome and -exome sequencing to identify somatic mutations in 29 melanoma samples. Validation of one synonymous somatic mutation in BCL2L12 in 285 samples identified 12 cases that harbored the recurrent F17F mutation. This mutation led to increased BCL2L12 mRNA and protein levels because of differential targeting of WT and mutant BCL2L12 by hsa-miR-671–5p. Protein made from mutant BCL2L12 transcript bound p53, inhibited UV-induced apoptosis more efficiently than WT BCL2L12, and reduced endogenous p53 target gene transcription. This report shows selection of a recurrent somatic synonymous mutation in cancer. Our data indicate that silent alterations have a role to play in human cancer, emphasizing the importance of their investigation in future cancer genome studies. PMID:23901115

  9. Challenges in identifying cancer genes by analysis of exome sequencing data

    PubMed Central

    Hofree, Matan; Carter, Hannah; Kreisberg, Jason F.; Bandyopadhyay, Sourav; Mischel, Paul S.; Friend, Stephen; Ideker, Trey

    2016-01-01

    Massively parallel sequencing has permitted an unprecedented examination of the cancer exome, leading to predictions that all genes important to cancer will soon be identified by genetic analysis of tumours. To examine this potential, here we evaluate the ability of state-of-the-art sequence analysis methods to specifically recover known cancer genes. While some cancer genes are identified by analysis of recurrence, spatial clustering or predicted impact of somatic mutations, many remain undetected due to lack of power to discriminate driver mutations from the background mutational load (13–60% recall of cancer genes impacted by somatic single-nucleotide variants, depending on the method). Cancer genes not detected by mutation recurrence also tend to be missed by all types of exome analysis. Nonetheless, these genes are implicated by other experiments such as functional genetic screens and expression profiling. These challenges are only partially addressed by increasing sample size and will likely hold even as greater numbers of tumours are analysed. PMID:27417679

  10. Next-generation sequencing identifies major DNA methylation changes during progression of Ph+ chronic myeloid leukemia.

    PubMed

    Heller, G; Topakian, T; Altenberger, C; Cerny-Reiterer, S; Herndlhofer, S; Ziegler, B; Datlinger, P; Byrgazov, K; Bock, C; Mannhalter, C; Hörmann, G; Sperr, W R; Lion, T; Zielinski, C C; Valent, P; Zöchbauer-Müller, S

    2016-09-01

    Little is known about the impact of DNA methylation on the evolution/progression of Ph+ chronic myeloid leukemia (CML). We investigated the methylome of CML patients in chronic phase (CP-CML), accelerated phase (AP-CML) and blast crisis (BC-CML) as well as in controls by reduced representation bisulfite sequencing. Although only ~600 differentially methylated CpG sites were identified in samples obtained from CP-CML patients compared with controls, ~6500 differentially methylated CpG sites were found in samples from BC-CML patients. In the majority of affected CpG sites, methylation was increased. In CP-CML patients who progressed to AP-CML/BC-CML, we identified up to 897 genes that were methylated at the time of progression but not at the time of diagnosis. Using RNA-sequencing, we observed downregulated expression of many of these genes in BC-CML compared with CP-CML samples. Several of them are well-known tumor-suppressor genes or regulators of cell proliferation, and gene re-expression was observed by the use of epigenetic active drugs. Together, our results demonstrate that CpG site methylation clearly increases during CML progression and that it may provide a useful basis for revealing new targets of therapy in advanced CML. PMID:27211271

  11. Genomic sequencing of meningiomas identifies oncogenic SMO and AKT1 mutations

    PubMed Central

    Brastianos, Priscilla K.; Horowitz, Peleg M.; Santagata, Sandro; Jones, Robert T.; McKenna, Aaron; Getz, Gad; Ligon, Keith L.; Palescandolo, Emanuele; Van Hummelen, Paul; Ducar, Matthew D.; Raza, Alina; Sunkavalli, Ashwini; MacConaill, Laura E.; Stemmer-Rachamimov, Anat O.; Louis, David N.; Hahn, William C.; Dunn, Ian F.; Beroukhim, Rameen

    2013-01-01

    Meningiomas are the most common primary nervous system tumor. The tumor suppressor NF2 is disrupted in approximately half of meningiomas1 but the complete spectrum of genetic changes remains undefined. We performed whole-genome or whole-exome sequencing on 17 meningiomas and focused sequencing on an additional 48 tumors to identify and validate somatic genetic alterations. Most meningiomas exhibited simple genomes, with fewer mutations, rearrangements, and copy-number alterations than reported in other adult tumors. However, several meningiomas harbored more complex patterns of copy-number changes and rearrangements including one tumor with chromothripsis. We confirmed focal NF2 inactivation in 43% of tumors and found alterations in epigenetic modifiers among an additional 8% of tumors. A subset of meningiomas lacking NF2 alterations harbored recurrent oncogenic mutations in AKT1 (E17K) and SMO (W535L) and exhibited immunohistochemical evidence of activation of their pathways. These mutations were present in therapeutically challenging tumors of the skull base and higher grade. These results begin to define the spectrum of genetic alterations in meningiomas and identify potential therapeutic targets. PMID:23334667

  12. Pathogenic mutations in two families with congenital cataract identified with whole-exome sequencing

    PubMed Central

    Kondo, Yukiko; Saitsu, Hirotomo; Miyamoto, Toshinobu; Lee, Byung Joo; Nishiyama, Kiyomi; Nakashima, Mitsuko; Tsurusaki, Yoshinori; Doi, Hiroshi; Miyake, Noriko; Kim, Jeong Hun; Yu, Young Suk

    2013-01-01

    Purpose Congenital cataract is one of the most frequent causes of visual impairment and childhood blindness. Approximately one quarter to one third of congenital cataract cases may have a genetic cause. However, phenotypic variability and genetic heterogeneity hamper correct genetic diagnosis. In this study, we used whole-exome sequencing (WES) to identify pathogenic mutations in two Korean families with congenital cataract. Methods Two affected members from each family were pooled and processed for WES. The detected variants were confirmed with direct sequencing. Results WES readily identified a CRYAA mutation in family A and a CRYGC mutation in family B. The c.61C>T (p.R21W) mutation in CRYAA has been previously reported in a family with congenital cataract and microcornea. The novel mutation, c.124delT, in CRYGC may lead to a premature stop codon (p.C42Afs*60). Conclusions This study clearly shows the efficacy of WES for rapid genetic diagnosis of congenital cataract with an unknown cause. WES will be the first choice for clinical services in the near future, providing useful information for genetic counseling and family planning. PMID:23441109

  13. Identifying Genetic Signatures of Natural Selection Using Pooled Population Sequencing in Picea abies.

    PubMed

    Chen, Jun; Källman, Thomas; Ma, Xiao-Fei; Zaina, Giusi; Morgante, Michele; Lascoux, Martin

    2016-01-01

    The joint inference of selection and past demography remain a costly and demanding task. We used next generation sequencing of two pools of 48 Norway spruce mother trees, one corresponding to the Fennoscandian domain, and the other to the Alpine domain, to assess nucleotide polymorphism at 88 nuclear genes. These genes are candidate genes for phenological traits, and most belong to the photoperiod pathway. Estimates of population genetic summary statistics from the pooled data are similar to previous estimates, suggesting that pooled sequencing is reliable. The nonsynonymous SNPs tended to have both lower frequency differences and lower FST values between the two domains than silent ones. These results suggest the presence of purifying selection. The divergence between the two domains based on synonymous changes was around 5 million yr, a time similar to a recent phylogenetic estimate of 6 million yr, but much larger than earlier estimates based on isozymes. Two approaches, one of them novel and that considers both FST and difference in allele frequencies between the two domains, were used to identify SNPs potentially under diversifying selection. SNPs from around 20 genes were detected, including genes previously identified as main target for selection, such as PaPRR3 and PaGI. PMID:27172202

  14. Efficiently identifying genome-wide changes with next-generation sequencing data.

    PubMed

    Huang, Weichun; Umbach, David M; Vincent Jordan, Nicole; Abell, Amy N; Johnson, Gary L; Li, Leping

    2011-10-01

    We propose a new and effective statistical framework for identifying genome-wide differential changes in epigenetic marks with ChIP-seq data or gene expression with mRNA-seq data, and we develop a new software tool EpiCenter that can efficiently perform data analysis. The key features of our framework are: (i) providing multiple normalization methods to achieve appropriate normalization under different scenarios, (ii) using a sequence of three statistical tests to eliminate background regions and to account for different sources of variation and (iii) allowing adjustment for multiple testing to control false discovery rate (FDR) or family-wise type I error. Our software EpiCenter can perform multiple analytic tasks including: (i) identifying genome-wide epigenetic changes or differentially expressed genes, (ii) finding transcription factor binding sites and (iii) converting multiple-sample sequencing data into a single read-count data matrix. By simulation, we show that our framework achieves a low FDR consistently over a broad range of read coverage and biological variation. Through two real examples, we demonstrate the effectiveness of our framework and the usages of our tool. In particular, we show that our novel and robust 'parsimony' normalization method is superior to the widely-used 'tagRatio' method. Our software EpiCenter is freely available to the public. PMID:21803788

  15. Whole-genome sequencing identifies recurrent mutations in chronic lymphocytic leukaemia

    PubMed Central

    Puente, Xose S.; Pinyol, Magda; Quesada, Víctor; Conde, Laura; Ordóñez, Gonzalo R.; Villamor, Neus; Escaramis, Georgia; Jares, Pedro; Beà, Sílvia; González-Díaz, Marcos; Bassaganyas, Laia; Baumann, Tycho; Juan, Manel; López-Guerra, Mónica; Colomer, Dolors; Tubío, José M. C.; López, Cristina; Navarro, Alba; Tornador, Cristian; Aymerich, Marta; Rozman, María; Hernández, Jesús M.; Puente, Diana A.; Freije, José M. P.; Velasco, Gloria; Gutiérrez-Fernández, Ana; Costa, Dolors; Carrió, Anna; Guijarro, Sara; Enjuanes, Anna; Hernández, Lluís; Yagüe, Jordi; Nicolás, Pilar; Romeo-Casabona, Carlos M.; Himmelbauer, Heinz; Castillo, Ester; Dohm, Juliane C.; de Sanjosé, Silvia; Piris, Miguel A.; de Alava, Enrique; Miguel, Jesús San; Royo, Romina; Gelpí, Josep L.; Torrents, David; Orozco, Modesto; Pisano, David G.; Valencia, Alfonso; Guigó, Roderic; Bayés, Mónica; Heath, Simon; Gut, Marta; Klatt, Peter; Marshall, John; Raine, Keiran; Stebbings, Lucy A.; Futreal, P. Andrew; Stratton, Michael R.; Campbell, Peter J.; Gut, Ivo; López-Guillermo, Armando; Estivill, Xavier; Montserrat, Emili; López-Otín, Carlos; Campo, Elías

    2012-01-01

    Chronic lymphocytic leukaemia (CLL), the most frequent leukaemia in adults in Western countries, is a heterogeneous disease with variable clinical presentation and evolution1,2. Two major molecular subtypes can be distinguished, characterized respectively by a high or low number of somatic hypermutations in the variable region of immunoglobulin genes3,4. The molecular changes leading to the pathogenesis of the disease are still poorly understood. Here we performed whole-genome sequencing of four cases of CLL and identified 46 somatic mutations that potentially affect gene function. Further analysis of these mutations in 363 patients with CLL identified four genes that are recurrently mutated: notch 1 (NOTCH1), exportin 1 (XPO1), myeloid differentiation primary response gene 88 (MYD88) and kelch-like 6 (KLHL6). Mutations in MYD88 and KLHL6 are predominant in cases of CLL with mutated immunoglobulin genes, whereas NOTCH1 and XPO1 mutations are mainly detected in patients with unmutated immunoglobulins. The patterns of somatic mutation, supported by functional and clinical analyses, strongly indicate that the recurrent NOTCH1, MYD88 and XPO1 mutations are oncogenic changes that contribute to the clinical evolution of the disease. To our knowledge, this is the first comprehensive analysis of CLL combining whole-genome sequencing with clinical characteristics and clinical outcomes. It highlights the usefulness of this approach for the identification of clinically relevant mutations in cancer. PMID:21642962

  16. Exome sequencing identified FGF12 as a novel candidate gene for Kashin-Beck disease.

    PubMed

    Zhang, Feng; Dai, Lanlan; Lin, Weimin; Wang, Wenyu; Liu, Xuanzhu; Zhang, Jianguo; Yang, Tielin; Liu, Xiaogang; Shen, Hui; Chen, Xiangding; Tan, Lijun; Tian, Qing; Deng, Hong-Wen; Xu, Xun; Guo, Xiong

    2016-01-01

    The objective of this study was to identify novel causal genes involved in the pathogenesis of Kashin-Beck disease (KBD). A representative grade III KBD sib pair with serious skeletal growth and development failure was subjected to exome sequencing using the Illumina Hiseq2000 platform. The detected gene mutations were then filtered against the data of 1000 Genome Project, dbSNP database, and BGI inhouse database, and replicated by a genome-wide association study (GWAS) of KBD. Ninety grade II or III KBD patients with extreme KBD phenotypes and 1627 healthy controls were enrolled in the GWAS. Affymetrix Genome-Wide Human SNP Array 6.0 was applied for genotyping. PLINK software was used for association analysis. We identified a novel 106T>C at the 3'UTR of the FGF12 gene, which has not been reported by now. Sequence alignment observed high conversation at the mutated 3'UTR+106T>C locus across various vertebrates. In the GWAS of KBD, we detected nine SNPs of the FGF12 gene showing association evidence (P value < 0.05) with KBD. The most significant association signal was observed at rs1847340 (P value = 1.90 × 10(-5)). This study suggests that FGF12 was a susceptibility gene of KBD. Our results provide novel clues for revealing the pathogenesis of KBD and the biological function of FGF12. PMID:26290467

  17. Next-generation sequencing identifies the natural killer cell microRNA transcriptome

    PubMed Central

    Fehniger, Todd A.; Wylie, Todd; Germino, Elizabeth; Leong, Jeffrey W.; Magrini, Vincent J.; Koul, Sunita; Keppel, Catherine R.; Schneider, Stephanie E.; Koboldt, Daniel C.; Sullivan, Ryan P.; Heinz, Michael E.; Crosby, Seth D.; Nagarajan, Rakesh; Ramsingh, Giridharan; Link, Daniel C.; Ley, Timothy J.; Mardis, Elaine R.

    2010-01-01

    Natural killer (NK) cells are innate lymphocytes important for early host defense against infectious pathogens and surveillance against malignant transformation. Resting murine NK cells regulate the translation of effector molecule mRNAs (e.g., granzyme B, GzmB) through unclear molecular mechanisms. MicroRNAs (miRNAs) are small noncoding RNAs that post-transcriptionally regulate the translation of their mRNA targets, and are therefore candidates for mediating this control process. While the expression and importance of miRNAs in T and B lymphocytes have been established, little is known about miRNAs in NK cells. Here, we used two next-generation sequencing (NGS) platforms to define the miRNA transcriptomes of resting and cytokine-activated primary murine NK cells, with confirmation by quantitative real-time PCR (qRT-PCR) and microarrays. We delineate a bioinformatics analysis pipeline that identified 302 known and 21 novel mature miRNAs from sequences obtained from NK cell small RNA libraries. These miRNAs are expressed over a broad range and exhibit isomiR complexity, and a subset is differentially expressed following cytokine activation. Using these miRNA NGS data, miR-223 was identified as a mature miRNA present in resting NK cells with decreased expression following cytokine activation. Furthermore, we demonstrate that miR-223 specifically targets the 3′ untranslated region of murine GzmB in vitro, indicating that this miRNA may contribute to control of GzmB translation in resting NK cells. Thus, the sequenced NK cell miRNA transcriptome provides a valuable framework for further elucidation of miRNA expression and function in NK cell biology. PMID:20935160

  18. Novel pathogenic variants and genes for myopathies identified by whole exome sequencing

    PubMed Central

    Hunter, Jesse M; Ahearn, Mary Ellen; Balak, Christopher D; Liang, Winnie S; Kurdoglu, Ahmet; Corneveaux, Jason J; Russell, Megan; Huentelman, Matthew J; Craig, David W; Carpten, John; Coons, Stephen W; DeMello, Daphne E; Hall, Judith G; Bernes, Saunder M; Baumbach-Reardon, Lisa

    2015-01-01

    Neuromuscular diseases (NMD) account for a significant proportion of infant and childhood mortality and devastating chronic disease. Determining the specific diagnosis of NMD is challenging due to thousands of unique or rare genetic variants that result in overlapping phenotypes. We present four unique childhood myopathy cases characterized by relatively mild muscle weakness, slowly progressing course, mildly elevated creatine phosphokinase (CPK), and contractures. We also present two additional cases characterized by severe prenatal/neonatal myopathy. Prior extensive genetic testing and histology of these cases did not reveal the genetic etiology of disease. Here, we applied whole exome sequencing (WES) and bioinformatics to identify likely causal pathogenic variants in each pedigree. In two cases, we identified novel pathogenic variants in COL6A3. In a third case, we identified novel likely pathogenic variants in COL6A6 and COL6A3. We identified a novel splice variant in EMD in a fourth case. Finally, we classify two cases as calcium channelopathies with identification of novel pathogenic variants in RYR1 and CACNA1S. These are the first cases of myopathies reported to be caused by variants in COL6A6 and CACNA1S. Our results demonstrate the utility and genetic diagnostic value of WES in the broad class of NMD phenotypes. PMID:26247046

  19. Novel pathogenic variants and genes for myopathies identified by whole exome sequencing.

    PubMed

    Hunter, Jesse M; Ahearn, Mary Ellen; Balak, Christopher D; Liang, Winnie S; Kurdoglu, Ahmet; Corneveaux, Jason J; Russell, Megan; Huentelman, Matthew J; Craig, David W; Carpten, John; Coons, Stephen W; DeMello, Daphne E; Hall, Judith G; Bernes, Saunder M; Baumbach-Reardon, Lisa

    2015-07-01

    Neuromuscular diseases (NMD) account for a significant proportion of infant and childhood mortality and devastating chronic disease. Determining the specific diagnosis of NMD is challenging due to thousands of unique or rare genetic variants that result in overlapping phenotypes. We present four unique childhood myopathy cases characterized by relatively mild muscle weakness, slowly progressing course, mildly elevated creatine phosphokinase (CPK), and contractures. We also present two additional cases characterized by severe prenatal/neonatal myopathy. Prior extensive genetic testing and histology of these cases did not reveal the genetic etiology of disease. Here, we applied whole exome sequencing (WES) and bioinformatics to identify likely causal pathogenic variants in each pedigree. In two cases, we identified novel pathogenic variants in COL6A3. In a third case, we identified novel likely pathogenic variants in COL6A6 and COL6A3. We identified a novel splice variant in EMD in a fourth case. Finally, we classify two cases as calcium channelopathies with identification of novel pathogenic variants in RYR1 and CACNA1S. These are the first cases of myopathies reported to be caused by variants in COL6A6 and CACNA1S. Our results demonstrate the utility and genetic diagnostic value of WES in the broad class of NMD phenotypes. PMID:26247046

  20. A novel method to identify nucleic acid binding sites in proteins by scanning mutagenesis: application to iron regulatory protein.

    PubMed Central

    Neupert, B; Menotti, E; Kühn, L C

    1995-01-01

    We describe a new procedure to identify RNA or DNA binding sites in proteins, based on a combination of UV cross-linking and single-hit chemical peptide cleavage. Site-directed mutagenesis is used to create a series of mutants with single Asn-Gly sequences in the protein to be analysed. Recombinant mutant proteins are incubated with their radiolabelled target sequence and UV irradiated. Covalently linked RNA- or DNA-protein complexes are digested with hydroxylamine and labelled peptides identified by SDS-PAGE and autoradiography. The analysis requires only small amounts of protein and is achieved within a relatively short time. Using this method we mapped the site at which human iron regulatory protein (IRP) is UV cross-linked to iron responsive element RNA to amino acid residues 116-151. Images PMID:7544459

  1. A sex-associated sequence identified by RAPD screening in gynogenetic individuals of turbot (Scophthalmus maximus).

    PubMed

    Vale, Luis; Dieguez, Rebeca; Sánchez, Laura; Martínez, Paulino; Viñas, Ana

    2014-03-01

    Understanding the genetic basis of sex determination mechanisms is essential for improving the productivity of farmed aquaculture fish species like turbot (Scophthalmus maximus). In culture conditions turbot males grow slower than females starting from eight months post-hatch, and this differential growth rate is maintained until sexual maturation is reached, being mature females almost twice as big as males of the same age. The goal of this study was to identify sex-specific DNA markers in turbot using comparative random amplified polymorphism DNA (RAPD) profiles in males and females to get new insights of the genetic architecture related to sex determination. In order to do this, we analyzed 540 commercial 10-mer RAPD primers in male and female pools of a gynogenetic family because of its higher inbreeding, which facilitates the detection of associations across the genome. Two sex-linked RAPD markers were identified in the female pool and one in the male pool. After the analysis of the three markers on individual samples of each pool and also in unrelated individuals, only one RAPD showed significant association with females. This marker was isolated, cloned and sequenced, containing two sequences, a microsatellite (SEX01) and a minisatellite (SEX02), which were mapped in the turbot reference map. From this map position, through a comparative mapping approach, we identified Foxl2, a relevant gene related to initial steps of sex differentiation, and Wnt4, a gene related with ovarian development, close to the microsatellite and minisatellite markers, respectively. The position of Foxl2 and Wnt4 was confirmed by linkage mapping in the reference turbot map. PMID:24415295

  2. Exome Sequencing Identifies Three Novel Candidate Genes Implicated in Intellectual Disability

    PubMed Central

    Azam, Maleeha; Ayub, Humaira; Vissers, Lisenka E. L. M.; Gilissen, Christian; Ali, Syeda Hafiza Benish; Riaz, Moeen; Veltman, Joris A.; Pfundt, Rolph; van Bokhoven, Hans; Qamar, Raheel

    2014-01-01

    Intellectual disability (ID) is a major health problem mostly with an unknown etiology. Recently exome sequencing of individuals with ID identified novel genes implicated in the disease. Therefore the purpose of the present study was to identify the genetic cause of ID in one syndromic and two non-syndromic Pakistani families. Whole exome of three ID probands was sequenced. Missense variations in two plausible novel genes implicated in autosomal recessive ID were identified: lysine (K)-specific methyltransferase 2B (KMT2B), zinc finger protein 589 (ZNF589), as well as hedgehog acyltransferase (HHAT) with a de novo mutation with autosomal dominant mode of inheritance. The KMT2B recessive variant is the first report of recessive Kleefstra syndrome-like phenotype. Identification of plausible causative mutations for two recessive and a dominant type of ID, in genes not previously implicated in disease, underscores the large genetic heterogeneity of ID. These results also support the viewpoint that large number of ID genes converge on limited number of common networks i.e. ZNF589 belongs to KRAB-domain zinc-finger proteins previously implicated in ID, HHAT is predicted to affect sonic hedgehog, which is involved in several disorders with ID, KMT2B associated with syndromic ID fits the epigenetic module underlying the Kleefstra syndromic spectrum. The association of these novel genes in three different Pakistani ID families highlights the importance of screening these genes in more families with similar phenotypes from different populations to confirm the involvement of these genes in pathogenesis of ID. PMID:25405613

  3. Linkage study and exome sequencing identify a BDP1 mutation associated with hereditary hearing loss.

    PubMed

    Girotto, Giorgia; Abdulhadi, Khalid; Buniello, Annalisa; Vozzi, Diego; Licastro, Danilo; d'Eustacchio, Angela; Vuckovic, Dragana; Alkowari, Moza Khalifa; Steel, Karen P; Badii, Ramin; Gasparini, Paolo

    2013-01-01

    Nonsyndromic Hereditary Hearing Loss is a common disorder accounting for at least 60% of prelingual deafness. GJB2 gene mutations, GJB6 deletion, and the A1555G mitochondrial mutation play a major role worldwide in causing deafness, but there is a high degree of genetic heterogeneity and many genes involved in deafness have not yet been identified. Therefore, there remains a need to search for new causative mutations. In this study, a combined strategy using both linkage analysis and sequencing identified a new mutation causing hearing loss. Linkage analysis identified a region of 40 Mb on chromosome 5q13 (LOD score 3.8) for which exome sequencing data revealed a mutation (c.7873 T>G leading to p.*2625Gluext*11) in the BDP1 gene (B double prime 1, subunit of RNA polymerase III transcription initiation factor IIIB) in patients from a consanguineous Qatari family of second degree, showing bilateral, post-lingual, sensorineural moderate to severe hearing impairment. The mutation disrupts the termination codon of the transcript resulting in an elongation of 11 residues of the BDP1 protein. This elongation does not contain any known motif and is not conserved across species. Immunohistochemistry studies carried out in the mouse inner ear showed Bdp1 expression within the endothelial cells in the stria vascularis, as well as in mesenchyme-derived cells surrounding the cochlear duct. The identification of the BDP1 mutation increases our knowledge of the molecular bases of Nonsyndromic Hereditary Hearing Loss and provides new opportunities for the diagnosis and treatment of this disease in the Qatari population. PMID:24312468

  4. Linkage Study and Exome Sequencing Identify a BDP1 Mutation Associated with Hereditary Hearing Loss

    PubMed Central

    Girotto, Giorgia; Abdulhadi, Khalid; Buniello, Annalisa; Vozzi, Diego; Licastro, Danilo; d'Eustacchio, Angela; Vuckovic, Dragana; Alkowari, Moza Khalifa; Steel, Karen P.; Badii, Ramin; Gasparini, Paolo

    2013-01-01

    Nonsyndromic Hereditary Hearing Loss is a common disorder accounting for at least 60% of prelingual deafness. GJB2 gene mutations, GJB6 deletion, and the A1555G mitochondrial mutation play a major role worldwide in causing deafness, but there is a high degree of genetic heterogeneity and many genes involved in deafness have not yet been identified. Therefore, there remains a need to search for new causative mutations. In this study, a combined strategy using both linkage analysis and sequencing identified a new mutation causing hearing loss. Linkage analysis identified a region of 40 Mb on chromosome 5q13 (LOD score 3.8) for which exome sequencing data revealed a mutation (c.7873 T>G leading to p.*2625Gluext*11) in the BDP1 gene (B double prime 1, subunit of RNA polymerase III transcription initiation factor IIIB) in patients from a consanguineous Qatari family of second degree, showing bilateral, post-lingual, sensorineural moderate to severe hearing impairment. The mutation disrupts the termination codon of the transcript resulting in an elongation of 11 residues of the BDP1 protein. This elongation does not contain any known motif and is not conserved across species. Immunohistochemistry studies carried out in the mouse inner ear showed Bdp1 expression within the endothelial cells in the stria vascularis, as well as in mesenchyme-derived cells surrounding the cochlear duct. The identification of the BDP1 mutation increases our knowledge of the molecular bases of Nonsyndromic Hereditary Hearing Loss and provides new opportunities for the diagnosis and treatment of this disease in the Qatari population. PMID:24312468

  5. Sequencing of SCN5A identifies rare and common variants associated with cardiac conduction

    PubMed Central

    Magnani, Jared W.; Brody, Jennifer A.; Prins, Bram P.; Arking, Dan E.; Lin, Honghuang; Yin, Xiaoyan; Liu, Ching-Ti; Morrison, Alanna C.; Zhang, Feng; Spector, Tim D.; Alonso, Alvaro; Bis, Joshua C.; Heckbert, Susan R.; Lumley, Thomas; Sitlani, Colleen M.; Cupples, L. Adrienne; Lubitz, Steven A.; Soliman, Elsayed Z.; Pulit, Sara L.; Newton-Cheh, Christopher; O'Donnell, Christopher J.; Ellinor, Patrick T.; Benjamin, Emelia J.; Muzny, Donna M.; Gibbs, Richard A.; Santibanez, Jireh; Taylor, Herman A.; Rotter, Jerome I.; Lange, Leslie A.; Psaty, Bruce M.; Jackson, Rebecca; Rich, Stephen S.; Boerwinkle, Eric; Jamshidi, Yalda; Sotoodehnia, Nona

    2014-01-01

    Background The cardiac sodium channel SCN5A regulates atrioventricular and ventricular conduction. Genetic variants in this gene are associated with PR and QRS intervals. We sought to further characterize the contribution of rare and common coding variation in SCN5A to cardiac conduction. Methods and Results In the Cohorts for Heart and Aging Research in Genomic Epidemiology Targeted Sequencing Study (CHARGE), we performed targeted exonic sequencing of SCN5A (n=3699, European-ancestry individuals) and identified 4 common (minor allele frequency >1%) and 157 rare variants. Common and rare SCN5A coding variants were examined for association with PR and QRS intervals through meta-analysis of European ancestry participants from CHARGE, NHLBI’s Exome Sequencing Project (ESP, n=607) and the UK10K (n=1275) and by examining ESP African-ancestry participants (N=972). Rare coding SCN5A variants in aggregate were associated with PR interval in European and African-ancestry participants (P=1.3×10−3). Three common variants were associated with PR and/or QRS interval duration among European-ancestry participants and one among African-ancestry participants. These included two well-known missense variants; rs1805124 (H558R) was associated with PR and QRS shortening in European-ancestry participants (P=6.25×10−4 and P=5.2×10−3 respectively) and rs7626962 (S1102Y) was associated with PR shortening in those of African ancestry (P=2.82×10−3). Among European-ancestry participants, two novel synonymous variants, rs1805126 and rs6599230, were associated with cardiac conduction. Our top signal, rs1805126 was associated with PR and QRS lengthening (P=3.35×10−7 and P=2.69×10−4 respectively), and rs6599230 was associated with PR shortening (P=2.67×10−5). Conclusions By sequencing SCN5A, we identified novel common and rare coding variants associated with cardiac conduction. PMID:24951663

  6. Matrix genes of measles virus and canine distemper virus: cloning, nucleotide sequences, and deduced amino acid sequences.

    PubMed Central

    Bellini, W J; Englund, G; Richardson, C D; Rozenblatt, S; Lazzarini, R A

    1986-01-01

    The nucleotide sequences encoding the matrix (M) proteins of measles virus (MV) and canine distemper virus (CDV) were determined from cDNA clones containing these genes in their entirety. In both cases, single open reading frames specifying basic proteins of 335 amino acid residues were predicted from the nucleotide sequences. Both viral messages were composed of approximately 1,450 nucleotides and contained 400 nucleotides of presumptive noncoding sequences at their respective 3' ends. MV and CDV M-protein-coding regions were 67% homologous at the nucleotide level and 76% homologous at the amino acid level. Only chance homology was observed in the 400-nucleotide trailer sequences. Comparisons of the M protein sequences of MV and CDV with the sequence reported for Sendai virus (B. M. Blumberg, K. Rose, M. G. Simona, L. Roux, C. Giorgi, and D. Kolakofsky, J. Virol. 52:656-663; Y. Hidaka, T. Kanda, K. Iwasaki, A. Nomoto, T. Shioda, and H. Shibuta, Nucleic Acids Res. 12:7965-7973) indicated the greatest homology among these M proteins in the carboxyterminal third of the molecule. Secondary-structure analyses of this shared region indicated a structurally conserved, hydrophobic sequence which possibly interacted with the lipid bilayer. Images PMID:3754588

  7. Whole-genome sequencing identifies EN1 as a determinant of bone density and fracture

    PubMed Central

    Zheng, Hou-Feng; Forgetta, Vincenzo; Hsu, Yi-Hsiang; Estrada, Karol; Rosello-Diez, Alberto; Leo, Paul J; Dahia, Chitra L; Park-Min, Kyung Hyun; Tobias, Jonathan H; Kooperberg, Charles; Kleinman, Aaron; Styrkarsdottir, Unnur; Liu, Ching-Ti; Uggla, Charlotta; Evans, Daniel S; Nielson, Carrie M; Walter, Klaudia; Pettersson-Kymmer, Ulrika; McCarthy, Shane; Eriksson, Joel; Kwan, Tony; Jhamai, Mila; Trajanoska, Katerina; Memari, Yasin; Min, Josine; Huang, Jie; Danecek, Petr; Wilmot, Beth; Li, Rui; Chou, Wen-Chi; Mokry, Lauren E; Moayyeri, Alireza; Claussnitzer, Melina; Cheng, Chia-Ho; Cheung, Warren; Medina-Gómez, Carolina; Ge, Bing; Chen, Shu-Huang; Choi, Kwangbom; Oei, Ling; Fraser, James; Kraaij, Robert; Hibbs, Matthew A; Gregson, Celia L; Paquette, Denis; Hofman, Albert; Wibom, Carl; Tranah, Gregory J; Marshall, Mhairi; Gardiner, Brooke B; Cremin, Katie; Auer, Paul; Hsu, Li; Ring, Sue; Tung, Joyce Y; Thorleifsson, Gudmar; Enneman, Anke W; van Schoor, Natasja M; de Groot, Lisette C.P.G.M.; van der Velde, Nathalie; Melin, Beatrice; Kemp, John P; Christiansen, Claus; Sayers, Adrian; Zhou, Yanhua; Calderari, Sophie; van Rooij, Jeroen; Carlson, Chris; Peters, Ulrike; Berlivet, Soizik; Dostie, Josée; Uitterlinden, Andre G; Williams, Stephen R.; Farber, Charles; Grinberg, Daniel; LaCroix, Andrea Z; Haessler, Jeff; Chasman, Daniel I; Giulianini, Franco; Rose, Lynda M; Ridker, Paul M; Eisman, John A; Nguyen, Tuan V; Center, Jacqueline R; Nogues, Xavier; Garcia-Giralt, Natalia; Launer, Lenore L; Gudnason, Vilmunder; Mellström, Dan; Vandenput, Liesbeth; Karlsson, Magnus K; Ljunggren, Östen; Svensson, Olle; Hallmans, Göran; Rousseau, François; Giroux, Sylvie; Bussière, Johanne; Arp, Pascal P; Koromani, Fjorda; Prince, Richard L; Lewis, Joshua R; Langdahl, Bente L; Hermann, A Pernille; Jensen, Jens-Erik B; Kaptoge, Stephen; Khaw, Kay-Tee; Reeve, Jonathan; Formosa, Melissa M; Xuereb-Anastasi, Angela; Åkesson, Kristina; McGuigan, Fiona E; Garg, Gaurav; Olmos, Jose M; Zarrabeitia, Maria T; Riancho, Jose A; Ralston, Stuart H; Alonso, Nerea; Jiang, Xi; Goltzman, David; Pastinen, Tomi; Grundberg, Elin; Gauguier, Dominique; Orwoll, Eric S; Karasik, David; Davey-Smith, George; Smith, Albert V; Siggeirsdottir, Kristin; Harris, Tamara B; Zillikens, M Carola; van Meurs, Joyce BJ; Thorsteinsdottir, Unnur; Maurano, Matthew T; Timpson, Nicholas J; Soranzo, Nicole; Durbin, Richard; Wilson, Scott G; Ntzani, Evangelia E; Brown, Matthew A; Stefansson, Kari; Hinds, David A; Spector, Tim; Cupples, L Adrienne; Ohlsson, Claes; Greenwood, Celia MT; Jackson, Rebecca D; Rowe, David W; Loomis, Cynthia A; Evans, David M; Ackert-Bicknell, Cheryl L; Joyner, Alexandra L; Duncan, Emma L; Kiel, Douglas P; Rivadeneira, Fernando; Richards, J Brent

    2016-01-01

    SUMMARY The extent to which low-frequency (minor allele frequency [MAF] between 1–5%) and rare (MAF ≤ 1%) variants contribute to complex traits and disease in the general population is largely unknown. Bone mineral density (BMD) is highly heritable, is a major predictor of osteoporotic fractures and has been previously associated with common genetic variants1–8, and rare, population-specific, coding variants9. Here we identify novel non-coding genetic variants with large effects on BMD (ntotal = 53,236) and fracture (ntotal = 508,253) in individuals of European ancestry from the general population. Associations for BMD were derived from whole-genome sequencing (n=2,882 from UK10K), whole-exome sequencing (n= 3,549), deep imputation of genotyped samples using a combined UK10K/1000Genomes reference panel (n=26,534), and de-novo replication genotyping (n= 20,271). We identified a low-frequency non-coding variant near a novel locus, EN1, with an effect size 4-fold larger than the mean of previously reported common variants for lumbar spine BMD8 (rs11692564[T], MAF = 1.7%, replication effect size = +0.20 standard deviations [SD], Pmeta = 2×10−14), which was also associated with a decreased risk of fracture (OR = 0.85; P = 2×10−11; ncases = 98,742 and ncontrols = 409,511). Using an En1Cre/flox mouse model, we observed that conditional loss of En1 results in low bone mass, likely as a consequence of high bone turn-over. We also identified a novel low-frequency non-coding variant with large effects on BMD near WNT16 (rs148771817[T], MAF = 1.1%, replication effect size = +0.39 SD, Pmeta = 1×10−11). In general, there was an excess of association signals arising from deleterious coding and conserved non-coding variants. These findings provide evidence that low-frequency non-coding variants have large effects on BMD and fracture, thereby providing rationale for whole-genome sequencing and improved imputation reference panels to study the genetic architecture of

  8. Whole-Genome Sequencing of Individuals from a Founder Population Identifies Candidate Genes for Asthma

    PubMed Central

    Campbell, Catarina D.; Mohajeri, Kiana; Malig, Maika; Hormozdiari, Fereydoun; Nelson, Benjamin; Du, Gaixin; Patterson, Kristen M.; Eng, Celeste; Torgerson, Dara G.; Hu, Donglei; Herman, Catherine; Chong, Jessica X.; Ko, Arthur; O'Roak, Brian J.; Krumm, Niklas; Vives, Laura; Lee, Choli; Roth, Lindsey A.; Rodriguez-Cintron, William; Rodriguez-Santana, Jose; Brigino-Buenaventura, Emerita; Davis, Adam; Meade, Kelley; LeNoir, Michael A.; Thyne, Shannon; Jackson, Daniel J.; Gern, James E.; Lemanske, Robert F.; Shendure, Jay; Abney, Mark; Burchard, Esteban G.; Ober, Carole; Eichler, Evan E.

    2014-01-01

    Asthma is a complex genetic disease caused by a combination of genetic and environmental risk factors. We sought to test classes of genetic variants largely missed by genome-wide association studies (GWAS), including copy number variants (CNVs) and low-frequency variants, by performing whole-genome sequencing (WGS) on 16 individuals from asthma-enriched and asthma-depleted families. The samples were obtained from an extended 13-generation Hutterite pedigree with reduced genetic heterogeneity due to a small founding gene pool and reduced environmental heterogeneity as a result of a communal lifestyle. We sequenced each individual to an average depth of 13-fold, generated a comprehensive catalog of genetic variants, and tested the most severe mutations for association with asthma. We identified and validated 1960 CNVs, 19 nonsense or splice-site single nucleotide variants (SNVs), and 18 insertions or deletions that were out of frame. As follow-up, we performed targeted sequencing of 16 genes in 837 cases and 540 controls of Puerto Rican ancestry and found that controls carry a significantly higher burden of mutations in IL27RA (2.0% of controls; 0.23% of cases; nominal p = 0.004; Bonferroni p = 0.21). We also genotyped 593 CNVs in 1199 Hutterite individuals. We identified a nominally significant association (p = 0.03; Odds ratio (OR) = 3.13) between a 6 kbp deletion in an intron of NEDD4L and increased risk of asthma. We genotyped this deletion in an additional 4787 non-Hutterite individuals (nominal p = 0.056; OR = 1.69). NEDD4L is expressed in bronchial epithelial cells, and conditional knockout of this gene in the lung in mice leads to severe inflammation and mucus accumulation. Our study represents one of the early instances of applying WGS to complex disease with a large environmental component and demonstrates how WGS can identify risk variants, including CNVs and low-frequency variants, largely untested in GWAS. PMID:25116239

  9. Unconventional amino acid sequence of the sun anemone (Stoichactis helianthus) polypeptide neurotoxin

    SciTech Connect

    Kem, W.; Dunn, B.; Parten, B.; Pennington, M.; Price, D.

    1986-05-01

    A 5000 dalton polypeptide neurotoxin (Sh-NI) purified by G50 Sephadex, P-cellulose, and SP-Sephadex chromatography was homogeneous by isoelectric focusing. Sh-NI was highly toxic to crayfish (LD/sub 50/ 0.6 ..mu..g/kg) but without effect upon mice at 15,000 ..mu..g/kg (i.p. injection). The reduced, /sup 3/H-carboxymethylated toxin and its fragments were subjected to automatic Edman degradation and the resulting PTH-amino acids were identified by HPLC, back hydrolysis, and scintillation counting. Peptides resulting from proteolytic (clostripain, staphylococcal protease) and chemical (tryptophan) cleavage were sequenced. The sequence is: AACKCDDEGPDIRTAPLTGTVDLGSCNAGWEKCASYYTIIADCCRKKK. This sequence differs considerably from the homologous Anemonia and Anthopleura toxins; many of the identical residues (6 half-cystines, G9, P10, R13, G19, G29, W30) are probably critical for folding rather than receptor recognition. However, the Sh-NI sequence closely resembles Radioanthus macrodactylus neurotoxin III and r. paumotensis II. The authors propose that Sh-NI and related Radioanthus toxins act upon a different site on the sodium channel.

  10. Detection and isolation of nucleic acid sequences using a bifunctional hybridization probe

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    2000-01-01

    A method for detecting and isolating a target sequence in a sample of nucleic acids is provided using a bifunctional hybridization probe capable of hybridizing to the target sequence that includes a detectable marker and a first complexing agent capable of forming a binding pair with a second complexing agent. A kit is also provided for detecting a target sequence in a sample of nucleic acids using a bifunctional hybridization probe according to this method.

  11. Exome sequencing of hepatocellular carcinomas identifies new mutational signatures and potential therapeutic targets

    PubMed Central

    Alexandrov, Ludmil B; Calderaro, Julien; Rebouissou, Sandra; Couchy, Gabrielle; Meiller, Clément; Shinde, Jayendra; Soysouvanh, Frederic; Calatayud, Anna-Line; Pinyol, Roser; Pelletier, Laura; Balabaud, Charles; Laurent, Alexis; Blanc, Jean-Frederic; Mazzaferro, Vincenzo; Calvo, Fabien; Villanueva, Augusto; Nault, Jean-Charles; Bioulac-Sage, Paulette; Stratton, Michael R; Llovet, Josep M; Zucman-Rossi, Jessica

    2015-01-01

    Genomic analyses promise to improve tumor characterization in order to optimize personalized treatment for patients with hepatocellular carcinoma (HCC). Exome sequencing analysis of 243 liver tumors revealed mutational signatures associated with specific risk factors, mainly combined alcohol/tobacco consumption, and aflatoxin B1. We identified 161 putative driver genes associated with 11 recurrent pathways. Associations of mutations defined 3 groups of genes related to risk factors and centered on CTNNB1 (alcohol), TP53 (HBV), and AXIN1. Analyses according to tumor stage progression revealed TERT promoter mutation as an early event whereas FGF3, FGF4, FGF19/CCND1 amplification, TP53 and CDKN2A alterations, appeared at more advanced stages in aggressive tumors. In 28% of the tumors we identified genetic alterations potentially targetable by FDA-approved drugs. In conclusion, we identified risk factor-specific mutational signatures and defined the extensive landscape of altered genes and pathways in HCC which will be useful to design clinical trials for targeted therapy. PMID:25822088

  12. Targeted next-generation sequencing of 22 mismatch repair genes identifies Lynch syndrome families.

    PubMed

    Talseth-Palmer, Bente A; Bauer, Denis C; Sjursen, Wenche; Evans, Tiffany J; McPhillips, Mary; Proietto, Anthony; Otton, Geoffrey; Spigelman, Allan D; Scott, Rodney J

    2016-05-01

    Causative germline mutations in mismatch repair (MMR) genes can only be identified in ~50% of families with a clinical diagnosis of the inherited colorectal cancer (CRC) syndrome hereditary nonpolyposis colorectal cancer (HNPCC)/Lynch syndrome (LS). Identification of these patients are critical as they are at substantially increased risk of developing multiple primary tumors, mainly colorectal and endometrial cancer (EC), occurring at a young age. This demonstrates the need to develop new and/or more thorough mutation detection approaches. Next-generation sequencing (NGS) was used to screen 22 genes involved in the DNA MMR pathway in constitutional DNA from 14 HNPCC and 12 sporadic EC patients, plus 2 positive controls. Several softwares were used for analysis and functional annotation. We identified 5 exonic indel variants, 42 exonic nonsynonymous single-nucleotide variants (SNVs) and 1 intronic variant of significance. Three of these variants were class 5 (pathogenic) or class 4 (likely pathogenic), 5 were class 3 (uncertain clinical relevance) and 40 were classified as variants of unknown clinical significance. In conclusion, we have identified two LS families from the sporadic EC patients, one without a family history of cancer, supporting the notion for universal MMR screening of EC patients. In addition, we have detected three novel class 3 variants in EC cases. We have, in addition discovered a polygenic interaction which is the most likely cause of cancer development in a HNPCC patient that could explain previous inconsistent results reported on an intronic EXO1 variant. PMID:26811195

  13. WHITE-DWARF-MAIN-SEQUENCE BINARIES IDENTIFIED FROM THE LAMOST PILOT SURVEY

    SciTech Connect

    Ren Juanjuan; Luo Ali; Li Yinbi; Wei Peng; Zhao Jingkun; Zhao Yongheng; Song Yihan; Zhao Gang E-mail: lal@nao.cas.cn

    2013-10-01

    We present a set of white-dwarf-main-sequence (WDMS) binaries identified spectroscopically from the Large sky Area Multi-Object fiber Spectroscopic Telescope (LAMOST, also called the Guo Shou Jing Telescope) pilot survey. We develop a color selection criteria based on what is so far the largest and most complete Sloan Digital Sky Survey (SDSS) DR7 WDMS binary catalog and identify 28 WDMS binaries within the LAMOST pilot survey. The primaries in our binary sample are mostly DA white dwarfs except for one DB white dwarf. We derive the stellar atmospheric parameters, masses, and radii for the two components of 10 of our binaries. We also provide cooling ages for the white dwarf primaries as well as the spectral types for the companion stars of these 10 WDMS binaries. These binaries tend to contain hot white dwarfs and early-type companions. Through cross-identification, we note that nine binaries in our sample have been published in the SDSS DR7 WDMS binary catalog. Nineteen spectroscopic WDMS binaries identified by the LAMOST pilot survey are new. Using the 3{sigma} radial velocity variation as a criterion, we find two post-common-envelope binary candidates from our WDMS binary sample.

  14. Huntington's disease biomarker progression profile identified by transcriptome sequencing in peripheral blood.

    PubMed

    Mastrokolias, Anastasios; Ariyurek, Yavuz; Goeman, Jelle J; van Duijn, Erik; Roos, Raymund A C; van der Mast, Roos C; van Ommen, GertJan B; den Dunnen, Johan T; 't Hoen, Peter A C; van Roon-Mom, Willeke M C

    2015-10-01

    With several therapeutic approaches in development for Huntington's disease, there is a need for easily accessible biomarkers to monitor disease progression and therapy response. We performed next-generation sequencing-based transcriptome analysis of total RNA from peripheral blood of 91 mutation carriers (27 presymptomatic and, 64 symptomatic) and 33 controls. Transcriptome analysis by DeepSAGE identified 167 genes significantly associated with clinical total motor score in Huntington's disease patients. Relative to previous studies, this yielded novel genes and confirmed previously identified genes, such as H2AFY, an overlap in results that has proven difficult in the past. Pathway analysis showed enrichment of genes of the immune system and target genes of miRNAs, which are downregulated in Huntington's disease models. Using a highly parallelized microfluidics array chip (Fluidigm), we validated 12 of the top 20 significant genes in our discovery cohort and 7 in a second independent cohort. The five genes (PROK2, ZNF238, AQP9, CYSTM1 and ANXA3) that were validated independently in both cohorts present a candidate biomarker panel for stage determination and therapeutic readout in Huntington's disease. Finally we suggest a first empiric formula predicting total motor score from the expression levels of our biomarker panel. Our data support the view that peripheral blood is a useful source to identify biomarkers for Huntington's disease and monitor disease progression in future clinical trials. PMID:25626709

  15. Wildlife sequences of islet amyloid polypeptide (IAPP) identify critical species variants for fibrillization.

    PubMed

    Fortin, Jessica S; Benoit-Biancamano, Marie-Odile

    2015-01-01

    Amyloid can be detected in the islets of Langerhans in a majority of type 2 diabetic patients. These deposits have been associated with β-cell death, thereby furthering diabetes progression. Islet amyloid polypeptide (IAPP) amyloidogenicity is quite variable among animal species, and studying this variability could further our understanding of the mechanisms involved in the aggregation process. Thus, the general aim of this study was to identify IAPP isoforms in different animal species and characterize their propensity to form fibrillar aggregates. A library of 23 peptides (fragment 8-32) was designed to study the amyloid formation using in silico analysis and in vitro assays. Amyloid formation was impeded when the NFLVH motif found in segment 8-20 was substituted by DFLGR or KFLIR segments. A 29P, 14K and 18R substitution were often present in non-amyloidogenic sequences. Non-amyloidogenic sequences were obtained from Leontopithecus rosalia, Tursiops truncatus and Vicugna pacos. Fragment peptides from 34 species were amyloidogenic. To conclude, this project advances our knowledge on the comparative pathogenesis of amyloidosis in type II diabetes. It is conceivable that the additional information gained may help point towards new therapeutic strategies for diabetes patients. PMID:26300107

  16. Whole-exome sequencing identifies rare, functional CFH variants in families with macular degeneration

    PubMed Central

    Yu, Yi; Triebwasser, Michael P.; Wong, Edwin K. S.; Schramm, Elizabeth C.; Thomas, Brett; Reynolds, Robyn; Mardis, Elaine R.; Atkinson, John P.; Daly, Mark; Raychaudhuri, Soumya; Kavanagh, David; Seddon, Johanna M.

    2014-01-01

    We sequenced the whole exome of 35 cases and 7 controls from 9 age-related macular degeneration (AMD) families in whom known common genetic risk alleles could not explain their high disease burden and/or their early-onset advanced disease. Two families harbored novel rare mutations in CFH (R53C and D90G). R53C segregates perfectly with AMD in 11 cases (heterozygous) and 1 elderly control (reference allele) (LOD = 5.07, P = 6.7 × 10−7). In an independent cohort, 4 out of 1676 cases but none of the 745 examined controls or 4300 NHBLI Exome Sequencing Project (ESP) samples carried the R53C mutation (P = 0.0039). In another family of six siblings, D90G similarly segregated with AMD in five cases and one control (LOD = 1.22, P = 0.009). No other sample in our large cohort or the ESP had this mutation. Functional studies demonstrated that R53C decreased the ability of FH to perform decay accelerating activity. D90G exhibited a decrease in cofactor-mediated inactivation. Both of these changes would lead to a loss of regulatory activity, resulting in excessive alternative pathway activation. This study represents an initial application of the whole-exome strategy to families with early-onset AMD. It successfully identified high impact alleles leading to clearer functional insight into AMD etiopathogenesis. PMID:24847005

  17. Whole-exome sequencing identifies rare, functional CFH variants in families with macular degeneration.

    PubMed

    Yu, Yi; Triebwasser, Michael P; Wong, Edwin K S; Schramm, Elizabeth C; Thomas, Brett; Reynolds, Robyn; Mardis, Elaine R; Atkinson, John P; Daly, Mark; Raychaudhuri, Soumya; Kavanagh, David; Seddon, Johanna M

    2014-10-01

    We sequenced the whole exome of 35 cases and 7 controls from 9 age-related macular degeneration (AMD) families in whom known common genetic risk alleles could not explain their high disease burden and/or their early-onset advanced disease. Two families harbored novel rare mutations in CFH (R53C and D90G). R53C segregates perfectly with AMD in 11 cases (heterozygous) and 1 elderly control (reference allele) (LOD = 5.07, P = 6.7 × 10(-7)). In an independent cohort, 4 out of 1676 cases but none of the 745 examined controls or 4300 NHBLI Exome Sequencing Project (ESP) samples carried the R53C mutation (P = 0.0039). In another family of six siblings, D90G similarly segregated with AMD in five cases and one control (LOD = 1.22, P = 0.009). No other sample in our large cohort or the ESP had this mutation. Functional studies demonstrated that R53C decreased the ability of FH to perform decay accelerating activity. D90G exhibited a decrease in cofactor-mediated inactivation. Both of these changes would lead to a loss of regulatory activity, resulting in excessive alternative pathway activation. This study represents an initial application of the whole-exome strategy to families with early-onset AMD. It successfully identified high impact alleles leading to clearer functional insight into AMD etiopathogenesis. PMID:24847005

  18. Whole genome nucleosome sequencing identifies novel types of forensic markers in degraded DNA samples

    PubMed Central

    Dong, Chun-nan; Yang, Ya-dong; Li, Shu-jin; Yang, Ya-ran; Zhang, Xiao-jing; Fang, Xiang-dong; Yan, Jiang-wei; Cong, Bin

    2016-01-01

    In the case of mass disasters, missing persons and forensic caseworks, highly degraded biological samples are often encountered. It can be a challenge to analyze and interpret the DNA profiles from these samples. Here we provide a new strategy to solve the problem by taking advantage of the intrinsic structural properties of DNA. We have assessed the in vivo positions of more than 35 million putative nucleosome cores in human leukocytes using high-throughput whole genome sequencing, and identified 2,462 single nucleotide variations (SNVs), 128 insertion-deletion polymorphisms (indels). After comparing the sequence reads with 44 STR loci commonly used in forensics, five STRs (TH01, TPOX, D18S51, DYS391, and D10S1248)were matched. We compared these “nucleosome protected STRs” (NPSTRs) with five other non-NPSTRs using mini-STR primer design, real-time PCR, and capillary gel electrophoresis on artificially degraded DNA. Moreover, genotyping performance of the five NPSTRs and five non-NPSTRs was also tested with real casework samples. All results show that loci located in nucleosomes are more likely to be successfully genotyped in degraded samples. In conclusion, after further strict validation, these markers could be incorporated into future forensic and paleontology identification kits, resulting in higher discriminatory power for certain degraded sample types. PMID:27189082

  19. Whole genome nucleosome sequencing identifies novel types of forensic markers in degraded DNA samples.

    PubMed

    Dong, Chun-Nan; Yang, Ya-Dong; Li, Shu-Jin; Yang, Ya-Ran; Zhang, Xiao-Jing; Fang, Xiang-Dong; Yan, Jiang-Wei; Cong, Bin

    2016-01-01

    In the case of mass disasters, missing persons and forensic caseworks, highly degraded biological samples are often encountered. It can be a challenge to analyze and interpret the DNA profiles from these samples. Here we provide a new strategy to solve the problem by taking advantage of the intrinsic structural properties of DNA. We have assessed the in vivo positions of more than 35 million putative nucleosome cores in human leukocytes using high-throughput whole genome sequencing, and identified 2,462 single nucleotide variations (SNVs), 128 insertion-deletion polymorphisms (indels). After comparing the sequence reads with 44 STR loci commonly used in forensics, five STRs (TH01, TPOX, D18S51, DYS391, and D10S1248)were matched. We compared these "nucleosome protected STRs" (NPSTRs) with five other non-NPSTRs using mini-STR primer design, real-time PCR, and capillary gel electrophoresis on artificially degraded DNA. Moreover, genotyping performance of the five NPSTRs and five non-NPSTRs was also tested with real casework samples. All results show that loci located in nucleosomes are more likely to be successfully genotyped in degraded samples. In conclusion, after further strict validation, these markers could be incorporated into future forensic and paleontology identification kits, resulting in higher discriminatory power for certain degraded sample types. PMID:27189082

  20. Lactobacillus casei, Lactobacillus rhamnosus, and Lactobacillus zeae isolates identified by sequence signature and immunoblot phenotype.

    PubMed

    Dobson, C Melissa; Chaban, Bonnie; Deneer, Harry; Ziola, Barry

    2004-07-01

    Species taxonomy within the Lactobacillus casei group of bacteria has been unsettled. With the goal of helping clarify the taxonomy of these bacteria, we investigated the first 3 variable regions of the 16S rRNA gene, the 16S-23S rRNA interspacer region, and one third of the chaperonin 60 gene for Lactobacillus isolates originally designated as L. casei, L. paracasei, L. rhamnosus, and L. zeae. For each genetic region, a phylogenetic tree was created and signature sequence analysis was done. As well, phenotypic analysis of the various strains was performed by immunoblotting. Both sequence signature analysis and immunoblotting gave immediate identification of L. casei, L. rhamnosus, and L. zeae isolates. These results corroborate and extend previous findings concerning these lactobacilli; therefore, we strongly endorse recent proposals for revised nomenclature. Specifically, isolate ATCC 393 is appropriately rejected as the L. casei type strain because of grouping with isolates identified as L. zeae. As well, because all other L. casei isolates, including the proposed neotype isolate ATCC 334, grouped together with isolates designated L. paracasei, we support the use of the single species L. casei and rejection of the name L. paracasei. PMID:15381972

  1. Exome sequencing identifies somatic mutations of DDX3X in natural killer/T-cell lymphoma.

    PubMed

    Jiang, Lu; Gu, Zhao-Hui; Yan, Zi-Xun; Zhao, Xia; Xie, Yin-Yin; Zhang, Zi-Guan; Pan, Chun-Ming; Hu, Yuan; Cai, Chang-Ping; Dong, Ying; Huang, Jin-Yan; Wang, Li; Shen, Yang; Meng, Guoyu; Zhou, Jian-Feng; Hu, Jian-Da; Wang, Jin-Fen; Liu, Yuan-Hua; Yang, Lin-Hua; Zhang, Feng; Wang, Jian-Min; Wang, Zhao; Peng, Zhi-Gang; Chen, Fang-Yuan; Sun, Zi-Min; Ding, Hao; Shi, Ju-Mei; Hou, Jian; Yan, Jin-Song; Shi, Jing-Yi; Xu, Lan; Li, Yang; Lu, Jing; Zheng, Zhong; Xue, Wen; Zhao, Wei-Li; Chen, Zhu; Chen, Sai-Juan

    2015-09-01

    Natural killer/T-cell lymphoma (NKTCL) is a malignant proliferation of CD56(+) and cytoCD3(+) lymphocytes with aggressive clinical course, which is prevalent in Asian and South American populations. The molecular pathogenesis of NKTCL has largely remained elusive. We identified somatic gene mutations in 25 people with NKTCL by whole-exome sequencing and confirmed them in an extended validation group of 80 people by targeted sequencing. Recurrent mutations were most frequently located in the RNA helicase gene DDX3X (21/105 subjects, 20.0%), tumor suppressors (TP53 and MGA), JAK-STAT-pathway molecules (STAT3 and STAT5B) and epigenetic modifiers (MLL2, ARID1A, EP300 and ASXL3). As compared to wild-type protein, DDX3X mutants exhibited decreased RNA-unwinding activity, loss of suppressive effects on cell-cycle progression in NK cells and transcriptional activation of NF-κB and MAPK pathways. Clinically, patients with DDX3X mutations presented a poor prognosis. Our work thus contributes to the understanding of the disease mechanism of NKTCL. PMID:26192917

  2. A computational approach to identify genes for functional RNAs in genomic sequences

    PubMed Central

    Carter, Richard J.; Dubchak, Inna; Holbrook, Stephen R.

    2001-01-01

    Currently there is no successful computational approach for identification of genes encoding novel functional RNAs (fRNAs) in genomic sequences. We have developed a machine learning approach using neural networks and support vector machines to extract common features among known RNAs for prediction of new RNA genes in the unannotated regions of prokaryotic and archaeal genomes. The Escherichia coli genome was used for development, but we have applied this method to several other bacterial and archaeal genomes. Networks based on nucleotide composition were 80–90% accurate in jackknife testing experiments for bacteria and 90–99% for hyperthermophilic archaea. We also achieved a significant improvement in accuracy by combining these predictions with those obtained using a second set of parameters consisting of known RNA sequence motifs and the calculated free energy of folding. Several known fRNAs not included in the training datasets were identified as well as several hundred predicted novel RNAs. These studies indicate that there are many unidentified RNAs in simple genomes that can be predicted computationally as a precursor to experimental study. Public access to our RNA gene predictions and an interface for user predictions is available via the web. PMID:11574674

  3. iACP: a sequence-based tool for identifying anticancer peptides

    PubMed Central

    Chen, Wei; Ding, Hui; Feng, Pengmian; Lin, Hao; Chou, Kuo-Chen

    2016-01-01

    Cancer remains a major killer worldwide. Traditional methods of cancer treatment are expensive and have some deleterious side effects on normal cells. Fortunately, the discovery of anticancer peptides (ACPs) has paved a new way for cancer treatment. With the explosive growth of peptide sequences generated in the post genomic age, it is highly desired to develop computational methods for rapidly and effectively identifying ACPs, so as to speed up their application in treating cancer. Here we report a sequence-based predictor called iACP developed by the approach of optimizing the g-gap dipeptide components. It was demonstrated by rigorous cross-validations that the new predictor remarkably outperformed the existing predictors for the same purpose in both overall accuracy and stability. For the convenience of most experimental scientists, a publicly accessible web-server for iACP has been established at http://lin.uestc.edu.cn/server/iACP, by which users can easily obtain their desired results. PMID:26942877

  4. Novel method for PIK3CA mutation analysis: locked nucleic acid--PCR sequencing.

    PubMed

    Ang, Daphne; O'Gara, Rebecca; Schilling, Amy; Beadling, Carol; Warrick, Andrea; Troxell, Megan L; Corless, Christopher L

    2013-05-01

    Somatic mutations in PIK3CA are commonly seen in invasive breast cancer and several other carcinomas, occurring in three hotspots: codons 542 and 545 of exon 9 and in codon 1047 of exon 20. We designed a locked nucleic acid (LNA)-PCR sequencing assay to detect low levels of mutant PIK3CA DNA with attention to avoiding amplification of a pseudogene on chromosome 22 that has >95% homology to exon 9 of PIK3CA. We tested 60 FFPE breast DNA samples with known PIK3CA mutation status (48 cases had one or more PIK3CA mutations, and 12 were wild type) as identified by PCR-mass spectrometry. PIK3CA exons 9 and 20 were amplified in the presence or absence of LNA-oligonucleotides designed to bind to the wild-type sequences for codons 542, 545, and 1047, and partially suppress their amplification. LNA-PCR sequencing confirmed all 51 PIK3CA mutations; however, the mutation detection rate by standard Sanger sequencing was only 69% (35 of 51). Of the 12 PIK3CA wild-type cases, LNA-PCR sequencing detected three additional H1047R mutations in "normal" breast tissue and one E545K in usual ductal hyperplasia. Histopathological review of these three normal breast specimens showed columnar cell change in two (both with known H1047R mutations) and apocrine metaplasia in one. The novel LNA-PCR shows higher sensitivity than standard Sanger sequencing and did not amplify the known pseudogene. PMID:23541593

  5. Deep sequencing reveals the complete genome and evidence for transcriptional activity of the first virus-like sequences identified in Aristotelia chilensis (Maqui Berry).

    PubMed

    Villacreses, Javier; Rojas-Herrera, Marcelo; Sánchez, Carolina; Hewstone, Nicole; Undurraga, Soledad F; Alzate, Juan F; Manque, Patricio; Maracaja-Coutinho, Vinicius; Polanco, Victor

    2015-04-01

    Here, we report the genome sequence and evidence for transcriptional activity of a virus-like element in the native Chilean berry tree Aristotelia chilensis. We propose to name the endogenous sequence as Aristotelia chilensis Virus 1 (AcV1). High-throughput sequencing of the genome of this tree uncovered an endogenous viral element, with a size of 7122 bp, corresponding to the complete genome of AcV1. Its sequence contains three open reading frames (ORFs): ORFs 1 and 2 shares 66%-73% amino acid similarity with members of the Caulimoviridae virus family, especially the Petunia vein clearing virus (PVCV), Petuvirus genus. ORF1 encodes a movement protein (MP); ORF2 a Reverse Transcriptase (RT) and a Ribonuclease H (RNase H) domain; and ORF3 showed no amino acid sequence similarity with any other known virus proteins. Analogous to other known endogenous pararetrovirus sequences (EPRVs), AcV1 is integrated in the genome of Maqui Berry and showed low viral transcriptional activity, which was detected by deep sequencing technology (DNA and RNA-seq). Phylogenetic analysis of AcV1 and other pararetroviruses revealed a closer resemblance with Petuvirus. Overall, our data suggests that AcV1 could be a new member of Caulimoviridae family, genus Petuvirus, and the first evidence of this kind of virus in a fruit plant. PMID:25855242

  6. Deep Sequencing Reveals the Complete Genome and Evidence for Transcriptional Activity of the First Virus-Like Sequences Identified in Aristotelia chilensis (Maqui Berry)

    PubMed Central

    Villacreses, Javier; Rojas-Herrera, Marcelo; Sánchez, Carolina; Hewstone, Nicole; Undurraga, Soledad F.; Alzate, Juan F.; Manque, Patricio; Maracaja-Coutinho, Vinicius; Polanco, Victor

    2015-01-01

    Here, we report the genome sequence and evidence for transcriptional activity of a virus-like element in the native Chilean berry tree Aristotelia chilensis. We propose to name the endogenous sequence as Aristotelia chilensis Virus 1 (AcV1). High-throughput sequencing of the genome of this tree uncovered an endogenous viral element, with a size of 7122 bp, corresponding to the complete genome of AcV1. Its sequence contains three open reading frames (ORFs): ORFs 1 and 2 shares 66%–73% amino acid similarity with members of the Caulimoviridae virus family, especially the Petunia vein clearing virus (PVCV), Petuvirus genus. ORF1 encodes a movement protein (MP); ORF2 a Reverse Transcriptase (RT) and a Ribonuclease H (RNase H) domain; and ORF3 showed no amino acid sequence similarity with any other known virus proteins. Analogous to other known endogenous pararetrovirus sequences (EPRVs), AcV1 is integrated in the genome of Maqui Berry and showed low viral transcriptional activity, which was detected by deep sequencing technology (DNA and RNA-seq). Phylogenetic analysis of AcV1 and other pararetroviruses revealed a closer resemblance with Petuvirus. Overall, our data suggests that AcV1 could be a new member of Caulimoviridae family, genus Petuvirus, and the first evidence of this kind of virus in a fruit plant. PMID:25855242

  7. Coinfections of Zika and Chikungunya Viruses in Bahia, Brazil, Identified by Metagenomic Next-Generation Sequencing.

    PubMed

    Sardi, Silvia I; Somasekar, Sneha; Naccache, Samia N; Bandeira, Antonio C; Tauro, Laura B; Campos, Gubio S; Chiu, Charles Y

    2016-09-01

    Metagenomic next-generation sequencing (mNGS) of samples from 15 patients with documented Zika virus (ZIKV) infection in Bahia, Brazil, from April 2015 to January 2016 identified coinfections with chikungunya virus (CHIKV) in 2 of 15 ZIKV-positive cases by PCR (13.3%). While generally nonspecific, the clinical presentation corresponding to these two CHIKV/ZIKV coinfections reflected infection by the virus present at a higher titer. Aside from CHIKV and ZIKV, coinfections of other viral pathogens were not detected. The mNGS approach is promising for differential diagnosis of acute febrile illness and identification of coinfections, although targeted arbovirus screening may be sufficient in the current ZIKV outbreak setting. PMID:27413190

  8. Transcriptome Sequencing of Lima Bean (Phaseolus lunatus) to Identify Putative Positive Selection in Phaseolus and Legumes

    PubMed Central

    Li, Fengqi; Cao, Depan; Liu, Yang; Yang, Ting; Wang, Guirong

    2015-01-01

    The identification of genes under positive selection is a central goal of evolutionary biology. Many legume species, including Phaseolus vulgaris (common bean) and Phaseolus lunatus (lima bean), have important ecological and economic value. In this study, we sequenced and assembled the transcriptome of one Phaseolus species, lima bean. A comparison with the genomes of six other legume species, including the common bean, Medicago, lotus, soybean, chickpea, and pigeonpea, revealed 15 and 4 orthologous groups with signatures of positive selection among the two Phaseolus species and among the seven legume species, respectively. Characterization of these positively selected genes using Non redundant (nr) annotation, gene ontology (GO) classification, GO term enrichment and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses revealed that these genes are mostly involved in thylakoids, photosynthesis and metabolism. This study identified genes that may be related to the divergence of the Phaseolus and legume species. These detected genes are particularly good candidates for subsequent functional studies. PMID:26151849

  9. BMPER Mutation in Diaphanospondylodysostosis Identified by Ancestral Autozygosity Mapping and Targeted High-Throughput Sequencing

    PubMed Central

    Funari, Vincent A.; Krakow, Deborah; Nevarez, Lisette; Chen, Zugen; Funari, Tara L.; Vatanavicharn, Nithiwat; Wilcox, William R.; Rimoin, David L.; Nelson, Stanley F.; Cohn, Daniel H.

    2010-01-01

    Diaphanospondylodysostosis (DSD) is a rare, recessively inherited, perinatal lethal skeletal disorder. The low frequency and perinatal lethality of DSD makes assembling a large set of families for traditional linkage-based genetic approaches challenging. By searching for evidence of unknown ancestral consanguinity, we identified two autozygous intervals, comprising 34 Mbps, unique to a single case of DSD. Empirically testing for ancestral consanguinity was effective in localizing the causative variant, thereby reducing the genomic space within which the mutation resides. High-throughput sequence analysis of exons captured from these intervals demonstrated that the affected individual was homozygous for a null mutation in BMPER, which encodes the bone morphogenetic protein-binding endothelial cell precursor-derived regulator. Mutations in BMPER were subsequently found in three additional DSD cases, confirming that defects in BMPER produce DSD. Phenotypic similarities between DSD and Bmper null mice indicate that BMPER-mediated signaling plays an essential role in vertebral segmentation early in human development. PMID:20869035

  10. SSR_pipeline: a bioinformatic infrastructure for identifying microsatellites from paired-end Illumina high-throughput DNA sequencing data

    USGS Publications Warehouse

    Miller, Mark P.; Knaus, Brian J.; Mullins, Thomas D.; Haig, Susan M.

    2013-01-01

    SSR_pipeline is a flexible set of programs designed to efficiently identify simple sequence repeats (e.g., microsatellites) from paired-end high-throughput Illumina DNA sequencing data. The program suite contains 3 analysis modules along with a fourth control module that can automate analyses of large volumes of data. The modules are used to 1) identify the subset of paired-end sequences that pass Illumina quality standards, 2) align paired-end reads into a single composite DNA sequence, and 3) identify sequences that possess microsatellites (both simple and compound) conforming to user-specified parameters. The microsatellite search algorithm is extremely efficient, and we have used it to identify repeats with motifs from 2 to 25bp in length. Each of the 3 analysis modules can also be used independently to provide greater flexibility or to work with FASTQ or FASTA files generated from other sequencing platforms (Roche 454, Ion Torrent, etc.). We demonstrate use of the program with data from the brine fly Ephydra packardi (Diptera: Ephydridae) and provide empirical timing benchmarks to illustrate program performance on a common desktop computer environment. We further show that the Illumina platform is capable of identifying large numbers of microsatellites, even when using unenriched sample libraries and a very small percentage of the sequencing capacity from a single DNA sequencing run. All modules from SSR_pipeline are implemented in the Python programming language and can therefore be used from nearly any computer operating system (Linux, Macintosh, and Windows).

  11. Comparison of phenotypic and molecular tests to identify lactic acid bacteria

    PubMed Central

    Moraes, Paula Mendonça; Perin, Luana Martins; Júnior, Abelardo Silva; Nero, Luís Augusto

    2013-01-01

    Twenty-nine lactic acid bacteria (LAB) isolates were submitted for identification using Biolog, API50CHL, 16S rDNA sequencing, and species-specific PCR reactions. The identification results were compared, and it was concluded that a polyphasic approach is necessary for proper LAB identification, being the molecular analyzes the most reliable. PMID:24159291

  12. Exome Sequencing Identifies Mitochondrial Alanyl-tRNA Synthetase Mutations in Infantile Mitochondrial Cardiomyopathy

    PubMed Central

    Götz, Alexandra; Tyynismaa, Henna; Euro, Liliya; Ellonen, Pekka; Hyötyläinen, Tuulia; Ojala, Tiina; Hämäläinen, Riikka H.; Tommiska, Johanna; Raivio, Taneli; Oresic, Matej; Karikoski, Riitta; Tammela, Outi; Simola, Kalle O.J.; Paetau, Anders; Tyni, Tiina; Suomalainen, Anu

    2011-01-01

    Infantile cardiomyopathies are devastating fatal disorders of the neonatal period or the first year of life. Mitochondrial dysfunction is a common cause of this group of diseases, but the underlying gene defects have been characterized in only a minority of cases, because tissue specificity of the manifestation hampers functional cloning and the heterogeneity of causative factors hinders collection of informative family materials. We sequenced the exome of a patient who died at the age of 10 months of hypertrophic mitochondrial cardiomyopathy with combined cardiac respiratory chain complex I and IV deficiency. Rigorous data analysis allowed us to identify a homozygous missense mutation in AARS2, which we showed to encode the mitochondrial alanyl-tRNA synthetase (mtAlaRS). Two siblings from another family, both of whom died perinatally of hypertrophic cardiomyopathy, had the same mutation, compound heterozygous with another missense mutation. Protein structure modeling of mtAlaRS suggested that one of the mutations affected a unique tRNA recognition site in the editing domain, leading to incorrect tRNA aminoacylation, whereas the second mutation severely disturbed the catalytic function, preventing tRNA aminoacylation. We show here that mutations in AARS2 cause perinatal or infantile cardiomyopathy with near-total combined mitochondrial respiratory chain deficiency in the heart. Our results indicate that exome sequencing is a powerful tool for identifying mutations in single patients and allows recognition of the genetic background in single-gene disorders of variable clinical manifestation and tissue-specific disease. Furthermore, we show that mitochondrial disorders extend to prenatal life and are an important cause of early infantile cardiac failure. PMID:21549344

  13. De Novo Transcriptome Sequencing of Oryza officinalis Wall ex Watt to Identify Disease-Resistance Genes

    PubMed Central

    He, Bin; Gu, Yinghong; Tao, Xiang; Cheng, Xiaojie; Wei, Changhe; Fu, Jian; Cheng, Zaiquan; Zhang, Yizheng

    2015-01-01

    Oryza officinalis Wall ex Watt is one of the most important wild relatives of cultivated rice and exhibits high resistance to many diseases. It has been used as a source of genes for introgression into cultivated rice. However, there are limited genomic resources and little genetic information publicly reported for this species. To better understand the pathways and factors involved in disease resistance and accelerating the process of rice breeding, we carried out a de novo transcriptome sequencing of O. officinalis. In this research, 137,229 contigs were obtained ranging from 200 to 19,214 bp with an N50 of 2331 bp through de novo assembly of leaves, stems and roots in O. officinalis using an Illumina HiSeq 2000 platform. Based on sequence similarity searches against a non-redundant protein database, a total of 88,249 contigs were annotated with gene descriptions and 75,589 transcripts were further assigned to GO terms. Candidate genes for plant–pathogen interaction and plant hormones regulation pathways involved in disease-resistance were identified. Further analyses of gene expression profiles showed that the majority of genes related to disease resistance were all expressed in the three tissues. In addition, there are two kinds of rice bacterial blight-resistant genes in O. officinalis, including two Xa1 genes and three Xa26 genes. All 2 Xa1 genes showed the highest expression level in stem, whereas one of Xa26 was expressed dominantly in leaf and other 2 Xa26 genes displayed low expression level in all three tissues. This transcriptomic database provides an opportunity for identifying the genes involved in disease-resistance and will provide a basis for studying functional genomics of O. officinalis and genetic improvement of cultivated rice in the future. PMID:26690414

  14. Nucleotide and protein sequences for dog masticatory tropomyosin identify a novel Tpm4 gene product.

    PubMed

    Brundage, Elizabeth A; Biesiadecki, Brandon J; Reiser, Peter J

    2015-10-01

    Jaw-closing muscles of several vertebrate species, including members of Carnivora, express a unique, "masticatory", isoform of myosin heavy chain, along with isoforms of other myofibrillar proteins that are not expressed in most other muscles. It is generally believed that the complement of myofibrillar isoforms in these muscles serves high force generation for capturing live prey, breaking down tough plant material and defensive biting. A unique isoform of tropomyosin (Tpm) was reported to be expressed in cat jaw-closing muscle, based upon two-dimensional gel mobility, peptide mapping, and immunohistochemistry. The objective of this study was to obtain protein and gene sequence information for this unique Tpm isoform. Samples of masseter (a jaw-closing muscle), tibialis (predominantly fast-twitch fibers), and the deep lateral gastrocnemius (predominantly slow-twitch fibers) were obtained from adult dogs. Expressed Tpm isoforms were cloned and sequencing yielded cDNAs that were identical to genomic predicted striated muscle Tpm1.1St(a,b,b,a) (historically referred to as αTpm), Tpm2.2St(a,b,b,a) (βTpm) and Tpm3.12St(a,b,b,a) (γTpm) isoforms (nomenclature reflects predominant tissue expression ("St"-striated muscle) and exon splicing pattern), as well as a novel 284 amino acid isoform observed in jaw-closing muscle that is identical to a genomic predicted product of the Tpm4 gene (δTpm) family. The novel isoform is designated as Tpm4.3St(a,b,b,a). The myofibrillar Tpm isoform expressed in dog masseter exhibits a unique electrophoretic mobility on gels containing 6 M urea, compared to other skeletal Tpm isoforms. To validate that the cloned Tpm4.3 isoform is the Tpm expressed in dog masseter, E. coli-expressed Tpm4.3 was electrophoresed in the presence of urea. Results demonstrate that Tpm4.3 has identical electrophoretic mobility to the unique dog masseter Tpm isoform and is of different mobility from that of muscle Tpm1.1, Tpm2.2 and Tpm3.12 isoforms. We

  15. Nucleotide and protein sequences for dog masticatory tropomyosin identify a novel Tpm4 gene product

    PubMed Central

    Reiser, Peter J.

    2016-01-01

    Jaw-closing muscles of several vertebrate species, including members of Carnivora, express a unique, “masticatory”, isoform of myosin heavy chain, along with isoforms of other myofibrillar proteins that are not expressed in most other muscles. It is generally believed that the complement of myofibrillar isoforms in these muscles serves high force generation for capturing live prey, breaking down tough plant material and defensive biting. A unique isoform of tropomyosin (Tpm) was reported to be expressed in cat jaw-closing muscle, based upon two-dimensional gel mobility, peptide mapping, and immunohistochemistry. The objective of this study was to obtain protein and gene sequence information for this unique Tpm isoform. Samples of masseter (also a jaw-closing muscle), tibialis (with predominantly fast-twitch fibers), and the deep lateral gastrocnemius (predominantly slow-twitch fibers) were obtained from adult dogs. Expressed Tpm isoforms were cloned and sequencing yielded cDNAs that were identical to genomic predicted striated muscle Tpm1.1St(a,b,b,a) (historically referred to as αTpm), Tpm2.2St(a,b,b,a) (βTpm) and Tpm3.12St(a,b,b,a) (cTpm) isoforms (nomenclature reflects predominant tissue expression (“St”—striated muscle) and exon splicing pattern), as well as a novel 284 amino acid isoform observed in jaw-closing muscle that is identical to a genomic predicted product of the Tpm4 gene (δTpm) family. The novel isoform is designated as Tpm4.3St(a,b,b,a). The myofibrillar Tpm isoform expressed in dog masseter exhibits a unique electrophoretic mobility on gels containing 6 M urea, compared to other skeletal Tpm isoforms. To validate that the cloned Tpm4.3 isoform is the Tpm expressed in dog masseter, E. coli-expressed Tpm4.3 was electrophoresed in the presence of urea. Results demonstrate that Tpm4.3 has identical electrophoretic mobility to the unique dog masseter Tpm isoform and is of different mobility from that of muscle Tpm1.1, Tpm2.2 and Tpm3

  16. Exome sequencing identifies rare LDLR and APOA5 alleles conferring risk for myocardial infarction.

    PubMed

    Do, Ron; Stitziel, Nathan O; Won, Hong-Hee; Jørgensen, Anders Berg; Duga, Stefano; Angelica Merlini, Pier; Kiezun, Adam; Farrall, Martin; Goel, Anuj; Zuk, Or; Guella, Illaria; Asselta, Rosanna; Lange, Leslie A; Peloso, Gina M; Auer, Paul L; Girelli, Domenico; Martinelli, Nicola; Farlow, Deborah N; DePristo, Mark A; Roberts, Robert; Stewart, Alexander F R; Saleheen, Danish; Danesh, John; Epstein, Stephen E; Sivapalaratnam, Suthesh; Hovingh, G Kees; Kastelein, John J; Samani, Nilesh J; Schunkert, Heribert; Erdmann, Jeanette; Shah, Svati H; Kraus, William E; Davies, Robert; Nikpay, Majid; Johansen, Christopher T; Wang, Jian; Hegele, Robert A; Hechter, Eliana; Marz, Winfried; Kleber, Marcus E; Huang, Jie; Johnson, Andrew D; Li, Mingyao; Burke, Greg L; Gross, Myron; Liu, Yongmei; Assimes, Themistocles L; Heiss, Gerardo; Lange, Ethan M; Folsom, Aaron R; Taylor, Herman A; Olivieri, Oliviero; Hamsten, Anders; Clarke, Robert; Reilly, Dermot F; Yin, Wu; Rivas, Manuel A; Donnelly, Peter; Rossouw, Jacques E; Psaty, Bruce M; Herrington, David M; Wilson, James G; Rich, Stephen S; Bamshad, Michael J; Tracy, Russell P; Cupples, L Adrienne; Rader, Daniel J; Reilly, Muredach P; Spertus, John A; Cresci, Sharon; Hartiala, Jaana; Tang, W H Wilson; Hazen, Stanley L; Allayee, Hooman; Reiner, Alex P; Carlson, Christopher S; Kooperberg, Charles; Jackson, Rebecca D; Boerwinkle, Eric; Lander, Eric S; Schwartz, Stephen M; Siscovick, David S; McPherson, Ruth; Tybjaerg-Hansen, Anne; Abecasis, Goncalo R; Watkins, Hugh; Nickerson, Deborah A; Ardissino, Diego; Sunyaev, Shamil R; O'Donnell, Christopher J; Altshuler, David; Gabriel, Stacey; Kathiresan, Sekar

    2015-02-01

    Myocardial infarction (MI), a leading cause of death around the world, displays a complex pattern of inheritance. When MI occurs early in life, genetic inheritance is a major component to risk. Previously, rare mutations in low-density lipoprotein (LDL) genes have been shown to contribute to MI risk in individual families, whereas common variants at more than 45 loci have been associated with MI risk in the population. Here we evaluate how rare mutations contribute to early-onset MI risk in the population. We sequenced the protein-coding regions of 9,793 genomes from patients with MI at an early age (≤50 years in males and ≤60 years in females) along with MI-free controls. We identified two genes in which rare coding-sequence mutations were more frequent in MI cases versus controls at exome-wide significance. At low-density lipoprotein receptor (LDLR), carriers of rare non-synonymous mutations were at 4.2-fold increased risk for MI; carriers of null alleles at LDLR were at even higher risk (13-fold difference). Approximately 2% of early MI cases harbour a rare, damaging mutation in LDLR; this estimate is similar to one made more than 40 years ago using an analysis of total cholesterol. Among controls, about 1 in 217 carried an LDLR coding-sequence mutation and had plasma LDL cholesterol > 190 mg dl(-1). At apolipoprotein A-V (APOA5), carriers of rare non-synonymous mutations were at 2.2-fold increased risk for MI. When compared with non-carriers, LDLR mutation carriers had higher plasma LDL cholesterol, whereas APOA5 mutation carriers had higher plasma triglycerides. Recent evidence has connected MI risk with coding-sequence mutations at two genes functionally related to APOA5, namely lipoprotein lipase and apolipoprotein C-III (refs 18, 19). Combined, these observations suggest that, as well as LDL cholesterol, disordered metabolism of triglyceride-rich lipoproteins contributes to MI risk. PMID:25487149

  17. FN-Identify: Novel Restriction Enzymes-Based Method for Bacterial Identification in Absence of Genome Sequencing

    PubMed Central

    Ouda, Osama; El-Refy, Ali; El-Feky, Fawzy A.; Mosa, Kareem A.

    2015-01-01

    Sequencing and restriction analysis of genes like 16S rRNA and HSP60 are intensively used for molecular identification in the microbial communities. With aid of the rapid progress in bioinformatics, genome sequencing became the method of choice for bacterial identification. However, the genome sequencing technology is still out of reach in the developing countries. In this paper, we propose FN-Identify, a sequencing-free method for bacterial identification. FN-Identify exploits the gene sequences data available in GenBank and other databases and the two algorithms that we developed, CreateScheme and GeneIdentify, to create a restriction enzyme-based identification scheme. FN-Identify was tested using three different and diverse bacterial populations (members of Lactobacillus, Pseudomonas, and Mycobacterium groups) in an in silico analysis using restriction enzymes and sequences of 16S rRNA gene. The analysis of the restriction maps of the members of three groups using the fragment numbers information only or along with fragments sizes successfully identified all of the members of the three groups using a minimum of four and maximum of eight restriction enzymes. Our results demonstrate the utility and accuracy of FN-Identify method and its two algorithms as an alternative method that uses the standard microbiology laboratories techniques when the genome sequencing is not available. PMID:26880910

  18. A comparative analysis of the ubiquitination kinetics of multiple degrons to identify an ideal targeting sequence for a proteasome reporter.

    PubMed

    Melvin, Adam T; Woss, Gregery S; Park, Jessica H; Dumberger, Lukas D; Waters, Marcey L; Allbritton, Nancy L

    2013-01-01

    The ubiquitin proteasome system (UPS) is the primary pathway responsible for the recognition and degradation of misfolded, damaged, or tightly regulated proteins. The conjugation of a polyubiquitin chain, or polyubiquitination, to a target protein requires an increasingly diverse cascade of enzymes culminating with the E3 ubiquitin ligases. Protein recognition by an E3 ligase occurs through a specific sequence of amino acids, termed a degradation sequence or degron. Recently, degrons have been incorporated into novel reporters to monitor proteasome activity; however only a limited few degrons have successfully been incorporated into such reporters. The goal of this work was to evaluate the ubiquitination kinetics of a small library of portable degrons that could eventually be incorporated into novel single cell reporters to assess proteasome activity. After an intensive literary search, eight degrons were identified from proteins recognized by a variety of E3 ubiquitin ligases and incorporated into a four component degron-based substrate to comparatively calculate ubiquitination kinetics. The mechanism of placement of multiple ubiquitins on the different degron-based substrates was assessed by comparing the data to computational models incorporating first order reaction kinetics using either multi-monoubiquitination or polyubiquitination of the degron-based substrates. A subset of three degrons was further characterized to determine the importance of the location and proximity of the ubiquitination site lysine with respect to the degron. Ultimately, this work identified three candidate portable degrons that exhibit a higher rate of ubiquitination compared to peptidase-dependent degradation, a desired trait for a proteasomal targeting motif. PMID:24205101

  19. Economic evidence on identifying clinically actionable findings with whole-genome sequencing: a scoping review.

    PubMed

    Douglas, Michael P; Ladabaum, Uri; Pletcher, Mark J; Marshall, Deborah A; Phillips, Kathryn A

    2016-02-01

    The American College of Medical Genetics and Genomics (ACMG) recommends that mutations in 56 genes for 24 conditions are clinically actionable and should be reported as secondary findings after whole-genome sequencing (WGS). Our aim was to identify published economic evaluations of detecting mutations in these genes among the general population or among targeted/high-risk populations and conditions and identify gaps in knowledge. A targeted PubMed search from 1994 through November 2014 was performed, and we included original, English-language articles reporting cost-effectiveness or a cost-to-utility ratio or net benefits/benefit-cost focused on screening (not treatment) for conditions and genes listed by the ACMG. Articles were screened, classified as targeting a high-risk or general population, and abstracted by two reviewers. General population studies were evaluated for actual cost-effectiveness measures (e.g., incremental cost-effectiveness ratios (ICER)), whereas studies of targeted populations were evaluated for whether at least one scenario proposed was cost-effective (e.g., ICER of ≤$100,000 per life-year or quality-adjusted life-year gained). A total of 607 studies were identified, and 32 relevant studies were included. Identified studies addressed fewer than one-third (7 of 24; 29%) of the ACMG conditions. The cost-effectiveness of screening in the general population was examined for only 2 of 24 conditions (8%). The cost-effectiveness of most genetic findings that the ACMG recommends for return has not been evaluated in economic studies or in the context of screening in the general population. The individual studies do not directly address the cost-effectiveness of WGS. PMID:25996638

  20. Exome Sequencing Identifies Biallelic MSH3 Germline Mutations as a Recessive Subtype of Colorectal Adenomatous Polyposis.

    PubMed

    Adam, Ronja; Spier, Isabel; Zhao, Bixiao; Kloth, Michael; Marquez, Jonathan; Hinrichsen, Inga; Kirfel, Jutta; Tafazzoli, Aylar; Horpaopan, Sukanya; Uhlhaas, Siegfried; Stienen, Dietlinde; Friedrichs, Nicolaus; Altmüller, Janine; Laner, Andreas; Holzapfel, Stefanie; Peters, Sophia; Kayser, Katrin; Thiele, Holger; Holinski-Feder, Elke; Marra, Giancarlo; Kristiansen, Glen; Nöthen, Markus M; Büttner, Reinhard; Möslein, Gabriela; Betz, Regina C; Brieger, Angela; Lifton, Richard P; Aretz, Stefan

    2016-08-01

    In ∼30% of families affected by colorectal adenomatous polyposis, no germline mutations have been identified in the previously implicated genes APC, MUTYH, POLE, POLD1, and NTHL1, although a hereditary etiology is likely. To uncover further genes with high-penetrance causative mutations, we performed exome sequencing of leukocyte DNA from 102 unrelated individuals with unexplained adenomatous polyposis. We identified two unrelated individuals with differing compound-heterozygous loss-of-function (LoF) germline mutations in the mismatch-repair gene MSH3. The impact of the MSH3 mutations (c.1148delA, c.2319-1G>A, c.2760delC, and c.3001-2A>C) was indicated at the RNA and protein levels. Analysis of the diseased individuals' tumor tissue demonstrated high microsatellite instability of di- and tetranucleotides (EMAST), and immunohistochemical staining illustrated a complete loss of nuclear MSH3 in normal and tumor tissue, confirming the LoF effect and causal relevance of the mutations. The pedigrees, genotypes, and frequency of MSH3 mutations in the general population are consistent with an autosomal-recessive mode of inheritance. Both index persons have an affected sibling carrying the same mutations. The tumor spectrum in these four persons comprised colorectal and duodenal adenomas, colorectal cancer, gastric cancer, and an early-onset astrocytoma. Additionally, we detected one unrelated individual with biallelic PMS2 germline mutations, representing constitutional mismatch-repair deficiency. Potentially causative variants in 14 more candidate genes identified in 26 other individuals require further workup. In the present study, we identified biallelic germline MSH3 mutations in individuals with a suspected hereditary tumor syndrome. Our data suggest that MSH3 mutations represent an additional recessive subtype of colorectal adenomatous polyposis. PMID:27476653

  1. Homology analyses of the protein sequences of fatty acid synthases from chicken liver, rat mammary gland, and yeast

    SciTech Connect

    Chang, Soo-Ik ); Hammes, G.G. )

    1989-11-01

    Homology analyses of the protein sequences of chicken liver and rat mammary gland fatty acid synthases were carried out. The amino acid sequences of the chicken and rat enzymes are 67% identical. If conservative substitutions are allowed, 78% of the amino acids are matched. A region of low homologies exists between the functional domains, in particular around amino acid residues 1059-1264 of the chicken enzyme. Homologies between the active sites of chicken and rat and of chicken and yeast enzymes have been analyzed by an alignment method. A high degree of homology exists between the active sites of the chicken and rat enzymes. However, the chicken and yeast enzymes show a lower degree of homology. The DADPH-binding dinucleotide folds of the {beta}-ketoacyl reductase and the enoyl reductase sites were identified by comparison with a known consensus sequence for the DADP- and FAD-binding dinucleotide folds. The active sites of all of the enzymes are primarily in hydrophobic regions of the protein. This study suggests that the genes for the functional domains of fatty acid synthase were originally separated, and these genes were connected to each other by using different connecting nucleotide sequences in different species. An alternative explanation for the differences in rat and chicken is a common ancestry and mutations in the joining regions during evolution.

  2. Exome Sequencing Identifies INPPL1 Mutations as a Cause of Opsismodysplasia

    PubMed Central

    Huber, Céline; Faqeih, Eissa Ali; Bartholdi, Deborah; Bole-Feysot, Christine; Borochowitz, Zvi; Cavalcanti, Denise P.; Frigo, Amandine; Nitschke, Patrick; Roume, Joelle; Santos, Heloísa G.; Shalev, Stavit A.; Superti-Furga, Andrea; Delezoide, Anne-Lise; Le Merrer, Martine; Munnich, Arnold; Cormier-Daire, Valérie

    2013-01-01

    Opsismodysplasia (OPS) is a severe autosomal-recessive chondrodysplasia characterized by pre- and postnatal micromelia with extremely short hands and feet. The main radiological features are severe platyspondyly, squared metacarpals, delayed skeletal ossification, and metaphyseal cupping. In order to identify mutations causing OPS, a total of 16 cases (7 terminated pregnancies and 9 postnatal cases) from 10 unrelated families were included in this study. We performed exome sequencing in three cases from three unrelated families and only one gene was found to harbor mutations in all three cases: inositol polyphosphate phosphatase-like 1 (INPPL1). Screening INPPL1 in the remaining cases identified a total of 12 distinct INPPL1 mutations in the 10 families, present at the homozygote state in 7 consanguinous families and at the compound heterozygote state in the 3 remaining families. Most mutations (6/12) resulted in premature stop codons, 2/12 were splice site, and 4/12 were missense mutations located in the catalytic domain, 5-phosphatase. INPPL1 belongs to the inositol-1,4,5-trisphosphate 5-phosphatase family, a family of signal-modulating enzymes that govern a plethora of cellular functions by regulating the levels of specific phosphoinositides. Our finding of INPPL1 mutations in OPS, a severe spondylodysplastic dysplasia with major growth plate disorganization, supports a key and specific role of this enzyme in endochondral ossification. PMID:23273569

  3. Whole-Exome Sequencing Identifies Novel Somatic Mutations in Chinese Breast Cancer Patients

    PubMed Central

    Zhang, Yanfeng; Cai, Qiuyin; Shu, Xiao-Ou; Gao, Yu-Tang; Li, Chun; Zheng, Wei; Long, Jirong

    2016-01-01

    Most breast cancer genomes harbor complex mutational landscapes. Somatic alterations have been predominantly discovered in breast cancer patients of European ancestry; however, little is known about somatic aberration in patients of other ethnic groups including Asians. In the present study, whole-exome sequencing (WES) was conducted in DNA extracted from tumor and matched adjacent normal tissue samples from eleven early onset breast cancer patients who were included in the Shanghai Breast Cancer Study. We discovered 159 somatic missense and ten nonsense mutations distributed among 167 genes. The most frequent 50 somatic mutations identified by WES were selected for validation using Sequenom MassARRAY system in the eleven breast cancer patients and an additional 433 tumor and 921 normal tissue/blood samples from the Shanghai Breast Cancer Study. Among these 50 mutations selected for validation, 32 were technically validated. Within the validated mutations, somatic mutations in the TRPM6, HYDIN, ENTHD1, and NDUFB10 genes were found in two or more tumor samples in the replication stage. Mutations in the ADRA1B, CBFB, KIAA2022, and RBM25 genes were observed once in the replication stage. To summarize, this study identified some novel somatic mutations for breast cancer. Future studies will need to be conducted to determine the function of these mutations/genes in the breast carcinogenesis. PMID:26870154

  4. Comparison of inherently essential genes of Porphyromonas gingivalis identified in two transposon-sequencing libraries.

    PubMed

    Hutcherson, J A; Gogeneni, H; Yoder-Himes, D; Hendrickson, E L; Hackett, M; Whiteley, M; Lamont, R J; Scott, D A

    2016-08-01

    Porphyromonas gingivalis is a Gram-negative anaerobe and keystone periodontal pathogen. A mariner transposon insertion mutant library has recently been used to define 463 genes as putatively essential for the in vitro growth of P. gingivalis ATCC 33277 in planktonic culture (Library 1). We have independently generated a transposon insertion mutant library (Library 2) for the same P. gingivalis strain and herein compare genes that are putatively essential for in vitro growth in complex media, as defined by both libraries. In all, 281 genes (61%) identified by Library 1 were common to Library 2. Many of these common genes are involved in fundamentally important metabolic pathways, notably pyrimidine cycling as well as lipopolysaccharide, peptidoglycan, pantothenate and coenzyme A biosynthesis, and nicotinate and nicotinamide metabolism. Also in common are genes encoding heat-shock protein homologues, sigma factors, enzymes with proteolytic activity, and the majority of sec-related protein export genes. In addition to facilitating a better understanding of critical physiological processes, transposon-sequencing technology has the potential to identify novel strategies for the control of P. gingivalis infections. Those genes defined as essential by two independently generated TnSeq mutant libraries are likely to represent particularly attractive therapeutic targets. PMID:26358096

  5. Exome sequencing identifies MAX mutations as a cause of hereditary pheochromocytoma.

    PubMed

    Comino-Méndez, Iñaki; Gracia-Aznárez, Francisco J; Schiavi, Francesca; Landa, Iñigo; Leandro-García, Luis J; Letón, Rocío; Honrado, Emiliano; Ramos-Medina, Rocío; Caronia, Daniela; Pita, Guillermo; Gómez-Graña, Alvaro; de Cubas, Aguirre A; Inglada-Pérez, Lucía; Maliszewska, Agnieszka; Taschin, Elisa; Bobisse, Sara; Pica, Giuseppe; Loli, Paola; Hernández-Lavado, Rafael; Díaz, José A; Gómez-Morales, Mercedes; González-Neira, Anna; Roncador, Giovanna; Rodríguez-Antona, Cristina; Benítez, Javier; Mannelli, Massimo; Opocher, Giuseppe; Robledo, Mercedes; Cascón, Alberto

    2011-07-01

    Hereditary pheochromocytoma (PCC) is often caused by germline mutations in one of nine susceptibility genes described to date, but there are familial cases without mutations in these known genes. We sequenced the exomes of three unrelated individuals with hereditary PCC (cases) and identified mutations in MAX, the MYC associated factor X gene. Absence of MAX protein in the tumors and loss of heterozygosity caused by uniparental disomy supported the involvement of MAX alterations in the disease. A follow-up study of a selected series of 59 cases with PCC identified five additional MAX mutations and suggested an association with malignant outcome and preferential paternal transmission of MAX mutations. The involvement of the MYC-MAX-MXD1 network in the development and progression of neural crest cell tumors is further supported by the lack of functional MAX in rat PCC (PC12) cells and by the amplification of MYCN in neuroblastoma and suggests that loss of MAX function is correlated with metastatic potential. PMID:21685915

  6. A new method to identify flanking sequence tags in chlamydomonas using 3’-RACE

    PubMed Central

    2012-01-01

    Background The green alga Chlamydomonas reinhardtii, although a premier model organism in biology, still lacks extensive insertion mutant libraries with well-identified Flanking Sequence Tags (FSTs). Rapid and efficient methods are needed for FST retrieval. Results Here, we present a novel method to identify FSTs in insertional mutants of Chlamydomonas. Transformants can be obtained with a resistance cassette lacking a 3’ untranslated region (UTR), suggesting that the RNA that is produced from the resistance marker terminates in the flanking genome when it encounters a cleavage/polyadenylation signal. We have used a robust 3’-RACE method to specifically amplify such chimeric cDNAs. Out of 38 randomly chosen transformants, 27 (71%) yielded valid FSTs, of which 23 could be unambiguously mapped to the genome. Eighteen of the mutants lie within a predicted gene. All but two of the intragenic insertions occur in the sense orientation with respect to transcription, suggesting a bias against situations of convergent transcription. Among the 14 insertion sites tested by genomic PCR, 12 could be confirmed. Among these are insertions in genes coding for PSBS3 (possibly involved in non-photochemical quenching), the NimA-related protein kinase CNK2, the mono-dehydroascorbate reductase MDAR1, the phosphoglycerate mutase PGM5 etc.. Conclusion We propose that our 3’-RACE FST method can be used to build large scale FST libraries in Chlamydomonas and other transformable organisms. PMID:22735168

  7. Exome sequencing identifies SLC24A5 as a candidate gene for nonsyndromic oculocutaneous albinism.

    PubMed

    Wei, Ai-Hua; Zang, Dong-Jie; Zhang, Zhe; Liu, Xuan-Zhu; He, Xin; Yang, Lin; Wang, Yi; Zhou, Zhi-Yong; Zhang, Ming-Rong; Dai, Lan-Lan; Yang, Xiu-Min; Li, Wei

    2013-07-01

    Oculocutaneous albinism (OCA) is a heterogeneous and autosomal recessive disorder with hypopigmentation in the eye, hair, and skin color. Four genes, TYR, OCA2, TYRP1, and SLC45A2, have been identified as causative genes for nonsyndromic OCA1-4, respectively. The genetic identity of OCA5 locus on 4q24 is unknown. Additional unknown OCA genes may exist as at least 5% of OCA patients have not been characterized during mutational screening in several populations. We used exome sequencing with a family-based recessive mutation model to determine that SLC24A5 is a previously unreported candidate gene for nonsyndromic OCA, which we designate as OCA6. Two deleterious mutations in this patient, c.591G>A and c.1361insT, were identified. We found apparent increase of immature melanosomes and less mature melanosomes in the patient's skin melanocytes. However, no defects in the platelet dense granules were observed, excluding typical Hermansky-Pudlak syndrome (HPS), a well-known syndromic OCA. Moreover, the SLC24A5 protein was reduced in steady-state levels in mouse HPS mutants with deficiencies in BLOC-1 and BLOC-2. Our results suggest that SLC24A5 is a previously unreported nonsyndromic OCA candidate gene and that the SLC24A5 transporter is transported into mature melanosomes by HPS protein complexes. PMID:23364476

  8. The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background: Theobroma cacao L. cultivar Matina 1-6 belongs to the most cultivated cacao type. The availability of its genome sequence and methods for identifying genes responsible for important cacao traits will aid cacao researchers and breeders. Results: We describe the sequencing and assembly of...

  9. [Distribution of nontuberculous mycobacteria isolated from clinical specimens and identified with DNA sequence analysis].

    PubMed

    Özçolpan, O Olcay; Sürücüoğlu, Süheyla; Özkütük, Nuri; Çavuşoğlu, Cengiz

    2015-10-01

    The aims of the study were to perform the identification of nontuberculous mycobacteria (NTM) isolated from different clinical specimens in the Mycobacteriology Laboratory of Celal Bayar University, Manisa (located at Aegean region of Turkey), by DNA sequence analysis, and to discuss the epidemiological aspects of the data obtained. Out of 5122 clinical specimens sent to the laboratory with the initial diagnosis of tuberculosis in the period April 2007 to July 2011, M.tuberculosis complex and NTM were identified in 225 (4.39%) and 126 (2.46%) samples, respectively. DNA sequence analysis by targeting hsp65 and 16S rDNA gene regions was performed on 101 of the NTM strains in Mycobacteriology Laboratory of Ege University, Izmir. DNA sequence analysis data was evaluated using RIDOM and GenBLAST data bases. NTM strains were identified as 40 M.porcinum (39.60%), 36 M.lentiflavum (35.65%), six M.abscessus (5.64%), five M.peregrinum (4.95%), four M.gordonae (3.96%), three M.fortuitum (2.97%), two M.chelonae (1.98%), and one for each M.alvei (0.99%), M.scrofulaceum (0.99%), M.kansasii (0.99%) species. Two strains which were both 95-98% compatible with other mycobacteria in the data bases could not be identified with certainty. Seventy-two (94.73%) strains of M.lentiflavum and M.porcinum, which were the most frequent (75.24%) species in the study, were isolated from bronchoalveolar lavage (BAL) specimens. The remaining 99 strains examined could not be proven as the cause of the disease due to absence of patients' clinical data, whereas two M.abscessus strains isolated from the sputum were considered as the cause of the disease according to the ATS/IDSA criteria. The isolation rate of NTM in 2010 was found significantly higher (5.33%) than previous years. Review of the 2010 data showed that all strains of M.porcinum and M.lentiflavum, which were the most frequently identified strains were isolated from BAL specimens. This situation is in line with the start of using of an

  10. Amino acid sequence of Japanese quail (Coturnix japonica) and northern bobwhite (Colinus virginianus) myoglobin.

    PubMed

    Goodson, John; Beckstead, Robert B; Payne, Jason; Singh, Rakesh K; Mohan, Anand

    2015-08-15

    Myoglobin has an important physiological role in vertebrates, and as the primary sarcoplasmic pigment in meat, influences quality perception and consumer acceptability. In this study, the amino acid sequences of Japanese quail and northern bobwhite myoglobin were deduced by cDNA cloning of the coding sequence from mRNA. Japanese quail myoglobin was isolated from quail cardiac muscles, purified using ammonium sulphate precipitation and gel-filtration, and subjected to multiple enzymatic digestions. Mass spectrometry corroborated the deduced protein amino acid sequence at the protein level. Sequence analysis revealed both species' myoglobin structures consist of 153 amino acids, differing at only three positions. When compared with chicken myoglobin, Japanese quail showed 98% sequence identity, and northern bobwhite 97% sequence identity. The myoglobin in both quail species contained eight histidine residues instead of the nine present in chicken and turkey. PMID:25794748

  11. Transcriptome sequencing identifies ETV6-NTRK3 as a gene fusion involved in GIST.

    PubMed

    Brenca, Monica; Rossi, Sabrina; Polano, Maurizio; Gasparotto, Daniela; Zanatta, Lucia; Racanelli, Dominga; Valori, Laura; Lamon, Stefano; Dei Tos, Angelo Paolo; Maestro, Roberta

    2016-03-01

    Gastrointestinal stromal tumours (GISTs) are the most common mesenchymal neoplasms of the gastrointestinal tract. The vast majority of GISTs are driven by oncogenic activation of KIT, PDGFRA or, less commonly, BRAF. Loss of succinate dehydrogenase complex activity has been identified in subsets of KIT/PDGFRA/BRAF-mutation negative tumours, yet a significant fraction of GISTs are devoid of any of such alterations. To address the pathobiology of these 'quadruple-negative' GISTs, we sought to explore the possible involvement of fusion genes. To this end we performed transcriptome sequencing on five KIT/PDGFRA/BRAF-mutation negative, SDH-proficient tumours. Intriguingly, the analysis unveiled the presence of an ETV6-NTRK3 gene fusion. The screening by FISH of 26 additional cases, including KIT/PDGFRA-mutated GISTs, failed to detect other ETV6 rearrangements beside the index case. This was a 'quadruple-negative' GIST located in the rectum, an uncommon primary site for GIST development (∼4% of all GISTs). The fusion transcript identified encompasses exon 4 of ETV6 and exon 14 of NTRK3 and therefore differs from the canonical ETV6-NTRK3 chimera of infantile fibrosarcomas. However, it retains the ability to induce IRS1 phosphorylation, activate the IGF1R downstream signalling pathway and to be targeted by IGF1R and ALK inhibitors. Thus, the ETV6-NTRK3 fusion might identify a subset of GISTs with peculiar clinicopathological characteristics which could be eligible for such therapies. Copyright © 2015 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd. PMID:26606880

  12. Whole-exome sequencing identifies MST1R as a genetic susceptibility gene in nasopharyngeal carcinoma.

    PubMed

    Dai, Wei; Zheng, Hong; Cheung, Arthur Kwok Leung; Tang, Clara Sze-Man; Ko, Josephine Mun Yee; Wong, Bonnie Wing Yan; Leong, Merrin Man Long; Sham, Pak Chung; Cheung, Florence; Kwong, Dora Lai-Wan; Ngan, Roger Kai Cheong; Ng, Wai Tong; Yau, Chun Chung; Pan, Jianji; Peng, Xun; Tung, Stewart; Zhang, Zengfeng; Ji, Mingfang; Chiang, Alan Kwok-Shing; Lee, Anne Wing-Mui; Lee, Victor Ho-Fun; Lam, Ka-On; Au, Kwok Hung; Cheng, Hoi Ching; Yiu, Harry Ho-Yin; Lung, Maria Li

    2016-03-22

    Multiple factors, including host genetics, environmental factors, and Epstein-Barr virus (EBV) infection, contribute to nasopharyngeal carcinoma (NPC) development. To identify genetic susceptibility genes for NPC, a whole-exome sequencing (WES) study was performed in 161 NPC cases and 895 controls of Southern Chinese descent. The gene-based burden test discovered an association between macrophage-stimulating 1 receptor (MST1R) and NPC. We identified 13 independent cases carrying theMST1Rpathogenic heterozygous germ-line variants, and 53.8% of these cases were diagnosed with NPC aged at or even younger than 20 y, indicating thatMST1Rgerm-line variants are relevant to disease early-age onset (EAO) (age of ≤20 y). In total, fiveMST1Rmissense variants were found in EAO cases but were rare in controls (EAO vs. control, 17.9% vs. 1.2%,P= 7.94 × 10(-12)). The validation study, including 2,160 cases and 2,433 controls, showed that theMST1Rvariant c.G917A:p.R306H is highly associated with NPC (odds ratio of 9.0).MST1Ris predominantly expressed in the tissue-resident macrophages and is critical for innate immunity that protects organs from tissue damage and inflammation. Importantly, MST1R expression is detected in the ciliated epithelial cells in normal nasopharyngeal mucosa and plays a role in the cilia motility important for host defense. Although no somatic mutation ofMST1Rwas identified in the sporadic NPC tumors, copy number alterations and promoter hypermethylation atMST1Rwere often observed. Our findings provide new insights into the pathogenesis of NPC by highlighting the involvement of the MST1R-mediated signaling pathways. PMID:26951679

  13. Multilocus sequence typing identifies epidemic clones of Flavobacterium psychrophilum in Nordic countries.

    PubMed

    Nilsen, Hanne; Sundell, Krister; Duchaud, Eric; Nicolas, Pierre; Dalsgaard, Inger; Madsen, Lone; Aspán, Anna; Jansson, Eva; Colquhoun, Duncan J; Wiklund, Tom

    2014-05-01

    Flavobacterium psychrophilum is the causative agent of bacterial cold water disease (BCWD), which affects a variety of freshwater-reared salmonid species. A large-scale study was performed to investigate the genetic diversity of F. psychrophilum in the four Nordic countries: Denmark, Finland, Norway, and Sweden. Multilocus sequence typing of 560 geographically and temporally disparate F. psychrophilum isolates collected from various sources between 1983 and 2012 revealed 81 different sequence types (STs) belonging to 12 clonal complexes (CCs) and 30 singleton STs. The largest CC, CC-ST10, which represented almost exclusively isolates from rainbow trout and included the most predominant genotype, ST2, comprised 65% of all isolates examined. In Norway, with a shorter history (<10 years) of BCWD in rainbow trout, ST2 was the only isolated CC-ST10 genotype, suggesting a recent introduction of an epidemic clone. The study identified five additional CCs shared between countries and five country-specific CCs, some with apparent host specificity. Almost 80% of the singleton STs were isolated from non-rainbow trout species or the environment. The present study reveals a simultaneous presence of genetically distinct CCs in the Nordic countries and points out specific F. psychrophilum STs posing a threat to the salmonid production. The study provides a significant contribution toward mapping the genetic diversity of F. psychrophilum globally and support for the existence of an epidemic population structure where recombination is a significant driver in F. psychrophilum evolution. Evidence indicating dissemination of a putatively virulent clonal complex (CC-ST10) with commercial movement of fish or fish products is strengthened. PMID:24561585

  14. Diversity of the causal genes in hearing impaired Algerian individuals identified by whole exome sequencing.

    PubMed

    Ammar-Khodja, Fatima; Bonnet, Crystel; Dahmani, Malika; Ouhab, Sofiane; Lefèvre, Gaelle M; Ibrahim, Hassina; Hardelin, Jean-Pierre; Weil, Dominique; Louha, Malek; Petit, Christine

    2015-05-01

    The genetic heterogeneity of congenital hearing disorders makes molecular diagnosis expensive and time-consuming using conventional techniques such as Sanger sequencing of DNA. In order to design an appropriate strategy of molecular diagnosis in the Algerian population, we explored the diversity of the involved mutations by studying 65 families affected by autosomal recessive forms of nonsyndromic hearing impairment (DFNB forms), which are the most prevalent early onset forms. We first carried out a systematic screening for mutations in GJB2 and the recurrent p.(Arg34*) mutation in TMC1, which were found in 31 (47.7%) families and 1 (1.5%) family, respectively. We then performed whole exome sequencing in nine of the remaining families, and identified the causative mutations in all the patients analyzed, either in the homozygous state (eight families) or in the compound heterozygous state (one family): (c.709C>T: p.(Arg237*)) and (c.2122C>T: p.(Arg708*)) in OTOF, (c.1334T>G: p.(Leu445Trp)) in SLC26A4, (c.764T>A: p.(Met255Lys)) in GIPC3, (c.518T>A: p.(Cys173Ser)) in LHFPL5, (c.5336T>C: p.(Leu1779Pro)) in MYO15A, (c.1807G>T: p.(Val603Phe)) in OTOA, (c.6080dup: p.(Asn2027Lys*9)) in PTPRQ, and (c.6017del: p.(Gly2006Alafs*13); c.7188_7189ins14: p.(Val2397Leufs*2)) in GPR98. Notably, 7 of these 10 mutations affecting 8 different genes had not been reported previously. These results highlight for the first time the genetic heterogeneity of the early onset forms of nonsyndromic deafness in Algerian families. PMID:26029705

  15. Diversity of the causal genes in hearing impaired Algerian individuals identified by whole exome sequencing

    PubMed Central

    Ammar-Khodja, Fatima; Bonnet, Crystel; Dahmani, Malika; Ouhab, Sofiane; Lefèvre, Gaelle M; Ibrahim, Hassina; Hardelin, Jean-Pierre; Weil, Dominique; Louha, Malek; Petit, Christine

    2015-01-01

    The genetic heterogeneity of congenital hearing disorders makes molecular diagnosis expensive and time-consuming using conventional techniques such as Sanger sequencing of DNA. In order to design an appropriate strategy of molecular diagnosis in the Algerian population, we explored the diversity of the involved mutations by studying 65 families affected by autosomal recessive forms of nonsyndromic hearing impairment (DFNB forms), which are the most prevalent early onset forms. We first carried out a systematic screening for mutations in GJB2 and the recurrent p.(Arg34*) mutation in TMC1, which were found in 31 (47.7%) families and 1 (1.5%) family, respectively. We then performed whole exome sequencing in nine of the remaining families, and identified the causative mutations in all the patients analyzed, either in the homozygous state (eight families) or in the compound heterozygous state (one family): (c.709C>T: p.(Arg237*)) and (c.2122C>T: p.(Arg708*)) in OTOF, (c.1334T>G: p.(Leu445Trp)) in SLC26A4, (c.764T>A: p.(Met255Lys)) in GIPC3, (c.518T>A: p.(Cys173Ser)) in LHFPL5, (c.5336T>C: p.(Leu1779Pro)) in MYO15A, (c.1807G>T: p.(Val603Phe)) in OTOA, (c.6080dup: p.(Asn2027Lys*9)) in PTPRQ, and (c.6017del: p.(Gly2006Alafs*13); c.7188_7189ins14: p.(Val2397Leufs*2)) in GPR98. Notably, 7 of these 10 mutations affecting 8 different genes had not been reported previously. These results highlight for the first time the genetic heterogeneity of the early onset forms of nonsyndromic deafness in Algerian families. PMID:26029705

  16. Identification of random nucleic acid sequence aberrations using dual capture probes which hybridize to different chromosome regions

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1998-01-01

    A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration.

  17. Identification of random nucleic acid sequence aberrations using dual capture probes which hybridize to different chromosome regions

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1998-03-24

    A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration. 14 figs.

  18. High density genome wide genotyping-by-sequencing and association identifies common and low frequency SNPs, and novel candidate genes influencing cow milk traits

    PubMed Central

    Ibeagha-Awemu, Eveline M.; Peters, Sunday O.; Akwanji, Kingsley A.; Imumorin, Ikhide G.; Zhao, Xin

    2016-01-01

    High-throughput sequencing technologies have increased the ability to detect sequence variations for complex trait improvement. A high throughput genome wide genotyping-by-sequencing (GBS) method was used to generate 515,787 single nucleotide polymorphisms (SNPs), from which 76,355 SNPs with call rates >85% and minor allele frequency ≥1.5% were used in genome wide association study (GWAS) of 44 milk traits in 1,246 Canadian Holstein cows. GWAS was accomplished with a mixed linear model procedure implementing the additive and dominant models. A strong signal within the centromeric region of bovine chromosome 14 was associated with test day fat percentage. Several SNPs were associated with eicosapentaenoic acid, docosapentaenoic acid, arachidonic acid, CLA:9c11t and gamma linolenic acid. Most of the significant SNPs for 44 traits studied are novel and located in intergenic regions or introns of genes. Novel potential candidate genes for milk traits or mammary gland functions include ERCC6, TONSL, NPAS2, ACER3, ITGB4, GGT6, ACOX3, MECR, ADAM12, ACHE, LRRC14, FUK, NPRL3, EVL, SLCO3A1, PSMA4, FTO, ADCK5, PP1R16A and TEP1. Our study further demonstrates the utility of the GBS approach for identifying population-specific SNPs for use in improvement of complex dairy traits. PMID:27506634

  19. High density genome wide genotyping-by-sequencing and association identifies common and low frequency SNPs, and novel candidate genes influencing cow milk traits.

    PubMed

    Ibeagha-Awemu, Eveline M; Peters, Sunday O; Akwanji, Kingsley A; Imumorin, Ikhide G; Zhao, Xin

    2016-01-01

    High-throughput sequencing technologies have increased the ability to detect sequence variations for complex trait improvement. A high throughput genome wide genotyping-by-sequencing (GBS) method was used to generate 515,787 single nucleotide polymorphisms (SNPs), from which 76,355 SNPs with call rates >85% and minor allele frequency ≥1.5% were used in genome wide association study (GWAS) of 44 milk traits in 1,246 Canadian Holstein cows. GWAS was accomplished with a mixed linear model procedure implementing the additive and dominant models. A strong signal within the centromeric region of bovine chromosome 14 was associated with test day fat percentage. Several SNPs were associated with eicosapentaenoic acid, docosapentaenoic acid, arachidonic acid, CLA:9c11t and gamma linolenic acid. Most of the significant SNPs for 44 traits studied are novel and located in intergenic regions or introns of genes. Novel potential candidate genes for milk traits or mammary gland functions include ERCC6, TONSL, NPAS2, ACER3, ITGB4, GGT6, ACOX3, MECR, ADAM12, ACHE, LRRC14, FUK, NPRL3, EVL, SLCO3A1, PSMA4, FTO, ADCK5, PP1R16A and TEP1. Our study further demonstrates the utility of the GBS approach for identifying population-specific SNPs for use in improvement of complex dairy traits. PMID:27506634

  20. Illumina sequencing of green stink bug nymph and adult cdna to identify potential rnai gene targets

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Whole-body transcriptomes for nymphs and adults of the green stink bug, Acrosternum hilare (Say), were sequenced on an Illumina® Genome Analyzer IIx sequencer. The insects were collected from sites in North Carolina and Virginia, USA. The cDNA library for each sample was sequenced on one lane of an...

  1. tax and rex Sequences of bovine leukaemia virus from globally diverse isolates: rex amino acid sequence more variable than tax.

    PubMed

    McGirr, K M; Buehring, G C

    2005-02-01

    Bovine leukaemia virus (BLV) is an important agricultural problem with high costs to the dairy industry. Here, we examine the variation of the tax and rex genes of BLV. The tax and rex genes share 420 bases and have overlapping reading frames. The tax gene encodes a protein that functions as a transactivator of the BLV promoter, is required for viral replication, acts on cellular promoters, and is responsible for oncogenesis. The rex facilitates the export of viral mRNAs from the nucleus and regulates transcription. We have sequenced five new isolates of the tax/rex gene. We examined the five new and three previously published tax/rex DNA and predicted amino acid sequences of BLV isolates from cattle in representative regions worldwide. The highest variation among nucleic acid sequences for tax and rex was 7% and 5%, respectively; among predicted amino acid sequences for Tax and Rex, 9% and 11%, respectively. Significantly more nucleotide changes resulted in predicted amino acid changes in the rex gene than in the tax gene (P < or = 0.0006). This variability is higher than previously reported for any region of the viral genome. This research may also have implications for the development of Tax-based vaccines. PMID:15702995

  2. Hartnup disorder: polymorphisms identified in the neutral amino acid transporter SLC1A5.

    PubMed

    Potter, S J; Lu, A; Wilcken, B; Green, K; Rasko, J E J

    2002-10-01

    Hartnup disorder is an inborn error of renal and gastrointestinal neutral amino acid transport. The cloning and functional characterization of the 'system B0' neutral amino acid transporter SLC1A5 led to it being proposed as a candidate gene for Hartnup disorder. Linkage analysis performed at 19q13.3, the chromosomal position of SLC1A5, was suggestive of an association with the Hartnup phenotype in some families. However, SLC1A5 was not linked to the Hartnup phenotype in other families. Linkage analysis also excluded an alternative candidate region at 11q13 implicated by a putative mouse model for Hartnup disorder. Sequencing of the coding region of SLC1A5 in Hartnup patients revealed two coding region polymorphisms. These mutations did not alter the predicted amino acid sequence of SLC1A5 and were considered unlikely to play a role in Hartnup disorder. There were no mutations in splice sites flanking each exon. Quantitative RT-PCR of SLC1A5 messenger RNA in affected and unaffected subjects did not support systemic differences in expression as an explanation for Hartnup disorder. In the six unrelated Hartnup pedigrees studied, examination of linkage at 19q13.3, polymorphisms in the coding sequence and quantitation of expression of SLC1A5 did not suffice to explain the defect in neutral amino acid transport. PMID:12555937

  3. The amino acid sequence of protein CM-3 from Dendroaspis polylepis polylepis (black mamba) venom.

    PubMed

    Joubert, F J

    1985-01-01

    Protein CM-3 from Dendroaspis polylepis polylepis venom was purified by gel filtration and ion exchange chromatography. It comprises 65 amino acids including eight half-cystines. The complete amino acid sequence of protein CM-3 has been elucidated. The sequence (residues 1-50) resembles that of the N-terminal sequence of the subunits of a synergistic type protein and residues 51-65 that of the C-terminal sequence of an angusticeps type protein. Mixtures of protein CM-3 and angusticeps type proteins showed no apparent synergistic effect, in that their toxicity in combination was no greater than the sum of their individual toxicities. PMID:4029488

  4. Proteomic analysis of cerebrospinal fluid in California sea lions (Zalophus californianus) with domoic acid toxicosis identifies proteins associated with neurodegeneration.

    PubMed

    Neely, Benjamin A; Soper, Jennifer L; Gulland, Frances M D; Bell, P Darwin; Kindy, Mark; Arthur, John M; Janech, Michael G

    2015-12-01

    Proteomic studies including marine mammals are rare, largely due to the lack of fully sequenced genomes. This has hampered the application of these techniques toward biomarker discovery efforts for monitoring of health and disease in these animals. We conducted a pilot label-free LC-MS/MS study to profile and compare the cerebrospinal fluid from California sea lions with domoic acid toxicosis (DAT) and without DAT. Across 11 samples, a total of 206 proteins were identified (FDR<0.1) using a composite mammalian database. Several peptide identifications were validated using stable isotope labeled peptides. Comparison of spectral counts revealed seven proteins that were elevated in the cerebrospinal fluid from sea lions with DAT: complement C3, complement factor B, dickkopf-3, malate dehydrogenase 1, neuron cell adhesion molecule 1, gelsolin, and neuronal cell adhesion molecule. Immunoblot analysis found reelin to be depressed in the cerebrospinal fluid from California sea lions with DAT. Mice administered domoic acid also had lower hippocampal reelin protein levels suggesting that domoic acid depresses reelin similar to kainic acid. In summary, proteomic analysis of cerebrospinal fluid in marine mammals is a useful tool to characterize the underlying molecular pathology of neurodegenerative disease. All MS data have been deposited in the ProteomeXchange with identifier PXD002105 (http://proteomecentral.proteomexchange.org/dataset/PXD002105). PMID:26364553

  5. Long Non-Coding RNA and Alternative Splicing Modulations in Parkinson's Leukocytes Identified by RNA Sequencing

    PubMed Central

    Soreq, Lilach; Guffanti, Alessandro; Salomonis, Nathan; Simchovitz, Alon; Israel, Zvi; Bergman, Hagai; Soreq, Hermona

    2014-01-01

    The continuously prolonged human lifespan is accompanied by increase in neurodegenerative diseases incidence, calling for the development of inexpensive blood-based diagnostics. Analyzing blood cell transcripts by RNA-Seq is a robust means to identify novel biomarkers that rapidly becomes a commonplace. However, there is lack of tools to discover novel exons, junctions and splicing events and to precisely and sensitively assess differential splicing through RNA-Seq data analysis and across RNA-Seq platforms. Here, we present a new and comprehensive computational workflow for whole-transcriptome RNA-Seq analysis, using an updated version of the software AltAnalyze, to identify both known and novel high-confidence alternative splicing events, and to integrate them with both protein-domains and microRNA binding annotations. We applied the novel workflow on RNA-Seq data from Parkinson's disease (PD) patients' leukocytes pre- and post- Deep Brain Stimulation (DBS) treatment and compared to healthy controls. Disease-mediated changes included decreased usage of alternative promoters and N-termini, 5′-end variations and mutually-exclusive exons. The PD regulated FUS and HNRNP A/B included prion-like domains regulated regions. We also present here a workflow to identify and analyze long non-coding RNAs (lncRNAs) via RNA-Seq data. We identified reduced lncRNA expression and selective PD-induced changes in 13 of over 6,000 detected leukocyte lncRNAs, four of which were inversely altered post-DBS. These included the U1 spliceosomal lncRNA and RP11-462G22.1, each entailing sequence complementarity to numerous microRNAs. Analysis of RNA-Seq from PD and unaffected controls brains revealed over 7,000 brain-expressed lncRNAs, of which 3,495 were co-expressed in the leukocytes including U1, which showed both leukocyte and brain increases. Furthermore, qRT-PCR validations confirmed these co-increases in PD leukocytes and two brain regions, the amygdala and substantia

  6. Sequence and structure-based comparative analysis to assess, identify and improve the thermostability of penicillin G acylases.

    PubMed

    Panigrahi, Priyabrata; Chand, Deepak; Mukherji, Ruchira; Ramasamy, Sureshkumar; Suresh, C G

    2015-11-01

    Penicillin acylases are enzymes employed by the pharmaceutical industry for the manufacture of semi-synthetic penicillins. There is a continuous demand for thermostable and alkalophilic enzymes in such applications. We have carried out a computational analysis of known penicillin G acylases (PGAs) in terms of their thermostable nature using various protein-stabilizing factors. While the presence of disulfide bridges was considered initially to screen putative thermostable PGAs from the database, various other factors such as high arginine to lysine ratio, less content of thermolabile amino acids, presence of proline in β-turns, more number of ion-pair and other non-bonded interactions were also considered for comparison. A modified consensus approach designed could further identify stabilizing residue positions by site-specific comparison between mesostable and thermostable PGAs. A most likely thermostable enzyme identified from the analysis was PGA from Paracoccus denitrificans (PdPGA). This was cloned, expressed and tested for its thermostable nature using biochemical and biophysical experiments. The consensus site-specific sequence-based approach predicted PdPGA to be more thermostable than Escherichia coli PGA, but not as thermostable as the PGA from Achromobacter xylosoxidans. Experimental data showed that PdPGA was comparatively less thermostable than Achromobacter xylosoxidans PGA, although thermostability factors favored a much higher stability. Despite being mesostable, PdPGA being active and stable at alkaline pH is an advantage. Finally, several residue positions could be identified in PdPGA, which upon mutation selectively could improve the thermostability of the enzyme. PMID:26419382

  7. Highly recurring sequence elements identified in eukaryotic DNAs by computer analysis are often homologous to regulatory sequences or protein binding sites.

    PubMed Central

    Bodnar, J W; Ward, D C

    1987-01-01

    We have used computer assisted dot matrix and oligonucleotide frequency analyses to identify highly recurring sequence elements of 7-11 base pairs in eukaryotic genes and viral DNAs. Such elements are found much more frequently than expected, often with an average spacing of a few hundred base pairs. Furthermore, the most abundant repetitive elements observed in the ovalbumin locus, the beta-globin gene cluster, the metallothionein gene and the viral genomes of SV40, polyoma, Herpes simplex-1 and Mouse Mammary Tumor Virus were sequences shown previously to be protein binding sites or sequences important for regulating gene expression. These sequences were present in both exons and introns as well as promoter regions. These observations suggest that such sequences are often highly overrepresented within the specific gene segments with which they are associated. Computer analysis of other genetic units, including viral genomes and oncogenes, has identified a number of highly recurring sequence elements that could serve similar regulatory or protein-binding functions. A model for the role of such reiterated sequence elements in DNA organization and function is presented. PMID:3822840

  8. Respiratory Syncytial Virus whole-genome sequencing identifies convergent evolution of sequence duplication in the C-terminus of the G gene

    PubMed Central

    Schobel, Seth A.; Stucker, Karla M.; Moore, Martin L.; Anderson, Larry J.; Larkin, Emma K.; Shankar, Jyoti; Bera, Jayati; Puri, Vinita; Shilts, Meghan H.; Rosas-Salazar, Christian; Halpin, Rebecca A.; Fedorova, Nadia; Shrivastava, Susmita; Stockwell, Timothy B.; Peebles, R. Stokes; Hartert, Tina V.; Das, Suman R.

    2016-01-01

    Respiratory Syncytial Virus (RSV) is responsible for considerable morbidity and mortality worldwide and is the most important respiratory viral pathogen in infants. Extensive sequence variability within and between RSV group A and B viruses and the ability of multiple clades and sub-clades of RSV to co-circulate are likely mechanisms contributing to the evasion of herd immunity. Surveillance and large-scale whole-genome sequencing of RSV is currently limited but would help identify its evolutionary dynamics and sites of selective immune evasion. In this study, we performed complete-genome next-generation sequencing of 92 RSV isolates from infants in central Tennessee during the 2012–2014 RSV seasons. We identified multiple co-circulating clades of RSV from both the A and B groups. Each clade is defined by signature N- and O-linked glycosylation patterns. Analyses of specific RSV genes revealed high rates of positive selection in the attachment (G) gene. We identified RSV-A viruses in circulation with and without a recently reported 72-nucleotide G gene sequence duplication. Furthermore, we show evidence of convergent evolution of G gene sequence duplication and fixation over time, which suggests a potential fitness advantage of RSV with the G sequence duplication. PMID:27212633

  9. Respiratory Syncytial Virus whole-genome sequencing identifies convergent evolution of sequence duplication in the C-terminus of the G gene.

    PubMed

    Schobel, Seth A; Stucker, Karla M; Moore, Martin L; Anderson, Larry J; Larkin, Emma K; Shankar, Jyoti; Bera, Jayati; Puri, Vinita; Shilts, Meghan H; Rosas-Salazar, Christian; Halpin, Rebecca A; Fedorova, Nadia; Shrivastava, Susmita; Stockwell, Timothy B; Peebles, R Stokes; Hartert, Tina V; Das, Suman R

    2016-01-01

    Respiratory Syncytial Virus (RSV) is responsible for considerable morbidity and mortality worldwide and is the most important respiratory viral pathogen in infants. Extensive sequence variability within and between RSV group A and B viruses and the ability of multiple clades and sub-clades of RSV to co-circulate are likely mechanisms contributing to the evasion of herd immunity. Surveillance and large-scale whole-genome sequencing of RSV is currently limited but would help identify its evolutionary dynamics and sites of selective immune evasion. In this study, we performed complete-genome next-generation sequencing of 92 RSV isolates from infants in central Tennessee during the 2012-2014 RSV seasons. We identified multiple co-circulating clades of RSV from both the A and B groups. Each clade is defined by signature N- and O-linked glycosylation patterns. Analyses of specific RSV genes revealed high rates of positive selection in the attachment (G) gene. We identified RSV-A viruses in circulation with and without a recently reported 72-nucleotide G gene sequence duplication. Furthermore, we show evidence of convergent evolution of G gene sequence duplication and fixation over time, which suggests a potential fitness advantage of RSV with the G sequence duplication. PMID:27212633

  10. The complete amino acid sequence of the A-chain of human plasma alpha 2HS-glycoprotein.

    PubMed

    Yoshioka, Y; Gejyo, F; Marti, T; Rickli, E E; Bürgi, W; Offner, G D; Troxler, R F; Schmid, K

    1986-02-01

    Normal human plasma alpha 2HS-glycoprotein has earlier been shown to be comprised of two polypeptide chains. Recently, the amino acid and carbohydrate sequences of the short chain were elucidated (Gejyo, F., Chang, J.-L., Bürgi, W., Schmid, K., Offner, G. D., Troxler, R.F., van Halbeck, H., Dorland, L., Gerwig, G. J., and Vliegenthart, J.F.G. (1983) J. Biol. Chem. 258, 4966-4971). In the present study, the amino acid sequence of the long chain of this protein, designated A-chain, was determined and found to consist of 282 amino acid residues. Twenty-four amino acid doublets were found; the most abundant of these are Pro-Pro and Ala-Ala which each occur five times. Of particular interest is the presence of three Gly-X-Pro and one Gly-Pro-X sequences that are characteristic of the repeating sequences of collagens. Chou-Fasman evaluation of the secondary structure suggested that the A-chain contains 29% alpha-helix, 24% beta-pleated sheet, and 26% reverse turns and, thus, approximately 80% of the polypeptide chain may display ordered structure. Four glycosylation sites were identified. The two N-glycosidic oligosaccharides were found in the center region (residues 138 and 158), whereas the two O-glycosidic heterosaccharides, both linked to threonine (residues 238 and 252), occur within the carboxyl-terminal region. The N-glycans are linked to Asn residues in beta-turns, while the O-glycans are located in short random segments. Comparison of the sequence of the amino- and carboxyl-terminal 30 residues with protein sequences in a data bank demonstrated that the A-chain is not significantly related to any known proteins. However, the proline-rich carboxyl-terminal region of the A-chain displays some sequence similarity to collagens and the collagen-like domains of complement subcomponent C1q. PMID:3944104

  11. Exome Sequencing in a Family Identifies RECQL5 Mutation Resulting in Early Myocardial Infarction

    PubMed Central

    Xie, Xiang; Zheng, Ying-Ying; Adi, Dilare; Yang, Yi-Ning; Ma, Yi-Tong; Li, Xiao-Mei; Fu, Zhen-Yan; Ma, Xiang; Liu, Fen; Yu, Zi-Xiang; Chen, You; Huang, Ying

    2016-01-01

    Abstract Coronary artery disease (CAD) including myocardial infarction (MI) is the leading cause of death worldwide and is commonly caused by the interaction between genetic factors and environmental risks. Despite intensive efforts using linkage and candidate gene approaches, the genetic etiology for the majority of families with a multigenerational early CAD /MI predisposition is unknown. In this study, we used whole-exome sequencing of 10 individuals from 1 early MI family, in which 4 siblings were diagnosed with MI before the age of 55, to identify potential predisposing genes. We identified a mutation in the RECQL5 gene, 1 of the 5 members of the RECQ family which are involved in the maintenance of genomic stability. This novel mutation, which is a TG insert at position 73,626,918 on the 13 chromosome and occurs before the last nucleotide of the introns 11 acceptor splice site affecting splicing of RECQL5. RT-PCR suggested the control subject had a full-length mRNA including exon 12, but the patients with RECQL5 mutation had a shorter mRNA form involving splicing of exons 11 to 13 directly, with skipping of exon 12. Quantitative RT-PCR analysis of RECQL5 exon 12 demonstrated that individuals whose genotype is mutant homozygote had only trace amounts of mRNA containing this exon and the family members who carry the heterozygous genotype had a level at 48% to 55% of the control's level. These findings provide insight into both the pathogenesis of MI and the role of RECQL5 gene in human disease. PMID:26844521

  12. Exome Sequencing Identifies PDE4D Mutations as Another Cause of Acrodysostosis

    PubMed Central

    Michot, Caroline; Le Goff, Carine; Goldenberg, Alice; Abhyankar, Avinash; Klein, Céline; Kinning, Esther; Guerrot, Anne-Marie; Flahaut, Philippe; Duncombe, Alice; Baujat, Genevieve; Lyonnet, Stanislas; Thalassinos, Caroline; Nitschke, Patrick; Casanova, Jean-Laurent; Le Merrer, Martine; Munnich, Arnold; Cormier-Daire, Valérie

    2012-01-01

    Acrodysostosis is a rare autosomal-dominant condition characterized by facial dysostosis, severe brachydactyly with cone-shaped epiphyses, and short stature. Moderate intellectual disability and resistance to multiple hormones might also be present. Recently, a recurrent mutation (c.1102C>T [p.Arg368∗]) in PRKAR1A has been identified in three individuals with acrodysostosis and resistance to multiple hormones. After studying ten unrelated acrodysostosis cases, we report here de novo PRKAR1A mutations in five out of the ten individuals (we found c.1102C>T [p.Arg368∗] in four of the ten and c.1117T>C [p.Tyr373His] in one of the ten). We performed exome sequencing in two of the five remaining individuals and selected phosphodiesterase 4D (PDE4D) as a candidate gene. PDE4D encodes a class IV cyclic AMP (cAMP)-specific phosphodiesterase that regulates cAMP concentration. Exome analysis detected heterozygous PDE4D mutations (c.673C>A [p.Pro225Thr] and c.677T>C [p.Phe226Ser]) in these two individuals. Screening of PDE4D identified heterozygous mutations (c.568T>G [p.Ser190Ala] and c.1759A>C [p.Thr587Pro]) in two additional acrodysostosis cases. These mutations occurred de novo in all four cases. The four individuals with PDE4D mutations shared common clinical features, namely characteristic midface and nasal hypoplasia and moderate intellectual disability. Metabolic screening was normal in three of these four individuals. However, resistance to parathyroid hormone and thyrotropin was consistently observed in the five cases with PRKAR1A mutations. Finally, our study further supports the key role of the cAMP signaling pathway in skeletogenesis. PMID:22464250

  13. Identifying Children With Poor Cochlear Implantation Outcomes Using Massively Parallel Sequencing

    PubMed Central

    Wu, Chen-Chi; Lin, Yin-Hung; Liu, Tien-Chen; Lin, Kai-Nan; Yang, Wei-Shiung; Hsu, Chuan-Jen; Chen, Pei-Lung; Wu, Che-Ming

    2015-01-01

    Abstract Cochlear implantation is currently the treatment of choice for children with severe to profound hearing impairment. However, the outcomes with cochlear implants (CIs) vary significantly among recipients. The purpose of the present study is to identify the genetic determinants of poor CI outcomes. Twelve children with poor CI outcomes (the “cases”) and 30 “matched controls” with good CI outcomes were subjected to comprehensive genetic analyses using massively parallel sequencing, which targeted 129 known deafness genes. Audiological features, imaging findings, and auditory/speech performance with CIs were then correlated to the genetic diagnoses. We identified genetic variants which are associated with poor CI outcomes in 7 (58%) of the 12 cases; 4 cases had bi-allelic PCDH15 pathogenic mutations and 3 cases were homozygous for the DFNB59 p.G292R variant. Mutations in the WFS1, GJB3, ESRRB, LRTOMT, MYO3A, and POU3F4 genes were detected in 7 (23%) of the 30 matched controls. The allele frequencies of PCDH15 and DFNB59 variants were significantly higher in the cases than in the matched controls (both P < 0.001). In the 7 CI recipients with PCDH15 or DFNB59 variants, otoacoustic emissions were absent in both ears, and imaging findings were normal in all 7 implanted ears. PCDH15 or DFNB59 variants are associated with poor CI performance, yet children with PCDH15 or DFNB59 variants might show clinical features indistinguishable from those of other typical pediatric CI recipients. Accordingly, genetic examination is indicated in all CI candidates before operation. PMID:26166082

  14. Barcode Sequencing Screen Identifies SUB1 as a Regulator of Yeast Pheromone Inducible Genes.

    PubMed

    Sliva, Anna; Kuang, Zheng; Meluh, Pamela B; Boeke, Jef D

    2016-01-01

    The yeast pheromone response pathway serves as a valuable model of eukaryotic mitogen-activated protein kinase (MAPK) pathways, and transcription of their downstream targets. Here, we describe application of a screening method combining two technologies: fluorescence-activated cell sorting (FACS), and barcode analysis by sequencing (Bar-Seq). Using this screening method, and pFUS1-GFP as a reporter for MAPK pathway activation, we readily identified mutants in known mating pathway components. In this study, we also include a comprehensive analysis of the FUS1 induction properties of known mating pathway mutants by flow cytometry, featuring single cell analysis of each mutant population. We also characterized a new source of false positives resulting from the design of this screen. Additionally, we identified a deletion mutant, sub1Δ, with increased basal expression of pFUS1-GFP. Here, in the first ChIP-Seq of Sub1, our data shows that Sub1 binds to the promoters of about half the genes in the genome (tripling the 991 loci previously reported), including the promoters of several pheromone-inducible genes, some of which show an increase upon pheromone induction. Here, we also present the first RNA-Seq of a sub1Δ mutant; the majority of genes have no change in RNA, but, of the small subset that do, most show decreased expression, consistent with biochemical studies implicating Sub1 as a positive transcriptional regulator. The RNA-Seq data also show that certain pheromone-inducible genes are induced less in the sub1Δ mutant relative to the wild type, supporting a role for Sub1 in regulation of mating pathway genes. The sub1Δ mutant has increased basal levels of a small subset of other genes besides FUS1, including IMD2 and FIG1, a gene encoding an integral membrane protein necessary for efficient mating. PMID:26837954

  15. Barcode Sequencing Screen Identifies SUB1 as a Regulator of Yeast Pheromone Inducible Genes

    PubMed Central

    Sliva, Anna; Kuang, Zheng; Meluh, Pamela B.; Boeke, Jef D.

    2016-01-01

    The yeast pheromone response pathway serves as a valuable model of eukaryotic mitogen-activated protein kinase (MAPK) pathways, and transcription of their downstream targets. Here, we describe application of a screening method combining two technologies: fluorescence-activated cell sorting (FACS), and barcode analysis by sequencing (Bar-Seq). Using this screening method, and pFUS1-GFP as a reporter for MAPK pathway activation, we readily identified mutants in known mating pathway components. In this study, we also include a comprehensive analysis of the FUS1 induction properties of known mating pathway mutants by flow cytometry, featuring single cell analysis of each mutant population. We also characterized a new source of false positives resulting from the design of this screen. Additionally, we identified a deletion mutant, sub1Δ, with increased basal expression of pFUS1-GFP. Here, in the first ChIP-Seq of Sub1, our data shows that Sub1 binds to the promoters of about half the genes in the genome (tripling the 991 loci previously reported), including the promoters of several pheromone-inducible genes, some of which show an increase upon pheromone induction. Here, we also present the first RNA-Seq of a sub1Δ mutant; the majority of genes have no change in RNA, but, of the small subset that do, most show decreased expression, consistent with biochemical studies implicating Sub1 as a positive transcriptional regulator. The RNA-Seq data also show that certain pheromone-inducible genes are induced less in the sub1Δ mutant relative to the wild type, supporting a role for Sub1 in regulation of mating pathway genes. The sub1Δ mutant has increased basal levels of a small subset of other genes besides FUS1, including IMD2 and FIG1, a gene encoding an integral membrane protein necessary for efficient mating. PMID:26837954

  16. Exome sequencing identifies PDE4D mutations as another cause of acrodysostosis.

    PubMed

    Michot, Caroline; Le Goff, Carine; Goldenberg, Alice; Abhyankar, Avinash; Klein, Céline; Kinning, Esther; Guerrot, Anne-Marie; Flahaut, Philippe; Duncombe, Alice; Baujat, Genevieve; Lyonnet, Stanislas; Thalassinos, Caroline; Nitschke, Patrick; Casanova, Jean-Laurent; Le Merrer, Martine; Munnich, Arnold; Cormier-Daire, Valérie

    2012-04-01

    Acrodysostosis is a rare autosomal-dominant condition characterized by facial dysostosis, severe brachydactyly with cone-shaped epiphyses, and short stature. Moderate intellectual disability and resistance to multiple hormones might also be present. Recently, a recurrent mutation (c.1102C>T [p.Arg368*]) in PRKAR1A has been identified in three individuals with acrodysostosis and resistance to multiple hormones. After studying ten unrelated acrodysostosis cases, we report here de novo PRKAR1A mutations in five out of the ten individuals (we found c.1102C>T [p.Arg368(∗)] in four of the ten and c.1117T>C [p.Tyr373His] in one of the ten). We performed exome sequencing in two of the five remaining individuals and selected phosphodiesterase 4D (PDE4D) as a candidate gene. PDE4D encodes a class IV cyclic AMP (cAMP)-specific phosphodiesterase that regulates cAMP concentration. Exome analysis detected heterozygous PDE4D mutations (c.673C>A [p.Pro225Thr] and c.677T>C [p.Phe226Ser]) in these two individuals. Screening of PDE4D identified heterozygous mutations (c.568T>G [p.Ser190Ala] and c.1759A>C [p.Thr587Pro]) in two additional acrodysostosis cases. These mutations occurred de novo in all four cases. The four individuals with PDE4D mutations shared common clinical features, namely characteristic midface and nasal hypoplasia and moderate intellectual disability. Metabolic screening was normal in three of these four individuals. However, resistance to parathyroid hormone and thyrotropin was consistently observed in the five cases with PRKAR1A mutations. Finally, our study further supports the key role of the cAMP signaling pathway in skeletogenesis. PMID:22464250

  17. Adaptation and possible ancient interspecies introgression in pigs identified by whole-genome sequencing.

    PubMed

    Ai, Huashui; Fang, Xiaodong; Yang, Bin; Huang, Zhiyong; Chen, Hao; Mao, Likai; Zhang, Feng; Zhang, Lu; Cui, Leilei; He, Weiming; Yang, Jie; Yao, Xiaoming; Zhou, Lisheng; Han, Lijuan; Li, Jing; Sun, Silong; Xie, Xianhua; Lai, Boxian; Su, Ying; Lu, Yao; Yang, Hui; Huang, Tao; Deng, Wenjiang; Nielsen, Rasmus; Ren, Jun; Huang, Lusheng

    2015-03-01

    Domestic pigs have evolved genetic adaptations to their local environmental conditions, such as cold and hot climates. We sequenced the genomes of 69 pigs from 15 geographically divergent locations in China and detected 41 million variants, of which 21 million were absent from the dbSNP database. In a genome-wide scan, we identified a set of loci that likely have a role in regional adaptations to high- and low-latitude environments within China. Intriguingly, we found an exceptionally large (14-Mb) region with a low recombination rate on the X chromosome that appears to have two distinct haplotypes in the high- and low-latitude populations, possibly underlying their adaptation to cold and hot environments, respectively. Surprisingly, the adaptive sweep in the high-latitude regions has acted on DNA that might have been introgressed from an extinct Sus species. Our findings provide new insights into the evolutionary history of pigs and the role of introgression in adaptation. PMID:25621459

  18. Distinct genetic architectures for syndromic and nonsyndromic congenital heart defects identified by exome sequencing.

    PubMed

    Sifrim, Alejandro; Hitz, Marc-Phillip; Wilsdon, Anna; Breckpot, Jeroen; Turki, Saeed H Al; Thienpont, Bernard; McRae, Jeremy; Fitzgerald, Tomas W; Singh, Tarjinder; Swaminathan, Ganesh Jawahar; Prigmore, Elena; Rajan, Diana; Abdul-Khaliq, Hashim; Banka, Siddharth; Bauer, Ulrike M M; Bentham, Jamie; Berger, Felix; Bhattacharya, Shoumo; Bu'Lock, Frances; Canham, Natalie; Colgiu, Irina-Gabriela; Cosgrove, Catherine; Cox, Helen; Daehnert, Ingo; Daly, Allan; Danesh, John; Fryer, Alan; Gewillig, Marc; Hobson, Emma; Hoff, Kirstin; Homfray, Tessa; Kahlert, Anne-Karin; Ketley, Ami; Kramer, Hans-Heiner; Lachlan, Katherine; Lampe, Anne Katrin; Louw, Jacoba J; Manickara, Ashok Kumar; Manase, Dorin; McCarthy, Karen P; Metcalfe, Kay; Moore, Carmel; Newbury-Ecob, Ruth; Omer, Seham Osman; Ouwehand, Willem H; Park, Soo-Mi; Parker, Michael J; Pickardt, Thomas; Pollard, Martin O; Robert, Leema; Roberts, David J; Sambrook, Jennifer; Setchfield, Kerry; Stiller, Brigitte; Thornborough, Chris; Toka, Okan; Watkins, Hugh; Williams, Denise; Wright, Michael; Mital, Seema; Daubeney, Piers E F; Keavney, Bernard; Goodship, Judith; Abu-Sulaiman, Riyadh Mahdi; Klaassen, Sabine; Wright, Caroline F; Firth, Helen V; Barrett, Jeffrey C; Devriendt, Koenraad; FitzPatrick, David R; Brook, J David; Hurles, Matthew E

    2016-09-01

    Congenital heart defects (CHDs) have a neonatal incidence of 0.8-1% (refs. 1,2). Despite abundant examples of monogenic CHD in humans and mice, CHD has a low absolute sibling recurrence risk (∼2.7%), suggesting a considerable role for de novo mutations (DNMs) and/or incomplete penetrance. De novo protein-truncating variants (PTVs) have been shown to be enriched among the 10% of 'syndromic' patients with extra-cardiac manifestations. We exome sequenced 1,891 probands, including both syndromic CHD (S-CHD, n = 610) and nonsyndromic CHD (NS-CHD, n = 1,281). In S-CHD, we confirmed a significant enrichment of de novo PTVs but not inherited PTVs in known CHD-associated genes, consistent with recent findings. Conversely, in NS-CHD we observed significant enrichment of PTVs inherited from unaffected parents in CHD-associated genes. We identified three genome-wide significant S-CHD disorders caused by DNMs in CHD4, CDK13 and PRKD1. Our study finds evidence for distinct genetic architectures underlying the low sibling recurrence risk in S-CHD and NS-CHD. PMID:27479907

  19. Ectrodactyly and Lethal Pulmonary Acinar Dysplasia Associated with Homozygous FGFR2 Mutations Identified by Exome Sequencing.

    PubMed

    Barnett, Christopher P; Nataren, Nathalie J; Klingler-Hoffmann, Manuela; Schwarz, Quenten; Chong, Chan-Eng; Lee, Young K; Bruno, Damien L; Lipsett, Jill; McPhee, Andrew J; Schreiber, Andreas W; Feng, Jinghua; Hahn, Christopher N; Scott, Hamish S

    2016-09-01

    Ectrodactyly/split hand-foot malformation is genetically heterogeneous with more than 100 syndromic associations. Acinar dysplasia is a rare congenital lung lesion of unknown etiology, which is frequently lethal postnatally. To date, there have been no reports of combinations of these two phenotypes. Here, we present an infant from a consanguineous union with both ectrodactyly and autopsy confirmed acinar dysplasia. SNP array and whole-exome sequencing analyses of the affected infant identified a novel homozygous Fibroblast Growth Factor Receptor 2 (FGFR2) missense mutation (p.R255Q) in the IgIII domain (D3). Expression studies of Fgfr2 in development show localization to the affected limbs and organs. Molecular modeling and genetic and functional assays support that this mutation is at least a partial loss-of-function mutation, and contributes to ectrodactyly and acinar dysplasia only in homozygosity, unlike previously reported heterozygous activating FGFR2 mutations that cause Crouzon, Apert, and Pfeiffer syndromes. This is the first report of mutations in a human disease with ectrodactyly with pulmonary acinar dysplasia and, as such, homozygous loss-of-function FGFR2 mutations represent a unique syndrome. PMID:27323706

  20. Whole-exome sequencing identifies ADRA2A mutation in atypical familial partial lipodystrophy

    PubMed Central

    Garg, Abhimanyu; Sankella, Shireesha; Xing, Chao; Agarwal, Anil K.

    2016-01-01

    Despite identification of causal genes for various lipodystrophy syndromes, the molecular basis of some peculiar lipodystrophies remains obscure. In an African-American pedigree with a novel autosomal dominant, atypical familial partial lipodystrophy (FPLD), we performed linkage analysis for candidate regions and whole-exome sequencing to identify the disease-causing mutation. Affected adults reported marked loss of fat from the extremities, with excess fat in the face and neck at age 13–15 years, and developed metabolic complications later. A heterozygous g.112837956C>T mutation on chromosome 10 (c.202C>T, p.Leu68Phe) affecting a highly conserved residue in adrenoceptor α 2A (ADRA2A) was found in all affected subjects but not in unaffected relatives. ADRA2A is the main presynaptic inhibitory feedback G protein–coupled receptor regulating norepinephrine release. Activation of ADRA2A inhibits cAMP production and reduces lipolysis in adipocytes. As compared with overexpression of a wild-type ADRA2A construct in human embryonic kidney–293 cells and differentiated 3T3-L1 adipocytes, the mutant ADRA2A produced more cAMP and glycerol, which were resistant to the effects of the α2-adrenergic receptor agonist clonidine and the α2-adrenergic receptor antagonist yohimbine, suggesting loss of function. We conclude that heterozygous p.Leu68Phe ADRA2A mutation causes a rare atypical FPLD, most likely by inducing excessive lipolysis in some adipose tissue depots. PMID:27376152

  1. Identifying Highly Penetrant Disease Causal Mutations Using Next Generation Sequencing: Guide to Whole Process

    PubMed Central

    Erzurumluoglu, A. Mesut; Shihab, Hashem A.; Baird, Denis; Richardson, Tom G.; Day, Ian N. M.; Gaunt, Tom R.

    2015-01-01

    Recent technological advances have created challenges for geneticists and a need to adapt to a wide range of new bioinformatics tools and an expanding wealth of publicly available data (e.g., mutation databases, and software). This wide range of methods and a diversity of file formats used in sequence analysis is a significant issue, with a considerable amount of time spent before anyone can even attempt to analyse the genetic basis of human disorders. Another point to consider that is although many possess “just enough” knowledge to analyse their data, they do not make full use of the tools and databases that are available and also do not fully understand how their data was created. The primary aim of this review is to document some of the key approaches and provide an analysis schema to make the analysis process more efficient and reliable in the context of discovering highly penetrant causal mutations/genes. This review will also compare the methods used to identify highly penetrant variants when data is obtained from consanguineous individuals as opposed to nonconsanguineous; and when Mendelian disorders are analysed as opposed to common-complex disorders. PMID:26106619

  2. A targeted next-generation sequencing method for identifying clinically relevant mutation profiles in lung adenocarcinoma

    PubMed Central

    Shao, Di; Lin, Yongping; Liu, Jilong; Wan, Liang; Liu, Zu; Cheng, Shaomin; Fei, Lingna; Deng, Rongqing; Wang, Jian; Chen, Xi; Liu, Liping; Gu, Xia; Liang, Wenhua; He, Ping; Wang, Jun; Ye, Mingzhi; He, Jianxing

    2016-01-01

    Molecular profiling of lung cancer has become essential for prediction of an individual’s response to targeted therapies. Next-generation sequencing (NGS) is a promising technique for routine diagnostics, but has not been sufficiently evaluated in terms of feasibility, reliability, cost and capacity with routine diagnostic formalin-fixed, paraffin-embedded (FFPE) materials. Here, we report the validation and application of a test based on Ion Proton technology for the rapid characterisation of single nucleotide variations (SNVs), short insertions and deletions (InDels), copy number variations (CNVs), and gene rearrangements in 145 genes with FFPE clinical specimens. The validation study, using 61 previously profiled clinical tumour samples, showed a concordance rate of 100% between results obtained by NGS and conventional test platforms. Analysis of tumour cell lines indicated reliable mutation detection in samples with 5% tumour content. Furthermore, application of the panel to 58 clinical cases, identified at least one actionable mutation in 43 cases, 1.4 times the number of actionable alterations detected by current diagnostic tests. We demonstrated that targeted NGS is a cost-effective and rapid platform to detect multiple mutations simultaneously in various genes with high reproducibility and sensitivity. PMID:26936516

  3. De novo transcriptome sequencing of Momordica cochinchinensis to identify genes involved in the carotenoid biosynthesis.

    PubMed

    Hyun, Tae Kyung; Rim, Yeonggil; Jang, Hui-Jeong; Kim, Cheol Hong; Park, Jongsun; Kumar, Ritesh; Lee, Sunghoon; Kim, Byung Chul; Bhak, Jong; Nguyen-Quoc, Binh; Kim, Seon-Won; Lee, Sang Yeol; Kim, Jae-Yean

    2012-07-01

    The ripe fruit of Momordica cochinchinensis Spreng, known as gac, is featured by very high carotenoid content. Although this plant might be a good resource for carotenoid metabolic engineering, so far, the genes involved in the carotenoid metabolic pathways in gac were unidentified due to lack of genomic information in the public database. In order to expedite the process of gene discovery, we have undertaken Illumina deep sequencing of mRNA prepared from aril of gac fruit. From 51,446,670 high-quality reads, we obtained 81,404 assembled unigenes with average length of 388 base pairs. At the protein level, gac aril transcripts showed about 81.5% similarity with cucumber proteomes. In addition 17,104 unigenes have been assigned to specific metabolic pathways in Kyoto Encyclopedia of Genes and Genomes, and all of known enzymes involved in terpenoid backbones biosynthetic and carotenoid biosynthetic pathways were also identified in our library. To analyze the relationship between putative carotenoid biosynthesis genes and alteration of carotenoid content during fruit ripening, digital gene expression analysis was performed on three different ripening stages of aril. This study has revealed putative phytoene synthase, 15-cis-phytone desaturase, zeta-carotene desaturase, carotenoid isomerase and lycopene epsilon cyclase might be key factors for controlling carotenoid contents during aril ripening. Taken together, this study has also made availability of a large gene database. This unique information for gac gene discovery would be helpful to facilitate functional studies for improving carotenoid quantities. PMID:22580955

  4. Whole-genome sequencing identifies genomic heterogeneity at a nucleotide and chromosomal level in bladder cancer

    PubMed Central

    Morrison, Carl D.; Liu, Pengyuan; Woloszynska-Read, Anna; Zhang, Jianmin; Luo, Wei; Qin, Maochun; Bshara, Wiam; Conroy, Jeffrey M.; Sabatini, Linda; Vedell, Peter; Xiong, Donghai; Liu, Song; Wang, Jianmin; Shen, He; Li, Yinwei; Omilian, Angela R.; Hill, Annette; Head, Karen; Guru, Khurshid; Kunnev, Dimiter; Leach, Robert; Eng, Kevin H.; Darlak, Christopher; Hoeflich, Christopher; Veeranki, Srividya; Glenn, Sean; You, Ming; Pruitt, Steven C.; Johnson, Candace S.; Trump, Donald L.

    2014-01-01

    Using complete genome analysis, we sequenced five bladder tumors accrued from patients with muscle-invasive transitional cell carcinoma of the urinary bladder (TCC-UB) and identified a spectrum of genomic aberrations. In three tumors, complex genotype changes were noted. All three had tumor protein p53 mutations and a relatively large number of single-nucleotide variants (SNVs; average of 11.2 per megabase), structural variants (SVs; average of 46), or both. This group was best characterized by chromothripsis and the presence of subclonal populations of neoplastic cells or intratumoral mutational heterogeneity. Here, we provide evidence that the process of chromothripsis in TCC-UB is mediated by nonhomologous end-joining using kilobase, rather than megabase, fragments of DNA, which we refer to as “stitchers,” to repair this process. We postulate that a potential unifying theme among tumors with the more complex genotype group is a defective replication–licensing complex. A second group (two bladder tumors) had no chromothripsis, and a simpler genotype, WT tumor protein p53, had relatively few SNVs (average of 5.9 per megabase) and only a single SV. There was no evidence of a subclonal population of neoplastic cells. In this group, we used a preclinical model of bladder carcinoma cell lines to study a unique SV (translocation and amplification) of the gene glutamate receptor ionotropic N-methyl D-aspertate as a potential new therapeutic target in bladder cancer. PMID:24469795

  5. Antimicrobial susceptibility among clinical Nocardia species identified by multilocus sequence analysis.

    PubMed

    McTaggart, Lisa R; Doucet, Jennifer; Witkowska, Maria; Richardson, Susan E

    2015-01-01

    Antimicrobial susceptibility patterns of 112 clinical isolates, 28 type strains, and 9 reference strains of Nocardia were determined using the Sensititre Rapmyco microdilution panel (Thermo Fisher, Inc.). Isolates were identified by highly discriminatory multilocus sequence analysis and were chosen to represent the diversity of species recovered from clinical specimens in Ontario, Canada. Susceptibility to the most commonly used drug, trimethoprim-sulfamethoxazole, was observed in 97% of isolates. Linezolid and amikacin were also highly effective; 100% and 99% of all isolates demonstrated a susceptible phenotype. For the remaining antimicrobials, resistance was species specific with isolates of Nocardia otitidiscaviarum, N. brasiliensis, N. abscessus complex, N. nova complex, N. transvalensis complex, N. farcinica, and N. cyriacigeorgica displaying the traditional characteristic drug pattern types. In addition, the antimicrobial susceptibility profiles of a variety of rarely encountered species isolated from clinical specimens are reported for the first time and were categorized into four additional drug pattern types. Finally, MICs for the control strains N. nova ATCC BAA-2227, N. asteroides ATCC 19247(T), and N. farcinica ATCC 23826 were robustly determined to demonstrate method reproducibility and suitability of the commercial Sensititre Rapmyco panel for antimicrobial susceptibility testing of Nocardia spp. isolated from clinical specimens. The reported values will facilitate quality control and standardization among laboratories. PMID:25348540

  6. Antimicrobial Susceptibility among Clinical Nocardia Species Identified by Multilocus Sequence Analysis

    PubMed Central

    Doucet, Jennifer; Witkowska, Maria; Richardson, Susan E.

    2014-01-01

    Antimicrobial susceptibility patterns of 112 clinical isolates, 28 type strains, and 9 reference strains of Nocardia were determined using the Sensititre Rapmyco microdilution panel (Thermo Fisher, Inc.). Isolates were identified by highly discriminatory multilocus sequence analysis and were chosen to represent the diversity of species recovered from clinical specimens in Ontario, Canada. Susceptibility to the most commonly used drug, trimethoprim-sulfamethoxazole, was observed in 97% of isolates. Linezolid and amikacin were also highly effective; 100% and 99% of all isolates demonstrated a susceptible phenotype. For the remaining antimicrobials, resistance was species specific with isolates of Nocardia otitidiscaviarum, N. brasiliensis, N. abscessus complex, N. nova complex, N. transvalensis complex, N. farcinica, and N. cyriacigeorgica displaying the traditional characteristic drug pattern types. In addition, the antimicrobial susceptibility profiles of a variety of rarely encountered species isolated from clinical specimens are reported for the first time and were categorized into four additional drug pattern types. Finally, MICs for the control strains N. nova ATCC BAA-2227, N. asteroides ATCC 19247T, and N. farcinica ATCC 23826 were robustly determined to demonstrate method reproducibility and suitability of the commercial Sensititre Rapmyco panel for antimicrobial susceptibility testing of Nocardia spp. isolated from clinical specimens. The reported values will facilitate quality control and standardization among laboratories. PMID:25348540

  7. An approach to identify the novel miRNA encoded from H. Annuus EST sequences.

    PubMed

    Gupta, Hemant; Tiwari, Tanushree; Patel, Maulik; Mehta, Aditya; Ghosh, Arpita

    2015-12-01

    MicroRNAs are a newly discovered class of non-protein small RNAs with 22-24 nucleotides. They play multiple roles in biological processes including development, cell proliferation, apoptosis, stress responses and many other cell functions. In this research, several approaches were combined to make a computational prediction of potential miRNAs and their targets in Helianthus annuus (H. annuus). The already available information of the plant miRNAs present in miRBase v21 was used against expressed sequence tags (ESTs). A total of three miRNAs were detected from which one potential novel miRNA was identified following a range of strict filtering criteria. The target prediction was carried out for these three miRNAs having various targets. These targets were functionally annotated and GO terms were assigned. To study the conserved nature of the miRNAs, predicted phylogenetic analysis was carried out. These findings will significantly provide the broader picture for understanding the functions in H. annuus. PMID:26697356

  8. Whole Exome Sequencing Identifies Mutations in Usher Syndrome Genes in Profoundly Deaf Tunisian Patients

    PubMed Central

    Riahi, Zied; Bonnet, Crystel; Zainine, Rim; Lahbib, Saida; Bouyacoub, Yosra; Bechraoui, Rym; Marrakchi, Jihène; Hardelin, Jean-Pierre; Louha, Malek; Largueche, Leila; Ben Yahia, Salim; Kheirallah, Moncef; Elmatri, Leila; Besbes, Ghazi; Abdelhak, Sonia; Petit, Christine

    2015-01-01

    Usher syndrome (USH) is an autosomal recessive disorder characterized by combined deafness-blindness. It accounts for about 50% of all hereditary deafness blindness cases. Three clinical subtypes (USH1, USH2, and USH3) are described, of which USH1 is the most severe form, characterized by congenital profound deafness, constant vestibular dysfunction, and a prepubertal onset of retinitis pigmentosa. We performed whole exome sequencing in four unrelated Tunisian patients affected by apparently isolated, congenital profound deafness, with reportedly normal ocular fundus examination. Four biallelic mutations were identified in two USH1 genes: a splice acceptor site mutation, c.2283-1G>T, and a novel missense mutation, c.5434G>A (p.Glu1812Lys), in MYO7A, and two previously unreported mutations in USH1G, i.e. a frameshift mutation, c.1195_1196delAG (p.Leu399Alafs*24), and a nonsense mutation, c.52A>T (p.Lys18*). Another ophthalmological examination including optical coherence tomography actually showed the presence of retinitis pigmentosa in all the patients. Our findings provide evidence that USH is under-diagnosed in Tunisian deaf patients. Yet, early diagnosis of USH is of utmost importance because these patients should undergo cochlear implant surgery in early childhood, in anticipation of the visual loss. PMID:25798947

  9. Whole-genome sequencing identifies genetic alterations in pediatric low-grade gliomas.

    PubMed

    Zhang, Jinghui; Wu, Gang; Miller, Claudia P; Tatevossian, Ruth G; Dalton, James D; Tang, Bo; Orisme, Wilda; Punchihewa, Chandanamali; Parker, Matthew; Qaddoumi, Ibrahim; Boop, Fredrick A; Lu, Charles; Kandoth, Cyriac; Ding, Li; Lee, Ryan; Huether, Robert; Chen, Xiang; Hedlund, Erin; Nagahawatte, Panduka; Rusch, Michael; Boggs, Kristy; Cheng, Jinjun; Becksfort, Jared; Ma, Jing; Song, Guangchun; Li, Yongjin; Wei, Lei; Wang, Jianmin; Shurtleff, Sheila; Easton, John; Zhao, David; Fulton, Robert S; Fulton, Lucinda L; Dooling, David J; Vadodaria, Bhavin; Mulder, Heather L; Tang, Chunlao; Ochoa, Kerri; Mullighan, Charles G; Gajjar, Amar; Kriwacki, Richard; Sheer, Denise; Gilbertson, Richard J; Mardis, Elaine R; Wilson, Richard K; Downing, James R; Baker, Suzanne J; Ellison, David W

    2013-06-01

    The most common pediatric brain tumors are low-grade gliomas (LGGs). We used whole-genome sequencing to identify multiple new genetic alterations involving BRAF, RAF1, FGFR1, MYB, MYBL1 and genes with histone-related functions, including H3F3A and ATRX, in 39 LGGs and low-grade glioneuronal tumors (LGGNTs). Only a single non-silent somatic alteration was detected in 24 of 39 (62%) tumors. Intragenic duplications of the portion of FGFR1 encoding the tyrosine kinase domain (TKD) and rearrangements of MYB were recurrent and mutually exclusive in 53% of grade II diffuse LGGs. Transplantation of Trp53-null neonatal astrocytes expressing FGFR1 with the duplication involving the TKD into the brains of nude mice generated high-grade astrocytomas with short latency and 100% penetrance. FGFR1 with the duplication induced FGFR1 autophosphorylation and upregulation of the MAPK/ERK and PI3K pathways, which could be blocked by specific inhibitors. Focusing on the therapeutically challenging diffuse LGGs, our study of 151 tumors has discovered genetic alterations and potential therapeutic targets across the entire range of pediatric LGGs and LGGNTs. PMID:23583981

  10. Whole exome sequencing identifies a recurrent RQCD1 P131L mutation in cutaneous melanoma

    PubMed Central

    Wong, Stephen Q.; Behren, Andreas; Mar, Victoria J.; Woods, Katherine; Li, Jason; Martin, Claire; Sheppard, Karen E.; Wolfe, Rory; Kelly, John; Cebon, Jonathan; Dobrovic, Alexander; McArthur, Grant A.

    2015-01-01

    Melanoma is often caused by mutations due to exposure to ultraviolet radiation. This study reports a recurrent somatic C > T change causing a P131L mutation in the RQCD1 (Required for Cell Differentiation1 Homolog) gene identified through whole exome sequencing of 20 metastatic melanomas. Screening in 715 additional primary melanomas revealed a prevalence of ~4%. This represents the first reported recurrent mutation in a member of the CCR4-NOT complex in cancer. Compared to tumors without the mutation, the P131L mutant positive tumors were associated with increased thickness (p = 0.02), head and neck (p = 0.009) and upper limb (p = 0.03) location, lentigo maligna melanoma subtype (p = 0.02) and BRAF V600K (p = 0.04) but not V600E or NRAS codon 61 mutations. There was no association with nodal disease (p = 0.3). Mutually exclusive mutations of other members of the CCR4-NOT complex were found in ~20% of the TCGA melanoma dataset suggesting the complex may play an important role in melanoma biology. Mutant RQCD1 was predicted to bind strongly to HLA-A0201 and HLA-Cw3 MHC1 complexes. From thirteen patients with mutant RQCD1, an anti-tumor CD8+ T cell response was observed from a single patient's peripheral blood mononuclear cell population stimulated with mutated peptide compared to wildtype indicating a neoantigen may be formed. PMID:25544760

  11. Computer Simulation of the Determination of Amino Acid Sequences in Polypeptides

    ERIC Educational Resources Information Center

    Daubert, Stephen D.; Sontum, Stephen F.

    1977-01-01

    Describes a computer program that generates a random string of amino acids and guides the student in determining the correct sequence of a given protein by using experimental analytic data for that protein. (MLH)

  12. Investigating Salmonella Eko from Various Sources in Nigeria by Whole Genome Sequencing to Identify the Source of Human Infections.

    PubMed

    Leekitcharoenphon, Pimlapas; Raufu, Ibrahim; Nielsen, Mette T; Rosenqvist Lund, Birthe S; Ameh, James A; Ambali, Abdul G; Sørensen, Gitte; Le Hello, Simon; Aarestrup, Frank M; Hendriksen, Rene S

    2016-01-01

    Twenty-six Salmonella enterica serovar Eko isolated from various sources in Nigeria were investigated by whole genome sequencing to identify the source of human infections. Diversity among the isolates was observed and camel and cattle were identified as the primary reservoirs and the most likely source of the human infections. PMID:27228329

  13. Investigating Salmonella Eko from Various Sources in Nigeria by Whole Genome Sequencing to Identify the Source of Human Infections

    PubMed Central

    Leekitcharoenphon, Pimlapas; Raufu, Ibrahim; Nielsen, Mette T.; Rosenqvist Lund, Birthe S.; Ameh, James A.; Ambali, Abdul G.; Sørensen, Gitte; Le Hello, Simon; Aarestrup, Frank M.; Hendriksen, Rene S.

    2016-01-01

    Twenty-six Salmonella enterica serovar Eko isolated from various sources in Nigeria were investigated by whole genome sequencing to identify the source of human infections. Diversity among the isolates was observed and camel and cattle were identified as the primary reservoirs and the most likely source of the human infections. PMID:27228329

  14. Novel Pathogenic Variant (c.3178G>A) in the SMC1A Gene in a Family With Cornelia de Lange Syndrome Identified by Exome Sequencing

    PubMed Central

    Jang, Mi-Ae; Lee, Chang-Woo

    2015-01-01

    Cornelia de Lange syndrome (CdLS) is a clinically and genetically heterogeneous congenital anomaly. Mutations in the NIPBL gene account for a half of the affected individuals. We describe a family with CdLS carrying a novel pathogenic variant of the SMC1A gene identified by exome sequencing. The proband was a 3-yr-old boy presenting with a developmental delay. He had distinctive facial features without major structural anomalies and tested negative for the NIPBL gene. His younger sister, mother, and maternal grandmother presented with mild mental retardation. By exome sequencing of the proband, a novel SMC1A variant, c.3178G>A, was identified, which was expected to cause an amino acid substitution (p.Glu1060Lys) in the highly conserved coiled-coil domain of the SMC1A protein. Sanger sequencing confirmed that the three female relatives with mental retardation also carry this variant. Our results reveal that SMC1A gene defects are associated with milder phenotypes of CdLS. Furthermore, we showed that exome sequencing could be a useful tool to identify pathogenic variants in patients with CdLS. PMID:26354354

  15. Novel pathogenic variant (c.3178G>A) in the SMC1A gene in a family with Cornelia de Lange syndrome identified by exome sequencing.

    PubMed

    Jang, Mi Ae; Lee, Chang Woo; Kim, Jin Kyung; Ki, Chang Seok

    2015-11-01

    Cornelia de Lange syndrome (CdLS) is a clinically and genetically heterogeneous congenital anomaly. Mutations in the NIPBL gene account for a half of the affected individuals. We describe a family with CdLS carrying a novel pathogenic variant of the SMC1A gene identified by exome sequencing. The proband was a 3-yr-old boy presenting with a developmental delay. He had distinctive facial features without major structural anomalies and tested negative for the NIPBL gene. His younger sister, mother, and maternal grandmother presented with mild mental retardation. By exome sequencing of the proband, a novel SMC1A variant, c.3178G>A, was identified, which was expected to cause an amino acid substitution (p.Glu1060Lys) in the highly conserved coiled-coil domain of the SMC1A protein. Sanger sequencing confirmed that the three female relatives with mental retardation also carry this variant. Our results reveal that SMC1A gene defects are associated with milder phenotypes of CdLS. Furthermore, we showed that exome sequencing could be a useful tool to identify pathogenic variants in patients with CdLS. PMID:26354354

  16. Mass spectrometric detection of the amino acid sequence polymorphism of the hepatitis C virus antigen.

    PubMed

    Kaysheva, A L; Ivanov, Yu D; Frantsuzov, P A; Krohin, N V; Pavlova, T I; Uchaikin, V F; Konev, V А; Kovalev, O B; Ziborov, V S; Archakov, A I

    2016-03-01

    A method for detection and identification of the hepatitis C virus antigen (HCVcoreAg) in human serum with consideration for possible amino acid substitutions is proposed. The method is based on a combination of biospecific capturing and concentrating of the target protein on the surface of the chip for atomic force microscope (AFM chip) with subsequent protein identification by tandem mass spectrometric (MS/MS) analysis. Biospecific AFM-capturing of viral particles containing HCVcoreAg from serum samples was performed by use of AFM chips with monoclonal antibodies (anti-HCVcore) covalently immobilized on the surface. Biospecific complexes were registered and counted by AFM. Further MS/MS analysis allowed to reliably identify the HCVcoreAg in the complexes formed on the AFM chip surface. Analysis of MS/MS spectra, with the account taken of the possible polymorphisms in the amino acid sequence of the HCVcoreAg, enabled us to increase the number of identified peptides. PMID:26773170

  17. De Novo Transcriptome Analysis of Warburgia ugandensis to Identify Genes Involved in Terpenoids and Unsaturated Fatty Acids Biosynthesis

    PubMed Central

    Wang, Xin; Zhou, Chen; Yang, Xianpeng; Miao, Di; Zhang, Yansheng

    2015-01-01

    The bark of Warburgia ugandensis (Canellaceae family) has been used as a medicinal source for a long history in many African countries. The presence of diverse terpenoids and abundant polyunsaturated fatty acids (PUFAs) in this organ contributes to its broad range of pharmacological properties. Despite its medicinal and economic importance, the knowledge on the biosynthesis of terpenoid and unsaturated fatty acid in W. ugandensis bark remains largely unknown. Therefore, it is necessary to construct a genomic and/or transcriptomic database for the functional genomics study on W. ugandensis. The chemical profiles of terpenoids and fatty acids between the bark and leaves of W. ugandensis were compared by gas chromatography-mass spectrometry (GC-MS) analysis. Meanwhile, the transcriptome database derived from both tissues was created using Illumina sequencing technology. In total, about 17.1 G clean nucleotides were obtained, and de novo assembled into 72,591 unigenes, of which about 38.06% can be aligned to the NCBI non-redundant protein database. Many candidate genes in the biosynthetic pathways of terpenoids and unsaturated fatty acids were identified, including 14 unigenes for terpene synthases. Furthermore, 2,324 unigenes were discovered to be differentially expressed between both tissues; the functions of those differentially expressed genes (DEGs) were predicted by gene ontology enrichment and metabolic pathway enrichment analyses. In addition, the expression of 12 DEGs with putative roles in terpenoid and unsaturated fatty acid metabolic pathways was confirmed by qRT-PCRs, which was consistent with the data of the RNA-sequencing. In conclusion, we constructed a comprehensive transcriptome dataset derived from the bark and leaf of W. ugandensis, which forms the basis for functional genomics studies on this plant species. Particularly, the comparative analysis of the transcriptome data between the bark and leaf will provide critical clues to reveal the regulatory

  18. Accuracy of sequence alignment and fold assessment using reduced amino acid alphabets.

    PubMed

    Melo, Francisco; Marti-Renom, Marc A

    2006-06-01

    Reduced or simplified amino acid alphabets group the 20 naturally occurring amino acids into a smaller number of representative protein residues. To date, several reduced amino acid alphabets have been proposed, which have been derived and optimized by a variety of methods. The resulting reduced amino acid alphabets have been applied to pattern recognition, generation of consensus sequences from multiple alignments, protein folding, and protein structure prediction. In this work, amino acid substitution matrices and statistical potentials were derived based on several reduced amino acid alphabets and their performance assessed in a large benchmark for the tasks of sequence alignment and fold assessment of protein structure models, using as a reference frame the standard alphabet of 20 amino acids. The results showed that a large reduction in the total number of residue types does not necessarily translate into a significant loss of discriminative power for sequence alignment and fold assessment. Therefore, some definitions of a few residue types are able to encode most of the relevant sequence/structure information that is present in the 20 standard amino acids. Based on these results, we suggest that the use of reduced amino acid alphabets may allow to increasing the accuracy of current substitution matrices and statistical potentials for the prediction of protein structure of remote homologs. PMID:16506243

  19. Identifying and assessing the impact of wine acid-related genes in yeast.

    PubMed

    Chidi, Boredi S; Rossouw, Debra; Bauer, Florian F

    2016-02-01

    Saccharomyces cerevisiae strains used for winemaking show a wide range of fermentation phenotypes, and the genetic background of individual strains contributes significantly to the organoleptic properties of wine. This strain-dependent impact extends to the organic acid composition of the wine, an important quality parameter. However, little is known about the genes which may impact on organic acids during grape must fermentation. To generate novel insights into the genetic regulation of this metabolic network, a subset of genes was identified based on a comparative analysis of the transcriptomes and organic acid profiles of different yeast strains showing different production levels of organic acids. These genes showed significant inter-strain differences in their transcription levels at one or more stages of fermentation and were also considered likely to influence organic acid metabolism based on existing functional annotations. Genes selected in this manner were ADH3, AAD6, SER33, ICL1, GLY1, SFC1, SER1, KGD1, AGX1, OSM1 and GPD2. Yeast strains carrying deletions for these genes were used to conduct fermentations and determine organic acid levels at various stages of alcoholic fermentation in synthetic grape must. The impact of these deletions on organic acid profiles was quantified, leading to novel insights and hypothesis generation regarding the role/s of these genes in wine yeast acid metabolism under fermentative conditions. Overall, the data contribute to our understanding of the roles of selected genes in yeast metabolism in general and of organic acid metabolism in particular. PMID:26040556

  20. Characterization of mouse cellular deoxyribonucleic acid homologous to Abelson murine leukemia virus-specific sequences.

    PubMed Central

    Dale, B; Ozanne, B

    1981-01-01

    The genome of Abelson murine leukemia virus (A-MuLV) consists of sequences derived from both BALB/c mouse deoxyribonucleic acid and the genome of Moloney murine leukemia virus. Using deoxyribonucleic acid linear intermediates as a source of retroviral deoxyribonucleic acid, we isolated a recombinant plasmid which contained 1.9 kilobases of the 3.5-kilobase mouse-derived sequences found in A-MuLV (A-MuLV-specific sequences). We used this clone, designated pSA-17, as a probe restriction enzyme and Southern blot analyses to examine the arrangement of homologous sequences in BALB/c deoxyribonucleic acid (endogenous Abelson sequences). The endogenous Abelson sequences within the mouse genome were interrupted by noncoding regions, suggesting that a rearrangement of the cell sequences was required to produce the sequence found in the virus. Endogenous Abelson sequences were arranged similarly in mice that were susceptible to A-MuLV tumors and in mice that were resistant to A-MuLV tumors. An examination of three BALB/c plasmacytomas and a BALB/c early B-cell tumor likewise revealed no alteration in the arrangement of the endogenous Abelson sequences. Homology to pSA-17 was also observed in deoxyribonucleic acids prepared from rat, hamster, chicken, and human cells. An isolate of A-MuLV which encoded a 160,000-dalton transforming protein (P160) contained 700 more base pairs of mouse sequences than the standard A-MuLV isolate, which encoded a 120,000-dalton transforming protein (P120). Images PMID:9279386

  1. The amino acid sequence of monal pheasant lysozyme and its activity.

    PubMed

    Araki, T; Matsumoto, T; Torikata, T

    1998-10-01

    The amino acid sequence of monal pheasant lysozyme and its activity were analyzed. Carboxymethylated lysozyme was digested with trypsin and the resulting peptides were sequenced. The established amino acid sequence had one amino acid substitution at position 102 (Arg to Gly) comparing with Indian peafowl lysozyme and four amino acid substitutions at positions 3 (Phe to Tyr), 15 (His to Leu), 41 (Gln to His), and 121 (Gln to His) with chicken lysozyme. Analysis of the time-courses of reaction using N-acetylglucosamine pentamer as a substrate showed a difference of binding free energy change (-0.4 kcal/mol) at subsites A between monal pheasant and Indian peafowl lysozyme. This was assumed to be caused by the amino acid substitution at subsite A with loss of a positive charge at position 102 (Arg102 to Gly). PMID:9836434

  2. Studies on monotreme proteins. VII. Amino acid sequence of myoglobin from the platypus, Ornithoryhynchus anatinus.

    PubMed

    Fisher, W K; Thompson, E O

    1976-03-01

    Myoglobin isolated from skeletal muscle of the platypus contains 153 amino acid residues. The complete amino acid sequence has been determined following cleavage with cyanogen bromide and further digestion of the four fragments with trypsin, chymotrypsin, pepsin and thermolysin. Sequences of the purified peptides were determined by the dansyl-Edman procedure. The amino acid sequence showed 25 differences from human myoglobin and 24 from kangaroo myoglobin. Amino acid sequences in myoglobins are more conserved than sequences in the alpha- and beta-globin chains, and platypus myoglobin shows a similar number of variations in sequence to kangaroo myoglobin when compared with myoglobin of other species. The date of divergence of the platypus from other mammals was estimated at 102 +/- 31 million years, based on the number of amino acid differences between species and allowing for mutations during the evolutionary period. This estimate differs widely from the estimate given by similar treatment of the alpha- and beta-chain sequences and a constant rate of mutation of globin chains is not supported. PMID:962722

  3. cDNA-derived amino acid sequences of myoglobins from nine species of whales and dolphins.

    PubMed

    Iwanami, Kentaro; Mita, Hajime; Yamamoto, Yasuhiko; Fujise, Yoshihiro; Yamada, Tadasu; Suzuki, Tomohiko

    2006-10-01

    We determined the myoglobin (Mb) cDNA sequences of nine cetaceans, of which six are the first reports of Mb sequences: sei whale (Balaenoptera borealis), Bryde's whale (Balaenoptera edeni), pygmy sperm whale (Kogia breviceps), Stejneger's beaked whale (Mesoplodon stejnegeri), Longman's beaked whale (Indopacetus pacificus), and melon-headed whale (Peponocephala electra), and three confirm the previously determined chemical amino acid sequences: sperm whale (Physeter macrocephalus), common minke whale (Balaenoptera acutorostrata) and pantropical spotted dolphin (Stenella attenuata). We found two types of Mb in the skeletal muscle of pantropical spotted dolphin: Mb I with the same amino acid sequence as that deposited in the protein database, and Mb II, which differs at two amino acid residues compared with Mb I. Using an alignment of the amino acid or cDNA sequences of cetacean Mb, we constructed a phylogenetic tree by the NJ method. Clustering of cetacean Mb amino acid and cDNA sequences essentially follows the classical taxonomy of cetaceans, suggesting that Mb sequence data is valid for classification of cetaceans at least to the family level. PMID:16962803

  4. SEQUENCING OF CUCUMBER (CUCUMIS SATIVUS L.) CHLOROPLAST GENOMES IDENTIFIES PUTATIVE CANDIDATE GENES FOR CHILLING TOLERANCE

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Chilling injury in cucumber (Cucumis sativus L.) is conditioned by maternal factors and the sequencing of its chloroplast (cp) genome could lead to the identification of economically important candidate genes. Complete sequencing of cucumber cpDNA was facilitated by the development of 414 consensus...

  5. A Rare Coincidence of Sitosterolemia and Familial Mediterranean Fever Identified by Whole Exome Sequencing.

    PubMed

    Tada, Hayato; Kawashiri, Masa-Aki; Okada, Hirofumi; Endo, Saori; Toyoshima, Yuka; Konno, Tetsuo; Nohara, Atsushi; Inazu, Akihiro; Takao, Akira; Mabuchi, Hiroshi; Yamagishi, Masakazu; Hayashi, Kenshi

    2016-07-01

    Whole exome sequencing (WES) technologies have accelerated genetic studies of Mendelian disorders, yielding approximately 30% diagnostic success. We encountered a 13-year-old Japanese female initially diagnosed with familial hypercholesterolemia on the basis of clinical manifestations of severe hypercholesterolemia (initial LDL cholesterol=609 mg/dl at the age of one) and systemic intertriginous xanthomas with histories of recurrent self-limiting episodes of fever and arthritis. Both her phenotypes seemed to co-segregate in a recessive manner. We performed WES on this patient, who was considered a proband. Among 206,430 variants found in this individual, we found 18,220 nonsense, missense, or splice site variants, of which 3,087 were rare (minor allele frequency ≤ 0.01 or not reported) in 1000 Genome (Asian population). Filtering by assuming a recessive pattern of inheritance with the use of an in silico annotation prediction tool, we successfully narrowed down the candidates to the compound heterozygous mutations in the ABCG5 gene (c.1256G>A or p.Arg419His/c.1763-1G>A [splice acceptor site]) and to the double-compound heterozygous mutations in the MEFV gene (c.329T>C/C or p.Leu110Pro/c.442G>C/C or p.Glu148Val). The patient was genetically diagnosed with sitosterolemia and familial Mediterranean fever using WES for the first time. Such a comprehensive approach is useful for identifying causative mutations for multiple unrelated inheritable diseases. PMID:27170062

  6. Exome Sequencing of Cell-Free DNA from Metastatic Cancer Patients Identifies Clinically Actionable Mutations Distinct from Primary Disease.

    PubMed

    Butler, Timothy M; Johnson-Camacho, Katherine; Peto, Myron; Wang, Nicholas J; Macey, Tara A; Korkola, James E; Koppie, Theresa M; Corless, Christopher L; Gray, Joe W; Spellman, Paul T

    2015-01-01

    The identification of the molecular drivers of cancer by sequencing is the backbone of precision medicine and the basis of personalized therapy; however, biopsies of primary tumors provide only a snapshot of the evolution of the disease and may miss potential therapeutic targets, especially in the metastatic setting. A liquid biopsy, in the form of cell-free DNA (cfDNA) sequencing, has the potential to capture the inter- and intra-tumoral heterogeneity present in metastatic disease, and, through serial blood draws, track the evolution of the tumor genome. In order to determine the clinical utility of cfDNA sequencing we performed whole-exome sequencing on cfDNA and tumor DNA from two patients with metastatic disease; only minor modifications to our sequencing and analysis pipelines were required for sequencing and mutation calling of cfDNA. The first patient had metastatic sarcoma and 47 of 48 mutations present in the primary tumor were also found in the cell-free DNA. The second patient had metastatic breast cancer and sequencing identified an ESR1 mutation in the cfDNA and metastatic site, but not in the primary tumor. This likely explains tumor progression on Anastrozole. Significant heterogeneity between the primary and metastatic tumors, with cfDNA reflecting the metastases, suggested separation from the primary lesion early in tumor evolution. This is best illustrated by an activating PIK3CA mutation (H1047R) which was clonal in the primary tumor, but completely absent from either the metastasis or cfDNA. Here we show that cfDNA sequencing supplies clinically actionable information with minimal risks compared to metastatic biopsies. This study demonstrates the utility of whole-exome sequencing of cell-free DNA from patients with metastatic disease. cfDNA sequencing identified an ESR1 mutation, potentially explaining a patient's resistance to aromatase inhibition, and gave insight into how metastatic lesions differ from the primary tumor. PMID:26317216

  7. Exome Sequencing of Cell-Free DNA from Metastatic Cancer Patients Identifies Clinically Actionable Mutations Distinct from Primary Disease

    PubMed Central

    Butler, Timothy M.; Johnson-Camacho, Katherine; Peto, Myron; Wang, Nicholas J.; Macey, Tara A.; Korkola, James E.; Koppie, Theresa M.; Corless, Christopher L.; Gray, Joe W.; Spellman, Paul T.

    2015-01-01

    The identification of the molecular drivers of cancer by sequencing is the backbone of precision medicine and the basis of personalized therapy; however, biopsies of primary tumors provide only a snapshot of the evolution of the disease and may miss potential therapeutic targets, especially in the metastatic setting. A liquid biopsy, in the form of cell-free DNA (cfDNA) sequencing, has the potential to capture the inter- and intra-tumoral heterogeneity present in metastatic disease, and, through serial blood draws, track the evolution of the tumor genome. In order to determine the clinical utility of cfDNA sequencing we performed whole-exome sequencing on cfDNA and tumor DNA from two patients with metastatic disease; only minor modifications to our sequencing and analysis pipelines were required for sequencing and mutation calling of cfDNA. The first patient had metastatic sarcoma and 47 of 48 mutations present in the primary tumor were also found in the cell-free DNA. The second patient had metastatic breast cancer and sequencing identified an ESR1 mutation in the cfDNA and metastatic site, but not in the primary tumor. This likely explains tumor progression on Anastrozole. Significant heterogeneity between the primary and metastatic tumors, with cfDNA reflecting the metastases, suggested separation from the primary lesion early in tumor evolution. This is best illustrated by an activating PIK3CA mutation (H1047R) which was clonal in the primary tumor, but completely absent from either the metastasis or cfDNA. Here we show that cfDNA sequencing supplies clinically actionable information with minimal risks compared to metastatic biopsies. This study demonstrates the utility of whole-exome sequencing of cell-free DNA from patients with metastatic disease. cfDNA sequencing identified an ESR1 mutation, potentially explaining a patient’s resistance to aromatase inhibition, and gave insight into how metastatic lesions differ from the primary tumor. PMID:26317216

  8. Draft Genome Sequences of Two Novel Acidimicrobiaceae Members from an Acid Mine Drainage Biofilm Metagenome.

    PubMed

    Pinto, Ameet J; Sharp, Jonathan O; Yoder, Michael J; Almstrand, Robert

    2016-01-01

    Bacteria belonging to the family Acidimicrobiaceae are frequently encountered in heavy metal-contaminated acidic environments. However, their phylogenetic and metabolic diversity is poorly resolved. We present draft genome sequences of two novel and phylogenetically distinct Acidimicrobiaceae members assembled from an acid mine drainage biofilm metagenome. PMID:26769942

  9. Draft Genome Sequences of Two Novel Acidimicrobiaceae Members from an Acid Mine Drainage Biofilm Metagenome

    PubMed Central

    Pinto, Ameet J.; Sharp, Jonathan O.; Yoder, Michael J.

    2016-01-01

    Bacteria belonging to the family Acidimicrobiaceae are frequently encountered in heavy metal-contaminated acidic environments. However, their phylogenetic and metabolic diversity is poorly resolved. We present draft genome sequences of two novel and phylogenetically distinct Acidimicrobiaceae members assembled from an acid mine drainage biofilm metagenome. PMID:26769942

  10. Whole-Genome Sequencing Identifies Emergence of a Quinolone Resistance Mutation in a Case of Stenotrophomonas maltophilia Bacteremia

    PubMed Central

    Pak, Theodore R.; Altman, Deena R.; Attie, Oliver; Sebra, Robert; Hamula, Camille L.; Lewis, Martha; Deikus, Gintaras; Newman, Leah C.; Fang, Gang; Hand, Jonathan; Patel, Gopi; Wallach, Fran; Schadt, Eric E.; Huprikar, Shirish; van Bakel, Harm; Bashir, Ali

    2015-01-01

    Whole-genome sequences for Stenotrophomonas maltophilia serial isolates from a bacteremic patient before and after development of levofloxacin resistance were assembled de novo and differed by one single-nucleotide variant in smeT, a repressor for multidrug efflux operon smeDEF. Along with sequenced isolates from five contemporaneous cases, they displayed considerable diversity compared against all published complete genomes. Whole-genome sequencing and complete assembly can conclusively identify resistance mechanisms emerging in S. maltophilia strains during clinical therapy. PMID:26324280

  11. Whole-genome sequencing identifies emergence of a quinolone resistance mutation in a case of Stenotrophomonas maltophilia bacteremia.

    PubMed

    Pak, Theodore R; Altman, Deena R; Attie, Oliver; Sebra, Robert; Hamula, Camille L; Lewis, Martha; Deikus, Gintaras; Newman, Leah C; Fang, Gang; Hand, Jonathan; Patel, Gopi; Wallach, Fran; Schadt, Eric E; Huprikar, Shirish; van Bakel, Harm; Kasarskis, Andrew; Bashir, Ali

    2015-11-01

    Whole-genome sequences for Stenotrophomonas maltophilia serial isolates from a bacteremic patient before and after development of levofloxacin resistance were assembled de novo and differed by one single-nucleotide variant in smeT, a repressor for multidrug efflux operon smeDEF. Along with sequenced isolates from five contemporaneous cases, they displayed considerable diversity compared against all published complete genomes. Whole-genome sequencing and complete assembly can conclusively identify resistance mechanisms emerging in S. maltophilia strains during clinical therapy. PMID:26324280

  12. Longitudinal Antigenic Sequences and Sites from Intra-Host Evolution (LASSIE) identifies immune-selected HIV variants

    SciTech Connect

    Hraber, Peter; Korber, Bette; Wagh, Kshitij; Giorgi, Elena; Bhattacharya, Tanmoy; Gnanakaran, S.; Lapedes, Alan S.; Learn, Gerald H.; Kreider, Edward F.; Li, Yingying; Shaw, George M.; Hahn, Beatrice H.; Montefiori, David C.; Alam, S. Munir; Bonsignori, Mattia; Moody, M. Anthony; Liao, Hua-Xin; Gao, Feng; Haynes, Barton

    2015-10-21

    Within-host genetic sequencing from samples collected over time provides a dynamic view of how viruses evade host immunity. Immune-driven mutations might stimulate neutralization breadth by selecting antibodies adapted to cycles of immune escape that generate within-subject epitope diversity. Comprehensive identification of immune-escape mutations is experimentally and computationally challenging. With current technology, many more viral sequences can readily be obtained than can be tested for binding and neutralization, making down-selection necessary. Typically, this is done manually, by picking variants that represent different time-points and branches on a phylogenetic tree. Such strategies are likely to miss many relevant mutations and combinations of mutations, and to be redundant for other mutations. Longitudinal Antigenic Sequences and Sites from Intrahost Evolution (LASSIE) uses transmitted founder loss to identify virus “hot-spots” under putative immune selection and chooses sequences that represent recurrent mutations in selected sites. LASSIE favors earliest sequences in which mutations arise. Here, with well-characterized longitudinal Env sequences, we confirmed selected sites were concentrated in antibody contacts and selected sequences represented diverse antigenic phenotypes. Finally, practical applications include rapidly identifying immune targets under selective pressure within a subject, selecting minimal sets of reagents for immunological assays that characterize evolving antibody responses, and for immunogens in polyvalent “cocktail” vaccines.

  13. Longitudinal Antigenic Sequences and Sites from Intra-Host Evolution (LASSIE) Identifies Immune-Selected HIV Variants

    PubMed Central

    Hraber, Peter; Korber, Bette; Wagh, Kshitij; Giorgi, Elena E.; Bhattacharya, Tanmoy; Gnanakaran, S.; Lapedes, Alan S.; Learn, Gerald H.; Kreider, Edward F.; Li, Yingying; Shaw, George M.; Hahn, Beatrice H.; Montefiori, David C.; Alam, S. Munir; Bonsignori, Mattia; Moody, M. Anthony; Liao, Hua-Xin; Gao, Feng; Haynes, Barton F.

    2015-01-01

    Within-host genetic sequencing from samples collected over time provides a dynamic view of how viruses evade host immunity. Immune-driven mutations might stimulate neutralization breadth by selecting antibodies adapted to cycles of immune escape that generate within-subject epitope diversity. Comprehensive identification of immune-escape mutations is experimentally and computationally challenging. With current technology, many more viral sequences can readily be obtained than can be tested for binding and neutralization, making down-selection necessary. Typically, this is done manually, by picking variants that represent different time-points and branches on a phylogenetic tree. Such strategies are likely to miss many relevant mutations and combinations of mutations, and to be redundant for other mutations. Longitudinal Antigenic Sequences and Sites from Intrahost Evolution (LASSIE) uses transmitted founder loss to identify virus “hot-spots” under putative immune selection and chooses sequences that represent recurrent mutations in selected sites. LASSIE favors earliest sequences in which mutations arise. With well-characterized longitudinal Env sequences, we confirmed selected sites were concentrated in antibody contacts and selected sequences represented diverse antigenic phenotypes. Practical applications include rapidly identifying immune targets under selective pressure within a subject, selecting minimal sets of reagents for immunological assays that characterize evolving antibody responses, and for immunogens in polyvalent “cocktail” vaccines. PMID:26506369

  14. Longitudinal Antigenic Sequences and Sites from Intra-Host Evolution (LASSIE) identifies immune-selected HIV variants

    DOE PAGESBeta

    Hraber, Peter; Korber, Bette; Wagh, Kshitij; Giorgi, Elena; Bhattacharya, Tanmoy; Gnanakaran, S.; Lapedes, Alan S.; Learn, Gerald H.; Kreider, Edward F.; Li, Yingying; et al

    2015-10-21

    Within-host genetic sequencing from samples collected over time provides a dynamic view of how viruses evade host immunity. Immune-driven mutations might stimulate neutralization breadth by selecting antibodies adapted to cycles of immune escape that generate within-subject epitope diversity. Comprehensive identification of immune-escape mutations is experimentally and computationally challenging. With current technology, many more viral sequences can readily be obtained than can be tested for binding and neutralization, making down-selection necessary. Typically, this is done manually, by picking variants that represent different time-points and branches on a phylogenetic tree. Such strategies are likely to miss many relevant mutations and combinations ofmore » mutations, and to be redundant for other mutations. Longitudinal Antigenic Sequences and Sites from Intrahost Evolution (LASSIE) uses transmitted founder loss to identify virus “hot-spots” under putative immune selection and chooses sequences that represent recurrent mutations in selected sites. LASSIE favors earliest sequences in which mutations arise. Here, with well-characterized longitudinal Env sequences, we confirmed selected sites were concentrated in antibody contacts and selected sequences represented diverse antigenic phenotypes. Finally, practical applications include rapidly identifying immune targets under selective pressure within a subject, selecting minimal sets of reagents for immunological assays that characterize evolving antibody responses, and for immunogens in polyvalent “cocktail” vaccines.« less

  15. Many amino acid substitution variants identified in DNA repair genes during human population screenings are predicted to impact protein function

    SciTech Connect

    Xi, T; Jones, I M; Mohrenweiser, H W

    2003-11-03

    Over 520 different amino acid substitution variants have been previously identified in the systematic screening of 91 human DNA repair genes for sequence variation. Two algorithms were employed to predict the impact of these amino acid substitutions on protein activity. Sorting Intolerant From Tolerant (SIFT) classified 226 of 508 variants (44%) as ''Intolerant''. Polymorphism Phenotyping (PolyPhen) classed 165 of 489 amino acid substitutions (34%) as ''Probably or Possibly Damaging''. Another 9-15% of the variants were classed as ''Potentially Intolerant or Damaging''. The results from the two algorithms are highly associated, with concordance in predicted impact observed for {approx}62% of the variants. Twenty one to thirty one percent of the variant proteins are predicted to exhibit reduced activity by both algorithms. These variants occur at slightly lower individual allele frequency than do the variants classified as ''Tolerant'' or ''Benign''. Both algorithms correctly predicted the impact of 26 functionally characterized amino acid substitutions in the APE1 protein on biochemical activity, with one exception. It is concluded that a substantial fraction of the missense variants observed in the general human population are functionally relevant. These variants are expected to be the molecular genetic and biochemical basis for the associations of reduced DNA repair capacity phenotypes with elevated cancer risk.

  16. Canine preprorelaxin: nucleic acid sequence and localization within the canine placenta.

    PubMed

    Klonisch, T; Hombach-Klonisch, S; Froehlich, C; Kauffold, J; Steger, K; Steinetz, B G; Fischer, B

    1999-03-01

    Employing uteroplacental tissue at Day 35 of gestation, we determined the nucleic acid sequence of canine preprorelaxin using reverse transcription- and rapid amplification of cDNA ends-polymerase chain reaction. Canine preprorelaxin cDNA consisted of 534 base pairs encoding a protein of 177 amino acids with a signal peptide of 25 amino acids (aa), a B domain of 35 aa, a C domain of 93 aa, and an A domain of 24 aa. The putative receptor binding region in the N'-terminal part of the canine relaxin B domain GRDYVR contained two substitutions from the classical motif (E-->D and L-->Y). Canine preprorelaxin shared highest homology with porcine and equine preprorelaxin. Northern analysis revealed a 1-kilobase transcript present in total RNA of canine uteroplacental tissue but not of kidney tissue. Uteroplacental tissue from two bitches each at Days 30 and 35 of gestation were studied by in situ hybridization to localize relaxin mRNA. Immunohistochemistry for relaxin, cytokeratin, vimentin, and von Willebrand factor was performed on uteroplacental tissue at Day 30 of gestation. The basal cell layer at the core of the chorionic villi was devoid of relaxin mRNA and immunoreactive relaxin or vimentin but was immunopositive for cytokeratin and identified as cytotrophoblast cells. The cell layer surrounding the chorionic villi displayed specific hybridization signals for relaxin mRNA and immunoreactivity for relaxin and cytokeratin but not for vimentin, and was identified as syncytiotrophoblast. Those areas of the chorioallantoic tissue with most intense relaxin immunoreactivity were highly vascularized as demonstrated by immunoreactive von Willebrand factor expressed on vascular endothelium. The uterine glands and nonplacental uterine areas of the canine zonary girdle placenta were devoid of relaxin mRNA and relaxin. We conclude that the syncytiotrophoblast is the source of relaxin in the canine placenta. PMID:10026098

  17. Two distinct ferredoxins from Rhodobacter capsulatus: complete amino acid sequences and molecular evolution.

    PubMed

    Saeki, K; Suetsugu, Y; Yao, Y; Horio, T; Marrs, B L; Matsubara, H

    1990-09-01

    Two distinct ferredoxins were purified from Rhodobacter capsulatus SB1003. Their complete amino acid sequences were determined by a combination of protease digestion, BrCN cleavage and Edman degradation. Ferredoxins I and II were composed of 64 and 111 amino acids, respectively, with molecular weights of 6,728 and 12,549 excluding iron and sulfur atoms. Both contained two Cys clusters in their amino acid sequences. The first cluster of ferredoxin I and the second cluster of ferredoxin II had a sequence, CxxCxxCxxxCP, in common with the ferredoxins found in Clostridia. The second cluster of ferredoxin I had a sequence, CxxCxxxxxxxxCxxxCM, with extra amino acids between the second and third Cys, which has been reported for other photosynthetic bacterial ferredoxins and putative ferredoxins (nif-gene products) from nitrogen-fixing bacteria, and with a unique occurrence of Met. The first cluster of ferredoxin II had a CxxCxxxxCxxxCP sequence, with two additional amino acids between the second and third Cys, a characteristics feature of Azotobacter-[3Fe-4S] [4Fe-4S]-ferredoxin. Ferredoxin II was also similar to Azotobacter-type ferredoxins with an extended carboxyl (C-) terminal sequence compared to the common Clostridium-type. The evolutionary relationship of the two together with a putative one recently found to be encoded in nifENXQ region in this bacterium [Moreno-Vivian et al. (1989) J. Bacteriol. 171, 2591-2598] is discussed. PMID:2277040

  18. Amino Acid Sequence of Anionic Peroxidase from the Windmill Palm Tree Trachycarpus fortunei

    PubMed Central

    2015-01-01

    Palm peroxidases are extremely stable and have uncommon substrate specificity. This study was designed to fill in the knowledge gap about the structures of a peroxidase from the windmill palm tree Trachycarpus fortunei. The complete amino acid sequence and partial glycosylation were determined by MALDI-top-down sequencing of native windmill palm tree peroxidase (WPTP), MALDI-TOF/TOF MS/MS of WPTP tryptic peptides, and cDNA sequencing. The propeptide of WPTP contained N- and C-terminal signal sequences which contained 21 and 17 amino acid residues, respectively. Mature WPTP was 306 amino acids in length, and its carbohydrate content ranged from 21% to 29%. Comparison to closely related royal palm tree peroxidase revealed structural features that may explain differences in their substrate specificity. The results can be used to guide engineering of WPTP and its novel applications. PMID:25383699

  19. GENOME-SCALE SEQUENCING TO IDENTIFY GENES INVOLVED IN MENDELIAN DISORDERS

    PubMed Central

    Markello, Thomas C; Adams, David R

    2014-01-01

    The analysis of genome-scale sequence data can be defined as the interrogation of a complete set of genetic instructions in a search for individual loci that produce or contribute to a pathological state. Bioinformatic analysis of sequence data requires sufficient discriminant power to find this needle in a haystack. Current approaches make choices about selectivity and specificity thresholds, and the quality, quantity and completeness of the data in these analyses. There are many software tools available for individual analytic component-tasks, including commercial and open source options. Three major types of techniques have been included in most published exome projects to date: frequency/population genetic analysis, inheritance state consistency, and predictions of deleteriousness. We discuss the required infrastructure and use of each technique during analysis of genomic sequence data for clinical and research applications. Future developments will alter the strategies and sequence of using these tools and are speculated on in the closing section. PMID:24510651

  20. Complete Genome Sequence of a Genotype G23P[37] Pheasant Rotavirus Strain Identified in Hungary

    PubMed Central

    Gál, János; Marton, Szilvia; Ihász, Katalin; Papp, Hajnalka; Jakab, Ferenc; Malik, Yashpal S.; Bányai, Krisztián

    2016-01-01

    We investigated the genomic properties of a rotavirus A strain isolated from diarrheic pheasant poults in Hungary in 2015. Sequence analyses revealed a shared genomic constellation (G23-P[37]-I4-R4-C4-M4-A16-N10-T4-E4-H4) and close relationship (range of nucleotide sequence similarity: VP2, 88%; VP1 and NSP4, 98%) with another pheasant rotavirus strain isolated previously in Germany. PMID:27034484

  1. Complete Genome Sequence of a Genotype G23P[37] Pheasant Rotavirus Strain Identified in Hungary.

    PubMed

    Gál, János; Marton, Szilvia; Ihász, Katalin; Papp, Hajnalka; Jakab, Ferenc; Malik, Yashpal S; Bányai, Krisztián; Farkas, Szilvia L

    2016-01-01

    We investigated the genomic properties of a rotavirus A strain isolated from diarrheic pheasant poults in Hungary in 2015. Sequence analyses revealed a shared genomic constellation (G23-P[37]-I4-R4-C4-M4-A16-N10-T4-E4-H4) and close relationship (range of nucleotide sequence similarity: VP2, 88%; VP1 and NSP4, 98%) with another pheasant rotavirus strain isolated previously in Germany. PMID:27034484

  2. Protein chemotaxonomy. XIII. Amino acid sequence of ferredoxin from Panax ginseng.

    PubMed

    Mino, Yoshiki

    2006-08-01

    The complete amino acid sequence of [2Fe-2S] ferredoxin from Panax ginseng (Araliaceae) has been determined by automated Edman degradation of the entire S-carboxymethylcysteinyl protein and of the peptides obtained by enzymatic digestion. This ferredoxin has a unique amino acid sequence, which includes an insertion of Tyr at the 3rd position from the amino-terminus and a deletion of two amino acid residues at the carboxyl terminus. This ferredoxin had 18 differences in its amino acid sequence compared to that of Petroselinum sativum (Umbelliferae). In contrast, 23-33 differences were observed compared to other dicotyledonous plants. This suggests that Panax ginseng is related taxonomically to umbelliferous plants. PMID:16880642

  3. Complete amino acid sequence and structure characterization of the taste-modifying protein, miraculin.

    PubMed

    Theerasilp, S; Hitotsuya, H; Nakajo, S; Nakaya, K; Nakamura, Y; Kurihara, Y

    1989-04-25

    The taste-modifying protein, miraculin, has the unusual property of modifying sour taste into sweet taste. The complete amino acid sequence of miraculin purified from miracle fruits by a newly developed method (Theerasilp, S., and Kurihara, Y. (1988) J. Biol. Chem. 263, 11536-11539) was determined by an automatic Edman degradation method. Miraculin was a single polypeptide with 191 amino acid residues. The calculated molecular weight based on the amino acid sequence and the carbohydrate content (13.9%) was 24,600. Asn-42 and Asn-186 were linked N-glycosidically to carbohydrate chains. High homology was found between the amino acid sequences of miraculin and soybean trypsin inhibitor. PMID:2708331

  4. [Antimicrobial susceptibilities of clinical Nocardia isolates identified by 16S rRNA gene sequence analysis].

    PubMed

    Uner, Mahmut Celalettin; Hasçelik, Gülşen; Müştak, Hamit Kaan

    2016-01-01

    Nocardia species are ubiquitous in the environment and responsible for various human infections such as pulmonary, cutaneous, central nervous system and disseminated nocardiosis. Since the clinical pictures and antimicrobial susceptibilities of Nocardia species exhibit variability, susceptibility testing is recommended for every Nocardia isolates. The aims of this study was to determine the antimicrobial susceptibilities of Nocardia clinical isolates and to compare the results of broth microdilution and disc diffusion susceptibility tests. A total of 45 clinical Nocardia isolates (isolated from 17 respiratory tract, 8 brain abscess, 7 pus, 3 skin, 3 conjunctiva, 2 blood, 2 tissue, 2 pleural fluid and 1 cerebrospinal fluid samples) were identified by using conventional methods and 16S rRNA gene sequence analysis. Susceptibility testing was performed for amikacin, ciprofloxacin, ceftriaxone, linezolid and trimethoprim-sulfamethoxazole (TMP-SMX) by broth microdilution method according to the Clinical and Laboratory Standards Institute (CLSI) criteria recommended in 2011 approved standard (M24-A2) and disk diffusion method used as an alternative comparative susceptibility testing method. Among the 45 Nocardia strains, N.cyriacigeorgica (n: 26, 57.8%) was the most common species, followed by N.farcinica (n: 12, 26.7%), N.otitiscaviarum (n: 4, 8.9%), N.asteroides (n: 1, 2.2%), N.neocaledoniensis (n: 1, 2.2%) and N.abscessus (n: 1, 2.2%). Amikacin and linezolid were the only two antimicrobials to which all isolates were susceptible for both broth microdilution and disk diffusion tests. In broth microdilution test, resistance rates to TMP-SMX, ceftriaxone and ciprofloxacin were found as 15.6%, 37.8% and 84.4% respectively, whereas in the disk diffusion test, the highest resistance rate was observed against ciprofloxacin (n: 33, 73.3%), followed by TMP-SMX (n: 22, 48.9%) and ceftriaxone (n: 15, 33.3%). In both of these tests, N.cyriacigeorgica was the species with the

  5. Jack bean α-mannosidase: amino acid sequencing and N-glycosylation analysis of a valuable glycomics tool.

    PubMed

    Gnanesh Kumar, B S; Pohlentz, Gottfried; Schulte, Mona; Mormann, Michael; Siva Kumar, Nadimpalli

    2014-03-01

    Jack bean (Canavalia ensiformis) seeds contain several biologically important proteins among which α-mannosidase (EC 3.2.1.24) has been purified, its biochemical properties studied and widely used in glycan analysis. In the present study, we have used the purified enzyme and derived its amino acid sequence covering both the known subunits (molecular mass of ∼66,000 and ∼44,000 Da) hitherto not known in its entirety. Peptide de novo sequencing and structural elucidation of N-glycopeptides obtained either directly from proteolytic digestion or after zwitterionic hydrophilic interaction liquid chromatography solid phase extraction-based separation were performed by use of nanoelectrospray ionization quadrupole time-of-flight mass spectrometry and low-energy collision-induced dissociation experiments. De novo sequencing provided new insights into the disulfide linkage organization, intersection of subunits and complete N-glycan structures along with site specificities. The primary sequence suggests that the enzyme belongs to glycosyl hydrolase family 38 and the N-glycan sequence analysis revealed high-mannose oligosaccharides, which were found to be heterogeneous with varying number of hexoses viz, Man8-9GlcNAc2 and Glc1Man9GlcNAc2 in an evolutionarily conserved N-glycosylation site. This site with two proximal cysteines is present in all the acidic α-mannosidases reported so far in eukaryotes. Further, a truncated paucimannose type was identified to be lacking terminal two mannose, Man1(Xyl)GlcNAc2 (Fuc). PMID:24295789

  6. Complete cDNA and derived amino acid sequence of human factor V

    SciTech Connect

    Jenny, R.J.; Pittman, D.D.; Toole, J.J.; Kriz, R.W.; Aldape, R.A.; Hewick, R.M.; Kaufman, R.J.; Mann, K.G.

    1987-07-01

    cDNA clones encoding human factor V have been isolated from an oligo(dT)-primed human fetal liver cDNA library prepared with vector Charon 21A. The cDNA sequence of factor V from three overlapping clones includes a 6672-base-pair (bp) coding region, a 90-bp 5' untranslated region, and a 163-bp 3' untranslated region within which is a poly(A)tail. The deduced amino acid sequence consists of 2224 amino acids inclusive of a 28-amino acid leader peptide. Direct comparison with human factor VIII reveals considerable homology between proteins in amino acid sequence and domain structure: a triplicated A domain and duplicated C domain show approx. 40% identity with the corresponding domains in factor VIII. As in factor VIII, the A domains of factor V share approx. 40% amino acid-sequence homology with the three highly conserved domains in ceruloplasmin. The B domain of factor V contains 35 tandem and approx. 9 additional semiconserved repeats of nine amino acids of the form Asp-Leu-Ser-Gln-Thr-Thr/Asn-Leu-Ser-Pro and 2 additional semiconserved repeats of 17 amino acids. Factor V contains 37 potential N-linked glycosylation sites, 25 of which are in the B domain, and a total of 19 cysteine residues.

  7. Integrated genome and transcriptome sequencing identifies a novel form of hybrid and aggressive prostate cancer.

    PubMed

    Wu, Chunxiao; Wyatt, Alexander W; Lapuk, Anna V; McPherson, Andrew; McConeghy, Brian J; Bell, Robert H; Anderson, Shawn; Haegert, Anne; Brahmbhatt, Sonal; Shukin, Robert; Mo, Fan; Li, Estelle; Fazli, Ladan; Hurtado-Coll, Antonio; Jones, Edward C; Butterfield, Yaron S; Hach, Faraz; Hormozdiari, Fereydoun; Hajirasouliha, Iman; Boutros, Paul C; Bristow, Robert G; Jones, Steven Jm; Hirst, Martin; Marra, Marco A; Maher, Christopher A; Chinnaiyan, Arul M; Sahinalp, S Cenk; Gleave, Martin E; Volik, Stanislav V; Collins, Colin C

    2012-05-01

    Next-generation sequencing is making sequence-based molecular pathology and personalized oncology viable. We selected an individual initially diagnosed with conventional but aggressive prostate adenocarcinoma and sequenced the genome and transcriptome from primary and metastatic tissues collected prior to hormone therapy. The histology-pathology and copy number profiles were remarkably homogeneous, yet it was possible to propose the quadrant of the prostate tumour that likely seeded the metastatic diaspora. Despite a homogeneous cell type, our transcriptome analysis revealed signatures of both luminal and neuroendocrine cell types. Remarkably, the repertoire of expressed but apparently private gene fusions, including C15orf21:MYC, recapitulated this biology. We hypothesize that the amplification and over-expression of the stem cell gene MSI2 may have contributed to the stable hybrid cellular identity. This hybrid luminal-neuroendocrine tumour appears to represent a novel and highly aggressive case of prostate cancer with unique biological features and, conceivably, a propensity for rapid progression to castrate-resistance. Overall, this work highlights the importance of integrated analyses of genome, exome and transcriptome sequences for basic tumour biology, sequence-based molecular pathology and personalized oncology. PMID:22294438

  8. A novel mutation identified in PKHD1 by targeted exome sequencing: guiding prenatal diagnosis for an ARPKD family.

    PubMed

    Xu, Yan; Xiao, Bing; Jiang, Wen-Ting; Wang, Lei; Gen, Hong-Quan; Chen, Ying-Wei; Sun, Yu; Ji, Xing

    2014-11-01

    Autosomal recessive polycystic kidney disease (ARPKD) is a rare hereditary renal cystic disease involving multiple organs, mainly the kidney and liver. Parents who had an affected child with ARPKD are in strong demand for an early and reliable prenatal diagnosis to guide the future pregnancies. Here we provide an example of prenatal diagnosis of an ARPKD family where traditional antenatal ultrasound examinations failed to produce conclusive results till 26th week of gestation. Compound heterozygous mutations c.274C>T (p.Arg92Trp) and c.9059T>C (p.Leu3020Pro) were identified using targeted exome sequencing in the patient and confirmed by Sanger sequencing. Further, the mother and father were revealed to be carriers of heterozygous c.274C>T and c.9059T>C mutations, respectively. Molecular prenatal diagnosis was performed for the current pregnancy by direct sequencing plus linkage analysis. Two mutations identified in the patient were both found in the fetus. In conclusion, compound heterozygous PKHD1 mutations were elucidated to be the molecular basis of the patient with ARPKD. The newly identified c.9059T>C mutation in the patient expands mutation spectrum in PKHD1 gene. For those ultrasound failed to provide clear diagnosis, we propose the new prenatal diagnosis procedure: first, screening underlying mutations in PKHD1 gene in the proband by targeted exome sequencing; then detecting causative mutations by direct sequencing in the fetal DNA and confirming results by linkage analysis. PMID:25153916

  9. Fad7 gene identification and fatty acids phenotypic variation in an olive collection by EcoTILLING and sequencing approaches.

    PubMed

    Sabetta, Wilma; Blanco, Antonio; Zelasco, Samanta; Lombardo, Luca; Perri, Enzo; Mangini, Giacomo; Montemurro, Cinzia

    2013-08-01

    The ω-3 fatty acid desaturases (FADs) are enzymes responsible for catalyzing the conversion of linoleic acid to α-linolenic acid localized in the plastid or in the endoplasmic reticulum. In this research we report the genotypic and phenotypic variation of Italian Olea europaea L. germoplasm for the fatty acid composition. The phenotypic oil characterization was followed by the molecular analysis of the plastidial-type ω-3 FAD gene (fad7) (EC 1.14.19), whose full-length sequence has been here identified in cultivar Leccino. The gene consisted of 2635 bp with 8 exons and 5'- and 3'-UTRs of 336 and 282 bp respectively, and showed a high level of heterozygousity (1/110 bp). The natural allelic variation was investigated both by a LiCOR EcoTILLING assay and the PCR product direct sequencing. Only three haplotypes were identified among the 96 analysed cultivars, highlighting the strong degree of conservation of this gene. PMID:23685785

  10. A simple ligation-based method to increase the information density in sequencing reactions used to deconvolute nucleic acid selections

    PubMed Central

    Childs-Disney, Jessica L.; Disney, Matthew D.

    2008-01-01

    Herein, a method is described to increase the information density of sequencing experiments used to deconvolute nucleic acid selections. The method is facile and should be applicable to any selection experiment. A critical feature of this method is the use of biotinylated primers to amplify and encode a BamHI restriction site on both ends of a PCR product. After amplification, the PCR reaction is captured onto streptavidin resin, washed, and digested directly on the resin. Resin-based digestion affords clean product that is devoid of partially digested products and unincorporated PCR primers. The product's complementary ends are annealed and ligated together with T4 DNA ligase. Analysis of ligation products shows formation of concatemers of different length and little detectable monomer. Sequencing results produced data that routinely contained three to four copies of the library. This method allows for more efficient formulation of structure-activity relationships since multiple active sequences are identified from a single clone. PMID:18065718

  11. Whole-genome sequencing in autism identifies hot spots for de novo germline mutation.

    PubMed

    Michaelson, Jacob J; Shi, Yujian; Gujral, Madhusudan; Zheng, Hancheng; Malhotra, Dheeraj; Jin, Xin; Jian, Minghan; Liu, Guangming; Greer, Douglas; Bhandari, Abhishek; Wu, Wenting; Corominas, Roser; Peoples, Aine; Koren, Amnon; Gore, Athurva; Kang, Shuli; Lin, Guan Ning; Estabillo, Jasper; Gadomski, Therese; Singh, Balvindar; Zhang, Kun; Akshoomoff, Natacha; Corsello, Christina; McCarroll, Steven; Iakoucheva, Lilia M; Li, Yingrui; Wang, Jun; Sebat, Jonathan

    2012-12-21

    De novo mutation plays an important role in autism spectrum disorders (ASDs). Notably, pathogenic copy number variants (CNVs) are characterized by high mutation rates. We hypothesize that hypermutability is a property of ASD genes and may also include nucleotide-substitution hot spots. We investigated global patterns of germline mutation by whole-genome sequencing of monozygotic twins concordant for ASD and their parents. Mutation rates varied widely throughout the genome (by 100-fold) and could be explained by intrinsic characteristics of DNA sequence and chromatin structure. Dense clusters of mutations within individual genomes were attributable to compound mutation or gene conversion. Hypermutability was a characteristic of genes involved in ASD and other diseases. In addition, genes impacted by mutations in this study were associated with ASD in independent exome-sequencing data sets. Our findings suggest that regional hypermutation is a significant factor shaping patterns of genetic variation and disease risk in humans. PMID:23260136

  12. Benefits and Challenges with Applying Unique Molecular Identifiers in Next Generation Sequencing to Detect Low Frequency Mutations

    PubMed Central

    Kou, Ruqin; Lam, Ham; Duan, Hairong; Ye, Li; Jongkam, Narisra; Chen, Weizhi; Zhang, Shifang; Li, Shihong

    2016-01-01

    Indexing individual template molecules with a unique identifier (UID) before PCR and deep sequencing is promising for detecting low frequency mutations, as true mutations could be distinguished from PCR errors or sequencing errors based on consensus among reads sharing same index. In an effort to develop a robust assay to detect from urine low-abundant bladder cancer cells carrying well-documented mutations, we have tested the idea first on a set of mock templates, with wild type and known mutants mixed at defined ratios. We have measured the combined error rate for PCR and Illumina sequencing at each nucleotide position of three exons, and demonstrated the power of a UID in distinguishing and correcting errors. In addition, we have demonstrated that PCR sampling bias, rather than PCR errors, challenges the UID-deep sequencing method in faithfully detecting low frequency mutation. PMID:26752634

  13. Diverse Array of New Viral Sequences Identified in Worldwide Populations of the Asian Citrus Psyllid (Diaphorina citri) Using Viral Metagenomics

    PubMed Central

    Nouri, Shahideh; Salem, Nidá; Nigg, Jared C.

    2015-01-01

    ABSTRACT The Asian citrus psyllid, Diaphorina citri, is the natural vector of the causal agent of Huanglongbing (HLB), or citrus greening disease. Together; HLB and D. citri represent a major threat to world citrus production. As there is no cure for HLB, insect vector management is considered one strategy to help control the disease, and D. citri viruses might be useful. In this study, we used a metagenomic approach to analyze viral sequences associated with the global population of D. citri. By sequencing small RNAs and the transcriptome coupled with bioinformatics analysis, we showed that the virus-like sequences of D. citri are diverse. We identified novel viral sequences belonging to the picornavirus superfamily, the Reoviridae, Parvoviridae, and Bunyaviridae families, and an unclassified positive-sense single-stranded RNA virus. Moreover, a Wolbachia prophage-related sequence was identified. This is the first comprehensive survey to assess the viral community from worldwide populations of an agricultural insect pest. Our results provide valuable information on new putative viruses, some of which may have the potential to be used as biocontrol agents. IMPORTANCE Insects have the most species of all animals, and are hosts to, and vectors of, a great variety of known and unknown viruses. Some of these most likely have the potential to be important fundamental and/or practical resources. In this study, we used high-throughput next-generation sequencing (NGS) technology and bioinformatics analysis to identify putative viruses associated with Diaphorina citri, the Asian citrus psyllid. D. citri is the vector of the bacterium causing Huanglongbing (HLB), currently the most serious threat to citrus worldwide. Here, we report several novel viral sequences associated with D. citri. PMID:26676774

  14. Genome Sequences for a Cluster of Human Isolates of Listeria monocytogenes Identified in South Africa in 2015

    PubMed Central

    Naicker, Preneshni; Bamford, Colleen; Shuping, Liliwe; McCarthy, Kerrigan M.; Sooka, Arvinda; Smouse, Shannon L.; Tau, Nomsa; Keddy, Karen H.

    2016-01-01

    Listeria monocytogenes is a Gram-positive bacterium with a ubiquitous presence in the environment. There is growing concern about the increasing prevalence of L. monocytogenes associated with food-borne outbreaks. Here we report genome sequences for a cluster of human isolates of L. monocytogenes identified in South Africa in 2015. PMID:27056221

  15. Newly Identified Enterovirus C Genotypes, Identified in the Netherlands through Routine Sequencing of All Enteroviruses Detected in Clinical Materials from 2008 to 2015

    PubMed Central

    Poelman, Randy; Borger, Renze; Niesters, Hubert G. M.

    2016-01-01

    Enteroviruses (EVs) are a group of human and animal viruses that are capable of causing a variety of clinical syndromes. Different genotypes classified into species can be distinguished on the basis of sequence divergence in the VP1 capsid-coding region. Apparently new genotypes are discovered regularly, often as incidental findings in studies investigating respiratory syndromes or as part of poliovirus surveillance. Recently, some EVs have become recognized as significant respiratory pathogens, and a number of new genotypes belonging to species C have been identified. The circulation of these newly identified species C EVs, such as EV-C104, EV-C105, EV-C109, and EV-C117, nevertheless appears to be limited. In this report, we show the results of routine genotyping of all enteroviruses detected in our tertiary care hospital between January 2008 and April 2015. We detected 365 EVs belonging to 40 genotypes. Interestingly, several newly identified species C EVs were detected during the study period. Sequencing of the 5′ untranslated region (5′ UTR) of these viruses shows divergence in this region, which is a target region in many detection assays. PMID:27358467

  16. Newly Identified Enterovirus C Genotypes, Identified in the Netherlands through Routine Sequencing of All Enteroviruses Detected in Clinical Materials from 2008 to 2015.

    PubMed

    Van Leer-Buter, Coretta C; Poelman, Randy; Borger, Renze; Niesters, Hubert G M

    2016-09-01

    Enteroviruses (EVs) are a group of human and animal viruses that are capable of causing a variety of clinical syndromes. Different genotypes classified into species can be distinguished on the basis of sequence divergence in the VP1 capsid-coding region. Apparently new genotypes are discovered regularly, often as incidental findings in studies investigating respiratory syndromes or as part of poliovirus surveillance. Recently, some EVs have become recognized as significant respiratory pathogens, and a number of new genotypes belonging to species C have been identified. The circulation of these newly identified species C EVs, such as EV-C104, EV-C105, EV-C109, and EV-C117, nevertheless appears to be limited. In this report, we show the results of routine genotyping of all enteroviruses detected in our tertiary care hospital between January 2008 and April 2015. We detected 365 EVs belonging to 40 genotypes. Interestingly, several newly identified species C EVs were detected during the study period. Sequencing of the 5' untranslated region (5' UTR) of these viruses shows divergence in this region, which is a target region in many detection assays. PMID:27358467

  17. CandiSSR: An Efficient Pipeline used for Identifying Candidate Polymorphic SSRs Based on Multiple Assembled Sequences

    PubMed Central

    Xia, En-Hua; Yao, Qiu-Yang; Zhang, Hai-Bin; Jiang, Jian-Jun; Zhang, Li-Ping; Gao, Li-Zhi

    2016-01-01

    Simple sequence repeats (SSRs), also known as microsatellites, are ubiquitous short tandem duplications commonly found in genomes and/or transcriptomes of diverse organisms. They represent one of the most powerful molecular markers for genetic analysis and breeding programs because of their high mutation rate and neutral evolution. However, traditionally experimental screening of the SSR polymorphic status and their subsequent applicability to genetic studies are extremely labor-intensive and time-consuming. Thankfully, the recently decreased costs of next generation sequencing and increasing availability of large genome and/or transcriptome sequences have provided an excellent opportunity and sources for large-scale mining this type of molecular markers. However, current tools are limited. Thus we here developed a new pipeline, CandiSSR, to identify candidate polymorphic SSRs (PolySSRs) based on the multiple assembled sequences. The pipeline allows users to identify putative PolySSRs not only from the transcriptome datasets but also from multiple assembled genome sequences. In addition, two confidence metrics including standard deviation and missing rate of the SSR repetitions are provided to systematically assess the feasibility of the detected PolySSRs for subsequent application to genetic characterization. Meanwhile, primer pairs for each identified PolySSR are also automatically designed and further evaluated by the global sequence similarities of the primer-binding region, ensuring the successful rate of the marker development. Screening rice genomes with CandiSSR and subsequent experimental validation showed an accuracy rate of over 90%. Besides, the application of CandiSSR has successfully identified a large number of PolySSRs in the Arabidopsis genomes and Camellia transcriptomes. CandiSSR and the PolySSR marker sources are publicly available at: http://www.plantkingdomgdb.com/CandiSSR/index.html. PMID:26779212

  18. Detection and isolation of nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1997-01-01

    A method for detecting a target nucleic acid sequence in a sample is provided using hybridization probes which competitively hybridize to a target nucleic acid. According to the method, a target nucleic acid sequence is hybridized to first and second hybridization probes which are complementary to overlapping portions of the target nucleic acid sequence, the first hybridization probe including a first complexing agent capable of forming a binding pair with a second complexing agent and the second hybridization probe including a detectable marker. The first complexing agent attached to the first hybridization probe is contacted with a second complexing agent, the second complexing agent being attached to a solid support such that when the first and second complexing agents are attached, target nucleic acid sequences hybridized to the first hybridization probe become immobilized on to the solid support. The immobilized target nucleic acids are then separated and detected by detecting the detectable marker attached to the second hybridization probe. A kit for performing the method is also provided.

  19. Detection and isolation of nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1997-04-01

    A method for detecting a target nucleic acid sequence in a sample is provided using hybridization probes which competitively hybridize to a target nucleic acid. According to the method, a target nucleic acid sequence is hybridized to first and second hybridization probes which are complementary to overlapping portions of the target nucleic acid sequence, the first hybridization probe including a first complexing agent capable of forming a binding pair with a second complexing agent and the second hybridization probe including a detectable marker. The first complexing agent attached to the first hybridization probe is contacted with a second complexing agent, the second complexing agent being attached to a solid support such that when the first and second complexing agents are attached, target nucleic acid sequences hybridized to the first hybridization probe become immobilized on to the solid support. The immobilized target nucleic acids are then separated and detected by detecting the detectable marker attached to the second hybridization probe. A kit for performing the method is also provided. 7 figs.

  20. Draft Genome Sequence of "Candidatus Mycoplasma haemobos," a Hemotropic Mycoplasma Identified in Cattle in Mexico.

    PubMed

    Martínez-Ocampo, Fernando; Rodríguez-Camarillo, Sergio D; Amaro-Estrada, Itzel; Quiroz-Castañeda, Rosa Estela

    2016-01-01

    We present here the draft genome sequence of the first "Candidatus Mycoplasma haemobos" strain found in cattle in Mexico. This hemotropic mycoplasma causes acute and chronic disease in animals. This genome is a starting point for studying the role of this mycoplasma in coinfections and synergistic mechanisms associated with the disease. PMID:27389272

  1. Identifying, Sequencing and Managing Intellectual Risks to Students: Discussion in the Foreign Language Literature Course.

    ERIC Educational Resources Information Center

    Nance, Kimberly A.

    Student apprehension about discussing intellectually "risky" ideas in the foreign language literature class can be addressed through construction of a classroom environment in which students gain confidence. The governing principle is the sequencing of risk. Students perceive risks to be in: (1) making a linguistic error; (2) making an error of…

  2. Genome Sequences of the Novel Porcine Parvovirus 3, Identified in Guangxi Province, China.

    PubMed

    Zhong, Hui; Li, Xiangmin; Zhao, Zekai; An, Chunjing; Wan, Peng; Wu, Mengge; Chen, Huanchun; Qian, Ping

    2016-01-01

    Porcine parvovirus 3 is a novel parvovirus that infects pigs. Here, we report two genome sequences of porcine parvovirus 3 strains GX1 and GX2, which are highly prevalent in Guangxi province. It will help in understanding the epidemiology and molecular characteristics of the porcine parvovirus 3. PMID:26941135

  3. Genome Sequences of the Novel Porcine Parvovirus 3, Identified in Guangxi Province, China

    PubMed Central

    Zhong, Hui; Li, Xiangmin; Zhao, Zekai; An, Chunjing; Wan, Peng; Wu, Mengge; Chen, Huanchun

    2016-01-01

    Porcine parvovirus 3 is a novel parvovirus that infects pigs. Here, we report two genome sequences of porcine parvovirus 3 strains GX1 and GX2, which are highly prevalent in Guangxi province. It will help in understanding the epidemiology and molecular characteristics of the porcine parvovirus 3. PMID:26941135

  4. A sequencing strategy for identifying variation throughout the prion gene of BSE-affected cattle

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Cattle prion gene (PRNP) polymorphisms have been associated with bovine spongiform encephalopathy (BSE) susceptibility. We developed a method for sequencing bovine PRNP through all exons, introns and part of the promoter (25.2 kb) that accounts for known variation. The method can be used to detect...

  5. Identifying Learning Behaviors by Contextualizing Differential Sequence Mining with Action Features and Performance Evolution

    ERIC Educational Resources Information Center

    Kinnebrew, John S.; Biswas, Gautam

    2012-01-01

    Our learning-by-teaching environment, Betty's Brain, captures a wealth of data on students' learning interactions as they teach a virtual agent. This paper extends an exploratory data mining methodology for assessing and comparing students' learning behaviors from these interaction traces. The core algorithm employs sequence mining techniques to…

  6. Identifying microbial fitness determinants by Insertion Sequencing (INSeq) using genome-wide transposon mutant libraries

    PubMed Central

    Goodman, Andrew L.; Wu, Meng; Gordon, Jeffrey I.

    2012-01-01

    Insertion Sequencing (INSeq) is a method for determining the insertion site and relative abundance of large numbers of transposon mutants in a mixed population of isogenic mutants of a sequenced microbial species. INSeq is based on a modified mariner transposon containing MmeI sites at its ends, allowing cleavage at chromosomal sites 16–17bp from the inserted transposon. Genomic regions adjacent to the transposons are amplified by linear PCR with a biotinylated primer. Products are bound to magnetic beads, digested with MmeI, and barcoded with sample-specific linkers appended to each restriction fragment. After limited PCR amplification, fragments are sequenced using a high-throughput instrument. The sequence of each read can be used to map the location of a transposon in the genome. Read count measures the relative abundance of that mutant in the population. Solid-phase library preparation makes this protocol rapid (18h), easy to scale-up, amenable to automation, and useful for a variety of samples. A protocol for characterizing libraries of transposon mutant strains clonally arrayed in multi-well format is provided. PMID:22094732

  7. Genome sequencing identifies Listeria fleischmannii subsp. coloradonensis subsp. nov., isolated from a ranch.

    PubMed

    den Bakker, Henk C; Manuel, Clyde S; Fortes, Esther D; Wiedmann, Martin; Nightingale, Kendra K

    2013-09-01

    Twenty Listeria-like isolates were obtained from environmental samples collected on a cattle ranch in northern Colorado; all of these isolates were found to share an identical partial sigB sequence, suggesting close relatedness. The isolates were similar to members of the genus Listeria in that they were Gram-stain-positive, short rods, oxidase-negative and catalase-positive; the isolates were similar to Listeria fleischmannii because they were non-motile at 25 °C. 16S rRNA gene sequencing for representative isolates and whole genome sequencing for one isolate was performed. The genome of the type strain of Listeria fleischmannii (strain LU2006-1(T)) was also sequenced. The draft genomes were very similar in size and the average MUMmer nucleotide identity across 91% of the genomes was 95.16%. Genome sequence data were used to design primers for a six-gene multi-locus sequence analysis (MLSA) scheme. Phylogenies based on (i) the near-complete 16S rRNA gene, (ii) 31 core genes and (iii) six housekeeping genes illustrated the close relationship of these Listeria-like isolates to Listeria fleischmannii LU2006-1(T). Sufficient genetic divergence of the Listeria-like isolates from the type strain of Listeria fleischmannii and differing phenotypic characteristics warrant these isolates to be classified as members of a distinct infraspecific taxon, for which the name Listeria fleischmannii subsp. coloradonensis subsp. nov. is proposed. The type strain is TTU M1-001(T) ( =BAA-2414(T) =DSM 25391(T)). The isolates of Listeria fleischmannii subsp. coloradonensis subsp. nov. differ from the nominate subspecies by the inability to utilize melezitose, turanose and sucrose, and the ability to utilize inositol. The results also demonstrate the utility of whole genome sequencing to facilitate identification of novel taxa within a well-described genus. The genomes of both subspecies of Listeria fleischmannii contained putative enhancin genes; the Listeria fleischmannii subsp

  8. The developmental transcriptome landscape of bovine skeletal muscle defined by Ribo-Zero ribonucleic acid sequencing.

    PubMed

    Sun, X; Li, M; Sun, Y; Cai, H; Li, R; Wei, X; Lan, X; Huang, Y; Lei, C; Chen, H

    2015-12-01

    Ribonucleic acid sequencing (RNA-Seq) libraries are normally prepared with oligo(dT) selection of poly(A)+ mRNA, but it depends on intact total RNA samples. Recent studies have described Ribo-Zero technology, a novel method that can capture both poly(A)+ and poly(A)- transcripts from intact or fragmented RNA samples. We report here the first application of Ribo-Zero RNA-Seq for the analysis of the bovine embryonic, neonatal, and adult skeletal muscle whole transcriptome at an unprecedented depth. Overall, 19,893 genes were found to be expressed, with a high correlation of expression levels between the calf and the adult. Hundreds of genes were found to be highly expressed in the embryo and decreased at least 10-fold after birth, indicating their potential roles in embryonic muscle development. In addition, we present for the first time the analysis of global transcript isoform discovery in bovine skeletal muscle and identified 36,694 transcript isoforms. Transcriptomic data were also analyzed to unravel sequence variations; 185,036 putative SNP and 12,428 putative short insertions-deletions (InDel) were detected. Specifically, many stop-gain, stop-loss, and frameshift mutations were identified that probably change the relative protein production and sequentially affect the gene function. Notably, the numbers of stage-specific transcripts, alternative splicing events, SNP, and InDel were greater in the embryo than in the calf and the adult, suggesting that gene expression is most active in the embryo. The resulting view of the transcriptome at a single-base resolution greatly enhances the comprehensive transcript catalog and uncovers the global trends in gene expression during bovine skeletal muscle development. PMID:26641174

  9. Purification and partial amino acid sequence of the chloroplast cytochrome b-559.

    PubMed

    Widger, W R; Cramer, W A; Hermodson, M; Meyer, D; Gullifor, M

    1984-03-25

    The hydrophobic cytochrome b-559, purified from unstacked, ethanol-washed spinach thylakoid membranes, using extraction with 2% Triton X-100 in 4 M urea and three chromatographic steps in the presence of protease inhibitors, has a dominant band on sodium dodecyl sulfate-urea gels corresponding to Mr = 10,000. The yield of this preparation is 30-50% (5-10 mg) starting with 600 mg of chlorophyll. The heme content yields a calculated molecular weight of no more than 17,500/heme, and perhaps somewhat smaller after correction for impurities. The Mr = 10,000 band is stained by the tetramethylbenzidine-H2O2 heme reagent on lithium dodecyl sulfate gels run at 0 degrees C. The Mr = 10,000 protein, further separated by high performance liquid chromatography, contains a unique NH2 terminus that is not blocked, and the amino acid sequence for the first 27 residues is NH2-Ser-Gly-Ser-Thr-Gly-Glu-Arg-Ser-Phe-Ala-Asp-Ile-Ile-Thr-Ser-Ile-Arg-Tyr-Trp -Val-Ile-X-Ser-Ile-Thr-Ile-Pro. . . COOH. Approximately 55% of the amino acids are hydrophobic, based on amino acid analysis of the Mr = 10,000 peptide, which also indicated the presence of at least one histidine. Only one cytochrome b-559 component could be identified, whose yield indicated that it arises from a single b-559 protein in chloroplasts corresponding to the in situ high potential cytochrome of the chloroplast photosystem II. PMID:6706983

  10. Nucleolar targeting of proteins by the tandem array of basic amino acid stretches identified in the RNA polymerase I-associated factor PAF49

    SciTech Connect

    Ushijima, Ryujiro; Matsuyama, Toshifumi; Nagata, Izumi; Yamamoto, Kazuo

    2008-05-16

    There is accumulating evidence to indicate that the regulation of subnuclear compartmentalization plays important roles in cellular processes. The RNA polymerase I-associated factor PAF49 has been shown to accumulate in the nucleolus in growing cells, but disperse into the nucleoplasm in growth-arrested cells. Serial deletion analysis revealed that amino acids 199-338 were necessary for the nucleolar localization of PAF49. Combinatorial point mutation analysis indicated that the individual basic amino acid stretches (BS) within the central (BS1-4) and the C-terminal (BS5 and 6) regions may cooperatively confer the nucleolar localization of PAF49. Addition of the basic stretches in tandem to a heterologous protein, such as the interferon regulatory factor-3, translocated the tagged protein into the nucleolus, even in the presence of an intrinsic nuclear export sequence. Thus, tandem array of the basic amino acid stretches identified here functions as a dominant nucleolar targeting sequence.

  11. Whole genome sequencing to identify host genetic risk factors for severe outcomes of hepatitis A virus infection

    PubMed Central

    Long, Dustin; Fix, Oren K.; Deng, Xutao; Seielstad, Mark; Lauring, Adam S.

    2014-01-01

    Acute liver failure is a severe, but rare, outcome of hepatitis A virus infection. Unusual presentations of prevalent infections have often been attributed to pathogen-specific immune deficits that exhibit Mendelian inheritance. Genome-wide resequencing of unrelated cases has proven to be a powerful approach for identifying highly penetrant risk alleles that underlie such syndromes. Rare mutations likely to affect protein expression or function can be identified from sequence data, and their association with a similarly rare phenotype rests on their existence in multiple affected individuals. A rare or novel sequence variant that is enriched to a significant degree in a genetically diverse cohort suggests a candidate susceptibility allele. Whole genome sequencing of ten individuals from ethnically diverse backgrounds with HAV-associated acute liver failure was performed. A set of rational filtering criteria was used to identify genetic variants that are rare in the population, but enriched in this cohort. Single nucleotide polymorphisms, insertions, and deletions were considered and autosomal dominant, autosomal recessive, and polygenic models were applied. Analysis of the protein-coding exome identified no single gene with putatively deleterious mutations shared by multiple individuals, arguing against a simple Mendelian model of inheritance. A number of rare variants were significantly enriched in this cohort, consistent with a complex and genetically heterogeneous trait. Several of the variants identified in this genome-wide study lie within genes important to hepatic pathophysiology and are candidate susceptibility alleles for hepatitis A virus infection. PMID:24978929

  12. Conservation of Shannon's redundancy for proteins. [information theory applied to amino acid sequences

    NASA Technical Reports Server (NTRS)

    Gatlin, L. L.

    1974-01-01

    Concepts of information theory are applied to examine various proteins in terms of their redundancy in natural originators such as animals and plants. The Monte Carlo method is used to derive information parameters for random protein sequences. Real protein sequence parameters are compared with the standard parameters of protein sequences having a specific length. The tendency of a chain to contain some amino acids more frequently than others and the tendency of a chain to contain certain amino acid pairs more frequently than other pairs are used as randomness measures of individual protein sequences. Non-periodic proteins are generally found to have random Shannon redundancies except in cases of constraints due to short chain length and genetic codes. Redundant characteristics of highly periodic proteins are discussed. A degree of periodicity parameter is derived.

  13. Clinical next generation sequencing of pediatric-type malignancies in adult patients identifies novel somatic aberrations

    PubMed Central

    Silva, Jorge Galvez; Corrales-Medina, Fernando F.; Maher, Ossama M.; Tannir, Nizar; Huh, Winston W.; Rytting, Michael E.; Subbiah, Vivek

    2015-01-01

    Pediatric malignancies in adults, in contrast to the same diseases in children are clinically more aggressive, resistant to chemotherapeutics, and carry a higher risk of relapse. Molecular profiling of tumor sample using next generation sequencing (NGS) has recently become clinically available. We report the results of targeted exome sequencing of six adult patients with pediatric-type malignancies : Wilms tumor(n=2), medulloblastoma(n=2), Ewing's sarcoma( n=1) and desmoplastic small round cell tumor (n=1) with a median age of 28.8 years. Detection of druggable somatic aberrations in tumors is feasible. However, identification of actionable target therapies in these rare adult patients with pediatric-type malignancies is challenging. Continuous efforts to establish a rare disease registry are warranted. PMID:25859559

  14. Clinical next generation sequencing of pediatric-type malignancies in adult patients identifies novel somatic aberrations.

    PubMed

    Silva, Jorge Galvez; Corrales-Medina, Fernando F; Maher, Ossama M; Tannir, Nizar; Huh, Winston W; Rytting, Michael E; Subbiah, Vivek

    2015-01-01

    Pediatric malignancies in adults, in contrast to the same diseases in children are clinically more aggressive, resistant to chemotherapeutics, and carry a higher risk of relapse. Molecular profiling of tumor sample using next generation sequencing (NGS) has recently become clinically available. We report the results of targeted exome sequencing of six adult patients with pediatric-type malignancies : Wilms tumor(n=2), medulloblastoma(n=2), Ewing's sarcoma( n=1) and desmoplastic small round cell tumor (n=1) with a median age of 28.8 years. Detection of druggable somatic aberrations in tumors is feasible. However, identification of actionable target therapies in these rare adult patients with pediatric-type malignancies is challenging. Continuous efforts to establish a rare disease registry are warranted. PMID:25859559

  15. Unique features of a global human ectoparasite identified through sequencing of the bed bug genome

    PubMed Central

    Benoit, Joshua B.; Adelman, Zach N.; Reinhardt, Klaus; Dolan, Amanda; Poelchau, Monica; Jennings, Emily C.; Szuter, Elise M.; Hagan, Richard W.; Gujar, Hemant; Shukla, Jayendra Nath; Zhu, Fang; Mohan, M.; Nelson, David R.; Rosendale, Andrew J.; Derst, Christian; Resnik, Valentina; Wernig, Sebastian; Menegazzi, Pamela; Wegener, Christian; Peschel, Nicolai; Hendershot, Jacob M.; Blenau, Wolfgang; Predel, Reinhard; Johnston, Paul R.; Ioannidis, Panagiotis; Waterhouse, Robert M.; Nauen, Ralf; Schorn, Corinna; Ott, Mark-Christoph; Maiwald, Frank; Johnston, J. Spencer; Gondhalekar, Ameya D.; Scharf, Michael E.; Peterson, Brittany F.; Raje, Kapil R.; Hottel, Benjamin A.; Armisén, David; Crumière, Antonin Jean Johan; Refki, Peter Nagui; Santos, Maria Emilia; Sghaier, Essia; Viala, Sèverine; Khila, Abderrahman; Ahn, Seung-Joon; Childers, Christopher; Lee, Chien-Yueh; Lin, Han; Hughes, Daniel S. T.; Duncan, Elizabeth J.; Murali, Shwetha C.; Qu, Jiaxin; Dugan, Shannon; Lee, Sandra L.; Chao, Hsu; Dinh, Huyen; Han, Yi; Doddapaneni, Harshavardhan; Worley, Kim C.; Muzny, Donna M.; Wheeler, David; Panfilio, Kristen A.; Vargas Jentzsch, Iris M.; Vargo, Edward L.; Booth, Warren; Friedrich, Markus; Weirauch, Matthew T.; Anderson, Michelle A. E.; Jones, Jeffery W.; Mittapalli, Omprakash; Zhao, Chaoyang; Zhou, Jing-Jiang; Evans, Jay D.; Attardo, Geoffrey M.; Robertson, Hugh M.; Zdobnov, Evgeny M.; Ribeiro, Jose M. C.; Gibbs, Richard A.; Werren, John H.; Palli, Subba R.; Schal, Coby; Richards, Stephen

    2016-01-01

    The bed bug, Cimex lectularius, has re-established itself as a ubiquitous human ectoparasite throughout much of the world during the past two decades. This global resurgence is likely linked to increased international travel and commerce in addition to widespread insecticide resistance. Analyses of the C. lectularius sequenced genome (650 Mb) and 14,220 predicted protein-coding genes provide a comprehensive representation of genes that are linked to traumatic insemination, a reduced chemosensory repertoire of genes related to obligate hematophagy, host–symbiont interactions, and several mechanisms of insecticide resistance. In addition, we document the presence of multiple putative lateral gene transfer events. Genome sequencing and annotation establish a solid foundation for future research on mechanisms of insecticide resistance, human–bed bug and symbiont–bed bug associations, and unique features of bed bug biology that contribute to the unprecedented success of C. lectularius as a human ectoparasite. PMID:26836814

  16. Unique features of a global human ectoparasite identified through sequencing of the bed bug genome.

    PubMed

    Benoit, Joshua B; Adelman, Zach N; Reinhardt, Klaus; Dolan, Amanda; Poelchau, Monica; Jennings, Emily C; Szuter, Elise M; Hagan, Richard W; Gujar, Hemant; Shukla, Jayendra Nath; Zhu, Fang; Mohan, M; Nelson, David R; Rosendale, Andrew J; Derst, Christian; Resnik, Valentina; Wernig, Sebastian; Menegazzi, Pamela; Wegener, Christian; Peschel, Nicolai; Hendershot, Jacob M; Blenau, Wolfgang; Predel, Reinhard; Johnston, Paul R; Ioannidis, Panagiotis; Waterhouse, Robert M; Nauen, Ralf; Schorn, Corinna; Ott, Mark-Christoph; Maiwald, Frank; Johnston, J Spencer; Gondhalekar, Ameya D; Scharf, Michael E; Peterson, Brittany F; Raje, Kapil R; Hottel, Benjamin A; Armisén, David; Crumière, Antonin Jean Johan; Refki, Peter Nagui; Santos, Maria Emilia; Sghaier, Essia; Viala, Sèverine; Khila, Abderrahman; Ahn, Seung-Joon; Childers, Christopher; Lee, Chien-Yueh; Lin, Han; Hughes, Daniel S T; Duncan, Elizabeth J; Murali, Shwetha C; Qu, Jiaxin; Dugan, Shannon; Lee, Sandra L; Chao, Hsu; Dinh, Huyen; Han, Yi; Doddapaneni, Harshavardhan; Worley, Kim C; Muzny, Donna M; Wheeler, David; Panfilio, Kristen A; Vargas Jentzsch, Iris M; Vargo, Edward L; Booth, Warren; Friedrich, Markus; Weirauch, Matthew T; Anderson, Michelle A E; Jones, Jeffery W; Mittapalli, Omprakash; Zhao, Chaoyang; Zhou, Jing-Jiang; Evans, Jay D; Attardo, Geoffrey M; Robertson, Hugh M; Zdobnov, Evgeny M; Ribeiro, Jose M C; Gibbs, Richard A; Werren, John H; Palli, Subba R; Schal, Coby; Richards, Stephen

    2016-01-01

    The bed bug, Cimex lectularius, has re-established itself as a ubiquitous human ectoparasite throughout much of the world during the past two decades. This global resurgence is likely linked to increased international travel and commerce in addition to widespread insecticide resistance. Analyses of the C. lectularius sequenced genome (650 Mb) and 14,220 predicted protein-coding genes provide a comprehensive representation of genes that are linked to traumatic insemination, a reduced chemosensory repertoire of genes related to obligate hematophagy, host-symbiont interactions, and several mechanisms of insecticide resistance. In addition, we document the presence of multiple putative lateral gene transfer events. Genome sequencing and annotation establish a solid foundation for future research on mechanisms of insecticide resistance, human-bed bug and symbiont-bed bug associations, and unique features of bed bug biology that contribute to the unprecedented success of C. lectularius as a human ectoparasite. PMID:26836814

  17. Genome Sequence of Erythromelalgia-Related Poxvirus Identifies it as an Ectromelia Virus Strain

    PubMed Central

    Mendez-Rios, Jorge D.; Martens, Craig A.; Bruno, Daniel P.; Porcella, Stephen F.; Zheng, Zhi-Ming; Moss, Bernard

    2012-01-01

    Erythromelagia is a condition characterized by attacks of burning pain and inflammation in the extremeties. An epidemic form of this syndrome occurs in secondary students in rural China and a virus referred to as erythromelalgia-associated poxvirus (ERPV) was reported to have been recovered from throat swabs in 1987. Studies performed at the time suggested that ERPV belongs to the orthopoxvirus genus and has similarities with ectromelia virus, the causative agent of mousepox. We have determined the complete genome sequence of ERPV and demonstrated that it has 99.8% identity to the Naval strain of ectromelia virus and a slighly lower identity to the Moscow strain. Small DNA deletions in the Naval genome that are absent from ERPV may suggest that the sequenced strain of Naval was not the immediate progenitor of ERPV. PMID:22558090

  18. The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color

    PubMed Central

    2013-01-01

    Background Theobroma cacao L. cultivar Matina 1-6 belongs to the most cultivated cacao type. The availability of its genome sequence and methods for identifying genes responsible for important cacao traits will aid cacao researchers and breeders. Results We describe the sequencing and assembly of the genome of Theobroma cacao L. cultivar Matina 1-6. The genome of the Matina 1-6 cultivar is 445 Mbp, which is significantly larger than a sequenced Criollo cultivar, and more typical of other cultivars. The chromosome-scale assembly, version 1.1, contains 711 scaffolds covering 346.0 Mbp, with a contig N50 of 84.4 kbp, a scaffold N50 of 34.4 Mbp, and an evidence-based gene set of 29,408 loci. Version 1.1 has 10x the scaffold N50 and 4x the contig N50 as Criollo, and includes 111 Mb more anchored sequence. The version 1.1 assembly has 4.4% gap sequence, while Criollo has 10.9%. Through a combination of haplotype, association mapping and gene expression analyses, we leverage this robust reference genome to identify a promising candidate gene responsible for pod color variation. We demonstrate that green/red pod color in cacao is likely regulated by the R2R3 MYB transcription factor TcMYB113, homologs of which determine pigmentation in Rosaceae, Solanaceae, and Brassicaceae. One SNP within the target site for a highly conserved trans-acting siRNA in dicots, found within TcMYB113, seems to affect transcript levels of this gene and therefore pod color variation. Conclusions We report a high-quality sequence and annotation of Theobroma cacao L. and demonstrate its utility in identifying candidate genes regulating traits. PMID:23731509

  19. Exome sequencing identifies a DNAJB6 mutation in a family with dominantly-inherited limb-girdle muscular dystrophy.

    PubMed

    Couthouis, Julien; Raphael, Alya R; Siskind, Carly; Findlay, Andrew R; Buenrostro, Jason D; Greenleaf, William J; Vogel, Hannes; Day, John W; Flanigan, Kevin M; Gitler, Aaron D

    2014-05-01

    Limb-girdle muscular dystrophy primarily affects the muscles of the hips and shoulders (the "limb-girdle" muscles), although it is a heterogeneous disorder that can present with varying symptoms. There is currently no cure. We sought to identify the genetic basis of limb-girdle muscular dystrophy type 1 in an American family of Northern European descent using exome sequencing. Exome sequencing was performed on DNA samples from two affected siblings and one unaffected sibling and resulted in the identification of eleven candidate mutations that co-segregated with the disease. Notably, this list included a previously reported mutation in DNAJB6, p.Phe89Ile, which was recently identified as a cause of limb-girdle muscular dystrophy type 1D. Additional family members were Sanger sequenced and the mutation in DNAJB6 was only found in affected individuals. Subsequent haplotype analysis indicated that this DNAJB6 p.Phe89Ile mutation likely arose independently of the previously reported mutation. Since other published mutations are located close by in the G/F domain of DNAJB6, this suggests that the area may represent a mutational hotspot. Exome sequencing provided an unbiased and effective method for identifying the genetic etiology of limb-girdle muscular dystrophy type 1 in a previously genetically uncharacterized family. This work further confirms the causative role of DNAJB6 mutations in limb-girdle muscular dystrophy type 1D. PMID:24594375

  20. Burkholderia pseudomallei sequencing identifies genomic clades with distinct recombination, accessory, and epigenetic profiles.

    PubMed

    Nandi, Tannistha; Holden, Matthew T G; Holden, Mathew T G; Didelot, Xavier; Mehershahi, Kurosh; Boddey, Justin A; Beacham, Ifor; Peak, Ian; Harting, John; Baybayan, Primo; Guo, Yan; Wang, Susana; How, Lee Chee; Sim, Bernice; Essex-Lopresti, Angela; Sarkar-Tyson, Mitali; Nelson, Michelle; Smither, Sophie; Ong, Catherine; Aw, Lay Tin; Hoon, Chua Hui; Michell, Stephen; Studholme, David J; Titball, Richard; Chen, Swaine L; Parkhill, Julian; Tan, Patrick

    2015-01-01

    Burkholderia pseudomallei (Bp) is the causative agent of the infectious disease melioidosis. To investigate population diversity, recombination, and horizontal gene transfer in closely related Bp isolates, we performed whole-genome sequencing (WGS) on 106 clinical, animal, and environmental strains from a restricted Asian locale. Whole-genome phylogenies resolved multiple genomic clades of Bp, largely congruent with multilocus sequence typing (MLST). We discovered widespread recombination in the Bp core genome, involving hundreds of regions associated with multiple haplotypes. Highly recombinant regions exhibited functional enrichments that may contribute to virulence. We observed clade-specific patterns of recombination and accessory gene exchange, and provide evidence that this is likely due to ongoing recombination between clade members. Reciprocally, interclade exchanges were rarely observed, suggesting mechanisms restricting gene flow between clades. Interrogation of accessory elements revealed that each clade harbored a distinct complement of restriction-modification (RM) systems, predicted to cause clade-specific patterns of DNA methylation. Using methylome sequencing, we confirmed that representative strains from separate clades indeed exhibit distinct methylation profiles. Finally, using an E. coli system, we demonstrate that Bp RM systems can inhibit uptake of non-self DNA. Our data suggest that RM systems borne on mobile elements, besides preventing foreign DNA invasion, may also contribute to limiting exchanges of genetic material between individuals of the same species. Genomic clades may thus represent functional units of genetic isolation in Bp, modulating intraspecies genetic diversity. PMID:25236617

  1. Burkholderia pseudomallei sequencing identifies genomic clades with distinct recombination, accessory, and epigenetic profiles

    PubMed Central

    Nandi, Tannistha; Holden, Matthew T.G.; Didelot, Xavier; Mehershahi, Kurosh; Boddey, Justin A.; Beacham, Ifor; Peak, Ian; Harting, John; Baybayan, Primo; Guo, Yan; Wang, Susana; How, Lee Chee; Sim, Bernice; Essex-Lopresti, Angela; Sarkar-Tyson, Mitali; Nelson, Michelle; Smither, Sophie; Ong, Catherine; Aw, Lay Tin; Hoon, Chua Hui; Michell, Stephen; Studholme, David J.; Titball, Richard; Chen, Swaine L.; Parkhill, Julian

    2015-01-01

    Burkholderia pseudomallei (Bp) is the causative agent of the infectious disease melioidosis. To investigate population diversity, recombination, and horizontal gene transfer in closely related Bp isolates, we performed whole-genome sequencing (WGS) on 106 clinical, animal, and environmental strains from a restricted Asian locale. Whole-genome phylogenies resolved multiple genomic clades of Bp, largely congruent with multilocus sequence typing (MLST). We discovered widespread recombination in the Bp core genome, involving hundreds of regions associated with multiple haplotypes. Highly recombinant regions exhibited functional enrichments that may contribute to virulence. We observed clade-specific patterns of recombination and accessory gene exchange, and provide evidence that this is likely due to ongoing recombination between clade members. Reciprocally, interclade exchanges were rarely observed, suggesting mechanisms restricting gene flow between clades. Interrogation of accessory elements revealed that each clade harbored a distinct complement of restriction-modification (RM) systems, predicted to cause clade-specific patterns of DNA methylation. Using methylome sequencing, we confirmed that representative strains from separate clades indeed exhibit distinct methylation profiles. Finally, using an E. coli system, we demonstrate that Bp RM systems can inhibit uptake of non-self DNA. Our data suggest that RM systems borne on mobile elements, besides preventing foreign DNA invasion, may also contribute to limiting exchanges of genetic material between individuals of the same species. Genomic clades may thus represent functional units of genetic isolation in Bp, modulating intraspecies genetic diversity. PMID:25236617

  2. TCR Sequencing Can Identify and Track Glioma-Infiltrating T Cells after DC Vaccination.

    PubMed

    Hsu, Melody S; Sedighim, Shaina; Wang, Tina; Antonios, Joseph P; Everson, Richard G; Tucker, Alexander M; Du, Lin; Emerson, Ryan; Yusko, Erik; Sanders, Catherine; Robins, Harlan S; Yong, William H; Davidson, Tom B; Li, Gang; Liau, Linda M; Prins, Robert M

    2016-05-01

    Although immunotherapeutic strategies are emerging as adjunctive treatments for cancer, sensitive methods of monitoring the immune response after treatment remain to be established. We used a novel next-generation sequencing approach to determine whether quantitative assessments of tumor-infiltrating lymphocyte (TIL) content and the degree of overlap of T-cell receptor (TCR) sequences in brain tumors and peripheral blood were predictors of immune response and overall survival in glioblastoma patients treated with autologous tumor lysate-pulsed dendritic cell immunotherapy. A statistically significant correlation was found between a higher estimated TIL content and increased time to progression and overall survival. In addition, we were able to assess the proportion of shared TCR sequences between tumor and peripheral blood at time points before and after therapy, and found the level of TCR overlap to correlate with survival outcomes. Higher degrees of overlap, or the development of an increased overlap following immunotherapy, was correlated with improved clinical outcome, and may provide insights into the successful, antigen-specific immune response. Cancer Immunol Res; 4(5); 412-8. ©2016 AACR. PMID:26968205

  3. baySeq: Empirical Bayesian methods for identifying differential expression in sequence count data

    PubMed Central

    2010-01-01

    Background High throughput sequencing has become an important technology for studying expression levels in many types of genomic, and particularly transcriptomic, data. One key way of analysing such data is to look for elements of the data which display particular patterns of differential expression in order to take these forward for further analysis and validation. Results We propose a framework for defining patterns of differential expression and develop a novel algorithm, baySeq, which uses an empirical Bayes approach to detect these patterns of differential expression within a set of sequencing samples. The method assumes a negative binomial distribution for the data and derives an empirically determined prior distribution from the entire dataset. We examine the performance of the method on real and simulated data. Conclusions Our method performs at least as well, and often better, than existing methods for analyses of pairwise differential expression in both real and simulated data. When we compare methods for the analysis of data from experimental designs involving multiple sample groups, our method again shows substantial gains in performance. We believe that this approach thus represents an important step forward for the analysis of count data from sequencing experiments. PMID:20698981

  4. Conversion of amino-acid sequence in proteins to classical music: search for auditory patterns

    PubMed Central

    2007-01-01

    We have converted genome-encoded protein sequences into musical notes to reveal auditory patterns without compromising musicality. We derived a reduced range of 13 base notes by pairing similar amino acids and distinguishing them using variations of three-note chords and codon distribution to dictate rhythm. The conversion will help make genomic coding sequences more approachable for the general public, young children, and vision-impaired scientists. PMID:17477882

  5. Exome sequencing identifies frequent inactivating mutations in BAP1, ARID1A and PBRM1 in intrahepatic cholangiocarcinomas.

    PubMed

    Jiao, Yuchen; Pawlik, Timothy M; Anders, Robert A; Selaru, Florin M; Streppel, Mirte M; Lucas, Donald J; Niknafs, Noushin; Guthrie, Violeta Beleva; Maitra, Anirban; Argani, Pedram; Offerhaus, G Johan A; Roa, Juan Carlos; Roberts, Lewis R; Gores, Gregory J; Popescu, Irinel; Alexandrescu, Sorin T; Dima, Simona; Fassan, Matteo; Simbolo, Michele; Mafficini, Andrea; Capelli, Paola; Lawlor, Rita T; Ruzzenente, Andrea; Guglielmi, Alfredo; Tortora, Giampaolo; de Braud, Filippo; Scarpa, Aldo; Jarnagin, William; Klimstra, David; Karchin, Rachel; Velculescu, Victor E; Hruban, Ralph H; Vogelstein, Bert; Kinzler, Kenneth W; Papadopoulos, Nickolas; Wood, Laura D

    2013-12-01

    Through exomic sequencing of 32 intrahepatic cholangiocarcinomas, we discovered frequent inactivating mutations in multiple chromatin-remodeling genes (including BAP1, ARID1A and PBRM1), and mutation in one of these genes occurred in almost half of the carcinomas sequenced. We also identified frequent mutations at previously reported hotspots in the IDH1 and IDH2 genes encoding metabolic enzymes in intrahepatic cholangiocarcinomas. In contrast, TP53 was the most frequently altered gene in a series of nine gallbladder carcinomas. These discoveries highlight the key role of dysregulated chromatin remodeling in intrahepatic cholangiocarcinomas. PMID:24185509

  6. Exome sequencing identifies frequent inactivating mutations in BAP1, ARID1A and PBRM1 in intrahepatic cholangiocarcinomas

    PubMed Central

    Selaru, Florin M; Streppel, Mirte M; Lucas, Donald J; Niknafs, Noushin; Guthrie, Violeta Beleva; Maitra, Anirban; Argani, Pedram; Offerhaus, G Johan A; Roa, Juan Carlos; Roberts, Lewis R; Gores, Gregory J; Popescu, Irinel; Alexandrescu, Sorin T; Dima, Simona; Fassan, Matteo; Simbolo, Michele; Mafficini, Andrea; Capelli, Paola; Lawlor, Rita T; Ruzzenente, Andrea; Guglielmi, Alfredo; Tortora, Giampaolo; de Braud, Filippo; Scarpa, Aldo; Jarnagin, William; Klimstra, David; Karchin, Rachel; Velculescu, Victor E; Hruban, Ralph H; Vogelstein, Bert; Kinzler, Kenneth W; Papadopoulos, Nickolas; Wood, Laura D

    2014-01-01

    Through exomic sequencing of 32 intrahepatic cholangiocarcinomas, we discovered frequent inactivating mutations in multiple chromatin-remodeling genes (including BAP1, ARID1A and PBRM1), and mutation in one of these genes occurred in almost half of the carcinomas sequenced. We also identified frequent mutations at previously reported hotspots in the IDH1 and IDH2 genes encoding metabolic enzymes in intrahepatic cholangiocarcinomas. In contrast, TP53 was the most frequently altered gene in a series of nine gallbladder carcinomas. These discoveries highlight the key role of dysregulated chromatin remodeling in intrahepatic cholangiocarcinomas. PMID:24185509

  7. Amino acid sequence of the serine-repeat antigen (SERA) of Plasmodium falciparum determined from cloned cDNA.

    PubMed

    Bzik, D J; Li, W B; Horii, T; Inselburg, J

    1988-09-01

    We report the isolation of cDNA clones for a Plasmodium falciparum gene that encodes the complete amino acid sequence of a previously identified exported blood stage antigen. The Mr of this antigen protein had been determined by sodium dodecylsulphate-polyacrylamide gel electrophoresis analysis, by different workers, to be 113,000, 126,000, and 140,000. We show, by cDNA nucleotide sequence analysis, that this antigen gene encodes a 989 amino acid protein (111 kDa) that contains a potential signal peptide, but not a membrane anchor domain. In the FCR3 strain the serine content of the protein was 11%, of which 57% of the serine residues were localized within a 201 amino acid sequence that included 35 consecutive serine residues. The protein also contained three possible N-linked glycosylation sites and numerous possible O-linked glycosylation sites. The mRNA was abundant during late trophozoite-schizont parasite stages. We propose to identity this antigen, which had been called p126, by the acronym SERA, serine-repeat antigen, based on its complete structure. The usefulness of the cloned cDNA as a source of a possible malaria vaccine is considered in view of the previously demonstrated ability of the antigen to induce parasite-inhibitory antibodies and a protective immune response in Saimiri monkeys. PMID:2847041

  8. Protein location prediction using atomic composition and global features of the amino acid sequence

    SciTech Connect

    Cherian, Betsy Sheena; Nair, Achuthsankar S.

    2010-01-22

    Subcellular location of protein is constructive information in determining its function, screening for drug candidates, vaccine design, annotation of gene products and in selecting relevant proteins for further studies. Computational prediction of subcellular localization deals with predicting the location of a protein from its amino acid sequence. For a computational localization prediction method to be more accurate, it should exploit all possible relevant biological features that contribute to the subcellular localization. In this work, we extracted the biological features from the full length protein sequence to incorporate more biological information. A new biological feature, distribution of atomic composition is effectively used with, multiple physiochemical properties, amino acid composition, three part amino acid composition, and sequence similarity for predicting the subcellular location of the protein. Support Vector Machines are designed for four modules and prediction is made by a weighted voting system. Our system makes prediction with an accuracy of 100, 82.47, 88.81 for self-consistency test, jackknife test and independent data test respectively. Our results provide evidence that the prediction based on the biological features derived from the full length amino acid sequence gives better accuracy than those derived from N-terminal alone. Considering the features as a distribution within the entire sequence will bring out underlying property distribution to a greater detail to enhance the prediction accuracy.

  9. Peptide Array on Cellulose Support—A Screening Tool to Identify Peptides with Dipeptidyl-Peptidase IV Inhibitory Activity within the Sequence of α-Lactalbumin

    PubMed Central

    Lacroix, Isabelle M. E.; Li-Chan, Eunice C. Y.

    2014-01-01

    The inhibition of the enzyme dipeptidyl-peptidase IV (DPP-IV) is an effective pharmacotherapeutic approach for the management of type 2 diabetes. Recent findings have suggested that dietary proteins, including bovine α-lactalbumin, could be precursors of peptides able to inhibit DPP-IV. However, information on the location of active peptide sequences within the proteins is far from being comprehensive. Moreover, the traditional approach to identify bioactive peptides from foods can be tedious and long. Therefore, the objective of this study was to use peptide arrays to screen α-lactalbumin-derived peptides for their interaction with DPP-IV. Deca-peptides spanning the entire α-lactalbumin sequence, with a frame shift of 1 amino acid between successive sequences, were synthesized on cellulose membranes using “SPOT” technology, and their binding to and inhibition of DPP-IV was studied. Among the 114 α-lactalbumin-derived decamers investigated, the peptides 60WCKDDQNPHS69 (αKi = 76 µM), 105LAHKALCSEK114 (Ki = 217 µM) and 110LCSEKLDQWL119 (Ki = 217 µM) were among the strongest DPP-IV inhibitors. While the SPOT- and traditionally-synthesized peptides showed consistent trends in DPP-IV inhibitory activity, the cellulose-bound peptides’ binding behavior was not correlated to their ability to inhibit the enzyme. This research showed, for the first time, that peptide arrays are useful screening tools to identify DPP-IV inhibitory peptides from dietary proteins. PMID:25402645

  10. Ab initio detection of fuzzy amino acid tandem repeats in protein sequences

    PubMed Central

    2012-01-01

    Background Tandem repetitions within protein amino acid sequences often correspond to regular secondary structures and form multi-repeat 3D assemblies of varied size and function. Developing internal repetitions is one of the evolutionary mechanisms that proteins employ to adapt their structure and function under evolutionary pressure. While there is keen interest in understanding such phenomena, detection of repeating structures based only on sequence analysis is considered an arduous task, since structure and function is often preserved even under considerable sequence divergence (fuzzy tandem repeats). Results In this paper we present PTRStalker, a new algorithm for ab-initio detection of fuzzy tandem repeats in protein amino acid sequences. In the reported results we show that by feeding PTRStalker with amino acid sequences from the UniProtKB/Swiss-Prot database we detect novel tandemly repeated structures not captured by other state-of-the-art tools. Experiments with membrane proteins indicate that PTRStalker can detect global symmetries in the primary structure which are then reflected in the tertiary structure. Conclusions PTRStalker is able to detect fuzzy tandem repeating structures in protein sequences, with performance beyond the current state-of-the art. Such a tool may be a valuable support to investigating protein structural properties when tertiary X-ray data is not available. PMID:22536906

  11. Multimodal phylogeny for taxonomy: integrating information from nucleotide and amino acid sequences.

    PubMed

    Bicego, Manuele; Dellaglio, Franco; Felis, Giovanna E

    2007-10-01

    The crucial role played by the analysis of microbial diversity in biotechnology-based innovations has increased the interest in the microbial taxonomy research area. Phylogenetic sequence analyses have contributed significantly to the advances in this field, also in the view of the large amount of sequence data collected in recent years. Phylogenetic analyses could be realized on the basis of protein-encoding nucleotide sequences or encoded amino acid molecules: these two mechanisms present different peculiarities, still starting from two alternative representations of the same information. This complementarity could be exploited to achieve a multimodal phylogenetic scheme that is able to integrate gene and protein information in order to realize a single final tree. This aspect has been poorly addressed in the literature. In this paper, we propose to integrate the two phylogenetic analyses using basic schemes derived from the multimodality fusion theory (or multiclassifier systems theory), a well-founded and rigorous branch for which its powerfulness has already been demonstrated in other pattern recognition contexts. The proposed approach could be applied to distance matrix-based phylogenetic techniques (like neighbor joining), resulting in a smart and fast method. The proposed methodology has been tested in a real case involving sequences of some species of lactic acid bacteria. With this dataset, both nucleotide sequence- and amino acid sequence-based phylogenetic analyses present some drawbacks, which are overcome with the multimodal analysis. PMID:17933011

  12. The amino-acid sequence of leghemoglobin component a from Phaseolus vulgaris (kidney bean).

    PubMed

    Lehtovaara, P; Ellfolk, N

    1975-06-01

    1. Leghemoglobin component a from Phaseolus vulgaris (kidney bean) was digested with trypsin; 15 tryptic peptides and free lysine were purified and the amino acid sequences of the peptides determined. 2. The internal order of the tryptic peptides was determined by the bridge peptides obtained from the thermolytic digest and the dilute acid hydrolyzate of kidney bean leghemoglobin a; 12 thermolytic peptides and two acid hydrolysis peptides were purified and the sequences were partially or completely determined. 3. The complete amino acid sequence of kidney bean leghemoglobin a is compared to that of leghemoglobin a from soybean (Glycine max) and to some animal globins. As regards sequence, the kidney bean globin has 79% identity with the soybean globin and 21% identity with human hemoglobin gamma-chain. Seven of the 14 amino acid residues common to most globins are found in the kidney bean globin. Trp-15 and Tyr-145 are evolutionarily conserved in this globin, which confirms the concept of a common origin of animal and plant globins. PMID:809270

  13. Application of combined mass spectrometry and partial amino acid sequence to the identification of gel-separated proteins.

    PubMed

    Patterson, S D; Thomas, D; Bradshaw, R A

    1996-05-01

    The combined use of peptide mass information with amino acid sequence information derived by chemical sequencing or mass spectrometry (MS)-based approaches provides a powerful means of protein identification. We have used a two-part strategy to identify proteins from nerve growth factor (NGF)-stimulated rat adrenal pheochromocytoma cell line PC-12 cell lysates that associate with the adaptor protein Shc (Shc homologous and collagen protein). Initial experiments with metabolically radiolabeled cell extracts separated by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) revealed a number of proteins that coimmunoprecipitated with anti-Shc antibody compared with control (unstimulated) cell extracts. The experiment was scaled up and cell lysate from NGF-stimulated PC-12 cells was applied to a glutathione-S-transferase (GST)-Shc affinity column, eluted, separated by SDS-PAGE and blotted to Immobilon-CD. The blotted proteins were proteolytically digested in situ, and the masses obtained from the extracted peptides were used in a peptide-mass search program in an attempt to identify the protein. Even if a strong candidate was found using this search, an additional step was performed to confirm the identification. The mixtures were fractionated by reversed-phase high-performance liquid chromatography (RP-HPLC) and subjected to chemical sequencing to obtain (partial) sequence information, or post-source decay (PSD-) matrix-assisted laser-desorption ionization (MALDI)-MS to obtain sequence-specific fragment ions. This data was used in a peptide-sequence tag search to confirm the identity of the proteins. This combined approach allowed identification of four proteins of M(r) 43,000 to 200,000. In one case the identified protein clearly did not correspond to the radiolabeled band, but to a protein contaminant from the column. The advantages and pitfalls of the approach are discussed. PMID:8783013

  14. Draft genome sequence of the docosahexaenoic acid producing thraustochytrid Aurantiochytrium sp. T66.

    PubMed

    Liu, Bin; Ertesvåg, Helga; Aasen, Inga Marie; Vadstein, Olav; Brautaset, Trygve; Heggeset, Tonje Marita Bjerkan

    2016-06-01

    Thraustochytrids are unicellular, marine protists, and there is a growing industrial interest in these organisms, particularly because some species, including strains belonging to the genus Aurantiochytrium, accumulate high levels of docosahexaenoic acid (DHA). Here, we report the draft genome sequence of Aurantiochytrium sp. T66 (ATCC PRA-276), with a size of 43 Mbp, and 11,683 predicted protein-coding sequences. The data has been deposited at DDBJ/EMBL/Genbank under the accession LNGJ00000000. The genome sequence will contribute new insight into DHA biosynthesis and regulation, providing a basis for metabolic engineering of thraustochytrids. PMID:27222814

  15. Disease-Targeted Sequencing of Ion Channel Genes identifies de novo mutations in Patients with Non-Familial Brugada Syndrome

    PubMed Central

    Juang, Jyh-Ming Jimmy; Lu, Tzu-Pin; Lai, Liang-Chuan; Ho, Chia-Chuan; Liu, Yen-Bin; Tsai, Chia-Ti; Lin, Lian-Yu; Yu, Chih-Chieh; Chen, Wen-Jone; Chiang, Fu-Tien; Yeh, Shih-Fan Sherri; Lai, Ling-Ping; Chuang, Eric Y.; Lin, Jiunn-Lee

    2014-01-01

    Brugada syndrome (BrS) is one of the ion channelopathies associated with sudden cardiac death (SCD). The most common BrS-associated gene (SCN5A) only accounts for approximately 20–25% of BrS patients. This study aims to identify novel mutations across human ion channels in non-familial BrS patients without SCN5A variants through disease-targeted sequencing. We performed disease-targeted multi-gene sequencing across 133 human ion channel genes and 12 reported BrS-associated genes in 15 unrelated, non-familial BrS patients without SCN5A variants. Candidate variants were validated by mass spectrometry and Sanger sequencing. Five de novo mutations were identified in four genes (SCNN1A, KCNJ16, KCNB2, and KCNT1) in three BrS patients (20%). Two of the three patients presented SCD and one had syncope. Interestingly, the two patients presented with SCD had compound mutations (SCNN1A:Arg350Gln and KCNB2:Glu522Lys; SCNN1A:Arg597* and KCNJ16:Ser261Gly). Importantly, two SCNN1A mutations were identified from different families. The KCNT1:Arg1106Gln mutation was identified in a patient with syncope. Bioinformatics algorithms predicted severe functional interruptions in these four mutation loci, suggesting their pivotal roles in BrS. This study identified four novel BrS-associated genes and indicated the effectiveness of this disease-targeted sequencing across ion channel genes for non-familial BrS patients without SCN5A variants. PMID:25339316

  16. A classification of glycosyl hydrolases based on amino acid sequence similarities.

    PubMed Central

    Henrissat, B

    1991-01-01

    The amino acid sequences of 301 glycosyl hydrolases and related enzymes have been compared. A total of 291 sequences corresponding to 39 EC entries could be classified into 35 families. Only ten sequences (less than 5% of the sample) could not be assigned to any family. With the sequences available for this analysis, 18 families were found to be monospecific (containing only one EC number) and 17 were found to be polyspecific (containing at least two EC numbers). Implications on the folding characteristics and mechanism of action of these enzymes and on the evolution of carbohydrate metabolism are discussed. With the steady increase in sequence and structural data, it is suggested that the enzyme classification system should perhaps be revised. PMID:1747104

  17. Sequence-specific purification of nucleic acids by PNA-controlled hybrid selection.

    PubMed

    Orum, H; Nielsen, P E; Jørgensen, M; Larsson, C; Stanley, C; Koch, T

    1995-09-01

    Using an oligohistidine peptide nucleic acids (oligohistidine-PNA) chimera, we have developed a rapid hybrid selection method that allows efficient, sequence-specific purification of a target nucleic acid. The method exploits two fundamental features of PNA. First, that PNA binds with high affinity and specificity to its complementary nucleic acid. Second, that amino acids are easily attached to the PNA oligomer during synthesis. We show that a (His)6-PNA chimera exhibits strong binding to chelated Ni2+ ions without compromising its native PNA hybridization properties. We further show that these characteristics allow the (His)6-PNA/DNA complex to be purified by the well-established method of metal ion affinity chromatography using a Ni(2+)-NTA (nitrilotriactic acid) resin. Specificity and efficiency are the touchstones of any nucleic acid purification scheme. We show that the specificity of the (His)6-PNA selection approach is such that oligonucleotides differing by only a single nucleotide can be selectively purified. We also show that large RNAs (2224 nucleotides) can be captured with high efficiency by using multiple (His)6-PNA probes. PNA can hybridize to nucleic acids in low-salt concentrations that destabilize native nucleic acid structures. We demonstrate that this property of PNA can be utilized to purify an oligonucleotide in which the target sequence forms part of an intramolecular stem/loop structure. PMID:7495562

  18. Characterization of microsatellites identified by next-generation sequencing in the Neotropical tree Handroanthus billbergii (Bignoniaceae)1

    PubMed Central

    Morillo, Eduardo; Buitron, Johanna; Limongi, Ricardo; Vignes, Helene; Argout, Xavier

    2016-01-01

    Premise of the study: We developed microsatellite (simple sequence repeat [SSR]) markers in the Neotropical tree Handroanthus billbergii (Bignoniaceae), to be applied in assessment of genetic diversity in this species as a reference for inferring the impact of dry forest fragmentation in Ecuador. Methods and Results: Using next-generation sequencing, we detected a total of 26,893 putative SSRs reported here. Using an ABI 3500xl sequencer, we identified and characterized a set of polymorphic markers in 23 individuals belonging to three populations of H. billbergii. Conclusions: We report a set of 30 useful SSR markers for H. billbergii and a large list of potential microsatellites for developing new markers for this or related species. PMID:27213123

  19. Draft Genome Sequences of Two Species of "Difficult-to-Identify" Human-Pathogenic Corynebacteria: Implications for Better Identification Tests.

    PubMed

    Pacheco, Luis G C; Mattos-Guaraldi, Ana L; Santos, Carolina S; Veras, Adonney A O; Guimarães, Luis C; Abreu, Vinícius; Pereira, Felipe L; Soares, Siomar C; Dorella, Fernanda A; Carvalho, Alex F; Leal, Carlos G; Figueiredo, Henrique C P; Ramos, Juliana N; Vieira, Veronica V; Farfour, Eric; Guiso, Nicole; Hirata, Raphael; Azevedo, Vasco; Silva, Artur; Ramos, Rommel T J

    2015-01-01

    Non-diphtheriae Corynebacterium species have been increasingly recognized as the causative agents of infections in humans. Differential identification of these bacteria in the clinical microbiology laboratory by the most commonly used biochemical tests is challenging, and normally requires additional molecular methods. Herein, we present the annotated draft genome sequences of two isolates of "difficult-to-identify" human-pathogenic corynebacterial species: C. xerosis and C. minutissimum. The genome sequences of ca. 2.7 Mbp, with a mean number of 2,580 protein encoding genes, were also compared with the publicly available genome sequences of strains of C. amycolatum and C. striatum. These results will aid the exploration of novel biochemical reactions to improve existing identification tests as well as the development of more accurate molecular identification methods through detection of species-specific target genes for isolate's identification or drug susceptibility profiling. PMID:26516374

  20. Antibody-specific model of amino acid substitution for immunological inferences from alignments of antibody sequences.

    PubMed

    Mirsky, Alexander; Kazandjian, Linda; Anisimova, Maria

    2015-03-01

    Antibodies are glycoproteins produced by the immune system as a dynamically adaptive line of defense against invading pathogens. Very elegant and specific mutational mechanisms allow B lymphocytes to produce a large and diversified repertoire of antibodies, which is modified and enhanced throughout all adulthood. One of these mechanisms is somatic hypermutation, which stochastically mutates nucleotides in the antibody genes, forming new sequences with different properties and, eventually, higher affinity and selectivity to the pathogenic target. As somatic hypermutation involves fast mutation of antibody sequences, this process can be described using a Markov substitution model of molecular evolution. Here, using large sets of antibody sequences from mice and humans, we infer an empirical amino acid substitution model AB, which is specific to antibody sequences. Compared with existing general amino acid models, we show that the AB model provides significantly better description for the somatic evolution of mice and human antibody sequences, as demonstrated on large next generation sequencing (NGS) antibody data. General amino acid models are reflective of conservation at the protein level due to functional constraints, with most frequent amino acids exchanges taking place between residues with the same or similar physicochemical properties. In contrast, within the variable part of antibody sequences we observed an elevated frequency of exchanges between amino acids with distinct physicochemical properties. This is indicative of a sui generis mutational mechanism, specific to antibody somatic hypermutation. We illustrate this property of antibody sequences by a comparative analysis of the network modularity implied by the AB model and general amino acid substitution models. We recommend using the new model for computational studies of antibody sequence maturation, including inference of alignments and phylogenetic trees describing antibody somatic hypermutation in

  1. Use of nucleotide sequence data to identify a microsporidian pathogen of Pieris rapae (Lepidoptera, Pieridae).

    PubMed

    Malone, L A; McIvor, C A

    1996-11-01

    Nucleotide sequence was determined for a portion of genomic DNA which spans the V4 variable region of the small subunit ribosomal RNA gene of an unidentified microsporidium from the cabbage white butterfly, Pieris rapae (174 base pairs). Comparison with equivalent sequence data obtained here for two other microsporidian species, Nosema bombycis (240 base pairs) and Nosema bombi (200 base pairs), and from the GenBank database for 11 other microsporidian species suggests that the unidentified species from P. rapae is most closely related to some Vairimorpha species. Light and electron microscopic observations of the developmental stages of this parasite were in accord with this. Infection experiments conducted at 20 and 26 degrees C demonstrated temperature-dependent dimorphism, with the production of both binucleate free spores (mean dimensions: 3.8 x 1.8 microns; 10-13 polar filament coils) and membrane-bound uninucleate octospores (mean dimensions: 3.1 x 1.9 microns). Macrospores (mean dimensions 8.0 x 2.1 microns) were also observed. Sites of infection were the gut epithelium, the Malpighian tubules, the salivary glands, and the fat body. Infections were found in all insect life stages, including the egg. This microsporidium was found to be indistinguishable from both Nosema mesnili (Paillot) and Microsporidium (Thelohania) mesnili (Paillot) and we propose that these species be combined and transferred to the genus Vairimorpha Pilley. PMID:8931362

  2. No increase in bleeding identified in type 1 VWD subjects with D1472H sequence variation.

    PubMed

    Flood, Veronica H; Friedman, Kenneth D; Gill, Joan Cox; Haberichter, Sandra L; Christopherson, Pamela A; Branchford, Brian R; Hoffmann, Raymond G; Abshire, Thomas C; Dunn, Amy L; Di Paola, Jorge A; Hoots, W Keith; Brown, Deborah L; Leissinger, Cindy; Lusher, Jeanne M; Ragni, Margaret V; Shapiro, Amy D; Montgomery, Robert R

    2013-05-01

    The diagnosis of von Willebrand disease (VWD) is complicated by issues with current laboratory testing, particularly the ristocetin cofactor activity assay (VWF:RCo). We have recently reported a sequence variation in the von Willebrand factor (VWF) A1 domain, p.D1472H (D1472H), associated with a decrease in the VWF:RCo/VWF antigen (VWF:Ag) ratio but not associated with bleeding in healthy control subjects. This report expands the previous study to include subjects with symptoms leading to the diagnosis of type 1 VWD. Type 1 VWD subjects with D1472H had a significant decrease in the VWF:RCo/VWF:Ag ratio compared with those without D1472H, similar to the findings in the healthy control population. No increase in bleeding score was observed, however, for VWD subjects with D1472H compared with those without D1472H. These results suggest that the presence of the D1472H sequence variation is not associated with a significant increase in bleeding symptoms, even in type 1 VWD subjects. PMID:23520336

  3. Exome sequencing in pooled DNA samples to identify maternal pre-eclampsia risk variants

    PubMed Central

    Kaartokallio, Tea; Wang, Jingwen; Heinonen, Seppo; Kajantie, Eero; Kivinen, Katja; Pouta, Anneli; Gerdhem, Paul; Jiao, Hong; Kere, Juha; Laivuori, Hannele

    2016-01-01

    Pre-eclampsia is a common pregnancy disorder that is a major cause for maternal and perinatal mortality and morbidity. Variants predisposing to pre-eclampsia might be under negative evolutionary selection that is likely to keep their population frequencies low. We exome sequenced samples from a hundred Finnish pre-eclamptic women in pools of ten to screen for low-frequency, large-effect risk variants for pre-eclampsia. After filtering and additional genotyping steps, we selected 28 low-frequency missense, nonsense and splice site variants that were enriched in the pre-eclampsia pools compared to reference data, and genotyped the variants in 1353 pre-eclamptic and 699 non-pre-eclamptic women to test the association of them with pre-eclampsia and quantitative traits relevant for the disease. Genotypes from the SISu project (n = 6118 exome sequenced Finnish samples) were included in the binary trait association analysis as a population reference to increase statistical power. In these analyses, none of the variants tested reached genome-wide significance. In conclusion, the genetic risk for pre-eclampsia is likely complex even in a population isolate like Finland, and larger sample sizes will be necessary to detect risk variants. PMID:27384325

  4. Identifying Selection in the Within-Host Evolution of Influenza Using Viral Sequence Data

    PubMed Central

    Illingworth, Christopher J. R.; Fischer, Andrej; Mustonen, Ville

    2014-01-01

    The within-host evolution of influenza is a vital component of its epidemiology. A question of particular interest is the role that selection plays in shaping the viral population over the course of a single infection. We here describe a method to measure selection acting upon the influenza virus within an individual host, based upon time-resolved genome sequence data from an infection. Analysing sequence data from a transmission study conducted in pigs, describing part of the haemagglutinin gene (HA1) of an influenza virus, we find signatures of non-neutrality in six of a total of sixteen infections. We find evidence for both positive and negative selection acting upon specific alleles, while in three cases, the data suggest the presence of time-dependent selection. In one infection we observe what is potentially a specific immune response against the virus; a non-synonymous mutation in an epitope region of the virus is found to be under initially positive, then strongly negative selection. Crucially, given the lack of homologous recombination in influenza, our method accounts for linkage disequilibrium between nucleotides at different positions in the haemagglutinin gene, allowing for the analysis of populations in which multiple mutations are present at any given time. Our approach offers a new insight into the dynamics of influenza infection, providing a detailed characterisation of the forces that underlie viral evolution. PMID:25080215

  5. Exome sequencing in pooled DNA samples to identify maternal pre-eclampsia risk variants.

    PubMed

    Kaartokallio, Tea; Wang, Jingwen; Heinonen, Seppo; Kajantie, Eero; Kivinen, Katja; Pouta, Anneli; Gerdhem, Paul; Jiao, Hong; Kere, Juha; Laivuori, Hannele

    2016-01-01

    Pre-eclampsia is a common pregnancy disorder that is a major cause for maternal and perinatal mortality and morbidity. Variants predisposing to pre-eclampsia might be under negative evolutionary selection that is likely to keep their population frequencies low. We exome sequenced samples from a hundred Finnish pre-eclamptic women in pools of ten to screen for low-frequency, large-effect risk variants for pre-eclampsia. After filtering and additional genotyping steps, we selected 28 low-frequency missense, nonsense and splice site variants that were enriched in the pre-eclampsia pools compared to reference data, and genotyped the variants in 1353 pre-eclamptic and 699 non-pre-eclamptic women to test the association of them with pre-eclampsia and quantitative traits relevant for the disease. Genotypes from the SISu project (n = 6118 exome sequenced Finnish samples) were included in the binary trait association analysis as a population reference to increase statistical power. In these analyses, none of the variants tested reached genome-wide significance. In conclusion, the genetic risk for pre-eclampsia is likely complex even in a population isolate like Finland, and larger sample sizes will be necessary to detect risk variants. PMID:27384325

  6. Next generation exome sequencing of paediatric inflammatory bowel disease patients identifies rare and novel variants in candidate genes

    PubMed Central

    Christodoulou, Katja; Wiskin, Anthony E; Gibson, Jane; Tapper, William; Willis, Claire; Afzal, Nadeem A; Upstill-Goddard, Rosanna; Holloway, John W; Simpson, Michael A; Beattie, R Mark; Collins, Andrew

    2013-01-01

    Background Multiple genes have been implicated by association studies in altering inflammatory bowel disease (IBD) predisposition. Paediatric patients often manifest more extensive disease and a particularly severe disease course. It is likely that genetic predisposition plays a more substantial role in this group. Objective To identify the spectrum of rare and novel variation in known IBD susceptibility genes using exome sequencing analysis in eight individual cases of childhood onset severe disease. Design DNA samples from the eight patients underwent targeted exome capture and sequencing. Data were processed through an analytical pipeline to align sequence reads, conduct quality checks, and identify and annotate variants where patient sequence differed from the reference sequence. For each patient, the entire complement of rare variation within strongly associated candidate genes was catalogued. Results Across the panel of 169 known IBD susceptibility genes, approximately 300 variants in 104 genes were found. Excluding splicing and HLA-class variants, 58 variants across 39 of these genes were classified as rare, with an alternative allele frequency of <5%, of which 17 were novel. Only two patients with early onset Crohn's disease exhibited rare deleterious variations within NOD2: the previously described R702W variant was the sole NOD2 variant in one patient, while the second patient also carried the L1007 frameshift insertion. Both patients harboured other potentially damaging mutations in the GSDMB, ERAP2 and SEC16A genes. The two patients severely affected with ulcerative colitis exhibited a distinct profile: both carried potentially detrimental variation in the BACH2 and IL10 genes not seen in other patients. Conclusion For each of the eight individuals studied, all non-synonymous, truncating and frameshift mutations across all known IBD genes were identified. A unique profile of rare and potentially damaging variants was evident for each patient with this

  7. Identifying Likely Transmission Pathways within a 10-Year Community Outbreak of Tuberculosis by High-Depth Whole Genome Sequencing

    PubMed Central

    Sadsad, Rosemarie; Martinez, Elena; Jelfs, Peter; Hill-Cawthorne, Grant A.; Gilbert, Gwendolyn L.; Marais, Ben J.; Sintchenko, Vitali

    2016-01-01

    Background Improved tuberculosis control and the need to contain the spread of drug-resistant strains provide a strong rationale for exploring tuberculosis transmission dynamics at the population level. Whole-genome sequencing provides optimal strain resolution, facilitating detailed mapping of potential transmission pathways. Methods We sequenced 22 isolates from a Mycobacterium tuberculosis cluster in New South Wales, Australia, identified during routine 24-locus mycobacterial interspersed repetitive unit typing. Following high-depth paired-end sequencing using the Illumina HiSeq 2000 platform, two independent pipelines were employed for analysis, both employing read mapping onto reference genomes as well as de novo assembly, to control biases in variant detection. In addition to single-nucleotide polymorphisms, the analyses also sought to identify insertions, deletions and structural variants. Results Isolates were highly similar, with a distance of 13 variants between the most distant members of the cluster. The most sensitive analysis classified the 22 isolates into 18 groups. Four of the isolates did not appear to share a recent common ancestor with the largest clade; another four isolates had an uncertain ancestral relationship with the largest clade. Conclusion Whole genome sequencing, with analysis of single-nucleotide polymorphisms, insertions, deletions, structural variants and subpopulations, enabled the highest possible level of discrimination between cluster members, clarifying likely transmission pathways and exposing the complexity of strain origin. The analysis provides a basis for targeted public health intervention and enhanced classification of future isolates linked to the cluster. PMID:26938641

  8. Phenotypic chemical screening using a zebrafish neural crest EMT reporter identifies retinoic acid as an inhibitor of epithelial morphogenesis.

    PubMed

    Jimenez, Laura; Wang, Jindong; Morrison, Monique A; Whatcott, Clifford; Soh, Katherine K; Warner, Steven; Bearss, David; Jette, Cicely A; Stewart, Rodney A

    2016-04-01

    The epithelial-to-mesenchymal transition (EMT) is a highly conserved morphogenetic program essential for embryogenesis, regeneration and cancer metastasis. In cancer cells, EMT also triggers cellular reprogramming and chemoresistance, which underlie disease relapse and decreased survival. Hence, identifying compounds that block EMT is essential to prevent or eradicate disseminated tumor cells. Here, we establish a whole-animal-based EMT reporter in zebrafish for rapid drug screening, calledTg(snai1b:GFP), which labels epithelial cells undergoing EMT to producesox10-positive neural crest (NC) cells. Time-lapse and lineage analysis ofTg(snai1b:GFP)embryos reveal that cranial NC cells delaminate from two regions: an early population delaminates adjacent to the neural plate, whereas a later population delaminates from within the dorsal neural tube. TreatingTg(snai1b:GFP)embryos with candidate small-molecule EMT-inhibiting compounds identified TP-0903, a multi-kinase inhibitor that blocked cranial NC cell delamination in both the lateral and medial populations. RNA sequencing (RNA-Seq) analysis and chemical rescue experiments show that TP-0903 acts through stimulating retinoic acid (RA) biosynthesis and RA-dependent transcription. These studies identify TP-0903 as a new therapeutic for activating RAin vivoand raise the possibility that RA-dependent inhibition of EMT contributes to its prior success in eliminating disseminated cancer cells. PMID:26794130

  9. Phenotypic chemical screening using a zebrafish neural crest EMT reporter identifies retinoic acid as an inhibitor of epithelial morphogenesis

    PubMed Central

    Jimenez, Laura; Wang, Jindong; Morrison, Monique A.; Whatcott, Clifford; Soh, Katherine K.; Warner, Steven; Bearss, David; Jette, Cicely A.; Stewart, Rodney A.

    2016-01-01

    ABSTRACT The epithelial-to-mesenchymal transition (EMT) is a highly conserved morphogenetic program essential for embryogenesis, regeneration and cancer metastasis. In cancer cells, EMT also triggers cellular reprogramming and chemoresistance, which underlie disease relapse and decreased survival. Hence, identifying compounds that block EMT is essential to prevent or eradicate disseminated tumor cells. Here, we establish a whole-animal-based EMT reporter in zebrafish for rapid drug screening, called Tg(snai1b:GFP), which labels epithelial cells undergoing EMT to produce sox10-positive neural crest (NC) cells. Time-lapse and lineage analysis of Tg(snai1b:GFP) embryos reveal that cranial NC cells delaminate from two regions: an early population delaminates adjacent to the neural plate, whereas a later population delaminates from within the dorsal neural tube. Treating Tg(snai1b:GFP) embryos with candidate small-molecule EMT-inhibiting compounds identified TP-0903, a multi-kinase inhibitor that blocked cranial NC cell delamination in both the lateral and medial populations. RNA sequencing (RNA-Seq) analysis and chemical rescue experiments show that TP-0903 acts through stimulating retinoic acid (RA) biosynthesis and RA-dependent transcription. These studies identify TP-0903 as a new therapeutic for activating RA in vivo and raise the possibility that RA-dependent inhibition of EMT contributes to its prior success in eliminating disseminated cancer cells. PMID:26794130

  10. Two Different High Throughput Sequencing Approaches Identify Thousands of De Novo Genomic Markers for the Genetically Depleted Bornean Elephant

    PubMed Central

    Sharma, Reeta; Goossens, Benoit; Kun-Rodrigues, Célia; Teixeira, Tatiana; Othman, Nurzhafarina; Boone, Jason Q.; Jue, Nathaniel K.; Obergfell, Craig; O'Neill, Rachel J.; Chikhi, Lounès

    2012-01-01

    High throughput sequencing technologies are being applied to an increasing number of model species with a high-quality reference genome. The application and analyses of whole-genome sequence data in non-model species with no prior genomic information are currently under way. Recent sequencing technologies provide new opportunities for gathering genomic data in natural populations, laying the empirical foundation for future research in the field of conservation and population genomics. Here we present the case study of the Bornean elephant, which is the most endangered subspecies of Asian elephant and exhibits very low genetic diversity. We used two different sequencing platforms, the Roche 454 FLX (shotgun) and Illumina, GAIIx (Restriction site associated DNA, RAD) to evaluate the feasibility of the two methodologies for the discovery of de novo markers (single nucleotide polymorphism, SNPs and microsatellites) using low coverage data. Approximately, 6,683 (shotgun) and 14,724 (RAD) SNPs were detected within our elephant sequence dataset. Genotyping of a representative sample of 194 SNPs resulted in a SNP validation rate of ∼ 83 to 94% and 17% of the loci were polymorphic with a low diversity (Ho = 0.057). Different numbers of microsatellites were identified through shotgun (27,226) and RAD (868) techniques. Out of all di-, tri-, and tetra-microsatellite loci, 1,706 loci had sufficient flanking regions (shotgun) while only 7 were found with RAD. All microsatellites were monomorphic in the Bornean but polymorphic in another elephant subspecies. Despite using different sample sizes, and the well known differences in the two platforms used regarding sequence length and throughput, the two approaches showed high validation rate. The approaches used here for marker development in a threatened species demonstrate the utility of high throughput sequencing technologies as a starting point for the development of genomic tools in a non-model species and in particular

  11. Targeted Next-generation Sequencing of Advanced Prostate Cancer Identifies Potential Therapeutic Targets and Disease Heterogeneity

    PubMed Central

    Beltran, Himisha; Yelensky, Roman; Frampton, Garrett M.; Park, Kyung; Downing, Sean R.; MacDonald, Theresa Y.; Jarosz, Mirna; Lipson, Doron; Tagawa, Scott T.; Nanus, David M.; Stephens, Philip J.; Mosquera, Juan Miguel; Cronin, Maureen T.; Rubin, Mark A.

    2012-01-01

    Background Most personalized cancer care strategies involving DNA sequencing are highly reliant on acquiring sufficient fresh or frozen tissue. It has been challenging to comprehensively evaluate the genome of advanced prostate cancer (PCa) because of limited access to metastatic tissue. Objective To demonstrate the feasibility of a novel next-generation sequencing (NGS) based platform that can be used with archival formalin-fixed paraffin-embedded (FFPE) biopsy tissue to evaluate the spectrum of DNA alterations seen in advanced PCa. Design, setting, and participants FFPE samples (including archival prostatectomies and prostate needle biopsies) were obtained from 45 patients representing the spectrum of disease: localized PCa, metastatic hormone-naive PCa, and metastatic castration-resistant PCa (CRPC). We also assessed paired primaries and metastases to understand disease heterogeneity and disease progression. Intervention At least 50 ng of tumor DNA was extracted from FFPE samples and used for hybridization capture and NGS using the Illumina HiSeq 2000 platform. Outcome measurements and statistical analysis A total of 3320 exons of 182 cancer-associated genes and 37 introns of 14 commonly rearranged genes were evaluated for genomic alterations. Results and limitations We obtained an average sequencing depth of >900X. Overall, 44% of CRPCs harbored genomic alterations involving the androgen receptor gene (AR), including AR copy number gain (24% of CRPCs) or AR point mutation (20% of CRPCs). Other recurrent mutations included transmembrane protease, serine 2 gene (TMPRSS2):v-ets erythroblastosis virus E26 oncogene homolog (avian) gene (ERG) fusion (44%); phosphatase and tensin homolog gene (PTEN) loss (44%); tumor protein p53 gene (TP53) mutation (40%); retinoblastoma gene (RB) loss (28%); v-myc myelocytomatosis viral oncogene homolog (avian) gene (MYC) gain (12%); and phosphatidylinositol-4,5-bisphosphate 3-kinase, catalytic subunit α gene (PIK3CA) mutation (4

  12. Site-directed gene mutation at mixed sequence targets by psoralen-conjugated pseudo-complementary peptide nucleic acids.

    PubMed

    Kim, Ki-Hyun; Nielsen, Peter E; Glazer, Peter M

    2007-01-01

    Sequence-specific DNA-binding molecules such as triple helix-forming oligonucleotides (TFOs) provide a means for inducing site-specific mutagenesis and recombination at chromosomal sites in mammalian cells. However, the utility of TFOs is limited by the requirement for homopurine stretches in the target duplex DNA. Here, we report the use of pseudo-complementary peptide nucleic acids (pcPNAs) for intracellular gene targeting at mixed sequence sites. Due to steric hindrance, pcPNAs are unable to form pcPNA-pcPNA duplexes but can bind to complementary DNA sequences by Watson-Crick pairing via double duplex-invasion complex formation. We show that psoralen-conjugated pcPNAs can deliver site-specific photoadducts and mediate targeted gene modification within both episomal and chromosomal DNA in mammalian cells without detectable off-target effects. Most of the induced psoralen-pcPNA mutations were single-base substitutions and deletions at the predicted pcPNA-binding sites. The pcPNA-directed mutagenesis was found to be dependent on PNA concentration and UVA dose and required matched pairs of pcPNAs. Neither of the individual pcPNAs alone had any effect nor did complementary PNA pairs of the same sequence. These results identify pcPNAs as new tools for site-specific gene modification in mammalian cells without purine sequence restriction, thereby providing a general strategy for designing gene targeting molecules. PMID:17977869

  13. Amino acid sequence of a vitamin K-dependent Ca2+-binding peptide from bovine prothrombin.

    PubMed

    Howard, J B; Fausch, M D

    1975-08-10

    The amino acid sequence of a 31-residue peptide from bovine prothrombin has been determined. This peptide has been shown to contain the vitamin K-dependent modification required for Ca2+ binding (Nelsestuen, G. L., and Suttie, J. W. (1973) Proc. Natl. Acad. Sci. U. S. A. 70, 3366-3370) and the modified amino acid, gamma-carboxyglutamic acid (Nelsestuen, G. L., Zytkovicz, T., and Howard, J. B. (1974) J. Biol. Chem. 249, 6347-6350). The peptide was shown to correspond to residues 12 to 42 of prothrombin. PMID:807581

  14. Amino acid sequences around the cysteine residues of rabbit muscle triose phosphate isomerase

    PubMed Central

    Miller, Janet C.; Waley, S. G.

    1971-01-01

    1. The nature of the subunits in rabbit muscle triose phosphate isomerase has been investigated. 2. Amino acid analyses show that there are five cysteine residues and two methionine residues/subunit. 3. The amino acid sequences around the cysteine residues have been determined; these account for about 75 residues. 4. Cleavage at the methionine residues with cyanogen bromide gave three fragments. 5. These results show that the subunits correspond to polypeptide chains, containing about 230 amino acid residues. The chains in triose phosphate isomerase seem to be shorter than those of other glycolytic enzymes. PMID:5165707

  15. Novel Exons and Splice Variants in the Human Antibody Heavy Chain Identified by Single Cell and Single Molecule Sequencing

    PubMed Central

    Vollmers, Christopher; Penland, Lolita; Kanbar, Jad N.; Quake, Stephen R.

    2015-01-01

    Antibody heavy chains contain a variable and a constant region. The constant region of the antibody heavy chain is encoded by multiple groups of exons which define the isotype and therefore many functional characteristics of the antibody. We performed both single B cell RNAseq and long read single molecule sequencing of antibody heavy chain transcripts and were able to identify novel exons for IGHA1 and IGHA2 as well as novel isoforms for IGHM antibody heavy chain. PMID:25611855

  16. Exome sequencing identifies recurrent mutations in NF1 and RASopathy genes in sun-exposed melanomas.

    PubMed

    Krauthammer, Michael; Kong, Yong; Bacchiocchi, Antonella; Evans, Perry; Pornputtapong, Natapol; Wu, Cen; McCusker, James P; Ma, Shuangge; Cheng, Elaine; Straub, Robert; Serin, Merdan; Bosenberg, Marcus; Ariyan, Stephan; Narayan, Deepak; Sznol, Mario; Kluger, Harriet M; Mane, Shrikant; Schlessinger, Joseph; Lifton, Richard P; Halaban, Ruth

    2015-09-01

    We report on whole-exome sequencing (WES) of 213 melanomas. Our analysis established NF1, encoding a negative regulator of RAS, as the third most frequently mutated gene in melanoma, after BRAF and NRAS. Inactivating NF1 mutations were present in 46% of melanomas expressing wild-type BRAF and RAS, occurred in older patients and showed a distinct pattern of co-mutation with other RASopathy genes, particularly RASA2. Functional studies showed that NF1 suppression led to increased RAS activation in most, but not all, melanoma cases. In addition, loss of NF1 did not predict sensitivity to MEK or ERK inhibitors. The rebound pathway, as seen by the induction of phosphorylated MEK, occurred in cells both sensitive and resistant to the studied drugs. We conclude that NF1 is a key tumor suppressor lost in melanomas, and that concurrent RASopathy gene mutations may enhance its role in melanomagenesis. PMID:26214590

  17. Exome sequencing identifies recurrent mutations in NF1 and RASopathy genes in sun-exposed melanomas

    PubMed Central

    Krauthammer, Michael; Kong, Yong; Bacchiocchi, Antonella; Evans, Perry; Pornputtapong, Natapol; Wu, Cen; McCusker, James P; Ma, Shuangge; Cheng, Elaine; Straub, Robert; Serin, Merdan; Bosenberg, Marcus; Ariyan, Stephan; Narayan, Deepak; Sznol, Mario; Kluger, Harriet M; Mane, Shrikant; Schlessinger, Joseph; Lifton, Richard P; Halaban, Ruth

    2016-01-01

    We report on whole-exome sequencing (WES) of 213 melanomas. Our analysis established NF1, encoding a negative regulator of RAS, as the third most frequently mutated gene in melanoma, after BRAF and NRAS. Inactivating NF1 mutations were present in 46% of melanomas expressing wild-type BRAF and RAS, occurred in older patients and showed a distinct pattern of co-mutation with other RASopathy genes, particularly RASA2. Functional studies showed that NF1 suppression led to increased RAS activation in most, but not all, melanoma cases. In addition, loss of NF1 did not predict sensitivity to MEK or ERK inhibitors. The rebound pathway, as seen by the induction of phosphorylated MEK, occurred in cells both sensitive and resistant to the studied drugs. We conclude that NF1 is a key tumor suppressor lost in melanomas, and that concurrent RASopathy gene mutations may enhance its role in melanomagenesis. PMID:26214590

  18. Exome Sequencing of Uterine Leiomyosarcomas Identifies Frequent Mutations in TP53, ATRX, and MED12.

    PubMed

    Mäkinen, Netta; Aavikko, Mervi; Heikkinen, Tuomas; Taipale, Minna; Taipale, Jussi; Koivisto-Korander, Riitta; Bützow, Ralf; Vahteristo, Pia

    2016-02-01

    Uterine leiomyosarcomas (ULMSs) are aggressive smooth muscle tumors associated with poor clinical outcome. Despite previous cytogenetic and molecular studies, their molecular background has remained elusive. To examine somatic variation in ULMS, we performed exome sequencing on 19 tumors. Altogether, 43 genes were mutated in at least two ULMSs. Most frequently mutated genes included tumor protein P53 (TP53; 6/19; 33%), alpha thalassemia/mental retardation syndrome X-linked (ATRX; 5/19; 26%), and mediator complex subunit 12 (MED12; 4/19; 21%). Unlike ATRX mutations, both TP53 and MED12 alterations have repeatedly been associated with ULMSs. All the observed ATRX alterations were either nonsense or frameshift mutations. ATRX protein levels were reliably analyzed by immunohistochemistry in altogether 44 ULMSs, and the majority of tumors (23/44; 52%) showed clearly reduced expression. Loss of ATRX expression has been associated with alternative lengthening of telomeres (ALT), and thus the telomere length was analyzed with telomere-specific fluorescence in situ hybridization. The ALT phenotype was confirmed in all ULMSs showing diminished ATRX expression. Exome data also revealed one nonsense mutation in death-domain associated protein (DAXX), another gene previously associated with ALT, and the tumor showed ALT positivity. In conclusion, exome sequencing revealed that TP53, ATRX, and MED12 are frequently mutated in ULMSs. ALT phenotype was commonly seen in tumors, indicating that ATR inhibitors, which were recently suggested as possible new drugs for ATRX-deficient tumors, could provide a potential novel therapeutic option for ULMS. PMID:26891131

  19. Exome Sequencing of Uterine Leiomyosarcomas Identifies Frequent Mutations in TP53, ATRX, and MED12

    PubMed Central

    Mäkinen, Netta; Aavikko, Mervi; Heikkinen, Tuomas; Taipale, Minna; Taipale, Jussi; Koivisto-Korander, Riitta; Bützow, Ralf; Vahteristo, Pia

    2016-01-01

    Uterine leiomyosarcomas (ULMSs) are aggressive smooth muscle tumors associated with poor clinical outcome. Despite previous cytogenetic and molecular studies, their molecular background has remained elusive. To examine somatic variation in ULMS, we performed exome sequencing on 19 tumors. Altogether, 43 genes were mutated in at least two ULMSs. Most frequently mutated genes included tumor protein P53 (TP53; 6/19; 33%), alpha thalassemia/mental retardation syndrome X-linked (ATRX; 5/19; 26%), and mediator complex subunit 12 (MED12; 4/19; 21%). Unlike ATRX mutations, both TP53 and MED12 alterations have repeatedly been associated with ULMSs. All the observed ATRX alterations were either nonsense or frameshift mutations. ATRX protein levels were reliably analyzed by immunohistochemistry in altogether 44 ULMSs, and the majority of tumors (23/44; 52%) showed clearly reduced expression. Loss of ATRX expression has been associated with alternative lengthening of telomeres (ALT), and thus the telomere length was analyzed with telomere-specific fluorescence in situ hybridization. The ALT phenotype was confirmed in all ULMSs showing diminished ATRX expression. Exome data also revealed one nonsense mutation in death-domain associated protein (DAXX), another gene previously associated with ALT, and the tumor showed ALT positivity. In conclusion, exome sequencing revealed that TP53, ATRX, and MED12 are frequently mutated in ULMSs. ALT phenotype was commonly seen in tumors, indicating that ATR inhibitors, which were recently suggested as possible new drugs for ATRX-deficient tumors, could provide a potential novel therapeutic option for ULMS. PMID:26891131

  20. Mulibrey nanism: Two novel mutations in a child identified by Array CGH and DNA sequencing.

    PubMed

    Mozzillo, Enza; Cozzolino, Carla; Genesio, Rita; Melis, Daniela; Frisso, Giulia; Orrico, Ada; Lombardo, Barbara; Fattorusso, Valentina; Discepolo, Valentina; Della Casa, Roberto; Simonelli, Francesca; Nitsch, Lucio; Salvatore, Francesco; Franzese, Adriana

    2016-08-01

    In childhood, several rare genetic diseases have overlapping symptoms and signs, including those regarding growth alterations, thus the differential diagnosis is sometimes difficult. The proband, aged 3 years, was suspected to have Silver-Russel syndrome because of intrauterine growth retardation, postnatal growth retardation, typical facial dysmorphic features, macrocephaly, body asymmetry, and bilateral fifth finger clinodactyly. Other features were left atrial and ventricular enlargement and patent foramen ovale. Total X-ray skeleton showed hypoplasia of the twelfth rib bilaterally and of the coccyx, slender long bones with thick cortex, and narrow medullary channels. The genetic investigation did not confirm Silver-Russel syndrome. At the age of 5 the patient developed an additional sign: hepatomegaly. Array CGH revealed a 147 kb deletion (involving TRIM 37 and SKA2 genes) on one allele of chromosome 17, inherited from his mother. These results suggested Mulibrey nanism. The clinical features were found to fit this hypothesis. Sequencing of the TRIM 37 gene showed a single base change at a splicing locus, inherited from his father that provoked a truncated protein. The combined use of Array CGH and DNA sequencing confirmed diagnosis of Mulibrey nanism. The large deletion involving the SKA2 gene, along with the increased frequency of malignant tumours in mulibrey patients, suggests closed monitoring for cancer of our patient and his mother. Array CGH should be performed as first tier test in all infants with multiple anomalies. The clinician should reconsider the clinical features when the genetics suggests this. © 2016 Wiley Periodicals, Inc. PMID:27256967

  1. Applications of molecular markers and DNA sequences in identifying fungal pathogens of cool season grain legumes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Molecular techniques have now been widely applied in many disciplines of biological sciences including fungal identification in microbial ecology and plant pathology. In plant pathology, it is now common to use molecular techniques to identify and study plant pathogens of many agronomic and horticul...

  2. Whole Genome Sequencing Identifies a Deletion in Protein Phosphatase 2A That Affects Its Stability and Localization in Chlamydomonas reinhardtii

    PubMed Central

    Lin, Huawen; Miller, Michelle L.; Granas, David M.; Dutcher, Susan K.

    2013-01-01

    Whole genome sequencing is a powerful tool in the discovery of single nucleotide polymorphisms (SNPs) and small insertions/deletions (indels) among mutant strains, which simplifies forward genetics approaches. However, identification of the causative mutation among a large number of non-causative SNPs in a mutant strain remains a big challenge. In the unicellular biflagellate green alga Chlamydomonas reinhardtii, we generated a SNP/indel library that contains over 2 million polymorphisms from four wild-type strains, one highly polymorphic strain that is frequently used in meiotic mapping, ten mutant strains that have flagellar assembly or motility defects, and one mutant strain, imp3, which has a mating defect. A comparison of polymorphisms in the imp3 strain and the other 15 strains allowed us to identify a deletion of the last three amino acids, Y313F314L315, in a protein phosphatase 2A catalytic subunit (PP2A3) in the imp3 strain. Introduction of a wild-type HA-tagged PP2A3 rescues the mutant phenotype, but mutant HA-PP2A3 at Y313 or L315 fail to rescue. Our immunoprecipitation results indicate that the Y313, L315, or YFLΔ mutations do not affect the binding of PP2A3 to the scaffold subunit, PP2A-2r. In contrast, the Y313, L315, or YFLΔ mutations affect both the stability and the localization of PP2A3. The PP2A3 protein is less abundant in these mutants and fails to accumulate in the basal body area as observed in transformants with either wild-type HA-PP2A3 or a HA-PP2A3 with a V310T change. The accumulation of HA-PP2A3 in the basal body region disappears in mated dikaryons, which suggests that the localization of PP2A3 may be essential to the mating process. Overall, our results demonstrate that the terminal YFL tail of PP2A3 is important in the regulation on Chlamydomonas mating. PMID:24086163

  3. Complete amino acid sequence of the Mu heavy chain of a human IgM immunoglobulin.

    PubMed

    Putnam, F W; Florent, G; Paul, C; Shinoda, T; Shimizu, A

    1973-10-19

    The amino acid sequence of the micro, chain of a human IgM immunoglobulin, including the location of all disulfide bridges and oligosaccharides, has been determined. The homology of the constant regions of immunoglobulin micro, gamma, alpha, and epsilon heavy chains reveals evolutionary relationships and suggests that two genes code for each heavy chain. PMID:4742735

  4. Draft Genome Sequence of the Butyric Acid Producer Clostridium tyrobutyricum Strain CIP I-776 (IFP923)

    PubMed Central

    Clément, Benjamin; Lopes Ferreira, Nicolas

    2016-01-01

    Here, we report the draft genome sequence of Clostridium tyrobutyricum CIP I-776 (IFP923), an efficient producer of butyric acid. The genome consists of a single chromosome of 3.19 Mb and provides useful data concerning the metabolic capacities of the strain. PMID:26941139

  5. Draft Genome Sequence of Perfluorooctane Acid-Degrading Bacterium Pseudomonas parafulva YAB-1

    PubMed Central

    Tang, Chongjian; Peng, Qingjing; Peng, Qingzhong

    2015-01-01

    Pseudomonas parafulva YAB-1, isolated from perfluorinated compound-contaminated soil, has the ability to degrade perfluorooctane acid (PFOA) compound. Here, we report the draft genome sequence and annotation of the PFOA-degrading bacterium P. parafulva YAB-1. The data provide the basis to investigate the molecular mechanism of PFOA metabolism. PMID:26337877

  6. Circulating tumor DNA identified by targeted sequencing in advanced-stage non-small cell lung cancer patients.

    PubMed

    Xu, Song; Lou, Feng; Wu, Yi; Sun, Da-Qiang; Zhang, Jing-Bo; Chen, Wei; Ye, Hua; Liu, Jing-Hao; Wei, Sen; Zhao, Ming-Yu; Wu, Wen-Jun; Su, Xue-Xia; Shi, Rong; Jones, Lindsey; Huang, Xue F; Chen, Si-Yi; Chen, Jun

    2016-01-28

    Non-small cell lung cancers (NSCLC) have unique mutation patterns, and some of these mutations may be used to predict prognosis or guide patient treatment. Mutation profiling before and during treatment often requires repeated tumor biopsies, which is not always possible. Recently, cell-free, circulating tumor DNA (ctDNA) isolated from blood plasma has been shown to contain genetic mutations representative of those found in the primary tumor tissue DNA (tDNA), and these samples can readily be obtained using non-invasive techniques. However, there are still no standardized methods to identify mutations in ctDNA. In the current study, we used a targeted sequencing approach with a semi-conductor based next-generation sequencing (NGS) platform to identify gene mutations in matched tDNA and ctDNA samples from 42 advanced-stage NSCLC patients from China. We identified driver mutations in matched tDNA and ctDNA in EGFR, KRAS, PIK3CA, and TP53, with an overall concordance of 76%. In conclusion, targeted sequencing of plasma ctDNA may be a feasible option for clinical monitoring of NSCLC in the near future. PMID:26582655

  7.  Bile salt export pump deficiency disease: two novel, late onset, ABCB11 mutations identified by next generation sequencing.

    PubMed

    Vitale, Giovanni; Pirillo, Martina; Mantovani, Vilma; Marasco, Elena; Aquilano, Adelia; Gamal, Nesrine; Francalanci, Paola; Conti, Fabio; Andreone, Pietro

    2016-01-01

     Progressive familial intrahepatic cholestasis (PFIC) is a heterogeneous group of autosomal recessive cholestatic diseases of childhood and represents the main indication for liver transplantation at this age; PFIC2 involves ABCB11 gene, that encodes the ATPdependent canalicular bile salt export pump (BSEP). Benign intrahepatic cholestasis (BRIC) identifies a group of diseases involving the same genes and characterized by intermittent attacks of cholestasis with no progression to liver cirrhosis. Diagnosis with standard sequencing techniques is expensive and available only at a few tertiary centers. We report the application of next generation sequencing (NGS) in the diagnosis of the familial intrahepatic cholestasis with a parallel sequencing of three causative genes. We identified the molecular defects in ABCB11 gene in two different probands who developed a severe cholestatic disease of unknown origin. In the first patient a compound heterozygosity for the novel frameshift mutation p.Ser1100GlnfsX38 and the missense variant p.Glu135Lys was detected. In the second patient, triggered by contraceptive therapy, we identified homozygosity for a novel missense variant p.Ala523Gly. In conclusion, these mutations seem to have a late onset and a less aggressive clinical impact, acting as an intermediate form between BRIC and PFIC. PMID:27493120

  8. Comparison of gene expression methods to identify genes responsive to perfluorooctane sulfonic acid.

    PubMed

    Hu, Wenyue; Jones, Paul D; Decoen, Wim; Newsted, John L; Giesy, John P

    2005-01-01

    Genome-wide expression techniques are being increasingly used to assess the effects of environmental contaminants. Oligonucleotide or cDNA microarray methods make possible the screening of large numbers of known sequences for a given model species, while differential display analysis makes possible analysis of the expression of all the genes from any species. We report a comparison of two currently popular methods for genome-wide expression analysis in rat hepatoma cells treated with perfluorooctane sulfonic acid. The two analyses provided 'complimentary' information. Approximately 5% of the 8000 genes analyzed by the GeneChip array, were altered by a factor of three or greater. Differential display results were more difficult to interpret, since multiple gene products were present in most gel bands so a probabilistic approach was used to determine which pathways were affected. The mechanistic interpretation derived from these two methods was in agreement, both showing similar alterations in a specific set of genes. PMID:21783471

  9. The amino acid sequence of cytochrome c-555 from the methane-oxidizing bacterium Methylococcus capsulatus.

    PubMed Central

    Ambler, R P; Dalton, H; Meyer, T E; Bartsch, R G; Kamen, M D

    1986-01-01

    The amino acid sequence of the cytochrome c-555 from the obligate methanotroph Methylococcus capsulatus strain Bath (N.C.I.B. 11132) was determined. It is a single polypeptide chain of 96 residues, binding a haem group through the cysteine residues at positions 19 and 22, and the only methionine residue is a position 59. The sequence does not closely resemble that of any other cytochrome c that has yet been characterized. Detailed evidence for the amino acid sequence of the protein has been deposited as Supplementary Publication SUP 50131 (12 pages) at the British Library Lending Division, Boston Spa, West Yorkshire LS23 7BQ, U.K., from whom copies are available on prepayment. PMID:3006666

  10. Clinically relevant variants identified in thoracic aortic aneurysm patients by research exome sequencing.

    PubMed

    Schubert, Jeffrey A; Landis, Benjamin J; Shikany, Amy R; Hinton, Robert B; Ware, Stephanie M

    2016-05-01

    Thoracic aortic aneurysm (TAA) is a genetically heterogeneous disease involving subclinical and progressive dilation of the thoracic aorta, which can lead to life-threatening complications such as dissection or rupture. Genetic testing is important for risk stratification and identification of at risk family members, and clinically available genetic testing panels have been expanding rapidly. However, when past testing results are normal, there is little evidence to guide decision-making about the indications and timing to pursue additional clinical genetic testing. Results from research based genetic testing can help inform this process. Here we present 10 TAA patients who have a family history of disease and who enrolled in research-based exome testing. Nine of these ten patients had previous clinical genetic testing that did not identify the cause of disease. We sought to determine the number of rare variants in 23 known TAA associated genes identified by research-based exome testing. In total, we found 10 rare variants in six patients. Likely pathogenic variants included a TGFB2 variant in one patient and a SMAD3 variant in another. These variants have been reported previously in individuals with similar phenotypes. Variants of uncertain significance of particular interest included novel variants in MYLK and MFAP5, which were identified in a third patient. In total, clinically reportable rare variants were found in 6/10 (60%) patients, with at least 2/10 (20%) patients having likely pathogenic variants identified. These data indicate that consideration of re-testing is important in TAA patients with previous negative or inconclusive results. PMID:26854089

  11. Whole Exome Sequencing Identifies CRB1 Defect in an Unusual Maculopathy Phenotype

    PubMed Central

    Tsang, Stephen H.; Burke, Tomas; Oll, Maris; Yzer, Suzanne; Lee, Winston; Xie, Yajing (Angela); Allikmets, Rando

    2014-01-01

    Objective To report a new phenotype caused by mutations in the CRB1 gene in a family with 2 affected siblings. Design Molecular genetics and observational case studies. Participants Two affected siblings and 3 unaffected family members. Methods Each subject received a complete ophthalmic examination together with color fundus photography, fundus autofluorescence (FAF), and spectral domain optical coherence tomography (SD-OCT). Microperimetry 1 (MP-1) mapping and electroretinogram (ERG) analysis were performed on the proband. Screening for disease-causing mutations was performed by whole exome sequencing in 3 family members followed by segregation analyses in the entire family. Main Outcome Measures Appearance of the macula as examined by clinical examination, fundus photography, FAF imaging, SD-OCT, and visual function by MP-1 and ERG. Results The proband and her affected brother exhibited unusual, previously unreported, findings of a macular dystrophy with relative sparing of the retinal periphery beyond the vascular arcades. The FAF imaging showed severely affected areas of hypoautofluorescence that extended nasally beyond the optic disc in both eyes. A central macular patch of retinal pigment epithelium (RPE) sparing was evident in both eyes on FAF, whereas photoreceptor sparing was documented in the right eye only using SD-OCT. The affected brother presented with irregular patterns of autofluorescence in both eyes characterized by concentric rings of alternating hyper- and hypoautofluorescence, and foveal sparing of photoreceptors and RPE, as seen on SD-OCT, bilaterally. After negative results in screening for mutations in candidate genes including ABCA4 and PRPH2, DNA from 3 members of the family, including both affected siblings and their mother, was screened by whole exome sequencing resulting in identification of 2 CRB1 missense mutations, c.C3991T:p.R1331C and c.C4142T:p.P1381L, which segregated with the disease in the family. Of the 2, the p.R1331C CRB1

  12. Use of a Drosophila Genome-Wide Conserved Sequence Database to Identify Functionally Related cis-Regulatory Enhancers

    PubMed Central

    Brody, Thomas; Yavatkar, Amarendra S; Kuzin, Alexander; Kundu, Mukta; Tyson, Leonard J; Ross, Jermaine; Lin, Tzu-Yang; Lee, Chi-Hon; Awasaki, Takeshi; Lee, Tzumin; Odenwald, Ward F

    2012-01-01

    Background: Phylogenetic footprinting has revealed that cis-regulatory enhancers consist of conserved DNA sequence clusters (CSCs). Currently, there is no systematic approach for enhancer discovery and analysis that takes full-advantage of the sequence information within enhancer CSCs. Results: We have generated a Drosophila genome-wide database of conserved DNA consisting of >100,000 CSCs derived from EvoPrints spanning over 90% of the genome. cis-Decoder database search and alignment algorithms enable the discovery of functionally related enhancers. The program first identifies conserved repeat elements within an input enhancer and then searches the database for CSCs that score highly against the input CSC. Scoring is based on shared repeats as well as uniquely shared matches, and includes measures of the balance of shared elements, a diagnostic that has proven to be useful in predicting cis-regulatory function. To demonstrate the utility of these tools, a temporally-restricted CNS neuroblast enhancer was used to identify other functionally related enhancers and analyze their structural organization. Conclusions: cis-Decoder reveals that co-regulating enhancers consist of combinations of overlapping shared sequence elements, providing insights into the mode of integration of multiple regulating transcription factors. The database and accompanying algorithms should prove useful in the discovery and analysis of enhancers involved in any developmental process. Developmental Dynamics 241:169–189, 2012. © 2011 Wiley Periodicals, Inc. Key findings A genome-wide catalog of Drosophila conserved DNA sequence clusters. cis-Decoder discovers functionally related enhancers. Functionally related enhancers share balanced sequence element copy numbers. Many enhancers function during multiple phases of development. PMID:22174086

  13. Novel somatic mutations identified by whole-exome sequencing in muscle-invasive transitional cell carcinoma of the bladder

    PubMed Central

    PAN, HUIXING; XU, XIAOJIAN; WU, DEYAO; QIU, QIAOCHENG; ZHOU, SHOUJUN; HE, XUEFENG; ZHOU, YUNFENG; QU, PING; HOU, JIANQUAN; HE, JUN; ZHOU, JIAN

    2016-01-01

    Transitional cell carcinoma (TCC) is the one of the most commonly observed types of cancer globally. The identification of novel disease-associated genes in TCC has had a significant effect on the diagnosis and treatment of bladder cancer; however, there may be a large number of novel genes that have not been identified. In the present study, the exomes of two individuals who were diagnosed with muscle-invasive TCC (MI-TCC) were sequenced to investigate potential variants. Subsequently, following algorithm and filter analysis, Sanger sequencing was used to validate the results of deep sequencing. Immunohistochemistry (IHC) was employed to observe the differences in HECT, C2 and WW domain-containing E3 ubiquitin protein ligase 1 (HECW1) protein expression between tumor tissues and para-carcinoma tissues. A total of 6 nonsynonymous mutation genes were identified in MI-TCC, identified as copine VII, RNA binding motif protein, X-linked-like 3, acyl-CoA synthetase medium-chain family member 2A, HECW1, zinc finger protein 273 and trichohyalin. Furthermore, 5 cases were identified to possess a HECW1 gene mutation in 61 MI-TCC specimens, and all of these were point mutations located at exon 11 on chromosome 7. The mutation categories of HECW1 had 4 missense mutations and 1 nonsense mutation. IHC revealed that HECW1 protein was expressed at significantly increased levels in MI-TCC compared with normal bladder urothelium (P<0.001). The present study provided a novel approach for investigating genetic changes in the MI-TCC exome, and identified the novel mutant gene HECW1, which may possess a significant role in the pathogenesis of TCC. PMID:26893765

  14. Whole-exome sequencing identifies novel MPL and JAK2 mutations in triple-negative myeloproliferative neoplasms

    PubMed Central

    Milosevic Feenstra, Jelena D.; Nivarthi, Harini; Gisslinger, Heinz; Leroy, Emilie; Rumi, Elisa; Chachoua, Ilyas; Bagienski, Klaudia; Kubesova, Blanka; Pietra, Daniela; Gisslinger, Bettina; Milanesi, Chiara; Jäger, Roland; Chen, Doris; Berg, Tiina; Schalling, Martin; Schuster, Michael; Bock, Christoph; Constantinescu, Stefan N.; Cazzola, Mario

    2016-01-01

    Essential thrombocythemia (ET) and primary myelofibrosis (PMF) are chronic diseases characterized by clonal hematopoiesis and hyperproliferation of terminally differentiated myeloid cells. The disease is driven by somatic mutations in exon 9 of CALR or exon 10 of MPL or JAK2-V617F in >90% of the cases, whereas the remaining cases are termed “triple negative.” We aimed to identify the disease-causing mutations in the triple-negative cases of ET and PMF by applying whole-exome sequencing (WES) on paired tumor and control samples from 8 patients. We found evidence of clonal hematopoiesis in 5 of 8 studied cases based on clonality analysis and presence of somatic genetic aberrations. WES identified somatic mutations in 3 of 8 cases. We did not detect any novel recurrent somatic mutations. In 3 patients with clonal hematopoiesis analyzed by WES, we identified a somatic MPL-S204P, a germline MPL-V285E mutation, and a germline JAK2-G571S variant. We performed Sanger sequencing of the entire coding region of MPL in 62, and of JAK2 in 49 additional triple-negative cases of ET or PMF. New somatic (T119I, S204F, E230G, Y591D) and 1 germline (R321W) MPL mutation were detected. All of the identified MPL mutations were gain-of-function when analyzed in functional assays. JAK2 variants were identified in 5 of 57 triple-negative cases analyzed by WES and Sanger sequencing combined. We could demonstrate that JAK2-V625F and JAK2-F556V are gain-of-function mutations. Our results suggest that triple-negative cases of ET and PMF do not represent a homogenous disease entity. Cases with polyclonal hematopoiesis might represent hereditary disorders. PMID:26423830

  15. Allelic polymorphism in arabian camel ribonuclease and the amino acid sequence of bactrian camel ribonuclease.

    PubMed

    Welling, G W; Mulder, H; Beintema, J J

    1976-04-01

    Pancreatic ribonucleases from several species (whitetail deer, roe deer, guinea pig, and arabian camel) exhibit more than one amino acid at particular positions in their amino acid sequences. Since these enzymes were isolated from pooled pancreas, the origin of this heterogeneity is not clear. The pancreatic ribonucleases from 11 individual arabian camels (Camelus dromedarius) have been investigated with respect to the lysine-glutamine heterogeneity at position 103 (Welling et al., 1975). Six ribonucleases showed only one basic band and five showed two bands after polyacrylamide gel electrophoresis, suggesting a gene frequency of about 0.75 for the Lys gene and about 0.25 for the Gln gene. The amino acid sequence of bactrian camel (Camelus bactrianus) ribonuclease isolated from individual pancreatic tissue was determined and compared with that of arabian camel ribonuclease. The only difference was observed at position 103. In the ribonucleases from two unrelated bactrian camels, only glutamine was observed at that position. PMID:962846

  16. Genome sequencing of disease and carriage isolates of nontypeable Haemophilus influenzae identifies discrete population structure.

    PubMed

    De Chiara, Matteo; Hood, Derek; Muzzi, Alessandro; Pickard, Derek J; Perkins, Tim; Pizza, Mariagrazia; Dougan, Gordon; Rappuoli, Rino; Moxon, E Richard; Soriani, Marco; Donati, Claudio

    2014-04-01

    One of the main hurdles for the development of an effective and broadly protective vaccine against nonencapsulated isolates of Haemophilus influenzae (NTHi) lies in the genetic diversity of the species, which renders extremely difficult the identification of cross-protective candidate antigens. To assess whether a population structure of NTHi could be defined, we performed genome sequencing of a collection of diverse clinical isolates representative of both carriage and disease and of the diversity of the natural population. Analysis of the distribution of polymorphic sites in the core genome and of the composition of the accessory genome defined distinct evolutionary clades and supported a predominantly clonal evolution of NTHi, with the majority of genetic information transmitted vertically within lineages. A correlation between the population structure and the presence of selected surface-associated proteins and lipooligosaccharide structure, known to contribute to virulence, was found. This high-resolution, genome-based population structure of NTHi provides the foundation to obtain a better understanding, of NTHi adaptation to the host as well as its commensal and virulence behavior, that could facilitate intervention strategies against disease caused by this important human pathogen. PMID:24706866

  17. Novel variants in MLL confer to bladder cancer recurrence identified by whole-exome sequencing

    PubMed Central

    Wang, Yongqiang; Huang, Yi; Liu, Huan; Li, Feida; He, Luyun; Sun, Da; Yu, Yuan; Li, Qiaoling; Huang, Peide; Zhang, Meng; Zhao, Xin; Bi, Tengteng; Zhuang, Xuehan; Zhang, Liyan; Lu, Jingxiao; Sun, Xiaojuan; Zhou, Fangjian; Liu, Chunxiao; Yang, Guosheng; Hou, Yong; Fan, Zusen; Cai, Zhiming

    2016-01-01

    Bladder cancer (BC) is distinguished by high rate of recurrence after surgery, but the underlying mechanisms remain poorly understood. Here we performed the whole-exome sequencing of 37 BC individuals including 20 primary and 17 recurrent samples in which the primary and recurrent samples were not from the same patient. We uncovered that MLL, EP400, PRDM2, ANK3 and CHD5 exclusively altered in recurrent BCs. Specifically, the recurrent BCs and bladder cancer cells with MLL mutation displayed increased histone H3 tri-methyl K4 (H3K4me3) modification in tissue and cell levels and showed enhanced expression of GATA4 and ETS1 downstream. What's more, MLL mutated bladder cancer cells obtained with CRISPR/Cas9 showed increased ability of drug-resistance to epirubicin (a chemotherapy drug for bladder cancer) than wild type cells. Additionally, the BC patients with high expression of GATA4 and ETS1 significantly displayed shorter lifespan than patients with low expression. Our study provided an overview of the genetic basis of recrudescent bladder cancer and discovered that genetic alterations of MLL were involved in BC relapse. The increased modification of H3K4me3 and expression of GATA4 and ETS1 would be the promising targets for the diagnosis and therapy of relapsed bladder cancer. PMID:26625313

  18. Identifying neuronal lineages of Drosophila by sequence analysis of axon tracts.

    PubMed

    Cardona, Albert; Saalfeld, Stephan; Arganda, Ignacio; Pereanu, Wayne; Schindelin, Johannes; Hartenstein, Volker

    2010-06-01

    The Drosophila brain is formed by an invariant set of lineages, each of which is derived from a unique neural stem cell (neuroblast) and forms a genetic and structural unit of the brain. The task of reconstructing brain circuitry at the level of individual neurons can be made significantly easier by assigning neurons to their respective lineages. In this article we address the automation of neuron and lineage identification. We focused on the Drosophila brain lineages at the larval stage when they form easily recognizable secondary axon tracts (SATs) that were previously partially characterized. We now generated an annotated digital database containing all lineage tracts reconstructed from five registered wild-type brains, at higher resolution and including some that were previously not characterized. We developed a method for SAT structural comparisons based on a dynamic programming approach akin to nucleotide sequence alignment and a machine learning classifier trained on the annotated database of reference SATs. We quantified the stereotypy of SATs by measuring the residual variability of aligned wild-type SATs. Next, we used our method for the identification of SATs within wild-type larval brains, and found it highly accurate (93-99%). The method proved highly robust for the identification of lineages in mutant brains and in brains that differed in developmental time or labeling. We describe for the first time an algorithm that quantifies neuronal projection stereotypy in the Drosophila brain and use the algorithm for automatic neuron and lineage recognition. PMID:20519528

  19. Identifying neuronal lineages of Drosophila by sequence analysis of axon tracts

    PubMed Central

    Cardona, Albert; Saalfeld, Stephan; Arganda, Ignacio; Pereanu, Wayne; Schindelin, Johannes; Hartenstein, Volker

    2010-01-01

    The Drosophila brain is formed by an invariant set of lineages, each of which is derived from a unique neural stem cell (neuroblast) and forms a genetic and structural unit of the brain. The task of reconstructing brain circuitry at the level of individual neurons can be made significantly easier by assigning neurons to their respective lineages. In this paper we address the automatization of neuron and lineage identification. We focused on the Drosophila brain lineages at the larval stage when they form easily recognizable secondary axon tracts (SATs) that were previously partially characterized. We now generated an annotated digital database containing all lineage tracts reconstructed from five registered wild-type brains, at higher resolution and including some that were previously not characterized. We developed a method for SAT structural comparisons based on a dynamic programming approach akin to nucleotide sequence alignment, and a machine learning classifier trained on the annotated database of reference SATs. We quantified the stereotypy of SATs by measuring the residual variability of aligned wild-type SATs. Next, we employed our method for the identification of SATs within wild-type larval brains, and found it highly accurate (93–99 %). The method proved highly robust for the identification of lineages in mutant brains, and in brains that differed in developmental time or labeling. We describe for the first time an algorithm that quantifies neuronal projection stereotypy in the Drosophila brain, and use the algorithm for automatic neuron and lineage recognition. PMID:20519528

  20. Exome sequencing identifies somatic mutations of DNA methyltransferase gene DNMT3A in acute monocytic leukemia.

    PubMed

    Yan, Xiao-Jing; Xu, Jie; Gu, Zhao-Hui; Pan, Chun-Ming; Lu, Gang; Shen, Yang; Shi, Jing-Yi; Zhu, Yong-Mei; Tang, Lin; Zhang, Xiao-Wei; Liang, Wen-Xue; Mi, Jian-Qing; Song, Huai-Dong; Li, Ke-Qin; Chen, Zhu; Chen, Sai-Juan

    2011-04-01

    Abnormal epigenetic regulation has been implicated in oncogenesis. We report here the identification of somatic mutations by exome sequencing in acute monocytic leukemia, the M5 subtype of acute myeloid leukemia (AML-M5). We discovered mutations in DNMT3A (encoding DNA methyltransferase 3A) in 23 of 112 (20.5%) cases. The DNMT3A mutants showed reduced enzymatic activity or aberrant affinity to histone H3 in vitro. Notably, there were alterations of DNA methylation patterns and/or gene expression profiles (such as HOXB genes) in samples with DNMT3A mutations as compared with those without such changes. Leukemias with DNMT3A mutations constituted a group of poor prognosis with elderly disease onset and of promonocytic as well as monocytic predominance among AML-M5 individuals. Screening other leukemia subtypes showed Arg882 alterations in 13.6% of acute myelomonocytic leukemia (AML-M4) cases. Our work suggests a contribution of aberrant DNA methyltransferase activity to the pathogenesis of acute monocytic leukemia and provides a useful new biomarker for relevant cases. PMID:21399634

  1. CDH1 mutations in gastric cancer patients from northern Brazil identified by Next- Generation Sequencing (NGS)

    PubMed Central

    El-Husny, Antonette; Raiol-Moraes, Milene; Amador, Marcos; Ribeiro-dos-Santos, André M.; Montagnini, André; Barbosa, Silvanira; Silva, Artur; Assumpção, Paulo; Ishak, Geraldo; Santos, Sidney; Pinto, Pablo; Cruz, Aline; Ribeiro-dos-Santos, Ândrea

    2016-01-01

    Abstract Gastric cancer is considered to be the fifth highest incident tumor worldwide and the third leading cause of cancer deaths. Developing regions report a higher number of sporadic cases, but there are only a few local studies related to hereditary cases of gastric cancer in Brazil to confirm this fact. CDH1 germline mutations have been described both in familial and sporadic cases, but there is only one recent molecular description of individuals from Brazil. In this study we performed Next Generation Sequencing (NGS) to assess CDH1 germline mutations in individuals who match the clinical criteria for Hereditary Diffuse Gastric Cancer (HDGC), or who exhibit very early diagnosis of gastric cancer. Among five probands we detected CDH1 germline mutations in two cases (40%). The mutation c.1023T > G was found in a HDGC family and the mutation c.1849G > A, which is nearly exclusive to African populations, was found in an early-onset case of gastric adenocarcinoma. The mutations described highlight the existence of gastric cancer cases caused by CDH1 germline mutations in northern Brazil, although such information is frequently ignored due to the existence of a large number of environmental factors locally. Our report represent the first CDH1 mutations in HDGC described from Brazil by an NGS platform. PMID:27192129

  2. CDH1 mutations in gastric cancer patients from northern Brazil identified by Next- Generation Sequencing (NGS).

    PubMed

    El-Husny, Antonette; Raiol-Moraes, Milene; Amador, Marcos; Ribeiro-Dos-Santos, André M; Montagnini, André; Barbosa, Silvanira; Silva, Artur; Assumpção, Paulo; Ishak, Geraldo; Santos, Sidney; Pinto, Pablo; Cruz, Aline; Ribeiro-Dos-Santos, Ândrea

    2016-05-13

    Gastric cancer is considered to be the fifth highest incident tumor worldwide and the third leading cause of cancer deaths. Developing regions report a higher number of sporadic cases, but there are only a few local studies related to hereditary cases of gastric cancer in Brazil to confirm this fact. CDH1 germline mutations have been described both in familial and sporadic cases, but there is only one recent molecular description of individuals from Brazil. In this study we performed Next Generation Sequencing (NGS) to assess CDH1 germline mutations in individuals who match the clinical criteria for Hereditary Diffuse Gastric Cancer (HDGC), or who exhibit very early diagnosis of gastric cancer. Among five probands we detected CDH1 germline mutations in two cases (40%). The mutation c.1023T > G was found in a HDGC family and the mutation c.1849G > A, which is nearly exclusive to African populations, was found in an early-onset case of gastric adenocarcinoma. The mutations described highlight the existence of gastric cancer cases caused by CDH1 germline mutations in northern Brazil, although such information is frequently ignored due to the existence of a large number of environmental factors locally. Our report represent the first CDH1 mutations in HDGC described from Brazil by an NGS platform. PMID:27192129

  3. TCR sequencing facilitates diagnosis and identifies mature T cells as the cell of origin in CTCL

    PubMed Central

    O'Malley, John T.; Williamson, David W.; Scott, Laura-Louise; Elco, Christopher P.; Teague, Jessica E.; Gehad, Ahmed; Lowry, Elizabeth L.; LeBoeuf, Nicole R.; Krueger, James G.; Robins, Harlan S.; Kupper, Thomas S.; Clark, Rachael A.

    2016-01-01

    Early diagnosis of CTCL is difficult and takes on average six years after presentation, in part because the clinical appearance and histopathology of CTCL can resemble that of benign inflammatory skin diseases. Detection of a malignant T cell clone is critical in making the diagnosis of CTCL but the TCRγ PCR analysis in current clinical use detect clones in only a subset of patients. High-throughput TCR sequencing (HTS) detected T cell clones in 46/46 CTCL patients, was more sensitive and specific than TCRγ PCR, and successfully discriminated CTCL from benign inflammatory diseases. HTS also accurately assessed responses to therapy and facilitated diagnosis of disease recurrence. In patients with new skin lesions and no involvement of blood by flow cytometry, HTS demonstrated hematogenous spread of small numbers of malignant T cells. Analysis of CTCL TCRγ genes demonstrated that CTCL is a malignancy derived from mature T cells. There was a maximal T cell density in skin in benign inflammatory diseases that was exceeded in CTCL, suggesting a niche of finite size may exist for benign T cells in skin. Lastly, immunostaining demonstrated that the malignant T cell clones in mycosis fungoides and leukemic CTCL localized to different anatomic compartments in the skin. In summary, HTS accurately diagnosed CTCL in all stages, discriminated CTCL from benign inflammatory skin diseases and provided insights into the cell of origin and location of malignant CTCL cells in skin. PMID:26446955

  4. Bioinformatic analysis of neurotropic HIV envelope sequences identifies polymorphisms in the gp120 bridging sheet that increase macrophage-tropism through enhanced interactions with CCR5

    SciTech Connect

    Mefford, Megan E.; Kunstman, Kevin; Wolinsky, Steven M.; Gabuzda, Dana

    2015-07-15

    Macrophages express low levels of the CD4 receptor compared to T-cells. Macrophage-tropic HIV strains replicating in brain of untreated patients with HIV-associated dementia (HAD) express Envs that are adapted to overcome this restriction through mechanisms that are poorly understood. Here, bioinformatic analysis of env sequence datasets together with functional studies identified polymorphisms in the β3 strand of the HIV gp120 bridging sheet that increase M-tropism. D197, which results in loss of an N-glycan located near the HIV Env trimer apex, was detected in brain in some HAD patients, while position 200 was estimated to be under positive selection. D197 and T/V200 increased fusion and infection of cells expressing low CD4 by enhancing gp120 binding to CCR5. These results identify polymorphisms in the HIV gp120 bridging sheet that overcome the restriction to macrophage infection imposed by low CD4 through enhanced gp120–CCR5 interactions, thereby promoting infection of brain and other macrophage-rich tissues. - Highlights: • We analyze HIV Env sequences and identify amino acids in beta 3 of the gp120 bridging sheet that enhance macrophage tropism. • These amino acids at positions 197 and 200 are present in brain of some patients with HIV-associated dementia. • D197 results in loss of a glycan near the HIV Env trimer apex, which may increase exposure of V3. • These variants may promote infection of macrophages in the brain by enhancing gp120–CCR5 interactions.

  5. Use of Illumina sequencing to identify transposon insertions underlying mutant phenotypes in high-copy Mutator lines of maize.

    PubMed

    Williams-Carrier, Rosalind; Stiffler, Nicholas; Belcher, Susan; Kroeger, Tiffany; Stern, David B; Monde, Rita-Ann; Coalter, Robert; Barkan, Alice

    2010-07-01

    High-copy transposons have been effectively exploited as mutagens in a variety of organisms. However, their utility for phenotype-driven forward genetics has been hampered by the difficulty of identifying the specific insertions responsible for phenotypes of interest. We describe a new method that can substantially increase the throughput of linking a disrupted gene to a known phenotype in high-copy Mutator (Mu) transposon lines in maize. The approach uses the Illumina platform to obtain sequences flanking Mu elements in pooled, bar-coded DNA samples. Insertion sites are compared among individuals of suitable genotype to identify those that are linked to the mutation of interest. DNA is prepared for sequencing by mechanical shearing, adapter ligation, and selection of DNA fragments harboring Mu flanking sequences by hybridization to a biotinylated oligonucleotide corresponding to the Mu terminal inverted repeat. This method yields dense clusters of sequence reads that tile approximately 400 bp flanking each side of each heritable insertion. The utility of the approach is demonstrated by identifying the causal insertions in four genes whose disruption blocks chloroplast biogenesis at various steps: thylakoid protein targeting (cpSecE), chloroplast gene expression (polynucleotide phosphorylase and PTAC12), and prosthetic group attachment (HCF208/CCB2). This method adds to the tools available for phenotype-driven Mu tagging in maize, and could be adapted for use with other high-copy transposons. A by-product of the approach is the identification of numerous heritable insertions that are unrelated to the targeted phenotype, which can contribute to community insertion resources. PMID:20409008

  6. Analytical Framework for Identifying and Differentiating Recent Hitchhiking and Severe Bottleneck Effects from Multi-Locus DNA Sequence Data

    DOE PAGESBeta

    Sargsyan, Ori

    2012-05-25

    Hitchhiking and severe bottleneck effects have impact on the dynamics of genetic diversity of a population by inducing homogenization at a single locus and at the genome-wide scale, respectively. As a result, identification and differentiation of the signatures of such events from DNA sequence data at a single locus is challenging. This study develops an analytical framework for identifying and differentiating recent homogenization events at multiple neutral loci in low recombination regions. The dynamics of genetic diversity at a locus after a recent homogenization event is modeled according to the infinite-sites mutation model and the Wright-Fisher model of reproduction withmore » constant population size. In this setting, I derive analytical expressions for the distribution, mean, and variance of the number of polymorphic sites in a random sample of DNA sequences from a locus affected by a recent homogenization event. Based on this framework, three likelihood-ratio based tests are presented for identifying and differentiating recent homogenization events at multiple loci. Lastly, I apply the framework to two data sets. First, I consider human DNA sequences from four non-coding loci on different chromosomes for inferring evolutionary history of modern human populations. The results suggest, in particular, that recent homogenization events at the loci are identifiable when the effective human population size is 50000 or greater in contrast to 10000, and the estimates of the recent homogenization events are agree with the “Out of Africa” hypothesis. Second, I use HIV DNA sequences from HIV-1-infected patients to infer the times of HIV seroconversions. The estimates are contrasted with other estimates derived as the mid-time point between the last HIV-negative and first HIV-positive screening tests. Finally, the results show that significant discrepancies can exist between the estimates.« less

  7. Analytical Framework for Identifying and Differentiating Recent Hitchhiking and Severe Bottleneck Effects from Multi-Locus DNA Sequence Data

    SciTech Connect

    Sargsyan, Ori

    2012-05-25

    Hitchhiking and severe bottleneck effects have impact on the dynamics of genetic diversity of a population by inducing homogenization at a single locus and at the genome-wide scale, respectively. As a result, identification and differentiation of the signatures of such events from DNA sequence data at a single locus is challenging. This study develops an analytical framework for identifying and differentiating recent homogenization events at multiple neutral loci in low recombination regions. The dynamics of genetic diversity at a locus after a recent homogenization event is modeled according to the infinite-sites mutation model and the Wright-Fisher model of reproduction with constant population size. In this setting, I derive analytical expressions for the distribution, mean, and variance of the number of polymorphic sites in a random sample of DNA sequences from a locus affected by a recent homogenization event. Based on this framework, three likelihood-ratio based tests are presented for identifying and differentiating recent homogenization events at multiple loci. Lastly, I apply the framework to two data sets. First, I consider human DNA sequences from four non-coding loci on different chromosomes for inferring evolutionary history of modern human populations. The results suggest, in particular, that recent homogenization events at the loci are identifiable when the effective human population size is 50000 or greater in contrast to 10000, and the estimates of the recent homogenization events are agree with the “Out of Africa” hypothesis. Second, I use HIV DNA sequences from HIV-1-infected patients to infer the times of HIV seroconversions. The estimates are contrasted with other estimates derived as the mid-time point between the last HIV-negative and first HIV-positive screening tests. Finally, the results show that significant discrepancies can exist between the estimates.

  8. Software scripts for quality checking of high-throughput nucleic acid sequencers.

    PubMed

    Lazo, G R; Tong, J; Miller, R; Hsia, C; Rausch, C; Kang, Y; Anderson, O D

    2001-06-01

    We have developed a graphical interface to allow the researcher to view and assess the quality of sequencing results using a series of program scripts developed to process data generated by automated sequencers. The scripts are written in Perl programming language and are executable under the cgibin directory of a Web server environment. The scripts direct nucleic acid sequencing trace file data output from automated sequencers to be analyzed by the phred molecular biology program and are displayed as graphical hypertext mark-up language (HTML) pages. The scripts are mainly designed to handle 96-well microtiter dish samples, but the scripts are also able to read data from 384-well microtiter dishes 96 samples at a time. The scripts may be customized for different laboratory environments and computer configurations. Web links to the sources and discussion page are provided. PMID:11414222

  9. Identifying genomic changes associated with insecticide resistance in the dengue mosquito Aedes aegypti by deep targeted sequencing.

    PubMed

    Faucon, Frederic; Dusfour, Isabelle; Gaude, Thierry; Navratil, Vincent; Boyer, Frederic; Chandre, Fabrice; Sirisopa, Patcharawan; Thanispong, Kanutcharee; Juntarajumnong, Waraporn; Poupardin, Rodolphe; Chareonviriyaphap, Theeraphap; Girod, Romain; Corbel, Vincent; Reynaud, Stephane; David, Jean-Philippe

    2015-09-01

    The capacity of mosquitoes to resist insecticides threatens the control of diseases such as dengue and malaria. Until alternative control tools are implemented, characterizing resistance mechanisms is crucial for managing resistance in natural populations. Insecticide biodegradation by detoxification enzymes is a common resistance mechanism; however, the genomic changes underlying this mechanism have rarely been identified, precluding individual resistance genotyping. In particular, the role of copy number variations (CNVs) and polymorphisms of detoxification enzymes have never been investigated at the genome level, although they can represent robust markers of metabolic resistance. In this context, we combined target enrichment with high-throughput sequencing for conducting the first comprehensive screening of gene amplifications and polymorphisms associated with insecticide resistance in mosquitoes. More than 760 candidate genes were captured and deep sequenced in several populations of the dengue mosquito Ae. aegypti displaying distinct genetic backgrounds and contrasted resistance levels to the insecticide deltamethrin. CNV analysis identified 41 gene amplifications associated with resistance, most affecting cytochrome P450s overtranscribed in resistant populations. Polymorphism analysis detected more than 30,000 variants and strong selection footprints in specific genomic regions. Combining Bayesian and allele frequency filtering approaches identified 55 nonsynonymous variants strongly associated with resistance. Both CNVs and polymorphisms were conserved within regions but differed across continents, confirming that genomic changes underlying metabolic resistance to insecticides are not universal. By identifying novel DNA markers of insecticide resistance, this study opens the way for tracking down metabolic changes developed by mosquitoes to resist insecticides within and among populations. PMID:26206155

  10. Identifying genomic changes associated with insecticide resistance in the dengue mosquito Aedes aegypti by deep targeted sequencing

    PubMed Central

    Faucon, Frederic; Dusfour, Isabelle; Gaude, Thierry; Navratil, Vincent; Boyer, Frederic; Chandre, Fabrice; Sirisopa, Patcharawan; Thanispong, Kanutcharee; Juntarajumnong, Waraporn; Poupardin, Rodolphe; Chareonviriyaphap, Theeraphap; Girod, Romain; Corbel, Vincent; Reynaud, Stephane; David, Jean-Philippe

    2015-01-01

    The capacity of mosquitoes to resist insecticides threatens the control of diseases such as dengue and malaria. Until alternative control tools are implemented, characterizing resistance mechanisms is crucial for managing resistance in natural populations. Insecticide biodegradation by detoxification enzymes is a common resistance mechanism; however, the genomic changes underlying this mechanism have rarely been identified, precluding individual resistance genotyping. In particular, the role of copy number variations (CNVs) and polymorphisms of detoxification enzymes have never been investigated at the genome level, although they can represent robust markers of metabolic resistance. In this context, we combined target enrichment with high-throughput sequencing for conducting the first comprehensive screening of gene amplifications and polymorphisms associated with insecticide resistance in mosquitoes. More than 760 candidate genes were captured and deep sequenced in several populations of the dengue mosquito Ae. aegypti displaying distinct genetic backgrounds and contrasted resistance levels to the insecticide deltamethrin. CNV analysis identified 41 gene amplifications associated with resistance, most affecting cytochrome P450s overtranscribed in resistant populations. Polymorphism analysis detected more than 30,000 variants and strong selection footprints in specific genomic regions. Combining Bayesian and allele frequency filtering approaches identified 55 nonsynonymous variants strongly associated with resistance. Both CNVs and polymorphisms were conserved within regions but differed across continents, confirming that genomic changes underlying metabolic resistance to insecticides are not universal. By identifying novel DNA markers of insecticide resistance, this study opens the way for tracking down metabolic changes developed by mosquitoes to resist insecticides within and among populations. PMID:26206155

  11. Exome Sequencing Identifies a Novel Homozygous Mutation in the Phosphate Transporter SLC34A1 in Hypophosphatemia and Nephrocalcinosis

    PubMed Central

    Rajagopal, Abbhirami; Braslavsky, Débora; Lu, James T.; Kleppe, Soledad; Clément, Florencia; Cassinelli, Hamilton; Liu, David S.; Liern, Jose Miguel; Vallejo, Graciela; Bergadá, Ignacio; Gibbs, Richard A.; Campeau, Phillipe M.

    2014-01-01

    Context: Two Argentinean siblings (a boy and a girl) from a nonconsanguineous family presented with hypercalcemia, hypercalciuria, hypophosphatemia, low parathyroid hormone (PTH), and nephrocalcinosis. Objective: The goal of this study was to identify genetic causes of the clinical findings in the two siblings. Design: Whole exome sequencing was performed to identify disease-causing mutations in the youngest sibling, and a candidate variant was screened in other family members by Sanger sequencing. In vitro experiments were conducted to determine the effects of the mutation that was identified. Patients and Other Participants: Affected siblings (2 y.o. female and 10 y.o male) and their parents were included in the study. Informed consent was obtained for genetic studies. Results: A novel homozygous mutation in the gene encoding the renal sodium-dependent phosphate transporter SLC34A1 was identified in both siblings (c.1484G>A, p.Arg495His). In vitro studies showed that the p.Arg495His mutation resulted in decreased phosphate uptake when compared to wild-type SLC34A1. Conclusions: The homozygous G>A transition that results in the substitution of histidine for arginine at position 495 of the renal sodium-dependent phosphate transporter, SLC34A1, is involved in disease pathogenesis in these patients. Our report of the second family with two mutated SLC34A1 alleles expands the known phenotype of this rare condition. PMID:25050900

  12. Candidate DNA repair susceptibility genes identified by exome sequencing in high-risk pancreatic cancer.

    PubMed

    Smith, Alyssa L; Alirezaie, Najmeh; Connor, Ashton; Chan-Seng-Yue, Michelle; Grant, Robert; Selander, Iris; Bascuñana, Claire; Borgida, Ayelet; Hall, Anita; Whelan, Thomas; Holter, Spring; McPherson, Treasa; Cleary, Sean; Petersen, Gloria M; Omeroglu, Atilla; Saloustros, Emmanouil; McPherson, John; Stein, Lincoln D; Foulkes, William D; Majewski, Jacek; Gallinger, Steven; Zogopoulos, George

    2016-01-28

    The genetic basis underlying the majority of hereditary pancreatic adenocarcinoma (PC) is unknown. Since DNA repair genes are widely implicated in gastrointestinal malignancies, including PC, we hypothesized that there are novel DNA repair PC susceptibility genes. As germline DNA repair gene mutations may lead to PC subtypes with selective therapeutic responses, we also hypothesized that there is an overall survival (OS) difference in mutation carriers versus non-carriers. We therefore interrogated the germline exomes of 109 high-risk PC cases for rare protein-truncating variants (PTVs) in 513 putative DNA repair genes. We identified PTVs in 41 novel genes among 36 kindred. Additional genetic evidence for causality was obtained for 17 genes, with FAN1, NEK1 and RHNO1 emerging as the strongest candidates. An OS difference was observed for carriers versus non-carriers of PTVs with early stage (≤IIB) disease. This adverse survival trend in carriers with early stage disease was also observed in an independent series of 130 PC cases. We identified candidate DNA repair PC susceptibility genes and suggest that carriers of a germline PTV in a DNA repair gene with early stage disease have worse survival. PMID:26546047

  13. Distinct myeloid progenitor-differentiation pathways identified through single-cell RNA sequencing.

    PubMed

    Drissen, Roy; Buza-Vidas, Natalija; Woll, Petter; Thongjuea, Supat; Gambardella, Adriana; Giustacchini, Alice; Mancini, Elena; Zriwil, Alya; Lutteropp, Michael; Grover, Amit; Mead, Adam; Sitnicka, Ewa; Jacobsen, Sten Eirik W; Nerlov, Claus

    2016-06-01

    According to current models of hematopoiesis, lymphoid-primed multi-potent progenitors (LMPPs) (Lin(-)Sca-1(+)c-Kit(+)CD34(+)Flt3(hi)) and common myeloid progenitors (CMPs) (Lin(-)Sca-1(+)c-Kit(+)CD34(+)CD41(hi)) establish an early branch point for separate lineage-commitment pathways from hematopoietic stem cells, with the notable exception that both pathways are proposed to generate all myeloid innate immune cell types through the same myeloid-restricted pre-granulocyte-macrophage progenitor (pre-GM) (Lin(-)Sca-1(-)c-Kit(+)CD41(-)FcγRII/III(-)CD150(-)CD105(-)). By single-cell transcriptome profiling of pre-GMs, we identified distinct myeloid differentiation pathways: a pathway expressing the gene encoding the transcription factor GATA-1 generated mast cells, eosinophils, megakaryocytes and erythroid cells, and a pathway lacking expression of that gene generated monocytes, neutrophils and lymphocytes. These results identify an early hematopoietic-lineage bifurcation that separates the myeloid lineages before their segregation from other hematopoietic-lineage potential. PMID:27043410

  14. Metagenomic sequencing of bile from gallstone patients to identify different microbial community patterns and novel biliary bacteria

    PubMed Central

    Shen, Hongzhang; Ye, Fuqiang; Xie, Lu; Yang, Jianfeng; Li, Zhen; Xu, Peisong; Meng, Fei; Li, Lei; Chen, Ying; Bo, Xiaochen; Ni, Ming; Zhang, Xiaofeng

    2015-01-01

    Despite the high worldwide prevalence of gallstone disease, the role of the biliary microbiota in gallstone pathogenesis remains obscure. Next-generation sequencing offers advantages for systematically understanding the human microbiota; however, there have been few such investigations of the biliary microbiome. Here, we performed whole-metagenome shotgun (WMS) sequencing and 16S rRNA sequencing on bile samples from 15 Chinese patients with gallstone disease. Microbial communities of most individuals were clustered into two types, according to the relative enrichment of different intestinal bacterial species. In the bile samples, oral cavity/respiratory tract inhabitants were more prevalent than intestinal inhabitants and existed in both community types. Unexpectedly, the two types were not associated with fever status or surgical history, and many bacteria were patient-specific. We identified 13 novel biliary bacteria based on WMS sequencing, as well as genes encoding putative proteins related to gallstone formation and bile resistance (e.g., β-glucuronidase and multidrug efflux pumps). Bile samples from gallstone patients had reduced microbial diversity compared to healthy faecal samples. Patient samples were enriched in pathways related to oxidative stress and flagellar assembly, whereas carbohydrate metabolic pathways showed varying behaviours. As the first biliary WMS survey, our study reveals the complexity and specificity of biliary microecology. PMID:26625708

  15. ENTPRISE: An Algorithm for Predicting Human Disease-Associated Amino Acid Substitutions from Sequence Entropy and Predicted Protein Structures

    PubMed Central

    Zhou, Hongyi; Gao, Mu; Skolnick, Jeffrey

    2016-01-01

    The advance of next-generation sequencing technologies has made exome sequencing rapid and relatively inexpensive. A major application of exome sequencing is the identification of genetic variations likely to cause Mendelian diseases. This requires processing large amounts of sequence information and therefore computational approaches that can accurately and efficiently identify the subset of disease-associated variations are needed. The accuracy and high false positive rates of existing computational tools leave much room for improvement. Here, we develop a boosted tree regression machine-learning approach to predict human disease-associated amino acid variations by utilizing a comprehensive combination of protein sequence and structure features. On comparing our method, ENTPRISE, to the state-of-the-art methods SIFT, PolyPhen-2, MUTATIONASSESSOR, MUTATIONTASTER, FATHMM, ENTPRISE exhibits significant improvement. In particular, on a testing dataset consisting of only proteins with balanced disease-associated and neutral variations defined as having the ratio of neutral/disease-associated variations between 0.3 and 3, the Mathews Correlation Coefficient by ENTPRISE is 0.493 as compared to 0.432 by PPH2-HumVar, 0.406 by SIFT, 0.403 by MUTATIONASSESSOR, 0.402 by PPH2-HumDiv, 0.305 by MUTATIONTASTER, and 0.181 by FATHMM. ENTPRISE is then applied to nucleic acid binding proteins in the human proteome. Disease-associated predictions are shown to be highly correlated with the number of protein-protein interactions. Both these predictions and the ENTPRISE server are freely available for academic users as a web service at http://cssb.biology.gatech.edu/entprise/. PMID:26982818

  16. Whole exome sequencing identifies a novel frameshift mutation in GPC3 gene in a patient with overgrowth syndrome.

    PubMed

    Das Bhowmik, Aneek; Dalal, Ashwin

    2015-11-10

    Overgrowth syndromes are a heterogeneous group of diseases characterized by focal or generalized overgrowth. Many of the syndromes have overlapping clinical features and it is difficult to diagnose the condition based on clinical features alone. In the present study we report on a patient with overgrowth syndrome where extensive investigation did not reveal the cause of disease. Finally exome sequencing revealed a novel hemizygous single base pair deletion in exon 8 of GPC3 gene (chrX:132670203delA) resulting in a frameshift and creating a new stop codon at 62 amino acids downstream to codon 564 (c.1692delT; p.Leu565SerfsTer63) of the protein. The mutation was confirmed by Sanger sequencing. The mother was found to be heterozygous for the mutation. This variation is not reported in the 1000 Genomes, Exome Variant Server (EVS), Exome Aggregation Consortium (ExAC) and dbSNP databases and the region is conserved across primates. Exome sequencing was helpful in establishing diagnosis of Simpson-Golabi-Behmel syndrome type 1 (SGBS1) in a patient with unknown overgrowth syndrome. PMID:26321508

  17. Unbiased phosphoproteomic method identifies the initial effects of a methacrylic acid copolymer on macrophages.

    PubMed

    Chamberlain, Michael Dean; Wells, Laura A; Lisovsky, Alexandra; Guo, Hongbo; Isserlin, Ruth; Talior-Volodarsky, Ilana; Mahou, Redouan; Emili, Andrew; Sefton, Michael V

    2015-08-25

    An unbiased phosphoproteomic method was used to identify biomaterial-associated changes in the phosphorylation patterns of macrophage-like cells. The phosphorylation differences between differentiated THP1 (dTHP1) cells treated for 10, 20, or 30 min with a vascular regenerative methacrylic acid (MAA) copolymer or a control methyl methacrylate (MM) copolymer were determined by MS. There were 1,470 peptides (corresponding to 729 proteins) that were differentially phosphorylated in dTHP1 cells treated with the two materials with a greater cellular response to MAA treatment. In addition to identifying pathways (such as integrin signaling and cytoskeletal arrangement) that are well known to change with cell-material interaction, previously unidentified pathways, such as apoptosis and mRNA splicing, were also discovered. PMID:26261332

  18. Unbiased phosphoproteomic method identifies the initial effects of a methacrylic acid copolymer on macrophages

    PubMed Central

    Chamberlain, Michael Dean; Wells, Laura A.; Lisovsky, Alexandra; Guo, Hongbo; Isserlin, Ruth; Talior-Volodarsky, Ilana; Mahou, Redouan; Emili, Andrew; Sefton, Michael V.

    2015-01-01

    An unbiased phosphoproteomic method was used to identify biomaterial-associated changes in the phosphorylation patterns of macrophage-like cells. The phosphorylation differences between differentiated THP1 (dTHP1) cells treated for 10, 20, or 30 min with a vascular regenerative methacrylic acid (MAA) copolymer or a control methyl methacrylate (MM) copolymer were determined by MS. There were 1,470 peptides (corresponding to 729 proteins) that were differentially phosphorylated in dTHP1 cells treated with the two materials with a greater cellular response to MAA treatment. In addition to identifying pathways (such as integrin signaling and cytoskeletal arrangement) that are well known to change with cell–material interaction, previously unidentified pathways, such as apoptosis and mRNA splicing, were also discovered. PMID:26261332

  19. Sequence-based association and selection scans identify drug resistance loci in the Plasmodium falciparum malaria parasite

    PubMed Central

    Park, Daniel J.; Lukens, Amanda K.; Neafsey, Daniel E.; Schaffner, Stephen F.; Chang, Hsiao-Han; Valim, Clarissa; Ribacke, Ulf; Van Tyne, Daria; Galinsky, Kevin; Galligan, Meghan; Becker, Justin S.; Ndiaye, Daouda; Mboup, Souleymane; Wiegand, Roger C.; Hartl, Daniel L.; Sabeti, Pardis C.; Wirth, Dyann F.; Volkman, Sarah K.

    2012-01-01

    Through rapid genetic adaptation and natural selection, the Plasmodium falciparum parasite—the deadliest of those that cause malaria—is able to develop resistance to antimalarial drugs, thwarting present efforts to control it. Genome-wide association studies (GWAS) provide a critical hypothesis-generating tool for understanding how this occurs. However, in P. falciparum, the limited amount of linkage disequilibrium hinders the power of traditional array-based GWAS. Here, we demonstrate the feasibility and power improvements gained by using whole-genome sequencing for association studies. We analyzed data from 45 Senegalese parasites and identified genetic changes associated with the parasites’ in vitro response to 12 different antimalarials. To further increase statistical power, we adapted a common test for natural selection, XP-EHH (cross-population extended haplotype homozygosity), and used it to identify genomic regions associated with resistance to drugs. Using this sequence-based approach and the combination of association and selection-based tests, we detected several loci associated with drug resistance. These loci included the previously known signals at pfcrt, dhfr, and pfmdr1, as well as many genes not previously implicated in drug-resistance roles, including genes in the ubiquitination pathway. Based on the success of the analysis presented in this study, and on the demonstrated shortcomings of array-based approaches, we argue for a complete transition to sequence-based GWAS for small, low linkage-disequilibrium genomes like that of P. falciparum. PMID:22826220

  20. Nucleotide and predicted amino acid sequences of cloned human and mouse preprocathepsin B cDNAs.

    PubMed Central

    Chan, S J; San Segundo, B; McCormick, M B; Steiner, D F

    1986-01-01

    Cathepsin B is a lysosomal thiol proteinase that may have additional extralysosomal functions. To further our investigations on the structure, mode of biosynthesis, and intracellular sorting of this enzyme, we have determined the complete coding sequences for human and mouse preprocathepsin B by using cDNA clones isolated from human hepatoma and kidney phage libraries. The nucleotide sequences predict that the primary structure of preprocathepsin B contains 339 amino acids organized as follows: a 17-residue NH2-terminal prepeptide sequence followed by a 62-residue propeptide region, 254 residues in mature (single chain) cathepsin B, and a 6-residue extension at the COOH terminus. A comparison of procathepsin B sequences from three species (human, mouse, and rat) reveals that the homology between the propeptides is relatively conserved with a minimum of 68% sequence identity. In particular, two conserved sequences in the propeptide that may be functionally significant include a potential glycosylation site and the presence of a single cysteine at position 59. Comparative analysis of the three sequences also suggests that processing of procathepsin B is a multistep process, during which enzymatically active intermediate forms may be generated. The availability of the cDNA clones will facilitate the identification of possible active or inactive intermediate processive forms as well as studies on the transcriptional regulation of the cathepsin B gene. PMID:3463996

  1. Efficient Nucleic Acid Extraction and 16S rRNA Gene Sequencing for Bacterial Community Characterization.

    PubMed

    Anahtar, Melis N; Bowman, Brittany A; Kwon, Douglas S

    2016-01-01

    There is a growing appreciation for the role of microbial communities as critical modulators of human health and disease. High throughput sequencing technologies have allowed for the rapid and efficient characterization of bacterial communities using 16S rRNA gene sequencing from a variety of sources. Although readily available tools for 16S rRNA sequence analysis have standardized computational workflows, sample processing for DNA extraction remains a continued source of variability across studies. Here we describe an efficient, robust, and cost effective method for extracting nucleic acid from swabs. We also delineate downstream methods for 16S rRNA gene sequencing, including generation of sequencing libraries, data quality control, and sequence analysis. The workflow can accommodate multiple samples types, including stool and swabs collected from a variety of anatomical locations and host species. Additionally, recovered DNA and RNA can be separated and used for other applications, including whole genome sequencing or RNA-seq. The method described allows for a common processing approach for multiple sample types and accommodates downstream analysis of genomic, metagenomic and transcriptional information. PMID:27168460

  2. Efficient Nucleic Acid Extraction and 16S rRNA Gene Sequencing for Bacterial Community Characterization

    PubMed Central

    Anahtar, Melis N.; Bowman, Brittany A.; Kwon, Douglas S.

    2016-01-01

    There is a growing appreciation for the role of microbial communities as critical modulators of human health and disease. High throughput sequencing technologies have allowed for the rapid and efficient characterization of bacterial communities using 16S rRNA gene sequencing from a variety of sources. Although readily available tools for 16S rRNA sequence analysis have standardized computational workflows, sample processing for DNA extraction remains a continued source of variability across studies. Here we describe an efficient, robust, and cost effective method for extracting nucleic acid from swabs. We also delineate downstream methods for 16S rRNA gene sequencing, including generation of sequencing libraries, data quality control, and sequence analysis. The workflow can accommodate multiple samples types, including stool and swabs collected from a variety of anatomical locations and host species. Additionally, recovered DNA and RNA can be separated and used for other applications, including whole genome sequencing or RNA-seq. The method described allows for a common processing approach for multiple sample types and accommodates downstream analysis of genomic, metagenomic and transcriptional information. PMID:27168460

  3. Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)

    ScienceCinema

    Patel, Kamlesh D [Ken]; SNL,

    2013-01-25

    Kamlesh (Ken) Patel from Sandia National Laboratories (Livermore, California) presents "Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology " at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.

  4. The amino acid sequence of ribonuclease U2 from Ustilago sphaerogena.

    PubMed Central

    Sato, S; Uchida, T

    1975-01-01

    1. RNAase (ribonuclease) U2, a purine-specific RNAase, was reduced, aminoethylated and hydrolysed with trypsin, chymotrypsin and thermolysin. On the basis of the analyses of the resulting peptides, the complete amino acid sequence of RNAase U2 was determined, 2. When the sequence was compared with the amino acid sequence of RNAase T1 (EC 3.1.4.8), the following regions were found to be similar in the two enzymes; Tyr-Pro-His-Gln-Tyr (38-42) in RNAase U2 and Tyr-Pro-His-Lys-Tyr (38-42) in RNAase T1, Glu-Phe-Pro-Leu-Val (61-65) in RNAase U2 and Glu-Trp-Pro-Ile-Leu (58-62) in RNAase T1, Asp-Arg-Val-Ile-Tyr-Gln (83-88) in RNAase U2 and Asp-Arg-Val-Phe-Asn (76-81) in RNAase T1 and Val-Thr-His-Thr-Gly-Ala (98-103) in RNAase U2 and Ile-Thr-His-Thr-Gly-Ala (90-95) in RNAase T1. All of the amino acid residues, histidine-40, glutamate-58, arginine-77 and histidine-92, which were found to play a crucial role in the biological activity of RNAase T1, were included in the regions cited here. 3. Detailed evidence for the amino acid sequence of the sequence of the proteins has been deposited as Supplementary Publication SUP 50041 (33 PAGES) AT THE British Library (Lending Division)(formerly the National Lending Library for Science and Technology), Boston Spa, Yorks. LS23 7BQ, U.K., from whom copies can be obtained on the terms indicated in Biochem. J. (1975), 145, 5. PMID:1156364

  5. High-throughput genomic sequencing of cassava bacterial blight strains identifies conserved effectors to target for durable resistance.

    PubMed

    Bart, Rebecca; Cohn, Megan; Kassen, Andrew; McCallum, Emily J; Shybut, Mikel; Petriello, Annalise; Krasileva, Ksenia; Dahlbeck, Douglas; Medina, Cesar; Alicai, Titus; Kumar, Lava; Moreira, Leandro M; Rodrigues Neto, Júlio; Verdier, Valerie; Santana, María Angélica; Kositcharoenkul, Nuttima; Vanderschuren, Hervé; Gruissem, Wilhelm; Bernal, Adriana; Staskawicz, Brian J

    2012-07-10

    Cassava bacterial blight (CBB), incited by Xanthomonas axonopodis pv. manihotis (Xam), is the most important bacterial disease of cassava, a staple food source for millions of people in developing countries. Here we present a widely applicable strategy for elucidating the virulence components of a pathogen population. We report Illumina-based draft genomes for 65 Xam strains and deduce the phylogenetic relatedness of Xam across the areas where cassava is grown. Using an extensive database of effector proteins from animal and plant pathogens, we identify the effector repertoire for each sequenced strain and use a comparative sequence analysis to deduce the least polymorphic of the conserved effectors. These highly conserved effectors have been maintained over 11 countries, three continents, and 70 y of evolution and as such represent ideal targets for developing resistance strategies. PMID:22699502

  6. Deduced amino acid sequence of human pulmonary surfactant proteolipid: SPL(pVal)

    SciTech Connect

    Whitsett, J.A.; Glasser, S.W.; Korfhagen, T.R.; Weaver, T.E.; Clark, J.; Pilot-Matias, T.; Meuth, J.; Fox, J.L.

    1987-05-01

    Hydrophobic, proteolipid-like protein of Mr 6500 was isolated from ether/ethanol extracts of human, canine and bovine pulmonary surfactant. Amino acid composition of the protein demonstrated a remarkable abundance of hydrophobic residues, particularly valine and leucine. The N-terminal amino acid sequence of the human protein was determined: N-Leu-Ile-Pro-Cys-Cys-Pro-Val-Asn-Leu-Lys-Arg-Leu-Leu-Ile-Val4... An oligonucleotide probe was used to screen an adult human lung cDNA library and resulted in detection of cDNA clones with predicted amino acid sequence with close identity to the N-terminal amino acid sequence of the human peptide. SPL(pVal) was found within the reading frame of a larger peptide. SPL(pVal) results from proteolytic processing of a larger preprotein. Northern blot analysis detected in a single 1.0 kilobase SPL(pVal) RNA which was less abundant in fetal than in adult lung. Mixtures of purified canine and bovine SPL(pVal) and synthetic phospholipids display properties of rapid adsorption and surface tension lowering activity characteristic of surfactant. Human SPL(pVal) is a pulmonary surfactant proteolipid which may therefore be useful in combination with phospholipids and/or other surfactant proteins for the treatment of surfactant deficiency such as hyaline membrane disease in newborn infants.

  7. Complete nucleic acid sequence of Penaeus stylirostris densovirus (PstDNV) from India.

    PubMed

    Rai, Praveen; Safeena, Muhammed P; Karunasagar, Iddya; Karunasagar, Indrani

    2011-06-01

    Infectious hypodermal and hematopoietic necrosis virus (IHHNV) of shrimp, recently been classified as Penaeus stylirostris densovirus (PstDNV). The complete nucleic acid sequence of PstDNV from India was obtained by cloning and sequencing of different DNA fragment of the virus. The genome organisation of PstDNV revealed that there were three major coding domains: a left ORF (NS1) of 2001 bp, a mid ORF (NS2) of 1092 bp and a right ORF (VP) of 990 bp. The complete genome and amino acid sequences of three proteins viz., NS1, NS2 and VP were compared with the genomes of the virus reported from Hawaii, China and Mexico and with partial sequence available from isolates from different regions. The phylogenetic analysis of shrimp, insect and vertebrate parvovirus sequences showed that the Indian PstDNV isolate is phylogenetically more closely related to one of the three isolates from Taiwan (AY355307), and two isolates (AY362547 and AY102034) from Thailand. PMID:21402111

  8. Human liver type pyruvate kinase: complete amino acid sequence and the expression in mammalian cells.

    PubMed Central

    Tani, K; Fujii, H; Nagata, S; Miwa, S

    1988-01-01

    Pyruvate kinase (PK) has four isozymes (L, R, M1, M2) that are encoded by two different genes. Among these isozymes, abnormalities of liver (L)-type PK is considered to be associated with hereditary nonspherocytic hemolytic anemia in humans. We isolated and determined the full-length sequence of human L-type PK cDNA. The cDNA contains 1629 base pairs encoding 543 amino acids, 68 base pairs of 5'-noncoding sequence, and 734 base pairs of 3'-noncoding sequence. The similarity between human and rat L-type PK was 86.9% at the nucleotide sequence level and 92.4% at the amino acid sequence level. The full-length L-type PK cDNA was placed under the promoter of simian virus 40 and introduced into monkey COS cells. Human L-type PK activity was detected in the extract of COS cells by the classical PK electrophoresis method. Images PMID:3126495

  9. Human liver type pyruvate kinase: Complete amino acid sequence and the expression in mammalian cells

    SciTech Connect

    Tani, Kenzaburo; Nagata, Shigekazu ); Fujii, Hisaichi ); Miwa, Shiro )

    1988-03-01

    Pyruvate kinase (PK) has four isozymes (L, R, M{sub 1}, M{sub 2}) that are encoded by two different genes. Among these isozymes, abnormalities of liver (L)-type PK is considered to be associated with hereditary nonspherocytic hemolytic anemia in humans. The authors isolated and determined the full-length sequence of human L-type PK cDNA. The cDNA contains 1,629 base pairs encoding 543 amino acids, 68 base pairs of 5{prime}-noncoding sequence, and 734 base pairs of 3{prime}-noncoding sequence. The similarity between human and rat L-type PK was 86.9% at the nucleotide sequence level and 92.4% at the amino acid sequence level. The full-length L-type PK cDNA was placed under the promoter of simian virus 40 and introduced into monkey COS cells. Human L-type PK activity was detected in the extract of COS cells by the classical PK electrophoresis method.

  10. Genome-Wide Linkage, Exome Sequencing and Functional Analyses Identify ABCB6 as the Pathogenic Gene of Dyschromatosis Universalis Hereditaria

    PubMed Central

    Wang, Na; Wang, Chuan; Chen, Xuechao; Sheng, Donglai; Fu, Xi’an; See, Kelvin; Foo, Jia Nee; Low, Huiqi; Liany, Herty; Irwan, Ishak Darryl; Liu, Jian; Yang, Baoqi; Chen, Mingfei; Yu, Yongxiang; Yu, Gongqi; Niu, Guiye; You, Jiabao; Zhou, Yan; Ma, Shanshan; Wang, Ting; Yan, Xiaoxiao; Goh, Boon Kee; Common, John E. A.; Lane, Birgitte E.; Sun, Yonghu; Zhou, Guizhi; Lu, Xianmei; Wang, Zhenhua; Tian, Hongqing; Cao, Yuanhua; Chen, Shumin; Liu, Qiji; Liu, Jianjun; Zhang, Furen

    2014-01-01

    Background As a genetic disorder of abnormal pigmentation, the molecular basis of dyschromatosis universalis hereditaria (DUH) had remained unclear until recently when ABCB6 was reported as a causative gene of DUH. Methodology We performed genome-wide linkage scan using Illumina Human 660W-Quad BeadChip and exome sequencing analyses using Agilent SureSelect Human All Exon Kits in a multiplex Chinese DUH family to identify the pathogenic mutations and verified the candidate mutations using Sanger sequencing. Quantitative RT-PCR and Immunohistochemistry was performed to verify the expression of the pathogenic gene, Zebrafish was also used to confirm the functional role of ABCB6 in melanocytes and pigmentation. Results Genome-wide linkage (assuming autosomal dominant inheritance mode) and exome sequencing analyses identified ABCB6 as the disease candidate gene by discovering a coding mutation (c.1358C>T; p.Ala453Val) that co-segregates with the disease phenotype. Further mutation analysis of ABCB6 in four other DUH families and two sporadic cases by Sanger sequencing confirmed the mutation (c.1358C>T; p.Ala453Val) and discovered a second, co-segregating coding mutation (c.964A>C; p.Ser322Lys) in one of the four families. Both mutations were heterozygous in DUH patients and not present in the 1000 Genome Project and dbSNP database as well as 1,516 unrelated Chinese healthy controls. Expression analysis in human skin and mutagenesis interrogation in zebrafish confirme