sequences previously identified: Topics by Science.gov

Sample records for sequences previously identified

Characterization and complete genome sequence of a previously uncharacterized panicovirus from Bermuda grass detected by high throughput sequencing

USDA-ARS?s Scientific Manuscript database

Bermuda grass samples were examined by transmission electron microscopy and 28-30 nm spherical virus particles were observed. Total RNA from these plants was subjected to high throughput sequencing (HTS). The nearly full genome sequence of a previously uncharacterized Panicovirus was identified from...
Targeted next-generation sequencing makes new molecular diagnoses and expands genotype-phenotype relationship in Ehlers-Danlos syndrome.

PubMed

Weerakkody, Ruwan A; Vandrovcova, Jana; Kanonidou, Christina; Mueller, Michael; Gampawar, Piyush; Ibrahim, Yousef; Norsworthy, Penny; Biggs, Jennifer; Abdullah, Abdulshakur; Ross, David; Black, Holly A; Ferguson, David; Cheshire, Nicholas J; Kazkaz, Hanadi; Grahame, Rodney; Ghali, Neeti; Vandersteen, Anthony; Pope, F Michael; Aitman, Timothy J

2016-11-01

Ehlers-Danlos syndrome (EDS) comprises a group of overlapping hereditary disorders of connective tissue with significant morbidity and mortality, including major vascular complications. We sought to identify the diagnostic utility of a next-generation sequencing (NGS) panel in a mixed EDS cohort. We developed and applied PCR-based NGS assays for targeted, unbiased sequencing of 12 collagen and aortopathy genes to a cohort of 177 unrelated EDS patients. Variants were scored blind to previous genetic testing and then compared with results of previous Sanger sequencing. Twenty-eight pathogenic variants in COL5A1/2, COL3A1, FBN1, and COL1A1 and four likely pathogenic variants in COL1A1, TGFBR1/2, and SMAD3 were identified by the NGS assays. These included all previously detected single-nucleotide and other short pathogenic variants in these genes, and seven newly detected pathogenic or likely pathogenic variants leading to clinically significant diagnostic revisions. Twenty-two variants of uncertain significance were identified, seven of which were in aortopathy genes and required clinical follow-up. Unbiased NGS-based sequencing made new molecular diagnoses outside the expected EDS genotype-phenotype relationship and identified previously undetected clinically actionable variants in aortopathy susceptibility genes. These data may be of value in guiding future clinical pathways for genetic diagnosis in EDS.Genet Med 18 11, 1119-1127.
Comprehensive Rare Variant Analysis via Whole-Genome Sequencing to Determine the Molecular Pathology of Inherited Retinal Disease.

PubMed

Carss, Keren J; Arno, Gavin; Erwood, Marie; Stephens, Jonathan; Sanchis-Juan, Alba; Hull, Sarah; Megy, Karyn; Grozeva, Detelina; Dewhurst, Eleanor; Malka, Samantha; Plagnol, Vincent; Penkett, Christopher; Stirrups, Kathleen; Rizzo, Roberta; Wright, Genevieve; Josifova, Dragana; Bitner-Glindzicz, Maria; Scott, Richard H; Clement, Emma; Allen, Louise; Armstrong, Ruth; Brady, Angela F; Carmichael, Jenny; Chitre, Manali; Henderson, Robert H H; Hurst, Jane; MacLaren, Robert E; Murphy, Elaine; Paterson, Joan; Rosser, Elisabeth; Thompson, Dorothy A; Wakeling, Emma; Ouwehand, Willem H; Michaelides, Michel; Moore, Anthony T; Webster, Andrew R; Raymond, F Lucy

2017-01-05

Inherited retinal disease is a common cause of visual impairment and represents a highly heterogeneous group of conditions. Here, we present findings from a cohort of 722 individuals with inherited retinal disease, who have had whole-genome sequencing (n = 605), whole-exome sequencing (n = 72), or both (n = 45) performed, as part of the NIHR-BioResource Rare Diseases research study. We identified pathogenic variants (single-nucleotide variants, indels, or structural variants) for 404/722 (56%) individuals. Whole-genome sequencing gives unprecedented power to detect three categories of pathogenic variants in particular: structural variants, variants in GC-rich regions, which have significantly improved coverage compared to whole-exome sequencing, and variants in non-coding regulatory regions. In addition to previously reported pathogenic regulatory variants, we have identified a previously unreported pathogenic intronic variant in CHM in two males with choroideremia. We have also identified 19 genes not previously known to be associated with inherited retinal disease, which harbor biallelic predicted protein-truncating variants in unsolved cases. Whole-genome sequencing is an increasingly important comprehensive method with which to investigate the genetic causes of inherited retinal disease. Copyright © 2017. Published by Elsevier Inc.
Enhanced arbovirus surveillance with deep sequencing: Identification of novel rhabdoviruses and bunyaviruses in Australian mosquitoes.

PubMed

Coffey, Lark L; Page, Brady L; Greninger, Alexander L; Herring, Belinda L; Russell, Richard C; Doggett, Stephen L; Haniotis, John; Wang, Chunlin; Deng, Xutao; Delwart, Eric L

2014-01-05

Viral metagenomics characterizes known and identifies unknown viruses based on sequence similarities to any previously sequenced viral genomes. A metagenomics approach was used to identify virus sequences in Australian mosquitoes causing cytopathic effects in inoculated mammalian cell cultures. Sequence comparisons revealed strains of Liao Ning virus (Reovirus, Seadornavirus), previously detected only in China, livestock-infecting Stretch Lagoon virus (Reovirus, Orbivirus), two novel dimarhabdoviruses, named Beaumont and North Creek viruses, and two novel orthobunyaviruses, named Murrumbidgee and Salt Ash viruses. The novel virus proteomes diverged by ≥ 50% relative to their closest previously genetically characterized viral relatives. Deep sequencing also generated genomes of Warrego and Wallal viruses, orbiviruses linked to kangaroo blindness, whose genomes had not been fully characterized. This study highlights viral metagenomics in concert with traditional arbovirus surveillance to characterize known and new arboviruses in field-collected mosquitoes. Follow-up epidemiological studies are required to determine whether the novel viruses infect humans. © 2013 Elsevier Inc. All rights reserved.
Automated conserved non-coding sequence (CNS) discovery reveals differences in gene content and promoter evolution among grasses

PubMed Central

Turco, Gina; Schnable, James C.; Pedersen, Brent; Freeling, Michael

2013-01-01

Conserved non-coding sequences (CNS) are islands of non-coding sequence that, like protein coding exons, show less divergence in sequence between related species than functionless DNA. Several CNSs have been demonstrated experimentally to function as cis-regulatory regions. However, the specific functions of most CNSs remain unknown. Previous searches for CNS in plants have either anchored on exons and only identified nearby sequences or required years of painstaking manual annotation. Here we present an open source tool that can accurately identify CNSs between any two related species with sequenced genomes, including both those immediately adjacent to exons and distal sequences separated by >12 kb of non-coding sequence. We have used this tool to characterize new motifs, associate CNSs with additional functions, and identify previously undetected genes encoding RNA and protein in the genomes of five grass species. We provide a list of 15,363 orthologous CNSs conserved across all grasses tested. We were also able to identify regulatory sequences present in the common ancestor of grasses that have been lost in one or more extant grass lineages. Lists of orthologous gene pairs and associated CNSs are provided for reference inbred lines of arabidopsis, Japonica rice, foxtail millet, sorghum, brachypodium, and maize. PMID:23874343
Targeted Analysis of Whole Genome Sequence Data to Diagnose Genetic Cardiomyopathy

DOE PAGES

Golbus, Jessica R.; Puckelwartz, Megan J.; Dellefave-Castillo, Lisa; ...

2014-09-01

Background—Cardiomyopathy is highly heritable but genetically diverse. At present, genetic testing for cardiomyopathy uses targeted sequencing to simultaneously assess the coding regions of more than 50 genes. New genes are routinely added to panels to improve the diagnostic yield. With the anticipated $1000 genome, it is expected that genetic testing will shift towards comprehensive genome sequencing accompanied by targeted gene analysis. Therefore, we assessed the reliability of whole genome sequencing and targeted analysis to identify cardiomyopathy variants in 11 subjects with cardiomyopathy. Methods and Results—Whole genome sequencing with an average of 37× coverage was combined with targeted analysis focused onmore » 204 genes linked to cardiomyopathy. Genetic variants were scored using multiple prediction algorithms combined with frequency data from public databases. This pipeline yielded 1-14 potentially pathogenic variants per individual. Variants were further analyzed using clinical criteria and/or segregation analysis. Three of three previously identified primary mutations were detected by this analysis. In six subjects for whom the primary mutation was previously unknown, we identified mutations that segregated with disease, had clinical correlates, and/or had additional pathological correlation to provide evidence for causality. For two subjects with previously known primary mutations, we identified additional variants that may act as modifiers of disease severity. In total, we identified the likely pathological mutation in 9 of 11 (82%) subjects. We conclude that these pilot data demonstrate that ~30-40× coverage whole genome sequencing combined with targeted analysis is feasible and sensitive to identify rare variants in cardiomyopathy-associated genes.« less
Use of Genome Sequence Information for Meat Quality Trait QTL Mining for Causal Genes and Mutations on Pig Chromosome 17

PubMed Central

Hu, Zhi-Liang; Ramos, Antonio M.; Humphray, Sean J.; Rogers, Jane; Reecy, James M.; Rothschild, Max F.

2011-01-01

The newly available pig genome sequence has provided new information to fine map quantitative trait loci (QTL) in order to eventually identify causal variants. With targeted genomic sequencing efforts, we were able to obtain high quality BAC sequences that cover a region on pig chromosome 17 where a number of meat quality QTL have been previously discovered. Sequences from 70 BAC clones were assembled to form an 8-Mbp contig. Subsequently, we successfully mapped five previously identified QTL, three for meat color and two for lactate related traits, to the contig. With an additional 25 genetic markers that were identified by sequence comparison, we were able to carry out further linkage disequilibrium analysis to narrow down the genomic locations of these QTL, which allowed identification of the chromosomal regions that likely contain the causative variants. This research has provided one practical approach to combine genetic and molecular information for QTL mining. PMID:22303339
DOE Office of Scientific and Technical Information (OSTI.GOV)

Golbus, Jessica R.; Puckelwartz, Megan J.; Dellefave-Castillo, Lisa

Background—Cardiomyopathy is highly heritable but genetically diverse. At present, genetic testing for cardiomyopathy uses targeted sequencing to simultaneously assess the coding regions of more than 50 genes. New genes are routinely added to panels to improve the diagnostic yield. With the anticipated $1000 genome, it is expected that genetic testing will shift towards comprehensive genome sequencing accompanied by targeted gene analysis. Therefore, we assessed the reliability of whole genome sequencing and targeted analysis to identify cardiomyopathy variants in 11 subjects with cardiomyopathy. Methods and Results—Whole genome sequencing with an average of 37× coverage was combined with targeted analysis focused onmore » 204 genes linked to cardiomyopathy. Genetic variants were scored using multiple prediction algorithms combined with frequency data from public databases. This pipeline yielded 1-14 potentially pathogenic variants per individual. Variants were further analyzed using clinical criteria and/or segregation analysis. Three of three previously identified primary mutations were detected by this analysis. In six subjects for whom the primary mutation was previously unknown, we identified mutations that segregated with disease, had clinical correlates, and/or had additional pathological correlation to provide evidence for causality. For two subjects with previously known primary mutations, we identified additional variants that may act as modifiers of disease severity. In total, we identified the likely pathological mutation in 9 of 11 (82%) subjects. We conclude that these pilot data demonstrate that ~30-40× coverage whole genome sequencing combined with targeted analysis is feasible and sensitive to identify rare variants in cardiomyopathy-associated genes.« less
Common Viral Integration Sites Identified in Avian Leukosis Virus-Induced B-Cell Lymphomas

PubMed Central

Justice, James F.; Morgan, Robin W.

2015-01-01

ABSTRACT Avian leukosis virus (ALV) induces B-cell lymphoma and other neoplasms in chickens by integrating within or near cancer genes and perturbing their expression. Four genes—MYC, MYB, Mir-155, and TERT—have previously been identified as common integration sites in these virus-induced lymphomas and are thought to play a causal role in tumorigenesis. In this study, we employ high-throughput sequencing to identify additional genes driving tumorigenesis in ALV-induced B-cell lymphomas. In addition to the four genes implicated previously, we identify other genes as common integration sites, including TNFRSF1A, MEF2C, CTDSPL, TAB2, RUNX1, MLL5, CXorf57, and BACH2. We also analyze the genome-wide ALV integration landscape in vivo and find increased frequency of ALV integration near transcriptional start sites and within transcripts. Previous work has shown ALV prefers a weak consensus sequence for integration in cultured human cells. We confirm this consensus sequence for ALV integration in vivo in the chicken genome. PMID:26670384
Experience of targeted Usher exome sequencing as a clinical test

PubMed Central

Besnard, Thomas; García-García, Gema; Baux, David; Vaché, Christel; Faugère, Valérie; Larrieu, Lise; Léonard, Susana; Millan, Jose M; Malcolm, Sue; Claustres, Mireille; Roux, Anne-Françoise

2014-01-01

We show that massively parallel targeted sequencing of 19 genes provides a new and reliable strategy for molecular diagnosis of Usher syndrome (USH) and nonsyndromic deafness, particularly appropriate for these disorders characterized by a high clinical and genetic heterogeneity and a complex structure of several of the genes involved. A series of 71 patients including Usher patients previously screened by Sanger sequencing plus newly referred patients was studied. Ninety-eight percent of the variants previously identified by Sanger sequencing were found by next-generation sequencing (NGS). NGS proved to be efficient as it offers analysis of all relevant genes which is laborious to reach with Sanger sequencing. Among the 13 newly referred Usher patients, both mutations in the same gene were identified in 77% of cases (10 patients) and one candidate pathogenic variant in two additional patients. This work can be considered as pilot for implementing NGS for genetically heterogeneous diseases in clinical service. PMID:24498627
Detection of Low-Level Cardinium and Wolbachia Infections in Culicoides

PubMed Central

Mee, Peter T.; Weeks, Andrew R.; Walker, Peter J.; Hoffmann, Ary A.

2015-01-01

Bacterial endosymbionts have been identified as potentially useful biological control agents for a range of invertebrate vectors of disease. Previous studies of Culicoides (Diptera: Ceratopogonidae) species using conventional PCR assays have provided evidence of Wolbachia (1/33) and Cardinium (8/33) infections. Here, we screened 20 species of Culicoides for Wolbachia and Cardinium, utilizing a combination of conventional PCR and more sensitive quantitative PCR (qPCR) assays. Low levels of Cardinium DNA were detected in females of all but one of the Culicoides species screened, and low levels of Wolbachia were detected in females of 9 of the 20 Culicoides species. Sequence analysis based on partial 16S rRNA gene and gyrB sequences identified “Candidatus Cardinium hertigii” from group C, which has previously been identified in Culicoides from Japan, Israel, and the United Kingdom. Wolbachia strains detected in this study showed 98 to 99% sequence identity to Wolbachia previously detected from Culicoides based on the 16S rRNA gene, whereas a strain with a novel wsp sequence was identified in Culicoides narrabeenensis. Cardinium isolates grouped to geographical regions independent of the host Culicoides species, suggesting possible geographical barriers to Cardinium movement. Screening also identified Asaia bacteria in Culicoides. These findings point to a diversity of low-level endosymbiont infections in Culicoides, providing candidates for further characterization and highlighting the widespread occurrence of these endosymbionts in this insect group. PMID:26150447
The spectrum and clinical impact of epigenetic modifier mutations in myeloma

PubMed Central

Pawlyn, Charlotte; Kaiser, Martin F; Heuck, Christoph; Melchor, Lorenzo; Wardell, Christopher P; Murison, Alex; Chavan, Shweta; Johnson, David C; Begum, Dil; Dahir, Nasrin; Proszek, Paula; Cairns, David A; Boyle, Eileen M; Jones, John R; Cook, Gordon; Drayson, Mark T; Owen, Roger G; Gregory, Walter M; Jackson, Graham H; Barlogie, Bart; Davies, Faith E; Walker, Brian A; Morgan, Gareth J

2016-01-01

Purpose Epigenetic dysregulation is known to be an important contributor to myeloma pathogenesis but, unlike in other B cell malignancies, the full spectrum of somatic mutations in epigenetic modifiers has not been previously reported. We sought to address this using results from whole-exome sequencing in the context of a large prospective clinical trial of newly diagnosed patients and targeted sequencing in a cohort of previously treated patients for comparison. Experimental Design Whole-exome sequencing analysis of 463 presenting myeloma cases entered in the UK NCRI Myeloma XI study and targeted sequencing analysis of 156 previously treated cases from the University of Arkansas for Medical Sciences. We correlated the presence of mutations with clinical outcome from diagnosis and compared the mutations found at diagnosis with later stages of disease. Results In diagnostic myeloma patient samples we identify significant mutations in genes encoding the histone 1 linker protein, previously identified in other B-cell malignancies. Our data suggest an adverse prognostic impact from the presence of lesions in genes encoding DNA methylation modifiers and the histone demethylase KDM6A/UTX. The frequency of mutations in epigenetic modifiers appears to increase following treatment most notably in genes encoding histone methyltransferases and DNA methylation modifiers. Conclusions Numerous mutations identified raise the possibility of targeted treatment strategies for patients either at diagnosis or relapse supporting the use of sequencing-based diagnostics in myeloma to help guide therapy as more epigenetic targeted agents become available. PMID:27235425
Ribosomal DNA intergenic spacer sequence in foxtail millet, Setaria italica (L.) P. Beauv. and its characterization and application to typing of foxtail millet landraces.

PubMed

Fukunaga, Kenji; Ichitani, Katsuyuki; Taura, Satoru; Sato, Muneharu; Kawase, Makoto

2005-02-01

We determined the sequence of ribosomal DNA (rDNA) intergenic spacer (IGS) of foxtail millet isolated in our previous study, and identified subrepeats in the polymorphic region. We also developed a PCR-based method for identifying rDNA types based on sequence information and assessed 153 accessions of foxtail millet. Results were congruent with our previous works. This study provides new findings regarding the geographical distribution of rDNA variants. This new method facilitates analyses of numerous foxtail millet accessions. It is helpful for typing of foxtail millet germplasms and elucidating the evolution of this millet.
Exome sequencing identifies a DNAJB6 mutation in a family with dominantly-inherited limb-girdle muscular dystrophy.

PubMed

Couthouis, Julien; Raphael, Alya R; Siskind, Carly; Findlay, Andrew R; Buenrostro, Jason D; Greenleaf, William J; Vogel, Hannes; Day, John W; Flanigan, Kevin M; Gitler, Aaron D

2014-05-01

Limb-girdle muscular dystrophy primarily affects the muscles of the hips and shoulders (the "limb-girdle" muscles), although it is a heterogeneous disorder that can present with varying symptoms. There is currently no cure. We sought to identify the genetic basis of limb-girdle muscular dystrophy type 1 in an American family of Northern European descent using exome sequencing. Exome sequencing was performed on DNA samples from two affected siblings and one unaffected sibling and resulted in the identification of eleven candidate mutations that co-segregated with the disease. Notably, this list included a previously reported mutation in DNAJB6, p.Phe89Ile, which was recently identified as a cause of limb-girdle muscular dystrophy type 1D. Additional family members were Sanger sequenced and the mutation in DNAJB6 was only found in affected individuals. Subsequent haplotype analysis indicated that this DNAJB6 p.Phe89Ile mutation likely arose independently of the previously reported mutation. Since other published mutations are located close by in the G/F domain of DNAJB6, this suggests that the area may represent a mutational hotspot. Exome sequencing provided an unbiased and effective method for identifying the genetic etiology of limb-girdle muscular dystrophy type 1 in a previously genetically uncharacterized family. This work further confirms the causative role of DNAJB6 mutations in limb-girdle muscular dystrophy type 1D. Copyright © 2014 Elsevier B.V. All rights reserved.
Phylogenetic stratigraphy in the Guerrero Negro hypersaline microbial mat.

PubMed

Harris, J Kirk; Caporaso, J Gregory; Walker, Jeffrey J; Spear, John R; Gold, Nicholas J; Robertson, Charles E; Hugenholtz, Philip; Goodrich, Julia; McDonald, Daniel; Knights, Dan; Marshall, Paul; Tufo, Henry; Knight, Rob; Pace, Norman R

2013-01-01

The microbial mats of Guerrero Negro (GN), Baja California Sur, Mexico historically were considered a simple environment, dominated by cyanobacteria and sulfate-reducing bacteria. Culture-independent rRNA community profiling instead revealed these microbial mats as among the most phylogenetically diverse environments known. A preliminary molecular survey of the GN mat based on only ∼1500 small subunit rRNA gene sequences discovered several new phylum-level groups in the bacterial phylogenetic domain and many previously undetected lower-level taxa. We determined an additional ∼119,000 nearly full-length sequences and 28,000 >200 nucleotide 454 reads from a 10-layer depth profile of the GN mat. With this unprecedented coverage of long sequences from one environment, we confirm the mat is phylogenetically stratified, presumably corresponding to light and geochemical gradients throughout the depth of the mat. Previous shotgun metagenomic data from the same depth profile show the same stratified pattern and suggest that metagenome properties may be predictable from rRNA gene sequences. We verify previously identified novel lineages and identify new phylogenetic diversity at lower taxonomic levels, for example, thousands of operational taxonomic units at the family-genus levels differ considerably from known sequences. The new sequences populate parts of the bacterial phylogenetic tree that previously were poorly described, but indicate that any comprehensive survey of GN diversity has only begun. Finally, we show that taxonomic conclusions are generally congruent between Sanger and 454 sequencing technologies, with the taxonomic resolution achieved dependent on the abundance of reference sequences in the relevant region of the rRNA tree of life.
The Faintest WISE Debris Disks: Enhanced Methods for Detection and Verification

NASA Astrophysics Data System (ADS)

Patel, Rahul I.; Metchev, Stanimir A.; Heinze, Aren; Trollo, Joseph

2017-02-01

In an earlier study, we reported nearly 100 previously unknown dusty debris disks around Hipparcos main-sequence stars within 75 pc by selecting stars with excesses in individual WISE colors. Here, we further scrutinize the Hipparcos 75 pc sample to (1) gain sensitivity to previously undetected, fainter mid-IR excesses and (2) remove spurious excesses contaminated by previously unidentified blended sources. We improve on our previous method by adopting a more accurate measure of the confidence threshold for excess detection and by adding an optimally weighted color average that incorporates all shorter-wavelength WISE photometry, rather than using only individual WISE colors. The latter is equivalent to spectral energy distribution fitting, but only over WISE bandpasses. In addition, we leverage the higher-resolution WISE images available through the unWISE.me image service to identify contaminated WISE excesses based on photocenter offsets among the W3- and W4-band images. Altogether, we identify 19 previously unreported candidate debris disks. Combined with the results from our earlier study, we have found a total of 107 new debris disks around 75 pc Hipparcos main-sequence stars using precisely calibrated WISE photometry. This expands the 75 pc debris disk sample by 22% around Hipparcos main-sequence stars and by 20% overall (including non-main-sequence and non-Hipparcos stars).
DOE Office of Scientific and Technical Information (OSTI.GOV)

Patel, Rahul I.; Metchev, Stanimir A.; Trollo, Joseph

In an earlier study, we reported nearly 100 previously unknown dusty debris disks around Hipparcos main-sequence stars within 75 pc by selecting stars with excesses in individual WISE colors. Here, we further scrutinize the Hipparcos 75 pc sample to (1) gain sensitivity to previously undetected, fainter mid-IR excesses and (2) remove spurious excesses contaminated by previously unidentified blended sources. We improve on our previous method by adopting a more accurate measure of the confidence threshold for excess detection and by adding an optimally weighted color average that incorporates all shorter-wavelength WISE photometry, rather than using only individual WISE colors. Themore » latter is equivalent to spectral energy distribution fitting, but only over WISE bandpasses. In addition, we leverage the higher-resolution WISE images available through the unWISE.me image service to identify contaminated WISE excesses based on photocenter offsets among the W 3- and W 4-band images. Altogether, we identify 19 previously unreported candidate debris disks. Combined with the results from our earlier study, we have found a total of 107 new debris disks around 75 pc Hipparcos main-sequence stars using precisely calibrated WISE photometry. This expands the 75 pc debris disk sample by 22% around Hipparcos main-sequence stars and by 20% overall (including non-main-sequence and non- Hipparcos stars).« less
High depth, whole-genome sequencing of cholera isolates from Haiti and the Dominican Republic.

PubMed

Sealfon, Rachel; Gire, Stephen; Ellis, Crystal; Calderwood, Stephen; Qadri, Firdausi; Hensley, Lisa; Kellis, Manolis; Ryan, Edward T; LaRocque, Regina C; Harris, Jason B; Sabeti, Pardis C

2012-09-11

Whole-genome sequencing is an important tool for understanding microbial evolution and identifying the emergence of functionally important variants over the course of epidemics. In October 2010, a severe cholera epidemic began in Haiti, with additional cases identified in the neighboring Dominican Republic. We used whole-genome approaches to sequence four Vibrio cholerae isolates from Haiti and the Dominican Republic and three additional V. cholerae isolates to a high depth of coverage (>2000x); four of the seven isolates were previously sequenced. Using these sequence data, we examined the effect of depth of coverage and sequencing platform on genome assembly and identification of sequence variants. We found that 50x coverage is sufficient to construct a whole-genome assembly and to accurately call most variants from 100 base pair paired-end sequencing reads. Phylogenetic analysis between the newly sequenced and thirty-three previously sequenced V. cholerae isolates indicates that the Haitian and Dominican Republic isolates are closest to strains from South Asia. The Haitian and Dominican Republic isolates form a tight cluster, with only four variants unique to individual isolates. These variants are located in the CTX region, the SXT region, and the core genome. Of the 126 mutations identified that separate the Haiti-Dominican Republic cluster from the V. cholerae reference strain (N16961), 73 are non-synonymous changes, and a number of these changes cluster in specific genes and pathways. Sequence variant analyses of V. cholerae isolates, including multiple isolates from the Haitian outbreak, identify coverage-specific and technology-specific effects on variant detection, and provide insight into genomic change and functional evolution during an epidemic.
The Sequence of Study Changes What Information Is Attended to, Encoded, and Remembered during Category Learning

ERIC Educational Resources Information Center

Carvalho, Paulo F.; Goldstone, Robert L.

2017-01-01

The sequence of study influences how we learn. Previous research has identified different sequences as potentially beneficial for learning in different contexts and with different materials. Here we investigate the mechanisms involved in inductive category learning that give rise to these sequencing effects. Across 3 experiments we show evidence…
Genome-wide identification of conserved intronic non-coding sequences using a Bayesian segmentation approach.

PubMed

Algama, Manjula; Tasker, Edward; Williams, Caitlin; Parslow, Adam C; Bryson-Richardson, Robert J; Keith, Jonathan M

2017-03-27

Computational identification of non-coding RNAs (ncRNAs) is a challenging problem. We describe a genome-wide analysis using Bayesian segmentation to identify intronic elements highly conserved between three evolutionarily distant vertebrate species: human, mouse and zebrafish. We investigate the extent to which these elements include ncRNAs (or conserved domains of ncRNAs) and regulatory sequences. We identified 655 deeply conserved intronic sequences in a genome-wide analysis. We also performed a pathway-focussed analysis on genes involved in muscle development, detecting 27 intronic elements, of which 22 were not detected in the genome-wide analysis. At least 87% of the genome-wide and 70% of the pathway-focussed elements have existing annotations indicative of conserved RNA secondary structure. The expression of 26 of the pathway-focused elements was examined using RT-PCR, providing confirmation that they include expressed ncRNAs. Consistent with previous studies, these elements are significantly over-represented in the introns of transcription factors. This study demonstrates a novel, highly effective, Bayesian approach to identifying conserved non-coding sequences. Our results complement previous findings that these sequences are enriched in transcription factors. However, in contrast to previous studies which suggest the majority of conserved sequences are regulatory factor binding sites, the majority of conserved sequences identified using our approach contain evidence of conserved RNA secondary structures, and our laboratory results suggest most are expressed. Functional roles at DNA and RNA levels are not mutually exclusive, and many of our elements possess evidence of both. Moreover, ncRNAs play roles in transcriptional and post-transcriptional regulation, and this may contribute to the over-representation of these elements in introns of transcription factors. We attribute the higher sensitivity of the pathway-focussed analysis compared to the genome-wide analysis to improved alignment quality, suggesting that enhanced genomic alignments may reveal many more conserved intronic sequences.

Molecular diagnosis of putative Stargardt disease probands by exome sequencing

PubMed Central

2012-01-01

Background The commonest genetic form of juvenile or early adult onset macular degeneration is Stargardt Disease (STGD) caused by recessive mutations in the gene ABCA4. However, high phenotypic and allelic heterogeneity and a small but non-trivial amount of locus heterogeneity currently impede conclusive molecular diagnosis in a significant proportion of cases. Methods We performed whole exome sequencing (WES) of nine putative Stargardt Disease probands and searched for potentially disease-causing genetic variants in previously identified retinal or macular dystrophy genes. Follow-up dideoxy sequencing was performed for confirmation and to screen for mutations in an additional set of affected individuals lacking a definitive molecular diagnosis. Results Whole exome sequencing revealed seven likely disease-causing variants across four genes, providing a confident genetic diagnosis in six previously uncharacterized participants. We identified four previously missed mutations in ABCA4 across three individuals. Likely disease-causing mutations in RDS/PRPH2, ELOVL, and CRB1 were also identified. Conclusions Our findings highlight the enormous potential of whole exome sequencing in Stargardt Disease molecular diagnosis and research. WES adequately assayed all coding sequences and canonical splice sites of ABCA4 in this study. Additionally, WES enables the identification of disease-related alleles in other genes. This work highlights the importance of collecting parental genetic material for WES testing as the current knowledge of human genome variation limits the determination of causality between identified variants and disease. While larger sample sizes are required to establish the precision and accuracy of this type of testing, this study supports WES for inherited early onset macular degeneration disorders as an alternative to standard mutation screening techniques. PMID:22863181
Genomic patterns of introgression in rainbow and westslope cutthroat trout illuminated by overlapping paired-end RAD sequencing

USGS Publications Warehouse

Hohenlohe, Paul A.; Day, Mitch D.; Amish, Stephen J.; Miller, Michael R.; Kamps-Hughes, Nick; Boyer, Matthew C.; Muhlfeld, Clint C.; Allendorf, Fred W.; Johnson, Eric A.; Luikart, Gordon

2013-01-01

Rapid and inexpensive methods for genomewide single nucleotide polymorphism (SNP) discovery and genotyping are urgently needed for population management and conservation. In hybridized populations, genomic techniques that can identify and genotype thousands of species-diagnostic markers would allow precise estimates of population- and individual-level admixture as well as identification of 'super invasive' alleles, which show elevated rates of introgression above the genomewide background (likely due to natural selection). Techniques like restriction-site-associated DNA (RAD) sequencing can discover and genotype large numbers of SNPs, but they have been limited by the length of continuous sequence data they produce with Illumina short-read sequencing. We present a novel approach, overlapping paired-end RAD sequencing, to generate RAD contigs of >300–400 bp. These contigs provide sufficient flanking sequence for design of high-throughput SNP genotyping arrays and strict filtering to identify duplicate paralogous loci. We applied this approach in five populations of native westslope cutthroat trout that previously showed varying (low) levels of admixture from introduced rainbow trout (RBT). We produced 77 141 RAD contigs and used these data to filter and genotype 3180 previously identified species-diagnostic SNP loci. Our population-level and individual-level estimates of admixture were generally consistent with previous microsatellite-based estimates from the same individuals. However, we observed slightly lower admixture estimates from genomewide markers, which might result from natural selection against certain genome regions, different genomic locations for microsatellites vs. RAD-derived SNPs and/or sampling error from the small number of microsatellite loci (n = 7). We also identified candidate adaptive super invasive alleles from RBT that had excessively high admixture proportions in hybridized cutthroat trout populations.
Characterization of 47 MHC class I sequences in Filipino cynomolgus macaques

PubMed Central

Campbell, Kevin J.; Detmer, Ann M.; Karl, Julie A.; Wiseman, Roger W.; Blasky, Alex J.; Hughes, Austin L.; Bimber, Benjamin N.; O’Connor, Shelby L.; O’Connor, David H.

2009-01-01

Cynomolgus macaques (Macaca fascicularis) provide increasingly common models for infectious disease research. Several geographically distinct populations of these macaques from Southeast Asia and the Indian Ocean island of Mauritius are available for pathogenesis studies. Though host genetics may profoundly impact results of such studies, similarities and differences between populations are often overlooked. In this study we identified 47 full-length MHC class I nucleotide sequences in 16 cynomolgus macaques of Filipino origin. The majority of MHC class I sequences characterized (39 of 47) were unique to this regional population. However, we discovered eight sequences with perfect identity and six sequences with close similarity to previously defined MHC class I sequences from other macaque populations. We identified two ancestral MHC haplotypes that appear to be shared between Filipino and Mauritian cynomolgus macaques, notably a Mafa-B haplotype that has previously been shown to protect Mauritian cynomolgus macaques against challenge with a simian/human immunodeficiency virus, SHIV89.6P. We also identified a Filipino cynomolgus macaque MHC class I sequence for which the predicted protein sequence differs from Mamu-B*17 by a single amino acid. This is important because Mamu-B*17 is strongly associated with protection against simian immunodeficiency virus (SIV) challenge in Indian rhesus macaques. These findings have implications for the evolutionary history of Filipino cynomolgus macaques as well as for the use of this model in SIV/SHIV research protocols. PMID:19107381
Mapping-by-sequencing in complex polyploid genomes using genic sequence capture: a case study to map yellow rust resistance in hexaploid wheat.

PubMed

Gardiner, Laura-Jayne; Bansept-Basler, Pauline; Olohan, Lisa; Joynson, Ryan; Brenchley, Rachel; Hall, Neil; O'Sullivan, Donal M; Hall, Anthony

2016-08-01

Previously we extended the utility of mapping-by-sequencing by combining it with sequence capture and mapping sequence data to pseudo-chromosomes that were organized using wheat-Brachypodium synteny. This, with a bespoke haplotyping algorithm, enabled us to map the flowering time locus in the diploid wheat Triticum monococcum L. identifying a set of deleted genes (Gardiner et al., 2014). Here, we develop this combination of gene enrichment and sliding window mapping-by-synteny analysis to map the Yr6 locus for yellow stripe rust resistance in hexaploid wheat. A 110 MB NimbleGen capture probe set was used to enrich and sequence a doubled haploid mapping population of hexaploid wheat derived from an Avalon and Cadenza cross. The Yr6 locus was identified by mapping to the POPSEQ chromosomal pseudomolecules using a bespoke pipeline and algorithm (Chapman et al., 2015). Furthermore the same locus was identified using newly developed pseudo-chromosome sequences as a mapping reference that are based on the genic sequence used for sequence enrichment. The pseudo-chromosomes allow us to demonstrate the application of mapping-by-sequencing to even poorly defined polyploidy genomes where chromosomes are incomplete and sub-genome assemblies are collapsed. This analysis uniquely enabled us to: compare wheat genome annotations; identify the Yr6 locus - defining a smaller genic region than was previously possible; associate the interval with one wheat sub-genome and increase the density of SNP markers associated. Finally, we built the pipeline in iPlant, making it a user-friendly community resource for phenotype mapping. © 2016 The Authors. The Plant Journal published by Society for Experimental Biology and John Wiley & Sons Ltd.
Whole Genome Complete Resequencing of Bacillus subtilis Natto by Combining Long Reads with High-Quality Short Reads

PubMed Central

Kamada, Mayumi; Hase, Sumitaka; Sato, Kengo; Toyoda, Atsushi; Fujiyama, Asao; Sakakibara, Yasubumi

2014-01-01

De novo microbial genome sequencing reached a turning point with third-generation sequencing (TGS) platforms, and several microbial genomes have been improved by TGS long reads. Bacillus subtilis natto is closely related to the laboratory standard strain B. subtilis Marburg 168, and it has a function in the production of the traditional Japanese fermented food “natto.” The B. subtilis natto BEST195 genome was previously sequenced with short reads, but it included some incomplete regions. We resequenced the BEST195 genome using a PacBio RS sequencer, and we successfully obtained a complete genome sequence from one scaffold without any gaps, and we also applied Illumina MiSeq short reads to enhance quality. Compared with the previous BEST195 draft genome and Marburg 168 genome, we found that incomplete regions in the previous genome sequence were attributed to GC-bias and repetitive sequences, and we also identified some novel genes that are found only in the new genome. PMID:25329997
Repetitive Elements May Comprise Over Two-Thirds of the Human Genome

PubMed Central

de Koning, A. P. Jason; Gu, Wanjun; Castoe, Todd A.; Batzer, Mark A.; Pollock, David D.

2011-01-01

Transposable elements (TEs) are conventionally identified in eukaryotic genomes by alignment to consensus element sequences. Using this approach, about half of the human genome has been previously identified as TEs and low-complexity repeats. We recently developed a highly sensitive alternative de novo strategy, P-clouds, that instead searches for clusters of high-abundance oligonucleotides that are related in sequence space (oligo “clouds”). We show here that P-clouds predicts >840 Mbp of additional repetitive sequences in the human genome, thus suggesting that 66%–69% of the human genome is repetitive or repeat-derived. To investigate this remarkable difference, we conducted detailed analyses of the ability of both P-clouds and a commonly used conventional approach, RepeatMasker (RM), to detect different sized fragments of the highly abundant human Alu and MIR SINEs. RM can have surprisingly low sensitivity for even moderately long fragments, in contrast to P-clouds, which has good sensitivity down to small fragment sizes (∼25 bp). Although short fragments have a high intrinsic probability of being false positives, we performed a probabilistic annotation that reflects this fact. We further developed “element-specific” P-clouds (ESPs) to identify novel Alu and MIR SINE elements, and using it we identified ∼100 Mb of previously unannotated human elements. ESP estimates of new MIR sequences are in good agreement with RM-based predictions of the amount that RM missed. These results highlight the need for combined, probabilistic genome annotation approaches and suggest that the human genome consists of substantially more repetitive sequence than previously believed. PMID:22144907
Genomewide Analysis of the Antimicrobial Peptides in Python bivittatus and Characterization of Cathelicidins with Potent Antimicrobial Activity and Low Cytotoxicity.

PubMed

Kim, Dayeong; Soundrarajan, Nagasundarapandian; Lee, Juyeon; Cho, Hye-Sun; Choi, Minkyeung; Cha, Se-Yeoun; Ahn, Byeongyong; Jeon, Hyoim; Le, Minh Thong; Song, Hyuk; Kim, Jin-Hoi; Park, Chankyu

2017-09-01

In this study, we sought to identify novel antimicrobial peptides (AMPs) in Python bivittatus through bioinformatic analyses of publicly available genome information and experimental validation. In our analysis of the python genome, we identified 29 AMP-related candidate sequences. Of these, we selected five cathelicidin-like sequences and subjected them to further in silico analyses. The results showed that these sequences likely have antimicrobial activity. The sequences were named Pb-CATH1 to Pb-CATH5 according to their sequence similarity to previously reported snake cathelicidins. We predicted their molecular structure and then chemically synthesized the mature peptide for three putative cathelicidins and subjected them to biological activity tests. Interestingly, all three peptides showed potent antimicrobial effects against Gram-negative bacteria but very weak activity against Gram-positive bacteria. Remarkably, ΔPb-CATH4 showed potent activity against antibiotic-resistant clinical isolates and also was observed to possess very low hemolytic activity and cytotoxicity. ΔPb-CATH4 also showed considerable serum stability. Electron microscopic analysis indicated that ΔPb-CATH4 exerts its effects via toroidal pore preformation. Structural comparison of the cathelicidins identified in this study to previously reported ones revealed that these Pb-CATHs are representatives of a new group of reptilian cathelicidins lacking the acidic connecting domain. Furthermore, Pb-CATH4 possesses a completely different mature peptide sequence from those of previously described reptilian cathelicidins. These new AMPs may be candidates for the development of alternatives to or complements of antibiotics to control multidrug-resistant pathogens. Copyright © 2017 American Society for Microbiology.
Genomewide Analysis of the Antimicrobial Peptides in Python bivittatus and Characterization of Cathelicidins with Potent Antimicrobial Activity and Low Cytotoxicity

PubMed Central

Kim, Dayeong; Soundrarajan, Nagasundarapandian; Lee, Juyeon; Cho, Hye-sun; Choi, Minkyeung; Cha, Se-Yeoun; Ahn, Byeongyong; Jeon, Hyoim; Le, Minh Thong; Song, Hyuk; Kim, Jin-Hoi

2017-01-01

ABSTRACT In this study, we sought to identify novel antimicrobial peptides (AMPs) in Python bivittatus through bioinformatic analyses of publicly available genome information and experimental validation. In our analysis of the python genome, we identified 29 AMP-related candidate sequences. Of these, we selected five cathelicidin-like sequences and subjected them to further in silico analyses. The results showed that these sequences likely have antimicrobial activity. The sequences were named Pb-CATH1 to Pb-CATH5 according to their sequence similarity to previously reported snake cathelicidins. We predicted their molecular structure and then chemically synthesized the mature peptide for three putative cathelicidins and subjected them to biological activity tests. Interestingly, all three peptides showed potent antimicrobial effects against Gram-negative bacteria but very weak activity against Gram-positive bacteria. Remarkably, ΔPb-CATH4 showed potent activity against antibiotic-resistant clinical isolates and also was observed to possess very low hemolytic activity and cytotoxicity. ΔPb-CATH4 also showed considerable serum stability. Electron microscopic analysis indicated that ΔPb-CATH4 exerts its effects via toroidal pore preformation. Structural comparison of the cathelicidins identified in this study to previously reported ones revealed that these Pb-CATHs are representatives of a new group of reptilian cathelicidins lacking the acidic connecting domain. Furthermore, Pb-CATH4 possesses a completely different mature peptide sequence from those of previously described reptilian cathelicidins. These new AMPs may be candidates for the development of alternatives to or complements of antibiotics to control multidrug-resistant pathogens. PMID:28630199
Mouse mammary tumor virus-like gene sequences are present in lung patient specimens

PubMed Central

2011-01-01

Background Previous studies have reported on the presence of Murine Mammary Tumor Virus (MMTV)-like gene sequences in human cancer tissue specimens. Here, we search for MMTV-like gene sequences in lung diseases including carcinomas specimens from a Mexican population. This study was based on our previous study reporting that the INER51 lung cancer cell line, from a pleural effusion of a Mexican patient, contains MMTV-like env gene sequences. Results The MMTV-like env gene sequences have been detected in three out of 18 specimens studied, by PCR using a specific set of MMTV-like primers. The three identified MMTV-like gene sequences, which were assigned as INER6, HZ101, and HZ14, were 99%, 98%, and 97% homologous, respectively, as compared to GenBank sequence accession number AY161347. The INER6 and HZ-101 samples were isolated from lung cancer specimens, and the HZ-14 was isolated from an acute inflammatory lung infiltrate sample. Two of the env sequences exhibited disruption of the reading frame due to mutations. Conclusion In summary, we identified the presence of MMTV-like gene sequences in 2 out of 11 (18%) of the lung carcinomas and 1 out of 7 (14%) of acute inflamatory lung infiltrate specimens studied of a Mexican Population. PMID:21943279
Expanded subgenomic mRNA transcriptome and coding capacity of a nidovirus

PubMed Central

Di, Han; Madden, Joseph C.; Morantz, Esther K.; Tang, Hsin-Yao; Graham, Rachel L.; Baric, Ralph S.

2017-01-01

Members of the order Nidovirales express their structural protein ORFs from a nested set of 3′ subgenomic mRNAs (sg mRNAs), and for most of these ORFs, a single genomic transcription regulatory sequence (TRS) was identified. Nine TRSs were previously reported for the arterivirus Simian hemorrhagic fever virus (SHFV). In the present study, which was facilitated by next-generation sequencing, 96 SHFV body TRSs were identified that were functional in both infected MA104 cells and macaque macrophages. The abundance of sg mRNAs produced from individual TRSs was consistent over time in the two different cell types. Most of the TRSs are located in the genomic 3′ region, but some are in the 5′ ORF1a/1b region and provide alternative sources of nonstructural proteins. Multiple functional TRSs were identified for the majority of the SHFV 3′ ORFs, and four previously identified TRSs were found not to be the predominant ones used. A third of the TRSs generated sg mRNAs with variant leader–body junction sequences. Sg mRNAs encoding E′, GP2, or ORF5a as their 5′ ORF as well as sg mRNAs encoding six previously unreported alternative frame ORFs or 14 previously unreported C-terminal ORFs of known proteins were also identified. Mutation of the start codon of two C-terminal ORFs in an infectious clone reduced virus yield. Mass spectrometry detected one previously unreported protein and suggested translation of some of the C-terminal ORFs. The results reveal the complexity of the transcriptional regulatory mechanism and expanded coding capacity for SHFV, which may also be characteristic of other nidoviruses. PMID:29073030
Whole exome sequencing identifies genetic variants in inherited thrombocytopenia with secondary qualitative function defects

PubMed Central

Johnson, Ben; Lowe, Gillian C.; Futterer, Jane; Lordkipanidzé, Marie; MacDonald, David; Simpson, Michael A.; Sanchez-Guiú, Isabel; Drake, Sian; Bem, Danai; Leo, Vincenzo; Fletcher, Sarah J.; Dawood, Ban; Rivera, José; Allsup, David; Biss, Tina; Bolton-Maggs, Paula HB; Collins, Peter; Curry, Nicola; Grimley, Charlotte; James, Beki; Makris, Mike; Motwani, Jayashree; Pavord, Sue; Talks, Katherine; Thachil, Jecko; Wilde, Jonathan; Williams, Mike; Harrison, Paul; Gissen, Paul; Mundell, Stuart; Mumford, Andrew; Daly, Martina E.; Watson, Steve P.; Morgan, Neil V.

2016-01-01

Inherited thrombocytopenias are a heterogeneous group of disorders characterized by abnormally low platelet counts which can be associated with abnormal bleeding. Next-generation sequencing has previously been employed in these disorders for the confirmation of suspected genetic abnormalities, and more recently in the discovery of novel disease-causing genes. However its full potential has not yet been exploited. Over the past 6 years we have sequenced the exomes from 55 patients, including 37 index cases and 18 additional family members, all of whom were recruited to the UK Genotyping and Phenotyping of Platelets study. All patients had inherited or sustained thrombocytopenia of unknown etiology with platelet counts varying from 11×109/L to 186×109/L. Of the 51 patients phenotypically tested, 37 (73%), had an additional secondary qualitative platelet defect. Using whole exome sequencing analysis we have identified “pathogenic” or “likely pathogenic” variants in 46% (17/37) of our index patients with thrombocytopenia. In addition, we report variants of uncertain significance in 12 index cases, including novel candidate genetic variants in previously unreported genes in four index cases. These results demonstrate that whole exome sequencing is an efficient method for elucidating potential pathogenic genetic variants in inherited thrombocytopenia. Whole exome sequencing also has the added benefit of discovering potentially pathogenic genetic variants for further study in novel genes not previously implicated in inherited thrombocytopenia. PMID:27479822
Using variable rate models to identify genes under selection in sequence pairs: their validity and limitations for EST sequences.

PubMed

Church, Sheri A; Livingstone, Kevin; Lai, Zhao; Kozik, Alexander; Knapp, Steven J; Michelmore, Richard W; Rieseberg, Loren H

2007-02-01

Using likelihood-based variable selection models, we determined if positive selection was acting on 523 EST sequence pairs from two lineages of sunflower and lettuce. Variable rate models are generally not used for comparisons of sequence pairs due to the limited information and the inaccuracy of estimates of specific substitution rates. However, previous studies have shown that the likelihood ratio test (LRT) is reliable for detecting positive selection, even with low numbers of sequences. These analyses identified 56 genes that show a signature of selection, of which 75% were not identified by simpler models that average selection across codons. Subsequent mapping studies in sunflower show four of five of the positively selected genes identified by these methods mapped to domestication QTLs. We discuss the validity and limitations of using variable rate models for comparisons of sequence pairs, as well as the limitations of using ESTs for identification of positively selected genes.
Exome Sequencing Reveals Primary Immunodeficiencies in Children with Community-Acquired Pseudomonas aeruginosa Sepsis.

PubMed

Asgari, Samira; McLaren, Paul J; Peake, Jane; Wong, Melanie; Wong, Richard; Bartha, Istvan; Francis, Joshua R; Abarca, Katia; Gelderman, Kyra A; Agyeman, Philipp; Aebi, Christoph; Berger, Christoph; Fellay, Jacques; Schlapbach, Luregn J

2016-01-01

One out of three pediatric sepsis deaths in high income countries occur in previously healthy children. Primary immunodeficiencies (PIDs) have been postulated to underlie fulminant sepsis, but this concept remains to be confirmed in clinical practice. Pseudomonas aeruginosa ( P. aeruginosa ) is a common bacterium mostly associated with health care-related infections in immunocompromised individuals. However, in rare cases, it can cause sepsis in previously healthy children. We used exome sequencing and bioinformatic analysis to systematically search for genetic factors underpinning severe P. aeruginosa infection in the pediatric population. We collected blood samples from 11 previously healthy children, with no family history of immunodeficiency, who presented with severe sepsis due to community-acquired P. aeruginosa bacteremia. Genomic DNA was extracted from blood or tissue samples obtained intravitam or postmortem. We obtained high-coverage exome sequencing data and searched for rare loss-of-function variants. After rigorous filtrations, 12 potentially causal variants were identified. Two out of eight (25%) fatal cases were found to carry novel pathogenic variants in PID genes, including BTK and DNMT3B . This study demonstrates that exome sequencing allows to identify rare, deleterious human genetic variants responsible for fulminant sepsis in apparently healthy children. Diagnosing PIDs in such patients is of high relevance to survivors and affected families. We propose that unusually severe and fatal sepsis cases in previously healthy children should be considered for exome/genome sequencing to search for underlying PIDs.
Exome Sequencing Reveals Primary Immunodeficiencies in Children with Community-Acquired Pseudomonas aeruginosa Sepsis

PubMed Central

Asgari, Samira; McLaren, Paul J.; Peake, Jane; Wong, Melanie; Wong, Richard; Bartha, Istvan; Francis, Joshua R.; Abarca, Katia; Gelderman, Kyra A.; Agyeman, Philipp; Aebi, Christoph; Berger, Christoph; Fellay, Jacques; Schlapbach, Luregn J.; Posfay-Barbe, Klara

2016-01-01

One out of three pediatric sepsis deaths in high income countries occur in previously healthy children. Primary immunodeficiencies (PIDs) have been postulated to underlie fulminant sepsis, but this concept remains to be confirmed in clinical practice. Pseudomonas aeruginosa (P. aeruginosa) is a common bacterium mostly associated with health care-related infections in immunocompromised individuals. However, in rare cases, it can cause sepsis in previously healthy children. We used exome sequencing and bioinformatic analysis to systematically search for genetic factors underpinning severe P. aeruginosa infection in the pediatric population. We collected blood samples from 11 previously healthy children, with no family history of immunodeficiency, who presented with severe sepsis due to community-acquired P. aeruginosa bacteremia. Genomic DNA was extracted from blood or tissue samples obtained intravitam or postmortem. We obtained high-coverage exome sequencing data and searched for rare loss-of-function variants. After rigorous filtrations, 12 potentially causal variants were identified. Two out of eight (25%) fatal cases were found to carry novel pathogenic variants in PID genes, including BTK and DNMT3B. This study demonstrates that exome sequencing allows to identify rare, deleterious human genetic variants responsible for fulminant sepsis in apparently healthy children. Diagnosing PIDs in such patients is of high relevance to survivors and affected families. We propose that unusually severe and fatal sepsis cases in previously healthy children should be considered for exome/genome sequencing to search for underlying PIDs. PMID:27703454
Identification of genes in anonymous DNA sequences. Final report: Report period, 15 April 1993--15 April 1994

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fields, C.A.

1994-09-01

This Report concludes the DOE Human Genome Program project, ``Identification of Genes in Anonymous DNA Sequence.`` The central goals of this project have been (1) understanding the problem of identifying genes in anonymous sequences, and (2) development of tools, primarily the automated identification system gm, for identifying genes. The activities supported under the previous award are summarized here to provide a single complete report on the activities supported as part of the project from its inception to its completion.
Modeling the integration of bacterial rRNA fragments into the human cancer genome.

PubMed

Sieber, Karsten B; Gajer, Pawel; Dunning Hotopp, Julie C

2016-03-21

Cancer is a disease driven by the accumulation of genomic alterations, including the integration of exogenous DNA into the human somatic genome. We previously identified in silico evidence of DNA fragments from a Pseudomonas-like bacteria integrating into the 5'-UTR of four proto-oncogenes in stomach cancer sequencing data. The functional and biological consequences of these bacterial DNA integrations remain unknown. Modeling of these integrations suggests that the previously identified sequences cover most of the sequence flanking the junction between the bacterial and human DNA. Further examination of these reads reveals that these integrations are rich in guanine nucleotides and the integrated bacterial DNA may have complex transcript secondary structures. The models presented here lay the foundation for future experiments to test if bacterial DNA integrations alter the transcription of the human genes.
Discovery of Neuropeptides in the Nematode Ascaris suum by Database Mining and Tandem Mass Spectrometry

PubMed Central

Jarecki, Jessica L.; Frey, Brian L.; Smith, Lloyd M.; Stretton, Antony O.

2011-01-01

Liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS) was used to discover peptides in extracts of the large parasitic nematode Ascaris suum. This required the assembly of a new database of known and predicted peptides. In addition to those already sequenced, peptides were either previously predicted to be processed from precursor proteins identified in an A. suum library of expressed sequence tags (ESTs), or newly predicted from a library of A. suum genome survey sequences (GSSs). The predicted MS/MS fragmentation patterns of this collection of real and putative peptides were compared with the actual fragmentation patterns found in the MS/MS spectra of peptides fractionated by MS; this enabled individual peptides to be sequenced. Many previously identified peptides were found, and 21 novel peptides were discovered. Thus, this approach is very useful, despite the fact that the available GSS database is still preliminary, having only 1X coverage. PMID:21524146
Discriminative Prediction of A-To-I RNA Editing Events from DNA Sequence

PubMed Central

Sun, Jiangming; Singh, Pratibha; Bagge, Annika; Valtat, Bérengère; Vikman, Petter; Spégel, Peter; Mulder, Hindrik

2016-01-01

RNA editing is a post-transcriptional alteration of RNA sequences that, via insertions, deletions or base substitutions, can affect protein structure as well as RNA and protein expression. Recently, it has been suggested that RNA editing may be more frequent than previously thought. A great impediment, however, to a deeper understanding of this process is the paramount sequencing effort that needs to be undertaken to identify RNA editing events. Here, we describe an in silico approach, based on machine learning, that ameliorates this problem. Using 41 nucleotide long DNA sequences, we show that novel A-to-I RNA editing events can be predicted from known A-to-I RNA editing events intra- and interspecies. The validity of the proposed method was verified in an independent experimental dataset. Using our approach, 203 202 putative A-to-I RNA editing events were predicted in the whole human genome. Out of these, 9% were previously reported. The remaining sites require further validation, e.g., by targeted deep sequencing. In conclusion, the approach described here is a useful tool to identify potential A-to-I RNA editing events without the requirement of extensive RNA sequencing. PMID:27764195
In silico Analysis of 2085 Clones from a Normalized Rat Vestibular Periphery 3′ cDNA Library

PubMed Central

Roche, Joseph P.; Cioffi, Joseph A.; Kwitek, Anne E.; Erbe, Christy B.; Popper, Paul

2005-01-01

The inserts from 2400 cDNA clones isolated from a normalized Rattus norvegicus vestibular periphery cDNA library were sequenced and characterized. The Wackym-Soares vestibular 3′ cDNA library was constructed from the saccular and utricular maculae, the ampullae of all three semicircular canals and Scarpa's ganglia containing the somata of the primary afferent neurons, microdissected from 104 male and female rats. The inserts from 2400 randomly selected clones were sequenced from the 5′ end. Each sequence was analyzed using the BLAST algorithm compared to the Genbank nonredundant, rat genome, mouse genome and human genome databases to search for high homology alignments. Of the initial 2400 clones, 315 (13%) were found to be of poor quality and did not yield useful information, and therefore were eliminated from the analysis. Of the remaining 2085 sequences, 918 (44%) were found to represent 758 unique genes having useful annotations that were identified in databases within the public domain or in the published literature; these sequences were designated as known characterized sequences. 1141 sequences (55%) aligned with 1011 unique sequences had no useful annotations and were designated as known but uncharacterized sequences. Of the remaining 26 sequences (1%), 24 aligned with rat genomic sequences, but none matched previously described rat expressed sequence tags or mRNAs. No significant alignment to the rat or human genomic sequences could be found for the remaining 2 sequences. Of the 2085 sequences analyzed, 86% were singletons. The known, characterized sequences were analyzed with the FatiGO online data-mining tool (http://fatigo.bioinfo.cnio.es/) to identify level 5 biological process gene ontology (GO) terms for each alignment and to group alignments with similar or identical GO terms. Numerous genes were identified that have not been previously shown to be expressed in the vestibular system. Further characterization of the novel cDNA sequences may lead to the identification of genes with vestibular-specific functions. Continued analysis of the rat vestibular periphery transcriptome should provide new insights into vestibular function and generate new hypotheses. Physiological studies are necessary to further elucidate the roles of the identified genes and novel sequences in vestibular function. PMID:16103642
Discovery of a bovine enterovirus in alpaca.

PubMed

McClenahan, Shasta D; Scherba, Gail; Borst, Luke; Fredrickson, Richard L; Krause, Philip R; Uhlenhaut, Christine

2013-01-01

A cytopathic virus was isolated using Madin-Darby bovine kidney (MDBK) cells from lung tissue of alpaca that died of a severe respiratory infection. To identify the virus, the infected cell culture supernatant was enriched for virus particles and a generic, PCR-based method was used to amplify potential viral sequences. Genomic sequence data of the alpaca isolate was obtained and compared with sequences of known viruses. The new alpaca virus sequence was most similar to recently designated Enterovirus species F, previously bovine enterovirus (BEVs), viruses that are globally prevalent in cattle, although they appear not to cause significant disease. Because bovine enteroviruses have not been previously reported in U.S. alpaca, we suspect that this type of infection is fairly rare, and in this case appeared not to spread beyond the original outbreak. The capsid sequence of the detected virus had greatest homology to Enterovirus F type 1 (indicating that the virus should be considered a member of serotype 1), but the virus had greater homology in 2A protease sequence to type 3, suggesting that it may have been a recombinant. Identifying pathogens that infect a new host species for the first time can be challenging. As the disease in a new host species may be quite different from that in the original or natural host, the pathogen may not be suspected based on the clinical presentation, delaying diagnosis. Although this virus replicated in MDBK cells, existing standard culture and molecular methods could not identify it. In this case, a highly sensitive generic PCR-based pathogen-detection method was used to identify this pathogen.

Discovery of a Bovine Enterovirus in Alpaca

PubMed Central

McClenahan, Shasta D.; Scherba, Gail; Borst, Luke; Fredrickson, Richard L.; Krause, Philip R.; Uhlenhaut, Christine

2013-01-01

A cytopathic virus was isolated using Madin-Darby bovine kidney (MDBK) cells from lung tissue of alpaca that died of a severe respiratory infection. To identify the virus, the infected cell culture supernatant was enriched for virus particles and a generic, PCR-based method was used to amplify potential viral sequences. Genomic sequence data of the alpaca isolate was obtained and compared with sequences of known viruses. The new alpaca virus sequence was most similar to recently designated Enterovirus species F, previously bovine enterovirus (BEVs), viruses that are globally prevalent in cattle, although they appear not to cause significant disease. Because bovine enteroviruses have not been previously reported in U.S. alpaca, we suspect that this type of infection is fairly rare, and in this case appeared not to spread beyond the original outbreak. The capsid sequence of the detected virus had greatest homology to Enterovirus F type 1 (indicating that the virus should be considered a member of serotype 1), but the virus had greater homology in 2A protease sequence to type 3, suggesting that it may have been a recombinant. Identifying pathogens that infect a new host species for the first time can be challenging. As the disease in a new host species may be quite different from that in the original or natural host, the pathogen may not be suspected based on the clinical presentation, delaying diagnosis. Although this virus replicated in MDBK cells, existing standard culture and molecular methods could not identify it. In this case, a highly sensitive generic PCR-based pathogen-detection method was used to identify this pathogen. PMID:23950875
Comprehensive analysis of the T-cell receptor beta chain gene in rhesus monkey by high throughput sequencing

PubMed Central

Li, Zhoufang; Liu, Guangjie; Tong, Yin; Zhang, Meng; Xu, Ying; Qin, Li; Wang, Zhanhui; Chen, Xiaoping; He, Jiankui

2015-01-01

Profiling immune repertoires by high throughput sequencing enhances our understanding of immune system complexity and immune-related diseases in humans. Previously, cloning and Sanger sequencing identified limited numbers of T cell receptor (TCR) nucleotide sequences in rhesus monkeys, thus their full immune repertoire is unknown. We applied multiplex PCR and Illumina high throughput sequencing to study the TCRβ of rhesus monkeys. We identified 1.26 million TCRβ sequences corresponding to 643,570 unique TCRβ sequences and 270,557 unique complementarity-determining region 3 (CDR3) gene sequences. Precise measurements of CDR3 length distribution, CDR3 amino acid distribution, length distribution of N nucleotide of junctional region, and TCRV and TCRJ gene usage preferences were performed. A comprehensive profile of rhesus monkey immune repertoire might aid human infectious disease studies using rhesus monkeys. PMID:25961410
Whole exome sequencing identifies genetic variants in inherited thrombocytopenia with secondary qualitative function defects.

PubMed

Johnson, Ben; Lowe, Gillian C; Futterer, Jane; Lordkipanidzé, Marie; MacDonald, David; Simpson, Michael A; Sanchez-Guiú, Isabel; Drake, Sian; Bem, Danai; Leo, Vincenzo; Fletcher, Sarah J; Dawood, Ban; Rivera, José; Allsup, David; Biss, Tina; Bolton-Maggs, Paula Hb; Collins, Peter; Curry, Nicola; Grimley, Charlotte; James, Beki; Makris, Mike; Motwani, Jayashree; Pavord, Sue; Talks, Katherine; Thachil, Jecko; Wilde, Jonathan; Williams, Mike; Harrison, Paul; Gissen, Paul; Mundell, Stuart; Mumford, Andrew; Daly, Martina E; Watson, Steve P; Morgan, Neil V

2016-10-01

Inherited thrombocytopenias are a heterogeneous group of disorders characterized by abnormally low platelet counts which can be associated with abnormal bleeding. Next-generation sequencing has previously been employed in these disorders for the confirmation of suspected genetic abnormalities, and more recently in the discovery of novel disease-causing genes. However its full potential has not yet been exploited. Over the past 6 years we have sequenced the exomes from 55 patients, including 37 index cases and 18 additional family members, all of whom were recruited to the UK Genotyping and Phenotyping of Platelets study. All patients had inherited or sustained thrombocytopenia of unknown etiology with platelet counts varying from 11×10 9 /L to 186×10 9 /L. Of the 51 patients phenotypically tested, 37 (73%), had an additional secondary qualitative platelet defect. Using whole exome sequencing analysis we have identified "pathogenic" or "likely pathogenic" variants in 46% (17/37) of our index patients with thrombocytopenia. In addition, we report variants of uncertain significance in 12 index cases, including novel candidate genetic variants in previously unreported genes in four index cases. These results demonstrate that whole exome sequencing is an efficient method for elucidating potential pathogenic genetic variants in inherited thrombocytopenia. Whole exome sequencing also has the added benefit of discovering potentially pathogenic genetic variants for further study in novel genes not previously implicated in inherited thrombocytopenia. Copyright© Ferrata Storti Foundation.
Newly identified allatostatin Bs and their receptor in the two-spotted cricket, Gryllus bimaculatus.

PubMed

Tsukamoto, Yusuke; Nagata, Shinji

2016-06-01

A cDNA encoding allatostatin Bs (ASTBs) containing the W(X)6W motif was identified using a database generated by a next generation sequencer (NGS) in the two-spotted cricket, Gryllus bimaculatus. The contig sequence revealed the presence of five novel putative ASTBs (GbASTBs) in addition to GbASTBs previously identified in G. bimaculatus. MALDI-TOF MS analyses revealed the presence of these novel and previously identified GbASTBs with three missing GbASTBs. We also identified a cDNA encoding G. bimaculatus GbASTB receptor (GbASTBR) in the NGS data. Phylogenetic analysis demonstrated that this receptor was highly conserved with other insect ASTBRs, including the sex peptide receptor of Drosophila melanogaster. Calcium imaging analyses indicated that the GbASTBR heterologously expressed in HEK293 cells exhibited responses to all identified GbASTBs at a concentration range of 10(-10)-10(-5)M. Copyright © 2016 Elsevier Inc. All rights reserved.
Remnants of an Ancient Deltaretrovirus in the Genomes of Horseshoe Bats (Rhinolophidae).

PubMed

Hron, Tomáš; Farkašová, Helena; Gifford, Robert J; Benda, Petr; Hulva, Pavel; Görföl, Tamás; Pačes, Jan; Elleder, Daniel

2018-04-10

Endogenous retrovirus (ERV) sequences provide a rich source of information about the long-term interactions between retroviruses and their hosts. However, most ERVs are derived from a subset of retrovirus groups, while ERVs derived from certain other groups remain extremely rare. In particular, only a single ERV sequence has been identified that shows evidence of being related to an ancient Deltaretrovirus , despite the large number of vertebrate genome sequences now available. In this report, we identify a second example of an ERV sequence putatively derived from a past deltaretroviral infection, in the genomes of several species of horseshoe bats (Rhinolophidae). This sequence represents a fragment of viral genome derived from a single integration. The time of the integration was estimated to be 11-19 million years ago. This finding, together with the previously identified endogenous Deltaretrovirus in long-fingered bats (Miniopteridae), suggest a close association of bats with ancient deltaretroviruses.
Evaluation of the Abbott RealTime HCV genotype II plus RUO (PLUS) assay with reference to core and NS5B sequencing.

PubMed

Mallory, Melanie A; Lucic, Danijela; Ebbert, Mark T W; Cloherty, Gavin A; Toolsie, Dan; Hillyard, David R

2017-05-01

HCV genotyping remains a critical tool for guiding initiation of therapy and selecting the most appropriate treatment regimen. Current commercial genotyping assays may have difficulty identifying 1a, 1b and genotype 6. To evaluate the concordance for identifying 1a, 1b, and genotype 6 between two methods: the PLUS assay and core/NS5B sequencing. This study included 236 plasma and serum samples previously genotyped by core/NS5B sequencing. Of these, 25 samples were also previously tested by the Abbott RealTime HCV GT II Research Use Only (RUO) assay and yielded ambiguous results. The remaining 211 samples were routine genotype 1 (n=169) and genotype 6 (n=42). Genotypes obtained from sequence data were determined using a laboratory-developed HCV sequence analysis tool and the NCBI non-redundant database. Agreement between the PLUS assay and core/NS5B sequencing for genotype 1 samples was 95.8% (162/169), with 96% (127/132) and 95% (35/37) agreement for 1a and 1b samples respectively. PLUS results agreed with core/NS5B sequencing for 83% (35/42) of unselected genotype 6 samples, with the remaining seven "not detected" by the PLUS assay. Among the 25 samples with ambiguous GT II results, 15 were concordant by PLUS and core/NS5B sequencing, nine were not detected by PLUS, and one sample had an internal control failure. The PLUS assay is an automated method that identifies 1a, 1b and genotype 6 with good agreement with gold-standard core/NS5B sequencing and can aid in the resolution of certain genotype samples with ambiguous GT II results. Copyright © 2017 Elsevier B.V. All rights reserved.
Assessment of clinical analytical sensitivity and specificity of next-generation sequencing for detection of simple and complex mutations.

PubMed

Chin, Ephrem L H; da Silva, Cristina; Hegde, Madhuri

2013-02-19

Detecting mutations in disease genes by full gene sequence analysis is common in clinical diagnostic laboratories. Sanger dideoxy terminator sequencing allows for rapid development and implementation of sequencing assays in the clinical laboratory, but it has limited throughput, and due to cost constraints, only allows analysis of one or at most a few genes in a patient. Next-generation sequencing (NGS), on the other hand, has evolved rapidly, although to date it has mainly been used for large-scale genome sequencing projects and is beginning to be used in the clinical diagnostic testing. One advantage of NGS is that many genes can be analyzed easily at the same time, allowing for mutation detection when there are many possible causative genes for a specific phenotype. In addition, regions of a gene typically not tested for mutations, like deep intronic and promoter mutations, can also be detected. Here we use 20 previously characterized Sanger-sequenced positive controls in disease-causing genes to demonstrate the utility of NGS in a clinical setting using standard PCR based amplification to assess the analytical sensitivity and specificity of the technology for detecting all previously characterized changes (mutations and benign SNPs). The positive controls chosen for validation range from simple substitution mutations to complex deletion and insertion mutations occurring in autosomal dominant and recessive disorders. The NGS data was 100% concordant with the Sanger sequencing data identifying all 119 previously identified changes in the 20 samples. We have demonstrated that NGS technology is ready to be deployed in clinical laboratories. However, NGS and associated technologies are evolving, and clinical laboratories will need to invest significantly in staff and infrastructure to build the necessary foundation for success.
Neuropeptidomics of the Mosquito Aedes Aegypti

DTIC Science & Technology

2010-01-01

translational processing ( pyroglutamate formation) was detected for AST-C and CAPA-PVK-2. For the first time in insects, we succeeded in the direct...hormones, trace DNA sequences generated by TIGR and the Broad Institute were first searched by TBLASTN24 using amino acid sequences of candidate peptides...previously described.1 TBLASTN searches, using the amino acid sequences of putative Ae. aegypti neuropeptide and peptide hormone orthologs identified in
Simultaneous activation of parallel sensory pathways promotes a grooming sequence in Drosophila

PubMed Central

Hampel, Stefanie; McKellar, Claire E

2017-01-01

A central model that describes how behavioral sequences are produced features a neural architecture that readies different movements simultaneously, and a mechanism where prioritized suppression between the movements determines their sequential performance. We previously described a model whereby suppression drives a Drosophila grooming sequence that is induced by simultaneous activation of different sensory pathways that each elicit a distinct movement (Seeds et al., 2014). Here, we confirm this model using transgenic expression to identify and optogenetically activate sensory neurons that elicit specific grooming movements. Simultaneous activation of different sensory pathways elicits a grooming sequence that resembles the naturally induced sequence. Moreover, the sequence proceeds after the sensory excitation is terminated, indicating that a persistent trace of this excitation induces the next grooming movement once the previous one is performed. This reveals a mechanism whereby parallel sensory inputs can be integrated and stored to elicit a delayed and sequential grooming response. PMID:28887878
Analysis of a MULE-cyanide hydratase gene fusion in Verticillium dahliae

USDA-ARS?s Scientific Manuscript database

The genome of the phytopathogenic fungus Verticillium dahliae encodes numerous Class II “cut-and-paste” transposable elements, including those of a small group of MULE transposons. We have previously identified a fusion event between a MULE transposon sequence and sequence encoding a cyanide hydrata...
Complete Genome Sequence of a Porcine Polyomavirus from Nasal Swabs of Pigs with Respiratory Disease

PubMed Central

Smith, Catherine; Bishop, Brian; Stewart, Chelsea; Simonson, Randy

2018-01-01

ABSTRACT Metagenomic sequencing of pooled nasal swabs from pigs with unexplained respiratory disease identified a large number of reads mapping to a previously uncharacterized porcine polyomavirus. Sus scrofa polyomavirus 2 was most closely related to betapolyomaviruses frequently detected in mammalian respiratory samples. PMID:29700160
Gene identification and analysis of transcripts differentially regulated in fracture healing by EST sequencing in the domestic sheep.

PubMed

Hecht, Jochen; Kuhl, Heiner; Haas, Stefan A; Bauer, Sebastian; Poustka, Albert J; Lienau, Jasmin; Schell, Hanna; Stiege, Asita C; Seitz, Volkhard; Reinhardt, Richard; Duda, Georg N; Mundlos, Stefan; Robinson, Peter N

2006-07-05

The sheep is an important model animal for testing novel fracture treatments and other medical applications. Despite these medical uses and the well known economic and cultural importance of the sheep, relatively little research has been performed into sheep genetics, and DNA sequences are available for only a small number of sheep genes. In this work we have sequenced over 47 thousand expressed sequence tags (ESTs) from libraries developed from healing bone in a sheep model of fracture healing. These ESTs were clustered with the previously available 10 thousand sheep ESTs to a total of 19087 contigs with an average length of 603 nucleotides. We used the newly identified sequences to develop RT-PCR assays for 78 sheep genes and measured differential expression during the course of fracture healing between days 7 and 42 postfracture. All genes showed significant shifts at one or more time points. 23 of the genes were differentially expressed between postfracture days 7 and 10, which could reflect an important role for these genes for the initiation of osteogenesis. The sequences we have identified in this work are a valuable resource for future studies on musculoskeletal healing and regeneration using sheep and represent an important head-start for genomic sequencing projects for Ovis aries, with partial or complete sequences being made available for over 5,800 previously unsequenced sheep genes.
Sequence Segmentation with changeptGUI.

PubMed

Tasker, Edward; Keith, Jonathan M

2017-01-01

Many biological sequences have a segmental structure that can provide valuable clues to their content, structure, and function. The program changept is a tool for investigating the segmental structure of a sequence, and can also be applied to multiple sequences in parallel to identify a common segmental structure, thus providing a method for integrating multiple data types to identify functional elements in genomes. In the previous edition of this book, a command line interface for changept is described. Here we present a graphical user interface for this package, called changeptGUI. This interface also includes tools for pre- and post-processing of data and results to facilitate investigation of the number and characteristics of segment classes.
Re-sequencing and genetic variation identification of a rice line with ideal plant architecture.

PubMed

Li, Shuangcheng; Xie, Kailong; Li, Wenbo; Zou, Ting; Ren, Yun; Wang, Shiquan; Deng, Qiming; Zheng, Aiping; Zhu, Jun; Liu, Huainian; Wang, Lingxia; Ai, Peng; Gao, Fengyan; Huang, Bin; Cao, Xuemei; Li, Ping

2012-12-01

The ideal plant architecture (IPA) includes several important characteristics such as low tiller numbers, few or no unproductive tillers, more grains per panicle, and thick and sturdy stems. We have developed an indica restorer line 7302R that displays the IPA phenotype in terms of tiller number, grain number, and stem strength. However, its mechanism had to be clarified. We performed re-sequencing and genome-wide variation analysis of 7302R using the Solexa sequencing technology. With the genomic sequence of the indica cultivar 9311 as reference, 307 627 SNPs, 57 372 InDels, and 3 096 SVs were identified in the 7302R genome. The 7302R-specific variations were investigated via the synteny analysis of all the SNPs of 7302R with those of the previous sequenced none-IPA-type lines IR24, MH63, and SH527. Moreover, we found 178 168 7302R-specific SNPs across the whole genome and 30 239 SNPs in the predicted mRNA regions, among which 8 517 were Non-syn CDS. In addition, 263 large-effect SNPs that were expected to affect the integrity of encoded proteins were identified from the 7302R-specific SNPs. SNPs of several important previously cloned rice genes were also identified by aligning the 7302R sequence with other sequence lines. Our results provided several candidates account for the IPA phenotype of 7302R. These results therefore lay the groundwork for long-term efforts to uncover important genes and alleles for rice plant architecture construction, also offer useful data resources for future genetic and genomic studies in rice.
Functional brain activation differences in stuttering identified with a rapid fMRI sequence

PubMed Central

Kraft, Shelly Jo; Choo, Ai Leen; Sharma, Harish; Ambrose, Nicoline G.

2011-01-01

The purpose of this study was to investigate whether brain activity related to the presence of stuttering can be identified with rapid functional MRI (fMRI) sequences that involved overt and covert speech processing tasks. The long-term goal is to develop sensitive fMRI approaches with developmentally appropriate tasks to identify deviant speech motor and auditory brain activity in children who stutter closer to the age at which recovery from stuttering is documented. Rapid sequences may be preferred for individuals or populations who do not tolerate long scanning sessions. In this report, we document the application of a picture naming and phoneme monitoring task in three minute fMRI sequences with adults who stutter (AWS). If relevant brain differences are found in AWS with these approaches that conform to previous reports, then these approaches can be extended to younger populations. Pairwise contrasts of brain BOLD activity between AWS and normally fluent adults indicated the AWS showed higher BOLD activity in the right inferior frontal gyrus (IFG), right temporal lobe and sensorimotor cortices during picture naming and and higher activity in the right IFG during phoneme monitoring. The right lateralized pattern of BOLD activity together with higher activity in sensorimotor cortices is consistent with previous reports, which indicates rapid fMRI sequences can be considered for investigating stuttering in younger participants. PMID:22133409
Sequence requirement of the ade6-4095 meiotic recombination hotspot in Schizosaccharomyces pombe.

PubMed

Foulis, Steven J; Fowler, Kyle R; Steiner, Walter W

2018-02-01

Homologous recombination occurs at a greatly elevated frequency in meiosis compared to mitosis and is initiated by programmed double-strand DNA breaks (DSBs). DSBs do not occur at uniform frequency throughout the genome in most organisms, but occur preferentially at a limited number of sites referred to as hotspots. The location of hotspots have been determined at nucleotide-level resolution in both the budding and fission yeasts, and while several patterns have emerged regarding preferred locations for DSB hotspots, it remains unclear why particular sites experience DSBs at much higher frequency than other sites with seemingly similar properties. Short sequence motifs, which are often sites for binding of transcription factors, are known to be responsible for a number of hotspots. In this study we identified the minimum sequence required for activity of one of such motif identified in a screen of random sequences capable of producing recombination hotspots. The experimentally determined sequence, GGTCTRGACC, closely matches the previously inferred sequence. Full hotspot activity requires an effective sequence length of 9.5 bp, whereas moderate activity requires an effective sequence length of approximately 8.2 bp and shows significant association with DSB hotspots. In combination with our previous work, this result is consistent with a large number of different sequence motifs capable of producing recombination hotspots, and supports a model in which hotspots can be rapidly regenerated by mutation as they are lost through recombination.
Molecular characterization of a distinct monopartite begomovirus associated with betasatellites and alphasatellites infecting Pisum sativum in Nepal.

PubMed

Shahid, M S; Pudashini, B J; Khatri-Chhetri, G B; Briddon, R W; Natsuaki, K T

2017-04-01

Pea (Pisum sativum) plants exhibiting leaf distortion, yellowing, stunted growth and reduction in leaf size from Rampur, Nepal were shown to be infected by a begomovirus in association with betasatellites and alphasatellites. The begomovirus associated with the disease showed only low levels of nucleotide sequence identity (<91%) to previously characterized begomoviruses. This finding indicates that the pea samples were infected with an as yet undescribed begomovirus for which the name Pea leaf distortion virus (PLDV) is proposed. Two species of betasatellite were identified in association with PLDV. One group of sequences had high (>78%) nucleotide sequence identity to isolates of Ludwigia leaf distortion betasatellite (LuLDB), and the second group had less than 78% to all other betasatellite sequences. This showed PLDV to be associated with either LuLDB or a previously undescribed betasatellite for which the name Pea leaf distortion betasatellite is proposed. Two types of alphasatellites were identified in the PLDV-infected pea plants. The first type showed high levels of sequence identity to Ageratum yellow vein alphasatellite, and the second type showed high levels of identity to isolates of Sida yellow vein China alphasatellite. These are the first begomovirus, betasatellites and alphasatellites isolated from pea.
Composite transcriptome assembly of RNA-seq data in a sheep model for delayed bone healing.

PubMed

Jäger, Marten; Ott, Claus-Eric; Grünhagen, Johannes; Hecht, Jochen; Schell, Hanna; Mundlos, Stefan; Duda, Georg N; Robinson, Peter N; Lienau, Jasmin

2011-03-24

The sheep is an important model organism for many types of medically relevant research, but molecular genetic experiments in the sheep have been limited by the lack of knowledge about ovine gene sequences. Prior to our study, mRNA sequences for only 1,556 partial or complete ovine genes were publicly available. Therefore, we developed a composite de novo transcriptome assembly method for next-generation sequence data to combine known ovine mRNA and EST sequences, mRNA sequences from mouse and cow, and sequences assembled de novo from short read RNA-Seq data into a composite reference transcriptome, and identified transcripts from over 12 thousand previously undescribed ovine genes. Gene expression analysis based on these data revealed substantially different expression profiles in standard versus delayed bone healing in an ovine tibial osteotomy model. Hundreds of transcripts were differentially expressed between standard and delayed healing and between the time points of the standard and delayed healing groups. We used the sheep sequences to design quantitative RT-PCR assays with which we validated the differential expression of 26 genes that had been identified by RNA-seq analysis. A number of clusters of characteristic expression profiles could be identified, some of which showed striking differences between the standard and delayed healing groups. Gene Ontology (GO) analysis showed that the differentially expressed genes were enriched in terms including extracellular matrix, cartilage development, contractile fiber, and chemokine activity. Our results provide a first atlas of gene expression profiles and differentially expressed genes in standard and delayed bone healing in a large-animal model and provide a number of clues as to the shifts in gene expression that underlie delayed bone healing. In the course of our study, we identified transcripts of 13,987 ovine genes, including 12,431 genes for which no sequence information was previously available. This information will provide a basis for future molecular research involving the sheep as a model organism.
Composite transcriptome assembly of RNA-seq data in a sheep model for delayed bone healing

PubMed Central

2011-01-01

Background The sheep is an important model organism for many types of medically relevant research, but molecular genetic experiments in the sheep have been limited by the lack of knowledge about ovine gene sequences. Results Prior to our study, mRNA sequences for only 1,556 partial or complete ovine genes were publicly available. Therefore, we developed a composite de novo transcriptome assembly method for next-generation sequence data to combine known ovine mRNA and EST sequences, mRNA sequences from mouse and cow, and sequences assembled de novo from short read RNA-Seq data into a composite reference transcriptome, and identified transcripts from over 12 thousand previously undescribed ovine genes. Gene expression analysis based on these data revealed substantially different expression profiles in standard versus delayed bone healing in an ovine tibial osteotomy model. Hundreds of transcripts were differentially expressed between standard and delayed healing and between the time points of the standard and delayed healing groups. We used the sheep sequences to design quantitative RT-PCR assays with which we validated the differential expression of 26 genes that had been identified by RNA-seq analysis. A number of clusters of characteristic expression profiles could be identified, some of which showed striking differences between the standard and delayed healing groups. Gene Ontology (GO) analysis showed that the differentially expressed genes were enriched in terms including extracellular matrix, cartilage development, contractile fiber, and chemokine activity. Conclusions Our results provide a first atlas of gene expression profiles and differentially expressed genes in standard and delayed bone healing in a large-animal model and provide a number of clues as to the shifts in gene expression that underlie delayed bone healing. In the course of our study, we identified transcripts of 13,987 ovine genes, including 12,431 genes for which no sequence information was previously available. This information will provide a basis for future molecular research involving the sheep as a model organism. PMID:21435219
Petri net modeling of high-order genetic systems using grammatical evolution.

PubMed

Moore, Jason H; Hahn, Lance W

2003-11-01

Understanding how DNA sequence variations impact human health through a hierarchy of biochemical and physiological systems is expected to improve the diagnosis, prevention, and treatment of common, complex human diseases. We have previously developed a hierarchical dynamic systems approach based on Petri nets for generating biochemical network models that are consistent with genetic models of disease susceptibility. This modeling approach uses an evolutionary computation approach called grammatical evolution as a search strategy for optimal Petri net models. We have previously demonstrated that this approach routinely identifies biochemical network models that are consistent with a variety of genetic models in which disease susceptibility is determined by nonlinear interactions between two DNA sequence variations. In the present study, we evaluate whether the Petri net approach is capable of identifying biochemical networks that are consistent with disease susceptibility due to higher order nonlinear interactions between three DNA sequence variations. The results indicate that our model-building approach is capable of routinely identifying good, but not perfect, Petri net models. Ideas for improving the algorithm for this high-dimensional problem are presented.

Early diagnosis of a Mexican variant of Papaya meleira virus (PMeV-Mx) by RT-PCR.

PubMed

Zamudio-Moreno, E; Ramirez-Prado, J H; Moreno-Valenzuela, O A; Lopez-Ochoa, L A

2015-02-06

Papaya meleira disease was identified in Brazil in the 1980s. The disease is caused by a double-stranded RNA virus known as Papaya meleira virus (PMeV), which has also been recently reported in Mexico. However, previously reported PMeV primers failed to diagnose the Mexican form of the disease. A genomic approach was used to identify sequences of the Mexican virus isolate, referred here to as PMeV-Mx, to develop a diagnostic method. A mini cDNA library was generated using total RNA from the latex of fruits; this RNA was also sequenced using the Illumina platform. Sequences corresponding to the previously reported 669-base pair sequence for PMeV from Brazil (PMeV-Br) were identified within the PMeV-Mx genome, exhibiting 79-92% identity with PMeV-Br. In addition, a new sequence of 1154-base pairs encoding a putative RNA-dependent RNA polymerase was identified in PMeV-Mx. Primers designed against this sequence detected both virus isolates, 2 amplicons of 173 and 491 base pairs from PMeV-Br and PMeV-Mx, and shared 100 and 98% identity, respectively. PMeV-Mx was found in the latex of fruits, in seedlings, and in the leaves, flowers, petioles, and seeds of mature plants. PMeV-Mx was more abundant in the latex of fruits than in the leaves. The limit of detection of the CB38/CB39 primer pair was 1 fg and 1 pg using total RNA extracted from the latex of fruits and from seedlings, respectively. A sensitive and early diagnosis protocol was developed; this method will enable the certification of seeds and seedlings prior to transplantation to the field.
Proteomic analysis of the venom from the scorpion Mesobuthus martensii.

PubMed

Xu, Xiaobo; Duan, Zhigui; Di, Zhiyong; He, Yawen; Li, Jianglin; Li, Zhongjie; Xie, Chunliang; Zeng, Xiongzhi; Cao, Zhijian; Wu, Yingliang; Liang, Songping; Li, Wenxin

2014-06-25

The scorpion Mesobuthus martensii is the most populous species in eastern Asian countries, and several toxic components have been identified from their venoms. Nevertheless, a complete proteomic profile of the venom of M. martensii is still not available. In this study, the venom of M. martensii was analyzed by comprehensive proteomic approaches. 153 fractions were isolated from the M. martensii venom by 2-DE, SDS-PAGE and RP-HPLC. The ESI-Q-TOF MS results of all fractions were used to search the scorpion genomic and transcriptomic databases. Totally, 227 non-redundant protein sequences were unambiguously identified, composed of 134 previously known and 93 previously unknown proteins. Among 134 previously known proteins, 115 proteins were firstly confirmed from the M. martensii crude venom and 19 toxins were confirmed once again, involving 43 typical toxins, 7 atypical toxins, 12 venom enzymes and 72 cell associated proteins. In typical toxins, 7 novel-toxin sequences were identified, including 3 Na(+)-channel toxins, 3K(+)-channel toxins and 1 no-annotation toxin. These results increased 230% (115/50) venom components compared with previous studies from the M. martensii venom, especially 50% (24/48) typical toxins. Additionally, a mass fingerprint obtained by MALDI-TOF MS indicated that the scorpion venom contained more than 200 different molecular mass components. This work firstly gave a systematic investigation of the M. martensii venom by combined proteomics strategy coupled with genomics and transcriptomics. A large number of protein components were unambiguously identified from the venom of M. martensii, most of which were confirmed for the first time. We also contributed 7 novel-toxin sequences and 93 protein sequences previously unknown to be part of the venom, for which we assigned potential biological functions. Besides, we obtained a mass fingerprint of the M. martensii venom. Together, our study not only provides the most comprehensive catalog of the molecular diversity of the M. martensii venom at the proteomic level, but also enriches the composition information of scorpion venom. Copyright © 2014 Elsevier B.V. All rights reserved.
A Bacterial Analysis Platform: An Integrated System for Analysing Bacterial Whole Genome Sequencing Data for Clinical Diagnostics and Surveillance.

PubMed

Thomsen, Martin Christen Frølund; Ahrenfeldt, Johanne; Cisneros, Jose Luis Bellod; Jurtz, Vanessa; Larsen, Mette Voldby; Hasman, Henrik; Aarestrup, Frank Møller; Lund, Ole

2016-01-01

Recent advances in whole genome sequencing have made the technology available for routine use in microbiological laboratories. However, a major obstacle for using this technology is the availability of simple and automatic bioinformatics tools. Based on previously published and already available web-based tools we developed a single pipeline for batch uploading of whole genome sequencing data from multiple bacterial isolates. The pipeline will automatically identify the bacterial species and, if applicable, assemble the genome, identify the multilocus sequence type, plasmids, virulence genes and antimicrobial resistance genes. A short printable report for each sample will be provided and an Excel spreadsheet containing all the metadata and a summary of the results for all submitted samples can be downloaded. The pipeline was benchmarked using datasets previously used to test the individual services. The reported results enable a rapid overview of the major results, and comparing that to the previously found results showed that the platform is reliable and able to correctly predict the species and find most of the expected genes automatically. In conclusion, a combined bioinformatics platform was developed and made publicly available, providing easy-to-use automated analysis of bacterial whole genome sequencing data. The platform may be of immediate relevance as a guide for investigators using whole genome sequencing for clinical diagnostics and surveillance. The platform is freely available at: https://cge.cbs.dtu.dk/services/CGEpipeline-1.1 and it is the intention that it will continue to be expanded with new features as these become available.
Intergenic Sequence Ribotyping using a region neighboring dkgB links genovar to Kauffman-White serotype of Salmonella enterica

USDA-ARS?s Scientific Manuscript database

Previous research identified that the 5S ribosomal (rrn) gene and associated flanking sequences that are closely linked to the dkgB gene of Salmonella enterica were highly variable between serotypes, but not between subpopulations within the same serotype (PMID: 17005008). The degree of variability ...
Analysis of the complete genome of subgroup A' hepatitis B virus isolates from South Africa.

PubMed

Kramvis, Anna; Weitzmann, Louise; Owiredu, William K B A; Kew, Michael C

2002-04-01

A phylogenetic analysis is presented of six complete and seven pre-S1/S2/S gene sequences of hepatitis B virus (HBV) isolates from South Africa. Five of the full-length sequences and all of the pre-S2/S sequences have been previously reported. Four of the six complete genomes and three of the five incomplete sequences clustered with subgroup A', a unique segment of genotype A of HBV previously identified in 60% of South African isolates using analysis of the pre-S2/S region alone. This separation was also evident when the polymerase open reading frame was analysed, but not on analysis of either the X or pre-core/core genes. Amino acids were identified in the pre-S1 and polymerase regions specific to subgroup A'. In common with genotype D, 10 of 11 genotype A South African isolates had an 11 amino acid deletion in the amino end of the pre-S1 region. This deletion is also found in hepadnaviruses from non-human primates.
Taxonomic evaluation of putative Streptomyces scabiei strains held in the ARS Culture Collection (NRRL) using multi-locus sequence analysis.

PubMed

Labeda, David P

2016-03-01

Multi-locus sequence analysis has been demonstrated to be a useful tool for identification of Streptomyces species and was previously applied to phylogenetically differentiate the type strains of species pathogenic on potatoes (Solanum tuberosum L.). The ARS Culture Collection (NRRL) contains 43 strains identified as Streptomyces scabiei deposited at various times since the 1950s and these were subjected to multi-locus sequence analysis utilising partial sequences of the house-keeping genes atpD, gyrB, recA, rpoB and trpB. Phylogenetic analyses confirmed the identity of 17 of these strains as Streptomyces scabiei, 9 of the strains as the potato-pathogenic species Streptomyces europaeiscabiei and 6 strains as potentially new phytopathogenic species. Of the 16 other strains, 12 were identified as members of previously described non-pathogenic Streptomyces species while the remaining 4 strains may represent heretofore unrecognised non-pathogenic species. This study demonstrated the value of this technique for the relatively rapid, simple and sensitive molecular identification of Streptomyces strains held in culture collections.
The membrane skeleton in Paramecium: Molecular characterization of a novel epiplasmin family and preliminary GFP expression results.

PubMed

Pomel, Sébastien; Diogon, Marie; Bouchard, Philippe; Pradel, Lydie; Ravet, Viviane; Coffe, Gérard; Viguès, Bernard

2006-02-01

Previous attempts to identify the membrane skeleton of Paramecium cells have revealed a protein pattern that is both complex and specific. The most prominent structural elements, epiplasmic scales, are centered around ciliary units and are closely apposed to the cytoplasmic side of the inner alveolar membrane. We sought to characterize epiplasmic scale proteins (epiplasmins) at the molecular level. PCR approaches enabled the cloning and sequencing of two closely related genes by amplifications of sequences from a macronuclear genomic library. Using these two genes (EPI-1 and EPI-2), we have contributed to the annotation of the Paramecium tetraurelia macronuclear genome and identified 39 additional (paralogous) sequences. Two orthologous sequences were found in the Tetrahymena thermophila genome. Structural analysis of the 43 sequences indicates that the hallmark of this new multigenic family is a 79 aa domain flanked by two Q-, P- and V-rich stretches of sequence that are much more variable in amino-acid composition. Such features clearly distinguish members of the multigenic family from epiplasmic proteins previously sequenced in other ciliates. The expression of Green Fluorescent Protein (GFP)-tagged epiplasmin showed significant labeling of epiplasmic scales as well as oral structures. We expect that the GFP construct described herein will prove to be a useful tool for comparative subcellular localization of different putative epiplasmins in Paramecium.
Structural and sequence features of two residue turns in beta-hairpins.

PubMed

Madan, Bharat; Seo, Sung Yong; Lee, Sun-Gu

2014-09-01

Beta-turns in beta-hairpins have been implicated as important sites in protein folding. In particular, two residue β-turns, the most abundant connecting elements in beta-hairpins, have been a major target for engineering protein stability and folding. In this study, we attempted to investigate and update the structural and sequence properties of two residue turns in beta-hairpins with a large data set. For this, 3977 beta-turns were extracted from 2394 nonhomologous protein chains and analyzed. First, the distribution, dihedral angles and twists of two residue turn types were determined, and compared with previous data. The trend of turn type occurrence and most structural features of the turn types were similar to previous results, but for the first time Type II turns in beta-hairpins were identified. Second, sequence motifs for the turn types were devised based on amino acid positional potentials of two-residue turns, and their distributions were examined. From this study, we could identify code-like sequence motifs for the two residue beta-turn types. Finally, structural and sequence properties of beta-strands in the beta-hairpins were analyzed, which revealed that the beta-strands showed no specific sequence and structural patterns for turn types. The analytical results in this study are expected to be a reference in the engineering or design of beta-hairpin turn structures and sequences. © 2014 Wiley Periodicals, Inc.
The transcriptome of sesquiterpenoid biosynthesis in heartwood xylem of Western Australian sandalwood (Santalum spicatum).

PubMed

Moniodis, Jessie; Jones, Christopher G; Barbour, E Liz; Plummer, Julie A; Ghisalberti, Emilio L; Bohlmann, Joerg

2015-05-01

The fragrant heartwood oil of West Australian sandalwood (Santalum spicatum) contains a mixture of sesquiterpene olefins and alcohols, including variable levels of the valuable sesquiterpene alcohols, α- and β-santalol, and often high levels of E,E-farnesol. Transcriptome analysis revealed sequences for a nearly complete set of genes of the sesquiterpenoid biosynthetic pathway in this commercially valuable sandalwood species. Transcriptome sequences were produced from heartwood xylem tissue of a farnesol-rich individual tree. From the assembly of 12,537 contigs, seven different terpene synthases (TPSs), several cytochromes P450, and allylic phosphatases were identified, as well as transcripts of the mevalonic acid and methylerythritol phosphate pathways. Five of the S. spicatum TPS sequences were previously unknown. The full-length cDNA of SspiTPS4 was cloned and the enzyme functionally characterized as a multi-product sesquisabinene B synthase, which complements previous characterization of santalene and bisabolol synthases in S. spicatum. While SspiTPS4 and previously cloned sandalwood TPSs do not explain the prevalence of E,E-farnesol in S. spicatum, the genes identified in this and previous work can form a basis for future studies on natural variation of sandalwood terpenoid oil profiles. Copyright © 2014 Elsevier Ltd. All rights reserved.
X-exome sequencing in Finnish families with Intellectual Disability - four novel mutations and two novel syndromic phenotypes

PubMed Central

2014-01-01

Background X-linked intellectual disability (XLID) is a group of genetically heterogeneous disorders characterized by substantial impairment in cognitive abilities, social and behavioral adaptive skills. Next generation sequencing technologies have become a powerful approach for identifying molecular gene mutations relevant for diagnosis. Methods & objectives Enrichment of X-chromosome specific exons and massively parallel sequencing was performed for identifying the causative mutations in 14 Finnish families, each of them having several males affected with intellectual disability of unknown cause. Results We found four novel mutations in known XLID genes. Two mutations; one previously reported missense mutation (c.1111C > T), and one novel frameshift mutation (c. 990_991insGCTGC) were identified in SLC16A2, a gene that has been linked to Allan-Herndon-Dudley syndrome (AHDS). One novel missense mutation (c.1888G > C) was found in GRIA3 and two novel splice donor site mutations (c.357 + 1G > C and c.985 + 1G > C) were identified in the DLG3 gene. One missense mutation (c.1321C > T) was identified in the candidate gene ZMYM3 in three affected males with a previously unrecognized syndrome characterized by unique facial features, aortic stenosis and hypospadia was detected. All of the identified mutations segregated in the corresponding families and were absent in > 100 Finnish controls and in the publicly available databases. In addition, a previously reported benign variant (c.877G > A) in SYP was identified in a large family with nine affected males in three generations, who have a syndromic phenotype. Conclusions All of the mutations found in this study are being reported for the first time in Finnish families with several affected male patients whose etiological diagnoses have remained unknown to us, in some families, for more than 30 years. This study illustrates the impact of X-exome sequencing to identify rare gene mutations and the challenges of interpreting the results. Further functional studies are required to confirm the cause of the syndromic phenotypes associated with ZMYM3 and SYP in this study. PMID:24721225
Analysis of MHC class I genes across horse MHC haplotypes

PubMed Central

Tallmadge, Rebecca L.; Campbell, Julie A.; Miller, Donald C.; Antczak, Douglas F.

2010-01-01

The genomic sequences of 15 horse Major Histocompatibility Complex (MHC) class I genes and a collection of MHC class I homozygous horses of five different haplotypes were used to investigate the genomic structure and polymorphism of the equine MHC. A combination of conserved and locus-specific primers was used to amplify horse MHC class I genes with classical and non-classical characteristics. Multiple clones from each haplotype identified three to five classical sequences per homozygous animal, and two to three non-classical sequences. Phylogenetic analysis was applied to these sequences and groups were identified which appear to be allelic series, but some sequences were left ungrouped. Sequences determined from MHC class I heterozygous horses and previously described MHC class I sequences were then added, representing a total of ten horse MHC haplotypes. These results were consistent with those obtained from the MHC homozygous horses alone, and 30 classical sequences were assigned to four previously confirmed loci and three new provisional loci. The non-classical genes had few alleles and the classical genes had higher levels of allelic polymorphism. Alleles for two classical loci with the expected pattern of polymorphism were found in the majority of haplotypes tested, but alleles at two other commonly detected loci had more variation outside of the hypervariable region than within. Our data indicate that the equine Major Histocompatibility Complex is characterized by variation in the complement of class I genes expressed in different haplotypes in addition to the expected allelic polymorphism within loci. PMID:20099063
A gyrovirus infecting a sea bird

PubMed Central

Li, Linlin; Pesavento, Patricia A.; Gaynor, Anne M.; Duerr, Rebecca S.; Phan, Tung Gia; Zhang, Wen; Deng, Xutao

2015-01-01

We characterized the genome of a highly divergent gyrovirus (GyV8) in the spleen and uropygial gland tissues of a diseased northern fulmar (Fulmarus glacialis), a pelagic bird beached in San Francisco, California. No other exogenous viral sequences could be identified using viral metagenomics. The small circular DNA genome shared no significant nucleotide sequence identity, and only 38–42 % amino acid sequence identity in VP1, with any of the previously identified gyroviruses. GyV8 is the first member of the third major phylogenetic clade of this viral genus and the first gyrovirus detected in an avian species other than chicken. PMID:26036564
Exome sequencing of a large family identifies potential candidate genes contributing risk to bipolar disorder.

PubMed

Zhang, Tianxiao; Hou, Liping; Chen, David T; McMahon, Francis J; Wang, Jen-Chyong; Rice, John P

2018-03-01

Bipolar disorder is a mental illness with lifetime prevalence of about 1%. Previous genetic studies have identified multiple chromosomal linkage regions and candidate genes that might be associated with bipolar disorder. The present study aimed to identify potential susceptibility variants for bipolar disorder using 6 related case samples from a four-generation family. A combination of exome sequencing and linkage analysis was performed to identify potential susceptibility variants for bipolar disorder. Our study identified a list of five potential candidate genes for bipolar disorder. Among these five genes, GRID1(Glutamate Receptor Delta-1 Subunit), which was previously reported to be associated with several psychiatric disorders and brain related traits, is particularly interesting. Variants with functional significance in this gene were identified from two cousins in our bipolar disorder pedigree. Our findings suggest a potential role for these genes and the related rare variants in the onset and development of bipolar disorder in this one family. Additional research is needed to replicate these findings and evaluate their patho-biological significance. Copyright © 2017 Elsevier B.V. All rights reserved.
Genotyping of Leptospira directly in urine samples of cattle demonstrates a diversity of species and strains in Brazil.

PubMed

Hamond, C; Pestana, C P; Medeiros, M A; Lilenbaum, W

2016-01-01

The aim of this study was to identify Leptospira in urine samples of cattle by direct sequencing of the secY gene. The validity of this approach was assessed using ten Leptospira strains obtained from cattle in Brazil and 77 DNA samples previously extracted from cattle urine, that were positive by PCR for the genus-specific lipL32 gene of Leptospira. Direct sequencing identified 24 (31·1%) interpretable secY sequences and these were identical to those obtained from direct DNA sequencing of the urine samples from which they were recovered. Phylogenetic analyses identified four species: L. interrogans, L. borgpetersenii, L. noguchii, and L. santarosai with the most prevalent genotypes being associated with L. borgpetersenii. While direct sequencing cannot, as yet, replace culturing of leptospires, it is a valid additional tool for epidemiological studies. An unexpected finding from this study was the genetic diversity of Leptospira infecting Brazilian cattle.
Diversity of virus-host systems in hypersaline Lake Retba, Senegal.

PubMed

Sime-Ngando, Télesphore; Lucas, Soizick; Robin, Agnès; Tucker, Kimberly Pause; Colombet, Jonathan; Bettarel, Yvan; Desmond, Elie; Gribaldo, Simonetta; Forterre, Patrick; Breitbart, Mya; Prangishvili, David

2011-08-01

Remarkable morphological diversity of virus-like particles was observed by transmission electron microscopy in a hypersaline water sample from Lake Retba, Senegal. The majority of particles morphologically resembled hyperthermophilic archaeal DNA viruses isolated from extreme geothermal environments. Some hypersaline viral morphotypes have not been previously observed in nature, and less than 1% of observed particles had a head-and-tail morphology, which is typical for bacterial DNA viruses. Culture-independent analysis of the microbial diversity in the sample suggested the dominance of extremely halophilic archaea. Few of the 16S sequences corresponded to known archeal genera (Haloquadratum, Halorubrum and Natronomonas), whereas the majority represented novel archaeal clades. Three sequences corresponded to a new basal lineage of the haloarchaea. Bacteria belonged to four major phyla, consistent with the known diversity in saline environments. Metagenomic sequencing of DNA from the purified virus-like particles revealed very few similarities to the NCBI non-redundant database at either the nucleotide or amino acid level. Some of the identifiable virus sequences were most similar to previously described haloarchaeal viruses, but no sequence similarities were found to archaeal viruses from extreme geothermal environments. A large proportion of the sequences had similarity to previously sequenced viral metagenomes from solar salterns. © 2010 Society for Applied Microbiology and Blackwell Publishing Ltd.
Validation of rearrangement break points identified by paired-end sequencing in natural populations of Drosophila melanogaster.

PubMed

Cridland, Julie M; Thornton, Kevin R

2010-01-13

Several recent studies have focused on the evolution of recently duplicated genes in Drosophila. Currently, however, little is known about the evolutionary forces acting upon duplications that are segregating in natural populations. We used a high-throughput, paired-end sequencing platform (Illumina) to identify structural variants in a population sample of African D. melanogaster. Polymerase chain reaction and sequencing confirmation of duplications detected by multiple, independent paired-ends showed that paired-end sequencing reliably uncovered the break points of structural rearrangements and allowed us to identify a number of tandem duplications segregating within a natural population. Our confirmation experiments show that rates of confirmation are very high, even at modest coverage. Our results also compare well with previous studies using microarrays (Emerson J, Cardoso-Moreira M, Borevitz JO, Long M. 2008. Natural selection shapes genome wide patterns of copy-number polymorphism in Drosophila melanogaster. Science. 320:1629-1631. and Dopman EB, Hartl DL. 2007. A portrait of copy-number polymorphism in Drosophila melanogaster. Proc Natl Acad Sci U S A. 104:19920-19925.), which both gives us confidence in the results of this study as well as confirms previous microarray results.We were also able to identify whole-gene duplications, such as a novel duplication of Or22a, an olfactory receptor, and identify copy-number differences in genes previously known to be under positive selection, like Cyp6g1, which confers resistance to dichlorodiphenyltrichloroethane. Several "hot spots" of duplications were detected in this study, which indicate that particular regions of the genome may be more prone to generating duplications. Finally, population frequency analysis of confirmed events also showed an excess of rare variants in our population, which indicates that duplications segregating in the population may be deleterious and ultimately destined to be lost from the population.
Clonal architecture of secondary acute myeloid leukemia defined by single-cell sequencing.

PubMed

Hughes, Andrew E O; Magrini, Vincent; Demeter, Ryan; Miller, Christopher A; Fulton, Robert; Fulton, Lucinda L; Eades, William C; Elliott, Kevin; Heath, Sharon; Westervelt, Peter; Ding, Li; Conrad, Donald F; White, Brian S; Shao, Jin; Link, Daniel C; DiPersio, John F; Mardis, Elaine R; Wilson, Richard K; Ley, Timothy J; Walter, Matthew J; Graubert, Timothy A

2014-07-01

Next-generation sequencing has been used to infer the clonality of heterogeneous tumor samples. These analyses yield specific predictions-the population frequency of individual clones, their genetic composition, and their evolutionary relationships-which we set out to test by sequencing individual cells from three subjects diagnosed with secondary acute myeloid leukemia, each of whom had been previously characterized by whole genome sequencing of unfractionated tumor samples. Single-cell mutation profiling strongly supported the clonal architecture implied by the analysis of bulk material. In addition, it resolved the clonal assignment of single nucleotide variants that had been initially ambiguous and identified areas of previously unappreciated complexity. Accordingly, we find that many of the key assumptions underlying the analysis of tumor clonality by deep sequencing of unfractionated material are valid. Furthermore, we illustrate a single-cell sequencing strategy for interrogating the clonal relationships among known variants that is cost-effective, scalable, and adaptable to the analysis of both hematopoietic and solid tumors, or any heterogeneous population of cells.
Discovery of T Cell Receptor β Motifs Specific to HLA-B27-Positive Ankylosing Spondylitis by Deep Repertoire Sequence Analysis.

PubMed

Faham, Malek; Carlton, Victoria; Moorhead, Martin; Zheng, Jianbiao; Klinger, Mark; Pepin, Francois; Asbury, Thomas; Vignali, Marissa; Emerson, Ryan O; Robins, Harlan S; Ireland, James; Baechler-Gillespie, Emily; Inman, Robert D

2017-04-01

Ankylosing spondylitis (AS), a chronic inflammatory disorder, has a notable association with HLA-B27. One hypothesis suggests that a common antigen that binds to HLA-B27 is important for AS disease pathogenesis. This study was undertaken to determine sequences and motifs that are shared among HLA-B27-positive AS patients, using T cell repertoire next-generation sequencing. To identify motifs enriched among B27-positive AS patients, we performed T cell receptor β (TCRβ) repertoire sequencing on samples from 191 B27-positive AS patients, 43 B27-negative AS patients, and 227 controls, and we obtained >77 million TCRβ clonotype sequences. First, we assessed whether any of 50 previously published sequences were enriched in B27-positive AS patients. We then used training and test cohorts to identify discovered motifs that were enriched in B27-positive AS patients versus controls. Six previously published and 11 discovered motifs were enriched in the B27-positive AS samples as compared to controls. After combining motifs related by sequence, we identified a total of 15 independent motifs. Both the full set of 15 motifs and a set of 6 published motifs were enriched in the B27-positive AS patients as compared to B27-positive healthy individuals (P = 0.049 and P = 0.001, respectively). Using an independent cohort, we validated that at least some of these motifs were associated with AS, and not simply with B27-positive status. We identified TCRβ motifs that are enriched in B27-positive AS patients as compared to B27-positive healthy controls. This suggests that a common antigen, presented by HLA-B27 and detected by CD8+ T cells, may be associated with AS disease pathogenesis. © 2016, American College of Rheumatology.
Gene discovery in the hamster: a comparative genomics approach for gene annotation by sequencing of hamster testis cDNAs

PubMed Central

Oduru, Sreedhar; Campbell, Janee L; Karri, SriTulasi; Hendry, William J; Khan, Shafiq A; Williams, Simon C

2003-01-01

Background Complete genome annotation will likely be achieved through a combination of computer-based analysis of available genome sequences combined with direct experimental characterization of expressed regions of individual genomes. We have utilized a comparative genomics approach involving the sequencing of randomly selected hamster testis cDNAs to begin to identify genes not previously annotated on the human, mouse, rat and Fugu (pufferfish) genomes. Results 735 distinct sequences were analyzed for their relatedness to known sequences in public databases. Eight of these sequences were derived from previously unidentified genes and expression of these genes in testis was confirmed by Northern blotting. The genomic locations of each sequence were mapped in human, mouse, rat and pufferfish, where applicable, and the structure of their cognate genes was derived using computer-based predictions, genomic comparisons and analysis of uncharacterized cDNA sequences from human and macaque. Conclusion The use of a comparative genomics approach resulted in the identification of eight cDNAs that correspond to previously uncharacterized genes in the human genome. The proteins encoded by these genes included a new member of the kinesin superfamily, a SET/MYND-domain protein, and six proteins for which no specific function could be predicted. Each gene was expressed primarily in testis, suggesting that they may play roles in the development and/or function of testicular cells. PMID:12783626
Evolutionary Analysis Predicts Sensitive Positions of MMP20 and Validates Newly- and Previously-Identified MMP20 Mutations Causing Amelogenesis Imperfecta

PubMed Central

Gasse, Barbara; Prasad, Megana; Delgado, Sidney; Huckert, Mathilde; Kawczynski, Marzena; Garret-Bernardin, Annelyse; Lopez-Cazaux, Serena; Bailleul-Forestier, Isabelle; Manière, Marie-Cécile; Stoetzel, Corinne; Bloch-Zupan, Agnès; Sire, Jean-Yves

2017-01-01

Amelogenesis imperfecta (AI) designates a group of genetic diseases characterized by a large range of enamel disorders causing important social and health problems. These defects can result from mutations in enamel matrix proteins or protease encoding genes. A range of mutations in the enamel cleavage enzyme matrix metalloproteinase-20 gene (MMP20) produce enamel defects of varying severity. To address how various alterations produce a range of AI phenotypes, we performed a targeted analysis to find MMP20 mutations in French patients diagnosed with non-syndromic AI. Genomic DNA was isolated from saliva and MMP20 exons and exon-intron boundaries sequenced. We identified several homozygous or heterozygous mutations, putatively involved in the AI phenotypes. To validate missense mutations and predict sensitive positions in the MMP20 sequence, we evolutionarily compared 75 sequences extracted from the public databases using the Datamonkey webserver. These sequences were representative of mammalian lineages, covering more than 150 million years of evolution. This analysis allowed us to find 324 sensitive positions (out of the 483 MMP20 residues), pinpoint functionally important domains, and build an evolutionary chart of important conserved MMP20 regions. This is an efficient tool to identify new- and previously-identified mutations. We thus identified six functional MMP20 mutations in unrelated families, finding two novel mutated sites. The genotypes and phenotypes of these six mutations are described and compared. To date, 13 MMP20 mutations causing AI have been reported, making these genotypes and associated hypomature enamel phenotypes the most frequent in AI. PMID:28659819

Evolutionary Analysis Predicts Sensitive Positions of MMP20 and Validates Newly- and Previously-Identified MMP20 Mutations Causing Amelogenesis Imperfecta.

PubMed

Gasse, Barbara; Prasad, Megana; Delgado, Sidney; Huckert, Mathilde; Kawczynski, Marzena; Garret-Bernardin, Annelyse; Lopez-Cazaux, Serena; Bailleul-Forestier, Isabelle; Manière, Marie-Cécile; Stoetzel, Corinne; Bloch-Zupan, Agnès; Sire, Jean-Yves

2017-01-01

Amelogenesis imperfecta (AI) designates a group of genetic diseases characterized by a large range of enamel disorders causing important social and health problems. These defects can result from mutations in enamel matrix proteins or protease encoding genes. A range of mutations in the enamel cleavage enzyme matrix metalloproteinase-20 gene ( MMP20 ) produce enamel defects of varying severity. To address how various alterations produce a range of AI phenotypes, we performed a targeted analysis to find MMP20 mutations in French patients diagnosed with non-syndromic AI. Genomic DNA was isolated from saliva and MMP20 exons and exon-intron boundaries sequenced. We identified several homozygous or heterozygous mutations, putatively involved in the AI phenotypes. To validate missense mutations and predict sensitive positions in the MMP20 sequence, we evolutionarily compared 75 sequences extracted from the public databases using the Datamonkey webserver. These sequences were representative of mammalian lineages, covering more than 150 million years of evolution. This analysis allowed us to find 324 sensitive positions (out of the 483 MMP20 residues), pinpoint functionally important domains, and build an evolutionary chart of important conserved MMP20 regions. This is an efficient tool to identify new- and previously-identified mutations. We thus identified six functional MMP20 mutations in unrelated families, finding two novel mutated sites. The genotypes and phenotypes of these six mutations are described and compared. To date, 13 MMP20 mutations causing AI have been reported, making these genotypes and associated hypomature enamel phenotypes the most frequent in AI.
Low-coverage single-cell mRNA sequencing reveals cellular heterogeneity and activated signaling pathways in developing cerebral cortex.

PubMed

Pollen, Alex A; Nowakowski, Tomasz J; Shuga, Joe; Wang, Xiaohui; Leyrat, Anne A; Lui, Jan H; Li, Nianzhen; Szpankowski, Lukasz; Fowler, Brian; Chen, Peilin; Ramalingam, Naveen; Sun, Gang; Thu, Myo; Norris, Michael; Lebofsky, Ronald; Toppani, Dominique; Kemp, Darnell W; Wong, Michael; Clerkson, Barry; Jones, Brittnee N; Wu, Shiquan; Knutsson, Lawrence; Alvarado, Beatriz; Wang, Jing; Weaver, Lesley S; May, Andrew P; Jones, Robert C; Unger, Marc A; Kriegstein, Arnold R; West, Jay A A

2014-10-01

Large-scale surveys of single-cell gene expression have the potential to reveal rare cell populations and lineage relationships but require efficient methods for cell capture and mRNA sequencing. Although cellular barcoding strategies allow parallel sequencing of single cells at ultra-low depths, the limitations of shallow sequencing have not been investigated directly. By capturing 301 single cells from 11 populations using microfluidics and analyzing single-cell transcriptomes across downsampled sequencing depths, we demonstrate that shallow single-cell mRNA sequencing (~50,000 reads per cell) is sufficient for unbiased cell-type classification and biomarker identification. In the developing cortex, we identify diverse cell types, including multiple progenitor and neuronal subtypes, and we identify EGR1 and FOS as previously unreported candidate targets of Notch signaling in human but not mouse radial glia. Our strategy establishes an efficient method for unbiased analysis and comparison of cell populations from heterogeneous tissue by microfluidic single-cell capture and low-coverage sequencing of many cells.
kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets

PubMed Central

Fletez-Brant, Christopher; Lee, Dongwon; McCallion, Andrew S.; Beer, Michael A.

2013-01-01

Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-seq identifies regions of open chromatin. A major bottleneck in the interpretation of these data is the identification of the underlying DNA sequence code that defines, and ultimately facilitates prediction of, these transcription factor (TF) bound or open chromatin regions. We have recently developed a novel computational methodology, which uses a support vector machine (SVM) with kmer sequence features (kmer-SVM) to identify predictive combinations of short transcription factor-binding sites, which determine the tissue specificity of these genomic assays (Lee, Karchin and Beer, Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011; 21:2167–80). This regulatory information can (i) give confidence in genomic experiments by recovering previously known binding sites, and (ii) reveal novel sequence features for subsequent experimental testing of cooperative mechanisms. Here, we describe the development and implementation of a web server to allow the broader research community to independently apply our kmer-SVM to analyze and interpret their genomic datasets. We analyze five recently published data sets and demonstrate how this tool identifies accessory factors and repressive sequence elements. kmer-SVM is available at http://kmersvm.beerlab.org. PMID:23771147
kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets.

PubMed

Fletez-Brant, Christopher; Lee, Dongwon; McCallion, Andrew S; Beer, Michael A

2013-07-01

Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-seq identifies regions of open chromatin. A major bottleneck in the interpretation of these data is the identification of the underlying DNA sequence code that defines, and ultimately facilitates prediction of, these transcription factor (TF) bound or open chromatin regions. We have recently developed a novel computational methodology, which uses a support vector machine (SVM) with kmer sequence features (kmer-SVM) to identify predictive combinations of short transcription factor-binding sites, which determine the tissue specificity of these genomic assays (Lee, Karchin and Beer, Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011; 21:2167-80). This regulatory information can (i) give confidence in genomic experiments by recovering previously known binding sites, and (ii) reveal novel sequence features for subsequent experimental testing of cooperative mechanisms. Here, we describe the development and implementation of a web server to allow the broader research community to independently apply our kmer-SVM to analyze and interpret their genomic datasets. We analyze five recently published data sets and demonstrate how this tool identifies accessory factors and repressive sequence elements. kmer-SVM is available at http://kmersvm.beerlab.org.
Genome-based insights into the resistome and mobilome of multidrug-resistant Aeromonas sp. ARM81 isolated from wastewater.

PubMed

Adamczuk, Marcin; Dziewit, Lukasz

2017-01-01

The draft genome of multidrug-resistant Aeromonas sp. ARM81 isolated from a wastewater treatment plant in Warsaw (Poland) was obtained. Sequence analysis revealed multiple genes conferring resistance to aminoglycosides, β-lactams or tetracycline. Three different β-lactamase genes were identified, including an extended-spectrum β-lactamase gene bla PER-1 . The antibiotic susceptibility was experimentally tested. Genome sequencing also allowed us to investigate the plasmidome and transposable mobilome of ARM81. Four plasmids, of which two carry phenotypic modules (i.e., genes encoding a zinc transporter ZitB and a putative glucosyltransferase), and 28 putative transposase genes were identified. The mobility of three insertion sequences (isoforms of previously identified elements ISAs12, ISKpn9 and ISAs26) was confirmed using trap plasmids.
Sequence analysis of the complete genome of Trichoplusia ni single nucleopolyhedrovirus and the identification of a baculoviral photolyase gene

DOE Office of Scientific and Technical Information (OSTI.GOV)

Willis, Leslie G.; Siepp, Robyn; Stewart, Taryn M.

2005-08-01

The genome of the Trichoplusia ni single nucleopolyhedrovirus (TnSNPV), a group II NPV which infects the cabbage looper (T. ni), has been completely sequenced and analyzed. The TnSNPV DNA genome consists of 134,394 bp and has an overall G + C content of 39%. Gene analysis predicted 144 open reading frames (ORFs) of 150 nucleotides or greater that showed minimal overlap. Comparisons with previously sequenced baculoviruses indicate that 119 TnSNPV ORFs were homologues of previously reported viral gene sequences. Ninety-four TnSNPV ORFs returned an Autographa californica multiple NPV (AcMNPV) homologue while 25 ORFs returned poor or no sequence matches withmore » the current databases. A putative photolyase gene was also identified that had highest amino acid identity to the photolyase genes of Chrysodeixis chalcites NPV (ChchNPV) (47%) and Danio rerio (zebrafish) (40%). In addition unlike all other baculoviruses no obvious homologous repeat (hr) sequences were identified. Comparison of the TnSNPV and AcMNPV genomes provides a unique opportunity to examine two baculoviruses that are highly virulent for a common insect host (T. ni) yet belong to diverse baculovirus taxonomic groups and possess distinct biological features. In vitro fusion assays demonstrated that the TnSNPV F protein induces membrane fusion and syncytia formation and were compared to syncytia formed by AcMNPV GP64.« less
Identifying Group-Specific Sequences for Microbial Communities Using Long k-mer Sequence Signatures

PubMed Central

Wang, Ying; Fu, Lei; Ren, Jie; Yu, Zhaoxia; Chen, Ting; Sun, Fengzhu

2018-01-01

Comparing metagenomic samples is crucial for understanding microbial communities. For different groups of microbial communities, such as human gut metagenomic samples from patients with a certain disease and healthy controls, identifying group-specific sequences offers essential information for potential biomarker discovery. A sequence that is present, or rich, in one group, but absent, or scarce, in another group is considered “group-specific” in our study. Our main purpose is to discover group-specific sequence regions between control and case groups as disease-associated markers. We developed a long k-mer (k ≥ 30 bps)-based computational pipeline to detect group-specific sequences at strain resolution free from reference sequences, sequence alignments, and metagenome-wide de novo assembly. We called our method MetaGO: Group-specific oligonucleotide analysis for metagenomic samples. An open-source pipeline on Apache Spark was developed with parallel computing. We applied MetaGO to one simulated and three real metagenomic datasets to evaluate the discriminative capability of identified group-specific markers. In the simulated dataset, 99.11% of group-specific logical 40-mers covered 98.89% disease-specific regions from the disease-associated strain. In addition, 97.90% of group-specific numerical 40-mers covered 99.61 and 96.39% of differentially abundant genome and regions between two groups, respectively. For a large-scale metagenomic liver cirrhosis (LC)-associated dataset, we identified 37,647 group-specific 40-mer features. Any one of the features can predict disease status of the training samples with the average of sensitivity and specificity higher than 0.8. The random forests classification using the top 10 group-specific features yielded a higher AUC (from ∼0.8 to ∼0.9) than that of previous studies. All group-specific 40-mers were present in LC patients, but not healthy controls. All the assembled 11 LC-specific sequences can be mapped to two strains of Veillonella parvula: UTDB1-3 and DSM2008. The experiments on the other two real datasets related to Inflammatory Bowel Disease and Type 2 Diabetes in Women consistently demonstrated that MetaGO achieved better prediction accuracy with fewer features compared to previous studies. The experiments showed that MetaGO is a powerful tool for identifying group-specific k-mers, which would be clinically applicable for disease prediction. MetaGO is available at https://github.com/VVsmileyx/MetaGO. PMID:29774017
Isolation of a gammaherpesvirus similar to asinine herpesvirus-2 (AHV-2) from a mule and a survey of mules and donkeys for AHV-2 infection by real-time PCR.

PubMed

Bell, Stephanie A; Pusterla, Nicola; Balasuriya, Udeni B R; Mapes, Samantha M; Nyberg, Nicole L; MacLachlan, N James

2008-07-27

Equids are commonly infected by herpesviruses, but isolation of herpesviruses from mules has apparently not been previously reported. Furthermore, the genomic relationships among the various equid herpesviruses are poorly characterized. We describe the isolation and preliminary characterization of a mule gammaherpesvirus tentatively identified as asinine herpesvirus-2 (AHV-2; also designated equid herpesvirus-7 (EHV-7)) from the nasal secretions (NS) of a healthy mule in northern California. The virus was initially identified by transmission electron microscopic examination of lysates of cell culture inoculated with NS collected from the mule. A 913 nucleotide sequence of the DNA polymerase gene was amplified using degenerate primers, and comparison of this sequence with those of various other herpesviruses showed that the mule herpesvirus was most closely related to EHV-2 (AHV-2 sequences were not available for comparison). The sequence of a shorter portion (166 nucleotides) of the mule herpesvirus DNA polymerase gene was identical to that of the published sequence of an asinine gammaherpesvirus, previously designated as AHV-4-3 (AY054992). AHV-2 was detected by real-time polymerase chain reaction assay in the NS of approximately 8% of a cohort of 114 healthy mules and 13 donkeys.
Genome-Wide Association Study Identifies Loci for Salt Tolerance during Germination in Autotetraploid Alfalfa (Medicago sativa L.) Using Genotyping-by-Sequencing

PubMed Central

Yu, Long-Xi; Liu, Xinchun; Boge, William; Liu, Xiang-Ping

2016-01-01

Salinity is one of major abiotic stresses limiting alfalfa (Medicago sativa L.) production in the arid and semi-arid regions in US and other counties. In this study, we used a diverse panel of alfalfa accessions previously described by Zhang et al. (2015) to identify molecular markers associated with salt tolerance during germination using genome-wide association study (GWAS) and genotyping-by-sequencing (GBS). Phenotyping was done by germinating alfalfa seeds under different levels of salt stress. Phenotypic data of adjusted germination rates and SNP markers generated by GBS were used for marker-trait association. Thirty six markers were significantly associated with salt tolerance in at least one level of salt treatments. Alignment of sequence tags to the Medicago truncatula genome revealed genetic locations of the markers on all chromosomes except chromosome 3. Most significant markers were found on chromosomes 1, 2, and 4. BLAST search using the flanking sequences of significant markers identified 14 putative candidate genes linked to 23 significant markers. Most of them were repeatedly identified in two or three salt treatments. Several loci identified in the present study had similar genetic locations to the reported QTL associated with salt tolerance in M. truncatula. A locus identified on chromosome 6 by this study overlapped with that by drought in our previous study. To our knowledge, this is the first report on mapping loci associated with salt tolerance during germination in autotetraploid alfalfa. Further investigation on these loci and their linked genes would provide insight into understanding molecular mechanisms by which salt and drought stresses affect alfalfa growth. Functional markers closely linked to the resistance loci would be useful for MAS to improve alfalfa cultivars with enhanced resistance to drought and salt stresses. PMID:27446182
Genome-Wide Association Study Identifies Loci for Salt Tolerance during Germination in Autotetraploid Alfalfa (Medicago sativa L.) Using Genotyping-by-Sequencing.

PubMed

Yu, Long-Xi; Liu, Xinchun; Boge, William; Liu, Xiang-Ping

2016-01-01

Salinity is one of major abiotic stresses limiting alfalfa (Medicago sativa L.) production in the arid and semi-arid regions in US and other counties. In this study, we used a diverse panel of alfalfa accessions previously described by Zhang et al. (2015) to identify molecular markers associated with salt tolerance during germination using genome-wide association study (GWAS) and genotyping-by-sequencing (GBS). Phenotyping was done by germinating alfalfa seeds under different levels of salt stress. Phenotypic data of adjusted germination rates and SNP markers generated by GBS were used for marker-trait association. Thirty six markers were significantly associated with salt tolerance in at least one level of salt treatments. Alignment of sequence tags to the Medicago truncatula genome revealed genetic locations of the markers on all chromosomes except chromosome 3. Most significant markers were found on chromosomes 1, 2, and 4. BLAST search using the flanking sequences of significant markers identified 14 putative candidate genes linked to 23 significant markers. Most of them were repeatedly identified in two or three salt treatments. Several loci identified in the present study had similar genetic locations to the reported QTL associated with salt tolerance in M. truncatula. A locus identified on chromosome 6 by this study overlapped with that by drought in our previous study. To our knowledge, this is the first report on mapping loci associated with salt tolerance during germination in autotetraploid alfalfa. Further investigation on these loci and their linked genes would provide insight into understanding molecular mechanisms by which salt and drought stresses affect alfalfa growth. Functional markers closely linked to the resistance loci would be useful for MAS to improve alfalfa cultivars with enhanced resistance to drought and salt stresses.
An Atlas of Peroxiredoxins Created Using an Active Site Profile-Based Approach to Functionally Relevant Clustering of Proteins.

PubMed

Harper, Angela F; Leuthaeuser, Janelle B; Babbitt, Patricia C; Morris, John H; Ferrin, Thomas E; Poole, Leslie B; Fetrow, Jacquelyn S

2017-02-01

Peroxiredoxins (Prxs or Prdxs) are a large protein superfamily of antioxidant enzymes that rapidly detoxify damaging peroxides and/or affect signal transduction and, thus, have roles in proliferation, differentiation, and apoptosis. Prx superfamily members are widespread across phylogeny and multiple methods have been developed to classify them. Here we present an updated atlas of the Prx superfamily identified using a novel method called MISST (Multi-level Iterative Sequence Searching Technique). MISST is an iterative search process developed to be both agglomerative, to add sequences containing similar functional site features, and divisive, to split groups when functional site features suggest distinct functionally-relevant clusters. Superfamily members need not be identified initially-MISST begins with a minimal representative set of known structures and searches GenBank iteratively. Further, the method's novelty lies in the manner in which isofunctional groups are selected; rather than use a single or shifting threshold to identify clusters, the groups are deemed isofunctional when they pass a self-identification criterion, such that the group identifies itself and nothing else in a search of GenBank. The method was preliminarily validated on the Prxs, as the Prxs presented challenges of both agglomeration and division. For example, previous sequence analysis clustered the Prx functional families Prx1 and Prx6 into one group. Subsequent expert analysis clearly identified Prx6 as a distinct functionally relevant group. The MISST process distinguishes these two closely related, though functionally distinct, families. Through MISST search iterations, over 38,000 Prx sequences were identified, which the method divided into six isofunctional clusters, consistent with previous expert analysis. The results represent the most complete computational functional analysis of proteins comprising the Prx superfamily. The feasibility of this novel method is demonstrated by the Prx superfamily results, laying the foundation for potential functionally relevant clustering of the universe of protein sequences.
An Atlas of Peroxiredoxins Created Using an Active Site Profile-Based Approach to Functionally Relevant Clustering of Proteins

PubMed Central

Babbitt, Patricia C.; Ferrin, Thomas E.

2017-01-01

Peroxiredoxins (Prxs or Prdxs) are a large protein superfamily of antioxidant enzymes that rapidly detoxify damaging peroxides and/or affect signal transduction and, thus, have roles in proliferation, differentiation, and apoptosis. Prx superfamily members are widespread across phylogeny and multiple methods have been developed to classify them. Here we present an updated atlas of the Prx superfamily identified using a novel method called MISST (Multi-level Iterative Sequence Searching Technique). MISST is an iterative search process developed to be both agglomerative, to add sequences containing similar functional site features, and divisive, to split groups when functional site features suggest distinct functionally-relevant clusters. Superfamily members need not be identified initially—MISST begins with a minimal representative set of known structures and searches GenBank iteratively. Further, the method’s novelty lies in the manner in which isofunctional groups are selected; rather than use a single or shifting threshold to identify clusters, the groups are deemed isofunctional when they pass a self-identification criterion, such that the group identifies itself and nothing else in a search of GenBank. The method was preliminarily validated on the Prxs, as the Prxs presented challenges of both agglomeration and division. For example, previous sequence analysis clustered the Prx functional families Prx1 and Prx6 into one group. Subsequent expert analysis clearly identified Prx6 as a distinct functionally relevant group. The MISST process distinguishes these two closely related, though functionally distinct, families. Through MISST search iterations, over 38,000 Prx sequences were identified, which the method divided into six isofunctional clusters, consistent with previous expert analysis. The results represent the most complete computational functional analysis of proteins comprising the Prx superfamily. The feasibility of this novel method is demonstrated by the Prx superfamily results, laying the foundation for potential functionally relevant clustering of the universe of protein sequences. PMID:28187133
Baseline Survey of Root-Associated Microbes of Taxus chinensis (Pilger) Rehd

PubMed Central

Sun, Guiling; Wilson, Iain W.; Wu, Jianqiang; Hoffman, Angela; Cheng, Junwen; Qiu, Deyou

2015-01-01

Taxol (paclitaxel) a diterpenoid is one of the most effective anticancer drugs identified. Biosynthesis of taxol was considered restricted to the Taxus genera until Stierle et al. discovered that an endophytic fungus isolated from Taxus brevifolia could independently synthesize taxol. Little is known about the mechanism of taxol biosynthesis in microbes, but it has been speculated that its biosynthesis may differ from plants. The microbiome from the roots of Taxus chinensis have been extensively investigated with culture-dependent methods to identify taxol synthesizing microbes, but not using culture independent methods.,Using bar-coded high-throughput sequencing in combination with a metagenomics approach, we surveyed the microbial diversity and gene composition of the root-associated microbiomefrom Taxus chinensis (Pilger) Rehd. High-throughput amplicon sequencing revealed 187 fungal OTUs which is higher than any previously reported fungal number identified with the culture-dependent method, suggesting that T. chinensis roots harbor novel and diverse fungi. Some operational taxonomic units (OTU) identified were identical to reported microbe strains possessing the ability to synthesis taxol and several genes previously associated with taxol biosynthesis were identified through metagenomics analysis. PMID:25821956
Baseline survey of root-associated microbes of Taxus chinensis (Pilger) Rehd.

PubMed

Zhang, Qian; Liu, Hongwei; Sun, Guiling; Wilson, Iain W; Wu, Jianqiang; Hoffman, Angela; Cheng, Junwen; Qiu, Deyou

2015-01-01

Taxol (paclitaxel) a diterpenoid is one of the most effective anticancer drugs identified. Biosynthesis of taxol was considered restricted to the Taxus genera until Stierle et al. discovered that an endophytic fungus isolated from Taxus brevifolia could independently synthesize taxol. Little is known about the mechanism of taxol biosynthesis in microbes, but it has been speculated that its biosynthesis may differ from plants. The microbiome from the roots of Taxus chinensis have been extensively investigated with culture-dependent methods to identify taxol synthesizing microbes, but not using culture independent methods.,Using bar-coded high-throughput sequencing in combination with a metagenomics approach, we surveyed the microbial diversity and gene composition of the root-associated microbiomefrom Taxus chinensis (Pilger) Rehd. High-throughput amplicon sequencing revealed 187 fungal OTUs which is higher than any previously reported fungal number identified with the culture-dependent method, suggesting that T. chinensis roots harbor novel and diverse fungi. Some operational taxonomic units (OTU) identified were identical to reported microbe strains possessing the ability to synthesis taxol and several genes previously associated with taxol biosynthesis were identified through metagenomics analysis.
Fragmentation of contaminant and endogenous DNA in ancient samples determined by shotgun sequencing; prospects for human palaeogenomics.

PubMed

García-Garcerà, Marc; Gigli, Elena; Sanchez-Quinto, Federico; Ramirez, Oscar; Calafell, Francesc; Civit, Sergi; Lalueza-Fox, Carles

2011-01-01

Despite the successful retrieval of genomes from past remains, the prospects for human palaeogenomics remain unclear because of the difficulty of distinguishing contaminant from endogenous DNA sequences. Previous sequence data generated on high-throughput sequencing platforms indicate that fragmentation of ancient DNA sequences is a characteristic trait primarily arising due to depurination processes that create abasic sites leading to DNA breaks. METHODOLOGY/PRINCIPALS FINDINGS: To investigate whether this pattern is present in ancient remains from a temperate environment, we have 454-FLX pyrosequenced different samples dated between 5,500 and 49,000 years ago: a bone from an extinct goat (Myotragus balearicus) that was treated with a depurinating agent (bleach), an Iberian lynx bone not subjected to any treatment, a human Neolithic sample from Barcelona (Spain), and a Neandertal sample from the El Sidrón site (Asturias, Spain). The efficiency of retrieval of endogenous sequences is below 1% in all cases. We have used the non-human samples to identify human sequences (0.35 and 1.4%, respectively), that we positively know are contaminants. We observed that bleach treatment appears to create a depurination-associated fragmentation pattern in resulting contaminant sequences that is indistinguishable from previously described endogenous sequences. Furthermore, the nucleotide composition pattern observed in 5' and 3' ends of contaminant sequences is much more complex than the flat pattern previously described in some Neandertal contaminants. Although much research on samples with known contaminant histories is needed, our results suggest that endogenous and contaminant sequences cannot be distinguished by the fragmentation pattern alone.
First report of the root-rot pathogen, Armillaria nabsnona, from Hawaii

Treesearch

J. W. Hanna; N. B. Klopfenstein; M. -S. Kim

2007-01-01

The genus Armillaria (2) and Armillaria mellea sensu lato (3) have been reported previously from Hawaii. However, Armillaria species in Hawaii have not been previously identified by DNA sequences, compatibility tests, or other methods that distinguish currently recognized taxa. In August 2005, Armillaria rhizomorphs and mycelial bark fans were collected from two...
Lariat sequencing in a unicellular yeast identifies regulated alternative splicing of exons that are evolutionarily conserved with humans.

PubMed

Awan, Ali R; Manfredo, Amanda; Pleiss, Jeffrey A

2013-07-30

Alternative splicing is a potent regulator of gene expression that vastly increases proteomic diversity in multicellular eukaryotes and is associated with organismal complexity. Although alternative splicing is widespread in vertebrates, little is known about the evolutionary origins of this process, in part because of the absence of phylogenetically conserved events that cross major eukaryotic clades. Here we describe a lariat-sequencing approach, which offers high sensitivity for detecting splicing events, and its application to the unicellular fungus, Schizosaccharomyces pombe, an organism that shares many of the hallmarks of alternative splicing in mammalian systems but for which no previous examples of exon-skipping had been demonstrated. Over 200 previously unannotated splicing events were identified, including examples of regulated alternative splicing. Remarkably, an evolutionary analysis of four of the exons identified here as subject to skipping in S. pombe reveals high sequence conservation and perfect length conservation with their homologs in scores of plants, animals, and fungi. Moreover, alternative splicing of two of these exons have been documented in multiple vertebrate organisms, making these the first demonstrations of identical alternative-splicing patterns in species that are separated by over 1 billion y of evolution.
Acral peeling skin syndrome resulting from a homozygous nonsense mutation in the CSTA gene encoding cystatin A.

PubMed

Krunic, Aleksandar L; Stone, Kristina L; Simpson, Michael A; McGrath, John A

2013-01-01

Acral peeling skin syndrome (APSS) is a clinically and genetically heterogeneous disorder. We used whole-exome sequencing to identify the molecular basis of APSS in a consanguineous Jordanian-American pedigree. We identified a homozygous nonsense mutation (p.Lys22X) in the CSTA gene, encoding cystatin A, that was confirmed using Sanger sequencing. Cystatin A is a protease inhibitor found in the cornified cell envelope, and loss-of-function mutations have previously been reported in two cases of exfoliative ichthyosis. Our study expands the molecular pathology of APSS and demonstrates the value of next-generation sequencing in the genetic characterization of inherited skin diseases. © 2013 Wiley Periodicals, Inc.
Whole-Genome Characterization of Prunus necrotic ringspot virus Infecting Sweet Cherry in China

PubMed Central

2018-01-01

ABSTRACT Prunus necrotic ringspot virus (PNRSV) causes yield loss in most cultivated stone fruits, including sweet cherry. Using a small RNA deep-sequencing approach combined with end-genome sequence cloning, we identified the complete genomes of all three PNRSV strands from PNRSV-infected sweet cherry trees and compared them with those of two previously reported isolates. PMID:29496825
Whole-exome sequencing for mutation detection in pediatric disorders of insulin secretion: Maturity onset diabetes of the young and congenital hyperinsulinism.

PubMed

Johnson, S R; Leo, P J; McInerney-Leo, A M; Anderson, L K; Marshall, M; McGown, I; Newell, F; Brown, M A; Conwell, L S; Harris, M; Duncan, E L

2018-06-01

To assess the utility of whole-exome sequencing (WES) for mutation detection in maturity-onset diabetes of the young (MODY) and congenital hyperinsulinism (CHI). MODY and CHI are the two commonest monogenic disorders of glucose-regulated insulin secretion in childhood, with 13 causative genes known for MODY and 10 causative genes identified for CHI. The large number of potential genes makes comprehensive screening using traditional methods expensive and time-consuming. Ten subjects with MODY and five with CHI with known mutations underwent WES using two different exome capture kits (Nimblegen SeqCap EZ Human v3.0 Exome Enrichment Kit, Nextera Rapid Capture Exome Kit). Analysis was blinded to previously identified mutations, and included assessment for large deletions. The target capture of five exome capture technologies was also analyzed using sequencing data from >2800 unrelated samples. Four of five MODY mutations were identified using Nimblegen (including a large deletion in HNF1B). Although targeted, one mutation (in INS) had insufficient coverage for detection. Eleven of eleven mutations (six MODY, five CHI) were identified using Nextera Rapid (including the previously missed mutation). On reconciliation, all mutations concorded with previous data and no additional variants in MODY genes were detected. There were marked differences in the performance of the capture technologies. WES can be useful for screening for MODY/CHI mutations, detecting both point mutations and large deletions. However, capture technologies require careful selection. © 2018 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

First complete genome sequence of infectious laryngotracheitis virus

PubMed Central

2011-01-01

Background Infectious laryngotracheitis virus (ILTV) is an alphaherpesvirus that causes acute respiratory disease in chickens worldwide. To date, only one complete genomic sequence of ILTV has been reported. This sequence was generated by concatenating partial sequences from six different ILTV strains. Thus, the full genomic sequence of a single (individual) strain of ILTV has not been determined previously. This study aimed to use high throughput sequencing technology to determine the complete genomic sequence of a live attenuated vaccine strain of ILTV. Results The complete genomic sequence of the Serva vaccine strain of ILTV was determined, annotated and compared to the concatenated ILTV reference sequence. The genome size of the Serva strain was 152,628 bp, with a G + C content of 48%. A total of 80 predicted open reading frames were identified. The Serva strain had 96.5% DNA sequence identity with the concatenated ILTV sequence. Notably, the concatenated ILTV sequence was found to lack four large regions of sequence, including 528 bp and 594 bp of sequence in the UL29 and UL36 genes, respectively, and two copies of a 1,563 bp sequence in the repeat regions. Considerable differences in the size of the predicted translation products of 4 other genes (UL54, UL30, UL37 and UL38) were also identified. More than 530 single-nucleotide polymorphisms (SNPs) were identified. Most SNPs were located within three genomic regions, corresponding to sequence from the SA-2 ILTV vaccine strain in the concatenated ILTV sequence. Conclusions This is the first complete genomic sequence of an individual ILTV strain. This sequence will facilitate future comparative genomic studies of ILTV by providing an appropriate reference sequence for the sequence analysis of other ILTV strains. PMID:21501528
Methods for determining the genetic affinity of microorganisms and viruses

NASA Technical Reports Server (NTRS)

Fox, George E. (Inventor); Willson, III, Richard C. (Inventor); Zhang, Zhengdong (Inventor)

2012-01-01

Selecting which sub-sequences in a database of nucleic acid such as 16S rRNA are highly characteristic of particular groupings of bacteria, microorganisms, fungi, etc. on a substantially phylogenetic tree. Also applicable to viruses comprising viral genomic RNA or DNA. A catalogue of highly characteristic sequences identified by this method is assembled to establish the genetic identity of an unknown organism. The characteristic sequences are used to design nucleic acid hybridization probes that include the characteristic sequence or its complement, or are derived from one or more characteristic sequences. A plurality of these characteristic sequences is used in hybridization to determine the phylogenetic tree position of the organism(s) in a sample. Those target organisms represented in the original sequence database and sufficient characteristic sequences can identify to the species or subspecies level. Oligonucleotide arrays of many probes are especially preferred. A hybridization signal can comprise fluorescence, chemiluminescence, or isotopic labeling, etc.; or sequences in a sample can be detected by direct means, e.g. mass spectrometry. The method's characteristic sequences can also be used to design specific PCR primers. The method uniquely identifies the phylogenetic affinity of an unknown organism without requiring prior knowledge of what is present in the sample. Even if the organism has not been previously encountered, the method still provides useful information about which phylogenetic tree bifurcation nodes encompass the organism.
Diversity and phylogeography of begomovirus-associated beta satellites of okra in India

PubMed Central

2011-01-01

Background Okra (Abelmoschus esculentus; family Malvaceae) is grown in temperate as well as subtropical regions of the world, both for human consumption as a vegetable and for industrial uses. Okra yields are affected by the diseases caused by phyopathogenic viruses. India is the largest producer of okra and in this region a major biotic constraint to production are viruses of the genus Begomovirus. Begomoviruses affecting okra across the Old World are associated with specific, symptom modulating satellites (beta satellites). We describe a comprehensive analysis of the diversity of beta satellites associated with okra in India. Results The full-length sequences of 36 beta satellites, isolated from okra exhibiting typical begomovirus symptoms (leaf curl and yellow vein), were determined. The sequences segregated in to four groups. Two groups correspond to the beta satellites Okra leaf curl beta satellite (OLCuB) and Bhendi yellow vein beta satellite (BYVB) that have previously been identified in okra from the sub-continent. One sequence was distinct from all other, previously isolated beta satellites and represents a new species for which we propose the name Bhendi yellow vein India beta satellite (BYVIB). This new beta satellite was nevertheless closely related to BYVB and OLCuB. Most surprising was the identification of Croton yellow vein mosaic beta satellite (CroYVMB) in okra; a beta satellite not previously identified in a malvaceous plant species. The okra beta satellites were shown to have distinct geographic host ranges with BYVB occurring across India whereas OLCuB was only identified in northwestern India. Okra infections with CroYVMB were only identified across the northern and eastern central regions of India. A more detailed analysis of the sequences showed that OLCuB, BYVB and BYVIB share highest identity with respect βC1 gene. βC1 is the only gene encoded by beta satellites, the product of which is the major pathogenicity determinant of begomovirus-beta satellite complexes and is involved in overcoming host defenses based on RNAi. Conclusion The diversity of beta satellites in okra across the sub-continent is higher than previously realized and is higher than for any other malvaceous plant species so far analyzed. The beta satellites identified in okra show geographic segregation, which has implications for the development and introduction of resistant okra varieties. However, the finding that the βC1 gene of the major okra beta satellites (OLCuB, BYVB and BYVIB) share high sequence identity and provides a possible avenue to achieve a broad spectrum resistance. PMID:22188644
Characterization of NIST human mitochondrial DNA SRM-2392 and SRM-2392-I standard reference materials by next generation sequencing.

PubMed

Riman, Sarah; Kiesler, Kevin M; Borsuk, Lisa A; Vallone, Peter M

2017-07-01

Standard Reference Materials SRM 2392 and 2392-I are intended to provide quality control when amplifying and sequencing human mitochondrial genome sequences. The National Institute of Standards and Technology (NIST) offers these SRMs to laboratories performing DNA-based forensic human identification, molecular diagnosis of mitochondrial diseases, mutation detection, evolutionary anthropology, and genetic genealogy. The entire mtGenome (∼16569bp) of SRM 2392 and 2392-I have previously been characterized at NIST by Sanger sequencing. Herein, we used the sensitivity, specificity, and accuracy offered by next generation sequencing (NGS) to: (1) re-sequence the certified values of the SRM 2392 and 2392-I; (2) confirm Sanger data with a high coverage new sequencing technology; (3) detect lower level heteroplasmies (<20%); and thus (4) support mitochondrial sequencing communities in the adoption of NGS methods. To obtain a consensus sequence for the SRMs as well as identify and control any bias, sequencing was performed using two NGS platforms and data was analyzed using different bioinformatics pipelines. Our results confirm five low level heteroplasmy sites that were not previously observed with Sanger sequencing: three sites in the GM09947A template in SRM 2392 and two sites in the HL-60 template in SRM 2392-I. Copyright © 2017 Elsevier B.V. All rights reserved.
A metagenomic viral discovery approach identifies potential zoonotic and novel mammalian viruses in Neoromicia bats within South Africa.

PubMed

Geldenhuys, Marike; Mortlock, Marinda; Weyer, Jacqueline; Bezuidt, Oliver; Seamark, Ernest C J; Kearney, Teresa; Gleasner, Cheryl; Erkkila, Tracy H; Cui, Helen; Markotter, Wanda

2018-01-01

Species within the Neoromicia bat genus are abundant and widely distributed in Africa. It is common for these insectivorous bats to roost in anthropogenic structures in urban regions. Additionally, Neoromicia capensis have previously been identified as potential hosts for Middle East respiratory syndrome (MERS)-related coronaviruses. This study aimed to ascertain the gastrointestinal virome of these bats, as viruses excreted in fecal material or which may be replicating in rectal or intestinal tissues have the greatest opportunities of coming into contact with other hosts. Samples were collected in five regions of South Africa over eight years. Initial virome composition was determined by viral metagenomic sequencing by pooling samples and enriching for viral particles. Libraries were sequenced on the Illumina MiSeq and NextSeq500 platforms, producing a combined 37 million reads. Bioinformatics analysis of the high throughput sequencing data detected the full genome of a novel species of the Circoviridae family, and also identified sequence data from the Adenoviridae, Coronaviridae, Herpesviridae, Parvoviridae, Papillomaviridae, Phenuiviridae, and Picornaviridae families. Metagenomic sequencing data was insufficient to determine the viral diversity of certain families due to the fragmented coverage of genomes and lack of suitable sequencing depth, as some viruses were detected from the analysis of reads-data only. Follow up conventional PCR assays targeting conserved gene regions for the Adenoviridae, Coronaviridae, and Herpesviridae families were used to confirm metagenomic data and generate additional sequences to determine genetic diversity. The complete coding genome of a MERS-related coronavirus was recovered with additional amplicon sequencing on the MiSeq platform. The new genome shared 97.2% overall nucleotide identity to a previous Neoromicia-associated MERS-related virus, also from South Africa. Conventional PCR analysis detected diverse adenovirus and herpesvirus sequences that were widespread throughout Neoromicia populations in South Africa. Furthermore, similar adenovirus sequences were detected within these populations throughout several years. With the exception of the coronaviruses, the study represents the first report of sequence data from several viral families within a Southern African insectivorous bat genus; highlighting the need for continued investigations in this regard.
A metagenomic viral discovery approach identifies potential zoonotic and novel mammalian viruses in Neoromicia bats within South Africa

PubMed Central

Geldenhuys, Marike; Mortlock, Marinda; Weyer, Jacqueline; Bezuidt, Oliver; Seamark, Ernest C. J.; Kearney, Teresa; Gleasner, Cheryl; Erkkila, Tracy H.; Cui, Helen; Markotter, Wanda

2018-01-01

Species within the Neoromicia bat genus are abundant and widely distributed in Africa. It is common for these insectivorous bats to roost in anthropogenic structures in urban regions. Additionally, Neoromicia capensis have previously been identified as potential hosts for Middle East respiratory syndrome (MERS)-related coronaviruses. This study aimed to ascertain the gastrointestinal virome of these bats, as viruses excreted in fecal material or which may be replicating in rectal or intestinal tissues have the greatest opportunities of coming into contact with other hosts. Samples were collected in five regions of South Africa over eight years. Initial virome composition was determined by viral metagenomic sequencing by pooling samples and enriching for viral particles. Libraries were sequenced on the Illumina MiSeq and NextSeq500 platforms, producing a combined 37 million reads. Bioinformatics analysis of the high throughput sequencing data detected the full genome of a novel species of the Circoviridae family, and also identified sequence data from the Adenoviridae, Coronaviridae, Herpesviridae, Parvoviridae, Papillomaviridae, Phenuiviridae, and Picornaviridae families. Metagenomic sequencing data was insufficient to determine the viral diversity of certain families due to the fragmented coverage of genomes and lack of suitable sequencing depth, as some viruses were detected from the analysis of reads-data only. Follow up conventional PCR assays targeting conserved gene regions for the Adenoviridae, Coronaviridae, and Herpesviridae families were used to confirm metagenomic data and generate additional sequences to determine genetic diversity. The complete coding genome of a MERS-related coronavirus was recovered with additional amplicon sequencing on the MiSeq platform. The new genome shared 97.2% overall nucleotide identity to a previous Neoromicia-associated MERS-related virus, also from South Africa. Conventional PCR analysis detected diverse adenovirus and herpesvirus sequences that were widespread throughout Neoromicia populations in South Africa. Furthermore, similar adenovirus sequences were detected within these populations throughout several years. With the exception of the coronaviruses, the study represents the first report of sequence data from several viral families within a Southern African insectivorous bat genus; highlighting the need for continued investigations in this regard. PMID:29579103
Identification of the maize gravitropism gene lazy plant1 by a transposon-tagging genome resequencing strategy.

PubMed

Howard, Thomas P; Hayward, Andrew P; Tordillos, Anthony; Fragoso, Christopher; Moreno, Maria A; Tohme, Joe; Kausch, Albert P; Mottinger, John P; Dellaporta, Stephen L

2014-01-01

Since their initial discovery, transposons have been widely used as mutagens for forward and reverse genetic screens in a range of organisms. The problems of high copy number and sequence divergence among related transposons have often limited the efficiency at which tagged genes can be identified. A method was developed to identity the locations of Mutator (Mu) transposons in the Zea mays genome using a simple enrichment method combined with genome resequencing to identify transposon junction fragments. The sequencing library was prepared from genomic DNA by digesting with a restriction enzyme that cuts within a perfectly conserved motif of the Mu terminal inverted repeats (TIR). Paired-end reads containing Mu TIR sequences were computationally identified and chromosomal sequences flanking the transposon were mapped to the maize reference genome. This method has been used to identify Mu insertions in a number of alleles and to isolate the previously unidentified lazy plant1 (la1) gene. The la1 gene is required for the negatively gravitropic response of shoots and mutant plants lack the ability to sense gravity. Using bioinformatic and fluorescence microscopy approaches, we show that the la1 gene encodes a cell membrane and nuclear localized protein. Our Mu-Taq method is readily adaptable to identify the genomic locations of any insertion of a known sequence in any organism using any sequencing platform.
Identification of the Maize Gravitropism Gene lazy plant1 by a Transposon-Tagging Genome Resequencing Strategy

PubMed Central

Howard, Thomas P.; Hayward, Andrew P.; Tordillos, Anthony; Fragoso, Christopher; Moreno, Maria A.; Tohme, Joe; Kausch, Albert P.; Mottinger, John P.; Dellaporta, Stephen L.

2014-01-01

Since their initial discovery, transposons have been widely used as mutagens for forward and reverse genetic screens in a range of organisms. The problems of high copy number and sequence divergence among related transposons have often limited the efficiency at which tagged genes can be identified. A method was developed to identity the locations of Mutator (Mu) transposons in the Zea mays genome using a simple enrichment method combined with genome resequencing to identify transposon junction fragments. The sequencing library was prepared from genomic DNA by digesting with a restriction enzyme that cuts within a perfectly conserved motif of the Mu terminal inverted repeats (TIR). Paired-end reads containing Mu TIR sequences were computationally identified and chromosomal sequences flanking the transposon were mapped to the maize reference genome. This method has been used to identify Mu insertions in a number of alleles and to isolate the previously unidentified lazy plant1 (la1) gene. The la1 gene is required for the negatively gravitropic response of shoots and mutant plants lack the ability to sense gravity. Using bioinformatic and fluorescence microscopy approaches, we show that the la1 gene encodes a cell membrane and nuclear localized protein. Our Mu-Taq method is readily adaptable to identify the genomic locations of any insertion of a known sequence in any organism using any sequencing platform. PMID:24498020
Novel species including Mycobacterium fukienense sp. is found from tuberculosis patients in Fujian Province, China, using phylogenetic analysis of Mycobacterium chelonae/abscessus complex.

PubMed

Zhang, Yuan Yuan; Li, Yan Bing; Huang, Ming Xiang; Zhao, Xiu Qin; Zhang, Li Shui; Liu, Wen En; Wan, Kang Lin

2013-11-01

To identify the novel species 'Mycobacterium fukienense' sp. nov of Mycobacterium chelonae/abscessus complex from tuberculosis patients in Fujian Province, China. Five of 27 clinical Mycobacterium isolates (Cls) were previously identified as M. chelonae/abscessus complex by sequencing the hsp65, rpoB, 16S-23S rRNA internal transcribed spacer region (its), recA and sodA house-keeping genes commonly used to describe the molecular characteristics of Mycobacterium. Clinical Mycobacterium isolates were classified according to the gene sequence using a clustering analysis program. Sequence similarity within clusters and diversity between clusters were analyzed. The 5 isolates were identified with distinct sequences exhibiting 99.8% homology in the hsp65 gene. However, a complete lack of homology was observed among the sequences of the rpoB, 16S-23S rRNA internal transcribed spacer region (its), sodA, and recA genes as compared with the M. abscessus. Furthermore, no match for rpoB, sodA, and recA genes was identified among the published sequences. The novel species, Mycobacterium fukienense, is identified from tuberculosis patients in Fujian Province, China, which does not belong to any existing subspecies of M. chelonea/abscessus complex. Copyright © 2013 The Editorial Board of Biomedical and Environmental Sciences. Published by China CDC. All rights reserved.
[Molecular-genetic characterization of shiga-toxin producing Escherichia coli isolated during a food-borne outbreak in St. Petersburg in 2013].

PubMed

Onishchenko, G G; Dyatlov, I A; Svetoch, E A; Volozhantsev, N V; Bannov, V A; Kartsev, N N; Borzenkov, V N; Fursova, N K; Shemyakin, I G; Bogun, A G; Kislichkina, A A; Popova, A V; Myakinina, V P; Teimurazov, M G; Polosenko, O V; Kaftyreva, L A; Makarova, M A; Matveeva, Z N; Grechaninova, T A; Grigor'eva, N S; Kicha, E V; Zabalueva, G V; Kutasova, T B; Korzhaev, Yu N; Bashketova, N S; Bushmanova, O N; Stalevskaya, A V; Tchinjeria, I G; Zhebrun, F B

2015-01-01

Shiga toxin-producing Escherichia coli (STEC) food-borne infections are reported worldwide and represent a serious problem for public healthcare. In the Russian Federation there is little information on epidemiology and etiology of STEC-infections as well as on molecular-genetic peculiarities of STEC pathogens. Our aim was to describe a food-borne outbreak as hemorrhagic colitis (HC) along with hemolytic uremic syndrome (HUS), enterocolitis, and acute gastroenteritis in children in St. Petersburg in 2013. Epidemiological, microbiological, molecular-genetic and bioinformatic methods were applied. Objects to study were clinical specimens, milk and food samples, as well as STEC strains isolated during the outbreak. The outbreak of food-borne infection was found to be caused by STEC-contaminated raw milk as confirmed by epidemiological analysis, detection of STEC DNA and isolation of relevant pathogens in milk and sick children fecal specimens. The whole-genome sequencing revealed two groups ofpathogens, E. coli O157:H7 and E. coli O101:H33 among collected strains. Group I strains were attributed to the previously known sequence type ST24, while group II strains belonged to the previously non-described sequence type ST145. In strain genomes of both groups there were identified nucleotide sequences of VT2-like prophage carrying stx2c gene, plasmid enterohemolysin gene, and gene of the STEC main adhesion factor intimin. Gene of intimin gamma was identified in E. coli O157:H7 strains and intimin iota 2 in E. coli O101:H33 strains. The latter previously was identified only in enteropathogenic E. coli (EPEC) strains. The additional knowledge of epidemiology and biology of STEC pathogens would assist clinicians and epidemiologists in diagnosing, treating and preventing hemorrhagic colitis.
Identification of novel Theileria genotypes from Grant's gazelle

PubMed Central

Hooge, Janis; Howe, Laryssa; Ezenwa, Vanessa O.

2015-01-01

Blood samples collected from Grant's gazelles (Nanger granti) in Kenya were screened for hemoparasites using a combination of microscopic and molecular techniques. All 69 blood smears examined by microscopy were positive for hemoparasites. In addition, Theileria/Babesia DNA was detected in all 65 samples screened by PCR for a ~450-base pair fragment of the V4 hypervariable region of the 18S rRNA gene. Sequencing and BLAST analysis of a subset of PCR amplicons revealed widespread co-infection (25/39) and the existence of two distinct Grant's gazelle Theileria subgroups. One group of 11 isolates clustered as a subgroup with previously identified Theileria ovis isolates from small ruminants from Europe, Asia and Africa; another group of 3 isolates clustered with previously identified Theileria spp. isolates from other African antelope. Based on extensive levels of sequence divergence (1.2–2%) from previously reported Theileria species within Kenya and worldwide, the Theileria isolates detected in Grant's gazelles appear to represent at least two novel Theileria genotypes. PMID:25973394
Identification of novel Theileria genotypes from Grant's gazelle.

PubMed

Hooge, Janis; Howe, Laryssa; Ezenwa, Vanessa O

2015-08-01

Blood samples collected from Grant's gazelles (Nanger granti) in Kenya were screened for hemoparasites using a combination of microscopic and molecular techniques. All 69 blood smears examined by microscopy were positive for hemoparasites. In addition, Theileria/Babesia DNA was detected in all 65 samples screened by PCR for a ~450-base pair fragment of the V4 hypervariable region of the 18S rRNA gene. Sequencing and BLAST analysis of a subset of PCR amplicons revealed widespread co-infection (25/39) and the existence of two distinct Grant's gazelle Theileria subgroups. One group of 11 isolates clustered as a subgroup with previously identified Theileria ovis isolates from small ruminants from Europe, Asia and Africa; another group of 3 isolates clustered with previously identified Theileria spp. isolates from other African antelope. Based on extensive levels of sequence divergence (1.2-2%) from previously reported Theileria species within Kenya and worldwide, the Theileria isolates detected in Grant's gazelles appear to represent at least two novel Theileria genotypes.
Detection of mumps virus genotype H in two previously vaccinated patients from Mexico City.

PubMed

Del Valle, Alberto; García, Alí A; Barrón, Blanca L

2016-06-01

Infections caused by mumps virus (MuV) have been successfully prevented through vaccination; however, in recent years, an increasing number of mumps outbreaks have been reported within vaccinated populations. In this study, MuV was genotyped for the first time in Mexico. Saliva samples were obtained from two previously vaccinated patients in Mexico City who had developed parotitis. Viral isolation was carried out in Vero cells, and the SH and HN genes were amplified by RT-PCR. Amplicons were sequenced and compared to a set of reference sequences to identify the MuV genotype.
A high-throughput venom-gland transcriptome for the Eastern Diamondback Rattlesnake (Crotalus adamanteus) and evidence for pervasive positive selection across toxin classes.

PubMed

Rokyta, Darin R; Wray, Kenneth P; Lemmon, Alan R; Lemmon, Emily Moriarty; Caudle, S Brian

2011-04-01

Despite causing considerable human mortality and morbidity, animal toxins represent a valuable source of pharmacologically active macromolecules, a unique system for studying molecular adaptation, and a powerful framework for examining structure-function relationships in proteins. Snake venoms are particularly useful in the latter regard as they consist primarily of a moderate number of proteins and peptides that have been found to belong to just a handful of protein families. As these proteins and peptides are produced in dedicated glands, transcriptome sequencing has proven to be an effective approach to identifying the expressed toxin genes. We generated a venom-gland transcriptome for the Eastern Diamondback Rattlesnake (Crotalus adamanteus) using Roche 454 sequencing technology. In the current work, we focus on transcripts encoding toxins. We identified 40 unique toxin transcripts, 30 of which have full-length coding sequences, and 10 have only partial coding sequences. These toxins account for 24% of the total sequencing reads. We found toxins from 11 previously described families of snake-venom toxins and have discovered two putative, previously undescribed toxin classes. The most diverse and highly expressed toxin classes in the C. adamanteus venom-gland transcriptome are the serine proteinases, metalloproteinases, and C-type lectins. The serine proteinases are the most abundant class, accounting for 35% of the toxin sequencing reads. Metalloproteinases are the most diverse; 11 different forms have been identified. Using our sequences and those available in public databases, we detected positive selection in seven of the eight toxin families for which sufficient sequences were available for the analysis. We find that the vast majority of the genes that contribute directly to this vertebrate trait show evidence for a role for positive selection in their evolutionary history. Copyright © 2011 Elsevier Ltd. All rights reserved.
Novel sequence variants in the TMIE gene in families with autosomal recessive nonsyndromic hearing impairment

PubMed Central

Santos, Regie Lyn P.; El-Shanti, Hatem; Sikandar, Shaheen; Lee, Kwanghyuk; Bhatti, Attya; Yan, Kai; Chahrour, Maria H.; McArthur, Nathan; Pham, Thanh L.; Mahasneh, Amjad Abdullah; Ahmad, Wasim

2010-01-01

To date, 37 genes have been identified for nonsyndromic hearing impairment (NSHI). Identifying the functional sequence variants within these genes and knowing their population-specific frequencies is of public health value, in particular for genetic screening for NSHI. To determine putatively functional sequence variants in the transmembrane inner ear (TMIE) gene in Pakistani and Jordanian families with autosomal recessive (AR) NSHI, four Jordanian and 168 Pakistani families with ARNSHI that is not due to GJB2 (CX26) were submitted to a genome scan. Two-point and multipoint parametric linkage analyses were performed, and families with logarithmic odds (LOD) scores of 1.0 or greater within the TMIE region underwent further DNA sequencing. The evolutionary conservation and location in predicted protein domains of amino acid residues where sequence variants occurred were studied to elucidate the possible effects of these sequence variants on function. Of seven families that were screened for TMIE, putatively functional sequence variants were found to segregate with hearing impairment in four families but were not seen in not less than 110 ethnically matched control chromosomes. The previously reported c.241C>T (p.R81C) variant was observed in two Pakistani families. Two novel variants, c.92A>G (p.E31G) and the splice site mutation c.212–2A>C, were identified in one Pakistani and one Jordanian family, respectively. The c.92A>G (p.E31G) variant occurred at a residue that is conserved in the mouse and is predicted to be extracellular. Conservation and potential functionality of previously published mutations were also examined. The prevalence of functional TMIE variants in Pakistani families is 1.7% [95% confidence interval (CI) 0.3–4.8]. Further studies on the spectrum, prevalence rates, and functional effect of sequence variants in the TMIE gene in other populations should demonstrate the true importance of this gene as a cause of hearing impairment. PMID:16389551
Exome Sequence Analysis of 14 Families With High Myopia.

PubMed

Kloss, Bethany A; Tompson, Stuart W; Whisenhunt, Kristina N; Quow, Krystina L; Huang, Samuel J; Pavelec, Derek M; Rosenberg, Thomas; Young, Terri L

2017-04-01

To identify causal gene mutations in 14 families with autosomal dominant (AD) high myopia using exome sequencing. Select individuals from 14 large Caucasian families with high myopia were exome sequenced. Gene variants were filtered to identify potential pathogenic changes. Sanger sequencing was used to confirm variants in original DNA, and to test for disease cosegregation in additional family members. Candidate genes and chromosomal loci previously associated with myopic refractive error and its endophenotypes were comprehensively screened. In 14 high myopia families, we identified 73 rare and 31 novel gene variants as candidates for pathogenicity. In seven of these families, two of the novel and eight of the rare variants were within known myopia loci. A total of 104 heterozygous nonsynonymous rare variants in 104 genes were identified in 10 out of 14 probands. Each variant cosegregated with affection status. No rare variants were identified in genes known to cause myopia or in genes closest to published genome-wide association study association signals for refractive error or its endophenotypes. Whole exome sequencing was performed to determine gene variants implicated in the pathogenesis of AD high myopia. This study provides new genes for consideration in the pathogenesis of high myopia, and may aid in the development of genetic profiling of those at greatest risk for attendant ocular morbidities of this disorder.
Genome-Wide Search Identifies 1.9 Mb from the Polar Bear Y Chromosome for Evolutionary Analyses

PubMed Central

Bidon, Tobias; Schreck, Nancy; Hailer, Frank; Nilsson, Maria A.; Janke, Axel

2015-01-01

The male-inherited Y chromosome is the major haploid fraction of the mammalian genome, rendering Y-linked sequences an indispensable resource for evolutionary research. However, despite recent large-scale genome sequencing approaches, only a handful of Y chromosome sequences have been characterized to date, mainly in model organisms. Using polar bear (Ursus maritimus) genomes, we compare two different in silico approaches to identify Y-linked sequences: 1) Similarity to known Y-linked genes and 2) difference in the average read depth of autosomal versus sex chromosomal scaffolds. Specifically, we mapped available genomic sequencing short reads from a male and a female polar bear against the reference genome and identify 112 Y-chromosomal scaffolds with a combined length of 1.9 Mb. We verified the in silico findings for the longer polar bear scaffolds by male-specific in vitro amplification, demonstrating the reliability of the average read depth approach. The obtained Y chromosome sequences contain protein-coding sequences, single nucleotide polymorphisms, microsatellites, and transposable elements that are useful for evolutionary studies. A high-resolution phylogeny of the polar bear patriline shows two highly divergent Y chromosome lineages, obtained from analysis of the identified Y scaffolds in 12 previously published male polar bear genomes. Moreover, we find evidence of gene conversion among ZFX and ZFY sequences in the giant panda lineage and in the ancestor of ursine and tremarctine bears. Thus, the identification of Y-linked scaffold sequences from unordered genome sequences yields valuable data to infer phylogenomic and population-genomic patterns in bears. PMID:26019166
Genetic and phylogenetic analysis of a novel parvovirus isolated from chickens in Guangxi, China.

PubMed

Feng, Bin; Xie, Zhixun; Deng, Xianwen; Xie, Liji; Xie, Zhiqin; Huang, Li; Fan, Qin; Luo, Sisi; Huang, Jiaoling; Zhang, Yanfang; Zeng, Tingting; Wang, Sheng; Wang, Leyi

2016-11-01

A previously unidentified chicken parvovirus (ChPV) strain, associated with runting-stunting syndrome (RSS), is now endemic among chickens in China. To explore the genetic diversity of ChPV strains, we determined the first complete genome sequence of a novel ChPV isolate (GX-CH-PV-7) identified in chickens in Guang Xi, China, and showed moderate genome sequence similarity to reference strains. Analysis showed that the viral genome sequence is 86.4 %-93.9 % identical to those of other ChPVs. Genetic and phylogenetic analyses showed that this newly emergent GX-CH-PV-7 is closely related to Gallus gallus enteric parvovirus isolate ChPV 798 from the USA, indicating that they may share a common ancestor. The complete DNA sequence is 4612 bp long with an A+T content of 56.66 %. We determined the first complete genome sequence of a previously unidentified ChPV strain to elucidate its origin and evolutionary status.
Defiant: (DMRs: easy, fast, identification and ANnoTation) identifies differentially Methylated regions from iron-deficient rat hippocampus.

PubMed

Condon, David E; Tran, Phu V; Lien, Yu-Chin; Schug, Jonathan; Georgieff, Michael K; Simmons, Rebecca A; Won, Kyoung-Jae

2018-02-05

Identification of differentially methylated regions (DMRs) is the initial step towards the study of DNA methylation-mediated gene regulation. Previous approaches to call DMRs suffer from false prediction, use extreme resources, and/or require library installation and input conversion. We developed a new approach called Defiant to identify DMRs. Employing Weighted Welch Expansion (WWE), Defiant showed superior performance to other predictors in the series of benchmarking tests on artificial and real data. Defiant was subsequently used to investigate DNA methylation changes in iron-deficient rat hippocampus. Defiant identified DMRs close to genes associated with neuronal development and plasticity, which were not identified by its competitor. Importantly, Defiant runs between 5 to 479 times faster than currently available software packages. Also, Defiant accepts 10 different input formats widely used for DNA methylation data. Defiant effectively identifies DMRs for whole-genome bisulfite sequencing (WGBS), reduced-representation bisulfite sequencing (RRBS), Tet-assisted bisulfite sequencing (TAB-seq), and HpaII tiny fragment enrichment by ligation-mediated PCR-tag (HELP) assays.
The genetic architecture of type 2 diabetes.

PubMed

Fuchsberger, Christian; Flannick, Jason; Teslovich, Tanya M; Mahajan, Anubha; Agarwala, Vineeta; Gaulton, Kyle J; Ma, Clement; Fontanillas, Pierre; Moutsianas, Loukas; McCarthy, Davis J; Rivas, Manuel A; Perry, John R B; Sim, Xueling; Blackwell, Thomas W; Robertson, Neil R; Rayner, N William; Cingolani, Pablo; Locke, Adam E; Tajes, Juan Fernandez; Highland, Heather M; Dupuis, Josee; Chines, Peter S; Lindgren, Cecilia M; Hartl, Christopher; Jackson, Anne U; Chen, Han; Huyghe, Jeroen R; van de Bunt, Martijn; Pearson, Richard D; Kumar, Ashish; Müller-Nurasyid, Martina; Grarup, Niels; Stringham, Heather M; Gamazon, Eric R; Lee, Jaehoon; Chen, Yuhui; Scott, Robert A; Below, Jennifer E; Chen, Peng; Huang, Jinyan; Go, Min Jin; Stitzel, Michael L; Pasko, Dorota; Parker, Stephen C J; Varga, Tibor V; Green, Todd; Beer, Nicola L; Day-Williams, Aaron G; Ferreira, Teresa; Fingerlin, Tasha; Horikoshi, Momoko; Hu, Cheng; Huh, Iksoo; Ikram, Mohammad Kamran; Kim, Bong-Jo; Kim, Yongkang; Kim, Young Jin; Kwon, Min-Seok; Lee, Juyoung; Lee, Selyeong; Lin, Keng-Han; Maxwell, Taylor J; Nagai, Yoshihiko; Wang, Xu; Welch, Ryan P; Yoon, Joon; Zhang, Weihua; Barzilai, Nir; Voight, Benjamin F; Han, Bok-Ghee; Jenkinson, Christopher P; Kuulasmaa, Teemu; Kuusisto, Johanna; Manning, Alisa; Ng, Maggie C Y; Palmer, Nicholette D; Balkau, Beverley; Stančáková, Alena; Abboud, Hanna E; Boeing, Heiner; Giedraitis, Vilmantas; Prabhakaran, Dorairaj; Gottesman, Omri; Scott, James; Carey, Jason; Kwan, Phoenix; Grant, George; Smith, Joshua D; Neale, Benjamin M; Purcell, Shaun; Butterworth, Adam S; Howson, Joanna M M; Lee, Heung Man; Lu, Yingchang; Kwak, Soo-Heon; Zhao, Wei; Danesh, John; Lam, Vincent K L; Park, Kyong Soo; Saleheen, Danish; So, Wing Yee; Tam, Claudia H T; Afzal, Uzma; Aguilar, David; Arya, Rector; Aung, Tin; Chan, Edmund; Navarro, Carmen; Cheng, Ching-Yu; Palli, Domenico; Correa, Adolfo; Curran, Joanne E; Rybin, Denis; Farook, Vidya S; Fowler, Sharon P; Freedman, Barry I; Griswold, Michael; Hale, Daniel Esten; Hicks, Pamela J; Khor, Chiea-Chuen; Kumar, Satish; Lehne, Benjamin; Thuillier, Dorothée; Lim, Wei Yen; Liu, Jianjun; van der Schouw, Yvonne T; Loh, Marie; Musani, Solomon K; Puppala, Sobha; Scott, William R; Yengo, Loïc; Tan, Sian-Tsung; Taylor, Herman A; Thameem, Farook; Wilson, Gregory; Wong, Tien Yin; Njølstad, Pål Rasmus; Levy, Jonathan C; Mangino, Massimo; Bonnycastle, Lori L; Schwarzmayr, Thomas; Fadista, João; Surdulescu, Gabriela L; Herder, Christian; Groves, Christopher J; Wieland, Thomas; Bork-Jensen, Jette; Brandslund, Ivan; Christensen, Cramer; Koistinen, Heikki A; Doney, Alex S F; Kinnunen, Leena; Esko, Tõnu; Farmer, Andrew J; Hakaste, Liisa; Hodgkiss, Dylan; Kravic, Jasmina; Lyssenko, Valeriya; Hollensted, Mette; Jørgensen, Marit E; Jørgensen, Torben; Ladenvall, Claes; Justesen, Johanne Marie; Käräjämäki, Annemari; Kriebel, Jennifer; Rathmann, Wolfgang; Lannfelt, Lars; Lauritzen, Torsten; Narisu, Narisu; Linneberg, Allan; Melander, Olle; Milani, Lili; Neville, Matt; Orho-Melander, Marju; Qi, Lu; Qi, Qibin; Roden, Michael; Rolandsson, Olov; Swift, Amy; Rosengren, Anders H; Stirrups, Kathleen; Wood, Andrew R; Mihailov, Evelin; Blancher, Christine; Carneiro, Mauricio O; Maguire, Jared; Poplin, Ryan; Shakir, Khalid; Fennell, Timothy; DePristo, Mark; de Angelis, Martin Hrabé; Deloukas, Panos; Gjesing, Anette P; Jun, Goo; Nilsson, Peter; Murphy, Jacquelyn; Onofrio, Robert; Thorand, Barbara; Hansen, Torben; Meisinger, Christa; Hu, Frank B; Isomaa, Bo; Karpe, Fredrik; Liang, Liming; Peters, Annette; Huth, Cornelia; O'Rahilly, Stephen P; Palmer, Colin N A; Pedersen, Oluf; Rauramaa, Rainer; Tuomilehto, Jaakko; Salomaa, Veikko; Watanabe, Richard M; Syvänen, Ann-Christine; Bergman, Richard N; Bharadwaj, Dwaipayan; Bottinger, Erwin P; Cho, Yoon Shin; Chandak, Giriraj R; Chan, Juliana C N; Chia, Kee Seng; Daly, Mark J; Ebrahim, Shah B; Langenberg, Claudia; Elliott, Paul; Jablonski, Kathleen A; Lehman, Donna M; Jia, Weiping; Ma, Ronald C W; Pollin, Toni I; Sandhu, Manjinder; Tandon, Nikhil; Froguel, Philippe; Barroso, Inês; Teo, Yik Ying; Zeggini, Eleftheria; Loos, Ruth J F; Small, Kerrin S; Ried, Janina S; DeFronzo, Ralph A; Grallert, Harald; Glaser, Benjamin; Metspalu, Andres; Wareham, Nicholas J; Walker, Mark; Banks, Eric; Gieger, Christian; Ingelsson, Erik; Im, Hae Kyung; Illig, Thomas; Franks, Paul W; Buck, Gemma; Trakalo, Joseph; Buck, David; Prokopenko, Inga; Mägi, Reedik; Lind, Lars; Farjoun, Yossi; Owen, Katharine R; Gloyn, Anna L; Strauch, Konstantin; Tuomi, Tiinamaija; Kooner, Jaspal Singh; Lee, Jong-Young; Park, Taesung; Donnelly, Peter; Morris, Andrew D; Hattersley, Andrew T; Bowden, Donald W; Collins, Francis S; Atzmon, Gil; Chambers, John C; Spector, Timothy D; Laakso, Markku; Strom, Tim M; Bell, Graeme I; Blangero, John; Duggirala, Ravindranath; Tai, E Shyong; McVean, Gilean; Hanis, Craig L; Wilson, James G; Seielstad, Mark; Frayling, Timothy M; Meigs, James B; Cox, Nancy J; Sladek, Rob; Lander, Eric S; Gabriel, Stacey; Burtt, Noël P; Mohlke, Karen L; Meitinger, Thomas; Groop, Leif; Abecasis, Goncalo; Florez, Jose C; Scott, Laura J; Morris, Andrew P; Kang, Hyun Min; Boehnke, Michael; Altshuler, David; McCarthy, Mark I

2016-08-04

The genetic architecture of common traits, including the number, frequency, and effect sizes of inherited variants that contribute to individual risk, has been long debated. Genome-wide association studies have identified scores of common variants associated with type 2 diabetes, but in aggregate, these explain only a fraction of the heritability of this disease. Here, to test the hypothesis that lower-frequency variants explain much of the remainder, the GoT2D and T2D-GENES consortia performed whole-genome sequencing in 2,657 European individuals with and without diabetes, and exome sequencing in 12,940 individuals from five ancestry groups. To increase statistical power, we expanded the sample size via genotyping and imputation in a further 111,548 subjects. Variants associated with type 2 diabetes after sequencing were overwhelmingly common and most fell within regions previously identified by genome-wide association studies. Comprehensive enumeration of sequence variation is necessary to identify functional alleles that provide important clues to disease pathophysiology, but large-scale sequencing does not support the idea that lower-frequency variants have a major role in predisposition to type 2 diabetes.

The genetic architecture of type 2 diabetes

PubMed Central

Ma, Clement; Fontanillas, Pierre; Moutsianas, Loukas; McCarthy, Davis J; Rivas, Manuel A; Perry, John R B; Sim, Xueling; Blackwell, Thomas W; Robertson, Neil R; Rayner, N William; Cingolani, Pablo; Locke, Adam E; Tajes, Juan Fernandez; Highland, Heather M; Dupuis, Josee; Chines, Peter S; Lindgren, Cecilia M; Hartl, Christopher; Jackson, Anne U; Chen, Han; Huyghe, Jeroen R; van de Bunt, Martijn; Pearson, Richard D; Kumar, Ashish; Müller-Nurasyid, Martina; Grarup, Niels; Stringham, Heather M; Gamazon, Eric R; Lee, Jaehoon; Chen, Yuhui; Scott, Robert A; Below, Jennifer E; Chen, Peng; Huang, Jinyan; Go, Min Jin; Stitzel, Michael L; Pasko, Dorota; Parker, Stephen C J; Varga, Tibor V; Green, Todd; Beer, Nicola L; Day-Williams, Aaron G; Ferreira, Teresa; Fingerlin, Tasha; Horikoshi, Momoko; Hu, Cheng; Huh, Iksoo; Ikram, Mohammad Kamran; Kim, Bong-Jo; Kim, Yongkang; Kim, Young Jin; Kwon, Min-Seok; Lee, Juyoung; Lee, Selyeong; Lin, Keng-Han; Maxwell, Taylor J; Nagai, Yoshihiko; Wang, Xu; Welch, Ryan P; Yoon, Joon; Zhang, Weihua; Barzilai, Nir; Voight, Benjamin F; Han, Bok-Ghee; Jenkinson, Christopher P; Kuulasmaa, Teemu; Kuusisto, Johanna; Manning, Alisa; Ng, Maggie C Y; Palmer, Nicholette D; Balkau, Beverley; Stančáková, Alena; Abboud, Hanna E; Boeing, Heiner; Giedraitis, Vilmantas; Prabhakaran, Dorairaj; Gottesman, Omri; Scott, James; Carey, Jason; Kwan, Phoenix; Grant, George; Smith, Joshua D; Neale, Benjamin M; Purcell, Shaun; Butterworth, Adam S; Howson, Joanna M M; Lee, Heung Man; Lu, Yingchang; Kwak, Soo-Heon; Zhao, Wei; Danesh, John; Lam, Vincent K L; Park, Kyong Soo; Saleheen, Danish; So, Wing Yee; Tam, Claudia H T; Afzal, Uzma; Aguilar, David; Arya, Rector; Aung, Tin; Chan, Edmund; Navarro, Carmen; Cheng, Ching-Yu; Palli, Domenico; Correa, Adolfo; Curran, Joanne E; Rybin, Denis; Farook, Vidya S; Fowler, Sharon P; Freedman, Barry I; Griswold, Michael; Hale, Daniel Esten; Hicks, Pamela J; Khor, Chiea-Chuen; Kumar, Satish; Lehne, Benjamin; Thuillier, Dorothée; Lim, Wei Yen; Liu, Jianjun; van der Schouw, Yvonne T; Loh, Marie; Musani, Solomon K; Puppala, Sobha; Scott, William R; Yengo, Loïc; Tan, Sian-Tsung; Taylor, Herman A; Thameem, Farook; Wilson, Gregory; Wong, Tien Yin; Njølstad, Pål Rasmus; Levy, Jonathan C; Mangino, Massimo; Bonnycastle, Lori L; Schwarzmayr, Thomas; Fadista, João; Surdulescu, Gabriela L; Herder, Christian; Groves, Christopher J; Wieland, Thomas; Bork-Jensen, Jette; Brandslund, Ivan; Christensen, Cramer; Koistinen, Heikki A; Doney, Alex S F; Kinnunen, Leena; Esko, Tõnu; Farmer, Andrew J; Hakaste, Liisa; Hodgkiss, Dylan; Kravic, Jasmina; Lyssenko, Valeriya; Hollensted, Mette; Jørgensen, Marit E; Jørgensen, Torben; Ladenvall, Claes; Justesen, Johanne Marie; Käräjämäki, Annemari; Kriebel, Jennifer; Rathmann, Wolfgang; Lannfelt, Lars; Lauritzen, Torsten; Narisu, Narisu; Linneberg, Allan; Melander, Olle; Milani, Lili; Neville, Matt; Orho-Melander, Marju; Qi, Lu; Qi, Qibin; Roden, Michael; Rolandsson, Olov; Swift, Amy; Rosengren, Anders H; Stirrups, Kathleen; Wood, Andrew R; Mihailov, Evelin; Blancher, Christine; Carneiro, Mauricio O; Maguire, Jared; Poplin, Ryan; Shakir, Khalid; Fennell, Timothy; DePristo, Mark; de Angelis, Martin Hrabé; Deloukas, Panos; Gjesing, Anette P; Jun, Goo; Nilsson, Peter; Murphy, Jacquelyn; Onofrio, Robert; Thorand, Barbara; Hansen, Torben; Meisinger, Christa; Hu, Frank B; Isomaa, Bo; Karpe, Fredrik; Liang, Liming; Peters, Annette; Huth, Cornelia; O'Rahilly, Stephen P; Palmer, Colin N A; Pedersen, Oluf; Rauramaa, Rainer; Tuomilehto, Jaakko; Salomaa, Veikko; Watanabe, Richard M; Syvänen, Ann-Christine; Bergman, Richard N; Bharadwaj, Dwaipayan; Bottinger, Erwin P; Cho, Yoon Shin; Chandak, Giriraj R; Chan, Juliana C N; Chia, Kee Seng; Daly, Mark J; Ebrahim, Shah B; Langenberg, Claudia; Elliott, Paul; Jablonski, Kathleen A; Lehman, Donna M; Jia, Weiping; Ma, Ronald C W; Pollin, Toni I; Sandhu, Manjinder; Tandon, Nikhil; Froguel, Philippe; Barroso, Inês; Teo, Yik Ying; Zeggini, Eleftheria; Loos, Ruth J F; Small, Kerrin S; Ried, Janina S; DeFronzo, Ralph A; Grallert, Harald; Glaser, Benjamin; Metspalu, Andres; Wareham, Nicholas J; Walker, Mark; Banks, Eric; Gieger, Christian; Ingelsson, Erik; Im, Hae Kyung; Illig, Thomas; Franks, Paul W; Buck, Gemma; Trakalo, Joseph; Buck, David; Prokopenko, Inga; Mägi, Reedik; Lind, Lars; Farjoun, Yossi; Owen, Katharine R; Gloyn, Anna L; Strauch, Konstantin; Tuomi, Tiinamaija; Kooner, Jaspal Singh; Lee, Jong-Young; Park, Taesung; Donnelly, Peter; Morris, Andrew D; Hattersley, Andrew T; Bowden, Donald W; Collins, Francis S; Atzmon, Gil; Chambers, John C; Spector, Timothy D; Laakso, Markku; Strom, Tim M; Bell, Graeme I; Blangero, John; Duggirala, Ravindranath; Tai, E Shyong; McVean, Gilean; Hanis, Craig L; Wilson, James G; Seielstad, Mark; Frayling, Timothy M; Meigs, James B; Cox, Nancy J; Sladek, Rob; Lander, Eric S; Gabriel, Stacey; Burtt, Noël P; Mohlke, Karen L; Meitinger, Thomas; Groop, Leif; Abecasis, Goncalo; Florez, Jose C; Scott, Laura J; Morris, Andrew P; Kang, Hyun Min; Boehnke, Michael; Altshuler, David; McCarthy, Mark I

2016-01-01

The genetic architecture of common traits, including the number, frequency, and effect sizes of inherited variants that contribute to individual risk, has been long debated. Genome-wide association studies have identified scores of common variants associated with type 2 diabetes, but in aggregate, these explain only a fraction of heritability. To test the hypothesis that lower-frequency variants explain much of the remainder, the GoT2D and T2D-GENES consortia performed whole genome sequencing in 2,657 Europeans with and without diabetes, and exome sequencing in a total of 12,940 subjects from five ancestral groups. To increase statistical power, we expanded sample size via genotyping and imputation in a further 111,548 subjects. Variants associated with type 2 diabetes after sequencing were overwhelmingly common and most fell within regions previously identified by genome-wide association studies. Comprehensive enumeration of sequence variation is necessary to identify functional alleles that provide important clues to disease pathophysiology, but large-scale sequencing does not support a major role for lower-frequency variants in predisposition to type 2 diabetes. PMID:27398621
Candida guilliermondii and Other Species of Candida Misidentified as Candida famata: Assessment by Vitek 2, DNA Sequencing Analysis, and Matrix-Assisted Laser Desorption Ionization–Time of Flight Mass Spectrometry in Two Global Antifungal Surveillance Programs

PubMed Central

Woosley, Leah N.; Diekema, Daniel J.; Jones, Ronald N.; Pfaller, Michael A.

2013-01-01

Candida famata (teleomorph Debaryomyces hansenii) has been described as a medically relevant yeast, and this species has been included in many commercial identification systems that are currently used in clinical laboratories. Among 53 strains collected during the SENTRY and ARTEMIS surveillance programs and previously identified as C. famata (includes all submitted strains with this identification) by a variety of commercial methods (Vitek, MicroScan, API, and AuxaColor), DNA sequencing methods demonstrated that 19 strains were C. guilliermondii, 14 were C. parapsilosis, 5 were C. lusitaniae, 4 were C. albicans, and 3 were C. tropicalis, and five isolates belonged to other Candida species (two C. fermentati and one each C. intermedia, C. pelliculosa, and Pichia fabianni). Additionally, three misidentified C. famata strains were correctly identified as Kodomaea ohmeri, Debaryomyces nepalensis, and Debaryomyces fabryi using intergenic transcribed spacer (ITS) and/or intergenic spacer (IGS) sequencing. The Vitek 2 system identified three isolates with high confidence to be C. famata and another 15 with low confidence between C. famata and C. guilliermondii or C. parapsilosis, displaying only 56.6% agreement with DNA sequencing results. Matrix-assisted laser desorption ionization–time of flight (MALDI-TOF) results displayed 81.1% agreement with DNA sequencing. One strain each of C. metapsilosis, C. fermentati, and C. intermedia demonstrated a low score for identification (<2.0) in the MALDI Biotyper. K. ohmeri, D. nepalensis, and D. fabryi identified by DNA sequencing in this study were not in the current database for the MALDI Biotyper. These results suggest that the occurrence of C. famata in fungal infections is much lower than previously appreciated and that commercial systems do not produce accurate identifications except for the newly introduced MALDI-TOF instruments. PMID:23100350
Deep Illumina-Based Shotgun Sequencing Reveals Dietary Effects on the Structure and Function of the Fecal Microbiome of Growing Kittens

PubMed Central

Deusch, Oliver; O’Flynn, Ciaran; Colyer, Alison; Morris, Penelope; Allaway, David; Jones, Paul G.; Swanson, Kelly S.

2014-01-01

Background Previously, we demonstrated that dietary protein:carbohydrate ratio dramatically affects the fecal microbial taxonomic structure of kittens using targeted 16S gene sequencing. The present study, using the same fecal samples, applied deep Illumina shotgun sequencing to identify the diet-associated functional potential and analyze taxonomic changes of the feline fecal microbiome. Methodology & Principal Findings Fecal samples from kittens fed one of two diets differing in protein and carbohydrate content (high–protein, low–carbohydrate, HPLC; and moderate-protein, moderate-carbohydrate, MPMC) were collected at 8, 12 and 16 weeks of age (n = 6 per group). A total of 345.3 gigabases of sequence were generated from 36 samples, with 99.75% of annotated sequences identified as bacterial. At the genus level, 26% and 39% of reads were annotated for HPLC- and MPMC-fed kittens, with HPLC-fed cats showing greater species richness and microbial diversity. Two phyla, ten families and fifteen genera were responsible for more than 80% of the sequences at each taxonomic level for both diet groups, consistent with the previous taxonomic study. Significantly different abundances between diet groups were observed for 324 genera (56% of all genera identified) demonstrating widespread diet-induced changes in microbial taxonomic structure. Diversity was not affected over time. Functional analysis identified 2,013 putative enzyme function groups were different (p<0.000007) between the two dietary groups and were associated to 194 pathways, which formed five discrete clusters based on average relative abundance. Of those, ten contained more (p<0.022) enzyme functions with significant diet effects than expected by chance. Six pathways were related to amino acid biosynthesis and metabolism linking changes in dietary protein with functional differences of the gut microbiome. Conclusions These data indicate that feline feces-derived microbiomes have large structural and functional differences relating to the dietary protein:carbohydrate ratio and highlight the impact of diet early in life. PMID:25010839
Transcriptome Assembly, Gene Annotation and Tissue Gene Expression Atlas of the Rainbow Trout

PubMed Central

Salem, Mohamed; Paneru, Bam; Al-Tobasei, Rafet; Abdouni, Fatima; Thorgaard, Gary H.; Rexroad, Caird E.; Yao, Jianbo

2015-01-01

Efforts to obtain a comprehensive genome sequence for rainbow trout are ongoing and will be complemented by transcriptome information that will enhance genome assembly and annotation. Previously, transcriptome reference sequences were reported using data from different sources. Although the previous work added a great wealth of sequences, a complete and well-annotated transcriptome is still needed. In addition, gene expression in different tissues was not completely addressed in the previous studies. In this study, non-normalized cDNA libraries were sequenced from 13 different tissues of a single doubled haploid rainbow trout from the same source used for the rainbow trout genome sequence. A total of ~1.167 billion paired-end reads were de novo assembled using the Trinity RNA-Seq assembler yielding 474,524 contigs > 500 base-pairs. Of them, 287,593 had homologies to the NCBI non-redundant protein database. The longest contig of each cluster was selected as a reference, yielding 44,990 representative contigs. A total of 4,146 contigs (9.2%), including 710 full-length sequences, did not match any mRNA sequences in the current rainbow trout genome reference. Mapping reads to the reference genome identified an additional 11,843 transcripts not annotated in the genome. A digital gene expression atlas revealed 7,678 housekeeping and 4,021 tissue-specific genes. Expression of about 16,000–32,000 genes (35–71% of the identified genes) accounted for basic and specialized functions of each tissue. White muscle and stomach had the least complex transcriptomes, with high percentages of their total mRNA contributed by a small number of genes. Brain, testis and intestine, in contrast, had complex transcriptomes, with a large numbers of genes involved in their expression patterns. This study provides comprehensive de novo transcriptome information that is suitable for functional and comparative genomics studies in rainbow trout, including annotation of the genome. PMID:25793877
Drinking from the Fire Hose: Why the Flight Management System Can Be Hard to Train and Difficult to Use

NASA Technical Reports Server (NTRS)

Sherry, Lance; Feary, Michael; Polson, Peter; Fennell, Karl

2003-01-01

The Flight Management Computer (FMC) and its interface, the Multi-function Control and Display Unit (MCDU) have been identified by researchers and airlines as difficult to train and use. Specifically, airline pilots have described the "drinking from the fire-hose" effect during training. Previous research has identified memorized action sequences as a major factor in a user s ability to learn and operate complex devices. This paper discusses the use of a method to examine the quantity of memorized action sequences required to perform a sample of 102 tasks, using features of the Boeing 777 Flight Management Computer Interface. The analysis identified a large number of memorized action sequences that must be learned during training and then recalled during line operations. Seventy-five percent of the tasks examined require recall of at least one memorized action sequence. Forty-five percent of the tasks require recall of a memorized action sequence and occur infrequently. The large number of memorized action sequences may provide an explanation for the difficulties in training and usage of the automation. Based on these findings, implications for training and the design of new user-interfaces are discussed.
Whole Transcriptome Sequencing Enables Discovery and Analysis of Viruses in Archived Primary Central Nervous System Lymphomas

PubMed Central

DeBoever, Christopher; Reid, Erin G.; Smith, Erin N.; Wang, Xiaoyun; Dumaop, Wilmar; Harismendy, Olivier; Carson, Dennis; Richman, Douglas; Masliah, Eliezer; Frazer, Kelly A.

2013-01-01

Primary central nervous system lymphomas (PCNSL) have a dramatically increased prevalence among persons living with AIDS and are known to be associated with human Epstein Barr virus (EBV) infection. Previous work suggests that in some cases, co-infection with other viruses may be important for PCNSL pathogenesis. Viral transcription in tumor samples can be measured using next generation transcriptome sequencing. We demonstrate the ability of transcriptome sequencing to identify viruses, characterize viral expression, and identify viral variants by sequencing four archived AIDS-related PCNSL tissue samples and analyzing raw sequencing reads. EBV was detected in all four PCNSL samples and cytomegalovirus (CMV), JC polyomavirus (JCV), and HIV were also discovered, consistent with clinical diagnoses. CMV was found to express three long non-coding RNAs recently reported as expressed during active infection. Single nucleotide variants were observed in each of the viruses observed and three indels were found in CMV. No viruses were found in several control tumor types including 32 diffuse large B-cell lymphoma samples. This study demonstrates the ability of next generation transcriptome sequencing to accurately identify viruses, including DNA viruses, in solid human cancer tissue samples. PMID:24023918
Identification and functional characterization of a novel bipartite nuclear localization sequence in ARID1A

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bateman, Nicholas W.; The John P. Murtha Cancer Center, Walter Reed National Military Medical Center, 8901 Wisconsin Avenue, Bethesda 20889, MD; Shoji, Yutaka

2016-01-01

AT-rich interactive domain-containing protein 1A (ARID1A) is a recently identified nuclear tumor suppressor frequently altered in solid tumor malignancies. We have identified a bipartite-like nuclear localization sequence (NLS) that contributes to nuclear import of ARID1A not previously described. We functionally confirm activity using GFP constructs fused with wild-type or mutant NLS sequences. We further show that cyto-nuclear localized, bipartite NLS mutant ARID1A exhibits greater stability than nuclear-localized, wild-type ARID1A. Identification of this undescribed functional NLS within ARID1A contributes vital insights to rationalize the impact of ARID1A missense mutations observed in patient tumors. - Highlights: • We have identified a bipartitemore » nuclear localization sequence (NLS) in ARID1A. • Confirmation of the NLS was performed using GFP constructs. • NLS mutant ARID1A exhibits greater stability than wild-type ARID1A.« less
Insights into fungal communities in composts revealed by 454-pyrosequencing: implications for human health and safety.

PubMed

De Gannes, Vidya; Eudoxie, Gaius; Hickey, William J

2013-01-01

Fungal community composition in composts of lignocellulosic wastes was assessed via 454-pyrosequencing of ITS1 libraries derived from the three major composting phases. Ascomycota represented most (93%) of the 27,987 fungal sequences. A total of 102 genera, 120 species, and 222 operational taxonomic units (OTUs; >97% similarity) were identified. Thirty genera predominated (ca. 94% of the sequences), and at the species level, sequences matching Chaetomium funicola and Fusarium oxysporum were the most abundant (26 and 12%, respectively). In all composts, fungal diversity in the mature phase exceeded that of the mesophilic phase, but there was no consistent pattern in diversity changes occurring in the thermophilic phase. Fifteen species of human pathogens were identified, eight of which have not been previously identified in composts. This study demonstrated that deep sequencing can elucidate fungal community diversity in composts, and that this information can have important implications for compost use and human health.
Insights into fungal communities in composts revealed by 454-pyrosequencing: implications for human health and safety

PubMed Central

De Gannes, Vidya; Eudoxie, Gaius; Hickey, William J.

2013-01-01

Fungal community composition in composts of lignocellulosic wastes was assessed via 454-pyrosequencing of ITS1 libraries derived from the three major composting phases. Ascomycota represented most (93%) of the 27,987 fungal sequences. A total of 102 genera, 120 species, and 222 operational taxonomic units (OTUs; >97% similarity) were identified. Thirty genera predominated (ca. 94% of the sequences), and at the species level, sequences matching Chaetomium funicola and Fusarium oxysporum were the most abundant (26 and 12%, respectively). In all composts, fungal diversity in the mature phase exceeded that of the mesophilic phase, but there was no consistent pattern in diversity changes occurring in the thermophilic phase. Fifteen species of human pathogens were identified, eight of which have not been previously identified in composts. This study demonstrated that deep sequencing can elucidate fungal community diversity in composts, and that this information can have important implications for compost use and human health. PMID:23785368
Whole-Genome Characterization of Prunus necrotic ringspot virus Infecting Sweet Cherry in China.

PubMed

Wang, Jiawei; Zhai, Ying; Zhu, Dongzi; Liu, Weizhen; Pappu, Hanu R; Liu, Qingzhong

2018-03-01

Prunus necrotic ringspot virus (PNRSV) causes yield loss in most cultivated stone fruits, including sweet cherry. Using a small RNA deep-sequencing approach combined with end-genome sequence cloning, we identified the complete genomes of all three PNRSV strands from PNRSV-infected sweet cherry trees and compared them with those of two previously reported isolates. Copyright © 2018 Wang et al.
[Multiplexing mapping of human cDNAs]. Final report, September 1, 1991--February 28, 1994

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

Using PCR with automated product analysis, 329 human brain cDNA sequences have been assigned to individual human chromosomes. Primers were designed from single-pass cDNA sequences expressed sequence tags (ESTs). Primers were used in PCR reactions with DNA from somatic cell hybrid mapping panels as templates, often with multiplexing. Many ESTs mapped match sequence database records. To evaluate of these matches, the position of the primers relative to the matching region (In), the BLAST scores and the Poisson probability values of the EST/sequence record match were determined. In cases where the gene product was stringently identified by the sequence match hadmore » already been mapped, the gene locus determined by EST was consistent with the previous position which strongly supports the validity of assigning unknown genes to human chromosomes based on the EST sequence matches. In the present cases mapping the ESTs to a chromosome can also be considered to have mapped the known gene product: rolipram-sensitive cAMP phosphodiesterase, chromosome 1; protein phosphatase 2A{beta}, chromosome 4; alpha-catenin, chromosome 5; the ELE1 oncogene, chromosome 10q11.2 or q2.1-q23; MXII protein, chromosome l0q24-qter; ribosomal protein L18a homologue, chromosome 14; ribosomal protein L3, chromosome 17; and moesin, Xp11-cen. There were also ESTs mapped that were closely related to non-human sequence records. These matches therefore can be considered to identify human counterparts of known gene products, or members of known gene families. Examples of these include membrane proteins, translation-associated proteins, structural proteins, and enzymes. These data then demonstrate that single pass sequence information is sufficient to design PCR primers useful for assigning cDNA sequences to human chromosomes. When the EST sequence matches previous sequence database records, the chromosome assignments of the EST can be used to make preliminary assignments of the human gene to a chromosome.« less
Bacterivory by a Summer Assemblage of Nanoplankton in the Ross Sea, Antarctica: Mixotrophic Versus Heterotrophic Protists

NASA Astrophysics Data System (ADS)

Sanders, R. W.; Gast, R. J.

2016-02-01

Many protists traditionally described as phototrophic have recently been shown to have retained the primitive trait of phagotrophy, and thus function as mixotrophs. Mixotrophic nanoflagellates were identified in every sample examined from a summer cruise in the Ross Sea, Antarctica, where they often were more abundant than heterotrophic nanoflagellates that have previously been considered the major bacterivores in marine waters. Mixotrophs, identified by uptake of fluorescent tracers, comprised similar proportions (9-75%) of the total bacterivorous flagellates in summer as were previously determined for an earlier spring cruise in the Ross Sea. Protist diversity also was linked to functional bacterivores using a culture-independent method in which BrdU-labeled DNA of bacterial prey was incorporated into the DNA of eukaryotic grazers. Immunoprecipitation of the BrdU-labeld DNA was followed by high-throughput sequencing to identify a diverse group of bacterivores, including numerous uncultured eukaryotes. However, its utility for identification of mixotrophs was limited by the availability of sequences from known mixotrophs.
Morphological identification and COI barcodes of adult flies help determine species identities of chironomid larvae (Diptera, Chironomidae).

PubMed

Failla, A J; Vasquez, A A; Hudson, P; Fujimoto, M; Ram, J L

2016-02-01

Establishing reliable methods for the identification of benthic chironomid communities is important due to their significant contribution to biomass, ecology and the aquatic food web. Immature larval specimens are more difficult to identify to species level by traditional morphological methods than their fully developed adult counterparts, and few keys are available to identify the larval species. In order to develop molecular criteria to identify species of chironomid larvae, larval and adult chironomids from Western Lake Erie were subjected to both molecular and morphological taxonomic analysis. Mitochondrial cytochrome c oxidase I (COI) barcode sequences of 33 adults that were identified to species level by morphological methods were grouped with COI sequences of 189 larvae in a neighbor-joining taxon-ID tree. Most of these larvae could be identified only to genus level by morphological taxonomy (only 22 of the 189 sequenced larvae could be identified to species level). The taxon-ID tree of larval sequences had 45 operational taxonomic units (OTUs, defined as clusters with >97% identity or individual sequences differing from nearest neighbors by >3%; supported by analysis of all larval pairwise differences), of which seven could be identified to species or 'species group' level by larval morphology. Reference sequences from the GenBank and BOLD databases assigned six larval OTUs with presumptive species level identifications and confirmed one previously assigned species level identification. Sequences from morphologically identified adults in the present study grouped with and further classified the identity of 13 larval OTUs. The use of morphological identification and subsequent DNA barcoding of adult chironomids proved to be beneficial in revealing possible species level identifications of larval specimens. Sequence data from this study also contribute to currently inadequate public databases relevant to the Great Lakes region, while the neighbor-joining analysis reported here describes the application and confirmation of a useful tool that can accelerate identification and bioassessment of chironomid communities.
Morphological identification and COI barcodes of adult flies help determine species identities of chironomid larvae (Diptera, Chironomidae)

USGS Publications Warehouse

Failla, Andrew Joseph; Vasquez, Adrian Amelio; Hudson, Patrick L.; Fujimoto, Masanori; Ram, Jeffrey L.

2016-01-01

Establishing reliable methods for the identification of benthic chironomid communities is important due to their significant contribution to biomass, ecology and the aquatic food web. Immature larval specimens are more difficult to identify to species level by traditional morphological methods than their fully developed adult counterparts, and few keys are available to identify the larval species. In order to develop molecular criteria to identify species of chironomid larvae, larval and adult chironomids from Western Lake Erie were subjected to both molecular and morphological taxonomic analysis. Mitochondrial cytochrome c oxidase I (COI) barcode sequences of 33 adults that were identified to species level by morphological methods were grouped with COI sequences of 189 larvae in a neighbor-joining taxon-ID tree. Most of these larvae could be identified only to genus level by morphological taxonomy (only 22 of the 189 sequenced larvae could be identified to species level). The taxon-ID tree of larval sequences had 45 operational taxonomic units (OTUs, defined as clusters with >97% identity or individual sequences differing from nearest neighbors by >3%; supported by analysis of all larval pairwise differences), of which seven could be identified to species or ‘species group’ level by larval morphology. Reference sequences from the GenBank and BOLD databases assigned six larval OTUs with presumptive species level identifications and confirmed one previously assigned species level identification. Sequences from morphologically identified adults in the present study grouped with and further classified the identity of 13 larval OTUs. The use of morphological identification and subsequent DNA barcoding of adult chironomids proved to be beneficial in revealing possible species level identifications of larval specimens. Sequence data from this study also contribute to currently inadequate public databases relevant to the Great Lakes region, while the neighbor-joining analysis reported here describes the application and confirmation of a useful tool that can accelerate identification and bioassesment of chironomid communities.
Complete genome sequence of the thermotolerant foodborne pathogen Salmonella enterica serovar Senftenberg ATCC 43845 and phylogenetic analysis of loci encoding thermotolerance

USDA-ARS?s Scientific Manuscript database

Introduction: Previous studies in Cronobacter sakazakii, Klebsiella spp., and Escherichia coli have identified a genomic island that confers thermotolerance to its hosts. This island has recently been identified in Salmonella enterica serovar Senfentenberg ATCC 43845, a historically important, heat ...
Screening for duplications, deletions and a common intronic mutation detects 35% of second mutations in patients with USH2A monoallelic mutations on Sanger sequencing.

PubMed

Steele-Stallard, Heather B; Le Quesne Stabej, Polona; Lenassi, Eva; Luxon, Linda M; Claustres, Mireille; Roux, Anne-Francoise; Webster, Andrew R; Bitner-Glindzicz, Maria

2013-08-08

Usher Syndrome is the leading cause of inherited deaf-blindness. It is divided into three subtypes, of which the most common is Usher type 2, and the USH2A gene accounts for 75-80% of cases. Despite recent sequencing strategies, in our cohort a significant proportion of individuals with Usher type 2 have just one heterozygous disease-causing mutation in USH2A, or no convincing disease-causing mutations across nine Usher genes. The purpose of this study was to improve the molecular diagnosis in these families by screening USH2A for duplications, heterozygous deletions and a common pathogenic deep intronic variant USH2A: c.7595-2144A>G. Forty-nine Usher type 2 or atypical Usher families who had missing mutations (mono-allelic USH2A or no mutations following Sanger sequencing of nine Usher genes) were screened for duplications/deletions using the USH2A SALSA MLPA reagent kit (MRC-Holland). Identification of USH2A: c.7595-2144A>G was achieved by Sanger sequencing. Mutations were confirmed by a combination of reverse transcription PCR using RNA extracted from nasal epithelial cells or fibroblasts, and by array comparative genomic hybridisation with sequencing across the genomic breakpoints. Eight mutations were identified in 23 Usher type 2 families (35%) with one previously identified heterozygous disease-causing mutation in USH2A. These consisted of five heterozygous deletions, one duplication, and two heterozygous instances of the pathogenic variant USH2A: c.7595-2144A>G. No variants were found in the 15 Usher type 2 families with no previously identified disease-causing mutations. In 11 atypical families, none of whom had any previously identified convincing disease-causing mutations, the mutation USH2A: c.7595-2144A>G was identified in a heterozygous state in one family. All five deletions and the heterozygous duplication we report here are novel. This is the first time that a duplication in USH2A has been reported as a cause of Usher syndrome. We found that 8 of 23 (35%) of 'missing' mutations in Usher type 2 probands with only a single heterozygous USH2A mutation detected with Sanger sequencing could be attributed to deletions, duplications or a pathogenic deep intronic variant. Future mutation detection strategies and genetic counselling will need to take into account the prevalence of these types of mutations in order to provide a more comprehensive diagnostic service.
Intragenic SNP haplotypes associated with 84dup18 mutation in TNFRSF11A in four FEO pedigrees suggest three independent origins for this mutation.

PubMed

Elahi, Elahe; Shafaghati, Yousef; Asadi, Sareh; Absalan, Farnaz; Goodarzi, Hani; Gharaii, Nava; Karimi-Nejad, Mohammad Hassan; Shahram, Farhad; Hughes, Anne E

2007-01-01

Familial expansile osteolysis (FEO) is a rare disorder causing bone dysplasia. The clinical features of FEO include early-onset hearing loss, tooth destruction, and progressive lytic expansion within limb bones causing pain, fracture, and deformity. An 18-bp duplication in the first exon of the TNFRSF11A gene encoding RANK has been previously identified in four FEO pedigrees. Despite having the identical mutation, phenotypic variations among affected individuals of the same and different pedigrees were noted. Another 18-bp duplication, one base proximal to the duplication previously reported, was subsequently found in two unrelated FEO patients. Finally, mutations overlapping with the mutations found in the FEO pedigrees have been found in ESH and early-onset PDB pedigrees. An Iranian FEO pedigree that contains six affected individuals dispersed in three generations has previously been introduced; here, the clinical features of the proband are reported in greater detail, and the genetic defect of the pedigree is presented. Direct sequencing of the entire coding region and upstream and downstream noncoding regions of TNFRSF11A in her DNA revealed the same 18-bp duplication mutation as previously found in the four FEO pedigrees. Additionally, eight sequence variations as compared to the TNFRSF11A reference sequence were identified, and a haplotype linked to the mutation based on these variations was defined. Although the mutation in the Iranian and four of the previously described FEO pedigrees was the same, haplotypes based on the intragenic SNPs suggest that the mutations do not share a common descent.
Resistance gene enrichment sequencing (RenSeq) enables reannotation of the NB-LRR gene family from sequenced plant genomes and rapid mapping of resistance loci in segregating populations

PubMed Central

Jupe, Florian; Witek, Kamil; Verweij, Walter; Śliwka, Jadwiga; Pritchard, Leighton; Etherington, Graham J; Maclean, Dan; Cock, Peter J; Leggett, Richard M; Bryan, Glenn J; Cardle, Linda; Hein, Ingo; Jones, Jonathan DG

2013-01-01

Summary RenSeq is a NB-LRR (nucleotide binding-site leucine-rich repeat) gene-targeted, Resistance gene enrichment and sequencing method that enables discovery and annotation of pathogen resistance gene family members in plant genome sequences. We successfully applied RenSeq to the sequenced potato Solanum tuberosum clone DM, and increased the number of identified NB-LRRs from 438 to 755. The majority of these identified R gene loci reside in poorly or previously unannotated regions of the genome. Sequence and positional details on the 12 chromosomes have been established for 704 NB-LRRs and can be accessed through a genome browser that we provide. We compared these NB-LRR genes and the corresponding oligonucleotide baits with the highest sequence similarity and demonstrated that ∼80% sequence identity is sufficient for enrichment. Analysis of the sequenced tomato S. lycopersicum ‘Heinz 1706’ extended the NB-LRR complement to 394 loci. We further describe a methodology that applies RenSeq to rapidly identify molecular markers that co-segregate with a pathogen resistance trait of interest. In two independent segregating populations involving the wild Solanum species S. berthaultii (Rpi-ber2) and S. ruiz-ceballosii (Rpi-rzc1), we were able to apply RenSeq successfully to identify markers that co-segregate with resistance towards the late blight pathogen Phytophthora infestans. These SNP identification workflows were designed as easy-to-adapt Galaxy pipelines. PMID:23937694
Long-range PCR facilitates the identification of PMS2-specific mutations.

PubMed

Clendenning, Mark; Hampel, Heather; LaJeunesse, Jennifer; Lindblom, Annika; Lockman, Jan; Nilbert, Mef; Senter, Leigha; Sotamaa, Kaisa; de la Chapelle, Albert

2006-05-01

Mutations within the DNA mismatch repair gene, "postmeiotic segregation increased 2" (PMS2), have been associated with a predisposition to hereditary nonpolyposis colorectal cancer (HNPCC; Lynch syndrome). The presence of a large family of highly homologous PMS2 pseudogenes has made previous attempts to sequence PMS2 very difficult. Here, we describe a novel method that utilizes long-range PCR as a way to preferentially amplify PMS2 and not the pseudogenes. A second, exon-specific, amplification from diluted long-range products enables us to obtain a clean sequence that shows no evidence of pseudogene contamination. This method has been used to screen a cohort of patients whose tumors were negative for the PMS2 protein by immunohistochemistry and had not shown any mutations within the MLH1 gene. Sequencing of the PMS2 gene from 30 colorectal and 11 endometrial cancer patients identified 10 novel sequence changes as well as 17 sequence changes that had previously been identified. In total, putative pathologic mutations were detected in 11 of the 41 families. Among these were five novel mutations, c.705+1G>T, c.736_741del6ins11, c.862_863del, c.1688G>T, and c.2007-1G>A. We conclude that PMS2 mutation detection in selected Lynch syndrome and Lynch syndrome-like patients is both feasible and desirable. Published 2006 Wiley-Liss, Inc.
Identification and functional characterization of a novel bipartite nuclear localization sequence in ARID1A.

PubMed

Bateman, Nicholas W; Shoji, Yutaka; Conrads, Kelly A; Stroop, Kevin D; Hamilton, Chad A; Darcy, Kathleen M; Maxwell, George L; Risinger, John I; Conrads, Thomas P

2016-01-01

AT-rich interactive domain-containing protein 1A (ARID1A) is a recently identified nuclear tumor suppressor frequently altered in solid tumor malignancies. We have identified a bipartite-like nuclear localization sequence (NLS) that contributes to nuclear import of ARID1A not previously described. We functionally confirm activity using GFP constructs fused with wild-type or mutant NLS sequences. We further show that cyto-nuclear localized, bipartite NLS mutant ARID1A exhibits greater stability than nuclear-localized, wild-type ARID1A. Identification of this undescribed functional NLS within ARID1A contributes vital insights to rationalize the impact of ARID1A missense mutations observed in patient tumors. Copyright © 2015 Elsevier Inc. All rights reserved.

Mapping autosomal recessive intellectual disability: combined microarray and exome sequencing identifies 26 novel candidate genes in 192 consanguineous families.

PubMed

Harripaul, R; Vasli, N; Mikhailov, A; Rafiq, M A; Mittal, K; Windpassinger, C; Sheikh, T I; Noor, A; Mahmood, H; Downey, S; Johnson, M; Vleuten, K; Bell, L; Ilyas, M; Khan, F S; Khan, V; Moradi, M; Ayaz, M; Naeem, F; Heidari, A; Ahmed, I; Ghadami, S; Agha, Z; Zeinali, S; Qamar, R; Mozhdehipanah, H; John, P; Mir, A; Ansar, M; French, L; Ayub, M; Vincent, J B

2018-04-01

Approximately 1% of the global population is affected by intellectual disability (ID), and the majority receive no molecular diagnosis. Previous studies have indicated high levels of genetic heterogeneity, with estimates of more than 2500 autosomal ID genes, the majority of which are autosomal recessive (AR). Here, we combined microarray genotyping, homozygosity-by-descent (HBD) mapping, copy number variation (CNV) analysis, and whole exome sequencing (WES) to identify disease genes/mutations in 192 multiplex Pakistani and Iranian consanguineous families with non-syndromic ID. We identified definite or candidate mutations (or CNVs) in 51% of families in 72 different genes, including 26 not previously reported for ARID. The new ARID genes include nine with loss-of-function mutations (ABI2, MAPK8, MPDZ, PIDD1, SLAIN1, TBC1D23, TRAPPC6B, UBA7 and USP44), and missense mutations include the first reports of variants in BDNF or TET1 associated with ID. The genes identified also showed overlap with de novo gene sets for other neuropsychiatric disorders. Transcriptional studies showed prominent expression in the prenatal brain. The high yield of AR mutations for ID indicated that this approach has excellent clinical potential and should inform clinical diagnostics, including clinical whole exome and genome sequencing, for populations in which consanguinity is common. As with other AR disorders, the relevance will also apply to outbred populations.
Rapid Hypothesis Testing with Candida albicans through Gene Disruption with Short Homology Regions

PubMed Central

Wilson, R. Bryce; Davis, Dana; Mitchell, Aaron P.

1999-01-01

Disruption of newly identified genes in the pathogen Candida albicans is a vital step in determination of gene function. Several gene disruption methods described previously employ long regions of homology flanking a selectable marker. Here, we describe disruption of C. albicans genes with PCR products that have 50 to 60 bp of homology to a genomic sequence on each end of a selectable marker. We used the method to disrupt two known genes, ARG5 and ADE2, and two sequences newly identified through the Candida genome project, HRM101 and ENX3. HRM101 and ENX3 are homologous to genes in the conserved RIM101 (previously called RIM1) and PacC pathways of Saccharomyces cerevisiae and Aspergillus nidulans. We show that three independent hrm101/hrm101 mutants and two independent enx3/enx3 mutants are defective in filamentation on Spider medium. These observations argue that HRM101 and ENX3 sequences are indeed portions of genes and that the respective gene products have related functions. PMID:10074081
Unconventional P-35S sequence identified in genetically modified maize

PubMed Central

Al-Hmoud, Nisreen; Al-Husseini, Nawar; Ibrahim-Alobaide, Mohammed A; Kübler, Eric; Farfoura, Mahmoud; Alobydi, Hytham; Al-Rousan, Hiyam

2014-01-01

The Cauliflower Mosaic Virus 35S promoter sequence, CaMV P-35S, is one of several commonly used genetic targets to detect genetically modified maize and is found in most GMOs. In this research we report the finding of an alternative P-35S sequence and its incidence in GM maize marketed in Jordan. The primer pair normally used to amplify a 123 bp DNA fragment of the CaMV P-35S promoter in GMOs also amplified a previously undetected alternative sequence of CaMV P-35S in GM maize samples which we term V3. The amplified V3 sequence comprises 386 base pairs and was not found in the standard wild-type maize, MON810 and MON 863 GM maize. The identified GM maize samples carrying the V3 sequence were found free of CaMV when compared with CaMV infected brown mustard sample. The data of sequence alignment analysis of the V3 genetic element showed 90% similarity with the matching P-35S sequence of the cauliflower mosaic virus isolate CabbB-JI and 99% similarity with matching P-35S sequences found in several binary plant vectors, of which the binary vector locus JQ693018 is one example. The current study showed an increase of 44% in the incidence of the identified 386 bp sequence in GM maize sold in Jordan’s markets during the period 2009 and 2012. PMID:24495911
Molecular Evidence of Chlamydia-Like Organisms in the Feces of Myotis daubentonii Bats.

PubMed

Hokynar, K; Vesterinen, E J; Lilley, T M; Pulliainen, A T; Korhonen, S J; Paavonen, J; Puolakkainen, M

2017-01-15

Chlamydia-like organisms (CLOs) are recently identified members of the Chlamydiales order. CLOs share intracellular lifestyles and biphasic developmental cycles, and they have been detected in environmental samples as well as in various hosts such as amoebae and arthropods. In this study, we screened bat feces for the presence of CLOs by molecular analysis. Using pan-Chlamydiales PCR targeting the 16S rRNA gene, Chlamydiales DNA was detected in 54% of the specimens. PCR amplification, sequencing, and phylogenetic analysis of the 16S rRNA and 23S rRNA genes were used to classify positive specimens and infer their phylogenetic relationships. Most sequences matched best with Rhabdochlamydia species or uncultured Chlamydia sequences identified in ticks. Another set of sequences matched best with sequences of the Chlamydia genus or uncultured Chlamydiales from snakes. To gain evidence of whether CLOs in bat feces are merely diet borne, we analyzed insects trapped from the same location where the bats foraged. Interestingly, the CLO sequences resembling Rhabdochlamydia spp. were detected in insect material as well, but the other set of CLO sequences was not, suggesting that this set might not originate from prey. Thus, bats represent another potential host for Chlamydiales and could harbor novel, previously unidentified members of this order. Several pathogenic viruses are known to colonize bats, and recent analyses indicate that bats are also reservoir hosts for bacterial genera. Chlamydia-like organisms (CLOs) have been detected in several animal species. CLOs have high 16S rRNA sequence similarity to Chlamydiaceae and exhibit similar intracellular lifestyles and biphasic developmental cycles. Our study describes the frequent occurrence of CLO DNA in bat feces, suggesting an expanding host species spectrum for the Chlamydiales As bats can acquire various infectious agents through their diet, prey insects were also studied. We identified CLO sequences in bats that matched best with sequences in prey insects but also CLO sequences not detected in prey insects. This suggests that a portion of CLO DNA present in bat feces is not prey borne. Furthermore, some sequences from bat droppings not originating from their diet might well represent novel, previously unidentified members of the Chlamydiales order. Copyright © 2016 American Society for Microbiology.
A statistical method for the detection of variants from next-generation resequencing of DNA pools.

PubMed

Bansal, Vikas

2010-06-15

Next-generation sequencing technologies have enabled the sequencing of several human genomes in their entirety. However, the routine resequencing of complete genomes remains infeasible. The massive capacity of next-generation sequencers can be harnessed for sequencing specific genomic regions in hundreds to thousands of individuals. Sequencing-based association studies are currently limited by the low level of multiplexing offered by sequencing platforms. Pooled sequencing represents a cost-effective approach for studying rare variants in large populations. To utilize the power of DNA pooling, it is important to accurately identify sequence variants from pooled sequencing data. Detection of rare variants from pooled sequencing represents a different challenge than detection of variants from individual sequencing. We describe a novel statistical approach, CRISP [Comprehensive Read analysis for Identification of Single Nucleotide Polymorphisms (SNPs) from Pooled sequencing] that is able to identify both rare and common variants by using two approaches: (i) comparing the distribution of allele counts across multiple pools using contingency tables and (ii) evaluating the probability of observing multiple non-reference base calls due to sequencing errors alone. Information about the distribution of reads between the forward and reverse strands and the size of the pools is also incorporated within this framework to filter out false variants. Validation of CRISP on two separate pooled sequencing datasets generated using the Illumina Genome Analyzer demonstrates that it can detect 80-85% of SNPs identified using individual sequencing while achieving a low false discovery rate (3-5%). Comparison with previous methods for pooled SNP detection demonstrates the significantly lower false positive and false negative rates for CRISP. Implementation of this method is available at http://polymorphism.scripps.edu/~vbansal/software/CRISP/.
Complementary DNA sequencing and identification of mRNAs from the venomous gland of Agkistrodon piscivorus leucostoma.

PubMed

Jia, Ying; Cantu, Bruno A; Sánchez, Elda E; Pérez, John C

2008-06-15

To advance our knowledge on the snake venom composition and transcripts expressed in venom gland at the molecular level, we constructed a cDNA library from the venom gland of Agkistrodon piscivorus leucostoma for the generation of expressed sequence tags (ESTs) database. From the randomly sequenced 2112 independent clones, we have obtained ESTs for 1309 (62%) cDNAs, which showed significant deduced amino acid sequence similarity (scores >80) to previously characterized proteins in National Center for Biotechnology Information (NCBI) database. Ribosomal proteins make up 47 clones (2%) and the remaining 756 (36%) cDNAs represent either unknown identity or show BLASTX sequence identity scores of <80 with known GenBank accessions. The most highly expressed gene encoding phospholipase A(2) (PLA(2)) accounting for 35% of A. p. leucostoma venom gland cDNAs was identified and further confirmed by crude venom applied to sodium dodecyl sulfate/polyacrylamide gel electrophoresis (SDS-PAGE) electrophoresis and protein sequencing. A total of 180 representative genes were obtained from the sequence assemblies and deposited to EST database. Clones showing sequence identity to disintegrins, thrombin-like enzymes, hemorrhagic toxins, fibrinogen clotting inhibitors and plasminogen activators were also identified in our EST database. These data can be used to develop a research program that will help us identify genes encoding proteins that are of medical importance or proteins involved in the mechanisms of the toxin venom.
VWF mutations and new sequence variations identified in healthy controls are more frequent in the African-American population.

PubMed

Bellissimo, Daniel B; Christopherson, Pamela A; Flood, Veronica H; Gill, Joan Cox; Friedman, Kenneth D; Haberichter, Sandra L; Shapiro, Amy D; Abshire, Thomas C; Leissinger, Cindy; Hoots, W Keith; Lusher, Jeanne M; Ragni, Margaret V; Montgomery, Robert R

2012-03-01

Diagnosis and classification of VWD is aided by molecular analysis of the VWF gene. Because VWF polymorphisms have not been fully characterized, we performed VWF laboratory testing and gene sequencing of 184 healthy controls with a negative bleeding history. The controls included 66 (35.9%) African Americans (AAs). We identified 21 new sequence variations, 13 (62%) of which occurred exclusively in AAs and 2 (G967D, T2666M) that were found in 10%-15% of the AA samples, suggesting they are polymorphisms. We identified 14 sequence variations reported previously as VWF mutations, the majority of which were type 1 mutations. These controls had VWF Ag levels within the normal range, suggesting that these sequence variations might not always reduce plasma VWF levels. Eleven mutations were found in AAs, and the frequency of M740I, H817Q, and R2185Q was 15%-18%. Ten AA controls had the 2N mutation H817Q; 1 was homozygous. The average factor VIII level in this group was 99 IU/dL, suggesting that this variation may confer little or no clinical symptoms. This study emphasizes the importance of sequencing healthy controls to understand ethnic-specific sequence variations so that asymptomatic sequence variations are not misidentified as mutations in other ethnic or racial groups.
Genome-Wide Search Identifies 1.9 Mb from the Polar Bear Y Chromosome for Evolutionary Analyses.

PubMed

Bidon, Tobias; Schreck, Nancy; Hailer, Frank; Nilsson, Maria A; Janke, Axel

2015-05-27

The male-inherited Y chromosome is the major haploid fraction of the mammalian genome, rendering Y-linked sequences an indispensable resource for evolutionary research. However, despite recent large-scale genome sequencing approaches, only a handful of Y chromosome sequences have been characterized to date, mainly in model organisms. Using polar bear (Ursus maritimus) genomes, we compare two different in silico approaches to identify Y-linked sequences: 1) Similarity to known Y-linked genes and 2) difference in the average read depth of autosomal versus sex chromosomal scaffolds. Specifically, we mapped available genomic sequencing short reads from a male and a female polar bear against the reference genome and identify 112 Y-chromosomal scaffolds with a combined length of 1.9 Mb. We verified the in silico findings for the longer polar bear scaffolds by male-specific in vitro amplification, demonstrating the reliability of the average read depth approach. The obtained Y chromosome sequences contain protein-coding sequences, single nucleotide polymorphisms, microsatellites, and transposable elements that are useful for evolutionary studies. A high-resolution phylogeny of the polar bear patriline shows two highly divergent Y chromosome lineages, obtained from analysis of the identified Y scaffolds in 12 previously published male polar bear genomes. Moreover, we find evidence of gene conversion among ZFX and ZFY sequences in the giant panda lineage and in the ancestor of ursine and tremarctine bears. Thus, the identification of Y-linked scaffold sequences from unordered genome sequences yields valuable data to infer phylogenomic and population-genomic patterns in bears. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Identification of a Recently Active Mammalian SINE Derived from Ribosomal RNA

PubMed Central

Longo, Mark S.; Brown, Judy D.; Zhang, Chu; O’Neill, Michael J.; O’Neill, Rachel J.

2015-01-01

Complex eukaryotic genomes are riddled with repeated sequences whose derivation does not coincide with phylogenetic history and thus is often unknown. Among such sequences, the capacity for transcriptional activity coupled with the adaptive use of reverse transcription can lead to a diverse group of genomic elements across taxa, otherwise known as selfish elements or mobile elements. Short interspersed nuclear elements (SINEs) are nonautonomous mobile elements found in eukaryotic genomes, typically derived from cellular RNAs such as tRNAs, 7SL or 5S rRNA. Here, we identify and characterize a previously unknown SINE derived from the 3′-end of the large ribosomal subunit (LSU or 28S rDNA) and transcribed via RNA polymerase III. This new element, SINE28, is represented in low-copy numbers in the human reference genome assembly, wherein we have identified 27 discrete loci. Phylogenetic analysis indicates these elements have been transpositionally active within primate lineages as recently as 6 MYA while modern humans still carry transcriptionally active copies. Moreover, we have identified SINE28s in all currently available assembled mammalian genome sequences. Phylogenetic comparisons indicate that these elements are frequently rederived from the highly conserved LSU rRNA sequences in a lineage-specific manner. We propose that this element has not been previously recognized as a SINE given its high identity to the canonical LSU, and that SINE28 likely represents one of possibly many unidentified, active transposable elements within mammalian genomes. PMID:25637222
Identifying Genetic Signatures of Natural Selection Using Pooled Population Sequencing in Picea abies

PubMed Central

Chen, Jun; Källman, Thomas; Ma, Xiao-Fei; Zaina, Giusi; Morgante, Michele; Lascoux, Martin

2016-01-01

The joint inference of selection and past demography remain a costly and demanding task. We used next generation sequencing of two pools of 48 Norway spruce mother trees, one corresponding to the Fennoscandian domain, and the other to the Alpine domain, to assess nucleotide polymorphism at 88 nuclear genes. These genes are candidate genes for phenological traits, and most belong to the photoperiod pathway. Estimates of population genetic summary statistics from the pooled data are similar to previous estimates, suggesting that pooled sequencing is reliable. The nonsynonymous SNPs tended to have both lower frequency differences and lower FST values between the two domains than silent ones. These results suggest the presence of purifying selection. The divergence between the two domains based on synonymous changes was around 5 million yr, a time similar to a recent phylogenetic estimate of 6 million yr, but much larger than earlier estimates based on isozymes. Two approaches, one of them novel and that considers both FST and difference in allele frequencies between the two domains, were used to identify SNPs potentially under diversifying selection. SNPs from around 20 genes were detected, including genes previously identified as main target for selection, such as PaPRR3 and PaGI. PMID:27172202
Identifying Genetic Signatures of Natural Selection Using Pooled Population Sequencing in Picea abies.

PubMed

Chen, Jun; Källman, Thomas; Ma, Xiao-Fei; Zaina, Giusi; Morgante, Michele; Lascoux, Martin

2016-07-07

The joint inference of selection and past demography remain a costly and demanding task. We used next generation sequencing of two pools of 48 Norway spruce mother trees, one corresponding to the Fennoscandian domain, and the other to the Alpine domain, to assess nucleotide polymorphism at 88 nuclear genes. These genes are candidate genes for phenological traits, and most belong to the photoperiod pathway. Estimates of population genetic summary statistics from the pooled data are similar to previous estimates, suggesting that pooled sequencing is reliable. The nonsynonymous SNPs tended to have both lower frequency differences and lower FST values between the two domains than silent ones. These results suggest the presence of purifying selection. The divergence between the two domains based on synonymous changes was around 5 million yr, a time similar to a recent phylogenetic estimate of 6 million yr, but much larger than earlier estimates based on isozymes. Two approaches, one of them novel and that considers both FST and difference in allele frequencies between the two domains, were used to identify SNPs potentially under diversifying selection. SNPs from around 20 genes were detected, including genes previously identified as main target for selection, such as PaPRR3 and PaGI. Copyright © 2016 Chen et al.
Morphological characters and DNA barcoding of Syngnathus schlegeli in the coastal waters of China

NASA Astrophysics Data System (ADS)

Chen, Zhi; Zhang, Yan; Han, Zhiqiang; Song, Na; Gao, Tianxiang

2018-03-01

A Syngnathus species widely distributed in Chinese seas was permanently identified as Syngnathus acus by native ichthyologists, but the taxonomic description about this species was inadequate and lacking conclusively molecular evidence. To identify this species, 357 individuals of this species from the coastal waters of Dandong, Yantai, Qingdao and Zhoushan were collected and measured. Morphological results showed that these slender specimens were mainly brownish, usually mottled with pale. Standard length ranged from 117 mm to 213 mm with an average length of 180.3 mm. The above characters were consistent with S. schlegeli distributed in Japan but colored differently from and much smaller than typical S. acus reported in Europe. Thus, morphological studies revealed that this species was previously misidentified as S. acus and might be S. schlegeli in reality. In addition, a fragment of cytochrome oxidase subunit I ( COI) gene of mitochondrial DNA was also sequenced for species identification, and 15 COI sequences belonging to different Syngnathus species were also used for the molecular identification. COI sequences of our specimens had the minimum genetic distance from recognized S. schlegeli from Japan and clustered with it firstly. The phylogenetic analysis similarly suggested that the species previously identified as S. acus in the coastal waters of China was S. schlegeli actually.
Genetic heterogeneity of diffuse large B-cell lymphoma.

PubMed

Zhang, Jenny; Grubor, Vladimir; Love, Cassandra L; Banerjee, Anjishnu; Richards, Kristy L; Mieczkowski, Piotr A; Dunphy, Cherie; Choi, William; Au, Wing Yan; Srivastava, Gopesh; Lugar, Patricia L; Rizzieri, David A; Lagoo, Anand S; Bernal-Mizrachi, Leon; Mann, Karen P; Flowers, Christopher; Naresh, Kikkeri; Evens, Andrew; Gordon, Leo I; Czader, Magdalena; Gill, Javed I; Hsi, Eric D; Liu, Qingquan; Fan, Alice; Walsh, Katherine; Jima, Dereje; Smith, Lisa L; Johnson, Amy J; Byrd, John C; Luftig, Micah A; Ni, Ting; Zhu, Jun; Chadburn, Amy; Levy, Shawn; Dunson, David; Dave, Sandeep S

2013-01-22

Diffuse large B-cell lymphoma (DLBCL) is the most common form of lymphoma in adults. The disease exhibits a striking heterogeneity in gene expression profiles and clinical outcomes, but its genetic causes remain to be fully defined. Through whole genome and exome sequencing, we characterized the genetic diversity of DLBCL. In all, we sequenced 73 DLBCL primary tumors (34 with matched normal DNA). Separately, we sequenced the exomes of 21 DLBCL cell lines. We identified 322 DLBCL cancer genes that were recurrently mutated in primary DLBCLs. We identified recurrent mutations implicating a number of known and not previously identified genes and pathways in DLBCL including those related to chromatin modification (ARID1A and MEF2B), NF-κB (CARD11 and TNFAIP3), PI3 kinase (PIK3CD, PIK3R1, and MTOR), B-cell lineage (IRF8, POU2F2, and GNA13), and WNT signaling (WIF1). We also experimentally validated a mutation in PIK3CD, a gene not previously implicated in lymphomas. The patterns of mutation demonstrated a classic long tail distribution with substantial variation of mutated genes from patient to patient and also between published studies. Thus, our study reveals the tremendous genetic heterogeneity that underlies lymphomas and highlights the need for personalized medicine approaches to treating these patients.
Whole Genome Sequence Analysis of Pig Respiratory Bacterial Pathogens with Elevated Minimum Inhibitory Concentrations for Macrolides.

PubMed

Dayao, Denise Ann Estarez; Seddon, Jennifer M; Gibson, Justine S; Blackall, Patrick J; Turni, Conny

2016-10-01

Macrolides are often used to treat and control bacterial pathogens causing respiratory disease in pigs. This study analyzed the whole genome sequences of one clinical isolate of Actinobacillus pleuropneumoniae, Haemophilus parasuis, Pasteurella multocida, and Bordetella bronchiseptica, all isolated from Australian pigs to identify the mechanism underlying the elevated minimum inhibitory concentrations (MICs) for erythromycin, tilmicosin, or tulathromycin. The H. parasuis assembled genome had a nucleotide transition at position 2059 (A to G) in the six copies of the 23S rRNA gene. This mutation has previously been associated with macrolide resistance but this is the first reported mechanism associated with elevated macrolide MICs in H. parasuis. There was no known macrolide resistance mechanism identified in the other three bacterial genomes. However, strA and sul2, aminoglycoside and sulfonamide resistance genes, respectively, were detected in one contiguous sequence (contig 1) of A. pleuropneumoniae assembled genome. This contig was identical to plasmids previously identified in Pasteurellaceae. This study has provided one possible explanation of elevated MICs to macrolides in H. parasuis. Further studies are necessary to clarify the mechanism causing the unexplained macrolide resistance in other Australian pig respiratory pathogens including the role of efflux systems, which were detected in all analyzed genomes.
Novel genomic findings in multiple myeloma identified through routine diagnostic sequencing.

PubMed

Ryland, Georgina L; Jones, Kate; Chin, Melody; Markham, John; Aydogan, Elle; Kankanige, Yamuna; Caruso, Marisa; Guinto, Jerick; Dickinson, Michael; Prince, H Miles; Yong, Kwee; Blombery, Piers

2018-05-14

Multiple myeloma is a genomically complex haematological malignancy with many genomic alterations recognised as important in diagnosis, prognosis and therapeutic decision making. Here, we provide a summary of genomic findings identified through routine diagnostic next-generation sequencing at our centre. A cohort of 86 patients with multiple myeloma underwent diagnostic sequencing using a custom hybridisation-based panel targeting 104 genes. Sequence variants, genome-wide copy number changes and structural rearrangements were detected using an inhouse-developed bioinformatics pipeline. At least one mutation was found in 69 (80%) patients. Frequently mutated genes included TP53 (36%), KRAS (22.1%), NRAS (15.1%), FAM46C/DIS3 (8.1%) and TET2/FGFR3 (5.8%), including multiple mutations not previously described in myeloma. Importantly we observed TP53 mutations in the absence of a 17 p deletion in 8% of the cohort, highlighting the need for sequencing-based assessment in addition to cytogenetics to identify these high-risk patients. Multiple novel copy number changes and immunoglobulin heavy chain translocations are also discussed. Our results demonstrate that many clinically relevant genomic findings remain in multiple myeloma which have not yet been identified through large-scale sequencing efforts, and provide important mechanistic insights into plasma cell pathobiology. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Recognition of Potentially Novel Human Disease-Associated Pathogens by Implementation of Systematic 16S rRNA Gene Sequencing in the Diagnostic Laboratory▿ †

PubMed Central

Keller, Peter M.; Rampini, Silvana K.; Büchler, Andrea C.; Eich, Gerhard; Wanner, Roger M.; Speck, Roberto F.; Böttger, Erik C.; Bloemberg, Guido V.

2010-01-01

Clinical isolates that are difficult to identify by conventional means form a valuable source of novel human pathogens. We report on a 5-year study based on systematic 16S rRNA gene sequence analysis. We found 60 previously unknown 16S rRNA sequences corresponding to potentially novel bacterial taxa. For 30 of 60 isolates, clinical relevance was evaluated; 18 of the 30 isolates analyzed were considered to be associated with human disease. PMID:20631113
Generation of a Maize B Centromere Minimal Map Containing the Central Core Domain.

PubMed

Ellis, Nathanael A; Douglas, Ryan N; Jackson, Caroline E; Birchler, James A; Dawe, R Kelly

2015-10-28

The maize B centromere has been used as a model for centromere epigenetics and as the basis for building artificial chromosomes. However, there are no sequence resources for this important centromere. Here we used transposon display for the centromere-specific retroelement CRM2 to identify a collection of 40 sequence tags that flank CRM2 insertion points on the B chromosome. These were confirmed to lie within the centromere by assaying deletion breakpoints from centromere misdivision derivatives (intracentromere breakages caused by centromere fission). Markers were grouped together on the basis of their association with other markers in the misdivision series and assembled into a pseudocontig containing 10.1 kb of sequence. To identify sequences that interact directly with centromere proteins, we carried out chromatin immunoprecipitation using antibodies to centromeric histone H3 (CENH3), a defining feature of functional centromeric sequences. The CENH3 chromatin immunoprecipitation map was interpreted relative to the known transmission rates of centromere misdivision derivatives to identify a centromere core domain spanning 33 markers. A subset of seven markers was mapped in additional B centromere misdivision derivatives with the use of unique primer pairs. A derivative previously shown to have no canonical centromere sequences (Telo3-3) lacks these core markers. Our results provide a molecular map of the B chromosome centromere and identify key sequences within the map that interact directly with centromeric histone H3. Copyright © 2015 Ellis et al.
Feline hypersomatotropism and acromegaly tumorigenesis: a potential role for the AIP gene.

PubMed

Scudder, C J; Niessen, S J; Catchpole, B; Fowkes, R C; Church, D B; Forcada, Y

2017-04-01

Acromegaly in humans is usually sporadic, however up to 20% of familial isolated pituitary adenomas are caused by germline sequence variants of the aryl-hydrocarbon-receptor interacting protein (AIP) gene. Feline acromegaly has similarities to human acromegalic families with AIP mutations. The aim of this study was to sequence the feline AIP gene, identify sequence variants and compare the AIP gene sequence between feline acromegalic and control cats, and in acromegalic siblings. The feline AIP gene was amplified through PCR using whole blood genomic DNA from 10 acromegalic and 10 control cats, and 3 sibling pairs affected by acromegaly. PCR products were sequenced and compared with the published predicted feline AIP gene. A single nonsynonymous SNP was identified in exon 1 (AIP:c.9T > G) of two acromegalic cats and none of the control cats, as well as both members of one sibling pair. The region of this SNP is considered essential for the interaction of the AIP protein with its receptor. This sequence variant has not previously been reported in humans. Two additional synonymous sequence variants were identified (AIP:c.481C > T and AIP:c.826C > T). This is the first molecular study to investigate a potential genetic cause of feline acromegaly and identified a nonsynonymous AIP single nucleotide polymorphism in 20% of the acromegalic cat population evaluated, as well as in one of the sibling pairs evaluated. Copyright © 2016 Elsevier Inc. All rights reserved.
Evolutionary Distance of Amino Acid Sequence Orthologs across Macaque Subspecies: Identifying Candidate Genes for SIV Resistance in Chinese Rhesus Macaques

PubMed Central

Ross, Cody T.; Roodgar, Morteza; Smith, David Glenn

2015-01-01

We use the Reciprocal Smallest Distance (RSD) algorithm to identify amino acid sequence orthologs in the Chinese and Indian rhesus macaque draft sequences and estimate the evolutionary distance between such orthologs. We then use GOanna to map gene function annotations and human gene identifiers to the rhesus macaque amino acid sequences. We conclude methodologically by cross-tabulating a list of amino acid orthologs with large divergence scores with a list of genes known to be involved in SIV or HIV pathogenesis. We find that many of the amino acid sequences with large evolutionary divergence scores, as calculated by the RSD algorithm, have been shown to be related to HIV pathogenesis in previous laboratory studies. Four of the strongest candidate genes for SIVmac resistance in Chinese rhesus macaques identified in this study are CDK9, CXCL12, TRIM21, and TRIM32. Additionally, ANKRD30A, CTSZ, GORASP2, GTF2H1, IL13RA1, MUC16, NMDAR1, Notch1, NT5M, PDCD5, RAD50, and TM9SF2 were identified as possible candidates, among others. We failed to find many laboratory experiments contrasting the effects of Indian and Chinese orthologs at these sites on SIVmac pathogenesis, but future comparative studies might hold fertile ground for research into the biological mechanisms underlying innate resistance to SIVmac in Chinese rhesus macaques. PMID:25884674
Whole-Exome Sequencing to Identify Novel Biological Pathways Associated With Infertility After Pelvic Inflammatory Disease.

PubMed

Taylor, Brandie D; Zheng, Xiaojing; Darville, Toni; Zhong, Wujuan; Konganti, Kranti; Abiodun-Ojo, Olayinka; Ness, Roberta B; O'Connell, Catherine M; Haggerty, Catherine L

2017-01-01

Ideal management of sexually transmitted infections (STI) may require risk markers for pathology or vaccine development. Previously, we identified common genetic variants associated with chlamydial pelvic inflammatory disease (PID) and reduced fecundity. As this explains only a proportion of the long-term morbidity risk, we used whole-exome sequencing to identify biological pathways that may be associated with STI-related infertility. We obtained stored DNA from 43 non-Hispanic black women with PID from the PID Evaluation and Clinical Health Study. Infertility was assessed at a mean of 84 months. Principal component analysis revealed no population stratification. Potential covariates did not significantly differ between groups. Sequencing kernel association test was used to examine associations between aggregates of variants on a single gene and infertility. The results from the sequencing kernel association test were used to choose "focus genes" (P < 0.01; n = 150) for subsequent Ingenuity Pathway Analysis to identify "gene sets" that are enriched in biologically relevant pathways. Pathway analysis revealed that focus genes were enriched in canonical pathways including, IL-1 signaling, P2Y purinergic receptor signaling, and bone morphogenic protein signaling. Focus genes were enriched in pathways that impact innate and adaptive immunity, protein kinase A activity, cellular growth, and DNA repair. These may alter host resistance or immunopathology after infection. Targeted sequencing of biological pathways identified in this study may provide insight into STI-related infertility.

Cataloging the 1811-1812 New Madrid, central U.S., earthquake sequence

USGS Publications Warehouse

Hough, S.E.

2009-01-01

The three principal New Madrid, central U.S., mainshocks of 1811-1812 were followed by extensive aftershock sequences that included numerous felt events. Although no instrumental data are available for the sequence, historical accounts provide information that can be used to estimate magnitudes and locations for the large aftershocks as well as the mainshocks. Several detailed eyewitness accounts of the sequence provide sufficient information to identify times and rough magnitude estimates for a number of aftershocks that have not been analyzed previously. I also use three extended compilations of felt events to explore the overall sequence productivity. Although one generally cannot estimate magnitudes or locations for individual events, the intensity distributions of recent, instrumentally recorded earthquakes in the region provide a basis for estimation of the magnitude distribution of 1811-1812 aftershocks. The distribution is consistent with a b-value distribution. I estimate Mw 6-6.3 for the three largest identifiable aftershocks, apart from the so-called dawn aftershock on 16 December 1811.
Covariant Evolutionary Event Analysis for Base Interaction Prediction Using a Relational Database Management System for RNA.

PubMed

Xu, Weijia; Ozer, Stuart; Gutell, Robin R

2009-01-01

With an increasingly large amount of sequences properly aligned, comparative sequence analysis can accurately identify not only common structures formed by standard base pairing but also new types of structural elements and constraints. However, traditional methods are too computationally expensive to perform well on large scale alignment and less effective with the sequences from diversified phylogenetic classifications. We propose a new approach that utilizes coevolutional rates among pairs of nucleotide positions using phylogenetic and evolutionary relationships of the organisms of aligned sequences. With a novel data schema to manage relevant information within a relational database, our method, implemented with a Microsoft SQL Server 2005, showed 90% sensitivity in identifying base pair interactions among 16S ribosomal RNA sequences from Bacteria, at a scale 40 times bigger and 50% better sensitivity than a previous study. The results also indicated covariation signals for a few sets of cross-strand base stacking pairs in secondary structure helices, and other subtle constraints in the RNA structure.
Covariant Evolutionary Event Analysis for Base Interaction Prediction Using a Relational Database Management System for RNA

PubMed Central

Xu, Weijia; Ozer, Stuart; Gutell, Robin R.

2010-01-01

With an increasingly large amount of sequences properly aligned, comparative sequence analysis can accurately identify not only common structures formed by standard base pairing but also new types of structural elements and constraints. However, traditional methods are too computationally expensive to perform well on large scale alignment and less effective with the sequences from diversified phylogenetic classifications. We propose a new approach that utilizes coevolutional rates among pairs of nucleotide positions using phylogenetic and evolutionary relationships of the organisms of aligned sequences. With a novel data schema to manage relevant information within a relational database, our method, implemented with a Microsoft SQL Server 2005, showed 90% sensitivity in identifying base pair interactions among 16S ribosomal RNA sequences from Bacteria, at a scale 40 times bigger and 50% better sensitivity than a previous study. The results also indicated covariation signals for a few sets of cross-strand base stacking pairs in secondary structure helices, and other subtle constraints in the RNA structure. PMID:20502534
The GENCODE exome: sequencing the complete human exome

PubMed Central

Coffey, Alison J; Kokocinski, Felix; Calafato, Maria S; Scott, Carol E; Palta, Priit; Drury, Eleanor; Joyce, Christopher J; LeProust, Emily M; Harrow, Jen; Hunt, Sarah; Lehesjoki, Anna-Elina; Turner, Daniel J; Hubbard, Tim J; Palotie, Aarno

2011-01-01

Sequencing the coding regions, the exome, of the human genome is one of the major current strategies to identify low frequency and rare variants associated with human disease traits. So far, the most widely used commercial exome capture reagents have mainly targeted the consensus coding sequence (CCDS) database. We report the design of an extended set of targets for capturing the complete human exome, based on annotation from the GENCODE consortium. The extended set covers an additional 5594 genes and 10.3 Mb compared with the current CCDS-based sets. The additional regions include potential disease genes previously inaccessible to exome resequencing studies, such as 43 genes linked to ion channel activity and 70 genes linked to protein kinase activity. In total, the new GENCODE exome set developed here covers 47.9 Mb and performed well in sequence capture experiments. In the sample set used in this study, we identified over 5000 SNP variants more in the GENCODE exome target (24%) than in the CCDS-based exome sequencing. PMID:21364695
Functional SNP associated with birth weight in independent populations identified with a permutation step added to GBLUP-GWAS

USDA-ARS?s Scientific Manuscript database

This study was conducted as an initial assessment of a newly available genotyping assay containing about 34,000 common SNP included on previous SNP chips, and 199,000 sequence variants predicted to affect gene function. Objectives were to identify functional variants associated with birth weight in...
Elucidating the 16S rRNA 3' boundaries and defining optimal SD/aSD pairing in Escherichia coli and Bacillus subtilis using RNA-Seq data.

PubMed

Wei, Yulong; Silke, Jordan R; Xia, Xuhua

2017-12-15

Bacterial translation initiation is influenced by base pairing between the Shine-Dalgarno (SD) sequence in the 5' UTR of mRNA and the anti-SD (aSD) sequence at the free 3' end of the 16S rRNA (3' TAIL) due to: 1) the SD/aSD sequence binding location and 2) SD/aSD binding affinity. In order to understand what makes an SD/aSD interaction optimal, we must define: 1) terminus of the 3' TAIL and 2) extent of the core aSD sequence within the 3' TAIL. Our approach to characterize these components in Escherichia coli and Bacillus subtilis involves 1) mapping the 3' boundary of the mature 16S rRNA using high-throughput RNA sequencing (RNA-Seq), and 2) identifying the segment within the 3' TAIL that is strongly preferred in SD/aSD pairing. Using RNA-Seq data, we resolve previous discrepancies in the reported 3' TAIL in B. subtilis and recovered the established 3' TAIL in E. coli. Furthermore, we extend previous studies to suggest that both highly and lowly expressed genes favor SD sequences with intermediate binding affinity, but this trend is exclusive to SD sequences that complement the core aSD sequences defined herein.
Bifidobacterium animalis subsp. lactis ATCC 27673 Is a Genomically Unique Strain within Its Conserved Subspecies

PubMed Central

Loquasto, Joseph R.; Barrangou, Rodolphe; Dudley, Edward G.; Stahl, Buffy; Chen, Chun

2013-01-01

Many strains of Bifidobacterium animalis subsp. lactis are considered health-promoting probiotic microorganisms and are commonly formulated into fermented dairy foods. Analyses of previously sequenced genomes of B. animalis subsp. lactis have revealed little genetic diversity, suggesting that it is a monomorphic subspecies. However, during a multilocus sequence typing survey of Bifidobacterium, it was revealed that B. animalis subsp. lactis ATCC 27673 gave a profile distinct from that of the other strains of the subspecies. As part of an ongoing study designed to understand the genetic diversity of this subspecies, the genome of this strain was sequenced and compared to other sequenced genomes of B. animalis subsp. lactis and B. animalis subsp. animalis. The complete genome of ATCC 27673 was 1,963,012 bp, contained 1,616 genes and 4 rRNA operons, and had a G+C content of 61.55%. Comparative analyses revealed that the genome of ATCC 27673 contained six distinct genomic islands encoding 83 open reading frames not found in other strains of the same subspecies. In four islands, either phage or mobile genetic elements were identified. In island 6, a novel clustered regularly interspaced short palindromic repeat (CRISPR) locus which contained 81 unique spacers was identified. This type I-E CRISPR-cas system differs from the type I-C systems previously identified in this subspecies, representing the first identification of a different system in B. animalis subsp. lactis. This study revealed that ATCC 27673 is a strain of B. animalis subsp. lactis with novel genetic content and suggests that the lack of genetic variability observed is likely due to the repeated sequencing of a limited number of widely distributed commercial strains. PMID:23995933
Exome-wide Sequencing Shows Low Mutation Rates and Identifies Novel Mutated Genes in Seminomas.

PubMed

Cutcutache, Ioana; Suzuki, Yuka; Tan, Iain Beehuat; Ramgopal, Subhashini; Zhang, Shenli; Ramnarayanan, Kalpana; Gan, Anna; Lee, Heng Hong; Tay, Su Ting; Ooi, Aikseng; Ong, Choon Kiat; Bolthouse, Jonathan T; Lane, Brian R; Anema, John G; Kahnoski, Richard J; Tan, Patrick; Teh, Bin Tean; Rozen, Steven G

2015-07-01

Testicular germ cell tumors are the most common cancer diagnosed in young men, and seminomas are the most common type of these cancers. There have been no exome-wide examinations of genes mutated in seminomas or of overall rates of nonsilent somatic mutations in these tumors. The objective was to analyze somatic mutations in seminomas to determine which genes are affected and to determine rates of nonsilent mutations. Eight seminomas and matched normal samples were surgically obtained from eight patients. DNA was extracted from tissue samples and exome sequenced on massively parallel Illumina DNA sequencers. Single-nucleotide polymorphism chip-based copy number analysis was also performed to assess copy number alterations. The DNA sequencing read data were analyzed to detect somatic mutations including single-nucleotide substitutions and short insertions and deletions. The detected mutations were validated by independent sequencing and further checked for subclonality. The rate of nonsynonymous somatic mutations averaged 0.31 mutations/Mb. We detected nonsilent somatic mutations in 96 genes that were not previously known to be mutated in seminomas, of which some may be driver mutations. Many of the mutations appear to have been present in subclonal populations. In addition, two genes, KIT and KRAS, were affected in two tumors each with mutations that were previously observed in other cancers and are presumably oncogenic. Our study, the first report on exome sequencing of seminomas, detected somatic mutations in 96 new genes, several of which may be targetable drivers. Furthermore, our results show that seminoma mutation rates are five times higher than previously thought, but are nevertheless low compared to other common cancers. Similar low rates are seen in other cancers that also have excellent rates of remission achieved with chemotherapy. We examined the DNA sequences of seminomas, the most common type of testicular germ cell cancer. Our study identified 96 new genes in which mutations occurred during seminoma development, some of which might contribute to cancer development or progression. The study also showed that the rates of DNA mutations during seminoma development are higher than previously thought, but still lower than for other common solid-organ cancers. Such low rates are also observed among other cancers that, like seminomas, show excellent rates of disease remission after chemotherapy. Copyright © 2015 European Association of Urology. Published by Elsevier B.V. All rights reserved.
Two novel disease-causing mutations in the CLRN1 gene in patients with Usher syndrome type 3

PubMed Central

García-García, Gema; Aparisi, María J.; Rodrigo, Regina; Sequedo, María D.; Espinós, Carmen; Rosell, Jordi; Olea, José L.; Mendívil, M. Paz; Ramos-Arroyo, María A; Ayuso, Carmen; Jaijo, Teresa; Aller, Elena

2012-01-01

Purpose To identify the genetic defect in Spanish families with Usher syndrome (USH) and probable involvement of the CLRN1 gene. Methods DNA samples of the affected members of our cohort of USH families were tested using an USH genotyping array, and/or genotyped with polymorphic markers specific for the USH3A locus. Based on these previous analyses and clinical findings, CLRN1 was directly sequenced in 17 patients susceptible to carrying mutations in this gene. Results Microarray analysis revealed the previously reported mutation p.Y63X in two unrelated patients, one of them homozygous for the mutation. After CLRN1 sequencing, we found two novel mutations, p.R207X and p.I168N. Both novel mutations segregated with the phenotype. Conclusions To date, 18 mutations in CLRN1 have been reported. In this work, we report two novel mutations and a third one previously identified in the Spanish USH sample. The prevalence of CLRN1 among our patients with USH is low. PMID:23304067
Gene expression divergence and nucleotide differentiation between males of different color morphs and mating strategies in the ruff

PubMed Central

Ekblom, Robert; Farrell, Lindsay L; Lank, David B; Burke, Terry

2012-01-01

By next generation transcriptome sequencing, it is possible to obtain data on both nucleotide sequence variation and gene expression. We have used this approach (RNA-Seq) to investigate the genetic basis for differences in plumage coloration and mating strategies in a non-model bird species, the ruff (Philomachus pugnax). Ruff males show enormous variation in the coloration of ornamental feathers, used for individual recognition. This polymorphism is linked to reproductive strategies, with dark males (Independents) defending territories on leks against other Independents, whereas white morphs (Satellites) co-occupy Independent's courts without agonistic interactions. Previous work found a strong genetic component for mating strategy, but the genes involved were not identified. We present feather transcriptome data of more than 6,000 de-novo sequenced ruff genes (although with limited coverage for many of them). None of the identified genes showed significant expression divergence between males, but many genetic markers showed nucleotide differentiation between different color morphs and mating strategies. These include several feather keratin genes, splicing factors, and the Xg blood-group gene. Many of the genes with significant genetic structure between mating strategies have not yet been annotated and their functions remain to be elucidated. We also conducted in-depth investigations of 28 pre-identified coloration candidate genes. Two of these (EDNRB and TYR) were specifically expressed in black- and rust-colored males, respectively. We have demonstrated the utility of next generation transcriptome sequencing for identifying and genotyping large number of genetic markers in a non-model species without previous genomic resources, and highlight the potential of this approach for addressing the genetic basis of ecologically important variation. PMID:23145334
Identification of human chromosome 22 transcribed sequences with ORF expressed sequence tags

PubMed Central

de Souza, Sandro J.; Camargo, Anamaria A.; Briones, Marcelo R. S.; Costa, Fernando F.; Nagai, Maria Aparecida; Verjovski-Almeida, Sergio; Zago, Marco A.; Andrade, Luis Eduardo C.; Carrer, Helaine; El-Dorry, Hamza F. A.; Espreafico, Enilza M.; Habr-Gama, Angelita; Giannella-Neto, Daniel; Goldman, Gustavo H.; Gruber, Arthur; Hackel, Christine; Kimura, Edna T.; Maciel, Rui M. B.; Marie, Suely K. N.; Martins, Elizabeth A. L.; Nóbrega, Marina P.; Paçó-Larson, Maria Luisa; Pardini, Maria Inês M. C.; Pereira, Gonçalo G.; Pesquero, João Bosco; Rodrigues, Vanderlei; Rogatto, Silvia R.; da Silva, Ismael D. C. G.; Sogayar, Mari C.; de Fátima Sonati, Maria; Tajara, Eloiza H.; Valentini, Sandro R.; Acencio, Marcio; Alberto, Fernando L.; Amaral, Maria Elisabete J.; Aneas, Ivy; Bengtson, Mário Henrique; Carraro, Dirce M.; Carvalho, Alex F.; Carvalho, Lúcia Helena; Cerutti, Janete M.; Corrêa, Maria Lucia C.; Costa, Maria Cristina R.; Curcio, Cyntia; Gushiken, Tsieko; Ho, Paulo L.; Kimura, Elza; Leite, Luciana C. C.; Maia, Gustavo; Majumder, Paromita; Marins, Mozart; Matsukuma, Adriana; Melo, Analy S. A.; Mestriner, Carlos Alberto; Miracca, Elisabete C.; Miranda, Daniela C.; Nascimento, Ana Lucia T. O.; Nóbrega, Francisco G.; Ojopi, Élida P. B.; Pandolfi, José Rodrigo C.; Pessoa, Luciana Gilbert; Rahal, Paula; Rainho, Claudia A.; da Ro's, Nancy; de Sá, Renata G.; Sales, Magaly M.; da Silva, Neusa P.; Silva, Tereza C.; da Silva, Wilson; Simão, Daniel F.; Sousa, Josane F.; Stecconi, Daniella; Tsukumo, Fernando; Valente, Valéria; Zalcberg, Heloisa; Brentani, Ricardo R.; Reis, Luis F. L.; Dias-Neto, Emmanuel; Simpson, Andrew J. G.

2000-01-01

Transcribed sequences in the human genome can be identified with confidence only by alignment with sequences derived from cDNAs synthesized from naturally occurring mRNAs. We constructed a set of 250,000 cDNAs that represent partial expressed gene sequences and that are biased toward the central coding regions of the resulting transcripts. They are termed ORF expressed sequence tags (ORESTES). The 250,000 ORESTES were assembled into 81,429 contigs. Of these, 1,181 (1.45%) were found to match sequences in chromosome 22 with at least one ORESTES contig for 162 (65.6%) of the 247 known genes, for 67 (44.6%) of the 150 related genes, and for 45 of the 148 (30.4%) EST-predicted genes on this chromosome. Using a set of stringent criteria to validate our sequences, we identified a further 219 previously unannotated transcribed sequences on chromosome 22. Of these, 171 were in fact also defined by EST or full length cDNA sequences available in GenBank but not utilized in the initial annotation of the first human chromosome sequence. Thus despite representing less than 15% of all expressed human sequences in the public databases at the time of the present analysis, ORESTES sequences defined 48 transcribed sequences on chromosome 22 not defined by other sequences. All of the transcribed sequences defined by ORESTES coincided with DNA regions predicted as encoding exons by genscan. (http://genes.mit.edu/GENSCAN.html). PMID:11070084
Characterization of full-length MHC class II sequences in Indonesian and Vietnamese cynomolgus macaques.

PubMed

Creager, Hannah M; Becker, Ericka A; Sandman, Kelly K; Karl, Julie A; Lank, Simon M; Bimber, Benjamin N; Wiseman, Roger W; Hughes, Austin L; O'Connor, Shelby L; O'Connor, David H

2011-09-01

In recent years, the use of cynomolgus macaques in biomedical research has increased greatly. However, with the exception of the Mauritian population, knowledge of the MHC class II genetics of the species remains limited. Here, using cDNA cloning and Sanger sequencing, we identified 127 full-length MHC class II alleles in a group of 12 Indonesian and 12 Vietnamese cynomolgus macaques. Forty two of these were completely novel to cynomolgus macaques while 61 extended the sequence of previously identified alleles from partial to full length. This more than doubles the number of full-length cynomolgus macaque MHC class II alleles available in GenBank, significantly expanding the allele library for the species and laying the groundwork for future evolutionary and functional studies.
Computational complexity of algorithms for sequence comparison, short-read assembly and genome alignment.

PubMed

Baichoo, Shakuntala; Ouzounis, Christos A

A multitude of algorithms for sequence comparison, short-read assembly and whole-genome alignment have been developed in the general context of molecular biology, to support technology development for high-throughput sequencing, numerous applications in genome biology and fundamental research on comparative genomics. The computational complexity of these algorithms has been previously reported in original research papers, yet this often neglected property has not been reviewed previously in a systematic manner and for a wider audience. We provide a review of space and time complexity of key sequence analysis algorithms and highlight their properties in a comprehensive manner, in order to identify potential opportunities for further research in algorithm or data structure optimization. The complexity aspect is poised to become pivotal as we will be facing challenges related to the continuous increase of genomic data on unprecedented scales and complexity in the foreseeable future, when robust biological simulation at the cell level and above becomes a reality. Copyright © 2017 Elsevier B.V. All rights reserved.
Numerous uncharacterized and highly divergent microbes which colonize humans are revealed by circulating cell-free DNA

PubMed Central

Camunas-Soler, Joan; Kertesz, Michael; De Vlaminck, Iwijn; Koh, Winston; Pan, Wenying; Martin, Lance; Neff, Norma F.; Okamoto, Jennifer; Wong, Ronald J.; Kharbanda, Sandhya; El-Sayed, Yasser; Blumenfeld, Yair; Stevenson, David K.; Shaw, Gary M.; Wolfe, Nathan D.; Quake, Stephen R.

2017-01-01

Blood circulates throughout the human body and contains molecules drawn from virtually every tissue, including the microbes and viruses which colonize the body. Through massive shotgun sequencing of circulating cell-free DNA from the blood, we identified hundreds of new bacteria and viruses which represent previously unidentified members of the human microbiome. Analyzing cumulative sequence data from 1,351 blood samples collected from 188 patients enabled us to assemble 7,190 contiguous regions (contigs) larger than 1 kbp, of which 3,761 are novel with little or no sequence homology in any existing databases. The vast majority of these novel contigs possess coding sequences, and we have validated their existence both by finding their presence in independent experiments and by performing direct PCR amplification. When their nearest neighbors are located in the tree of life, many of the organisms represent entirely novel taxa, showing that microbial diversity within the human body is substantially broader than previously appreciated. PMID:28830999
A third genotype of the human parvovirus PARV4 in sub-Saharan Africa.

PubMed

Simmonds, Peter; Douglas, Jill; Bestetti, Giovanna; Longhi, Erika; Antinori, Spinello; Parravicini, Carlo; Corbellino, Mario

2008-09-01

PARV4 is a recently discovered human parvovirus widely distributed in injecting drug users in the USA and Europe, particularly in those co-infected with human immunodeficiency virus (HIV). Like parvovirus B19, PARV4 persists in previously exposed individuals. In bone marrow and lymphoid tissue, PARV4 sequences were detected in two sub-Saharan African study subjects with AIDS but without a reported history of parenteral exposure and who were uninfected with hepatitis C virus. PARV4 variants infecting these subjects were phylogenetically distinct from genotypes 1 and 2 (formerly PARV5) that were reported previously. Analysis of near-complete genome sequences demonstrated that they should be classified as a third (equidistant) PARV4 genotype. The availability of a further near-complete genome sequence of this novel genotype facilitated identification of conserved novel open reading frames embedded in the ORF2 coding sequence; one encoded a putative protein with identifiable homology to SAT proteins of members of the genus Parvovirus.
Multilocus sequence typing of Pseudomonas syringae sensu lato confirms previously described genomospecies and permits rapid identification of P. syringae pv. coriandricola and P. syringae pv. apii causing bacterial leaf spot on parsley.

PubMed

Bull, Carolee T; Clarke, Christopher R; Cai, Rongman; Vinatzer, Boris A; Jardini, Teresa M; Koike, Steven T

2011-07-01

Since 2002, severe leaf spotting on parsley (Petroselinum crispum) has occurred in Monterey County, CA. Either of two different pathovars of Pseudomonas syringae sensu lato were isolated from diseased leaves from eight distinct outbreaks and once from the same outbreak. Fragment analysis of DNA amplified between repetitive sequence polymerase chain reaction; 16S rDNA sequence analysis; and biochemical, physiological, and host range tests identified the pathogens as Pseudomonas syringae pv. apii and P. syringae pv. coriandricola. Koch's postulates were completed for the isolates from parsley, and host range tests with parsley isolates and pathotype strains demonstrated that P. syringae pv. apii and P. syringae pv. coriandricola cause leaf spot diseases on parsley, celery, and coriander or cilantro. In a multilocus sequence typing (MLST) approach, four housekeeping gene fragments were sequenced from 10 strains isolated from parsley and 56 pathotype strains of P. syringae. Allele sequences were uploaded to the Plant-Associated Microbes Database and a phylogenetic tree was built based on concatenated sequences. Tree topology directly corresponded to P. syringae genomospecies and P. syringae pv. apii was allocated appropriately to genomospecies 3. This is the first demonstration that MLST can accurately allocate new pathogens directly to P. syringae sensu lato genomospecies. According to MLST, P. syringae pv. coriandricola is a member of genomospecies 9, P. cannabina. In a blind test, both P. syringae pv. coriandricola and P. syringae pv. apii isolates from parsley were correctly identified to pathovar. In both cases, MLST described diversity within each pathovar that was previously unknown.
Targeted next-generation sequencing helps to decipher the genetic and phenotypic heterogeneity of hypertrophic cardiomyopathy

PubMed Central

Cecconi, Massimiliano; Parodi, Maria I.; Formisano, Francesco; Spirito, Paolo; Autore, Camillo; Musumeci, Maria B.; Favale, Stefano; Forleo, Cinzia; Rapezzi, Claudio; Biagini, Elena; Davì, Sabrina; Canepa, Elisabetta; Pennese, Loredana; Castagnetta, Mauro; Degiorgio, Dario; Coviello, Domenico A.

2016-01-01

Hypertrophic cardiomyopathy (HCM) is mainly associated with myosin, heavy chain 7 (MYH7) and myosin binding protein C, cardiac (MYBPC3) mutations. In order to better explain the clinical and genetic heterogeneity in HCM patients, in this study, we implemented a target-next generation sequencing (NGS) assay. An Ion AmpliSeq™ Custom Panel for the enrichment of 19 genes, of which 9 of these did not encode thick/intermediate and thin myofilament (TTm) proteins and, among them, 3 responsible of HCM phenocopy, was created. Ninety-two DNA samples were analyzed by the Ion Personal Genome Machine: 73 DNA samples (training set), previously genotyped in some of the genes by Sanger sequencing, were used to optimize the NGS strategy, whereas 19 DNA samples (discovery set) allowed the evaluation of NGS performance. In the training set, we identified 72 out of 73 expected mutations and 15 additional mutations: the molecular diagnosis was achieved in one patient with a previously wild-type status and the pre-excitation syndrome was explained in another. In the discovery set, we identified 20 mutations, 5 of which were in genes encoding non-TTm proteins, increasing the diagnostic yield by approximately 20%: a single mutation in genes encoding non-TTm proteins was identified in 2 out of 3 borderline HCM patients, whereas co-occuring mutations in genes encoding TTm and galactosidase alpha (GLA) altered proteins were characterized in a male with HCM and multiorgan dysfunction. Our combined targeted NGS-Sanger sequencing-based strategy allowed the molecular diagnosis of HCM with greater efficiency than using the conventional (Sanger) sequencing alone. Mutant alleles encoding non-TTm proteins may aid in the complete understanding of the genetic and phenotypic heterogeneity of HCM: co-occuring mutations of genes encoding TTm and non-TTm proteins could explain the wide variability of the HCM phenotype, whereas mutations in genes encoding only the non-TTm proteins are identifiable in patients with a milder HCM status. PMID:27600940
Frequent genes in rare diseases: panel-based next generation sequencing to disclose causal mutations in hereditary neuropathies.

PubMed

Dohrn, Maike F; Glöckle, Nicola; Mulahasanovic, Lejla; Heller, Corina; Mohr, Julia; Bauer, Christine; Riesch, Erik; Becker, Andrea; Battke, Florian; Hörtnagel, Konstanze; Hornemann, Thorsten; Suriyanarayanan, Saranya; Blankenburg, Markus; Schulz, Jörg B; Claeys, Kristl G; Gess, Burkhard; Katona, Istvan; Ferbert, Andreas; Vittore, Debora; Grimm, Alexander; Wolking, Stefan; Schöls, Ludger; Lerche, Holger; Korenke, G Christoph; Fischer, Dirk; Schrank, Bertold; Kotzaeridou, Urania; Kurlemann, Gerhard; Dräger, Bianca; Schirmacher, Anja; Young, Peter; Schlotter-Weigel, Beate; Biskup, Saskia

2017-12-01

Hereditary neuropathies comprise a wide variety of chronic diseases associated to more than 80 genes identified to date. We herein examined 612 index patients with either a Charcot-Marie-Tooth phenotype, hereditary sensory neuropathy, familial amyloid neuropathy, or small fiber neuropathy using a customized multigene panel based on the next generation sequencing technique. In 121 cases (19.8%), we identified at least one putative pathogenic mutation. Of these, 54.4% showed an autosomal dominant, 33.9% an autosomal recessive, and 11.6% an X-linked inheritance. The most frequently affected genes were PMP22 (16.4%), GJB1 (10.7%), MPZ, and SH3TC2 (both 9.9%), and MFN2 (8.3%). We further detected likely or known pathogenic variants in HINT1, HSPB1, NEFL, PRX, IGHMBP2, NDRG1, TTR, EGR2, FIG4, GDAP1, LMNA, LRSAM1, POLG, TRPV4, AARS, BIC2, DHTKD1, FGD4, HK1, INF2, KIF5A, PDK3, REEP1, SBF1, SBF2, SCN9A, and SPTLC2 with a declining frequency. Thirty-four novel variants were considered likely pathogenic not having previously been described in association with any disorder in the literature. In one patient, two homozygous mutations in HK1 were detected in the multigene panel, but not by whole exome sequencing. A novel missense mutation in KIF5A was considered pathogenic because of the highly compatible phenotype. In one patient, the plasma sphingolipid profile could functionally prove the pathogenicity of a mutation in SPTLC2. One pathogenic mutation in MPZ was identified after being previously missed by Sanger sequencing. We conclude that panel based next generation sequencing is a useful, time- and cost-effective approach to assist clinicians in identifying the correct diagnosis and enable causative treatment considerations. © 2017 International Society for Neurochemistry.
Novel Molecular Method for Identification of Streptococcus pneumoniae Applicable to Clinical Microbiology and 16S rRNA Sequence-Based Microbiome Studies

PubMed Central

Scholz, Christian F. P.; Poulsen, Knud

2012-01-01

The close phylogenetic relationship of the important pathogen Streptococcus pneumoniae and several species of commensal streptococci, particularly Streptococcus mitis and Streptococcus pseudopneumoniae, and the recently demonstrated sharing of genes and phenotypic traits previously considered specific for S. pneumoniae hamper the exact identification of S. pneumoniae. Based on sequence analysis of 16S rRNA genes of a collection of 634 streptococcal strains, identified by multilocus sequence analysis, we detected a cytosine at position 203 present in all 440 strains of S. pneumoniae but replaced by an adenosine residue in all strains representing other species of mitis group streptococci. The S. pneumoniae-specific sequence signature could be demonstrated by sequence analysis or indirectly by restriction endonuclease digestion of a PCR amplicon covering the site. The S. pneumoniae-specific signature offers an inexpensive means for validation of the identity of clinical isolates and should be used as an integrated marker in the annotation procedure employed in 16S rRNA-based molecular studies of complex human microbiotas. This may avoid frequent misidentifications such as those we demonstrate to have occurred in previous reports and in reference sequence databases. PMID:22442329
Ataxia telangiectasia presenting as dopa-responsive cervical dystonia

PubMed Central

Mohire, Mahavir D.; Schneider, Susanne A.; Stamelou, Maria; Wood, Nicholas W.; Bhatia, Kailash P.

2013-01-01

Objective: To identify the cause of cervical dopa-responsive dystonia (DRD) in a Muslim Indian family inherited in an apparently autosomal recessive fashion, as previously described in this journal. Methods: Previous testing for mutations in the genes known to cause DRD (GCH1, TH, and SPR) had been negative. Whole exome sequencing was performed on all 3 affected individuals for whom DNA was available to identify potentially pathogenic shared variants. Genotyping data obtained for all 3 affected individuals using the OmniExpress single nucleotide polymorphism chip (Illumina, San Diego, CA) were used to perform linkage analysis, autozygosity mapping, and copy number variation analysis. Sanger sequencing was used to confirm all variants. Results: After filtering of the variants, exome sequencing revealed 2 genes harboring potentially pathogenic compound heterozygous variants (ATM and LRRC16A). Of these, the variants in ATM segregated perfectly with the cervical DRD. Both mutations detected in ATM have been shown to be pathogenic, and α-fetoprotein, a marker of ataxia telangiectasia, was increased in all affected individuals. Conclusion: Biallelic mutations in ATM can cause DRD, and mutations in this gene should be considered in the differential diagnosis of unexplained DRD, particularly if the dystonia is cervical and if there is a recessive family history. ATM has previously been reported to cause isolated cervical dystonia, but never, to our knowledge, DRD. Individuals with dystonia related to ataxia telangiectasia may benefit from a trial of levodopa. PMID:23946315

Gustatory Receptor Expression in the Labella and Tarsi of Aedes aegypti

DTIC Science & Technology

2013-01-01

Gibbs, R., Chen, R., 2011. The Drosophila melanogaster transcriptome by paired-end RNA sequencing . Genome Res. 21, 315e324. Debboun, M., Strickman, D...from genomic sequences and compared to previously identified insect GRs (Kent et al., 2008). In general, GRs of the two main mosquito sub- families...almost always demonstrated conservation in Drosophila melanogaster as well. Twelve out the total 40 AaegGRs with likely orthologs in An. gambiae had
Genetic Mapping and Exome Sequencing Identify Variants Associated with Five Novel Diseases

PubMed Central

Puffenberger, Erik G.; Jinks, Robert N.; Sougnez, Carrie; Cibulskis, Kristian; Willert, Rebecca A.; Achilly, Nathan P.; Cassidy, Ryan P.; Fiorentini, Christopher J.; Heiken, Kory F.; Lawrence, Johnny J.; Mahoney, Molly H.; Miller, Christopher J.; Nair, Devika T.; Politi, Kristin A.; Worcester, Kimberly N.; Setton, Roni A.; DiPiazza, Rosa; Sherman, Eric A.; Eastman, James T.; Francklyn, Christopher; Robey-Bond, Susan; Rider, Nicholas L.; Gabriel, Stacey; Morton, D. Holmes; Strauss, Kevin A.

2012-01-01

The Clinic for Special Children (CSC) has integrated biochemical and molecular methods into a rural pediatric practice serving Old Order Amish and Mennonite (Plain) children. Among the Plain people, we have used single nucleotide polymorphism (SNP) microarrays to genetically map recessive disorders to large autozygous haplotype blocks (mean = 4.4 Mb) that contain many genes (mean = 79). For some, uninformative mapping or large gene lists preclude disease-gene identification by Sanger sequencing. Seven such conditions were selected for exome sequencing at the Broad Institute; all had been previously mapped at the CSC using low density SNP microarrays coupled with autozygosity and linkage analyses. Using between 1 and 5 patient samples per disorder, we identified sequence variants in the known disease-causing genes SLC6A3 and FLVCR1, and present evidence to strongly support the pathogenicity of variants identified in TUBGCP6, BRAT1, SNIP1, CRADD, and HARS. Our results reveal the power of coupling new genotyping technologies to population-specific genetic knowledge and robust clinical data. PMID:22279524
Regulatory elements of the floral homeotic gene AGAMOUS identified by phylogenetic footprinting and shadowing.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hong, R. L., Hamaguchi, L., Busch, M. A., and Weigel, D.

2003-06-01

OAK-B135 In Arabidopsis thaliana, cis-regulatory sequences of the floral homeotic gene AGAMOUS (AG) are located in the second intron. This 3 kb intron contains binding sites for two direct activators of AG, LEAFY (LFY) and WUSCHEL (WUS), along with other putative regulatory elements. We have used phylogenetic footprinting and the related technique of phylogenetic shadowing to identify putative cis-regulatory elements in this intron. Among 29 Brassicaceae, several other motifs, but not the LFY and WUS binding sites previously identified, are largely invariant. Using reporter gene analyses, we tested six of these motifs and found that they are all functionally importantmore » for activity of AG regulatory sequences in A. thaliana. Although there is little obvious sequence similarity outside the Brassicaceae, the intron from cucumber AG has at least partial activity in A. thaliana. Our studies underscore the value of the comparative approach as a tool that complements gene-by-gene promoter dissection, but also highlight that sequence-based studies alone are insufficient for a complete identification of cis-regulatory sites.« less
Detection of Mixed Infection from Bacterial Whole Genome Sequence Data Allows Assessment of Its Role in Clostridium difficile Transmission

PubMed Central

Eyre, David W.; Cule, Madeleine L.; Griffiths, David; Crook, Derrick W.; Peto, Tim E. A.

2013-01-01

Bacterial whole genome sequencing offers the prospect of rapid and high precision investigation of infectious disease outbreaks. Close genetic relationships between microorganisms isolated from different infected cases suggest transmission is a strong possibility, whereas transmission between cases with genetically distinct bacterial isolates can be excluded. However, undetected mixed infections—infection with ≥2 unrelated strains of the same species where only one is sequenced—potentially impairs exclusion of transmission with certainty, and may therefore limit the utility of this technique. We investigated the problem by developing a computationally efficient method for detecting mixed infection without the need for resource-intensive independent sequencing of multiple bacterial colonies. Given the relatively low density of single nucleotide polymorphisms within bacterial sequence data, direct reconstruction of mixed infection haplotypes from current short-read sequence data is not consistently possible. We therefore use a two-step maximum likelihood-based approach, assuming each sample contains up to two infecting strains. We jointly estimate the proportion of the infection arising from the dominant and minor strains, and the sequence divergence between these strains. In cases where mixed infection is confirmed, the dominant and minor haplotypes are then matched to a database of previously sequenced local isolates. We demonstrate the performance of our algorithm with in silico and in vitro mixed infection experiments, and apply it to transmission of an important healthcare-associated pathogen, Clostridium difficile. Using hospital ward movement data in a previously described stochastic transmission model, 15 pairs of cases enriched for likely transmission events associated with mixed infection were selected. Our method identified four previously undetected mixed infections, and a previously undetected transmission event, but no direct transmission between the pairs of cases under investigation. These results demonstrate that mixed infections can be detected without additional sequencing effort, and this will be important in assessing the extent of cryptic transmission in our hospitals. PMID:23658511
High-resolution identification and abundance profiling of cassava (Manihot esculenta Crantz) microRNAs.

PubMed

Khatabi, Behnam; Arikit, Siwaret; Xia, Rui; Winter, Stephan; Oumar, Doungous; Mongomake, Kone; Meyers, Blake C; Fondong, Vincent N

2016-01-28

Small RNAs (sRNAs) are endogenous sRNAs that play regulatory roles in plant growth, development, and biotic and abiotic stress responses. In plants, one subset of sRNAs, microRNAs (miRNAs) exhibit tissue-differential expression and regulate gene expression mainly through direct cleavage of mRNA or indirectly via production of secondary phased siRNAs (phasiRNAs) that silence cognate target transcripts in trans. Here, we have identified cassava (Manihot esculenta Crantz) miRNAs using high resolution sequencing of sRNA libraries from leaf, stem, callus, male and female flower tissues. To analyze the data, we built a cassava genome database and, via sequence analysis and secondary structure prediction, 38 miRNAs not previously reported in cassava were identified. These new cassava miRNAs included two miRNAs not previously been reported in any plant species. The miRNAs exhibited tissue-differential accumulation as confirmed by quantitative RT-PCR and Northern blot analysis, largely reflecting levels observed in sequencing data. Some of the miRNAs identified were predicted to trigger production of secondary phased siRNAs (phasiRNAs) from 80 PHAS loci. Cassava is a woody perennial shrub, grown principally for its starch-rich storage roots, which are rich in calories. In this study, new miRNAs were identified and their expression was validated using qRT-PCR of RNA from five different tissues. The data obtained expand the list of annotated miRNAs and provide additional new resources for cassava improvement research.
Occurrence of ascaridoid nematodes in selected edible fish from the Persian Gulf and description of Hysterothylacium larval type XV and Hysterothylacium persicum n. sp. (Nematoda: Raphidascarididae).

PubMed

Shamsi, Shokoofeh; Ghadam, Masoumeh; Suthar, Jaydipbhai; Ebrahimzadeh Mousavi, Hoseinali; Soltani, Mehdi; Mirzargar, Saeed

2016-11-07

Despite several reports on the presence of the potentially zoonotic nematodes among edible fishes in the Persian Gulf, there is still no study on the specific identification of these parasites or their genetic characterisation. In the present study, a total of 600 fish belonging to five popular species of fish in the region, including Otolithes ruber, Psettodes erumei, Saurida tumbil, Scomberomorus commerson and Sphyraena jello were examined for infection with nematode parasites. Detailed microscopy of nematodes found in the present study followed by characterisation of the first and second internal transcribed spacers (ITS-1 and ITS-2, respectively) showed that they belong to five distinct taxa that could be potentially zoonotic. Anisakis type I was found in four species of fish, had identical ITS sequences as Anisakis typica previously reported in Australian waters and was different from those reported in the Nearctic. Hysterothylacium type VI in the present study was morphologically similar to those previously described from Australasian waters and ITS sequences were identical among Australian specimens and those found in the present study. Another Hysterothylacium larval type was also found in the present study which had identical ITS sequences and similar morphology to those previously reported and identified as H. amoyense in China Sea. Since no ITS sequence data from a well identified adult H. amoyense with an identifiable museum voucher number is yet available and due to some other issues discussed in the article we suggest assignment of this larval type from the China Sea and the Persian Gulf to H. amoyense is doubtful until future studies on a well identified male specimen of H. amoyense or other species reveals the specific identity of this larval type. We propose to refer to this larval type as Hysterothylacium larval type XV. In the present study we also describe a new species, Hysterothylacium persicum and discuss how to differentiate it from closely related species. We also found some adult females with distinct morphology and ITS sequence but due to lack of male specimens they have been referred as Hysterothylacium sp. in this paper. They had the same ITS sequence data as Hysterothylacium larval type VI. This study shows the presence of a relatively broad diversity of potentially zoonotic nematodes in edible fish of the Persian Gulf. Therefore educational campaigns for public and local health practitioners are suggested to protect consumers from becoming infected with these parasites. Copyright © 2016 Elsevier B.V. All rights reserved.
Characterization of the Campylobacter jejuni cryptic plasmid pTIW94 recovered from wild birds in the southeastern United States.

PubMed

Hiett, Kelli L; Rothrock, Michael J; Seal, Bruce S

2013-09-01

The complete nucleotide sequence was determined for a cryptic plasmid, pTIW94, recovered from several Campylobacter jejuni isolates from wild birds in the southeastern United States. pTIW94 is a circular molecule of 3860 nucleotides, with a G+C content (31.0%) similar to that of many Campylobacter spp. genomes. A typical origin of replication, with iteron sequences, was identified upstream of DNA sequences that demonstrated similarity to replication initiation proteins. A total of five open reading frames (ORFs) were identified; two of the five ORFs demonstrated significant similarity to plasmid pCC2228-2 found within Campylobacter coli. These two ORFs were similar to essential replication proteins RepA (100%; 26/26 aa identity) and RepB (95%; 327/346 aa identity). A third identified ORF demonstrated significant similarity (99%; 421/424 aa identity) to the MOB protein from C. coli 67-8, originally recovered from swine. The other two identified ORFs were either similar to hypothetical proteins from other Campylobacter spp., or exhibited no significant similarity to any DNA or protein sequence in the GenBank database. Promoter regions (-35 and -10 signal sites), ribosomal binding sites upstream of ORFs, and stem-loop structures were also identified within the plasmid. These results demonstrate that pTIW94 represents a previously un-reported small cryptic plasmid with unique sequences as well as highly similar sequences to other small plasmids found within Campylobacter spp., and that this cryptic plasmid is present among Campylobacter spp. recovered from different genera of wild birds. Copyright © 2013. Published by Elsevier Inc.
SNPs in putative regulatory regions identified by human mouse comparative sequencing and transcription factor binding site data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Banerjee, Poulabi; Bahlo, Melanie; Schwartz, Jody R.

2002-01-01

Genome wide disease association analysis using SNPs is being explored as a method for dissecting complex genetic traits and a vast number of SNPs have been generated for this purpose. As there are cost and throughput limitations of genotyping large numbers of SNPs and statistical issues regarding the large number of dependent tests on the same data set, to make association analysis practical it has been proposed that SNPs should be prioritized based on likely functional importance. The most easily identifiable functional SNPs are coding SNPs (cSNPs) and accordingly cSNPs have been screened in a number of studies. SNPs inmore » gene regulatory sequences embedded in noncoding DNA are another class of SNPs suggested for prioritization due to their predicted quantitative impact on gene expression. The main challenge in evaluating these SNPs, in contrast to cSNPs is a lack of robust algorithms and databases for recognizing regulatory sequences in noncoding DNA. Approaches that have been previously used to delineate noncoding sequences with gene regulatory activity include cross-species sequence comparisons and the search for sequences recognized by transcription factors. We combined these two methods to sift through mouse human genomic sequences to identify putative gene regulatory elements and subsequently localized SNPs within these sequences in a 1 Megabase (Mb) region of human chromosome 5q31, orthologous to mouse chromosome 11 containing the Interleukin cluster.« less
Design of the Illumina Porcine 50K+ SNP Iselect(TM) Beadchip and Characterization of the Porcine HapMap Population

USDA-ARS?s Scientific Manuscript database

Using next generation sequencing technology the International Swine SNP Consortium has identified 500,000 SNPs and used these to design an Illumina Infinium iSelect™ SNP BeadChip with a selection of 60,218 SNPs. The selected SNPs include previously validated SNPs and SNPs identified de novo using se...
Identification of novel mutations in the α-galactosidase A gene in patients with Fabry disease: pitfalls of mutation analyses in patients with low α-galactosidase A activity.

PubMed

Yoshimitsu, Makoto; Higuchi, Koji; Miyata, Masaaki; Devine, Sean; Mattman, Andre; Sirrs, Sandra; Medin, Jeffrey A; Tei, Chuwa; Takenaka, Toshihiro

2011-05-01

Fabry disease is an X-linked lysosomal storage disorder caused by mutations of the α-galactosidase A (GLA) gene, and the disease is a relatively prevalent cause of left ventricular hypertrophy followed by conduction abnormalities and arrhythmias. Mutation analysis of the GLA gene is a valuable tool for accurate diagnosis of affected families. In this study, we carried out molecular studies of 10 unrelated families diagnosed with Fabry disease. Genetic analysis of the GLA gene using conventional genomic sequencing was performed in 9 hemizygous males and 6 heterozygous females. In patients with no mutations in coding DNA sequence, multiplex ligation-dependent probe amplification (MLPA) and/or cDNA sequencing were performed. We identified a novel exon 2 deletion (IVS1_IVS2) in a heterozygous female by MLPA, which was undetectable by conventional sequencing methods. In addition, the g.9331G>A mutation that has previously been found only in patients with cardiac Fabry disease was found in 3 unrelated, newly-diagnosed, cardiac Fabry patients by sequencing GLA genomic DNA and cDNA. Two other novel mutations, g.8319A>G and 832delA were also found in addition to 4 previously reported mutations (R112C, C142Y, M296I, and G373D) in 6 other families. We could identify GLA gene mutations in all hemizygotes and heterozygotes from 10 families with Fabry disease. Mutations in 4 out of 10 families could not be identified by classical genomic analysis, which focuses on exons and the flanking region. Instead, these data suggest that MLPA analysis and cDNA sequence should be considered in genetic testing surveys of patients with Fabry disease. Copyright © 2011 Japanese College of Cardiology. Published by Elsevier Ltd. All rights reserved.
Molecular analysis of the anaerobic rumen fungus Orpinomyces - insights into an AT-rich genome.

PubMed

Nicholson, Matthew J; Theodorou, Michael K; Brookman, Jayne L

2005-01-01

The anaerobic gut fungi occupy a unique niche in the intestinal tract of large herbivorous animals and are thought to act as primary colonizers of plant material during digestion. They are the only known obligately anaerobic fungi but molecular analysis of this group has been hampered by difficulties in their culture and manipulation, and by their extremely high A+T nucleotide content. This study begins to answer some of the fundamental questions about the structure and organization of the anaerobic gut fungal genome. Directed plasmid libraries using genomic DNA digested with highly or moderately rich AT-specific restriction enzymes (VspI and EcoRI) were prepared from a polycentric Orpinomyces isolate. Clones were sequenced from these libraries and the breadth of genomic inserts, both genic and intergenic, was characterized. Genes encoding numerous functions not previously characterized for these fungi were identified, including cytoskeletal, secretory pathway and transporter genes. A peptidase gene with no introns and having sequence similarity to a gene encoding a bacterial peptidase was also identified, extending the range of metabolic enzymes resulting from apparent trans-kingdom transfer from bacteria to fungi, as previously characterized largely for genes encoding plant-degrading enzymes. This paper presents the first thorough analysis of the genic, intergenic and rDNA regions of a variety of genomic segments from an anaerobic gut fungus and provides observations on rules governing intron boundaries, the codon biases observed with different types of genes, and the sequence of only the second anaerobic gut fungal promoter reported. Large numbers of retrotransposon sequences of different types were found and the authors speculate on the possible consequences of any such transposon activity in the genome. The coding sequences identified included several orphan gene sequences, including one with regions strongly suggestive of structural proteins such as collagens and lampirin. This gene was present as a single copy in Orpinomyces, was expressed during vegetative growth and was also detected in genomes from another gut fungal genus, Neocallimastix.
Identification and Removal of Contaminant Sequences From Ribosomal Gene Databases: Lessons From the Census of Deep Life

PubMed Central

Sheik, Cody S.; Reese, Brandi Kiel; Twing, Katrina I.; Sylvan, Jason B.; Grim, Sharon L.; Schrenk, Matthew O.; Sogin, Mitchell L.; Colwell, Frederick S.

2018-01-01

Earth’s subsurface environment is one of the largest, yet least studied, biomes on Earth, and many questions remain regarding what microorganisms are indigenous to the subsurface. Through the activity of the Census of Deep Life (CoDL) and the Deep Carbon Observatory, an open access 16S ribosomal RNA gene sequence database from diverse subsurface environments has been compiled. However, due to low quantities of biomass in the deep subsurface, the potential for incorporation of contaminants from reagents used during sample collection, processing, and/or sequencing is high. Thus, to understand the ecology of subsurface microorganisms (i.e., the distribution, richness, or survival), it is necessary to minimize, identify, and remove contaminant sequences that will skew the relative abundances of all taxa in the sample. In this meta-analysis, we identify putative contaminants associated with the CoDL dataset, recommend best practices for removing contaminants from samples, and propose a series of best practices for subsurface microbiology sampling. The most abundant putative contaminant genera observed, independent of evenness across samples, were Propionibacterium, Aquabacterium, Ralstonia, and Acinetobacter. While the top five most frequently observed genera were Pseudomonas, Propionibacterium, Acinetobacter, Ralstonia, and Sphingomonas. The majority of the most frequently observed genera (high evenness) were associated with reagent or potential human contamination. Additionally, in DNA extraction blanks, we observed potential archaeal contaminants, including methanogens, which have not been discussed in previous contamination studies. Such contaminants would directly affect the interpretation of subsurface molecular studies, as methanogenesis is an important subsurface biogeochemical process. Utilizing previously identified contaminant genera, we found that ∼27% of the total dataset were identified as contaminant sequences that likely originate from DNA extraction and DNA cleanup methods. Thus, controls must be taken at every step of the collection and processing procedure when working with low biomass environments such as, but not limited to, portions of Earth’s deep subsurface. Taken together, we stress that the CoDL dataset is an incredible resource for the broader research community interested in subsurface life, and steps to remove contamination derived sequences must be taken prior to using this dataset. PMID:29780369
Identification and Removal of Contaminant Sequences From Ribosomal Gene Databases: Lessons From the Census of Deep Life.

PubMed

Sheik, Cody S; Reese, Brandi Kiel; Twing, Katrina I; Sylvan, Jason B; Grim, Sharon L; Schrenk, Matthew O; Sogin, Mitchell L; Colwell, Frederick S

2018-01-01

Earth's subsurface environment is one of the largest, yet least studied, biomes on Earth, and many questions remain regarding what microorganisms are indigenous to the subsurface. Through the activity of the Census of Deep Life (CoDL) and the Deep Carbon Observatory, an open access 16S ribosomal RNA gene sequence database from diverse subsurface environments has been compiled. However, due to low quantities of biomass in the deep subsurface, the potential for incorporation of contaminants from reagents used during sample collection, processing, and/or sequencing is high. Thus, to understand the ecology of subsurface microorganisms (i.e., the distribution, richness, or survival), it is necessary to minimize, identify, and remove contaminant sequences that will skew the relative abundances of all taxa in the sample. In this meta-analysis, we identify putative contaminants associated with the CoDL dataset, recommend best practices for removing contaminants from samples, and propose a series of best practices for subsurface microbiology sampling. The most abundant putative contaminant genera observed, independent of evenness across samples, were Propionibacterium , Aquabacterium , Ralstonia , and Acinetobacter . While the top five most frequently observed genera were Pseudomonas , Propionibacterium , Acinetobacter , Ralstonia , and Sphingomonas . The majority of the most frequently observed genera (high evenness) were associated with reagent or potential human contamination. Additionally, in DNA extraction blanks, we observed potential archaeal contaminants, including methanogens, which have not been discussed in previous contamination studies. Such contaminants would directly affect the interpretation of subsurface molecular studies, as methanogenesis is an important subsurface biogeochemical process. Utilizing previously identified contaminant genera, we found that ∼27% of the total dataset were identified as contaminant sequences that likely originate from DNA extraction and DNA cleanup methods. Thus, controls must be taken at every step of the collection and processing procedure when working with low biomass environments such as, but not limited to, portions of Earth's deep subsurface. Taken together, we stress that the CoDL dataset is an incredible resource for the broader research community interested in subsurface life, and steps to remove contamination derived sequences must be taken prior to using this dataset.
A Children's Oncology Group and TARGET initiative exploring the genetic landscape of Wilms tumor. | Office of Cancer Genomics

Cancer.gov

We performed genome-wide sequencing and analyzed mRNA and miRNA expression, DNA copy number, and DNA methylation in 117 Wilms tumors, followed by targeted sequencing of 651 Wilms tumors. In addition to genes previously implicated in Wilms tumors (WT1, CTNNB1, AMER1, DROSHA, DGCR8, XPO5, DICER1, SIX1, SIX2, MLLT1, MYCN, and TP53), we identified mutations in genes not previously recognized as recurrently involved in Wilms tumors, the most frequent being BCOR, BCORL1, NONO, MAX, COL6A3, ASXL1, MAP3K4, and ARID1A.
SNP-based real-time pyrosequencing as a sensitive and specific tool for identification and differentiation of Rickettsia species in Ixodes ricinus ticks.

PubMed

Janecek, Elisabeth; Streichan, Sabine; Strube, Christina

2012-10-18

Rickettsioses are caused by pathogenic species of the genus Rickettsia and play an important role as emerging diseases. The bacteria are transmitted to mammal hosts including humans by arthropod vectors. Since detection, especially in tick vectors, is usually based on PCR with genus-specific primers to include different occurring Rickettsia species, subsequent species identification is mainly achieved by Sanger sequencing. In the present study a real-time pyrosequencing approach was established with the objective to differentiate between species occurring in German Ixodes ticks, which are R. helvetica, R. monacensis, R. massiliae, and R. felis. Tick material from a quantitative real-time PCR (qPCR) based study on Rickettsia-infections in I. ricinus allowed direct comparison of both sequencing techniques, Sanger and real-time pyrosequencing. A sequence stretch of rickettsial citrate synthase (gltA) gene was identified to contain divergent single nucleotide polymorphism (SNP) sites suitable for Rickettsia species differentiation. Positive control plasmids inserting the respective target sequence of each Rickettsia species of interest were constructed for initial establishment of the real-time pyrosequencing approach using Qiagen's PSQ 96MA Pyrosequencing System operating in a 96-well format. The approach included an initial amplification reaction followed by the actual pyrosequencing, which is traceable by pyrograms in real-time. Afterwards, real-time pyrosequencing was applied to 263 Ixodes tick samples already detected Rickettsia-positive in previous qPCR experiments. Establishment of real-time pyrosequencing using positive control plasmids resulted in accurate detection of all SNPs in all included Rickettsia species. The method was then applied to 263 Rickettsia-positive Ixodes ricinus samples, of which 153 (58.2%) could be identified for their species (151 R. helvetica and 2 R. monacensis) by previous custom Sanger sequencing. Real-time pyrosequencing identified all Sanger-determined ticks as well as 35 previously undifferentiated ticks resulting in a total number of 188 (71.5%) identified samples. Pyrosequencing sensitivity was found to be strongly dependent on gltA copy numbers in the reaction setup. Whereas less than 101 copies in the initial amplification reaction resulted in identification of 15.1% of the samples only, the percentage increased to 54.2% at 101-102 copies, to 95.6% at >102-103 copies and reached 100% samples identified for their Rickettsia species if more than 103 copies were present in the template. The established real-time pyrosequencing approach represents a reliable method for detection and differentiation of Rickettsia spp. present in I. ricinus diagnostic material and prevalence studies. Furthermore, the method proved to be faster, more cost-effective as well as more sensitive than custom Sanger sequencing with simultaneous high specificity.
Whole genome re-sequencing identifies a mutation in an ABC transporter (mdr2) in a Plasmodium chabaudi clone with altered susceptibility to antifolate drugs☆

PubMed Central

Martinelli, Axel; Henriques, Gisela; Cravo, Pedro; Hunt, Paul

2011-01-01

In malaria parasites, mutations in two genes of folate biosynthesis encoding dihydrofolate reductase (dhfr) and dihydropteroate synthase (dhps) modify responses to antifolate therapies which target these enzymes. However, the involvement of other genes which modify the availability of exogenous folate, for example, has been proposed. Here, we used short-read whole-genome re-sequencing to determine the mutations in a clone of the rodent malaria parasite, Plasmodium chabaudi, which has altered susceptibility to both sulphadoxine and pyrimethamine. This clone bears a previously identified S106N mutation in dhfr and no mutation in dhps. Instead, three additional point mutations in genes on chromosomes 2, 13 and 14 were identified. The mutated gene on chromosome 13 (mdr2 K392Q) encodes an ABC transporter. Because Quantitative Trait Locus analysis previously indicated an association of genetic markers on chromosome 13 with responses to individual and combined antifolates, MDR2 is proposed to modulate antifolate responses, possibly mediated by the transport of folate intermediates. PMID:20858498
Strain/species identification in metagenomes using genome-specific markers

PubMed Central

Tu, Qichao; He, Zhili; Zhou, Jizhong

2014-01-01

Shotgun metagenome sequencing has become a fast, cheap and high-throughput technology for characterizing microbial communities in complex environments and human body sites. However, accurate identification of microorganisms at the strain/species level remains extremely challenging. We present a novel k-mer-based approach, termed GSMer, that identifies genome-specific markers (GSMs) from currently sequenced microbial genomes, which were then used for strain/species-level identification in metagenomes. Using 5390 sequenced microbial genomes, 8 770 321 50-mer strain-specific and 11 736 360 species-specific GSMs were identified for 4088 strains and 2005 species (4933 strains), respectively. The GSMs were first evaluated against mock community metagenomes, recently sequenced genomes and real metagenomes from different body sites, suggesting that the identified GSMs were specific to their targeting genomes. Sensitivity evaluation against synthetic metagenomes with different coverage suggested that 50 GSMs per strain were sufficient to identify most microbial strains with ≥0.25× coverage, and 10% of selected GSMs in a database should be detected for confident positive callings. Application of GSMs identified 45 and 74 microbial strains/species significantly associated with type 2 diabetes patients and obese/lean individuals from corresponding gastrointestinal tract metagenomes, respectively. Our result agreed with previous studies but provided strain-level information. The approach can be directly applied to identify microbial strains/species from raw metagenomes, without the effort of complex data pre-processing. PMID:24523352
A Public Health Model for the Molecular Surveillance of HIV Transmission in San Diego, California

PubMed Central

May, Susanne; Tweeten, Samantha; Drumright, Lydia; Pacold, Mary E.; Kosakovsky Pond, Sergei L.; Pesano, Rick L.; Lie, Yolanda S.; Richman, Douglas D.; Frost, Simon D.W.; Woelk, Christopher H.; Little, Susan J.

2009-01-01

Background Current public health efforts often use molecular technologies to identify and contain communicable disease networks, but not for HIV. Here, we investigate how molecular epidemiology can be used to identify highly-related HIV networks within a population and how voluntary contact tracing of sexual partners can be used to selectively target these networks. Methods We evaluated the use of HIV-1 pol sequences obtained from participants of a community-recruited cohort (n=268) and a primary infection research cohort (n=369) to define highly related transmission clusters and the use of contact tracing to link other individuals (n=36) within these clusters. The presence of transmitted drug resistance was interpreted from the pol sequences (Calibrated Population Resistance v3.0). Results Phylogenetic clustering was conservatively defined when the genetic distance between any two pol sequences was <1%, which identified 34 distinct transmission clusters within the combined community-recruited and primary infection research cohorts containing 160 individuals. Although sequences from the epidemiologically-linked partners represented approximately 5% of the total sequences, they clustered with 60% of the sequences that clustered from the combined cohorts (O.R. 21.7; p=<0.01). Major resistance to at least one class of antiretroviral medication was found in 19% of clustering sequences. Conclusions Phylogenetic methods can be used to identify individuals who are within highly related transmission groups, and contact tracing of epidemiologically-linked partners of recently infected individuals can be used to link into previously-defined transmission groups. These methods could be used to implement selectively targeted prevention interventions. PMID:19098493
Recall is not necessary for verbal sequence learning.

PubMed

Kalm, Kristjan; Norris, Dennis

2016-01-01

The question of whether overt recall of to-be-remembered material accelerates learning is important in a wide range of real-world learning settings. In the case of verbal sequence learning, previous research has proposed that recall either is necessary for verbal sequence learning (Cohen & Johansson Journal of Verbal Learning and Verbal Behavior, 6, 139-143, 1967; Cunningham, Healy, & Williams Journal of Experimental Psychology: Learning, Memory, and Cognition, 10, 575-597, 1984), or at least contributes significantly to it (Glass, Krejci, & Goldman Journal of Memory and Language, 28, 189-199, 1989; Oberauer & Meyer Memory, 17, 774-781, 2009). In contrast, here we show that the amount of previous spoken recall does not predict learning and is not necessary for it. We suggest that previous research may have underestimated participants' learning by using suboptimal performance measures, or by using manual or written recall. However, we show that the amount of spoken recall predicted how much interference from other to-be-remembered sequences would be observed. In fact, spoken recall mediated most of the error learning observed in the task. Our data support the view that the learning of overlapping auditory-verbal sequences is driven by learning the phonological representations and not the articulatory motor responses. However, spoken recall seems to reinforce already learned representations, whether they are correct or incorrect, thus contributing to a participant identifying a specific stimulus as either "learned" or "new" during the presentation phase.
Nucleotide sequence of a cluster of early and late genes in a conserved segment of the vaccinia virus genome.

PubMed Central

Plucienniczak, A; Schroeder, E; Zettlmeissl, G; Streeck, R E

1985-01-01

The nucleotide sequence of a 7.6 kb vaccinia DNA segment from a genomic region conserved among different orthopox virus has been determined. This segment contains a tight cluster of 12 partly overlapping open reading frames most of which can be correlated with previously identified early and late proteins and mRNAs. Regulatory signals used by vaccinia virus have been studied. Presumptive promoter regions are rich in A, T and carry the consensus sequences TATA and AATAA spaced at 20-24 base pairs. Tandem repeats of a CTATTC consensus sequence are proposed to be involved in the termination of early transcription. PMID:2987815

More Easily Cultivated Than Identified: Classical Isolation With Molecular Identification of Vaginal Bacteria

PubMed Central

Srinivasan, Sujatha; Munch, Matthew M.; Sizova, Maria V.; Fiedler, Tina L.; Kohler, Christina M.; Hoffman, Noah G.; Liu, Congzhou; Agnew, Kathy J.; Marrazzo, Jeanne M.; Epstein, Slava S.; Fredricks, David N.

2016-01-01

Background. Women with bacterial vaginosis (BV) have complex communities of anaerobic bacteria. There are no cultivated isolates of several bacteria identified using molecular methods and associated with BV. It is unclear whether this is due to the inability to adequately propagate these bacteria or to correctly identify them in culture. Methods. Vaginal fluid from 15 women was plated on 6 different media using classical cultivation approaches. Individual isolates were identified by 16S ribosomal RNA (rRNA) gene sequencing and compared with validly described species. Bacterial community profiles in vaginal samples were determined using broad-range 16S rRNA gene polymerase chain reaction and pyrosequencing. Results. We isolated and identified 101 distinct bacterial strains spanning 6 phyla including (1) novel strains with <98% 16S rRNA sequence identity to validly described species, (2) closely related species within a genus, (3) bacteria previously isolated from body sites other than the vagina, and (4) known bacteria formerly isolated from the vagina. Pyrosequencing showed that novel strains Peptoniphilaceae DNF01163 and Prevotellaceae DNF00733 were prevalent in women with BV. Conclusions. We isolated a diverse set of novel and clinically significant anaerobes from the human vagina using conventional approaches with systematic molecular identification. Several previously “uncultivated” bacteria are amenable to conventional cultivation. PMID:27449870
Whole-Exome Sequencing Identifies Rare and Low-Frequency Coding Variants Associated with LDL Cholesterol

PubMed Central

Lange, Leslie A.; Hu, Youna; Zhang, He; Xue, Chenyi; Schmidt, Ellen M.; Tang, Zheng-Zheng; Bizon, Chris; Lange, Ethan M.; Smith, Joshua D.; Turner, Emily H.; Jun, Goo; Kang, Hyun Min; Peloso, Gina; Auer, Paul; Li, Kuo-ping; Flannick, Jason; Zhang, Ji; Fuchsberger, Christian; Gaulton, Kyle; Lindgren, Cecilia; Locke, Adam; Manning, Alisa; Sim, Xueling; Rivas, Manuel A.; Holmen, Oddgeir L.; Gottesman, Omri; Lu, Yingchang; Ruderfer, Douglas; Stahl, Eli A.; Duan, Qing; Li, Yun; Durda, Peter; Jiao, Shuo; Isaacs, Aaron; Hofman, Albert; Bis, Joshua C.; Correa, Adolfo; Griswold, Michael E.; Jakobsdottir, Johanna; Smith, Albert V.; Schreiner, Pamela J.; Feitosa, Mary F.; Zhang, Qunyuan; Huffman, Jennifer E.; Crosby, Jacy; Wassel, Christina L.; Do, Ron; Franceschini, Nora; Martin, Lisa W.; Robinson, Jennifer G.; Assimes, Themistocles L.; Crosslin, David R.; Rosenthal, Elisabeth A.; Tsai, Michael; Rieder, Mark J.; Farlow, Deborah N.; Folsom, Aaron R.; Lumley, Thomas; Fox, Ervin R.; Carlson, Christopher S.; Peters, Ulrike; Jackson, Rebecca D.; van Duijn, Cornelia M.; Uitterlinden, André G.; Levy, Daniel; Rotter, Jerome I.; Taylor, Herman A.; Gudnason, Vilmundur; Siscovick, David S.; Fornage, Myriam; Borecki, Ingrid B.; Hayward, Caroline; Rudan, Igor; Chen, Y. Eugene; Bottinger, Erwin P.; Loos, Ruth J.F.; Sætrom, Pål; Hveem, Kristian; Boehnke, Michael; Groop, Leif; McCarthy, Mark; Meitinger, Thomas; Ballantyne, Christie M.; Gabriel, Stacey B.; O’Donnell, Christopher J.; Post, Wendy S.; North, Kari E.; Reiner, Alexander P.; Boerwinkle, Eric; Psaty, Bruce M.; Altshuler, David; Kathiresan, Sekar; Lin, Dan-Yu; Jarvik, Gail P.; Cupples, L. Adrienne; Kooperberg, Charles; Wilson, James G.; Nickerson, Deborah A.; Abecasis, Goncalo R.; Rich, Stephen S.; Tracy, Russell P.; Willer, Cristen J.; Gabriel, Stacey B.; Altshuler, David M.; Abecasis, Gonçalo R.; Allayee, Hooman; Cresci, Sharon; Daly, Mark J.; de Bakker, Paul I.W.; DePristo, Mark A.; Do, Ron; Donnelly, Peter; Farlow, Deborah N.; Fennell, Tim; Garimella, Kiran; Hazen, Stanley L.; Hu, Youna; Jordan, Daniel M.; Jun, Goo; Kathiresan, Sekar; Kang, Hyun Min; Kiezun, Adam; Lettre, Guillaume; Li, Bingshan; Li, Mingyao; Newton-Cheh, Christopher H.; Padmanabhan, Sandosh; Peloso, Gina; Pulit, Sara; Rader, Daniel J.; Reich, David; Reilly, Muredach P.; Rivas, Manuel A.; Schwartz, Steve; Scott, Laura; Siscovick, David S.; Spertus, John A.; Stitziel, Nathaniel O.; Stoletzki, Nina; Sunyaev, Shamil R.; Voight, Benjamin F.; Willer, Cristen J.; Rich, Stephen S.; Akylbekova, Ermeg; Atwood, Larry D.; Ballantyne, Christie M.; Barbalic, Maja; Barr, R. Graham; Benjamin, Emelia J.; Bis, Joshua; Boerwinkle, Eric; Bowden, Donald W.; Brody, Jennifer; Budoff, Matthew; Burke, Greg; Buxbaum, Sarah; Carr, Jeff; Chen, Donna T.; Chen, Ida Y.; Chen, Wei-Min; Concannon, Pat; Crosby, Jacy; Cupples, L. Adrienne; D’Agostino, Ralph; DeStefano, Anita L.; Dreisbach, Albert; Dupuis, Josée; Durda, J. Peter; Ellis, Jaclyn; Folsom, Aaron R.; Fornage, Myriam; Fox, Caroline S.; Fox, Ervin; Funari, Vincent; Ganesh, Santhi K.; Gardin, Julius; Goff, David; Gordon, Ora; Grody, Wayne; Gross, Myron; Guo, Xiuqing; Hall, Ira M.; Heard-Costa, Nancy L.; Heckbert, Susan R.; Heintz, Nicholas; Herrington, David M.; Hickson, DeMarc; Huang, Jie; Hwang, Shih-Jen; Jacobs, David R.; Jenny, Nancy S.; Johnson, Andrew D.; Johnson, Craig W.; Kawut, Steven; Kronmal, Richard; Kurz, Raluca; Lange, Ethan M.; Lange, Leslie A.; Larson, Martin G.; Lawson, Mark; Lewis, Cora E.; Levy, Daniel; Li, Dalin; Lin, Honghuang; Liu, Chunyu; Liu, Jiankang; Liu, Kiang; Liu, Xiaoming; Liu, Yongmei; Longstreth, William T.; Loria, Cay; Lumley, Thomas; Lunetta, Kathryn; Mackey, Aaron J.; Mackey, Rachel; Manichaikul, Ani; Maxwell, Taylor; McKnight, Barbara; Meigs, James B.; Morrison, Alanna C.; Musani, Solomon K.; Mychaleckyj, Josyf C.; Nettleton, Jennifer A.; North, Kari; O’Donnell, Christopher J.; O’Leary, Daniel; Ong, Frank; Palmas, Walter; Pankow, James S.; Pankratz, Nathan D.; Paul, Shom; Perez, Marco; Person, Sharina D.; Polak, Joseph; Post, Wendy S.; Psaty, Bruce M.; Quinlan, Aaron R.; Raffel, Leslie J.; Ramachandran, Vasan S.; Reiner, Alexander P.; Rice, Kenneth; Rotter, Jerome I.; Sanders, Jill P.; Schreiner, Pamela; Seshadri, Sudha; Shea, Steve; Sidney, Stephen; Silverstein, Kevin; Smith, Nicholas L.; Sotoodehnia, Nona; Srinivasan, Asoke; Taylor, Herman A.; Taylor, Kent; Thomas, Fridtjof; Tracy, Russell P.; Tsai, Michael Y.; Volcik, Kelly A.; Wassel, Chrstina L.; Watson, Karol; Wei, Gina; White, Wendy; Wiggins, Kerri L.; Wilk, Jemma B.; Williams, O. Dale; Wilson, Gregory; Wilson, James G.; Wolf, Phillip; Zakai, Neil A.; Hardy, John; Meschia, James F.; Nalls, Michael; Singleton, Andrew; Worrall, Brad; Bamshad, Michael J.; Barnes, Kathleen C.; Abdulhamid, Ibrahim; Accurso, Frank; Anbar, Ran; Beaty, Terri; Bigham, Abigail; Black, Phillip; Bleecker, Eugene; Buckingham, Kati; Cairns, Anne Marie; Caplan, Daniel; Chatfield, Barbara; Chidekel, Aaron; Cho, Michael; Christiani, David C.; Crapo, James D.; Crouch, Julia; Daley, Denise; Dang, Anthony; Dang, Hong; De Paula, Alicia; DeCelie-Germana, Joan; Drumm, Allen DozorMitch; Dyson, Maynard; Emerson, Julia; Emond, Mary J.; Ferkol, Thomas; Fink, Robert; Foster, Cassandra; Froh, Deborah; Gao, Li; Gershan, William; Gibson, Ronald L.; Godwin, Elizabeth; Gondor, Magdalen; Gutierrez, Hector; Hansel, Nadia N.; Hassoun, Paul M.; Hiatt, Peter; Hokanson, John E.; Howenstine, Michelle; Hummer, Laura K.; Kanga, Jamshed; Kim, Yoonhee; Knowles, Michael R.; Konstan, Michael; Lahiri, Thomas; Laird, Nan; Lange, Christoph; Lin, Lin; Lin, Xihong; Louie, Tin L.; Lynch, David; Make, Barry; Martin, Thomas R.; Mathai, Steve C.; Mathias, Rasika A.; McNamara, John; McNamara, Sharon; Meyers, Deborah; Millard, Susan; Mogayzel, Peter; Moss, Richard; Murray, Tanda; Nielson, Dennis; Noyes, Blakeslee; O’Neal, Wanda; Orenstein, David; O’Sullivan, Brian; Pace, Rhonda; Pare, Peter; Parker, H. Worth; Passero, Mary Ann; Perkett, Elizabeth; Prestridge, Adrienne; Rafaels, Nicholas M.; Ramsey, Bonnie; Regan, Elizabeth; Ren, Clement; Retsch-Bogart, George; Rock, Michael; Rosen, Antony; Rosenfeld, Margaret; Ruczinski, Ingo; Sanford, Andrew; Schaeffer, David; Sell, Cindy; Sheehan, Daniel; Silverman, Edwin K.; Sin, Don; Spencer, Terry; Stonebraker, Jackie; Tabor, Holly K.; Varlotta, Laurie; Vergara, Candelaria I.; Weiss, Robert; Wigley, Fred; Wise, Robert A.; Wright, Fred A.; Wurfel, Mark M.; Zanni, Robert; Zou, Fei; Nickerson, Deborah A.; Rieder, Mark J.; Green, Phil; Shendure, Jay; Akey, Joshua M.; Bustamante, Carlos D.; Crosslin, David R.; Eichler, Evan E.; Fox, P. Keolu; Fu, Wenqing; Gordon, Adam; Gravel, Simon; Jarvik, Gail P.; Johnsen, Jill M.; Kan, Mengyuan; Kenny, Eimear E.; Kidd, Jeffrey M.; Lara-Garduno, Fremiet; Leal, Suzanne M.; Liu, Dajiang J.; McGee, Sean; O’Connor, Timothy D.; Paeper, Bryan; Robertson, Peggy D.; Smith, Joshua D.; Staples, Jeffrey C.; Tennessen, Jacob A.; Turner, Emily H.; Wang, Gao; Yi, Qian; Jackson, Rebecca; Peters, Ulrike; Carlson, Christopher S.; Anderson, Garnet; Anton-Culver, Hoda; Assimes, Themistocles L.; Auer, Paul L.; Beresford, Shirley; Bizon, Chris; Black, Henry; Brunner, Robert; Brzyski, Robert; Burwen, Dale; Caan, Bette; Carty, Cara L.; Chlebowski, Rowan; Cummings, Steven; Curb, J. David; Eaton, Charles B.; Ford, Leslie; Franceschini, Nora; Fullerton, Stephanie M.; Gass, Margery; Geller, Nancy; Heiss, Gerardo; Howard, Barbara V.; Hsu, Li; Hutter, Carolyn M.; Ioannidis, John; Jiao, Shuo; Johnson, Karen C.; Kooperberg, Charles; Kuller, Lewis; LaCroix, Andrea; Lakshminarayan, Kamakshi; Lane, Dorothy; Lasser, Norman; LeBlanc, Erin; Li, Kuo-Ping; Limacher, Marian; Lin, Dan-Yu; Logsdon, Benjamin A.; Ludlam, Shari; Manson, JoAnn E.; Margolis, Karen; Martin, Lisa; McGowan, Joan; Monda, Keri L.; Kotchen, Jane Morley; Nathan, Lauren; Ockene, Judith; O’Sullivan, Mary Jo; Phillips, Lawrence S.; Prentice, Ross L.; Robbins, John; Robinson, Jennifer G.; Rossouw, Jacques E.; Sangi-Haghpeykar, Haleh; Sarto, Gloria E.; Shumaker, Sally; Simon, Michael S.; Stefanick, Marcia L.; Stein, Evan; Tang, Hua; Taylor, Kira C.; Thomson, Cynthia A.; Thornton, Timothy A.; Van Horn, Linda; Vitolins, Mara; Wactawski-Wende, Jean; Wallace, Robert; Wassertheil-Smoller, Sylvia; Zeng, Donglin; Applebaum-Bowden, Deborah; Feolo, Michael; Gan, Weiniu; Paltoo, Dina N.; Sholinsky, Phyliss; Sturcke, Anne

2014-01-01

Elevated low-density lipoprotein cholesterol (LDL-C) is a treatable, heritable risk factor for cardiovascular disease. Genome-wide association studies (GWASs) have identified 157 variants associated with lipid levels but are not well suited to assess the impact of rare and low-frequency variants. To determine whether rare or low-frequency coding variants are associated with LDL-C, we exome sequenced 2,005 individuals, including 554 individuals selected for extreme LDL-C (>98th or <2nd percentile). Follow-up analyses included sequencing of 1,302 additional individuals and genotype-based analysis of 52,221 individuals. We observed significant evidence of association between LDL-C and the burden of rare or low-frequency variants in PNPLA5, encoding a phospholipase-domain-containing protein, and both known and previously unidentified variants in PCSK9, LDLR and APOB, three known lipid-related genes. The effect sizes for the burden of rare variants for each associated gene were substantially higher than those observed for individual SNPs identified from GWASs. We replicated the PNPLA5 signal in an independent large-scale sequencing study of 2,084 individuals. In conclusion, this large whole-exome-sequencing study for LDL-C identified a gene not known to be implicated in LDL-C and provides unique insight into the design and analysis of similar experiments. PMID:24507775
Whole-exome sequencing identifies rare and low-frequency coding variants associated with LDL cholesterol.

PubMed

Lange, Leslie A; Hu, Youna; Zhang, He; Xue, Chenyi; Schmidt, Ellen M; Tang, Zheng-Zheng; Bizon, Chris; Lange, Ethan M; Smith, Joshua D; Turner, Emily H; Jun, Goo; Kang, Hyun Min; Peloso, Gina; Auer, Paul; Li, Kuo-Ping; Flannick, Jason; Zhang, Ji; Fuchsberger, Christian; Gaulton, Kyle; Lindgren, Cecilia; Locke, Adam; Manning, Alisa; Sim, Xueling; Rivas, Manuel A; Holmen, Oddgeir L; Gottesman, Omri; Lu, Yingchang; Ruderfer, Douglas; Stahl, Eli A; Duan, Qing; Li, Yun; Durda, Peter; Jiao, Shuo; Isaacs, Aaron; Hofman, Albert; Bis, Joshua C; Correa, Adolfo; Griswold, Michael E; Jakobsdottir, Johanna; Smith, Albert V; Schreiner, Pamela J; Feitosa, Mary F; Zhang, Qunyuan; Huffman, Jennifer E; Crosby, Jacy; Wassel, Christina L; Do, Ron; Franceschini, Nora; Martin, Lisa W; Robinson, Jennifer G; Assimes, Themistocles L; Crosslin, David R; Rosenthal, Elisabeth A; Tsai, Michael; Rieder, Mark J; Farlow, Deborah N; Folsom, Aaron R; Lumley, Thomas; Fox, Ervin R; Carlson, Christopher S; Peters, Ulrike; Jackson, Rebecca D; van Duijn, Cornelia M; Uitterlinden, André G; Levy, Daniel; Rotter, Jerome I; Taylor, Herman A; Gudnason, Vilmundur; Siscovick, David S; Fornage, Myriam; Borecki, Ingrid B; Hayward, Caroline; Rudan, Igor; Chen, Y Eugene; Bottinger, Erwin P; Loos, Ruth J F; Sætrom, Pål; Hveem, Kristian; Boehnke, Michael; Groop, Leif; McCarthy, Mark; Meitinger, Thomas; Ballantyne, Christie M; Gabriel, Stacey B; O'Donnell, Christopher J; Post, Wendy S; North, Kari E; Reiner, Alexander P; Boerwinkle, Eric; Psaty, Bruce M; Altshuler, David; Kathiresan, Sekar; Lin, Dan-Yu; Jarvik, Gail P; Cupples, L Adrienne; Kooperberg, Charles; Wilson, James G; Nickerson, Deborah A; Abecasis, Goncalo R; Rich, Stephen S; Tracy, Russell P; Willer, Cristen J

2014-02-06

Elevated low-density lipoprotein cholesterol (LDL-C) is a treatable, heritable risk factor for cardiovascular disease. Genome-wide association studies (GWASs) have identified 157 variants associated with lipid levels but are not well suited to assess the impact of rare and low-frequency variants. To determine whether rare or low-frequency coding variants are associated with LDL-C, we exome sequenced 2,005 individuals, including 554 individuals selected for extreme LDL-C (>98(th) or <2(nd) percentile). Follow-up analyses included sequencing of 1,302 additional individuals and genotype-based analysis of 52,221 individuals. We observed significant evidence of association between LDL-C and the burden of rare or low-frequency variants in PNPLA5, encoding a phospholipase-domain-containing protein, and both known and previously unidentified variants in PCSK9, LDLR and APOB, three known lipid-related genes. The effect sizes for the burden of rare variants for each associated gene were substantially higher than those observed for individual SNPs identified from GWASs. We replicated the PNPLA5 signal in an independent large-scale sequencing study of 2,084 individuals. In conclusion, this large whole-exome-sequencing study for LDL-C identified a gene not known to be implicated in LDL-C and provides unique insight into the design and analysis of similar experiments. Copyright © 2014 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Real-Time PCR Typing of Escherichia coli Based on Multiple Single Nucleotide Polymorphisms--a Convenient and Rapid Method.

PubMed

Lager, Malin; Mernelius, Sara; Löfgren, Sture; Söderman, Jan

2016-01-01

Healthcare-associated infections caused by Escherichia coli and antibiotic resistance due to extended-spectrum beta-lactamase (ESBL) production constitute a threat against patient safety. To identify, track, and control outbreaks and to detect emerging virulent clones, typing tools of sufficient discriminatory power that generate reproducible and unambiguous data are needed. A probe based real-time PCR method targeting multiple single nucleotide polymorphisms (SNP) was developed. The method was based on the multi locus sequence typing scheme of Institute Pasteur and by adaptation of previously described typing assays. An 8 SNP-panel that reached a Simpson's diversity index of 0.95 was established, based on analysis of sporadic E. coli cases (ESBL n = 27 and non-ESBL n = 53). This multi-SNP assay was used to identify the sequence type 131 (ST131) complex according to the Achtman's multi locus sequence typing scheme. However, it did not fully discriminate within the complex but provided a diagnostic signature that outperformed a previously described detection assay. Pulsed-field gel electrophoresis typing of isolates from a presumed outbreak (n = 22) identified two outbreaks (ST127 and ST131) and three different non-outbreak-related isolates. Multi-SNP typing generated congruent data except for one non-outbreak-related ST131 isolate. We consider multi-SNP real-time PCR typing an accessible primary generic E. coli typing tool for rapid and uniform type identification.
Two missense mutations in melanocortin 1 receptor (MC1R) are strongly associated with dark ventral coat color in reindeer (Rangifer tarandus).

PubMed

Våge, D I; Nieminen, M; Anderson, D G; Røed, K H

2014-10-01

The protein-coding region of melanocortin 1 receptor (MC1R) was sequenced to identify potential variation affecting coat color in reindeer (Rangifer tarandus). A T→C sequence variation at nucleotide position 218 (c.218T>C) causing an amino acid (aa) change from methionine to threonine at aa position 73 (p.Met73Thr) was identified. In addition, a T→G sequence variation was found at nucleotide position 839 (c.839T>G), causing phenylalanine to be exchanged by cysteine at aa position 280 (p.Phe280Cys). The two sequence variants (c.218C and c.839G) were found to be closely associated with a darker belly coat compared with animals not having any of these two variants. The aa acid change p.Met73Thr affects the same position as p.Met73Lys previously reported to give constitutive activation of MC1R in black sheep (Ovis aries), whereas p.Phe280Cys is identical to one of two variants previously reported to be associated with dark coat color in Arctic fox (Alopex lagopus), supporting that the two variants found in reindeer are functional. The complete absence of Thr73 and Cys280 among the 51 wild reindeer analyzed provides some evidence that these variants are more common in the domestic herds. © 2014 Stichting International Foundation for Animal Genetics.
Detection of signals in mRNAs that influence translation.

PubMed

Brown, Chris M; Jacobs, Grant; Stockwell, Peter; Schreiber, Mark

2003-01-01

Genome sequencing efforts mean that we now have extensive data from a wide range of organisms to study. Understanding the differing natures of the biology of these organisms is an important aim of genome analysis. We are interested in signals that affect translation of mRNAs. Some signals in the mRNA influence how efficiently it is translated into protein. Previous studies have indicated that many important signals are located around the initiation and termination codons. We have developed tools described here to extract the relevant sequence regions from GenBank. To create databases organised by species, or higher taxonomic groupings (eg planta), a program was developed to dynamically view and edit the taxonomy database. Data from relevant species were then extracted using our Genbank feature table parser. We analysed all available sequences, particularly those from complete genomes. Patterns were then identified using information theory. The software is available from http://transterm.otago.ac.nz. Patterns around the initiation codons for most of the organisms fall into two groups, containing the previously known Shine-Dalgarno and Kozaks efficiency signals. However, we have identified several organisms that appear to utilise novel systems. Our analysis indicates that some organisms with extremely high GC% genomes do not have a strong dependence on base pairing ribosome binding sites, as the complementary sequence is absent from many genes.
A dehydration-inducible gene in the truffle Tuber borchii identifies a novel group of dehydrins

PubMed Central

Abba', Simona; Ghignone, Stefano; Bonfante, Paola

2006-01-01

Background The expressed sequence tag M6G10 was originally isolated from a screening for differentially expressed transcripts during the reproductive stage of the white truffle Tuber borchii. mRNA levels for M6G10 increased dramatically during fruiting body maturation compared to the vegetative mycelial stage. Results Bioinformatics tools, phylogenetic analysis and expression studies were used to support the hypothesis that this sequence, named TbDHN1, is the first dehydrin (DHN)-like coding gene isolated in fungi. Homologs of this gene, all defined as "coding for hypothetical proteins" in public databases, were exclusively found in ascomycetous fungi and in plants. Although complete (or almost complete) fungal genomes and EST collections of some Basidiomycota and Glomeromycota are already available, DHN-like proteins appear to be represented only in Ascomycota. A new and previously uncharacterized conserved signature pattern was identified and proposed to Uniprot database as the main distinguishing feature of this new group of DHNs. Expression studies provide experimental evidence of a transcript induction of TbDHN1 during cellular dehydration. Conclusion Expression pattern and sequence similarities to known plant DHNs indicate that TbDHN1 is the first characterized DHN-like protein in fungi. The high similarity of TbDHN1 with homolog coding sequences implies the existence of a novel fungal/plant group of LEA Class II proteins characterized by a previously undescribed signature pattern. PMID:16512918
Random Amplification and Pyrosequencing for Identification of Novel Viral Genome Sequences

PubMed Central

Hang, Jun; Forshey, Brett M.; Kochel, Tadeusz J.; Li, Tao; Solórzano, Víctor Fiestas; Halsey, Eric S.; Kuschner, Robert A.

2012-01-01

ssRNA viruses have high levels of genomic divergence, which can lead to difficulty in genomic characterization of new viruses using traditional PCR amplification and sequencing methods. In this study, random reverse transcription, anchored random PCR amplification, and high-throughput pyrosequencing were used to identify orthobunyavirus sequences from total RNA extracted from viral cultures of acute febrile illness specimens. Draft genome sequence for the orthobunyavirus L segment was assembled and sequentially extended using de novo assembly contigs from pyrosequencing reads and orthobunyavirus sequences in GenBank as guidance. Accuracy and continuous coverage were achieved by mapping all reads to the L segment draft sequence. Subsequently, RT-PCR and Sanger sequencing were used to complete the genome sequence. The complete L segment was found to be 6936 bases in length, encoding a 2248-aa putative RNA polymerase. The identified L segment was distinct from previously published South American orthobunyaviruses, sharing 63% and 54% identity at the nucleotide and amino acid level, respectively, with the complete Oropouche virus L segment and 73% and 81% identity at the nucleotide and amino acid level, respectively, with a partial Caraparu virus L segment. The result demonstrated the effectiveness of a sequence-independent amplification and next-generation sequencing approach for obtaining complete viral genomes from total nucleic acid extracts and its use in pathogen discovery. PMID:22468136
Genetic characterization of L-Zagreb mumps vaccine strain.

PubMed

Ivancic, Jelena; Gulija, Tanja Kosutic; Forcic, Dubravko; Baricevic, Marijana; Jug, Renata; Mesko-Prejac, Majda; Mazuran, Renata

2005-04-01

Eleven mumps vaccine strains, all containing live attenuated virus, have been used throughout the world. Although L-Zagreb mumps vaccine has been licensed since 1972, only its partial nucleotide sequence was previously determined (accession numbers , and ). Therefore, we sequenced the entire genome of L-Zagreb vaccine strain (Institute of Immunology Inc., Zagreb, Croatia). In order to investigate the genetic stability of the vaccine, sequences of both L-Zagreb master seed and currently produced vaccine batch were determined and no difference between them was observed. A phylogenetic analysis based on SH gene sequence has shown that L-Zagreb strain does not belong to any of established mumps genotypes and that it is most similar to old, laboratory preserved European strains (1950s-1970s). L-Zagreb nucleotide and deduced protein sequences were compared with other mumps virus sequences obtained from the GenBank. Emphasis was put on functionally important protein regions and known antigenic epitopes. The extensive comparisons of nucleotide and deduced protein sequences between L-Zagreb vaccine strain and other previously determined mumps virus sequences have shown that while the functional regions of HN, V, and L proteins are well conserved among various mumps strains, there can be a substantial amino acid difference in antigenic epitopes of all proteins and in functional regions of F protein. No molecular pattern was identified that can be used as a distinction marker between virulent and attenuated strains.
Novel splice mutation in microthalmia-associated transcription factor in Waardenburg Syndrome.

PubMed

Brenner, Laura; Burke, Kelly; Leduc, Charles A; Guha, Saurav; Guo, Jiancheng; Chung, Wendy K

2011-01-01

Waardenburg Syndrome (WS) is a syndromic form of hearing loss associated with mutations in six different genes. We identified a large family with WS that had previously undergone clinical testing, with no reported pathogenic mutation. Using linkage analysis, a region on 3p14.1 with an LOD score of 6.6 was identified. Microthalmia-Associated Transcription Factor, a gene known to cause WS, is located within this region of linkage. Sequencing of Microthalmia-Associated Transcription Factor demonstrated a c.1212 G>A synonymous variant that segregated with the WS in the family and was predicted to cause a novel splicing site that was confirmed with expression analysis of the mRNA. This case illustrates the need to computationally analyze novel synonymous sequence variants for possible effects on splicing to maximize the clinical sensitivity of sequence-based genetic testing.
Complete genome sequence of bacteriocin-producing Lactobacillus plantarum KLDS1.0391, a probiotic strain with gastrointestinal tract resistance and adhesion to the intestinal epithelial cells.

PubMed

Jia, Fang-Fang; Zhang, Lu-Ji; Pang, Xue-Hui; Gu, Xin-Xi; Abdelazez, Amro; Liang, Yu; Sun, Si-Rui; Meng, Xiang-Chen

2017-10-01

Lactobacillus plantarum KLDS1.0391 is a probiotic strain isolated from the traditional fermented dairy products and identified to produce bacteriocin against Gram-positive and Gram-negative bacteria. Previous studies showed that the strain has a high resistance to gastrointestinal stress and has a high adhesion ability to the intestinal epithelial cells (Caco-2). We reported the entire genome sequence of this strain, which contains a circular 2,886,607-bp chromosome and three circular plasmids. Genes, which are related to the biosynthesis of bacteriocins, the stress resistance to gastrointestinal tract environment and adhesive performance, were identified. Whole genome sequence of Lactobacillus plantarum KLDS1.0391 will be helpful for its applications in food industry. Copyright © 2017 Elsevier Inc. All rights reserved.
Genome sequencing reveals complex secondary metabolome in themarine actinomycete Salinispora tropica

DOE Office of Scientific and Technical Information (OSTI.GOV)

Udwary, Daniel W.; Zeigler, Lisa; Asolkar, Ratnakar

2007-05-01

Recent fermentation studies have identified actinomycetes ofthe marine-dwelling genus Salinispora as prolific natural productproducers. To further evaluate their biosynthetic potential, we analyzedall identifiable secondary natural product gene clusters from therecently sequenced 5,184,724 bp S. tropica CNB-440 circular genome. Ouranalysis shows that biosynthetic potential meets or exceeds that shown byprevious Streptomyces genome sequences as well as other naturalproduct-producing actinomycetes. The S. tropica genome features ninepolyketide synthase systems of every known formally classified family,non-ribosomal peptide synthetases and several hybrid clusters. While afew clusters appear to encode molecules previously identified inStreptomyces species,the majority of the 15 biosynthetic loci are novel.Specific chemical information aboutmore » putative and observed natural productmolecules is presented and discussed. In addition, our bioinformaticanalysis was critical for the structure elucidation of the novelpolyenemacrolactam salinilactam A. This study demonstrates the potentialfor genomic analysis to complement and strengthen traditional naturalproduct isolation studies and firmly establishes the genus Salinispora asa rich source of novel drug-like molecules.« less
Whole exome sequencing identifies driver mutations in asymptomatic computed tomography-detected lung cancers with normal karyotype.

PubMed

Belloni, Elena; Veronesi, Giulia; Rotta, Luca; Volorio, Sara; Sardella, Domenico; Bernard, Loris; Pece, Salvatore; Di Fiore, Pier Paolo; Fumagalli, Caterina; Barberis, Massimo; Spaggiari, Lorenzo; Pelicci, Pier Giuseppe; Riva, Laura

2015-04-01

The efficacy of curative surgery for lung cancer could be largely improved by non-invasive screening programs, which can detect the disease at early stages. We previously showed that 18% of screening-identified lung cancers demonstrate a normal karyotype and, following high-density genome scanning, can be subdivided into samples with 1) numerous; 2) none; and 3) few copy number alterations. Whole exome sequencing was applied to the two normal karyotype, screening-detected lung cancers, constituting group 2, as well as normal controls. We identified mutations in both tumors, including KEAP1 (commonly mutated in lung cancers) in one, and TP53, PMS1, and MSH3 (well-characterized DNA-repair genes) in the other. The two normal karyotype screening-detected lung tumors displayed a typical lung cancer mutational profile that only next generation sequencing could reveal, which offered an additional contribution to the over-diagnosis bias concept hypothesized within lung cancer screening programs. Copyright © 2015 Elsevier Inc. All rights reserved.
Y-Chromosome Markers for the Red Fox.

PubMed

Rando, Halie M; Stutchman, Jeremy T; Bastounes, Estelle R; Johnson, Jennifer L; Driscoll, Carlos A; Barr, Christina S; Trut, Lyudmila N; Sacks, Benjamin N; Kukekova, Anna V

2017-09-01

The de novo assembly of the red fox (Vulpes vulpes) genome has facilitated the development of genomic tools for the species. Efforts to identify the population history of red foxes in North America have previously been limited by a lack of information about the red fox Y-chromosome sequence. However, a megabase of red fox Y-chromosome sequence was recently identified over 2 scaffolds in the reference genome. Here, these scaffolds were scanned for repeated motifs, revealing 194 likely microsatellites. Twenty-three of these loci were selected for primer development and, after testing, produced a panel of 11 novel markers that were analyzed alongside 2 markers previously developed for the red fox from dog Y-chromosome sequence. The markers were genotyped in 76 male red foxes from 4 populations: 7 foxes from Newfoundland (eastern Canada), 12 from Maryland (eastern United States), and 9 from the island of Great Britain, as well as 48 foxes of known North American origin maintained on an experimental farm in Novosibirsk, Russia. The full marker panel revealed 22 haplotypes among these red foxes, whereas the 2 previously known markers alone would have identified only 10 haplotypes. The haplotypes from the 4 populations clustered primarily by continent, but unidirectional gene flow from Great Britain and farm populations may influence haplotype diversity in the Maryland population. The development of new markers has increased the resolution at which red fox Y-chromosome diversity can be analyzed and provides insight into the contribution of males to red fox population diversity and patterns of phylogeography. © The American Genetic Association 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Cryopyrin-associated Periodic Syndrome Caused by a Myeloid-Restricted Somatic NLRP3 Mutation

PubMed Central

Zhou, Qing; Aksentijevich, Ivona; Wood, Geryl M.; Walts, Avram D.; Hoffmann, Patrycja; Remmers, Elaine F.; Kastner, Daniel L.; Ombrello, Amanda K.

2015-01-01

Objective To identify the cause of disease in an adult patient presenting with recent onset fevers, chills, urticaria, fatigue, and profound myalgia, who was negative for cryopyrin-associated periodic syndrome (CAPS) NLRP3 mutations by conventional Sanger DNA sequencing. Methods We performed whole-exome sequencing and targeted deep sequencing using DNA from the patient’s whole blood to identify a possible NLRP3 somatic mutation. We then screened for this mutation in subcloned NLRP3 amplicons from fibroblasts, buccal cells, granulocytes, negatively-selected monocytes, and T and B lymphocytes and further confirmed the somatic mutation by targeted sequencing of exon 3. Results We identified a previously reported CAPS-associated mutation, p.Tyr570Cys, with a mutant allele frequency of 15% based on exome data. Targeted sequencing and subcloning of NLRP3 amplicons confirmed the presence of the somatic mutation in whole blood at a ratio similar to the exome data. The mutant allele frequency was in the range of 13.3%–16.8% in monocytes and 15.2%–18% in granulocytes; Notably, this mutation was either absent or present at a very low frequency in B and T lymphocytes, buccal cells, and in the patient’s cultured fibroblasts. Conclusion These data document the possibility of myeloid-restricted somatic mosaicism in the pathogenesis of CAPS, underscoring the emerging role of massively-parallel sequencing in clinical diagnosis. PMID:25988971
Identification of a second flagellin gene and functional characterization of a sigma70-like promoter upstream of a Leptospira borgpetersenii flaB gene.

PubMed

Lin, Min; Dan, Hanhong; Li, Yijing

2004-02-01

Leptospira borgpetersenii, one of the causative agents of leptospirosis in both animals and humans, is a bacterial pathogen with characteristic motility that is mediated by the rotation of two periplasmic flagella (PF). The flaB gene coding for a core polypeptide subunit of PF was previously characterized by sequence analysis of its open reading frame (ORF) (M. Lin, J Biochem Mol Biol Biophys 2:181-187, 1999). The present study was undertaken to isolate and clone the uncharacterized sequence upstream of the flaB gene by using a PCR-based genome walking procedure. This has resulted in a 1470-bp genomic DNA sequence in which an 846-bp ORF coding for a 281-amino acid polypeptide (31.3 kDa) is identified 455 bp upstream from the flaB start codon. The encoded protein exhibits 72% amino acid identity to the deduced FlaB protein sequence of L. borgpetersenii and a high degree of sequence homology to the FlaB proteins of other spirochaetes. This has demonstrated for the first time that a second flaB gene homolog is present in a Leptospira species. The newly identified gene is designated flaB1, and the previously cloned flaB renamed flaB2. Within the intergenic sequence between flaB1 and flaB2, a potential stem-loop structure (12-bp inverted repeats) was identified 25 bp downstream of the flaB1 stop codon; this could serve as a transcription terminator for the flaB1 mRNA. Three E. coli-like promoter regions (I, II, and III) for binding Esigma(70), a regulatory sequence uncommonly found in flagellar genes, were predicted upstream of the flaB2 ORF. Only promoter region II contains a promoter that is functional in E. coli, as revealed at phenotypic and transcriptional levels by its capability of directing the expression of the chloramphenicol acetyltransferase (CAT) gene in the promoter probe vector pKK232-8. These observations may suggest that flaB1 and flaB2 are transcribed separately and do not form a transcriptional operon controlled by a single promoter.
CSTminer: a web tool for the identification of coding and noncoding conserved sequence tags through cross-species genome comparison

PubMed Central

Castrignanò, Tiziana; Canali, Alessandro; Grillo, Giorgio; Liuni, Sabino; Mignone, Flavio; Pesole, Graziano

2004-01-01

The identification and characterization of genome tracts that are highly conserved across species during evolution may contribute significantly to the functional annotation of whole-genome sequences. Indeed, such sequences are likely to correspond to known or unknown coding exons or regulatory motifs. Here, we present a web server implementing a previously developed algorithm that, by comparing user-submitted genome sequences, is able to identify statistically significant conserved blocks and assess their coding or noncoding nature through the measure of a coding potential score. The web tool, available at http://www.caspur.it/CSTminer/, is dynamically interconnected with the Ensembl genome resources and produces a graphical output showing a map of detected conserved sequences and annotated gene features. PMID:15215464
Genome sequencing of idiopathic pulmonary fibrosis in conjunction with a medical school human anatomy course.

PubMed

Kumar, Akash; Dougherty, Max; Findlay, Gregory M; Geisheker, Madeleine; Klein, Jason; Lazar, John; Machkovech, Heather; Resnick, Jesse; Resnick, Rebecca; Salter, Alexander I; Talebi-Liasi, Faezeh; Arakawa, Christopher; Baudin, Jacob; Bogaard, Andrew; Salesky, Rebecca; Zhou, Qian; Smith, Kelly; Clark, John I; Shendure, Jay; Horwitz, Marshall S

2014-01-01

Even in cases where there is no obvious family history of disease, genome sequencing may contribute to clinical diagnosis and management. Clinical application of the genome has not yet become routine, however, in part because physicians are still learning how best to utilize such information. As an educational research exercise performed in conjunction with our medical school human anatomy course, we explored the potential utility of determining the whole genome sequence of a patient who had died following a clinical diagnosis of idiopathic pulmonary fibrosis (IPF). Medical students performed dissection and whole genome sequencing of the cadaver. Gross and microscopic findings were more consistent with the fibrosing variant of nonspecific interstitial pneumonia (NSIP), as opposed to IPF per se. Variants in genes causing Mendelian disorders predisposing to IPF were not detected. However, whole genome sequencing identified several common variants associated with IPF, including a single nucleotide polymorphism (SNP), rs35705950, located in the promoter region of the gene encoding mucin glycoprotein MUC5B. The MUC5B promoter polymorphism was recently found to markedly elevate risk for IPF, though a particular association with NSIP has not been previously reported, nor has its contribution to disease risk previously been evaluated in the genome-wide context of all genetic variants. We did not identify additional predicted functional variants in a region of linkage disequilibrium (LD) adjacent to MUC5B, nor did we discover other likely risk-contributing variants elsewhere in the genome. Whole genome sequencing thus corroborates the association of rs35705950 with MUC5B dysregulation and interstitial lung disease. This novel exercise additionally served a unique mission in bridging clinical and basic science education.
Identification and characterization of EBV genomes in spontaneously immortalized human peripheral blood B lymphocytes by NGS technology.

PubMed

Lei, Haiyan; Li, Tianwei; Hung, Guo-Chiuan; Li, Bingjie; Tsai, Shien; Lo, Shyh-Ching

2013-11-19

We conducted genomic sequencing to identify Epstein Barr Virus (EBV) genomes in 2 human peripheral blood B lymphocytes that underwent spontaneous immortalization promoted by mycoplasma infections in culture, using the high-throughput sequencing (HTS) Illumina MiSeq platform. The purpose of this study was to examine if rapid detection and characterization of a viral agent could be effectively achieved by HTS using a platform that has become readily available in general biology laboratories. Raw read sequences, averaging 175 bps in length, were mapped with DNA databases of human, bacteria, fungi and virus genomes using the CLC Genomics Workbench bioinformatics tool. Overall 37,757 out of 49,520,834 total reads in one lymphocyte line (# K4413-Mi) and 28,178 out of 45,335,960 reads in the other lymphocyte line (# K4123-Mi) were identified as EBV sequences. The two EBV genomes with estimated 35.22-fold and 31.06-fold sequence coverage respectively, designated K4413-Mi EBV and K4123-Mi EBV (GenBank accession number KC440852 and KC440851 respectively), are characteristic of type-1 EBV. Sequence comparison and phylogenetic analysis among K4413-Mi EBV, K4123-Mi EBV and the EBV genomes previously reported to GenBank as well as the NA12878 EBV genome assembled from database of the 1000 Genome Project showed that these 2 EBVs are most closely related to B95-8, an EBV previously isolated from a patient with infectious mononucleosis and WT-EBV. They are less similar to EBVs associated with nasopharyngeal carcinoma (NPC) from Hong Kong and China as well as the Akata strain of a case of Burkitt's lymphoma from Japan. They are most different from type 2 EBV found in Western African Burkitt's lymphoma.
Lactobacillus strain diversity based on partial hsp60 gene sequences and design of PCR-restriction fragment length polymorphism assays for species identification and differentiation.

PubMed

Blaiotta, Giuseppe; Fusco, Vincenzina; Ercolini, Danilo; Aponte, Maria; Pepe, Olimpia; Villani, Francesco

2008-01-01

A phylogenetic tree showing diversities among 116 partial (499-bp) Lactobacillus hsp60 (groEL, encoding a 60-kDa heat shock protein) nucleotide sequences was obtained and compared to those previously described for 16S rRNA and tuf gene sequences. The topology of the tree produced in this study showed a Lactobacillus species distribution similar, but not identical, to those previously reported. However, according to the most recent systematic studies, a clear differentiation of 43 single-species clusters was detected/identified among the sequences analyzed. The slightly higher variability of the hsp60 nucleotide sequences than of the 16S rRNA sequences offers better opportunities to design or develop molecular assays allowing identification and differentiation of either distant or very closely related Lactobacillus species. Therefore, our results suggest that hsp60 can be considered an excellent molecular marker for inferring the taxonomy and phylogeny of members of the genus Lactobacillus and that the chosen primers can be used in a simple PCR procedure allowing the direct sequencing of the hsp60 fragments. Moreover, in this study we performed a computer-aided restriction endonuclease analysis of all 499-bp hsp60 partial sequences and we showed that the PCR-restriction fragment length polymorphism (RFLP) patterns obtainable by using both endonucleases AluI and TacI (in separate reactions) can allow identification and differentiation of all 43 Lactobacillus species considered, with the exception of the pair L. plantarum/L. pentosus. However, the latter species can be differentiated by further analysis with Sau3AI or MseI. The hsp60 PCR-RFLP approach was efficiently applied to identify and to differentiate a total of 110 wild Lactobacillus strains (including closely related species, such as L. casei and L. rhamnosus or L. plantarum and L. pentosus) isolated from cheese and dry-fermented sausages.

Lactobacillus Strain Diversity Based on Partial hsp60 Gene Sequences and Design of PCR-Restriction Fragment Length Polymorphism Assays for Species Identification and Differentiation▿ †

PubMed Central

Blaiotta, Giuseppe; Fusco, Vincenzina; Ercolini, Danilo; Aponte, Maria; Pepe, Olimpia; Villani, Francesco

2008-01-01

A phylogenetic tree showing diversities among 116 partial (499-bp) Lactobacillus hsp60 (groEL, encoding a 60-kDa heat shock protein) nucleotide sequences was obtained and compared to those previously described for 16S rRNA and tuf gene sequences. The topology of the tree produced in this study showed a Lactobacillus species distribution similar, but not identical, to those previously reported. However, according to the most recent systematic studies, a clear differentiation of 43 single-species clusters was detected/identified among the sequences analyzed. The slightly higher variability of the hsp60 nucleotide sequences than of the 16S rRNA sequences offers better opportunities to design or develop molecular assays allowing identification and differentiation of either distant or very closely related Lactobacillus species. Therefore, our results suggest that hsp60 can be considered an excellent molecular marker for inferring the taxonomy and phylogeny of members of the genus Lactobacillus and that the chosen primers can be used in a simple PCR procedure allowing the direct sequencing of the hsp60 fragments. Moreover, in this study we performed a computer-aided restriction endonuclease analysis of all 499-bp hsp60 partial sequences and we showed that the PCR-restriction fragment length polymorphism (RFLP) patterns obtainable by using both endonucleases AluI and TacI (in separate reactions) can allow identification and differentiation of all 43 Lactobacillus species considered, with the exception of the pair L. plantarum/L. pentosus. However, the latter species can be differentiated by further analysis with Sau3AI or MseI. The hsp60 PCR-RFLP approach was efficiently applied to identify and to differentiate a total of 110 wild Lactobacillus strains (including closely related species, such as L. casei and L. rhamnosus or L. plantarum and L. pentosus) isolated from cheese and dry-fermented sausages. PMID:17993558
Evolutionary conservation of regulatory elements in vertebrate HOX gene clusters

DOE Office of Scientific and Technical Information (OSTI.GOV)

Santini, Simona; Boore, Jeffrey L.; Meyer, Axel

2003-12-31

Due to their high degree of conservation, comparisons of DNA sequences among evolutionarily distantly-related genomes permit to identify functional regions in noncoding DNA. Hox genes are optimal candidate sequences for comparative genome analyses, because they are extremely conserved in vertebrates and occur in clusters. We aligned (Pipmaker) the nucleotide sequences of HoxA clusters of tilapia, pufferfish, striped bass, zebrafish, horn shark, human and mouse (over 500 million years of evolutionary distance). We identified several highly conserved intergenic sequences, likely to be important in gene regulation. Only a few of these putative regulatory elements have been previously described as being involvedmore » in the regulation of Hox genes, while several others are new elements that might have regulatory functions. The majority of these newly identified putative regulatory elements contain short fragments that are almost completely conserved and are identical to known binding sites for regulatory proteins (Transfac). The conserved intergenic regions located between the most rostrally expressed genes in the developing embryo are longer and better retained through evolution. We document that presumed regulatory sequences are retained differentially in either A or A clusters resulting from a genome duplication in the fish lineage. This observation supports both the hypothesis that the conserved elements are involved in gene regulation and the Duplication-Deletion-Complementation model.« less
Comparative sequence analysis of a region on human chromosome 13q14, frequently deleted in B-cell chronic lymphocytic leukemia, and its homologous region on mouse chromosome 14.

PubMed

Kapanadze, B; Makeeva, N; Corcoran, M; Jareborg, N; Hammarsund, M; Baranova, A; Zabarovsky, E; Vorontsova, O; Merup, M; Gahrton, G; Jansson, M; Yankovsky, N; Einhorn, S; Oscier, D; Grandér, D; Sangfelt, O

2000-12-15

Previous studies have indicated the presence of a putative tumor suppressor gene on human chromosome 13q14, commonly deleted in patients with B-cell chronic lymphocytic leukemia (B-CLL). We have recently identified a minimally deleted region encompassing parts of two adjacent genes, termed LEU1 and LEU2 (leukemia-associated genes 1 and 2), and several additional transcripts. In addition, 50 kb centromeric to this region we have identified another gene, LEU5/RFP2. To elucidate further the complex genomic organization of this region, we have identified, mapped, and sequenced the homologous region in the mouse. Fluorescence in situ hybridization analysis demonstrated that the region maps to mouse chromosome 14. The overall organization and gene order in this region were found to be highly conserved in the mouse. Sequence comparison between the human deletion hotspot region and its homologous mouse region revealed a high degree of sequence conservation with an overall score of 74%. However, our data also show that in terms of transcribed sequences, only two of those, human LEU2 and LEU5/RFP2, are clearly conserved, strengthening the case for these genes as putative candidate B-CLL tumor suppressor genes.
Substrate sequence selectivity of APOBEC3A implicates intra-DNA interactions.

PubMed

Silvas, Tania V; Hou, Shurong; Myint, Wazo; Nalivaika, Ellen; Somasundaran, Mohan; Kelch, Brian A; Matsuo, Hiroshi; Kurt Yilmaz, Nese; Schiffer, Celia A

2018-05-14

The APOBEC3 (A3) family of human cytidine deaminases is renowned for providing a first line of defense against many exogenous and endogenous retroviruses. However, the ability of these proteins to deaminate deoxycytidines in ssDNA makes A3s a double-edged sword. When overexpressed, A3s can mutate endogenous genomic DNA resulting in a variety of cancers. Although the sequence context for mutating DNA varies among A3s, the mechanism for substrate sequence specificity is not well understood. To characterize substrate specificity of A3A, a systematic approach was used to quantify the affinity for substrate as a function of sequence context, length, secondary structure, and solution pH. We identified the A3A ssDNA binding motif as (T/C)TC(A/G), which correlated with enzymatic activity. We also validated that A3A binds RNA in a sequence specific manner. A3A bound tighter to substrate binding motif within a hairpin loop compared to linear oligonucleotide, suggesting A3A affinity is modulated by substrate structure. Based on these findings and previously published A3A-ssDNA co-crystal structures, we propose a new model with intra-DNA interactions for the molecular mechanism underlying A3A sequence preference. Overall, the sequence and structural preferences identified for A3A leads to a new paradigm for identifying A3A's involvement in mutation of endogenous or exogenous DNA.
IMG/VR: a database of cultured and uncultured DNA Viruses and retroviruses

DOE PAGES

Paez-Espino, David; Chen, I. -Min A.; Palaniappan, Krishna; ...

2016-10-30

Viruses represent the most abundant life forms on the planet. Recent experimental and computational improvements have led to a dramatic increase in the number of viral genome sequences identified primarily from metagenomic samples. As a result of the expanding catalog of metagenomic viral sequences, there exists a need for a comprehensive computational platform integrating all these sequences with associated metadata and analytical tools. Here we present IMG/VR (https://img.jgi.doe.gov/vr/), the largest publicly available database of 3908 isolate reference DNA viruses with 264 413 computationally identified viral contigs from > 6000 ecologically diverse metagenomic samples. Approximately half of the viral contigs aremore » grouped into genetically distinct quasi-species clusters. Microbial hosts are predicted for 20 000 viral sequences, revealing nine microbial phyla previously unreported to be infected by viruses. Viral sequences can be queried using a variety of associated metadata, including habitat type and geographic location of the samples, or taxonomic classification according to hallmark viral genes. IMG/VR has a user-friendly interface that allows users to interrogate all integrated data and interact by comparingwith external sequences, thus serving as an essential resource in the viral genomics community.« less
IMG/VR: a database of cultured and uncultured DNA Viruses and retroviruses

DOE Office of Scientific and Technical Information (OSTI.GOV)

Paez-Espino, David; Chen, I. -Min A.; Palaniappan, Krishna

Viruses represent the most abundant life forms on the planet. Recent experimental and computational improvements have led to a dramatic increase in the number of viral genome sequences identified primarily from metagenomic samples. As a result of the expanding catalog of metagenomic viral sequences, there exists a need for a comprehensive computational platform integrating all these sequences with associated metadata and analytical tools. Here we present IMG/VR (https://img.jgi.doe.gov/vr/), the largest publicly available database of 3908 isolate reference DNA viruses with 264 413 computationally identified viral contigs from > 6000 ecologically diverse metagenomic samples. Approximately half of the viral contigs aremore » grouped into genetically distinct quasi-species clusters. Microbial hosts are predicted for 20 000 viral sequences, revealing nine microbial phyla previously unreported to be infected by viruses. Viral sequences can be queried using a variety of associated metadata, including habitat type and geographic location of the samples, or taxonomic classification according to hallmark viral genes. IMG/VR has a user-friendly interface that allows users to interrogate all integrated data and interact by comparingwith external sequences, thus serving as an essential resource in the viral genomics community.« less
Characterization of Urtica dioica agglutinin isolectins and the encoding gene family.

PubMed

Does, M P; Ng, D K; Dekker, H L; Peumans, W J; Houterman, P M; Van Damme, E J; Cornelissen, B J

1999-01-01

Urtica dioica agglutinin (UDA) has previously been found in roots and rhizomes of stinging nettles as a mixture of UDA-isolectins. Protein and cDNA sequencing have shown that mature UDA is composed of two hevein domains and is processed from a precursor protein. The precursor contains a signal peptide, two in-tandem hevein domains, a hinge region and a carboxyl-terminal chitinase domain. Genomic fragments encoding precursors for UDA-isolectins have been amplified by five independent polymerase chain reactions on genomic DNA from stinging nettle ecotype Weerselo. One amplified gene was completely sequenced. As compared to the published cDNA sequence, the genomic sequence contains, besides two basepair substitutions, two introns located at the same positions as in other plant chitinases. By partial sequence analysis of 40 amplified genes, 16 different genes were identified which encode seven putative UDA-isolectins. The deduced amino acid sequences share 78.9-98.9% identity. In extracts of roots and rhizomes of stinging nettle ecotype Weerselo six out of these seven isolectins were detected by mass spectrometry. One of them is an acidic form, which has not been identified before. Our results demonstrate that UDA is encoded by a large gene family.
Chicken parvovirus-induced runting-stunting syndrome in young broilers

USDA-ARS?s Scientific Manuscript database

Previously we identified a novel parvovirus from enteric contents of chickens that were affected by enteric diseases. Comparative sequence analysis showed that the chicken parvovirus (ChPV) represented a new member in the Parvoviridae family. Here, we describe some of the pathogenic characteristics ...
Stable integration and expression of heterologous genes in several lactobacilli using an integration vector constructed from the integrase and attP sequences of phage ΦAT3 isolated from Lactobacillus casei ATCC 393.

PubMed

Lin, Chao-Fen; Lo, Ta-Chun; Kuo, Yang-Cheng; Lin, Thy-Hou

2013-04-01

An integration vector capable of stably integrating and maintaining in the chromosomes of several lactobacilli over hundreds of generations has been constructed. The major integration machinery used is based on the ΦAT3 integrase (int) and attP sequences determined previously. A novel core sequence located at the 3' end of the tRNA(leu) gene is identified in Lactobacillus fermentum ATCC 14931 as the integration target by the integration vector though most of such sequences found in other lactobacilli are similar to that determined previously. Due to the lack of an appropriate attB site in Lactococcus lactis MG1363, the integration vector is found to be unable to integrate into the chromosome of the strain. However, such integration can be successfully restored by cotransforming the integration vector with a replicative one harboring both attB and erythromycin resistance sequences into the strain. Furthermore, the integration vector constructed carries a promoter region of placT from the chromosome of Lactobacillus rhamnosus TCELL-1 which is used to express green fluorescence and luminance protein genes in the lactobacilli studied.
Phylogeography and genetic identification of the newly-discovered populations of torrent salamanders (Rhyacotriton cascade and R. variegatus) in the central Cascades (USA)

USGS Publications Warehouse

Wagner, R.S.; Miller, Mark P.; Haig, Susan M.

2006-01-01

Newly discovered populations of Rhyacotritonidae were investigated for taxonomic identity, hybridization, and sympatry. Species in the genus Rhyacotriton have been historically difficult to identify using morphological characters. Mitochondrial (mtDNA) 16S ribosomal RNA sequences (491 bp) and allozymes (6 loci) were used to identify the distribution of populations occurring intermediate between the previously described ranges of R. variegatus and R. cascadae in the central Cascade Mountain region of Oregon. Allozyme and mitochondrial sequence data both indicated the presence of two distinct evolutionary lineages, with each lineage corresponding to the allopatric distribution of R. cascadae and R. variegatus. Results suggest the Willamette River acts as a phylogeographic barrier limiting the distribution of both species, although we cannot exclude the possibility that reproductive isolation also exists that reinforces species' distributions. This study extends the previously described geographical ranges of both R. cascadae and R. variegatus and defines an eastern range limit for R. variegatus conservation efforts.
Characterization and mapping of cDNA encoding aspartate aminotransferase in rice, Oryza sativa L.

PubMed

Song, J; Yamamoto, K; Shomura, A; Yano, M; Minobe, Y; Sasaki, T

1996-10-31

Fifteen cDNA clones, putatively identified as encoding aspartate aminotransferase (AST, EC 2.6.1.1.), were isolated and partially sequenced. Together with six previously isolated clones putatively identified to encode ASTs (Sasaki, et al. 1994, Plant Journal 6, 615-624), their sequences were characterized and classified into 4 cDNA species. Two of the isolated clones, C60213 and C2079, were full-length cDNAs, and their complete nucleotide sequences were determined. C60213 was 1612 bp long and its deduced amino acid sequence showed 88% homology with that of Panicum miliaceum L. mitochondrial AST. The C60213-encoded protein had an N-terminal amino acid sequence that was characteristic of a mitochondrial transit peptide. On the other hand, C2079 was 1546 bp long and had 91% amino acid sequence homology with P. miliaceum L. cytosolic AST but lacked in the transit peptide sequence. The homologies of nucleotide sequences and deduced amino acid sequences of C2079 and C60213 were 54% and 52%, respectively. C2079 and C60213 were mapped on chromosomes 1 and 6, respectively, by restriction fragment length polymorphism linkage analysis. Northern blot analysis using C2079 as a probe revealed much higher transcript levels in callus and root than in green and etiolated shoots, suggesting tissue-specific variations of AST gene expression.
Detection and analysis of recombination in GII.4 norovirus strains causing gastroenteritis outbreaks in Alberta.

PubMed

Hasing, Maria E; Hazes, Bart; Lee, Bonita E; Preiksaitis, Jutta K; Pang, Xiaoli L

2014-10-01

Recombination is an important mechanism generating genetic diversity in norovirus (NoV) that occurs commonly at the NoV polymerase-capsid (ORF1/2) junction. The genotyping method based on partial ORF2 sequences currently used to characterize circulating NoV strains in gastroenteritis outbreaks in Alberta cannot detect such recombination events and provides only limited information on NoV genetic evolution. The objective of this study was to determine whether any NoV GII.4 strains causing outbreaks in Alberta are recombinants. Twenty stool samples collected during outbreaks occurring between July 2004 and January 2012 were selected to include the GII.4 variants Farmington Hills 2002, Hunter 2004, Yerseke 2006a, Den Haag 2006b, Apeldoorn 2007, New Orleans 2009, and Sydney 2012 based on previous NoV ORF2-genotyping results. Near full-length NoV genome sequences were obtained, aligned with reference sequences from GenBank and analyzed with RDPv4.13. Two sequences corresponding to Apeldoorn 2007, and Sydney 2012 were identified as recombinants with breakpoints near the ORF1/2 junction and putative parental strains as previously reported. We also identified, for the first time, a non-recombinant sequence resembling the ORF2-3 parent of the recombinant cluster Sydney 2012 responsible for the most recent pandemic. Our results confirmed the presence of recombinant NoV GII.4 strains in Alberta, and highlight the importance of including additional genomic regions in surveillance studies to trace the evolution of pandemic NoV GII.4 strains. Copyright © 2014 Elsevier B.V. All rights reserved.
KpnBI is the prototype of a new family (IE) of bacterial type I restriction-modification system

PubMed Central

Chin, V.; Valinluck, V.; Magaki, S.; Ryu, J.

2004-01-01

KpnBI is a restriction-modification (R-M) system recognized in the GM236 strain of Klebsiella pneumoniae. Here, the KpnBI modification genes were cloned into a plasmid using a modification expression screening method. The modification genes that consist of both hsdM (2631 bp) and hsdS (1344 bp) genes were identified on an 8.2 kb EcoRI chromosomal fragment. These two genes overlap by one base and share the same promoter located upstream of the hsdM gene. Using recently developed plasmid R-M tests and a computer program RM Search, the DNA recognition sequence for the KpnBI enzymes was identified as a new 8 nt sequence containing one degenerate base with a 6 nt spacer, CAAANNNNNNRTCA. From Dam methylation and HindIII sensitivity tests, the methylation loci were predicted to be the italicized third adenine in the 5′ specific region and the adenine opposite the italicized thymine in the 3′ specific region. Combined with previous sequence data for hsdR, we concluded that the KpnBI system is a typical type I R-M system. The deduced amino acid sequences of the three subunits of the KpnBI system show only limited homologies (25 to 33% identity) at best, to the four previously categorized type I families (IA, IB, IC, and ID). Furthermore, their identity scores to other uncharacterized putative genome type I sequences were 53% at maximum. Therefore, we propose that KpnBI is the prototype of a new ‘type IE’ family. PMID:15475385
Retroviral insertions in the VISION database identify molecular pathways in mouse lymphoid leukemia and lymphoma

PubMed Central

Weiser, Keith C.; Liu, Bin; Hansen, Gwenn M.; Skapura, Darlene; Hentges, Kathryn E.; Yarlagadda, Sujatha; Morse III, Herbert C.

2007-01-01

AKXD recombinant inbred (RI) strains develop a variety of leukemias and lymphomas due to somatically acquired insertions of retroviral DNA into the genome of hematopoetic cells that can mutate cellular proto-oncogenes and tumor suppressor genes. We generated a new set of tumors from nine AKXD RI strains selected for their propensity to develop B-cell tumors, the most common type of human hematopoietic cancers. We employed a PCR technique called viral insertion site amplification (VISA) to rapidly isolate genomic sequence at the site of provirus insertion. Here we describe 550 VISA sequence tags (VSTs) that identify 74 common insertion sites (CISs), of which 21 have not been identified previously. Several suspected proto-oncogenes and tumor suppressor genes lie near CISs, providing supportive evidence for their roles in cancer. Furthermore, numerous previously uncharacterized genes lie near CISs, providing a pool of candidate disease genes for future research. Pathway analysis of candidate genes identified several signaling pathways as common and powerful routes to blood cancer, including Notch, E-protein, NFκB, and Ras signaling. Misregulation of several Notch signaling genes was confirmed by quantitative RT-PCR. Our data suggest that analyses of insertional mutagenesis on a single genetic background are biased toward the identification of cooperating mutations. This tumor collection represents the most comprehensive study of the genetics of B-cell leukemia and lymphoma development in mice. We have deposited the VST sequences, CISs in a genome viewer, histopathology, and molecular tumor typing data in a public web database called VISION (Viral Insertion Sites Identifying Oncogenes), which is located at http://www.mouse-genome.bcm.tmc.edu/vision. PMID:17926094
Retroviral insertions in the VISION database identify molecular pathways in mouse lymphoid leukemia and lymphoma.

PubMed

Weiser, Keith C; Liu, Bin; Hansen, Gwenn M; Skapura, Darlene; Hentges, Kathryn E; Yarlagadda, Sujatha; Morse Iii, Herbert C; Justice, Monica J

2007-10-01

AKXD recombinant inbred (RI) strains develop a variety of leukemias and lymphomas due to somatically acquired insertions of retroviral DNA into the genome of hematopoetic cells that can mutate cellular proto-oncogenes and tumor suppressor genes. We generated a new set of tumors from nine AKXD RI strains selected for their propensity to develop B-cell tumors, the most common type of human hematopoietic cancers. We employed a PCR technique called viral insertion site amplification (VISA) to rapidly isolate genomic sequence at the site of provirus insertion. Here we describe 550 VISA sequence tags (VSTs) that identify 74 common insertion sites (CISs), of which 21 have not been identified previously. Several suspected proto-oncogenes and tumor suppressor genes lie near CISs, providing supportive evidence for their roles in cancer. Furthermore, numerous previously uncharacterized genes lie near CISs, providing a pool of candidate disease genes for future research. Pathway analysis of candidate genes identified several signaling pathways as common and powerful routes to blood cancer, including Notch, E-protein, NFkappaB, and Ras signaling. Misregulation of several Notch signaling genes was confirmed by quantitative RT-PCR. Our data suggest that analyses of insertional mutagenesis on a single genetic background are biased toward the identification of cooperating mutations. This tumor collection represents the most comprehensive study of the genetics of B-cell leukemia and lymphoma development in mice. We have deposited the VST sequences, CISs in a genome viewer, histopathology, and molecular tumor typing data in a public web database called VISION (Viral Insertion Sites Identifying Oncogenes), which is located at http://www.mouse-genome.bcm.tmc.edu/vision .
Resolving the Origin of Rabbit Hemorrhagic Disease Virus: Insights from an Investigation of the Viral Stocks Released in Australia

PubMed Central

Eden, John-Sebastian; Read, Andrew J.; Duckworth, Janine A.; Strive, Tanja

2015-01-01

To resolve the evolutionary history of rabbit hemorrhagic disease virus (RHDV), we performed a genomic analysis of the viral stocks imported and released as a biocontrol measure in Australia, as well as a global phylogenetic analysis. Importantly, conflicts were identified between the sequences determined here and those previously published that may have affected evolutionary rate estimates. By removing likely erroneous sequences, we show that RHDV emerged only shortly before its initial description in China. PMID:26378178
Use of 16S rRNA Sequencing for Identification of Actinobacillus ureae Isolated from a Cerebrospinal Fluid Sample

PubMed Central

Whitelaw, A. C.; Shankland, I. M.; Elisha, B. G.

2002-01-01

Actinobacillus ureae, previously Pasteurella ureae, has on rare occasions been described as a cause of human infection. Owing to its rarity, it may not be easily identified in clinical microbiology laboratories by standard tests. This report describes a patient with acute bacterial meningitis due to A. ureae. The identity of the isolate was determined by means of DNA sequence analysis of a portion of the 16S rRNA gene. PMID:11825992
Unbiased Combinatorial Genomic Approaches to Identify Alternative Therapeutic Targets within the TSC Signaling Network

DTIC Science & Technology

2013-06-01

number of ways to generate either random mutations or specific alterations to the genome sequence . Unlike previous approaches however, both TALENs and...made to the donor construct will be incorporated into the endogenous genomic sequence (examples in Liu et al., 2012; Zu et al., 2013). One challenge... Drosophila with the CRISPR RNA-guided Cas9 nuclease. Genetics. 2013. Hwang WY, Fu Y, Reyon D, Maeder ML, Tsai SQ, Sander JD, et al. Efficient genome
Molecular detection of kobuviruses in European roe deer (Capreolus capreolus) in Italy.

PubMed

Di Martino, Barbara; Di Profio, Federica; Melegari, Irene; Di Felice, Elisabetta; Robetto, Serena; Guidetti, Cristina; Orusa, Riccardo; Martella, Vito; Marsilio, Fulvio

2015-08-01

Kobuvirus RNA was found in 6.6 % (13/198) of stool specimens from roe deer (Capreolus capreolus) captured during the regular hunting season. Upon sequence analysis of a fragment of the 3D gene, nine strains displayed the highest nucleotide sequence identity (91.2-97.4 %) to bovine kobuviruses previously detected in either diarrhoeic or asymptomatic calves. Interestingly, four strains were genetically related to the newly discovered caprine kobuviruses (84.2-87.6 % nucleotide identity) identified in black goats in Korea.
Detection and phylogenetic analysis of hepatitis E viruses from mongooses in Okinawa, Japan.

PubMed

Nidaira, Minoru; Takahashi, Kazuaki; Ogura, Go; Taira, Katsuya; Okano, Shou; Kudaka, Jun; Itokazu, Kiyomasa; Mishiro, Shunji; Nakamura, Masaji

2012-12-01

Hepatitis E virus (HEV) infection has previously been reported in wild mongooses on Okinawa Island; to date however, only one HEV RNA sequence has been identified in a mongoose. Hence, this study was performed to detect HEV RNA in 209 wild mongooses on Okinawa Island. Six (2.9%) samples tested positive for HEV RNA. Phylogenetic analysis revealed that 6 HEV RNAs belonged to genotype 3 and were classified into groups A and B. In group B, mongoose-derived HEV sequences were very similar to mongoose HEV previously detected on Okinawa Island, as well as to those of a pig. This investigation emphasized the possibility that the mongoose is a reservoir animal for HEV on Okinawa Island.

Isolation of Lagos bat virus from water mongoose.

PubMed

Markotter, Wanda; Kuzmin, Ivan; Rupprecht, Charles E; Randles, Jenny; Sabeta, Claude T; Wandeler, Alexander I; Nel, Louis H

2006-12-01

A genotype 2 lyssavirus, Lagos bat virus (LBV), was isolated from a terrestrial wildlife species (water mongoose) in August 2004 in the Durban area of the KwaZulu-Natal Province of South Africa. The virus isolate was confirmed as LBV by antigenic and genetic characterization, and the mongoose was identified as Atilax paludinosus by mitochondrial cytochrome b sequence analysis. Phylogenetic analysis demonstrated sequence homology with previous LBV isolates from South African bats. Studies performed in mice indicated that the peripheral pathogenicity of LBV had been underestimated in previous studies. Surveillance strategies for LBV in Africa must be improved to better understand the epidemiology of this virus and to make informed decisions on future vaccine strategies because evidence is insufficent that current rabies vaccines provide protection against LBV.
Application of a fast sorting algorithm to the assignment of mass spectrometric cross-linking data.

PubMed

Petrotchenko, Evgeniy V; Borchers, Christoph H

2014-09-01

Cross-linking combined with MS involves enzymatic digestion of cross-linked proteins and identifying cross-linked peptides. Assignment of cross-linked peptide masses requires a search of all possible binary combinations of peptides from the cross-linked proteins' sequences, which becomes impractical with increasing complexity of the protein system and/or if digestion enzyme specificity is relaxed. Here, we describe the application of a fast sorting algorithm to search large sequence databases for cross-linked peptide assignments based on mass. This same algorithm has been used previously for assigning disulfide-bridged peptides (Choi et al., ), but has not previously been applied to cross-linking studies. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Exome Sequencing Identifies Three Novel Candidate Genes Implicated in Intellectual Disability

PubMed Central

Azam, Maleeha; Ayub, Humaira; Vissers, Lisenka E. L. M.; Gilissen, Christian; Ali, Syeda Hafiza Benish; Riaz, Moeen; Veltman, Joris A.; Pfundt, Rolph; van Bokhoven, Hans; Qamar, Raheel

2014-01-01

Intellectual disability (ID) is a major health problem mostly with an unknown etiology. Recently exome sequencing of individuals with ID identified novel genes implicated in the disease. Therefore the purpose of the present study was to identify the genetic cause of ID in one syndromic and two non-syndromic Pakistani families. Whole exome of three ID probands was sequenced. Missense variations in two plausible novel genes implicated in autosomal recessive ID were identified: lysine (K)-specific methyltransferase 2B (KMT2B), zinc finger protein 589 (ZNF589), as well as hedgehog acyltransferase (HHAT) with a de novo mutation with autosomal dominant mode of inheritance. The KMT2B recessive variant is the first report of recessive Kleefstra syndrome-like phenotype. Identification of plausible causative mutations for two recessive and a dominant type of ID, in genes not previously implicated in disease, underscores the large genetic heterogeneity of ID. These results also support the viewpoint that large number of ID genes converge on limited number of common networks i.e. ZNF589 belongs to KRAB-domain zinc-finger proteins previously implicated in ID, HHAT is predicted to affect sonic hedgehog, which is involved in several disorders with ID, KMT2B associated with syndromic ID fits the epigenetic module underlying the Kleefstra syndromic spectrum. The association of these novel genes in three different Pakistani ID families highlights the importance of screening these genes in more families with similar phenotypes from different populations to confirm the involvement of these genes in pathogenesis of ID. PMID:25405613
Identification and characterization of Burkholderia multivorans CCA53.

PubMed

Akita, Hironaga; Kimura, Zen-Ichiro; Yusoff, Mohd Zulkhairi Mohd; Nakashima, Nobutaka; Hoshino, Tamotsu

2017-07-06

A lignin-degrading bacterium, Burkholderia sp. CCA53, was previously isolated from leaf soil. The purpose of this study was to determine phenotypic and biochemical features of Burkholderia sp. CCA53. Multilocus sequence typing (MLST) analysis based on fragments of the atpD, gltD, gyrB, lepA, recA and trpB gene sequences was performed to identify Burkholderia sp. CCA53. The MLST analysis revealed that Burkholderia sp. CCA53 was tightly clustered with B. multivorans ATCC BAA-247 T . The quinone and cellular fatty acid profiles, carbon source utilization, growth temperature and pH were consistent with the characteristics of B. multivorans species. Burkholderia sp. CCA53 was therefore identified as B. multivorans CCA53.
Use of conserved key amino acid positions to morph protein folds.

PubMed

Reddy, Boojala V B; Li, Wilfred W; Bourne, Philip E

2002-07-15

By using three-dimensional (3D) structure alignments and a previously published method to determine Conserved Key Amino Acid Positions (CKAAPs) we propose a theoretical method to design mutations that can be used to morph the protein folds. The original Paracelsus challenge, met by several groups, called for the engineering of a stable but different structure by modifying less than 50% of the amino acid residues. We have used the sequences from the Protein Data Bank (PDB) identifiers 1ROP, and 2CRO, which were previously used in the Paracelsus challenge by those groups, and suggest mutation to CKAAPs to morph the protein fold. The total number of mutations suggested is less than 40% of the starting sequence theoretically improving the challenge results. From secondary structure prediction experiments of the proposed mutant sequence structures, we observe that each of the suggested mutant protein sequences likely folds to a different, non-native potentially stable target structure. These results are an early indicator that analyses using structure alignments leading to CKAAPs of a given structure are of value in protein engineering experiments. Copyright 2002 Wiley Periodicals, Inc.
Informatic and genomic analysis of melanocyte cDNA libraries as a resource for the study of melanocyte development and function.

PubMed

Baxter, Laura L; Hsu, Benjamin J; Umayam, Lowell; Wolfsberg, Tyra G; Larson, Denise M; Frith, Martin C; Kawai, Jun; Hayashizaki, Yoshihide; Carninci, Piero; Pavan, William J

2007-06-01

As part of the RIKEN mouse encyclopedia project, two cDNA libraries were prepared from melanocyte-derived cell lines, using techniques of full-length clone selection and subtraction/normalization to enrich for rare transcripts. End sequencing showed that these libraries display over 83% complete coding sequence at the 5' end and 96-97% complete coding sequence at the 3' end. Evaluation of the libraries, derived from B16F10Y tumor cells and melan-c cells, revealed that they contain clones for a majority of the genes previously demonstrated to function in melanocyte biology. Analysis of genomic locations for transcripts revealed that the distribution of melanocyte genes is non-random throughout the genome. Three genomic regions identified that showed significant clustering of melanocyte-expressed genes contain one or more genes previously shown to regulate melanocyte development or function. A catalog of genes expressed in these libraries is presented, providing a valuable resource of cDNA clones and sequence information that can be used for identification of new genes important for melanocyte development, function, and disease.
The determination of complete human mitochondrial DNA sequences in single cells: implications for the study of somatic mitochondrial DNA point mutations

PubMed Central

Taylor, Robert W.; Taylor, Geoffrey A.; Durham, Steve E.; Turnbull, Douglass M.

2001-01-01

Studies of single cells have previously shown intracellular clonal expansion of mitochondrial DNA (mtDNA) mutations to levels that can cause a focal cytochrome c oxidase (COX) defect. Whilst techniques are available to study mtDNA rearrangements at the level of the single cell, recent interest has focused on the possible role of somatic mtDNA point mutations in ageing, neurodegenerative disease and cancer. We have therefore developed a method that permits the reliable determination of the entire mtDNA sequence from single cells without amplifying contaminating, nuclear-embedded pseudogenes. Sequencing and PCR–RFLP analyses of individual COX-negative muscle fibres from a patient with a previously described heteroplasmic COX II (T7587C) mutation indicate that mutant loads as low as 30% can be reliably detected by sequencing. This technique will be particularly useful in identifying the mtDNA mutational spectra in age-related COX-negative cells and will increase our understanding of the pathogenetic mechanisms by which they occur. PMID:11470889
Annotation and sequence diversity of transposable elements in common bean (Phaseolus vulgaris).

PubMed

Gao, Dongying; Abernathy, Brian; Rohksar, Daniel; Schmutz, Jeremy; Jackson, Scott A

2014-01-01

Common bean (Phaseolus vulgaris) is an important legume crop grown and consumed worldwide. With the availability of the common bean genome sequence, the next challenge is to annotate the genome and characterize functional DNA elements. Transposable elements (TEs) are the most abundant component of plant genomes and can dramatically affect genome evolution and genetic variation. Thus, it is pivotal to identify TEs in the common bean genome. In this study, we performed a genome-wide transposon annotation in common bean using a combination of homology and sequence structure-based methods. We developed a 2.12-Mb transposon database which includes 791 representative transposon sequences and is available upon request or from www.phytozome.org. Of note, nearly all transposons in the database are previously unrecognized TEs. More than 5,000 transposon-related expressed sequence tags (ESTs) were detected which indicates that some transposons may be transcriptionally active. Two Ty1-copia retrotransposon families were found to encode the envelope-like protein which has rarely been identified in plant genomes. Also, we identified an extra open reading frame (ORF) termed ORF2 from 15 Ty3-gypsy families that was located between the ORF encoding the retrotransposase and the 3'LTR. The ORF2 was in opposite transcriptional orientation to retrotransposase. Sequence homology searches and phylogenetic analysis suggested that the ORF2 may have an ancient origin, but its function is not clear. These transposon data provide a useful resource for understanding the genome organization and evolution and may be used to identify active TEs for developing transposon-tagging system in common bean and other related genomes.
Comparison of taxon-specific versus general locus sets for targeted sequence capture in plant phylogenomics.

PubMed

Chau, John H; Rahfeldt, Wolfgang A; Olmstead, Richard G

2018-03-01

Targeted sequence capture can be used to efficiently gather sequence data for large numbers of loci, such as single-copy nuclear loci. Most published studies in plants have used taxon-specific locus sets developed individually for a clade using multiple genomic and transcriptomic resources. General locus sets can also be developed from loci that have been identified as single-copy and have orthologs in large clades of plants. We identify and compare a taxon-specific locus set and three general locus sets (conserved ortholog set [COSII], shared single-copy nuclear [APVO SSC] genes, and pentatricopeptide repeat [PPR] genes) for targeted sequence capture in Buddleja (Scrophulariaceae) and outgroups. We evaluate their performance in terms of assembly success, sequence variability, and resolution and support of inferred phylogenetic trees. The taxon-specific locus set had the most target loci. Assembly success was high for all locus sets in Buddleja samples. For outgroups, general locus sets had greater assembly success. Taxon-specific and PPR loci had the highest average variability. The taxon-specific data set produced the best-supported tree, but all data sets showed improved resolution over previous non-sequence capture data sets. General locus sets can be a useful source of sequence capture targets, especially if multiple genomic resources are not available for a taxon.
A family of cellular proteins related to snake venom disintegrins.

PubMed

Weskamp, G; Blobel, C P

1994-03-29

Disintegrins are short soluble integrin ligands that were initially identified in snake venom. A previously recognized cellular protein with a disintegrin domain was the guinea pig sperm protein PH-30, a protein implicated in sperm-egg membrane binding and fusion. Here we present peptide sequences that are characteristic for several cellular disintegrin-domain proteins. These peptide sequences were deduced from cDNA sequence tags that were generated by polymerase chain reaction from various mouse tissue and a mouse muscle cell line. Northern blot analysis with four sequence tags revealed distinct mRNA expression patterns. Evidently, cellular proteins containing a disintegrin domain define a superfamily of potential integrin ligands that are likely to function in important cell-cell and cell-matrix interactions.
Antibiotic Resistance Markers in Burkholderia pseudomallei Strain Bp1651 Identified by Genome Sequence Analysis

PubMed Central

Sue, David; Gee, Jay E.; Elrod, Mindy G.; Hoffmaster, Alex R.; Randall, Linnell B.; Chirakul, Sunisa; Tuanyok, Apichai; Schweizer, Herbert P.; Weigel, Linda M.

2017-01-01

ABSTRACT Burkholderia pseudomallei Bp1651 is resistant to several classes of antibiotics that are usually effective for treatment of melioidosis, including tetracyclines, sulfonamides, and β-lactams such as penicillins (amoxicillin-clavulanic acid), cephalosporins (ceftazidime), and carbapenems (imipenem and meropenem). We sequenced, assembled, and annotated the Bp1651 genome and analyzed the sequence using comparative genomic analyses with susceptible strains, keyword searches of the annotation, publicly available antimicrobial resistance prediction tools, and published reports. More than 100 genes in the Bp1651 sequence were identified as potentially contributing to antimicrobial resistance. Most notably, we identified three previously uncharacterized point mutations in penA, which codes for a class A β-lactamase and was previously implicated in resistance to β-lactam antibiotics. The mutations result in amino acid changes T147A, D240G, and V261I. When individually introduced into select agent-excluded B. pseudomallei strain Bp82, D240G was found to contribute to ceftazidime resistance and T147A contributed to amoxicillin-clavulanic acid and imipenem resistance. This study provides the first evidence that mutations in penA may alter susceptibility to carbapenems in B. pseudomallei. Another mutation of interest was a point mutation affecting the dihydrofolate reductase gene folA, which likely explains the trimethoprim resistance of this strain. Bp1651 was susceptible to aminoglycosides likely because of a frameshift in the amrB gene, the transporter subunit of the AmrAB-OprA efflux pump. These findings expand the role of penA to include resistance to carbapenems and may assist in the development of molecular diagnostics that predict antimicrobial resistance and provide guidance for treatment of melioidosis. PMID:28396541
Identification and correction of systematic error in high-throughput sequence data

PubMed Central

2011-01-01

Background A feature common to all DNA sequencing technologies is the presence of base-call errors in the sequenced reads. The implications of such errors are application specific, ranging from minor informatics nuisances to major problems affecting biological inferences. Recently developed "next-gen" sequencing technologies have greatly reduced the cost of sequencing, but have been shown to be more error prone than previous technologies. Both position specific (depending on the location in the read) and sequence specific (depending on the sequence in the read) errors have been identified in Illumina and Life Technology sequencing platforms. We describe a new type of systematic error that manifests as statistically unlikely accumulations of errors at specific genome (or transcriptome) locations. Results We characterize and describe systematic errors using overlapping paired reads from high-coverage data. We show that such errors occur in approximately 1 in 1000 base pairs, and that they are highly replicable across experiments. We identify motifs that are frequent at systematic error sites, and describe a classifier that distinguishes heterozygous sites from systematic error. Our classifier is designed to accommodate data from experiments in which the allele frequencies at heterozygous sites are not necessarily 0.5 (such as in the case of RNA-Seq), and can be used with single-end datasets. Conclusions Systematic errors can easily be mistaken for heterozygous sites in individuals, or for SNPs in population analyses. Systematic errors are particularly problematic in low coverage experiments, or in estimates of allele-specific expression from RNA-Seq data. Our characterization of systematic error has allowed us to develop a program, called SysCall, for identifying and correcting such errors. We conclude that correction of systematic errors is important to consider in the design and interpretation of high-throughput sequencing experiments. PMID:22099972
GWASeq: targeted re-sequencing follow up to GWAS.

PubMed

Salomon, Matthew P; Li, Wai Lok Sibon; Edlund, Christopher K; Morrison, John; Fortini, Barbara K; Win, Aung Ko; Conti, David V; Thomas, Duncan C; Duggan, David; Buchanan, Daniel D; Jenkins, Mark A; Hopper, John L; Gallinger, Steven; Le Marchand, Loïc; Newcomb, Polly A; Casey, Graham; Marjoram, Paul

2016-03-03

For the last decade the conceptual framework of the Genome-Wide Association Study (GWAS) has dominated the investigation of human disease and other complex traits. While GWAS have been successful in identifying a large number of variants associated with various phenotypes, the overall amount of heritability explained by these variants remains small. This raises the question of how best to follow up on a GWAS, localize causal variants accounting for GWAS hits, and as a consequence explain more of the so-called "missing" heritability. Advances in high throughput sequencing technologies now allow for the efficient and cost-effective collection of vast amounts of fine-scale genomic data to complement GWAS. We investigate these issues using a colon cancer dataset. After QC, our data consisted of 1993 cases, 899 controls. Using marginal tests of associations, we identify 10 variants distributed among six targeted regions that are significantly associated with colorectal cancer, with eight of the variants being novel to this study. Additionally, we perform so-called 'SNP-set' tests of association and identify two sets of variants that implicate both common and rare variants in the etiology of colorectal cancer. Here we present a large-scale targeted re-sequencing resource focusing on genomic regions implicated in colorectal cancer susceptibility previously identified in several GWAS, which aims to 1) provide fine-scale targeted sequencing data for fine-mapping and 2) provide data resources to address methodological questions regarding the design of sequencing-based follow-up studies to GWAS. Additionally, we show that this strategy successfully identifies novel variants associated with colorectal cancer susceptibility and can implicate both common and rare variants.
Identification and characterization of tandem repeats in exon III of dopamine receptor D4 (DRD4) genes from different mammalian species.

PubMed

Larsen, Svend Arild; Mogensen, Line; Dietz, Rune; Baagøe, Hans Jørgen; Andersen, Mogens; Werge, Thomas; Rasmussen, Henrik Berg

2005-12-01

In this study we have identified and characterized dopamine receptor D4 (DRD4) exon III tandem repeats in 33 public available nucleotide sequences from different mammalian species. We found that the tandem repeat in canids could be described in a novel and simple way, namely, as a structure composed of 15- and 12- bp modules. Tandem repeats composed of 18-bp modules were found in sequences from the horse, zebra, onager, and donkey, Asiatic bear, polar bear, common raccoon, dolphin, harbor porpoise, and domestic cat. Several of these sequences have been analyzed previously without a tandem repeat being found. In the domestic cow and gray seal we identified tandem repeats composed of 36-bp modules, each consisting of two closely related 18-bp basic units. A tandem repeat consisting of 9-bp modules was identified in sequences from mink and ferret. In the European otter we detected an 18-bp tandem repeat, while a tandem repeat consisting of 27-bp modules was identified in a sequence from European badger. Both these tandem repeats were composed of 9-bp basic units, which were closely related with the 9-bp repeat modules identified in the mink and ferret. Tandem repeats could not be identified in sequences from rodents. All tandem repeats possessed a high GC content with a strong bias for C. On phylogenetic analysis of the tandem repeats evolutionary related species were clustered into the same groups. The degree of conservation of the tandem repeats varied significantly between species. The deduced amino acid sequences of most of the tandem repeats exhibited a high propensity for disorder. This was also the case with an amino acid sequence of the human DRD4 exon III tandem repeat, which was included in the study for comparative purposes. We identified proline-containing motifs for SH3 and WW domain binding proteins, potential phosphorylation sites, PDZ domain binding motifs, and FHA domain binding motifs in the amino acid sequences of the tandem repeats. The numbers of potential functional sites varied pronouncedly between species. Our observations provide a platform for future studies of the architecture and evolution of the DRD4 exon III tandem repeat, and they suggest that differences in the structure of this tandem repeat contribute to specialization and generation of diversity in receptor function.
De novo assembly of a haplotype-resolved human genome.

PubMed

Cao, Hongzhi; Wu, Honglong; Luo, Ruibang; Huang, Shujia; Sun, Yuhui; Tong, Xin; Xie, Yinlong; Liu, Binghang; Yang, Hailong; Zheng, Hancheng; Li, Jian; Li, Bo; Wang, Yu; Yang, Fang; Sun, Peng; Liu, Siyang; Gao, Peng; Huang, Haodong; Sun, Jing; Chen, Dan; He, Guangzhu; Huang, Weihua; Huang, Zheng; Li, Yue; Tellier, Laurent C A M; Liu, Xiao; Feng, Qiang; Xu, Xun; Zhang, Xiuqing; Bolund, Lars; Krogh, Anders; Kristiansen, Karsten; Drmanac, Radoje; Drmanac, Snezana; Nielsen, Rasmus; Li, Songgang; Wang, Jian; Yang, Huanming; Li, Yingrui; Wong, Gane Ka-Shu; Wang, Jun

2015-06-01

The human genome is diploid, and knowledge of the variants on each chromosome is important for the interpretation of genomic information. Here we report the assembly of a haplotype-resolved diploid genome without using a reference genome. Our pipeline relies on fosmid pooling together with whole-genome shotgun strategies, based solely on next-generation sequencing and hierarchical assembly methods. We applied our sequencing method to the genome of an Asian individual and generated a 5.15-Gb assembled genome with a haplotype N50 of 484 kb. Our analysis identified previously undetected indels and 7.49 Mb of novel coding sequences that could not be aligned to the human reference genome, which include at least six predicted genes. This haplotype-resolved genome represents the most complete de novo human genome assembly to date. Application of our approach to identify individual haplotype differences should aid in translating genotypes to phenotypes for the development of personalized medicine.
DNA barcodes for dragonflies and damselflies (Odonata) of Mindanao, Philippines.

PubMed

Casas, Princess Angelie S; Sing, Kong-Wah; Lee, Ping-Shin; Nuñeza, Olga M; Villanueva, Reagan Joseph T; Wilson, John-James

2018-03-01

Reliable species identification provides a sounder basis for use of species in the order Odonata as biological indicators and for their conservation, an urgent concern as many species are threatened with imminent extinction. We generated 134 COI barcodes from 36 morphologically identified species of Odonata collected from Mindanao Island, representing 10 families and 19 genera. Intraspecific sequence divergences ranged from 0 to 6.7% with four species showing more than 2%, while interspecific sequence divergences ranged from 0.5 to 23.3% with seven species showing less than 2%. Consequently, no distinct gap was observed between intraspecific and interspecific DNA barcode divergences. The numerous islands of the Philippine archipelago may have facilitated rapid speciation in the Odonata and resulted in low interspecific sequence divergences among closely related groups of species. This study contributes DNA barcodes for 36 morphologically identified species of Odonata reported from Mindanao including 31 species with no previous DNA barcode records.
A public platform for the verification of the phenotypic effect of candidate genes for resistance to aflatoxin accumulation and Aspergillus flavus infection in maize.

PubMed

Warburton, Marilyn L; Williams, William Paul; Hawkins, Leigh; Bridges, Susan; Gresham, Cathy; Harper, Jonathan; Ozkan, Seval; Mylroie, J Erik; Shan, Xueyan

2011-07-01

A public candidate gene testing pipeline for resistance to aflatoxin accumulation or Aspergillus flavus infection in maize is presented here. The pipeline consists of steps for identifying, testing, and verifying the association of selected maize gene sequences with resistance under field conditions. Resources include a database of genetic and protein sequences associated with the reduction in aflatoxin contamination from previous studies; eight diverse inbred maize lines for polymorphism identification within any maize gene sequence; four Quantitative Trait Loci (QTL) mapping populations and one association mapping panel, all phenotyped for aflatoxin accumulation resistance and associated phenotypes; and capacity for Insertion/Deletion (InDel) and SNP genotyping in the population(s) for mapping. To date, ten genes have been identified as possible candidate genes and put through the candidate gene testing pipeline, and results are presented here to demonstrate the utility of the pipeline.
Identification of a Divergent Environmental DNA Sequence Clade Using the Phylogeny of Gregarine Parasites (Apicomplexa) from Crustacean Hosts

PubMed Central

Rueckert, Sonja; Simdyanov, Timur G.; Aleoshin, Vladimir V.; Leander, Brian S.

2011-01-01

Background Environmental SSU rDNA surveys have significantly improved our understanding of microeukaryotic diversity. Many of the sequences acquired using this approach are closely related to lineages previously characterized at both morphological and molecular levels, making interpretation of these data relatively straightforward. Some sequences, by contrast, appear to be phylogenetic orphans and are sometimes inferred to represent “novel lineages” of unknown cellular identity. Consequently, interpretation of environmental DNA surveys of cellular diversity rely on an adequately comprehensive database of DNA sequences derived from identified species. Several major taxa of microeukaryotes, however, are still very poorly represented in these databases, and this is especially true for diverse groups of single-celled parasites, such as gregarine apicomplexans. Methodology/Principal Findings This study attempts to address this paucity of DNA sequence data by characterizing four different gregarine species, isolated from the intestines of crustaceans, at both morphological and molecular levels: Thiriotia pugettiae sp. n. from the graceful kelp crab (Pugettia gracilis), Cephaloidophora cf. communis from two different species of barnacles (Balanus glandula and B. balanus), Heliospora cf. longissima from two different species of freshwater amphipods (Eulimnogammarus verrucosus and E. vittatus), and Heliospora caprellae comb. n. from a skeleton shrimp (Caprella alaskana). SSU rDNA sequences were acquired from isolates of these gregarine species and added to a global apicomplexan alignment containing all major groups of gregarines characterized so far. Molecular phylogenetic analyses of these data demonstrated that all of the gregarines collected from crustacean hosts formed a very strongly supported clade with 48 previously unidentified environmental DNA sequences. Conclusions/Significance This expanded molecular phylogenetic context enabled us to establish a major clade of intestinal gregarine parasites and infer the cellular identities of several previously unidentified environmental SSU rDNA sequences, including several sequences that have formerly been discussed broadly in the literature as a suspected “novel” lineage of eukaryotes. PMID:21483868
Genomic Characterization of the Genus Nairovirus (Family Bunyaviridae).

PubMed

Kuhn, Jens H; Wiley, Michael R; Rodriguez, Sergio E; Bào, Yīmíng; Prieto, Karla; Travassos da Rosa, Amelia P A; Guzman, Hilda; Savji, Nazir; Ladner, Jason T; Tesh, Robert B; Wada, Jiro; Jahrling, Peter B; Bente, Dennis A; Palacios, Gustavo

2016-06-10

Nairovirus, one of five bunyaviral genera, includes seven species. Genomic sequence information is limited for members of the Dera Ghazi Khan, Hughes, Qalyub, Sakhalin, and Thiafora nairovirus species. We used next-generation sequencing and historical virus-culture samples to determine 14 complete and nine coding-complete nairoviral genome sequences to further characterize these species. Previously unsequenced viruses include Abu Mina, Clo Mor, Great Saltee, Hughes, Raza, Sakhalin, Soldado, and Tillamook viruses. In addition, we present genomic sequence information on additional isolates of previously sequenced Avalon, Dugbe, Sapphire II, and Zirqa viruses. Finally, we identify Tunis virus, previously thought to be a phlebovirus, as an isolate of Abu Hammad virus. Phylogenetic analyses indicate the need for reassignment of Sapphire II virus to Dera Ghazi Khan nairovirus and reassignment of Hazara, Tofla, and Nairobi sheep disease viruses to novel species. We also propose new species for the Kasokero group (Kasokero, Leopards Hill, Yogue viruses), the Ketarah group (Gossas, Issyk-kul, Keterah/soft tick viruses) and the Burana group (Wēnzhōu tick virus, Huángpí tick virus 1, Tǎchéng tick virus 1). Our analyses emphasize the sister relationship of nairoviruses and arenaviruses, and indicate that several nairo-like viruses (Shāyáng spider virus 1, Xīnzhōu spider virus, Sānxiá water strider virus 1, South Bay virus, Wǔhàn millipede virus 2) require establishment of novel genera in a larger nairovirus-arenavirus supergroup.
Identification of somatic mutations in non-small cell lung carcinomas using whole-exome sequencing

PubMed Central

Liu, Pengyuan; Morrison, Carl; Wang, Liang; Xiong, Donghai; Vedell, Peter; Cui, Peng; Hua, Xing; Ding, Feng; Lu, Yan; James, Michael; Ebben, John D.; Xu, Haiming; Adjei, Alex A.; Head, Karen; Andrae, Jaime W.; Tschannen, Michael R.; Jacob, Howard; Pan, Jing; Zhang, Qi; Van den Bergh, Francoise; Xiao, Haijie; Lo, Ken C.; Patel, Jigar; Richmond, Todd; Watt, Mary-Anne; Albert, Thomas; Selzer, Rebecca; Anderson, Marshall; Wang, Jiang; Wang, Yian; Starnes, Sandra; Yang, Ping; You, Ming

2012-01-01

Lung cancer is the leading cause of cancer-related death, with non-small cell lung cancer (NSCLC) being the predominant form of the disease. Most lung cancer is caused by the accumulation of genomic alterations due to tobacco exposure. To uncover its mutational landscape, we performed whole-exome sequencing in 31 NSCLCs and their matched normal tissue samples. We identified both common and unique mutation spectra and pathway activation in lung adenocarcinomas and squamous cell carcinomas, two major histologies in NSCLC. In addition to identifying previously known lung cancer genes (TP53, KRAS, EGFR, CDKN2A and RB1), the analysis revealed many genes not previously implicated in this malignancy. Notably, a novel gene CSMD3 was identified as the second most frequently mutated gene (next to TP53) in lung cancer. We further demonstrated that loss of CSMD3 results in increased proliferation of airway epithelial cells. The study provides unprecedented insights into mutational processes, cellular pathways and gene networks associated with lung cancer. Of potential immediate clinical relevance, several highly mutated genes identified in our study are promising druggable targets in cancer therapy including ALK, CTNNA3, DCC, MLL3, PCDHIIX, PIK3C2B, PIK3CG and ROCK2. PMID:22510280

Sequence diversity of hepatitis C virus 6a within the extended interferon sensitivity-determining region correlates with interferon-alpha/ribavirin treatment outcomes.

PubMed

Zhou, Daniel X M; Chan, Paul K S; Zhang, Tiejun; Tully, Damien C; Tam, John S

2010-10-01

Studies on the association between sequence variability of the interferon sensitivity-determining region (ISDR) of hepatitis C virus and the outcome of treatment have reached conflicting results. In this study, 25 patients infected with HCV 6a who had received interferon-alpha/ribavirin combination treatment were analyzed for the sequence variations. 14 of them had the full genome sequences obtained from a previous study, whereas the other 11 samples were sequenced for the extended ISDR (eISDR). This eISDR fragment covers 192 bp (64 amino acids) upstream and 201 bp (67 amino acids) downstream from the ISDR previously defined for HCV 1b. The comparison between interferon-alpha resistance and response groups for the amino acid mutations located in the full genome (6 and 8 patients respectively) as well as the mutations located in the eISDR (10 and 15 patients respectively) showed that the mutations I2160V, I2256V, V2292I (P<0.05) within eISDR were significantly associated with resistance to treatment. However, the extent of amino acid variations within previously defined ISDR was not associated with resistance to treatment as previously reported. Four amino acid variations I248V (P=0.03-0.06) within E1, R445K (P=0.02-0.05) and S747T (P=0.03) within E2, I861V (P=0.01) within NS2 which located outside the eISDR may also associate with treatment outcome as identified by a prescreening of variations within 14 HCV 6a full genomes. (c) 2010 Elsevier B.V. All rights reserved.
ENU Mutagenesis in Mice Identifies Candidate Genes For Hypogonadism

PubMed Central

Weiss, Jeffrey; Hurley, Lisa A.; Harris, Rebecca M.; Finlayson, Courtney; Tong, Minghan; Fisher, Lisa A.; Moran, Jennifer L.; Beier, David R.; Mason, Christopher; Jameson, J. Larry

2012-01-01

Genome-wide mutagenesis was performed in mice to identify candidate genes for male infertility, for which the predominant causes remain idiopathic. Mice were mutagenized using N-ethyl-N-nitrosourea (ENU), bred, and screened for phenotypes associated with the male urogenital system. Fifteen heritable lines were isolated and chromosomal loci were assigned using low density genome-wide SNP arrays. Ten of the fifteen lines were pursued further using higher resolution SNP analysis to narrow the candidate gene regions. Exon sequencing of candidate genes identified mutations in mice with cystic kidneys (Bicc1), cryptorchidism (Rxfp2), restricted germ cell deficiency (Plk4), and severe germ cell deficiency (Prdm9). In two other lines with severe hypogonadism candidate sequencing failed to identify mutations, suggesting defects in genes with previously undocumented roles in gonadal function. These genomic intervals were sequenced in their entirety and a candidate mutation was identified in SnrpE in one of the two lines. The line harboring the SnrpE variant retains substantial spermatogenesis despite small testis size, an unusual phenotype. In addition to the reproductive defects, heritable phenotypes were observed in mice with ataxia (Myo5a), tremors (Pmp22), growth retardation (unknown gene), and hydrocephalus (unknown gene). These results demonstrate that the ENU screen is an effective tool for identifying potential causes of male infertility. PMID:22258617
Whole mitochondrial and plastid genome SNP analysis of nine date palm cultivars reveals plastid heteroplasmy and close phylogenetic relationships among cultivars.

PubMed

Sabir, Jamal S M; Arasappan, Dhivya; Bahieldin, Ahmed; Abo-Aba, Salah; Bafeel, Sameera; Zari, Talal A; Edris, Sherif; Shokry, Ahmed M; Gadalla, Nour O; Ramadan, Ahmed M; Atef, Ahmed; Al-Kordy, Magdy A; El-Domyati, Fotoh M; Jansen, Robert K

2014-01-01

Date palm is a very important crop in western Asia and northern Africa, and it is the oldest domesticated fruit tree with archaeological records dating back 5000 years. The huge economic value of this crop has generated considerable interest in breeding programs to enhance production of dates. One of the major limitations of these efforts is the uncertainty regarding the number of date palm cultivars, which are currently based on fruit shape, size, color, and taste. Whole mitochondrial and plastid genome sequences were utilized to examine single nucleotide polymorphisms (SNPs) of date palms to evaluate the efficacy of this approach for molecular characterization of cultivars. Mitochondrial and plastid genomes of nine Saudi Arabian cultivars were sequenced. For each species about 60 million 100 bp paired-end reads were generated from total genomic DNA using the Illumina HiSeq 2000 platform. For each cultivar, sequences were aligned separately to the published date palm plastid and mitochondrial reference genomes, and SNPs were identified. The results identified cultivar-specific SNPs for eight of the nine cultivars. Two previous SNP analyses of mitochondrial and plastid genomes identified substantial intra-cultivar ( = intra-varietal) polymorphisms in organellar genomes but these studies did not properly take into account the fact that nearly half of the plastid genome has been integrated into the mitochondrial genome. Filtering all sequencing reads that mapped to both organellar genomes nearly eliminated mitochondrial heteroplasmy but all plastid SNPs remained heteroplasmic. This investigation provides valuable insights into how to deal with interorganellar DNA transfer in performing SNP analyses from total genomic DNA. The results confirm recent suggestions that plastid heteroplasmy is much more common than previously thought. Finally, low levels of sequence variation in plastid and mitochondrial genomes argue for using nuclear SNPs for molecular characterization of date palm cultivars.
Bioaccessible peptides released by in vitro gastrointestinal digestion of fermented goat milks.

PubMed

Moreno-Montoro, Miriam; Jauregi, Paula; Navarro-Alarcón, Miguel; Olalla-Herrera, Manuel; Giménez-Martínez, Rafael; Amigo, Lourdes; Miralles, Beatriz

2018-06-01

In this study, ultrafiltered goat milks fermented with the classical starter bacteria Lactobacillus delbrueckii subsp. bulgaricus and Streptococcus salivarus subsp. thermophilus or with the classical starter plus the Lactobacillus plantarum C4 probiotic strain were analyzed using ultra-high performance liquid chromatography-quadrupole-time-of-flight tandem mass spectrometry (UPLC-Q-TOF-MS/MS) and/or high performance liquid chromatography-ion trap (HPLC-IT-MS/MS). Partial overlapping of the identified sequences with regard to fermentation culture was observed. Evaluation of the cleavage specificity suggested a lower proteolytic activity of the probiotic strain. Some of the potentially identified peptides had been previously reported as angiotensin-converting enzyme (ACE) inhibitory, antioxidant, and antibacterial and might account for the in vitro activity previously reported for these fermented milks. Simulated digestion of the products was conducted in the presence of a dialysis membrane to retrieve the bioaccessible peptide fraction. Some sequences with reported physiological activity resisted digestion but were found in the non-dialyzable fraction. However, new forms released by digestion, such as the antioxidant α s1 -casein 144 YFYPQL 149 , the antihypertensive α s2 -casein 90 YQKFPQY 96 , and the antibacterial α s2 -casein 165 LKKISQ 170 , were found in the dialyzable fraction of both fermented milks. Moreover, in the fermented milk including the probiotic strain, the k-casein dipeptidyl peptidase IV inhibitor (DPP-IV) 51 INNQFLPYPY 60 as well as additional ACE inhibitory or antioxidant sequences could be identified. With the aim of anticipating further biological outcomes, quantitative structure activity relationship (QSAR) analysis was applied to the bioaccessible fragments and led to potential ACE inhibitory sequences being proposed. Graphical abstract Ultrafiltered goat milks were fermented with the classical starter bacteria (St) and with St plus the L. plantarum C4 probiotic strain. Samples were analyzed using HPLC-IT-MS/MS and UPLC-Q-TOF-MS/MS. After simulated digestion and dialysis, some of the active sequences remained and new peptides with reported beneficial activities were released.
Novel Hepatozoon in vertebrates from the southern United States.

PubMed

Allen, Kelly E; Yabsley, Michael J; Johnson, Eileen M; Reichard, Mason V; Panciera, Roger J; Ewing, Sidney A; Little, Susan E

2011-08-01

Novel Hepatozoon spp. sequences collected from previously unrecognized vertebrate hosts in North America were compared with documented Hepatozoon 18S rRNA sequences in an effort to examine phylogenetic relationships between the different Hepatozoon organisms found cycling in nature. An approximately 500-base pair fragment of 18S rDNA common to Hepatozoon spp. and some other apicomplexans was amplified and sequenced from the tissues or blood of 16 vertebrate host species from the southern United States, including 1 opossum (Didelphis virginiana), 2 bobcats (Lynx rufus), 1 domestic cat (Felis catus), 3 coyotes (Canis latrans), 1 gray fox (Urocyon cinereoargenteus), 4 raccoons (Procyon lotor), 1 pet boa constrictor (Boa constrictor imperator), 1 swamp rabbit (Sylvilagus aquaticus), 1 cottontail rabbit (Sylvilagus floridanus), 4 woodrats (Neotoma fuscipes and Neotoma micropus), 3 white-footed mice (Peromyscus leucopus), 8 cotton rats (Sigmodon hispidus), 1 cotton mouse (Peromyscus gossypinus), 1 eastern grey squirrel (Sciurus carolinensis), and 1 woodchuck (Marmota monax). Phylogenetic analyses and comparison with sequences in the existing database revealed distinct groups of Hepatozoon spp., with clusters formed by sequences obtained from scavengers and carnivores (opossum, raccoons, canids, and felids) and those obtained from rodents. Surprisingly, Hepatozoon spp. sequences from wild rabbits were most closely related to sequences obtained from carnivores (97.2% identical), and the sequence from the boa constrictor was most closely related to the rodent cluster (97.4% identical). These data are consistent with recent work identifying prey-predator transmission cycles in Hepatozoon spp. and suggest this pattern may be more common than previously recognized.
Improved deoxyribozymes for synthesis of covalently branched DNA and RNA.

PubMed

Lee, Christine S; Mui, Timothy P; Silverman, Scott K

2011-01-01

A covalently branched nucleic acid can be synthesized by joining the 2'-hydroxyl of the branch-site ribonucleotide of a DNA or RNA strand to the activated 5'-phosphorus of a separate DNA or RNA strand. We have previously used deoxyribozymes to synthesize several types of branched nucleic acids for experiments in biotechnology and biochemistry. Here, we report in vitro selection experiments to identify improved deoxyribozymes for synthesis of branched DNA and RNA. Each of the new deoxyribozymes requires Mn²(+) as a cofactor, rather than Mg²(+) as used by our previous branch-forming deoxyribozymes, and each has an initially random region of 40 rather than 22 or fewer combined nucleotides. The deoxyribozymes all function by forming a three-helix-junction (3HJ) complex with their two oligonucleotide substrates. For synthesis of branched DNA, the best new deoxyribozyme, 8LV13, has k(obs) on the order of 0.1 min⁻¹, which is about two orders of magnitude faster than our previously identified 15HA9 deoxyribozyme. 8LV13 also functions at closer-to-neutral pH than does 15HA9 (pH 7.5 versus 9.0) and has useful tolerance for many DNA substrate sequences. For synthesis of branched RNA, two new deoxyribozymes, 8LX1 and 8LX6, were identified with broad sequence tolerances and substantial activity at pH 7.5, versus pH 9.0 for many of our previous deoxyribozymes that form branched RNA. These experiments provide new, and in key aspects improved, practical catalysts for preparation of synthetic branched DNA and RNA.
Microbial Characterization During the Early Habitation of the International Space Station

NASA Technical Reports Server (NTRS)

Castro, V. A.; Thrasher, A. N.; Healy, M.; Ott, C. M.; Pierson, D. L.

2004-01-01

An evaluation of the microbiota from air, water, and surface samples provided a baseline of microbial characterization onboard the International Space Station (ISS) to gain insight into bacterial and fungal contamination during the initial stages of construction and habitation. Using 16S genetic sequencing and rep-PCR, 63 bacterial strains were isolated for identification and fingerprinted for microbial tracking. Of the bacterial strains that were isolated and fingerprinted, 19 displayed similarity to each other. The use of these molecular tools allowed for the identification of bacteria not previously identified using automated biochemical analysis and provided a clear indication of the source of several ISS contaminants. Strains of Bradyrhizobium and Sphingomonas unable to be identified using sequencing were identified by comparison of rep-PCR DNA fingerprints. Distinct DNA fingerprints for several strains of Methylobacterium provided a clear indication of the source of an ISS water supply contaminant. Fungal and bacterial data acquired during monitoring do not suggest there is a current microbial hazard to the spacecraft, nor does any trend indicate a potential health risk. Previous spacecraft environmental analysis indicated that microbial contamination will increase with time and will require continued surveillance. Copyright 2004 Springer-Verlag.
Mutations in GBA are associated with familial Parkinson disease susceptibility and age at onset.

PubMed

Nichols, W C; Pankratz, N; Marek, D K; Pauciulo, M W; Elsaesser, V E; Halter, C A; Rudolph, A; Wojcieszek, J; Pfeiffer, R F; Foroud, T

2009-01-27

To characterize sequence variation within the glucocerebrosidase (GBA) gene in a select subset of our sample of patients with familial Parkinson disease (PD) and then to test in our full sample whether these sequence variants increased the risk for PD and were associated with an earlier onset of disease. We performed a comprehensive study of all GBA exons in one patient with PD from each of 96 PD families, selected based on the family-specific lod scores at the GBA locus. Identified GBA variants were subsequently screened in all 1325 PD cases from 566 multiplex PD families and in 359 controls. Nine different GBA variants, five previously reported, were identified in 21 of the 96 PD cases sequenced. Screening for these variants in the full sample identified 161 variant carriers (12.2%) in 99 different PD families. An unbiased estimate of the frequency of the five previously reported GBA variants in the familial PD sample was 12.6% and in the control sample was 5.3% (odds ratio 2.6; 95% confidence interval 1.5-4.4). Presence of a GBA variant was associated with an earlier age at onset (p = 0.0001). On average, those patients carrying a GBA variant had onset with PD 6.04 years earlier than those without a GBA variant. This study suggests that GBA is a susceptibility gene for familial Parkinson disease (PD) and patients with GBA variants have an earlier age at onset than patients with PD without GBA variants.
The floral transcriptomes of four bamboo species (Bambusoideae; Poaceae): support for common ancestry among woody bamboos.

PubMed

Wysocki, William P; Ruiz-Sanchez, Eduardo; Yin, Yanbin; Duvall, Melvin R

2016-05-20

Next-generation sequencing now allows for total RNA extracts to be sequenced in non-model organisms such as bamboos, an economically and ecologically important group of grasses. Bamboos are divided into three lineages, two of which are woody perennials with bisexual flowers, which undergo gregarious monocarpy. The third lineage, which are herbaceous perennials, possesses unisexual flowers that undergo annual flowering events. Transcriptomes were assembled using both reference-based and de novo methods. These two methods were tested by characterizing transcriptome content using sequence alignment to previously characterized reference proteomes and by identifying Pfam domains. Because of the striking differences in floral morphology and phenology between the herbaceous and woody bamboo lineages, MADS-box genes, transcription factors that control floral development and timing, were characterized and analyzed in this study. Transcripts were identified using phylogenetic methods and categorized as A, B, C, D or E-class genes, which control floral development, or SOC or SVP-like genes, which control the timing of flowering events. Putative nuclear orthologues were also identified in bamboos to use as phylogenetic markers. Instances of gene copies exhibiting topological patterns that correspond to shared phenotypes were observed in several gene families including floral development and timing genes. Alignments and phylogenetic trees were generated for 3,878 genes and for all genes in a concatenated analysis. Both the concatenated analysis and those of 2,412 separate gene trees supported monophyly among the woody bamboos, which is incongruent with previous phylogenetic studies using plastid markers.
Isolation and genetic characterization of Aurantimonas and Methylobacterium strains from stems of hypernodulated soybeans.

PubMed

Anda, Mizue; Ikeda, Seishi; Eda, Shima; Okubo, Takashi; Sato, Shusei; Tabata, Satoshi; Mitsui, Hisayuki; Minamisawa, Kiwamu

2011-01-01

The aims of this study were to isolate Aurantimonas and Methylobacterium strains that responded to soybean nodulation phenotypes and nitrogen fertilization rates in a previous culture-independent analysis (Ikeda et al. ISME J. 4:315-326, 2010). Two strategies were adopted for isolation from enriched bacterial cells prepared from stems of field-grown, hypernodulated soybeans: PCR-assisted isolation for Aurantimonas and selective cultivation for Methylobacterium. Thirteen of 768 isolates cultivated on Nutrient Agar medium were identified as Aurantimonas by colony PCR specific for Aurantimonas and 16S rRNA gene sequencing. Meanwhile, among 187 isolates on methanol-containing agar media, 126 were identified by 16S rRNA gene sequences as Methylobacterium. A clustering analysis (>99% identity) of the 16S rRNA gene sequences for the combined datasets of the present and previous studies revealed 4 and 8 operational taxonomic units (OTUs) for Aurantimonas and Methylobacterium, respectively, and showed the successful isolation of target bacteria for these two groups. ERIC- and BOX-PCR showed the genomic uniformity of the target isolates. In addition, phylogenetic analyses of Aurantimonas revealed a phyllosphere-specific cluster in the genus. The isolates obtained in the present study will be useful for revealing unknown legume-microbe interactions in relation to the autoregulation of nodulation.
hsp65 PCR-restriction analysis (PRA) with capillary electrophoresis in comparison to three other methods for identification of Mycobacterium species.

PubMed

Sajduda, Anna; Martin, Anandi; Portaels, Françoise; Palomino, Juan Carlos

2010-02-01

We developed a scheme for rapid identification of Mycobacterium species using an automated fluorescence capillary electrophoresis instrument. A 441-bp region of the hsp65 gene was examined using PCR-restriction analysis (PRA). The assay was initially evaluated on 38 reference strains. The observed sizes of restriction fragments were consistently smaller than the real sizes for each of the species as deduced from the sequence analysis (mean variance=7bp). Nevertheless, the obtained PRA patterns were highly reproducible and resulted in correct species identifications. A blind test was then successfully performed on 64 test isolates previously characterized by conventional biochemical methods, a commercial INNO-LiPA Mycobacteria assay and/or sequence determination of the 5' end of 16S rRNA gene. A total of 14 of 64 isolates were erroneously identified by conventional methods (78% accuracy). In contrast, PRA performed very well in comparison with the LiPA (89% concordance) and especially with DNA sequencing (93.3% of concordant results). Also, PRA identified seven isolates representing five previously unreported hsp65 alleles. We conclude that hsp65 PRA based on automated capillary electrophoresis is a rapid, simple and reliable method for identification of mycobacteria. Copyright 2010 Elsevier B.V. All rights reserved.
A Streamlined Protocol for Molecular Testing of the DMD Gene within a Diagnostic Laboratory: A Combination of Array Comparative Genomic Hybridization and Bidirectional Sequence Analysis

PubMed Central

Marquis-Nicholson, Renate; Lai, Daniel; Love, Jennifer M.; Love, Donald R.

2013-01-01

Purpose. The aim of this study was to develop a streamlined mutation screening protocol for the DMD gene in order to confirm a clinical diagnosis of Duchenne or Becker muscular dystrophy in affected males and to clarify the carrier status of female family members. Methods. Sequence analysis and array comparative genomic hybridization (aCGH) were used to identify mutations in the dystrophin DMD gene. We analysed genomic DNA from six individuals with a range of previously characterised mutations and from eight individuals who had not previously undergone any form of molecular analysis. Results. We successfully identified the known mutations in all six patients. A molecular diagnosis was also made in three of the four patients with a clinical diagnosis who had not undergone prior genetic screening, and testing for familial mutations was successfully completed for the remaining four patients. Conclusion. The mutation screening protocol described here meets best practice guidelines for molecular testing of the DMD gene in a diagnostic laboratory. The aCGH method is a superior alternative to more conventional assays such as multiplex ligation-dependent probe amplification (MLPA). The combination of aCGH and sequence analysis will detect mutations in 98% of patients with the Duchenne or Becker muscular dystrophy. PMID:23476807
Global survey of genomic imprinting by transcriptome sequencing.

PubMed

Babak, Tomas; Deveale, Brian; Armour, Christopher; Raymond, Christopher; Cleary, Michele A; van der Kooy, Derek; Johnson, Jason M; Lim, Lee P

2008-11-25

Genomic imprinting restricts gene expression to a paternal or maternal allele. To date, approximately 90 imprinted transcripts have been identified in mouse, of which the majority were detected after intense interrogation of clusters of imprinted genes identified by phenotype-driven assays in mice with uniparental disomies [1]. Here we use selective priming and parallel sequencing to measure allelic bias in whole transcriptomes. By distinguishing parent-of-origin bias from strain-specific bias in embryos derived from a reciprocal cross of mice, we constructed a genome-wide map of imprinted transcription. This map was able to objectively locate over 80% of known imprinted loci and allowed the detection and confirmation of six novel imprinted genes. Even in the intensely studied embryonic day 9.5 developmental stage that we analyzed, more than half of all imprinted single-nucleotide polymorphisms did not overlap previously discovered imprinted transcripts; a large fraction of these represent novel noncoding RNAs within known imprinted loci. For example, a previously unnoticed, maternally expressed antisense transcript was mapped within the Grb10 locus. This study demonstrates the feasibility of using transcriptome sequencing for mapping of imprinted gene expression in physiologically normal animals. Such an approach will allow researchers to study imprinting without restricting themselves to individual loci or specific transcripts.
A cohort of new adhesive proteins identified from transcriptomic analysis of mussel foot glands.

PubMed

DeMartini, Daniel G; Errico, John M; Sjoestroem, Sebastian; Fenster, April; Waite, J Herbert

2017-06-01

The adaptive attachment of marine mussels to a wide range of substrates in a high-energy, saline environment has been explored for decades and is a significant driver of bioinspired wet adhesion research. Mussel attachment relies on a fibrous holdfast known as the byssus, which is made by a specialized appendage called the foot. Multiple adhesive and structural proteins are rapidly synthesized, secreted and moulded by the foot into holdfast threads. About 10 well-characterized proteins, namely the mussel foot proteins (Mfps), the preCols and the thread matrix proteins, are reported as representing the bulk of these structures. To explore how robust this proposition is, we sequenced the transcriptome of the glandular tissues that produce and secrete the various holdfast components using next-generation sequencing methods. Surprisingly, we found around 15 highly expressed genes that have not previously been characterized, but bear key similarities to the previously defined mussel foot proteins, suggesting additional contribution to byssal function. We verified the validity of these transcripts by polymerase chain reaction, cloning and Sanger sequencing as well as confirming their presence as proteins in the byssus. These newly identified proteins greatly expand the palette of mussel holdfast biochemistry and provide new targets for investigation into bioinspired wet adhesion. © 2017 The Author(s).
More Easily Cultivated Than Identified: Classical Isolation With Molecular Identification of Vaginal Bacteria.

PubMed

Srinivasan, Sujatha; Munch, Matthew M; Sizova, Maria V; Fiedler, Tina L; Kohler, Christina M; Hoffman, Noah G; Liu, Congzhou; Agnew, Kathy J; Marrazzo, Jeanne M; Epstein, Slava S; Fredricks, David N

2016-08-15

Women with bacterial vaginosis (BV) have complex communities of anaerobic bacteria. There are no cultivated isolates of several bacteria identified using molecular methods and associated with BV. It is unclear whether this is due to the inability to adequately propagate these bacteria or to correctly identify them in culture. Vaginal fluid from 15 women was plated on 6 different media using classical cultivation approaches. Individual isolates were identified by 16S ribosomal RNA (rRNA) gene sequencing and compared with validly described species. Bacterial community profiles in vaginal samples were determined using broad-range 16S rRNA gene polymerase chain reaction and pyrosequencing. We isolated and identified 101 distinct bacterial strains spanning 6 phyla including (1) novel strains with <98% 16S rRNA sequence identity to validly described species, (2) closely related species within a genus, (3) bacteria previously isolated from body sites other than the vagina, and (4) known bacteria formerly isolated from the vagina. Pyrosequencing showed that novel strains Peptoniphilaceae DNF01163 and Prevotellaceae DNF00733 were prevalent in women with BV. We isolated a diverse set of novel and clinically significant anaerobes from the human vagina using conventional approaches with systematic molecular identification. Several previously "uncultivated" bacteria are amenable to conventional cultivation. © The Author 2016. Published by Oxford University Press for the Infectious Diseases Society of America. All rights reserved. For permissions, e-mail journals.permissions@oup.com.
Genomics and introgression: discovery and mapping of thousands of species-diagnostic SNPs using RAD sequencing

USGS Publications Warehouse

Hand, Brian K.; Hether, Tyler D; Kovach, Ryan P.; Muhlfeld, Clint C.; Amish, Stephen J.; Boyer, Matthew C.; O’Rourke, Sean M.; Miller, Michael R.; Lowe, Winsor H.; Hohenlohe, Paul A.; Luikart, Gordon

2015-01-01

Invasive hybridization and introgression pose a serious threat to the persistence of many native species. Understanding the effects of hybridization on native populations (e.g., fitness consequences) requires numerous species-diagnostic loci distributed genome-wide. Here we used RAD sequencing to discover thousands of single-nucleotide polymorphisms (SNPs) that are diagnostic between rainbow trout (RBT, Oncorhynchus mykiss), the world’s most widely introduced fish, and native westslope cutthroat trout (WCT, O. clarkii lewisi) in the northern Rocky Mountains, USA. We advanced previous work that identified 4,914 species-diagnostic loci by using longer sequence reads (100 bp vs. 60 bp) and a larger set of individuals (n = 84). We sequenced RAD libraries for individuals from diverse sampling sources, including native populations of WCT and hatchery broodstocks of WCT and RBT. We also took advantage of a newly released reference genome assembly for RBT to align our RAD loci. In total, we discovered 16,788 putatively diagnostic SNPs, 10,267 of which we mapped to anchored chromosome locations on the RBT genome. A small portion of previously discovered putative diagnostic loci (325 of 4,914) were no longer diagnostic (i.e., fixed between species) based on our wider survey of non-hybridized RBT and WCT individuals. Our study suggests that RAD loci mapped to a draft genome assembly could provide the marker density required to identify genes and chromosomal regions influencing selection in admixed populations of conservation concern and evolutionary interest.
First Report of a Fatal Case Associated with EV-D68 Infection in Hong Kong and Emergence of an Interclade Recombinant in China Revealed by Genome Analysis.

PubMed

Yip, Cyril C Y; Lo, Janice Y C; Sridhar, Siddharth; Lung, David C; Luk, Shik; Chan, Kwok-Hung; Chan, Jasper F W; Cheng, Vincent C C; Woo, Patrick C Y; Yuen, Kwok-Yung; Lau, Susanna K P

2017-05-16

A fatal case associated with enterovirus D68 (EV-D68) infection affecting a 10-year-old boy was reported in Hong Kong in 2014. To examine if a new strain has emerged in Hong Kong, we sequenced the partial genome of the EV-D68 strain identified from the fatal case and the complete VP1, and partial 5'UTR and 2C sequences of nine additional EV-D68 strains isolated from patients in Hong Kong. Sequence analysis indicated that a cluster of strains including the previously recognized A2 strains should belong to a separate clade, clade D, which is further divided into subclades D1 and D2. Among the 10 EV-D68 strains, 7 (including the fatal case) belonged to the previously described, newly emerged subclade B3, 2 belonged to subclade B1, and 1 belonged to subclade D1. Three EV-D68 strains, each from subclades B1, B3, and D1, were selected for complete genome sequencing and recombination analysis. While no evidence of recombination was noted among local strains, interclade recombination was identified in subclade D2 strains detected in mainland China in 2008 with VP2 acquired from clade A. This study supports the reclassification of subclade A2 into clade D1, and demonstrates interclade recombination between clades A and D2 in EV-D68 strains from China.
Discovery of Novel Viruses in Mosquitoes from the Zambezi Valley of Mozambique

PubMed Central

Hayer, Juliette; Abilio, Ana Paula; Mulandane, Fernando Chanisso; Verner-Carlsson, Jenny; Falk, Kerstin I.; Fafetine, Jose M.; Berg, Mikael; Blomström, Anne-Lie

2016-01-01

Mosquitoes carry a wide variety of viruses that can cause vector-borne infectious diseases and affect both human and veterinary public health. Although Mozambique can be considered a hot spot for emerging infectious diseases due to factors such as a rich vector population and a close vector/human/wildlife interface, the viral flora in mosquitoes have not previously been investigated. In this study, viral metagenomics was employed to analyze the viral communities in Culex and Mansonia mosquitoes in the Zambezia province of Mozambique. Among the 1.7 and 2.6 million sequences produced from the Culex and Mansonia samples, respectively, 3269 and 983 reads were classified as viral sequences. Viruses belonging to the Flaviviridae, Rhabdoviridae and Iflaviridae families were detected, and different unclassified single- and double-stranded RNA viruses were also identified. A near complete genome of a flavivirus, tentatively named Cuacua virus, was obtained from the Mansonia mosquitoes. Phylogenetic analysis of this flavivirus, using the NS5 amino acid sequence, showed that it grouped with ‘insect-specific’ viruses and was most closely related to Nakiwogo virus previously identified in Uganda. Both mosquito genera had viral sequences related to Rhabdoviruses, and these were most closely related to Culex tritaeniorhynchus rhabdovirus (CTRV). The results from this study suggest that several viruses specific for insects belonging to, for example, the Flaviviridae and Rhabdoviridae families, as well as a number of unclassified RNA viruses, are present in mosquitoes in Mozambique. PMID:27682810
Refined identification of Vibrio bacterial flora from Acanthasther planci based on biochemical profiling and analysis of housekeeping genes.

PubMed

Rivera-Posada, J A; Pratchett, M; Cano-Gomez, A; Arango-Gomez, J D; Owens, L

2011-09-09

We used a polyphasic approach for precise identification of bacterial flora (Vibrionaceae) isolated from crown-of-thorns starfish (COTS) from Lizard Island (Great Barrier Reef, Australia) and Guam (U.S.A., Western Pacific Ocean). Previous 16S rRNA gene phylogenetic analysis was useful to allocate and identify isolates within the Photobacterium, Splendidus and Harveyi clades but failed in the identification of Vibrio harveyi-like isolates. Species of the V harveyi group have almost indistinguishable phenotypes and genotypes, and thus, identification by standard biochemical tests and 16S rRNA gene analysis is commonly inaccurate. Biochemical profiling and sequence analysis of additional topA and mreB housekeeping genes were carried out for definitive identification of 19 bacterial isolates recovered from sick and wild COTS. For 8 isolates, biochemical profiles and topA and mreB gene sequence alignments with the closest relatives (GenBank) confirmed previous 16S rRNA-based identification: V. fortis and Photobacterium eurosenbergii species (from wild COTS), and V natriegens (from diseased COTS). Further phylogenetic analysis based on topA and mreB concatenated sequences served to identify the remaining 11 V harveyi-like isolates: V. owensii and V. rotiferianus (from wild COTS), and V. owensii, V. rotiferianus, and V. harveyi (from diseased COTS). This study further confirms the reliability of topA-mreB gene sequence analysis for identification of these close species, and it reveals a wider distribution range of the potentially pathogenic V. harveyi group.
Excess of genomic defects in a woolly mammoth on Wrangel island

PubMed Central

Slatkin, Montgomery

2017-01-01

Woolly mammoths (Mammuthus primigenius) populated Siberia, Beringia, and North America during the Pleistocene and early Holocene. Recent breakthroughs in ancient DNA sequencing have allowed for complete genome sequencing for two specimens of woolly mammoths (Palkopoulou et al. 2015). One mammoth specimen is from a mainland population 45,000 years ago when mammoths were plentiful. The second, a 4300 yr old specimen, is derived from an isolated population on Wrangel island where mammoths subsisted with small effective population size more than 43-fold lower than previous populations. These extreme differences in effective population size offer a rare opportunity to test nearly neutral models of genome architecture evolution within a single species. Using these previously published mammoth sequences, we identify deletions, retrogenes, and non-functionalizing point mutations. In the Wrangel island mammoth, we identify a greater number of deletions, a larger proportion of deletions affecting gene sequences, a greater number of candidate retrogenes, and an increased number of premature stop codons. This accumulation of detrimental mutations is consistent with genomic meltdown in response to low effective population sizes in the dwindling mammoth population on Wrangel island. In addition, we observe high rates of loss of olfactory receptors and urinary proteins, either because these loci are non-essential or because they were favored by divergent selective pressures in island environments. Finally, at the locus of FOXQ1 we observe two independent loss-of-function mutations, which would confer a satin coat phenotype in this island woolly mammoth. PMID:28253255

Seismic sequences in the Sombrero Seismic Zone

NASA Astrophysics Data System (ADS)

Pulliam, J.; Huerfano, V. A.; ten Brink, U.; von Hillebrandt, C.

2007-05-01

The northeastern Caribbean, in the vicinity of Puerto Rico and the Virgin Islands, has a long and well-documented history of devastating earthquakes and tsunamis, including major events in 1670, 1787, 1867, 1916, 1918, and 1943. Recently, seismicity has been concentrated to the north and west of the British Virgin Islands, in the region referred to as the Sombrero Seismic Zone by the Puerto Rico Seismic Network (PRSN). In the combined seismicity catalog maintained by the PRSN, several hundred small to moderate magnitude events can be found in this region prior to 2006. However, beginning in 2006 and continuing to the present, the rate of seismicity in the Sombrero suddenly increased, and a new locus of activity developed to the east of the previous location. Accurate estimates of seismic hazard, and the tsunamigenic potential of seismic events, depend on an accurate and comprehensive understanding of how strain is being accommodated in this corner region. Are faults locked and accumulating strain for release in a major event? Or is strain being released via slip over a diffuse system of faults? A careful analysis of seismicity patterns in the Sombrero region has the potential to both identify faults and modes of failure, provided the aggregation scheme is tuned to properly identify related events. To this end, we experimented with a scheme to identify seismic sequences based on physical and temporal proximity, under the assumptions that (a) events occur on related fault systems as stress is refocused by immediately previous events and (b) such 'stress waves' die out with time, so that two events that occur on the same system within a relatively short time window can be said to have a similar 'trigger' in ways that two nearby events that occurred years apart cannot. Patterns that emerge from the identification, temporal sequence, and refined locations of such sequences of events carry information about stress accommodation that is obscured by large clouds of unrelated events in plots of the general catalog. One characteristic of these sequences is that their magnitudes tend to be consistently small (1.0 - 3.5 mb, with only five events greater than 3.5 mb) and they typically do not include an event that could confidently be identified as a "main" shock. Nevertheless, the numbers of events, temporal and geographic distribution of shocks in each sequence suggests that these are aftershock sequences, yet none includes an event that could confidently be identified as a "main" shock. This observation suggests several questions. Do these sequences truly represent aftershocks? If so, where are the main events? Are they perhaps related to "silent" or "slow" earthquakes in the subduction zone? If so, could such slow earthquakes be related to the dropping away of the subducting slab beneath the deep Puerto Rico Trench? Or do the sequences indicate tearing of the NA lithosphere of the North America plate as it subducts beneath the Caribbean plate?
Identification of mildew resistance in wild and cultivated Central Asian grape germplasm

PubMed Central

2013-01-01

Background Cultivated grapevines, Vitis vinifera subsp. sativa, evolved from their wild relative, V. vinifera subsp. sylvestris. They were domesticated in Central Asia in the absence of the powdery mildew fungus, Erysiphe necator, which is thought to have originated in North America. However, powdery mildew resistance has previously been discovered in two Central Asian cultivars and in Chinese Vitis species. Results A set of 380 unique genotypes were evaluated with data generated from 34 simple sequence repeat (SSR) markers. The set included 306 V. vinifera cultivars, 40 accessions of V. vinifera subsp. sylvestris, and 34 accessions of Vitis species from northern Pakistan, Afghanistan and China. Based on the presence of four SSR alleles previously identified as linked to the powdery mildew resistance locus, Ren1, 10 new mildew resistant genotypes were identified in the test set: eight were V. vinifera cultivars and two were V. vinifera subsp. sylvestris based on flower and seed morphology. Sequence comparison of a 620 bp region that includes the Ren1-linked allele (143 bp) of the co-segregating SSR marker SC8-0071-014, revealed that the ten newly identified genotypes have sequences that are essentially identical to the previously identified mildew resistant V. vinifera cultivars: ‘Kishmish vatkana’ and ‘Karadzhandal’. Kinship analysis determined that three of the newly identified powdery mildew resistant accessions had a relationship with ‘Kishmish vatkana’ and ‘Karadzhandal’, and that six were not related to any other accession in this study set. Clustering procedures assigned accessions into three groups: 1) Chinese species; 2) a mixed group of cultivated and wild V. vinifera; and 3) table grape cultivars, including nine of the powdery mildew resistant accessions. Gene flow was detected among the groups. Conclusions This study provides evidence that powdery mildew resistance is present in V. vinifera subsp. sylvestris, the dioecious wild progenitor of the cultivated grape. Four first-degree parent progeny relationships were discovered among the hermaphroditic powdery mildew resistant cultivars, supporting the existence of intentional grape breeding efforts. Although several Chinese grape species are resistant to powdery mildew, no direct genetic link to the resistance found in V. vinifera could be established. PMID:24093598
Border Disease Virus among Chamois, Spain

PubMed Central

Rosell, Rosa; Cabezón, Oscar; Mentaberre, Gregorio; Casas, Encarna; Velarde, Roser; Lavín, Santiago

2009-01-01

Approximately 3,000 Pyrenean chamois (Rupicapra pyrenaica pyrenaica) died in northeastern Spain during 2005–2007. Border disease virus infection was identified by reverse transcription–PCR and sequencing analysis. These results implicate this virus as the primary cause of death, similar to findings in the previous epizootic in 2001. PMID:19239761
Lineage and genogroup-defining single nucleotide polymorphisms of Escherichia coli 0157:H7

USDA-ARS?s Scientific Manuscript database

Escherichia coli O157:H7 is a zoonotic human pathogen for which cattle are an important reservoir host. Using both previously published and new sequencing data, a 48-locus single nucleotide polymorphism (SNP) based typing panel was developed that redundantly identified eleven genogroups that span ...
Genetic diversity of Clostridium perfringens type A isolates from animals, food poisoning outbreaks and sludge

PubMed Central

Johansson, Anders; Aspan, Anna; Bagge, Elisabeth; Båverud, Viveca; Engström, Björn E; Johansson, Karl-Erik

2006-01-01

Background Clostridium perfringens, a serious pathogen, causes enteric diseases in domestic animals and food poisoning in humans. The epidemiological relationship between C. perfringens isolates from the same source has previously been investigated chiefly by pulsed-field gel electrophoresis (PFGE). In this study the genetic diversity of C. perfringens isolated from various animals, from food poisoning outbreaks and from sludge was investigated. Results We used PFGE to examine the genetic diversity of 95 C. perfringens type A isolates from eight different sources. The isolates were also examined for the presence of the beta2 toxin gene (cpb2) and the enterotoxin gene (cpe). The cpb2 gene from the 28 cpb2-positive isolates was also partially sequenced (519 bp, corresponding to positions 188 to 706 in the consensus cpb2 sequence). The results of PFGE revealed a wide genetic diversity among the C. perfringens type A isolates. The genetic relatedness of the isolates ranged from 58 to 100% and 56 distinct PFGE types were identified. Almost all clusters with similar patterns comprised isolates with a known epidemiological correlation. Most of the isolates from pig, horse and sheep carried the cpb2 gene. All isolates originating from food poisoning outbreaks carried the cpe gene and three of these also carried cpb2. Two evolutionary different populations were identified by sequence analysis of the partially sequenced cpb2 genes from our study and cpb2 sequences previously deposited in GenBank. Conclusion As revealed by PFGE, there was a wide genetic diversity among C. perfringens isolates from different sources. Epidemiologically related isolates showed a high genetic similarity, as expected, while isolates with no obvious epidemiological relationship expressed a lesser degree of genetic similarity. The wide diversity revealed by PFGE was not reflected in the 16S rRNA sequences, which had a considerable degree of sequence similarity. Sequence comparison of the partially sequenced cpb2 gene revealed two genetically different populations. This is to our knowledge the first study in which the genetic diversity of C. perfringens isolates both from different animals species, from food poisoning outbreaks and from sludge has been investigated. PMID:16737528
Brief Report: Cryopyrin-Associated Periodic Syndrome Caused by a Myeloid-Restricted Somatic NLRP3 Mutation.

PubMed

Zhou, Qing; Aksentijevich, Ivona; Wood, Geryl M; Walts, Avram D; Hoffmann, Patrycja; Remmers, Elaine F; Kastner, Daniel L; Ombrello, Amanda K

2015-09-01

To identify the cause of disease in an adult patient presenting with recent-onset fevers, chills, urticaria, fatigue, and profound myalgia, who was found to be negative for cryopyrin-associated periodic syndrome (CAPS) NLRP3 mutations by conventional Sanger DNA sequencing. We performed whole-exome sequencing and targeted deep sequencing using DNA from the patient's whole blood to identify a possible NLRP3 somatic mutation. We then screened for this mutation in subcloned NLRP3 amplicons from fibroblasts, buccal cells, granulocytes, negatively selected monocytes, and T and B lymphocytes and further confirmed the somatic mutation by targeted sequencing of exon 3. We identified a previously reported CAPS-associated mutation, p.Tyr570Cys, with a mutant allele frequency of 15% based on exome data. Targeted sequencing and subcloning of NLRP3 amplicons confirmed the presence of the somatic mutation in whole blood at a ratio similar to the exome data. The mutant allele frequency was in the range of 13.3-16.8% in monocytes and 15.2-18% in granulocytes. Notably, this mutation was either absent or present at a very low frequency in B and T lymphocytes, in buccal cells, and in the patient's cultured fibroblasts. Our findings indicate the possibility of myeloid-restricted somatic mosaicism in the pathogenesis of CAPS, underscoring the emerging role of massively parallel sequencing in clinical diagnosis. Published 2015. This article is a U.S. Government work and is in the public domain in the USA.
Mutation Analysis of SLC26A4 for Pendred Syndrome and Nonsyndromic Hearing Loss by High-Resolution Melting

PubMed Central

Chen, Neng; Tranebjærg, Lisbeth; Rendtorff, Nanna Dahl; Schrijver, Iris

2011-01-01

Pendred syndrome and DFNB4 (autosomal recessive nonsyndromic congenital deafness, locus 4) are associated with autosomal recessive congenital sensorineural hearing loss and mutations in the SLC26A4 gene. Extensive allelic heterogeneity, however, necessitates analysis of all exons and splice sites to identify mutations for individual patients. Although Sanger sequencing is the gold standard for mutation detection, screening methods supplemented with targeted sequencing can provide a cost-effective alternative. One such method, denaturing high-performance liquid chromatography, was developed for clinical mutation detection in SLC26A4. However, this method inherently cannot distinguish homozygous changes from wild-type sequences. High-resolution melting (HRM), on the other hand, can detect heterozygous and homozygous changes cost-effectively, without any post-PCR modifications. We developed a closed-tube HRM mutation detection method specific for SLC26A4 that can be used in the clinical diagnostic setting. Twenty-eight primer pairs were designed to cover all 21 SLC26A4 exons and splice junction sequences. Using the resulting amplicons, initial HRM analysis detected all 45 variants previously identified by sequencing. Subsequently, a 384-well plate format was designed for up to three patient samples per run. Blinded HRM testing on these plates of patient samples collected over 1 year in a clinical diagnostic laboratory accurately detected all variants identified by sequencing. In conclusion, HRM with targeted sequencing is a reliable, simple, and cost-effective method for SLC26A4 mutation screening and detection. PMID:21704276
Comparative genomic sequence analysis of novel Helicoverpa armigera nucleopolyhedrovirus (NPV) isolated from Kenya and three other previously sequenced Helicoverpa spp. NPVs.

PubMed

Ogembo, Javier Gordon; Caoili, Barbara L; Shikata, Masamitsu; Chaeychomsri, Sudawan; Kobayashi, Michihiro; Ikeda, Motoko

2009-10-01

A newly cloned Helicoverpa armigera nucleopolyhedrovirus (HearNPV) from Kenya, HearNPV-NNg1, has a higher insecticidal activity than HearNPV-G4, which also exhibits lower insecticidal activity than HearNPV-C1. In the search for genes and/or nucleotide sequences that might be involved in the observed virulence differences among Helicoverpa spp. NPVs, the entire genome of NNg1 was sequenced and compared with previously sequenced genomes of G4, C1 and Helicoverpa zea single-nucleocapsid NPV (Hz). The NNg1 genome was 132,425 bp in length, with a total of 143 putative open reading frames (ORFs), and shared high levels of overall amino acid and nucleotide sequence identities with G4, C1 and Hz. Three NNg1 ORFs, ORF5, ORF100 and ORF124, which were shared with C1, were absent in G4 and Hz, while NNg1 and C1 were missing a homologue of G4/Hz ORF5. Another three ORFs, ORF60 (bro-b), ORF119 and ORF120, and one direct repeat sequence (dr) were unique to NNg1. Relative to the overall nucleotide sequence identity, lower sequence identities were observed between NNg1 hrs and the homologous hrs in the other three Helicoverpa spp. NPVs, despite containing the same number of hrs located at essentially the same positions on the genomes. Differences were also observed between NNg1 and each of the other three Helicoverpa spp. NPVs in the diversity of bro genes encoded on the genomes. These results indicate several putative genes and nucleotide sequences that may be responsible for the virulence differences observed among Helicoverpa spp., yet the specific genes and/or nucleotide sequences responsible have not been identified.
Transposon diversity in Arabidopsis thaliana

PubMed Central

Le, Quang Hien; Wright, Stephen; Yu, Zhihui; Bureau, Thomas

2000-01-01

Recent availability of extensive genome sequence information offers new opportunities to analyze genome organization, including transposon diversity and accumulation, at a level of resolution that was previously unattainable. In this report, we used sequence similarity search and analysis protocols to perform a fine-scale analysis of a large sample (≈17.2 Mb) of the Arabidopsis thaliana (Columbia) genome for transposons. Consistent with previous studies, we report that the A. thaliana genome harbors diverse representatives of most known superfamilies of transposons. However, our survey reveals a higher density of transposons of which over one-fourth could be classified into a single novel transposon family designated as Basho, which appears unrelated to any previously known superfamily. We have also identified putative transposase-coding ORFs for miniature inverted-repeat transposable elements (MITEs), providing clues into the mechanism of mobility and origins of the most abundant transposons associated with plant genes. In addition, we provide evidence that most mined transposons have a clear distribution preference for A + T-rich sequences and show that structural variation for many mined transposons is partly due to interelement recombination. Taken together, these findings further underscore the complexity of transposons within the compact genome of A. thaliana. PMID:10861007
PlantTFDB 3.0: a portal for the functional and evolutionary study of plant transcription factors

PubMed Central

Jin, Jinpu; Zhang, He; Kong, Lei; Gao, Ge; Luo, Jingchu

2014-01-01

With the aim to provide a resource for functional and evolutionary study of plant transcription factors (TFs), we updated the plant TF database PlantTFDB to version 3.0 (http://planttfdb.cbi.pku.edu.cn). After refining the TF classification pipeline, we systematically identified 129 288 TFs from 83 species, of which 67 species have genome sequences, covering main lineages of green plants. Besides the abundant annotation provided in the previous version, we generated more annotations for identified TFs, including expression, regulation, interaction, conserved elements, phenotype information, expert-curated descriptions derived from UniProt, TAIR and NCBI GeneRIF, as well as references to provide clues for functional studies of TFs. To help identify evolutionary relationship among identified TFs, we assigned 69 450 TFs into 3924 orthologous groups, and constructed 9217 phylogenetic trees for TFs within the same families or same orthologous groups, respectively. In addition, we set up a TF prediction server in this version for users to identify TFs from their own sequences. PMID:24174544
Phylogeny of Theileria buffeli genotypes identified in the South African buffalo (Syncerus caffer) population.

PubMed

Chaisi, Mamohale E; Collins, Nicola E; Oosthuizen, Marinda C

2014-08-29

Theileria buffeli/orientalis is a group of benign and mildly pathogenic species of cattle and buffalo in various parts of the world. In a previous study, we identified T. buffeli in blood samples originating from the African buffalo (Syncerus caffer) in the Hluhluwe-iMfolozi Game Park (HIP) and the Addo Elephant Game Park (AEGP) in South Africa. The aim of this study was to characterise the 18S rRNA gene and complete internal transcribed spacer (ITS1-5.8S-ITS2) region of T. buffeli samples, and to establish the phylogenetic position of this species based on these loci. The 18S rRNA gene and the complete ITS region were amplified from DNA extracted from blood samples originating from buffalo in HIP and AEGP. The PCR products were cloned and the resulting recombinants sequenced. We identified novel T. buffeli-like 18S rRNA and ITS genotypes from buffalo in the AEGP, and novel Theileria sinensis-like 18S rRNA genotypes from buffalo in the HIP. Phylogenetic analyses indicated that the T. buffeli-like sequences were similar to T. buffeli sequences from cattle and buffalo in China and India, and the T. sinensis-like sequences were similar to T. sinensis 18S rRNA sequences of cattle and yak in China. There was extensive sequence variation between the novel T. buffeli genotypes of the African buffalo and previously described T. buffeli and T. sinensis genotypes. The presence of organisms with T. buffeli-like and T. sinensis-like genotypes in the African buffalo could be of significant importance, particularly to the cattle industry in South Africa as these animals might act as sources of infections to naïve cattle. This is the first report on the characterisation of the full-length 18S rRNA gene and ITS region of T. buffeli and T. sinensis genotypes in South Africa. Our study provides invaluable information towards the classification of this complex group of benign and mildly pathogenic species. Copyright © 2014 Elsevier B.V. All rights reserved.
1,003 reference genomes of bacterial and archaeal isolates expand coverage of the tree of life

DOE PAGES

Mukherjee, Supratim; Seshadri, Rekha; Varghese, Neha J.; ...

2017-06-12

We present 1,003 reference genomes that were sequenced as part of the Genomic Encyclopedia of Bacteria and Archaea (GEBA) initiative, selected to maximize sequence coverage of phylogenetic space. These genomes double the number of existing type strains and expand their overall phylogenetic diversity by 25%. Comparative analyses with previously available finished and draft genomes reveal a 10.5% increase in novel protein families as a function of phylogenetic diversity. The GEBA genomes recruit 25 million previously unassigned metagenomic proteins from 4,650 samples, improving their phylogenetic and functional interpretation. We identify numerous biosynthetic clusters and experimentally validate a divergent phenazine cluster withmore » potential new chemical structure and antimicrobial activity. This Resource is the largest single release of reference genomes to date. Bacterial and archaeal isolate sequence space is still far from saturated, and future endeavors in this direction will continue to be a valuable resource for scientific discovery.« less
Complete Sequence of the mitochondrial genome of the tapeworm Hymenolepis diminuta: Gene arrangements indicate that platyhelminths are eutrochozoans

DOE Office of Scientific and Technical Information (OSTI.GOV)

von Nickisch-Rosenegk, Markus; Brown, Wesley M.; Boore, Jeffrey L.

2001-01-01

Using ''long-PCR'' we have amplified in overlapping fragments the complete mitochondrial genome of the tapeworm Hymenolepis diminuta (Platyhelminthes: Cestoda) and determined its 13,900 nucleotide sequence. The gene content is the same as that typically found for animal mitochondrial DNA (mtDNA) except that atp8 appears to be lacking, a condition found previously for several other animals. Despite the small size of this mtDNA, there are two large non-coding regions, one of which contains 13 repeats of a 31 nucleotide sequence and a potential stem-loop structure of 25 base pairs with an 11-member loop. Large potential secondary structures are identified also formore » the non-coding regions of two other cestode mtDNAs. Comparison of the mitochondrial gene arrangement of H. diminuta with those previously published supports a phylogenetic position of flatworms as members of the Eutrochozoa, rather than being basal to either a clade of protostomes or a clade of coelomates.« less
1,003 reference genomes of bacterial and archaeal isolates expand coverage of the tree of life

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mukherjee, Supratim; Seshadri, Rekha; Varghese, Neha J.

We present 1,003 reference genomes that were sequenced as part of the Genomic Encyclopedia of Bacteria and Archaea (GEBA) initiative, selected to maximize sequence coverage of phylogenetic space. These genomes double the number of existing type strains and expand their overall phylogenetic diversity by 25%. Comparative analyses with previously available finished and draft genomes reveal a 10.5% increase in novel protein families as a function of phylogenetic diversity. The GEBA genomes recruit 25 million previously unassigned metagenomic proteins from 4,650 samples, improving their phylogenetic and functional interpretation. We identify numerous biosynthetic clusters and experimentally validate a divergent phenazine cluster withmore » potential new chemical structure and antimicrobial activity. This Resource is the largest single release of reference genomes to date. Bacterial and archaeal isolate sequence space is still far from saturated, and future endeavors in this direction will continue to be a valuable resource for scientific discovery.« less
Identification and characterization of suppressors of plant cell death (SPD) effectors from Magnaporthe oryzae.

PubMed

Sharpee, William; Oh, Yeonyee; Yi, Mihwa; Franck, William; Eyre, Alex; Okagaki, Laura H; Valent, Barbara; Dean, Ralph A

2017-08-01

Phytopathogenic microorganisms, including the fungal pathogen Magnaporthe oryzae, secrete a myriad of effector proteins to facilitate infection. Utilizing the transient expression of candidate effectors in the leaves of the model plant Nicotiana benthamiana, we identified 11 suppressors of plant cell death (SPD) effectors from M. oryzae that were able to block the host cell death reaction induced by Nep1. Ten of these 11 were also able to suppress BAX-mediated plant cell death. Five of the 11 SPD genes have been identified previously as either essential for the pathogenicity of M. oryzae, secreted into the plant during disease development, or as suppressors or homologues of other characterized suppressors. In addition, of the remaining six, we showed that SPD8 (previously identified as BAS162) was localized to the rice cytoplasm in invaded and surrounding uninvaded cells during biotrophic invasion. Sequence analysis of the 11 SPD genes across 43 re-sequenced M. oryzae genomes revealed that SPD2, SPD4 and SPD7 have nucleotide polymorphisms amongst the isolates. SPD4 exhibited the highest level of nucleotide diversity of any currently known effector from M. oryzae in addition to the presence/absence polymorphisms, suggesting that this gene is potentially undergoing selection to avoid recognition by the host. Taken together, we have identified a series of effectors, some of which were previously unknown or whose function was unknown, that probably act at different stages of the infection process and contribute to the virulence of M. oryzae. © 2016 BSPP AND JOHN WILEY & SONS LTD.
In silico assessment of primers for eDNA studies using PrimerTree and application to characterize the biodiversity surrounding the Cuyahoga River

NASA Astrophysics Data System (ADS)

Cannon, M. V.; Hester, J.; Shalkhauser, A.; Chan, E. R.; Logue, K.; Small, S. T.; Serre, D.

2016-03-01

Analysis of environmental DNA (eDNA) enables the detection of species of interest from water and soil samples, typically using species-specific PCR. Here, we describe a method to characterize the biodiversity of a given environment by amplifying eDNA using primer pairs targeting a wide range of taxa and high-throughput sequencing for species identification. We tested this approach on 91 water samples of 40 mL collected along the Cuyahoga River (Ohio, USA). We amplified eDNA using 12 primer pairs targeting mammals, fish, amphibians, birds, bryophytes, arthropods, copepods, plants and several microorganism taxa and sequenced all PCR products simultaneously by high-throughput sequencing. Overall, we identified DNA sequences from 15 species of fish, 17 species of mammals, 8 species of birds, 15 species of arthropods, one turtle and one salamander. Interestingly, in addition to aquatic and semi-aquatic animals, we identified DNA from terrestrial species that live near the Cuyahoga River. We also identified DNA from one Asian carp species invasive to the Great Lakes but that had not been previously reported in the Cuyahoga River. Our study shows that analysis of eDNA extracted from small water samples using wide-range PCR amplification combined with high-throughput sequencing can provide a broad perspective on biological diversity.
Identification of (R)-selective ω-aminotransferases by exploring evolutionary sequence space.

PubMed

Kim, Eun-Mi; Park, Joon Ho; Kim, Byung-Gee; Seo, Joo-Hyun

2018-03-01

Several (R)-selective ω-aminotransferases (R-ωATs) have been reported. The existence of additional R-ωATs having different sequence characteristics from previous ones is highly expected. In addition, it is generally accepted that R-ωATs are variants of aminotransferase group III. Based on these backgrounds, sequences in RefSeq database were scored using family profiles of branched-chain amino acid aminotransferase (BCAT) and d-alanine aminotransferase (DAT) to predict and identify putative R-ωATs. Sequences with two profile analysis scores were plotted on two-dimensional score space. Candidates with relatively similar scores in both BCAT and DAT profiles (i.e., profile analysis score using BCAT profile was similar to profile analysis score using DAT profile) were selected. Experimental results for selected candidates showed that putative R-ωATs from Saccharopolyspora erythraea (R-ωAT_Sery), Bacillus cellulosilyticus (R-ωAT_Bcel), and Bacillus thuringiensis (R-ωAT_Bthu) had R-ωAT activity. Additional experiments revealed that R-ωAT_Sery also possessed DAT activity while R-ωAT_Bcel and R-ωAT_Bthu had BCAT activity. Selecting putative R-ωATs from regions with similar profile analysis scores identified potential R-ωATs. Therefore, R-ωATs could be efficiently identified by using simple family profile analysis and exploring evolutionary sequence space. Copyright © 2017 Elsevier Inc. All rights reserved.
In silico assessment of primers for eDNA studies using PrimerTree and application to characterize the biodiversity surrounding the Cuyahoga River

PubMed Central

Cannon, M. V.; Hester, J.; Shalkhauser, A.; Chan, E. R.; Logue, K.; Small, S. T.; Serre, D.

2016-01-01

Analysis of environmental DNA (eDNA) enables the detection of species of interest from water and soil samples, typically using species-specific PCR. Here, we describe a method to characterize the biodiversity of a given environment by amplifying eDNA using primer pairs targeting a wide range of taxa and high-throughput sequencing for species identification. We tested this approach on 91 water samples of 40 mL collected along the Cuyahoga River (Ohio, USA). We amplified eDNA using 12 primer pairs targeting mammals, fish, amphibians, birds, bryophytes, arthropods, copepods, plants and several microorganism taxa and sequenced all PCR products simultaneously by high-throughput sequencing. Overall, we identified DNA sequences from 15 species of fish, 17 species of mammals, 8 species of birds, 15 species of arthropods, one turtle and one salamander. Interestingly, in addition to aquatic and semi-aquatic animals, we identified DNA from terrestrial species that live near the Cuyahoga River. We also identified DNA from one Asian carp species invasive to the Great Lakes but that had not been previously reported in the Cuyahoga River. Our study shows that analysis of eDNA extracted from small water samples using wide-range PCR amplification combined with high-throughput sequencing can provide a broad perspective on biological diversity. PMID:26965911
Mutation Scanning in Wheat by Exon Capture and Next-Generation Sequencing.

PubMed

King, Robert; Bird, Nicholas; Ramirez-Gonzalez, Ricardo; Coghill, Jane A; Patil, Archana; Hassani-Pak, Keywan; Uauy, Cristobal; Phillips, Andrew L

2015-01-01

Targeted Induced Local Lesions in Genomes (TILLING) is a reverse genetics approach to identify novel sequence variation in genomes, with the aims of investigating gene function and/or developing useful alleles for breeding. Despite recent advances in wheat genomics, most current TILLING methods are low to medium in throughput, being based on PCR amplification of the target genes. We performed a pilot-scale evaluation of TILLING in wheat by next-generation sequencing through exon capture. An oligonucleotide-based enrichment array covering ~2 Mbp of wheat coding sequence was used to carry out exon capture and sequencing on three mutagenised lines of wheat containing previously-identified mutations in the TaGA20ox1 homoeologous genes. After testing different mapping algorithms and settings, candidate SNPs were identified by mapping to the IWGSC wheat Chromosome Survey Sequences. Where sequence data for all three homoeologues were found in the reference, mutant calls were unambiguous; however, where the reference lacked one or two of the homoeologues, captured reads from these genes were mis-mapped to other homoeologues, resulting either in dilution of the variant allele frequency or assignment of mutations to the wrong homoeologue. Competitive PCR assays were used to validate the putative SNPs and estimate cut-off levels for SNP filtering. At least 464 high-confidence SNPs were detected across the three mutagenized lines, including the three known alleles in TaGA20ox1, indicating a mutation rate of ~35 SNPs per Mb, similar to that estimated by PCR-based TILLING. This demonstrates the feasibility of using exon capture for genome re-sequencing as a method of mutation detection in polyploid wheat, but accurate mutation calling will require an improved genomic reference with more comprehensive coverage of homoeologues.
Paralogues of nuclear ribosomal genes conceal phylogenetic signals within the invasive Asian fish tapeworm lineage: evidence from next generation sequencing data.

PubMed

Brabec, Jan; Kuchta, Roman; Scholz, Tomáš; Littlewood, D Timothy J

2016-08-01

Complete mitochondrial genomes and nuclear rRNA operons of eight geographically distinct isolates of the Asian fish tapeworm Schyzocotyle acheilognathi (syn. Bothriocephalus acheilognathi), representing the parasite's global diversity spanning four continents, were fully characterised using an Illumina sequencing platform. This cestode species represents an extreme example of a highly invasive, globally distributed pathogen of veterinary importance with exceptionally low host specificity unseen elsewhere within the parasitic flatworms. In addition to eight specimens of S. acheilognathi, we fully characterised its closest known relative and the only congeneric species, Schyzocotyle nayarensis, from cyprinids in the Indian subcontinent. Since previous nucleotide sequence data on the Asian fish tapeworm were restricted to a single molecular locus of questionable phylogenetic utility-the nuclear rRNA genes-separating internal transcribed spacers-the mitogenomic data presented here offer a unique opportunity to gain the first detailed insights into both the intraspecific phylogenetic relationships and population genetic structure of the parasite, providing key baseline information for future research in the field. Additionally, we identify a previously unnoticed source of error and demonstrate the limited utility of the nuclear rRNA sequences, including the internal transcribed spacers that has likely misled most of the previous molecular phylogenetic and population genetic estimates on the Asian fish tapeworm. Copyright © 2016 Australian Society for Parasitology. Published by Elsevier Ltd. All rights reserved.

Characterizing the Grape Transcriptome. Analysis of Expressed Sequence Tags from Multiple Vitis Species and Development of a Compendium of Gene Expression during Berry Development1[w

PubMed Central

Silva, Francisco Goes da; Iandolino, Alberto; Al-Kayal, Fadi; Bohlmann, Marlene C.; Cushman, Mary Ann; Lim, Hyunju; Ergul, Ali; Figueroa, Rubi; Kabuloglu, Elif K.; Osborne, Craig; Rowe, Joan; Tattersall, Elizabeth; Leslie, Anna; Xu, Jane; Baek, JongMin; Cramer, Grant R.; Cushman, John C.; Cook, Douglas R.

2005-01-01

We report the analysis and annotation of 146,075 expressed sequence tags from Vitis species. The majority of these sequences were derived from different cultivars of Vitis vinifera, comprising an estimated 25,746 unique contig and singleton sequences that survey transcription in various tissues and developmental stages and during biotic and abiotic stress. Putatively homologous proteins were identified for over 17,752 of the transcripts, with 1,962 transcripts further subdivided into one or more Gene Ontology categories. A simple structured vocabulary, with modules for plant genotype, plant development, and stress, was developed to describe the relationship between individual expressed sequence tags and cDNA libraries; the resulting vocabulary provides query terms to facilitate data mining within the context of a relational database. As a measure of the extent to which characterized metabolic pathways were encompassed by the data set, we searched for homologs of the enzymes leading from glycolysis, through the oxidative/nonoxidative pentose phosphate pathway, and into the general phenylpropanoid pathway. Homologs were identified for 65 of these 77 enzymes, with 86% of enzymatic steps represented by paralogous genes. Differentially expressed transcripts were identified by means of a stringent believability index cutoff of ≥98.4%. Correlation analysis and two-dimensional hierarchical clustering grouped these transcripts according to similarity of expression. In the broadest analysis, 665 differentially expressed transcripts were identified across 29 cDNA libraries, representing a range of developmental and stress conditions. The groupings revealed expected associations between plant developmental stages and tissue types, with the notable exception of abiotic stress treatments. A more focused analysis of flower and berry development identified 87 differentially expressed transcripts and provides the basis for a compendium that relates gene expression and annotation to previously characterized aspects of berry development and physiology. Comparison with published results for select genes, as well as correlation analysis between independent data sets, suggests that the inferred in silico patterns of expression are likely to be an accurate representation of transcript abundance for the conditions surveyed. Thus, the combined data set reveals the in silico expression patterns for hundreds of genes in V. vinifera, the majority of which have not been previously studied within this species. PMID:16219919
Deep sequencing and in silico analysis of small RNA library reveals novel miRNA from leaf Persicaria minor transcriptome.

PubMed

Samad, Abdul Fatah A; Nazaruddin, Nazaruddin; Murad, Abdul Munir Abdul; Jani, Jaeyres; Zainal, Zamri; Ismail, Ismanizan

2018-03-01

In current era, majority of microRNA (miRNA) are being discovered through computational approaches which are more confined towards model plants. Here, for the first time, we have described the identification and characterization of novel miRNA in a non-model plant, Persicaria minor ( P . minor ) using computational approach. Unannotated sequences from deep sequencing were analyzed based on previous well-established parameters. Around 24 putative novel miRNAs were identified from 6,417,780 reads of the unannotated sequence which represented 11 unique putative miRNA sequences. PsRobot target prediction tool was deployed to identify the target transcripts of putative novel miRNAs. Most of the predicted target transcripts (mRNAs) were known to be involved in plant development and stress responses. Gene ontology showed that majority of the putative novel miRNA targets involved in cellular component (69.07%), followed by molecular function (30.08%) and biological process (0.85%). Out of 11 unique putative miRNAs, 7 miRNAs were validated through semi-quantitative PCR. These novel miRNAs discoveries in P . minor may develop and update the current public miRNA database.
Sequence analysis of the msp4 gene of Anaplasma ovis strains

USGS Publications Warehouse

de la Fuente, J.; Atkinson, M.W.; Naranjo, V.; Fernandez de Mera, I. G.; Mangold, A.J.; Keating, K.A.; Kocan, K.M.

2007-01-01

Anaplasma ovis (Rickettsiales: Anaplasmataceae) is a tick-borne pathogen of sheep, goats and wild ruminants. The genetic diversity of A. ovis strains has not been well characterized due to the lack of sequence information. In this study, we evaluated bighorn sheep (Ovis canadensis) and mule deer (Odocoileus hemionus) from Montana for infection with A. ovis by serology and sequence analysis of the msp4 gene. Antibodies to Anaplasma spp. were detected in 37% and 39% of bighorn sheep and mule deer analyzed, respectively. Four new msp4 genotypes were identified. The A. ovis msp4 sequences identified herein were analyzed together with sequences reported previously for the characterization of the genetic diversity of A. ovis strains in comparison with other Anaplasma spp. The results of these studies demonstrated that although A. ovis msp4 genotypes may vary among geographic regions and between sheep and deer hosts, the variation observed was less than the variation observed between A. marginale and A. phagocytophilum strains. The results reported herein further confirm that A. ovis infection occurs in natural wild ruminant populations in Western United States and that bighorn sheep and mule deer may serve as wildlife reservoirs of A. ovis. ?? 2006.
Novel Detection of Coxiella spp., Theileria luwenshuni, and T. ovis Endosymbionts in Deer Keds (Lipoptena fortisetosa).

PubMed

Lee, Seung-Hun; Kim, Kyoo-Tae; Kwon, Oh-Deog; Ock, Younsung; Kim, Taeil; Choi, Donghag; Kwak, Dongmi

2016-01-01

We describe for the first time the detection of Coxiella-like bacteria (CLB), Theileria luwenshuni, and T. ovis endosymbionts in blood-sucking deer keds. Eight deer keds attached to a Korean water deer were identified as Lipoptena fortisetosa (Diptera: Hippoboscidae) by morphological and genetic analyses. Among the endosymbionts assessed, CLB, Theileria luwenshuni, and T. ovis were identified in L. fortisetosa by PCR and nucleotide sequencing. Based on phylogeny, CLB 16S rRNA sequences were classified into clade B, sharing 99.4% identity with CLB from Haemaphysalis longicornis in South Korea. Although the virulence of CLB to vertebrates is still controversial, several studies have reported clinical symptoms in birds due to CLB infections. The 18S rRNA sequences of T. luwenshuni and T. ovis in this study were 98.8-100% identical to those in GenBank, and all of the obtained sequences of T. ovis and T. luwenshuni in this study were 100% identical to each other, respectively. Although further studies are required to positively confirm L. fortisetosa as a biological vector of these pathogens, strong genetic relationships among sequences from this and previous studies suggest potential transmission among mammalian hosts by ticks and keds.
Gene discovery by chemical mutagenesis and whole-genome sequencing in Dictyostelium.

PubMed

Li, Cheng-Lin Frank; Santhanam, Balaji; Webb, Amanda Nicole; Zupan, Blaž; Shaulsky, Gad

2016-09-01

Whole-genome sequencing is a useful approach for identification of chemical-induced lesions, but previous applications involved tedious genetic mapping to pinpoint the causative mutations. We propose that saturation mutagenesis under low mutagenic loads, followed by whole-genome sequencing, should allow direct implication of genes by identifying multiple independent alleles of each relevant gene. We tested the hypothesis by performing three genetic screens with chemical mutagenesis in the social soil amoeba Dictyostelium discoideum Through genome sequencing, we successfully identified mutant genes with multiple alleles in near-saturation screens, including resistance to intense illumination and strong suppressors of defects in an allorecognition pathway. We tested the causality of the mutations by comparison to published data and by direct complementation tests, finding both dominant and recessive causative mutations. Therefore, our strategy provides a cost- and time-efficient approach to gene discovery by integrating chemical mutagenesis and whole-genome sequencing. The method should be applicable to many microbial systems, and it is expected to revolutionize the field of functional genomics in Dictyostelium by greatly expanding the mutation spectrum relative to other common mutagenesis methods. © 2016 Li et al.; Published by Cold Spring Harbor Laboratory Press.
Top-Down-Assisted Bottom-Up Method for Homologous Protein Sequencing: Hemoglobin from 33 Bird Species

NASA Astrophysics Data System (ADS)

Song, Yang; Laskay, Ünige A.; Vilcins, Inger-Marie E.; Barbour, Alan G.; Wysocki, Vicki H.

2015-11-01

Ticks are vectors for disease transmission because they are indiscriminant in their feeding on multiple vertebrate hosts, transmitting pathogens between their hosts. Identifying the hosts on which ticks have fed is important for disease prevention and intervention. We have previously shown that hemoglobin (Hb) remnants from a host on which a tick fed can be used to reveal the host's identity. For the present research, blood was collected from 33 bird species that are common in the U.S. as hosts for ticks but that have unknown Hb sequences. A top-down-assisted bottom-up mass spectrometry approach with a customized searching database, based on variability in known bird hemoglobin sequences, has been devised to facilitate fast and complete sequencing of hemoglobin from birds with unknown sequences. These hemoglobin sequences will be added to a hemoglobin database and used for tick host identification. The general approach has the potential to sequence any set of homologous proteins completely in a rapid manner.
Mapping the pericentric heterochromatin by comparative genomic hybridization analysis and chromosome deletions in Drosophila melanogaster

PubMed Central

He, Bing; Caudy, Amy; Parsons, Lance; Rosebrock, Adam; Pane, Attilio; Raj, Sandeep; Wieschaus, Eric

2012-01-01

Heterochromatin represents a significant portion of eukaryotic genomes and has essential structural and regulatory functions. Its molecular organization is largely unknown due to difficulties in sequencing through and assembling repetitive sequences enriched in the heterochromatin. Here we developed a novel strategy using chromosomal rearrangements and embryonic phenotypes to position unmapped Drosophila melanogaster heterochromatic sequence to specific chromosomal regions. By excluding sequences that can be mapped to the assembled euchromatic arms, we identified sequences that are specific to heterochromatin and used them to design heterochromatin specific probes (“H-probes”) for microarray. By comparative genomic hybridization (CGH) analyses of embryos deficient for each chromosome or chromosome arm, we were able to map most of our H-probes to specific chromosome arms. We also positioned sequences mapped to the second and X chromosomes to finer intervals by analyzing smaller deletions with breakpoints in heterochromatin. Using this approach, we were able to map >40% (13.9 Mb) of the previously unmapped heterochromatin sequences assembled by the whole-genome sequencing effort on arm U and arm Uextra to specific locations. We also identified and mapped 110 kb of novel heterochromatic sequences. Subsequent analyses revealed that sequences located within different heterochromatic regions have distinct properties, such as sequence composition, degree of repetitiveness, and level of underreplication in polytenized tissues. Surprisingly, although heterochromatin is generally considered to be transcriptionally silent, we detected region-specific temporal patterns of transcription in heterochromatin during oogenesis and early embryonic development. Our study provides a useful approach to elucidate the molecular organization and function of heterochromatin and reveals region-specific variation of heterochromatin. PMID:22745230
Isolation of Lagos Bat Virus from Water Mongoose

PubMed Central

Markotter, Wanda; Kuzmin, Ivan; Rupprecht, Charles E.; Randles, Jenny; Sabeta, Claude T.; Wandeler, Alexander I.

2006-01-01

A genotype 2 lyssavirus, Lagos bat virus (LBV), was isolated from a terrestrial wildlife species (water mongoose) in August 2004 in the Durban area of the KwaZulu-Natal Province of South Africa. The virus isolate was confirmed as LBV by antigenic and genetic characterization, and the mongoose was identified as Atilax paludinosus by mitochondrial cytochrome b sequence analysis. Phylogenetic analysis demonstrated sequence homology with previous LBV isolates from South African bats. Studies performed in mice indicated that the peripheral pathogenicity of LBV had been underestimated in previous studies. Surveillance strategies for LBV in Africa must be improved to better understand the epidemiology of this virus and to make informed decisions on future vaccine strategies because evidence is insufficent that current rabies vaccines provide protection against LBV. PMID:17326944
The nature of the embedded population in the Rho Ophiuchi dark cloud - Mid-infrared observations

NASA Technical Reports Server (NTRS)

Lada, C. J.; Wilking, B. A.

1984-01-01

In combination with previous IR and optical data, the present 10-20 micron observations of previously identified members of the embedded population of the Rho Ophiuchi dark cloud allow determinations to be made of the broadband energy distributions for 32 of the 44 sources. The majority of the sources are found to emit the bulk of their luminosity in the 1-20 micron range, and to be surrounded by dust shells. Because they are, in light of these characteristics, probably premain-sequence in nature, relatively accurate bolometric luminosities for these objects can be obtained through integration of their energy distributions. It is found that 44 percent of the sources are less luminous than the sun, and are among the lowest luminosity premain-sequence/protostellar objects observed to date.
IDEPI: rapid prediction of HIV-1 antibody epitopes and other phenotypic features from sequence data using a flexible machine learning platform.

PubMed

Hepler, N Lance; Scheffler, Konrad; Weaver, Steven; Murrell, Ben; Richman, Douglas D; Burton, Dennis R; Poignard, Pascal; Smith, Davey M; Kosakovsky Pond, Sergei L

2014-09-01

Since its identification in 1983, HIV-1 has been the focus of a research effort unprecedented in scope and difficulty, whose ultimate goals--a cure and a vaccine--remain elusive. One of the fundamental challenges in accomplishing these goals is the tremendous genetic variability of the virus, with some genes differing at as many as 40% of nucleotide positions among circulating strains. Because of this, the genetic bases of many viral phenotypes, most notably the susceptibility to neutralization by a particular antibody, are difficult to identify computationally. Drawing upon open-source general-purpose machine learning algorithms and libraries, we have developed a software package IDEPI (IDentify EPItopes) for learning genotype-to-phenotype predictive models from sequences with known phenotypes. IDEPI can apply learned models to classify sequences of unknown phenotypes, and also identify specific sequence features which contribute to a particular phenotype. We demonstrate that IDEPI achieves performance similar to or better than that of previously published approaches on four well-studied problems: finding the epitopes of broadly neutralizing antibodies (bNab), determining coreceptor tropism of the virus, identifying compartment-specific genetic signatures of the virus, and deducing drug-resistance associated mutations. The cross-platform Python source code (released under the GPL 3.0 license), documentation, issue tracking, and a pre-configured virtual machine for IDEPI can be found at https://github.com/veg/idepi.
RNAi screen for rapid therapeutic target identification in leukemia patients

PubMed Central

Tyner, Jeffrey W.; Deininger, Michael W.; Loriaux, Marc M.; Chang, Bill H.; Gotlib, Jason R.; Willis, Stephanie G.; Erickson, Heidi; Kovacsovics, Tibor; O'Hare, Thomas; Heinrich, Michael C.; Druker, Brian J.

2009-01-01

Targeted therapy has vastly improved outcomes in certain types of cancer. Extension of this paradigm across a broad spectrum of malignancies will require an efficient method to determine the molecular vulnerabilities of cancerous cells. Improvements in sequencing technology will soon enable high-throughput sequencing of entire genomes of cancer patients; however, determining the relevance of identified sequence variants will require complementary functional analyses. Here, we report an RNAi-assisted protein target identification (RAPID) technology that individually assesses targeting of each member of the tyrosine kinase gene family. We demonstrate that RAPID screening of primary leukemia cells from 30 patients identifies targets that are critical to survival of the malignant cells from 10 of these individuals. We identify known, activating mutations in JAK2 and K-RAS, as well as patient-specific sensitivity to down-regulation of FLT1, CSF1R, PDGFR, ROR1, EPHA4/5, JAK1/3, LMTK3, LYN, FYN, PTK2B, and N-RAS. We also describe a previously undescribed, somatic, activating mutation in the thrombopoietin receptor that is sensitive to down-stream pharmacologic inhibition. Hence, the RAPID technique can quickly identify molecular vulnerabilities in malignant cells. Combination of this technique with whole-genome sequencing will represent an ideal tool for oncogenic target identification such that specific therapies can be matched with individual patients. PMID:19433805
Targeted sequencing-based analyses of candidate gene variants in ulcerative colitis-associated colorectal neoplasia.

PubMed

Chakrabarty, Sanjiban; Varghese, Vinay Koshy; Sahu, Pranoy; Jayaram, Pradyumna; Shivakumar, Bhadravathi M; Pai, Cannanore Ganesh; Satyamoorthy, Kapaettu

2017-06-27

Long-standing ulcerative colitis (UC) leading to colorectal cancer (CRC) is one of the most serious and life-threatening consequences acknowledged globally. Ulcerative colitis-associated colorectal carcinogenesis showed distinct molecular alterations when compared with sporadic colorectal carcinoma. Targeted sequencing of 409 genes in tissue samples of 18 long-standing UC subjects at high risk of colorectal carcinoma (UCHR) was performed to identify somatic driver mutations, which may be involved in the molecular changes during the transformation of non-dysplastic mucosa to high-grade dysplasia. Findings from the study are also compared with previously published genome wide and exome sequencing data in inflammatory bowel disease-associated and sporadic colorectal carcinoma. Next-generation sequencing analysis identified 1107 mutations in 275 genes in UCHR subjects. In addition to TP53 (17%) and KRAS (22%) mutations, recurrent mutations in APC (33%), ACVR2A (61%), ARID1A (44%), RAF1 (39%) and MTOR (61%) were observed in UCHR subjects. In addition, APC, FGFR3, FGFR2 and PIK3CA driver mutations were identified in UCHR subjects. Recurrent mutations in ARID1A (44%), SMARCA4 (17%), MLL2 (44%), MLL3 (67%), SETD2 (17%) and TET2 (50%) genes involved in histone modification and chromatin remodelling were identified in UCHR subjects. Our study identifies new oncogenic driver mutations which may be involved in the transition of non-dysplastic cells to dysplastic phenotype in the subjects with long-standing UC with high risk of progression into colorectal neoplasia.
Investigating the viral ecology of global bee communities with high-throughput metagenomics.

PubMed

Galbraith, David A; Fuller, Zachary L; Ray, Allyson M; Brockmann, Axel; Frazier, Maryann; Gikungu, Mary W; Martinez, J Francisco Iturralde; Kapheim, Karen M; Kerby, Jeffrey T; Kocher, Sarah D; Losyev, Oleksiy; Muli, Elliud; Patch, Harland M; Rosa, Cristina; Sakamoto, Joyce M; Stanley, Scott; Vaudo, Anthony D; Grozinger, Christina M

2018-06-11

Bee viral ecology is a fascinating emerging area of research: viruses exert a range of effects on their hosts, exacerbate impacts of other environmental stressors, and, importantly, are readily shared across multiple bee species in a community. However, our understanding of bee viral communities is limited, as it is primarily derived from studies of North American and European Apis mellifera populations. Here, we examined viruses in populations of A. mellifera and 11 other bee species from 9 countries, across 4 continents and Oceania. We developed a novel pipeline to rapidly and inexpensively screen for bee viruses. This pipeline includes purification of encapsulated RNA/DNA viruses, sequence-independent amplification, high throughput sequencing, integrated assembly of contigs, and filtering to identify contigs specifically corresponding to viral sequences. We identified sequences for (+)ssRNA, (-)ssRNA, dsRNA, and ssDNA viruses. Overall, we found 127 contigs corresponding to novel viruses (i.e. previously not observed in bees), with 27 represented by >0.1% of the reads in a given sample, and 7 contained an RdRp or replicase sequence which could be used for robust phylogenetic analysis. This study provides a sequence-independent pipeline for viral metagenomics analysis, and greatly expands our understanding of the diversity of viruses found in bee communities.
IMG/VR: a database of cultured and uncultured DNA Viruses and retroviruses.

PubMed

Paez-Espino, David; Chen, I-Min A; Palaniappan, Krishna; Ratner, Anna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Huang, Jinghua; Markowitz, Victor M; Nielsen, Torben; Huntemann, Marcel; K Reddy, T B; Pavlopoulos, Georgios A; Sullivan, Matthew B; Campbell, Barbara J; Chen, Feng; McMahon, Katherine; Hallam, Steve J; Denef, Vincent; Cavicchioli, Ricardo; Caffrey, Sean M; Streit, Wolfgang R; Webster, John; Handley, Kim M; Salekdeh, Ghasem H; Tsesmetzis, Nicolas; Setubal, Joao C; Pope, Phillip B; Liu, Wen-Tso; Rivers, Adam R; Ivanova, Natalia N; Kyrpides, Nikos C

2017-01-04

Viruses represent the most abundant life forms on the planet. Recent experimental and computational improvements have led to a dramatic increase in the number of viral genome sequences identified primarily from metagenomic samples. As a result of the expanding catalog of metagenomic viral sequences, there exists a need for a comprehensive computational platform integrating all these sequences with associated metadata and analytical tools. Here we present IMG/VR (https://img.jgi.doe.gov/vr/), the largest publicly available database of 3908 isolate reference DNA viruses with 264 413 computationally identified viral contigs from >6000 ecologically diverse metagenomic samples. Approximately half of the viral contigs are grouped into genetically distinct quasi-species clusters. Microbial hosts are predicted for 20 000 viral sequences, revealing nine microbial phyla previously unreported to be infected by viruses. Viral sequences can be queried using a variety of associated metadata, including habitat type and geographic location of the samples, or taxonomic classification according to hallmark viral genes. IMG/VR has a user-friendly interface that allows users to interrogate all integrated data and interact by comparing with external sequences, thus serving as an essential resource in the viral genomics community. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
XPAT: a toolkit to conduct cross-platform association studies with heterogeneous sequencing datasets.

PubMed

Yu, Yao; Hu, Hao; Bohlender, Ryan J; Hu, Fulan; Chen, Jiun-Sheng; Holt, Carson; Fowler, Jerry; Guthery, Stephen L; Scheet, Paul; Hildebrandt, Michelle A T; Yandell, Mark; Huff, Chad D

2018-04-06

High-throughput sequencing data are increasingly being made available to the research community for secondary analyses, providing new opportunities for large-scale association studies. However, heterogeneity in target capture and sequencing technologies often introduce strong technological stratification biases that overwhelm subtle signals of association in studies of complex traits. Here, we introduce the Cross-Platform Association Toolkit, XPAT, which provides a suite of tools designed to support and conduct large-scale association studies with heterogeneous sequencing datasets. XPAT includes tools to support cross-platform aware variant calling, quality control filtering, gene-based association testing and rare variant effect size estimation. To evaluate the performance of XPAT, we conducted case-control association studies for three diseases, including 783 breast cancer cases, 272 ovarian cancer cases, 205 Crohn disease cases and 3507 shared controls (including 1722 females) using sequencing data from multiple sources. XPAT greatly reduced Type I error inflation in the case-control analyses, while replicating many previously identified disease-gene associations. We also show that association tests conducted with XPAT using cross-platform data have comparable performance to tests using matched platform data. XPAT enables new association studies that combine existing sequencing datasets to identify genetic loci associated with common diseases and other complex traits.
Comprehensive Survey of Genetic Diversity in Chloroplast Genomes and 45S nrDNAs within Panax ginseng Species

PubMed Central

Kim, Kyunghee; Lee, Sang-Choon; Lee, Junki; Lee, Hyun Oh; Joh, Ho Jun; Kim, Nam-Hoon; Park, Hyun-Seung; Yang, Tae-Jin

2015-01-01

We report complete sequences of chloroplast (cp) genome and 45S nuclear ribosomal DNA (45S nrDNA) for 11 Panax ginseng cultivars. We have obtained complete sequences of cp and 45S nrDNA, the representative barcoding target sequences for cytoplasm and nuclear genome, respectively, based on low coverage NGS sequence of each cultivar. The cp genomes sizes ranged from 156,241 to 156,425 bp and the major size variation was derived from differences in copy number of tandem repeats in the ycf1 gene and in the intergenic regions of rps16-trnUUG and rpl32-trnUAG. The complete 45S nrDNA unit sequences were 11,091 bp, representing a consensus single transcriptional unit with an intergenic spacer region. Comparative analysis of these sequences as well as those previously reported for three Chinese accessions identified very rare but unique polymorphism in the cp genome within P. ginseng cultivars. There were 12 intra-species polymorphisms (six SNPs and six InDels) among 14 cultivars. We also identified five SNPs from 45S nrDNA of 11 Korean ginseng cultivars. From the 17 unique informative polymorphic sites, we developed six reliable markers for analysis of ginseng diversity and cultivar authentication. PMID:26061692
A sequence-based survey of the complex structural organization of tumor genomes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Collins, Colin; Raphael, Benjamin J.; Volik, Stanislav

2008-04-03

The genomes of many epithelial tumors exhibit extensive chromosomal rearrangements. All classes of genome rearrangements can be identified using End Sequencing Profiling (ESP), which relies on paired-end sequencing of cloned tumor genomes. In this study, brain, breast, ovary and prostate tumors along with three breast cancer cell lines were surveyed with ESP yielding the largest available collection of sequence-ready tumor genome breakpoints and providing evidence that some rearrangements may be recurrent. Sequencing and fluorescence in situ hybridization (FISH) confirmed translocations and complex tumor genome structures that include coamplification and packaging of disparate genomic loci with associated molecular heterogeneity. Comparison ofmore » the tumor genomes suggests recurrent rearrangements. Some are likely to be novel structural polymorphisms, whereas others may be bona fide somatic rearrangements. A recurrent fusion transcript in breast tumors and a constitutional fusion transcript resulting from a segmental duplication were identified. Analysis of end sequences for single nucleotide polymorphisms (SNPs) revealed candidate somatic mutations and an elevated rate of novel SNPs in an ovarian tumor. These results suggest that the genomes of many epithelial tumors may be far more dynamic and complex than previously appreciated and that genomic fusions including fusion transcripts and proteins may be common, possibly yielding tumor-specific biomarkers and therapeutic targets.« less
Thermodynamics-based models of transcriptional regulation with gene sequence.

PubMed

Wang, Shuqiang; Shen, Yanyan; Hu, Jinxing

2015-12-01

Quantitative models of gene regulatory activity have the potential to improve our mechanistic understanding of transcriptional regulation. However, the few models available today have been based on simplistic assumptions about the sequences being modeled or heuristic approximations of the underlying regulatory mechanisms. In this work, we have developed a thermodynamics-based model to predict gene expression driven by any DNA sequence. The proposed model relies on a continuous time, differential equation description of transcriptional dynamics. The sequence features of the promoter are exploited to derive the binding affinity which is derived based on statistical molecular thermodynamics. Experimental results show that the proposed model can effectively identify the activity levels of transcription factors and the regulatory parameters. Comparing with the previous models, the proposed model can reveal more biological sense.
Three copies of a single protein II-encoding sequence in the genome of Neisseria gonorrhoeae JS3: evidence for gene conversion and gene duplication.

PubMed

van der Ley, P

1988-11-01

Gonococci express a family of related outer membrane proteins designated protein II (P.II). These surface proteins are subject to both phase variation and antigenic variation. The P.II gene repertoire of Neisseria gonorrhoeae strain JS3 was found to consist of at least ten genes, eight of which were cloned. Sequence analysis and DNA hybridization studies revealed that one particular P.II-encoding sequence is present in three distinct, but almost identical, copies in the JS3 genome. These genes encode the P.II protein that was previously identified as P.IIc. Comparison of their sequences shows that the multiple copies of this P.IIc-encoding gene might have been generated by both gene conversion and gene duplication.
A Snapshot Avian Surveillance Reveals West Nile Virus and Evidence of Wild Birds Participating in Toscana Virus Circulation.

PubMed

Hacioglu, Sabri; Dincer, Ender; Isler, Cafer Tayer; Karapinar, Zeynep; Ataseven, Veysel Soydal; Ozkul, Aykut; Ergunay, Koray

2017-10-01

Birds are involved in the epidemiology of several vector-borne viruses, as amplification hosts for viruses, dissemination vehicles for the vectors, and sources of emerging strains in cross-species transmission. Turkey provides diverse habitats for a variety of wild birds and is located along major bird migration routes. This study was undertaken to provide a cross-sectional screening of avian specimens for a spectrum of vector-borne viruses. The specimens were collected in Hatay province, in the Mediterranean coast of the Anatolian peninsula, located in the convergence zone of the known migration routes. Generic PCR assays were used for the detection of members of Nairovirus, Flavivirus, and Phlebovirus genera of Flaviviridae and Bunyaviridae families. The circulating viruses were characterized via sequencing and selected specimens were inoculated onto Vero cell lines for virus isolation. Specimens from 72 wild birds belonging in 8 orders and 14 species were collected. A total of 158 specimens that comprise 32 sera (20.3%) from 7 species and 126 tissues (79.7%) from 14 species were screened. Eight specimens (8/158, 5%), obtained from 4 individuals (4/72, 5.5%), were positive. West Nile virus (WNV) lineage 1 sequences were characterized in the spleen, heart, and kidney tissues from a lesser spotted eagle (Clanga pomarina), which distinctly clustered from sequences previously identified in Turkey. Toscana virus (TOSV) genotype A and B sequences were identified in brain and kidney tissues from a greater flamingo (Phoenicopterus roseus), a great white pelican (Pelecanus onocrotalus), and a black stork (Ciconia nigra), without successful virus isolation. Partial amino acid sequences of the viral nucleocapsid protein revealed previously unreported substitutions. This study documents the involvement of avians in WNV dispersion in Anatolia as well in TOSV life cycle.

Identification of a novel LMF1 nonsense mutation responsible for severe hypertriglyceridemia by targeted next-generation sequencing.

PubMed

Cefalù, Angelo B; Spina, Rossella; Noto, Davide; Ingrassia, Valeria; Valenti, Vincenza; Giammanco, Antonina; Fayer, Francesca; Misiano, Gabriella; Cocorullo, Gianfranco; Scrimali, Chiara; Palesano, Ornella; Altieri, Grazia I; Ganci, Antonina; Barbagallo, Carlo M; Averna, Maurizio R

Severe hypertriglyceridemia (HTG) may result from mutations in genes affecting the intravascular lipolysis of triglyceride (TG)-rich lipoproteins. The aim of this study was to develop a targeted next-generation sequencing panel for the molecular diagnosis of disorders characterized by severe HTG. We developed a targeted customized panel for next-generation sequencing Ion Torrent Personal Genome Machine to capture the coding exons and intron/exon boundaries of 18 genes affecting the main pathways of TG synthesis and metabolism. We sequenced 11 samples of patients with severe HTG (TG>885 mg/dL-10 mmol/L): 4 positive controls in whom pathogenic mutations had previously been identified by Sanger sequencing and 7 patients in whom the molecular defect was still unknown. The customized panel was accurate, and it allowed to confirm genetic variants previously identified in all positive controls with primary severe HTG. Only 1 patient of 7 with HTG was found to be carrier of a homozygous pathogenic mutation of the third novel mutation of LMF1 gene (c.1380C>G-p.Y460X). The clinical and molecular familial cascade screening allowed the identification of 2 additional affected siblings and 7 heterozygous carriers of the mutation. We showed that our targeted resequencing approach for genetic diagnosis of severe HTG appears to be accurate, less time consuming, and more economical compared with traditional Sanger resequencing. The identification of pathogenic mutations in candidate genes remains challenging and clinical resequencing should mainly intended for patients with strong clinical criteria for monogenic severe HTG. Copyright © 2017 National Lipid Association. Published by Elsevier Inc. All rights reserved.
Sequence analysis of the lactococcal plasmid pNP40: a mobile replicon for coping with environmental hazards.

PubMed

O'Driscoll, Jonathan; Glynn, Frances; Fitzgerald, Gerald F; van Sinderen, Douwe

2006-09-01

The conjugative lactococcal plasmid pNP40, identified in Lactococcus lactis subsp. diacetylactis DRC3, possesses a potent complement of bacteriophage resistance systems, which has stimulated its application as a fitness-improving, food-grade genetic element for industrial starter cultures. The complete sequence of this plasmid allowed the mapping of previously known functions including replication, conjugation, bacteriocin resistance, heavy metal tolerance, and bacteriophage resistance. In addition, functions for cold shock adaptation and DNA damage repair were identified, further confirming pNP40's contribution to environmental stress protection. A plasmid cointegration event appears to have been part of the evolution of pNP40, resulting in a "stockpiling" of bacteriophage resistance systems.
Characterization of Samples Identified as Hepatitis C Virus Genotype 1 without Subtype by Abbott RealTime HCV Genotype II Assay Using the New Abbott HCV Genotype Plus RUO Test.

PubMed

Mokhtari, Camelia; Ebel, Anne; Reinhardt, Birgit; Merlin, Sandra; Proust, Stéphanie; Roque-Afonso, Anne-Marie

2016-02-01

Hepatitis C virus (HCV) genotyping continues to be relevant for therapeutic strategies. Some samples are reported as genotype 1 (gt 1) without subtype by the Abbott RealTime HCV Genotype II (GT II) test. To characterize such samples further, the Abbott HCV Genotype Plus RUO (Plus) assay, which targets the core region for gt 1a, gt 1b, and gt 6 detection, was evaluated as a reflex test in reference to NS5B or 5'-untranslated region (UTR)/core region sequencing. Of 3,626 routine samples, results of gt 1 without subtype were received for 171 samples (4.7%), accounting for 11.5% of gt 1 specimens. The Plus assay and sequencing were applied to 98 of those samples. NS5B or 5'-UTR/core region sequencing was successful for 91/98 specimens (92.9%). Plus assay and sequencing results were concordant for 87.9% of specimens (80/91 samples). Sequencing confirmed Plus assay results for 82.6%, 85.7%, 100%, and 89.3% of gt 1a, gt 1b, gt 6, and non-gt 1a/1b/6 results, respectively. Notably, 12 gt 6 samples that had been identified previously as gt 1 without subtype were assigned correctly here; for 25/28 samples reported as "not detected" by the Plus assay, sequencing identified the samples as gt 1 with subtypes other than 1a/1b. The genetic variability of HCV continues to present challenges for the current genotyping platforms regardless of the applied methodology. Samples identified by the GT II assay as gt 1 without subtype can be further resolved and reliably characterized by the new Plus assay. Copyright © 2016 Mokhtari et al.
Phylogenetic analysis of a spontaneous cocoa bean fermentation metagenome reveals new insights into its bacterial and fungal community diversity.

PubMed

Illeghems, Koen; De Vuyst, Luc; Papalexandratou, Zoi; Weckx, Stefan

2012-01-01

This is the first report on the phylogenetic analysis of the community diversity of a single spontaneous cocoa bean box fermentation sample through a metagenomic approach involving 454 pyrosequencing. Several sequence-based and composition-based taxonomic profiling tools were used and evaluated to avoid software-dependent results and their outcome was validated by comparison with previously obtained culture-dependent and culture-independent data. Overall, this approach revealed a wider bacterial (mainly γ-Proteobacteria) and fungal diversity than previously found. Further, the use of a combination of different classification methods, in a software-independent way, helped to understand the actual composition of the microbial ecosystem under study. In addition, bacteriophage-related sequences were found. The bacterial diversity depended partially on the methods used, as composition-based methods predicted a wider diversity than sequence-based methods, and as classification methods based solely on phylogenetic marker genes predicted a more restricted diversity compared with methods that took all reads into account. The metagenomic sequencing analysis identified Hanseniaspora uvarum, Hanseniaspora opuntiae, Saccharomyces cerevisiae, Lactobacillus fermentum, and Acetobacter pasteurianus as the prevailing species. Also, the presence of occasional members of the cocoa bean fermentation process was revealed (such as Erwinia tasmaniensis, Lactobacillus brevis, Lactobacillus casei, Lactobacillus rhamnosus, Lactococcus lactis, Leuconostoc mesenteroides, and Oenococcus oeni). Furthermore, the sequence reads associated with viral communities were of a restricted diversity, dominated by Myoviridae and Siphoviridae, and reflecting Lactobacillus as the dominant host. To conclude, an accurate overview of all members of a cocoa bean fermentation process sample was revealed, indicating the superiority of metagenomic sequencing over previously used techniques.
Identification of three homologous latex-clearing protein (lcp) genes from the genome of Streptomyces sp. strain CFMR 7.

PubMed

Nanthini, Jayaram; Ong, Su Yean; Sudesh, Kumar

2017-09-10

Rubber materials have greatly contributed to human civilization. However, being a polymeric material does not decompose easily, it has caused huge environmental problems. On the other hand, only few bacteria are known to degrade rubber, with studies pertaining them being intensively focusing on the mechanism involved in microbial rubber degradation. The Streptomyces sp. strain CFMR 7, which was previously confirmed to possess rubber-degrading ability, was subjected to whole genome sequencing using the single molecule sequencing technology of the PacBio® RS II system. The genome was further analyzed and compared with previously reported rubber-degrading bacteria in order to identify the potential genes involved in rubber degradation. This led to the interesting discovery of three homologues of latex-clearing protein (Lcp) on the chromosome of this strain, which are probably responsible for rubber degrading activities. Genes encoding oxidoreductase α-subunit (oxiA) and oxidoreductase β-subunit (oxiB) were also found downstream of two lcp genes which are located adjacent to each other. In silico analysis reveals genes that have been identified to be involved in the microbial degradation of rubber in the Streptomyces sp. strain CFMR 7. This is the first whole genome sequence of a clear-zone-forming natural rubber- degrading Streptomyces sp., which harbours three Lcp homologous genes with the presence of oxiA and oxiB genes compared to the previously reported Gordonia polyisoprenivorans strain VH2 (with two Lcp homologous genes) and Nocardia nova SH22a (with only one Lcp gene). Copyright © 2017 Elsevier B.V. All rights reserved.
Filling Gaps in Biodiversity Knowledge for Macrofungi: Contributions and Assessment of an Herbarium Collection DNA Barcode Sequencing Project

PubMed Central

Osmundson, Todd W.; Robert, Vincent A.; Schoch, Conrad L.; Baker, Lydia J.; Smith, Amy; Robich, Giovanni; Mizzan, Luca; Garbelotto, Matteo M.

2013-01-01

Despite recent advances spearheaded by molecular approaches and novel technologies, species description and DNA sequence information are significantly lagging for fungi compared to many other groups of organisms. Large scale sequencing of vouchered herbarium material can aid in closing this gap. Here, we describe an effort to obtain broad ITS sequence coverage of the approximately 6000 macrofungal-species-rich herbarium of the Museum of Natural History in Venice, Italy. Our goals were to investigate issues related to large sequencing projects, develop heuristic methods for assessing the overall performance of such a project, and evaluate the prospects of such efforts to reduce the current gap in fungal biodiversity knowledge. The effort generated 1107 sequences submitted to GenBank, including 416 previously unrepresented taxa and 398 sequences exhibiting a best BLAST match to an unidentified environmental sequence. Specimen age and taxon affected sequencing success, and subsequent work on failed specimens showed that an ITS1 mini-barcode greatly increased sequencing success without greatly reducing the discriminating power of the barcode. Similarity comparisons and nonmetric multidimensional scaling ordinations based on pairwise distance matrices proved to be useful heuristic tools for validating the overall accuracy of specimen identifications, flagging potential misidentifications, and identifying taxa in need of additional species-level revision. Comparison of within- and among-species nucleotide variation showed a strong increase in species discriminating power at 1–2% dissimilarity, and identified potential barcoding issues (same sequence for different species and vice-versa). All sequences are linked to a vouchered specimen, and results from this study have already prompted revisions of species-sequence assignments in several taxa. PMID:23638077
Filling gaps in biodiversity knowledge for macrofungi: contributions and assessment of an herbarium collection DNA barcode sequencing project.

PubMed

Osmundson, Todd W; Robert, Vincent A; Schoch, Conrad L; Baker, Lydia J; Smith, Amy; Robich, Giovanni; Mizzan, Luca; Garbelotto, Matteo M

2013-01-01

Despite recent advances spearheaded by molecular approaches and novel technologies, species description and DNA sequence information are significantly lagging for fungi compared to many other groups of organisms. Large scale sequencing of vouchered herbarium material can aid in closing this gap. Here, we describe an effort to obtain broad ITS sequence coverage of the approximately 6000 macrofungal-species-rich herbarium of the Museum of Natural History in Venice, Italy. Our goals were to investigate issues related to large sequencing projects, develop heuristic methods for assessing the overall performance of such a project, and evaluate the prospects of such efforts to reduce the current gap in fungal biodiversity knowledge. The effort generated 1107 sequences submitted to GenBank, including 416 previously unrepresented taxa and 398 sequences exhibiting a best BLAST match to an unidentified environmental sequence. Specimen age and taxon affected sequencing success, and subsequent work on failed specimens showed that an ITS1 mini-barcode greatly increased sequencing success without greatly reducing the discriminating power of the barcode. Similarity comparisons and nonmetric multidimensional scaling ordinations based on pairwise distance matrices proved to be useful heuristic tools for validating the overall accuracy of specimen identifications, flagging potential misidentifications, and identifying taxa in need of additional species-level revision. Comparison of within- and among-species nucleotide variation showed a strong increase in species discriminating power at 1-2% dissimilarity, and identified potential barcoding issues (same sequence for different species and vice-versa). All sequences are linked to a vouchered specimen, and results from this study have already prompted revisions of species-sequence assignments in several taxa.
Neotropical Bats from Costa Rica harbour Diverse Coronaviruses.

PubMed

Moreira-Soto, A; Taylor-Castillo, L; Vargas-Vargas, N; Rodríguez-Herrera, B; Jiménez, C; Corrales-Aguilar, E

2015-11-01

Bats are hosts of diverse coronaviruses (CoVs) known to potentially cross the host-species barrier. For analysing coronavirus diversity in a bat species-rich country, a total of 421 anal swabs/faecal samples from Costa Rican bats were screened for CoV RNA-dependent RNA polymerase (RdRp) gene sequences by a pancoronavirus PCR. Six families, 24 genera and 41 species of bats were analysed. The detection rate for CoV was 1%. Individuals (n = 4) from four different species of frugivorous (Artibeus jamaicensis, Carollia perspicillata and Carollia castanea) and nectivorous (Glossophaga soricina) bats were positive for coronavirus-derived nucleic acids. Analysis of 440 nt. RdRp sequences allocated all Costa Rican bat CoVs to the α-CoV group. Several CoVs sequences clustered near previously described CoVs from the same species of bat, but were phylogenetically distant from the human CoV sequences identified to date, suggesting no recent spillover events. The Glossophaga soricina CoV sequence is sufficiently dissimilar (26% homology to the closest known bat CoVs) to represent a unique coronavirus not clustering near other CoVs found in the same bat species so far, implying an even higher CoV diversity than previously suspected. © 2015 Blackwell Verlag GmbH.
Purification and sequence of rat oxyntomodulin.

PubMed Central

Collie, N L; Walsh, J H; Wong, H C; Shively, J E; Davis, M T; Lee, T D; Reeve, J R

1994-01-01

Structural information about rat enteroglucagon, intestinal peptides containing the pancreatic glucagon sequence, has been based previously on cDNA, immunologic, and chromatographic data. Our interests in testing the physiological actions of synthetic enteroglucagon peptides in rats required that we identify precisely the forms present in vivo. From knowledge of the proglucagon gene sequence, we synthesized an enteroglucagon C-terminal octapeptide common to both proposed enteroglucagon forms, glicentin and oxyntomodulin, but sharing no sequence overlap with glucagon. We then developed a radioimmunoassay using antibodies raised against the octapeptide that was specific for enteroglucagon peptides without cross-reacting with glucagon. Rat intestine was extracted, and one presumptive enteroglucagon form was purified by following the enteroglucagon C-terminal octapeptide-like immunoreactivity through several HPLC purification steps. Structural characterization of the material by amino acid composition, microsequence, and mass spectral analyses identified the peptide as rat oxyntomodulin. The 37-residue peptide consists of pancreatic glucagon plus the C-terminal extension, Lys-Arg-Asn-Arg-Asn-Asn-Ile-Ala. This now permits synthesis of an unambiguous duplicate of endogenous rat oxyntomodulin for physiological studies. Images PMID:7937770
Prediction of a common beta-propeller catalytic domain for fructosyltransferases of different origin and substrate specificity.

PubMed

Pons, T; Hernández, L; Batista, F R; Chinea, G

2000-11-01

The three-dimensional (3D) structure of fructan biosynthetic enzymes is still unknown. Here, we have explored folding similarities between reported microbial and plant enzymes that catalyze transfructosylation reactions. A sequence-structure compatibility search using TOPITS, SDP, 3D-PSSM, and SAM-T98 programs identified a beta-propeller fold with scores above the confidence threshold that indicate a structurally conserved catalytic domain in fructosyltransferases (FTFs) of diverse origin and substrate specificity. The predicted fold appeared related to that of neuraminidase and sialidase, of glycoside hydrolase families 33 and 34, respectively. The most reliable structural model was obtained using the crystal structure of neuraminidase (Protein Data Bank file: 5nn9) as template, and it is consistent with the location of previously identified functional residues of bacterial levansucrases (Batista et al., 1999; Song & Jacques, 1999). The sequence-sequence analysis presented here reinforces the recent inclusion of fungal and plant FTFs into glycoside hydrolase family 32, and suggests a modified sequence pattern H-x (2)-[PTV]-x (4)-[LIVMA]-[NSCAYG]-[DE]-P-[NDSC][GA]3 for this family.
Prediction of a common beta-propeller catalytic domain for fructosyltransferases of different origin and substrate specificity.

PubMed Central

Pons, T.; Hernández, L.; Batista, F. R.; Chinea, G.

2000-01-01

The three-dimensional (3D) structure of fructan biosynthetic enzymes is still unknown. Here, we have explored folding similarities between reported microbial and plant enzymes that catalyze transfructosylation reactions. A sequence-structure compatibility search using TOPITS, SDP, 3D-PSSM, and SAM-T98 programs identified a beta-propeller fold with scores above the confidence threshold that indicate a structurally conserved catalytic domain in fructosyltransferases (FTFs) of diverse origin and substrate specificity. The predicted fold appeared related to that of neuraminidase and sialidase, of glycoside hydrolase families 33 and 34, respectively. The most reliable structural model was obtained using the crystal structure of neuraminidase (Protein Data Bank file: 5nn9) as template, and it is consistent with the location of previously identified functional residues of bacterial levansucrases (Batista et al., 1999; Song & Jacques, 1999). The sequence-sequence analysis presented here reinforces the recent inclusion of fungal and plant FTFs into glycoside hydrolase family 32, and suggests a modified sequence pattern H-x (2)-[PTV]-x (4)-[LIVMA]-[NSCAYG]-[DE]-P-[NDSC][GA]3 for this family. PMID:11305239
Genotyping-By-Sequencing (GBS) Detects Genetic Structure and Confirms Behavioral QTL in Tame and Aggressive Foxes (Vulpes vulpes)

PubMed Central

Johnson, Jennifer L.; Wittgenstein, Helena; Mitchell, Sharon E.; Hyma, Katie E.; Temnykh, Svetlana V.; Kharlamova, Anastasiya V.; Gulevich, Rimma G.; Vladimirova, Anastasiya V.; Fong, Hiu Wa Flora; Acland, Gregory M.; Trut, Lyudmila N.; Kukekova, Anna V.

2015-01-01

The silver fox (Vulpes vulpes) offers a novel model for studying the genetics of social behavior and animal domestication. Selection of foxes, separately, for tame and for aggressive behavior has yielded two strains with markedly different, genetically determined, behavioral phenotypes. Tame strain foxes are eager to establish human contact while foxes from the aggressive strain are aggressive and difficult to handle. These strains have been maintained as separate outbred lines for over 40 generations but their genetic structure has not been previously investigated. We applied a genotyping-by-sequencing (GBS) approach to provide insights into the genetic composition of these fox populations. Sequence analysis of EcoT22I genomic libraries of tame and aggressive foxes identified 48,294 high quality SNPs. Population structure analysis revealed genetic divergence between the two strains and more diversity in the aggressive strain than in the tame one. Significant differences in allele frequency between the strains were identified for 68 SNPs. Three of these SNPs were located on fox chromosome 14 within an interval of a previously identified behavioral QTL, further supporting the importance of this region for behavior. The GBS SNP data confirmed that significant genetic diversity has been preserved in both fox populations despite many years of selective breeding. Analysis of SNP allele frequencies in the two populations identified several regions of genetic divergence between the tame and aggressive foxes, some of which may represent targets of selection for behavior. The GBS protocol used in this study significantly expanded genomic resources for the fox, and can be adapted for SNP discovery and genotyping in other canid species. PMID:26061395
Genotyping-By-Sequencing (GBS) Detects Genetic Structure and Confirms Behavioral QTL in Tame and Aggressive Foxes (Vulpes vulpes).

PubMed

Johnson, Jennifer L; Wittgenstein, Helena; Mitchell, Sharon E; Hyma, Katie E; Temnykh, Svetlana V; Kharlamova, Anastasiya V; Gulevich, Rimma G; Vladimirova, Anastasiya V; Fong, Hiu Wa Flora; Acland, Gregory M; Trut, Lyudmila N; Kukekova, Anna V

2015-01-01

The silver fox (Vulpes vulpes) offers a novel model for studying the genetics of social behavior and animal domestication. Selection of foxes, separately, for tame and for aggressive behavior has yielded two strains with markedly different, genetically determined, behavioral phenotypes. Tame strain foxes are eager to establish human contact while foxes from the aggressive strain are aggressive and difficult to handle. These strains have been maintained as separate outbred lines for over 40 generations but their genetic structure has not been previously investigated. We applied a genotyping-by-sequencing (GBS) approach to provide insights into the genetic composition of these fox populations. Sequence analysis of EcoT22I genomic libraries of tame and aggressive foxes identified 48,294 high quality SNPs. Population structure analysis revealed genetic divergence between the two strains and more diversity in the aggressive strain than in the tame one. Significant differences in allele frequency between the strains were identified for 68 SNPs. Three of these SNPs were located on fox chromosome 14 within an interval of a previously identified behavioral QTL, further supporting the importance of this region for behavior. The GBS SNP data confirmed that significant genetic diversity has been preserved in both fox populations despite many years of selective breeding. Analysis of SNP allele frequencies in the two populations identified several regions of genetic divergence between the tame and aggressive foxes, some of which may represent targets of selection for behavior. The GBS protocol used in this study significantly expanded genomic resources for the fox, and can be adapted for SNP discovery and genotyping in other canid species.
Pediatric Neurosurgery Patients Need More than a Pediatric Neurosurgeon. Part II. A Clinical Report: In the USA Lack of Parent/Caregiver Compliance Interferes with the Patient Care Sequence.

PubMed

MacGregor, Teresa L; James, Hector E; Everett, Laurel; Childers, David O

2016-01-01

We have previously reported on the initiation, development, and preliminary results of a comprehensive multidisciplinary team for the long-term management of children with neurosurgical conditions other than spina bifida. This report addresses the follow-up of the care of these patients and identifies limitations in the care sequence including, but not limited to, lack of parental/caregiver compliance, unmet educational needs, and medical insurance issues. © 2016 S. Karger AG, Basel.
Tyrosine kinome sequencing of pediatric acute lymphoblastic leukemia: a report from the Children's Oncology Group TARGET Project | Office of Cancer Genomics

Cancer.gov

TARGET researchers sequenced the tyrosine kinome and downstream signaling genes in 45 high-risk pediatric ALL cases with activated kinase signaling, including Ph-like ALL, to establish the incidence of tyrosine kinase mutations in this cohort. The study confirmed previously identified somatic mutations in JAK and FLT3, but did not find novel alterations in any additional tyrosine kinases or downstream genes. The mechanism of kinase signaling activation in this high-risk subgroup of pediatric ALL remains largely unknown.
Assembly and diploid architecture of an individual human genome via single-molecule technologies

PubMed Central

Pendleton, Matthew; Sebra, Robert; Pang, Andy Wing Chun; Ummat, Ajay; Franzen, Oscar; Rausch, Tobias; Stütz, Adrian M; Stedman, William; Anantharaman, Thomas; Hastie, Alex; Dai, Heng; Fritz, Markus Hsi-Yang; Cao, Han; Cohain, Ariella; Deikus, Gintaras; Durrett, Russell E; Blanchard, Scott C; Altman, Roger; Chin, Chen-Shan; Guo, Yan; Paxinos, Ellen E; Korbel, Jan O; Darnell, Robert B; McCombie, W Richard; Kwok, Pui-Yan; Mason, Christopher E; Schadt, Eric E; Bashir, Ali

2015-01-01

We present the first comprehensive analysis of a diploid human genome that combines single-molecule sequencing with single-molecule genome maps. Our hybrid assembly markedly improves upon the contiguity observed from traditional shotgun sequencing approaches, with scaffold N50 values approaching 30 Mb, and we identified complex structural variants (SVs) missed by other high-throughput approaches. Furthermore, by combining Illumina short-read data with long reads, we phased both single-nucleotide variants and SVs, generating haplotypes with over 99% consistency with previous trio-based studies. Our work shows that it is now possible to integrate single-molecule and high-throughput sequence data to generate de novo assembled genomes that approach reference quality. PMID:26121404
Assembly and diploid architecture of an individual human genome via single-molecule technologies.

PubMed

Pendleton, Matthew; Sebra, Robert; Pang, Andy Wing Chun; Ummat, Ajay; Franzen, Oscar; Rausch, Tobias; Stütz, Adrian M; Stedman, William; Anantharaman, Thomas; Hastie, Alex; Dai, Heng; Fritz, Markus Hsi-Yang; Cao, Han; Cohain, Ariella; Deikus, Gintaras; Durrett, Russell E; Blanchard, Scott C; Altman, Roger; Chin, Chen-Shan; Guo, Yan; Paxinos, Ellen E; Korbel, Jan O; Darnell, Robert B; McCombie, W Richard; Kwok, Pui-Yan; Mason, Christopher E; Schadt, Eric E; Bashir, Ali

2015-08-01

We present the first comprehensive analysis of a diploid human genome that combines single-molecule sequencing with single-molecule genome maps. Our hybrid assembly markedly improves upon the contiguity observed from traditional shotgun sequencing approaches, with scaffold N50 values approaching 30 Mb, and we identified complex structural variants (SVs) missed by other high-throughput approaches. Furthermore, by combining Illumina short-read data with long reads, we phased both single-nucleotide variants and SVs, generating haplotypes with over 99% consistency with previous trio-based studies. Our work shows that it is now possible to integrate single-molecule and high-throughput sequence data to generate de novo assembled genomes that approach reference quality.
Genetically Diverse Low Pathogenicity Avian Influenza A Virus Subtypes Co-Circulate among Poultry in Bangladesh.

PubMed

Gerloff, Nancy A; Khan, Salah Uddin; Zanders, Natosha; Balish, Amanda; Haider, Najmul; Islam, Ausraful; Chowdhury, Sukanta; Rahman, Mahmudur Ziaur; Haque, Ainul; Hosseini, Parviez; Gurley, Emily S; Luby, Stephen P; Wentworth, David E; Donis, Ruben O; Sturm-Ramirez, Katharine; Davis, C Todd

2016-01-01

Influenza virus surveillance, poultry outbreak investigations and genomic sequencing were assessed to understand the ecology and evolution of low pathogenicity avian influenza (LPAI) A viruses in Bangladesh from 2007 to 2013. We analyzed 506 avian specimens collected from poultry in live bird markets and backyard flocks to identify influenza A viruses. Virus isolation-positive specimens (n = 50) were subtyped and their coding-complete genomes were sequenced. The most frequently identified subtypes among LPAI isolates were H9N2, H11N3, H4N6, and H1N1. Less frequently detected subtypes included H1N3, H2N4, H3N2, H3N6, H3N8, H4N2, H5N2, H6N1, H6N7, and H7N9. Gene sequences were compared to publicly available sequences using phylogenetic inference approaches. Among the 14 subtypes identified, the majority of viral gene segments were most closely related to poultry or wild bird viruses commonly found in Southeast Asia, Europe, and/or northern Africa. LPAI subtypes were distributed over several geographic locations in Bangladesh, and surface and internal protein gene segments clustered phylogenetically with a diverse number of viral subtypes suggesting extensive reassortment among these LPAI viruses. H9N2 subtype viruses differed from other LPAI subtypes because genes from these viruses consistently clustered together, indicating this subtype is enzootic in Bangladesh. The H9N2 strains identified in Bangladesh were phylogenetically and antigenically related to previous human-derived H9N2 viruses detected in Bangladesh representing a potential source for human infection. In contrast, the circulating LPAI H5N2 and H7N9 viruses were both phylogenetically and antigenically unrelated to H5 viruses identified previously in humans in Bangladesh and H7N9 strains isolated from humans in China. In Bangladesh, domestic poultry sold in live bird markets carried a wide range of LPAI virus subtypes and a high diversity of genotypes. These findings, combined with the seven year timeframe of sampling, indicate a continuous circulation of these viruses in the country.
Genetically Diverse Low Pathogenicity Avian Influenza A Virus Subtypes Co-Circulate among Poultry in Bangladesh

PubMed Central

Gerloff, Nancy A.; Khan, Salah Uddin; Zanders, Natosha; Balish, Amanda; Haider, Najmul; Islam, Ausraful; Chowdhury, Sukanta; Rahman, Mahmudur Ziaur; Haque, Ainul; Hosseini, Parviez; Gurley, Emily S.; Luby, Stephen P.; Wentworth, David E.; Donis, Ruben O.; Sturm-Ramirez, Katharine; Davis, C. Todd

2016-01-01

Influenza virus surveillance, poultry outbreak investigations and genomic sequencing were assessed to understand the ecology and evolution of low pathogenicity avian influenza (LPAI) A viruses in Bangladesh from 2007 to 2013. We analyzed 506 avian specimens collected from poultry in live bird markets and backyard flocks to identify influenza A viruses. Virus isolation-positive specimens (n = 50) were subtyped and their coding-complete genomes were sequenced. The most frequently identified subtypes among LPAI isolates were H9N2, H11N3, H4N6, and H1N1. Less frequently detected subtypes included H1N3, H2N4, H3N2, H3N6, H3N8, H4N2, H5N2, H6N1, H6N7, and H7N9. Gene sequences were compared to publicly available sequences using phylogenetic inference approaches. Among the 14 subtypes identified, the majority of viral gene segments were most closely related to poultry or wild bird viruses commonly found in Southeast Asia, Europe, and/or northern Africa. LPAI subtypes were distributed over several geographic locations in Bangladesh, and surface and internal protein gene segments clustered phylogenetically with a diverse number of viral subtypes suggesting extensive reassortment among these LPAI viruses. H9N2 subtype viruses differed from other LPAI subtypes because genes from these viruses consistently clustered together, indicating this subtype is enzootic in Bangladesh. The H9N2 strains identified in Bangladesh were phylogenetically and antigenically related to previous human-derived H9N2 viruses detected in Bangladesh representing a potential source for human infection. In contrast, the circulating LPAI H5N2 and H7N9 viruses were both phylogenetically and antigenically unrelated to H5 viruses identified previously in humans in Bangladesh and H7N9 strains isolated from humans in China. In Bangladesh, domestic poultry sold in live bird markets carried a wide range of LPAI virus subtypes and a high diversity of genotypes. These findings, combined with the seven year timeframe of sampling, indicate a continuous circulation of these viruses in the country. PMID:27010791
Characterisation of the subtelomeric regions of Giardia lamblia genome isolate WBC6.

PubMed

Prabhu, Anjali; Morrison, Hilary G; Martinez, Charles R; Adam, Rodney D

2007-04-01

Giardia trophozoites are polyploid and have five chromosomes. The chromosome homologues demonstrate considerable size heterogeneity due to variation in the subtelomeric regions. We used clones from the genome project with telomeric sequence at one end to identify six subtelomeric regions in addition to previously identified subtelomeric regions, to study the telomeric arrangement of the chromosomes. The subtelomeric regions included two retroposons, one retroposon pseudogene, and two vsp genes, in addition to the previously identified subtelomeric regions that include ribosomal DNA repeats. The presence of vsp genes in a subtelomeric region suggests that telomeric rearrangements may contribute to the generation of vsp diversity. These studies of the subtelomeric regions of Giardia may contribute to our understanding of the factors that maintain stability, while allowing diversity in chromosome structure.

Semiconductor Whole Exome Sequencing for the Identification of Genetic Variants in Colombian Patients Clinically Diagnosed with Long QT Syndrome.

PubMed

Burgos, Mariana; Arenas, Alvaro; Cabrera, Rodrigo

2016-08-01

Inherited long QT syndrome (LQTS) is a cardiac channelopathy characterized by a prolongation of QT interval and the risk of syncope, cardiac arrest, and sudden cardiac death. Genetic diagnosis of LQTS is critical in medical practice as results can guide adequate management of patients and distinguish phenocopies such as catecholaminergic polymorphic ventricular tachycardia (CPVT). However, extensive screening of large genomic regions is required in order to reliably identify genetic causes. Semiconductor whole exome sequencing (WES) is a promising approach for the identification of variants in the coding regions of most human genes. DNA samples from 21 Colombian patients clinically diagnosed with LQTS were enriched for coding regions using multiplex polymerase chain reaction (PCR) and subjected to WES using a semiconductor sequencer. Semiconductor WES showed mean coverage of 93.6 % for all coding regions relevant to LQTS at >10× depth with high intra- and inter-assay depth heterogeneity. Fifteen variants were detected in 12 patients in genes associated with LQTS. Three variants were identified in three patients in genes associated with CPVT. Co-segregation analysis was performed when possible. All variants were analyzed with two pathogenicity prediction algorithms. The overall prevalence of LQTS and CPVT variants in our cohort was 71.4 %. All LQTS variants previously identified through commercial genetic testing were identified. Standardized WES assays can be easily implemented, often at a lower cost than sequencing panels. Our results show that WES can identify LQTS-causing mutations and permits differential diagnosis of related conditions in a real-world clinical setting. However, high heterogeneity in sequencing depth and low coverage in the most relevant genes is expected to be associated with reduced analytical sensitivity.
DNA-based stable isotope probing coupled with cultivation methods implicates Methylophaga in hydrocarbon degradation

PubMed Central

Mishamandani, Sara; Gutierrez, Tony; Aitken, Michael D.

2014-01-01

Marine hydrocarbon-degrading bacteria perform a fundamental role in the oxidation and ultimate removal of crude oil and its petrochemical derivatives in coastal and open ocean environments. Those with an almost exclusive ability to utilize hydrocarbons as a sole carbon and energy source have been found confined to just a few genera. Here we used stable isotope probing (SIP), a valuable tool to link the phylogeny and function of targeted microbial groups, to investigate hydrocarbon-degrading bacteria in coastal North Carolina sea water (Beaufort Inlet, USA) with uniformly labeled [13C]n-hexadecane. The dominant sequences in clone libraries constructed from 13C-enriched bacterial DNA (from n-hexadecane enrichments) were identified to belong to the genus Alcanivorax, with ≤98% sequence identity to the closest type strain—thus representing a putative novel phylogenetic taxon within this genus. Unexpectedly, we also identified 13C-enriched sequences in heavy DNA fractions that were affiliated to the genus Methylophaga. This is a contentious group since, though some of its members have been proposed to degrade hydrocarbons, substantive evidence has not previously confirmed this. We used quantitative PCR primers targeting the 16S rRNA gene of the SIP-identified Alcanivorax and Methylophaga to determine their abundance in incubations amended with unlabeled n-hexadecane. Both showed substantial increases in gene copy number during the experiments. Subsequently, we isolated a strain representing the SIP-identified Methylophaga sequences (99.9% 16S rRNA gene sequence identity) and used it to show, for the first time, direct evidence of hydrocarbon degradation by a cultured Methylophaga sp. This study demonstrates the value of coupling SIP with cultivation methods to identify and expand on the known diversity of hydrocarbon-degrading bacteria in the marine environment. PMID:24578702
Identification and characterization of microRNAs in Phaseolus vulgaris by high-throughput sequencing

PubMed Central

2012-01-01

Background MicroRNAs (miRNAs) are endogenously encoded small RNAs that post-transcriptionally regulate gene expression. MiRNAs play essential roles in almost all plant biological processes. Currently, few miRNAs have been identified in the model food legume Phaseolus vulgaris (common bean). Recent advances in next generation sequencing technologies have allowed the identification of conserved and novel miRNAs in many plant species. Here, we used Illumina's sequencing by synthesis (SBS) technology to identify and characterize the miRNA population of Phaseolus vulgaris. Results Small RNA libraries were generated from roots, flowers, leaves, and seedlings of P. vulgaris. Based on similarity to previously reported plant miRNAs,114 miRNAs belonging to 33 conserved miRNA families were identified. Stem-loop precursors and target gene sequences for several conserved common bean miRNAs were determined from publicly available databases. Less conserved miRNA families and species-specific common bean miRNA isoforms were also characterized. Moreover, novel miRNAs based on the small RNAs were found and their potential precursors were predicted. In addition, new target candidates for novel and conserved miRNAs were proposed. Finally, we studied organ-specific miRNA family expression levels through miRNA read frequencies. Conclusions This work represents the first massive-scale RNA sequencing study performed in Phaseolus vulgaris to identify and characterize its miRNA population. It significantly increases the number of miRNAs, precursors, and targets identified in this agronomically important species. The miRNA expression analysis provides a foundation for understanding common bean miRNA organ-specific expression patterns. The present study offers an expanded picture of P. vulgaris miRNAs in relation to those of other legumes. PMID:22394504
Sequence variation in the env gene of simian immunodeficiency virus recovered from immunized macaques is predominantly in the V1 region.

PubMed

Almond, N; Jenkins, A; Heath, A B; Kitchin, P

1993-05-01

Three cynomolgus macaques were immunized with recombinant envelope protein preparations derived from simian immunodeficiency virus (SIV). Although humoral and cellular responses were elicited by the immunization regime, all macaques became infected upon challenge with 10 MID50 of the 11/88 virus challenge stock of SIVmac251-32H. The polymerase chain reaction was used to amplify proviral SIV gp120 sequences present in the blood of both immunized and control macaques at 2 months post-infection. A comparison of the predominant sequences found in the region from V2 to V5 of gp120 failed to differentiate provirus recovered from either immunized or control animals. A detailed investigation of sequences obtained from the hypervariable V1 region identified a mixture of sequences in both immunized and control macaques. Some sequences were identical to those previously detected in the virus challenge stock, whereas others had not been detected previously. Phenogram analysis of the new V1 sequences found in immunized animals revealed that they were quite distinct from those from the virus challenge stock and that they included alterations to potential N-linked glycosylation sites. In contrast, new sequence variants recovered from the control animals were closely related to sequences from the virus challenge stock. The difference in diversity of new V1 sequences recovered from immunized and control macaques was highly significant (P < 0.001). Thus, the presence of pre-existing immune responses to SIV envelope protein is associated with greater genetic change in the V1 region of gp120. These data are discussed in relation to the epitopes of SIV gp120 that may confer protection from in vivo challenge.
Miniprimer PCR, a New Lens for Viewing the Microbial World▿ †

PubMed Central

Isenbarger, Thomas A.; Finney, Michael; Ríos-Velázquez, Carlos; Handelsman, Jo; Ruvkun, Gary

2008-01-01

Molecular methods based on the 16S rRNA gene sequence are used widely in microbial ecology to reveal the diversity of microbial populations in environmental samples. Here we show that a new PCR method using an engineered polymerase and 10-nucleotide “miniprimers” expands the scope of detectable sequences beyond those detected by standard methods using longer primers and Taq polymerase. After testing the method in silico to identify divergent ribosomal genes in previously cloned environmental sequences, we applied the method to soil and microbial mat samples, which revealed novel 16S rRNA gene sequences that would not have been detected with standard primers. Deeply divergent sequences were discovered with high frequency and included representatives that define two new division-level taxa, designated CR1 and CR2, suggesting that miniprimer PCR may reveal new dimensions of microbial diversity. PMID:18083877
Cytogenetic evidence for asexual evolution of bdelloid rotifers.

PubMed

Mark Welch, Jessica L; Mark Welch, David B; Meselson, Matthew

2004-02-10

DNA sequencing has shown individual bdelloid rotifer genomes to contain two or more diverged copies of every gene examined and has revealed no closely similar copies. These and other findings are consistent with long-term asexual evolution of bdelloids. It is not entirely ruled out, however, that bdelloid genomes consist of previously undetected pairs of sequences so similar as to be identical over the regions sequenced, as might result if bdelloids were highly inbred sexual diploids or polyploids. Here, we employ fluorescent in situ hybridization with cosmid probes to determine the copy number and chromosomal distribution of the heat shock gene hsp82 and adjacent sequences in the bdelloid Philodina roseola. We conclude that the four copies identified by sequencing are the only ones present and that each is on a separate chromosome. Bdelloids therefore are not highly homozygous sexually reproducing diploids or polyploids.
Sequence analysis of the 5.8S ribosomal DNA and internal transcribed spacers (ITS1 and ITS2) from five species of the Oxalis tuberosa alliance.

PubMed

Tosto, D S; Hopp, H E

1996-01-01

The internal transcribed spacer region (ITS1 and ITS2) of the 18S-25S nuclear ribosomal DNA sequence and the intervening 5.8S region from five species of the genus Oxalis was amplified by polymerase chain reaction and subjected to direct DNA sequencing. On the basis of cytogenetic studies some species of this genus were postulated to be related by the number of chromosomes. Sequence homologies in the ITS1, 5.8S and ITS2 among species are in good agreement with previous relationships established on the basis of chromosome numbers. We also identified a highly conserved sequence of six bp in the ITS1, reported to be present in a wide range of flowering plants, but not in the Oxalidaceae family to which the genus Oxalis belongs to.
Coordinated regulation of accessory genetic elements produces cyclic di-nucleotides for V. cholerae virulence.

PubMed

Davies, Bryan W; Bogard, Ryan W; Young, Travis S; Mekalanos, John J

2012-04-13

The function of the Vibrio 7(th) pandemic island-1 (VSP-1) in cholera pathogenesis has remained obscure. Utilizing chromatin immunoprecipitation sequencing and RNA sequencing to map the regulon of the master virulence regulator ToxT, we identify a TCP island-encoded small RNA that reduces the expression of a previously unrecognized VSP-1-encoded transcription factor termed VspR. VspR modulates the expression of several VSP-1 genes including one that encodes a novel class of di-nucleotide cyclase (DncV), which preferentially synthesizes a previously undescribed hybrid cyclic AMP-GMP molecule. We show that DncV is required for efficient intestinal colonization and downregulates V. cholerae chemotaxis, a phenotype previously associated with hyperinfectivity. This pathway couples the actions of previously disparate genomic islands, defines VSP-1 as a pathogenicity island in V. cholerae, and implicates its occurrence in 7(th) pandemic strains as a benefit for host adaptation through the production of a regulatory cyclic di-nucleotide. Copyright © 2012 Elsevier Inc. All rights reserved.
Top-down mass spectrometry reveals new sequence variants of the major bovine seminal plasma protein PDC-109.

PubMed

Laitaoja, Mikko; Sankhala, Rajeshwer S; Swamy, Musti J; Jänis, Janne

2012-07-01

The major protein of bovine seminal plasma, PDC-109, is a 109-residue polypeptide that exists as a polydisperse aggregate under native conditions. The oligomeric state of this aggregate varies with ionic strength and the presence of lipids. Binding of PDC-109 to choline phospholipids on the sperm plasma membrane results in an efflux of cholesterol and choline phospholipids, which is an important step in sperm capacitation. In this study, Fourier transform ion cyclotron resonance mass spectrometry was used to analyze PDC-109 purified from bovine seminal plasma. In addition to the previously known PDC-109 variants, four new sequence variants were identified by top-down mass spectrometry. For example, a protein variant containing point mutations P10L and G14R was identified along with another form having a 14-residue truncation in the N-terminal region. Two other minor variants could also be identified from the affinity-purified PDC-109. These results demonstrate that PDC-109 is naturally produced as a mixture of several protein forms, most of which have not been detected in previous studies. Native mass spectrometry revealed that PDC-109 is exclusively monomeric at low protein concentrations, suggesting that the protein oligomers are weakly bound and can easily be disrupted. Ligand binding to PDC-109 was also investigated, and it was observed that two molecules of O-phosphorylcholine bind to each PDC-109 monomer, consistent with previous reports. Copyright © 2012 John Wiley & Sons, Ltd.
Ultra-deep sequencing of ribosome-associated poly-adenylated RNA in early Drosophila embryos reveals hundreds of conserved translated sORFs.

PubMed

Li, Hongmei; Hu, Chuansheng; Bai, Ling; Li, Hua; Li, Mingfa; Zhao, Xiaodong; Czajkowsky, Daniel M; Shao, Zhifeng

2016-12-01

There is growing recognition that small open reading frames (sORFs) encoding peptides shorter than 100 amino acids are an important class of functional elements in the eukaryotic genome, with several already identified to play critical roles in growth, development, and disease. However, our understanding of their biological importance has been hindered owing to the significant technical challenges limiting their annotation. Here we combined ultra-deep sequencing of ribosome-associated poly-adenylated RNAs with rigorous conservation analysis to identify a comprehensive population of translated sORFs during early Drosophila embryogenesis. In total, we identify 399 sORFs, including those previously annotated but without evidence of translational capacity, those found within transcripts previously classified as non-coding, and those not previously known to be transcribed. Further, we find, for the first time, evidence for translation of many sORFs with different isoforms, suggesting their regulation is as complex as longer ORFs. Furthermore, many sORFs are found not associated with ribosomes in late-stage Drosophila S2 cells, suggesting that many of the translated sORFs may have stage-specific functions during embryogenesis. These results thus provide the first comprehensive annotation of the sORFs present during early Drosophila embryogenesis, a necessary basis for a detailed delineation of their function in embryogenesis and other biological processes. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Novel mutations in CRB1 gene identified in a chinese pedigree with retinitis pigmentosa by targeted capture and next generation sequencing

PubMed Central

Lo, David; Weng, Jingning; Liu, xiaohong; Yang, Juhua; He, Fen; Wang, Yun; Liu, Xuyang

2016-01-01

PURPOSE To detect the disease-causing gene in a Chinese pedigree with autosomal-recessive retinitis pigmentosa (ARRP). METHODS All subjects in this family underwent a complete ophthalmic examination. Targeted-capture next generation sequencing (NGS) was performed on the proband to detect variants. All variants were verified in the remaining family members by PCR amplification and Sanger sequencing. RESULTS All the affected subjects in this pedigree were diagnosed with retinitis pigmentosa (RP). The compound heterozygous c.138delA (p.Asp47IlefsX24) and c.1841G>T (p.Gly614Val) mutations in the Crumbs homolog 1 (CRB1) gene were identified in all the affected patients but not in the unaffected individuals in this family. These mutations were inherited from their parents, respectively. CONCLUSION The novel compound heterozygous mutations in CRB1 were identified in a Chinese pedigree with ARRP using targeted-capture next generation sequencing. After evaluating the significant heredity and impaired protein function, the compound heterozygous c.138delA (p.Asp47IlefsX24) and c.1841G>T (p.Gly614Val) mutations are the causal genes of early onset ARRP in this pedigree. To the best of our knowledge, there is no previous report regarding the compound mutations. PMID:27806333
A high-resolution genetic, physical, and comparative gene map of the doublefoot (Dbf) region of mouse chromosome 1 and the region of conserved synteny on human chromosome 2q35.

PubMed

Hayes, C; Rump, A; Cadman, M R; Harrison, M; Evans, E P; Lyon, M F; Morriss-Kay, G M; Rosenthal, A; Brown, S D

2001-12-01

The mouse doublefoot (Dbf) mutant exhibits preaxial polydactyly in association with craniofacial defects. This mutation has previously been mapped to mouse chromosome 1. We have used a positional cloning strategy, coupled with a comparative sequencing approach using available human draft sequence, to identify putative candidates for the Dbf gene in the mouse and in homologous human region. We have constructed a high-resolution genetic map of the region, localizing the mutation to a 0.4-cM (+/-0.0061) interval on mouse chromosome 1. Furthermore, we have constructed contiguous BAC/PAC clone maps across the mouse and human Dbf region. Using existing markers and additional sequence tagged sites, which we have generated, we have anchored the physical map to the genetic map. Through the comparative sequencing of these clones we have identified 35 genes within this interval, indicating that the region is gene-rich. From this we have identified several genes that are known to be differentially expressed in the developing mid-gestation mouse embryo, some in the developing embryonic limb buds. These genes include those encoding known developmental signaling molecules such as WNT proteins and IHH, and we provide evidence that these genes are candidates for the Dbf mutation.
Diagnostic Yield of Next-Generation Sequencing in Very Early-Onset Inflammatory Bowel Diseases: A Multicenter Study.

PubMed

Charbit-Henrion, Fabienne; Parlato, Marianna; Hanein, Sylvain; Duclaux-Loras, Rémi; Nowak, Jan; Begue, Bernadette; Rakotobe, Sabine; Bruneau, Julie; Fourrage, Cécile; Alibeu, Olivier; Rieux-Laucat, Frédéric; Lévy, Eva; Stolzenberg, Marie-Claude; Mazerolles, Fabienne; Latour, Sylvain; Lenoir, Christelle; Fischer, Alain; Picard, Capucine; Aloi, Marina; Amil Dias, Jorge; Ben Hariz, Mongi; Bourrier, Anne; Breuer, Christian; Breton, Anne; Bronski, Jiri; Buderus, Stephan; Cananzi, Mara; Coopman, Stéphanie; Crémilleux, Clara; Dabadie, Alain; Dumant-Forest, Clémentine; Egritas Gurkan, Odul; Fabre, Alexandre; Fischer, Aude; German Diaz, Marta; Gonzalez-Lama, Yago; Goulet, Olivier; Guariso, Graziella; Gurcan, Neslihan; Homan, Matjaz; Hugot, Jean-Pierre; Jeziorski, Eric; Karanika, Evi; Lachaux, Alain; Lewindon, Peter; Lima, Rosa; Magro, Fernando; Major, Janos; Malamut, Georgia; Mas, Emmanuel; Mattyus, Istvan; Mearin, Luisa M; Melek, Jan; Navas-Lopez, Victor Manuel; Paerregaard, Anders; Pelatan, Cecile; Pigneur, Bénédicte; Pinto Pais, Isabel; Rebeuh, Julie; Romano, Claudio; Siala, Nadia; Strisciuglio, Caterina; Tempia-Caliera, Michela; Tounian, Patrick; Turner, Dan; Urbonas, Vaidotas; Willot, Stéphanie; Ruemmele, Frank M; Cerf-Bensussan, Nadine

2018-05-18

An expanding number of monogenic defects have been identified as causative of severe forms of very early-onset inflammatory bowel diseases (VEO-IBD). The present study aimed at defining how next-generation sequencing (NGS) methods can be used to improve identification of known molecular diagnosis and adapt treatment. 207 children were recruited in 45 Paediatric centres through an international collaborative network (ESPGHAN GENIUS working group) with a clinical presentation of severe VEO-IBD (n=185) or an anamnesis suggestive of a monogenic disorder (n=22). Patients were divided at inclusion into three phenotypic subsets: predominantly small bowel inflammation, colitis with perianal lesions, and colitis only. Methods to obtain molecular diagnosis included functional tests followed by specific Sanger sequencing, custom-made targeted NGS, and in selected cases whole exome sequencing (WES) of parents-child trios. Genetic findings were validated clinically and/or functionally. Molecular diagnosis was achieved in 66/207 children (32%): 61% with small bowel inflammation, 39% with colitis and perianal lesions and 18% with colitis only. Targeted NGS pinpointed gene mutations causative of atypical presentations and identified large exonic copy number variations previously missed by WES. Our results lead us to propose an optimised diagnostic strategy to identify known monogenic causes of severe IBD.
Hybridization-based antibody cDNA recovery for the production of recombinant antibodies identified by repertoire sequencing.

PubMed

Valdés-Alemán, Javier; Téllez-Sosa, Juan; Ovilla-Muñoz, Marbella; Godoy-Lozano, Elizabeth; Velázquez-Ramírez, Daniel; Valdovinos-Torres, Humberto; Gómez-Barreto, Rosa E; Martinez-Barnetche, Jesús

2014-01-01

High-throughput sequencing of the antibody repertoire is enabling a thorough analysis of B cell diversity and clonal selection, which may improve the novel antibody discovery process. Theoretically, an adequate bioinformatic analysis could allow identification of candidate antigen-specific antibodies, requiring their recombinant production for experimental validation of their specificity. Gene synthesis is commonly used for the generation of recombinant antibodies identified in silico. Novel strategies that bypass gene synthesis could offer more accessible antibody identification and validation alternatives. We developed a hybridization-based recovery strategy that targets the complementarity-determining region 3 (CDRH3) for the enrichment of cDNA of candidate antigen-specific antibody sequences. Ten clonal groups of interest were identified through bioinformatic analysis of the heavy chain antibody repertoire of mice immunized with hen egg white lysozyme (HEL). cDNA from eight of the targeted clonal groups was recovered efficiently, leading to the generation of recombinant antibodies. One representative heavy chain sequence from each clonal group recovered was paired with previously reported anti-HEL light chains to generate full antibodies, later tested for HEL-binding capacity. The recovery process proposed represents a simple and scalable molecular strategy that could enhance antibody identification and specificity assessment, enabling a more cost-efficient generation of recombinant antibodies.
Screening strategies for a highly polymorphic gene: DHPLC analysis of the Fanconi anemia group A gene.

PubMed

Rischewski, J; Schneppenheim, R

2001-01-30

Patients with Fanconi anemia (Fanc) are at risk of developing leukemia. Mutations of the group A gene (FancA) are most common. A multitude of polymorphisms and mutations within the 43 exons of the gene are described. To examine the role of heterozygosity as a risk factor for malignancies, a partially automatized screening method to identify aberrations was needed. We report on our experience with DHPLC (WAVE (Transgenomic)). PCR amplification of all 43 exons from one individual was performed on one microtiter plate on a gradient thermocycler. DHPLC analysis conditions were established via melting curves, prediction software, and test runs with aberrant samples. PCR products were analyzed twice: native, and after adding a WT-PCR product. Retention patterns were compared with previously identified polymorphic PCR products or mutants. We have defined the mutation screening conditions for all 43 exons of FancA using DHPLC. So far, 40 different sequence variations have been detected in more than 100 individuals. The native analysis identifies heterozygous individuals, and the second run detects homozygous aberrations. Retention patterns are specific for the underlying sequence aberration, thus reducing sequencing demand and costs. DHPLC is a valuable tool for reproducible recognition of known sequence aberrations and screening for unknown mutations in the highly polymorphic FancA gene.
The evolutionary history of the DMRT3 'Gait keeper' haplotype.

PubMed

Staiger, E A; Almén, M S; Promerová, M; Brooks, S; Cothran, E G; Imsland, F; Jäderkvist Fegraeus, K; Lindgren, G; Mehrabani Yeganeh, H; Mikko, S; Vega-Pla, J L; Tozaki, T; Rubin, C J; Andersson, L

2017-10-01

A previous study revealed a strong association between the DMRT3:Ser301STOP mutation in horses and alternate gaits as well as performance in harness racing. Several follow-up studies have confirmed a high frequency of the mutation in gaited horse breeds and an effect on gait quality. The aim of this study was to determine when and where the mutation arose, to identify additional potential causal mutations and to determine the coalescence time for contemporary haplotypes carrying the stop mutation. We utilized sequences from 89 horses representing 26 breeds to identify 102 SNPs encompassing the DMRT3 gene that are in strong linkage disequilibrium with the stop mutation. These 102 SNPs were genotyped in an additional 382 horses representing 72 breeds, and we identified 14 unique haplotypes. The results provided conclusive evidence that DMRT3:Ser301STOP is causal, as no other sequence polymorphisms showed an equally strong association to locomotion traits. The low sequence diversity among mutant chromosomes demonstrated that they must have diverged from a common ancestral sequence within the last 10 000 years. Thus, the mutation occurred either just before domestication or more likely some time after domestication and then spread across the world as a result of selection on locomotion traits. © 2017 Stichting International Foundation for Animal Genetics.
Refining the Results of a Classical SELEX Experiment by Expanding the Sequence Data Set of an Aptamer Pool Selected for Protein A

PubMed Central

2018-01-01

New, as yet undiscovered aptamers for Protein A were identified by applying next generation sequencing (NGS) to a previously selected aptamer pool. This pool was obtained in a classical SELEX (Systematic Evolution of Ligands by EXponential enrichment) experiment using the FluMag-SELEX procedure followed by cloning and Sanger sequencing. PA#2/8 was identified as the only Protein A-binding aptamer from the Sanger sequence pool, and was shown to be able to bind intact cells of Staphylococcus aureus. In this study, we show the extension of the SELEX results by re-sequencing of the same aptamer pool using a medium throughput NGS approach and data analysis. Both data pools were compared. They confirm the selection of a highly complex and heterogeneous oligonucleotide pool and show consistently a high content of orphans as well as a similar relative frequency of certain sequence groups. But in contrast to the Sanger data pool, the NGS pool was clearly dominated by one sequence group containing the known Protein A-binding aptamer PA#2/8 as the most frequent sequence in this group. In addition, we found two new sequence groups in the NGS pool represented by PA-C10 and PA-C8, respectively, which also have high specificity for Protein A. Comparative affinity studies reveal differences between the aptamers and confirm that PA#2/8 remains the most potent sequence within the selected aptamer pool reaching affinities in the low nanomolar range of KD = 20 ± 1 nM. PMID:29495282
Refining the Results of a Classical SELEX Experiment by Expanding the Sequence Data Set of an Aptamer Pool Selected for Protein A.

PubMed

Stoltenburg, Regina; Strehlitz, Beate

2018-02-24

New, as yet undiscovered aptamers for Protein A were identified by applying next generation sequencing (NGS) to a previously selected aptamer pool. This pool was obtained in a classical SELEX (Systematic Evolution of Ligands by EXponential enrichment) experiment using the FluMag-SELEX procedure followed by cloning and Sanger sequencing. PA#2/8 was identified as the only Protein A-binding aptamer from the Sanger sequence pool, and was shown to be able to bind intact cells of Staphylococcus aureus . In this study, we show the extension of the SELEX results by re-sequencing of the same aptamer pool using a medium throughput NGS approach and data analysis. Both data pools were compared. They confirm the selection of a highly complex and heterogeneous oligonucleotide pool and show consistently a high content of orphans as well as a similar relative frequency of certain sequence groups. But in contrast to the Sanger data pool, the NGS pool was clearly dominated by one sequence group containing the known Protein A-binding aptamer PA#2/8 as the most frequent sequence in this group. In addition, we found two new sequence groups in the NGS pool represented by PA-C10 and PA-C8, respectively, which also have high specificity for Protein A. Comparative affinity studies reveal differences between the aptamers and confirm that PA#2/8 remains the most potent sequence within the selected aptamer pool reaching affinities in the low nanomolar range of K D = 20 ± 1 nM.
Development and Verification of an RNA Sequencing (RNA-Seq) Assay for the Detection of Gene Fusions in Tumors.

PubMed

Winters, Jennifer L; Davila, Jaime I; McDonald, Amber M; Nair, Asha A; Fadra, Numrah; Wehrs, Rebecca N; Thomas, Brittany C; Balcom, Jessica R; Jin, Long; Wu, Xianglin; Voss, Jesse S; Klee, Eric W; Oliver, Gavin R; Graham, Rondell P; Neff, Jadee L; Rumilla, Kandelaria M; Aypar, Umut; Kipp, Benjamin R; Jenkins, Robert B; Jen, Jin; Halling, Kevin C

2018-06-13

We assessed the performance characteristics of an RNA sequencing (RNA-Seq) assay designed to detect gene fusions in 571 genes to help manage patients with cancer. Polyadenylated RNA was converted to cDNA, which was then used to prepare next-generation sequencing libraries that were sequenced on an Illumina HiSeq 2500 instrument and analyzed with an in-house developed bioinformatic pipeline. The assay identified 38 of 41 gene fusions detected by another method, such as fluorescence in situ hybridization or RT-PCR, for a sensitivity of 93%. No false-positive gene fusions were identified in 15 normal tissue specimens and 10 tumor specimens that were negative for fusions by RNA sequencing or Mate Pair NGS (100% specificity). The assay also identified 22 fusions in 17 tumor specimens that had not been detected by other methods. Eighteen of the 22 fusions had not previously been described. Good intra-assay and interassay reproducibility was observed with complete concordance for the presence or absence of gene fusions in replicates. The analytical sensitivity of the assay was tested by diluting RNA isolated from gene fusion-positive cases with fusion-negative RNA. Gene fusions were generally detectable down to 12.5% dilutions for most fusions and as little as 3% for some fusions. This assay can help identify fusions in patients with cancer; these patients may in turn benefit from both US Food and Drug Administration-approved and investigational targeted therapies. Copyright © 2018 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.
The characteristics of seismic activity during the 2016 Kumamoto Earthquake sequence

NASA Astrophysics Data System (ADS)

Yano, T. E.; Matsubara, M.

2016-12-01

We have relocated hypocenters (total number of hypocenters to be relocated within five independent regions; N= 37,136) during the 2016 Kumamoto Earthquake sequence applying the NIED Hi-net phase pick data and waveform cross-correlations to hypoDD (Waldhauser and Ellsworth, 2000), the double-difference method. The relocated seismicity clearly trace linearly to the background seismicity, such as the Hinagu, Futagawa, and Beppu-Haneyama fault zone, and Mt. Aso area, but also form a linear seismic activity at the previously quiet area including northern edge of the caldera of Mt. Aso (Aso caldera) and some areas within the Beppu-Haneyama fault zone. Two mainshocks of M6.5 on April 14th and M7.3 on April 16th occurred at the region where the Hinagu and Futagawa faults meet each other. Our results show that the seismicity forming a shape enough to identify a line along the Hinagu fault for about 20 km immediately after the M6.3 and continues after the M7.5 event. It also make enable to trace a line of seismicity along the Futagawa fault to the east (total of about 28 km), northern part of the Aso caldera, and Ohita region along the Beppu-Haneyama fault zone becomes active only after the M7.5 event. Not only seismicity following the known faults but also seismicity unconfirmed from background seismicity in previous relocation study between 2000 and 2012 (Yano, et al., 2016) appears during the Kumamoto Earthquake sequence. By comparing our high resolution relocated catalog in the Kumamoto region from previous study and this study enable us to identified interesting characteristics; (1) the quiet area making as a gap of seismicity between the northeast extension of the Futagawa fault zone and Mt. Aso region appears only after the M7.5 event, (2) the seismicity forming a vertical or high angle dip in Aso and Ohita regions are selectively activated, (3) the linear seismicity at previously unconfirmed regions where at the northern part of the Aso caldera and along the Beppu-Haneyama fault zone. We present these characteristics of seismicity during the Kumamoto Earthquake sequence in detail.

Divergence in substrate specificity by the vOTU domain of various strains of highly-pathogenic PRRSV and the implications to pathogenicity

USDA-ARS?s Scientific Manuscript database

Porcine reproductive and respiratory syndrome virus (PRRSV) is widespread with a high variation in sequence and virulence among the divergent strains and causes an economically destructive disease. A viral ovarian domain protease (vOTU) has been previously identified within the nonstructural protein...
Introductions of West Nile Virus Strains to Mexico

PubMed Central

Deardorff, Eleanor; Estrada-Franco, José G.; Brault, Aaron C.; Navarro-Lopez, Roberto; Campomanes-Cortes, Arturo; Paz-Ramirez, Pedro; Solis-Hernandez, Mario; Ramey, Wanichaya N.; Davis, C. Todd; Beasley, David W.C.; Tesh, Robert B.; Barrett, Alan D.T.

2006-01-01

Complete genome sequencing of 22 West Nile virus isolates suggested 2 independent introductions into Mexico. A previously identified mouse-attenuated glycosylation variant was introduced into southern Mexico through the southeastern United States, while a common US genotype appears to have been introduced incrementally into northern Mexico through the southwestern United States. PMID:16494762
PHYLOGENETIC AFFILIATION OF WATER DISTRIBUTION SYSTEM BACTERIAL ISOLATES USING 16S RDNA SEQUENCE ANALYSIS

EPA Science Inventory

In a previously described study, only 15% of the bacterial strains isolated from a water distribution system (WDS) grown on R2A agar were identifiable using fatty acid methyl esthers (FAME) profiling. The lack of success was attributed to the use of fatty acid databases of bacter...
Integron-Associated DfrB4, a Previously Uncharacterized Member of the Trimethoprim-Resistant Dihydrofolate Reductase B Family, Is a Clinically Identified Emergent Source of Antibiotic Resistance.

PubMed

Toulouse, Jacynthe L; Edens, Thaddeus J; Alejaldre, Lorea; Manges, Amee R; Pelletier, Joelle N

2017-05-01

Whole-genome sequencing of trimethoprim-resistant Escherichia coli clinical isolates identified a member of the trimethoprim-resistant type II dihydrofolate reductase gene family ( dfrB ). The dfrB4 gene was located within a class I integron flanked by multiple resistance genes. This arrangement was previously reported in a 130.6-kb multiresistance plasmid. The DfrB4 protein conferred a >2,000-fold increased trimethoprim resistance on overexpression in E. coli Our results are consistent with the finding that dfrB4 contributes to clinical trimethoprim resistance. Copyright © 2017 American Society for Microbiology.
Whole genome sequences of two octogenarians with sustained cognitive abilities

PubMed Central

Nickles, Dorothee; Madireddy, Lohith; Patel, Nihar; Isobe, Noriko; Miller, Bruce L.; Baranzini, Sergio E.; Kramer, Joel H.; Oksenberg, Jorge R.

2014-01-01

Although numerous genetic variants affecting aging and mortality have been identified, e.g. APOE ε4, the genetic component influencing cognitive aging has not been fully defined yet. A better knowledge of the genetics of aging will prove helpful in understanding the underlying biological processes. Here, we describe the whole genome sequences of two female octogenarians. We provide the repertoire of genomic variants that the two octogenarians have in common. We also describe the overlap with the previously reported genomes of two supercentenarians - individuals aged ≥ 110 years. We assessed the genetic disease propensities of the octogenarians and non-aged control genomes and could not find support for the hypothesis that long-lived healthy individuals might exhibit greater genetic fitness than the general population. Furthermore, there is no evidence for an accumulation of previously described variants promoting longevity in the two octogenarians. These findings suggest that genetic fitness, as currently defined, is not the sole factor enabling an increased lifespan. We identified a number of healthy-cognitive-aging candidate genetic loci awaiting confirmation in larger studies. PMID:25618617
Whole genome sequences of 2 octogenarians with sustained cognitive abilities.

PubMed

Nickles, Dorothee; Madireddy, Lohith; Patel, Nihar; Isobe, Noriko; Miller, Bruce L; Baranzini, Sergio E; Kramer, Joel H; Oksenberg, Jorge R

2015-03-01

Although numerous genetic variants affecting aging and mortality have been identified, for example, apolipoprotein E ε4, the genetic component influencing cognitive aging has not been fully defined yet. A better knowledge of the genetics of aging will prove helpful in understanding the underlying biological processes. Here, we describe the whole genome sequences of 2 female octogenarians. We provide the repertoire of genomic variants that the 2 octogenarians have in common. We also describe the overlap with the previously reported genomes of 2 supercentenarians—individuals aged ≥110 years. We assessed the genetic disease propensities of the octogenarians and non-aged control genomes and could not find support for the hypothesis that long-lived healthy individuals might exhibit greater genetic fitness than the general population. Furthermore, there is no evidence for an accumulation of previously described variants promoting longevity in the 2 octogenarians. These findings suggest that genetic fitness, as currently defined, is not the sole factor enabling an increased life span. We identified a number of healthy-cognitive-aging candidate genetic loci awaiting confirmation in larger studies. Copyright © 2015 Elsevier Inc. All rights reserved.
A New Primer to Amplify pmoA Gene From NC10 Bacteria in the Sediments of Dongchang Lake and Dongping Lake.

PubMed

Wang, Shenghui; Liu, Yanjun; Liu, Guofu; Huang, Yaru; Zhou, Yu

2017-08-01

Nitrite-dependent anaerobic methane oxidation (n-damo) is catalyzed by the NC10 phylum bacterium "Candidatus Methylomirabilis oxyfera" (M. oxyfera). Generally, the pmoA gene is applied as a functional marker to test and identify NC10-like bacteria. However, it is difficult to detect the NC10 bacteria from sediments of freshwater lake (Dongchang Lake and Dongping Lake) with the previous pmoA gene primer sets. In this work, a new primer cmo208 was designed and used to amplify pmoA gene of NC10-like bacteria. A newly nested PCR approach was performed using the new primer cmo208 and the previous primers cmo182, cmo682, and cmo568 to detect the NC10 bacteria. The obtained pmoA gene sequences exhibited 85-92% nucleotide identity and 95-97% amino acid sequence identity to pmoA gene of M. oxyfera. The obtained diversity of pmoA gene sequences coincided well with the diversity of 16S rRNA sequences. These results indicated that the newly designed pmoA primer cmo208 could give one more option to detect NC10 bacteria from different environmental samples.
Genome analysis of the foxtail millet pathogen Sclerospora graminicola reveals the complex effector repertoire of graminicolous downy mildews.

PubMed

Kobayashi, Michie; Hiraka, Yukie; Abe, Akira; Yaegashi, Hiroki; Natsume, Satoshi; Kikuchi, Hideko; Takagi, Hiroki; Saitoh, Hiromasa; Win, Joe; Kamoun, Sophien; Terauchi, Ryohei

2017-11-22

Downy mildew, caused by the oomycete pathogen Sclerospora graminicola, is an economically important disease of Gramineae crops including foxtail millet (Setaria italica). Plants infected with S. graminicola are generally stunted and often undergo a transformation of flower organs into leaves (phyllody or witches' broom), resulting in serious yield loss. To establish the molecular basis of downy mildew disease in foxtail millet, we carried out whole-genome sequencing and an RNA-seq analysis of S. graminicola. Sequence reads were generated from S. graminicola using an Illumina sequencing platform and assembled de novo into a draft genome sequence comprising approximately 360 Mbp. Of this sequence, 73% comprised repetitive elements, and a total of 16,736 genes were predicted from the RNA-seq data. The predicted genes included those encoding effector-like proteins with high sequence similarity to those previously identified in other oomycete pathogens. Genes encoding jacalin-like lectin-domain-containing secreted proteins were enriched in S. graminicola compared to other oomycetes. Of a total of 1220 genes encoding putative secreted proteins, 91 significantly changed their expression levels during the infection of plant tissues compared to the sporangia and zoospore stages of the S. graminicola lifecycle. We established the draft genome sequence of a downy mildew pathogen that infects Gramineae plants. Based on this sequence and our transcriptome analysis, we generated a catalog of in planta-induced candidate effector genes, providing a solid foundation from which to identify the effectors causing phyllody.
Observations of disconnection of open coronal magnetic structures

NASA Technical Reports Server (NTRS)

Mccomas, D. J.; Phillips, J. L.; Hundhausen, A. J.; Burkepile, J. T.

1991-01-01

The solar maximum mission coronagraph/polarimeter observations are surveyed for evidence of magnetic disconnection of previously open magnetic structures and several sequences of images consistent with this interpretation are identified. Such disconnection occurs when open field lines above helmet streamers reconnect, in contrast to previously suggested disconnections of CMEs into closed plasmoids. In this paper a clear example of open field disconnection is shown in detail. The event, on June 27, 1988, is preceded by compression of a preexisting helmet streamer and the open coronal field around it. The compressed helmet streamer and surrounding open field region detach in a large U-shaped structure which subsequently accelerates outward from the sun. The observed sequence of events is consistent with reconnection across the heliospheric current sheet and the creation of a detached U-shaped magnetic structure. Unlike CMEs, which may open new magnetic flux into interplanetary space, this process could serve to close off previously open flux, perhaps helping to maintain the roughly constant amount of open magnetic flux observed in interplanetary space.
High-throughput discovery of novel developmental phenotypes

PubMed Central

Dickinson, Mary E.; Flenniken, Ann M.; Ji, Xiao; Teboul, Lydia; Wong, Michael D.; White, Jacqueline K.; Meehan, Terrence F.; Weninger, Wolfgang J.; Westerberg, Henrik; Adissu, Hibret; Baker, Candice N.; Bower, Lynette; Brown, James M.; Caddle, L. Brianna; Chiani, Francesco; Clary, Dave; Cleak, James; Daly, Mark J.; Denegre, James M.; Doe, Brendan; Dolan, Mary E.; Edie, Sarah M.; Fuchs, Helmut; Gailus-Durner, Valerie; Galli, Antonella; Gambadoro, Alessia; Gallegos, Juan; Guo, Shiying; Horner, Neil R.; Hsu, Chih-wei; Johnson, Sara J.; Kalaga, Sowmya; Keith, Lance C.; Lanoue, Louise; Lawson, Thomas N.; Lek, Monkol; Mark, Manuel; Marschall, Susan; Mason, Jeremy; McElwee, Melissa L.; Newbigging, Susan; Nutter, Lauryl M.J.; Peterson, Kevin A.; Ramirez-Solis, Ramiro; Rowland, Douglas J.; Ryder, Edward; Samocha, Kaitlin E.; Seavitt, John R.; Selloum, Mohammed; Szoke-Kovacs, Zsombor; Tamura, Masaru; Trainor, Amanda G; Tudose, Ilinca; Wakana, Shigeharu; Warren, Jonathan; Wendling, Olivia; West, David B.; Wong, Leeyean; Yoshiki, Atsushi; MacArthur, Daniel G.; Tocchini-Valentini, Glauco P.; Gao, Xiang; Flicek, Paul; Bradley, Allan; Skarnes, William C.; Justice, Monica J.; Parkinson, Helen E.; Moore, Mark; Wells, Sara; Braun, Robert E.; Svenson, Karen L.; de Angelis, Martin Hrabe; Herault, Yann; Mohun, Tim; Mallon, Ann-Marie; Henkelman, R. Mark; Brown, Steve D.M.; Adams, David J.; Lloyd, K.C. Kent; McKerlie, Colin; Beaudet, Arthur L.; Bucan, Maja; Murray, Stephen A.

2016-01-01

Approximately one third of all mammalian genes are essential for life. Phenotypes resulting from mouse knockouts of these genes have provided tremendous insight into gene function and congenital disorders. As part of the International Mouse Phenotyping Consortium effort to generate and phenotypically characterize 5000 knockout mouse lines, we have identified 410 lethal genes during the production of the first 1751 unique gene knockouts. Using a standardised phenotyping platform that incorporates high-resolution 3D imaging, we identified novel phenotypes at multiple time points for previously uncharacterized genes and additional phenotypes for genes with previously reported mutant phenotypes. Unexpectedly, our analysis reveals that incomplete penetrance and variable expressivity are common even on a defined genetic background. In addition, we show that human disease genes are enriched for essential genes identified in our screen, thus providing a novel dataset that facilitates prioritization and validation of mutations identified in clinical sequencing efforts. PMID:27626380
Human mitochondrial pyrophosphatase: cDNA cloning and analysis of the gene in patients with mtDNA depletion syndromes.

PubMed

Curbo, Sophie; Lagier-Tourenne, Clotilde; Carrozzo, Rosalba; Palenzuela, Lluis; Lucioli, Simona; Hirano, Michio; Santorelli, Filippo; Arenas, Joaquin; Karlsson, Anna; Johansson, Magnus

2006-03-01

Pyrophosphatases (PPases) catalyze the hydrolysis of inorganic pyrophosphate generated in several cellular enzymatic reactions. A novel human pyrophosphatase cDNA encoding a 334-amino-acid protein approximately 60% identical to the previously identified human cytosolic PPase was cloned and characterized. The novel enzyme, named PPase-2, was enzymatically active and catalyzed hydrolysis of pyrophosphate at a rate similar to that of the previously identified PPase-1. A functional mitochondrial import signal sequence was identified in the N-terminus of PPase-2, which targeted the enzyme to the mitochondrial matrix. The human pyrophosphatase 2 gene (PPase-2) was mapped to chromosome 4q25 and the 1.4-kb mRNA was ubiquitously expressed in human tissues, with highest levels in muscle, liver, and kidney. The yeast homologue of the mitochondrial PPase-2 is required for mitochondrial DNA maintenance and yeast cells lacking the enzyme exhibit mitochondrial DNA depletion. We sequenced the PPA2 gene in 13 patients with mitochondrial DNA depletion syndromes (MDS) of unknown cause to determine if mutations in the PPA2 gene of these patients were associated with this disease. No pathogenic mutations were identified in the PPA2 gene of these patients and we found no evidence that PPA2 gene mutations are a common cause of MDS in humans.
A functional genomics investigation of allelochemical biosynthesis in Sorghum bicolor root hairs.

PubMed

Baerson, Scott R; Dayan, Franck E; Rimando, Agnes M; Nanayakkara, N P Dhammika; Liu, Chang-Jun; Schröder, Joachim; Fishbein, Mark; Pan, Zhiqiang; Kagan, Isabelle A; Pratt, Lee H; Cordonnier-Pratt, Marie-Michèle; Duke, Stephen O

2008-02-08

Sorghum is considered to be one of the more allelopathic crop species, producing phytotoxins such as the potent benzoquinone sorgoleone (2-hydroxy-5-methoxy-3-[(Z,Z)-8',11',14'-pentadecatriene]-p-benzoquinone) and its analogs. Sorgoleone likely accounts for much of the allelopathy of Sorghum spp., typically representing the predominant constituent of Sorghum bicolor root exudates. Previous and ongoing studies suggest that the biosynthetic pathway for this plant growth inhibitor occurs in root hair cells, involving a polyketide synthase activity that utilizes an atypical 16:3 fatty acyl-CoA starter unit, resulting in the formation of a pentadecatrienyl resorcinol intermediate. Subsequent modifications of this resorcinolic intermediate are likely to be mediated by S-adenosylmethionine-dependent O-methyltransferases and dihydroxylation by cytochrome P450 monooxygenases, although the precise sequence of reactions has not been determined previously. Analyses performed by gas chromatography-mass spectrometry with sorghum root extracts identified a 3-methyl ether derivative of the likely pentadecatrienyl resorcinol intermediate, indicating that dihydroxylation of the resorcinol ring is preceded by O-methylation at the 3'-position by a novel 5-n-alk(en)ylresorcinol-utilizing O-methyltransferase activity. An expressed sequence tag data set consisting of 5,468 sequences selected at random from an S. bicolor root hair-specific cDNA library was generated to identify candidate sequences potentially encoding enzymes involved in the sorgoleone biosynthetic pathway. Quantitative real time reverse transcription-PCR and recombinant enzyme studies with putative O-methyltransferase sequences obtained from the expressed sequence tag data set have led to the identification of a novel O-methyltransferase highly and predominantly expressed in root hairs (designated SbOMT3), which preferentially utilizes alk(en)ylresorcinols among a panel of benzene-derivative substrates tested. SbOMT3 is therefore proposed to be involved in the biosynthesis of the allelochemical sorgoleone.
Whole Exome Sequencing of Patients with Steroid-Resistant Nephrotic Syndrome.

PubMed

Warejko, Jillian K; Tan, Weizhen; Daga, Ankana; Schapiro, David; Lawson, Jennifer A; Shril, Shirlee; Lovric, Svjetlana; Ashraf, Shazia; Rao, Jia; Hermle, Tobias; Jobst-Schwan, Tilman; Widmeier, Eugen; Majmundar, Amar J; Schneider, Ronen; Gee, Heon Yung; Schmidt, J Magdalena; Vivante, Asaf; van der Ven, Amelie T; Ityel, Hadas; Chen, Jing; Sadowski, Carolin E; Kohl, Stefan; Pabst, Werner L; Nakayama, Makiko; Somers, Michael J G; Rodig, Nancy M; Daouk, Ghaleb; Baum, Michelle; Stein, Deborah R; Ferguson, Michael A; Traum, Avram Z; Soliman, Neveen A; Kari, Jameela A; El Desoky, Sherif; Fathy, Hanan; Zenker, Martin; Bakkaloglu, Sevcan A; Müller, Dominik; Noyan, Aytul; Ozaltin, Fatih; Cadnapaphornchai, Melissa A; Hashmi, Seema; Hopcian, Jeffrey; Kopp, Jeffrey B; Benador, Nadine; Bockenhauer, Detlef; Bogdanovic, Radovan; Stajić, Nataša; Chernin, Gil; Ettenger, Robert; Fehrenbach, Henry; Kemper, Markus; Munarriz, Reyner Loza; Podracka, Ludmila; Büscher, Rainer; Serdaroglu, Erkin; Tasic, Velibor; Mane, Shrikant; Lifton, Richard P; Braun, Daniela A; Hildebrandt, Friedhelm

2018-01-06

Steroid-resistant nephrotic syndrome overwhelmingly progresses to ESRD. More than 30 monogenic genes have been identified to cause steroid-resistant nephrotic syndrome. We previously detected causative mutations using targeted panel sequencing in 30% of patients with steroid-resistant nephrotic syndrome. Panel sequencing has a number of limitations when compared with whole exome sequencing. We employed whole exome sequencing to detect monogenic causes of steroid-resistant nephrotic syndrome in an international cohort of 300 families. Three hundred thirty-five individuals with steroid-resistant nephrotic syndrome from 300 families were recruited from April of 1998 to June of 2016. Age of onset was restricted to <25 years of age. Exome data were evaluated for 33 known monogenic steroid-resistant nephrotic syndrome genes. In 74 of 300 families (25%), we identified a causative mutation in one of 20 genes known to cause steroid-resistant nephrotic syndrome. In 11 families (3.7%), we detected a mutation in a gene that causes a phenocopy of steroid-resistant nephrotic syndrome. This is consistent with our previously published identification of mutations using a panel approach. We detected a causative mutation in a known steroid-resistant nephrotic syndrome gene in 38% of consanguineous families and in 13% of nonconsanguineous families, and 48% of children with congenital nephrotic syndrome. A total of 68 different mutations were detected in 20 of 33 steroid-resistant nephrotic syndrome genes. Fifteen of these mutations were novel. NPHS1 , PLCE1 , NPHS2 , and SMARCAL1 were the most common genes in which we detected a mutation. In another 28% of families, we detected mutations in one or more candidate genes for steroid-resistant nephrotic syndrome. Whole exome sequencing is a sensitive approach toward diagnosis of monogenic causes of steroid-resistant nephrotic syndrome. A molecular genetic diagnosis of steroid-resistant nephrotic syndrome may have important consequences for the management of treatment and kidney transplantation in steroid-resistant nephrotic syndrome. Copyright © 2018 by the American Society of Nephrology.
Molecular Identification of Commercialized Medicinal Plants in Southern Morocco

PubMed Central

Krüger, Åsa; Rydberg, Anders; Abbad, Abdelaziz; Björk, Lars; Martin, Gary

2012-01-01

Background Medicinal plant trade is important for local livelihoods. However, many medicinal plants are difficult to identify when they are sold as roots, powders or bark. DNA barcoding involves using a short, agreed-upon region of a genome as a unique identifier for species– ideally, as a global standard. Research Question What is the functionality, efficacy and accuracy of the use of barcoding for identifying root material, using medicinal plant roots sold by herbalists in Marrakech, Morocco, as a test dataset. Methodology In total, 111 root samples were sequenced for four proposed barcode regions rpoC1, psbA-trnH, matK and ITS. Sequences were searched against a tailored reference database of Moroccan medicinal plants and their closest relatives using BLAST and Blastclust, and through inference of RAxML phylograms of the aligned market and reference samples. Principal Findings Sequencing success was high for rpoC1, psbA-trnH, and ITS, but low for matK. Searches using rpoC1 alone resulted in a number of ambiguous identifications, indicating insufficient DNA variation for accurate species-level identification. Combining rpoC1, psbA-trnH and ITS allowed the majority of the market samples to be identified to genus level. For a minority of the market samples, the barcoding identification differed significantly from previous hypotheses based on the vernacular names. Conclusions/Significance Endemic plant species are commercialized in Marrakech. Adulteration is common and this may indicate that the products are becoming locally endangered. Nevertheless the majority of the traded roots belong to species that are common and not known to be endangered. A significant conclusion from our results is that unknown samples are more difficult to identify than earlier suggested, especially if the reference sequences were obtained from different populations. A global barcoding database should therefore contain sequences from different populations of the same species to assure the reference sequences characterize the species throughout its distributional range. PMID:22761800
Whole exome sequencing reveals concomitant mutations of multiple FA genes in individual Fanconi anemia patients

PubMed Central

2014-01-01

Background Fanconi anemia (FA) is a rare inherited genetic syndrome with highly variable clinical manifestations. Fifteen genetic subtypes of FA have been identified. Traditional complementation tests for grouping studies have been used generally in FA patients and in stepwise methods to identify the FA type, which can result in incomplete genetic information from FA patients. Methods We diagnosed five pediatric patients with FA based on clinical manifestations, and we performed exome sequencing of peripheral blood specimens from these patients and their family members. The related sequencing data were then analyzed by bioinformatics, and the FANC gene mutations identified by exome sequencing were confirmed by PCR re-sequencing. Results Homozygous and compound heterozygous mutations of FANC genes were identified in all of the patients. The FA subtypes of the patients included FANCA, FANCM and FANCD2. Interestingly, four FA patients harbored multiple mutations in at least two FA genes, and some of these mutations have not been previously reported. These patients’ clinical manifestations were vastly different from each other, as were their treatment responses to androstanazol and prednisone. This finding suggests that heterozygous mutation(s) in FA genes could also have diverse biological and/or pathophysiological effects on FA patients or FA gene carriers. Interestingly, we were not able to identify de novo mutations in the genes implicated in DNA repair pathways when the sequencing data of patients were compared with those of their parents. Conclusions Our results indicate that Chinese FA patients and carriers might have higher and more complex mutation rates in FANC genes than have been conventionally recognized. Testing of the fifteen FANC genes in FA patients and their family members should be a regular clinical practice to determine the optimal care for the individual patient, to counsel the family and to obtain a better understanding of FA pathophysiology. PMID:24885126
Whole exome sequencing reveals concomitant mutations of multiple FA genes in individual Fanconi anemia patients.

PubMed

Chang, Lixian; Yuan, Weiping; Zeng, Huimin; Zhou, Quanquan; Wei, Wei; Zhou, Jianfeng; Li, Miaomiao; Wang, Xiaomin; Xu, Mingjiang; Yang, Fengchun; Yang, Yungui; Cheng, Tao; Zhu, Xiaofan

2014-05-15

Fanconi anemia (FA) is a rare inherited genetic syndrome with highly variable clinical manifestations. Fifteen genetic subtypes of FA have been identified. Traditional complementation tests for grouping studies have been used generally in FA patients and in stepwise methods to identify the FA type, which can result in incomplete genetic information from FA patients. We diagnosed five pediatric patients with FA based on clinical manifestations, and we performed exome sequencing of peripheral blood specimens from these patients and their family members. The related sequencing data were then analyzed by bioinformatics, and the FANC gene mutations identified by exome sequencing were confirmed by PCR re-sequencing. Homozygous and compound heterozygous mutations of FANC genes were identified in all of the patients. The FA subtypes of the patients included FANCA, FANCM and FANCD2. Interestingly, four FA patients harbored multiple mutations in at least two FA genes, and some of these mutations have not been previously reported. These patients' clinical manifestations were vastly different from each other, as were their treatment responses to androstanazol and prednisone. This finding suggests that heterozygous mutation(s) in FA genes could also have diverse biological and/or pathophysiological effects on FA patients or FA gene carriers. Interestingly, we were not able to identify de novo mutations in the genes implicated in DNA repair pathways when the sequencing data of patients were compared with those of their parents. Our results indicate that Chinese FA patients and carriers might have higher and more complex mutation rates in FANC genes than have been conventionally recognized. Testing of the fifteen FANC genes in FA patients and their family members should be a regular clinical practice to determine the optimal care for the individual patient, to counsel the family and to obtain a better understanding of FA pathophysiology.
In-depth proteomic analysis of a mollusc shell: acid-soluble and acid-insoluble matrix of the limpet Lottia gigantea

PubMed Central

2012-01-01

Background Invertebrate biominerals are characterized by their extraordinary functionality and physical properties, such as strength, stiffness and toughness that by far exceed those of the pure mineral component of such composites. This is attributed to the organic matrix, secreted by specialized cells, which pervades and envelops the mineral crystals. Despite the obvious importance of the protein fraction of the organic matrix, only few in-depth proteomic studies have been performed due to the lack of comprehensive protein sequence databases. The recent public release of the gastropod Lottia gigantea genome sequence and the associated protein sequence database provides for the first time the opportunity to do a state-of-the-art proteomic in-depth analysis of the organic matrix of a mollusc shell. Results Using three different sodium hypochlorite washing protocols before shell demineralization, a total of 569 proteins were identified in Lottia gigantea shell matrix. Of these, 311 were assembled in a consensus proteome comprising identifications contained in all proteomes irrespective of shell cleaning procedure. Some of these proteins were similar in amino acid sequence, amino acid composition, or domain structure to proteins identified previously in different bivalve or gastropod shells, such as BMSP, dermatopontin, nacrein, perlustrin, perlucin, or Pif. In addition there were dozens of previously uncharacterized proteins, many containing repeated short linear motifs or homorepeats. Such proteins may play a role in shell matrix construction or control of mineralization processes. Conclusions The organic matrix of Lottia gigantea shells is a complex mixture of proteins comprising possible homologs of some previously characterized mollusc shell proteins, but also many novel proteins with a possible function in biomineralization as framework building blocks or as regulatory components. We hope that this data set, the most comprehensive available at present, will provide a platform for the further exploration of biomineralization processes in molluscs. PMID:22540284
Identification and characterization of unrecognized viruses in stool samples of non-polio acute flaccid paralysis children by simplified VIDISCA.

PubMed

Shaukat, Shahzad; Angez, Mehar; Alam, Muhammad Masroor; Jebbink, Maarten F; Deijs, Martin; Canuti, Marta; Sharif, Salmaan; de Vries, Michel; Khurshid, Adnan; Mahmood, Tariq; van der Hoek, Lia; Zaidi, Syed Sohail Zahoor

2014-08-12

The use of sequence independent methods combined with next generation sequencing for identification purposes in clinical samples appears promising and exciting results have been achieved to understand unexplained infections. One sequence independent method, Virus Discovery based on cDNA Amplified Fragment Length Polymorphism (VIDISCA) is capable of identifying viruses that would have remained unidentified in standard diagnostics or cell cultures. VIDISCA is normally combined with next generation sequencing, however, we set up a simplified VIDISCA which can be used in case next generation sequencing is not possible. Stool samples of 10 patients with unexplained acute flaccid paralysis showing cytopathic effect in rhabdomyosarcoma cells and/or mouse cells were used to test the efficiency of this method. To further characterize the viruses, VIDISCA-positive samples were amplified and sequenced with gene specific primers. Simplified VIDISCA detected seven viruses (70%) and the proportion of eukaryotic viral sequences from each sample ranged from 8.3 to 45.8%. Human enterovirus EV-B97, EV-B100, echovirus-9 and echovirus-21, human parechovirus type-3, human astrovirus probably a type-3/5 recombinant, and tetnovirus-1 were identified. Phylogenetic analysis based on the VP1 region demonstrated that the human enteroviruses are more divergent isolates circulating in the community. Our data support that a simplified VIDISCA protocol can efficiently identify unrecognized viruses grown in cell culture with low cost, limited time without need of advanced technical expertise. Also complex data interpretation is avoided thus the method can be used as a powerful diagnostic tool in limited resources. Redesigning the routine diagnostics might lead to additional detection of previously undiagnosed viruses in clinical samples of patients.
Cultivable Anaerobic Microbiota of Severe Early Childhood Caries▿¶

PubMed Central

Tanner, A. C. R.; Mathney, J. M. J.; Kent, R. L.; Chalmers, N. I.; Hughes, C. V.; Loo, C. Y.; Pradhan, N.; Kanasi, E.; Hwang, J.; Dahlan, M. A.; Papadopolou, E.; Dewhirst, F. E.

2011-01-01

Severe early childhood caries (ECC), while strongly associated with Streptococcus mutans using selective detection (culture, PCR), has also been associated with a widely diverse microbiota using molecular cloning approaches. The aim of this study was to evaluate the microbiota of severe ECC using anaerobic culture. The microbial composition of dental plaque from 42 severe ECC children was compared with that of 40 caries-free children. Bacterial samples were cultured anaerobically on blood and acid (pH 5) agars. Isolates were purified, and partial sequences for the 16S rRNA gene were obtained from 5,608 isolates. Sequence-based analysis of the 16S rRNA isolate libraries from blood and acid agars of severe ECC and caries-free children had >90% population coverage, with greater diversity occurring in the blood isolate library. Isolate sequences were compared with taxon sequences in the Human Oral Microbiome Database (HOMD), and 198 HOMD taxa were identified, including 45 previously uncultivated taxa, 29 extended HOMD taxa, and 45 potential novel groups. The major species associated with severe ECC included Streptococcus mutans, Scardovia wiggsiae, Veillonella parvula, Streptococcus cristatus, and Actinomyces gerensceriae. S. wiggsiae was significantly associated with severe ECC children in the presence and absence of S. mutans detection. We conclude that anaerobic culture detected as wide a diversity of species in ECC as that observed using cloning approaches. Culture coupled with 16S rRNA identification identified over 74 isolates for human oral taxa without previously cultivated representatives. The major caries-associated species were S. mutans and S. wiggsiae, the latter of which is a candidate as a newly recognized caries pathogen. PMID:21289150
Sequence analysis of 497 mouse brain ESTs expressed in the substantia nigra

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stewart, G.J.; Savioz, A.; Davies, R.W.

1997-01-15

The use of subtracted, region-specific cDNA libraries combined with single-pass cDNA sequencing allows the discovery of novel genes and facilitates molecular description of the tissue or region involved. We report the sequence of 497 mouse expressed sequence tags (ESTs) from two subtracted libraries enriched for cDNAs expressed in the substantia nigra, a brain region with important roles in movement control and Parkinson disease. Of these, 238 ESTs give no database matches and therefore derive from novel genes. A further 115 ESTs show sequence similarity to ESTs from other organisms, which themselves do not yield any significant database matches to genesmore » of known function. Fifty-six ESTs show sequence similarity to previously identified genes whose mouse homologues have not been reported. The total number of ESTs reported that are new for the mouse is 407, which, together with the 90 ESTs corresponding to known mouse genes or cDNAs, contributes to the molecular description of the substantia nigra. 21 refs., 4 tabs.« less

Characterization of Dermanyssus gallinae (Acarina: Dermanissydae) by sequence analysis of the ribosomal internal transcribed spacer regions.

PubMed

Potenza, L; Cafiero, M A; Camarda, A; La Salandra, G; Cucchiarini, L; Dachà, M

2009-10-01

In the present work mites previously identified as Dermanyssus gallinae De Geer (Acari, Mesostigmata) using morphological keys were investigated by molecular tools. The complete internal transcribed spacer 1 (ITS1), 5.8S ribosomal DNA, and ITS2 region of the ribosomal DNA from mites were amplified and sequenced to examine the level of sequence variations and to explore the feasibility of using this region in the identification of this mite. Conserved primers located at the 3'end of 18S and at the 5'start of 28S rRNA genes were used first, and amplified fragments were sequenced. Sequence analyses showed no variation in 5.8S and ITS2 region while slight intraspecific variations involving substitutions as well as deletions concentrated in the ITS1 region. Based on the sequence analyses a nested PCR of the ITS2 region followed by RFLP analyses has been set up in the attempt to provide a rapid molecular diagnostic tool of D. gallinae.
Leishmania species identification using FTA card sampling directly from patients' cutaneous lesions in the state of Lara, Venezuela.

PubMed

Kato, Hirotomo; Watanabe, Junko; Mendoza Nieto, Iraida; Korenaga, Masataka; Hashiguchi, Yoshihisa

2011-10-01

A molecular epidemiological study was performed using FTA card materials directly sampled from lesions of patients with cutaneous leishmaniasis (CL) in the state of Lara, Venezuela, where causative agents have been identified as Leishmania (Viannia) braziliensis and L. (Leishmania) venezuelensis in previous studies. Of the 17 patients diagnosed with CL, Leishmania spp. were successfully identified in 16 patients based on analysis of the cytochrome b gene and rRNA internal transcribed spacer sequences. Consistent with previous findings, seven of the patients were infected with L. (V.) braziliensis. However, parasites from the other nine patients were genetically identified as L. (L.) mexicana, which differed from results of previous enzymatic and antigenic analyses. These results strongly suggest that L. (L.) venezuelensis is a variant of L. (L.) mexicana and that the classification of L. (L.) venezuelensis should be reconsidered. Copyright © 2011 Royal Society of Tropical Medicine and Hygiene. Published by Elsevier Ltd. All rights reserved.
Identification of Candidate Genes Underlying an Iron Efficiency Quantitative Trait Locus in Soybean1

PubMed Central

Peiffer, Gregory A.; King, Keith E.; Severin, Andrew J.; May, Gregory D.; Cianzio, Silvia R.; Lin, Shun Fu; Lauter, Nicholas C.; Shoemaker, Randy C.

2012-01-01

Prevalent on calcareous soils in the United States and abroad, iron deficiency is among the most common and severe nutritional stresses in plants. In soybean (Glycine max) commercial plantings, the identification and use of iron-efficient genotypes has proven to be the best form of managing this soil-related plant stress. Previous studies conducted in soybean identified a significant iron efficiency quantitative trait locus (QTL) explaining more than 70% of the phenotypic variation for the trait. In this research, we identified candidate genes underlying this QTL through molecular breeding, mapping, and transcriptome sequencing. Introgression mapping was performed using two related near-isogenic lines in which a region located on soybean chromosome 3 required for iron efficiency was identified. The region corresponds to the previously reported iron efficiency QTL. The location was further confirmed through QTL mapping conducted in this study. Transcriptome sequencing and quantitative real-time-polymerase chain reaction identified two genes encoding transcription factors within the region that were significantly induced in soybean roots under iron stress. The two induced transcription factors were identified as homologs of the subgroup lb basic helix-loop-helix (bHLH) genes that are known to regulate the strategy I response in Arabidopsis (Arabidopsis thaliana). Resequencing of these differentially expressed genes unveiled a significant deletion within a predicted dimerization domain. We hypothesize that this deletion disrupts the Fe-DEFICIENCY-INDUCED TRANSCRIPTION FACTOR (FIT)/bHLH heterodimer that has been shown to induce known iron acquisition genes. PMID:22319075
Identifying transcription factor functions and targets by phenotypic activation

PubMed Central

Chua, Gordon; Morris, Quaid D.; Sopko, Richelle; Robinson, Mark D.; Ryan, Owen; Chan, Esther T.; Frey, Brendan J.; Andrews, Brenda J.; Boone, Charles; Hughes, Timothy R.

2006-01-01

Mapping transcriptional regulatory networks is difficult because many transcription factors (TFs) are activated only under specific conditions. We describe a generic strategy for identifying genes and pathways induced by individual TFs that does not require knowledge of their normal activation cues. Microarray analysis of 55 yeast TFs that caused a growth phenotype when overexpressed showed that the majority caused increased transcript levels of genes in specific physiological categories, suggesting a mechanism for growth inhibition. Induced genes typically included established targets and genes with consensus promoter motifs, if known, indicating that these data are useful for identifying potential new target genes and binding sites. We identified the sequence 5′-TCACGCAA as a binding sequence for Hms1p, a TF that positively regulates pseudohyphal growth and previously had no known motif. The general strategy outlined here presents a straightforward approach to discovery of TF activities and mapping targets that could be adapted to any organism with transgenic technology. PMID:16880382
Distributed biotin–streptavidin transcription roadblocks for mapping cotranscriptional RNA folding

PubMed Central

Strobel, Eric J.; Nedialkov, Yuri; Artsimovitch, Irina

2017-01-01

Abstract RNA folding during transcription directs an order of folding that can determine RNA structure and function. However, the experimental study of cotranscriptional RNA folding has been limited by the lack of easily approachable methods that can interrogate nascent RNA structure at nucleotide resolution. To address this, we previously developed cotranscriptional selective 2΄-hydroxyl acylation analyzed by primer extension sequencing (SHAPE-Seq) to simultaneously probe all intermediate RNA transcripts during transcription by stalling elongation complexes at catalytically dead EcoRIE111Q roadblocks. While effective, the distribution of elongation complexes using EcoRIE111Q requires laborious PCR using many different oligonucleotides for each sequence analyzed. Here, we improve the broad applicability of cotranscriptional SHAPE-Seq by developing a sequence-independent biotin–streptavidin (SAv) roadblocking strategy that simplifies the preparation of roadblocking DNA templates. We first determine the properties of biotin–SAv roadblocks. We then show that randomly distributed biotin–SAv roadblocks can be used in cotranscriptional SHAPE-Seq experiments to identify the same RNA structural transitions related to a riboswitch decision-making process that we previously identified using EcoRIE111Q. Lastly, we find that EcoRIE111Q maps nascent RNA structure to specific transcript lengths more precisely than biotin–SAv and propose guidelines to leverage the complementary strengths of each transcription roadblock in cotranscriptional SHAPE-Seq. PMID:28398514
A second gene for acyl-(acyl-carrier-protein): glycerol-3-phosphate acyltransferase in squash, Cucurbita moschata cv. Shirogikuza(*), codes for an oleate-selective isozyme: molecular cloning and protein purification studies.

PubMed

Nishida, I; Sugiura, M; Enju, A; Nakamura, M

2000-12-01

A new isogene for acyl-(acyl-carrier-protein):glycerol-3-phosphate acyltransferase (GPAT; EC 2.3.1.15) in squash has been cloned and the gene product was identified as oleate-selective GPAT. Using PCR primers that could hybridise with exons for a previously cloned squash GPAT, we obtained two PCR products of different size: one coded for a previously cloned squash GPAT corresponding to non-selective isoforms AT2 and AT3, and the other for a new isozyme, probably the oleate-selective isoform AT1. Full-length amino acid sequences of respective isozymes were deduced from the nucleotide sequences of genomic genes and cDNAs, which were cloned by a series of PCR-based methods. Thus, we designated the new gene CmATS1;1 and the other one CmATS1;2. Genome blot analysis revealed that the squash genome contained the two isogenes at non-allelic loci. AT1-active fractions were partially purified, and three polypeptide bands were identified as being AT1 polypeptides, which exhibited relative molecular masses of 39.5-40.5 kDa, pI values of 6.75-7.15, and oleate selectivity over palmitate. Partial amino-terminal sequences obtained from two of these bands verified that the new isogene codes for AT1 polypeptides.
The Plasmodium falciparum transcriptome in severe malaria reveals altered expression of genes involved in important processes including surface antigen–encoding var genes

PubMed Central

Tonkin-Hill, Gerry Q.; Trianty, Leily; Noviyanti, Rintis; Nguyen, Hanh H. T.; Sebayang, Boni F.; Lampah, Daniel A.; Marfurt, Jutta; Cobbold, Simon A.; Rambhatla, Janavi S.; McConville, Malcolm J.; Rogerson, Stephen J.; Brown, Graham V.; Day, Karen P.; Price, Ric N.; Anstey, Nicholas M.

2018-01-01

Within the human host, the malaria parasite Plasmodium falciparum is exposed to multiple selection pressures. The host environment changes dramatically in severe malaria, but the extent to which the parasite responds to—or is selected by—this environment remains unclear. From previous studies, the parasites that cause severe malaria appear to increase expression of a restricted but poorly defined subset of the PfEMP1 variant, surface antigens. PfEMP1s are major targets of protective immunity. Here, we used RNA sequencing (RNAseq) to analyse gene expression in 44 parasite isolates that caused severe and uncomplicated malaria in Papuan patients. The transcriptomes of 19 parasite isolates associated with severe malaria indicated that these parasites had decreased glycolysis without activation of compensatory pathways; altered chromatin structure and probably transcriptional regulation through decreased histone methylation; reduced surface expression of PfEMP1; and down-regulated expression of multiple chaperone proteins. Our RNAseq also identified novel associations between disease severity and PfEMP1 transcripts, domains, and smaller sequence segments and also confirmed all previously reported associations between expressed PfEMP1 sequences and severe disease. These findings will inform efforts to identify vaccine targets for severe malaria and also indicate how parasites adapt to—or are selected by—the host environment in severe malaria. PMID:29529020
Distributed biotin-streptavidin transcription roadblocks for mapping cotranscriptional RNA folding.

PubMed

Strobel, Eric J; Watters, Kyle E; Nedialkov, Yuri; Artsimovitch, Irina; Lucks, Julius B

2017-07-07

RNA folding during transcription directs an order of folding that can determine RNA structure and function. However, the experimental study of cotranscriptional RNA folding has been limited by the lack of easily approachable methods that can interrogate nascent RNA structure at nucleotide resolution. To address this, we previously developed cotranscriptional selective 2΄-hydroxyl acylation analyzed by primer extension sequencing (SHAPE-Seq) to simultaneously probe all intermediate RNA transcripts during transcription by stalling elongation complexes at catalytically dead EcoRIE111Q roadblocks. While effective, the distribution of elongation complexes using EcoRIE111Q requires laborious PCR using many different oligonucleotides for each sequence analyzed. Here, we improve the broad applicability of cotranscriptional SHAPE-Seq by developing a sequence-independent biotin-streptavidin (SAv) roadblocking strategy that simplifies the preparation of roadblocking DNA templates. We first determine the properties of biotin-SAv roadblocks. We then show that randomly distributed biotin-SAv roadblocks can be used in cotranscriptional SHAPE-Seq experiments to identify the same RNA structural transitions related to a riboswitch decision-making process that we previously identified using EcoRIE111Q. Lastly, we find that EcoRIE111Q maps nascent RNA structure to specific transcript lengths more precisely than biotin-SAv and propose guidelines to leverage the complementary strengths of each transcription roadblock in cotranscriptional SHAPE-Seq. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Clinical characteristics of severe congenital neutropenia caused by novel ELANE gene mutations.

PubMed

Shu, Zhou; Li, Xiao-Hui; Bai, Xiao-Ming; Zhang, Zhi-Yong; Jiang, Li-ping; Tang, Xue-Mei; Zhao, Xiao-dong

2015-02-01

Mutations within the ELANE gene, which encodes human neutrophil elastase, are the most common genetic causes of severe congenital neutropenia (SCN). No cases of SCN have been previously described from a Chinese population. Herein, we describe the clinical, hematologic and molecular characteristics of 7 Chinese SCN cases with novel ELANE mutations. Seven Chinese pediatric patients (4 males and 3 females) with suspected SCN were enrolled in this study. Clinical data, peripheral blood, bone marrow and immune function were evaluated for SCN. ELANE genomic DNA and cDNA sequences from patients and potential carriers were analyzed using polymerase chain reaction (PCR) and direct sequencing. All the7 patients experienced recurrent infection (soft tissue, lung, oral cavity) during a period of 120 days. Noninfectious conditions such as anemia and osteopenia were found in most patients, and absolute peripheral neutrophil counts varied. DNA and cDNA sequencing demonstrated that the patients harbored a range of heterozygous ELANE gene mutations, including substitution, deletion, insertion and frame shift alterations. All the mutations had not been reported previously; however, no mutation carriers were identified among the parents or siblings, even in a family with 2 affected offspring. SCN cases were identified for the first time in China, and all patients carried novel ELANE mutations. Granulocyte-colony stimulating factor (G-CSF) was an effective treatment for most of the SCN patients and prevented life-threatening bacterial infections.
Ribosomal RNA Genes Contribute to the Formation of Pseudogenes and Junk DNA in the Human Genome.

PubMed

Robicheau, Brent M; Susko, Edward; Harrigan, Amye M; Snyder, Marlene

2017-02-01

Approximately 35% of the human genome can be identified as sequence devoid of a selected-effect function, and not derived from transposable elements or repeated sequences. We provide evidence supporting a known origin for a fraction of this sequence. We show that: 1) highly degraded, but near full length, ribosomal DNA (rDNA) units, including both 45S and Intergenic Spacer (IGS), can be found at multiple sites in the human genome on chromosomes without rDNA arrays, 2) that these rDNA sequences have a propensity for being centromere proximal, and 3) that sequence at all human functional rDNA array ends is divergent from canonical rDNA to the point that it is pseudogenic. We also show that small sequence strings of rDNA (from 45S + IGS) can be found distributed throughout the genome and are identifiable as an "rDNA-like signal", representing 0.26% of the q-arm of HSA21 and ∼2% of the total sequence of other regions tested. The size of sequence strings found in the rDNA-like signal intergrade into the size of sequence strings that make up the full-length degrading rDNA units found scattered throughout the genome. We conclude that the displaced and degrading rDNA sequences are likely of a similar origin but represent different stages in their evolution towards random sequence. Collectively, our data suggests that over vast evolutionary time, rDNA arrays contribute to the production of junk DNA. The concept that the production of rDNA pseudogenes is a by-product of concerted evolution represents a previously under-appreciated process; we demonstrate here its importance. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Partial Shotgun Sequencing of the Boechera stricta Genome Reveals Extensive Microsynteny and Promoter Conservation with Arabidopsis1[W

PubMed Central

Windsor, Aaron J.; Schranz, M. Eric; Formanová, Nataša; Gebauer-Jung, Steffi; Bishop, John G.; Schnabelrauch, Domenica; Kroymann, Juergen; Mitchell-Olds, Thomas

2006-01-01

Comparative genomics provides insight into the evolutionary dynamics that shape discrete sequences as well as whole genomes. To advance comparative genomics within the Brassicaceae, we have end sequenced 23,136 medium-sized insert clones from Boechera stricta, a wild relative of Arabidopsis (Arabidopsis thaliana). A significant proportion of these sequences, 18,797, are nonredundant and display highly significant similarity (BLASTn e-value ≤ 10−30) to low copy number Arabidopsis genomic regions, including more than 9,000 annotated coding sequences. We have used this dataset to identify orthologous gene pairs in the two species and to perform a global comparison of DNA regions 5′ to annotated coding regions. On average, the 500 nucleotides upstream to coding sequences display 71.4% identity between the two species. In a similar analysis, 61.4% identity was observed between 5′ noncoding sequences of Brassica oleracea and Arabidopsis, indicating that regulatory regions are not as diverged among these lineages as previously anticipated. By mapping the B. stricta end sequences onto the Arabidopsis genome, we have identified nearly 2,000 conserved blocks of microsynteny (bracketing 26% of the Arabidopsis genome). A comparison of fully sequenced B. stricta inserts to their homologous Arabidopsis genomic regions indicates that indel polymorphisms >5 kb contribute substantially to the genome size difference observed between the two species. Further, we demonstrate that microsynteny inferred from end-sequence data can be applied to the rapid identification and cloning of genomic regions of interest from nonmodel species. These results suggest that among diploid relatives of Arabidopsis, small- to medium-scale shotgun sequencing approaches can provide rapid and cost-effective benefits to evolutionary and/or functional comparative genomic frameworks. PMID:16607030
High-throughput, pooled sequencing identifies mutations in NUBPL and FOXRED1 in human complex I deficiency

PubMed Central

Calvo, Sarah E; Tucker, Elena J; Compton, Alison G; Kirby, Denise M; Crawford, Gabriel; Burtt, Noel P; Rivas, Manuel A; Guiducci, Candace; Bruno, Damien L; Goldberger, Olga A; Redman, Michelle C; Wiltshire, Esko; Wilson, Callum J; Altshuler, David; Gabriel, Stacey B; Daly, Mark J; Thorburn, David R; Mootha, Vamsi K

2010-01-01

Discovering the molecular basis of mitochondrial respiratory chain disease is challenging given the large number of both mitochondrial and nuclear genes involved. We report a strategy of focused candidate gene prediction, high-throughput sequencing, and experimental validation to uncover the molecular basis of mitochondrial complex I (CI) disorders. We created five pools of DNA from a cohort of 103 patients and then performed deep sequencing of 103 candidate genes to spotlight 151 rare variants predicted to impact protein function. We used confirmatory experiments to establish genetic diagnoses in 22% of previously unsolved cases, and discovered that defects in NUBPL and FOXRED1 can cause CI deficiency. Our study illustrates how large-scale sequencing, coupled with functional prediction and experimental validation, can reveal novel disease-causing mutations in individual patients. PMID:20818383
High-resolution characterization of a hepatocellular carcinoma genome.

PubMed

Totoki, Yasushi; Tatsuno, Kenji; Yamamoto, Shogo; Arai, Yasuhito; Hosoda, Fumie; Ishikawa, Shumpei; Tsutsumi, Shuichi; Sonoda, Kohtaro; Totsuka, Hirohiko; Shirakihara, Takuya; Sakamoto, Hiromi; Wang, Linghua; Ojima, Hidenori; Shimada, Kazuaki; Kosuge, Tomoo; Okusaka, Takuji; Kato, Kazuto; Kusuda, Jun; Yoshida, Teruhiko; Aburatani, Hiroyuki; Shibata, Tatsuhiro

2011-05-01

Hepatocellular carcinoma, one of the most common virus-associated cancers, is the third most frequent cause of cancer-related death worldwide. By massively parallel sequencing of a primary hepatitis C virus-positive hepatocellular carcinoma (36× coverage) and matched lymphocytes (>28× coverage) from the same individual, we identified more than 11,000 somatic substitutions of the tumor genome that showed predominance of T>C/A>G transition and a decrease of the T>C substitution on the transcribed strand, suggesting preferential DNA repair. Gene annotation enrichment analysis of 63 validated non-synonymous substitutions revealed enrichment of phosphoproteins. We further validated 22 chromosomal rearrangements, generating four fusion transcripts that had altered transcriptional regulation (BCORL1-ELF4) or promoter activity. Whole-exome sequencing at a higher sequence depth (>76× coverage) revealed a TSC1 nonsense substitution in a subpopulation of the tumor cells. This first high-resolution characterization of a virus-associated cancer genome identified previously uncharacterized mutation patterns, intra-chromosomal rearrangements and fusion genes, as well as genetic heterogeneity within the tumor.
In vivo insertion pool sequencing identifies virulence factors in a complex fungal–host interaction

PubMed Central

Uhse, Simon; Pflug, Florian G.; Stirnberg, Alexandra; Ehrlinger, Klaus; von Haeseler, Arndt

2018-01-01

Large-scale insertional mutagenesis screens can be powerful genome-wide tools if they are streamlined with efficient downstream analysis, which is a serious bottleneck in complex biological systems. A major impediment to the success of next-generation sequencing (NGS)-based screens for virulence factors is that the genetic material of pathogens is often underrepresented within the eukaryotic host, making detection extremely challenging. We therefore established insertion Pool-Sequencing (iPool-Seq) on maize infected with the biotrophic fungus U. maydis. iPool-Seq features tagmentation, unique molecular barcodes, and affinity purification of pathogen insertion mutant DNA from in vivo-infected tissues. In a proof of concept using iPool-Seq, we identified 28 virulence factors, including 23 that were previously uncharacterized, from an initial pool of 195 candidate effector mutants. Because of its sensitivity and quantitative nature, iPool-Seq can be applied to any insertional mutagenesis library and is especially suitable for genetically complex setups like pooled infections of eukaryotic hosts. PMID:29684023
Whole-genome sequencing of giant pandas provides insights into demographic history and local adaptation.

PubMed

Zhao, Shancen; Zheng, Pingping; Dong, Shanshan; Zhan, Xiangjiang; Wu, Qi; Guo, Xiaosen; Hu, Yibo; He, Weiming; Zhang, Shanning; Fan, Wei; Zhu, Lifeng; Li, Dong; Zhang, Xuemei; Chen, Quan; Zhang, Hemin; Zhang, Zhihe; Jin, Xuelin; Zhang, Jinguo; Yang, Huanming; Wang, Jian; Wang, Jun; Wei, Fuwen

2013-01-01

The panda lineage dates back to the late Miocene and ultimately leads to only one extant species, the giant panda (Ailuropoda melanoleuca). Although global climate change and anthropogenic disturbances are recognized to shape animal population demography their contribution to panda population dynamics remains largely unknown. We sequenced the whole genomes of 34 pandas at an average 4.7-fold coverage and used this data set together with the previously deep-sequenced panda genome to reconstruct a continuous demographic history of pandas from their origin to the present. We identify two population expansions, two bottlenecks and two divergences. Evidence indicated that, whereas global changes in climate were the primary drivers of population fluctuation for millions of years, human activities likely underlie recent population divergence and serious decline. We identified three distinct panda populations that show genetic adaptation to their environments. However, in all three populations, anthropogenic activities have negatively affected pandas for 3,000 years.
Evaluation of an automated repetitive sequence-based PCR system for subtyping Enterobacter sakazakii.

PubMed

Healy, B; Mullane, N; Collin, V; Mailler, S; Iversen, C; Chatellier, S; Storrs, M; Fanning, S

2008-07-01

Enterobacter sakazakii is regarded as a ubiquitous organism that can be isolated from a wide range of foods and environments. Infection in at-risk infants has been epidemiologically linked to the consumption of contaminated powdered infant formula. Preventing the dissemination of this pathogen in a powdered infant formula manufacturing facility is an important step in ensuring consumer confidence in a given brand together with the protection of the health status of a vulnerable population. In this study we report the application of a repetitive sequence-based PCR typing method to subtype a previously well-characterized collection of E. sakazakii isolates of diverse origin. While both methods successfully discriminated between the collection of isolates, repetitive sequence-based PCR identified 65 types, whereas pulsed-field gel electrophoresis identified 110 types showing > or =95% similarity. The method was quick and easy to perform, and our data demonstrated the utility and value of this approach to monitor in-process contamination, which could potentially contribute to a reduction in the transmission of E. sakazakii.
A dual role for a polyketide synthase in dynemicin enediyne and anthraquinone biosynthesis

NASA Astrophysics Data System (ADS)

Cohen, Douglas R.; Townsend, Craig A.

2018-02-01

Dynemicin A is a member of a subfamily of enediyne antitumour antibiotics characterized by a 10-membered carbocycle fused to an anthraquinone, both of polyketide origin. Sequencing of the dynemicin biosynthetic gene cluster in Micromonospora chersina previously identified an enediyne polyketide synthase (PKS), but no anthraquinone PKS, suggesting gene(s) for biosynthesis of the latter were distant from the core dynemicin cluster. To identify these gene(s), we sequenced and analysed the genome of M. chersina. Sequencing produced a short list of putative PKS candidates, yet CRISPR-Cas9 mutants of each locus retained dynemicin production. Subsequently, deletion of two cytochromes P450 in the dynemicin cluster suggested that the dynemicin enediyne PKS, DynE8, may biosynthesize the anthraquinone. Together with 18O-labelling studies, we now present evidence that DynE8 produces the core scaffolds of both the enediyne and anthraquinone, and provide a working model to account for their formation from the programmed octaketide of the enediyne PKS.
Newborn screening for cystic fibrosis: Polish 4 years' experience with CFTR sequencing strategy.

PubMed

Sobczyńska-Tomaszewska, Agnieszka; Ołtarzewski, Mariusz; Czerska, Kamila; Wertheim-Tysarowska, Katarzyna; Sands, Dorota; Walkowiak, Jarosław; Bal, Jerzy; Mazurczak, Tadeusz

2013-04-01

Newborn screening for cystic fibrosis (NBS CF) in Poland was started in September 2006. Summary from 4 years' experience is presented in this study. The immunoreactive trypsin/DNA sequencing strategy was implemented. The group of 1,212,487 newborns were screened for cystic fibrosis during the programme. We identified a total of 221 CF cases during this period, including, 4 CF cases were reported to be omitted by NBS CF. Disease incidence in Poland based on the programme results was estimated as 1/4394 and carrier frequency as 1/33. The frequency of the F508del was similar (62%) to population data previously reported. This strategy allowed us to identify 29 affected infants with rare genotypes. The frequency of some mutations (eg, 2184insA, K710X) was assessed in Poland for the first time. Thus, sequencing assay seems to be accurate method for screening programme using blood spots in the Polish population.
Distribution and clinical impact of functional variants in 50,726 whole-exome sequences from the DiscovEHR study.

PubMed

Dewey, Frederick E; Murray, Michael F; Overton, John D; Habegger, Lukas; Leader, Joseph B; Fetterolf, Samantha N; O'Dushlaine, Colm; Van Hout, Cristopher V; Staples, Jeffrey; Gonzaga-Jauregui, Claudia; Metpally, Raghu; Pendergrass, Sarah A; Giovanni, Monica A; Kirchner, H Lester; Balasubramanian, Suganthi; Abul-Husn, Noura S; Hartzel, Dustin N; Lavage, Daniel R; Kost, Korey A; Packer, Jonathan S; Lopez, Alexander E; Penn, John; Mukherjee, Semanti; Gosalia, Nehal; Kanagaraj, Manoj; Li, Alexander H; Mitnaul, Lyndon J; Adams, Lance J; Person, Thomas N; Praveen, Kavita; Marcketta, Anthony; Lebo, Matthew S; Austin-Tse, Christina A; Mason-Suares, Heather M; Bruse, Shannon; Mellis, Scott; Phillips, Robert; Stahl, Neil; Murphy, Andrew; Economides, Aris; Skelding, Kimberly A; Still, Christopher D; Elmore, James R; Borecki, Ingrid B; Yancopoulos, George D; Davis, F Daniel; Faucett, William A; Gottesman, Omri; Ritchie, Marylyn D; Shuldiner, Alan R; Reid, Jeffrey G; Ledbetter, David H; Baras, Aris; Carey, David J

2016-12-23

The DiscovEHR collaboration between the Regeneron Genetics Center and Geisinger Health System couples high-throughput sequencing to an integrated health care system using longitudinal electronic health records (EHRs). We sequenced the exomes of 50,726 adult participants in the DiscovEHR study to identify ~4.2 million rare single-nucleotide variants and insertion/deletion events, of which ~176,000 are predicted to result in a loss of gene function. Linking these data to EHR-derived clinical phenotypes, we find clinical associations supporting therapeutic targets, including genes encoding drug targets for lipid lowering, and identify previously unidentified rare alleles associated with lipid levels and other blood level traits. About 3.5% of individuals harbor deleterious variants in 76 clinically actionable genes. The DiscovEHR data set provides a blueprint for large-scale precision medicine initiatives and genomics-guided therapeutic discovery. Copyright © 2016, American Association for the Advancement of Science.
Deep sequencing of the small RNA transcriptome of normal and malignant human B cells identifies hundreds of novel microRNAs

PubMed Central

Jima, Dereje D.; Zhang, Jenny; Jacobs, Cassandra; Richards, Kristy L.; Dunphy, Cherie H.; Choi, William W. L.; Yan Au, Wing; Srivastava, Gopesh; Czader, Magdalena B.; Rizzieri, David A.; Lagoo, Anand S.; Lugar, Patricia L.; Mann, Karen P.; Flowers, Christopher R.; Bernal-Mizrachi, Leon; Naresh, Kikkeri N.; Evens, Andrew M.; Gordon, Leo I.; Luftig, Micah; Friedman, Daphne R.; Weinberg, J. Brice; Thompson, Michael A.; Gill, Javed I.; Liu, Qingquan; How, Tam; Grubor, Vladimir; Gao, Yuan; Patel, Amee; Wu, Han; Zhu, Jun; Blobe, Gerard C.; Lipsky, Peter E.; Chadburn, Amy

2010-01-01

A role for microRNA (miRNA) has been recognized in nearly every biologic system examined thus far. A complete delineation of their role must be preceded by the identification of all miRNAs present in any system. We elucidated the complete small RNA transcriptome of normal and malignant B cells through deep sequencing of 31 normal and malignant human B-cell samples that comprise the spectrum of B-cell differentiation and common malignant phenotypes. We identified the expression of 333 known miRNAs, which is more than twice the number previously recognized in any tissue type. We further identified the expression of 286 candidate novel miRNAs in normal and malignant B cells. These miRNAs were validated at a high rate (92%) using quantitative polymerase chain reaction, and we demonstrated their application in the distinction of clinically relevant subgroups of lymphoma. We further demonstrated that a novel miRNA cluster, previously annotated as a hypothetical gene LOC100130622, contains 6 novel miRNAs that regulate the transforming growth factor-β pathway. Thus, our work suggests that more than a third of the miRNAs present in most cellular types are currently unknown and that these miRNAs may regulate important cellular functions. PMID:20733160

Extended exome sequencing identifies BACH2 as a novel major risk locus for Addison's disease.

PubMed

Eriksson, D; Bianchi, M; Landegren, N; Nordin, J; Dalin, F; Mathioudaki, A; Eriksson, G N; Hultin-Rosenberg, L; Dahlqvist, J; Zetterqvist, H; Karlsson, Å; Hallgren, Å; Farias, F H G; Murén, E; Ahlgren, K M; Lobell, A; Andersson, G; Tandre, K; Dahlqvist, S R; Söderkvist, P; Rönnblom, L; Hulting, A-L; Wahlberg, J; Ekwall, O; Dahlqvist, P; Meadows, J R S; Bensing, S; Lindblad-Toh, K; Kämpe, O; Pielberg, G R

2016-12-01

Autoimmune disease is one of the leading causes of morbidity and mortality worldwide. In Addison's disease, the adrenal glands are targeted by destructive autoimmunity. Despite being the most common cause of primary adrenal failure, little is known about its aetiology. To understand the genetic background of Addison's disease, we utilized the extensively characterized patients of the Swedish Addison Registry. We developed an extended exome capture array comprising a selected set of 1853 genes and their potential regulatory elements, for the purpose of sequencing 479 patients with Addison's disease and 1394 controls. We identified BACH2 (rs62408233-A, OR = 2.01 (1.71-2.37), P = 1.66 × 10 -15 , MAF 0.46/0.29 in cases/controls) as a novel gene associated with Addison's disease development. We also confirmed the previously known associations with the HLA complex. Whilst BACH2 has been previously reported to associate with organ-specific autoimmune diseases co-inherited with Addison's disease, we have identified BACH2 as a major risk locus in Addison's disease, independent of concomitant autoimmune diseases. Our results may enable future research towards preventive disease treatment. © 2016 The Authors. Journal of Internal Medicine published by John Wiley & Sons Ltd on behalf of Association for Publication of The Journal of Internal Medicine.
Walking the interactome for candidate prioritization in exome sequencing studies of Mendelian diseases

DOE PAGES

Smedley, Damian; Kohler, Sebastian; Czeschik, Johanna Christina; ...

2014-07-30

Here, whole-exome sequencing (WES) has opened up previously unheard of possibilities for identifying novel disease genes in Mendelian disorders, only about half of which have been elucidated to date. However, interpretation of WES data remains challenging. As a result, we analyze protein–protein association (PPA) networks to identify candidate genes in the vicinity of genes previously implicated in a disease. The analysis, using a random-walk with restart (RWR) method, is adapted to the setting of WES by developing a composite variant-gene relevance score based on the rarity, location and predicted pathogenicity of variants and the RWR evaluation of genes harboring themore » variants. Benchmarking using known disease variants from 88 disease-gene families reveals that the correct gene is ranked among the top 10 candidates in ≥50% of cases, a figure which we confirmed using a prospective study of disease genes identified in 2012 and PPA data produced before that date. In conclusion, we implement our method in a freely available Web server, ExomeWalker, that displays a ranked list of candidates together with information on PPAs, frequency and predicted pathogenicity of the variants to allow quick and effective searches for candidates that are likely to reward closer investigation.« less
Walking the interactome for candidate prioritization in exome sequencing studies of Mendelian diseases

DOE Office of Scientific and Technical Information (OSTI.GOV)

Smedley, Damian; Kohler, Sebastian; Czeschik, Johanna Christina

Here, whole-exome sequencing (WES) has opened up previously unheard of possibilities for identifying novel disease genes in Mendelian disorders, only about half of which have been elucidated to date. However, interpretation of WES data remains challenging. As a result, we analyze protein–protein association (PPA) networks to identify candidate genes in the vicinity of genes previously implicated in a disease. The analysis, using a random-walk with restart (RWR) method, is adapted to the setting of WES by developing a composite variant-gene relevance score based on the rarity, location and predicted pathogenicity of variants and the RWR evaluation of genes harboring themore » variants. Benchmarking using known disease variants from 88 disease-gene families reveals that the correct gene is ranked among the top 10 candidates in ≥50% of cases, a figure which we confirmed using a prospective study of disease genes identified in 2012 and PPA data produced before that date. In conclusion, we implement our method in a freely available Web server, ExomeWalker, that displays a ranked list of candidates together with information on PPAs, frequency and predicted pathogenicity of the variants to allow quick and effective searches for candidates that are likely to reward closer investigation.« less
Pellagra-like condition is xeroderma pigmentosum/Cockayne syndrome complex and niacin confers clinical benefit.

PubMed

Hijazi, H; Salih, M A; Hamad, M H A; Hassan, H H; Salih, S B M; Mohamed, K A; Mukhtar, M M; Karrar, Z A; Ansari, S; Ibrahim, N; Alkuraya, F S

2015-01-01

An extremely rare pellagra-like condition has been described, which was partially responsive to niacin and associated with a multisystem involvement. The condition was proposed to represent a novel autosomal recessive entity but the underlying mutation remained unknown for almost three decades. The objective of this study was to identify the causal mutation in the pellagra-like condition and investigate the mechanism by which niacin confers clinical benefit. Autozygosity mapping and exome sequencing were used to identify the causal mutation, and comet assay on patient fibroblasts before and after niacin treatment to assess its effect on DNA damage. We identified a single disease locus that harbors a novel mutation in ERCC5, thus confirming that the condition is in fact xeroderma pigmentosum/Cockayne syndrome (XP/CS) complex. Importantly, we also show that the previously described dermatological response to niacin is consistent with a dramatic protective effect against ultraviolet-induced DNA damage in patient fibroblasts conferred by niacin treatment. Our findings show the power of exome sequencing in reassigning previously described novel clinical entities, and suggest a mechanism for the dermatological response to niacin in patients with XP/CS complex. This raises interesting possibilities about the potential therapeutic use of niacin in XP. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
[Genetic variants in miRNAs and its association with breast cancer].

PubMed

Méndez-Gómez, Susana; Ruiz Esparza-Garrido, Ruth; Velázquez-Flores, Miguel; Dolores-Vergara, Maria; Salamanca-Gómez, Fabio; Arenas-Aranda, Diego Julio

2014-01-01

In Mexico, breast cancer represents the first cause of cancer death in females. At the molecular level, non-coding RNAs and especially microRNAs have played an important role in the origin and development of this neoplasm In the Anglo-Saxon population, diverse genetic variants in microRNA genes and in their targets are associated with the development of this disease. In the Mexican population it is not known if these or other variants exist. Identification of these or new variants in our population is fundamental in order to have a better understanding of cancer development and to help establish a better diagnostic strategy. DNA was isolated from mammary tumors, adjacent tissue and peripheral blood of Mexican females with or without cancer. From DNA, five microRNA genes and three of their targets were amplified and sequenced. Genetic variants associated with breast cancer in an Anglo- Saxon population have been previously identified in these sequences. In the samples studied we identified seven single nucleotide polymorphisms (SNPs). Two had not been previously described and were identified only in women with cancer. The new variants may be genetic predisposition factors for the development of breast cancer in our population. Further experiments are needed to determine the involvement of these variants in the development, establishment and progression of breast cancer.
A Predictive Model of the Oxygen and Heme Regulatory Network in Yeast

PubMed Central

Kundaje, Anshul; Xin, Xiantong; Lan, Changgui; Lianoglou, Steve; Zhou, Mei; Zhang, Li; Leslie, Christina

2008-01-01

Deciphering gene regulatory mechanisms through the analysis of high-throughput expression data is a challenging computational problem. Previous computational studies have used large expression datasets in order to resolve fine patterns of coexpression, producing clusters or modules of potentially coregulated genes. These methods typically examine promoter sequence information, such as DNA motifs or transcription factor occupancy data, in a separate step after clustering. We needed an alternative and more integrative approach to study the oxygen regulatory network in Saccharomyces cerevisiae using a small dataset of perturbation experiments. Mechanisms of oxygen sensing and regulation underlie many physiological and pathological processes, and only a handful of oxygen regulators have been identified in previous studies. We used a new machine learning algorithm called MEDUSA to uncover detailed information about the oxygen regulatory network using genome-wide expression changes in response to perturbations in the levels of oxygen, heme, Hap1, and Co2+. MEDUSA integrates mRNA expression, promoter sequence, and ChIP-chip occupancy data to learn a model that accurately predicts the differential expression of target genes in held-out data. We used a novel margin-based score to extract significant condition-specific regulators and assemble a global map of the oxygen sensing and regulatory network. This network includes both known oxygen and heme regulators, such as Hap1, Mga2, Hap4, and Upc2, as well as many new candidate regulators. MEDUSA also identified many DNA motifs that are consistent with previous experimentally identified transcription factor binding sites. Because MEDUSA's regulatory program associates regulators to target genes through their promoter sequences, we directly tested the predicted regulators for OLE1, a gene specifically induced under hypoxia, by experimental analysis of the activity of its promoter. In each case, deletion of the candidate regulator resulted in the predicted effect on promoter activity, confirming that several novel regulators identified by MEDUSA are indeed involved in oxygen regulation. MEDUSA can reveal important information from a small dataset and generate testable hypotheses for further experimental analysis. Supplemental data are included. PMID:19008939
Dominant Sequences of Human Major Histocompatibility Complex Conserved Extended Haplotypes from HLA-DQA2 to DAXX

PubMed Central

Larsen, Charles E.; Alford, Dennis R.; Trautwein, Michael R.; Jalloh, Yanoh K.; Tarnacki, Jennifer L.; Kunnenkeri, Sushruta K.; Fici, Dolores A.; Yunis, Edmond J.; Awdeh, Zuheir L.; Alper, Chester A.

2014-01-01

We resequenced and phased 27 kb of DNA within 580 kb of the MHC class II region in 158 population chromosomes, most of which were conserved extended haplotypes (CEHs) of European descent or contained their centromeric fragments. We determined the single nucleotide polymorphism and deletion-insertion polymorphism alleles of the dominant sequences from HLA-DQA2 to DAXX for these CEHs. Nine of 13 CEHs remained sufficiently intact to possess a dominant sequence extending at least to DAXX, 230 kb centromeric to HLA-DPB1. We identified the regions centromeric to HLA-DQB1 within which single instances of eight “common” European MHC haplotypes previously sequenced by the MHC Haplotype Project (MHP) were representative of those dominant CEH sequences. Only two MHP haplotypes had a dominant CEH sequence throughout the centromeric and extended class II region and one MHP haplotype did not represent a known European CEH anywhere in the region. We identified the centromeric recombination transition points of other MHP sequences from CEH representation to non-representation. Several CEH pairs or groups shared sequence identity in small blocks but had significantly different (although still conserved for each separate CEH) sequences in surrounding regions. These patterns partly explain strong calculated linkage disequilibrium over only short (tens to hundreds of kilobases) distances in the context of a finite number of observed megabase-length CEHs comprising half a population's haplotypes. Our results provide a clearer picture of European CEH class II allelic structure and population haplotype architecture, improved regional CEH markers, and raise questions concerning regional recombination hotspots. PMID:25299700
Isolation of laccase gene-specific sequences from white rot and brown rot fungi by PCR

DOE Office of Scientific and Technical Information (OSTI.GOV)

D`Souza, T.M.; Boominathan, K.; Reddy, C.A.

1996-10-01

Degenerate primers corresponding to the consensus sequences of the copper-binding regions in the N-terminal domains of known basidiomycete laccases were used to isolate laccase gene-specific sequences from strains representing nine genera of wood rot fungi. All except three gave the expected PCR product of about 200 bp. Computer searches of the databases identified the sequences of each of the PCR product of about 200 bp. Computer searches of the databases identified the sequence of each of the PCR products analyzed as a laccase gene sequence, suggesting the specificity of the primers. PCR products of the white rot fungi Ganoderma lucidum,more » Phlebia brevispora, and Trametes versicolor showed 65 to 74% nucleotide sequence similarity to each other; the similarity in deduced amino acid sequences was 83 to 91%. The PCR products of Lentinula edodes and Lentinus tigrinus, on the other hand, showed relatively low nucleotide and amino acid similarities (58 to 64 and 62 to 81%, respectively); however, these similarities were still much higher than when compared with the corresponding regions in the laccases of the ascomycete fungi Aspergillus nidulans and Neurospora crassa. A few of the white rot fungi, as well as Gloeophyllum trabeum, a brown rot fungus, gave a 144-bp PCR fragment which had a nucleotide sequence similarity of 60 to 71%. Demonstration of laccase activity in G. trabeum and several other brown rot fungi was of particular interest because these organisms were not previously shown to produce laccases. 36 refs., 6 figs., 2 tabs.« less
Associations between novel single nucleotide polymorphisms in the Bos taurus growth hormone gene and performance traits in Holstein-Friesian dairy cattle.

PubMed

Mullen, M P; Berry, D P; Howard, D J; Diskin, M G; Lynch, C O; Berkowicz, E W; Magee, D A; MacHugh, D E; Waters, S M

2010-12-01

Growth hormone, produced in the anterior pituitary gland, stimulates the release of insulin-like growth factor-I from the liver and is of critical importance in the control of nutrient utilization and partitioning for lactogenesis, fertility, growth, and development in cattle. The aim of this study was to discover novel polymorphisms in the bovine growth hormone gene (GH1) and to quantify their association with performance using estimates of genetic merit on 848 Holstein-Friesian AI (artificial insemination) dairy sires. Associations with previously reported polymorphisms in the bovine GH1 gene were also undertaken. A total of 38 novel single nucleotide polymorphisms (SNP) were identified across a panel of 22 beef and dairy cattle by sequence analysis of the 5' promoter, intronic, exonic, and 3' regulatory regions, encompassing approximately 7 kb of the GH1 gene. Following multiple regression analysis on all SNP, associations were identified between 11 SNP (2 novel and 9 previously identified) and milk fat and protein yield, milk composition, somatic cell score, survival, body condition score, and body size. The G allele of a previously identified SNP in exon 5 at position 2141 of the GH1 sequence, resulting in a nonsynonymous substitution, was associated with decreased milk protein yield. The C allele of a novel SNP, GH32, was associated with inferior carcass conformation. In addition, the T allele of a previously characterized SNP, GH35, was associated with decreased survival. Both GH24 (novel) and GH35 were independently associated with somatic cell count, and 3 SNP, GH21, 2291, and GH35, were independently associated with body depth. Furthermore, 2 SNP, GH24 and GH63, were independently associated with carcass fat. Results of this study further demonstrate the multifaceted influences of GH1 on milk production, fertility, and growth-related traits in cattle. Copyright © 2010 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
High-resolution seismic sequence stratigraphy and history of relative sea level changes since the Late Miocene, northern continental margin, South China Sea

NASA Astrophysics Data System (ADS)

Zhong, G.; Wang, L.

2013-12-01

The northern South China Sea (SCS) margin is suggested as one of the ideal sites for documenting the late Cenozoic sea level changes for its characteristics of rapid sedimentation and relatively stable structural subsidence since the Late Miocene. In this study, high-resolution seismic profiles acquired by the Guangzhou Marine Geological Survey, calibrated by well control from the ODP sites 1146 and 1148, were utilized to construct a time-significant sequence stratigraphic framework, from which the history of relative sea level changes since the Late Miocene on the northern SCS margin was derived. Our study area is situated in the middle segment of the margin, between the Hainan Island to the west and the Dongsha Islands to the east. This region is to a certain degree far away from the active structural zones and is suggested as the most stable region in the margin. Totally 4000 km seismic profiles were used, which controls an area of about 6×104 km2. The seismic data have a vertical resolution of 5 to 15 m for the Upper Miocene to Quaternary interval. Three regional seismic sequence boundaries were identified. They subdivide the Late Miocene to Quaternary into three mega-sequences, which correspond to the Quaternary, Pliocene and Late Miocene, respectively by tying to well control. The Late Miocene mega-sequence, including 13 component sequences, is characterized with a basal incised canyon-developed interval overlain by three sets of progradational sequences formed in deep-water slope environments. The Pliocene mega-sequence consists of four sets of progradational sequences. Each sequence set contains one to three component sequences. At least 7 component sequences can be identified. The Quaternary mega-sequence consists of five sets of progradational sequences, in which the lower two constitute a retrogressive sequence set and the upper three a progradational sequence set. At least 9 component sequences can be recognized. Most of the component sequences within the Pliocene and Quaternary mega-sequences occur adjacent to modern shelf margin, and therefore were interpreted as shelf-marginal progradational deltaic sequences. A relative sea level curve since the Late Miocene was compiled by integrating the shift trajectory of onlap points, the stacking pattern of component sequences, and the chronostratigraphic diagrams. The curve contains about 29 cycles of relative sea level changes, showing a much higher resolution than the previous results in the region. These cycles constitute three large relative sea level rise and fall cycles. General trend of sea level variations is rising since the Late Miocene, which is opposite to the global sea level changes and is in accordance with the previous regional researches. This deviation is ascribed to the combined effects of very rapid regional subsidence and relative deficiency of sediment supply. This research was funded by the National Natural Science Foundation of China (Grant Nos. 91028003 and 41076020).
Unveiling the Micronome of Cassava (Manihot esculenta Crantz)

PubMed Central

2016-01-01

MicroRNAs (miRNAs) are an important class of endogenous non-coding single-stranded small RNAs (21–24 nt in length), which serve as post-transcriptional negative regulators of gene expression in plants. Despite the economic importance of Manihot esculenta Crantz (cassava) only 153 putative cassava miRNAs (from multiple germplasm) are available to date in miRBase (Version 21), and identification of a number of miRNAs from the cassava EST database have been limited to comparisons with Arabidopsis. In this study, mature sequences of all known plant miRNAs were used as a query for homologous searches against cassava EST and GSS databases, and additional identification of novel and conserved miRNAs were gleaned from next generation sequencing (NGS) of two cassava landraces (T200 from southern Africa and TME3 from West Africa) at three different stages post explant transplantation and acclimatization. EST and GSS derived data revealed 259 and 32 miRNAs in cassava, and one of the miRNA families (miR2118) from previous studies has not been reported in cassava. NGS data collectively displayed expression of 289 conserved miRNAs in leaf tissue, of which 230 had not been reported previously. Of the 289 conserved miRNAs identified in T200 and TME3, 208 were isomiRs. Thirty-nine novel cassava-specific miRNAs of low abundance, belonging to 29 families, were identified. Thirty-eight (98.6%) of the putative new miRNAs identified by NGS have not been previously reported in cassava. Several miRNA targets were identified in T200 and TME3, highlighting differential temporal miRNA expression between the two cassava landraces. This study contributes to the expanding knowledge base of the micronome of this important crop. PMID:26799216
rpoB-Based Identification of Nonpigmented and Late-Pigmenting Rapidly Growing Mycobacteria

PubMed Central

Adékambi, Toïdi; Colson, Philippe; Drancourt, Michel

2003-01-01

Nonpigmented and late-pigmenting rapidly growing mycobacteria (RGM) are increasingly isolated in clinical microbiology laboratories. Their accurate identification remains problematic because classification is labor intensive work and because new taxa are not often incorporated into classification databases. Also, 16S rRNA gene sequence analysis underestimates RGM diversity and does not distinguish between all taxa. We determined the complete nucleotide sequence of the rpoB gene, which encodes the bacterial β subunit of the RNA polymerase, for 20 RGM type strains. After using in-house software which analyzes and graphically represents variability stretches of 60 bp along the nucleotide sequence, our analysis focused on a 723-bp variable region exhibiting 83.9 to 97% interspecies similarity and 0 to 1.7% intraspecific divergence. Primer pair Myco-F-Myco-R was designed as a tool for both PCR amplification and sequencing of this region for molecular identification of RGM. This tool was used for identification of 63 RGM clinical isolates previously identified at the species level on the basis of phenotypic characteristics and by 16S rRNA gene sequence analysis. Of 63 clinical isolates, 59 (94%) exhibited <2% partial rpoB gene sequence divergence from 1 of 20 species under study and were regarded as correctly identified at the species level. Mycobacterium abscessus and Mycobacterium mucogenicum isolates were clearly distinguished from Mycobacterium chelonae; Mycobacterium mageritense isolates were clearly distinguished from “Mycobacterium houstonense.” Four isolates were not identified at the species level because they exhibited >3% partial rpoB gene sequence divergence from the corresponding type strain; they belonged to three taxa related to M. mucogenicum, Mycobacterium smegmatis, and Mycobacterium porcinum. For M. abscessus and M. mucogenicum, this partial sequence yielded a high genetic heterogeneity within the clinical isolates. We conclude that molecular identification by analysis of the 723-bp rpoB sequence is a rapid and accurate tool for identification of RGM. PMID:14662964
Sialome of a Generalist Lepidopteran Herbivore: Identification of Transcripts and Proteins from Helicoverpa armigera Labial Salivary Glands

PubMed Central

Celorio-Mancera, Maria de la Paz; Courtiade, Juliette; Muck, Alexander; Heckel, David G.; Musser, Richard O.; Vogel, Heiko

2011-01-01

Although the importance of insect saliva in insect-host plant interactions has been acknowledged, there is very limited information on the nature and complexity of the salivary proteome in lepidopteran herbivores. We inspected the labial salivary transcriptome and proteome of Helicoverpa armigera, an important polyphagous pest species. To identify the majority of the salivary proteins we have randomly sequenced 19,389 expressed sequence tags (ESTs) from a normalized cDNA library of salivary glands. In parallel, a non-cytosolic enriched protein fraction was obtained from labial salivary glands and subjected to two-dimensional gel electrophoresis (2-DE) and de novo peptide sequencing. This procedure allowed comparison of peptides and EST sequences and enabled us to identify 65 protein spots from the secreted labial saliva 2DE proteome. The mass spectrometry analysis revealed ecdysone, glucose oxidase, fructosidase, carboxyl/cholinesterase and an uncharacterized protein previously detected in H. armigera midgut proteome. Consistently, their corresponding transcripts are among the most abundant in our cDNA library. We did find redundancy of sequence identification of saliva-secreted proteins suggesting multiple isoforms. As expected, we found several enzymes responsible for digestion and plant offense. In addition, we identified non-digestive proteins such as an arginine kinase and abundant proteins of unknown function. This identification of secreted salivary gland proteins allows a more comprehensive understanding of insect feeding and poses new challenges for the elucidation of protein function. PMID:22046331
Molecular testing for familial hypercholesterolaemia-associated mutations in a UK-based cohort: development of an NGS-based method and comparison with multiplex polymerase chain reaction and oligonucleotide arrays.

PubMed

Reiman, Anne; Pandey, Sarojini; Lloyd, Kate L; Dyer, Nigel; Khan, Mike; Crockard, Martin; Latten, Mark J; Watson, Tracey L; Cree, Ian A; Grammatopoulos, Dimitris K

2016-11-01

Background Detection of disease-associated mutations in patients with familial hypercholesterolaemia is crucial for early interventions to reduce risk of cardiovascular disease. Screening for these mutations represents a methodological challenge since more than 1200 different causal mutations in the low-density lipoprotein receptor has been identified. A number of methodological approaches have been developed for screening by clinical diagnostic laboratories. Methods Using primers targeting, the low-density lipoprotein receptor, apolipoprotein B, and proprotein convertase subtilisin/kexin type 9, we developed a novel Ion Torrent-based targeted re-sequencing method. We validated this in a West Midlands-UK small cohort of 58 patients screened in parallel with other mutation-targeting methods, such as multiplex polymerase chain reaction (Elucigene FH20), oligonucleotide arrays (Randox familial hypercholesterolaemia array) or the Illumina next-generation sequencing platform. Results In this small cohort, the next-generation sequencing method achieved excellent analytical performance characteristics and showed 100% and 89% concordance with the Randox array and the Elucigene FH20 assay. Investigation of the discrepant results identified two cases of mutation misclassification of the Elucigene FH20 multiplex polymerase chain reaction assay. A number of novel mutations not previously reported were also identified by the next-generation sequencing method. Conclusions Ion Torrent-based next-generation sequencing can deliver a suitable alternative for the molecular investigation of familial hypercholesterolaemia patients, especially when comprehensive mutation screening for rare or unknown mutations is required.
Genetic diversity and virulence profiles of Listeria monocytogenes recovered from bulk tank milk, milk filters, and milking equipment from dairies in the United States (2002 to 2014).

PubMed

Kim, Seon Woo; Haendiges, Julie; Keller, Eric N; Myers, Robert; Kim, Alexander; Lombard, Jason E; Karns, Jeffrey S; Van Kessel, Jo Ann S; Haley, Bradd J

2018-01-01

Unpasteurized dairy products are known to occasionally harbor Listeria monocytogenes and have been implicated in recent listeriosis outbreaks and numerous sporadic cases of listeriosis. However, the diversity and virulence profiles of L. monocytogenes isolates recovered from these products have not been fully described. Here we report a genomic analysis of 121 L. monocytogenes isolates recovered from milk, milk filters, and milking equipment collected from bovine dairy farms in 19 states over a 12-year period. In a multi-virulence-locus sequence typing (MVLST) analysis, 59 Virulence Types (VT) were identified, of which 25% were Epidemic Clones I, II, V, VI, VII, VIII, IX, or X, and 31 were novel VT. In a multi-locus sequence typing (MLST) analysis, 60 Sequence Types (ST) of 56 Clonal Complexes (CC) were identified. Within lineage I, CC5 and CC1 were among the most abundant, and within lineage II, CC7 and CC37 were the most abundant. Multiple CCs previously associated with central nervous system and maternal-neonatal infections were identified. A genomic analysis identified variable distribution of virulence markers, Listeria pathogenicity islands (LIPI) -1, -3, and -4, and stress survival island-1 (SSI-1). Of these, 14 virulence markers, including LIPI-3 and -4 were more frequently detected in one lineage (I or II) than the other. LIPI-3 and LIPI-4 were identified in 68% and 28% of lineage I CCs, respectively. Results of this analysis indicate that there is a high level of genetic diversity among the L. monocytogenes present in bulk tank milk in the United States with some strains being more frequently detected than others, and some being similar to those that have been isolated from previous non-dairy related outbreaks. Results of this study also demonstrate significant number of strains isolated from dairy farms encode virulence markers associated with severe human disease.
Identifying micro-inversions using high-throughput sequencing reads.

PubMed

He, Feifei; Li, Yang; Tang, Yu-Hang; Ma, Jian; Zhu, Huaiqiu

2016-01-11

The identification of inversions of DNA segments shorter than read length (e.g., 100 bp), defined as micro-inversions (MIs), remains challenging for next-generation sequencing reads. It is acknowledged that MIs are important genomic variation and may play roles in causing genetic disease. However, current alignment methods are generally insensitive to detect MIs. Here we develop a novel tool, MID (Micro-Inversion Detector), to identify MIs in human genomes using next-generation sequencing reads. The algorithm of MID is designed based on a dynamic programming path-finding approach. What makes MID different from other variant detection tools is that MID can handle small MIs and multiple breakpoints within an unmapped read. Moreover, MID improves reliability in low coverage data by integrating multiple samples. Our evaluation demonstrated that MID outperforms Gustaf, which can currently detect inversions from 30 bp to 500 bp. To our knowledge, MID is the first method that can efficiently and reliably identify MIs from unmapped short next-generation sequencing reads. MID is reliable on low coverage data, which is suitable for large-scale projects such as the 1000 Genomes Project (1KGP). MID identified previously unknown MIs from the 1KGP that overlap with genes and regulatory elements in the human genome. We also identified MIs in cancer cell lines from Cancer Cell Line Encyclopedia (CCLE). Therefore our tool is expected to be useful to improve the study of MIs as a type of genetic variant in the human genome. The source code can be downloaded from: http://cqb.pku.edu.cn/ZhuLab/MID .
High-throughput sequencing of fecal DNA to identify insects consumed by wild Weddell's saddleback tamarins (Saguinus weddelli, Cebidae, Primates) in Bolivia.

PubMed

Mallott, E K; Malhi, R S; Garber, P A

2015-03-01

The genus Saguinus represents a successful radiation of over 20 species of small-bodied New World monkeys. Studies of the tamarin diet indicate that insects and small vertebrates account for ∼16-45% of total feeding and foraging time, and represent an important source of lipids, protein, and metabolizable energy. Although tamarins are reported to commonly consume large-bodied insects such as grasshoppers and walking sticks (Orthoptera), little is known concerning the degree to which smaller or less easily identifiable arthropod prey comprises an important component of their diet. To better understand tamarin arthropod feeding behavior, fecal samples from 20 wild Bolivian saddleback tamarins (members of five groups) were collected over a 3 week period in June 2012, and analyzed for the presence of arthropod DNA. DNA was extracted using a Qiagen stool extraction kit, and universal insect primers were created and used to amplify a ∼280 bp section of the COI mitochondrial gene. Amplicons were sequenced on the Roche 454 sequencing platform using high-throughput sequencing techniques. An analysis of these samples indicated the presence of 43 taxa of arthropods including 10 orders, 15 families, and 12 identified genera. Many of these taxa had not been previously identified in the tamarin diet. These results highlight molecular analysis of fecal DNA as an important research tool for identifying anthropod feeding patterns in primates, and reveal broad diversity in the taxa, foraging microhabitats, and size of arthropods consumed by tamarin monkeys. © 2014 Wiley Periodicals, Inc.
PCR amplification and DNA sequencing of Demodex injai from otic secretions of a dog.

PubMed

Milosevic, Milivoj A; Frank, Linda A; Brahmbhatt, Rupal A; Kania, Stephen A

2013-04-01

The identification of Demodex mites from dogs is usually based on morphology and location. Mites with uncharacteristic features or from unusual locations, hosts or disease manifestations could represent new species not previously described; however, this is difficult to determine based on morphology alone. The goal of this study was to identify and confirm Demodex injai in association with otitis externa in a dog using PCR amplification and DNA sequencing. Otic samples were obtained from a beagle in which a long-bodied Demodex mite was identified. For comparison, Demodex mite samples were collected from a swab and scraping of the dorsal skin of a wire-haired fox terrier and an otic sample from a dog with generalized and otic demodicosis. To identify the Demodex mite, DNA was extracted, and 16S rRNA was amplified by PCR, sequenced and compared with Demodex sequences available in public databases and from separate samples morphologically diagnosed as D. injai and Demodex canis. PCR amplification of the long-bodied mite rRNA DNA obtained from otic samples was approximately 330 bp and was identical to that from the mite morphologically identified as D. injai obtained from the dorsal skin of a dog. Furthermore, the examined mite did not have any significant homology to any of the reported genes from Demodex spp. These results confirmed that the demodex mites in this case were D. injai. © 2013 The Authors. Veterinary Dermatology © 2013 ESVD and ACVD.
Discovery of common sequences absent in the human reference genome using pooled samples from next generation sequencing.

PubMed

Liu, Yu; Koyutürk, Mehmet; Maxwell, Sean; Xiang, Min; Veigl, Martina; Cooper, Richard S; Tayo, Bamidele O; Li, Li; LaFramboise, Thomas; Wang, Zhenghe; Zhu, Xiaofeng; Chance, Mark R

2014-08-16

Sequences up to several megabases in length have been found to be present in individual genomes but absent in the human reference genome. These sequences may be common in populations, and their absence in the reference genome may indicate rare variants in the genomes of individuals who served as donors for the human genome project. As the reference genome is used in probe design for microarray technology and mapping short reads in next generation sequencing (NGS), this missing sequence could be a source of bias in functional genomic studies and variant analysis. One End Anchor (OEA) and/or orphan reads from paired-end sequencing have been used to identify novel sequences that are absent in reference genome. However, there is no study to investigate the distribution, evolution and functionality of those sequences in human populations. To systematically identify and study the missing common sequences (micSeqs), we extended the previous method by pooling OEA reads from large number of individuals and applying strict filtering methods to remove false sequences. The pipeline was applied to data from phase 1 of the 1000 Genomes Project. We identified 309 micSeqs that are present in at least 1% of the human population, but absent in the reference genome. We confirmed 76% of these 309 micSeqs by comparison to other primate genomes, individual human genomes, and gene expression data. Furthermore, we randomly selected fifteen micSeqs and confirmed their presence using PCR validation in 38 additional individuals. Functional analysis using published RNA-seq and ChIP-seq data showed that eleven micSeqs are highly expressed in human brain and three micSeqs contain transcription factor (TF) binding regions, suggesting they are functional elements. In addition, the identified micSeqs are absent in non-primates and show dynamic acquisition during primate evolution culminating with most micSeqs being present in Africans, suggesting some micSeqs may be important sources of human diversity. 76% of micSeqs were confirmed by a comparative genomics approach. Fourteen micSeqs are expressed in human brain or contain TF binding regions. Some micSeqs are primate-specific, conserved and may play a role in the evolution of primates.
Molecular epidemiological analysis of paired pol/env sequences from Portuguese HIV type 1 patients.

PubMed

Abecasis, Ana B; Martins, Andreia; Costa, Inês; Carvalho, Ana P; Diogo, Isabel; Gomes, Perpétua; Camacho, Ricardo J

2011-07-01

The advent of new therapeutic approaches targeting env and the search for efficient anti-HIV-1 vaccines make it necessary to identify the number of recombinant forms using genomic regions that were previously not frequently sequenced. In this study, we have subtyped paired pol and env sequences from HIV-1 strains infecting 152 patients being clinically followed in Portugal. The percentage of strains in which we found discordant subtypes in pol and env was 25.7%. When the subtype in pol and env was concordant (65.1%), the most prevalent subtypes were subtype B (40.8%), followed by subtype C (17.8%) and subtype G (5.3%). The most prevalent recombinant form was CRF14_BGpol/Genv (7.2%).

Chorea-acanthocytosis

PubMed Central

Walker, Susan; Dad, Rubina; Thiruvahindrapuram, Bhooma; Ullah, Muhammed Ikram; Ahmad, Arsalan; Hassan, Muhammad Jawad; Scherer, Stephen W.

2018-01-01

Objective To determine a molecular diagnosis for a large multigenerational family of South Asian ancestry with seizures, hyperactivity, and episodes of tongue biting. Methods Two affected individuals from the family were analyzed by whole-genome sequencing on the Illumina HiSeq X platform, and rare variants were prioritized for interpretation with respect to the phenotype. Results A previously undescribed, 1-kb homozygous deletion was identified in both individuals sequenced, which spanned 2 exons of the VPS13A gene, and was found to segregate in other family members. Conclusions VPS13A is associated with autosomal recessive chorea-acanthocytosis, a diagnosis consistent with the phenotype observed in this family. Whole-genome sequencing presents a comprehensive and agnostic approach for detecting diagnostic mutations in families with rare neurologic disorders. PMID:29845114
Assessing the diversity of AM fungi in arid gypsophilous plant communities.

PubMed

Alguacil, M M; Roldán, A; Torres, M P

2009-10-01

In the present study, we used PCR-Single-Stranded Conformation Polymorphism (SSCP) techniques to analyse arbuscular mycorrhizal fungi (AMF) communities in four sites within a 10 km(2) gypsum area in Southern Spain. Four common plant species from these ecosystems were selected. The AM fungal small-subunit (SSU) rRNA genes were subjected to PCR, cloning, SSCP analysis, sequencing and phylogenetic analyses. A total of 1443 SSU rRNA sequences were analysed, for 21 AM fungal types: 19 belonged to the genus Glomus, 1 to the genus Diversispora and 1 to the Scutellospora. Four sequence groups were identified, which showed high similarity to sequences of known glomalean species or isolates: Glo G18 to Glomus constrictum, Glo G1 to Glomus intraradices, Glo G16 to Glomus clarum, Scut to Scutellospora dipurpurescens and Div to one new genus in the family Diversisporaceae identified recently as Otospora bareai. There were three sequence groups that received strong support in the phylogenetic analysis, and did not seem to be related to any sequences of AM fungi in culture or previously found in the database; thus, they could be novel taxa within the genus Glomus: Glo G4, Glo G2 and Glo G14. We have detected the presence of both generalist and potential specialist AMF in gypsum ecosystems. The AMF communities were different in the plant studied suggesting some degree of preference in the interactions between these symbionts.
Evaluation of Presumably Disease Causing SCN1A Variants in a Cohort of Common Epilepsy Syndromes.

PubMed

Lal, Dennis; Reinthaler, Eva M; Dejanovic, Borislav; May, Patrick; Thiele, Holger; Lehesjoki, Anna-Elina; Schwarz, Günter; Riesch, Erik; Ikram, M Arfan; van Duijn, Cornelia M; Uitterlinden, Andre G; Hofman, Albert; Steinböck, Hannelore; Gruber-Sedlmayr, Ursula; Neophytou, Birgit; Zara, Federico; Hahn, Andreas; Gormley, Padhraig; Becker, Felicitas; Weber, Yvonne G; Cilio, Maria Roberta; Kunz, Wolfram S; Krause, Roland; Zimprich, Fritz; Lemke, Johannes R; Nürnberg, Peter; Sander, Thomas; Lerche, Holger; Neubauer, Bernd A

2016-01-01

The SCN1A gene, coding for the voltage-gated Na+ channel alpha subunit NaV1.1, is the clinically most relevant epilepsy gene. With the advent of high-throughput next-generation sequencing, clinical laboratories are generating an ever-increasing catalogue of SCN1A variants. Variants are more likely to be classified as pathogenic if they have already been identified previously in a patient with epilepsy. Here, we critically re-evaluate the pathogenicity of this class of variants in a cohort of patients with common epilepsy syndromes and subsequently ask whether a significant fraction of benign variants have been misclassified as pathogenic. We screened a discovery cohort of 448 patients with a broad range of common genetic epilepsies and 734 controls for previously reported SCN1A mutations that were assumed to be disease causing. We re-evaluated the evidence for pathogenicity of the identified variants using in silico predictions, segregation, original reports, available functional data and assessment of allele frequencies in healthy individuals as well as in a follow up cohort of 777 patients. We identified 8 known missense mutations, previously reported as pathogenic, in a total of 17 unrelated epilepsy patients (17/448; 3.80%). Our re-evaluation indicates that 7 out of these 8 variants (p.R27T; p.R28C; p.R542Q; p.R604H; p.T1250M; p.E1308D; p.R1928G; NP_001159435.1) are not pathogenic. Only the p.T1174S mutation may be considered as a genetic risk factor for epilepsy of small effect size based on the enrichment in patients (P = 6.60 x 10-4; OR = 0.32, fishers exact test), previous functional studies but incomplete penetrance. Thus, incorporation of previous studies in genetic counseling of SCN1A sequencing results is challenging and may produce incorrect conclusions.
Evaluation of Presumably Disease Causing SCN1A Variants in a Cohort of Common Epilepsy Syndromes

PubMed Central

May, Patrick; Thiele, Holger; Lehesjoki, Anna-Elina; Schwarz, Günter; Riesch, Erik; Ikram, M. Arfan; van Duijn, Cornelia M.; Uitterlinden, Andre G.; Hofman, Albert; Steinböck, Hannelore; Gruber-Sedlmayr, Ursula; Neophytou, Birgit; Zara, Federico; Hahn, Andreas; Gormley, Padhraig; Becker, Felicitas; Weber, Yvonne G.; Cilio, Maria Roberta; Kunz, Wolfram S.; Krause, Roland; Zimprich, Fritz; Lemke, Johannes R.; Nürnberg, Peter; Sander, Thomas; Lerche, Holger; Neubauer, Bernd A.

2016-01-01

Objective The SCN1A gene, coding for the voltage-gated Na+ channel alpha subunit NaV1.1, is the clinically most relevant epilepsy gene. With the advent of high-throughput next-generation sequencing, clinical laboratories are generating an ever-increasing catalogue of SCN1A variants. Variants are more likely to be classified as pathogenic if they have already been identified previously in a patient with epilepsy. Here, we critically re-evaluate the pathogenicity of this class of variants in a cohort of patients with common epilepsy syndromes and subsequently ask whether a significant fraction of benign variants have been misclassified as pathogenic. Methods We screened a discovery cohort of 448 patients with a broad range of common genetic epilepsies and 734 controls for previously reported SCN1A mutations that were assumed to be disease causing. We re-evaluated the evidence for pathogenicity of the identified variants using in silico predictions, segregation, original reports, available functional data and assessment of allele frequencies in healthy individuals as well as in a follow up cohort of 777 patients. Results and Interpretation We identified 8 known missense mutations, previously reported as pathogenic, in a total of 17 unrelated epilepsy patients (17/448; 3.80%). Our re-evaluation indicates that 7 out of these 8 variants (p.R27T; p.R28C; p.R542Q; p.R604H; p.T1250M; p.E1308D; p.R1928G; NP_001159435.1) are not pathogenic. Only the p.T1174S mutation may be considered as a genetic risk factor for epilepsy of small effect size based on the enrichment in patients (P = 6.60 x 10−4; OR = 0.32, fishers exact test), previous functional studies but incomplete penetrance. Thus, incorporation of previous studies in genetic counseling of SCN1A sequencing results is challenging and may produce incorrect conclusions. PMID:26990884
Enterocin TW21, a novel bacteriocin from dochi-isolated Enterococcus faecium D081821.

PubMed

Chang, S-Y; Chen, Y-S; Pan, S-F; Lee, Y-S; Chang, C-H; Chang, C-H; Yu, B; Wu, H-C

2013-09-01

Purification and characterization of a novel bacteriocin produced by strain Enterococcus faecium D081821. Enterococcus faecium D081821, isolated from the traditional Taiwanese fermented food dochi (fermented black beans), was previously found to produce a bacteriocin against Listeria monocytogenes and some Gram-positive bacteria. This bacteriocin, termed enterocin TW21, was purified from culture supernatant by ammonium sulfate precipitation, Sep-Pak C18 cartridge, ion-exchange and gel filtration chromatography. Mass spectrometry analysis showed the mass of the peptide to be approximately 5300·6 Da. The N-terminal amino acid sequencing yielded a partial sequence NH2 -ATYYGNGVYxNTQK by Edman degradation, and it contains the consensus class IIa bacteriocin motif YGNGV in the N-terminal region. The open reading frame (ORF) encoding the bacteriocin was identified from the draft genome sequence of Enterococcus faecium D081821, and sequence analysis of this peptide indicated that enterocin TW21 is a novel bacteriocin. Enterococcus faecium D081821 produced a bacteriocin named enterocin TW21, the molecular weight and amino acid sequence both revealed it to be a novel bacteriocin. A new member of class IIa bacteriocin was identified. This bacteriocin shows great inhibitory ability against L. monocytogenes and could be applied as a natural food preservative. © 2013 The Society for Applied Microbiology.
Molecular Genetics of the Usher Syndrome in Lebanon: Identification of 11 Novel Protein Truncating Mutations by Whole Exome Sequencing

PubMed Central

Reddy, Ramesh; Fahiminiya, Somayyeh; El Zir, Elie; Mansour, Ahmad; Megarbane, Andre; Majewski, Jacek; Slim, Rima

2014-01-01

Background Usher syndrome (USH) is a genetically heterogeneous condition with ten disease-causing genes. The spectrum of genes and mutations causing USH in the Lebanese and Middle Eastern populations has not been described. Consequently, diagnostic approaches designed to screen for previously reported mutations were unlikely to identify the mutations in 11 unrelated families, eight of Lebanese and three of Middle Eastern origins. In addition, six of the ten USH genes consist of more than 20 exons, each, which made mutational analysis by Sanger sequencing of PCR-amplified exons from genomic DNA tedious and costly. The study was aimed at the identification of USH causing genes and mutations in 11 unrelated families with USH type I or II. Methods Whole exome sequencing followed by expanded familial validation by Sanger sequencing. Results We identified disease-causing mutations in all the analyzed patients in four USH genes, MYO7A, USH2A, GPR98 and CDH23. Eleven of the mutations were novel and protein truncating, including a complex rearrangement in GPR98. Conclusion Our data highlight the genetic diversity of Usher syndrome in the Lebanese population and the time and cost-effectiveness of whole exome sequencing approach for mutation analysis of genetically heterogeneous conditions caused by large genes. PMID:25211151
The Genome Sequence of Mannheimia haemolytica A1: Insights into Virulence, Natural Competence, and Pasteurellaceae Phylogeny†

PubMed Central

Gioia, Jason; Qin, Xiang; Jiang, Huaiyang; Clinkenbeard, Kenneth; Lo, Reggie; Liu, Yamei; Fox, George E.; Yerrapragada, Shailaja; McLeod, Michael P.; McNeill, Thomas Z.; Hemphill, Lisa; Sodergren, Erica; Wang, Qiaoyan; Muzny, Donna M.; Homsi, Farah J.; Weinstock, George M.; Highlander, Sarah K.

2006-01-01

The draft genome sequence of Mannheimia haemolytica A1, the causative agent of bovine respiratory disease complex (BRDC), is presented. Strain ATCC BAA-410, isolated from the lung of a calf with BRDC, was the DNA source. The annotated genome includes 2,839 coding sequences, 1,966 of which were assigned a function and 436 of which are unique to M. haemolytica. Through genome annotation many features of interest were identified, including bacteriophages and genes related to virulence, natural competence, and transcriptional regulation. In addition to previously described virulence factors, M. haemolytica encodes adhesins, including the filamentous hemagglutinin FhaB and two trimeric autotransporter adhesins. Two dual-function immunoglobulin-protease/adhesins are also present, as is a third immunoglobulin protease. Genes related to iron acquisition and drug resistance were identified and are likely important for survival in the host and virulence. Analysis of the genome indicates that M. haemolytica is naturally competent, as genes for natural competence and DNA uptake signal sequences (USS) are present. Comparison of competence loci and USS in other species in the family Pasteurellaceae indicates that M. haemolytica, Actinobacillus pleuropneumoniae, and Haemophilus ducreyi form a lineage distinct from other Pasteurellaceae. This observation was supported by a phylogenetic analysis using sequences of predicted housekeeping genes. PMID:17015664
Increased complexity of circRNA expression during species evolution.

PubMed

Dong, Rui; Ma, Xu-Kai; Chen, Ling-Ling; Yang, Li

2017-08-03

Circular RNAs (circRNAs) are broadly identified from precursor mRNA (pre-mRNA) back-splicing across various species. Recent studies have suggested a cell-/tissue- specific manner of circRNA expression. However, the distinct expression pattern of circRNAs among species and its underlying mechanism still remain to be explored. Here, we systematically compared circRNA expression from human and mouse, and found that only a small portion of human circRNAs could be determined in parallel mouse samples. The conserved circRNA expression between human and mouse is correlated with the existence of orientation-opposite complementary sequences in introns that flank back-spliced exons in both species, but not the circRNA sequences themselves. Quantification of RNA pairing capacity of orientation-opposite complementary sequences across circRNA-flanking introns by Complementary Sequence Index (CSI) identifies that among all types of complementary sequences, SINEs, especially Alu elements in human, contribute the most for circRNA formation and that their diverse distribution across species leads to the increased complexity of circRNA expression during species evolution. Together, our integrated and comparative reference catalog of circRNAs in different species reveals a species-specific pattern of circRNA expression and suggests a previously under-appreciated impact of fast-evolved SINEs on the regulation of (circRNA) gene expression.
Molecular genetics of the Usher syndrome in Lebanon: identification of 11 novel protein truncating mutations by whole exome sequencing.

PubMed

Reddy, Ramesh; Fahiminiya, Somayyeh; El Zir, Elie; Mansour, Ahmad; Megarbane, Andre; Majewski, Jacek; Slim, Rima

2014-01-01

Usher syndrome (USH) is a genetically heterogeneous condition with ten disease-causing genes. The spectrum of genes and mutations causing USH in the Lebanese and Middle Eastern populations has not been described. Consequently, diagnostic approaches designed to screen for previously reported mutations were unlikely to identify the mutations in 11 unrelated families, eight of Lebanese and three of Middle Eastern origins. In addition, six of the ten USH genes consist of more than 20 exons, each, which made mutational analysis by Sanger sequencing of PCR-amplified exons from genomic DNA tedious and costly. The study was aimed at the identification of USH causing genes and mutations in 11 unrelated families with USH type I or II. Whole exome sequencing followed by expanded familial validation by Sanger sequencing. We identified disease-causing mutations in all the analyzed patients in four USH genes, MYO7A, USH2A, GPR98 and CDH23. Eleven of the mutations were novel and protein truncating, including a complex rearrangement in GPR98. Our data highlight the genetic diversity of Usher syndrome in the Lebanese population and the time and cost-effectiveness of whole exome sequencing approach for mutation analysis of genetically heterogeneous conditions caused by large genes.
Complete Deletion of the Fucose Operon in Haemophilus influenzae Is Associated with a Cluster in Multilocus Sequence Analysis-Based Phylogenetic Group II Related to Haemophilus haemolyticus: Implications for Identification and Typing

PubMed Central

de Gier, Camilla; Kirkham, Lea-Ann S.

2015-01-01

Nonhemolytic variants of Haemophilus haemolyticus are difficult to differentiate from Haemophilus influenzae despite a wide difference in pathogenic potential. A previous investigation characterized a challenging set of 60 clinical strains using multiple PCRs for marker genes and described strains that could not be unequivocally identified as either species. We have analyzed the same set of strains by multilocus sequence analysis (MLSA) and near-full-length 16S rRNA gene sequencing. MLSA unambiguously allocated all study strains to either of the two species, while identification by 16S rRNA sequence was inconclusive for three strains. Notably, the two methods yielded conflicting identifications for two strains. Most of the “fuzzy species” strains were identified as H. influenzae that had undergone complete deletion of the fucose operon. Such strains, which are untypeable by the H. influenzae multilocus sequence type (MLST) scheme, have sporadically been reported and predominantly belong to a single branch of H. influenzae MLSA phylogenetic group II. We also found evidence of interspecies recombination between H. influenzae and H. haemolyticus within the 16S rRNA genes. Establishing an accurate method for rapid and inexpensive identification of H. influenzae is important for disease surveillance and treatment. PMID:26378279
Extraordinary Structured Noncoding RNAs Revealed by Bacterial Metagenome Analysis

PubMed Central

Weinberg, Zasha; Perreault, Jonathan; Meyer, Michelle M.; Breaker, Ronald R.

2012-01-01

Estimates of the total number of bacterial species1-3 suggest that existing DNA sequence databases carry only a tiny fraction of the total amount of DNA sequence space represented by this division of life. Indeed, environmental DNA samples have been shown to encode many previously unknown classes of proteins4 and RNAs5. Bioinformatics searches6-10 of genomic DNA from bacteria commonly identify novel noncoding RNAs (ncRNAs)10-12 such as riboswitches13,14. In rare instances, RNAs that exhibit more extensive sequence and structural conservation across a wide range of bacteria are encountered15,16. Given that large structured RNAs are known to carry out complex biochemical functions such as protein synthesis and RNA processing reactions, identifying more RNAs of great size and intricate structure is likely to reveal additional biochemical functions that can be achieved by RNA. We applied an updated computational pipeline17 to discover ncRNAs that rival the known large ribozymes in size and structural complexity or that are among the most abundant RNAs in bacteria that encode them. These RNAs would have been difficult or impossible to detect without examining environmental DNA sequences, suggesting that numerous RNAs with extraordinary size, structural complexity, or other exceptional characteristics remain to be discovered in unexplored sequence space. PMID:19956260
Barcode ITS2: a useful tool for identifying Trachelospermum jasminoides and a good monitor for medicine market.

PubMed

Yu, Ning; Wei, Yu-Long; Zhang, Xin; Zhu, Ning; Wang, Yan-Li; Zhu, Yue; Zhang, Hai-Ping; Li, Fen-Mei; Yang, Lan; Sun, Jia-Qi; Sun, Ai-Dong

2017-07-11

Trachelospermum jasminoides is commonly used in traditional Chinese medicine. However, the use of the plant's local alternatives is frequent, causing potential clinical problems. The T. jasminoides sold in the medicine market is commonly dried and sliced, making traditional identification methods difficult. In this study, the ITS2 region was evaluated on 127 sequences representing T. jasminoides and its local alternatives according to PCR and sequencing rates, intra- and inter-specific divergences, secondary structure, and discrimination capacity. Results indicated the 100% success rates of PCR and sequencing and the obvious presence of a barcoding gap. Results of BLAST 1, nearest distance and neighbor-joining tree methods showed that barcode ITS2 could successfully identify all the texted samples. The secondary structures of the ITS2 region provided another dimensionality for species identification. Two-dimensional images were obtained for better and easier identification. Previous studies on DNA barcoding concentrated more on the same family, genus, or species. However, an ideal barcode should be variable enough to identify closely related species. Meanwhile, the barcodes should also be conservative in identifying distantly related species. This study highlights the application of barcode ITS2 in solving practical problems in the distantly related local alternatives of medical plants.
Bacterial Diversity in Human Subgingival Plaque

PubMed Central

Paster, Bruce J.; Boches, Susan K.; Galvin, Jamie L.; Ericson, Rebecca E.; Lau, Carol N.; Levanos, Valerie A.; Sahasrabudhe, Ashish; Dewhirst, Floyd E.

2001-01-01

The purpose of this study was to determine the bacterial diversity in the human subgingival plaque by using culture-independent molecular methods as part of an ongoing effort to obtain full 16S rRNA sequences for all cultivable and not-yet-cultivated species of human oral bacteria. Subgingival plaque was analyzed from healthy subjects and subjects with refractory periodontitis, adult periodontitis, human immunodeficiency virus periodontitis, and acute necrotizing ulcerative gingivitis. 16S ribosomal DNA (rDNA) bacterial genes from DNA isolated from subgingival plaque samples were PCR amplified with all-bacterial or selective primers and cloned into Escherichia coli. The sequences of cloned 16S rDNA inserts were used to determine species identity or closest relatives by comparison with sequences of known species. A total of 2,522 clones were analyzed. Nearly complete sequences of approximately 1,500 bases were obtained for putative new species. About 60% of the clones fell into 132 known species, 70 of which were identified from multiple subjects. About 40% of the clones were novel phylotypes. Of the 215 novel phylotypes, 75 were identified from multiple subjects. Known putative periodontal pathogens such as Porphyromonas gingivalis, Bacteroides forsythus, and Treponema denticola were identified from multiple subjects, but typically as a minor component of the plaque as seen in cultivable studies. Several phylotypes fell into two recently described phyla previously associated with extreme natural environments, for which there are no cultivable species. A number of species or phylotypes were found only in subjects with disease, and a few were found only in healthy subjects. The organisms identified only from diseased sites deserve further study as potential pathogens. Based on the sequence data in this study, the predominant subgingival microbial community consisted of 347 species or phylotypes that fall into 9 bacterial phyla. Based on the 347 species seen in our sample of 2,522 clones, we estimate that there are 68 additional unseen species, for a total estimate of 415 species in the subgingival plaque. When organisms found on other oral surfaces such as the cheek, tongue, and teeth are added to this number, the best estimate of the total species diversity in the oral cavity is approximately 500 species, as previously proposed. PMID:11371542
Unifying tephrostratigraphic approaches to redefine major Holocene marker tephras, Mt. Taranaki, New Zealand

NASA Astrophysics Data System (ADS)

Damaschke, M.; Cronin, S. J.; Torres-Orozco, R.; Wallace, R. C.

2017-05-01

In this study, geochemical fingerprinting of glass shards and titanomagnetite phenocrysts was used to match twenty complex pyroclastic deposits from the flanks of Mt. Taranaki to major tephra fall ;marker beds; in medial and distal deposition sites. These correlations hinged upon identifying time-bound compositional changes (a chemostratigraphy) in distal Taranaki tephra-fall sequences preserved in lake and peat sediment records around the volcano. The current work shows that previous soil-stratigraphy based studies led to miscorrelations, because they relied upon radiocarbon dates, a ;counting back; approach, and an underestimate of the number of eruptions that actually occurred in any time frame. The new tephrostratigraphy proposed at Mt. Taranaki resulted from stratigraphic rearranging of several earlier-defined units. Some tephra units are older than previously determined (e.g., Waipuku, Tariki, and Mangatoki; 6 to 9 cal ka BP), while one of the most prominent Taranaki marker tephra deposit, the Korito, is shown to lie stratigraphically above a widespread rhyolitic marker bed from Taupo volcano, the Stent Tephra (also known as unit Q; 4.3 cal ka BP). Pyroclastic tephra deposits previously dated between 6 to 4 cal ka BP at a key tephra section, c. 40 km NE of Mt. Taranaki's summit, were misidentified and are now shown to comprise new marker tephra deposits, including the Kokowai ( 4.7 cal ka BP), which is a prominent marker horizon on the eastern flanks of the volcano. A new local proximal stratigraphy for < 5 cal ka BP tephra units can be well correlated to tephra layers within distal lake and peat sequences, but the differences between the two records indicates an overall larger number of eruptions have occurred at this volcano than previously thought. This study additionally demonstrates the utility of titanomagnetite chemistry for discrimination and correlation of groups or sequences of tephra deposits - even if unique compositions cannot be identified.
Barcoding snakeheads (Teleostei, Channidae) revisited: Discovering greater species diversity and resolving perpetuated taxonomic confusions

PubMed Central

Conte-Grand, Cecilia; Britz, Ralf; Dahanukar, Neelesh; Raghavan, Rajeev; Pethiyagoda, Rohan; Tan, Heok Hui; Hadiaty, Renny K.; Yaakob, Norsham S.

2017-01-01

Snakehead fishes of the family Channidae are predatory freshwater teleosts from Africa and Asia comprising 38 valid species. Snakeheads are important food fishes (aquaculture, live food trade) and have been introduced widely with several species becoming highly invasive. A channid barcode library was recently assembled by Serrao and co-workers to better detect and identify potential and established invasive snakehead species outside their native range. Comparing our own recent phylogenetic results of this taxonomically confusing group with those previously reported revealed several inconsistencies that prompted us to expand and improve on previous studies. By generating 343 novel snakehead coxI sequences and combining them with an additional 434 coxI sequences from GenBank we highlight several problems with previous efforts towards the assembly of a snakehead reference barcode library. We found that 16.3% of the channid coxI sequences deposited in GenBank are based on misidentifications. With the inclusion of our own data we were, however, able to solve these cases of perpetuated taxonomic confusion. Different species delimitation approaches we employed (BIN, GMYC, and PTP) were congruent in suggesting a potentially much higher species diversity within snakeheads than currently recognized. In total, 90 BINs were recovered and within a total of 15 currently recognized species multiple BINs were identified. This higher species diversity is mostly due to either the incorporation of undescribed, narrow range, endemics from the Eastern Himalaya biodiversity hotspot or the incorporation of several widespread species characterized by deep genetic splits between geographically well-defined lineages. In the latter case, over-lumping in the past has deflated the actual species numbers. Further integrative approaches are clearly needed for providing a better taxonomic understanding of snakehead diversity, new species descriptions and taxonomic revisions of the group. PMID:28931084
A cultivation-independent PCR-RFLP assay targeting oprF gene for detection and identification of Pseudomonas spp. in samples from fibrocystic pediatric patients.

PubMed

Lagares, Antonio; Agaras, Betina; Bettiol, Marisa P; Gatti, Blanca M; Valverde, Claudio

2015-07-01

Species-specific genetic markers are crucial to develop faithful and sensitive molecular methods for the detection and identification of Pseudomonas aeruginosa (Pa). We have previously set up a PCR-RFLP protocol targeting oprF, the gene encoding the genus-specific outer membrane porin F, whose strong conservation and marked sequence diversity allowed detection and differentiation of environmental isolates (Agaras et al., 2012). Here, we evaluated the ability of the PCR-RFLP assay to genotype clinical isolates previously identified as Pa by conventional microbiological methods within a collection of 62 presumptive Pa isolates from different pediatric clinical samples and different sections of the Hospital de Niños "Sor María Ludovica" from La Plata, Argentina. All isolates, but one, gave an oprF amplicon consistent with that from reference Pa strains. The sequence of the smaller-sized amplicon revealed that the isolate was in fact a mendocina Pseudomonas strain. The oprF RFLP pattern generated with TaqI or HaeIII nucleases matched those of reference Pa strains for 59 isolates (96%). The other two Pa isolates (4%) revealed a different RFLP pattern based on HaeIII digestion, although oprF sequencing confirmed that Pa identification was correct. We next tested the effectiveness of the PCR-RFLP to detect pseudomonads on clinical samples of pediatric fibrocystic patients directly without sample cultivation. The expected amplicon and its cognate RFLP profile were obtained for all samples in which Pa was previously detected by cultivation-dependent methods. Altogether, these results provide the basis for the application of the oprF PCR-RFLP protocol to directly detect and identify Pa and other non-Pa pseudomonads in fibrocystic clinical samples. Copyright © 2015 Elsevier B.V. All rights reserved.
The long tail of oncogenic drivers in prostate cancer.

PubMed

Armenia, Joshua; Wankowicz, Stephanie A M; Liu, David; Gao, Jianjiong; Kundra, Ritika; Reznik, Ed; Chatila, Walid K; Chakravarty, Debyani; Han, G Celine; Coleman, Ilsa; Montgomery, Bruce; Pritchard, Colin; Morrissey, Colm; Barbieri, Christopher E; Beltran, Himisha; Sboner, Andrea; Zafeiriou, Zafeiris; Miranda, Susana; Bielski, Craig M; Penson, Alexander V; Tolonen, Charlotte; Huang, Franklin W; Robinson, Dan; Wu, Yi Mi; Lonigro, Robert; Garraway, Levi A; Demichelis, Francesca; Kantoff, Philip W; Taplin, Mary-Ellen; Abida, Wassim; Taylor, Barry S; Scher, Howard I; Nelson, Peter S; de Bono, Johann S; Rubin, Mark A; Sawyers, Charles L; Chinnaiyan, Arul M; Schultz, Nikolaus; Van Allen, Eliezer M

2018-05-01

Comprehensive genomic characterization of prostate cancer has identified recurrent alterations in genes involved in androgen signaling, DNA repair, and PI3K signaling, among others. However, larger and uniform genomic analysis may identify additional recurrently mutated genes at lower frequencies. Here we aggregate and uniformly analyze exome sequencing data from 1,013 prostate cancers. We identify and validate a new class of E26 transformation-specific (ETS)-fusion-negative tumors defined by mutations in epigenetic regulators, as well as alterations in pathways not previously implicated in prostate cancer, such as the spliceosome pathway. We find that the incidence of significantly mutated genes (SMGs) follows a long-tail distribution, with many genes mutated in less than 3% of cases. We identify a total of 97 SMGs, including 70 not previously implicated in prostate cancer, such as the ubiquitin ligase CUL3 and the transcription factor SPEN. Finally, comparing primary and metastatic prostate cancer identifies a set of genomic markers that may inform risk stratification.
Identification of Major Outer Surface Proteins of Streptococcus agalactiae

PubMed Central

Hughes, Martin J. G.; Moore, Joanne C.; Lane, Jonathan D.; Wilson, Rebecca; Pribul, Philippa K.; Younes, Zabin N.; Dobson, Richard J.; Everest, Paul; Reason, Andrew J.; Redfern, Joanne M.; Greer, Fiona M.; Paxton, Thanai; Panico, Maria; Morris, Howard R.; Feldman, Robert G.; Santangelo, Joseph D.

2002-01-01

To identify the major outer surface proteins of Streptococcus agalactiae (group B streptococcus), a proteomic analysis was undertaken. An extract of the outer surface proteins was separated by two-dimensional electrophoresis. The visualized spots were identified through a combination of peptide sequencing and reverse genetic methodologies. Of the 30 major spots identified as S. agalactiae specific, 27 have been identified. Six of these proteins, previously unidentified in S. agalactiae, were sequenced and cloned. These were ornithine carbamoyltransferase, phosphoglycerate kinase, nonphosphorylating glyceraldehyde-3-phosphate dehydrogenase, purine nucleoside phosphorylase, enolase, and glucose-6-phosphate isomerase. Using a gram-positive expression system, we have overexpressed two of these proteins in an in vitro system. These recombinant, purified proteins were used to raise antisera. The identification of these proteins as residing on the outer surface was confirmed by the ability of the antisera to react against whole, live bacteria. Further, in a neonatal-animal model system, we demonstrate that some of these sera are protective against lethal doses of bacteria. These studies demonstrate the successful application of proteomics as a technique for identifying vaccine candidates. PMID:11854208
Diagnostics for Yaws Eradication: Insights From Direct Next-Generation Sequencing of Cutaneous Strains of Treponema pallidum

PubMed Central

Marks, Michael; Fookes, Maria; Wagner, Josef; Butcher, Robert; Ghinai, Rosanna; Sokana, Oliver; Sarkodie, Yaw-Adu; Lukehart, Sheila A; Solomon, Anthony W; Mabey, David C W; Thomson, Nicholas

2018-01-01

Abstract Background Yaws-like chronic ulcers can be caused by Treponema pallidum subspecies pertenue, Haemophilus ducreyi, or other, still-undefined bacteria. To permit accurate evaluation of yaws elimination efforts, programmatic use of molecular diagnostics is required. The accuracy and sensitivity of current tools remain unclear because our understanding of T. pallidum diversity is limited by the low number of sequenced genomes. Methods We tested samples from patients with suspected yaws collected in the Solomon Islands and Ghana. All samples were from patients whose lesions had previously tested negative using the Centers for Disease Control and Prevention (CDC) diagnostic assay in widespread use. However, some of these patients had positive serological assays for yaws on blood. We used direct whole-genome sequencing to identify T. pallidum subsp pertenue strains missed by the current assay. Results From 45 Solomon Islands and 27 Ghanaian samples, 11 were positive for T. pallidum DNA using the species-wide quantitative polymerase chain reaction (PCR) assay, from which we obtained 6 previously undetected T. pallidum subsp pertenue whole-genome sequences. These show that Solomon Islands sequences represent distinct T. pallidum subsp pertenue clades. These isolates were invisible to the CDC diagnostic PCR assay, due to sequence variation in the primer binding site. Conclusions Our data double the number of published T. pallidum subsp pertenue genomes. We show that Solomon Islands strains are undetectable by the PCR used in many studies and by health ministries. This assay is therefore not adequate for the eradication program. Next-generation genome sequence data are essential for these efforts. PMID:29045605
Contig Maps and Genomic Sequencing Identify Candidate Genes in the Usher 1C Locus

PubMed Central

Higgins, Michael J.; Day, Colleen D.; Smilinich, Nancy J.; Ni, L.; Cooper, Paul R.; Nowak, Norma J.; Davies, Chris; de Jong, Pieter J.; Hejtmancik, Fielding; Evans, Glen A.; Smith, Richard J.H.; Shows, Thomas B.

1998-01-01

Usher syndrome 1C (USH1C) is a congenital condition manifesting profound hearing loss, the absence of vestibular function, and eventual retinal degeneration. The USH1C locus has been mapped genetically to a 2- to 3-cM interval in 11p14–15.1 between D11S899 and D11S861. In an effort to identify the USH1C disease gene we have isolated the region between these markers in yeast artificial chromosomes (YACs) using a combination of STS content mapping and Alu–PCR hybridization. The YAC contig is ∼3.5 Mb and has located several other loci within this interval, resulting in the order CEN-LDHA-SAA1-TPH-D11S1310-(D11S1888/KCNC1)-MYOD1-D11S902D11S921-D11S1890-TEL. Subsequent haplotyping and homozygosity analysis refined the location of the disease gene to a 400-kb interval between D11S902 and D11S1890 with all affected individuals being homozygous for the internal marker D11S921. To facilitate gene identification, the critical region has been converted into P1 artificial chromosome (PAC) clones using sequence-tagged sites (STSs) mapped to the YAC contig, Alu–PCR products generated from the YACs, and PAC end probes. A contig of >50 PAC clones has been assembled between D11S1310 and D11S1890, confirming the order of markers used in haplotyping. Three PAC clones representing nearly two-thirds of the USH1C critical region have been sequenced. PowerBLAST analysis identified six clusters of expressed sequence tags (ESTs), two known genes (BIR,SUR1) mapped previously to this region, and a previously characterized but unmapped gene NEFA (DNA binding/EF hand/acidic amino-acid-rich). GRAIL analysis identified 11 CpG islands and 73 exons of excellent quality. These data allowed the construction of a transcription map for the USH1C critical region, consisting of three known genes and six or more novel transcripts. Based on their map location, these loci represent candidate disease loci for USH1C. The NEFA gene was assessed as the USH1C locus by the sequencing of an amplified NEFA cDNA from an USH1C patient; however, no mutations were detected. [The sequence data described in this paper have been submitted to GenBank under accession numbers AC000406–AC000407.] PMID:9445488

Operating characteristics of the implicit learning system supporting serial interception sequence learning.

PubMed

Sanchez, Daniel J; Reber, Paul J

2012-04-01

The memory system that supports implicit perceptual-motor sequence learning relies on brain regions that operate separately from the explicit, medial temporal lobe memory system. The implicit learning system therefore likely has distinct operating characteristics and information processing constraints. To attempt to identify the limits of the implicit sequence learning mechanism, participants performed the serial interception sequence learning (SISL) task with covertly embedded repeating sequences that were much longer than most previous studies: ranging from 30 to 60 (Experiment 1) and 60 to 90 (Experiment 2) items in length. Robust sequence-specific learning was observed for sequences up to 80 items in length, extending the known capacity of implicit sequence learning. In Experiment 3, 12-item repeating sequences were embedded among increasing amounts of irrelevant nonrepeating sequences (from 20 to 80% of training trials). Despite high levels of irrelevant trials, learning occurred across conditions. A comparison of learning rates across all three experiments found a surprising degree of constancy in the rate of learning regardless of sequence length or embedded noise. Sequence learning appears to be constant with the logarithm of the number of sequence repetitions practiced during training. The consistency in learning rate across experiments and conditions implies that the mechanisms supporting implicit sequence learning are not capacity-constrained by very long sequences nor adversely affected by high rates of irrelevant sequences during training.
Recombinational hotspot specific to female meiosis in the mouse major histocompatibility complex.

PubMed

Shiroishi, T; Hanzawa, N; Sagai, T; Ishiura, M; Gojobori, T; Steinmetz, M; Moriwaki, K

1990-01-01

The wm7 haplotype of the major histocompatibility complex (MHC), derived from the Japanese wild mouse Mus musculus molossinus, enhances recombination specific to female meiosis in the K/A beta interval of the MHC. We have mapped crossover points of fifteen independent recombinants from genetic crosses of the wm7 and laboratory haplotypes. Most of them were confined to a short segment of approximately 1 kilobase (kb) of DNA between the A beta 3 and A beta 2 genes, indicating the presence of a female-specific recombinational hotspot. Its location overlaps with a sex-independent hotspot previously identified in the Mus musculus castaneus CAS3 haplotype. We have cloned and sequenced DNA fragments surrounding the hotspot from the wm7 haplotype and the corresponding regions from the hotspot-negative B10.A and C57BL/10 strains. There is no significant difference between the sequences of these three strains, or between these and the published sequences of the CAS3 and C57BL/6 strains. However, a comparison of this A beta 3/A beta 2 hotspot with a previously characterized hotspot in the E beta gene revealed that they have a very similar molecular organization. Each hotspot consists of two elements, the consensus sequence of the mouse middle repetitive MT family and the tetrameric repeated sequences, which are separated by 1 kb of DNA.
Overview of recurrent chromosomal losses in retinoblastoma detected by low coverage next generation sequencing

PubMed Central

García-Chequer, A.J.; Méndez-Tenorio, A.; Olguín-Ruiz, G.; Sánchez-Vallejo, C.; Isa, P.; Arias, C.F.; Torres, J.; Hernández-Angeles, A.; Ramírez-Ortiz, M.A.; Lara, C.; Cabrera-Muñoz, M.L.; Sadowinski-Pine, S.; Bravo-Ortiz, J.C.; Ramón-García, G.; Diegopérez-Ramírez, J.; Ramírez-Reyes, G.; Casarrubias-Islas, R.; Ramírez, J.; Orjuela, M.A.; Ponce-Castañeda, M.V.

2016-01-01

Genes are frequently lost or gained in malignant tumors and the analysis of these changes can be informative about the underlying tumor biology. Retinoblastoma is a pediatric intraocular malignancy, and since deletions in chromosome 13 have been described in this tumor, we performed genome wide sequencing with the Illumina platform to test whether recurrent losses could be detected in low coverage data from DNA pools of Rb cases. An in silico reference profile for each pool was created from the human genome sequence GRCh37p5; a chromosome integrity score and a graphics 40 Kb window analysis approach, allowed us to identify with high resolution previously reported non random recurrent losses in all chromosomes of these tumors. We also found a pattern of gains and losses associated to clear and dark cytogenetic bands respectively. We further analyze a pool of medulloblastoma and found a more stable genomic profile and previously reported losses in this tumor. This approach facilitates identification of recurrent deletions from many patients that may be biological relevant for tumor development. PMID:26883451
Sequence variants in four genes underlying Bardet-Biedl syndrome in consanguineous families

PubMed Central

Ullah, Asmat; Umair, Muhammad; Yousaf, Maryam; Khan, Sher Alam; Nazim-ud-din, Muhammad; Shah, Khadim; Ahmad, Farooq; Azeem, Zahid; Ali, Ghazanfar; Alhaddad, Bader; Rafique, Afzal; Jan, Abid; Haack, Tobias B.; Strom, Tim M.; Meitinger, Thomas; Ghous, Tahseen

2017-01-01

Purpose To investigate the molecular basis of Bardet-Biedl syndrome (BBS) in five consanguineous families of Pakistani origin. Methods Linkage in two families (A and B) was established to BBS7 on chromosome 4q27, in family C to BBS8 on chromosome 14q32.1, and in family D to BBS10 on chromosome 12q21.2. Family E was investigated directly with exome sequence analysis. Results Sanger sequencing revealed two novel mutations and three previously reported mutations in the BBS genes. These mutations include two deletions (c.580_582delGCA, c.1592_1597delTTCCAG) in the BBS7 gene, a missense mutation (p.Gln449His) in the BBS8 gene, a frameshift mutation (c.271_272insT) in the BBS10 gene, and a nonsense mutation (p.Ser40*) in the MKKS (BBS6) gene. Conclusions Two novel mutations and three previously reported variants, identified in the present study, further extend the body of evidence implicating BBS6, BBS7, BBS8, and BBS10 in causing BBS. PMID:28761321
Progress Report for DOE DE-FG03-98ER20317 ''Regulation of the floral homeotic gene AGAMOUS'' Current and Final Funding Period: September 1, 2002, to December 31, 2002

DOE Office of Scientific and Technical Information (OSTI.GOV)

Weigel, D.

2003-03-11

OAK-B135 Results obtained during this funding period: (1) Phylogenetic footprinting of AG regulatory sequences Sequences necessary and sufficient for AGAMOUS (AG) expression in the center of Arabidopsis flowers are located in the second intron, which is about 3 kb in size. This intron contains binding sites for two transcription factors, LEAFY (LFY) and WUSCHEL (WUS), which are direct activators of AG. We used the new method of phylogenetic shadowing to identify new regulatory elements. Among 29 Brassicaceae, several other motifs, but not the LFY and WUS binding sites previously identified, are largely invariant. Using reporter gene analyses, we tested sixmore » of these motifs and found that they are all functionally important for activity of AG regulatory sequences in A. thaliana. (2) Repression of AG by MADS box genes A candidate for repressing AG in the shoot apical meristem has been the MADS box gene FUL, since it is expressed in the shoot apical meristem and since an activated version (FUL:VP16) leads to ectopic AG expression in the shoot apical meristem. However, there is no ectopic AG expression in full single mutants. We therefore started to generate VP16 fusions of several other MADS box genes expressed in the shoot apical meristem, to determine which of these might be candidates for FUL redundant genes. We found that AGL6:VP16 has a similar phenotype as FUL:VP16, suggesting that AGL6 and FUL interact. We are now testing this hypothesis. (3) Two candidate AG regulators, WOW and ULA Because the phylogenetic footprinting project has identified several new candidate regulatory motifs, of which at least one (the CCAATCA motif) has rather strong effects, we had decided to put the analysis of WOW and ULA on hold, and to focus on using the newly identified motifs as tools. We conduct ed yeast one-hybrid screen with two of the conserved motifs, and identified several classes of transcription factors that can interact with them. One of these is encoded by the PAN gene, previously known to be expressed in a domain that overlaps the AG domain, but not known before to regulate AG. (4) New genetic modifiers of AG This part of the project was concluded in the previous funding period.« less
THAP1/DYT6 sequence variants in non-DYT1 early-onset primary dystonia in China and their effects on RNA expression.

PubMed

Cheng, Fu Bo; Ozelius, Laurie J; Wan, Xin Hua; Feng, Jia Chun; Ma, Ling Yan; Yang, Ying Mai; Wang, Lin

2012-02-01

Mutations in the THAP1 gene were recently identified as the cause of DYT6 primary dystonia. More than 40 mutations in this gene have been described in different populations. However, no previous report has identified sequence variations that affect the transcript process of the THAP1 gene. In addition, the mutation frequency in Chinese early-onset primary dystonia has not been well characterized. One hundred and two unrelated patients with non-DYT1 early-onset primary dystonia (age at onset <26 years), family members of participants with mutations, and 200 neurologically normal controls were screened for THAP1 gene mutations. The effects of the identified mutations on RNA expression were analyzed using semi-quantitative real-time PCR. Seven sequence variants (c.63_66del TTTC, c.161G>T, c.224A>T, c.267G>A, c.339T>C, c.449A>C, and c.539T>C) were identified in this group of patients (6.9%). In this cohort, 15 subjects (seven unrelated patients and eight family members) were detected to have THAP1 sequence variants. Among these 15 subjects, 11 were manifested (penetrance of DYT6 was 73.3%) and seven presented with craniocervical involvement (63.6%). However, one patient manifested paroxysmal headshake, and one presented with essential hand tremor. Semi-quantitative real-time PCR indicated that a novel silent mutation (c.267G>A) decreased the expression of THAP1 in human lymphocytes. Our findings indicated that THAP1 sequence variants are not common in non-DYT1 early-onset primary dystonia in China and that the clinical manifestation may vary. One silent mutation (c.267G>A) was shown to affect THAP1 expression.
Sequence analysis of sub-genotype D hepatitis B surface antigens isolated from Jeddah, Saudi Arabia.

PubMed

El Hadad, Sahar; Alakilli, Saleha; Rabah, Samar; Sabir, Jamal

2018-05-01

Little is known about the prevalence of HBV genotypes/sub-genotypes in Jeddah province, although the hepatitis B virus (HBV) was identified as the most predominant type of hepatitis in Saudi Arabia. To characterize HBV genotypes/sub-genotypes, serum samples from 15 patients with chronic HBV were collected and subjected to HBsAg gene amplification and sequence analysis. Phylogenetic analysis of the HBsAg gene sequences revealed that 11 (48%) isolates belonged to HBV/D while 4 (18%) were associated with HBV/C. Notably, a HBV/D sub-genotype phylogenetic tree identified that eight current isolates (72%) belonged to HBV/D1, whereas three isolates (28%) appeared to be more closely related to HBV/D5, although they formed a novel cluster supported by a branch with 99% bootstrap value. Isolates belonging to D1 were grouped in one branch and seemed to be more closely related to various strains isolated from different countries. For further determination of whether the three current isolates belonged to HBV/D5 or represented a novel sub-genotype, HBV/DA, whole HBV genome sequences would be required. In the present study, we verified that HBV/D1 is the most prevalent HBV sub-genotype in Jeddah, and identified novel variant mutations suggesting that an additional sub-genotype designated HBV/DA should be proposed. Overall, the results of the present HBsAg sequence analyses provide us with insights regarding the nucleotide differences between the present HBsAg /D isolates identified in the populace of Jeddah, Saudi Arabia and those previously isolated worldwide. Additional studies with large numbers of subjects in other areas might lead to the discovery of the specific HBV strain genotypes or even additional new sub-genotypes that are circulating in Saudi Arabia.
Genes encoding calmodulin-binding proteins in the Arabidopsis genome

NASA Technical Reports Server (NTRS)

Reddy, Vaka S.; Ali, Gul S.; Reddy, Anireddy S N.

2002-01-01

Analysis of the recently completed Arabidopsis genome sequence indicates that approximately 31% of the predicted genes could not be assigned to functional categories, as they do not show any sequence similarity with proteins of known function from other organisms. Calmodulin (CaM), a ubiquitous and multifunctional Ca(2+) sensor, interacts with a wide variety of cellular proteins and modulates their activity/function in regulating diverse cellular processes. However, the primary amino acid sequence of the CaM-binding domain in different CaM-binding proteins (CBPs) is not conserved. One way to identify most of the CBPs in the Arabidopsis genome is by protein-protein interaction-based screening of expression libraries with CaM. Here, using a mixture of radiolabeled CaM isoforms from Arabidopsis, we screened several expression libraries prepared from flower meristem, seedlings, or tissues treated with hormones, an elicitor, or a pathogen. Sequence analysis of 77 positive clones that interact with CaM in a Ca(2+)-dependent manner revealed 20 CBPs, including 14 previously unknown CBPs. In addition, by searching the Arabidopsis genome sequence with the newly identified and known plant or animal CBPs, we identified a total of 27 CBPs. Among these, 16 CBPs are represented by families with 2-20 members in each family. Gene expression analysis revealed that CBPs and CBP paralogs are expressed differentially. Our data suggest that Arabidopsis has a large number of CBPs including several plant-specific ones. Although CaM is highly conserved between plants and animals, only a few CBPs are common to both plants and animals. Analysis of Arabidopsis CBPs revealed the presence of a variety of interesting domains. Our analyses identified several hypothetical proteins in the Arabidopsis genome as CaM targets, suggesting their involvement in Ca(2+)-mediated signaling networks.
The Essential Genome of Escherichia coli K-12

PubMed Central

2018-01-01

ABSTRACT Transposon-directed insertion site sequencing (TraDIS) is a high-throughput method coupling transposon mutagenesis with short-fragment DNA sequencing. It is commonly used to identify essential genes. Single gene deletion libraries are considered the gold standard for identifying essential genes. Currently, the TraDIS method has not been benchmarked against such libraries, and therefore, it remains unclear whether the two methodologies are comparable. To address this, a high-density transposon library was constructed in Escherichia coli K-12. Essential genes predicted from sequencing of this library were compared to existing essential gene databases. To decrease false-positive identification of essential genes, statistical data analysis included corrections for both gene length and genome length. Through this analysis, new essential genes and genes previously incorrectly designated essential were identified. We show that manual analysis of TraDIS data reveals novel features that would not have been detected by statistical analysis alone. Examples include short essential regions within genes, orientation-dependent effects, and fine-resolution identification of genome and protein features. Recognition of these insertion profiles in transposon mutagenesis data sets will assist genome annotation of less well characterized genomes and provides new insights into bacterial physiology and biochemistry. PMID:29463657
Targeted next generation sequencing identifies functionally deleterious germline mutations in novel genes in early-onset/familial prostate cancer.

PubMed

Paulo, Paula; Maia, Sofia; Pinto, Carla; Pinto, Pedro; Monteiro, Augusta; Peixoto, Ana; Teixeira, Manuel R

2018-04-01

Considering that mutations in known prostate cancer (PrCa) predisposition genes, including those responsible for hereditary breast/ovarian cancer and Lynch syndromes, explain less than 5% of early-onset/familial PrCa, we have sequenced 94 genes associated with cancer predisposition using next generation sequencing (NGS) in a series of 121 PrCa patients. We found monoallelic truncating/functionally deleterious mutations in seven genes, including ATM and CHEK2, which have previously been associated with PrCa predisposition, and five new candidate PrCa associated genes involved in cancer predisposing recessive disorders, namely RAD51C, FANCD2, FANCI, CEP57 and RECQL4. Furthermore, using in silico pathogenicity prediction of missense variants among 18 genes associated with breast/ovarian cancer and/or Lynch syndrome, followed by KASP genotyping in 710 healthy controls, we identified "likely pathogenic" missense variants in ATM, BRIP1, CHEK2 and TP53. In conclusion, this study has identified putative PrCa predisposing germline mutations in 14.9% of early-onset/familial PrCa patients. Further data will be necessary to confirm the genetic heterogeneity of inherited PrCa predisposition hinted in this study.
Burkholderia sp. induces functional nodules on the South African invasive legume Dipogon lignosus (Phaseoleae) in New Zealand soils.

PubMed

Liu, Wendy Y Y; Ridgway, Hayley J; James, Trevor K; James, Euan K; Chen, Wen-Ming; Sprent, Janet I; Young, J Peter W; Andrews, Mitchell

2014-10-01

The South African invasive legume Dipogon lignosus (Phaseoleae) produces nodules with both determinate and indeterminate characteristics in New Zealand (NZ) soils. Ten bacterial isolates produced functional nodules on D. lignosus. The 16S ribosomal RNA (rRNA) gene sequences identified one isolate as Bradyrhizobium sp., one isolate as Rhizobium sp. and eight isolates as Burkholderia sp. The Bradyrhizobium sp. and Rhizobium sp. 16S rRNA sequences were identical to those of strains previously isolated from crop plants and may have originated from inocula used on crops. Both 16S rRNA and DNA recombinase A (recA) gene sequences placed the eight Burkholderia isolates separate from previously described Burkholderia rhizobial species. However, the isolates showed a very close relationship to Burkholderia rhizobial strains isolated from South African plants with respect to their nitrogenase iron protein (nifH), N-acyltransferase nodulation protein A (nodA) and N-acetylglucosaminyl transferase nodulation protein C (nodC) gene sequences. Gene sequences and enterobacterial repetitive intergenic consensus (ERIC) PCR and repetitive element palindromic PCR (rep-PCR) banding patterns indicated that the eight Burkholderia isolates separated into five clones of one strain and three of another. One strain was tested and shown to produce functional nodules on a range of South African plants previously reported to be nodulated by Burkholderia tuberum STM678(T) which was isolated from the Cape Region. Thus, evidence is strong that the Burkholderia strains isolated here originated in South Africa and were somehow transported with the plants from their native habitat to NZ. It is possible that the strains are of a new species capable of nodulating legumes.
Whole-genome sequencing identifies EN1 as a determinant of bone density and fracture

PubMed Central

Zheng, Hou-Feng; Forgetta, Vincenzo; Hsu, Yi-Hsiang; Estrada, Karol; Rosello-Diez, Alberto; Leo, Paul J; Dahia, Chitra L; Park-Min, Kyung Hyun; Tobias, Jonathan H; Kooperberg, Charles; Kleinman, Aaron; Styrkarsdottir, Unnur; Liu, Ching-Ti; Uggla, Charlotta; Evans, Daniel S; Nielson, Carrie M; Walter, Klaudia; Pettersson-Kymmer, Ulrika; McCarthy, Shane; Eriksson, Joel; Kwan, Tony; Jhamai, Mila; Trajanoska, Katerina; Memari, Yasin; Min, Josine; Huang, Jie; Danecek, Petr; Wilmot, Beth; Li, Rui; Chou, Wen-Chi; Mokry, Lauren E; Moayyeri, Alireza; Claussnitzer, Melina; Cheng, Chia-Ho; Cheung, Warren; Medina-Gómez, Carolina; Ge, Bing; Chen, Shu-Huang; Choi, Kwangbom; Oei, Ling; Fraser, James; Kraaij, Robert; Hibbs, Matthew A; Gregson, Celia L; Paquette, Denis; Hofman, Albert; Wibom, Carl; Tranah, Gregory J; Marshall, Mhairi; Gardiner, Brooke B; Cremin, Katie; Auer, Paul; Hsu, Li; Ring, Sue; Tung, Joyce Y; Thorleifsson, Gudmar; Enneman, Anke W; van Schoor, Natasja M; de Groot, Lisette C.P.G.M.; van der Velde, Nathalie; Melin, Beatrice; Kemp, John P; Christiansen, Claus; Sayers, Adrian; Zhou, Yanhua; Calderari, Sophie; van Rooij, Jeroen; Carlson, Chris; Peters, Ulrike; Berlivet, Soizik; Dostie, Josée; Uitterlinden, Andre G; Williams, Stephen R.; Farber, Charles; Grinberg, Daniel; LaCroix, Andrea Z; Haessler, Jeff; Chasman, Daniel I; Giulianini, Franco; Rose, Lynda M; Ridker, Paul M; Eisman, John A; Nguyen, Tuan V; Center, Jacqueline R; Nogues, Xavier; Garcia-Giralt, Natalia; Launer, Lenore L; Gudnason, Vilmunder; Mellström, Dan; Vandenput, Liesbeth; Karlsson, Magnus K; Ljunggren, Östen; Svensson, Olle; Hallmans, Göran; Rousseau, François; Giroux, Sylvie; Bussière, Johanne; Arp, Pascal P; Koromani, Fjorda; Prince, Richard L; Lewis, Joshua R; Langdahl, Bente L; Hermann, A Pernille; Jensen, Jens-Erik B; Kaptoge, Stephen; Khaw, Kay-Tee; Reeve, Jonathan; Formosa, Melissa M; Xuereb-Anastasi, Angela; Åkesson, Kristina; McGuigan, Fiona E; Garg, Gaurav; Olmos, Jose M; Zarrabeitia, Maria T; Riancho, Jose A; Ralston, Stuart H; Alonso, Nerea; Jiang, Xi; Goltzman, David; Pastinen, Tomi; Grundberg, Elin; Gauguier, Dominique; Orwoll, Eric S; Karasik, David; Davey-Smith, George; Smith, Albert V; Siggeirsdottir, Kristin; Harris, Tamara B; Zillikens, M Carola; van Meurs, Joyce BJ; Thorsteinsdottir, Unnur; Maurano, Matthew T; Timpson, Nicholas J; Soranzo, Nicole; Durbin, Richard; Wilson, Scott G; Ntzani, Evangelia E; Brown, Matthew A; Stefansson, Kari; Hinds, David A; Spector, Tim; Cupples, L Adrienne; Ohlsson, Claes; Greenwood, Celia MT; Jackson, Rebecca D; Rowe, David W; Loomis, Cynthia A; Evans, David M; Ackert-Bicknell, Cheryl L; Joyner, Alexandra L; Duncan, Emma L; Kiel, Douglas P; Rivadeneira, Fernando; Richards, J Brent

2016-01-01

SUMMARY The extent to which low-frequency (minor allele frequency [MAF] between 1–5%) and rare (MAF ≤ 1%) variants contribute to complex traits and disease in the general population is largely unknown. Bone mineral density (BMD) is highly heritable, is a major predictor of osteoporotic fractures and has been previously associated with common genetic variants1–8, and rare, population-specific, coding variants9. Here we identify novel non-coding genetic variants with large effects on BMD (ntotal = 53,236) and fracture (ntotal = 508,253) in individuals of European ancestry from the general population. Associations for BMD were derived from whole-genome sequencing (n=2,882 from UK10K), whole-exome sequencing (n= 3,549), deep imputation of genotyped samples using a combined UK10K/1000Genomes reference panel (n=26,534), and de-novo replication genotyping (n= 20,271). We identified a low-frequency non-coding variant near a novel locus, EN1, with an effect size 4-fold larger than the mean of previously reported common variants for lumbar spine BMD8 (rs11692564[T], MAF = 1.7%, replication effect size = +0.20 standard deviations [SD], Pmeta = 2×10−14), which was also associated with a decreased risk of fracture (OR = 0.85; P = 2×10−11; ncases = 98,742 and ncontrols = 409,511). Using an En1Cre/flox mouse model, we observed that conditional loss of En1 results in low bone mass, likely as a consequence of high bone turn-over. We also identified a novel low-frequency non-coding variant with large effects on BMD near WNT16 (rs148771817[T], MAF = 1.1%, replication effect size = +0.39 SD, Pmeta = 1×10−11). In general, there was an excess of association signals arising from deleterious coding and conserved non-coding variants. These findings provide evidence that low-frequency non-coding variants have large effects on BMD and fracture, thereby providing rationale for whole-genome sequencing and improved imputation reference panels to study the genetic architecture of complex traits and disease in the general population. PMID:26367794
Complete Genome Sequence of Sporisorium scitamineum and Biotrophic Interaction Transcriptome with Sugarcane

PubMed Central

Benevenuto, Juliana; Peters, Leila P.; Carvalho, Giselle; Palhares, Alessandra; Quecine, Maria C.; Nunes, Filipe R. S.; Kmit, Maria C. P.; Wai, Alvan; Hausner, Georg; Aitken, Karen S.; Berkman, Paul J.; Fraser, James A.; Moolhuijzen, Paula M.; Coutinho, Luiz L.; Creste, Silvana; Vieira, Maria L. C.; Kitajima, João P.; Monteiro-Vitorello, Claudia B.

2015-01-01

Sporisorium scitamineum is a biotrophic fungus responsible for the sugarcane smut, a worldwide spread disease. This study provides the complete sequence of individual chromosomes of S. scitamineum from telomere to telomere achieved by a combination of PacBio long reads and Illumina short reads sequence data, as well as a draft sequence of a second fungal strain. Comparative analysis to previous available sequences of another strain detected few polymorphisms among the three genomes. The novel complete sequence described herein allowed us to identify and annotate extended subtelomeric regions, repetitive elements and the mitochondrial DNA sequence. The genome comprises 19,979,571 bases, 6,677 genes encoding proteins, 111 tRNAs and 3 assembled copies of rDNA, out of our estimated number of copies as 130. Chromosomal reorganizations were detected when comparing to sequences of S. reilianum, the closest smut relative, potentially influenced by repeats of transposable elements. Repetitive elements may have also directed the linkage of the two mating-type loci. The fungal transcriptome profiling from in vitro and from interaction with sugarcane at two time points (early infection and whip emergence) revealed that 13.5% of the genes were differentially expressed in planta and particular to each developmental stage. Among them are plant cell wall degrading enzymes, proteases, lipases, chitin modification and lignin degradation enzymes, sugar transporters and transcriptional factors. The fungus also modulates transcription of genes related to surviving against reactive oxygen species and other toxic metabolites produced by the plant. Previously described effectors in smut/plant interactions were detected but some new candidates are proposed. Ten genomic islands harboring some of the candidate genes unique to S. scitamineum were expressed only in planta. RNAseq data was also used to reassure gene predictions. PMID:26065709
SeqTrim: a high-throughput pipeline for pre-processing any type of sequence read

PubMed Central

2010-01-01

Background High-throughput automated sequencing has enabled an exponential growth rate of sequencing data. This requires increasing sequence quality and reliability in order to avoid database contamination with artefactual sequences. The arrival of pyrosequencing enhances this problem and necessitates customisable pre-processing algorithms. Results SeqTrim has been implemented both as a Web and as a standalone command line application. Already-published and newly-designed algorithms have been included to identify sequence inserts, to remove low quality, vector, adaptor, low complexity and contaminant sequences, and to detect chimeric reads. The availability of several input and output formats allows its inclusion in sequence processing workflows. Due to its specific algorithms, SeqTrim outperforms other pre-processors implemented as Web services or standalone applications. It performs equally well with sequences from EST libraries, SSH libraries, genomic DNA libraries and pyrosequencing reads and does not lead to over-trimming. Conclusions SeqTrim is an efficient pipeline designed for pre-processing of any type of sequence read, including next-generation sequencing. It is easily configurable and provides a friendly interface that allows users to know what happened with sequences at every pre-processing stage, and to verify pre-processing of an individual sequence if desired. The recommended pipeline reveals more information about each sequence than previously described pre-processors and can discard more sequencing or experimental artefacts. PMID:20089148
Beyond barcoding: a mitochondrial genomics approach to molecular phylogenetics and diagnostics of blowflies (Diptera: Calliphoridae).

PubMed

Nelson, Leigh A; Lambkin, Christine L; Batterham, Philip; Wallman, James F; Dowton, Mark; Whiting, Michael F; Yeates, David K; Cameron, Stephen L

2012-12-15

Members of the Calliphoridae (blowflies) are significant for medical and veterinary management, due to the ability of some species to consume living flesh as larvae, and for forensic investigations due to the ability of others to develop in corpses. Due to the difficulty of accurately identifying larval blowflies to species there is a need for DNA-based diagnostics for this family, however the widely used DNA-barcoding marker, cox1, has been shown to fail for several groups within this family. Additionally, many phylogenetic relationships within the Calliphoridae are still unresolved, particularly deeper level relationships. Sequencing whole mt genomes has been demonstrated both as an effective method for identifying the most informative diagnostic markers and for resolving phylogenetic relationships. Twenty-seven complete, or nearly so, mt genomes were sequenced representing 13 species, seven genera and four calliphorid subfamilies and a member of the related family Tachinidae. PCR and sequencing primers developed for sequencing one calliphorid species could be reused to sequence related species within the same superfamily with success rates ranging from 61% to 100%, demonstrating the speed and efficiency with which an mt genome dataset can be assembled. Comparison of molecular divergences for each of the 13 protein-coding genes and 2 ribosomal RNA genes, at a range of taxonomic scales identified novel targets for developing as diagnostic markers which were 117-200% more variable than the markers which have been used previously in calliphorids. Phylogenetic analysis of whole mt genome sequences resulted in much stronger support for family and subfamily-level relationships. The Calliphoridae are polyphyletic, with the Polleninae more closely related to the Tachinidae, and the Sarcophagidae are the sister group of the remaining calliphorids. Within the Calliphoridae, there was strong support for the monophyly of the Chrysomyinae and Luciliinae and for the sister-grouping of Luciliinae with Calliphorinae. Relationships within Chrysomya were not well resolved. Whole mt genome data, supported the previously demonstrated paraphyly of Lucilia cuprina with respect to L. sericata and allowed us to conclude that it is due to hybrid introgression prior to the last common ancestor of modern sericata populations, rather than due to recent hybridisation, nuclear pseudogenes or incomplete lineage sorting. Copyright © 2012 Elsevier B.V. All rights reserved.
Identification of a pathogenic FTO mutation by next-generation sequencing in a newborn with growth retardation and developmental delay.

PubMed

Daoud, Hussein; Zhang, Dong; McMurray, Fiona; Yu, Andrea; Luco, Stephanie M; Vanstone, Jason; Jarinova, Olga; Carson, Nancy; Wickens, James; Shishodia, Shifali; Choi, Hwanho; McDonough, Michael A; Schofield, Christopher J; Harper, Mary-Ellen; Dyment, David A; Armour, Christine M

2016-03-01

A homozygous loss-of-function mutation p.(Arg316Gln) in the fat mass and obesity-associated (FTO) gene, which encodes for an iron and 2-oxoglutarate-dependent oxygenase, was previously identified in a large family in which nine affected individuals present with a lethal syndrome characterised by growth retardation and multiple malformations. To date, no other pathogenic mutation in FTO has been identified as a cause of multiple congenital malformations. We investigated a 21-month-old girl who presented distinctive facial features, failure to thrive, global developmental delay, left ventricular cardiac hypertrophy, reduced vision and bilateral hearing loss. We performed targeted next-generation sequencing of 4813 clinically relevant genes in the patient and her parents. We identified a novel FTO homozygous missense mutation (c.956C>T; p.(Ser319Phe)) in the affected individual. This mutation affects a highly conserved residue located in the same functional domain as the previously characterised mutation p.(Arg316Gln). Biochemical studies reveal that p.(Ser319Phe) FTO has reduced 2-oxoglutarate turnover and N-methyl-nucleoside demethylase activity. Our findings are consistent with previous reports that homozygous mutations in FTO can lead to rare growth retardation and developmental delay syndrome, and further support the proposal that FTO plays an important role in early development of human central nervous and cardiovascular systems. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Non-invasive fetal sex determination by maternal plasma sequencing and application in X-linked disorder counseling.

PubMed

Pan, Xiaoyu; Zhang, Chunlei; Li, Xuchao; Chen, Shengpei; Ge, Huijuan; Zhang, Yanyan; Chen, Fang; Jiang, Hui; Jiang, Fuman; Zhang, Hongyun; Wang, Wei; Zhang, Xiuqing

2014-12-01

To develop a fetal sex determination method based on maternal plasma sequencing (MPS), assess its performance and potential use in X-linked disorder counseling. 900 cases of MPS data from a previous study were reviewed, in which 100 and 800 cases were used as training and validation set, respectively. The percentage of uniquely mapped sequencing reads on Y chromosome was calculated and used to classify male and female cases. Eight pregnant women who are carriers of Duchenne muscular dystrophy (DMD) mutations were recruited, whose plasma were subjected to multiplex sequencing and fetal sex determination analysis. In the training set, a sensitivity of 96% and false positive rate of 0% for male cases detection were reached in our method. The blinded validation results showed 421 in 423 male cases and 374 in 377 female cases were successfully identified, revealing sensitivity and specificity of 99.53% and 99.20% for fetal sex determination, at as early as 12 gestational weeks. Fetal sex for all eight DMD genetic counseling cases were correctly identified, which were confirmed by amniocentesis. Based on MPS, high accuracy of non-invasive fetal sex determination can be achieved. This method can potentially be used for prenatal genetic counseling.
E-RNAi: a web application for the multi-species design of RNAi reagents—2010 update

PubMed Central

Horn, Thomas; Boutros, Michael

2010-01-01

The design of RNA interference (RNAi) reagents is an essential step for performing loss-of-function studies in many experimental systems. The availability of sequenced and annotated genomes greatly facilitates RNAi experiments in an increasing number of organisms that were previously not genetically tractable. The E-RNAi web-service, accessible at http://www.e-rnai.org/, provides a computational resource for the optimized design and evaluation of RNAi reagents. The 2010 update of E-RNAi now covers 12 genomes, including Drosophila, Caenorhabditis elegans, human, emerging model organisms such as Schmidtea mediterranea and Acyrthosiphon pisum, as well as the medically relevant vectors Anopheles gambiae and Aedes aegypti. The web service calculates RNAi reagents based on the input of target sequences, sequence identifiers or by visual selection of target regions through a genome browser interface. It identifies optimized RNAi target-sites by ranking sequences according to their predicted specificity, efficiency and complexity. E-RNAi also facilitates the design of secondary RNAi reagents for validation experiments, evaluation of pooled siRNA reagents and batch design. Results are presented online, as a downloadable HTML report and as tab-delimited files. PMID:20444868
Epidemiological characterization of a nosocomial outbreak of extended spectrum β-lactamase Escherichia coli ST-131 confirms the clinical value of core genome multilocus sequence typing.

PubMed

Woksepp, Hanna; Ryberg, Anna; Berglind, Linda; Schön, Thomas; Söderman, Jan

2017-12-01

Enhanced precision of epidemiological typing in clinically suspected nosocomial outbreaks is crucial. Our aim was to investigate whether single nucleotide polymorphism (SNP) analysis and core genome (cg) multilocus sequence typing (MLST) of whole genome sequencing (WGS) data would more reliably identify a nosocomial outbreak, compared to earlier molecular typing methods. Sixteen isolates from a nosocomial outbreak of ESBL E. coli ST-131 in southeastern Sweden and three control strains were subjected to WGS. Sequences were explored by SNP analysis and cgMLST. cgMLST clearly differentiated between the outbreak isolates and the control isolates (>1400 differences). All clinically identified outbreak isolates showed close clustering (≥2 allele differences), except for two isolates (>50 allele differences). These data confirmed that the isolates with >50 differing genes did not belong to the nosocomial outbreak. The number of SNPs within the outbreak was ≤7, whereas the two discrepant isolates had >700 SNPs. Two of the ESBL E. coli ST-131 isolates did not belong to the clinically identified outbreak. Our results illustrate the power of WGS in terms of resolution, which may avoid overestimation of patients belonging to outbreaks as judged from epidemiological data and previously employed molecular methods with lower discriminatory ability. © 2017 APMIS. Published by John Wiley & Sons Ltd.
Somatic mosaicism of a CDKL5 mutation identified by next-generation sequencing.

PubMed

Kato, Takeshi; Morisada, Naoya; Nagase, Hiroaki; Nishiyama, Masahiro; Toyoshima, Daisaku; Nakagawa, Taku; Maruyama, Azusa; Fu, Xue Jun; Nozu, Kandai; Wada, Hiroko; Takada, Satoshi; Iijima, Kazumoto

2015-10-01

CDKL5-related encephalopathy is an X-linked dominantly inherited disorder that is characterized by early infantile epileptic encephalopathy or atypical Rett syndrome. We describe a 5-year-old Japanese boy with intractable epilepsy, severe developmental delay, and Rett syndrome-like features. Onset was at 2 months, when his electroencephalogram showed sporadic single poly spikes and diffuse irregular poly spikes. We conducted a genetic analysis using an Illumina® TruSight™ One sequencing panel on a next-generation sequencer. We identified two epilepsy-associated single nucleotide variants in our case: CDKL5 p.Ala40Val and KCNQ2 p.Glu515Asp. CDKL5 p.Ala40Val has been previously reported to be responsible for early infantile epileptic encephalopathy. In our case, the CDKL5 heterozygous mutation showed somatic mosaicism because the boy's karyotype was 46,XY. The KCNQ2 variant p.Glu515Asp is known to cause benign familial neonatal seizures-1, and this variant showed paternal inheritance. Although we believe that the somatic mosaic CDKL5 mutation is mainly responsible for the neurological phenotype in the patient, the KCNQ2 variant might have some neurological effect. Genetic analysis by next-generation sequencing is capable of identifying multiple variants in a patient. Copyright © 2015 The Japanese Society of Child Neurology. Published by Elsevier B.V. All rights reserved.

Transcriptome sequencing and annotation of the microalgae Dunaliella tertiolecta: Pathway description and gene discovery for production of next-generation biofuels

PubMed Central

2011-01-01

Background Biodiesel or ethanol derived from lipids or starch produced by microalgae may overcome many of the sustainability challenges previously ascribed to petroleum-based fuels and first generation plant-based biofuels. The paucity of microalgae genome sequences, however, limits gene-based biofuel feedstock optimization studies. Here we describe the sequencing and de novo transcriptome assembly for the non-model microalgae species, Dunaliella tertiolecta, and identify pathways and genes of importance related to biofuel production. Results Next generation DNA pyrosequencing technology applied to D. tertiolecta transcripts produced 1,363,336 high quality reads with an average length of 400 bases. Following quality and size trimming, ~ 45% of the high quality reads were assembled into 33,307 isotigs with a 31-fold coverage and 376,482 singletons. Assembled sequences and singletons were subjected to BLAST similarity searches and annotated with Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) orthology (KO) identifiers. These analyses identified the majority of lipid and starch biosynthesis and catabolism pathways in D. tertiolecta. Conclusions The construction of metabolic pathways involved in the biosynthesis and catabolism of fatty acids, triacylglycrols, and starch in D. tertiolecta as well as the assembled transcriptome provide a foundation for the molecular genetics and functional genomics required to direct metabolic engineering efforts that seek to enhance the quantity and character of microalgae-based biofuel feedstock. PMID:21401935
Transcriptional profiling reveals the expression of novel genes in response to various stimuli in the human dermatophyte Trichophyton rubrum

PubMed Central

2010-01-01

Background Cutaneous mycoses are common human infections among healthy and immunocompromised hosts, and the anthropophilic fungus Trichophyton rubrum is the most prevalent microorganism isolated from such clinical cases worldwide. The aim of this study was to determine the transcriptional profile of T. rubrum exposed to various stimuli in order to obtain insights into the responses of this pathogen to different environmental challenges. Therefore, we generated an expressed sequence tag (EST) collection by constructing one cDNA library and nine suppression subtractive hybridization libraries. Results The 1388 unigenes identified in this study were functionally classified based on the Munich Information Center for Protein Sequences (MIPS) categories. The identified proteins were involved in transcriptional regulation, cellular defense and stress, protein degradation, signaling, transport, and secretion, among other functions. Analysis of these unigenes revealed 575 T. rubrum sequences that had not been previously deposited in public databases. Conclusion In this study, we identified novel T. rubrum genes that will be useful for ORF prediction in genome sequencing and facilitating functional genome analysis. Annotation of these expressed genes revealed metabolic adaptations of T. rubrum to carbon sources, ambient pH shifts, and various antifungal drugs used in medical practice. Furthermore, challenging T. rubrum with cytotoxic drugs and ambient pH shifts extended our understanding of the molecular events possibly involved in the infectious process and resistance to antifungal drugs. PMID:20144196
Exome Sequencing Identifies a Novel CEACAM16 Mutation Associated with Autosomal Dominant Nonsyndromic Hearing Loss DFNA4B in a Chinese Family

PubMed Central

He, Chufeng; Li, Haibo; Qing, Jie; Grati, Mhamed; Hu, Zhengmao; Li, Jiada; Hu, Yiqiao; Xia, Kun; Mei, Lingyun; Wang, Xingwei; Yu, Jianjun; Chen, Hongsheng; Jiang, Lu; Liu, Yalan; Men, Meichao; Zhang, Hailin; Guan, Liping; Xiao, Jingjing; Zhang, Jianguo; Liu, Xuezhong; Feng, Yong

2014-01-01

Autosomal dominant nonsyndromic hearing loss (ADNSHL/DFNA) is a highly genetically heterogeneous disorder. Hitherto only about 30 ADNSHL-causing genes have been identified and many unknown genes remain to be discovered. In this research, genome-wide linkage analysis mapped the disease locus to a 4.3 Mb region on chromosome 19q13 in SY-026, a five-generation nonconsanguineous Chinese family affected by late-onset and progressive ADNSHL. This linkage region showed partial overlap with the previously reported DFNA4. Simultaneously, probands were analyzed using exome capture followed by next generation sequencing. Encouragingly, a heterozygous missense mutation, c.505G>A (p.G169R) in exon 3 of the CEACAM16 gene (carcinoembryonic antigen-related cell adhesion molecule 16), was identified via this combined strategy. Sanger sequencing verified that the mutation co-segregated with hearing loss in the family and that it was not present in 200 unrelated control subjects with matched ancestry. This is the second report in the literature of a family with ADNSHL caused by CEACAM16 mutation. Immunofluorescence staining and Western blots also prove CEACAM16 to be a secreted protein. Furthermore, our studies in transfected HEK293T cells show that the secretion efficacy of the mutant CEACAM16 is much lower than that of the wild-type, suggesting a deleterious effect of the sequence variant. PMID:25589040
Exome sequencing identifies a novel CEACAM16 mutation associated with autosomal dominant nonsyndromic hearing loss DFNA4B in a Chinese family.

PubMed

Wang, Honghan; Wang, Xinwei; He, Chufeng; Li, Haibo; Qing, Jie; Grati, Mhamed; Hu, Zhengmao; Li, Jiada; Hu, Yiqiao; Xia, Kun; Mei, Lingyun; Wang, Xingwei; Yu, Jianjun; Chen, Hongsheng; Jiang, Lu; Liu, Yalan; Men, Meichao; Zhang, Hailin; Guan, Liping; Xiao, Jingjing; Zhang, Jianguo; Liu, Xuezhong; Feng, Yong

2015-03-01

Autosomal dominant nonsyndromic hearing loss (ADNSHL/DFNA) is a highly genetically heterogeneous disorder. Hitherto only about 30 ADNSHL-causing genes have been identified and many unknown genes remain to be discovered. In this research, genome-wide linkage analysis mapped the disease locus to a 4.3 Mb region on chromosome 19q13 in SY-026, a five-generation nonconsanguineous Chinese family affected by late-onset and progressive ADNSHL. This linkage region showed partial overlap with the previously reported DFNA4. Simultaneously, probands were analyzed using exome capture followed by next-generation sequencing. Encouragingly, a heterozygous missense mutation, c.505G>A (p.G169R) in exon 3 of the CEACAM16 gene (carcinoembryonic antigen-related cell adhesion molecule 16), was identified via this combined strategy. Sanger sequencing verified that the mutation co-segregated with hearing loss in the family and that it was not present in 200 unrelated control subjects with matched ancestry. This is the second report in the literature of a family with ADNSHL caused by CEACAM16 mutation. Immunofluorescence staining and western blots also prove CEACAM16 to be a secreted protein. Furthermore, our studies in transfected HEK293T cells show that the secretion efficacy of the mutant CEACAM16 is much lower than that of the wild type, suggesting a deleterious effect of the sequence variant.
Analysis of xylem formation in pine by cDNA sequencing

NASA Technical Reports Server (NTRS)

Allona, I.; Quinn, M.; Shoop, E.; Swope, K.; St Cyr, S.; Carlis, J.; Riedl, J.; Retzel, E.; Campbell, M. M.; Sederoff, R.;

1998-01-01

Secondary xylem (wood) formation is likely to involve some genes expressed rarely or not at all in herbaceous plants. Moreover, environmental and developmental stimuli influence secondary xylem differentiation, producing morphological and chemical changes in wood. To increase our understanding of xylem formation, and to provide material for comparative analysis of gymnosperm and angiosperm sequences, ESTs were obtained from immature xylem of loblolly pine (Pinus taeda L.). A total of 1,097 single-pass sequences were obtained from 5' ends of cDNAs made from gravistimulated tissue from bent trees. Cluster analysis detected 107 groups of similar sequences, ranging in size from 2 to 20 sequences. A total of 361 sequences fell into these groups, whereas 736 sequences were unique. About 55% of the pine EST sequences show similarity to previously described sequences in public databases. About 10% of the recognized genes encode factors involved in cell wall formation. Sequences similar to cell wall proteins, most known lignin biosynthetic enzymes, and several enzymes of carbohydrate metabolism were found. A number of putative regulatory proteins also are represented. Expression patterns of several of these genes were studied in various tissues and organs of pine. Sequencing novel genes expressed during xylem formation will provide a powerful means of identifying mechanisms controlling this important differentiation pathway.

The global prevalence of HFE and non-HFE hemochromatosis estimated from analysis of next-generation sequencing data.

PubMed

Wallace, Daniel F; Subramaniam, V Nathan

2016-06-01

The prevalence of HFE-related hereditary hemochromatosis (HH) among European populations has been well studied. There are no prevalence data for atypical forms of HH caused by mutations in HFE2, HAMP, TFR2, or SLC40A1. The purpose of this study was to estimate the population prevalence of these non-HFE forms of HH. A list of HH pathogenic variants in publically available next-generation sequence (NGS) databases was compiled and allele frequencies were determined. Of 161 variants previously associated with HH, 43 were represented among the NGS data sets; an additional 40 unreported functional variants also were identified. The predicted prevalence of HFE HH and the p.Cys282Tyr mutation closely matched previous estimates from similar populations. Of the non-HFE forms of iron overload, TFR2-, HFE2-, and HAMP-related forms are predicted to be rare, with pathogenic allele frequencies in the range of 0.00007 to 0.0005. Significantly, SLC40A1 variants that have been previously associated with autosomal-dominant ferroportin disease were identified in several populations (pathogenic allele frequency 0.0004), being most prevalent among Africans. We have, for the first time, estimated the population prevalence of non-HFE HH. This methodology could be applied to estimate the population prevalence of a wide variety of genetic disorders.Genet Med 18 6, 618-626.
Intriguing olfactory proteins from the yellow fever mosquito, Aedes aegypti

NASA Astrophysics Data System (ADS)

Ishida, Yuko; Chen, Angela M.; Tsuruda, Jennifer M.; Cornel, Anthon J.; Debboun, Mustapha; Leal, Walter S.

2004-09-01

Four antennae-specific proteins (AaegOBP1, AaegOBP2, AaegOBP3, and AaegASP1) were isolated from the yellow fever mosquito, Aedes aegypti and their full-length cDNAs were cloned. RT-PCR indicated that they are expressed in female and, to a lesser extent, in male antennae, but not in control tissues (legs). AaegOBP1 and AaegOBP3 showed significant similarity to previously identified mosquito odorant-binding proteins (OBPs) in cysteine spacing pattern and sequence. Two of the isolated proteins have a total of eight cysteine residues. The similarity of the spacing pattern of the cysteine residues and amino acid sequence to those of previously identified olfactory proteins suggests that one of the cysteine-rich proteins (AaegOBP2) is an OBP. The other (AaegASP1) did not belong to any group of known OBPs. Structural analyses indicate that six of the cysteine residues in AaegOBP2 are linked in a similar pattern to the previously known cysteine pairing in OBPs, i.e., Cys-24 Cys-55, Cys-51 Cys-104, Cys-95 Cys-113. The additional disulfide bridge, Cys-38 Cys-125, knits the extended C-terminal segment of the protein to a predicted α2-helix. As indicated by circular dichroism (CD) spectra, the extra rigidity seems to prevent the predicted formation of a C-terminal α-helix at low pH.
Co-occurrence of anaerobic bacteria in colorectal carcinomas.

PubMed

Warren, René L; Freeman, Douglas J; Pleasance, Stephen; Watson, Peter; Moore, Richard A; Cochrane, Kyla; Allen-Vercoe, Emma; Holt, Robert A

2013-05-15

Numerous cancers have been linked to microorganisms. Given that colorectal cancer is a leading cause of cancer deaths and the colon is continuously exposed to a high diversity of microbes, the relationship between gut mucosal microbiome and colorectal cancer needs to be explored. Metagenomic studies have shown an association between Fusobacterium species and colorectal carcinoma. Here, we have extended these studies with deeper sequencing of a much larger number (n = 130) of colorectal carcinoma and matched normal control tissues. We analyzed these data using co-occurrence networks in order to identify microbe-microbe and host-microbe associations specific to tumors. We confirmed tumor over-representation of Fusobacterium species and observed significant co-occurrence within individual tumors of Fusobacterium, Leptotrichia and Campylobacter species. This polymicrobial signature was associated with over-expression of numerous host genes, including the gene encoding the pro-inflammatory chemokine Interleukin-8. The tumor-associated bacteria we have identified are all Gram-negative anaerobes, recognized previously as constituents of the oral microbiome, which are capable of causing infection. We isolated a novel strain of Campylobacter showae from a colorectal tumor specimen. This strain is substantially diverged from a previously sequenced oral Campylobacter showae isolate, carries potential virulence genes, and aggregates with a previously isolated tumor strain of Fusobacterium nucleatum. A polymicrobial signature of Gram-negative anaerobic bacteria is associated with colorectal carcinoma tissue.
Culture and Next-generation sequencing-based drug susceptibility testing unveil high levels of drug-resistant-TB in Djibouti: results from the first national survey.

PubMed

Tagliani, Elisa; Hassan, Mohamed Osman; Waberi, Yacine; De Filippo, Maria Rosaria; Falzon, Dennis; Dean, Anna; Zignol, Matteo; Supply, Philip; Abdoulkader, Mohamed Ali; Hassangue, Hawa; Cirillo, Daniela Maria

2017-12-15

Djibouti is a small country in the Horn of Africa with a high TB incidence (378/100,000 in 2015). Multidrug-resistant TB (MDR-TB) and resistance to second-line agents have been previously identified in the country but the extent of the problem has yet to be quantified. A national survey was conducted to estimate the proportion of MDR-TB among a representative sample of TB patients. Sputum was tested using XpertMTB/RIF and samples positive for MTB and resistant to rifampicin underwent first line phenotypic susceptibility testing. The TB supranational reference laboratory in Milan, Italy, undertook external quality assurance, genotypic testing based on whole genome and targeted-deep sequencing and phylogenetic studies. 301 new and 66 previously treated TB cases were enrolled. MDR-TB was detected in 34 patients: 4.7% of new and 31% of previously treated cases. Resistance to pyrazinamide, aminoglycosides and capreomycin was detected in 68%, 18% and 29% of MDR-TB strains respectively, while resistance to fluoroquinolones was not detected. Cluster analysis identified transmission of MDR-TB as a critical factor fostering drug resistance in the country. Levels of MDR-TB in Djibouti are among the highest on the African continent. High prevalence of resistance to pyrazinamide and second-line injectable agents have important implications for treatment regimens.
Bacterial diversity of autotrophic enriched cultures from remote, glacial Antarctic, Alpine and Andean aerosol, snow and soil samples

NASA Astrophysics Data System (ADS)

González-Toril, E.; Amils, R.; Delmas, R. J.; Petit, J.-R.; Komárek, J.; Elster, J.

2009-01-01

Four different communities and one culture of autotrophic microbial assemblages were obtained by incubation of samples collected from high elevation snow in the Alps (Mt. Blanc area) and the Andes (Nevado Illimani summit, Bolivia), from Antarctic aerosol (French station Dumont d'Urville) and a maritime Antarctic soil (King George Island, South Shetlands, Uruguay Station Artigas), in a minimal mineral (oligotrophic) media. Molecular analysis of more than 200 16S rRNA gene sequences showed that all cultured cells belong to the Bacteria domain. Phylogenetic comparison with the currently available rDNA database allowed sequences belonging to Proteobacteria Alpha-, Beta- and Gamma-proteobacteria), Actinobacteria and Bacteroidetes phyla to be identified. The Andes snow culture was the richest in bacterial diversity (eight microorganisms identified) and the marine Antarctic soil the poorest (only one). Snow samples from Col du Midi (Alps) and the Andes shared the highest number of identified microorganisms (Agrobacterium, Limnobacter, Aquiflexus and two uncultured Alphaproteobacteria clones). These two sampling sites also shared four sequences with the Antarctic aerosol sample (Limnobacter, Pseudonocardia and an uncultured Alphaproteobacteriaclone). The only microorganism identified in the Antarctica soil (Brevundimonas sp.) was also detected in the Antarctic aerosol. Most of the identified microorganisms had been detected previously in cold environments, marine sediments soils and rocks. Air current dispersal is the best model to explain the presence of very specific microorganisms, like those identified in this work, in environments very distant and very different from each other.
De novo assembly and characterization of the Trichuris trichiura adult worm transcriptome using Ion Torrent sequencing.

PubMed

Santos, Leonardo N; Silva, Eduardo S; Santos, André S; De Sá, Pablo H; Ramos, Rommel T; Silva, Artur; Cooper, Philip J; Barreto, Maurício L; Loureiro, Sebastião; Pinheiro, Carina S; Alcantara-Neves, Neuza M; Pacheco, Luis G C

2016-07-01

Infection with helminthic parasites, including the soil-transmitted helminth Trichuris trichiura (human whipworm), has been shown to modulate host immune responses and, consequently, to have an impact on the development and manifestation of chronic human inflammatory diseases. De novo derivation of helminth proteomes from sequencing of transcriptomes will provide valuable data to aid identification of parasite proteins that could be evaluated as potential immunotherapeutic molecules in near future. Herein, we characterized the transcriptome of the adult stage of the human whipworm T. trichiura, using next-generation sequencing technology and a de novo assembly strategy. Nearly 17.6 million high-quality clean reads were assembled into 6414 contiguous sequences, with an N50 of 1606bp. In total, 5673 protein-encoding sequences were confidentially identified in the T. trichiura adult worm transcriptome; of these, 1013 sequences represent potential newly discovered proteins for the species, most of which presenting orthologs already annotated in the related species T. suis. A number of transcripts representing probable novel non-coding transcripts for the species T. trichiura were also identified. Among the most abundant transcripts, we found sequences that code for proteins involved in lipid transport, such as vitellogenins, and several chitin-binding proteins. Through a cross-species expression analysis of gene orthologs shared by T. trichiura and the closely related parasites T. suis and T. muris it was possible to find twenty-six protein-encoding genes that are consistently highly expressed in the adult stages of the three helminth species. Additionally, twenty transcripts could be identified that code for proteins previously detected by mass spectrometry analysis of protein fractions of the whipworm somatic extract that present immunomodulatory activities. Five of these transcripts were amongst the most highly expressed protein-encoding sequences in the T. trichiura adult worm. Besides, orthologs of proteins demonstrated to have potent immunomodulatory properties in related parasitic helminths were also predicted from the T. trichiura de novo assembled transcriptome. Copyright © 2016. Published by Elsevier B.V.
Polymorphism and selection in the major histocompatibility complex DRA and DQA genes in the family Equidae.

PubMed

Janova, Eva; Matiasovic, Jan; Vahala, Jiri; Vodicka, Roman; Van Dyk, Enette; Horin, Petr

2009-07-01

The major histocompatibility complex genes coding for antigen binding and presenting molecules are the most polymorphic genes in the vertebrate genome. We studied the DRA and DQA gene polymorphism of the family Equidae. In addition to 11 previously reported DRA and 24 DQA alleles, six new DRA sequences and 13 new DQA alleles were identified in the genus Equus. Phylogenetic analysis of both DRA and DQA sequences provided evidence for trans-species polymorphism in the family Equidae. The phylogenetic trees differed from species relationships defined by standard taxonomy of Equidae and from trees based on mitochondrial or neutral gene sequence data. Analysis of selection showed differences between the less variable DRA and more variable DQA genes. DRA alleles were more often shared by more species. The DQA sequences analysed showed strong amongst-species positive selection; the selected amino acid positions mostly corresponded to selected positions in rodent and human DQA genes.
SxtA gene sequence analysis of dinoflagellate Alexandrium minutum

NASA Astrophysics Data System (ADS)

Norshaha, Safida Anira; Latib, Norhidayu Abdul; Usup, Gires; Yusof, Nurul Yuziana Mohd

2015-09-01

The dinoflagellate Alexandrium minutum is typically known for the production of potent neurotoxins such as saxitoxin, affecting the health of human seafood consumers via paralytic shellfish poisoning (PSP). These phenomena is related to the harmful algal blooms (HABs) that is believed to be influenced by environmental and nutritional factors. Previous study has revealed that SxtA gene is a starting gene that involved in the saxitoxin production pathway. The aim of this study was to analyse the sequence of the sxtA gene in A. minutum. The dinoflagellates culture was cultured at temperature 26°C with 16:8-hour light:dark photocycle. After the samples were harvested, RNA was extracted, complementary DNA (cDNA) was synthesised and amplified by polymerase chain reaction (PCR). The PCR products were then purified and cloned before sequenced. The SxtA sequence obtained was then analyzed in order to identify the presence of SxtA gene in Alexandrium minutum.
Cognitive Dissonance as an Instructional Tool for Understanding Chemical Representations

NASA Astrophysics Data System (ADS)

Corradi, David; Clarebout, Geraldine; Elen, Jan

2015-10-01

Previous research on multiple external representations (MER) indicates that sequencing representations (compared with presenting them as a whole) can, in some cases, increase conceptual understanding if there is interference between internal and external representations. We tested this mechanism by sequencing different combinations of scientific and abstract chemical representations and presenting them to 133 learners with low prior knowledge of the represented domain. The results provide insight into three separate mechanisms of learning with MER. (1) A memory (number of ideas reproduced) and (2) an accuracy (correctness of these ideas) effects occur when two representations are presented in a sequence. An accuracy and a (3) redundancy (number of redundant ideas remembered) effects occur when three representations are presented in a sequence. A necessary precondition for these effects is that descriptive formats are placed before depictive formats. The identified effects are analyzed in terms of the concept of cognitive dissonance.
Genetic discovery in Xylella fastidiosa through sequence analysis of selected randomly amplified polymorphic DNAs.

PubMed

Chen, Jianchi; Civerolo, Edwin L; Jarret, Robert L; Van Sluys, Marie-Anne; de Oliveira, Mariana C

2005-02-01

Xylella fastidiosa causes many important plant diseases including Pierce's disease (PD) in grape and almond leaf scorch disease (ALSD). DNA-based methodologies, such as randomly amplified polymorphic DNA (RAPD) analysis, have been playing key roles in genetic information collection of the bacterium. This study further analyzed the nucleotide sequences of selected RAPDs from X. fastidiosa strains in conjunction with the available genome sequence databases and unveiled several previously unknown novel genetic traits. These include a sequence highly similar to those in the phage family of Podoviridae. Genome comparisons among X. fastidiosa strains suggested that the "phage" is currently active. Two other RAPDs were also related to horizontal gene transfer: one was part of a broadly distributed cryptic plasmid and the other was associated with conjugal transfer. One RAPD inferred a genomic rearrangement event among X. fastidiosa PD strains and another identified a single nucleotide polymorphism of evolutionary value.
Artificial selection increased body weight but induced increase of runs of homozygosity in Hanwoo cattle

PubMed Central

Kim, Kwondo; Jung, Jaehoon; Caetano-Anollés, Kelsey; Sung, Samsun; Yoo, DongAhn; Choi, Bong-Hwan; Kim, Hyung-Chul; Jeong, Jin-Young; Cho, Yong-Min; Park, Eung-Woo; Choi, Tae-Jeong; Park, Byoungho; Lim, Dajeong

2018-01-01

Artificial selection has been demonstrated to have a rapid and significant effect on the phenotype and genome of an organism. However, most previous studies on artificial selection have focused solely on genomic sequences modified by artificial selection or genomic sequences associated with a specific trait. In this study, we generated whole genome sequencing data of 126 cattle under artificial selection, and 24,973,862 single nucleotide variants to investigate the relationship among artificial selection, genomic sequences and trait. Using runs of homozygosity detected by the variants, we showed increase of inbreeding for decades, and at the same time demonstrated a little influence of recent inbreeding on body weight. Also, we could identify ~0.2 Mb runs of homozygosity segment which may be created by recent artificial selection. This approach may aid in development of genetic markers directly influenced by artificial selection, and provide insight into the process of artificial selection. PMID:29561881
A Children's Oncology Group and TARGET initiative exploring the genetic landscape of Wilms tumor.

PubMed

Gadd, Samantha; Huff, Vicki; Walz, Amy L; Ooms, Ariadne H A G; Armstrong, Amy E; Gerhard, Daniela S; Smith, Malcolm A; Auvil, Jaime M Guidry; Meerzaman, Daoud; Chen, Qing-Rong; Hsu, Chih Hao; Yan, Chunhua; Nguyen, Cu; Hu, Ying; Hermida, Leandro C; Davidsen, Tanja; Gesuwan, Patee; Ma, Yussanne; Zong, Zusheng; Mungall, Andrew J; Moore, Richard A; Marra, Marco A; Dome, Jeffrey S; Mullighan, Charles G; Ma, Jing; Wheeler, David A; Hampton, Oliver A; Ross, Nicole; Gastier-Foster, Julie M; Arold, Stefan T; Perlman, Elizabeth J

2017-10-01

We performed genome-wide sequencing and analyzed mRNA and miRNA expression, DNA copy number, and DNA methylation in 117 Wilms tumors, followed by targeted sequencing of 651 Wilms tumors. In addition to genes previously implicated in Wilms tumors (WT1, CTNNB1, AMER1, DROSHA, DGCR8, XPO5, DICER1, SIX1, SIX2, MLLT1, MYCN, and TP53), we identified mutations in genes not previously recognized as recurrently involved in Wilms tumors, the most frequent being BCOR, BCORL1, NONO, MAX, COL6A3, ASXL1, MAP3K4, and ARID1A. DNA copy number changes resulted in recurrent 1q gain, MYCN amplification, LIN28B gain, and MIRLET7A loss. Unexpected germline variants involved PALB2 and CHEK2. Integrated analyses support two major classes of genetic changes that preserve the progenitor state and/or interrupt normal development.
A Children's Oncology Group and TARGET Initiative Exploring the Genetic Landscape of Wilms Tumor

PubMed Central

Gadd, Samantha; Huff, Vicki; Walz, Amy L.; Ooms, Ariadne H.A.G.; Armstrong, Amy E.; Gerhard, Daniela S.; Smith, Malcolm A.; Guidry Auvil, Jaime M.; Meerzaman, Daoud; Chen, Qing-Rong; Hsu, Chih Hao; Yan, Chunhua; Nguyen, Cu; Hu, Ying; Hermida, Leandro C.; Davidsen, Tanja; Gesuwan, Patee; Ma, Yussanne; Zong, Zusheng; Mungall, Andrew J.; Moore, Richard A.; Marra, Marco A.; Dome, Jeffrey S.; Mullighan, Charles G.; Ma, Jing; Wheeler, David A.; Hampton, Oliver A.; Ross, Nicole; Gastier-Foster, Julie M.; Arold, Stefan T.; Perlman, Elizabeth J.

2017-01-01

Genome-wide sequencing, mRNA and miRNA expression, DNA copy number and methylation analyses were performed on 117 Wilms tumors, followed by targeted sequencing of 651 Wilms tumors. In addition to genes previously implicated in Wilms tumors (WT1, CTNNB1, FAM123B, DROSHA, DGCR8, XPO5, DICER1, SIX1, SIX2, MLLT1, MYCN, and TP53), mutations were identified in genes not previously recognized as recurrently involved in Wilms tumors, the most frequent being BCOR, BCORL1, NONO, MAX, COL6A3, ASXL1, MAP3K4, and ARID1A. DNA copy number changes resulted in recurrent 1q gain, MYCN amplification, LIN28B gain, and let-7a loss. Unexpected germline variants involved PALB2 and CHEK2. Integrated analyses support two major classes of genetic changes that preserve the progenitor state and/or interrupt normal development. PMID:28825729
Connecting the dots between genes, biochemistry, and disease susceptibility: systems biology modeling in human genetics.

PubMed

Moore, Jason H; Boczko, Erik M; Summar, Marshall L

2005-02-01

Understanding how DNA sequence variations impact human health through a hierarchy of biochemical and physiological systems is expected to improve the diagnosis, prevention, and treatment of common, complex human diseases. We have previously developed a hierarchical dynamic systems approach based on Petri nets for generating biochemical network models that are consistent with genetic models of disease susceptibility. This modeling approach uses an evolutionary computation approach called grammatical evolution as a search strategy for optimal Petri net models. We have previously demonstrated that this approach routinely identifies biochemical network models that are consistent with a variety of genetic models in which disease susceptibility is determined by nonlinear interactions between two or more DNA sequence variations. We review here this approach and then discuss how it can be used to model biochemical and metabolic data in the context of genetic studies of human disease susceptibility.
Multiple approaches to characterize the microbial community in a thermophilic anaerobic digester running on swine manure: a case study.

PubMed

Tuan, Nguyen Ngoc; Chang, Yi-Chia; Yu, Chang-Ping; Huang, Shir-Ly

2014-01-01

In this study, the first survey of microbial community in thermophilic anaerobic digester using swine manure as sole feedstock was performed by multiple approaches including denaturing gradient gel electrophoresis (DGGE), clone library and pyrosequencing techniques. The integrated analysis of 21 DGGE bands, 126 clones and 8506 pyrosequencing read sequences revealed that Clostridia from the phylum Firmicutes account for the most dominant Bacteria. In addition, our analysis also identified additional taxa that were missed by the previous researches, including members of the bacterial phyla Synergistetes, Planctomycetes, Armatimonadetes, Chloroflexi and Nitrospira which might also play a role in thermophilic anaerobic digester. Most archaeal 16S rRNA sequences could be assigned to the order Methanobacteriales instead of Methanomicrobiales comparing to previous studies. In addition, this study reported that the member of Methanothermobacter genus was firstly found in thermophilic anaerobic digester. Copyright © 2014 Elsevier GmbH. All rights reserved.

Spatial constraints govern competition of mutant clones in human epidermis.

PubMed

Lynch, M D; Lynch, C N S; Craythorne, E; Liakath-Ali, K; Mallipeddi, R; Barker, J N; Watt, F M

2017-10-24

Deep sequencing can detect somatic DNA mutations in tissues permitting inference of clonal relationships. This has been applied to human epidermis, where sun exposure leads to the accumulation of mutations and an increased risk of skin cancer. However, previous studies have yielded conflicting conclusions about the relative importance of positive selection and neutral drift in clonal evolution. Here, we sequenced larger areas of skin than previously, focusing on cancer-prone skin spanning five decades of life. The mutant clones identified were too large to be accounted for solely by neutral drift. Rather, using mathematical modelling and computational lattice-based simulations, we show that observed clone size distributions can be explained by a combination of neutral drift and stochastic nucleation of mutations at the boundary of expanding mutant clones that have a competitive advantage. These findings demonstrate that spatial context and cell competition cooperate to determine the fate of a mutant stem cell.
Imputation of Exome Sequence Variants into Population- Based Samples and Blood-Cell-Trait-Associated Loci in African Americans: NHLBI GO Exome Sequencing Project

PubMed Central

Auer, Paul L.; Johnsen, Jill M.; Johnson, Andrew D.; Logsdon, Benjamin A.; Lange, Leslie A.; Nalls, Michael A.; Zhang, Guosheng; Franceschini, Nora; Fox, Keolu; Lange, Ethan M.; Rich, Stephen S.; O’Donnell, Christopher J.; Jackson, Rebecca D.; Wallace, Robert B.; Chen, Zhao; Graubert, Timothy A.; Wilson, James G.; Tang, Hua; Lettre, Guillaume; Reiner, Alex P.; Ganesh, Santhi K.; Li, Yun

2012-01-01

Researchers have successfully applied exome sequencing to discover causal variants in selected individuals with familial, highly penetrant disorders. We demonstrate the utility of exome sequencing followed by imputation for discovering low-frequency variants associated with complex quantitative traits. We performed exome sequencing in a reference panel of 761 African Americans and then imputed newly discovered variants into a larger sample of more than 13,000 African Americans for association testing with the blood cell traits hemoglobin, hematocrit, white blood count, and platelet count. First, we illustrate the feasibility of our approach by demonstrating genome-wide-significant associations for variants that are not covered by conventional genotyping arrays; for example, one such association is that between higher platelet count and an MPL c.117G>T (p.Lys39Asn) variant encoding a p.Lys39Asn amino acid substitution of the thrombpoietin receptor gene (p = 1.5 × 10−11). Second, we identified an association between missense variants of LCT and higher white blood count (p = 4 × 10−13). Third, we identified low-frequency coding variants that might account for allelic heterogeneity at several known blood cell-associated loci: MPL c.754T>C (p.Tyr252His) was associated with higher platelet count; CD36 c.975T>G (p.Tyr325∗) was associated with lower platelet count; and several missense variants at the α-globin gene locus were associated with lower hemoglobin. By identifying low-frequency missense variants associated with blood cell traits not previously reported by genome-wide association studies, we establish that exome sequencing followed by imputation is a powerful approach to dissecting complex, genetically heterogeneous traits in large population-based studies. PMID:23103231
Whole Genome Sequencing for Genomics-Guided Investigations of Escherichia coli O157:H7 Outbreaks.

PubMed

Rusconi, Brigida; Sanjar, Fatemeh; Koenig, Sara S K; Mammel, Mark K; Tarr, Phillip I; Eppinger, Mark

2016-01-01

Multi isolate whole genome sequencing (WGS) and typing for outbreak investigations has become a reality in the post-genomics era. We applied this technology to strains from Escherichia coli O157:H7 outbreaks. These include isolates from seven North America outbreaks, as well as multiple isolates from the same patient and from different infected individuals in the same household. Customized high-resolution bioinformatics sequence typing strategies were developed to assess the core genome and mobilome plasticity. Sequence typing was performed using an in-house single nucleotide polymorphism (SNP) discovery and validation pipeline. Discriminatory power becomes of particular importance for the investigation of isolates from outbreaks in which macrogenomic techniques such as pulse-field gel electrophoresis or multiple locus variable number tandem repeat analysis do not differentiate closely related organisms. We also characterized differences in the phage inventory, allowing us to identify plasticity among outbreak strains that is not detectable at the core genome level. Our comprehensive analysis of the mobilome identified multiple plasmids that have not previously been associated with this lineage. Applied phylogenomics approaches provide strong molecular evidence for exceptionally little heterogeneity of strains within outbreaks and demonstrate the value of intra-cluster comparisons, rather than basing the analysis on archetypal reference strains. Next generation sequencing and whole genome typing strategies provide the technological foundation for genomic epidemiology outbreak investigation utilizing its significantly higher sample throughput, cost efficiency, and phylogenetic relatedness accuracy. These phylogenomics approaches have major public health relevance in translating information from the sequence-based survey to support timely and informed countermeasures. Polymorphisms identified in this work offer robust phylogenetic signals that index both short- and long-term evolution and can complement currently employed typing schemes for outbreak ex- and inclusion, diagnostics, surveillance, and forensic studies.
Whole Genome Sequencing for Genomics-Guided Investigations of Escherichia coli O157:H7 Outbreaks

PubMed Central

Rusconi, Brigida; Sanjar, Fatemeh; Koenig, Sara S. K.; Mammel, Mark K.; Tarr, Phillip I.; Eppinger, Mark

2016-01-01

Multi isolate whole genome sequencing (WGS) and typing for outbreak investigations has become a reality in the post-genomics era. We applied this technology to strains from Escherichia coli O157:H7 outbreaks. These include isolates from seven North America outbreaks, as well as multiple isolates from the same patient and from different infected individuals in the same household. Customized high-resolution bioinformatics sequence typing strategies were developed to assess the core genome and mobilome plasticity. Sequence typing was performed using an in-house single nucleotide polymorphism (SNP) discovery and validation pipeline. Discriminatory power becomes of particular importance for the investigation of isolates from outbreaks in which macrogenomic techniques such as pulse-field gel electrophoresis or multiple locus variable number tandem repeat analysis do not differentiate closely related organisms. We also characterized differences in the phage inventory, allowing us to identify plasticity among outbreak strains that is not detectable at the core genome level. Our comprehensive analysis of the mobilome identified multiple plasmids that have not previously been associated with this lineage. Applied phylogenomics approaches provide strong molecular evidence for exceptionally little heterogeneity of strains within outbreaks and demonstrate the value of intra-cluster comparisons, rather than basing the analysis on archetypal reference strains. Next generation sequencing and whole genome typing strategies provide the technological foundation for genomic epidemiology outbreak investigation utilizing its significantly higher sample throughput, cost efficiency, and phylogenetic relatedness accuracy. These phylogenomics approaches have major public health relevance in translating information from the sequence-based survey to support timely and informed countermeasures. Polymorphisms identified in this work offer robust phylogenetic signals that index both short- and long-term evolution and can complement currently employed typing schemes for outbreak ex- and inclusion, diagnostics, surveillance, and forensic studies. PMID:27446025
Intron loss from the NADH dehydrogenase subunit 4 gene of lettuce mitochondrial DNA: evidence for homologous recombination of a cDNA intermediate.

PubMed

Geiss, K T; Abbas, G M; Makaroff, C A

1994-04-01

The mitochondrial gene coding for subunit 4 of the NADH dehydrogenase complex I (nad4) has been isolated and characterized from lettuce, Lactuca sativa. Analysis of nad4 genes in a number of plants by Southern hybridization had previously suggested that the intron content varied between species. Characterization of the lettuce gene confirms this observation. Lettuce nad4 contains two exons and one group IIA intron, whereas previously sequenced nad4 genes from turnip and wheat contain three group IIA introns. Northern analysis identified a transcript of 1600 nucleotides, which represents the mature nad4 mRNA and a primary transcript of 3200 nucleotides. Sequence analysis of lettuce and turnip nad4 cDNAs was used to confirm the intron/exon border sequences and to examine RNA editing patterns. Editing is observed at the 5' and 3' ends of the lettuce transcript, but is absent from sequences that correspond to exons two, three and the 5' end of exon four in turnip and wheat. In contrast, turnip transcripts are highly edited in this region, suggesting that homologous recombination of an edited and spliced cDNA intermediate was involved in the loss of introns two and three from an ancestral lettuce nad4 gene.
Deep-branching Novel Lineages and High Diversity of Haptophytes in the Skagerrak (Norway) Uncovered by 454 Pyrosequencing

PubMed Central

Egge, Elianne S; Eikrem, Wenche; Edvardsen, Bente

2015-01-01

Microalgae in the division Haptophyta may be difficult to identify to species by microscopy because they are small and fragile. Here, we used high-throughput sequencing to explore the diversity of haptophytes in outer Oslofjorden, Skagerrak, and supplemented this with electron microscopy. Nano- and picoplanktonic subsurface samples were collected monthly for 2 yr, and the haptophytes were targeted by amplification of RNA/cDNA with Haptophyta-specific 18S ribosomal DNA V4 primers. Pyrosequencing revealed higher species richness of haptophytes than previously observed in the Skagerrak by microscopy. From ca. 400,000 reads we obtained 156 haptophyte operational taxonomic units (OTUs) after rigorous filtering and 99.5% clustering. The majority (84%) of the OTUs matched environmental sequences not linked to a morphological species, most of which were affiliated with the order Prymnesiales. Phylogenetic analyses including Oslofjorden OTUs and available cultured and environmental haptophyte sequences showed that several of the OTUs matched sequences forming deep-branching lineages, potentially representing novel haptophyte classes. Pyrosequencing also retrieved cultured species not previously reported by microscopy in the Skagerrak. Electron microscopy revealed species not yet genetically characterised and some potentially novel taxa. This study contributes to linking genotype to phenotype within this ubiquitous and ecologically important protist group, and reveals great, unknown diversity. PMID:25099994
When Genomics Is Not Enough: Experimental Evidence for a Decrease in LINE-1 Activity During the Evolution of Australian Marsupials

PubMed Central

Gallus, Susanne; Lammers, Fritjof

2016-01-01

The autonomous transposable element LINE-1 is a highly abundant element that makes up between 15% and 20% of therian mammal genomes. Since their origin before the divergence of marsupials and placental mammals, LINE-1 elements have contributed actively to the genome landscape. A previous in silico screen of the Tasmanian devil genome revealed a lack of functional coding LINE-1 sequences. In this study we present the results of an in vitro analysis from a partial LINE-1 reverse transcriptase coding sequence in five marsupial species. Our experimental screen supports the in silico findings of the genome-wide degradation of LINE-1 sequences in the Tasmanian devil, and identifies a high frequency of degraded LINE-1 sequences in other Australian marsupials. The comparison between the experimentally obtained LINE-1 sequences and reference genome assemblies suggests that conclusions from in silico analyses of retrotransposition activity can be influenced by incomplete genome assemblies from short reads. PMID:27389686
PHASTpep: Analysis Software for Discovery of Cell-Selective Peptides via Phage Display and Next-Generation Sequencing

PubMed Central

Dasa, Siva Sai Krishna; Kelly, Kimberly A.

2016-01-01

Next-generation sequencing has enhanced the phage display process, allowing for the quantification of millions of sequences resulting from the biopanning process. In response, many valuable analysis programs focused on specificity and finding targeted motifs or consensus sequences were developed. For targeted drug delivery and molecular imaging, it is also necessary to find peptides that are selective—targeting only the cell type or tissue of interest. We present a new analysis strategy and accompanying software, PHage Analysis for Selective Targeted PEPtides (PHASTpep), which identifies highly specific and selective peptides. Using this process, we discovered and validated, both in vitro and in vivo in mice, two sequences (HTTIPKV and APPIMSV) targeted to pancreatic cancer-associated fibroblasts that escaped identification using previously existing software. Our selectivity analysis makes it possible to discover peptides that target a specific cell type and avoid other cell types, enhancing clinical translatability by circumventing complications with systemic use. PMID:27186887
Sequence analysis of infectious pancreatic necrosis virus isolated from Iranian reared rainbow trout (Oncorhynchus mykiss) in 2012.

PubMed

Dadar, Maryam; Peyghan, Rahim; Memari, Hamid Rajabi; Shapouri, Masod Reza Seifi Abad; Hasanzadeh, Reza; Goudarzi, Laleh Moazzami; Vakharia, Vikram N

2013-12-01

Infectious pancreatic necrosis virus (IPNV) is the causal agent of a highly contagious disease that affects many species of fish and shellfish. This virus causes economically significant diseases of farmed rainbow trout, Oncorhynchus mykiss (Walbaum), in Iran, which is often associated with the transmission of pathogens from European resources. In this study, moribund rainbow trout fry samples were collected during an outbreak of IPNV in three different fish farms in north and west provinces of Iran in 2012; and we investigated the full genome sequence of Iranian IPNV and compared it with previously identified IPNV sequences. The sequences of different structural and nonstructural-protein genes were compared to those of other aquatic birnaviruses sequenced to date. Our results show that the Iranian isolate falls within genogroup 5, serotype A2 strain SP, having 99% identity with the strain 1146 from Spain. These results suggest that the Iranian isolate may have originated from Europe.
XGlycScan: An Open-source Software For N-linked Glycosite Assignment, Quantification and Quality Assessment of Data from Mass Spectrometry-based Glycoproteomic Analysis.

PubMed

Aiyetan, Paul; Zhang, Bai; Zhang, Zhen; Zhang, Hui

2014-01-01

Mass spectrometry based glycoproteomics has become a major means of identifying and characterizing previously N-linked glycan attached loci (glycosites). In the bottom-up approach, several factors which include but not limited to sample preparation, mass spectrometry analyses, and protein sequence database searches result in previously N-linked peptide spectrum matches (PSMs) of varying lengths. Given that multiple PSM scan map to a glycosite, we reason that identified PSMs are varying length peptide species of a unique set of glycosites. Because associated spectra of these PSMs are typically summed separately, true glycosite associated spectra counts are lost or complicated. Also, these varying length peptide species complicate protein inference as smaller sized peptide sequences are more likely to map to more proteins than larger sized peptides or actual glycosite sequences. Here, we present XGlycScan. XGlycScan maps varying length peptide species to glycosites to facilitate an accurate quantification of glycosite associated spectra counts. We observed that this reduced the variability in reported identifications of mass spectrometry technical replicates of our sample dataset. We also observed that mapping identified peptides to glycosites provided an assessment of search-engine identification. Inherently, XGlycScan reported glycosites reduce the complexity in protein inference. We implemented XGlycScan in the platform independent Java programing language and have made it available as open source. XGlycScan's source code is freely available at https://bitbucket.org/paiyetan/xglycscan/src and its compiled binaries and documentation can be freely downloaded at https://bitbucket.org/paiyetan/xglycscan/downloads. The graphical user interface version can also be found at https://bitbucket.org/paiyetan/xglycscangui/src and https://bitbucket.org/paiyetan/xglycscangui/downloads respectively.
Restriction site polymorphism-based candidate gene mapping for seedling drought tolerance in cowpea [Vigna unguiculata (L.) Walp.].

PubMed

Muchero, Wellington; Ehlers, Jeffrey D; Roberts, Philip A

2010-02-01

Quantitative trait loci (QTL) studies provide insight into the complexity of drought tolerance mechanisms. Molecular markers used in these studies also allow for marker-assisted selection (MAS) in breeding programs, enabling transfer of genetic factors between breeding lines without complete knowledge of their exact nature. However, potential for recombination between markers and target genes limit the utility of MAS-based strategies. Candidate gene mapping offers an alternative solution to identify trait determinants underlying QTL of interest. Here, we used restriction site polymorphisms to investigate co-location of candidate genes with QTL for seedling drought stress-induced premature senescence identified previously in cowpea. Genomic DNA isolated from 113 F(2:8) RILs of drought-tolerant IT93K503-1 and drought susceptible CB46 genotypes was digested with combinations of EcoR1 and HpaII, Mse1, or Msp1 restriction enzymes and amplified with primers designed from 13 drought-responsive cDNAs. JoinMap 3.0 and MapQTL 4.0 software were used to incorporate polymorphic markers onto the AFLP map and to analyze their association with the drought response QTL. Seven markers co-located with peaks of previously identified QTL. Isolation, sequencing, and blast analysis of these markers confirmed their significant homology with drought or other abiotic stress-induced expressed sequence tags (EST) from cowpea and other plant systems. Further, homology with coding sequences for a multidrug resistance protein 3 and a photosystem I assembly protein ycf3 was revealed in two of these candidates. These results provide a platform for the identification and characterization of genetic trait determinants underlying seedling drought tolerance in cowpea.
Identification of Inherited Retinal Disease-Associated Genetic Variants in 11 Candidate Genes.

PubMed

Astuti, Galuh D N; van den Born, L Ingeborgh; Khan, M Imran; Hamel, Christian P; Bocquet, Béatrice; Manes, Gaël; Quinodoz, Mathieu; Ali, Manir; Toomes, Carmel; McKibbin, Martin; El-Asrag, Mohammed E; Haer-Wigman, Lonneke; Inglehearn, Chris F; Black, Graeme C M; Hoyng, Carel B; Cremers, Frans P M; Roosing, Susanne

2018-01-10

Inherited retinal diseases (IRDs) display an enormous genetic heterogeneity. Whole exome sequencing (WES) recently identified genes that were mutated in a small proportion of IRD cases. Consequently, finding a second case or family carrying pathogenic variants in the same candidate gene often is challenging. In this study, we searched for novel candidate IRD gene-associated variants in isolated IRD families, assessed their causality, and searched for novel genotype-phenotype correlations. Whole exome sequencing was performed in 11 probands affected with IRDs. Homozygosity mapping data was available for five cases. Variants with minor allele frequencies ≤ 0.5% in public databases were selected as candidate disease-causing variants. These variants were ranked based on their: (a) presence in a gene that was previously implicated in IRD; (b) minor allele frequency in the Exome Aggregation Consortium database (ExAC); (c) in silico pathogenicity assessment using the combined annotation dependent depletion (CADD) score; and (d) interaction of the corresponding protein with known IRD-associated proteins. Twelve unique variants were found in 11 different genes in 11 IRD probands. Novel autosomal recessive and dominant inheritance patterns were found for variants in Small Nuclear Ribonucleoprotein U5 Subunit 200 ( SNRNP200 ) and Zinc Finger Protein 513 ( ZNF513 ), respectively. Using our pathogenicity assessment, a variant in DEAH-Box Helicase 32 ( DHX32 ) was the top ranked novel candidate gene to be associated with IRDs, followed by eight medium and lower ranked candidate genes. The identification of candidate disease-associated sequence variants in 11 single families underscores the notion that the previously identified IRD-associated genes collectively carry > 90% of the defects implicated in IRDs. To identify multiple patients or families with variants in the same gene and thereby provide extra proof for pathogenicity, worldwide data sharing is needed.
Anti-infective activity of apolipoprotein domain derived peptides in vitro: identification of novel antimicrobial peptides related to apolipoprotein B with anti-HIV activity

PubMed Central

2010-01-01

Background Previous reports have shown that peptides derived from the apolipoprotein E receptor binding region and the amphipathic α-helical domains of apolipoprotein AI have broad anti-infective activity and antiviral activity respectively. Lipoproteins and viruses share a similar cell biological niche, being of overlapping size and displaying similar interactions with mammalian cells and receptors, which may have led to other antiviral sequences arising within apolipoproteins, in addition to those previously reported. We therefore designed a series of peptides based around either apolipoprotein receptor binding regions, or amphipathic α-helical domains, and tested these for antiviral and antibacterial activity. Results Of the nineteen new peptides tested, seven showed some anti-infective activity, with two of these being derived from two apolipoproteins not previously used to derive anti-infective sequences. Apolipoprotein J (151-170) - based on a predicted amphipathic alpha-helical domain from apolipoprotein J - had measurable anti-HSV1 activity, as did apolipoprotein B (3359-3367) dp (apoBdp), the latter being derived from the LDL receptor binding domain B of apolipoprotein B. The more active peptide - apoBdp - showed similarity to the previously reported apoE derived anti-infective peptide, and further modification of the apoBdp sequence to align the charge distribution more closely to that of apoEdp or to introduce aromatic residues resulted in increased breadth and potency of activity. The most active peptide of this type showed similar potent anti-HIV activity, comparable to that we previously reported for the apoE derived peptide apoEdpL-W. Conclusions These data suggest that further antimicrobial peptides may be obtained using human apolipoprotein sequences, selecting regions with either amphipathic α-helical structure, or those linked to receptor-binding regions. The finding that an amphipathic α-helical region of apolipoprotein J has antiviral activity comparable with that for the previously reported apolipoprotein AI derived peptide 18A, suggests that full-length apolipoprotein J may also have such activity, as has been reported for full-length apolipoprotein AI. Although the strength of the anti-infective activity of the sequences identified was limited, this could be increased substantially by developing related mutant peptides. Indeed the apolipoprotein B-derived peptide mutants uncovered by the present study may have utility as HIV therapeutics or microbicides. PMID:20298574
Identification of the Quorum-Sensing Target DNA Sequence and N-Acyl Homoserine Lactone Responsiveness of the Brucella abortus virB promoter▿

PubMed Central

Arocena, Gastón M.; Sieira, Rodrigo; Comerci, Diego J.; Ugalde, Rodolfo A.

2010-01-01

VjbR is a LuxR-type quorum-sensing (QS) regulator that plays an essential role in the virulence of the intracellular facultative pathogen Brucella, the causative agent of brucellosis. It was previously described that VjbR regulates a diverse group of genes, including the virB operon. The latter codes for a type IV secretion system (T4SS) that is central for the pathogenesis of Brucella. Although the regulatory role of VjbR on the virB promoter (PvirB) was extensively studied by different groups, the VjbR-binding site had not been identified so far. Here, we identified the target DNA sequence of VjbR in PvirB by DNase I footprinting analyses. Surprisingly, we observed that VjbR specifically recognizes a sequence that is identical to a half-binding site of the QS-related regulator MrtR of Mesorhizobium tianshanense. As shown by DNase I footprinting and electrophoretic mobility shift assays, generation of a palindromic MrtR-like-binding site in PvirB increased both the affinity and the stability of the VjbR-DNA complex, which confirmed that the QS regulator of Brucella is highly related to that of M. tianshanense. The addition of N-dodecanoyl homoserine lactone dissociated VjbR from the promoter, which confirmed previous reports that indicated a negative effect of this signal on the VjbR-mediated activation of PvirB. Our results provide new molecular evidence for the structure of the virB promoter and reveal unusual features of the QS target DNA sequence of the main regulator of virulence in Brucella. PMID:20400542
Genetic Characterization of Human-Derived Hydatid Cysts of Echinococcus granulosus Sensu Lato in Heilongjiang Province and the First Report of G7 Genotype of E. canadensis in Humans in China

PubMed Central

Zeng, Zhaolin; Zhao, Wei; Liu, Aiqin; Piao, Daxun; Jiang, Tao; Cao, Jianping; Shen, Yujuan; Liu, Hua; Zhang, Weizhe

2014-01-01

Cystic echinococcosis (CE) caused by the larval stage of Echinococcus granulosus sensu lato (s.l.) is one of the most important zoonotic parasitic diseases worldwide and 10 genotypes (G1–G10) have been reported. In China, almost all the epidemiological and genotyping studies of E. granulosus s.l. are from the west and northwest pasturing areas. However, in Heilongjiang Province of northeastern China, no molecular information is available on E. granulosus s.l. To understand and to speculate on possible transmission patterns of E. granulosus s.l., we molecularly identified and genotyped 10 hydatid cysts from hepatic CE patients in Heilongjiang Province based on mitochondrial cytochrome c oxidase subunit I (cox1), cytochrome b (cytb) and NADH dehydrogenase subunit 1 (nad1) genes. Two genotypes were identified, G1 genotype (n = 6) and G7 genotype (n = 4). All the six G1 genotype isolates were identical to each other at the cox1 locus; three and two different sequences were obtained at the cytb and nad1 loci, respectively, with two cytb gene sequences not being described previously. G7 genotype isolates were identical to each other at the cox1, cytb and nad1 loci; however, the cytb gene sequence was not described previously. This is the first report of G7 genotype in humans in China. Three new cytb gene sequences from G1 and G7 genotypes might reflect endemic genetic characterizations. Pigs might be the main intermediate hosts of G7 genotype in our investigated area by homology analysis. The results will aid in making more effective control strategies for the prevention of transmission of E. granulosus s.l. PMID:25329820
Characterization of a Large Antibiotic Resistance Plasmid Found in Enteropathogenic Escherichia coli Strain B171 and Its Relatedness to Plasmids of Diverse E. coli and Shigella Strains.

PubMed

Hazen, Tracy H; Michalski, Jane; Nagaraj, Sushma; Okeke, Iruka N; Rasko, David A

2017-09-01

Enteropathogenic Escherichia coli (EPEC) is a leading cause of severe infantile diarrhea in developing countries. Previous research has focused on the diversity of the EPEC virulence plasmid, whereas less is known regarding the genetic content and distribution of antibiotic resistance plasmids carried by EPEC. A previous study demonstrated that in addition to the virulence plasmid, reference EPEC strain B171 harbors a second, larger plasmid that confers antibiotic resistance. To further understand the genetic diversity and dissemination of antibiotic resistance plasmids among EPEC strains, we describe the complete sequence of an antibiotic resistance plasmid from EPEC strain B171. The resistance plasmid, pB171_90, has a completed sequence length of 90,229 bp, a GC content of 54.55%, and carries protein-encoding genes involved in conjugative transfer, resistance to tetracycline ( tetA ), sulfonamides ( sulI ), and mercury, as well as several virulence-associated genes, including the transcriptional regulator hha and the putative calcium sequestration inhibitor ( csi ). In silico detection of the pB171_90 genes among 4,798 publicly available E. coli genome assemblies indicates that the unique genes of pB171_90 ( csi and traI ) are primarily restricted to genomes identified as EPEC or enterotoxigenic E. coli However, conserved regions of the pB171_90 plasmid containing genes involved in replication, stability, and antibiotic resistance were identified among diverse E. coli pathotypes. Interestingly, pB171_90 also exhibited significant similarity with a sequenced plasmid from Shigella dysenteriae type I. Our findings demonstrate the mosaic nature of EPEC antibiotic resistance plasmids and highlight the need for additional sequence-based characterization of antibiotic resistance plasmids harbored by pathogenic E. coli . Copyright © 2017 American Society for Microbiology.
Identifying the pattern of molecular evolution for Zaire ebolavirus in the 2014 outbreak in West Africa.

PubMed

Liu, Si-Qing; Deng, Cheng-Lin; Yuan, Zhi-Ming; Rayner, Simon; Zhang, Bo

2015-06-01

The current Ebola virus disease (EVD) epidemic has killed more than all previous Ebola outbreaks combined and, even as efforts appear to be bringing the outbreak under control, the threat of reemergence remains. The availability of new whole-genome sequences from West Africa in 2014 outbreak, together with those from the earlier outbreaks, provide an opportunity to investigate the genetic characteristics, the epidemiological dynamics and the evolutionary history for Zaire ebolavirus (ZEBOV). To investigate the evolutionary properties of ZEBOV in this outbreak, we examined amino acid mutations, positive selection, and evolutionary rates on the basis of 123 ZEBOV genome sequences. The estimated phylogenetic relationships within ZEBOV revealed that viral sequences from the same period or location formed a distinct cluster. The West Africa viruses probably derived from Middle Africa, consistent with results from previous studies. Analysis of the seven protein regions of ZEBOV revealed evidence of positive selection acting on the GP and L genes. Interestingly, all putatively positive-selected sites identified in the GP are located within the mucin-like domain of the solved structure of the protein, suggesting a possible role in the immune evasion properties of ZEBOV. Compared with earlier outbreaks, the evolutionary rate of GP gene was estimated to significantly accelerate in the 2014 outbreak, suggesting that more ZEBOV variants are generated for human to human transmission during this sweeping epidemic. However, a more balanced sample set and next generation sequencing datasets would help achieve a clearer understanding at the genetic level of how the virus is evolving and adapting to new conditions. Copyright © 2015 Elsevier B.V. All rights reserved.
CRISPR interference and priming varies with individual spacer sequences

PubMed Central

Xue, Chaoyou; Seetharam, Arun S.; Musharova, Olga; Severinov, Konstantin; J. Brouns, Stan J.; Severin, Andrew J.; Sashital, Dipali G.

2015-01-01

CRISPR–Cas (clustered regularly interspaced short palindromic repeats-CRISPR associated) systems allow bacteria to adapt to infection by acquiring ‘spacer’ sequences from invader DNA into genomic CRISPR loci. Cas proteins use RNAs derived from these loci to target cognate sequences for destruction through CRISPR interference. Mutations in the protospacer adjacent motif (PAM) and seed regions block interference but promote rapid ‘primed’ adaptation. Here, we use multiple spacer sequences to reexamine the PAM and seed sequence requirements for interference and priming in the Escherichia coli Type I-E CRISPR–Cas system. Surprisingly, CRISPR interference is far more tolerant of mutations in the seed and the PAM than previously reported, and this mutational tolerance, as well as priming activity, is highly dependent on spacer sequence. We identify a large number of functional PAMs that can promote interference, priming or both activities, depending on the associated spacer sequence. Functional PAMs are preferentially acquired during unprimed ‘naïve’ adaptation, leading to a rapid priming response following infection. Our results provide numerous insights into the importance of both spacer and target sequences for interference and priming, and reveal that priming is a major pathway for adaptation during initial infection. PMID:26586800
Palindromic Sequence Artifacts Generated during Next Generation Sequencing Library Preparation from Historic and Ancient DNA

PubMed Central

Star, Bastiaan; Nederbragt, Alexander J.; Hansen, Marianne H. S.; Skage, Morten; Gilfillan, Gregor D.; Bradbury, Ian R.; Pampoulie, Christophe; Stenseth, Nils Chr; Jakobsen, Kjetill S.; Jentoft, Sissel

2014-01-01

Degradation-specific processes and variation in laboratory protocols can bias the DNA sequence composition from samples of ancient or historic origin. Here, we identify a novel artifact in sequences from historic samples of Atlantic cod (Gadus morhua), which forms interrupted palindromes consisting of reverse complementary sequence at the 5′ and 3′-ends of sequencing reads. The palindromic sequences themselves have specific properties – the bases at the 5′-end align well to the reference genome, whereas extensive misalignments exists among the bases at the terminal 3′-end. The terminal 3′ bases are artificial extensions likely caused by the occurrence of hairpin loops in single stranded DNA (ssDNA), which can be ligated and amplified in particular library creation protocols. We propose that such hairpin loops allow the inclusion of erroneous nucleotides, specifically at the 3′-end of DNA strands, with the 5′-end of the same strand providing the template. We also find these palindromes in previously published ancient DNA (aDNA) datasets, albeit at varying and substantially lower frequencies. This artifact can negatively affect the yield of endogenous DNA in these types of samples and introduces sequence bias. PMID:24608104
Influence of Molecular Resolution on Sequence-Based Discovery of Ecological Diversity among Synechococcus Populations in an Alkaline Siliceous Hot Spring Microbial Mat ▿ †

PubMed Central

Melendrez, Melanie C.; Lange, Rachel K.; Cohan, Frederick M.; Ward, David M.

2011-01-01

Previous research has shown that sequences of 16S rRNA genes and 16S-23S rRNA internal transcribed spacer regions may not have enough genetic resolution to define all ecologically distinct Synechococcus populations (ecotypes) inhabiting alkaline, siliceous hot spring microbial mats. To achieve higher molecular resolution, we studied sequence variation in three protein-encoding loci sampled by PCR from 60°C and 65°C sites in the Mushroom Spring mat (Yellowstone National Park, WY). Sequences were analyzed using the ecotype simulation (ES) and AdaptML algorithms to identify putative ecotypes. Between 4 and 14 times more putative ecotypes were predicted from variation in protein-encoding locus sequences than from variation in 16S rRNA and 16S-23S rRNA internal transcribed spacer sequences. The number of putative ecotypes predicted depended on the number of sequences sampled and the molecular resolution of the locus. Chao estimates of diversity indicated that few rare ecotypes were missed. Many ecotypes hypothesized by sequence analyses were different in their habitat specificities, suggesting different adaptations to temperature or other parameters that vary along the flow channel. PMID:21169433

Some links on this page may take you to non-federal websites. Their policies may differ from this site.