previously published sequences: Topics by Science.gov

Sample records for previously published sequences

Partial gene sequences for the A subunit of methyl-coenzyme M reductase (mcrI) as a phylogenetic tool for the family Methanosarcinaceae

NASA Technical Reports Server (NTRS)

Springer, E.; Sachs, M. S.; Woese, C. R.; Boone, D. R.

1995-01-01

Representatives of the family Methanosarcinaceae were analyzed phylogenetically by comparing partial sequences of their methyl-coenzyme M reductase (mcrI) genes. A 490-bp fragment from the A subunit of the gene was selected, amplified by the PCR, cloned, and sequenced for each of 25 strains belonging to the Methanosarcinaceae. The sequences obtained were aligned with the corresponding portions of five previously published sequences, and all of the sequences were compared to determine phylogenetic distances by Fitch distance matrix methods. We prepared analogous trees based on 16S rRNA sequences; these trees corresponded closely to the mcrI trees, although the mcrI sequences of pairs of organisms had 3.01 +/- 0.541 times more changes than the respective pairs of 16S rRNA sequences, suggesting that the mcrI fragment evolved about three times more rapidly than the 16S rRNA gene. The qualitative similarity of the mcrI and 16S rRNA trees suggests that transfer of genetic information between dissimilar organisms has not significantly affected these sequences, although we found inconsistencies between some mcrI distances that we measured and and previously published DNA reassociation data. It is unlikely that multiple mcrI isogenes were present in the organisms that we examined, because we found no major discrepancies in multiple determinations of mcrI sequences from the same organism. Our primers for the PCR also match analogous sites in the previously published mcrII sequences, but all of the sequences that we obtained from members of the Methanosarcinaceae were more closely related to mcrI sequences than to mcrII sequences, suggesting that members of the Methanosarcinaceae do not have distinct mcrII genes.
HIV Sequence Compendium 2015

DOE Office of Scientific and Technical Information (OSTI.GOV)

Foley, Brian Thomas; Leitner, Thomas Kenneth; Apetrei, Cristian

This compendium is an annual printed summary of the data contained in the HIV sequence database. We try to present a judicious selection of the data in such a way that it is of maximum utility to HIV researchers. Each of the alignments attempts to display the genetic variability within the different species, groups and subtypes of the virus. This compendium contains sequences published before January 1, 2015. Hence, though it is published in 2015 and called the 2015 Compendium, its contents correspond to the 2014 curated alignments on our website. The number of sequences in the HIV database ismore » still increasing. In total, at the end of 2014, there were 624,121 sequences in the HIV Sequence Database, an increase of 7% since the previous year. This is the first year that the number of new sequences added to the database has decreased compared to the previous year. The number of near complete genomes (>7000 nucleotides) increased to 5834 by end of 2014. However, as in previous years, the compendium alignments contain only a fraction of these. A more complete version of all alignments is available on our website, http://www.hiv.lanl.gov/ content/sequence/NEWALIGN/align.html As always, we are open to complaints and suggestions for improvement. Inquiries and comments regarding the compendium should be addressed to seq-info@lanl.gov.« less
Application of the major capsid protein as a marker of the phylogenetic diversity of Emiliania huxleyi viruses.

PubMed

Rowe, Janet M; Fabre, Marie-Françoise; Gobena, Daniel; Wilson, William H; Wilhelm, Steven W

2011-05-01

Studies of the Phycodnaviridae have traditionally relied on the DNA polymerase (pol) gene as a biomarker. However, recent investigations have suggested that the major capsid protein (MCP) gene may be a reliable phylogenetic biomarker. We used MCP gene amplicons gathered across the North Atlantic to assess the diversity of Emiliania huxleyi-infecting Phycodnaviridae. Nucleotide sequences were examined across >6000 km of open ocean, with comparisons between concentrates of the virus-size fraction of seawater and of lysates generated by exposing host strains to these same virus concentrates. Analyses revealed that many sequences were only sampled once, while several were over-represented. Analyses also revealed nucleotide sequences distinct from previous coastal isolates. Examination of lysed cultures revealed a new richness in phylogeny, as MCP sequences previously unrepresented within the existing collection of E. huxleyi viruses (EhV) were associated with viruses lysing cultures. Sequences were compared with previously described EhV MCP sequences from the North Sea and a Norwegian Fjord, as well as from the Gulf of Maine. Principal component analysis indicates that location-specific distinctions exist despite the presence of sequences common across these environments. Overall, this investigation provides new sequence data and an assessment on the use of the MCP gene. © 2011 Federation of European Microbiological Societies Published by Blackwell Publishing Ltd. All rights reserved.
The complete CDS of the prion protein (PRNP) gene of African lion (Panthera leo).

PubMed

Maj, Andrzej; Spellman, Garth M; Sarver, Shane K

2008-04-01

We provide the complete PRNP CDS sequence for the African lion, which is different from the previously published sequence and more similar to other carnivore sequences. The newly obtained prion protein sequence differs from the domestic cat sequence at three amino acid positions and contains only four octapeptide repeats. We recommend that this sequence be used as the reference sequence for future studies of the PRNP gene for this species.
Identification of viral and non-viral reverse transcribing elements in pineapple (Ananas comosus), including members of two new badnavirus species.

PubMed

Gambley, C F; Geering, A D W; Steele, V; Thomas, J E

2008-01-01

A previously published partial sequence of pineapple bacilliform virus was shown to be from a retrotransposon (family Metaviridae) and not from a badnavirus as previously thought. Two newly discovered sequence groups isolated from pineapple were associated with bacilliform virions and were transmitted by mealybugs. Phylogenetic analyses indicated that they were members of new badnavirus species. A third caulimovirid sequence was also amplified from pineapple, but available evidence suggests that this DNA is not encapsidated, but more likely derived from an endogenous virus.
Comparative genomic survey, exon-intron annotation and phylogenetic analysis of NAT-homologous sequences in archaea, protists, fungi, viruses, and invertebrates

USDA-ARS?s Scientific Manuscript database

We have previously published extensive genomic surveys [1-3], reporting NAT-homologous sequences in hundreds of sequenced bacterial, fungal and vertebrate genomes. We present here the results of our latest search of 2445 genomes, representing 1532 (70 archaeal, 1210 bacterial, 43 protist, 97 fungal,...
FUNGAL-SPECIFIC PCR PRIMERS DEVELOPED FOR ANALYSIS OF THE ITS REGION OF ENVIRONMENTAL DNA EXTRACTS

EPA Science Inventory

Background The Internal Transcribed Spacer (ITS) regions of fungal ribosomal DNA (rDNA) are highly variable sequences of great importance in distinguishing fungal species by PCR analysis. Previously published PCR primers available for amplifying these sequences from environmenta...
Application of Genotyping during an Extensive Outbreak of Waterborne Giardiasis in Bergen, Norway, during Autumn and Winter 2004†

PubMed Central

Robertson, L. J.; Hermansen, L.; Gjerde, B. K.; Strand, E.; Alvsvåg, J. O.; Langeland, N.

2006-01-01

During the autumn and winter of 2004 and 2005, an extensive outbreak of waterborne giardiasis occurred in Bergen, Norway. Over 1,500 patients were diagnosed with giardiasis. Analysis of water from the implicated source revealed low numbers of Giardia cysts, but the initial contamination event probably occurred up to 10 weeks previously. While sewage leakage from a residential area is now considered to be the probable source of contamination, during the episode waste from one particular septic tank was thought to be a possible source. Genotyping of cysts from the septic tank demonstrated that they were assemblage A cysts, although the sequences were not identical to any previously published sequences. For the β-giardin gene, the closest published subgenotype was subgenotype A3; for the gdh gene, the closest published subgenotype was subgenotype A2. Genotyping of cysts from 21 patient samples revealed that they were assemblage B cysts; thus, the septic tank was unlikely to be the contamination source. Sequencing of the β-giardin and gdh genes from patient samples and a comparison of the sequences gave complex results. For the β-giardin gene, three isolates had sequences identical to subgenotype B3 sequences. However, other isolates had between one and four single-nucleotide polymorphisms (SNPs). For the gdh gene, none of the sequences were identical to the sequence published for subgenotype B3, and the sequences had between one and three SNPs. One isolate, which was identical to subgenotype B3 at the β-giardin gene, was more similar to subgenotype B2 at the gdh gene. Grouping the isolates on the basis of SNPs resulted in different groups for the two genes. The results are discussed in relation to giardiasis in Norway and to other Giardia genotyping studies. PMID:16517674
Thermodynamic characterization of tandem mismatches found in naturally occurring RNA

PubMed Central

Christiansen, Martha E.; Znosko, Brent M.

2009-01-01

Although all sequence symmetric tandem mismatches and some sequence asymmetric tandem mismatches have been thermodynamically characterized and a model has been proposed to predict the stability of previously unmeasured sequence asymmetric tandem mismatches [Christiansen,M.E. and Znosko,B.M. (2008) Biochemistry, 47, 4329–4336], experimental thermodynamic data for frequently occurring tandem mismatches is lacking. Since experimental data is preferred over a predictive model, the thermodynamic parameters for 25 frequently occurring tandem mismatches were determined. These new experimental values, on average, are 1.0 kcal/mol different from the values predicted for these mismatches using the previous model. The data for the sequence asymmetric tandem mismatches reported here were then combined with the data for 72 sequence asymmetric tandem mismatches that were published previously, and the parameters used to predict the thermodynamics of previously unmeasured sequence asymmetric tandem mismatches were updated. The average absolute difference between the measured values and the values predicted using these updated parameters is 0.5 kcal/mol. This updated model improves the prediction for tandem mismatches that were predicted rather poorly by the previous model. This new experimental data and updated predictive model allow for more accurate calculations of the free energy of RNA duplexes containing tandem mismatches, and, furthermore, should allow for improved prediction of secondary structure from sequence. PMID:19509311
On the phylogenetic placement of human T cell leukemia virus type 1 sequences associated with an Andean mummy.

PubMed

Coulthart, Michael B; Posada, David; Crandall, Keith A; Dekaban, Gregory A

2006-03-01

Recently, the putative finding of ancient human T cell leukemia virus type 1 (HTLV-1) long terminal repeat (LTR) DNA sequences in association with a 1500-year-old Chilean mummy has stirred vigorous debate. The debate is based partly on the inherent uncertainties associated with phylogenetic reconstruction when only short sequences of closely related genotypes are available. However, a full analysis of what phylogenetic information is present in the mummy data has not previously been published, leaving open the question of what precisely is the range of admissible interpretation. To fulfill this need, we re-analyzed the mummy data in a new way. We first performed phylogenetic analysis of 188 published LTR DNA sequences from extant strains belonging to the HTLV-1 Cosmopolitan clade, using the method of statistical parsimony which is designed both to optimize phylogenetic resolution among sequences with little evolutionary divergence, and to permit precise mapping of individual sequence mutations onto branches of a divergence network. We then deduced possible phylogenetic positions for the two main categories of published Chilean mummy sequences, based on their published 157-nucleotide LTR sequences. The possible phylogenetic placements for one of the mummy sequence categories are consistent with a modern origin. However, one of these placements for the other mummy sequence category falls very close to the root of the Cosmopolitan clade, consistent with an ancient origin for both this mummy sequence and the Cosmopolitan clade.
Genome Sequences of Multidrug-Resistant Salmonella enterica subsp. enterica Serovar Infantis Strains from Broiler Chicks in Hungary

PubMed Central

Wilk, Tímea; Szabó, Móni; Szmolka, Ama; Kiss, János; Barta, Endre; Nagy, Tibor

2016-01-01

Three strains of Salmonella enterica serovar Infantis isolated from healthy broiler chickens from 2012 to 2013 have been sequenced. Comparison of these and previously published S. Infantis genome sequences of broiler origin in 1996 and 2004 will provide new insight into the genome evolution and recent spread of S. Infantis in poultry. PMID:27979950
HIV-1 pol mutation frequency by subtype and treatment experience: extension of the HIVseq program to seven non-B subtypes.

PubMed

Rhee, Soo-Yon; Kantor, Rami; Katzenstein, David A; Camacho, Ricardo; Morris, Lynn; Sirivichayakul, Sunee; Jorgensen, Louise; Brigido, Luis F; Schapiro, Jonathan M; Shafer, Robert W

2006-03-21

HIVseq was developed in 2000 to make published data on the frequency of HIV-1 group M protease and reverse transcriptase (RT) mutations available in real time to laboratories and researchers sequencing these genes. Because most published protease and RT sequences belonged to subtype B, the initial version of HIVseq was based on this subtype. As additional non-B sequences from persons with well-characterized antiretroviral treatment histories have become available, the program has been extended to subtypes A, C, D, F, G, CRF01, and CRF02. The latest frequency of each protease and RT mutation according to subtype and drug-class exposure was calculated using published sequences in the Stanford HIV RT and Protease Sequence Database. Each mutation was hyperlinked to published reports of viruses containing the mutation. As of September 2005, the mean number of protease sequences per non-B subtype was 534 from protease inhibitor-naive persons and 133 from protease inhibitor-treated persons, representing 13.2% and 2.3%, respectively, of the data available for subtype B. The mean number of RT sequences per non-B subtype was 373 from RT inhibitor-naive persons and 288 from RT inhibitor-treated persons, representing 17.9% and 3.8%, respectively, of the data available for subtype B. HIVseq allows users to examine protease and RT mutations within the context of previously published sequences of these genes. The publication of additional non-B protease and RT sequences from persons with well-characterized treatment histories, however, will be required to perform the same types of analysis possible with the much larger number of subtype B sequences.
HIV-1 pol mutation frequency by subtype and treatment experience

PubMed Central

Rhee, Soo-Yon; Kantor, Rami; Katzenstein, David A.; Camacho, Ricardo; Morris, Lynn; Sirivichayakul, Sunee; Jorgensen, Louise; Brigido, Luis F.; Schapiro, Jonathan M.; Shafer, Robert W.

2008-01-01

Objective HIVseq was developed in 2000 to make published data on the frequency of HIV-1 group M protease and reverse transcriptase (RT) mutations available in real time to laboratories and researchers sequencing these genes. Because most published protease and RT sequences belonged to subtype B, the initial version of HIVseq was based on this subtype. As additional non-B sequences from persons with well-characterized antiretroviral treatment histories have become available, the program has been extended to subtypes A, C, D, F, G, CRF01, and CRF02. Methods The latest frequency of each protease and RT mutation according to subtype and drug-class exposure was calculated using published sequences in the Stanford HIV RT and Protease Sequence Database. Each mutation was hyperlinked to published reports of viruses containing the mutation. Results As of September 2005, the mean number of protease sequences per non-B subtype was 534 from protease inhibitor-naive persons and 133 from protease inhibitor-treated persons, representing 13.2% and 2.3%, respectively, of the data available for subtype B. The mean number of RT sequences per non-B subtype was 373 from RT inhibitor-naive persons and 288 from RT inhibitor-treated persons, representing 17.9% and 3.8%, respectively, of the data available for subtype B. Conclusions HIVseq allows users to examine protease and RT mutations within the context of previously published sequences of these genes. The publication of additional non-B protease and RT sequences from persons with well-characterized treatment histories, however, will be required to perform the same types of analysis possible with the much larger number of subtype B sequences. PMID:16514293
How many novel eukaryotic 'kingdoms'? Pitfalls and limitations of environmental DNA surveys

PubMed Central

Berney, Cédric; Fahrni, José; Pawlowski, Jan

2004-01-01

Background Over the past few years, the use of molecular techniques to detect cultivation-independent, eukaryotic diversity has proven to be a powerful approach. Based on small-subunit ribosomal RNA (SSU rRNA) gene analyses, these studies have revealed the existence of an unexpected variety of new phylotypes. Some of them represent novel diversity in known eukaryotic groups, mainly stramenopiles and alveolates. Others do not seem to be related to any molecularly described lineage, and have been proposed to represent novel eukaryotic kingdoms. In order to review the evolutionary importance of this novel high-level eukaryotic diversity critically, and to test the potential technical and analytical pitfalls and limitations of eukaryotic environmental DNA surveys (EES), we analysed 484 environmental SSU rRNA gene sequences, including 81 new sequences from sediments of the small river, the Seymaz (Geneva, Switzerland). Results Based on a detailed screening of an exhaustive alignment of eukaryotic SSU rRNA gene sequences and the phylogenetic re-analysis of previously published environmental sequences using Bayesian methods, our results suggest that the number of novel higher-level taxa revealed by previously published EES was overestimated. Three main sources of errors are responsible for this situation: (1) the presence of undetected chimeric sequences; (2) the misplacement of several fast-evolving sequences; and (3) the incomplete sampling of described, but yet unsequenced eukaryotes. Additionally, EES give a biased view of the diversity present in a given biotope because of the difficult amplification of SSU rRNA genes in some taxonomic groups. Conclusions Environmental DNA surveys undoubtedly contribute to reveal many novel eukaryotic lineages, but there is no clear evidence for a spectacular increase of the diversity at the kingdom level. After re-analysis of previously published data, we found only five candidate lineages of possible novel high-level eukaryotic taxa, two of which comprise several phylotypes that were found independently in different studies. To ascertain their taxonomic status, however, the organisms themselves have now to be identified. PMID:15176975
Arthropod phylogenetics in light of three novel millipede (myriapoda: diplopoda) mitochondrial genomes with comments on the appropriateness of mitochondrial genome sequence data for inferring deep level relationships.

PubMed

Brewer, Michael S; Swafford, Lynn; Spruill, Chad L; Bond, Jason E

2013-01-01

Arthropods are the most diverse group of eukaryotic organisms, but their phylogenetic relationships are poorly understood. Herein, we describe three mitochondrial genomes representing orders of millipedes for which complete genomes had not been characterized. Newly sequenced genomes are combined with existing data to characterize the protein coding regions of myriapods and to attempt to reconstruct the evolutionary relationships within the Myriapoda and Arthropoda. The newly sequenced genomes are similar to previously characterized millipede sequences in terms of synteny and length. Unique translocations occurred within the newly sequenced taxa, including one half of the Appalachioria falcifera genome, which is inverted with respect to other millipede genomes. Across myriapods, amino acid conservation levels are highly dependent on the gene region. Additionally, individual loci varied in the level of amino acid conservation. Overall, most gene regions showed low levels of conservation at many sites. Attempts to reconstruct the evolutionary relationships suffered from questionable relationships and low support values. Analyses of phylogenetic informativeness show the lack of signal deep in the trees (i.e., genes evolve too quickly). As a result, the myriapod tree resembles previously published results but lacks convincing support, and, within the arthropod tree, well established groups were recovered as polyphyletic. The novel genome sequences described herein provide useful genomic information concerning millipede groups that had not been investigated. Taken together with existing sequences, the variety of compositions and evolution of myriapod mitochondrial genomes are shown to be more complex than previously thought. Unfortunately, the use of mitochondrial protein-coding regions in deep arthropod phylogenetics appears problematic, a result consistent with previously published studies. Lack of phylogenetic signal renders the resulting tree topologies as suspect. As such, these data are likely inappropriate for investigating such ancient relationships.
Sequenza: allele-specific copy number and mutation profiles from tumor sequencing data.

PubMed

Favero, F; Joshi, T; Marquard, A M; Birkbak, N J; Krzystanek, M; Li, Q; Szallasi, Z; Eklund, A C

2015-01-01

Exome or whole-genome deep sequencing of tumor DNA along with paired normal DNA can potentially provide a detailed picture of the somatic mutations that characterize the tumor. However, analysis of such sequence data can be complicated by the presence of normal cells in the tumor specimen, by intratumor heterogeneity, and by the sheer size of the raw data. In particular, determination of copy number variations from exome sequencing data alone has proven difficult; thus, single nucleotide polymorphism (SNP) arrays have often been used for this task. Recently, algorithms to estimate absolute, but not allele-specific, copy number profiles from tumor sequencing data have been described. We developed Sequenza, a software package that uses paired tumor-normal DNA sequencing data to estimate tumor cellularity and ploidy, and to calculate allele-specific copy number profiles and mutation profiles. We applied Sequenza, as well as two previously published algorithms, to exome sequence data from 30 tumors from The Cancer Genome Atlas. We assessed the performance of these algorithms by comparing their results with those generated using matched SNP arrays and processed by the allele-specific copy number analysis of tumors (ASCAT) algorithm. Comparison between Sequenza/exome and SNP/ASCAT revealed strong correlation in cellularity (Pearson's r = 0.90) and ploidy estimates (r = 0.42, or r = 0.94 after manual inspecting alternative solutions). This performance was noticeably superior to previously published algorithms. In addition, in artificial data simulating normal-tumor admixtures, Sequenza detected the correct ploidy in samples with tumor content as low as 30%. The agreement between Sequenza and SNP array-based copy number profiles suggests that exome sequencing alone is sufficient not only for identifying small scale mutations but also for estimating cellularity and inferring DNA copy number aberrations. © The Author 2014. Published by Oxford University Press on behalf of the European Society for Medical Oncology.
Comprehensive Rare Variant Analysis via Whole-Genome Sequencing to Determine the Molecular Pathology of Inherited Retinal Disease.

PubMed

Carss, Keren J; Arno, Gavin; Erwood, Marie; Stephens, Jonathan; Sanchis-Juan, Alba; Hull, Sarah; Megy, Karyn; Grozeva, Detelina; Dewhurst, Eleanor; Malka, Samantha; Plagnol, Vincent; Penkett, Christopher; Stirrups, Kathleen; Rizzo, Roberta; Wright, Genevieve; Josifova, Dragana; Bitner-Glindzicz, Maria; Scott, Richard H; Clement, Emma; Allen, Louise; Armstrong, Ruth; Brady, Angela F; Carmichael, Jenny; Chitre, Manali; Henderson, Robert H H; Hurst, Jane; MacLaren, Robert E; Murphy, Elaine; Paterson, Joan; Rosser, Elisabeth; Thompson, Dorothy A; Wakeling, Emma; Ouwehand, Willem H; Michaelides, Michel; Moore, Anthony T; Webster, Andrew R; Raymond, F Lucy

2017-01-05

Inherited retinal disease is a common cause of visual impairment and represents a highly heterogeneous group of conditions. Here, we present findings from a cohort of 722 individuals with inherited retinal disease, who have had whole-genome sequencing (n = 605), whole-exome sequencing (n = 72), or both (n = 45) performed, as part of the NIHR-BioResource Rare Diseases research study. We identified pathogenic variants (single-nucleotide variants, indels, or structural variants) for 404/722 (56%) individuals. Whole-genome sequencing gives unprecedented power to detect three categories of pathogenic variants in particular: structural variants, variants in GC-rich regions, which have significantly improved coverage compared to whole-exome sequencing, and variants in non-coding regulatory regions. In addition to previously reported pathogenic regulatory variants, we have identified a previously unreported pathogenic intronic variant in CHM in two males with choroideremia. We have also identified 19 genes not previously known to be associated with inherited retinal disease, which harbor biallelic predicted protein-truncating variants in unsolved cases. Whole-genome sequencing is an increasingly important comprehensive method with which to investigate the genetic causes of inherited retinal disease. Copyright © 2017. Published by Elsevier Inc.
Discovery of T Cell Receptor β Motifs Specific to HLA-B27-Positive Ankylosing Spondylitis by Deep Repertoire Sequence Analysis.

PubMed

Faham, Malek; Carlton, Victoria; Moorhead, Martin; Zheng, Jianbiao; Klinger, Mark; Pepin, Francois; Asbury, Thomas; Vignali, Marissa; Emerson, Ryan O; Robins, Harlan S; Ireland, James; Baechler-Gillespie, Emily; Inman, Robert D

2017-04-01

Ankylosing spondylitis (AS), a chronic inflammatory disorder, has a notable association with HLA-B27. One hypothesis suggests that a common antigen that binds to HLA-B27 is important for AS disease pathogenesis. This study was undertaken to determine sequences and motifs that are shared among HLA-B27-positive AS patients, using T cell repertoire next-generation sequencing. To identify motifs enriched among B27-positive AS patients, we performed T cell receptor β (TCRβ) repertoire sequencing on samples from 191 B27-positive AS patients, 43 B27-negative AS patients, and 227 controls, and we obtained >77 million TCRβ clonotype sequences. First, we assessed whether any of 50 previously published sequences were enriched in B27-positive AS patients. We then used training and test cohorts to identify discovered motifs that were enriched in B27-positive AS patients versus controls. Six previously published and 11 discovered motifs were enriched in the B27-positive AS samples as compared to controls. After combining motifs related by sequence, we identified a total of 15 independent motifs. Both the full set of 15 motifs and a set of 6 published motifs were enriched in the B27-positive AS patients as compared to B27-positive healthy individuals (P = 0.049 and P = 0.001, respectively). Using an independent cohort, we validated that at least some of these motifs were associated with AS, and not simply with B27-positive status. We identified TCRβ motifs that are enriched in B27-positive AS patients as compared to B27-positive healthy controls. This suggests that a common antigen, presented by HLA-B27 and detected by CD8+ T cells, may be associated with AS disease pathogenesis. © 2016, American College of Rheumatology.
Diversity of virus-host systems in hypersaline Lake Retba, Senegal.

PubMed

Sime-Ngando, Télesphore; Lucas, Soizick; Robin, Agnès; Tucker, Kimberly Pause; Colombet, Jonathan; Bettarel, Yvan; Desmond, Elie; Gribaldo, Simonetta; Forterre, Patrick; Breitbart, Mya; Prangishvili, David

2011-08-01

Remarkable morphological diversity of virus-like particles was observed by transmission electron microscopy in a hypersaline water sample from Lake Retba, Senegal. The majority of particles morphologically resembled hyperthermophilic archaeal DNA viruses isolated from extreme geothermal environments. Some hypersaline viral morphotypes have not been previously observed in nature, and less than 1% of observed particles had a head-and-tail morphology, which is typical for bacterial DNA viruses. Culture-independent analysis of the microbial diversity in the sample suggested the dominance of extremely halophilic archaea. Few of the 16S sequences corresponded to known archeal genera (Haloquadratum, Halorubrum and Natronomonas), whereas the majority represented novel archaeal clades. Three sequences corresponded to a new basal lineage of the haloarchaea. Bacteria belonged to four major phyla, consistent with the known diversity in saline environments. Metagenomic sequencing of DNA from the purified virus-like particles revealed very few similarities to the NCBI non-redundant database at either the nucleotide or amino acid level. Some of the identifiable virus sequences were most similar to previously described haloarchaeal viruses, but no sequence similarities were found to archaeal viruses from extreme geothermal environments. A large proportion of the sequences had similarity to previously sequenced viral metagenomes from solar salterns. © 2010 Society for Applied Microbiology and Blackwell Publishing Ltd.
Analysis of S-RNase alleles of almond (Prunus dulcis): characterization of new sequences, resolution of synonyms and evidence of intragenic recombination.

PubMed

Ortega, Encarnación; Bosković, Radovan I; Sargent, Daniel J; Tobutt, Kenneth R

2006-11-01

Cross-compatibility relationships in almond are controlled by a gametophytically expressed incompatibility system partly mediated by stylar RNases, of which 29 have been reported. To resolve possible synonyms and to provide data for phylogenetic analysis, 21 almond S-RNase alleles were cloned and sequenced from SP (signal peptide region) or C1 (first conserved region) to C5, except for the S29 allele, which could be cloned only from SP to C1. Nineteen sequences (S4, S6, S11-S22, S25-S29)) were potentially new whereas S10 and S24 had previously been published but with different labels. The sequences for S16 and S17 were identical to that for S1, published previously; likewise, S15 was identical to S5. In addition, S4 and S20 were identical, as were S13 and S19. A revised version of the standard table of almond incompatibility genotypes is presented. Several alleles had AT or GA tandem repeats in their introns. Sequences of the 23 distinct newly cloned or already published alleles were aligned. Sliding windows analysis of Ka/Ks identified regions where positive selection may operate; in contrast to the Maloideae, most of the region from the beginning of C3 to the beginning of RC4 appeared not to be under positive selection. Phylogenetic analysis indicated four pairs of alleles had "bootstrap" support > 80%: S5/S10, S4/S8, S11/S24, and S3/S6. Various motifs up to 19 residues long occurred in at least two alleles, and their distributions were consistent with intragenic recombination, as were separate phylogenetic analyses of the 5' and 3' sections. Sequence comparison of phylogenetically related alleles indicated the significance of the region between RC4 and C5 in defining specificity.

Forensic massively parallel sequencing data analysis tool: Implementation of MyFLq as a standalone web- and Illumina BaseSpace(®)-application.

PubMed

Van Neste, Christophe; Gansemans, Yannick; De Coninck, Dieter; Van Hoofstat, David; Van Criekinge, Wim; Deforce, Dieter; Van Nieuwerburgh, Filip

2015-03-01

Routine use of massively parallel sequencing (MPS) for forensic genomics is on the horizon. The last few years, several algorithms and workflows have been developed to analyze forensic MPS data. However, none have yet been tailored to the needs of the forensic analyst who does not possess an extensive bioinformatics background. We developed our previously published forensic MPS data analysis framework MyFLq (My-Forensic-Loci-queries) into an open-source, user-friendly, web-based application. It can be installed as a standalone web application, or run directly from the Illumina BaseSpace environment. In the former, laboratories can keep their data on-site, while in the latter, data from forensic samples that are sequenced on an Illumina sequencer can be uploaded to Basespace during acquisition, and can subsequently be analyzed using the published MyFLq BaseSpace application. Additional features were implemented such as an interactive graphical report of the results, an interactive threshold selection bar, and an allele length-based analysis in addition to the sequenced-based analysis. Practical use of the application is demonstrated through the analysis of four 16-plex short tandem repeat (STR) samples, showing the complementarity between the sequence- and length-based analysis of the same MPS data. Copyright © 2014 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.
Single-molecule sequencing and optical mapping yields an improved genome of woodland strawberry (Fragaria vesca) with chromosome-scale contiguity.

PubMed

Edger, Patrick P; VanBuren, Robert; Colle, Marivi; Poorten, Thomas J; Wai, Ching Man; Niederhuth, Chad E; Alger, Elizabeth I; Ou, Shujun; Acharya, Charlotte B; Wang, Jie; Callow, Pete; McKain, Michael R; Shi, Jinghua; Collier, Chad; Xiong, Zhiyong; Mower, Jeffrey P; Slovin, Janet P; Hytönen, Timo; Jiang, Ning; Childs, Kevin L; Knapp, Steven J

2018-02-01

Although draft genomes are available for most agronomically important plant species, the majority are incomplete, highly fragmented, and often riddled with assembly and scaffolding errors. These assembly issues hinder advances in tool development for functional genomics and systems biology. Here we utilized a robust, cost-effective approach to produce high-quality reference genomes. We report a near-complete genome of diploid woodland strawberry (Fragaria vesca) using single-molecule real-time sequencing from Pacific Biosciences (PacBio). This assembly has a contig N50 length of ∼7.9 million base pairs (Mb), representing a ∼300-fold improvement of the previous version. The vast majority (>99.8%) of the assembly was anchored to 7 pseudomolecules using 2 sets of optical maps from Bionano Genomics. We obtained ∼24.96 Mb of sequence not present in the previous version of the F. vesca genome and produced an improved annotation that includes 1496 new genes. Comparative syntenic analyses uncovered numerous, large-scale scaffolding errors present in each chromosome in the previously published version of the F. vesca genome. Our results highlight the need to improve existing short-read based reference genomes. Furthermore, we demonstrate how genome quality impacts commonly used analyses for addressing both fundamental and applied biological questions. © The Authors 2017. Published by Oxford University Press.
Arbitrarily accurate twin composite π -pulse sequences

NASA Astrophysics Data System (ADS)

Torosov, Boyan T.; Vitanov, Nikolay V.

2018-04-01

We present three classes of symmetric broadband composite pulse sequences. The composite phases are given by analytic formulas (rational fractions of π ) valid for any number of constituent pulses. The transition probability is expressed by simple analytic formulas and the order of pulse area error compensation grows linearly with the number of pulses. Therefore, any desired compensation order can be produced by an appropriate composite sequence; in this sense, they are arbitrarily accurate. These composite pulses perform equally well as or better than previously published ones. Moreover, the current sequences are more flexible as they allow total pulse areas of arbitrary integer multiples of π .
Hierarchical Traces for Reduced NSM Memory Requirements

NASA Astrophysics Data System (ADS)

Dahl, Torbjørn S.

This paper presents work on using hierarchical long term memory to reduce the memory requirements of nearest sequence memory (NSM) learning, a previously published, instance-based reinforcement learning algorithm. A hierarchical memory representation reduces the memory requirements by allowing traces to share common sub-sequences. We present moderated mechanisms for estimating discounted future rewards and for dealing with hidden state using hierarchical memory. We also present an experimental analysis of how the sub-sequence length affects the memory compression achieved and show that the reduced memory requirements do not effect the speed of learning. Finally, we analyse and discuss the persistence of the sub-sequences independent of specific trace instances.
Multiplexed microsatellite markers for seven Metarhizium species

USDA-ARS?s Scientific Manuscript database

Cross-species transferability of 41 previously published simple sequence repeat (SSR) markers was assessed for 11 species of the entomopathogenic fungus Metarhizium. A collection of 65 Metarhizium isolates including all 54 used in a recent phylogenetic revision of the genus were characterized. Betwe...
Genome sequence of the mud-dwelling archaeon Methanoplanus limicola type strain (DSM 2279 T), reclassification of Methanoplanus petrolearius as Methanolacinia petrolearia and emended descriptions of the genera Methanoplanus and Methanolacinia

DOE PAGES

Goker, Markus; Lu, Megan; Fiebig, Anne; ...

2014-06-15

Methanoplanus limicola Wildgruber et al. 1984 is a mesophilic methanogen that was isolated from a swamp composed of drilling waste near Naples, Italy, shortly after the Archaea were recognized as a separate domain of life. Methanoplanus is the type genus in the family Methanoplanaceae, a taxon that felt into disuse since modern 16S rRNA gene sequences-based taxonomy was established. Methanoplanus is now placed within the Methanomicrobiaceae, a family that is so far poorly characterized at the genome level. The only other type strain of the genus with a sequenced genome, Methanoplanus petrolearius SEBR 4847 T, turned out to be misclassifiedmore » and required reclassification to Methanolacinia. Both, Methanoplanus and Methanolacinia, needed taxonomic emendations due to a significant deviation of the G+C content of their genomes from previously published (pregenome-sequence era) values. Until now genome sequences were published for only four of the 33 species with validly published names in the Methanomicrobiaceae. Here we describe the features of M. limicola, together with the improved-high-quality draft genome sequence and an notation of the type strain, M3 T. The 3,200,946 bp long chromosome (permanent draft sequence) with its 3,064 protein-coding and 65 RNA genes is a part of the Genomic Encyclopedia of Bacteria and Archaea project.« less
The past, present and future of mitochondrial genomics: have we sequenced enough mtDNAs?

PubMed

Smith, David Roy

2016-01-01

The year 2014 saw more than a thousand new mitochondrial genome sequences deposited in GenBank-an almost 15% increase from the previous year. Hundreds of peer-reviewed articles accompanied these genomes, making mitochondrial DNAs (mtDNAs) the most sequenced and reported type of eukaryotic chromosome. These mtDNA data have advanced a wide range of scientific fields, from forensics to anthropology to medicine to molecular evolution. But for many biological lineages, mtDNAs are so well sampled that newly published genomes are arguably no longer contributing significantly to the progression of science, and in some cases they are tying up valuable resources, particularly journal editors and referees. Is it time to acknowledge that as a research community we have published enough mitochondrial genome papers? Here, I address this question, exploring the history, milestones and impacts of mitochondrial genomics, the benefits and drawbacks of continuing to publish mtDNAs at a high rate and what the future may hold for such an important and popular genetic marker. I highlight groups for which mtDNAs are still poorly sampled, thus meriting further investigation, and recommend that more energy be spent characterizing aspects of mitochondrial genomes apart from the DNA sequence, such as their chromosomal and transcriptional architectures. Ultimately, one should be mindful before writing a mitochondrial genome paper. Consider perhaps sending the sequence directly to GenBank instead, and be sure to annotate it correctly before submission. © The Author 2015. Published by Oxford University Press.
Diagnostics for Yaws Eradication: Insights From Direct Next-Generation Sequencing of Cutaneous Strains of Treponema pallidum.

PubMed

Marks, Michael; Fookes, Maria; Wagner, Josef; Butcher, Robert; Ghinai, Rosanna; Sokana, Oliver; Sarkodie, Yaw-Adu; Lukehart, Sheila A; Solomon, Anthony W; Mabey, David C W; Thomson, Nicholas

2018-03-05

Yaws-like chronic ulcers can be caused by Treponema pallidum subspecies pertenue, Haemophilus ducreyi, or other, still-undefined bacteria. To permit accurate evaluation of yaws elimination efforts, programmatic use of molecular diagnostics is required. The accuracy and sensitivity of current tools remain unclear because our understanding of T. pallidum diversity is limited by the low number of sequenced genomes. We tested samples from patients with suspected yaws collected in the Solomon Islands and Ghana. All samples were from patients whose lesions had previously tested negative using the Centers for Disease Control and Prevention (CDC) diagnostic assay in widespread use. However, some of these patients had positive serological assays for yaws on blood. We used direct whole-genome sequencing to identify T. pallidum subsp pertenue strains missed by the current assay. From 45 Solomon Islands and 27 Ghanaian samples, 11 were positive for T. pallidum DNA using the species-wide quantitative polymerase chain reaction (PCR) assay, from which we obtained 6 previously undetected T. pallidum subsp pertenue whole-genome sequences. These show that Solomon Islands sequences represent distinct T. pallidum subsp pertenue clades. These isolates were invisible to the CDC diagnostic PCR assay, due to sequence variation in the primer binding site. Our data double the number of published T. pallidum subsp pertenue genomes. We show that Solomon Islands strains are undetectable by the PCR used in many studies and by health ministries. This assay is therefore not adequate for the eradication program. Next-generation genome sequence data are essential for these efforts. © The Author 2017. Published by Oxford University Press for the Infectious Diseases Society of America.
Disseminated Autochthonous Dermal Leishmaniasis Caused by Leishmania siamensis (PCM2 Trang) in a Patient from Central Thailand Infected with Human Immunodeficiency Virus.

PubMed

Supsrisunjai, Chavalit; Kootiratrakarn, Tanawatt; Puangpet, Pailin; Bunnag, Thareena; Chaowalit, Prapaipit; Wessagowit, Vesarat

2017-05-01

AbstractSeveral case reports of autochthonous leishmaniasis in Thailand have been published since 1996. Most of the previous cases presented with visceral leishmaniasis (VL) and were mostly reported in southern part of Thailand. Recently, it has been evident that Leishmania martiniquensis is the main cause of Leishmania infection in Thailand. However, Leishmania siamensis (PCM2 Trang isolate) was found to be of a separate lineage with restricted distribution in southern Thailand and also a cause of disseminated dermal and visceral leishmaniasis in one published case. Here we report the first patient from central Thailand with human immunodeficiency virus infection presenting with disseminated dermal leishmaniasis. Polymerase chain reaction and DNA sequencing analysis (large subunit of RNA polymerase II and 18S ribosomal RNA internal transcribed spacer 1) from the tissue biopsy sample revealed the pathogen sequences to be highly homologous to PCM2 Trang strain previously reported from southern Thailand.
Reanalysis and revision of the complete mitochondrial genome of Rachycentron canadum (Teleostei, Perciformes, Rachycentridae).

PubMed

Musika, Jidapa; Khongchatee, Adison; Phinchongsakuldit, Jaros

2014-08-01

The complete mitochondrial genome of cobia, Rachycentron canadum, was reanalyzed and revised. The genome is 18,008 bp in length, containing 13 protein-coding genes, 2 ribosomal RNA (rRNA) genes, 22 transfer RNA (tRNA) genes, and a control region or displacement loop (D-loop). The gene arrangement is identical to that observed in most vertebrates. Base composition on the heavy strand is 30.14% A, 25.22% C, 15.80% G and 28.84% T. The D-loop region exhibits an A + T rich pattern, containing short tandem repeats of TATATACATGG, TATATGCACAA and TATATGCACGG. The mitochondrial genome studied differs from the previously published genome in two segments; the control region to 12S and ND5 to tRNA(Glu). The 12S sequence also differs from those published in the databases. Phylogeny analyses revealed that the differences could be due to errors in sequence assembly and/or sample misidentification of the previous studies.
Disseminated Autochthonous Dermal Leishmaniasis Caused by Leishmania siamensis (PCM2 Trang) in a Patient from Central Thailand Infected with Human Immunodeficiency Virus

PubMed Central

Supsrisunjai, Chavalit; Kootiratrakarn, Tanawatt; Puangpet, Pailin; Bunnag, Thareena; Chaowalit, Prapaipit; Wessagowit, Vesarat

2017-01-01

Several case reports of autochthonous leishmaniasis in Thailand have been published since 1996. Most of the previous cases presented with visceral leishmaniasis (VL) and were mostly reported in southern part of Thailand. Recently, it has been evident that Leishmania martiniquensis is the main cause of Leishmania infection in Thailand. However, Leishmania siamensis (PCM2 Trang isolate) was found to be of a separate lineage with restricted distribution in southern Thailand and also a cause of disseminated dermal and visceral leishmaniasis in one published case. Here we report the first patient from central Thailand with human immunodeficiency virus infection presenting with disseminated dermal leishmaniasis. Polymerase chain reaction and DNA sequencing analysis (large subunit of RNA polymerase II and 18S ribosomal RNA internal transcribed spacer 1) from the tissue biopsy sample revealed the pathogen sequences to be highly homologous to PCM2 Trang strain previously reported from southern Thailand. PMID:28138050
Characterization of regionally associated feline immunodeficiency virus (FIV) in bobcats (Lynx rufus).

PubMed

Lagana, Danielle M; Lee, Justin S; Lewis, Jesse S; Bevins, Sarah N; Carver, Scott; Sweanor, Linda L; McBride, Roy; McBride, Caleb; Crooks, Kevin R; VandeWoude, Sue

2013-07-01

Feline immunodeficiency virus (FIV) classically infects felid species with highly divergent species-specific FIVs. However, recent studies have detected an FIV strain infecting both bobcats (Lynx rufus) and pumas (Puma concolor) in California and Florida. To further investigate this observation, we evaluated FIV from bobcats in Florida (n=25) and Colorado (n=80) between 2008 and 2011. Partial viral sequences from five Florida bobcats cluster with previously published sequences from Florida panthers. We did not detect FIV in Colorado bobcats.
Resolving the Origin of Rabbit Hemorrhagic Disease Virus: Insights from an Investigation of the Viral Stocks Released in Australia

PubMed Central

Eden, John-Sebastian; Read, Andrew J.; Duckworth, Janine A.; Strive, Tanja

2015-01-01

To resolve the evolutionary history of rabbit hemorrhagic disease virus (RHDV), we performed a genomic analysis of the viral stocks imported and released as a biocontrol measure in Australia, as well as a global phylogenetic analysis. Importantly, conflicts were identified between the sequences determined here and those previously published that may have affected evolutionary rate estimates. By removing likely erroneous sequences, we show that RHDV emerged only shortly before its initial description in China. PMID:26378178
Sma3s: a three-step modular annotator for large sequence datasets.

PubMed

Muñoz-Mérida, Antonio; Viguera, Enrique; Claros, M Gonzalo; Trelles, Oswaldo; Pérez-Pulido, Antonio J

2014-08-01

Automatic sequence annotation is an essential component of modern 'omics' studies, which aim to extract information from large collections of sequence data. Most existing tools use sequence homology to establish evolutionary relationships and assign putative functions to sequences. However, it can be difficult to define a similarity threshold that achieves sufficient coverage without sacrificing annotation quality. Defining the correct configuration is critical and can be challenging for non-specialist users. Thus, the development of robust automatic annotation techniques that generate high-quality annotations without needing expert knowledge would be very valuable for the research community. We present Sma3s, a tool for automatically annotating very large collections of biological sequences from any kind of gene library or genome. Sma3s is composed of three modules that progressively annotate query sequences using either: (i) very similar homologues, (ii) orthologous sequences or (iii) terms enriched in groups of homologous sequences. We trained the system using several random sets of known sequences, demonstrating average sensitivity and specificity values of ~85%. In conclusion, Sma3s is a versatile tool for high-throughput annotation of a wide variety of sequence datasets that outperforms the accuracy of other well-established annotation algorithms, and it can enrich existing database annotations and uncover previously hidden features. Importantly, Sma3s has already been used in the functional annotation of two published transcriptomes. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Greenbook Abstract and Catalog--1.

ERIC Educational Resources Information Center

Coole, Walter A.; Reitan, Henry M.

This catalog is intended to be issued periodically as a means of extending and updating teaching materials which have been previously published through ERIC by these authors. The materials include the Alpha and Gamma levels of the "Greenbook System," pre-professional and entry in-service professional levels of an integrated sequence of…
Lineage and genogroup-defining single nucleotide polymorphisms of Escherichia coli 0157:H7

USDA-ARS?s Scientific Manuscript database

Escherichia coli O157:H7 is a zoonotic human pathogen for which cattle are an important reservoir host. Using both previously published and new sequencing data, a 48-locus single nucleotide polymorphism (SNP) based typing panel was developed that redundantly identified eleven genogroups that span ...
The Awesome Power of Yeast Evolutionary Genetics: New Genome Sequences and Strain Resources for the Saccharomyces sensu stricto Genus

PubMed Central

Scannell, Devin R.; Zill, Oliver A.; Rokas, Antonis; Payen, Celia; Dunham, Maitreya J.; Eisen, Michael B.; Rine, Jasper; Johnston, Mark; Hittinger, Chris Todd

2011-01-01

High-quality, well-annotated genome sequences and standardized laboratory strains fuel experimental and evolutionary research. We present improved genome sequences of three species of Saccharomyces sensu stricto yeasts: S. bayanus var. uvarum (CBS 7001), S. kudriavzevii (IFO 1802T and ZP 591), and S. mikatae (IFO 1815T), and describe their comparison to the genomes of S. cerevisiae and S. paradoxus. The new sequences, derived by assembling millions of short DNA sequence reads together with previously published Sanger shotgun reads, have vastly greater long-range continuity and far fewer gaps than the previously available genome sequences. New gene predictions defined a set of 5261 protein-coding orthologs across the five most commonly studied Saccharomyces yeasts, enabling a re-examination of the tempo and mode of yeast gene evolution and improved inferences of species-specific gains and losses. To facilitate experimental investigations, we generated genetically marked, stable haploid strains for all three of these Saccharomyces species. These nearly complete genome sequences and the collection of genetically marked strains provide a valuable toolset for comparative studies of gene function, metabolism, and evolution, and render Saccharomyces sensu stricto the most experimentally tractable model genus. These resources are freely available and accessible through www.SaccharomycesSensuStricto.org. PMID:22384314
Gene discovery by chemical mutagenesis and whole-genome sequencing in Dictyostelium.

PubMed

Li, Cheng-Lin Frank; Santhanam, Balaji; Webb, Amanda Nicole; Zupan, Blaž; Shaulsky, Gad

2016-09-01

Whole-genome sequencing is a useful approach for identification of chemical-induced lesions, but previous applications involved tedious genetic mapping to pinpoint the causative mutations. We propose that saturation mutagenesis under low mutagenic loads, followed by whole-genome sequencing, should allow direct implication of genes by identifying multiple independent alleles of each relevant gene. We tested the hypothesis by performing three genetic screens with chemical mutagenesis in the social soil amoeba Dictyostelium discoideum Through genome sequencing, we successfully identified mutant genes with multiple alleles in near-saturation screens, including resistance to intense illumination and strong suppressors of defects in an allorecognition pathway. We tested the causality of the mutations by comparison to published data and by direct complementation tests, finding both dominant and recessive causative mutations. Therefore, our strategy provides a cost- and time-efficient approach to gene discovery by integrating chemical mutagenesis and whole-genome sequencing. The method should be applicable to many microbial systems, and it is expected to revolutionize the field of functional genomics in Dictyostelium by greatly expanding the mutation spectrum relative to other common mutagenesis methods. © 2016 Li et al.; Published by Cold Spring Harbor Laboratory Press.
Listening Comprehension Strategies: A Review of the Literature

ERIC Educational Resources Information Center

Berne, Jane E.

2004-01-01

Numerous studies related to listening comprehension strategies have been published in the past two decades. The present study seeks to build upon two previous reviews of listening comprehension strategies research. Of particular interest in this review are studies dealing with the types of cues used by listeners, the sequence of listening,…
An extended sequence specificity for UV-induced DNA damage.

PubMed

Chung, Long H; Murray, Vincent

2018-01-01

The sequence specificity of UV-induced DNA damage was determined with a higher precision and accuracy than previously reported. UV light induces two major damage adducts: cyclobutane pyrimidine dimers (CPDs) and pyrimidine(6-4)pyrimidone photoproducts (6-4PPs). Employing capillary electrophoresis with laser-induced fluorescence and taking advantages of the distinct properties of the CPDs and 6-4PPs, we studied the sequence specificity of UV-induced DNA damage in a purified DNA sequence using two approaches: end-labelling and a polymerase stop/linear amplification assay. A mitochondrial DNA sequence that contained a random nucleotide composition was employed as the target DNA sequence. With previous methodology, the UV sequence specificity was determined at a dinucleotide or trinucleotide level; however, in this paper, we have extended the UV sequence specificity to a hexanucleotide level. With the end-labelling technique (for 6-4PPs), the consensus sequence was found to be 5'-GCTC*AC (where C* is the breakage site); while with the linear amplification procedure, it was 5'-TCTT*AC. With end-labelling, the dinucleotide frequency of occurrence was highest for 5'-TC*, 5'-TT* and 5'-CC*; whereas it was 5'-TT* for linear amplification. The influence of neighbouring nucleotides on the degree of UV-induced DNA damage was also examined. The core sequences consisted of pyrimidine nucleotides 5'-CTC* and 5'-CTT* while an A at position "1" and C at position "2" enhanced UV-induced DNA damage. Crown Copyright © 2017. Published by Elsevier B.V. All rights reserved.

An improved assembly of the loblolly pine mega-genome using long-read single-molecule sequencing.

PubMed

Zimin, Aleksey V; Stevens, Kristian A; Crepeau, Marc W; Puiu, Daniela; Wegrzyn, Jill L; Yorke, James A; Langley, Charles H; Neale, David B; Salzberg, Steven L

2017-01-01

The 22-gigabase genome of loblolly pine (Pinus taeda) is one of the largest ever sequenced. The draft assembly published in 2014 was built entirely from short Illumina reads, with lengths ranging from 100 to 250 base pairs (bp). The assembly was quite fragmented, containing over 11 million contigs whose weighted average (N50) size was 8206 bp. To improve this result, we generated approximately 12-fold coverage in long reads using the Single Molecule Real Time sequencing technology developed at Pacific Biosciences. We assembled the long and short reads together using the MaSuRCA mega-reads assembly algorithm, which produced a substantially better assembly, P. taeda version 2.0. The new assembly has an N50 contig size of 25 361, more than three times as large as achieved in the original assembly, and an N50 scaffold size of 107 821, 61% larger than the previous assembly. © The Author 2017. Published by Oxford University Press.
Erratum to: An improved assembly of the loblolly pine mega-genome using long-read single-molecule sequencing.

PubMed

Zimin, Aleksey V; Stevens, Kristian A; Crepeau, Marc W; Puiu, Daniela; Wegrzyn, Jill L; Yorke, James A; Langley, Charles H; Neale, David B; Salzberg, Steven L

2017-10-01

The 22-gigabase genome of loblolly pine (Pinus taeda) is one of the largest ever sequenced. The draft assembly published in 2014 was built entirely from short Illumina reads, with lengths ranging from 100 to 250 base pairs (bp). The assembly was quite fragmented, containing over 11 million contigs whose weighted average (N50) size was 8206 bp. To improve this result, we generated approximately 12-fold coverage in long reads using the Single Molecule Real Time sequencing technology developed at Pacific Biosciences. We assembled the long and short reads together using the MaSuRCA mega-reads assembly algorithm, which produced a substantially better assembly, P. taeda version 2.0. The new assembly has an N50 contig size of 25 361, more than three times as large as achieved in the original assembly, and an N50 scaffold size of 107 821, 61% larger than the previous assembly. © The Authors 2017. Published by Oxford University Press.
Use of conserved key amino acid positions to morph protein folds.

PubMed

Reddy, Boojala V B; Li, Wilfred W; Bourne, Philip E

2002-07-15

By using three-dimensional (3D) structure alignments and a previously published method to determine Conserved Key Amino Acid Positions (CKAAPs) we propose a theoretical method to design mutations that can be used to morph the protein folds. The original Paracelsus challenge, met by several groups, called for the engineering of a stable but different structure by modifying less than 50% of the amino acid residues. We have used the sequences from the Protein Data Bank (PDB) identifiers 1ROP, and 2CRO, which were previously used in the Paracelsus challenge by those groups, and suggest mutation to CKAAPs to morph the protein fold. The total number of mutations suggested is less than 40% of the starting sequence theoretically improving the challenge results. From secondary structure prediction experiments of the proposed mutant sequence structures, we observe that each of the suggested mutant protein sequences likely folds to a different, non-native potentially stable target structure. These results are an early indicator that analyses using structure alignments leading to CKAAPs of a given structure are of value in protein engineering experiments. Copyright 2002 Wiley Periodicals, Inc.
Muroid rodent phylogenetics: 900-species tree reveals increasing diversification rates

PubMed Central

Schenk, John J.

2017-01-01

We combined new sequence data for more than 300 muroid rodent species with our previously published sequences for up to five nuclear and one mitochondrial genes to generate the most widely and densely sampled hypothesis of evolutionary relationships across Muroidea. An exhaustive screening procedure for publically available sequences was implemented to avoid the propagation of taxonomic errors that are common to supermatrix studies. The combined data set of carefully screened sequences derived from all available sequences on GenBank with our new data resulted in a robust maximum likelihood phylogeny for 900 of the approximately 1,620 muroids. Several regions that were equivocally resolved in previous studies are now more decisively resolved, and we estimated a chronogram using 28 fossil calibrations for the most integrated age and topological estimates to date. The results were used to update muroid classification and highlight questions needing additional data. We also compared the results of multigene supermatrix studies like this one with the principal published supertrees and concluded that the latter are unreliable for any comparative study in muroids. In addition, we explored diversification patterns as an explanation for why muroid rodents represent one of the most species-rich groups of mammals by detecting evidence for increasing net diversification rates through time across the muroid tree. We suggest the observation of increasing rates may be due to a combination of parallel increases in rate across clades and high average extinction rates. Five increased diversification-rate-shifts were inferred, suggesting that multiple, but perhaps not independent, events have led to the remarkable species diversity in the superfamily. Our results provide a phylogenetic framework for comparative studies that is not highly dependent upon the signal from any one gene. PMID:28813483
Evolutionary History of Ascomyceteous Yeasts

DOE Office of Scientific and Technical Information (OSTI.GOV)

Haridas, Sajeet; Riley, Robert; Salamov, Asaf

2014-06-06

Yeasts are important for many industrial and biotechnological processes and show remarkable diversity despite morphological similarities. We have sequenced the genomes of 16 ascomycete yeasts of taxonomic and industrial importance including members of Saccharomycotina and Taphrinomycotina. A comparison of these with several other previously published yeast genomes have added increased confidence to the phylogenetic positions of previously poorly placed species including Saitoella complicata, Babjeviella inositovora and Metschnikowia bicuspidata. Phylogenetic analysis also showed that yeasts with alternative nuclear codon usage where CUG encodes serine instead of leucine are monophyletic within the Saccharomycotina. Most of the yeasts have compact genomes with amore » large fraction of single exon genes with Lipomyces starkeyi and the previously published Pneumocystis jirovecii being notable exceptions. Intron analysis suggests that early diverging species have more introns. We also observed a large number of unclassified lineage specific non-simple repeats in these genomes.« less
Chlamydia pecorum Infection in Free-ranging Koalas ( Phascolarctos cinereus ) on French Island, Victoria, Australia.

PubMed

Legione, Alistair R; Amery-Gale, Jemima; Lynch, Michael; Haynes, Leesa; Gilkerson, James R; Sansom, Fiona M; Devlin, Joanne M

2016-04-28

We detected Chlamydia pecorum in two koalas ( Phascolarctos cinereus ) from a closed island population in Victoria, Australia, previously free of Chlamydia infection. The ompA and multilocus sequence type were most closely related to published isolates of livestock rather than koala origin, suggesting potential cross-species transmission of C. pecorum .
Learning a weighted sequence model of the nucleosome core and linker yields more accurate predictions in Saccharomyces cerevisiae and Homo sapiens.

PubMed

Reynolds, Sheila M; Bilmes, Jeff A; Noble, William Stafford

2010-07-08

DNA in eukaryotes is packaged into a chromatin complex, the most basic element of which is the nucleosome. The precise positioning of the nucleosome cores allows for selective access to the DNA, and the mechanisms that control this positioning are important pieces of the gene expression puzzle. We describe a large-scale nucleosome pattern that jointly characterizes the nucleosome core and the adjacent linkers and is predominantly characterized by long-range oscillations in the mono, di- and tri-nucleotide content of the DNA sequence, and we show that this pattern can be used to predict nucleosome positions in both Homo sapiens and Saccharomyces cerevisiae more accurately than previously published methods. Surprisingly, in both H. sapiens and S. cerevisiae, the most informative individual features are the mono-nucleotide patterns, although the inclusion of di- and tri-nucleotide features results in improved performance. Our approach combines a much longer pattern than has been previously used to predict nucleosome positioning from sequence-301 base pairs, centered at the position to be scored-with a novel discriminative classification approach that selectively weights the contributions from each of the input features. The resulting scores are relatively insensitive to local AT-content and can be used to accurately discriminate putative dyad positions from adjacent linker regions without requiring an additional dynamic programming step and without the attendant edge effects and assumptions about linker length modeling and overall nucleosome density. Our approach produces the best dyad-linker classification results published to date in H. sapiens, and outperforms two recently published models on a large set of S. cerevisiae nucleosome positions. Our results suggest that in both genomes, a comparable and relatively small fraction of nucleosomes are well-positioned and that these positions are predictable based on sequence alone. We believe that the bulk of the remaining nucleosomes follow a statistical positioning model.
Complete Sequence of the mitochondrial genome of the tapeworm Hymenolepis diminuta: Gene arrangements indicate that platyhelminths are eutrochozoans

DOE Office of Scientific and Technical Information (OSTI.GOV)

von Nickisch-Rosenegk, Markus; Brown, Wesley M.; Boore, Jeffrey L.

2001-01-01

Using ''long-PCR'' we have amplified in overlapping fragments the complete mitochondrial genome of the tapeworm Hymenolepis diminuta (Platyhelminthes: Cestoda) and determined its 13,900 nucleotide sequence. The gene content is the same as that typically found for animal mitochondrial DNA (mtDNA) except that atp8 appears to be lacking, a condition found previously for several other animals. Despite the small size of this mtDNA, there are two large non-coding regions, one of which contains 13 repeats of a 31 nucleotide sequence and a potential stem-loop structure of 25 base pairs with an 11-member loop. Large potential secondary structures are identified also formore » the non-coding regions of two other cestode mtDNAs. Comparison of the mitochondrial gene arrangement of H. diminuta with those previously published supports a phylogenetic position of flatworms as members of the Eutrochozoa, rather than being basal to either a clade of protostomes or a clade of coelomates.« less
A Bacterial Analysis Platform: An Integrated System for Analysing Bacterial Whole Genome Sequencing Data for Clinical Diagnostics and Surveillance.

PubMed

Thomsen, Martin Christen Frølund; Ahrenfeldt, Johanne; Cisneros, Jose Luis Bellod; Jurtz, Vanessa; Larsen, Mette Voldby; Hasman, Henrik; Aarestrup, Frank Møller; Lund, Ole

2016-01-01

Recent advances in whole genome sequencing have made the technology available for routine use in microbiological laboratories. However, a major obstacle for using this technology is the availability of simple and automatic bioinformatics tools. Based on previously published and already available web-based tools we developed a single pipeline for batch uploading of whole genome sequencing data from multiple bacterial isolates. The pipeline will automatically identify the bacterial species and, if applicable, assemble the genome, identify the multilocus sequence type, plasmids, virulence genes and antimicrobial resistance genes. A short printable report for each sample will be provided and an Excel spreadsheet containing all the metadata and a summary of the results for all submitted samples can be downloaded. The pipeline was benchmarked using datasets previously used to test the individual services. The reported results enable a rapid overview of the major results, and comparing that to the previously found results showed that the platform is reliable and able to correctly predict the species and find most of the expected genes automatically. In conclusion, a combined bioinformatics platform was developed and made publicly available, providing easy-to-use automated analysis of bacterial whole genome sequencing data. The platform may be of immediate relevance as a guide for investigators using whole genome sequencing for clinical diagnostics and surveillance. The platform is freely available at: https://cge.cbs.dtu.dk/services/CGEpipeline-1.1 and it is the intention that it will continue to be expanded with new features as these become available.
Leveraging long read sequencing from a single individual to provide a comprehensive resource for benchmarking variant calling methods

PubMed Central

Mu, John C.; Tootoonchi Afshar, Pegah; Mohiyuddin, Marghoob; Chen, Xi; Li, Jian; Bani Asadi, Narges; Gerstein, Mark B.; Wong, Wing H.; Lam, Hugo Y. K.

2015-01-01

A high-confidence, comprehensive human variant set is critical in assessing accuracy of sequencing algorithms, which are crucial in precision medicine based on high-throughput sequencing. Although recent works have attempted to provide such a resource, they still do not encompass all major types of variants including structural variants (SVs). Thus, we leveraged the massive high-quality Sanger sequences from the HuRef genome to construct by far the most comprehensive gold set of a single individual, which was cross validated with deep Illumina sequencing, population datasets, and well-established algorithms. It was a necessary effort to completely reanalyze the HuRef genome as its previously published variants were mostly reported five years ago, suffering from compatibility, organization, and accuracy issues that prevent their direct use in benchmarking. Our extensive analysis and validation resulted in a gold set with high specificity and sensitivity. In contrast to the current gold sets of the NA12878 or HS1011 genomes, our gold set is the first that includes small variants, deletion SVs and insertion SVs up to a hundred thousand base-pairs. We demonstrate the utility of our HuRef gold set to benchmark several published SV detection tools. PMID:26412485
Novel species including Mycobacterium fukienense sp. is found from tuberculosis patients in Fujian Province, China, using phylogenetic analysis of Mycobacterium chelonae/abscessus complex.

PubMed

Zhang, Yuan Yuan; Li, Yan Bing; Huang, Ming Xiang; Zhao, Xiu Qin; Zhang, Li Shui; Liu, Wen En; Wan, Kang Lin

2013-11-01

To identify the novel species 'Mycobacterium fukienense' sp. nov of Mycobacterium chelonae/abscessus complex from tuberculosis patients in Fujian Province, China. Five of 27 clinical Mycobacterium isolates (Cls) were previously identified as M. chelonae/abscessus complex by sequencing the hsp65, rpoB, 16S-23S rRNA internal transcribed spacer region (its), recA and sodA house-keeping genes commonly used to describe the molecular characteristics of Mycobacterium. Clinical Mycobacterium isolates were classified according to the gene sequence using a clustering analysis program. Sequence similarity within clusters and diversity between clusters were analyzed. The 5 isolates were identified with distinct sequences exhibiting 99.8% homology in the hsp65 gene. However, a complete lack of homology was observed among the sequences of the rpoB, 16S-23S rRNA internal transcribed spacer region (its), sodA, and recA genes as compared with the M. abscessus. Furthermore, no match for rpoB, sodA, and recA genes was identified among the published sequences. The novel species, Mycobacterium fukienense, is identified from tuberculosis patients in Fujian Province, China, which does not belong to any existing subspecies of M. chelonea/abscessus complex. Copyright © 2013 The Editorial Board of Biomedical and Environmental Sciences. Published by China CDC. All rights reserved.
Exome sequencing identifies a DNAJB6 mutation in a family with dominantly-inherited limb-girdle muscular dystrophy.

PubMed

Couthouis, Julien; Raphael, Alya R; Siskind, Carly; Findlay, Andrew R; Buenrostro, Jason D; Greenleaf, William J; Vogel, Hannes; Day, John W; Flanigan, Kevin M; Gitler, Aaron D

2014-05-01

Limb-girdle muscular dystrophy primarily affects the muscles of the hips and shoulders (the "limb-girdle" muscles), although it is a heterogeneous disorder that can present with varying symptoms. There is currently no cure. We sought to identify the genetic basis of limb-girdle muscular dystrophy type 1 in an American family of Northern European descent using exome sequencing. Exome sequencing was performed on DNA samples from two affected siblings and one unaffected sibling and resulted in the identification of eleven candidate mutations that co-segregated with the disease. Notably, this list included a previously reported mutation in DNAJB6, p.Phe89Ile, which was recently identified as a cause of limb-girdle muscular dystrophy type 1D. Additional family members were Sanger sequenced and the mutation in DNAJB6 was only found in affected individuals. Subsequent haplotype analysis indicated that this DNAJB6 p.Phe89Ile mutation likely arose independently of the previously reported mutation. Since other published mutations are located close by in the G/F domain of DNAJB6, this suggests that the area may represent a mutational hotspot. Exome sequencing provided an unbiased and effective method for identifying the genetic etiology of limb-girdle muscular dystrophy type 1 in a previously genetically uncharacterized family. This work further confirms the causative role of DNAJB6 mutations in limb-girdle muscular dystrophy type 1D. Copyright © 2014 Elsevier B.V. All rights reserved.
Genome-wide comparative analysis of four Indian Drosophila species.

PubMed

Mohanty, Sujata; Khanna, Radhika

2017-12-01

Comparative analysis of multiple genomes of closely or distantly related Drosophila species undoubtedly creates excitement among evolutionary biologists in exploring the genomic changes with an ecology and evolutionary perspective. We present herewith the de novo assembled whole genome sequences of four Drosophila species, D. bipectinata, D. takahashii, D. biarmipes and D. nasuta of Indian origin using Next Generation Sequencing technology on an Illumina platform along with their detailed assembly statistics. The comparative genomics analysis, e.g. gene predictions and annotations, functional and orthogroup analysis of coding sequences and genome wide SNP distribution were performed. The whole genome of Zaprionus indianus of Indian origin published earlier by us and the genome sequences of previously sequenced 12 Drosophila species available in the NCBI database were included in the analysis. The present work is a part of our ongoing genomics project of Indian Drosophila species.
How close is close: 16S rRNA sequence identity may not be sufficient to guarantee species identity

NASA Technical Reports Server (NTRS)

Fox, G. E.; Wisotzkey, J. D.; Jurtshuk, P. Jr

1992-01-01

16S rRNA (genes coding for rRNA) sequence comparisons were conducted with the following three psychrophilic strains: Bacillus globisporus W25T (T = type strain) and Bacillus psychrophilus W16AT, and W5. These strains exhibited more than 99.5% sequence identity and within experimental uncertainty could be regarded as identical. Their close taxonomic relationship was further documented by phenotypic similarities. In contrast, previously published DNA-DNA hybridization results have convincingly established that these strains do not belong to the same species if current standards are used. These results emphasize the important point that effective identity of 16S rRNA sequences is not necessarily a sufficient criterion to guarantee species identity. Thus, although 16S rRNA sequences can be used routinely to distinguish and establish relationships between genera and well-resolved species, very recently diverged species may not be recognizable.
Early-stage chunking of finger tapping sequences by persons who stutter and fluent speakers.

PubMed

Smits-Bandstra, Sarah; De Nil, Luc F

2013-01-01

This research note explored the hypothesis that chunking differences underlie the slow finger-tap sequencing performance reported in the literature for persons who stutter (PWS) relative to fluent speakers (PNS). Early-stage chunking was defined as an immediate and spontaneous tendency to organize a long sequence into pauses, for motor planning, and chunks of fluent motor performance. A previously published study in which 12 PWS and 12 matched PNS practised a 10-item finger tapping sequence 30 times was examined. Both groups significantly decreased the duration of between-chunk intervals (BCIs) and within-chunk intervals (WCIs) over practice. PNS had significantly shorter WCIs relative to PWS, but minimal differences between groups were found for the number of, or duration of, BCI. Results imply that sequencing differences found between PNS and PWS may be due to differences in automatizing movements within chunks or retrieving chunks from memory rather than chunking per se.
The organization and expression of the mdm2 gene.

PubMed

de Oca Luna, R M; Tabor, A D; Eberspaecher, H; Hulboy, D L; Worth, L L; Colman, M S; Finlay, C A; Lozano, G

1996-05-01

The mdm2 gene encodes a zinc finger protein that negatively regulates p53 function by binding and masking the p53 transcriptional activation domain. Two different promoters control expression of mdm2, one of which is also transactivated by p53. We cloned and characterized the mdm2 gene from a murine 129 library. It contained at least 12 exons and spanned approximately 25 kb of DNA. Sequencing of the mdm2 gene revealed three nucleotide differences that resulted in amino acid substitutions in the previously published mdm2 sequence. Sequencing of normal BalbC/J DNA and the original cosmid clone isolated from the 3T3DM cell line revealed that they are identical, suggesting that the published sequence is in error at these three positions. In addition, we analyzed the expression pattern of mdm2 and found ubiquitous low-level expression throughout embryo development and in adult tissues. Analysis of mRNA from numerous tissues for several mdm2 spliced variants that had been identified in the transformed 3T3DM cell line revealed that these variants could not be detected in the developing embryo or in adult tissues.
SIBIS: a Bayesian model for inconsistent protein sequence estimation.

PubMed

Khenoussi, Walyd; Vanhoutrève, Renaud; Poch, Olivier; Thompson, Julie D

2014-09-01

The prediction of protein coding genes is a major challenge that depends on the quality of genome sequencing, the accuracy of the model used to elucidate the exonic structure of the genes and the complexity of the gene splicing process leading to different protein variants. As a consequence, today's protein databases contain a huge amount of inconsistency, due to both natural variants and sequence prediction errors. We have developed a new method, called SIBIS, to detect such inconsistencies based on the evolutionary information in multiple sequence alignments. A Bayesian framework, combined with Dirichlet mixture models, is used to estimate the probability of observing specific amino acids and to detect inconsistent or erroneous sequence segments. We evaluated the performance of SIBIS on a reference set of protein sequences with experimentally validated errors and showed that the sensitivity is significantly higher than previous methods, with only a small loss of specificity. We also assessed a large set of human sequences from the UniProt database and found evidence of inconsistency in 48% of the previously uncharacterized sequences. We conclude that the integration of quality control methods like SIBIS in automatic analysis pipelines will be critical for the robust inference of structural, functional and phylogenetic information from these sequences. Source code, implemented in C on a linux system, and the datasets of protein sequences are freely available for download at http://www.lbgi.fr/∼julie/SIBIS. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Late Holocene volcanic activity and environmental change in Highland Guatemala

NASA Astrophysics Data System (ADS)

Lohse, Jon C.; Hamilton, W. Derek; Brenner, Mark; Curtis, Jason; Inomata, Takeshi; Morgan, Molly; Cardona, Karla; Aoyama, Kazuo; Yonenobu, Hitoshi

2018-07-01

We present a record of late Holocene volcanic eruptions with elemental data for a sequence of sampled tephras from Lake Amatitlan in Highland Guatemala. Our tephrochronology is anchored by a Bayesian P_Sequence age-depth model based on multiple AMS radiocarbon dates. We compare our record against a previously published study from the same area to understand the record of volcanism and environmental changes. This work has implications for understanding the effects of climate and other environmental changes that may be related to the emission of volcanic aerosols at local, regional and global scales.
Paralogues of nuclear ribosomal genes conceal phylogenetic signals within the invasive Asian fish tapeworm lineage: evidence from next generation sequencing data.

PubMed

Brabec, Jan; Kuchta, Roman; Scholz, Tomáš; Littlewood, D Timothy J

2016-08-01

Complete mitochondrial genomes and nuclear rRNA operons of eight geographically distinct isolates of the Asian fish tapeworm Schyzocotyle acheilognathi (syn. Bothriocephalus acheilognathi), representing the parasite's global diversity spanning four continents, were fully characterised using an Illumina sequencing platform. This cestode species represents an extreme example of a highly invasive, globally distributed pathogen of veterinary importance with exceptionally low host specificity unseen elsewhere within the parasitic flatworms. In addition to eight specimens of S. acheilognathi, we fully characterised its closest known relative and the only congeneric species, Schyzocotyle nayarensis, from cyprinids in the Indian subcontinent. Since previous nucleotide sequence data on the Asian fish tapeworm were restricted to a single molecular locus of questionable phylogenetic utility-the nuclear rRNA genes-separating internal transcribed spacers-the mitogenomic data presented here offer a unique opportunity to gain the first detailed insights into both the intraspecific phylogenetic relationships and population genetic structure of the parasite, providing key baseline information for future research in the field. Additionally, we identify a previously unnoticed source of error and demonstrate the limited utility of the nuclear rRNA sequences, including the internal transcribed spacers that has likely misled most of the previous molecular phylogenetic and population genetic estimates on the Asian fish tapeworm. Copyright © 2016 Australian Society for Parasitology. Published by Elsevier Ltd. All rights reserved.
Discovery and validation of information theory-based transcription factor and cofactor binding site motifs.

PubMed

Lu, Ruipeng; Mucaki, Eliseos J; Rogan, Peter K

2017-03-17

Data from ChIP-seq experiments can derive the genome-wide binding specificities of transcription factors (TFs) and other regulatory proteins. We analyzed 765 ENCODE ChIP-seq peak datasets of 207 human TFs with a novel motif discovery pipeline based on recursive, thresholded entropy minimization. This approach, while obviating the need to compensate for skewed nucleotide composition, distinguishes true binding motifs from noise, quantifies the strengths of individual binding sites based on computed affinity and detects adjacent cofactor binding sites that coordinate with the targets of primary, immunoprecipitated TFs. We obtained contiguous and bipartite information theory-based position weight matrices (iPWMs) for 93 sequence-specific TFs, discovered 23 cofactor motifs for 127 TFs and revealed six high-confidence novel motifs. The reliability and accuracy of these iPWMs were determined via four independent validation methods, including the detection of experimentally proven binding sites, explanation of effects of characterized SNPs, comparison with previously published motifs and statistical analyses. We also predict previously unreported TF coregulatory interactions (e.g. TF complexes). These iPWMs constitute a powerful tool for predicting the effects of sequence variants in known binding sites, performing mutation analysis on regulatory SNPs and predicting previously unrecognized binding sites and target genes. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

Genome-Wide Search Identifies 1.9 Mb from the Polar Bear Y Chromosome for Evolutionary Analyses.

PubMed

Bidon, Tobias; Schreck, Nancy; Hailer, Frank; Nilsson, Maria A; Janke, Axel

2015-05-27

The male-inherited Y chromosome is the major haploid fraction of the mammalian genome, rendering Y-linked sequences an indispensable resource for evolutionary research. However, despite recent large-scale genome sequencing approaches, only a handful of Y chromosome sequences have been characterized to date, mainly in model organisms. Using polar bear (Ursus maritimus) genomes, we compare two different in silico approaches to identify Y-linked sequences: 1) Similarity to known Y-linked genes and 2) difference in the average read depth of autosomal versus sex chromosomal scaffolds. Specifically, we mapped available genomic sequencing short reads from a male and a female polar bear against the reference genome and identify 112 Y-chromosomal scaffolds with a combined length of 1.9 Mb. We verified the in silico findings for the longer polar bear scaffolds by male-specific in vitro amplification, demonstrating the reliability of the average read depth approach. The obtained Y chromosome sequences contain protein-coding sequences, single nucleotide polymorphisms, microsatellites, and transposable elements that are useful for evolutionary studies. A high-resolution phylogeny of the polar bear patriline shows two highly divergent Y chromosome lineages, obtained from analysis of the identified Y scaffolds in 12 previously published male polar bear genomes. Moreover, we find evidence of gene conversion among ZFX and ZFY sequences in the giant panda lineage and in the ancestor of ursine and tremarctine bears. Thus, the identification of Y-linked scaffold sequences from unordered genome sequences yields valuable data to infer phylogenomic and population-genomic patterns in bears. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Isolation of a gammaherpesvirus similar to asinine herpesvirus-2 (AHV-2) from a mule and a survey of mules and donkeys for AHV-2 infection by real-time PCR.

PubMed

Bell, Stephanie A; Pusterla, Nicola; Balasuriya, Udeni B R; Mapes, Samantha M; Nyberg, Nicole L; MacLachlan, N James

2008-07-27

Equids are commonly infected by herpesviruses, but isolation of herpesviruses from mules has apparently not been previously reported. Furthermore, the genomic relationships among the various equid herpesviruses are poorly characterized. We describe the isolation and preliminary characterization of a mule gammaherpesvirus tentatively identified as asinine herpesvirus-2 (AHV-2; also designated equid herpesvirus-7 (EHV-7)) from the nasal secretions (NS) of a healthy mule in northern California. The virus was initially identified by transmission electron microscopic examination of lysates of cell culture inoculated with NS collected from the mule. A 913 nucleotide sequence of the DNA polymerase gene was amplified using degenerate primers, and comparison of this sequence with those of various other herpesviruses showed that the mule herpesvirus was most closely related to EHV-2 (AHV-2 sequences were not available for comparison). The sequence of a shorter portion (166 nucleotides) of the mule herpesvirus DNA polymerase gene was identical to that of the published sequence of an asinine gammaherpesvirus, previously designated as AHV-4-3 (AY054992). AHV-2 was detected by real-time polymerase chain reaction assay in the NS of approximately 8% of a cohort of 114 healthy mules and 13 donkeys.
Strongylus asini (Nematoda, Strongyloidea): genetic relationships with other Strongylus species determined by ribosomal DNA.

PubMed

Hung, G C; Jacobs, D E; Krecek, R C; Gasser, R B; Chilton, N B

1996-12-01

Genomic DNA was isolated from adult Strongylus asini collected from zebra. The second ribosomal transcribed spacer (ITS-2) was amplified and sequenced using polymerase chain reaction (PCR) based techniques. The DNA sequence was compared with previously published data for 3 related Strongylus species. A PCR-linked restriction fragment length polymorphism method allowed the 4 species to be differentiated unequivocally. The ITS-2 sequence of S. asini was found to be more similar to those of S. edentatus (87.1%) and S. equinus (95.3%) than to that of S vulgaris (73.9%). This result confirms that S. Asini and S vulgaris represent separate species and supports the retention of the 4 species within 1 genus.
Genome sequence of Prevotella intermedia SUNY aB G8-9K-3, a biofilm forming strain with drug-resistance.

PubMed

Moon, Ji-Hoi; Kim, Minjung; Lee, Jae-Hyung

Prevotella intermedia has long been known to be as the principal etiologic agent of periodontal diseases and associated with various systemic diseases. Previous studies showed that the intra-species difference exists in capacity of biofilm formation, antibiotic resistance, and serological reaction among P. intermedia strains. Here we report the genome sequence of P. intermedia SUNY aB G8-9K-3 (designated ATCC49046) that displays a relatively high antimicrobial resistant and biofilm-forming capacity. Genome sequencing information provides important clues in understanding the genetic bases of phenotypic differences among P. intermedia strains. Copyright © 2016 Sociedade Brasileira de Microbiologia. Published by Elsevier Editora Ltda. All rights reserved.
Genetic composition and connectivity of the Antillean manatee (Trichechus manatus manatus) in Panama

USGS Publications Warehouse

Díaz-Ferguson, Edgardo; Hunter, Margaret; Guzmán, Héctor M.

2017-01-01

Genetic diversity and haplotype composition of the West Indian manatee (Trichechus manatus) population from the San San Pond Sak wetland in Bocas del Toro, Panama was studied using a segment of mitochondrial DNA (D’loop). No genetic information has been published to date for Panamanian populations. Due to the secretive behavior and small population size of the species in the area, DNA extraction was conducted from opportunistically collected fecal (N=20), carcass tissue (N=4) and bone (N=4) samples. However, after DNA processing only 10 samples provided good quality DNA for sequencing (3 fecal, 4 tissue and 3 bone samples). We found three haplotypes in total; two of these haplotypes are reported for the first time, J02 (N=3) and J03 (N=4), and one J01 was previously published (N=3). Genetic diversity showed similar values to previous studies conducted in other Caribbean regions with moderate values of nucleotide diversity (π= 0.00152) and haplotipic diversity (Hd= 0.57). Connectivity assessment was based on sequence similarity, genetic distance and genetic differentiation between San San population and other manatee populations previously studied. The J01 haplotype found in the Panamanian population is shared with populations in the Caribbean mainland and the Gulf of Mexico showing a reduced differentiation corroborated with Fst value between HSSPS and this region of 0.0094. In contrast, comparisons between our sequences and populations in the Eastern Caribbean (South American populations) and North Western Caribbean showed fewer similarities (Fst =0.049 and 0.058, respectively). These results corroborate previous phylogeographic patterns already established for manatee populations and situate Panamanian populations into the Belize and Mexico cluster. In addition, these findings will be a baseline for future studies and comparisons with manatees in other areas of Panama and Central America. These results should be considered to inform management decisions regarding conservation of genetic diversity, future controlled introductions, connectivity and effective population size of the West Indian manatee along the Central American corridor.
Next-generation sequencing of the BRCA1 and BRCA2 genes for the genetic diagnostics of hereditary breast and/or ovarian cancer.

PubMed

Trujillano, Daniel; Weiss, Maximilian E R; Schneider, Juliane; Köster, Julia; Papachristos, Efstathios B; Saviouk, Viatcheslav; Zakharkina, Tetyana; Nahavandi, Nahid; Kovacevic, Lejla; Rolfs, Arndt

2015-03-01

Genetic testing for hereditary breast and/or ovarian cancer mostly relies on laborious molecular tools that use Sanger sequencing to scan for mutations in the BRCA1 and BRCA2 genes. We explored a more efficient genetic screening strategy based on next-generation sequencing of the BRCA1 and BRCA2 genes in 210 hereditary breast and/or ovarian cancer patients. We first validated this approach in a cohort of 115 samples with previously known BRCA1 and BRCA2 mutations and polymorphisms. Genomic DNA was amplified using the Ion AmpliSeq BRCA1 and BRCA2 panel. The DNA Libraries were pooled, barcoded, and sequenced using an Ion Torrent Personal Genome Machine sequencer. The combination of different robust bioinformatics tools allowed detection of all previously known pathogenic mutations and polymorphisms in the 115 samples, without detecting spurious pathogenic calls. We then used the same assay in a discovery cohort of 95 uncharacterized hereditary breast and/or ovarian cancer patients for BRCA1 and BRCA2. In addition, we describe the allelic frequencies across 210 hereditary breast and/or ovarian cancer patients of 74 unique definitely and likely pathogenic and uncertain BRCA1 and BRCA2 variants, some of which have not been previously annotated in the public databases. Targeted next-generation sequencing is ready to substitute classic molecular methods to perform genetic testing on the BRCA1 and BRCA2 genes and provides a greater opportunity for more comprehensive testing of at-risk patients. Copyright © 2015 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.
Successful enrichment and recovery of whole mitochondrial genomes from ancient human dental calculus.

PubMed

Ozga, Andrew T; Nieves-Colón, Maria A; Honap, Tanvi P; Sankaranarayanan, Krithivasan; Hofman, Courtney A; Milner, George R; Lewis, Cecil M; Stone, Anne C; Warinner, Christina

2016-06-01

Archaeological dental calculus is a rich source of host-associated biomolecules. Importantly, however, dental calculus is more accurately described as a calcified microbial biofilm than a host tissue. As such, concerns regarding destructive analysis of human remains may not apply as strongly to dental calculus, opening the possibility of obtaining human health and ancestry information from dental calculus in cases where destructive analysis of conventional skeletal remains is not permitted. Here we investigate the preservation of human mitochondrial DNA (mtDNA) in archaeological dental calculus and its potential for full mitochondrial genome (mitogenome) reconstruction in maternal lineage ancestry analysis. Extracted DNA from six individuals at the 700-year-old Norris Farms #36 cemetery in Illinois was enriched for mtDNA using in-solution capture techniques, followed by Illumina high-throughput sequencing. Full mitogenomes (7-34×) were successfully reconstructed from dental calculus for all six individuals, including three individuals who had previously tested negative for DNA preservation in bone using conventional PCR techniques. Mitochondrial haplogroup assignments were consistent with previously published findings, and additional comparative analysis of paired dental calculus and dentine from two individuals yielded equivalent haplotype results. All dental calculus samples exhibited damage patterns consistent with ancient DNA, and mitochondrial sequences were estimated to be 92-100% endogenous. DNA polymerase choice was found to impact error rates in downstream sequence analysis, but these effects can be mitigated by greater sequencing depth. Dental calculus is a viable alternative source of human DNA that can be used to reconstruct full mitogenomes from archaeological remains. Am J Phys Anthropol 160:220-228, 2016. © 2016 The Authors American Journal of Physical Anthropology Published by Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Successful enrichment and recovery of whole mitochondrial genomes from ancient human dental calculus

PubMed Central

Ozga, Andrew T.; Nieves‐Colón, Maria A.; Honap, Tanvi P.; Sankaranarayanan, Krithivasan; Hofman, Courtney A.; Milner, George R.; Lewis, Cecil M.; Stone, Anne C.

2016-01-01

ABSTRACT Objectives Archaeological dental calculus is a rich source of host‐associated biomolecules. Importantly, however, dental calculus is more accurately described as a calcified microbial biofilm than a host tissue. As such, concerns regarding destructive analysis of human remains may not apply as strongly to dental calculus, opening the possibility of obtaining human health and ancestry information from dental calculus in cases where destructive analysis of conventional skeletal remains is not permitted. Here we investigate the preservation of human mitochondrial DNA (mtDNA) in archaeological dental calculus and its potential for full mitochondrial genome (mitogenome) reconstruction in maternal lineage ancestry analysis. Materials and Methods Extracted DNA from six individuals at the 700‐year‐old Norris Farms #36 cemetery in Illinois was enriched for mtDNA using in‐solution capture techniques, followed by Illumina high‐throughput sequencing. Results Full mitogenomes (7–34×) were successfully reconstructed from dental calculus for all six individuals, including three individuals who had previously tested negative for DNA preservation in bone using conventional PCR techniques. Mitochondrial haplogroup assignments were consistent with previously published findings, and additional comparative analysis of paired dental calculus and dentine from two individuals yielded equivalent haplotype results. All dental calculus samples exhibited damage patterns consistent with ancient DNA, and mitochondrial sequences were estimated to be 92–100% endogenous. DNA polymerase choice was found to impact error rates in downstream sequence analysis, but these effects can be mitigated by greater sequencing depth. Discussion Dental calculus is a viable alternative source of human DNA that can be used to reconstruct full mitogenomes from archaeological remains. Am J Phys Anthropol 160:220–228, 2016. © 2016 The Authors American Journal of Physical Anthropology Published by Wiley Periodicals, Inc. PMID:26989998
Comparative pathogenomics of Clostridium tetani.

PubMed

Cohen, Jonathan E; Wang, Rong; Shen, Rong-Fong; Wu, Wells W; Keller, James E

2017-01-01

Clostridium tetani and Clostridium botulinum produce two of the most potent neurotoxins known, tetanus neurotoxin and botulinum neurotoxin, respectively. Extensive biochemical and genetic investigation has been devoted to identifying and characterizing various C. botulinum strains. Less effort has been focused on studying C. tetani likely because recently sequenced strains of C. tetani show much less genetic diversity than C. botulinum strains and because widespread vaccination efforts have reduced the public health threat from tetanus. Our aim was to acquire genomic data on the U.S. vaccine strain of C. tetani to better understand its genetic relationship to previously published genomic data from European vaccine strains. We performed high throughput genomic sequence analysis on two wild-type and two vaccine C. tetani strains. Comparative genomic analysis was performed using these and previously published genomic data for seven other C. tetani strains. Our analysis focused on single nucleotide polymorphisms (SNP) and four distinct constituents of the mobile genome (mobilome): a hypervariable flagellar glycosylation island region, five conserved bacteriophage insertion regions, variations in three CRISPR (clustered regularly interspaced short palindromic repeats)-Cas (CRISPR-associated) systems, and a single plasmid. Intact type IA and IB CRISPR/Cas systems were within 10 of 11 strains. A type IIIA CRISPR/Cas system was present in two strains. Phage infection histories derived from CRISPR-Cas sequences indicate C. tetani encounters phages common among commensal gut bacteria and soil-borne organisms consistent with C. tetani distribution in nature. All vaccine strains form a clade distinct from currently sequenced wild type strains when considering variations in these mobile elements. SNP, flagellar glycosylation island, prophage content and CRISPR/Cas phylogenic histories provide tentative evidence suggesting vaccine and wild type strains share a common ancestor.
Median network analysis of defectively sequenced entire mitochondrial genomes from early and contemporary disease studies.

PubMed

Bandelt, Hans-Jürgen; Yao, Yong-Gang; Bravi, Claudio M; Salas, Antonio; Kivisild, Toomas

2009-03-01

Sequence analysis of the mitochondrial genome has become a routine method in the study of mitochondrial diseases. Quite often, the sequencing efforts in the search of pathogenic or disease-associated mutations are affected by technical and interpretive problems, caused by sample mix-up, contamination, biochemical problems, incomplete sequencing, misdocumentation and insufficient reference to previously published data. To assess data quality in case studies of mitochondrial diseases, it is recommended to compare any mtDNA sequence under consideration to their phylogenetically closest lineages available in the Web. The median network method has proven useful for visualizing potential problems with the data. We contrast some early reports of complete mtDNA sequences to more recent total mtDNA sequencing efforts in studies of various mitochondrial diseases. We conclude that the quality of complete mtDNA sequences generated in the medical field in the past few years is somewhat unsatisfactory and may even fall behind that of pioneer manual sequencing in the early nineties. Our study provides a paradigm for an a posteriori evaluation of sequence quality and for detection of potential problems with inferring a pathogenic status of a particular mutation.
Mapping-by-sequencing in complex polyploid genomes using genic sequence capture: a case study to map yellow rust resistance in hexaploid wheat.

PubMed

Gardiner, Laura-Jayne; Bansept-Basler, Pauline; Olohan, Lisa; Joynson, Ryan; Brenchley, Rachel; Hall, Neil; O'Sullivan, Donal M; Hall, Anthony

2016-08-01

Previously we extended the utility of mapping-by-sequencing by combining it with sequence capture and mapping sequence data to pseudo-chromosomes that were organized using wheat-Brachypodium synteny. This, with a bespoke haplotyping algorithm, enabled us to map the flowering time locus in the diploid wheat Triticum monococcum L. identifying a set of deleted genes (Gardiner et al., 2014). Here, we develop this combination of gene enrichment and sliding window mapping-by-synteny analysis to map the Yr6 locus for yellow stripe rust resistance in hexaploid wheat. A 110 MB NimbleGen capture probe set was used to enrich and sequence a doubled haploid mapping population of hexaploid wheat derived from an Avalon and Cadenza cross. The Yr6 locus was identified by mapping to the POPSEQ chromosomal pseudomolecules using a bespoke pipeline and algorithm (Chapman et al., 2015). Furthermore the same locus was identified using newly developed pseudo-chromosome sequences as a mapping reference that are based on the genic sequence used for sequence enrichment. The pseudo-chromosomes allow us to demonstrate the application of mapping-by-sequencing to even poorly defined polyploidy genomes where chromosomes are incomplete and sub-genome assemblies are collapsed. This analysis uniquely enabled us to: compare wheat genome annotations; identify the Yr6 locus - defining a smaller genic region than was previously possible; associate the interval with one wheat sub-genome and increase the density of SNP markers associated. Finally, we built the pipeline in iPlant, making it a user-friendly community resource for phenotype mapping. © 2016 The Authors. The Plant Journal published by Society for Experimental Biology and John Wiley & Sons Ltd.
Ichnology applied to sequence stratigraphic analysis of Siluro-Devonian mud-dominated shelf deposits, Paraná Basin, Brazil

NASA Astrophysics Data System (ADS)

Sedorko, Daniel; Netto, Renata G.; Savrda, Charles E.

2018-04-01

Previous studies of the Paraná Supersequence (Furnas and Ponta Grossa formations) of the Paraná Basin in southern Brazil have yielded disparate sequence stratigraphic interpretations. An integrated sedimentological, paleontological, and ichnological model was created to establish a refined sequence stratigraphic framework for this succession, focusing on the Ponta Grossa Formation. Twenty-nine ichnotaxa are recognized in the Ponta Grossa Formation, recurring assemblages of which define five trace fossil suites that represent various expressions of the Skolithos, Glossifungites and Cruziana ichnofacies. Physical sedimentologic characteristics and associated softground ichnofacies provide the basis for recognizing seven facies that reflect a passive relationship to bathymetric gradients from shallow marine (shoreface) to offshore deposition. The vertical distribution of facies provides the basis for dividing the Ponta Grossa Formation into three major (3rd-order) depositional sequences- Siluro-Devonian and Devonian I and II-each containing a record of three to seven higher-order relative sea-level cycles. Major sequence boundaries, commonly coinciding with hiatuses recognized from previously published biostratigraphic data, are locally marked by firmground Glossifungites Ichnofacies associated with submarine erosion. Maximum transgressive horizons are prominently marked by unbioturbated or weakly bioturbated black shales. By integrating observations of the Ponta Grossa Formation with those recently made on the underlying marginal- to shallow-marine Furnas Formation, the entire Paraná Supersequence can be divided into four disconformity-bound sequences: a Lower Silurian (Llandovery-Wenlock) sequence, corresponding to lower and middle units of the Furnas; a Siluro-Devonian sequence (?Pridoli-Early Emsian), and Devonian sequences I (Late Emsian-Late Eifelian) and II (Late Eifelian-Early Givetian). Stratigraphic positions of sequence boundaries generally coincide with regressive phases on established global sea-level curves for the Silurian-Devonian.
Preferential amino acid sequences in alumina-catalyzed peptide bond formation.

PubMed

Bujdák, J; Rode, B M

2002-05-21

The catalytic effect of activated alumina on amino acid condensation was investigated. The readiness of amino acids to form peptide sequences was estimated on the basis of the yield of dipeptides and was found to decrease in the order glycine (Gly), alanine (Ala), leucine (Leu), valine (Val), proline (Pro). For example, approximately 15% Gly was converted to the dipeptide (Gly(2)), 5% to cyclic anhydride (cyc(Gly(2))) and small amounts of tri- (Gly(3)) and tetrapeptide (Gly(4)) were formed after 28 days. On the other hand, only trace amounts of Pro(2) were formed from proline under the same conditions. Preferential formation of certain sequences was observed in the mixed reaction systems containing two amino acids. For example, almost ten times more Gly-Val than Val-Gly was formed in the Gly+Val reaction system. The preferred sequences can be explained on the basis of an inductive effect that side groups have on the nucleophilicity and electrophilicity, respectively, of the amino and carboxyl groups. A comparison with published data of amino acid reactions in other reaction systems revealed that the main trends of preferential sequence formation were the same as those described for the salt-induced peptide formation (SIPF) reaction. The results of this work and other previously published papers show that alumina and related mineral surfaces might have played a crucial role in the prebiotic formation of the first peptides on the primitive earth.
SANSparallel: interactive homology search against Uniprot.

PubMed

Somervuo, Panu; Holm, Liisa

2015-07-01

Proteins evolve by mutations and natural selection. The network of sequence similarities is a rich source for mining homologous relationships that inform on protein structure and function. There are many servers available to browse the network of homology relationships but one has to wait up to a minute for results. The SANSparallel webserver provides protein sequence database searches with immediate response and professional alignment visualization by third-party software. The output is a list, pairwise alignment or stacked alignment of sequence-similar proteins from Uniprot, UniRef90/50, Swissprot or Protein Data Bank. The stacked alignments are viewed in Jalview or as sequence logos. The database search uses the suffix array neighborhood search (SANS) method, which has been re-implemented as a client-server, improved and parallelized. The method is extremely fast and as sensitive as BLAST above 50% sequence identity. Benchmarks show that the method is highly competitive compared to previously published fast database search programs: UBLAST, DIAMOND, LAST, LAMBDA, RAPSEARCH2 and BLAT. The web server can be accessed interactively or programmatically at http://ekhidna2.biocenter.helsinki.fi/cgi-bin/sans/sans.cgi. It can be used to make protein functional annotation pipelines more efficient, and it is useful in interactive exploration of the detailed evidence supporting the annotation of particular proteins of interest. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Limited Genetic Diversity Preceded Extinction of the Tasmanian Tiger

PubMed Central

Menzies, Brandon R.; Renfree, Marilyn B.; Heider, Thomas; Mayer, Frieder; Hildebrandt, Thomas B.; Pask, Andrew J.

2012-01-01

The Tasmanian tiger or thylacine was the largest carnivorous marsupial when Europeans first reached Australia. Sadly, the last known thylacine died in captivity in 1936. A recent analysis of the genome of the closely related and extant Tasmanian devil demonstrated limited genetic diversity between individuals. While a similar lack of diversity has been reported for the thylacine, this analysis was based on just two individuals. Here we report the sequencing of an additional 12 museum-archived specimens collected between 102 and 159 years ago. We examined a portion of the mitochondrial DNA hyper-variable control region and determined that all sequences were on average 99.5% identical at the nucleotide level. As a measure of accuracy we also sequenced mitochondrial DNA from a mother and two offspring. As expected, these samples were found to be 100% identical, validating our methods. We also used 454 sequencing to reconstruct 2.1 kilobases of the mitochondrial genome, which shared 99.91% identity with the two complete thylacine mitochondrial genomes published previously. Our thylacine genomic data also contained three highly divergent putative nuclear mitochondrial sequences, which grouped phylogenetically with the published thylacine mitochondrial homologs but contained 100-fold more polymorphisms than the conserved fragments. Together, our data suggest that the thylacine population in Tasmania had limited genetic diversity prior to its extinction, possibly as a result of their geographic isolation from mainland Australia approximately 10,000 years ago. PMID:22530022
Temporal lobe dual pathology in malignant migrating partial seizures in infancy.

PubMed

Coppola, Giangennaro; Operto, Francesca Felicia; Auricchio, Gianfranca; D'Amico, Alessandra; Fortunato, Delia; Pascotto, Antonio

2007-06-01

A child had the characteristic clinical and EEG pattern of migrating partial seizures in infancy with left temporal lobe atrophy, hippocampal sclerosis and cortical-subcortical blurring. Seizures were drug-resistant, with recurring episodes of status epilepticus. The child developed microcephaly with arrest of psychomotor development. Focal brain lesions, in the context of migrating partial seizures, have not been previously reported.[Published with video sequences].
Reality check: Prior exposure facilitates picture book imitation by 15-month-old infants.

PubMed

Simcock, Gabrielle; Heron-Delaney, Michelle

2016-11-01

We examined whether 15-month-olds could imitate a novel action sequence from a picture book, and whether or not pre-exposure to the objects before reading the book would facilitate imitation. We found that infants only imitated from a picture book above baseline when they had previously interacted with the objects. Copyright © 2016. Published by Elsevier Inc.
Improving the efficiency of a user-driven learning system with reconfigurable hardware. Application to DNA splicing.

PubMed

Lemoine, E; Merceron, D; Sallantin, J; Nguifo, E M

1999-01-01

This paper describes a new approach to problem solving by splitting up problem component parts between software and hardware. Our main idea arises from the combination of two previously published works. The first one proposed a conceptual environment of concept modelling in which the machine and the human expert interact. The second one reported an algorithm based on reconfigurable hardware system which outperforms any kind of previously published genetic data base scanning hardware or algorithms. Here we show how efficient the interaction between the machine and the expert is when the concept modelling is based on reconfigurable hardware system. Their cooperation is thus achieved with an real time interaction speed. The designed system has been partially applied to the recognition of primate splice junctions sites in genetic sequences.
Evolution and spread of Ebola virus in Liberia, 2014–2015

PubMed Central

Ladner, Jason T.; Wiley, Michael R.; Mate, Suzanne; Dudas, Gytis; Prieto, Karla; Lovett, Sean; Nagle, Elyse R.; Beitzel, Brett; Gilbert, Merle L.; Fakoli, Lawrence; Diclaro, Joseph W.; Schoepp, Randal J.; Fair, Joseph; Kuhn, Jens H.; Hensley, Lisa E.; Park, Daniel J.; Sabeti, Pardis C.; Rambaut, Andrew; Sanchez-Lockhart, Mariano; Bolay, Fatorma K.; Kugelman, Jeffrey R.; Palacios, Gustavo

2015-01-01

SUMMARY The 2013–present Western African Ebola virus disease (EVD) outbreak is the largest ever recorded with >28,000 reported cases. Ebola virus (EBOV) genome sequencing has played an important role throughout this outbreak; however, relatively few sequences have been determined from patients in Liberia, the second worst-affected country. Here, we report 140 EBOV genome sequences from the second wave of the Liberian outbreak and analyze them in combination with 782 previously published sequences from throughout the Western African outbreak. While multiple early introductions of EBOV to Liberia are evident, the majority of Liberian EVD cases are consistent with a single introduction, followed by spread and diversification within the country. Movement of the virus within Liberia was widespread and reintroductions from Liberia served as an important source for the continuation of the already ongoing EVD outbreak in Guinea. Overall, little evidence was found for incremental adaptation of EBOV to the human host. PMID:26651942
Deep sequencing reveals exceptional diversity and modes of transmission for bacterial sponge symbionts.

PubMed

Webster, Nicole S; Taylor, Michael W; Behnam, Faris; Lücker, Sebastian; Rattei, Thomas; Whalan, Stephen; Horn, Matthias; Wagner, Michael

2010-08-01

Marine sponges contain complex bacterial communities of considerable ecological and biotechnological importance, with many of these organisms postulated to be specific to sponge hosts. Testing this hypothesis in light of the recent discovery of the rare microbial biosphere, we investigated three Australian sponges by massively parallel 16S rRNA gene tag pyrosequencing. Here we show bacterial diversity that is unparalleled in an invertebrate host, with more than 250,000 sponge-derived sequence tags being assigned to 23 bacterial phyla and revealing up to 2996 operational taxonomic units (95% sequence similarity) per sponge species. Of the 33 previously described 'sponge-specific' clusters that were detected in this study, 48% were found exclusively in adults and larvae - implying vertical transmission of these groups. The remaining taxa, including 'Poribacteria', were also found at very low abundance among the 135,000 tags retrieved from surrounding seawater. Thus, members of the rare seawater biosphere may serve as seed organisms for widely occurring symbiont populations in sponges and their host association might have evolved much more recently than previously thought. © 2009 Society for Applied Microbiology and Blackwell Publishing Ltd.

AMPLISAS: a web server for multilocus genotyping using next-generation amplicon sequencing data.

PubMed

Sebastian, Alvaro; Herdegen, Magdalena; Migalska, Magdalena; Radwan, Jacek

2016-03-01

Next-generation sequencing (NGS) technologies are revolutionizing the fields of biology and medicine as powerful tools for amplicon sequencing (AS). Using combinations of primers and barcodes, it is possible to sequence targeted genomic regions with deep coverage for hundreds, even thousands, of individuals in a single experiment. This is extremely valuable for the genotyping of gene families in which locus-specific primers are often difficult to design, such as the major histocompatibility complex (MHC). The utility of AS is, however, limited by the high intrinsic sequencing error rates of NGS technologies and other sources of error such as polymerase amplification or chimera formation. Correcting these errors requires extensive bioinformatic post-processing of NGS data. Amplicon Sequence Assignment (AMPLISAS) is a tool that performs analysis of AS results in a simple and efficient way, while offering customization options for advanced users. AMPLISAS is designed as a three-step pipeline consisting of (i) read demultiplexing, (ii) unique sequence clustering and (iii) erroneous sequence filtering. Allele sequences and frequencies are retrieved in excel spreadsheet format, making them easy to interpret. AMPLISAS performance has been successfully benchmarked against previously published genotyped MHC data sets obtained with various NGS technologies. © 2015 John Wiley & Sons Ltd.
"The devil's in the detail": Release of an expanded, enhanced and dynamically revised forensic STR Sequence Guide.

PubMed

Phillips, C; Gettings, K Butler; King, J L; Ballard, D; Bodner, M; Borsuk, L; Parson, W

2018-05-01

The STR sequence template file published in 2016 as part of the considerations from the DNA Commission of the International Society for Forensic Genetics on minimal STR sequence nomenclature requirements, has been comprehensively revised and audited using the latest GRCh38 genome assembly. The list of forensic STRs characterized was expanded by including supplementary autosomal, X- and Y-chromosome microsatellites in less common use for routine DNA profiling, but some likely to be adopted in future massively parallel sequencing (MPS) STR panels. We outline several aspects of sequence alignment and annotation that required care and attention to detail when comparing sequences to GRCh37 and GRCh38 assemblies, as well as the necessary matching of MPS-based allele descriptions to previously established repeat region structures described in initial sequencing studies of the less well known forensic STRs. The revised sequence guide is now available in a dynamically updated FTP format from the STRidER website with a date-stamped change log to allow users to explore their own MPS data with the most up-to-date forensic STR sequence information compiled in a simple guide. Copyright © 2018 Elsevier B.V. All rights reserved.
SeqSIMLA2_exact: simulate multiple disease sites in large pedigrees with given disease status for diseases with low prevalence.

PubMed

Yao, Po-Ju; Chung, Ren-Hua

2016-02-15

It is difficult for current simulation tools to simulate sequence data in a pre-specified pedigree structure and pre-specified affection status. Previously, we developed a flexible tool, SeqSIMLA2, for simulating sequence data in either unrelated case-control or family samples with different disease and quantitative trait models. Here we extended the tool to efficiently simulate sequences with multiple disease sites in large pedigrees with a given disease status for each pedigree member, assuming that the disease prevalence is low. SeqSIMLA2_exact is implemented with C++ and is available at http://seqsimla.sourceforge.net. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Performance comparison of the Prophecy (forecasting) Algorithm in FFT form for unseen feature and time-series prediction

NASA Astrophysics Data System (ADS)

Jaenisch, Holger; Handley, James

2013-06-01

We introduce a generalized numerical prediction and forecasting algorithm. We have previously published it for malware byte sequence feature prediction and generalized distribution modeling for disparate test article analysis. We show how non-trivial non-periodic extrapolation of a numerical sequence (forecast and backcast) from the starting data is possible. Our ancestor-progeny prediction can yield new options for evolutionary programming. Our equations enable analytical integrals and derivatives to any order. Interpolation is controllable from smooth continuous to fractal structure estimation. We show how our generalized trigonometric polynomial can be derived using a Fourier transform.
Microbial community analysis of the hypersaline water of the Dead Sea using high-throughput amplicon sequencing.

PubMed

Jacob, Jacob H; Hussein, Emad I; Shakhatreh, Muhamad Ali K; Cornelison, Christopher T

2017-10-01

Amplicon sequencing using next-generation technology (bTEFAP ® ) has been utilized in describing the diversity of Dead Sea microbiota. The investigated area is a well-known salt lake in the western part of Jordan found in the lowest geographical location in the world (more than 420 m below sea level) and characterized by extreme salinity (approximately, 34%) in addition to other extreme conditions (low pH, unique ionic composition different from sea water). DNA was extracted from Dead Sea water. A total of 314,310 small subunit RNA (SSU rRNA) sequences were parsed, and 288,452 sequences were then clustered. For alpha diversity analysis, sample was rarefied to 3,000 sequences. The Shannon-Wiener index curve plot reached a plateau at approximately 3,000 sequences indicating that sequencing depth was sufficient to capture the full scope of microbial diversity. Archaea was found to be dominating the sequences (52%), whereas Bacteria constitute 45% of the sequences. Altogether, prokaryotic sequences (which constitute 97% of all sequences) were found to predominate. The findings expand on previous studies by using high-throughput amplicon sequencing to describe the microbial community in an environment which in recent years has been shown to hide some interesting diversity. © 2017 The Authors. MicrobiologyOpen published by John Wiley & Sons Ltd.
HIV Sequence Compendium 2010

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kuiken, Carla; Foley, Brian; Leitner, Thomas

This compendium is an annual printed summary of the data contained in the HIV sequence database. In these compendia we try to present a judicious selection of the data in such a way that it is of maximum utility to HIV researchers. Each of the alignments attempts to display the genetic variability within the different species, groups and subtypes of the virus. This compendium contains sequences published before January 1, 2010. Hence, though it is called the 2010 Compendium, its contents correspond to the 2009 curated alignments on our website. The number of sequences in the HIV database is stillmore » increasing exponentially. In total, at the time of printing, there were 339,306 sequences in the HIV Sequence Database, an increase of 45% since last year. The number of near complete genomes (>7000 nucleotides) increased to 2576 by end of 2009, reflecting a smaller increase than in previous years. However, as in previous years, the compendium alignments contain only a small fraction of these. Included in the alignments are a small number of sequences representing each of the subtypes and the more prevalent circulating recombinant forms (CRFs) such as 01 and 02, as well as a few outgroup sequences (group O and N and SIV-CPZ). Of the rarer CRFs we included one representative each. A more complete version of all alignments is available on our website, http://www.hiv.lanl.gov/content/sequence/NEWALIGN/align.html. Reprints are available from our website in the form of both HTML and PDF files. As always, we are open to complaints and suggestions for improvement. Inquiries and comments regarding the compendium should be addressed to seq-info@lanl.gov.« less
Phylogenetic relationships among the major lineages of the birds-of-paradise (Paradisaeidae) using mitochondrial DNA gene sequences.

PubMed

Nunn, G B; Cracraft, J

1996-06-01

Complete mitochondrial cytochrome b gene sequences were determined from 12 species of the Australo-Papuan birds-of-paradise (Paradisaeidae) representing 9 genera. Phylogenetic analysis of these and 5 previously published sequences reveals a radiation of the main paradisaeinine lineages that took place over a relatively short evolutionary time scale. The core paradisaeinines are resolved as the monophyletic sister-group to the crow-like manucodines. The genus Parotia is basal to other paradisaeinines and is not closely related to the morphologically similar genera Ptiloris and Lophorina. Three major clades within the paradisaeinine ingroup include: (1) Cicinnurus and Diphyllodes, (2) Ptiloris and Lophorina, and (3) the genus Paradisaea. The monotypic genus Seleucidis is apparently closely related to clades (1) and (2). Cytochrome b sequences did not provide evidence for the monophyly of the sicklebill genera Epimachus and Drepanornis. The paradisaeid tree is characterized by short internodal distances. Thus, some clades cannot be strongly resolved by cytochrome b sequences alone.
New partial sequences of phosphoenolpyruvate carboxylase as molecular phylogenetic markers.

PubMed

Gehrig, H; Heute, V; Kluge, M

2001-08-01

To better understand the evolution of the enzyme phosphoenolpyruvate carboxylase (PEPC) and to test its versatility as a molecular character in phylogenetic and taxonomic studies, we have characterized and compared 70 new partial PEPC nucleotide and amino acid sequences (about 1100 bp of the 3' side of the gene) from 50 plant species (24 species of Bryophyta, 1 of Pteridophyta, and 25 of Spermatophyta). Together with previously published data, the new set of sequences allowed us to construct the up to now most complete phylogenetic tree of PEPC, where the PEPC sequences cluster according to both the taxonomic positions of the donor plants and the assumed specific function of the PEPC isoforms. Altogether, the study further strengthens the view that PEPC sequences can provide interesting information for the reconstruction of phylogenetic relations between organisms and metabolic pathways. To avoid confusion in future discussion, we propose a new nomenclature for the denotation of PEPC isoforms. Copyright 2001 Academic Press.
Recombinational hotspot specific to female meiosis in the mouse major histocompatibility complex.

PubMed

Shiroishi, T; Hanzawa, N; Sagai, T; Ishiura, M; Gojobori, T; Steinmetz, M; Moriwaki, K

1990-01-01

The wm7 haplotype of the major histocompatibility complex (MHC), derived from the Japanese wild mouse Mus musculus molossinus, enhances recombination specific to female meiosis in the K/A beta interval of the MHC. We have mapped crossover points of fifteen independent recombinants from genetic crosses of the wm7 and laboratory haplotypes. Most of them were confined to a short segment of approximately 1 kilobase (kb) of DNA between the A beta 3 and A beta 2 genes, indicating the presence of a female-specific recombinational hotspot. Its location overlaps with a sex-independent hotspot previously identified in the Mus musculus castaneus CAS3 haplotype. We have cloned and sequenced DNA fragments surrounding the hotspot from the wm7 haplotype and the corresponding regions from the hotspot-negative B10.A and C57BL/10 strains. There is no significant difference between the sequences of these three strains, or between these and the published sequences of the CAS3 and C57BL/6 strains. However, a comparison of this A beta 3/A beta 2 hotspot with a previously characterized hotspot in the E beta gene revealed that they have a very similar molecular organization. Each hotspot consists of two elements, the consensus sequence of the mouse middle repetitive MT family and the tetrameric repeated sequences, which are separated by 1 kb of DNA.
Long-range PCR facilitates the identification of PMS2-specific mutations.

PubMed

Clendenning, Mark; Hampel, Heather; LaJeunesse, Jennifer; Lindblom, Annika; Lockman, Jan; Nilbert, Mef; Senter, Leigha; Sotamaa, Kaisa; de la Chapelle, Albert

2006-05-01

Mutations within the DNA mismatch repair gene, "postmeiotic segregation increased 2" (PMS2), have been associated with a predisposition to hereditary nonpolyposis colorectal cancer (HNPCC; Lynch syndrome). The presence of a large family of highly homologous PMS2 pseudogenes has made previous attempts to sequence PMS2 very difficult. Here, we describe a novel method that utilizes long-range PCR as a way to preferentially amplify PMS2 and not the pseudogenes. A second, exon-specific, amplification from diluted long-range products enables us to obtain a clean sequence that shows no evidence of pseudogene contamination. This method has been used to screen a cohort of patients whose tumors were negative for the PMS2 protein by immunohistochemistry and had not shown any mutations within the MLH1 gene. Sequencing of the PMS2 gene from 30 colorectal and 11 endometrial cancer patients identified 10 novel sequence changes as well as 17 sequence changes that had previously been identified. In total, putative pathologic mutations were detected in 11 of the 41 families. Among these were five novel mutations, c.705+1G>T, c.736_741del6ins11, c.862_863del, c.1688G>T, and c.2007-1G>A. We conclude that PMS2 mutation detection in selected Lynch syndrome and Lynch syndrome-like patients is both feasible and desirable. Published 2006 Wiley-Liss, Inc.
The Applied Development of a Tiered Multilocus Sequence Typing (MLST) Scheme for Dichelobacter nodosus.

PubMed

Blanchard, Adam M; Jolley, Keith A; Maiden, Martin C J; Coffey, Tracey J; Maboni, Grazieli; Staley, Ceri E; Bollard, Nicola J; Warry, Andrew; Emes, Richard D; Davies, Peers L; Tötemeyer, Sabine

2018-01-01

Dichelobacter nodosus ( D. nodosus ) is the causative pathogen of ovine footrot, a disease that has a significant welfare and financial impact on the global sheep industry. Previous studies into the phylogenetics of D. nodosus have focused on Australia and Scandinavia, meaning the current diversity in the United Kingdom (U.K.) population and its relationship globally, is poorly understood. Numerous epidemiological methods are available for bacterial typing; however, few account for whole genome diversity or provide the opportunity for future application of new computational techniques. Multilocus sequence typing (MLST) measures nucleotide variations within several loci with slow accumulation of variation to enable the designation of allele numbers to determine a sequence type. The usage of whole genome sequence data enables the application of MLST, but also core and whole genome MLST for higher levels of strain discrimination with a negligible increase in experimental cost. An MLST database was developed alongside a seven loci scheme using publically available whole genome data from the sequence read archive. Sequence type designation and strain discrimination was compared to previously published data to ensure reproducibility. Multiple D. nodosus isolates from U.K. farms were directly compared to populations from other countries. The U.K. isolates define new clades within the global population of D. nodosus and predominantly consist of serogroups A, B and H, however serogroups C, D, E, and I were also found. The scheme is publically available at https://pubmlst.org/dnodosus/.
Mixed heterolobosean and novel gregarine lineage genes from culture ATCC 50646: Long-branch artefacts, not lateral gene transfer, distort α-tubulin phylogeny.

PubMed

Cavalier-Smith, Thomas

2015-04-01

Contradictory and confusing results can arise if sequenced 'monoprotist' samples really contain DNA of very different species. Eukaryote-wide phylogenetic analyses using five genes from the amoeboflagellate culture ATCC 50646 previously implied it was an undescribed percolozoan related to percolatean flagellates (Stephanopogon, Percolomonas). Contrastingly, three phylogenetic analyses of 18S rRNA alone, did not place it within Percolozoa, but as an isolated deep-branching excavate. I resolve that contradiction by sequence phylogenies for all five genes individually, using up to 652 taxa. Its 18S rRNA sequence (GQ377652) is near-identical to one from stained-glass windows, somewhat more distant from one from cooling-tower water, all three related to terrestrial actinocephalid gregarines Hoplorhynchus and Pyxinia. All four protein-gene sequences (Hsp90; α-tubulin; β-tubulin; actin) are from an amoeboflagellate heterolobosean percolozoan, not especially deeply branching. Contrary to previous conclusions from trees combining protein and rRNA sequences or rDNA trees including Eozoa only, this culture does not represent a major novel deep-branching eukaryote lineage distinct from Heterolobosea, and thus lacks special significance for deep eukaryote phylogeny, though the rDNA sequence is important for gregarine phylogeny. α-Tubulin trees for over 250 eukaryotes refute earlier suggestions of lateral gene transfer within eukaryotes, being largely congruent with morphology and other gene trees. Copyright © 2015. Published by Elsevier GmbH.
SeqTrim: a high-throughput pipeline for pre-processing any type of sequence read

PubMed Central

2010-01-01

Background High-throughput automated sequencing has enabled an exponential growth rate of sequencing data. This requires increasing sequence quality and reliability in order to avoid database contamination with artefactual sequences. The arrival of pyrosequencing enhances this problem and necessitates customisable pre-processing algorithms. Results SeqTrim has been implemented both as a Web and as a standalone command line application. Already-published and newly-designed algorithms have been included to identify sequence inserts, to remove low quality, vector, adaptor, low complexity and contaminant sequences, and to detect chimeric reads. The availability of several input and output formats allows its inclusion in sequence processing workflows. Due to its specific algorithms, SeqTrim outperforms other pre-processors implemented as Web services or standalone applications. It performs equally well with sequences from EST libraries, SSH libraries, genomic DNA libraries and pyrosequencing reads and does not lead to over-trimming. Conclusions SeqTrim is an efficient pipeline designed for pre-processing of any type of sequence read, including next-generation sequencing. It is easily configurable and provides a friendly interface that allows users to know what happened with sequences at every pre-processing stage, and to verify pre-processing of an individual sequence if desired. The recommended pipeline reveals more information about each sequence than previously described pre-processors and can discard more sequencing or experimental artefacts. PMID:20089148
Diagnostics for Yaws Eradication: Insights From Direct Next-Generation Sequencing of Cutaneous Strains of Treponema pallidum

PubMed Central

Marks, Michael; Fookes, Maria; Wagner, Josef; Butcher, Robert; Ghinai, Rosanna; Sokana, Oliver; Sarkodie, Yaw-Adu; Lukehart, Sheila A; Solomon, Anthony W; Mabey, David C W; Thomson, Nicholas

2018-01-01

Abstract Background Yaws-like chronic ulcers can be caused by Treponema pallidum subspecies pertenue, Haemophilus ducreyi, or other, still-undefined bacteria. To permit accurate evaluation of yaws elimination efforts, programmatic use of molecular diagnostics is required. The accuracy and sensitivity of current tools remain unclear because our understanding of T. pallidum diversity is limited by the low number of sequenced genomes. Methods We tested samples from patients with suspected yaws collected in the Solomon Islands and Ghana. All samples were from patients whose lesions had previously tested negative using the Centers for Disease Control and Prevention (CDC) diagnostic assay in widespread use. However, some of these patients had positive serological assays for yaws on blood. We used direct whole-genome sequencing to identify T. pallidum subsp pertenue strains missed by the current assay. Results From 45 Solomon Islands and 27 Ghanaian samples, 11 were positive for T. pallidum DNA using the species-wide quantitative polymerase chain reaction (PCR) assay, from which we obtained 6 previously undetected T. pallidum subsp pertenue whole-genome sequences. These show that Solomon Islands sequences represent distinct T. pallidum subsp pertenue clades. These isolates were invisible to the CDC diagnostic PCR assay, due to sequence variation in the primer binding site. Conclusions Our data double the number of published T. pallidum subsp pertenue genomes. We show that Solomon Islands strains are undetectable by the PCR used in many studies and by health ministries. This assay is therefore not adequate for the eradication program. Next-generation genome sequence data are essential for these efforts. PMID:29045605
Palindromic Sequence Artifacts Generated during Next Generation Sequencing Library Preparation from Historic and Ancient DNA

PubMed Central

Star, Bastiaan; Nederbragt, Alexander J.; Hansen, Marianne H. S.; Skage, Morten; Gilfillan, Gregor D.; Bradbury, Ian R.; Pampoulie, Christophe; Stenseth, Nils Chr; Jakobsen, Kjetill S.; Jentoft, Sissel

2014-01-01

Degradation-specific processes and variation in laboratory protocols can bias the DNA sequence composition from samples of ancient or historic origin. Here, we identify a novel artifact in sequences from historic samples of Atlantic cod (Gadus morhua), which forms interrupted palindromes consisting of reverse complementary sequence at the 5′ and 3′-ends of sequencing reads. The palindromic sequences themselves have specific properties – the bases at the 5′-end align well to the reference genome, whereas extensive misalignments exists among the bases at the terminal 3′-end. The terminal 3′ bases are artificial extensions likely caused by the occurrence of hairpin loops in single stranded DNA (ssDNA), which can be ligated and amplified in particular library creation protocols. We propose that such hairpin loops allow the inclusion of erroneous nucleotides, specifically at the 3′-end of DNA strands, with the 5′-end of the same strand providing the template. We also find these palindromes in previously published ancient DNA (aDNA) datasets, albeit at varying and substantially lower frequencies. This artifact can negatively affect the yield of endogenous DNA in these types of samples and introduces sequence bias. PMID:24608104
Accounting for biases in riboprofiling data indicates a major role for proline in stalling translation.

PubMed

Artieri, Carlo G; Fraser, Hunter B

2014-12-01

The recent advent of ribosome profiling-sequencing of short ribosome-bound fragments of mRNA-has offered an unprecedented opportunity to interrogate the sequence features responsible for modulating translational rates. Nevertheless, numerous analyses of the first riboprofiling data set have produced equivocal and often incompatible results. Here we analyze three independent yeast riboprofiling data sets, including two with much higher coverage than previously available, and find that all three show substantial technical sequence biases that confound interpretations of ribosomal occupancy. After accounting for these biases, we find no effect of previously implicated factors on ribosomal pausing. Rather, we find that incorporation of proline, whose unique side-chain stalls peptide synthesis in vitro, also slows the ribosome in vivo. We also reanalyze a method that implicated positively charged amino acids as the major determinant of ribosomal stalling and demonstrate that it produces false signals of stalling in low-coverage data. Our results suggest that any analysis of riboprofiling data should account for sequencing biases and sparse coverage. To this end, we establish a robust methodology that enables analysis of ribosome profiling data without prior assumptions regarding which positions spanned by the ribosome cause stalling. © 2014 Artieri and Fraser; Published by Cold Spring Harbor Laboratory Press.
Tracing the phylogeographic history of Southeast Asian long-tailed macaques through mitogenomes of museum specimens.

PubMed

Yao, Lu; Li, Hongjie; Martin, Robert D; Moreau, Corrie S; Malhi, Ripan S

2017-11-01

The biogeographical history of Southeast Asia is complicated due to the continuous emergences and disappearances of land bridges throughout the Pleistocene. Here, we use long-tailed macaques (Macaca fascicularis), which are widely distributed throughout the mainland and islands of Southeast Asia, asa model for better understanding the biogeographical patterns of diversification in this geographically complex region. A reliable intraspecific phylogeny including individuals from localities on oceanic islands, continental islands, and the mainland is needed to trace relatedness along with the pattern and timing of colonization in this region. We used high-throughput sequencing techniques to sequence mitochondrial genomes (mitogenomes) from 95 Southeast Asian M. fascicularis specimens housed at natural history museums around the world. To achieve a comprehensive picture, we more than tripled the mitogenome sample size for M. fascicularis from previous studies, and for the first time included documented samples from the Philippines and several small Indonesian islands. Confirming the result from a previous, recent intraspecific phylogeny for M. fascicularis, the newly reconstructed phylogeny of 135 specimens divides the samples into two major clades: Clade A includes haplotypes from the mainland and some from northern Sumatra, while Clade B includes all insular haplotypes along with lineages from southern Sumatra. This study resolves a previous disparity by revealing a disjunction in the origin of Sumatran macaques, with separate lineages originating within the two major clades, suggesting that at least two major migrations to Sumatra occurred. However, our dated phylogeny reveals that the two major clades split ∼1.88Ma, which is earlier than in previously published phylogenies. Our new data reveal that most Philippine macaque lineages diverged from the Borneo stock within the last ∼0.06-0.43Ma. Finally, our study provides insight into successful sequencing of DNA across museums and shotgun sequencing of DNA specimens asa method to sequence the mitogenome. Copyright © 2017 Elsevier Inc. All rights reserved.
Similar Ratios of Introns to Intergenic Sequence across Animal Genomes.

PubMed

Francis, Warren R; Wörheide, Gert

2017-06-01

One central goal of genome biology is to understand how the usage of the genome differs between organisms. Our knowledge of genome composition, needed for downstream inferences, is critically dependent on gene annotations, yet problems associated with gene annotation and assembly errors are usually ignored in comparative genomics. Here, we analyze the genomes of 68 species across 12 animal phyla and some single-cell eukaryotes for general trends in genome composition and transcription, taking into account problems of gene annotation. We show that, regardless of genome size, the ratio of introns to intergenic sequence is comparable across essentially all animals, with nearly all deviations dominated by increased intergenic sequence. Genomes of model organisms have ratios much closer to 1:1, suggesting that the majority of published genomes of nonmodel organisms are underannotated and consequently omit substantial numbers of genes, with likely negative impact on evolutionary interpretations. Finally, our results also indicate that most animals transcribe half or more of their genomes arguing against differences in genome usage between animal groups, and also suggesting that the transcribed portion is more dependent on genome size than previously thought. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Transcriptome-wide investigation of genomic imprinting in chicken.

PubMed

Frésard, Laure; Leroux, Sophie; Servin, Bertrand; Gourichon, David; Dehais, Patrice; Cristobal, Magali San; Marsaud, Nathalie; Vignoles, Florence; Bed'hom, Bertrand; Coville, Jean-Luc; Hormozdiari, Farhad; Beaumont, Catherine; Zerjal, Tatiana; Vignal, Alain; Morisson, Mireille; Lagarrigue, Sandrine; Pitel, Frédérique

2014-04-01

Genomic imprinting is an epigenetic mechanism by which alleles of some specific genes are expressed in a parent-of-origin manner. It has been observed in mammals and marsupials, but not in birds. Until now, only a few genes orthologous to mammalian imprinted ones have been analyzed in chicken and did not demonstrate any evidence of imprinting in this species. However, several published observations such as imprinted-like QTL in poultry or reciprocal effects keep the question open. Our main objective was thus to screen the entire chicken genome for parental-allele-specific differential expression on whole embryonic transcriptomes, using high-throughput sequencing. To identify the parental origin of each observed haplotype, two chicken experimental populations were used, as inbred and as genetically distant as possible. Two families were produced from two reciprocal crosses. Transcripts from 20 embryos were sequenced using NGS technology, producing ∼200 Gb of sequences. This allowed the detection of 79 potentially imprinted SNPs, through an analysis method that we validated by detecting imprinting from mouse data already published. However, out of 23 candidates tested by pyrosequencing, none could be confirmed. These results come together, without a priori, with previous statements and phylogenetic considerations assessing the absence of genomic imprinting in chicken.
Learning a Weighted Sequence Model of the Nucleosome Core and Linker Yields More Accurate Predictions in Saccharomyces cerevisiae and Homo sapiens

PubMed Central

Reynolds, Sheila M.; Bilmes, Jeff A.; Noble, William Stafford

2010-01-01

DNA in eukaryotes is packaged into a chromatin complex, the most basic element of which is the nucleosome. The precise positioning of the nucleosome cores allows for selective access to the DNA, and the mechanisms that control this positioning are important pieces of the gene expression puzzle. We describe a large-scale nucleosome pattern that jointly characterizes the nucleosome core and the adjacent linkers and is predominantly characterized by long-range oscillations in the mono, di- and tri-nucleotide content of the DNA sequence, and we show that this pattern can be used to predict nucleosome positions in both Homo sapiens and Saccharomyces cerevisiae more accurately than previously published methods. Surprisingly, in both H. sapiens and S. cerevisiae, the most informative individual features are the mono-nucleotide patterns, although the inclusion of di- and tri-nucleotide features results in improved performance. Our approach combines a much longer pattern than has been previously used to predict nucleosome positioning from sequence—301 base pairs, centered at the position to be scored—with a novel discriminative classification approach that selectively weights the contributions from each of the input features. The resulting scores are relatively insensitive to local AT-content and can be used to accurately discriminate putative dyad positions from adjacent linker regions without requiring an additional dynamic programming step and without the attendant edge effects and assumptions about linker length modeling and overall nucleosome density. Our approach produces the best dyad-linker classification results published to date in H. sapiens, and outperforms two recently published models on a large set of S. cerevisiae nucleosome positions. Our results suggest that in both genomes, a comparable and relatively small fraction of nucleosomes are well-positioned and that these positions are predictable based on sequence alone. We believe that the bulk of the remaining nucleosomes follow a statistical positioning model. PMID:20628623

Rapid microsatellite marker development for African mahogany (Khaya senegalensis, Meliaceae) using next-generation sequencing and assessment of its intra-specific genetic diversity.

PubMed

Karan, M; Evans, D S; Reilly, D; Schulte, K; Wright, C; Innes, D; Holton, T A; Nikles, D G; Dickinson, G R

2012-03-01

Khaya senegalensis (African mahogany or dry-zone mahogany) is a high-value hardwood timber species with great potential for forest plantations in northern Australia. The species is distributed across the sub-Saharan belt from Senegal to Sudan and Uganda. Because of heavy exploitation and constraints on natural regeneration and sustainable planting, it is now classified as a vulnerable species. Here, we describe the development of microsatellite markers for K. senegalensis using next-generation sequencing to assess its intra-specific diversity across its natural range, which is a key for successful breeding programs and effective conservation management of the species. Next-generation sequencing yielded 93,943 sequences with an average read length of 234 bp. The assembled sequences contained 1030 simple sequence repeats, with primers designed for 522 microsatellite loci. Twenty-one microsatellite loci were tested with 11 showing reliable amplification and polymorphism in K. senegalensis. The 11 novel microsatellites, together with one previously published, were used to assess 73 accessions belonging to the Australian K. senegalensis domestication program, sampled from across the natural range of the species. STRUCTURE analysis shows two major clusters, one comprising mainly accessions from west Africa (Senegal to Benin) and the second based in the far eastern limits of the range in Sudan and Uganda. Higher levels of genetic diversity were found in material from western Africa. This suggests that new seed collections from this region may yield more diverse genotypes than those originating from Sudan and Uganda in eastern Africa. © 2011 Blackwell Publishing Ltd.
Searching for evidence of selection in avian DNA barcodes.

PubMed

Kerr, Kevin C R

2011-11-01

The barcode of life project has assembled a tremendous number of mitochondrial cytochrome c oxidase I (COI) sequences. Although these sequences were gathered to develop a DNA-based system for species identification, it has been suggested that further biological inferences may also be derived from this wealth of data. Recurrent selective sweeps have been invoked as an evolutionary mechanism to explain limited intraspecific COI diversity, particularly in birds, but this hypothesis has not been formally tested. In this study, I collated COI sequences from previous barcoding studies on birds and tested them for evidence of selection. Using this expanded data set, I re-examined the relationships between intraspecific diversity and interspecific divergence and sampling effort, respectively. I employed the McDonald-Kreitman test to test for neutrality in sequence evolution between closely related pairs of species. Because amino acid sequences were generally constrained between closely related pairs, I also included broader intra-order comparisons to quantify patterns of protein variation in avian COI sequences. Lastly, using 22 published whole mitochondrial genomes, I compared the evolutionary rate of COI against the other 12 protein-coding mitochondrial genes to assess intragenomic variability. I found no conclusive evidence of selective sweeps. Most evidence pointed to an overall trend of strong purifying selection and functional constraint. The COI protein did vary across the class Aves, but to a very limited extent. COI was the least variable gene in the mitochondrial genome, suggesting that other genes might be more informative for probing factors constraining mitochondrial variation within species. © 2011 Blackwell Publishing Ltd.
Lineage-Specific Biology Revealed by a Finished Genome Assembly of the Mouse

PubMed Central

Hillier, LaDeana W.; Zody, Michael C.; Goldstein, Steve; She, Xinwe; Bult, Carol J.; Agarwala, Richa; Cherry, Joshua L.; DiCuccio, Michael; Hlavina, Wratko; Kapustin, Yuri; Meric, Peter; Maglott, Donna; Birtle, Zoë; Marques, Ana C.; Graves, Tina; Zhou, Shiguo; Teague, Brian; Potamousis, Konstantinos; Churas, Christopher; Place, Michael; Herschleb, Jill; Runnheim, Ron; Forrest, Daniel; Amos-Landgraf, James; Schwartz, David C.; Cheng, Ze; Lindblad-Toh, Kerstin; Eichler, Evan E.; Ponting, Chris P.

2009-01-01

The mouse (Mus musculus) is the premier animal model for understanding human disease and development. Here we show that a comprehensive understanding of mouse biology is only possible with the availability of a finished, high-quality genome assembly. The finished clone-based assembly of the mouse strain C57BL/6J reported here has over 175,000 fewer gaps and over 139 Mb more of novel sequence, compared with the earlier MGSCv3 draft genome assembly. In a comprehensive analysis of this revised genome sequence, we are now able to define 20,210 protein-coding genes, over a thousand more than predicted in the human genome (19,042 genes). In addition, we identified 439 long, non–protein-coding RNAs with evidence for transcribed orthologs in human. We analyzed the complex and repetitive landscape of 267 Mb of sequence that was missing or misassembled in the previously published assembly, and we provide insights into the reasons for its resistance to sequencing and assembly by whole-genome shotgun approaches. Duplicated regions within newly assembled sequence tend to be of more recent ancestry than duplicates in the published draft, correcting our initial understanding of recent evolution on the mouse lineage. These duplicates appear to be largely composed of sequence regions containing transposable elements and duplicated protein-coding genes; of these, some may be fixed in the mouse population, but at least 40% of segmentally duplicated sequences are copy number variable even among laboratory mouse strains. Mouse lineage-specific regions contain 3,767 genes drawn mainly from rapidly-changing gene families associated with reproductive functions. The finished mouse genome assembly, therefore, greatly improves our understanding of rodent-specific biology and allows the delineation of ancestral biological functions that are shared with human from derived functions that are not. PMID:19468303
Idiopathic slow transit constipation and megacolon are not associated with neurturin mutations.

PubMed

Chen, B; Knowles, C H; Scott, M; Anand, P; Williams, N S; Milbrandt, J; Tam, P K H

2002-10-01

Chronic idiopathic slow-transit constipation (ISTC) and idiopathic megacolon (IMC) are early-onset gastrointestinal motility disorders of unknown aetiology. The gene encoding the neurotrophic factor neurturin may be a candidate for these disorders, as neurturin-deficient mice have a similar enteric phenotype. In the present study, we tested this hypothesis. Genomic DNA from 26 cases of chronic idiopathic STC [with a family history of constipation in 15 (58%) and Hirschsprung's disease in two (8%)], and five cases of IMC [two familial (40%)] was screened by direct DNA sequencing using the fluorescent dideoxy terminator method. Results were compared with published sequence data and 24 control DNAs. Our results revealed several previously unreported common sequence polymorphisms, but overall frequencies were comparable between patients and controls. We conclude that mutation of neurturin is not a frequent cause of ISTC or IMC.
Bloodmeal Identification in Field-Collected Sand Flies From Casa Branca, Brazil, Using the Cytochrome b PCR Method.

PubMed

Carvalho, G M L; Rêgo, F D; Tanure, A; Silva, A C P; Dias, T A; Paz, G F; Andrade Filho, J D

2017-07-01

PCR-based identification of vertebrate host bloodmeals has been performed on several vectors species with success. In the present study, we used a previously published PCR protocol followed by DNA sequencing based on primers designed from multiple alignments of the mitochondrial cytochrome b gene used to identify avian and mammalian hosts of various hematophagous vectors. The amplification of a fragment encoding a 359 bp sequence of the Cyt b gene yielded recognized amplification products in 192 female sand flies (53%), from a total of 362 females analyzed. In the study area of Casa Branca, Brazil, blood-engorged female sand flies such as Lutzomyia longipalpis (Lutz & Neiva, 1912), Migonemyia migonei (França, 1924), and Nyssomyia whitmani (Antunes & Coutinho, 1939) were analyzed for bloodmeal sources. The PCR-based method identified human, dog, chicken, and domestic rat blood sources. © The Authors 2017. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Toward a Tree-of-Life for the boas and pythons: multilocus species-level phylogeny with unprecedented taxon sampling.

PubMed

Graham Reynolds, R; Niemiller, Matthew L; Revell, Liam J

2014-02-01

Snakes in the families Boidae and Pythonidae constitute some of the most spectacular reptiles and comprise an enormous diversity of morphology, behavior, and ecology. While many species of boas and pythons are familiar, taxonomy and evolutionary relationships within these families remain contentious and fluid. A major effort in evolutionary and conservation biology is to assemble a comprehensive Tree-of-Life, or a macro-scale phylogenetic hypothesis, for all known life on Earth. No previously published study has produced a species-level molecular phylogeny for more than 61% of boa species or 65% of python species. Using both novel and previously published sequence data, we have produced a species-level phylogeny for 84.5% of boid species and 82.5% of pythonid species, contextualized within a larger phylogeny of henophidian snakes. We obtained new sequence data for three boid, one pythonid, and two tropidophiid taxa which have never previously been included in a molecular study, in addition to generating novel sequences for seven genes across an additional 12 taxa. We compiled an 11-gene dataset for 127 taxa, consisting of the mitochondrial genes CYTB, 12S, and 16S, and the nuclear genes bdnf, bmp2, c-mos, gpr35, rag1, ntf3, odc, and slc30a1, totaling up to 7561 base pairs per taxon. We analyzed this dataset using both maximum likelihood and Bayesian inference and recovered a well-supported phylogeny for these species. We found significant evidence of discordance between taxonomy and evolutionary relationships in the genera Tropidophis, Morelia, Liasis, and Leiopython, and we found support for elevating two previously suggested boid species. We suggest a revised taxonomy for the boas (13 genera, 58 species) and pythons (8 genera, 40 species), review relationships between our study and the many other molecular phylogenetic studies of henophidian snakes, and present a taxonomic database and alignment which may be easily used and built upon by other researchers. Copyright © 2013 Elsevier Inc. All rights reserved.
Thermal buckling optimisation of composite plates using firefly algorithm

NASA Astrophysics Data System (ADS)

Kamarian, S.; Shakeri, M.; Yas, M. H.

2017-07-01

Composite plates play a very important role in engineering applications, especially in aerospace industry. Thermal buckling of such components is of great importance and must be known to achieve an appropriate design. This paper deals with stacking sequence optimisation of laminated composite plates for maximising the critical buckling temperature using a powerful meta-heuristic algorithm called firefly algorithm (FA) which is based on the flashing behaviour of fireflies. The main objective of present work was to show the ability of FA in optimisation of composite structures. The performance of FA is compared with the results reported in the previous published works using other algorithms which shows the efficiency of FA in stacking sequence optimisation of laminated composite structures.
Reappraisal of Hydatigera taeniaeformis (Batsch, 1786) (Cestoda: Taeniidae) sensu lato with description of Hydatigera kamiyai n. sp.

PubMed

Lavikainen, Antti; Iwaki, Takashi; Haukisalmi, Voitto; Konyaev, Sergey V; Casiraghi, Maurizio; Dokuchaev, Nikolai E; Galimberti, Andrea; Halajian, Ali; Henttonen, Heikki; Ichikawa-Seki, Madoka; Itagaki, Tadashi; Krivopalov, Anton V; Meri, Seppo; Morand, Serge; Näreaho, Anu; Olsson, Gert E; Ribas, Alexis; Terefe, Yitagele; Nakao, Minoru

2016-05-01

The common cat tapeworm Hydatigera taeniaeformis is a complex of three morphologically cryptic entities, which can be differentiated genetically. To clarify the biogeography and the host spectrum of the cryptic lineages, 150 specimens of H. taeniaeformis in various definitive and intermediate hosts from Eurasia, Africa and Australia were identified with DNA barcoding using partial mitochondrial cytochrome c oxidase subunit 1 gene sequences and compared with previously published data. Additional phylogenetic analyses of selected isolates were performed using nuclear DNA and mitochondrial genome sequences. Based on molecular data and morphological analysis, Hydatigera kamiyai n. sp. Iwaki is proposed for a cryptic lineage, which is predominantly northern Eurasian and uses mainly arvicoline rodents (voles) and mice of the genus Apodemus as intermediate hosts. Hydatigera taeniaeformis sensu stricto (s.s.) is restricted to murine rodents (rats and mice) as intermediate hosts. It probably originates from Asia but has spread worldwide. Despite remarkable genetic divergence between H. taeniaeformis s.s. and H. kamiyai, interspecific morphological differences are evident only in dimensions of rostellar hooks. The third cryptic lineage is closely related to H. kamiyai, but its taxonomic status remains unresolved due to limited morphological, molecular, biogeographical and ecological data. This Hydatigera sp. is confined to the Mediterranean and its intermediate hosts are unknown. Further studies are needed to classify Hydatigera sp. either as a distinct species or a variant of H. kamiyai. According to previously published limited data, all three entities occur in the Americas, probably due to human-mediated introductions. Copyright © 2016 Australian Society for Parasitology Inc. Published by Elsevier Ltd. All rights reserved.
MultiGeMS: detection of SNVs from multiple samples using model selection on high-throughput sequencing data.

PubMed

Murillo, Gabriel H; You, Na; Su, Xiaoquan; Cui, Wei; Reilly, Muredach P; Li, Mingyao; Ning, Kang; Cui, Xinping

2016-05-15

Single nucleotide variant (SNV) detection procedures are being utilized as never before to analyze the recent abundance of high-throughput DNA sequencing data, both on single and multiple sample datasets. Building on previously published work with the single sample SNV caller genotype model selection (GeMS), a multiple sample version of GeMS (MultiGeMS) is introduced. Unlike other popular multiple sample SNV callers, the MultiGeMS statistical model accounts for enzymatic substitution sequencing errors. It also addresses the multiple testing problem endemic to multiple sample SNV calling and utilizes high performance computing (HPC) techniques. A simulation study demonstrates that MultiGeMS ranks highest in precision among a selection of popular multiple sample SNV callers, while showing exceptional recall in calling common SNVs. Further, both simulation studies and real data analyses indicate that MultiGeMS is robust to low-quality data. We also demonstrate that accounting for enzymatic substitution sequencing errors not only improves SNV call precision at low mapping quality regions, but also improves recall at reference allele-dominated sites with high mapping quality. The MultiGeMS package can be downloaded from https://github.com/cui-lab/multigems xinping.cui@ucr.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Construction of an Integrated High Density Simple Sequence Repeat Linkage Map in Cultivated Strawberry (Fragaria × ananassa) and its Applicability

PubMed Central

Isobe, Sachiko N.; Hirakawa, Hideki; Sato, Shusei; Maeda, Fumi; Ishikawa, Masami; Mori, Toshiki; Yamamoto, Yuko; Shirasawa, Kenta; Kimura, Mitsuhiro; Fukami, Masanobu; Hashizume, Fujio; Tsuji, Tomoko; Sasamoto, Shigemi; Kato, Midori; Nanri, Keiko; Tsuruoka, Hisano; Minami, Chiharu; Takahashi, Chika; Wada, Tsuyuko; Ono, Akiko; Kawashima, Kumiko; Nakazaki, Naomi; Kishida, Yoshie; Kohara, Mitsuyo; Nakayama, Shinobu; Yamada, Manabu; Fujishiro, Tsunakazu; Watanabe, Akiko; Tabata, Satoshi

2013-01-01

The cultivated strawberry (Fragaria× ananassa) is an octoploid (2n = 8x = 56) of the Rosaceae family whose genomic architecture is still controversial. Several recent studies support the AAA′A′BBB′B′ model, but its complexity has hindered genetic and genomic analysis of this important crop. To overcome this difficulty and to assist genome-wide analysis of F. × ananassa, we constructed an integrated linkage map by organizing a total of 4474 of simple sequence repeat (SSR) markers collected from published Fragaria sequences, including 3746 SSR markers [Fragaria vesca expressed sequence tag (EST)-derived SSR markers] derived from F. vesca ESTs, 603 markers (F. × ananassa EST-derived SSR markers) from F. × ananassa ESTs, and 125 markers (F. × ananassa transcriptome-derived SSR markers) from F. × ananassa transcripts. Along with the previously published SSR markers, these markers were mapped onto five parent-specific linkage maps derived from three mapping populations, which were then assembled into an integrated linkage map. The constructed map consists of 1856 loci in 28 linkage groups (LGs) that total 2364.1 cM in length. Macrosynteny at the chromosome level was observed between the LGs of F. × ananassa and the genome of F. vesca. Variety distinction on 129 F. × ananassa lines was demonstrated using 45 selected SSR markers. PMID:23248204
Molecular insights into the colonization and chromosomal diversification of Madeiran house mice.

PubMed

Förster, D W; Gündüz, I; Nunes, A C; Gabriel, S; Ramalhinho, M G; Mathias, M L; Britton-Davidian, J; Searle, J B

2009-11-01

The colonization history of Madeiran house mice was investigated by analysing the complete mitochondrial (mt) D-loop sequences of 156 mice from the island of Madeira and mainland Portugal, extending on previous studies. The numbers of mtDNA haplotypes from Madeira and mainland Portugal were substantially increased (17 and 14 new haplotypes respectively), and phylogenetic analysis confirmed the previously reported link between the Madeiran archipelago and northern Europe. Sequence analysis revealed the presence of four mtDNA lineages in mainland Portugal, of which one was particularly common and widespread (termed the 'Portugal Main Clade'). There was no support for population bottlenecks during the formation of the six Robertsonian chromosome races on the island of Madeira, and D-loop sequence variation was not found to be structured according to karyotype. The colonization time of the Madeiran archipelago by Mus musculus domesticus was approached using two molecular dating methods (mismatch distribution and Bayesian skyline plot). Time estimates based on D-loop sequence variation at mainland sites (including previously published data from France and Turkey) were evaluated in the context of the zooarchaeological record of M. m. domesticus. A range of values for mutation rate (mu) and number of mouse generations per year was considered in these analyses because of the uncertainty surrounding these two parameters. The colonization of Portugal and Madeira by house mice is discussed in the context of the best-supported parameter values. In keeping with recent studies, our results suggest that mutation rate estimates based on interspecific divergence lead to gross overestimates concerning the timing of recent within-species events.
Complete Chloroplast Genome Sequences of Four Meliaceae Species and Comparative Analyses

PubMed Central

Mader, Malte; Pakull, Birte; Blanc-Jolivet, Céline; Paulini-Drewes, Maike; Bouda, Zoéwindé Henri-Noël; Degen, Bernd; Small, Ian

2018-01-01

The Meliaceae family mainly consists of trees and shrubs with a pantropical distribution. In this study, the complete chloroplast genomes of four Meliaceae species were sequenced and compared with each other and with the previously published Azadirachta indica plastome. The five plastomes are circular and exhibit a quadripartite structure with high conservation of gene content and order. They include 130 genes encoding 85 proteins, 37 tRNAs and 8 rRNAs. Inverted repeat expansion resulted in a duplication of rps19 in the five Meliaceae species, which is consistent with that in many other Sapindales, but different from many other rosids. Compared to Azadirachta indica, the four newly sequenced Meliaceae individuals share several large deletions, which mainly contribute to the decreased genome sizes. A whole-plastome phylogeny supports previous findings that the four species form a monophyletic sister clade to Azadirachta indica within the Meliaceae. SNPs and indels identified in all complete Meliaceae plastomes might be suitable targets for the future development of genetic markers at different taxonomic levels. The extended analysis of SNPs in the matK gene led to the identification of four potential Meliaceae-specific SNPs as a basis for future validation and marker development. PMID:29494509
A phylogenetic framework facilitates Y-STR variant discovery and classification via massively parallel sequencing.

PubMed

Huszar, Tunde I; Jobling, Mark A; Wetton, Jon H

2018-04-12

Short tandem repeats on the male-specific region of the Y chromosome (Y-STRs) are permanently linked as haplotypes, and therefore Y-STR sequence diversity can be considered within the robust framework of a phylogeny of haplogroups defined by single nucleotide polymorphisms (SNPs). Here we use massively parallel sequencing (MPS) to analyse the 23 Y-STRs in Promega's prototype PowerSeq™ Auto/Mito/Y System kit (containing the markers of the PowerPlex® Y23 [PPY23] System) in a set of 100 diverse Y chromosomes whose phylogenetic relationships are known from previous megabase-scale resequencing. Including allele duplications and alleles resulting from likely somatic mutation, we characterised 2311 alleles, demonstrating 99.83% concordance with capillary electrophoresis (CE) data on the same sample set. The set contains 267 distinct sequence-based alleles (an increase of 58% compared to the 169 detectable by CE), including 60 novel Y-STR variants phased with their flanking sequences which have not been reported previously to our knowledge. Variation includes 46 distinct alleles containing non-reference variants of SNPs/indels in both repeat and flanking regions, and 145 distinct alleles containing repeat pattern variants (RPV). For DYS385a,b, DYS481 and DYS390 we observed repeat count variation in short flanking segments previously considered invariable, and suggest new MPS-based structural designations based on these. We considered the observed variation in the context of the Y phylogeny: several specific haplogroup associations were observed for SNPs and indels, reflecting the low mutation rates of such variant types; however, RPVs showed less phylogenetic coherence and more recurrence, reflecting their relatively high mutation rates. In conclusion, our study reveals considerable additional diversity at the Y-STRs of the PPY23 set via MPS analysis, demonstrates high concordance with CE data, facilitates nomenclature standardisation, and places Y-STR sequence variants in their phylogenetic context. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.
SHARAKU: an algorithm for aligning and clustering read mapping profiles of deep sequencing in non-coding RNA processing.

PubMed

Tsuchiya, Mariko; Amano, Kojiro; Abe, Masaya; Seki, Misato; Hase, Sumitaka; Sato, Kengo; Sakakibara, Yasubumi

2016-06-15

Deep sequencing of the transcripts of regulatory non-coding RNA generates footprints of post-transcriptional processes. After obtaining sequence reads, the short reads are mapped to a reference genome, and specific mapping patterns can be detected called read mapping profiles, which are distinct from random non-functional degradation patterns. These patterns reflect the maturation processes that lead to the production of shorter RNA sequences. Recent next-generation sequencing studies have revealed not only the typical maturation process of miRNAs but also the various processing mechanisms of small RNAs derived from tRNAs and snoRNAs. We developed an algorithm termed SHARAKU to align two read mapping profiles of next-generation sequencing outputs for non-coding RNAs. In contrast with previous work, SHARAKU incorporates the primary and secondary sequence structures into an alignment of read mapping profiles to allow for the detection of common processing patterns. Using a benchmark simulated dataset, SHARAKU exhibited superior performance to previous methods for correctly clustering the read mapping profiles with respect to 5'-end processing and 3'-end processing from degradation patterns and in detecting similar processing patterns in deriving the shorter RNAs. Further, using experimental data of small RNA sequencing for the common marmoset brain, SHARAKU succeeded in identifying the significant clusters of read mapping profiles for similar processing patterns of small derived RNA families expressed in the brain. The source code of our program SHARAKU is available at http://www.dna.bio.keio.ac.jp/sharaku/, and the simulated dataset used in this work is available at the same link. Accession code: The sequence data from the whole RNA transcripts in the hippocampus of the left brain used in this work is available from the DNA DataBank of Japan (DDBJ) Sequence Read Archive (DRA) under the accession number DRA004502. yasu@bio.keio.ac.jp Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
The World Health Organization Global Programme on AIDS proposal for standardization of HIV sequence nomenclature. WHO Network for HIV Isolation and Characterization.

PubMed

Korber, B T; Osmanov, S; Esparza, J; Myers, G

1994-11-01

The World Health Organization Global Programme on AIDS (WHO/GPA) is conducting a large-scale collaborative study of human immunodeficiency virus type 1 (HIV-1) variation, based in four potential vaccine-trial site countries: Brazil, Rwanda, Thailand, and Uganda. Through the course of this study, it was crucial to keep track of certain attributes of the samples from which the viral nucleotide sequences were derived (e.g., country of origin and viral culture characterization), so that meaningful sequence comparisons could be made. Here we describe a system developed in the context of the WHO/GPA study that summarizes such critical attributes by representing them as standardized characters directly incorporated into sequence names. This nomenclature allows linkage of clinical, phenotypic, and geographic information with molecular data. We propose that other investigators involved in human immunodeficiency virus (HIV) nucleotide sequencing efforts adopt a similar standardized sequence nomenclature to facilitate cross-study sequence comparison. HIV sequence data are being generated at an ever-increasing rate; directly coupled to this increase is our deepening understanding of biological parameters that influence or result from sequence variability. A standardized sequence nomenclature that includes relevant biological information would enable researchers to better utilize the growing body of sequence data, and enhance their ability to interpret the biological implications of their own data through facilitating comparisons with previously published work.
Identification of a pathogenic FTO mutation by next-generation sequencing in a newborn with growth retardation and developmental delay.

PubMed

Daoud, Hussein; Zhang, Dong; McMurray, Fiona; Yu, Andrea; Luco, Stephanie M; Vanstone, Jason; Jarinova, Olga; Carson, Nancy; Wickens, James; Shishodia, Shifali; Choi, Hwanho; McDonough, Michael A; Schofield, Christopher J; Harper, Mary-Ellen; Dyment, David A; Armour, Christine M

2016-03-01

A homozygous loss-of-function mutation p.(Arg316Gln) in the fat mass and obesity-associated (FTO) gene, which encodes for an iron and 2-oxoglutarate-dependent oxygenase, was previously identified in a large family in which nine affected individuals present with a lethal syndrome characterised by growth retardation and multiple malformations. To date, no other pathogenic mutation in FTO has been identified as a cause of multiple congenital malformations. We investigated a 21-month-old girl who presented distinctive facial features, failure to thrive, global developmental delay, left ventricular cardiac hypertrophy, reduced vision and bilateral hearing loss. We performed targeted next-generation sequencing of 4813 clinically relevant genes in the patient and her parents. We identified a novel FTO homozygous missense mutation (c.956C>T; p.(Ser319Phe)) in the affected individual. This mutation affects a highly conserved residue located in the same functional domain as the previously characterised mutation p.(Arg316Gln). Biochemical studies reveal that p.(Ser319Phe) FTO has reduced 2-oxoglutarate turnover and N-methyl-nucleoside demethylase activity. Our findings are consistent with previous reports that homozygous mutations in FTO can lead to rare growth retardation and developmental delay syndrome, and further support the proposal that FTO plays an important role in early development of human central nervous and cardiovascular systems. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Evolution and Spread of Ebola Virus in Liberia, 2014-2015.

PubMed

Ladner, Jason T; Wiley, Michael R; Mate, Suzanne; Dudas, Gytis; Prieto, Karla; Lovett, Sean; Nagle, Elyse R; Beitzel, Brett; Gilbert, Merle L; Fakoli, Lawrence; Diclaro, Joseph W; Schoepp, Randal J; Fair, Joseph; Kuhn, Jens H; Hensley, Lisa E; Park, Daniel J; Sabeti, Pardis C; Rambaut, Andrew; Sanchez-Lockhart, Mariano; Bolay, Fatorma K; Kugelman, Jeffrey R; Palacios, Gustavo

2015-12-09

The 2013-present Western African Ebola virus disease (EVD) outbreak is the largest ever recorded with >28,000 reported cases. Ebola virus (EBOV) genome sequencing has played an important role throughout this outbreak; however, relatively few sequences have been determined from patients in Liberia, the second worst-affected country. Here, we report 140 EBOV genome sequences from the second wave of the Liberian outbreak and analyze them in combination with 782 previously published sequences from throughout the Western African outbreak. While multiple early introductions of EBOV to Liberia are evident, the majority of Liberian EVD cases are consistent with a single introduction, followed by spread and diversification within the country. Movement of the virus within Liberia was widespread, and reintroductions from Liberia served as an important source for the continuation of the already ongoing EVD outbreak in Guinea. Overall, little evidence was found for incremental adaptation of EBOV to the human host. Copyright © 2015 Elsevier Inc. All rights reserved.
The Large Subunit rDNA Sequence of Plasmodiophora brassicae Does not Contain Intra-species Polymorphism.

PubMed

Schwelm, Arne; Berney, Cédric; Dixelius, Christina; Bass, David; Neuhauser, Sigrid

2016-12-01

Clubroot disease caused by Plasmodiophora brassicae is one of the most important diseases of cultivated brassicas. P. brassicae occurs in pathotypes which differ in the aggressiveness towards their Brassica host plants. To date no DNA based method to distinguish these pathotypes has been described. In 2011 polymorphism within the 28S rDNA of P. brassicae was reported which potentially could allow to distinguish pathotypes without the need of time-consuming bioassays. However, isolates of P. brassicae from around the world analysed in this study do not show polymorphism in their LSU rDNA sequences. The previously described polymorphism most likely derived from soil inhabiting Cercozoa more specifically Neoheteromita-like glissomonads. Here we correct the LSU rDNA sequence of P. brassicae. By using FISH we demonstrate that our newly generated sequence belongs to the causal agent of clubroot disease. Copyright © 2016 The Authors. Published by Elsevier GmbH.. All rights reserved.
Probabilistic learning and inference in schizophrenia.

PubMed

Averbeck, Bruno B; Evans, Simon; Chouhan, Viraj; Bristow, Eleanor; Shergill, Sukhwinder S

2011-04-01

Patients with schizophrenia make decisions on the basis of less evidence when required to collect information to make an inference, a behavior often called jumping to conclusions. The underlying basis for this behavior remains controversial. We examined the cognitive processes underpinning this finding by testing subjects on the beads task, which has been used previously to elicit jumping to conclusions behavior, and a stochastic sequence learning task, with a similar decision theoretic structure. During the sequence learning task, subjects had to learn a sequence of button presses, while receiving a noisy feedback on their choices. We fit a Bayesian decision making model to the sequence task and compared model parameters to the choice behavior in the beads task in both patients and healthy subjects. We found that patients did show a jumping to conclusions style; and those who picked early in the beads task tended to learn less from positive feedback in the sequence task. This favours the likelihood of patients selecting early because they have a low threshold for making decisions, and that they make choices on the basis of relatively little evidence. Published by Elsevier B.V.
Are commercial providers a viable option for clinical bacterial sequencing?

PubMed

Raven, Kathy; Blane, Beth; Churcher, Carol; Parkhill, Julian; Peacock, Sharon J

2018-04-05

Bacterial whole-genome sequencing in the clinical setting has the potential to bring major improvements to infection control and clinical practice. Sequencing instruments are not currently available in the majority of routine microbiology laboratories worldwide, but an alternative is to use external sequencing providers. To foster discussion around this we investigated whether send-out services were a viable option. Four providers offering MiSeq sequencing were selected based on cost and evaluated based on the service provided and sequence data quality. DNA was prepared from five methicillin-resistant Staphylococcus aureus (MRSA) isolates, four of which were investigated during a previously published outbreak in the UK together with a reference MRSA isolate (ST22 HO 5096 0412). Cost of sequencing per isolate ranged from £155 to £342 and turnaround times from DNA postage to arrival of sequence data ranged from 12 to 63 days. Comparison of commercially generated genomes against the original sequence data demonstrated very high concordance, with no more than one single nucleotide polymorphism (SNP) difference on core genome mapping between the original sequences and the new sequence for all four providers. Multilocus sequence type could not be assigned based on assembly for the two cheapest sequence providers due to fragmented assemblies probably caused by a lower output of sequence data per isolate. Our results indicate that external providers returned highly accurate genome data, but that improvements are required in turnaround time to make this a viable option for use in clinical practice.

The giraffe (Giraffa camelopardalis) rumen microbiome.

PubMed

Roggenbuck, Michael; Sauer, Cathrine; Poulsen, Morten; Bertelsen, Mads F; Sørensen, Søren J

2014-10-01

Recent studies have shown that wild ruminants are sources of previously undescribed microorganisms, knowledge of which can improve our understanding of the complex microbial interactions in the foregut. Here, we investigated the microbial community of seven wild-caught giraffes (Giraffa camelopardalis), three of which were fed natural browse and four were fed Boskos pellets, leafy alfalfa hay, and cut savanna browse, by characterizing the 16S rRNA gene diversity using 454 FLX high-throughput sequencing. The microbial community composition varied according to diet, but differed little between the ruminal fluid and solid fraction. The giraffe rumen contained large levels of the phyla of Firmicutes and Bacteroidetes independent of diet, while Prevotella, Succinclasticium, and Methanobrevibacter accounted for the largest abundant taxonomic assigned genera. However, up to 21% of the generated sequences could not been assigned to any known bacterial phyla, and c. 70% not to genus, revealing that the giraffe rumen hosts a variety of previously undescribed bacteria. © 2014 Federation of European Microbiological Societies. Published by John Wiley & Sons Ltd. All rights reserved.
Exome-wide Sequencing Shows Low Mutation Rates and Identifies Novel Mutated Genes in Seminomas.

PubMed

Cutcutache, Ioana; Suzuki, Yuka; Tan, Iain Beehuat; Ramgopal, Subhashini; Zhang, Shenli; Ramnarayanan, Kalpana; Gan, Anna; Lee, Heng Hong; Tay, Su Ting; Ooi, Aikseng; Ong, Choon Kiat; Bolthouse, Jonathan T; Lane, Brian R; Anema, John G; Kahnoski, Richard J; Tan, Patrick; Teh, Bin Tean; Rozen, Steven G

2015-07-01

Testicular germ cell tumors are the most common cancer diagnosed in young men, and seminomas are the most common type of these cancers. There have been no exome-wide examinations of genes mutated in seminomas or of overall rates of nonsilent somatic mutations in these tumors. The objective was to analyze somatic mutations in seminomas to determine which genes are affected and to determine rates of nonsilent mutations. Eight seminomas and matched normal samples were surgically obtained from eight patients. DNA was extracted from tissue samples and exome sequenced on massively parallel Illumina DNA sequencers. Single-nucleotide polymorphism chip-based copy number analysis was also performed to assess copy number alterations. The DNA sequencing read data were analyzed to detect somatic mutations including single-nucleotide substitutions and short insertions and deletions. The detected mutations were validated by independent sequencing and further checked for subclonality. The rate of nonsynonymous somatic mutations averaged 0.31 mutations/Mb. We detected nonsilent somatic mutations in 96 genes that were not previously known to be mutated in seminomas, of which some may be driver mutations. Many of the mutations appear to have been present in subclonal populations. In addition, two genes, KIT and KRAS, were affected in two tumors each with mutations that were previously observed in other cancers and are presumably oncogenic. Our study, the first report on exome sequencing of seminomas, detected somatic mutations in 96 new genes, several of which may be targetable drivers. Furthermore, our results show that seminoma mutation rates are five times higher than previously thought, but are nevertheless low compared to other common cancers. Similar low rates are seen in other cancers that also have excellent rates of remission achieved with chemotherapy. We examined the DNA sequences of seminomas, the most common type of testicular germ cell cancer. Our study identified 96 new genes in which mutations occurred during seminoma development, some of which might contribute to cancer development or progression. The study also showed that the rates of DNA mutations during seminoma development are higher than previously thought, but still lower than for other common solid-organ cancers. Such low rates are also observed among other cancers that, like seminomas, show excellent rates of disease remission after chemotherapy. Copyright © 2015 European Association of Urology. Published by Elsevier B.V. All rights reserved.
Rhizobium favelukesii sp. nov., isolated from the root nodules of alfalfa (Medicago sativa L).

PubMed

Torres Tejerizo, Gonzalo; Rogel, Marco Antonio; Ormeño-Orrillo, Ernesto; Althabegoiti, María Julia; Nilsson, Juliet Fernanda; Niehaus, Karsten; Schlüter, Andreas; Pühler, Alfred; Del Papa, María Florencia; Lagares, Antonio; Martínez-Romero, Esperanza; Pistorio, Mariano

2016-11-01

Strains LPU83T and Or191 of the genus Rhizobium were isolated from the root nodules of alfalfa, grown in acid soils from Argentina and the USA. These two strains, which shared the same plasmid pattern, lipopolysaccharide profile, insertion-sequence fingerprint, 16S rRNA gene sequence and PCR-fingerprinting pattern, were different from reference strains representing species of the genus Rhizobium with validly published names. On the basis of previously reported data and from new DNA-DNA hybridization results, phenotypic characterization and phylogenetic analyses, strains LPU83T and Or191 can be considered to be representatives of a novel species of the genus Rhizobium, for which the name Rhizobium favelukesii sp. nov. is proposed. The type strain of this species is LPU83T (=CECT 9014T=LMG 29160T), for which an improved draft-genome sequence is available.
Discriminating between stabilizing and destabilizing protein design mutations via recombination and simulation.

PubMed

Johnson, Lucas B; Gintner, Lucas P; Park, Sehoo; Snow, Christopher D

2015-08-01

Accuracy of current computational protein design (CPD) methods is limited by inherent approximations in energy potentials and sampling. These limitations are often used to qualitatively explain design failures; however, relatively few studies provide specific examples or quantitative details that can be used to improve future CPD methods. Expanding the design method to include a library of sequences provides data that is well suited for discriminating between stabilizing and destabilizing design elements. Using thermophilic endoglucanase E1 from Acidothermus cellulolyticus as a model enzyme, we computationally designed a sequence with 60 mutations. The design sequence was rationally divided into structural blocks and recombined with the wild-type sequence. Resulting chimeras were assessed for activity and thermostability. Surprisingly, unlike previous chimera libraries, regression analysis based on one- and two-body effects was not sufficient for predicting chimera stability. Analysis of molecular dynamics simulations proved helpful in distinguishing stabilizing and destabilizing mutations. Reverting to the wild-type amino acid at destabilized sites partially regained design stability, and introducing predicted stabilizing mutations in wild-type E1 significantly enhanced thermostability. The ability to isolate stabilizing and destabilizing elements in computational design offers an opportunity to interpret previous design failures and improve future CPD methods. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Documentation for the machine-readable version of the Smithsonian Astrophysical Observatory Star catalogue (SAO) version 1984

NASA Technical Reports Server (NTRS)

Roman, N. G.; Warren, W. H., Jr.

1984-01-01

An updated, corrected and extended machine readable version of the Smithsonian Astrophysical Observatory star catalog (SAO) is described. Published and unpublished errors discovered in the previous version have been corrected, and multiple star and supplemental BD identifications added to stars where more than one SAO entry has the same Durchmusterung number. Henry Draper Extension (HDE) numbers have been added for stars found in both volumes of the extension. Data for duplicate SAO entries (those referring to the same star) have been blanked out, but the records themselves have been retained and flagged so that sequencing and record count are identical to the published catalog.
Widespread occurrence of organelle genome-encoded 5S rRNAs including permuted molecules.

PubMed

Valach, Matus; Burger, Gertraud; Gray, Michael W; Lang, B Franz

2014-12-16

5S Ribosomal RNA (5S rRNA) is a universal component of ribosomes, and the corresponding gene is easily identified in archaeal, bacterial and nuclear genome sequences. However, organelle gene homologs (rrn5) appear to be absent from most mitochondrial and several chloroplast genomes. Here, we re-examine the distribution of organelle rrn5 by building mitochondrion- and plastid-specific covariance models (CMs) with which we screened organelle genome sequences. We not only recover all organelle rrn5 genes annotated in GenBank records, but also identify more than 50 previously unrecognized homologs in mitochondrial genomes of various stramenopiles, red algae, cryptomonads, malawimonads and apusozoans, and surprisingly, in the apicoplast (highly derived plastid) genomes of the coccidian pathogens Toxoplasma gondii and Eimeria tenella. Comparative modeling of RNA secondary structure reveals that mitochondrial 5S rRNAs from brown algae adopt a permuted triskelion shape that has not been seen elsewhere. Expression of the newly predicted rrn5 genes is confirmed experimentally in 10 instances, based on our own and published RNA-Seq data. This study establishes that particularly mitochondrial 5S rRNA has a much broader taxonomic distribution and a much larger structural variability than previously thought. The newly developed CMs will be made available via the Rfam database and the MFannot organelle genome annotator. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
A novel homozygous truncating GNAT1 mutation implicated in retinal degeneration.

PubMed

Carrigan, Matthew; Duignan, Emma; Humphries, Pete; Palfi, Arpad; Kenna, Paul F; Farrar, G Jane

2016-04-01

The GNAT1 gene encodes the α subunit of the rod transducin protein, a key element in the rod phototransduction cascade. Variants in GNAT1 have been implicated in stationary night-blindness in the past, but unlike other proteins in the same pathway, it has not previously been implicated in retinitis pigmentosa. A panel of 182 retinopathy-associated genes was sequenced to locate disease-causing mutations in patients with inherited retinopathies. Sequencing revealed a novel homozygous truncating mutation in the GNAT1 gene in a patient with significant pigmentary disturbance and constriction of visual fields, a presentation consistent with retinitis pigmentosa. This is the first report of a patient homozygous for a complete loss-of-function GNAT1 mutation. The clinical data from this patient provide definitive evidence of retinitis pigmentosa with late onset in addition to the lifelong night-blindness that would be expected from a lack of transducin function. These data suggest that some truncating GNAT1 variants can indeed cause a recessive, mild, late-onset retinal degeneration in human beings rather than just stationary night-blindness as reported previously, with notable similarities to the phenotype of the Gnat1 knockout mouse. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Random Amplification and Pyrosequencing for Identification of Novel Viral Genome Sequences

PubMed Central

Hang, Jun; Forshey, Brett M.; Kochel, Tadeusz J.; Li, Tao; Solórzano, Víctor Fiestas; Halsey, Eric S.; Kuschner, Robert A.

2012-01-01

ssRNA viruses have high levels of genomic divergence, which can lead to difficulty in genomic characterization of new viruses using traditional PCR amplification and sequencing methods. In this study, random reverse transcription, anchored random PCR amplification, and high-throughput pyrosequencing were used to identify orthobunyavirus sequences from total RNA extracted from viral cultures of acute febrile illness specimens. Draft genome sequence for the orthobunyavirus L segment was assembled and sequentially extended using de novo assembly contigs from pyrosequencing reads and orthobunyavirus sequences in GenBank as guidance. Accuracy and continuous coverage were achieved by mapping all reads to the L segment draft sequence. Subsequently, RT-PCR and Sanger sequencing were used to complete the genome sequence. The complete L segment was found to be 6936 bases in length, encoding a 2248-aa putative RNA polymerase. The identified L segment was distinct from previously published South American orthobunyaviruses, sharing 63% and 54% identity at the nucleotide and amino acid level, respectively, with the complete Oropouche virus L segment and 73% and 81% identity at the nucleotide and amino acid level, respectively, with a partial Caraparu virus L segment. The result demonstrated the effectiveness of a sequence-independent amplification and next-generation sequencing approach for obtaining complete viral genomes from total nucleic acid extracts and its use in pathogen discovery. PMID:22468136
Breast cancer brain metastases show increased levels of genomic aberration based homologous recombination deficiency scores relative to their corresponding primary tumors.

PubMed

Diossy, M; Reiniger, L; Sztupinszki, Z; Krzystanek, M; Timms, K M; Neff, C; Solimeno, C; Pruss, D; Eklund, A C; Tóth, E; Kiss, O; Rusz, O; Cserni, G; Zombori, T; Székely, B; Tímár, J; Csabai, I; Szallasi, Z

2018-06-18

Based on its mechanism of action, PARP inhibitor therapy is expected to benefit mainly tumor cases with homologous recombination deficiency (HRD). Therefore, identification of tumor types with increased HRD is important for the optimal use of this class of therapeutic agents. HRD levels can be estimated using various mutational signatures from next generation sequencing data and we used this approach to determine whether breast cancer brain metastases show altered levels of HRD scores relative to their corresponding primary tumor. We used a previously published next generation sequencing dataset of twenty-one matched primary breast cancer/brain metastasis pairs to derive the various mutational signatures/HRD scores strongly associated with HRD. We also performed the myChoice HRD analysis on an independent cohort of seventeen breast cancer patients with matched primary/brain metastasis pairs. All of the mutational signatures indicative of HRD showed a significant increase in the brain metastases relative to their matched primary tumor in the previously published whole exome sequencing dataset. In the independent validation cohort the myChoice HRD assay showed an increased level in 87.5% of the brain metastases relative to the primary tumor, with 56% of brain metastases being HRD positive according to the myChoice criteria. The consistent observation that brain metastases of breast cancer tend to have higher HRD measures may raise the possibility that brain metastases may be more sensitive to PARP inhibitor treatment. This observation warrants further investigation to assess whether this increase is common to other metastatic sites as well, and whether clinical trials should adjust their strategy in the application of HRD measures for the prioritization of patients for PARP inhibitor therapy.
mtDNA and the Origin of the Icelanders: Deciphering Signals of Recent Population History

PubMed Central

Helgason, Agnar; Sigurðardóttir, Sigrún; Gulcher, Jeffrey R.; Ward, Ryk; Stefánsson, Kári

2000-01-01

Previous attempts to investigate the origin of the Icelanders have provided estimates of ancestry ranging from a 98% British Isles contribution to an 86% Scandinavian contribution. We generated mitochondrial sequence data for 401 Icelandic individuals and compared these data with >2,500 other European sequences from published sources, to determine the probable origins of women who contributed to Iceland’s settlement. Although the mean number of base-pair differences is high in the Icelandic sequences and they are widely distributed in the overall European mtDNA phylogeny, we find a smaller number of distinct mitochondrial lineages, compared with most other European populations. The frequencies of a number of mtDNA lineages in the Icelanders deviate noticeably from those in neighboring populations, suggesting that founder effects and genetic drift may have had a considerable influence on the Icelandic gene pool. This is in accordance with available demographic evidence about Icelandic population history. A comparison with published mtDNA lineages from European populations indicates that, whereas most founding females probably originated from Scandinavia and the British Isles, lesser contributions from other populations may also have taken place. We present a highly resolved phylogenetic network for the Icelandic data, identifying a number of previously unreported mtDNA lineage clusters and providing a detailed depiction of the evolutionary relationships between European mtDNA clusters. Our findings indicate that European populations contain a large number of closely related mitochondrial lineages, many of which have not yet been sampled in the current comparative data set. Consequently, substantial increases in sample sizes that use mtDNA data will be needed to obtain valid estimates of the diverse ancestral mixtures that ultimately gave rise to contemporary populations. PMID:10712214
High resolution identity testing of inactivated poliovirus vaccines.

PubMed

Mee, Edward T; Minor, Philip D; Martin, Javier

2015-07-09

Definitive identification of poliovirus strains in vaccines is essential for quality control, particularly where multiple wild-type and Sabin strains are produced in the same facility. Sequence-based identification provides the ultimate in identity testing and would offer several advantages over serological methods. We employed random RT-PCR and high throughput sequencing to recover full-length genome sequences from monovalent and trivalent poliovirus vaccine products at various stages of the manufacturing process. All expected strains were detected in previously characterised products and the method permitted identification of strains comprising as little as 0.1% of sequence reads. Highly similar Mahoney and Sabin 1 strains were readily discriminated on the basis of specific variant positions. Analysis of a product known to contain incorrect strains demonstrated that the method correctly identified the contaminants. Random RT-PCR and shotgun sequencing provided high resolution identification of vaccine components. In addition to the recovery of full-length genome sequences, the method could also be easily adapted to the characterisation of minor variant frequencies and distinction of closely related products on the basis of distinguishing consensus and low frequency polymorphisms. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
On the value of nuclear and mitochondrial gene sequences for reconstructing the phylogeny of vanilloid orchids (Vanilloideae, Orchidaceae)

PubMed Central

Cameron, Kenneth M.

2009-01-01

Background and Aims Most molecular phylogenetic studies of Orchidaceae have relied heavily on DNA sequences from the plastid genome. Nuclear and mitochondrial loci have only been superficially examined for their systematic value. Since 40% of the genera within Vanilloideae are achlorophyllous mycoheterotrophs, this is an ideal group of orchids in which to evaluate non-plastid gene sequences. Methods Phylogenetic reconstructions for Vanilloideae were produced using independent and combined data from the nuclear 18S, 5·8S and 26S rDNA genes and the mitochondrial atpA gene and nad1b-c intron. Key Results These new data indicate placements for genera such as Lecanorchis and Galeola, for which plastid gene sequences have been mostly unavailable. Nuclear and mitochondrial parsimony jackknife trees are congruent with each other and previously published trees based solely on plastid data. Because of high rates of sequence divergence among vanilloid orchids, even the short 5·8S rDNA gene provides impressive levels of resolution and support. Conclusions Orchid systematists are encouraged to sequence nuclear and mitochondrial gene regions along with the growing number of plastid loci available. PMID:19251715
VarDict: a novel and versatile variant caller for next-generation sequencing in cancer research

PubMed Central

Lai, Zhongwu; Markovets, Aleksandra; Ahdesmaki, Miika; Chapman, Brad; Hofmann, Oliver; McEwen, Robert; Johnson, Justin; Dougherty, Brian; Barrett, J. Carl; Dry, Jonathan R.

2016-01-01

Abstract Accurate variant calling in next generation sequencing (NGS) is critical to understand cancer genomes better. Here we present VarDict, a novel and versatile variant caller for both DNA- and RNA-sequencing data. VarDict simultaneously calls SNV, MNV, InDels, complex and structural variants, expanding the detected genetic driver landscape of tumors. It performs local realignments on the fly for more accurate allele frequency estimation. VarDict performance scales linearly to sequencing depth, enabling ultra-deep sequencing used to explore tumor evolution or detect tumor DNA circulating in blood. In addition, VarDict performs amplicon aware variant calling for polymerase chain reaction (PCR)-based targeted sequencing often used in diagnostic settings, and is able to detect PCR artifacts. Finally, VarDict also detects differences in somatic and loss of heterozygosity variants between paired samples. VarDict reprocessing of The Cancer Genome Atlas (TCGA) Lung Adenocarcinoma dataset called known driver mutations in KRAS, EGFR, BRAF, PIK3CA and MET in 16% more patients than previously published variant calls. We believe VarDict will greatly facilitate application of NGS in clinical cancer research. PMID:27060149
Characterization of Urtica dioica agglutinin isolectins and the encoding gene family.

PubMed

Does, M P; Ng, D K; Dekker, H L; Peumans, W J; Houterman, P M; Van Damme, E J; Cornelissen, B J

1999-01-01

Urtica dioica agglutinin (UDA) has previously been found in roots and rhizomes of stinging nettles as a mixture of UDA-isolectins. Protein and cDNA sequencing have shown that mature UDA is composed of two hevein domains and is processed from a precursor protein. The precursor contains a signal peptide, two in-tandem hevein domains, a hinge region and a carboxyl-terminal chitinase domain. Genomic fragments encoding precursors for UDA-isolectins have been amplified by five independent polymerase chain reactions on genomic DNA from stinging nettle ecotype Weerselo. One amplified gene was completely sequenced. As compared to the published cDNA sequence, the genomic sequence contains, besides two basepair substitutions, two introns located at the same positions as in other plant chitinases. By partial sequence analysis of 40 amplified genes, 16 different genes were identified which encode seven putative UDA-isolectins. The deduced amino acid sequences share 78.9-98.9% identity. In extracts of roots and rhizomes of stinging nettle ecotype Weerselo six out of these seven isolectins were detected by mass spectrometry. One of them is an acidic form, which has not been identified before. Our results demonstrate that UDA is encoded by a large gene family.
Recently published protein sequences. I.

NASA Technical Reports Server (NTRS)

Jukes, T. H.; Holmquist, R.

1972-01-01

Some polypeptide sequences that have been published in the 1972 scientific literature are listed. Only selected sequences are included. The compilation has two objectives. Current information between periods when more comprehensive compilations are published is to be assembled and the use of data that do not include arrangements of unsequenced peptides for 'maximum homology' is to be encouraged.
Detection of aneurysmal subarachnoid hemorrhage 3 months after initial bleeding: evaluation of T2* and FLAIR MR sequences at 3 T in comparison with initial non-enhanced CT as a gold standard.

PubMed

Mulé, Sébastien; Soize, Sébastien; Benaissa, Azzedine; Portefaix, Christophe; Pierot, Laurent

2016-08-01

To investigate the ability of T2* and fluid-attenuated inversion recovery (FLAIR) MR sequences to detect hemosiderin deposition 3 months after aneurysmal subarachnoid hemorrhage (SAH) in comparison with early non-enhanced CT (NECT) as a gold standard. From September 2008 through May 2013, patients with aneurysmal SAH were included if a NECT less than 24 h after the onset of symptoms showed a SAH, and MRI, including T2* and FLAIR sequences, was performed 3 months later. All aneurysms were treated endovascularly. NECT and MR sequences were blindly analyzed for the presence of SAH (NECT) or hemosiderin deposition (MRI). When positive, details of the spatial distribution of SAH or hemosiderin deposits were noted. Sensitivities were calculated for each patient. Sensitivities, specificities, and positive predictive values (PPVs) were calculated for each location. Forty-nine patients (mean age 52.9 years) were included. Bleeding-related patterns were identified in 43 patients (87.8%) on T2* and 10 patients (20.4%) on FLAIR. T2* was highly predictive of the location of the initial hemorrhage, especially in the Sylvian cisterns (PPVs 95% and 100%) and the anterior interhemispheric fissure (PPV 90%). The T2* sequence can detect and localize a previous SAH a few months after aneurysmal bleeding. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
In silico Analysis of 2085 Clones from a Normalized Rat Vestibular Periphery 3′ cDNA Library

PubMed Central

Roche, Joseph P.; Cioffi, Joseph A.; Kwitek, Anne E.; Erbe, Christy B.; Popper, Paul

2005-01-01

The inserts from 2400 cDNA clones isolated from a normalized Rattus norvegicus vestibular periphery cDNA library were sequenced and characterized. The Wackym-Soares vestibular 3′ cDNA library was constructed from the saccular and utricular maculae, the ampullae of all three semicircular canals and Scarpa's ganglia containing the somata of the primary afferent neurons, microdissected from 104 male and female rats. The inserts from 2400 randomly selected clones were sequenced from the 5′ end. Each sequence was analyzed using the BLAST algorithm compared to the Genbank nonredundant, rat genome, mouse genome and human genome databases to search for high homology alignments. Of the initial 2400 clones, 315 (13%) were found to be of poor quality and did not yield useful information, and therefore were eliminated from the analysis. Of the remaining 2085 sequences, 918 (44%) were found to represent 758 unique genes having useful annotations that were identified in databases within the public domain or in the published literature; these sequences were designated as known characterized sequences. 1141 sequences (55%) aligned with 1011 unique sequences had no useful annotations and were designated as known but uncharacterized sequences. Of the remaining 26 sequences (1%), 24 aligned with rat genomic sequences, but none matched previously described rat expressed sequence tags or mRNAs. No significant alignment to the rat or human genomic sequences could be found for the remaining 2 sequences. Of the 2085 sequences analyzed, 86% were singletons. The known, characterized sequences were analyzed with the FatiGO online data-mining tool (http://fatigo.bioinfo.cnio.es/) to identify level 5 biological process gene ontology (GO) terms for each alignment and to group alignments with similar or identical GO terms. Numerous genes were identified that have not been previously shown to be expressed in the vestibular system. Further characterization of the novel cDNA sequences may lead to the identification of genes with vestibular-specific functions. Continued analysis of the rat vestibular periphery transcriptome should provide new insights into vestibular function and generate new hypotheses. Physiological studies are necessary to further elucidate the roles of the identified genes and novel sequences in vestibular function. PMID:16103642
Approach for classification and taxonomy within family Rickettsiaceae based on the Formal Order Analysis.

PubMed

Shpynov, S; Pozdnichenko, N; Gumenuk, A

2015-01-01

Genome sequences of 36 Rickettsia and Orientia were analyzed using Formal Order Analysis (FOA). This approach takes into account arrangement of nucleotides in each sequence. A numerical characteristic, the average distance (remoteness) - "g" was used to compare of genomes. Our results corroborated previous separation of three groups within the genus Rickettsia, including typhus group, classic spotted fever group, and the ancestral group and Orientia as a separate genus. Rickettsia felis URRWXCal2 and R. akari Hartford were not in the same group based on FOA, therefore designation of a so-called transitional Rickettsia group could not be confirmed with this approach. Copyright © 2015 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.
Applications of next-generation sequencing to blood and marrow transplantation.

PubMed

Chapman, Michael; Warren, Edus H; Wu, Catherine J

2012-01-01

Since the advent of next-generation sequencing (NGS) in 2005, there has been an explosion of published studies employing the technology to tackle previously intractable questions in many disparate biological fields. This has been coupled with technology development that has occurred at a remarkable pace. This review discusses the potential impact of this new technology on the field of blood and marrow stem cell transplantation. Hematologic malignancies have been among the forefront of those cancers whose genomes have been the subject of NGS. Hence, these studies have opened novel areas of biology that can be exploited for prognostic, diagnostic, and therapeutic means. Because of the unprecedented depth, resolution and accuracy achievable by NGS, this technology is well-suited for providing detailed information on the diversity of receptors that govern antigen recognition; this approach has the potential to contribute important insights into understanding the biologic effects of transplantation. Finally, the ability to perform comprehensive tumor sequencing provides a systematic approach to the discovery of genetic alterations that can encode peptides with restricted tumor expression, and hence serve as potential target antigens of graft-versus-leukemia responses. Altogether, this increasingly affordable technology will undoubtedly impact the future practice and care of patients with hematologic malignancies. Copyright © 2012 American Society for Blood and Marrow Transplantation. Published by Elsevier Inc. All rights reserved.
Transcriptome-wide investigation of genomic imprinting in chicken

PubMed Central

Frésard, Laure; Leroux, Sophie; Servin, Bertrand; Gourichon, David; Dehais, Patrice; Cristobal, Magali San; Marsaud, Nathalie; Vignoles, Florence; Bed'hom, Bertrand; Coville, Jean-Luc; Hormozdiari, Farhad; Beaumont, Catherine; Zerjal, Tatiana; Vignal, Alain; Morisson, Mireille; Lagarrigue, Sandrine; Pitel, Frédérique

2014-01-01

Genomic imprinting is an epigenetic mechanism by which alleles of some specific genes are expressed in a parent-of-origin manner. It has been observed in mammals and marsupials, but not in birds. Until now, only a few genes orthologous to mammalian imprinted ones have been analyzed in chicken and did not demonstrate any evidence of imprinting in this species. However, several published observations such as imprinted-like QTL in poultry or reciprocal effects keep the question open. Our main objective was thus to screen the entire chicken genome for parental-allele-specific differential expression on whole embryonic transcriptomes, using high-throughput sequencing. To identify the parental origin of each observed haplotype, two chicken experimental populations were used, as inbred and as genetically distant as possible. Two families were produced from two reciprocal crosses. Transcripts from 20 embryos were sequenced using NGS technology, producing ∼200 Gb of sequences. This allowed the detection of 79 potentially imprinted SNPs, through an analysis method that we validated by detecting imprinting from mouse data already published. However, out of 23 candidates tested by pyrosequencing, none could be confirmed. These results come together, without a priori, with previous statements and phylogenetic considerations assessing the absence of genomic imprinting in chicken. PMID:24452801

Dissection of the Octoploid Strawberry Genome by Deep Sequencing of the Genomes of Fragaria Species

PubMed Central

Hirakawa, Hideki; Shirasawa, Kenta; Kosugi, Shunichi; Tashiro, Kosuke; Nakayama, Shinobu; Yamada, Manabu; Kohara, Mistuyo; Watanabe, Akiko; Kishida, Yoshie; Fujishiro, Tsunakazu; Tsuruoka, Hisano; Minami, Chiharu; Sasamoto, Shigemi; Kato, Midori; Nanri, Keiko; Komaki, Akiko; Yanagi, Tomohiro; Guoxin, Qin; Maeda, Fumi; Ishikawa, Masami; Kuhara, Satoru; Sato, Shusei; Tabata, Satoshi; Isobe, Sachiko N.

2014-01-01

Cultivated strawberry (Fragaria x ananassa) is octoploid and shows allogamous behaviour. The present study aims at dissecting this octoploid genome through comparison with its wild relatives, F. iinumae, F. nipponica, F. nubicola, and F. orientalis by de novo whole-genome sequencing on an Illumina and Roche 454 platforms. The total length of the assembled Illumina genome sequences obtained was 698 Mb for F. x ananassa, and ∼200 Mb each for the four wild species. Subsequently, a virtual reference genome termed FANhybrid_r1.2 was constructed by integrating the sequences of the four homoeologous subgenomes of F. x ananassa, from which heterozygous regions in the Roche 454 and Illumina genome sequences were eliminated. The total length of FANhybrid_r1.2 thus created was 173.2 Mb with the N50 length of 5137 bp. The Illumina-assembled genome sequences of F. x ananassa and the four wild species were then mapped onto the reference genome, along with the previously published F. vesca genome sequence to establish the subgenomic structure of F. x ananassa. The strategy adopted in this study has turned out to be successful in dissecting the genome of octoploid F. x ananassa and appears promising when applied to the analysis of other polyploid plant species. PMID:24282021
Comparison of taxon-specific versus general locus sets for targeted sequence capture in plant phylogenomics.

PubMed

Chau, John H; Rahfeldt, Wolfgang A; Olmstead, Richard G

2018-03-01

Targeted sequence capture can be used to efficiently gather sequence data for large numbers of loci, such as single-copy nuclear loci. Most published studies in plants have used taxon-specific locus sets developed individually for a clade using multiple genomic and transcriptomic resources. General locus sets can also be developed from loci that have been identified as single-copy and have orthologs in large clades of plants. We identify and compare a taxon-specific locus set and three general locus sets (conserved ortholog set [COSII], shared single-copy nuclear [APVO SSC] genes, and pentatricopeptide repeat [PPR] genes) for targeted sequence capture in Buddleja (Scrophulariaceae) and outgroups. We evaluate their performance in terms of assembly success, sequence variability, and resolution and support of inferred phylogenetic trees. The taxon-specific locus set had the most target loci. Assembly success was high for all locus sets in Buddleja samples. For outgroups, general locus sets had greater assembly success. Taxon-specific and PPR loci had the highest average variability. The taxon-specific data set produced the best-supported tree, but all data sets showed improved resolution over previous non-sequence capture data sets. General locus sets can be a useful source of sequence capture targets, especially if multiple genomic resources are not available for a taxon.
DendroBLAST: approximate phylogenetic trees in the absence of multiple sequence alignments.

PubMed

Kelly, Steven; Maini, Philip K

2013-01-01

The rapidly growing availability of genome information has created considerable demand for both fast and accurate phylogenetic inference algorithms. We present a novel method called DendroBLAST for reconstructing phylogenetic dendrograms/trees from protein sequences using BLAST. This method differs from other methods by incorporating a simple model of sequence evolution to test the effect of introducing sequence changes on the reliability of the bipartitions in the inferred tree. Using realistic simulated sequence data we demonstrate that this method produces phylogenetic trees that are more accurate than other commonly-used distance based methods though not as accurate as maximum likelihood methods from good quality multiple sequence alignments. In addition to tests on simulated data, we use DendroBLAST to generate input trees for a supertree reconstruction of the phylogeny of the Archaea. This independent analysis produces an approximate phylogeny of the Archaea that has both high precision and recall when compared to previously published analysis of the same dataset using conventional methods. Taken together these results demonstrate that approximate phylogenetic trees can be produced in the absence of multiple sequence alignments, and we propose that these trees will provide a platform for improving and informing downstream bioinformatic analysis. A web implementation of the DendroBLAST method is freely available for use at http://www.dendroblast.com/.
Incidental germline variants in 1000 advanced cancers on a prospective somatic genomic profiling protocol.

PubMed

Meric-Bernstam, F; Brusco, L; Daniels, M; Wathoo, C; Bailey, A M; Strong, L; Shaw, K; Lu, K; Qi, Y; Zhao, H; Lara-Guerra, H; Litton, J; Arun, B; Eterovic, A K; Aytac, U; Routbort, M; Subbiah, V; Janku, F; Davies, M A; Kopetz, S; Mendelsohn, J; Mills, G B; Chen, K

2016-05-01

Next-generation sequencing in cancer research may reveal germline variants of clinical significance. We report patient preferences for return of results and the prevalence of incidental pathogenic germline variants (PGVs). Targeted exome sequencing of 202 genes was carried out in 1000 advanced cancers using tumor and normal DNA in a research laboratory. Pathogenic variants in 18 genes, recommended for return by The American College of Medical Genetics and Genomics, as well as PALB2, were considered actionable. Patient preferences of return of incidental germline results were collected. Return of results was initiated with genetic counseling and repeat CLIA testing. Of the 1000 patients who underwent sequencing, 43 had likely PGVs: APC (1), BRCA1 (11), BRCA2 (10), TP53 (10), MSH2 (1), MSH6 (4), PALB2 (2), PTEN (2), TSC2 (1), and RB1 (1). Twenty (47%) of 43 variants were previously known based on clinical genetic testing. Of the 1167 patients who consented for a germline testing protocol, 1157 (99%) desired to be informed of incidental results. Twenty-three previously unrecognized mutations identified in the research environment were confirmed with an orthogonal CLIA platform. All patients approached decided to proceed with formal genetic counseling; in all cases where formal genetic testing was carried out, the germline variant of concern validated with clinical genetic testing. In this series, 2.3% patients had previously unrecognized pathogenic germline mutations in 19 cancer-related genes. Thus, genomic sequencing must be accompanied by a plan for return of germline results, in partnership with genetic counseling. © The Author 2016. Published by Oxford University Press on behalf of the European Society for Medical Oncology. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Nursing theory and concept development: a theoretical model of clinical nurses' intentions to stay in their current positions.

PubMed

Cowden, Tracy L; Cummings, Greta G

2012-07-01

We describe a theoretical model of staff nurses' intentions to stay in their current positions. The global nursing shortage and high nursing turnover rate demand evidence-based retention strategies. Inconsistent study outcomes indicate a need for testable theoretical models of intent to stay that build on previously published models, are reflective of current empirical research and identify causal relationships between model concepts. Two systematic reviews of electronic databases of English language published articles between 1985-2011. This complex, testable model expands on previous models and includes nurses' affective and cognitive responses to work and their effects on nurses' intent to stay. The concepts of desire to stay, job satisfaction, joy at work, and moral distress are included in the model to capture the emotional response of nurses to their work environments. The influence of leadership is integrated within the model. A causal understanding of clinical nurses' intent to stay and the effects of leadership on the development of that intention will facilitate the development of effective retention strategies internationally. Testing theoretical models is necessary to confirm previous research outcomes and to identify plausible sequences of the development of behavioral intentions. Increased understanding of the causal influences on nurses' intent to stay should lead to strategies that may result in higher retention rates and numbers of nurses willing to work in the health sector. © 2012 Blackwell Publishing Ltd.
Paleomagnetic investigation of some volcanic rocks from the McMurdo volcanic province, Antarctica

USGS Publications Warehouse

Mankinen, E.A.; Cox, A.

1988-01-01

Paleomagnetic data for lava flows from sporadic but long-lived eruptions in the McMurdo Sound region are combined with previously published geologic and geochronologic data to determine the general eruptive sequence of the area. Lava flows in the Walcott Bay area were erupted during the Gauss Normal, Matuyama Reversed, and Brunhes Normal Polarity Chrons. The youngest flows on Black Island probably erupted near the boundary between the Gilbert and Gauss chrons. The most recent activity was concentrated on the volcanic edifices of Mounts Morning and Discovery and on Ross Island sampled during this study with those of eight flows that were published previously yields a mean paleomagnetic pole at 87.3??N, 317.3??E (??95 = 6.3??). The ancient geomagnetic field dispersion about this mean pole is 23.5??, with upper and lower limits of 95% confidence equal to 27.4?? and 20.5??, respectively. This value probably is a reasonable estimate of secular variation for the Antarctic continent during Pliocene and Pleistocene time. -Authors
Whole exome sequencing frequently detects a monogenic cause in early onset nephrolithiasis and nephrocalcinosis.

PubMed

Daga, Ankana; Majmundar, Amar J; Braun, Daniela A; Gee, Heon Yung; Lawson, Jennifer A; Shril, Shirlee; Jobst-Schwan, Tilman; Vivante, Asaf; Schapiro, David; Tan, Weizhen; Warejko, Jillian K; Widmeier, Eugen; Nelson, Caleb P; Fathy, Hanan M; Gucev, Zoran; Soliman, Neveen A; Hashmi, Seema; Halbritter, Jan; Halty, Margarita; Kari, Jameela A; El-Desoky, Sherif; Ferguson, Michael A; Somers, Michael J G; Traum, Avram Z; Stein, Deborah R; Daouk, Ghaleb H; Rodig, Nancy M; Katz, Avi; Hanna, Christian; Schwaderer, Andrew L; Sayer, John A; Wassner, Ari J; Mane, Shrikant; Lifton, Richard P; Milosevic, Danko; Tasic, Velibor; Baum, Michelle A; Hildebrandt, Friedhelm

2018-01-01

The incidence of nephrolithiasis continues to rise. Previously, we showed that a monogenic cause could be detected in 11.4% of individuals with adult-onset nephrolithiasis or nephrocalcinosis and in 16.7-20.8% of individuals with onset before 18 years of age, using gene panel sequencing of 30 genes known to cause nephrolithiasis/nephrocalcinosis. To overcome the limitations of panel sequencing, we utilized whole exome sequencing in 51 families, who presented before age 25 years with at least one renal stone or with a renal ultrasound finding of nephrocalcinosis to identify the underlying molecular genetic cause of disease. In 15 of 51 families, we detected a monogenic causative mutation by whole exome sequencing. A mutation in seven recessive genes (AGXT, ATP6V1B1, CLDN16, CLDN19, GRHPR, SLC3A1, SLC12A1), in one dominant gene (SLC9A3R1), and in one gene (SLC34A1) with both recessive and dominant inheritance was detected. Seven of the 19 different mutations were not previously described as disease-causing. In one family, a causative mutation in one of 117 genes that may represent phenocopies of nephrolithiasis-causing genes was detected. In nine of 15 families, the genetic diagnosis may have specific implications for stone management and prevention. Several factors that correlated with the higher detection rate in our cohort were younger age at onset of nephrolithiasis/nephrocalcinosis, presence of multiple affected members in a family, and presence of consanguinity. Thus, we established whole exome sequencing as an efficient approach toward a molecular genetic diagnosis in individuals with nephrolithiasis/nephrocalcinosis who manifest before age 25 years. Copyright © 2017 International Society of Nephrology. Published by Elsevier Inc. All rights reserved.
High-throughput sequence-based analysis of the bacterial composition of kefir and an associated kefir grain.

PubMed

Dobson, Alleson; O'Sullivan, Orla; Cotter, Paul D; Ross, Paul; Hill, Colin

2011-07-01

Lacticin 3147 is a two-peptide broad spectrum lantibiotic produced by Lactococcus lactis DPC3147 shown to inhibit a number of clinically relevant Gram-positive pathogens. Initially isolated from an Irish kefir grain, lacticin 3147 is one of the most extensively studied lantibiotics to date. In this study, the bacterial diversity of the Irish kefir grain from which L. lactis DPC3147 was originally isolated was for the first time investigated using a high-throughput parallel sequencing strategy. A total of 17 416 unique V4 variable regions of the 16S rRNA gene were analysed from both the kefir starter grain and its derivative kefir-fermented milk. Firmicutes (which includes the lactic acid bacteria) was the dominant phylum accounting for > 92% of sequences. Within the Firmicutes, dramatic differences in abundance were observed when the starter grain and kefir milk fermentate were compared. The kefir grain-associated bacterial community was largely composed of the Lactobacillaceae family while Streptococcaceae (primarily Lactococcus spp.) was the dominant family within the kefir milk fermentate. Sequencing data confirmed previous findings that the microbiota of kefir milk and the starter grain are quite different while at the same time, establishing that the microbial diversity of the starter grain is not uniform with a greater level of diversity associated with the interior kefir starter grain compared with the exterior. © 2011 Teagasc Food Research Centre, Moorepark. FEMS Microbiology Letters © 2011 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd.
Species diversity and phylogeographical affinities of the Branchiopoda (Crustacea) of Churchill, Manitoba, Canada.

PubMed

Jeffery, Nicholas W; Elías-Gutiérrez, Manuel; Adamowicz, Sarah J

2011-01-01

The region of Churchill, Manitoba, contains a wide variety of habitats representative of both the boreal forest and arctic tundra and has been used as a model site for biodiversity studies for nearly seven decades within Canada. Much previous work has been done in Churchill to study the Daphnia pulex species complex in particular, but no study has completed a wide-scale survey on the crustacean species that inhabit Churchill's aquatic ecosystems using molecular markers. We have employed DNA barcoding to study the diversity of the Branchiopoda (Crustacea) in a wide variety of freshwater habitats and to determine the likely origins of the Churchill fauna following the last glaciation. The standard animal barcode marker (COI) was sequenced for 327 specimens, and a 3% divergence threshold was used to delineate potential species. We found 42 provisional and valid branchiopod species from this survey alone, including several cryptic lineages, in comparison with the 25 previously recorded from previous ecological works. Using published sequence data, we explored the phylogeographic affinities of Churchill's branchiopods, finding that the Churchill fauna apparently originated from all directions from multiple glacial refugia (including southern, Beringian, and high arctic regions). Overall, these microcrustaceans are very diverse in Churchill and contain multiple species complexes. The present study introduces among the first sequences for some understudied genera, for which further work is required to delineate species boundaries and develop a more complete understanding of branchiopod diversity over a larger spatial scale.
Species Diversity and Phylogeographical Affinities of the Branchiopoda (Crustacea) of Churchill, Manitoba, Canada

PubMed Central

Jeffery, Nicholas W.; Elías-Gutiérrez, Manuel; Adamowicz, Sarah J.

2011-01-01

The region of Churchill, Manitoba, contains a wide variety of habitats representative of both the boreal forest and arctic tundra and has been used as a model site for biodiversity studies for nearly seven decades within Canada. Much previous work has been done in Churchill to study the Daphnia pulex species complex in particular, but no study has completed a wide-scale survey on the crustacean species that inhabit Churchill's aquatic ecosystems using molecular markers. We have employed DNA barcoding to study the diversity of the Branchiopoda (Crustacea) in a wide variety of freshwater habitats and to determine the likely origins of the Churchill fauna following the last glaciation. The standard animal barcode marker (COI) was sequenced for 327 specimens, and a 3% divergence threshold was used to delineate potential species. We found 42 provisional and valid branchiopod species from this survey alone, including several cryptic lineages, in comparison with the 25 previously recorded from previous ecological works. Using published sequence data, we explored the phylogeographic affinities of Churchill's branchiopods, finding that the Churchill fauna apparently originated from all directions from multiple glacial refugia (including southern, Beringian, and high arctic regions). Overall, these microcrustaceans are very diverse in Churchill and contain multiple species complexes. The present study introduces among the first sequences for some understudied genera, for which further work is required to delineate species boundaries and develop a more complete understanding of branchiopod diversity over a larger spatial scale. PMID:21610864
Deglycosylated Filovirus Glycoproteins as Effective Vaccine Immunogens

DTIC Science & Technology

2015-11-01

pre-fusion 119 EBOV GP1,2 ΔTM structure ( PDB ID: 3CSY) that lacks the MLD was performed as previously 120 described (22, 23). Briefly, the published... structure lacks four NGS in GP1 due to disordered 121 regions missing from the structure (N204 and N296) or mutations that promoted crystallization...122 (N40 and N228) (20, 21). The EBOV GP sequence was submitted to the PHYRE2 protein fold 123 recognition server (16), which provided a structure
Deep learning improves prediction of CRISPR-Cpf1 guide RNA activity.

PubMed

Kim, Hui Kwon; Min, Seonwoo; Song, Myungjae; Jung, Soobin; Choi, Jae Woo; Kim, Younggwang; Lee, Sangeun; Yoon, Sungroh; Kim, Hyongbum Henry

2018-03-01

We present two algorithms to predict the activity of AsCpf1 guide RNAs. Indel frequencies for 15,000 target sequences were used in a deep-learning framework based on a convolutional neural network to train Seq-deepCpf1. We then incorporated chromatin accessibility information to create the better-performing DeepCpf1 algorithm for cell lines for which such information is available and show that both algorithms outperform previous machine learning algorithms on our own and published data sets.
Novel sequence variants in the TMIE gene in families with autosomal recessive nonsyndromic hearing impairment

PubMed Central

Santos, Regie Lyn P.; El-Shanti, Hatem; Sikandar, Shaheen; Lee, Kwanghyuk; Bhatti, Attya; Yan, Kai; Chahrour, Maria H.; McArthur, Nathan; Pham, Thanh L.; Mahasneh, Amjad Abdullah; Ahmad, Wasim

2010-01-01

To date, 37 genes have been identified for nonsyndromic hearing impairment (NSHI). Identifying the functional sequence variants within these genes and knowing their population-specific frequencies is of public health value, in particular for genetic screening for NSHI. To determine putatively functional sequence variants in the transmembrane inner ear (TMIE) gene in Pakistani and Jordanian families with autosomal recessive (AR) NSHI, four Jordanian and 168 Pakistani families with ARNSHI that is not due to GJB2 (CX26) were submitted to a genome scan. Two-point and multipoint parametric linkage analyses were performed, and families with logarithmic odds (LOD) scores of 1.0 or greater within the TMIE region underwent further DNA sequencing. The evolutionary conservation and location in predicted protein domains of amino acid residues where sequence variants occurred were studied to elucidate the possible effects of these sequence variants on function. Of seven families that were screened for TMIE, putatively functional sequence variants were found to segregate with hearing impairment in four families but were not seen in not less than 110 ethnically matched control chromosomes. The previously reported c.241C>T (p.R81C) variant was observed in two Pakistani families. Two novel variants, c.92A>G (p.E31G) and the splice site mutation c.212–2A>C, were identified in one Pakistani and one Jordanian family, respectively. The c.92A>G (p.E31G) variant occurred at a residue that is conserved in the mouse and is predicted to be extracellular. Conservation and potential functionality of previously published mutations were also examined. The prevalence of functional TMIE variants in Pakistani families is 1.7% [95% confidence interval (CI) 0.3–4.8]. Further studies on the spectrum, prevalence rates, and functional effect of sequence variants in the TMIE gene in other populations should demonstrate the true importance of this gene as a cause of hearing impairment. PMID:16389551
AMS 4.0: consensus prediction of post-translational modifications in protein sequences.

PubMed

Plewczynski, Dariusz; Basu, Subhadip; Saha, Indrajit

2012-08-01

We present here the 2011 update of the AutoMotif Service (AMS 4.0) that predicts the wide selection of 88 different types of the single amino acid post-translational modifications (PTM) in protein sequences. The selection of experimentally confirmed modifications is acquired from the latest UniProt and Phospho.ELM databases for training. The sequence vicinity of each modified residue is represented using amino acids physico-chemical features encoded using high quality indices (HQI) obtaining by automatic clustering of known indices extracted from AAindex database. For each type of the numerical representation, the method builds the ensemble of Multi-Layer Perceptron (MLP) pattern classifiers, each optimising different objectives during the training (for example the recall, precision or area under the ROC curve (AUC)). The consensus is built using brainstorming technology, which combines multi-objective instances of machine learning algorithm, and the data fusion of different training objects representations, in order to boost the overall prediction accuracy of conserved short sequence motifs. The performance of AMS 4.0 is compared with the accuracy of previous versions, which were constructed using single machine learning methods (artificial neural networks, support vector machine). Our software improves the average AUC score of the earlier version by close to 7 % as calculated on the test datasets of all 88 PTM types. Moreover, for the selected most-difficult sequence motifs types it is able to improve the prediction performance by almost 32 %, when compared with previously used single machine learning methods. Summarising, the brainstorming consensus meta-learning methodology on the average boosts the AUC score up to around 89 %, averaged over all 88 PTM types. Detailed results for single machine learning methods and the consensus methodology are also provided, together with the comparison to previously published methods and state-of-the-art software tools. The source code and precompiled binaries of brainstorming tool are available at http://code.google.com/p/automotifserver/ under Apache 2.0 licensing.
SINE_scan: an efficient tool to discover short interspersed nuclear elements (SINEs) in large-scale genomic datasets.

PubMed

Mao, Hongliang; Wang, Hao

2017-03-01

Short Interspersed Nuclear Elements (SINEs) are transposable elements (TEs) that amplify through a copy-and-paste mode via RNA intermediates. The computational identification of new SINEs are challenging because of their weak structural signals and rapid diversification in sequences. Here we report SINE_Scan, a highly efficient program to predict SINE elements in genomic DNA sequences. SINE_Scan integrates hallmark of SINE transposition, copy number and structural signals to identify a SINE element. SINE_Scan outperforms the previously published de novo SINE discovery program. It shows high sensitivity and specificity in 19 plant and animal genome assemblies, of which sizes vary from 120 Mb to 3.5 Gb. It identifies numerous new families and substantially increases the estimation of the abundance of SINEs in these genomes. The code of SINE_Scan is freely available at http://github.com/maohlzj/SINE_Scan , implemented in PERL and supported on Linux. wangh8@fudan.edu.cn. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Hepatitis E virus and fulminant hepatitis--a virus or host-specific pathology?

PubMed

Smith, Donald B; Simmonds, Peter

2015-04-01

Fulminant hepatitis is a rare outcome of infection with hepatitis E virus. Several recent reports suggest that virus variation is an important determinant of disease progression. To critically examine the evidence that virus-specific factors underlie the development of fulminant hepatitis following hepatitis E virus infection. Published sequence information of hepatitis E virus isolates from patients with and without fulminant hepatitis was collected and analysed using statistical tests to identify associations between virus polymorphisms and disease outcome. Fulminant hepatitis has been reported following infection with all four hepatitis E virus genotypes that infect humans comprising multiple phylogenetic lineages within genotypes 1, 3 and 4. Analysis of virus sequences from individuals infected by a common source did not detect any common substitutions associated with progression to fulminant hepatitis. Re-analysis of previously reported associations between virus substitutions and fulminant hepatitis suggests that these were probably the result of sampling biases. Host-specific factors rather than virus genotype, variants or specific substitutions appear to be responsible for the development of fulminant hepatitis. © 2014 The Authors. Liver International Published by John Wiley & Sons Ltd.
Improved modeling of side-chain--base interactions and plasticity in protein--DNA interface design.

PubMed

Thyme, Summer B; Baker, David; Bradley, Philip

2012-06-08

Combinatorial sequence optimization for protein design requires libraries of discrete side-chain conformations. The discreteness of these libraries is problematic, particularly for long, polar side chains, since favorable interactions can be missed. Previously, an approach to loop remodeling where protein backbone movement is directed by side-chain rotamers predicted to form interactions previously observed in native complexes (termed "motifs") was described. Here, we show how such motif libraries can be incorporated into combinatorial sequence optimization protocols and improve native complex recapitulation. Guided by the motif rotamer searches, we made improvements to the underlying energy function, increasing recapitulation of native interactions. To further test the methods, we carried out a comprehensive experimental scan of amino acid preferences in the I-AniI protein-DNA interface and found that many positions tolerated multiple amino acids. This sequence plasticity is not observed in the computational results because of the fixed-backbone approximation of the model. We improved modeling of this diversity by introducing DNA flexibility and reducing the convergence of the simulated annealing algorithm that drives the design process. In addition to serving as a benchmark, this extensive experimental data set provides insight into the types of interactions essential to maintain the function of this potential gene therapy reagent. Published by Elsevier Ltd.
Recent sequence variation in probe binding site affected detection of respiratory syncytial virus group B by real-time RT-PCR.

PubMed

Kamau, Everlyn; Agoti, Charles N; Lewa, Clement S; Oketch, John; Owor, Betty E; Otieno, Grieven P; Bett, Anne; Cane, Patricia A; Nokes, D James

2017-03-01

Direct immuno-fluorescence test (IFAT) and multiplex real-time RT-PCR have been central to RSV diagnosis in Kilifi, Kenya. Recently, these two methods showed discrepancies with an increasing number of PCR undetectable RSV-B viruses. Establish if mismatches in the primer and probe binding sites could have reduced real-time RT-PCR sensitivity. Nucleoprotein (N) and glycoprotein (G) genes were sequenced for real-time RT-PCR positive and negative samples. Primer and probe binding regions in N gene were checked for mismatches and phylogenetic analyses done to determine molecular epidemiology of these viruses. New primers and probe were designed and tested on the previously real-time RT-PCR negative samples. N gene sequences revealed 3 different mismatches in the probe target site of PCR negative, IFAT positive viruses. The primers target sites had no mismatches. Phylogenetic analysis of N and G genes showed that real-time RT-PCR positive and negative samples fell into distinct clades. Newly designed primers-probe pair improved detection and recovered previous PCR undetectable viruses. An emerging RSV-B variant is undetectable by a quite widely used real-time RT-PCR assay due to polymorphisms that influence probe hybridization affecting PCR accuracy. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.
Genetic heterogeneity of diffuse large B-cell lymphoma.

PubMed

Zhang, Jenny; Grubor, Vladimir; Love, Cassandra L; Banerjee, Anjishnu; Richards, Kristy L; Mieczkowski, Piotr A; Dunphy, Cherie; Choi, William; Au, Wing Yan; Srivastava, Gopesh; Lugar, Patricia L; Rizzieri, David A; Lagoo, Anand S; Bernal-Mizrachi, Leon; Mann, Karen P; Flowers, Christopher; Naresh, Kikkeri; Evens, Andrew; Gordon, Leo I; Czader, Magdalena; Gill, Javed I; Hsi, Eric D; Liu, Qingquan; Fan, Alice; Walsh, Katherine; Jima, Dereje; Smith, Lisa L; Johnson, Amy J; Byrd, John C; Luftig, Micah A; Ni, Ting; Zhu, Jun; Chadburn, Amy; Levy, Shawn; Dunson, David; Dave, Sandeep S

2013-01-22

Diffuse large B-cell lymphoma (DLBCL) is the most common form of lymphoma in adults. The disease exhibits a striking heterogeneity in gene expression profiles and clinical outcomes, but its genetic causes remain to be fully defined. Through whole genome and exome sequencing, we characterized the genetic diversity of DLBCL. In all, we sequenced 73 DLBCL primary tumors (34 with matched normal DNA). Separately, we sequenced the exomes of 21 DLBCL cell lines. We identified 322 DLBCL cancer genes that were recurrently mutated in primary DLBCLs. We identified recurrent mutations implicating a number of known and not previously identified genes and pathways in DLBCL including those related to chromatin modification (ARID1A and MEF2B), NF-κB (CARD11 and TNFAIP3), PI3 kinase (PIK3CD, PIK3R1, and MTOR), B-cell lineage (IRF8, POU2F2, and GNA13), and WNT signaling (WIF1). We also experimentally validated a mutation in PIK3CD, a gene not previously implicated in lymphomas. The patterns of mutation demonstrated a classic long tail distribution with substantial variation of mutated genes from patient to patient and also between published studies. Thus, our study reveals the tremendous genetic heterogeneity that underlies lymphomas and highlights the need for personalized medicine approaches to treating these patients.
Ultrafast DNA sequencing on a microchip by a hybrid separation mechanism that gives 600 bases in 6.5 minutes.

PubMed

Fredlake, Christopher P; Hert, Daniel G; Kan, Cheuk-Wai; Chiesl, Thomas N; Root, Brian E; Forster, Ryan E; Barron, Annelise E

2008-01-15

To realize the immense potential of large-scale genomic sequencing after the completion of the second human genome (Venter's), the costs for the complete sequencing of additional genomes must be dramatically reduced. Among the technologies being developed to reduce sequencing costs, microchip electrophoresis is the only new technology ready to produce the long reads most suitable for the de novo sequencing and assembly of large and complex genomes. Compared with the current paradigm of capillary electrophoresis, microchip systems promise to reduce sequencing costs dramatically by increasing throughput, reducing reagent consumption, and integrating the many steps of the sequencing pipeline onto a single platform. Although capillary-based systems require approximately 70 min to deliver approximately 650 bases of contiguous sequence, we report sequencing up to 600 bases in just 6.5 min by microchip electrophoresis with a unique polymer matrix/adsorbed polymer wall coating combination. This represents a two-thirds reduction in sequencing time over any previously published chip sequencing result, with comparable read length and sequence quality. We hypothesize that these ultrafast long reads on chips can be achieved because the combined polymer system engenders a recently discovered "hybrid" mechanism of DNA electromigration, in which DNA molecules alternate rapidly between repeating through the intact polymer network and disrupting network entanglements to drag polymers through the solution, similar to dsDNA dynamics we observe in single-molecule DNA imaging studies. Most importantly, these results reveal the surprisingly powerful ability of microchip electrophoresis to provide ultrafast Sanger sequencing, which will translate to increased system throughput and reduced costs.

Ultrafast DNA sequencing on a microchip by a hybrid separation mechanism that gives 600 bases in 6.5 minutes

PubMed Central

Fredlake, Christopher P.; Hert, Daniel G.; Kan, Cheuk-Wai; Chiesl, Thomas N.; Root, Brian E.; Forster, Ryan E.; Barron, Annelise E.

2008-01-01

To realize the immense potential of large-scale genomic sequencing after the completion of the second human genome (Venter's), the costs for the complete sequencing of additional genomes must be dramatically reduced. Among the technologies being developed to reduce sequencing costs, microchip electrophoresis is the only new technology ready to produce the long reads most suitable for the de novo sequencing and assembly of large and complex genomes. Compared with the current paradigm of capillary electrophoresis, microchip systems promise to reduce sequencing costs dramatically by increasing throughput, reducing reagent consumption, and integrating the many steps of the sequencing pipeline onto a single platform. Although capillary-based systems require ≈70 min to deliver ≈650 bases of contiguous sequence, we report sequencing up to 600 bases in just 6.5 min by microchip electrophoresis with a unique polymer matrix/adsorbed polymer wall coating combination. This represents a two-thirds reduction in sequencing time over any previously published chip sequencing result, with comparable read length and sequence quality. We hypothesize that these ultrafast long reads on chips can be achieved because the combined polymer system engenders a recently discovered “hybrid” mechanism of DNA electromigration, in which DNA molecules alternate rapidly between reptating through the intact polymer network and disrupting network entanglements to drag polymers through the solution, similar to dsDNA dynamics we observe in single-molecule DNA imaging studies. Most importantly, these results reveal the surprisingly powerful ability of microchip electrophoresis to provide ultrafast Sanger sequencing, which will translate to increased system throughput and reduced costs. PMID:18184818
kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets

PubMed Central

Fletez-Brant, Christopher; Lee, Dongwon; McCallion, Andrew S.; Beer, Michael A.

2013-01-01

Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-seq identifies regions of open chromatin. A major bottleneck in the interpretation of these data is the identification of the underlying DNA sequence code that defines, and ultimately facilitates prediction of, these transcription factor (TF) bound or open chromatin regions. We have recently developed a novel computational methodology, which uses a support vector machine (SVM) with kmer sequence features (kmer-SVM) to identify predictive combinations of short transcription factor-binding sites, which determine the tissue specificity of these genomic assays (Lee, Karchin and Beer, Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011; 21:2167–80). This regulatory information can (i) give confidence in genomic experiments by recovering previously known binding sites, and (ii) reveal novel sequence features for subsequent experimental testing of cooperative mechanisms. Here, we describe the development and implementation of a web server to allow the broader research community to independently apply our kmer-SVM to analyze and interpret their genomic datasets. We analyze five recently published data sets and demonstrate how this tool identifies accessory factors and repressive sequence elements. kmer-SVM is available at http://kmersvm.beerlab.org. PMID:23771147
kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets.

PubMed

Fletez-Brant, Christopher; Lee, Dongwon; McCallion, Andrew S; Beer, Michael A

2013-07-01

Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-seq identifies regions of open chromatin. A major bottleneck in the interpretation of these data is the identification of the underlying DNA sequence code that defines, and ultimately facilitates prediction of, these transcription factor (TF) bound or open chromatin regions. We have recently developed a novel computational methodology, which uses a support vector machine (SVM) with kmer sequence features (kmer-SVM) to identify predictive combinations of short transcription factor-binding sites, which determine the tissue specificity of these genomic assays (Lee, Karchin and Beer, Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011; 21:2167-80). This regulatory information can (i) give confidence in genomic experiments by recovering previously known binding sites, and (ii) reveal novel sequence features for subsequent experimental testing of cooperative mechanisms. Here, we describe the development and implementation of a web server to allow the broader research community to independently apply our kmer-SVM to analyze and interpret their genomic datasets. We analyze five recently published data sets and demonstrate how this tool identifies accessory factors and repressive sequence elements. kmer-SVM is available at http://kmersvm.beerlab.org.
Detection of microRNAs in color space.

PubMed

Marco, Antonio; Griffiths-Jones, Sam

2012-02-01

Deep sequencing provides inexpensive opportunities to characterize the transcriptional diversity of known genomes. The AB SOLiD technology generates millions of short sequencing reads in color-space; that is, the raw data is a sequence of colors, where each color represents 2 nt and each nucleotide is represented by two consecutive colors. This strategy is purported to have several advantages, including increased ability to distinguish sequencing errors from polymorphisms. Several programs have been developed to map short reads to genomes in color space. However, a number of previously unexplored technical issues arise when using SOLiD technology to characterize microRNAs. Here we explore these technical difficulties. First, since the sequenced reads are longer than the biological sequences, every read is expected to contain linker fragments. The color-calling error rate increases toward the 3(') end of the read such that recognizing the linker sequence for removal becomes problematic. Second, mapping in color space may lead to the loss of the first nucleotide of each read. We propose a sequential trimming and mapping approach to map small RNAs. Using our strategy, we reanalyze three published insect small RNA deep sequencing datasets and characterize 22 new microRNAs. A bash shell script to perform the sequential trimming and mapping procedure, called SeqTrimMap, is available at: http://www.mirbase.org/tools/seqtrimmap/ antonio.marco@manchester.ac.uk Supplementary data are available at Bioinformatics online.
Species composition of the genus Saprolegnia in fin fish aquaculture environments, as determined by nucleotide sequence analysis of the nuclear rDNA ITS regions.

PubMed

de la Bastide, Paul Y; Leung, Wai Lam; Hintz, William E

2015-01-01

The ITS region of the rDNA gene was compared for Saprolegnia spp. in order to improve our understanding of nucleotide sequence variability within and between species of this genus, determine species composition in Canadian fin fish aquaculture facilities, and to assess the utility of ITS sequence variability in genetic marker development. From a collection of more than 400 field isolates, ITS region nucleotide sequences were studied and it was determined that there was sufficient consistent inter-specific variation to support the designation of species identity based on ITS sequence data. This non-subjective approach to species identification does not rely upon transient morphological features. Phylogenetic analyses comparing our ITS sequences and species designations with data from previous studies generally supported the clade scheme of Diéguez-Uribeondo et al. (2007) and found agreement with the molecular taxonomic cluster system of Sandoval-Sierra et al. (2014). Our Canadian ITS sequence collection will thus contribute to the public database and assist the clarification of Saprolegnia spp. taxonomy. The analysis of ITS region sequence variability facilitated genus- and species-level identification of unknown samples from aquaculture facilities and provided useful information on species composition. A unique ITS-RFLP for the identification of S. parasitica was also described. Copyright © 2014 The British Mycological Society. Published by Elsevier Ltd. All rights reserved.
Sequence and pattern of expression of a bovine homologue of a human mitochondrial transport protein associated with Grave's disease.

PubMed

Fiermonte, G; Runswick, M J; Walker, J E; Palmieri, F

1992-01-01

A human cDNA has been isolated previously from a thyroid library with the aid of serum from a patient with Grave's disease. It encodes a protein belonging to the mitochondrial metabolite carrier family, referred to as the Grave's disease carrier protein (GDC). Using primers based on this sequence, overlapping cDNAs encoding the bovine homologue of the GDC have been isolated from total bovine heart poly(A)+ cDNA. The bovine protein is 18 amino acids shorter than the published human sequence, but if a frame shift requiring the removal of one nucleotide is introduced into the human cDNA sequence, the human and bovine proteins become identical in their C-terminal regions, and 308 out of 330 amino acids are conserved over their entire sequences. The bovine cDNA has been used to investigate the expression of the GDC in various bovine tissues. In the tissues that were examined, the GDC is most strongly expressed in the thyroid, but substantial amounts of its mRNA were also detected in liver, lung and kidney, and lesser amounts in heart and skeletal muscle.
Prediction of β-turns in proteins from multiple alignment using neural network

PubMed Central

Kaur, Harpreet; Raghava, Gajendra Pal Singh

2003-01-01

A neural network-based method has been developed for the prediction of β-turns in proteins by using multiple sequence alignment. Two feed-forward back-propagation networks with a single hidden layer are used where the first-sequence structure network is trained with the multiple sequence alignment in the form of PSI-BLAST–generated position-specific scoring matrices. The initial predictions from the first network and PSIPRED-predicted secondary structure are used as input to the second structure-structure network to refine the predictions obtained from the first net. A significant improvement in prediction accuracy has been achieved by using evolutionary information contained in the multiple sequence alignment. The final network yields an overall prediction accuracy of 75.5% when tested by sevenfold cross-validation on a set of 426 nonhomologous protein chains. The corresponding Qpred, Qobs, and Matthews correlation coefficient values are 49.8%, 72.3%, and 0.43, respectively, and are the best among all the previously published β-turn prediction methods. The Web server BetaTPred2 (http://www.imtech.res.in/raghava/betatpred2/) has been developed based on this approach. PMID:12592033
Lineage-specific genomics: Frequent birth and death in the human genome: The human genome contains many lineage-specific elements created by both sequence and functional turnover.

PubMed

Young, Robert S

2016-07-01

Frequent evolutionary birth and death events have created a large quantity of biologically important, lineage-specific DNA within mammalian genomes. The birth and death of DNA sequences is so frequent that the total number of these insertions and deletions in the human population remains unknown, although there are differences between these groups, e.g. transposable elements contribute predominantly to sequence insertion. Functional turnover - where the activity of a locus is specific to one lineage, but the underlying DNA remains conserved - can also drive birth and death. However, this does not appear to be a major driver of divergent transcriptional regulation. Both sequence and functional turnover have contributed to the birth and death of thousands of functional promoters in the human and mouse genomes. These findings reveal the pervasive nature of evolutionary birth and death and suggest that lineage-specific regions may play an important but previously underappreciated role in human biology and disease. © 2016 The Authors BioEssays Published by WILEY Periodicals, Inc.
High-throughput sequencing of natively paired antibody chains provides evidence for original antigenic sin shaping the antibody response to influenza vaccination.

PubMed

Tan, Yann-Chong; Blum, Lisa K; Kongpachith, Sarah; Ju, Chia-Hsin; Cai, Xiaoyong; Lindstrom, Tamsin M; Sokolove, Jeremy; Robinson, William H

2014-03-01

We developed a DNA barcoding method to enable high-throughput sequencing of the cognate heavy- and light-chain pairs of the antibodies expressed by individual B cells. We used this approach to elucidate the plasmablast antibody response to influenza vaccination. We show that >75% of the rationally selected plasmablast antibodies bind and neutralize influenza, and that antibodies from clonal families, defined by sharing both heavy-chain VJ and light-chain VJ sequence usage, do so most effectively. Vaccine-induced heavy-chain VJ regions contained on average >20 nucleotide mutations as compared to their predicted germline gene sequences, and some vaccine-induced antibodies exhibited higher binding affinities for hemagglutinins derived from prior years' seasonal influenza as compared to their affinities for the immunization strains. Our results show that influenza vaccination induces the recall of memory B cells that express antibodies that previously underwent affinity maturation against prior years' seasonal influenza, suggesting that 'original antigenic sin' shapes the antibody response to influenza vaccination. Published by Elsevier Inc.
Principles of long noncoding RNA evolution derived from direct comparison of transcriptomes in 17 species.

PubMed

Hezroni, Hadas; Koppstein, David; Schwartz, Matthew G; Avrutin, Alexandra; Bartel, David P; Ulitsky, Igor

2015-05-19

The inability to predict long noncoding RNAs from genomic sequence has impeded the use of comparative genomics for studying their biology. Here, we develop methods that use RNA sequencing (RNA-seq) data to annotate the transcriptomes of 16 vertebrates and the echinoid sea urchin, uncovering thousands of previously unannotated genes, most of which produce long intervening noncoding RNAs (lincRNAs). Although in each species, >70% of lincRNAs cannot be traced to homologs in species that diverged >50 million years ago, thousands of human lincRNAs have homologs with similar expression patterns in other species. These homologs share short, 5'-biased patches of sequence conservation nested in exonic architectures that have been extensively rewired, in part by transposable element exonization. Thus, over a thousand human lincRNAs are likely to have conserved functions in mammals, and hundreds beyond mammals, but those functions require only short patches of specific sequences and can tolerate major changes in gene architecture. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Effects of short read quality and quantity on a de novo vertebrate transcriptome assembly.

PubMed

Garcia, T I; Shen, Y; Catchen, J; Amores, A; Schartl, M; Postlethwait, J; Walter, R B

2012-01-01

For many researchers, next generation sequencing data holds the key to answering a category of questions previously unassailable. One of the important and challenging steps in achieving these goals is accurately assembling the massive quantity of short sequencing reads into full nucleic acid sequences. For research groups working with non-model or wild systems, short read assembly can pose a significant challenge due to the lack of pre-existing EST or genome reference libraries. While many publications describe the overall process of sequencing and assembly, few address the topic of how many and what types of reads are best for assembly. The goal of this project was use real world data to explore the effects of read quantity and short read quality scores on the resulting de novo assemblies. Using several samples of short reads of various sizes and qualities we produced many assemblies in an automated manner. We observe how the properties of read length, read quality, and read quantity affect the resulting assemblies and provide some general recommendations based on our real-world data set. Published by Elsevier Inc.
Very low-depth sequencing in a founder population identifies a cardioprotective APOC3 signal missed by genome-wide imputation.

PubMed

Gilly, Arthur; Ritchie, Graham Rs; Southam, Lorraine; Farmaki, Aliki-Eleni; Tsafantakis, Emmanouil; Dedoussis, George; Zeggini, Eleftheria

2016-06-01

Cohort-wide very low-depth whole-genome sequencing (WGS) can comprehensively capture low-frequency sequence variation for the cost of a dense genome-wide genotyping array. Here, we analyse 1x sequence data across the APOC3 gene in a founder population from the island of Crete in Greece (n = 1239) and find significant evidence for association with blood triglyceride levels with the previously reported R19X cardioprotective null mutation (β = -1.09,σ = 0.163, P = 8.2 × 10 -11 ) and a second loss of function mutation, rs138326449 (β = -1.17,σ = 0.188, P = 1.14 × 10 -9 ). The signal cannot be recapitulated by imputing genome-wide genotype data on a large reference panel of 5122 individuals including 249 with 4x WGS data from the same population. Gene-level meta-analysis with other studies reporting burden signals at APOC3 provides robust evidence for a replicable cardioprotective rare variant aggregation (P = 3.2 × 10 -31 , n = 13 480). © The Author 2016. Published by Oxford University Press.
Low-Pass Genome-Wide Sequencing and Variant Inference Using Identity-by-Descent in an Isolated Human Population

PubMed Central

Gusev, A.; Shah, M. J.; Kenny, E. E.; Ramachandran, A.; Lowe, J. K.; Salit, J.; Lee, C. C.; Levandowsky, E. C.; Weaver, T. N.; Doan, Q. C.; Peckham, H. E.; McLaughlin, S. F.; Lyons, M. R.; Sheth, V. N.; Stoffel, M.; De La Vega, F. M.; Friedman, J. M.; Breslow, J. L.

2012-01-01

Whole-genome sequencing in an isolated population with few founders directly ascertains variants from the population bottleneck that may be rare elsewhere. In such populations, shared haplotypes allow imputation of variants in unsequenced samples without resorting to complex statistical methods as in studies of outbred cohorts. We focus on an isolated population cohort from the Pacific Island of Kosrae, Micronesia, where we previously collected SNP array and rich phenotype data for the majority of the population. We report identification of long regions with haplotypes co-inherited between pairs of individuals and methodology to leverage such shared genetic content for imputation. Our estimates show that sequencing as few as 40 personal genomes allows for inference in up to 60% of the 3000-person cohort at the average locus. We ascertained a pilot data set of whole-genome sequences from seven Kosraean individuals, with average 5× coverage. This assay identified 5,735,306 unique sites of which 1,212,831 were previously unknown. Additionally, these variants are unusually enriched for alleles that are rare in other populations when compared to geographic neighbors (published Korean genome SJK). We used the presence of shared haplotypes between the seven Kosraen individuals to estimate expected imputation accuracy of known and novel homozygous variants at 99.6% and 97.3%, respectively. This study presents whole-genome analysis of a homogenous isolate population with emphasis on optimal rare variant inference. PMID:22135348
Discovery and Annotation of Plant Endogenous Target Mimicry Sequences from Public Transcriptome Libraries: A Case Study of Prunus persica.

PubMed

Karakülah, Gökhan

2017-06-28

Novel transcript discovery through RNA sequencing has substantially improved our understanding of the transcriptome dynamics of biological systems. Endogenous target mimicry (eTM) transcripts, a novel class of regulatory molecules, bind to their target microRNAs (miRNAs) by base pairing and block their biological activity. The objective of this study was to provide a computational analysis framework for the prediction of putative eTM sequences in plants, and as an example, to discover previously un-annotated eTMs in Prunus persica (peach) transcriptome. Therefore, two public peach transcriptome libraries downloaded from Sequence Read Archive (SRA) and a previously published set of long non-coding RNAs (lncRNAs) were investigated with multi-step analysis pipeline, and 44 putative eTMs were found. Additionally, an eTM-miRNA-mRNA regulatory network module associated with peach fruit organ development was built via integration of the miRNA target information and predicted eTM-miRNA interactions. My findings suggest that one of the most widely expressed miRNA families among diverse plant species, miR156, might be potentially sponged by seven putative eTMs. Besides, the study indicates eTMs potentially play roles in the regulation of development processes in peach fruit via targeting specific miRNAs. In conclusion, by following the step-by step instructions provided in this study, novel eTMs can be identified and annotated effectively in public plant transcriptome libraries.
Excess of genomic defects in a woolly mammoth on Wrangel island

PubMed Central

Slatkin, Montgomery

2017-01-01

Woolly mammoths (Mammuthus primigenius) populated Siberia, Beringia, and North America during the Pleistocene and early Holocene. Recent breakthroughs in ancient DNA sequencing have allowed for complete genome sequencing for two specimens of woolly mammoths (Palkopoulou et al. 2015). One mammoth specimen is from a mainland population 45,000 years ago when mammoths were plentiful. The second, a 4300 yr old specimen, is derived from an isolated population on Wrangel island where mammoths subsisted with small effective population size more than 43-fold lower than previous populations. These extreme differences in effective population size offer a rare opportunity to test nearly neutral models of genome architecture evolution within a single species. Using these previously published mammoth sequences, we identify deletions, retrogenes, and non-functionalizing point mutations. In the Wrangel island mammoth, we identify a greater number of deletions, a larger proportion of deletions affecting gene sequences, a greater number of candidate retrogenes, and an increased number of premature stop codons. This accumulation of detrimental mutations is consistent with genomic meltdown in response to low effective population sizes in the dwindling mammoth population on Wrangel island. In addition, we observe high rates of loss of olfactory receptors and urinary proteins, either because these loci are non-essential or because they were favored by divergent selective pressures in island environments. Finally, at the locus of FOXQ1 we observe two independent loss-of-function mutations, which would confer a satin coat phenotype in this island woolly mammoth. PMID:28253255
Transcriptome Sequencing of Hevea brasiliensis for Development of Microsatellite Markers and Construction of a Genetic Linkage Map

PubMed Central

Triwitayakorn, Kanokporn; Chatkulkawin, Pornsupa; Kanjanawattanawong, Supanath; Sraphet, Supajit; Yoocha, Thippawan; Sangsrakru, Duangjai; Chanprasert, Juntima; Ngamphiw, Chumpol; Jomchai, Nukoon; Therawattanasuk, Kanikar; Tangphatsornruang, Sithichoke

2011-01-01

To obtain more information on the Hevea brasiliensis genome, we sequenced the transcriptome from the vegetative shoot apex yielding 2 311 497 reads. Clustering and assembly of the reads produced a total of 113 313 unique sequences, comprising 28 387 isotigs and 84 926 singletons. Also, 17 819 expressed sequence tag (EST)-simple sequence repeats (SSRs) were identified from the data set. To demonstrate the use of this EST resource for marker development, primers were designed for 430 of the EST-SSRs. Three hundred and twenty-three primer pairs were amplifiable in H. brasiliensis clones. Polymorphic information content values of selected 47 SSRs among 20 H. brasiliensis clones ranged from 0.13 to 0.71, with an average of 0.51. A dendrogram of genetic similarities between the 20 H. brasiliensis clones using these 47 EST-SSRs suggested two distinct groups that correlated well with clone pedigree. These novel EST-SSRs together with the published SSRs were used for the construction of an integrated parental linkage map of H. brasiliensis based on 81 lines of an F1 mapping population. The map consisted of 97 loci, consisting of 37 novel EST-SSRs and 60 published SSRs, distributed on 23 linkage groups and covered 842.9 cM with a mean interval of 11.9 cM and ∼4 loci per linkage group. Although the numbers of linkage groups exceed the haploid number (18), but with several common markers between homologous linkage groups with the previous map indicated that the F1 map in this study is appropriate for further study in marker-assisted selection. PMID:22086998
Finding the Subcellular Location of Barley, Wheat, Rice and Maize Proteins: The Compendium of Crop Proteins with Annotated Locations (cropPAL).

PubMed

Hooper, Cornelia M; Castleden, Ian R; Aryamanesh, Nader; Jacoby, Richard P; Millar, A Harvey

2016-01-01

Barley, wheat, rice and maize provide the bulk of human nutrition and have extensive industrial use as agricultural products. The genomes of these crops each contains >40,000 genes encoding proteins; however, the major genome databases for these species lack annotation information of protein subcellular location for >80% of these gene products. We address this gap, by constructing the compendium of crop protein subcellular locations called crop Proteins with Annotated Locations (cropPAL). Subcellular location is most commonly determined by fluorescent protein tagging of live cells or mass spectrometry detection in subcellular purifications, but can also be predicted from amino acid sequence or protein expression patterns. The cropPAL database collates 556 published studies, from >300 research institutes in >30 countries that have been previously published, as well as compiling eight pre-computed subcellular predictions for all Hordeum vulgare, Triticum aestivum, Oryza sativa and Zea mays protein sequences. The data collection including metadata for proteins and published studies can be accessed through a search portal http://crop-PAL.org. The subcellular localization information housed in cropPAL helps to depict plant cells as compartmentalized protein networks that can be investigated for improving crop yield and quality, and developing new biotechnological solutions to agricultural challenges. © The Author 2015. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Efficient error correction for next-generation sequencing of viral amplicons

PubMed Central

2012-01-01

Background Next-generation sequencing allows the analysis of an unprecedented number of viral sequence variants from infected patients, presenting a novel opportunity for understanding virus evolution, drug resistance and immune escape. However, sequencing in bulk is error prone. Thus, the generated data require error identification and correction. Most error-correction methods to date are not optimized for amplicon analysis and assume that the error rate is randomly distributed. Recent quality assessment of amplicon sequences obtained using 454-sequencing showed that the error rate is strongly linked to the presence and size of homopolymers, position in the sequence and length of the amplicon. All these parameters are strongly sequence specific and should be incorporated into the calibration of error-correction algorithms designed for amplicon sequencing. Results In this paper, we present two new efficient error correction algorithms optimized for viral amplicons: (i) k-mer-based error correction (KEC) and (ii) empirical frequency threshold (ET). Both were compared to a previously published clustering algorithm (SHORAH), in order to evaluate their relative performance on 24 experimental datasets obtained by 454-sequencing of amplicons with known sequences. All three algorithms show similar accuracy in finding true haplotypes. However, KEC and ET were significantly more efficient than SHORAH in removing false haplotypes and estimating the frequency of true ones. Conclusions Both algorithms, KEC and ET, are highly suitable for rapid recovery of error-free haplotypes obtained by 454-sequencing of amplicons from heterogeneous viruses. The implementations of the algorithms and data sets used for their testing are available at: http://alan.cs.gsu.edu/NGS/?q=content/pyrosequencing-error-correction-algorithm PMID:22759430
Efficient error correction for next-generation sequencing of viral amplicons.

PubMed

Skums, Pavel; Dimitrova, Zoya; Campo, David S; Vaughan, Gilberto; Rossi, Livia; Forbi, Joseph C; Yokosawa, Jonny; Zelikovsky, Alex; Khudyakov, Yury

2012-06-25

Next-generation sequencing allows the analysis of an unprecedented number of viral sequence variants from infected patients, presenting a novel opportunity for understanding virus evolution, drug resistance and immune escape. However, sequencing in bulk is error prone. Thus, the generated data require error identification and correction. Most error-correction methods to date are not optimized for amplicon analysis and assume that the error rate is randomly distributed. Recent quality assessment of amplicon sequences obtained using 454-sequencing showed that the error rate is strongly linked to the presence and size of homopolymers, position in the sequence and length of the amplicon. All these parameters are strongly sequence specific and should be incorporated into the calibration of error-correction algorithms designed for amplicon sequencing. In this paper, we present two new efficient error correction algorithms optimized for viral amplicons: (i) k-mer-based error correction (KEC) and (ii) empirical frequency threshold (ET). Both were compared to a previously published clustering algorithm (SHORAH), in order to evaluate their relative performance on 24 experimental datasets obtained by 454-sequencing of amplicons with known sequences. All three algorithms show similar accuracy in finding true haplotypes. However, KEC and ET were significantly more efficient than SHORAH in removing false haplotypes and estimating the frequency of true ones. Both algorithms, KEC and ET, are highly suitable for rapid recovery of error-free haplotypes obtained by 454-sequencing of amplicons from heterogeneous viruses.The implementations of the algorithms and data sets used for their testing are available at: http://alan.cs.gsu.edu/NGS/?q=content/pyrosequencing-error-correction-algorithm.
Comparative genomics provides new insights into the diversity, physiology, and sexuality of the only industrially exploited tremellomycete: Phaffia rhodozyma

DOE PAGES

Bellora, Nicolas; Moline, Martin; David-Palma, Marcia; ...

2016-11-09

The class Tremellomycete (Agaricomycotina) encompasses more than 380 fungi. Although there are a few edible Tremella spp., the only species with current biotechnological use is the astaxanthin-producing yeast Phaffia rhodozyma (Cystofilobasidiales). Besides astaxanthin, a carotenoid pigment with potent antioxidant activity and great value for aquaculture and pharmaceutical industries, P. rhodozyma possesses multiple exceptional traits of fundamental and applied interest. The aim of this study was to obtain, and analyze two new genome sequences of representative strains from the northern (CBS 7918 T, the type strain) and southern hemispheres (CRUB 1149) and compre them to a previously published genome sequence (strainmore » CBS 6938). Furthermore, photoprotection and antioxidant related genes, as well as genes involved in sexual reproduction were analyzed.« less

Comparative genomics provides new insights into the diversity, physiology, and sexuality of the only industrially exploited tremellomycete: Phaffia rhodozyma

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bellora, Nicolas; Moline, Martin; David-Palma, Marcia

The class Tremellomycete (Agaricomycotina) encompasses more than 380 fungi. Although there are a few edible Tremella spp., the only species with current biotechnological use is the astaxanthin-producing yeast Phaffia rhodozyma (Cystofilobasidiales). Besides astaxanthin, a carotenoid pigment with potent antioxidant activity and great value for aquaculture and pharmaceutical industries, P. rhodozyma possesses multiple exceptional traits of fundamental and applied interest. The aim of this study was to obtain, and analyze two new genome sequences of representative strains from the northern (CBS 7918 T, the type strain) and southern hemispheres (CRUB 1149) and compre them to a previously published genome sequence (strainmore » CBS 6938). Furthermore, photoprotection and antioxidant related genes, as well as genes involved in sexual reproduction were analyzed.« less
Uveal melanoma hepatic metastases mutation spectrum analysis using targeted next-generation sequencing of 400 cancer genes.

PubMed

Luscan, A; Just, P A; Briand, A; Burin des Roziers, C; Goussard, P; Nitschké, P; Vidaud, M; Avril, M F; Terris, B; Pasmant, E

2015-04-01

Uveal melanoma (UM) is the most common malignant tumour of the eye. Diagnosis often occurs late in the course of disease, and prognosis is generally poor. Recently, recurrent somatic mutations were described, unravelling additional specific altered pathways in UM. Targeted next-generation sequencing (NGS) can now be applied to an accurate and fast identification of somatic mutations in cancer. The aim of the present study was to characterise the mutation pattern of five UM hepatic metastases with well-defined clinical and pathological features. We analysed the UM mutation spectrum using targeted NGS on 409 cancer genes. Four previous reported genes were found to be recurrently mutated. All tumours presented mutually exclusive GNA11 or GNAQ missense mutations. BAP1 loss-of-function mutations were found in three UMs. SF3B1 missense mutations were found in the two UMs with no BAP1 mutations. We then searched for additional mutation targets. We identified the Arg505Cys mutation in the tumour suppressor FBXW7. The same mutation was previously described in different cancer types, and FBXW7 was recently reported to be mutated in UM exomes. Further studies are required to confirm FBXW7 implication in UM tumorigenesis. Elucidating the molecular mechanisms underlying UM tumorigenesis holds the promise for novel and effective targeted UM therapies. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
IMG/VR: a database of cultured and uncultured DNA Viruses and retroviruses.

PubMed

Paez-Espino, David; Chen, I-Min A; Palaniappan, Krishna; Ratner, Anna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Huang, Jinghua; Markowitz, Victor M; Nielsen, Torben; Huntemann, Marcel; K Reddy, T B; Pavlopoulos, Georgios A; Sullivan, Matthew B; Campbell, Barbara J; Chen, Feng; McMahon, Katherine; Hallam, Steve J; Denef, Vincent; Cavicchioli, Ricardo; Caffrey, Sean M; Streit, Wolfgang R; Webster, John; Handley, Kim M; Salekdeh, Ghasem H; Tsesmetzis, Nicolas; Setubal, Joao C; Pope, Phillip B; Liu, Wen-Tso; Rivers, Adam R; Ivanova, Natalia N; Kyrpides, Nikos C

2017-01-04

Viruses represent the most abundant life forms on the planet. Recent experimental and computational improvements have led to a dramatic increase in the number of viral genome sequences identified primarily from metagenomic samples. As a result of the expanding catalog of metagenomic viral sequences, there exists a need for a comprehensive computational platform integrating all these sequences with associated metadata and analytical tools. Here we present IMG/VR (https://img.jgi.doe.gov/vr/), the largest publicly available database of 3908 isolate reference DNA viruses with 264 413 computationally identified viral contigs from >6000 ecologically diverse metagenomic samples. Approximately half of the viral contigs are grouped into genetically distinct quasi-species clusters. Microbial hosts are predicted for 20 000 viral sequences, revealing nine microbial phyla previously unreported to be infected by viruses. Viral sequences can be queried using a variety of associated metadata, including habitat type and geographic location of the samples, or taxonomic classification according to hallmark viral genes. IMG/VR has a user-friendly interface that allows users to interrogate all integrated data and interact by comparing with external sequences, thus serving as an essential resource in the viral genomics community. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Identification in Marinomonas mediterranea of a novel quinoprotein with glycine oxidase activity.

PubMed

Campillo-Brocal, Jonatan Cristian; Lucas-Elio, Patricia; Sanchez-Amat, Antonio

2013-08-01

A novel enzyme with lysine-epsilon oxidase activity was previously described in the marine bacterium Marinomonas mediterranea. This enzyme differs from other l-amino acid oxidases in not being a flavoprotein but containing a quinone cofactor. It is encoded by an operon with two genes lodA and lodB. The first one codes for the oxidase, while the second one encodes a protein required for the expression of the former. Genome sequencing of M. mediterranea has revealed that it contains two additional operons encoding proteins with sequence similarity to LodA. In this study, it is shown that the product of one of such genes, Marme_1655, encodes a protein with glycine oxidase activity. This activity shows important differences in terms of substrate range and sensitivity to inhibitors to other glycine oxidases previously described which are flavoproteins synthesized by Bacillus. The results presented in this study indicate that the products of the genes with different degrees of similarity to lodA detected in bacterial genomes could constitute a reservoir of different oxidases. © 2013 The Authors. Microbiology Open published by John Wiley & Sons Ltd.
New insights into replication origin characteristics in metazoans

PubMed Central

Puy, Aurore; Rialle, Stéphanie; Kaplan, Noam; Segal, Eran

2012-01-01

We recently reported the identification and characterization of DNA replication origins (Oris) in metazoan cell lines. Here, we describe additional bioinformatic analyses showing that the previously identified GC-rich sequence elements form origin G-rich repeated elements (OGREs) that are present in 67% to 90% of the DNA replication origins from Drosophila to human cells, respectively. Our analyses also show that initiation of DNA synthesis takes place precisely at 160 bp (Drosophila) and 280 bp (mouse) from the OGRE. We also found that in most CpG islands, an OGRE is positioned in opposite orientation on each of the two DNA strands and detected two sites of initiation of DNA synthesis upstream or downstream of each OGRE. Conversely, Oris not associated with CpG islands have a single initiation site. OGRE density along chromosomes correlated with previously published replication timing data. Ori sequences centered on the OGRE are also predicted to have high intrinsic nucleosome occupancy. Finally, OGREs predict G-quadruplex structures at Oris that might be structural elements controlling the choice or activation of replication origins. PMID:22373526
Next-generation sequencing sheds light on the natural history of hepatitis C infection in patients who fail treatment.

PubMed

Abdelrahman, Tamer; Hughes, Joseph; Main, Janice; McLauchlan, John; Thursz, Mark; Thomson, Emma

2015-01-01

High rates of sexually transmitted infection and reinfection with hepatitis C virus (HCV) have recently been reported in human immunodeficiency virus (HIV)-infected men who have sex with men and reinfection has also been described in monoinfected injecting drug users. The diagnosis of reinfection has traditionally been based on direct Sanger sequencing of samples pre- and posttreatment, but not on more sensitive deep sequencing techniques. We studied viral quasispecies dynamics in patients who failed standard of care therapy in a high-risk HIV-infected cohort of patients with early HCV infection to determine whether treatment failure was associated with reinfection or recrudescence of preexisting infection. Paired sequences (pre- and posttreatment) were analyzed. The HCV E2 hypervariable region-1 was amplified using nested reverse-transcription polymerase chain reaction (RT-PCR) with indexed genotype-specific primers and the same products were sequenced using both Sanger and 454 pyrosequencing approaches. Of 99 HIV-infected patients with acute HCV treated with 24-48 weeks of pegylated interferon alpha and ribavirin, 15 failed to achieve a sustained virological response (six relapsed, six had a null response, and three had a partial response). Using direct sequencing, 10/15 patients (66%) had evidence of a previously undetected strain posttreatment; in many studies, this is interpreted as reinfection. However, pyrosequencing revealed that 15/15 (100%) of patients had evidence of persisting infection; 6/15 (40%) patients had evidence of a previously undetected variant present in the posttreatment sample in addition to a variant that was detected at baseline. This could represent superinfection or a limitation of the sensitivity of pyrosequencing. In this high-risk group, the emergence of new viral strains following treatment failure is most commonly associated with emerging dominance of preexisting minority variants rather than reinfection. Superinfection may occur in this cohort but reinfection is overestimated by Sanger sequencing. © 2014 The Authors. Hepatology published by Wiley on behalf of the American Association for the Study of Liver Diseases.
Classification of "Pseudomonas azotocolligans" Anderson 1955, 132, in the genus Sphingomonas as Sphingomonas trueperi sp. nov.

PubMed

Kämpfer, P; Denner, E B; Meyer, S; Moore, E R; Busse, H J

1997-04-01

"Pseudomonas azotocolligans" ATCC 12417T (T = type strain), which was described as a diazotrophic bacterium, was reinvestigated to clarify its taxonomic position. 16S ribosomal DNA sequence comparisons demonstrated that this strain clusters phylogenetically with species of the genus Sphingomonas and represents a new species. The results of investigations of the fatty acid patterns, polar lipid profiles, and quinone system supported this placement. The substrate utilization profile and biochemical characteristics displayed no obvious similarity to the characteristics of any previously described species of the genus Sphingomonas. The new name Sphingomonas trueperi is proposed on the basis of these results and previously published data for the G + C content of the genomic DNA and the polyamine pattern.
High energy PIXE: A tool to characterize multi-layer thick samples

NASA Astrophysics Data System (ADS)

Subercaze, A.; Koumeir, C.; Métivier, V.; Servagent, N.; Guertin, A.; Haddad, F.

2018-02-01

High energy PIXE is a useful and non-destructive tool to characterize multi-layer thick samples such as cultural heritage objects. In a previous work, we demonstrated the possibility to perform quantitative analysis of simple multi-layer samples using high energy PIXE, without any assumption on their composition. In this work an in-depth study of the parameters involved in the method previously published is proposed. Its extension to more complex samples with a repeated layer is also presented. Experiments have been performed at the ARRONAX cyclotron using 68 MeV protons. The thicknesses and sequences of a multi-layer sample including two different layers of the same element have been determined. Performances and limits of this method are presented and discussed.
Fine-scale analysis of 16S rRNA sequences reveals a high level of taxonomic diversity among vaginal Atopobium spp.

PubMed Central

Mendes-Soares, Helena; Krishnan, Vandhana; Settles, Matthew L.; Ravel, Jacques; Brown, Celeste J.; Forney, Larry J.

2015-01-01

Although vaginal microbial communities of some healthy women have high proportions of Atopobium vaginae, the genus Atopobium is more commonly associated with bacterial vaginosis, a syndrome associated with an increased risk of adverse pregnancy outcomes and the transmission of sexually transmitted diseases. Genetic differences within Atopobium species may explain why single species can be associated with both health and disease. We used 16S rRNA gene sequences from previously published studies to explore the taxonomic diversity of the genus Atopobium in vaginal microbial communities of healthy women. Although A. vaginae was the species most commonly found, we also observed three other Atopobium species in the vaginal microbiota, one of which, A. parvulum, was not previously known to reside in the human vagina. Furthermore, we found several potential novel species of the genus Atopobium and multiple phylogenetic clades of A. vaginae. The diversity of Atopobium found in our study, which focused only on samples from healthy women, is greater than previously recognized, suggesting that analysis of samples from women with BV would yield even more diversity. Classification of microbes only to the genus level may thus obfuscate differences that might be important to better understand health or disease. PMID:25778779
The sequence specificity of UV-induced DNA damage in a systematically altered DNA sequence.

PubMed

Khoe, Clairine V; Chung, Long H; Murray, Vincent

2018-06-01

The sequence specificity of UV-induced DNA damage was investigated in a specifically designed DNA plasmid using two procedures: end-labelling and linear amplification. Absorption of UV photons by DNA leads to dimerisation of pyrimidine bases and produces two major photoproducts, cyclobutane pyrimidine dimers (CPDs) and pyrimidine(6-4)pyrimidone photoproducts (6-4PPs). A previous study had determined that two hexanucleotide sequences, 5'-GCTC*AC and 5'-TATT*AA, were high intensity UV-induced DNA damage sites. The UV clone plasmid was constructed by systematically altering each nucleotide of these two hexanucleotide sequences. One of the main goals of this study was to determine the influence of single nucleotide alterations on the intensity of UV-induced DNA damage. The sequence 5'-GCTC*AC was designed to examine the sequence specificity of 6-4PPs and the highest intensity 6-4PP damage sites were found at 5'-GTTC*CC nucleotides. The sequence 5'-TATT*AA was devised to investigate the sequence specificity of CPDs and the highest intensity CPD damage sites were found at 5'-TTTT*CG nucleotides. It was proposed that the tetranucleotide DNA sequence, 5'-YTC*Y (where Y is T or C), was the consensus sequence for the highest intensity UV-induced 6-4PP adduct sites; while it was 5'-YTT*C for the highest intensity UV-induced CPD damage sites. These consensus tetranucleotides are composed entirely of consecutive pyrimidines and must have a DNA conformation that is highly productive for the absorption of UV photons. Crown Copyright © 2018. Published by Elsevier B.V. All rights reserved.
Lucinidae/sulfur-oxidizing bacteria: ancestral heritage or opportunistic association? Further insights from the Bohol Sea (the Philippines).

PubMed

Brissac, Terry; Merçot, Hervé; Gros, Olivier

2011-01-01

The first studies of the 16S rRNA gene diversity of the bacterial symbionts found in lucinid clams did not clarify how symbiotic associations had evolved in this group. Indeed, although species-specific associations deriving from a putative ancestral symbiotic association have been described (coevolution scenario), associations between the same bacterial species and various host species (opportunistic scenario) have also been described. Here, we carried out a comparative molecular analysis of hosts, based on 18S and 28S rRNA gene sequences, and of symbionts, based on 16S rRNA gene sequences, to determine as to which evolutionary scenario led to modern lucinid/symbiont associations. For all sequences analyzed, we found only three bacterial symbiont species, two of which are harbored by lucinids colonizing mangrove swamps. The last symbiont is the most common and was found to be independent of biotope or depth. Another interesting feature is the similarity of ctenidial organization of lucinids from the Philippines to those described previously, with the exception that two bacterial morphotypes were observed in two different species (Gloverina rectangularis and Myrtea flabelliformis). Thus, there is apparently no specific association between Lucinidae and their symbionts, the association taking place according to which bacterial species is present in the environment. FEMS Microbiology Ecology © 2010 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. No claim to original French government works.
Assembly and analysis of a male sterile rubber tree mitochondrial genome reveals DNA rearrangement events and a novel transcript.

PubMed

Shearman, Jeremy R; Sangsrakru, Duangjai; Ruang-Areerate, Panthita; Sonthirod, Chutima; Uthaipaisanwong, Pichahpuk; Yoocha, Thippawan; Poopear, Supannee; Theerawattanasuk, Kanikar; Tragoonrung, Somvong; Tangphatsornruang, Sithichoke

2014-02-10

The rubber tree, Hevea brasiliensis, is an important plant species that is commercially grown to produce latex rubber in many countries. The rubber tree variety BPM 24 exhibits cytoplasmic male sterility, inherited from the variety GT 1. We constructed the rubber tree mitochondrial genome of a cytoplasmic male sterile variety, BPM 24, using 454 sequencing, including 8 kb paired-end libraries, plus Illumina paired-end sequencing. We annotated this mitochondrial genome with the aid of Illumina RNA-seq data and performed comparative analysis. We then compared the sequence of BPM 24 to the contigs of the published rubber tree, variety RRIM 600, and identified a rearrangement that is unique to BPM 24 resulting in a novel transcript containing a portion of atp9. The novel transcript is consistent with changes that cause cytoplasmic male sterility through a slight reduction to ATP production efficiency. The exhaustive nature of the search rules out alternative causes and supports previous findings of novel transcripts causing cytoplasmic male sterility.
Identification, isolation, and N-terminal sequencing of style glycoproteins associated with self-incompatibility in Nicotiana alata.

PubMed

Jahnen, W; Batterham, M P; Clarke, A E; Moritz, R L; Simpson, R J

1989-05-01

S-Gene-associated glycoproteins (S-glycoproteins) from styles of Nicotiana alata, identified by non-equilibrium two-dimensional electrophoresis, were purified by cation exchange fast protein liquid chromatography with yields of 0.5 to 8 micrograms of protein per style, depending on the S-genotype of the plant. The method relies on the highly basic nature of the S-glycoproteins. The elution profiles of the different S-glycoproteins from the fast protein liquid chromatography column were characteristic of each S-glycoprotein, and could be used to establish the S-genotype of plants in outbreeding populations. In all cases, the S-genotype predicted from the style protein profile corresponded to that predicted from DNA gel blot analysis using S-allele-specific DNA probes and to that established by conventional breeding tests. Amino-terminal sequences of five purified S-glycoproteins showed a high degree of homology with the previously published sequences of N. alata and Lycopersicon esculentum S-glycoproteins.
Comprehensive discovery of noncoding RNAs in acute myeloid leukemia cell transcriptomes.

PubMed

Zhang, Jin; Griffith, Malachi; Miller, Christopher A; Griffith, Obi L; Spencer, David H; Walker, Jason R; Magrini, Vincent; McGrath, Sean D; Ly, Amy; Helton, Nichole M; Trissal, Maria; Link, Daniel C; Dang, Ha X; Larson, David E; Kulkarni, Shashikant; Cordes, Matthew G; Fronick, Catrina C; Fulton, Robert S; Klco, Jeffery M; Mardis, Elaine R; Ley, Timothy J; Wilson, Richard K; Maher, Christopher A

2017-11-01

To detect diverse and novel RNA species comprehensively, we compared deep small RNA and RNA sequencing (RNA-seq) methods applied to a primary acute myeloid leukemia (AML) sample. We were able to discover previously unannotated small RNAs using deep sequencing of a library method using broader insert size selection. We analyzed the long noncoding RNA (lncRNA) landscape in AML by comparing deep sequencing from multiple RNA-seq library construction methods for the sample that we studied and then integrating RNA-seq data from 179 AML cases. This identified lncRNAs that are completely novel, differentially expressed, and associated with specific AML subtypes. Our study revealed the complexity of the noncoding RNA transcriptome through a combined strategy of strand-specific small RNA and total RNA-seq. This dataset will serve as an invaluable resource for future RNA-based analyses. Copyright © 2017 ISEH – Society for Hematology and Stem Cells. Published by Elsevier Inc. All rights reserved.
The mitochondrial genome of Pocillopora (Cnidaria: Scleractinia) contains two variable regions: the putative D-loop and a novel ORF of unknown function.

PubMed

Flot, Jean-François; Tillier, Simon

2007-10-15

The complete mitochondrial genomes of two individuals attributed to different morphospecies of the scleractinian coral genus Pocillopora have been sequenced. Both genomes, respectively 17,415 and 17,422 nt long, share the presence of a previously undescribed ORF encoding a putative protein made up of 302 amino acids and of unknown function. Surprisingly, this ORF turns out to be the second most variable region of the mitochondrial genome (1% nucleotide sequence difference between the two individuals) after the putative control region (1.5% sequence difference). Except for the presence of this ORF and for the location of the putative control region, the mitochondrial genome of Pocillopora is organized in a fashion similar to the other scleractinian coral genomes published to date. For the first time in a cnidarian, a putative second origin of replication is described based on its secondary structure similar to the stem-loop structure of O(L), the origin of L-strand replication in vertebrates.
Tracing the neural basis of auditory entrainment.

PubMed

Lehmann, Alexandre; Arias, Diana Jimena; Schönwiesner, Marc

2016-11-19

Neurons in the auditory cortex synchronize their responses to temporal regularities in sound input. This coupling or "entrainment" is thought to facilitate beat extraction and rhythm perception in temporally structured sounds, such as music. As a consequence of such entrainment, the auditory cortex responds to an omitted (silent) sound in a regular sequence. Although previous studies suggest that the auditory brainstem frequency-following response (FFR) exhibits some of the beat-related effects found in the cortex, it is unknown whether omissions of sounds evoke a brainstem response. We simultaneously recorded cortical and brainstem responses to isochronous and irregular sequences of consonant-vowel syllable /da/ that contained sporadic omissions. The auditory cortex responded strongly to omissions, but we found no evidence of evoked responses to omitted stimuli from the auditory brainstem. However, auditory brainstem responses in the isochronous sound sequence were more consistent across trials than in the irregular sequence. These results indicate that the auditory brainstem faithfully encodes short-term acoustic properties of a stimulus and is sensitive to sequence regularity, but does not entrain to isochronous sequences sufficiently to generate overt omission responses, even for sequences that evoke such responses in the cortex. These findings add to our understanding of the processing of sound regularities, which is an important aspect of human cognitive abilities like rhythm, music and speech perception. Copyright © 2016 IBRO. Published by Elsevier Ltd. All rights reserved.
Virtual Genome Walking across the 32 Gb Ambystoma mexicanum genome; assembling gene models and intronic sequence.

PubMed

Evans, Teri; Johnson, Andrew D; Loose, Matthew

2018-01-12

Large repeat rich genomes present challenges for assembly using short read technologies. The 32 Gb axolotl genome is estimated to contain ~19 Gb of repetitive DNA making an assembly from short reads alone effectively impossible. Indeed, this model species has been sequenced to 20× coverage but the reads could not be conventionally assembled. Using an alternative strategy, we have assembled subsets of these reads into scaffolds describing over 19,000 gene models. We call this method Virtual Genome Walking as it locally assembles whole genome reads based on a reference transcriptome, identifying exons and iteratively extending them into surrounding genomic sequence. These assemblies are then linked and refined to generate gene models including upstream and downstream genomic, and intronic, sequence. Our assemblies are validated by comparison with previously published axolotl bacterial artificial chromosome (BAC) sequences. Our analyses of axolotl intron length, intron-exon structure, repeat content and synteny provide novel insights into the genic structure of this model species. This resource will enable new experimental approaches in axolotl, such as ChIP-Seq and CRISPR and aid in future whole genome sequencing efforts. The assembled sequences and annotations presented here are freely available for download from https://tinyurl.com/y8gydc6n . The software pipeline is available from https://github.com/LooseLab/iterassemble .
Population diversity of Diaphorina citri (Hemiptera: Liviidae) in China based on whole mitochondrial genome sequences.

PubMed

Wu, Fengnian; Jiang, Hongyan; Beattie, G Andrew C; Holford, Paul; Chen, Jianchi; Wallis, Christopher M; Zheng, Zheng; Deng, Xiaoling; Cen, Yijing

2018-04-24

Diaphorina citri (Asian citrus psyllid; ACP) transmits 'Candidatus Liberibacter asiaticus' associated with citrus Huanglongbing (HLB). ACP has been reported in 11 provinces/regions in China, yet its population diversity remains unclear. In this study, we evaluated ACP population diversity in China using representative whole mitochondrial genome (mitogenome) sequences. Additional mitogenome sequences outside China were also acquired and evaluated. The sizes of the 27 ACP mitogenome sequences ranged from 14 986 to 15 030 bp. Along with three previously published mitogenome sequences, the 30 sequences formed three major mitochondrial groups (MGs): MG1, present in southwestern China and occurring at elevations above 1000 m; MG2, present in southeastern China and Southeast Asia (Cambodia, Indonesia, Malaysia, and Vietnam) and occurring at elevations below 180 m; and MG3, present in the USA and Pakistan. Single nucleotide polymorphisms in five genes (cox2, atp8, nad3, nad1 and rrnL) contributed mostly in the ACP diversity. Among these genes, rrnL had the most variation. Mitogenome sequences analyses revealed two major phylogenetic groups of ACP present in China as well as a possible unique group present currently in Pakistan and the USA. The information could have significant implications for current ACP control and HLB management. © 2018 Society of Chemical Industry. © 2018 Society of Chemical Industry.
Sensitivity to sequencing depth in single-cell cancer genomics.

PubMed

Alves, João M; Posada, David

2018-04-16

Querying cancer genomes at single-cell resolution is expected to provide a powerful framework to understand in detail the dynamics of cancer evolution. However, given the high costs currently associated with single-cell sequencing, together with the inevitable technical noise arising from single-cell genome amplification, cost-effective strategies that maximize the quality of single-cell data are critically needed. Taking advantage of previously published single-cell whole-genome and whole-exome cancer datasets, we studied the impact of sequencing depth and sampling effort towards single-cell variant detection. Five single-cell whole-genome and whole-exome cancer datasets were independently downscaled to 25, 10, 5, and 1× sequencing depth. For each depth level, ten technical replicates were generated, resulting in a total of 6280 single-cell BAM files. The sensitivity of variant detection, including structural and driver mutations, genotyping, clonal inference, and phylogenetic reconstruction to sequencing depth was evaluated using recent tools specifically designed for single-cell data. Altogether, our results suggest that for relatively large sample sizes (25 or more cells) sequencing single tumor cells at depths > 5× does not drastically improve somatic variant discovery, characterization of clonal genotypes, or estimation of single-cell phylogenies. We suggest that sequencing multiple individual tumor cells at a modest depth represents an effective alternative to explore the mutational landscape and clonal evolutionary patterns of cancer genomes.
Sequence variation of functional HTLV-II tax alleles among isolates from an endemic population: lack of evidence for oncogenic determinant in tax.

PubMed

Hjelle, B; Chaney, R

1992-02-01

Human T-cell leukemia-lymphoma virus type II (HTLV-II) has been isolated from patients with hairy cell leukemia (HCL). We previously described a population with longstanding endemic HTLV-II infection, and showed that there is no increased risk for HCL in the affected groups. We thus have direct evidence that the endemic form(s) of HTLV-II cause HCL infrequently, if at all. By comparison, there is reason to suspect that the viruses isolated from patients with HCL had an etiologic role in the disease in those patients. One way to reconcile these conflicting observations is to consider that isolates of HTLV-II might differ in oncogenic potential. To determine whether the structure of the putative oncogenic determinant of HTLV-II, tax2, might differ in the new isolates compared to the tax of the prototype HCL isolate, MO, four new functional tax cDNAs were cloned from new isolates. Sequence analysis showed only minor (0.9-2.0%) amino acid variation compared to the published sequence of MO tax2. Some codons were consistently different from published sequences of the MO virus, but in most cases, such variations were also found in each of two tax2 clones we isolated from the MO T-cell line. These variations rendered the new clones more similar to the tax1 of the pathogenic virus HTLV-I. Thus we find no evidence that pathologic determinants of HTLV-II can be assigned to the tax gene.

Whole-exome sequencing for mutation detection in pediatric disorders of insulin secretion: Maturity onset diabetes of the young and congenital hyperinsulinism.

PubMed

Johnson, S R; Leo, P J; McInerney-Leo, A M; Anderson, L K; Marshall, M; McGown, I; Newell, F; Brown, M A; Conwell, L S; Harris, M; Duncan, E L

2018-06-01

To assess the utility of whole-exome sequencing (WES) for mutation detection in maturity-onset diabetes of the young (MODY) and congenital hyperinsulinism (CHI). MODY and CHI are the two commonest monogenic disorders of glucose-regulated insulin secretion in childhood, with 13 causative genes known for MODY and 10 causative genes identified for CHI. The large number of potential genes makes comprehensive screening using traditional methods expensive and time-consuming. Ten subjects with MODY and five with CHI with known mutations underwent WES using two different exome capture kits (Nimblegen SeqCap EZ Human v3.0 Exome Enrichment Kit, Nextera Rapid Capture Exome Kit). Analysis was blinded to previously identified mutations, and included assessment for large deletions. The target capture of five exome capture technologies was also analyzed using sequencing data from >2800 unrelated samples. Four of five MODY mutations were identified using Nimblegen (including a large deletion in HNF1B). Although targeted, one mutation (in INS) had insufficient coverage for detection. Eleven of eleven mutations (six MODY, five CHI) were identified using Nextera Rapid (including the previously missed mutation). On reconciliation, all mutations concorded with previous data and no additional variants in MODY genes were detected. There were marked differences in the performance of the capture technologies. WES can be useful for screening for MODY/CHI mutations, detecting both point mutations and large deletions. However, capture technologies require careful selection. © 2018 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Identification of a novel LMF1 nonsense mutation responsible for severe hypertriglyceridemia by targeted next-generation sequencing.

PubMed

Cefalù, Angelo B; Spina, Rossella; Noto, Davide; Ingrassia, Valeria; Valenti, Vincenza; Giammanco, Antonina; Fayer, Francesca; Misiano, Gabriella; Cocorullo, Gianfranco; Scrimali, Chiara; Palesano, Ornella; Altieri, Grazia I; Ganci, Antonina; Barbagallo, Carlo M; Averna, Maurizio R

Severe hypertriglyceridemia (HTG) may result from mutations in genes affecting the intravascular lipolysis of triglyceride (TG)-rich lipoproteins. The aim of this study was to develop a targeted next-generation sequencing panel for the molecular diagnosis of disorders characterized by severe HTG. We developed a targeted customized panel for next-generation sequencing Ion Torrent Personal Genome Machine to capture the coding exons and intron/exon boundaries of 18 genes affecting the main pathways of TG synthesis and metabolism. We sequenced 11 samples of patients with severe HTG (TG>885 mg/dL-10 mmol/L): 4 positive controls in whom pathogenic mutations had previously been identified by Sanger sequencing and 7 patients in whom the molecular defect was still unknown. The customized panel was accurate, and it allowed to confirm genetic variants previously identified in all positive controls with primary severe HTG. Only 1 patient of 7 with HTG was found to be carrier of a homozygous pathogenic mutation of the third novel mutation of LMF1 gene (c.1380C>G-p.Y460X). The clinical and molecular familial cascade screening allowed the identification of 2 additional affected siblings and 7 heterozygous carriers of the mutation. We showed that our targeted resequencing approach for genetic diagnosis of severe HTG appears to be accurate, less time consuming, and more economical compared with traditional Sanger resequencing. The identification of pathogenic mutations in candidate genes remains challenging and clinical resequencing should mainly intended for patients with strong clinical criteria for monogenic severe HTG. Copyright © 2017 National Lipid Association. Published by Elsevier Inc. All rights reserved.
Next-generation sequencing can reveal in vitro-generated PCR crossover products: some artifactual sequences correspond to HLA alleles in the IMGT/HLA database.

PubMed

Holcomb, C L; Rastrou, M; Williams, T C; Goodridge, D; Lazaro, A M; Tilanus, M; Erlich, H A

2014-01-01

The high-resolution human leukocyte antigen (HLA) genotyping assay that we developed using 454 sequencing and Conexio software uses generic polymerase chain reaction (PCR) primers for DRB exon 2. Occasionally, we observed low abundance DRB amplicon sequences that resulted from in vitro PCR 'crossing over' between DRB1 and DRB3/4/5. These hybrid sequences, revealed by the clonal sequencing property of the 454 system, were generally observed at a read depth of 5%-10% of the true alleles. They usually contained at least one mismatch with the IMGT/HLA database, and consequently, were easily recognizable and did not cause a problem for HLA genotyping. Sometimes, however, these artifactual sequences matched a rare allele and the automatic genotype assignment was incorrect. These observations raised two issues: (1) could PCR conditions be modified to reduce such artifacts? and (2) could some of the rare alleles listed in the IMGT/HLA database be artifacts rather than true alleles? Because PCR crossing over occurs during late cycles of PCR, we compared DRB genotypes resulting from 28 and (our standard) 35 cycles of PCR. For all 21 cell line DNAs amplified for 35 cycles, crossover products were detected. In 33% of the cases, these hybrid sequences corresponded to named alleles. With amplification for only 28 cycles, these artifactual sequences were not detectable. To investigate whether some rare alleles in the IMGT/HLA database might be due to PCR artifacts, we analyzed four samples obtained from the investigators who submitted the sequences. In three cases, the sequences were generated from true alleles. In one case, our 454 sequencing revealed an error in the previously submitted sequence. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
AD-LIBS: inferring ancestry across hybrid genomes using low-coverage sequence data.

PubMed

Schaefer, Nathan K; Shapiro, Beth; Green, Richard E

2017-04-04

Inferring the ancestry of each region of admixed individuals' genomes is useful in studies ranging from disease gene mapping to speciation genetics. Current methods require high-coverage genotype data and phased reference panels, and are therefore inappropriate for many data sets. We present a software application, AD-LIBS, that uses a hidden Markov model to infer ancestry across hybrid genomes without requiring variant calling or phasing. This approach is useful for non-model organisms and in cases of low-coverage data, such as ancient DNA. We demonstrate the utility of AD-LIBS with synthetic data. We then use AD-LIBS to infer ancestry in two published data sets: European human genomes with Neanderthal ancestry and brown bear genomes with polar bear ancestry. AD-LIBS correctly infers 87-91% of ancestry in simulations and produces ancestry maps that agree with published results and global ancestry estimates in humans. In brown bears, we find more polar bear ancestry than has been published previously, using both AD-LIBS and an existing software application for local ancestry inference, HAPMIX. We validate AD-LIBS polar bear ancestry maps by recovering a geographic signal within bears that mirrors what is seen in SNP data. Finally, we demonstrate that AD-LIBS is more effective than HAPMIX at inferring ancestry when preexisting phased reference data are unavailable and genomes are sequenced to low coverage. AD-LIBS is an effective tool for ancestry inference that can be used even when few individuals are available for comparison or when genomes are sequenced to low coverage. AD-LIBS is therefore likely to be useful in studies of non-model or ancient organisms that lack large amounts of genomic DNA. AD-LIBS can therefore expand the range of studies in which admixture mapping is a viable tool.
Genome-Wide Search Identifies 1.9 Mb from the Polar Bear Y Chromosome for Evolutionary Analyses

PubMed Central

Bidon, Tobias; Schreck, Nancy; Hailer, Frank; Nilsson, Maria A.; Janke, Axel

2015-01-01

The male-inherited Y chromosome is the major haploid fraction of the mammalian genome, rendering Y-linked sequences an indispensable resource for evolutionary research. However, despite recent large-scale genome sequencing approaches, only a handful of Y chromosome sequences have been characterized to date, mainly in model organisms. Using polar bear (Ursus maritimus) genomes, we compare two different in silico approaches to identify Y-linked sequences: 1) Similarity to known Y-linked genes and 2) difference in the average read depth of autosomal versus sex chromosomal scaffolds. Specifically, we mapped available genomic sequencing short reads from a male and a female polar bear against the reference genome and identify 112 Y-chromosomal scaffolds with a combined length of 1.9 Mb. We verified the in silico findings for the longer polar bear scaffolds by male-specific in vitro amplification, demonstrating the reliability of the average read depth approach. The obtained Y chromosome sequences contain protein-coding sequences, single nucleotide polymorphisms, microsatellites, and transposable elements that are useful for evolutionary studies. A high-resolution phylogeny of the polar bear patriline shows two highly divergent Y chromosome lineages, obtained from analysis of the identified Y scaffolds in 12 previously published male polar bear genomes. Moreover, we find evidence of gene conversion among ZFX and ZFY sequences in the giant panda lineage and in the ancestor of ursine and tremarctine bears. Thus, the identification of Y-linked scaffold sequences from unordered genome sequences yields valuable data to infer phylogenomic and population-genomic patterns in bears. PMID:26019166
The Bologna Annotation Resource (BAR 3.0): improving protein functional annotation.

PubMed

Profiti, Giuseppe; Martelli, Pier Luigi; Casadio, Rita

2017-07-03

BAR 3.0 updates our server BAR (Bologna Annotation Resource) for predicting protein structural and functional features from sequence. We increase data volume, query capabilities and information conveyed to the user. The core of BAR 3.0 is a graph-based clustering procedure of UniProtKB sequences, following strict pairwise similarity criteria (sequence identity ≥40% with alignment coverage ≥90%). Each cluster contains the available annotation downloaded from UniProtKB, GO, PFAM and PDB. After statistical validation, GO terms and PFAM domains are cluster-specific and annotate new sequences entering the cluster after satisfying similarity constraints. BAR 3.0 includes 28 869 663 sequences in 1 361 773 clusters, of which 22.2% (22 241 661 sequences) and 47.4% (24 555 055 sequences) have at least one validated GO term and one PFAM domain, respectively. 1.4% of the clusters (36% of all sequences) include PDB structures and the cluster is associated to a hidden Markov model that allows building template-target alignment suitable for structural modeling. Some other 3 399 026 sequences are singletons. BAR 3.0 offers an improved search interface, allowing queries by UniProtKB-accession, Fasta sequence, GO-term, PFAM-domain, organism, PDB and ligand/s. When evaluated on the CAFA2 targets, BAR 3.0 largely outperforms our previous version and scores among state-of-the-art methods. BAR 3.0 is publicly available and accessible at http://bar.biocomp.unibo.it/bar3. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Genetic diversity of the DBLalpha region in Plasmodium falciparum var genes among Asia-Pacific isolates.

PubMed

Fowler, Elizabeth V; Peters, Jennifer M; Gatton, Michelle L; Chen, Nanhua; Cheng, Qin

2002-03-01

In Plasmodium falciparum a highly polymorphic multi-copy gene family, var, encodes the variant surface antigen P. falciparum erythrocyte membrane protein 1 (PfEMP1), which has an important role in cytoadherence and immune evasion. Using previously described universal PCR primers for the first Duffy binding-like domain (DBLalpha) of var we analysed the DBLalpha repertoires of Dd2 (originally from Thailand) and eight isolates from the Solomon Islands (n=4), Philippines (n=2), Papua New Guinea (n=1) and Africa (n=1). We found 15-32 unique DBLalpha sequence types among these isolates and estimated detectable DBLalpha repertoire sizes ranging from 33-38 to 52-57 copies per genome. Our data suggest that var gene repertoires generally consist of 40-50 copies per genome. Eighteen DBLalpha sequences appeared in more than one Asia-Pacific isolate with the number of sequences shared between any two isolates ranging from 0 to 6 (mean=2.0 +/-1.6). At the amino acid level DBLalpha sequence similarity within isolates ranged from 45.2 +/- 7.1 to 50.2 +/- 6.9%, and was not significantly different from the DBLalpha amino acid sequence similarity among isolates (P>0.1). Comparisons with published sequences also revealed little overlap among DBLalpha sequences from different regions. High DBLalpha sequence diversity and minimal overlap among these isolates suggest that the global var gene repertoire is immense, and may potentially be selected for by the host's protective immune response to the var gene products, PfEMP1.
Auditory and visual sequence learning in humans and monkeys using an artificial grammar learning paradigm.

PubMed

Milne, Alice E; Petkov, Christopher I; Wilson, Benjamin

2017-07-05

Language flexibly supports the human ability to communicate using different sensory modalities, such as writing and reading in the visual modality and speaking and listening in the auditory domain. Although it has been argued that nonhuman primate communication abilities are inherently multisensory, direct behavioural comparisons between human and nonhuman primates are scant. Artificial grammar learning (AGL) tasks and statistical learning experiments can be used to emulate ordering relationships between words in a sentence. However, previous comparative work using such paradigms has primarily investigated sequence learning within a single sensory modality. We used an AGL paradigm to evaluate how humans and macaque monkeys learn and respond to identically structured sequences of either auditory or visual stimuli. In the auditory and visual experiments, we found that both species were sensitive to the ordering relationships between elements in the sequences. Moreover, the humans and monkeys produced largely similar response patterns to the visual and auditory sequences, indicating that the sequences are processed in comparable ways across the sensory modalities. These results provide evidence that human sequence processing abilities stem from an evolutionarily conserved capacity that appears to operate comparably across the sensory modalities in both human and nonhuman primates. The findings set the stage for future neurobiological studies to investigate the multisensory nature of these sequencing operations in nonhuman primates and how they compare to related processes in humans. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Phylogenomics from Whole Genome Sequences Using aTRAM.

PubMed

Allen, Julie M; Boyd, Bret; Nguyen, Nam-Phuong; Vachaspati, Pranjal; Warnow, Tandy; Huang, Daisie I; Grady, Patrick G S; Bell, Kayce C; Cronk, Quentin C B; Mugisha, Lawrence; Pittendrigh, Barry R; Leonardi, M Soledad; Reed, David L; Johnson, Kevin P

2017-09-01

Novel sequencing technologies are rapidly expanding the size of data sets that can be applied to phylogenetic studies. Currently the most commonly used phylogenomic approaches involve some form of genome reduction. While these approaches make assembling phylogenomic data sets more economical for organisms with large genomes, they reduce the genomic coverage and thereby the long-term utility of the data. Currently, for organisms with moderate to small genomes ($<$1000 Mbp) it is feasible to sequence the entire genome at modest coverage ($10-30\\times$). Computational challenges for handling these large data sets can be alleviated by assembling targeted reads, rather than assembling the entire genome, to produce a phylogenomic data matrix. Here we demonstrate the use of automated Target Restricted Assembly Method (aTRAM) to assemble 1107 single-copy ortholog genes from whole genome sequencing of sucking lice (Anoplura) and out-groups. We developed a pipeline to extract exon sequences from the aTRAM assemblies by annotating them with respect to the original target protein. We aligned these protein sequences with the inferred amino acids and then performed phylogenetic analyses on both the concatenated matrix of genes and on each gene separately in a coalescent analysis. Finally, we tested the limits of successful assembly in aTRAM by assembling 100 genes from close- to distantly related taxa at high to low levels of coverage.Both the concatenated analysis and the coalescent-based analysis produced the same tree topology, which was consistent with previously published results and resolved weakly supported nodes. These results demonstrate that this approach is successful at developing phylogenomic data sets from raw genome sequencing reads. Further, we found that with coverages above $5-10\\times$, aTRAM was successful at assembling 80-90% of the contigs for both close and distantly related taxa. As sequencing costs continue to decline, we expect full genome sequencing will become more feasible for a wider array of organisms, and aTRAM will enable mining of these genomic data sets for an extensive variety of applications, including phylogenomics. [aTRAM; gene assembly; genome sequencing; phylogenomics.]. © The Author(s) 2017. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Distributed biotin-streptavidin transcription roadblocks for mapping cotranscriptional RNA folding.

PubMed

Strobel, Eric J; Watters, Kyle E; Nedialkov, Yuri; Artsimovitch, Irina; Lucks, Julius B

2017-07-07

RNA folding during transcription directs an order of folding that can determine RNA structure and function. However, the experimental study of cotranscriptional RNA folding has been limited by the lack of easily approachable methods that can interrogate nascent RNA structure at nucleotide resolution. To address this, we previously developed cotranscriptional selective 2΄-hydroxyl acylation analyzed by primer extension sequencing (SHAPE-Seq) to simultaneously probe all intermediate RNA transcripts during transcription by stalling elongation complexes at catalytically dead EcoRIE111Q roadblocks. While effective, the distribution of elongation complexes using EcoRIE111Q requires laborious PCR using many different oligonucleotides for each sequence analyzed. Here, we improve the broad applicability of cotranscriptional SHAPE-Seq by developing a sequence-independent biotin-streptavidin (SAv) roadblocking strategy that simplifies the preparation of roadblocking DNA templates. We first determine the properties of biotin-SAv roadblocks. We then show that randomly distributed biotin-SAv roadblocks can be used in cotranscriptional SHAPE-Seq experiments to identify the same RNA structural transitions related to a riboswitch decision-making process that we previously identified using EcoRIE111Q. Lastly, we find that EcoRIE111Q maps nascent RNA structure to specific transcript lengths more precisely than biotin-SAv and propose guidelines to leverage the complementary strengths of each transcription roadblock in cotranscriptional SHAPE-Seq. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Historian: accurate reconstruction of ancestral sequences and evolutionary rates.

PubMed

Holmes, Ian H

2017-04-15

Reconstruction of ancestral sequence histories, and estimation of parameters like indel rates, are improved by using explicit evolutionary models and summing over uncertain alignments. The previous best tool for this purpose (according to simulation benchmarks) was ProtPal, but this tool was too slow for practical use. Historian combines an efficient reimplementation of the ProtPal algorithm with performance-improving heuristics from other alignment tools. Simulation results on fidelity of rate estimation via ancestral reconstruction, along with evaluations on the structurally informed alignment dataset BAliBase 3.0, recommend Historian over other alignment tools for evolutionary applications. Historian is available at https://github.com/evoldoers/historian under the Creative Commons Attribution 3.0 US license. ihholmes+historian@gmail.com. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Induction of surfactin production in Bacillus subtilis by gsp, a gene located upstream of the gramicidin S operon in Bacillus brevis.

PubMed Central

Borchert, S; Stachelhaus, T; Marahiel, M A

1994-01-01

The deduced amino acid sequence of the gsp gene, located upstream of the 5' end of the gramicidin S operon (grs operon) in Bacillus brevis, showed a high degree of similarity to the sfp gene product, which is located downstream of the srfA operon in B. subtilis. The gsp gene complemented in trans a defect in the sfp gene (sfpO) and promoted production of the lipopeptide antibiotic surfactin. The functional homology of Gsp and Sfp and the sequence similarity of these two proteins to EntD suggest that the three proteins represent a new class of proteins involved in peptide secretion, in support of a hypothesis published previously (T. H. Grossman, M. Tuckman, S. Ellestad, and M. S. Osburne, J. Bacteriol. 175:6203-6211, 1993). Images PMID:7512553
Redescription and phylogenetic relationships of Euparyphium capitaneum Dietz, 1909, the type-species of Euparyphium Dietz, 1909 (Digenea: Echinostomatidae).

PubMed

Kudlai, Olena; Tkach, Vasyl V; Pulis, Eric E; Kostadinova, Aneta

2015-01-01

Euparyphium capitaneum Dietz, 1909, the type-species of the genus Euparyphium Dietz, 1909, is described on the basis of material collected from the type-host Anhinga anhinga (L.) from Pascagoula River, which drains into the northern coast of the Gulf of Mexico. Combination of light and scanning electron microscopy observations of freshly collected and properly fixed specimens in our study has allowed us to provide novel information on the morphology and topology of the reproductive systems and other morphological features of the species. A Bayesian inference analysis based on the newly-obtained partial sequence of the nuclear 28S rRNA gene for E. capitaneum and 24 previously published sequences from the superfamily Echinostomatoidea Looss, 1899 provided evidence supporting the distinct status of the genera Euparyphium and Isthmiophora Lühe, 1909.
A Case of KCNQ2-Associated Movement Disorder Triggered by Fever.

PubMed

Dhamija, Radhika; Goodkin, Howard P; Bailey, Russell; Chambers, Chelsea; Brenton, J Nicholas

2017-12-01

The differential diagnosis of fever-induced movement disorders in childhood is broad. Whole exome sequencing has yielded new insights into those cases with a suspected genetic basis. We report the case of an 8-year-old boy with a history of neonatal seizures who presented with near-continuous hyperkinetic movements of his limbs during a febrile illness. Initial diagnostic testing did not explain his abnormalities; however, given the suspicion for a channelopathy, whole exome sequencing was performed and it demonstrated a de novo pathogenic heterozygous variant in KCNQ2. There is an expanding phenotypic spectrum of heterozygous alterations in KCNQ2; however, this report provides the first description of a pathogenic KCNQ2 variant fever-induced hyperkinetic movement disorder in childhood. We also review the literature of cases previously published with the same pathogenic variant.
Chitayat syndrome: hyperphalangism, characteristic facies, hallux valgus and bronchomalacia results from a recurrent c.266A>G p.(Tyr89Cys) variant in the ERF gene.

PubMed

Balasubramanian, M; Lord, H; Levesque, S; Guturu, H; Thuriot, F; Sillon, G; Wenger, A M; Sureka, D L; Lester, T; Johnson, D S; Bowen, J; Calhoun, A R; Viskochil, D H; Bejerano, G; Bernstein, J A; Chitayat, D

2017-03-01

In 1993, Chitayat et al. , reported a newborn with hyperphalangism, facial anomalies, and bronchomalacia. We identified three additional families with similar findings. Features include bilateral accessory phalanx resulting in shortened index fingers; hallux valgus; distinctive face; respiratory compromise. To identify the genetic aetiology of Chitayat syndrome and identify a unifying cause for this specific form of hyperphalangism. Through ongoing collaboration, we had collected patients with strikingly-similar phenotype. Trio-based exome sequencing was first performed in Patient 2 through Deciphering Developmental Disorders study. Proband-only exome sequencing had previously been independently performed in Patient 4. Following identification of a candidate gene variant in Patient 2, the same variant was subsequently confirmed from exome data in Patient 4. Sanger sequencing was used to validate this variant in Patients 1, 3; confirm paternal inheritance in Patient 5. A recurrent, novel variant NM_006494.2:c.266A>G p.(Tyr89Cys) in ERF was identified in five affected individuals: de novo (patient 1, 2 and 3) and inherited from an affected father (patient 4 and 5). p.Tyr89Cys is an aromatic polar neutral to polar neutral amino acid substitution, at a highly conserved position and lies within the functionally important ETS-domain of the protein. The recurrent ERF c.266A>C p.(Tyr89Cys) variant causes Chitayat syndrome. ERF variants have previously been associated with complex craniosynostosis. In contrast, none of the patients with the c.266A>G p.(Tyr89Cys) variant have craniosynostosis. We report the molecular aetiology of Chitayat syndrome and discuss potential mechanisms for this distinctive phenotype associated with the p.Tyr89Cys substitution in ERF . Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
Neoerysiphe kerribeeensis sp. nov. (Ascomycota: Erysiphales), a new species of Neoerysiphe on native and introduced species of Senecio (Asteraceae) in Australia.

PubMed

Beilharz, Vyrna; Cunnington, James H; Pascoe, Ian G

2010-04-01

Anamorphic powdery mildew fungi on introduced taxa of Senecio and Pericallis × hybrida in Australia have previously been identified as Neoerysiphe cumminsiana on the basis of a combination of Euoidium-type conidiophores and lobed mycelial and germ tube appressoria. But, two specimens with chasmothecia on the indigenous Senecio glossanthus did not agree with published descriptions of N. cumminsiana. The teleomorph of the S. glossanthus mildew differed from that of N. cumminsiana in the morphology of its peridial cells, the pigmentation of its appendages, and the morphology and pigmentation of some secondary hyphae. Ribosomal DNA ITS sequences from the two S. glossanthus mildew specimens and five other specimens of Senecio mildews from south-eastern Australia demonstrated that all Australian Senecio mildews are conspecific and distinct from the northern hemisphere Senecio mildew (N. cumminsiana) and from other Neoerysiphe taxa. Based on morphological characters and rDNA sequence data, the Australian Senecio mildew is described as a new species, Neoerysiphe kerribeeensis. This is the first native teleomorphic powdery mildew described from Australia. Copyright © 2010 The British Mycological Society. Published by Elsevier Ltd. All rights reserved.
Deep sequencing analysis of viral infection and evolution allows rapid and detailed characterization of viral mutant spectrum.

PubMed

Isakov, Ofer; Bordería, Antonio V; Golan, David; Hamenahem, Amir; Celniker, Gershon; Yoffe, Liron; Blanc, Hervé; Vignuzzi, Marco; Shomron, Noam

2015-07-01

The study of RNA virus populations is a challenging task. Each population of RNA virus is composed of a collection of different, yet related genomes often referred to as mutant spectra or quasispecies. Virologists using deep sequencing technologies face major obstacles when studying virus population dynamics, both experimentally and in natural settings due to the relatively high error rates of these technologies and the lack of high performance pipelines. In order to overcome these hurdles we developed a computational pipeline, termed ViVan (Viral Variance Analysis). ViVan is a complete pipeline facilitating the identification, characterization and comparison of sequence variance in deep sequenced virus populations. Applying ViVan on deep sequenced data obtained from samples that were previously characterized by more classical approaches, we uncovered novel and potentially crucial aspects of virus populations. With our experimental work, we illustrate how ViVan can be used for studies ranging from the more practical, detection of resistant mutations and effects of antiviral treatments, to the more theoretical temporal characterization of the population in evolutionary studies. Freely available on the web at http://www.vivanbioinfo.org : nshomron@post.tau.ac.il Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Kerr, J.M.; Fisher, L.W.; Termine, J.D.

The authors have isolated and partially sequenced the human bone sialoprotein gene (IBSP). IBSP has been sublocalized by in situ hybridization to chromosome 4q38-q31 and is composed of six small exons (51 to 159 bp) and 1 large exon ([approximately]2.6 kb). The intron/exon junctions defined by sequence analysis are of class O, retaining an intact coding triplet. Sequence analysis of the 5[prime] upstream region revealed a TATAA (nucleotides -30 to-25 from the transcriptional start point) and a CCAAT (nucleotides -56 to-52) box, both in the reverse orientation. Intron 1 contains interesting structural elements composed of polypyrimidine repeats followed by amore » poly(AC)[sub n] tract. Both types of structural elements have been detected in promoter regions of other genes and have been implicated in transcriptional regulation. Several differences between the previously published cDNA sequence and the authors' sequence have been identified, most of which are contained within the untranslated exon 1. Three base revisions in the coding region include a G to T (Gly to Val, amino acid 195), T to C (Val to Ala, amino acid 268), and T to A (Glu to Asp, amino acid 270). In conclusion, the genomic organization and potential regulatory elements of human IBSP have been elucidated. 42 refs., 4 figs., 1 tab.« less
Exome Sequence Analysis of 14 Families With High Myopia.

PubMed

Kloss, Bethany A; Tompson, Stuart W; Whisenhunt, Kristina N; Quow, Krystina L; Huang, Samuel J; Pavelec, Derek M; Rosenberg, Thomas; Young, Terri L

2017-04-01

To identify causal gene mutations in 14 families with autosomal dominant (AD) high myopia using exome sequencing. Select individuals from 14 large Caucasian families with high myopia were exome sequenced. Gene variants were filtered to identify potential pathogenic changes. Sanger sequencing was used to confirm variants in original DNA, and to test for disease cosegregation in additional family members. Candidate genes and chromosomal loci previously associated with myopic refractive error and its endophenotypes were comprehensively screened. In 14 high myopia families, we identified 73 rare and 31 novel gene variants as candidates for pathogenicity. In seven of these families, two of the novel and eight of the rare variants were within known myopia loci. A total of 104 heterozygous nonsynonymous rare variants in 104 genes were identified in 10 out of 14 probands. Each variant cosegregated with affection status. No rare variants were identified in genes known to cause myopia or in genes closest to published genome-wide association study association signals for refractive error or its endophenotypes. Whole exome sequencing was performed to determine gene variants implicated in the pathogenesis of AD high myopia. This study provides new genes for consideration in the pathogenesis of high myopia, and may aid in the development of genetic profiling of those at greatest risk for attendant ocular morbidities of this disorder.
SANSparallel: interactive homology search against Uniprot

PubMed Central

Somervuo, Panu; Holm, Liisa

2015-01-01

Proteins evolve by mutations and natural selection. The network of sequence similarities is a rich source for mining homologous relationships that inform on protein structure and function. There are many servers available to browse the network of homology relationships but one has to wait up to a minute for results. The SANSparallel webserver provides protein sequence database searches with immediate response and professional alignment visualization by third-party software. The output is a list, pairwise alignment or stacked alignment of sequence-similar proteins from Uniprot, UniRef90/50, Swissprot or Protein Data Bank. The stacked alignments are viewed in Jalview or as sequence logos. The database search uses the suffix array neighborhood search (SANS) method, which has been re-implemented as a client-server, improved and parallelized. The method is extremely fast and as sensitive as BLAST above 50% sequence identity. Benchmarks show that the method is highly competitive compared to previously published fast database search programs: UBLAST, DIAMOND, LAST, LAMBDA, RAPSEARCH2 and BLAT. The web server can be accessed interactively or programmatically at http://ekhidna2.biocenter.helsinki.fi/cgi-bin/sans/sans.cgi. It can be used to make protein functional annotation pipelines more efficient, and it is useful in interactive exploration of the detailed evidence supporting the annotation of particular proteins of interest. PMID:25855811

Evolutionary relationships of lactate dehydrogenases (LDHs) from mammals, birds, an amphibian, fish, barley, and bacteria: LDH cDNA sequences from Xenopus, pig, and rat.

PubMed Central

Tsuji, S; Qureshi, M A; Hou, E W; Fitch, W M; Li, S S

1994-01-01

The nucleotide sequences of the cDNAs encoding LDH (EC 1.1.1.27) subunits LDH-A (muscle), LDH-B (liver), and LDH-C (oocyte) from Xenopus laevis, LDH-A (muscle) and LDH-B (heart) from pig, and LDH-B (heart) and LDH-C (testis) from rat were determined. These seven newly deduced amino acid sequences and 22 other published LDH sequences, and three unpublished fish LDH-A sequences kindly provided by G. N. Somero and D. A. Powers, were used to construct the most parsimonious phylogenetic tree of these 32 LDH subunits from mammals, birds, an amphibian, fish, barley, and bacteria. There have been at least six LDH gene duplications among the vertebrates. The Xenopus LDH-A, LDH-B, and LDH-C subunits are most closely related to each other and then are more closely related to vertebrate LDH-B than LDH-A. Three fish LDH-As, as well as a single LDH of lamprey, also seem to be more related to vertebrate LDH-B than to land vertebrate LDH-A. The mammalian LDH-C (testis) subunit appears to have diverged very early, prior to the divergence of vertebrate LDH-A and LDH-B subunits, as reported previously. Images PMID:7937776
Genome assembly and transcriptome resource for river buffalo, Bubalus bubalis (2n = 50).

PubMed

Williams, John L; Iamartino, Daniela; Pruitt, Kim D; Sonstegard, Tad; Smith, Timothy P L; Low, Wai Yee; Biagini, Tommaso; Bomba, Lorenzo; Capomaccio, Stefano; Castiglioni, Bianca; Coletta, Angelo; Corrado, Federica; Ferré, Fabrizio; Iannuzzi, Leopoldo; Lawley, Cynthia; Macciotta, Nicolò; McClure, Matthew; Mancini, Giordano; Matassino, Donato; Mazza, Raffaele; Milanesi, Marco; Moioli, Bianca; Morandi, Nicola; Ramunno, Luigi; Peretti, Vincenzo; Pilla, Fabio; Ramelli, Paola; Schroeder, Steven; Strozzi, Francesco; Thibaud-Nissen, Francoise; Zicarelli, Luigi; Ajmone-Marsan, Paolo; Valentini, Alessio; Chillemi, Giovanni; Zimin, Aleksey

2017-10-01

Water buffalo is a globally important species for agriculture and local economies. A de novo assembled, well-annotated reference sequence for the water buffalo is an important prerequisite for studying the biology of this species, and is necessary to manage genetic diversity and to use modern breeding and genomic selection techniques. However, no such genome assembly has been previously reported. There are 2 species of domestic water buffalo, the river (2 n = 50) and the swamp (2 n = 48) buffalo. Here we describe a draft quality reference sequence for the river buffalo created from Illumina GA and Roche 454 short read sequences using the MaSuRCA assembler. The assembled sequence is 2.83 Gb, consisting of 366 983 scaffolds with a scaffold N50 of 1.41 Mb and contig N50 of 21 398 bp. Annotation of the genome was supported by transcriptome data from 30 tissues and identified 21 711 predicted protein coding genes. Searches for complete mammalian BUSCO gene groups found 98.6% of curated single copy orthologs present among predicted genes, which suggests a high level of completeness of the genome. The annotated sequence is available from NCBI at accession GCA_000471725.1. © The Author 2017. Published by Oxford University Press.
The spatial alignment of time: Differences in alignment of deictic and sequence time along the sagittal and lateral axes.

PubMed

Walker, Esther J; Bergen, Benjamin K; Núñez, Rafael

2017-04-01

People use space in a variety of ways to structure their thoughts about time. The present report focuses on the different ways that space is employed when reasoning about deictic (past/future relationships) and sequence (earlier/later relationships) time. In the first study, we show that deictic and sequence time are aligned along the lateral axis in a manner consistent with previous work, with past and earlier events associated with left space and future and later events associated with right space. However, the alignment of time with space is different along the sagittal axis. Participants associated future events and earlier events-not later events-with the space in front of their body and past and later events with the space behind, consistent with the sagittal spatial terms (e.g., ahead, in front of) that we use to talk about deictic and sequence time. In the second study, we show that these associations between sequence time and sagittal space are sensitive to person-perspective. This suggests that the particular space-time associations observed in English speakers are influenced by a variety of different spatial properties, including spatial location and perspective. Copyright © 2016. Published by Elsevier B.V.
Toward allotetraploid cotton genome assembly: integration of a high-density molecular genetic linkage map with DNA sequence information

PubMed Central

2012-01-01

Background Cotton is the world’s most important natural textile fiber and a significant oilseed crop. Decoding cotton genomes will provide the ultimate reference and resource for research and utilization of the species. Integration of high-density genetic maps with genomic sequence information will largely accelerate the process of whole-genome assembly in cotton. Results In this paper, we update a high-density interspecific genetic linkage map of allotetraploid cultivated cotton. An additional 1,167 marker loci have been added to our previously published map of 2,247 loci. Three new marker types, InDel (insertion-deletion) and SNP (single nucleotide polymorphism) developed from gene information, and REMAP (retrotransposon-microsatellite amplified polymorphism), were used to increase map density. The updated map consists of 3,414 loci in 26 linkage groups covering 3,667.62 cM with an average inter-locus distance of 1.08 cM. Furthermore, genome-wide sequence analysis was finished using 3,324 informative sequence-based markers and publicly-available Gossypium DNA sequence information. A total of 413,113 EST and 195 BAC sequences were physically anchored and clustered by 3,324 sequence-based markers. Of these, 14,243 ESTs and 188 BACs from different species of Gossypium were clustered and specifically anchored to the high-density genetic map. A total of 2,748 candidate unigenes from 2,111 ESTs clusters and 63 BACs were mined for functional annotation and classification. The 337 ESTs/genes related to fiber quality traits were integrated with 132 previously reported cotton fiber quality quantitative trait loci, which demonstrated the important roles in fiber quality of these genes. Higher-level sequence conservation between different cotton species and between the A- and D-subgenomes in tetraploid cotton was found, indicating a common evolutionary origin for orthologous and paralogous loci in Gossypium. Conclusion This study will serve as a valuable genomic resource for tetraploid cotton genome assembly, for cloning genes related to superior agronomic traits, and for further comparative genomic analyses in Gossypium. PMID:23046547
Interordinal gene capture, the phylogenetic position of Steller's sea cow based on molecular and morphological data, and the macroevolutionary history of Sirenia.

PubMed

Springer, Mark S; Signore, Anthony V; Paijmans, Johanna L A; Vélez-Juarbe, Jorge; Domning, Daryl P; Bauer, Cameron E; He, Kai; Crerar, Lorelei; Campos, Paula F; Murphy, William J; Meredith, Robert W; Gatesy, John; Willerslev, Eske; MacPhee, Ross D E; Hofreiter, Michael; Campbell, Kevin L

2015-10-01

The recently extinct (ca. 1768) Steller's sea cow (Hydrodamalis gigas) was a large, edentulous North Pacific sirenian. The phylogenetic affinities of this taxon to other members of this clade, living and extinct, are uncertain based on previous morphological and molecular studies. We employed hybridization capture methods and second generation sequencing technology to obtain >30kb of exon sequences from 26 nuclear genes for both H. gigas and Dugong dugon. We also obtained complete coding sequences for the tooth-related enamelin (ENAM) gene. Hybridization probes designed using dugong and manatee sequences were both highly effective in retrieving sequences from H. gigas (mean=98.8% coverage), as were more divergent probes for regions of ENAM (99.0% coverage) that were designed exclusively from a proboscidean (African elephant) and a hyracoid (Cape hyrax). New sequences were combined with available sequences for representatives of all other afrotherian orders. We also expanded a previously published morphological matrix for living and fossil Sirenia by adding both new taxa and nine new postcranial characters. Maximum likelihood and parsimony analyses of the molecular data provide robust support for an association of H. gigas and D. dugon to the exclusion of living trichechids (manatees). Parsimony analyses of the morphological data also support the inclusion of H. gigas in Dugongidae with D. dugon and fossil dugongids. Timetree analyses based on calibration density approaches with hard- and soft-bounded constraints suggest that H. gigas and D. dugon diverged in the Oligocene and that crown sirenians last shared a common ancestor in the Eocene. The coding sequence for the ENAM gene in H. gigas does not contain frameshift mutations or stop codons, but there is a transversion mutation (AG to CG) in the acceptor splice site of intron 2. This disruption in the edentulous Steller's sea cow is consistent with previous studies that have documented inactivating mutations in tooth-specific loci of a variety of edentulous and enamelless vertebrates including birds, turtles, aardvarks, pangolins, xenarthrans, and baleen whales. Further, branch-site dN/dS analyses provide evidence for positive selection in ENAM on the stem dugongid branch where extensive tooth reduction occurred, followed by neutral evolution on the Hydrodamalis branch. Finally, we present a synthetic evolutionary tree for living and fossil sirenians showing several key innovations in the history of this clade including character state changes that parallel those that occurred in the evolutionary history of cetaceans. Copyright © 2015 Elsevier Inc. All rights reserved.
pyPaSWAS: Python-based multi-core CPU and GPU sequence alignment.

PubMed

Warris, Sven; Timal, N Roshan N; Kempenaar, Marcel; Poortinga, Arne M; van de Geest, Henri; Varbanescu, Ana L; Nap, Jan-Peter

2018-01-01

Our previously published CUDA-only application PaSWAS for Smith-Waterman (SW) sequence alignment of any type of sequence on NVIDIA-based GPUs is platform-specific and therefore adopted less than could be. The OpenCL language is supported more widely and allows use on a variety of hardware platforms. Moreover, there is a need to promote the adoption of parallel computing in bioinformatics by making its use and extension more simple through more and better application of high-level languages commonly used in bioinformatics, such as Python. The novel application pyPaSWAS presents the parallel SW sequence alignment code fully packed in Python. It is a generic SW implementation running on several hardware platforms with multi-core systems and/or GPUs that provides accurate sequence alignments that also can be inspected for alignment details. Additionally, pyPaSWAS support the affine gap penalty. Python libraries are used for automated system configuration, I/O and logging. This way, the Python environment will stimulate further extension and use of pyPaSWAS. pyPaSWAS presents an easy Python-based environment for accurate and retrievable parallel SW sequence alignments on GPUs and multi-core systems. The strategy of integrating Python with high-performance parallel compute languages to create a developer- and user-friendly environment should be considered for other computationally intensive bioinformatics algorithms.
Two new miniature inverted-repeat transposable elements in the genome of the clam Donax trunculus.

PubMed

Šatović, Eva; Plohl, Miroslav

2017-10-01

Repetitive sequences are important components of eukaryotic genomes that drive their evolution. Among them are different types of mobile elements that share the ability to spread throughout the genome and form interspersed repeats. To broaden the generally scarce knowledge on bivalves at the genome level, in the clam Donax trunculus we described two new non-autonomous DNA transposons, miniature inverted-repeat transposable elements (MITEs), named DTC M1 and DTC M2. Like other MITEs, they are characterized by their small size, their A + T richness, and the presence of terminal inverted repeats (TIRs). DTC M1 and DTC M2 are 261 and 286 bp long, respectively, and in addition to TIRs, both of them contain a long imperfect palindrome sequence in their central parts. These elements are present in complete and truncated versions within the genome of the clam D. trunculus. The two new MITEs share only structural similarity, but lack any nucleotide sequence similarity to each other. In a search for related elements in databases, blast search revealed within the Crassostrea gigas genome a larger element sharing sequence similarity only to DTC M1 in its TIR sequences. The lack of sequence similarity with any previously published mobile elements indicates that DTC M1 and DTC M2 elements may be unique to D. trunculus.
Identification of novel mutations in the α-galactosidase A gene in patients with Fabry disease: pitfalls of mutation analyses in patients with low α-galactosidase A activity.

PubMed

Yoshimitsu, Makoto; Higuchi, Koji; Miyata, Masaaki; Devine, Sean; Mattman, Andre; Sirrs, Sandra; Medin, Jeffrey A; Tei, Chuwa; Takenaka, Toshihiro

2011-05-01

Fabry disease is an X-linked lysosomal storage disorder caused by mutations of the α-galactosidase A (GLA) gene, and the disease is a relatively prevalent cause of left ventricular hypertrophy followed by conduction abnormalities and arrhythmias. Mutation analysis of the GLA gene is a valuable tool for accurate diagnosis of affected families. In this study, we carried out molecular studies of 10 unrelated families diagnosed with Fabry disease. Genetic analysis of the GLA gene using conventional genomic sequencing was performed in 9 hemizygous males and 6 heterozygous females. In patients with no mutations in coding DNA sequence, multiplex ligation-dependent probe amplification (MLPA) and/or cDNA sequencing were performed. We identified a novel exon 2 deletion (IVS1_IVS2) in a heterozygous female by MLPA, which was undetectable by conventional sequencing methods. In addition, the g.9331G>A mutation that has previously been found only in patients with cardiac Fabry disease was found in 3 unrelated, newly-diagnosed, cardiac Fabry patients by sequencing GLA genomic DNA and cDNA. Two other novel mutations, g.8319A>G and 832delA were also found in addition to 4 previously reported mutations (R112C, C142Y, M296I, and G373D) in 6 other families. We could identify GLA gene mutations in all hemizygotes and heterozygotes from 10 families with Fabry disease. Mutations in 4 out of 10 families could not be identified by classical genomic analysis, which focuses on exons and the flanking region. Instead, these data suggest that MLPA analysis and cDNA sequence should be considered in genetic testing surveys of patients with Fabry disease. Copyright © 2011 Japanese College of Cardiology. Published by Elsevier Ltd. All rights reserved.
'Candidatus Phytoplasma solani', a novel taxon associated with stolbur- and bois noir-related diseases of plants.

PubMed

Quaglino, Fabio; Zhao, Yan; Casati, Paola; Bulgari, Daniela; Bianco, Piero Attilio; Wei, Wei; Davis, Robert Edward

2013-08-01

Phytoplasmas classified in group 16SrXII infect a wide range of plants and are transmitted by polyphagous planthoppers of the family Cixiidae. Based on 16S rRNA gene sequence identity and biological properties, group 16SrXII encompasses several species, including 'Candidatus Phytoplasma australiense', 'Candidatus Phytoplasma japonicum' and 'Candidatus Phytoplasma fragariae'. Other group 16SrXII phytoplasma strains are associated with stolbur disease in wild and cultivated herbaceous and woody plants and with bois noir disease in grapevines (Vitis vinifera L.). Such latter strains have been informally proposed to represent a separate species, 'Candidatus Phytoplasma solani', but a formal description of this taxon has not previously been published. In the present work, stolbur disease strain STOL11 (STOL) was distinguished from reference strains of previously described species of the 'Candidatus Phytoplasma' genus based on 16S rRNA gene sequence similarity and a unique signature sequence in the 16S rRNA gene. Other stolbur- and bois noir-associated ('Ca. Phytoplasma solani') strains shared >99 % 16S rRNA gene sequence similarity with strain STOL11 and contained the signature sequence. 'Ca. Phytoplasma solani' is the only phytoplasma known to be transmitted by Hyalesthes obsoletus. Insect vectorship and molecular characteristics are consistent with the concept that diverse 'Ca. Phytoplasma solani' strains share common properties and represent an ecologically distinct gene pool. Phylogenetic analyses of 16S rRNA, tuf, secY and rplV-rpsC gene sequences supported this view and yielded congruent trees in which 'Ca. Phytoplasma solani' strains formed, within the group 16SrXII clade, a monophyletic subclade that was most closely related to, but distinct from, that of 'Ca. Phytoplasma australiense'-related strains. Based on distinct molecular and biological properties, stolbur- and bois noir-associated strains are proposed to represent a novel species level taxon, 'Ca. Phytoplasma solani'; STOL11 is designated the reference strain.
Whole Exome Sequencing of Patients with Steroid-Resistant Nephrotic Syndrome.

PubMed

Warejko, Jillian K; Tan, Weizhen; Daga, Ankana; Schapiro, David; Lawson, Jennifer A; Shril, Shirlee; Lovric, Svjetlana; Ashraf, Shazia; Rao, Jia; Hermle, Tobias; Jobst-Schwan, Tilman; Widmeier, Eugen; Majmundar, Amar J; Schneider, Ronen; Gee, Heon Yung; Schmidt, J Magdalena; Vivante, Asaf; van der Ven, Amelie T; Ityel, Hadas; Chen, Jing; Sadowski, Carolin E; Kohl, Stefan; Pabst, Werner L; Nakayama, Makiko; Somers, Michael J G; Rodig, Nancy M; Daouk, Ghaleb; Baum, Michelle; Stein, Deborah R; Ferguson, Michael A; Traum, Avram Z; Soliman, Neveen A; Kari, Jameela A; El Desoky, Sherif; Fathy, Hanan; Zenker, Martin; Bakkaloglu, Sevcan A; Müller, Dominik; Noyan, Aytul; Ozaltin, Fatih; Cadnapaphornchai, Melissa A; Hashmi, Seema; Hopcian, Jeffrey; Kopp, Jeffrey B; Benador, Nadine; Bockenhauer, Detlef; Bogdanovic, Radovan; Stajić, Nataša; Chernin, Gil; Ettenger, Robert; Fehrenbach, Henry; Kemper, Markus; Munarriz, Reyner Loza; Podracka, Ludmila; Büscher, Rainer; Serdaroglu, Erkin; Tasic, Velibor; Mane, Shrikant; Lifton, Richard P; Braun, Daniela A; Hildebrandt, Friedhelm

2018-01-06

Steroid-resistant nephrotic syndrome overwhelmingly progresses to ESRD. More than 30 monogenic genes have been identified to cause steroid-resistant nephrotic syndrome. We previously detected causative mutations using targeted panel sequencing in 30% of patients with steroid-resistant nephrotic syndrome. Panel sequencing has a number of limitations when compared with whole exome sequencing. We employed whole exome sequencing to detect monogenic causes of steroid-resistant nephrotic syndrome in an international cohort of 300 families. Three hundred thirty-five individuals with steroid-resistant nephrotic syndrome from 300 families were recruited from April of 1998 to June of 2016. Age of onset was restricted to <25 years of age. Exome data were evaluated for 33 known monogenic steroid-resistant nephrotic syndrome genes. In 74 of 300 families (25%), we identified a causative mutation in one of 20 genes known to cause steroid-resistant nephrotic syndrome. In 11 families (3.7%), we detected a mutation in a gene that causes a phenocopy of steroid-resistant nephrotic syndrome. This is consistent with our previously published identification of mutations using a panel approach. We detected a causative mutation in a known steroid-resistant nephrotic syndrome gene in 38% of consanguineous families and in 13% of nonconsanguineous families, and 48% of children with congenital nephrotic syndrome. A total of 68 different mutations were detected in 20 of 33 steroid-resistant nephrotic syndrome genes. Fifteen of these mutations were novel. NPHS1 , PLCE1 , NPHS2 , and SMARCAL1 were the most common genes in which we detected a mutation. In another 28% of families, we detected mutations in one or more candidate genes for steroid-resistant nephrotic syndrome. Whole exome sequencing is a sensitive approach toward diagnosis of monogenic causes of steroid-resistant nephrotic syndrome. A molecular genetic diagnosis of steroid-resistant nephrotic syndrome may have important consequences for the management of treatment and kidney transplantation in steroid-resistant nephrotic syndrome. Copyright © 2018 by the American Society of Nephrology.
Sequencing of emerging canine distemper virus strain reveals new distinct genetic lineage in the United States associated with disease in wildlife and domestic canine populations.

PubMed

Riley, Matthew C; Wilkes, Rebecca P

2015-12-18

Recent outbreaks of canine distemper have prompted examination of strains from clinical samples submitted to the University of Tennessee College of Veterinary Medicine (UTCVM) Clinical Virology Lab. We previously described a new strain of CDV that significantly diverged from all genotypes reported to date including America 2, the genotype proposed to be the main lineage currently circulating in the US. The aim of this study was to determine when this new strain appeared and how widespread it is in animal populations, given that it has also been detected in fully vaccinated adult dogs. Additionally, we sequenced complete viral genomes to characterize the strain and determine if variation is confined to known variable regions of the genome or if the changes are also present in more conserved regions. Archived clinical samples were genotyped using real-time RT-PCR amplification and sequencing. The genomes of two unrelated viruses from a dog and fox each from a different state were sequenced and aligned with previously published genomes. Phylogenetic analysis was performed using coding, non-coding and genome-length sequences. Virus neutralization assays were used to evaluate potential antigenic differences between this strain and a vaccine strain and mixed ANOVA test was used to compare the titers. Genotyping revealed this strain first appeared in 2011 and was detected in dogs from multiple states in the Southeast region of the United States. It was the main strain detected among the clinical samples that were typed from 2011-2013, including wildlife submissions. Genome sequencing demonstrated that it is highly conserved within a new lineage and preliminary serologic testing showed significant differences in neutralizing antibody titers between this strain and the strain commonly used in vaccines. This new strain represents an emerging CDV in domestic dogs in the US, may be associated with a stable reservoir in the wildlife population, and could facilitate vaccine escape.
Expanding the 2011 Prague, OK Event Catalog: Detections, Relocations, and Stress Drop Estimates

NASA Astrophysics Data System (ADS)

Clerc, F.; Cochran, E. S.; Dougherty, S. L.; Keranen, K. M.; Harrington, R. M.

2016-12-01

The Mw 5.6 earthquake occurring on 6 Nov. 2011, near Prague, OK, is thought to have been triggered by a Mw 4.8 foreshock, which was likely induced by fluid injection into local wastewater disposal wells [Keranen et al., 2013; Sumy et al., 2014]. Previous stress drop estimates for the sequence have suggested values lower than those for most Central and Eastern U.S. tectonic events of similar magnitudes [Hough, 2014; Sun & Hartzell, 2014; Sumy & Neighbors et al., 2016]. Better stress drop estimates allow more realistic assessment of seismic hazard and more effective regulation of wastewater injection. More reliable estimates of source properties may help to differentiate induced events from natural ones. Using data from local and regional networks, we perform event detections, relocations, and stress drop calculations of the Prague aftershock sequence. We use the Match & Locate method, a variation on the matched-filter method which detects events of lower magnitudes by stacking cross-correlograms from different stations [Zhang & Wen, 2013; 2015], in order to create a more complete catalog from 6 Nov to 31 Dec 2011. We then relocate the detected events using the HypoDD double-difference algorithm. Using our enhanced catalog and relocations, we examine the seismicity distribution for evidence of migration and investigate implications for triggering mechanisms. To account for path and site effects, we calculate stress drops using the Empirical Green's Function (EGF) spectral ratio method, beginning with 2730 previously relocated events. We determine whether there is a correlation between the stress drop magnitudes and the spatial and temporal distribution of events, including depth, position relative to existing faults, and proximity to injection wells. Finally, we consider the range of stress drop values and scaling with respect to event magnitudes within the context of previously published work for the Prague sequence as well as other induced and natural sequences.
Determination of Trichuris skrjabini by sequencing of the ITS1-5.8S-ITS2 segment of the ribosomal DNA: comparative molecular study of different species of trichurids.

PubMed

Cutillas, C; Oliveros, R; de Rojas, M; Guevara, D C

2004-06-01

Adults of Trichuris skrjahini have been isolated from the cecum of caprine hosts (Capra hircus), Trichuris ovis and Trichuris globulosa from Ovis aries (sheep) and C. hircus (goats), and Trichuris leporis from Lepus europaeus (rabbits) in Spain. Genomic DNA was isolated and the ITS1-5.8S-ITS2 segment from the ribosomal DNA (rDNA) was amplified and sequenced by polymerase chain reaction (PCR) techniques. The ITS1 of T. skrjabini, T. ovis, T. globulosa, and T. leporis was 495, 757, 757, and 536 nucleotides in length, respectively, and had G + C contents of 59.6, 58.7, 58.7, and 60.8%, respectively. Intraindividual variation was detected in the ITSI sequences of the 4 species. Furthermore, the 5.8S sequences of T. skrjabini, T. ovis, T. globulosa, and T. leporis were compared. A total of 157, 152, 153, and 157 nucleotides in length was observed in the 5.8S sequences of these 4 species, respectively. There were no sequence differences of ITS1 and 5.8S products between T. ovis and T. globulosa. Nevertheless, clear differences were detected between the ITS1 sequences of T. skrjabini, T. ovis, T. leporis, Trichuris muris, and T. arvicolae. The ITS2 fragment from the rDNA of T. skrjabini was sequenced. A comparative study of the ITS2 sequence of T. skrjabini with the previously published ITS2 sequence data of T. ovis, T. leporis, T. muris, and T. arvicolae suggested that the combined use of sequence data from both spacers would be useful in the molecular characterization of trichurid parasites.
Ribosomal RNA Genes Contribute to the Formation of Pseudogenes and Junk DNA in the Human Genome.

PubMed

Robicheau, Brent M; Susko, Edward; Harrigan, Amye M; Snyder, Marlene

2017-02-01

Approximately 35% of the human genome can be identified as sequence devoid of a selected-effect function, and not derived from transposable elements or repeated sequences. We provide evidence supporting a known origin for a fraction of this sequence. We show that: 1) highly degraded, but near full length, ribosomal DNA (rDNA) units, including both 45S and Intergenic Spacer (IGS), can be found at multiple sites in the human genome on chromosomes without rDNA arrays, 2) that these rDNA sequences have a propensity for being centromere proximal, and 3) that sequence at all human functional rDNA array ends is divergent from canonical rDNA to the point that it is pseudogenic. We also show that small sequence strings of rDNA (from 45S + IGS) can be found distributed throughout the genome and are identifiable as an "rDNA-like signal", representing 0.26% of the q-arm of HSA21 and ∼2% of the total sequence of other regions tested. The size of sequence strings found in the rDNA-like signal intergrade into the size of sequence strings that make up the full-length degrading rDNA units found scattered throughout the genome. We conclude that the displaced and degrading rDNA sequences are likely of a similar origin but represent different stages in their evolution towards random sequence. Collectively, our data suggests that over vast evolutionary time, rDNA arrays contribute to the production of junk DNA. The concept that the production of rDNA pseudogenes is a by-product of concerted evolution represents a previously under-appreciated process; we demonstrate here its importance. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Phylogeny and Divergence Times of Lemurs Inferred with Recent and Ancient Fossils in the Tree.

PubMed

Herrera, James P; Dávalos, Liliana M

2016-09-01

Paleontological and neontological systematics seek to answer evolutionary questions with different data sets. Phylogenies inferred for combined extant and extinct taxa provide novel insights into the evolutionary history of life. Primates have an extensive, diverse fossil record and molecular data for living and extinct taxa are rapidly becoming available. We used two models to infer the phylogeny and divergence times for living and fossil primates, the tip-dating (TD) and fossilized birth-death process (FBD). We collected new morphological data, especially on the living and extinct endemic lemurs of Madagascar. We combined the morphological data with published DNA sequences to infer near-complete (88% of lemurs) time-calibrated phylogenies. The results suggest that primates originated around the Cretaceous-Tertiary boundary, slightly earlier than indicated by the fossil record and later than previously inferred from molecular data alone. We infer novel relationships among extinct lemurs, and strong support for relationships that were previously unresolved. Dates inferred with TD were significantly older than those inferred with FBD, most likely related to an assumption of a uniform branching process in the TD compared with a birth-death process assumed in the FBD. This is the first study to combine morphological and DNA sequence data from extinct and extant primates to infer evolutionary relationships and divergence times, and our results shed new light on the tempo of lemur evolution and the efficacy of combined phylogenetic analyses. © The Author(s) 2016. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Large-scale deletions of the ABCA1 gene in patients with hypoalphalipoproteinemia.

PubMed

Dron, Jacqueline S; Wang, Jian; Berberich, Amanda J; Iacocca, Michael A; Cao, Henian; Yang, Ping; Knoll, Joan; Tremblay, Karine; Brisson, Diane; Netzer, Christian; Gouni-Berthold, Ioanna; Gaudet, Daniel; Hegele, Robert A

2018-06-04

Copy-number variations (CNVs) have been studied in the context of familial hypercholesterolemia but have not yet been evaluated in patients with extremes of high-density lipoprotein (HDL) cholesterol levels. We evaluated targeted next-generation sequencing data from patients with very low HDL cholesterol (i.e. hypoalphalipoproteinemia) using the VarSeq-CNV caller algorithm to screen for CNVs disrupting the ABCA1, LCAT or APOA1 genes. In four individuals, we found three unique deletions in ABCA1: a heterozygous deletion of exon 4, a heterozygous deletion spanning exons 8 to 31, and a heterozygous deletion of the entire ABCA1 gene. Breakpoints were identified using Sanger sequencing, and the full-gene deletion was also confirmed using exome sequencing and the Affymetrix CytoScanTM HD Array. Before now, large-scale deletions in candidate HDL genes have not been associated with hypoalphalipoproteinemia; our findings indicate that CNVs in ABCA1 may be a previously unappreciated genetic determinant of low HDL cholesterol levels. By coupling bioinformatic analyses with next-generation sequencing data, we can successfully assess the spectrum of genetic determinants of many dyslipidemias, now including hypoalphalipoproteinemia. Published under license by The American Society for Biochemistry and Molecular Biology, Inc.
Architecture of a Species: Phylogenomics of Staphylococcus aureus.

PubMed

Planet, Paul J; Narechania, Apurva; Chen, Liang; Mathema, Barun; Boundy, Sam; Archer, Gordon; Kreiswirth, Barry

2017-02-01

A deluge of whole-genome sequencing has begun to give insights into the patterns and processes of microbial evolution, but genome sequences have accrued in a haphazard manner, with biased sampling of natural variation that is driven largely by medical and epidemiological priorities. For instance, there is a strong bias for sequencing epidemic lineages of methicillin-resistant Staphylococcus aureus (MRSA) over sensitive isolates (methicillin-sensitive S. aureus: MSSA). As more diverse genomes are sequenced the emerging picture is of a highly subdivided species with a handful of relatively clonal groups (complexes) that, at any given moment, dominate in particular geographical regions. The establishment of hegemony of particular clones appears to be a dynamic process of successive waves of replacement of the previously dominant clone. Here we review the phylogenomic structure of a diverse range of S. aureus, including both MRSA and MSSA. We consider the utility of the concept of the 'core' genome and the impact of recombination and horizontal transfer. We argue that whole-genome surveillance of S. aureus populations could lead to better forecasting of antibiotic resistance and virulence of emerging clones, and a better understanding of the elusive biological factors that determine repeated strain replacement. Copyright © 2016. Published by Elsevier Ltd.
Interspecific Plastome Recombination Reflects Ancient Reticulate Evolution in Picea (Pinaceae).

PubMed

Sullivan, Alexis R; Schiffthaler, Bastian; Thompson, Stacey Lee; Street, Nathaniel R; Wang, Xiao-Ru

2017-07-01

Plastid sequences are a cornerstone in plant systematic studies and key aspects of their evolution, such as uniparental inheritance and absent recombination, are often treated as axioms. While exceptions to these assumptions can profoundly influence evolutionary inference, detecting them can require extensive sampling, abundant sequence data, and detailed testing. Using advancements in high-throughput sequencing, we analyzed the whole plastomes of 65 accessions of Picea, a genus of ∼35 coniferous forest tree species, to test for deviations from canonical plastome evolution. Using complementary hypothesis and data-driven tests, we found evidence for chimeric plastomes generated by interspecific hybridization and recombination in the clade comprising Norway spruce (P. abies) and 10 other species. Support for interspecific recombination remained after controlling for sequence saturation, positive selection, and potential alignment artifacts. These results reconcile previous conflicting plastid-based phylogenies and strengthen the mounting evidence of reticulate evolution in Picea. Given the relatively high frequency of hybridization and biparental plastid inheritance in plants, we suggest interspecific plastome recombination may be more widespread than currently appreciated and could underlie reported cases of discordant plastid phylogenies. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Using populations of human and microbial genomes for organism detection in metagenomes.

PubMed

Ames, Sasha K; Gardner, Shea N; Marti, Jose Manuel; Slezak, Tom R; Gokhale, Maya B; Allen, Jonathan E

2015-07-01

Identifying causative disease agents in human patients from shotgun metagenomic sequencing (SMS) presents a powerful tool to apply when other targeted diagnostics fail. Numerous technical challenges remain, however, before SMS can move beyond the role of research tool. Accurately separating the known and unknown organism content remains difficult, particularly when SMS is applied as a last resort. The true amount of human DNA that remains in a sample after screening against the human reference genome and filtering nonbiological components left from library preparation has previously been underreported. In this study, we create the most comprehensive collection of microbial and reference-free human genetic variation available in a database optimized for efficient metagenomic search by extracting sequences from GenBank and the 1000 Genomes Project. The results reveal new human sequences found in individual Human Microbiome Project (HMP) samples. Individual samples contain up to 95% human sequence, and 4% of the individual HMP samples contain 10% or more human reads. Left unidentified, human reads can complicate and slow down further analysis and lead to inaccurately labeled microbial taxa and ultimately lead to privacy concerns as more human genome data is collected. © 2015 Ames et al.; Published by Cold Spring Harbor Laboratory Press.
Human IgG repertoire of malaria antigen-immunized human immune system (HIS) mice.

PubMed

Nogueira, Raquel Tayar; Sahi, Vincent; Huang, Jing; Tsuji, Moriya

2017-08-01

Humanized mouse models present an important tool for preclinical evaluation of new vaccines and therapeutics. Here we show the human variable repertoire of antibody sequences cloned from a previously described human immune system (HIS) mouse model that possesses functional human CD4+ T cells and B cells, namely HIS-CD4/B mice. We sequenced variable IgG genes from single memory B-cell and plasma-cell sorted from splenocytes or whole blood lymphocytes of HIS-CD4/B mice that were vaccinated with a human plasmodial antigen, a recombinant Plasmodium falciparum circumsporozoite protein (rPfCSP). We demonstrate that rPfCSP immunization triggers a diverse B-cell IgG repertoire composed of various human VH family genes and distinct V(D)J recombinations that constitute diverse CDR3 sequences similar to humans, although low hypermutated sequences were generated. These results demonstrate the substantial genetic diversity of responding human B cells of HIS-CD4/B mice and their capacity to mount human IgG class-switched antibody response upon vaccination. Copyright © 2017 European Federation of Immunological Societies. Published by Elsevier B.V. All rights reserved.

Colony-PCR Is a Rapid Method for DNA Amplification of Hyphomycetes

PubMed Central

Walch, Georg; Knapp, Maria; Rainer, Georg; Peintner, Ursula

2016-01-01

Fungal pure cultures identified with both classical morphological methods and through barcoding sequences are a basic requirement for reliable reference sequences in public databases. Improved techniques for an accelerated DNA barcode reference library construction will result in considerably improved sequence databases covering a wider taxonomic range. Fast, cheap, and reliable methods for obtaining DNA sequences from fungal isolates are, therefore, a valuable tool for the scientific community. Direct colony PCR was already successfully established for yeasts, but has not been evaluated for a wide range of anamorphic soil fungi up to now, and a direct amplification protocol for hyphomycetes without tissue pre-treatment has not been published so far. Here, we present a colony PCR technique directly from fungal hyphae without previous DNA extraction or other prior manipulation. Seven hundred eighty-eight fungal strains from 48 genera were tested with a success rate of 86%. PCR success varied considerably: DNA of fungi belonging to the genera Cladosporium, Geomyces, Fusarium, and Mortierella could be amplified with high success. DNA of soil-borne yeasts was always successfully amplified. Absidia, Mucor, Trichoderma, and Penicillium isolates had noticeably lower PCR success. PMID:29376929
Genetic characterisation of Taenia multiceps cysts from ruminants in Greece.

PubMed

Al-Riyami, Shumoos; Ioannidou, Evi; Koehler, Anson V; Hussain, Muhammad H; Al-Rawahi, Abdulmajeed H; Giadinis, Nektarios D; Lafi, Shawkat Q; Papadopoulos, Elias; Jabbar, Abdul

2016-03-01

This study was designed to genetically characterise the larval stage (coenurus) of Taenia multiceps from ruminants in Greece, utilising DNA regions within the cytochrome c oxidase subunit 1 (partial cox1) and NADH dehydrogenase 1 (pnad1) mitochondrial (mt) genes, respectively. A molecular-phylogenetic approach was used to analyse the pcox1 and pnad1 amplicons derived from genomic DNA samples from individual cysts (n=105) from cattle (n=3), goats (n=5) and sheep (n=97). Results revealed five and six distinct electrophoretic profiles for pcox1 and pnad1, respectively, using single-strand conformation polymorphism. Direct sequencing of selected amplicons representing each of these profiles defined five haplotypes each for pcox1 and pnad1, among all 105 isolates. Phylogenetic analysis of individual sequence data for each locus, including a range of well-defined reference sequences, inferred that all isolates of T. multiceps cysts from ruminants in Greece clustered with previously published sequences from different continents. The present study provides a foundation for future large-scale studies on the epidemiology of T. multiceps in ruminants as well as dogs in Greece. Copyright © 2015 Elsevier B.V. All rights reserved.
Nucleotide sequence of a resistance breaking mutant of southern bean mosaic virus.

PubMed

Lee, L; Anderson, E J

1998-01-01

SBMV-S is a resistance-breaking mutant of an Arkansas isolate of the bean strain of southern bean mosaic virus (SBMV-BARK) that is able to move systemically in Phaseolus vulgaris cvs. Pinto and Great Northern, whereas the wild-type SBMV-BARK causes local necrotic lesions and is restricted to the inoculated leaves of these hosts. Sequence analysis of the 4136 nucleotide genomes of SBMV-BARK and SBMV-S revealed seven nucleotide differences, but only four deduced amino acid changes. A single amino acid change occurred in the C-terminal region of the putative RNA-dependent RNA polymerase and three differences were identified in the N-terminal portion of the virus coat protein. SBMV-BARK and SBMV-S were compared with other sobemoviruses and were found to contain a high level of nucleotide sequence identity (91.3%) to SBMV-B. Unlike SBMV-B however, SBMV-BARK and SBMV-S contained four putative overlapping open reading frames, making them more similar in genome organization to the cowpea strain, SBMV-C. The possibility exists that mutations or even errors, that resulted in mis-identification of open reading frames, occurred in previously published information on nucleotide sequence and genomic organization for SBMV-B.
Implementation of Nationwide Real-time Whole-genome Sequencing to Enhance Listeriosis Outbreak Detection and Investigation.

PubMed

Jackson, Brendan R; Tarr, Cheryl; Strain, Errol; Jackson, Kelly A; Conrad, Amanda; Carleton, Heather; Katz, Lee S; Stroika, Steven; Gould, L Hannah; Mody, Rajal K; Silk, Benjamin J; Beal, Jennifer; Chen, Yi; Timme, Ruth; Doyle, Matthew; Fields, Angela; Wise, Matthew; Tillman, Glenn; Defibaugh-Chavez, Stephanie; Kucerova, Zuzana; Sabol, Ashley; Roache, Katie; Trees, Eija; Simmons, Mustafa; Wasilenko, Jamie; Kubota, Kristy; Pouseele, Hannes; Klimke, William; Besser, John; Brown, Eric; Allard, Marc; Gerner-Smidt, Peter

2016-08-01

Listeria monocytogenes (Lm) causes severe foodborne illness (listeriosis). Previous molecular subtyping methods, such as pulsed-field gel electrophoresis (PFGE), were critical in detecting outbreaks that led to food safety improvements and declining incidence, but PFGE provides limited genetic resolution. A multiagency collaboration began performing real-time, whole-genome sequencing (WGS) on all US Lm isolates from patients, food, and the environment in September 2013, posting sequencing data into a public repository. Compared with the year before the project began, WGS, combined with epidemiologic and product trace-back data, detected more listeriosis clusters and solved more outbreaks (2 outbreaks in pre-WGS year, 5 in WGS year 1, and 9 in year 2). Whole-genome multilocus sequence typing and single nucleotide polymorphism analyses provided equivalent phylogenetic relationships relevant to investigations; results were most useful when interpreted in context of epidemiological data. WGS has transformed listeriosis outbreak surveillance and is being implemented for other foodborne pathogens. Published by Oxford University Press for the Infectious Diseases Society of America 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Predicting DNA binding proteins using support vector machine with hybrid fractal features.

PubMed

Niu, Xiao-Hui; Hu, Xue-Hai; Shi, Feng; Xia, Jing-Bo

2014-02-21

DNA-binding proteins play a vitally important role in many biological processes. Prediction of DNA-binding proteins from amino acid sequence is a significant but not fairly resolved scientific problem. Chaos game representation (CGR) investigates the patterns hidden in protein sequences, and visually reveals previously unknown structure. Fractal dimensions (FD) are good tools to measure sizes of complex, highly irregular geometric objects. In order to extract the intrinsic correlation with DNA-binding property from protein sequences, CGR algorithm, fractal dimension and amino acid composition are applied to formulate the numerical features of protein samples in this paper. Seven groups of features are extracted, which can be computed directly from the primary sequence, and each group is evaluated by the 10-fold cross-validation test and Jackknife test. Comparing the results of numerical experiments, the group of amino acid composition and fractal dimension (21-dimension vector) gets the best result, the average accuracy is 81.82% and average Matthew's correlation coefficient (MCC) is 0.6017. This resulting predictor is also compared with existing method DNA-Prot and shows better performances. © 2013 The Authors. Published by Elsevier Ltd All rights reserved.
Memory for sequences of events impaired in typical aging.

PubMed

Allen, Timothy A; Morris, Andrea M; Stark, Shauna M; Fortin, Norbert J; Stark, Craig E L

2015-03-01

Typical aging is associated with diminished episodic memory performance. To improve our understanding of the fundamental mechanisms underlying this age-related memory deficit, we previously developed an integrated, cross-species approach to link converging evidence from human and animal research. This novel approach focuses on the ability to remember sequences of events, an important feature of episodic memory. Unlike existing paradigms, this task is nonspatial, nonverbal, and can be used to isolate different cognitive processes that may be differentially affected in aging. Here, we used this task to make a comprehensive comparison of sequence memory performance between younger (18-22 yr) and older adults (62-86 yr). Specifically, participants viewed repeated sequences of six colored, fractal images and indicated whether each item was presented "in sequence" or "out of sequence." Several out of sequence probe trials were used to provide a detailed assessment of sequence memory, including: (i) repeating an item from earlier in the sequence ("Repeats"; e.g., AB A: DEF), (ii) skipping ahead in the sequence ("Skips"; e.g., AB D: DEF), and (iii) inserting an item from a different sequence into the same ordinal position ("Ordinal Transfers"; e.g., AB 3: DEF). We found that older adults performed as well as younger controls when tested on well-known and predictable sequences, but were severely impaired when tested using novel sequences. Importantly, overall sequence memory performance in older adults steadily declined with age, a decline not detected with other measures (RAVLT or BPS-O). We further characterized this deficit by showing that performance of older adults was severely impaired on specific probe trials that required detailed knowledge of the sequence (Skips and Ordinal Transfers), and was associated with a shift in their underlying mnemonic representation of the sequences. Collectively, these findings provide unambiguous evidence that the capacity to remember sequences of events is fundamentally affected by typical aging. © 2015 Allen et al.; Published by Cold Spring Harbor Laboratory Press.
A range-wide synthesis and timeline for phylogeographic events in the red fox (Vulpes vulpes).

PubMed

Kutschera, Verena E; Lecomte, Nicolas; Janke, Axel; Selva, Nuria; Sokolov, Alexander A; Haun, Timm; Steyer, Katharina; Nowak, Carsten; Hailer, Frank

2013-06-05

Many boreo-temperate mammals have a Pleistocene fossil record throughout Eurasia and North America, but only few have a contemporary distribution that spans this large area. Examples of Holarctic-distributed carnivores are the brown bear, grey wolf, and red fox, all three ecological generalists with large dispersal capacity and a high adaptive flexibility. While the two former have been examined extensively across their ranges, no phylogeographic study of the red fox has been conducted across its entire Holarctic range. Moreover, no study included samples from central Asia, leaving a large sampling gap in the middle of the Eurasian landmass. Here we provide the first mitochondrial DNA sequence data of red foxes from central Asia (Siberia), and new sequences from several European populations. In a range-wide synthesis of 729 red fox mitochondrial control region sequences, including 677 previously published and 52 newly obtained sequences, this manuscript describes the pattern and timing of major phylogeographic events in red foxes, using a Bayesian coalescence approach with multiple fossil tip and root calibration points. In a 335 bp alignment we found in total 175 unique haplotypes. All newly sequenced individuals belonged to the previously described Holarctic lineage. Our analyses confirmed the presence of three Nearctic- and two Japan-restricted lineages that were formed since the Mid/Late Pleistocene. The phylogeographic history of red foxes is highly similar to that previously described for grey wolves and brown bears, indicating that climatic fluctuations and habitat changes since the Pleistocene had similar effects on these highly mobile generalist species. All three species originally diversified in Eurasia and later colonized North America and Japan. North American lineages persisted through the last glacial maximum south of the ice sheets, meeting more recent colonizers from Beringia during postglacial expansion into the northern Nearctic. Both brown bears and red foxes colonized Japan's northern island Hokkaido at least three times, all lineages being most closely related to different mainland lineages. Red foxes, grey wolves, and brown bears thus represent an interesting case where species that occupy similar ecological niches also exhibit similar phylogeographic histories.
A range-wide synthesis and timeline for phylogeographic events in the red fox (Vulpes vulpes)

PubMed Central

2013-01-01

Background Many boreo-temperate mammals have a Pleistocene fossil record throughout Eurasia and North America, but only few have a contemporary distribution that spans this large area. Examples of Holarctic-distributed carnivores are the brown bear, grey wolf, and red fox, all three ecological generalists with large dispersal capacity and a high adaptive flexibility. While the two former have been examined extensively across their ranges, no phylogeographic study of the red fox has been conducted across its entire Holarctic range. Moreover, no study included samples from central Asia, leaving a large sampling gap in the middle of the Eurasian landmass. Results Here we provide the first mitochondrial DNA sequence data of red foxes from central Asia (Siberia), and new sequences from several European populations. In a range-wide synthesis of 729 red fox mitochondrial control region sequences, including 677 previously published and 52 newly obtained sequences, this manuscript describes the pattern and timing of major phylogeographic events in red foxes, using a Bayesian coalescence approach with multiple fossil tip and root calibration points. In a 335 bp alignment we found in total 175 unique haplotypes. All newly sequenced individuals belonged to the previously described Holarctic lineage. Our analyses confirmed the presence of three Nearctic- and two Japan-restricted lineages that were formed since the Mid/Late Pleistocene. Conclusions The phylogeographic history of red foxes is highly similar to that previously described for grey wolves and brown bears, indicating that climatic fluctuations and habitat changes since the Pleistocene had similar effects on these highly mobile generalist species. All three species originally diversified in Eurasia and later colonized North America and Japan. North American lineages persisted through the last glacial maximum south of the ice sheets, meeting more recent colonizers from Beringia during postglacial expansion into the northern Nearctic. Both brown bears and red foxes colonized Japan’s northern island Hokkaido at least three times, all lineages being most closely related to different mainland lineages. Red foxes, grey wolves, and brown bears thus represent an interesting case where species that occupy similar ecological niches also exhibit similar phylogeographic histories. PMID:23738594
PCR Amplification Strategies towards full-length HIV-1 Genome sequencing.

PubMed

Liu, Chao Chun; Ji, Hezhao

2018-06-26

The advent of next generation sequencing has enabled greater resolution of viral diversity and improved feasibility of full viral genome sequencing allowing routine HIV-1 full genome sequencing in both research and diagnostic settings. Regardless of the sequencing platform selected, successful PCR amplification of the HIV-1 genome is essential for sequencing template preparation. As such, full HIV-1 genome amplification is a crucial step in dictating the successful and reliable sequencing downstream. Here we reviewed existing PCR protocols leading to HIV-1 full genome sequencing. In addition to the discussion on basic considerations on relevant PCR design, the advantages as well as the pitfalls of published protocols were reviewed. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
A world of opportunities with nanopore sequencing.

PubMed

Leggett, Richard M; Clark, Matthew D

2017-11-28

Oxford Nanopore Technologies' MinION sequencer was launched in pre-release form in 2014 and represents an exciting new sequencing paradigm. The device offers multi-kilobase reads and a streamed mode of operation that allows processing of reads as they are generated. Crucially, it is an extremely compact device that is powered from the USB port of a laptop computer, enabling it to be taken out of the lab and facilitating previously impossible in-field sequencing experiments to be undertaken. Many of the initial publications concerning the platform focused on provision of tools to access and analyse the new sequence formats and then demonstrating the assembly of microbial genomes. More recently, as throughput and accuracy have increased, it has been possible to begin work involving more complex genomes and metagenomes. With the release of the high-throughput GridION X5 and PromethION platforms, the sequencing of large genomes will become more cost efficient, and enable the leveraging of extremely long (>100 kb) reads for resolution of complex genomic structures. This review provides a brief overview of nanopore sequencing technology, describes the growing range of nanopore bioinformatics tools, and highlights some of the most influential publications that have emerged over the last 2 years. Finally, we look to the future and the potential the platform has to disrupt work in human, microbiome, and plant genomics. © The Author 2017. Published by Oxford University Press on behalf of the Society for Experimental Biology. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Feline hypersomatotropism and acromegaly tumorigenesis: a potential role for the AIP gene.

PubMed

Scudder, C J; Niessen, S J; Catchpole, B; Fowkes, R C; Church, D B; Forcada, Y

2017-04-01

Acromegaly in humans is usually sporadic, however up to 20% of familial isolated pituitary adenomas are caused by germline sequence variants of the aryl-hydrocarbon-receptor interacting protein (AIP) gene. Feline acromegaly has similarities to human acromegalic families with AIP mutations. The aim of this study was to sequence the feline AIP gene, identify sequence variants and compare the AIP gene sequence between feline acromegalic and control cats, and in acromegalic siblings. The feline AIP gene was amplified through PCR using whole blood genomic DNA from 10 acromegalic and 10 control cats, and 3 sibling pairs affected by acromegaly. PCR products were sequenced and compared with the published predicted feline AIP gene. A single nonsynonymous SNP was identified in exon 1 (AIP:c.9T > G) of two acromegalic cats and none of the control cats, as well as both members of one sibling pair. The region of this SNP is considered essential for the interaction of the AIP protein with its receptor. This sequence variant has not previously been reported in humans. Two additional synonymous sequence variants were identified (AIP:c.481C > T and AIP:c.826C > T). This is the first molecular study to investigate a potential genetic cause of feline acromegaly and identified a nonsynonymous AIP single nucleotide polymorphism in 20% of the acromegalic cat population evaluated, as well as in one of the sibling pairs evaluated. Copyright © 2016 Elsevier Inc. All rights reserved.
Substrate sequence selectivity of APOBEC3A implicates intra-DNA interactions.

PubMed

Silvas, Tania V; Hou, Shurong; Myint, Wazo; Nalivaika, Ellen; Somasundaran, Mohan; Kelch, Brian A; Matsuo, Hiroshi; Kurt Yilmaz, Nese; Schiffer, Celia A

2018-05-14

The APOBEC3 (A3) family of human cytidine deaminases is renowned for providing a first line of defense against many exogenous and endogenous retroviruses. However, the ability of these proteins to deaminate deoxycytidines in ssDNA makes A3s a double-edged sword. When overexpressed, A3s can mutate endogenous genomic DNA resulting in a variety of cancers. Although the sequence context for mutating DNA varies among A3s, the mechanism for substrate sequence specificity is not well understood. To characterize substrate specificity of A3A, a systematic approach was used to quantify the affinity for substrate as a function of sequence context, length, secondary structure, and solution pH. We identified the A3A ssDNA binding motif as (T/C)TC(A/G), which correlated with enzymatic activity. We also validated that A3A binds RNA in a sequence specific manner. A3A bound tighter to substrate binding motif within a hairpin loop compared to linear oligonucleotide, suggesting A3A affinity is modulated by substrate structure. Based on these findings and previously published A3A-ssDNA co-crystal structures, we propose a new model with intra-DNA interactions for the molecular mechanism underlying A3A sequence preference. Overall, the sequence and structural preferences identified for A3A leads to a new paradigm for identifying A3A's involvement in mutation of endogenous or exogenous DNA.
Non-codingRNA sequence variations in human chronic lymphocytic leukemia and colorectal cancer.

PubMed

Wojcik, Sylwia E; Rossi, Simona; Shimizu, Masayoshi; Nicoloso, Milena S; Cimmino, Amelia; Alder, Hansjuerg; Herlea, Vlad; Rassenti, Laura Z; Rai, Kanti R; Kipps, Thomas J; Keating, Michael J; Croce, Carlo M; Calin, George A

2010-02-01

Cancer is a genetic disease in which the interplay between alterations in protein-coding genes and non-coding RNAs (ncRNAs) plays a fundamental role. In recent years, the full coding component of the human genome was sequenced in various cancers, whereas such attempts related to ncRNAs are still fragmentary. We screened genomic DNAs for sequence variations in 148 microRNAs (miRNAs) and ultraconserved regions (UCRs) loci in patients with chronic lymphocytic leukemia (CLL) or colorectal cancer (CRC) by Sanger technique and further tried to elucidate the functional consequences of some of these variations. We found sequence variations in miRNAs in both sporadic and familial CLL cases, mutations of UCRs in CLLs and CRCs and, in certain instances, detected functional effects of these variations. Furthermore, by integrating our data with previously published data on miRNA sequence variations, we have created a catalog of DNA sequence variations in miRNAs/ultraconserved genes in human cancers. These findings argue that ncRNAs are targeted by both germ line and somatic mutations as well as by single-nucleotide polymorphisms with functional significance for human tumorigenesis. Sequence variations in ncRNA loci are frequent and some have functional and biological significance. Such information can be exploited to further investigate on a genome-wide scale the frequency of genetic variations in ncRNAs and their functional meaning, as well as for the development of new diagnostic and prognostic markers for leukemias and carcinomas.
Non-codingRNA sequence variations in human chronic lymphocytic leukemia and colorectal cancer

PubMed Central

Wojcik, Sylwia E.; Rossi, Simona; Shimizu, Masayoshi; Nicoloso, Milena S.; Cimmino, Amelia; Alder, Hansjuerg; Herlea, Vlad; Rassenti, Laura Z.; Rai, Kanti R.; Kipps, Thomas J.; Keating, Michael J.

2010-01-01

Cancer is a genetic disease in which the interplay between alterations in protein-coding genes and non-coding RNAs (ncRNAs) plays a fundamental role. In recent years, the full coding component of the human genome was sequenced in various cancers, whereas such attempts related to ncRNAs are still fragmentary. We screened genomic DNAs for sequence variations in 148 microRNAs (miRNAs) and ultraconserved regions (UCRs) loci in patients with chronic lymphocytic leukemia (CLL) or colorectal cancer (CRC) by Sanger technique and further tried to elucidate the functional consequences of some of these variations. We found sequence variations in miRNAs in both sporadic and familial CLL cases, mutations of UCRs in CLLs and CRCs and, in certain instances, detected functional effects of these variations. Furthermore, by integrating our data with previously published data on miRNA sequence variations, we have created a catalog of DNA sequence variations in miRNAs/ultraconserved genes in human cancers. These findings argue that ncRNAs are targeted by both germ line and somatic mutations as well as by single-nucleotide polymorphisms with functional significance for human tumorigenesis. Sequence variations in ncRNA loci are frequent and some have functional and biological significance. Such information can be exploited to further investigate on a genome-wide scale the frequency of genetic variations in ncRNAs and their functional meaning, as well as for the development of new diagnostic and prognostic markers for leukemias and carcinomas. PMID:19926640
DPTEdb, an integrative database of transposable elements in dioecious plants.

PubMed

Li, Shu-Fen; Zhang, Guo-Jun; Zhang, Xue-Jin; Yuan, Jin-Hong; Deng, Chuan-Liang; Gu, Lian-Feng; Gao, Wu-Jun

2016-01-01

Dioecious plants usually harbor 'young' sex chromosomes, providing an opportunity to study the early stages of sex chromosome evolution. Transposable elements (TEs) are mobile DNA elements frequently found in plants and are suggested to play important roles in plant sex chromosome evolution. The genomes of several dioecious plants have been sequenced, offering an opportunity to annotate and mine the TE data. However, comprehensive and unified annotation of TEs in these dioecious plants is still lacking. In this study, we constructed a dioecious plant transposable element database (DPTEdb). DPTEdb is a specific, comprehensive and unified relational database and web interface. We used a combination of de novo, structure-based and homology-based approaches to identify TEs from the genome assemblies of previously published data, as well as our own. The database currently integrates eight dioecious plant species and a total of 31 340 TEs along with classification information. DPTEdb provides user-friendly web interfaces to browse, search and download the TE sequences in the database. Users can also use tools, including BLAST, GetORF, HMMER, Cut sequence and JBrowse, to analyze TE data. Given the role of TEs in plant sex chromosome evolution, the database will contribute to the investigation of TEs in structural, functional and evolutionary dynamics of the genome of dioecious plants. In addition, the database will supplement the research of sex diversification and sex chromosome evolution of dioecious plants.Database URL: http://genedenovoweb.ticp.net:81/DPTEdb/index.php. © The Author(s) 2016. Published by Oxford University Press.
Accurate Filtering of Privacy-Sensitive Information in Raw Genomic Data.

PubMed

Decouchant, Jérémie; Fernandes, Maria; Völp, Marcus; Couto, Francisco M; Esteves-Veríssimo, Paulo

2018-04-13

Sequencing thousands of human genomes has enabled breakthroughs in many areas, among them precision medicine, the study of rare diseases, and forensics. However, mass collection of such sensitive data entails enormous risks if not protected to the highest standards. In this article, we follow the position and argue that post-alignment privacy is not enough and that data should be automatically protected as early as possible in the genomics workflow, ideally immediately after the data is produced. We show that a previous approach for filtering short reads cannot extend to long reads and present a novel filtering approach that classifies raw genomic data (i.e., whose location and content is not yet determined) into privacy-sensitive (i.e., more affected by a successful privacy attack) and non-privacy-sensitive information. Such a classification allows the fine-grained and automated adjustment of protective measures to mitigate the possible consequences of exposure, in particular when relying on public clouds. We present the first filter that can be indistinctly applied to reads of any length, i.e., making it usable with any recent or future sequencing technologies. The filter is accurate, in the sense that it detects all known sensitive nucleotides except those located in highly variable regions (less than 10 nucleotides remain undetected per genome instead of 100,000 in previous works). It has far less false positives than previously known methods (10% instead of 60%) and can detect sensitive nucleotides despite sequencing errors (86% detected instead of 56% with 2% of mutations). Finally, practical experiments demonstrate high performance, both in terms of throughput and memory consumption. Copyright © 2018. Published by Elsevier Inc.
The Sex Determination Gene Shows No Founder Effect in the Giant Honey Bee, Apis dorsata

PubMed Central

Yan, Wei Yu; Wu, Xiao Bo; Zeng, Zhi Jiang; Huang, Zachary Y.

2012-01-01

Background All honey bee species (Apis spp) share the same sex determination mechanism using the complementary sex determination (csd) gene. Only individuals heterogeneous at the csd allele develop into females, and the homozygous develop into diploid males, which do not survive. The honeybees are therefore under selection pressure to generate new csd alleles. Previous studies have shown that the csd gene is under balancing selection. We hypothesize that due to the long separation from the mainland of Hainan Island, China, that the giant honey bees (Apis dorsata) should show a founder effect for the csd gene, with many different alleles clustered together, and these would be absent on the mainland. Methodology/Principal Findings We sampled A. dorsata workers from both Hainan and Guangxi Provinces and then cloned and sequenced region 3 of the csd gene and constructed phylogenetic trees. We failed to find any clustering of the csd alleles according to their geographical origin, i.e. the Hainan and Guangxi samples did not form separate clades. Further analysis by including previously published csd sequences also failed to show any clade-forming in both the Philippines and Malaysia. Conclusions/Significance Results from this study and those from previous studies did not support the expectations of a founder effect. We conclude that because of the extremely high mating frequency of A. dorsata queens, a founder effect does not apply in this species. PMID:22511940
The sex determination gene shows no founder effect in the giant honey bee, Apis dorsata.

PubMed

Liu, Zhi Yong; Wang, Zi Long; Yan, Wei Yu; Wu, Xiao Bo; Zeng, Zhi Jiang; Huang, Zachary Y

2012-01-01

All honey bee species (Apis spp) share the same sex determination mechanism using the complementary sex determination (csd) gene. Only individuals heterogeneous at the csd allele develop into females, and the homozygous develop into diploid males, which do not survive. The honeybees are therefore under selection pressure to generate new csd alleles. Previous studies have shown that the csd gene is under balancing selection. We hypothesize that due to the long separation from the mainland of Hainan Island, China, that the giant honey bees (Apis dorsata) should show a founder effect for the csd gene, with many different alleles clustered together, and these would be absent on the mainland. We sampled A. dorsata workers from both Hainan and Guangxi Provinces and then cloned and sequenced region 3 of the csd gene and constructed phylogenetic trees. We failed to find any clustering of the csd alleles according to their geographical origin, i.e. the Hainan and Guangxi samples did not form separate clades. Further analysis by including previously published csd sequences also failed to show any clade-forming in both the Philippines and Malaysia. Results from this study and those from previous studies did not support the expectations of a founder effect. We conclude that because of the extremely high mating frequency of A. dorsata queens, a founder effect does not apply in this species.
RNA-Seq based phylogeny recapitulates previous phylogeny of the genus Flaveria (Asteraceae) with some modifications.

PubMed

Lyu, Ming-Ju Amy; Gowik, Udo; Kelly, Steve; Covshoff, Sarah; Mallmann, Julia; Westhoff, Peter; Hibberd, Julian M; Stata, Matt; Sage, Rowan F; Lu, Haorong; Wei, Xiaofeng; Wong, Gane Ka-Shu; Zhu, Xin-Guang

2015-06-18

The genus Flaveria has been extensively used as a model to study the evolution of C4 photosynthesis as it contains C3 and C4 species as well as a number of species that exhibit intermediate types of photosynthesis. The current phylogenetic tree of the genus Flaveria contains 21 of the 23 known Flaveria species and has been previously constructed using a combination of morphological data and three non-coding DNA sequences (nuclear encoded ETS, ITS and chloroplast encoded trnL-F). Here we developed a new strategy to update the phylogenetic tree of 16 Flaveria species based on RNA-Seq data. The updated phylogeny is largely congruent with the previously published tree but with some modifications. We propose that the data collection method provided in this study can be used as a generic method for phylogenetic tree reconstruction if the target species has no genomic information. We also showed that a "F. pringlei" genotype recently used in a number of labs may be a hybrid between F. pringlei (C3) and F. angustifolia (C3-C4). We propose that the new strategy of obtaining phylogenetic sequences outlined in this study can be used to construct robust trees in a larger number of taxa. The updated Flaveria phylogenetic tree also supports a hypothesis of stepwise and parallel evolution of C4 photosynthesis in the Flavaria clade.
MMASS: an optimized array-based method for assessing CpG island methylation.

PubMed

Ibrahim, Ashraf E K; Thorne, Natalie P; Baird, Katie; Barbosa-Morais, Nuno L; Tavaré, Simon; Collins, V Peter; Wyllie, Andrew H; Arends, Mark J; Brenton, James D

2006-01-01

We describe an optimized microarray method for identifying genome-wide CpG island methylation called microarray-based methylation assessment of single samples (MMASS) which directly compares methylated to unmethylated sequences within a single sample. To improve previous methods we used bioinformatic analysis to predict an optimized combination of methylation-sensitive enzymes that had the highest utility for CpG-island probes and different methods to produce unmethylated representations of test DNA for more sensitive detection of differential methylation by hybridization. Subtraction or methylation-dependent digestion with McrBC was used with optimized (MMASS-v2) or previously described (MMASS-v1, MMASS-sub) methylation-sensitive enzyme combinations and compared with a published McrBC method. Comparison was performed using DNA from the cell line HCT116. We show that the distribution of methylation microarray data is inherently skewed and requires exogenous spiked controls for normalization and that analysis of digestion of methylated and unmethylated control sequences together with linear fit models of replicate data showed superior statistical power for the MMASS-v2 method. Comparison with previous methylation data for HCT116 and validation of CpG islands from PXMP4, SFRP2, DCC, RARB and TSEN2 confirmed the accuracy of MMASS-v2 results. The MMASS-v2 method offers improved sensitivity and statistical power for high-throughput microarray identification of differential methylation.

GUIDANCE2: accurate detection of unreliable alignment regions accounting for the uncertainty of multiple parameters.

PubMed

Sela, Itamar; Ashkenazy, Haim; Katoh, Kazutaka; Pupko, Tal

2015-07-01

Inference of multiple sequence alignments (MSAs) is a critical part of phylogenetic and comparative genomics studies. However, from the same set of sequences different MSAs are often inferred, depending on the methodologies used and the assumed parameters. Much effort has recently been devoted to improving the ability to identify unreliable alignment regions. Detecting such unreliable regions was previously shown to be important for downstream analyses relying on MSAs, such as the detection of positive selection. Here we developed GUIDANCE2, a new integrative methodology that accounts for: (i) uncertainty in the process of indel formation, (ii) uncertainty in the assumed guide tree and (iii) co-optimal solutions in the pairwise alignments, used as building blocks in progressive alignment algorithms. We compared GUIDANCE2 with seven methodologies to detect unreliable MSA regions using extensive simulations and empirical benchmarks. We show that GUIDANCE2 outperforms all previously developed methodologies. Furthermore, GUIDANCE2 also provides a set of alternative MSAs which can be useful for downstream analyses. The novel algorithm is implemented as a web-server, available at: http://guidance.tau.ac.il. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Molecular evolution of lineage 2 West Nile virus.

PubMed

McMullen, Allison R; Albayrak, Harun; May, Fiona J; Davis, C Todd; Beasley, David W C; Barrett, Alan D T

2013-02-01

Since the 1990s West Nile virus (WNV) has become an increasingly important public health problem and the cause of outbreaks of neurological disease. Genetic analyses have identified multiple lineages with many studies focusing on lineage 1 due to its emergence in New York in 1999 and its neuroinvasive phenotype. Until recently, viruses in lineage 2 were not thought to be of public health importance due to few outbreaks of disease being associated with viruses in this lineage. However, recent epidemics of lineage 2 in Europe (Greece and Italy) and Russia have shown the increasing importance of this lineage. There are very few genetic studies examining isolates belonging to lineage 2. We have sequenced the full-length genomes of four older lineage 2 WNV isolates, compared them to 12 previously published genomic sequences and examined the evolution of this lineage. Our studies show that this lineage has evolved over the past 300-400 years and appears to correlate with a change from mouse attenuated to virulent phenotype based on previous studies by our group. This evolution mirrors that which is seen in lineage 1 isolates, which have also evolved to a virulent phenotype over the same period of time.
Fixing Formalin: A Method to Recover Genomic-Scale DNA Sequence Data from Formalin-Fixed Museum Specimens Using High-Throughput Sequencing

PubMed Central

Hykin, Sarah M.; Bi, Ke; McGuire, Jimmy A.

2015-01-01

For 150 years or more, specimens were routinely collected and deposited in natural history collections without preserving fresh tissue samples for genetic analysis. In the case of most herpetological specimens (i.e. amphibians and reptiles), attempts to extract and sequence DNA from formalin-fixed, ethanol-preserved specimens—particularly for use in phylogenetic analyses—has been laborious and largely ineffective due to the highly fragmented nature of the DNA. As a result, tens of thousands of specimens in herpetological collections have not been available for sequence-based phylogenetic studies. Massively parallel High-Throughput Sequencing methods and the associated bioinformatics, however, are particularly suited to recovering meaningful genetic markers from severely degraded/fragmented DNA sequences such as DNA damaged by formalin-fixation. In this study, we compared previously published DNA extraction methods on three tissue types subsampled from formalin-fixed specimens of Anolis carolinensis, followed by sequencing. Sufficient quality DNA was recovered from liver tissue, making this technique minimally destructive to museum specimens. Sequencing was only successful for the more recently collected specimen (collected ~30 ybp). We suspect this could be due either to the conditions of preservation and/or the amount of tissue used for extraction purposes. For the successfully sequenced sample, we found a high rate of base misincorporation. After rigorous trimming, we successfully mapped 27.93% of the cleaned reads to the reference genome, were able to reconstruct the complete mitochondrial genome, and recovered an accurate phylogenetic placement for our specimen. We conclude that the amount of DNA available, which can vary depending on specimen age and preservation conditions, will determine if sequencing will be successful. The technique described here will greatly improve the value of museum collections by making many formalin-fixed specimens available for genetic analysis. PMID:26505622
Fixing Formalin: A Method to Recover Genomic-Scale DNA Sequence Data from Formalin-Fixed Museum Specimens Using High-Throughput Sequencing.

PubMed

Hykin, Sarah M; Bi, Ke; McGuire, Jimmy A

2015-01-01

For 150 years or more, specimens were routinely collected and deposited in natural history collections without preserving fresh tissue samples for genetic analysis. In the case of most herpetological specimens (i.e. amphibians and reptiles), attempts to extract and sequence DNA from formalin-fixed, ethanol-preserved specimens-particularly for use in phylogenetic analyses-has been laborious and largely ineffective due to the highly fragmented nature of the DNA. As a result, tens of thousands of specimens in herpetological collections have not been available for sequence-based phylogenetic studies. Massively parallel High-Throughput Sequencing methods and the associated bioinformatics, however, are particularly suited to recovering meaningful genetic markers from severely degraded/fragmented DNA sequences such as DNA damaged by formalin-fixation. In this study, we compared previously published DNA extraction methods on three tissue types subsampled from formalin-fixed specimens of Anolis carolinensis, followed by sequencing. Sufficient quality DNA was recovered from liver tissue, making this technique minimally destructive to museum specimens. Sequencing was only successful for the more recently collected specimen (collected ~30 ybp). We suspect this could be due either to the conditions of preservation and/or the amount of tissue used for extraction purposes. For the successfully sequenced sample, we found a high rate of base misincorporation. After rigorous trimming, we successfully mapped 27.93% of the cleaned reads to the reference genome, were able to reconstruct the complete mitochondrial genome, and recovered an accurate phylogenetic placement for our specimen. We conclude that the amount of DNA available, which can vary depending on specimen age and preservation conditions, will determine if sequencing will be successful. The technique described here will greatly improve the value of museum collections by making many formalin-fixed specimens available for genetic analysis.
Overview of Next-generation Sequencing Platforms Used in Published Draft Plant Genomes in Light of Genotypization of Immortelle Plant (Helichrysium Arenarium)

PubMed Central

Hodzic, Jasin; Gurbeta, Lejla; Omanovic-Miklicanin, Enisa; Badnjevic, Almir

2017-01-01

Introduction: Major advancements in DNA sequencing methods introduced in the first decade of the new millennium initiated a rapid expansion of sequencing studies, which yielded a tremendous amount of DNA sequence data, including whole sequenced genomes of various species, including plants. A set of novel sequencing platforms, often collectively named as “next-generation sequencing” (NGS) completely transformed the life sciences, by allowing extensive throughput, while greatly reducing the necessary time, labor and cost of any sequencing endeavor. Purpose: of this paper is to present an overview NGS platforms used to produce the current compendium of published draft genomes of various plants, namely the Roche/454, ABI/SOLiD, and Solexa/Illumina, and to determine the most frequently used platform for the whole genome sequencing of plants in light of genotypization of immortelle plant. Materials and methods: 45 papers were selected (with 47 presented plant genome draft sequences), and utilized sequencing techniques and NGS platforms (Roche/454, ABI/SOLiD and Illumina/Solexa) in selected papers were determined. Subsequently, frequency of usage of each platform or combination of platforms was calculated. Results: Illumina/Solexa platforms are by used either as sole sequencing tool in 40.42% of published genomes, or in combination with other platforms - additional 48.94% of published genomes, followed by Roche/454 platforms, used in combination with traditional Sanger sequencing method (10.64%), and never as a sole tool. ABI/SOLiD was only used in combination with Illumina/Solexa and Roche/454 in 4.25% of publications. Conclusions: Illumina/Solexa platforms are by far most preferred by researchers, most probably due to most affordable sequencing costs. Taking into consideration the current economic situation in the Balkans region, Illumina Solexa is the best (if not the only) platform choice if the sequencing of immortelle plant (Helichrysium arenarium) is to be performed by the researchers in this region. PMID:28974852
Improved serial analysis of V1 ribosomal sequence tags (SARST-V1) provides a rapid, comprehensive, sequence-based characterization of bacterial diversity and community composition.

PubMed

Yu, Zhongtang; Yu, Marie; Morrison, Mark

2006-04-01

Serial analysis of ribosomal sequence tags (SARST) is a recently developed technology that can generate large 16S rRNA gene (rrs) sequence data sets from microbiomes, but there are numerous enzymatic and purification steps required to construct the ribosomal sequence tag (RST) clone libraries. We report here an improved SARST method, which still targets the V1 hypervariable region of rrs genes, but reduces the number of enzymes, oligonucleotides, reagents, and technical steps needed to produce the RST clone libraries. The new method, hereafter referred to as SARST-V1, was used to examine the eubacterial diversity present in community DNA recovered from the microbiome resident in the ovine rumen. The 190 sequenced clones contained 1055 RSTs and no less than 236 unique phylotypes (based on > or = 95% sequence identity) that were assigned to eight different eubacterial phyla. Rarefaction and monomolecular curve analyses predicted that the complete RST clone library contains 99% of the 353 unique phylotypes predicted to exist in this microbiome. When compared with ribosomal intergenic spacer analysis (RISA) of the same community DNA sample, as well as a compilation of nine previously published conventional rrs clone libraries prepared from the same type of samples, the RST clone library provided a more comprehensive characterization of the eubacterial diversity present in rumen microbiomes. As such, SARST-V1 should be a useful tool applicable to comprehensive examination of diversity and composition in microbiomes and offers an affordable, sequence-based method for diversity analysis.
Questioning short-term memory and its measurement: Why digit span measures long-term associative learning.

PubMed

Jones, Gary; Macken, Bill

2015-11-01

Traditional accounts of verbal short-term memory explain differences in performance for different types of verbal material by reference to inherent characteristics of the verbal items making up memory sequences. The role of previous experience with sequences of different types is ostensibly controlled for either by deliberate exclusion or by presenting multiple trials constructed from different random permutations. We cast doubt on this general approach in a detailed analysis of the basis for the robust finding that short-term memory for digit sequences is superior to that for other sequences of verbal material. Specifically, we show across four experiments that this advantage is not due to inherent characteristics of digits as verbal items, nor are individual digits within sequences better remembered than other types of individual verbal items. Rather, the advantage for digit sequences stems from the increased frequency, compared to other verbal material, with which digits appear in random sequences in natural language, and furthermore, relatively frequent digit sequences support better short-term serial recall than less frequent ones. We also provide corpus-based computational support for the argument that performance in a short-term memory setting is a function of basic associative learning processes operating on the linguistic experience of the rememberer. The experimental and computational results raise questions not only about the role played by measurement of digit span in cognition generally, but also about the way in which long-term memory processes impact on short-term memory functioning. Copyright © 2015 The Authors. Published by Elsevier B.V. All rights reserved.
Isolation and molecular characterization of partial FSH and LH receptor genes in Arabian camels (Camelus dromedarius)

PubMed Central

Jelokhani-Niaraki, Saber; Tahmoorespur, Mojtaba; Bitaraf-Sani, Morteza

2015-01-01

Very little is known about LHR and FSHR genes of domestic dromedary camels. The main objective of this study was to determine and analyze partial genomic regions of FSHR and LHR genes in dromedary camels for the first time. To this end, a total of50 DNA samples belonging to dromedary camels raised in Iran were sent for sequencing (25 samples of each gene). We compared the nucleotide sequences of Camelus dromedarius with corresponding sequences of previously published FSHR and LHR genes in bactrian camels and other species. According to the data, the same nucleotide variation was identified in both regions of the two camel species. The alignment of deduced protein sequences of the two different species revealed an amino acid variation at the FSHR region. No evidence of amino acid variation was observed, however, in LHR sequences. Phylogenetic analysis indicated that both camel species had a close relationship and clustered together in a separate branch. This was further confirmed by genetic distance values illustrating significant sequence identity between Camelus dromedarius and Camelus bactrianus. Interestingly, sequence comparisons revealed heterozygote patterns in FSHR sequences isolated from dromedary camels of Iran. In comparison to other species, this camel contains three amino acid substitutions at 5, 67, and 105 positions in the FSHR coding region. These positions are found exclusively in camels and can be considered as species specific. The results of our study can be used for hormone functionality research (FSHR and LHR) as well as reproduction-linked polymorphisms and breeding programs. PMID:27844002
Isolation and molecular characterization of partial FSH and LH receptor genes in Arabian camels (Camelus dromedarius).

PubMed

Jelokhani-Niaraki, Saber; Tahmoorespur, Mojtaba; Bitaraf-Sani, Morteza

2015-06-01

Very little is known about LHR and FSHR genes of domestic dromedary camels. The main objective of this study was to determine and analyze partial genomic regions of FSHR and LHR genes in dromedary camels for the first time. To this end, a total of50 DNA samples belonging to dromedary camels raised in Iran were sent for sequencing (25 samples of each gene). We compared the nucleotide sequences of Camelus dromedarius with corresponding sequences of previously published FSHR and LHR genes in bactrian camels and other species. According to the data, the same nucleotide variation was identified in both regions of the two camel species. The alignment of deduced protein sequences of the two different species revealed an amino acid variation at the FSHR region. No evidence of amino acid variation was observed, however, in LHR sequences. Phylogenetic analysis indicated that both camel species had a close relationship and clustered together in a separate branch. This was further confirmed by genetic distance values illustrating significant sequence identity between Camelus dromedarius and Camelus bactrianus . Interestingly, sequence comparisons revealed heterozygote patterns in FSHR sequences isolated from dromedary camels of Iran. In comparison to other species, this camel contains three amino acid substitutions at 5, 67, and 105 positions in the FSHR coding region. These positions are found exclusively in camels and can be considered as species specific. The results of our study can be used for hormone functionality research ( FSHR and LHR ) as well as reproduction-linked polymorphisms and breeding programs.
Genetics of intellectual disability in consanguineous families.

PubMed

Hu, Hao; Kahrizi, Kimia; Musante, Luciana; Fattahi, Zohreh; Herwig, Ralf; Hosseini, Masoumeh; Oppitz, Cornelia; Abedini, Seyedeh Sedigheh; Suckow, Vanessa; Larti, Farzaneh; Beheshtian, Maryam; Lipkowitz, Bettina; Akhtarkhavari, Tara; Mehvari, Sepideh; Otto, Sabine; Mohseni, Marzieh; Arzhangi, Sanaz; Jamali, Payman; Mojahedi, Faezeh; Taghdiri, Maryam; Papari, Elaheh; Soltani Banavandi, Mohammad Javad; Akbari, Saeide; Tonekaboni, Seyed Hassan; Dehghani, Hossein; Ebrahimpour, Mohammad Reza; Bader, Ingrid; Davarnia, Behzad; Cohen, Monika; Khodaei, Hossein; Albrecht, Beate; Azimi, Sarah; Zirn, Birgit; Bastami, Milad; Wieczorek, Dagmar; Bahrami, Gholamreza; Keleman, Krystyna; Vahid, Leila Nouri; Tzschach, Andreas; Gärtner, Jutta; Gillessen-Kaesbach, Gabriele; Varaghchi, Jamileh Rezazadeh; Timmermann, Bernd; Pourfatemi, Fatemeh; Jankhah, Aria; Chen, Wei; Nikuei, Pooneh; Kalscheuer, Vera M; Oladnabi, Morteza; Wienker, Thomas F; Ropers, Hans-Hilger; Najmabadi, Hossein

2018-01-04

Autosomal recessive (AR) gene defects are the leading genetic cause of intellectual disability (ID) in countries with frequent parental consanguinity, which account for about 1/7th of the world population. Yet, compared to autosomal dominant de novo mutations, which are the predominant cause of ID in Western countries, the identification of AR-ID genes has lagged behind. Here, we report on whole exome and whole genome sequencing in 404 consanguineous predominantly Iranian families with two or more affected offspring. In 219 of these, we found likely causative variants, involving 77 known and 77 novel AR-ID (candidate) genes, 21 X-linked genes, as well as 9 genes previously implicated in diseases other than ID. This study, the largest of its kind published to date, illustrates that high-throughput DNA sequencing in consanguineous families is a superior strategy for elucidating the thousands of hitherto unknown gene defects underlying AR-ID, and it sheds light on their prevalence.
Entamoeba struthionis n.sp. (Sarcomastigophora: Endamoebidae) from ostriches (Struthio camelus).

PubMed

Ponce Gordo, F; Martínez Díaz, R A; Herrera, S

2004-02-06

In the present work we identify the species of Entamoeba from ostriches (Struthio camelus). The complete sequence of the small subunit ribosomal RNA gene from this organism has been compared with those published for other species of the genus and clear differences have been found. These results confirm previous data which showed differences on parasite morphology and class of host with the other Entamoeba species. Taking all these data together, it can be concluded that the organism from ostriches is a new species whose proposed name is Entamoeba struthionis n.sp. This species probably infects rheas (Rhea americana), but genetic analysis of isolates from this host should be performed to confirm morphological data. Also, comparison of gene sequences with data from other authors on cysts recovered from human stool samples showed the possibility that this amoeba may affect humans. Further studies are needed to determine the risk of transmission of this new species to humans.
Structural impact of complete CpG methylation within target DNA on specific complex formation of the inducible transcription factor Egr-1.

PubMed

Zandarashvili, Levani; White, Mark A; Esadze, Alexandre; Iwahara, Junji

2015-07-08

The inducible transcription factor Egr-1 binds specifically to 9-bp target sequences containing two CpG sites that can potentially be methylated at four cytosine bases. Although it appears that complete CpG methylation would make an unfavorable steric clash in the previous crystal structures of the complexes with unmethylated or partially methylated DNA, our affinity data suggest that DNA recognition by Egr-1 is insensitive to CpG methylation. We have determined, at a 1.4-Å resolution, the crystal structure of the Egr-1 zinc-finger complex with completely methylated target DNA. Structural comparison of the three different methylation states reveals why Egr-1 can recognize the target sequences regardless of CpG methylation. Copyright © 2015 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.
Novel transcriptional networks regulated by CLOCK in human neurons.

PubMed

Fontenot, Miles R; Berto, Stefano; Liu, Yuxiang; Werthmann, Gordon; Douglas, Connor; Usui, Noriyoshi; Gleason, Kelly; Tamminga, Carol A; Takahashi, Joseph S; Konopka, Genevieve

2017-11-01

The molecular mechanisms underlying human brain evolution are not fully understood; however, previous work suggested that expression of the transcription factor CLOCK in the human cortex might be relevant to human cognition and disease. In this study, we investigated this novel transcriptional role for CLOCK in human neurons by performing chromatin immunoprecipitation sequencing for endogenous CLOCK in adult neocortices and RNA sequencing following CLOCK knockdown in differentiated human neurons in vitro. These data suggested that CLOCK regulates the expression of genes involved in neuronal migration, and a functional assay showed that CLOCK knockdown increased neuronal migratory distance. Furthermore, dysregulation of CLOCK disrupts coexpressed networks of genes implicated in neuropsychiatric disorders, and the expression of these networks is driven by hub genes with human-specific patterns of expression. These data support a role for CLOCK-regulated transcriptional cascades involved in human brain evolution and function. © 2017 Fontenot et al.; Published by Cold Spring Harbor Laboratory Press.
Riverine effects on mitochondrial structure of Bornean orangutans (Pongo pygmaeus) at two spatial scales.

PubMed

Jalil, M F; Cable, J; Sinyor, J; Lackman-Ancrenaz, I; Ancrenaz, M; Bruford, M W; Goossens, B

2008-06-01

We examined mitochondrial DNA control region sequences of 73 Kinabatangan orangutans to test the hypothesis that the phylogeographical structure of the Bornean orangutan is influenced by riverine barriers. The Lower Kinabatangan Wildlife Sanctuary contains one of the most northern populations of orangutans (Pongo pygmaeus) on Borneo and is bisected by the Kinabatangan River, the longest river in Sabah. Orang-utan samples on either side of the river were strongly differentiated with a high Phi(ST) value of 0.404 (P < 0.001). Results also suggest an east-west gradient of genetic diversity and evidence for population expansion along the river, possibly reflecting a postglacial colonization of the Kinabatangan floodplain. We compared our data with previously published sequences of Bornean orangutans in the context of river catchment structure on the island and evaluated the general relevance of rivers as barriers to gene flow in this long-lived, solitary arboreal ape.
Genome-wide association mapping of virulence gene in rice blast fungus Magnaporthe oryzae using a genotyping by sequencing approach.

PubMed

Korinsak, Siripar; Tangphatsornruang, Sithichoke; Pootakham, Wirulda; Wanchana, Samart; Plabpla, Anucha; Jantasuriyarat, Chatchawan; Patarapuwadol, Sujin; Vanavichit, Apichart; Toojinda, Theerayut

2018-05-15

Magnaporthe oryzae is a fungal pathogen causing blast disease in many plant species. In this study, seventy three isolates of M. oryzae collected from rice (Oryza sativa) in 1996-2014 were genotyped using a genotyping-by-sequencing approach to detect genetic variation. An association study was performed to identify single nucleotide polymorphisms (SNPs) associated with virulence genes using 831 selected SNP and infection phenotypes on local and improved rice varieties. Population structure analysis revealed eight subpopulations. The division into eight groups was not related to the degree of virulence. Association mapping showed five SNPs associated with fungal virulence on chromosome 1, 2, 3, 4 and 7. The SNP on chromosome 1 was associated with virulence against RD6-Pi7 and IRBL7-M which might be linked to the previously reported AvrPi7. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Discovery of novel antimicrobial peptides with unusual cysteine motifs in dandelion Taraxacum officinale Wigg. flowers.

PubMed

Astafieva, A A; Rogozhin, E A; Odintsova, T I; Khadeeva, N V; Grishin, E V; Egorov, Ts A

2012-08-01

Three novel antimicrobial peptides designated ToAMP1, ToAMP2 and ToAMP3 were purified from Taraxacum officinale flowers. Their amino acid sequences were determined. The peptides are cationic and cysteine-rich and consist of 38, 44 and 42 amino acid residues for ToAMP1, ToAMP2 and ToAMP3, respectively. Importantly, according to cysteine motifs, the peptides are representatives of two novel previously unknown families of plant antimicrobial peptides. ToAMP1 and ToAMP2 share high sequence identity and belong to 6-Cys-containing antimicrobial peptides, while ToAMP3 is a member of a distinct 8-Cys family. The peptides were shown to display high antimicrobial activity both against fungal and bacterial pathogens, and therefore represent new promising molecules for biotechnological and medicinal applications. Crown Copyright © 2012. Published by Elsevier Inc. All rights reserved.
RNA motif search with data-driven element ordering.

PubMed

Rampášek, Ladislav; Jimenez, Randi M; Lupták, Andrej; Vinař, Tomáš; Brejová, Broňa

2016-05-18

In this paper, we study the problem of RNA motif search in long genomic sequences. This approach uses a combination of sequence and structure constraints to uncover new distant homologs of known functional RNAs. The problem is NP-hard and is traditionally solved by backtracking algorithms. We have designed a new algorithm for RNA motif search and implemented a new motif search tool RNArobo. The tool enhances the RNAbob descriptor language, allowing insertions in helices, which enables better characterization of ribozymes and aptamers. A typical RNA motif consists of multiple elements and the running time of the algorithm is highly dependent on their ordering. By approaching the element ordering problem in a principled way, we demonstrate more than 100-fold speedup of the search for complex motifs compared to previously published tools. We have developed a new method for RNA motif search that allows for a significant speedup of the search of complex motifs that include pseudoknots. Such speed improvements are crucial at a time when the rate of DNA sequencing outpaces growth in computing. RNArobo is available at http://compbio.fmph.uniba.sk/rnarobo .
Similar Ratios of Introns to Intergenic Sequence across Animal Genomes

PubMed Central

Wörheide, Gert

2017-01-01

Abstract One central goal of genome biology is to understand how the usage of the genome differs between organisms. Our knowledge of genome composition, needed for downstream inferences, is critically dependent on gene annotations, yet problems associated with gene annotation and assembly errors are usually ignored in comparative genomics. Here, we analyze the genomes of 68 species across 12 animal phyla and some single-cell eukaryotes for general trends in genome composition and transcription, taking into account problems of gene annotation. We show that, regardless of genome size, the ratio of introns to intergenic sequence is comparable across essentially all animals, with nearly all deviations dominated by increased intergenic sequence. Genomes of model organisms have ratios much closer to 1:1, suggesting that the majority of published genomes of nonmodel organisms are underannotated and consequently omit substantial numbers of genes, with likely negative impact on evolutionary interpretations. Finally, our results also indicate that most animals transcribe half or more of their genomes arguing against differences in genome usage between animal groups, and also suggesting that the transcribed portion is more dependent on genome size than previously thought. PMID:28633296
Cultural studies coupled with DNA based sequence analyses and its implication on pigmentation as a phylogenetic marker in Pestalotiopsis taxonomy.

PubMed

Liu, Ai-Rong; Chen, Shuang-Chen; Wu, Shang-Ying; Xu, Tong; Guo, Liang-Dong; Jeewon, Rajesh; Wei, Ji-Guang

2010-11-01

Previous phylogenetic studies based on DNA sequence data have partially resolved taxonomic relationships among Pestalotiopsis species. There are still some morphological characters whose phylogenetic significance have not been assessed properly due to limited taxon sampling, in particular the degree of pigmentation of median cells. In this study, the stability of pigmentation of median cells of conidia in Pestalotiopsis species was evaluated in subculture, and a molecular phylogenetic analysis was conducted on 45 strains belonging to 26 species in order to reappraise the pigmentation of median cells for its significance in the taxonomy of Pestalotiopsis. Phylogenetic relationships were inferred from nucleotide sequences in ITS regions (ITS1, 5.8S and ITS2) and β-tubulin 2 gene (tub2). The results showed that pigmentation of median cells was stable and it could be a key character in the taxonomy of Pestalotiopsis species. Instead of "concolorous" and "versicolor" proposed by Steyeart (1949), "brown to olivaceous" and "umber to fuliginous" are described and proposed in this paper. Copyright © 2010. Published by Elsevier Inc.
Assembly and analysis of a male sterile rubber tree mitochondrial genome reveals DNA rearrangement events and a novel transcript

PubMed Central

2014-01-01

Background The rubber tree, Hevea brasiliensis, is an important plant species that is commercially grown to produce latex rubber in many countries. The rubber tree variety BPM 24 exhibits cytoplasmic male sterility, inherited from the variety GT 1. Results We constructed the rubber tree mitochondrial genome of a cytoplasmic male sterile variety, BPM 24, using 454 sequencing, including 8 kb paired-end libraries, plus Illumina paired-end sequencing. We annotated this mitochondrial genome with the aid of Illumina RNA-seq data and performed comparative analysis. We then compared the sequence of BPM 24 to the contigs of the published rubber tree, variety RRIM 600, and identified a rearrangement that is unique to BPM 24 resulting in a novel transcript containing a portion of atp9. Conclusions The novel transcript is consistent with changes that cause cytoplasmic male sterility through a slight reduction to ATP production efficiency. The exhaustive nature of the search rules out alternative causes and supports previous findings of novel transcripts causing cytoplasmic male sterility. PMID:24512148

Detection of canine cytokine gene expression by reverse transcription-polymerase chain reaction.

PubMed

Pinelli, E; van der Kaaij, S Y; Slappendel, R; Fragio, C; Ruitenberg, E J; Bernadina, W; Rutten, V P

1999-08-02

Further characterization of the canine immune system will greatly benefit from the availability of tools to detect canine cytokines. Our interest concerns the study on the role of cytokines in canine visceral leishmaniasis. For this purpose, we have designed specific primers using previously published sequences for the detection of canine IL-2, IFN-gamma and IL10 mRNA by reverse transcription-polymerase chain reaction (RT-PCR). For IL-4, we have cloned and sequenced this cytokine gene, and developed canine-specific primers. To control for sample-to-sample variation in the quantity of mRNA and variation in the RT and PCR reactions, the mRNA levels of glyceraldehyde-3-phosphate dehydrogenase (G3PDH), a housekeeping gene, were determined in parallel. Primers to amplify G3PDH were designed from consensus sequences obtained from the Genbank database. The mRNA levels of the cytokines mentioned here were detected from ConA-stimulated peripheral mononuclear cells derived from Leishmania-infected dogs. A different pattern of cytokine production among infected animals was found.
A new record of ponyfish Deveximentum megalolepis (Perciformes: Leiognathidae) in Beibu Gulf of China

NASA Astrophysics Data System (ADS)

Ju, Yuman; Song, Na; Chen, Guobao; Sun, Dianrong; Han, Zhiqiang; Gao, Tianxiang

2017-06-01

A new record ponyfish, Deveximentum megalolepis Mochizuki and Hayashi, 1989, was documented based on its morphological characteristics and DNA barcode. Fifty specimens were collected from Beibu Gulf of China and identified as D. megalolepis by morphological characterization. The coloration, meristic traits, and morphometric measurements were consistent with previously published records. In general, it is a silver-white, laterally compressed and deep bodied ponyfish with 6-9 rows of scales on cheek; scale rows above lateral line 6-8; scale rows below lateral line 14-17. Mitochondrial cytochrome oxidase I subunit (COI) gene fragment was sequenced for phylogenetic analysis. There is no sequence variation of COI gene between the specimens collected in this study. The genetic distances between D. megalolepis and other congeneric species range from 3.6% to 14.0%, which were greater than the threshold for fish species delimitation. The COI sequence analysis also supported the validity of D. megalolepis at genetic level. However, the genetic distance between Chinese and Philippine individuals was about 1.2% and they formed two lineages in gene tree, which may be caused by the geographical distance.
Antarctic ice core samples: culturable bacterial diversity.

PubMed

Shivaji, Sisinthy; Begum, Zareena; Shiva Nageswara Rao, Singireesu Soma; Vishnu Vardhan Reddy, Puram V; Manasa, Poorna; Sailaja, Buddi; Prathiba, Mambatta S; Thamban, Meloth; Krishnan, Kottekkatu P; Singh, Shiv M; Srinivas, Tanuku N R

2013-01-01

Culturable bacterial abundance at 11 different depths of a 50.26 m ice core from the Tallaksenvarden Nunatak, Antarctica, varied from 0.02 to 5.8 × 10(3) CFU ml(-1) of the melt water. A total of 138 bacterial strains were recovered from the 11 different depths of the ice core. Based on 16S rRNA gene sequence analyses, the 138 isolates could be categorized into 25 phylotypes belonging to phyla Actinobacteria, Bacteroidetes, Firmicutes and Proteobacteria. All isolates had 16S rRNA sequences similar to previously determined sequences (97.2-100%). No correlation was observed in the distribution of the isolates at the various depths either at the phylum, genus or species level. The 25 phylotypes varied in growth temperature range, tolerance to NaCl, growth pH range and ability to produce eight different extracellular enzymes at either 4 or 18 °C. Iso-, anteiso-, unsaturated and saturated fatty acids together constituted a significant proportion of the total fatty acid composition. Copyright © 2012 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.
Quantitation of normal CFTR mRNA in CF patients with splice-site mutations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhou, Z.; Olsen, J.C.; Silverman, L.M.

Previously we identified two mutations in introns of the CFTR gene associated with partially active splice sites and unusual clinical phenotypes. One mutation in intron 19 (3849+10 kb C to T) is common in CF patients with normal sweat chloride values; an 84 bp sequence from intron 19, which contains a stop codon, is inserted between exon 19 and exon 20 in most nasal CFTR transcripts. The other mutation in intron 14B (2789+5 G to A) is associated with elevated sweat chloride levels, but mild pulmonary disease; exon 14B (38 bp) is spliced out of most nasal CFTR transcipts. Themore » remaining CFTR cDNA sequences, other than the 84 bp insertion of exon 14B deletion, are identical to the published sequence. To correlate genotype and phenotype, we used quantitative RT-PCR to determine the levels of normally-spliced CFTR mRNA in nasal epithelia from these patients. CFTR cDNA was amplified (25 cycles) by using primers specific for normally-spliced species, {gamma}-actin cDNA was amplified as a standard.« less
A molecular epidemiological investigation of avian paramyxovirus type 1 viruses isolated from game birds of the order Galliformes.

PubMed

Aldous, E W; Mynn, J K; Irvine, R M; Alexander, D J; Brown, I H

2010-12-01

The partial (370 nucleotides) fusion gene sequences of 55 avian paramyxovirus type 1 (APMV-1) isolates were obtained. Included were 41 published sequences, of which 16 were from strains of APMV-1 of previously determined lineages included as markers for the data analysed and 25 were from APMV-1 viruses isolated from game birds of the order Galliformes. In addition, we sequenced a further 14 game bird isolates obtained from the repository at the Veterinary Laboratories Agency. The game bird isolates had been obtained from 17 countries, and spanned four decades. Earlier studies have shown that class II APMV-1 viruses can be divided into at least 15 lineages and sub-lineages. Phylogenetic analysis revealed that the 39 game bird isolates were distributed across 12 of these sub-lineages. We conclude that no single lineage of Newcastle disease viruses appears to be prevalent in game birds, and the isolates obtained from these hosts reflected the prevailing, both geographically and temporally, viruses in poultry, pigeons or wild birds.
XLID-causing mutations and associated genes challenged in light of data from large-scale human exome sequencing.

PubMed

Piton, Amélie; Redin, Claire; Mandel, Jean-Louis

2013-08-08

Because of the unbalanced sex ratio (1.3-1.4 to 1) observed in intellectual disability (ID) and the identification of large ID-affected families showing X-linked segregation, much attention has been focused on the genetics of X-linked ID (XLID). Mutations causing monogenic XLID have now been reported in over 100 genes, most of which are commonly included in XLID diagnostic gene panels. Nonetheless, the boundary between true mutations and rare non-disease-causing variants often remains elusive. The sequencing of a large number of control X chromosomes, required for avoiding false-positive results, was not systematically possible in the past. Such information is now available thanks to large-scale sequencing projects such as the National Heart, Lung, and Blood (NHLBI) Exome Sequencing Project, which provides variation information on 10,563 X chromosomes from the general population. We used this NHLBI cohort to systematically reassess the implication of 106 genes proposed to be involved in monogenic forms of XLID. We particularly question the implication in XLID of ten of them (AGTR2, MAGT1, ZNF674, SRPX2, ATP6AP2, ARHGEF6, NXF5, ZCCHC12, ZNF41, and ZNF81), in which truncating variants or previously published mutations are observed at a relatively high frequency within this cohort. We also highlight 15 other genes (CCDC22, CLIC2, CNKSR2, FRMPD4, HCFC1, IGBP1, KIAA2022, KLF8, MAOA, NAA10, NLGN3, RPL10, SHROOM4, ZDHHC15, and ZNF261) for which replication studies are warranted. We propose that similar reassessment of reported mutations (and genes) with the use of data from large-scale human exome sequencing would be relevant for a wide range of other genetic diseases. Copyright © 2013 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Mammalian genome projects reveal new growth hormone (GH) sequences. Characterization of the GH-encoding genes of armadillo (Dasypus novemcinctus), hedgehog (Erinaceus europaeus), bat (Myotis lucifugus), hyrax (Procavia capensis), shrew (Sorex araneus), ground squirrel (Spermophilus tridecemlineatus), elephant (Loxodonta africana), cat (Felis catus) and opossum (Monodelphis domestica).

PubMed

Wallis, Michael

2008-01-15

Mammalian growth hormone (GH) sequences have been shown previously to display episodic evolution: the sequence is generally strongly conserved but on at least two occasions during mammalian evolution (on lineages leading to higher primates and ruminants) bursts of rapid evolution occurred. However, the number of mammalian orders studied previously has been relatively limited, and the availability of sequence data via mammalian genome projects provides the potential for extending the range of GH gene sequences examined. Complete or nearly complete GH gene sequences for six mammalian species for which no data were previously available have been extracted from the genome databases-Dasypus novemcinctus (nine-banded armadillo), Erinaceus europaeus (western European hedgehog), Myotis lucifugus (little brown bat), Procavia capensis (cape rock hyrax), Sorex araneus (European shrew), Spermophilus tridecemlineatus (13-lined ground squirrel). In addition incomplete data for several other species have been extended. Examination of the data in detail and comparison with previously available sequences has allowed assessment of the reliability of deduced sequences. Several of the new sequences differ substantially from the consensus sequence previously determined for eutherian GHs, indicating greater variability than previously recognised, and confirming the episodic pattern of evolution. The episodic pattern is not seen for signal sequences, 5' upstream sequence or synonymous substitutions-it is specific to the mature protein sequence, suggesting that it relates to the hormonal function. The substitutions accumulated during the course of GH evolution have occurred mainly on the side of the hormone facing away from the receptor, in a non-random fashion, and it is suggested that this may reflect interaction of the receptor-bound hormone with other proteins or small ligands.
Comparison of Control of Clostridium difficile Infection in Six English Hospitals Using Whole-Genome Sequencing.

PubMed

Eyre, David W; Fawley, Warren N; Rajgopal, Anu; Settle, Christopher; Mortimer, Kalani; Goldenberg, Simon D; Dawson, Susan; Crook, Derrick W; Peto, Tim E A; Walker, A Sarah; Wilcox, Mark H

2017-08-01

Variation in Clostridium difficile infection (CDI) rates between healthcare institutions suggests overall incidence could be reduced if the lowest rates could be achieved more widely. We used whole-genome sequencing (WGS) of consecutive C. difficile isolates from 6 English hospitals over 1 year (2013-14) to compare infection control performance. Fecal samples with a positive initial screen for C. difficile were sequenced. Within each hospital, we estimated the proportion of cases plausibly acquired from previous cases. Overall, 851/971 (87.6%) sequenced samples contained toxin genes, and 451 (46.4%) were fecal-toxin-positive. Of 652 potentially toxigenic isolates >90-days after the study started, 128 (20%, 95% confidence interval [CI] 17-23%) were genetically linked (within ≤2 single nucleotide polymorphisms) to a prior patient's isolate from the previous 90 days. Hospital 2 had the fewest linked isolates, 7/105 (7%, 3-13%), hospital 1, 9/70 (13%, 6-23%), and hospitals 3-6 had similar proportions of linked isolates (22-26%) (P ≤ .002 comparing hospital-2 vs 3-6). Results were similar adjusting for locally circulating ribotypes. Adjusting for hospital, ribotype-027 had the highest proportion of linked isolates (57%, 95% CI 29-81%). Fecal-toxin-positive and toxin-negative patients were similarly likely to be a potential transmission donor, OR = 1.01 (0.68-1.49). There was no association between the estimated proportion of linked cases and testing rates. WGS can be used as a novel surveillance tool to identify varying rates of C. difficile transmission between institutions and therefore to allow targeted efforts to reduce CDI incidence. © The Author 2017. Published by Oxford University Press for the Infectious Diseases Society of America.
Diversity, abundance, and consistency of microbial oxygenase expression and biodegradation in a shallow contaminated aquifer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yagi, J.M.; Madsen, E.L.

The diversity of Rieske dioxygenase genes and short-term temporal variability in the abundance of two selected dioxygenase gene sequences were examined in a naphthalene-rich, coal tar waste-contaminated subsurface study site. Using a previously published PCR-based approach (S. M. Ni Chadhain, R. S. Norman, K. V. Pesce, J. J. Kukor, and G. J. Zylstra, Appl. Environ. Microbiol. 72: 4078-4087, 2006) a broad suite of genes was detected, ranging from dioxygenase sequences associated with Rhodococcus and Sphingomonas to 32 previously uncharacterized Rieske gene sequence clone groups. The nag genes appeared frequently (20% of the total) in two groundwater monitoring wells characterized bymore » low (similar to 10{sup 2} ppb; similar to 1 {mu} M) ambient concentrations of naphthalene. A quantitative competitive PCR assay was used to show that abundances of nag genes (and archetypal nah genes) fluctuated substantially over a 9-month period. To contrast short-term variation with long-term community stability, in situ community gene expression (dioxygenase mRNA) and biodegradation potential (community metabolism of naphthalene in microcosms) were compared to measurements from 6 years earlier. cDNA sequences amplified from total RNA extracts revealed that nah- and nag-type genes were expressed in situ, corresponding well with structural gene abundances. Despite evidence for short-term (9-month) shifts in dioxygenase gene copy number, agreement in field gene expression (dioxygenase mRNA) and biodegradation potential was observed in comparisons to equivalent assays performed 6 years earlier. Thus, stability in community biodegradation characteristics at the hemidecadal time frame has been documented for these subsurface microbial communities.« less
Antibiotic Resistance Markers in Burkholderia pseudomallei Strain Bp1651 Identified by Genome Sequence Analysis

PubMed Central

Sue, David; Gee, Jay E.; Elrod, Mindy G.; Hoffmaster, Alex R.; Randall, Linnell B.; Chirakul, Sunisa; Tuanyok, Apichai; Schweizer, Herbert P.; Weigel, Linda M.

2017-01-01

ABSTRACT Burkholderia pseudomallei Bp1651 is resistant to several classes of antibiotics that are usually effective for treatment of melioidosis, including tetracyclines, sulfonamides, and β-lactams such as penicillins (amoxicillin-clavulanic acid), cephalosporins (ceftazidime), and carbapenems (imipenem and meropenem). We sequenced, assembled, and annotated the Bp1651 genome and analyzed the sequence using comparative genomic analyses with susceptible strains, keyword searches of the annotation, publicly available antimicrobial resistance prediction tools, and published reports. More than 100 genes in the Bp1651 sequence were identified as potentially contributing to antimicrobial resistance. Most notably, we identified three previously uncharacterized point mutations in penA, which codes for a class A β-lactamase and was previously implicated in resistance to β-lactam antibiotics. The mutations result in amino acid changes T147A, D240G, and V261I. When individually introduced into select agent-excluded B. pseudomallei strain Bp82, D240G was found to contribute to ceftazidime resistance and T147A contributed to amoxicillin-clavulanic acid and imipenem resistance. This study provides the first evidence that mutations in penA may alter susceptibility to carbapenems in B. pseudomallei. Another mutation of interest was a point mutation affecting the dihydrofolate reductase gene folA, which likely explains the trimethoprim resistance of this strain. Bp1651 was susceptible to aminoglycosides likely because of a frameshift in the amrB gene, the transporter subunit of the AmrAB-OprA efflux pump. These findings expand the role of penA to include resistance to carbapenems and may assist in the development of molecular diagnostics that predict antimicrobial resistance and provide guidance for treatment of melioidosis. PMID:28396541
Expanding the mutational spectrum in Johanson-Blizzard syndrome: identification of whole exon deletions and duplications in the UBR1 gene by multiplex ligation-dependent probe amplification analysis.

PubMed

Sukalo, Maja; Schäflein, Eva; Schanze, Ina; Everman, David B; Rezaei, Nima; Argente, Jesús; Lorda-Sanchez, Isabel; Deshpande, Charu; Takahashi, Tsutomu; Kleger, Alexander; Zenker, Martin

2017-11-01

Johanson-Blizzard syndrome (JBS, MIM #243800) is a very rare autosomal recessive disorder characterized by exocrine pancreatic insufficiency, nasal wing hypoplasia, hypodontia, and other abnormalities. JBS is caused by mutations of the UBR1 gene (MIM *605981), encoding a ubiquitin ligase of the N-end rule pathway. Molecular findings in a total of 65 unrelated patients with a clinical diagnosis of JBS who were previously screened for UBR1 mutations by Sanger sequencing were reviewed and cases lacking a disease-causing UBR1 mutation on either one or both alleles were included in this study. In order to discover mutations that are not detectable by Sanger sequencing, we designed a probe set for multiplex ligation-dependent probe amplification (MLPA) analysis of the UBR1 gene and analyzed the copy number status of all 47 UBR1 exons. Our previous studies using Sanger sequencing could detect mutations in 93.1% of 130 disease-associated UBR1 alleles. Six patients with a highly suggestive clinical diagnosis of JBS and unsolved genotype were included in this study. MLPA analysis detected six alleles harboring exon deletions/duplications, thereby raising the mutation detection rate in the entire cohort to 97.7% (127/130 alleles). We conclude that single or multi-exon deletions or duplications account for a substantial proportion of JBS-associated UBR1 mutations. © 2017 The Authors. Molecular Genetics & Genomic Medicine published by Wiley Periodicals, Inc.
Extensive Horizontal Transfer and Homologous Recombination Generate Highly Chimeric Mitochondrial Genomes in Yeast.

PubMed

Wu, Baojun; Buljic, Adnan; Hao, Weilong

2015-10-01

The frequency of horizontal gene transfer (HGT) in mitochondrial DNA varies substantially. In plants, HGT is relatively common, whereas in animals it appears to be quite rare. It is of considerable importance to understand mitochondrial HGT across the major groups of eukaryotes at a genome-wide level, but so far this has been well studied only in plants. In this study, we generated ten new mitochondrial genome sequences and analyzed 40 mitochondrial genomes from the Saccharomycetaceae to assess the magnitude and nature of mitochondrial HGT in yeasts. We provide evidence for extensive, homologous-recombination-mediated, mitochondrial-to-mitochondrial HGT occurring throughout yeast mitochondrial genomes, leading to genomes that are highly chimeric evolutionarily. This HGT has led to substantial intraspecific polymorphism in both sequence content and sequence divergence, which to our knowledge has not been previously documented in any mitochondrial genome. The unexpectedly high frequency of mitochondrial HGT in yeast may be driven by frequent mitochondrial fusion, relatively low mitochondrial substitution rates and pseudohyphal fusion to produce heterokaryons. These findings suggest that mitochondrial HGT may play an important role in genome evolution of a much broader spectrum of eukaryotes than previously appreciated and that there is a critical need to systematically study the frequency, extent, and importance of mitochondrial HGT across eukaryotes. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Comprehensive profiling of retroviral integration sites using target enrichment methods from historical koala samples without an assembled reference genome

PubMed Central

Alquezar-Planas, David E.; Ishida, Yasuko; Courtiol, Alexandre; Timms, Peter; Johnson, Rebecca N.; Lenz, Dorina; Helgen, Kristofer M.; Roca, Alfred L.; Hartman, Stefanie

2016-01-01

Background. Retroviral integration into the host germline results in permanent viral colonization of vertebrate genomes. The koala retrovirus (KoRV) is currently invading the germline of the koala (Phascolarctos cinereus) and provides a unique opportunity for studying retroviral endogenization. Previous analysis of KoRV integration patterns in modern koalas demonstrate that they share integration sites primarily if they are related, indicating that the process is currently driven by vertical transmission rather than infection. However, due to methodological challenges, KoRV integrations have not been comprehensively characterized. Results. To overcome these challenges, we applied and compared three target enrichment techniques coupled with next generation sequencing (NGS) and a newly customized sequence-clustering based computational pipeline to determine the integration sites for 10 museum Queensland and New South Wales (NSW) koala samples collected between the 1870s and late 1980s. A secondary aim of this study sought to identify common integration sites across modern and historical specimens by comparing our dataset to previously published studies. Several million sequences were processed, and the KoRV integration sites in each koala were characterized. Conclusions. Although the three enrichment methods each exhibited bias in integration site retrieval, a combination of two methods, Primer Extension Capture and hybridization capture is recommended for future studies on historical samples. Moreover, identification of integration sites shows that the proportion of integration sites shared between any two koalas is quite small. PMID:27069793
Joubert syndrome: genotyping a Northern European patient cohort.

PubMed

Kroes, Hester Y; Monroe, Glen R; van der Zwaag, Bert; Duran, Karen J; de Kovel, Carolien G; van Roosmalen, Mark J; Harakalova, Magdalena; Nijman, Ies J; Kloosterman, Wigard P; Giles, Rachel H; Knoers, Nine V A M; van Haaften, Gijs

2016-02-01

Joubert syndrome (JBS) is a rare neurodevelopmental disorder belonging to the group of ciliary diseases. JBS is genetically heterogeneous, with >20 causative genes identified to date. A molecular diagnosis of JBS is essential for prediction of disease progression and genetic counseling. We developed a targeted next-generation sequencing (NGS) approach for parallel sequencing of 22 known JBS genes plus 599 additional ciliary genes. This method was used to genotype a cohort of 51 well-phenotyped Northern European JBS cases (in some of the cases, Sanger sequencing of individual JBS genes had been performed previously). Altogether, 21 of the 51 cases (41%) harbored biallelic pathogenic mutations in known JBS genes, including 14 mutations not previously described. Mutations in C5orf42 (12%), TMEM67 (10%), and AHI1 (8%) were the most prevalent. C5orf42 mutations result in a purely neurological Joubert phenotype, in one case associated with postaxial polydactyly. Our study represents a population-based cohort of JBS patients not enriched for consanguinity, providing insight into the relative importance of the different JBS genes in a Northern European population. Mutations in C5orf42 are relatively frequent (possibly due to a Dutch founder mutation) and mutations in CEP290 are underrepresented compared with international cohorts. Furthermore, we report a case with heterozygous mutations in CC2D2A and B9D1, a gene associated with the more severe Meckel-Gruber syndrome that was recently published as a potential new JBS gene, and discuss the significance of this finding.
Public health surveillance of multidrug-resistant clones of Neisseria gonorrhoeae in Europe: a genomic survey.

PubMed

Harris, Simon R; Cole, Michelle J; Spiteri, Gianfranco; Sánchez-Busó, Leonor; Golparian, Daniel; Jacobsson, Susanne; Goater, Richard; Abudahab, Khalil; Yeats, Corin A; Bercot, Beatrice; Borrego, Maria José; Crowley, Brendan; Stefanelli, Paola; Tripodo, Francesco; Abad, Raquel; Aanensen, David M; Unemo, Magnus

2018-05-15

Traditional methods for molecular epidemiology of Neisseria gonorrhoeae are suboptimal. Whole-genome sequencing (WGS) offers ideal resolution to describe population dynamics and to predict and infer transmission of antimicrobial resistance, and can enhance infection control through linkage with epidemiological data. We used WGS, in conjunction with linked epidemiological and phenotypic data, to describe the gonococcal population in 20 European countries. We aimed to detail changes in phenotypic antimicrobial resistance levels (and the reasons for these changes) and strain distribution (with a focus on antimicrobial resistance strains in risk groups), and to predict antimicrobial resistance from WGS data. We carried out an observational study, in which we sequenced isolates taken from patients with gonorrhoea from the European Gonococcal Antimicrobial Surveillance Programme in 20 countries from September to November, 2013. We also developed a web platform that we used for automated antimicrobial resistance prediction, molecular typing (N gonorrhoeae multi-antigen sequence typing [NG-MAST] and multilocus sequence typing), and phylogenetic clustering in conjunction with epidemiological and phenotypic data. The multidrug-resistant NG-MAST genogroup G1407 was predominant and accounted for the most cephalosporin resistance, but the prevalence of this genogroup decreased from 248 (23%) of 1066 isolates in a previous study from 2009-10 to 174 (17%) of 1054 isolates in this survey in 2013. This genogroup previously showed an association with men who have sex with men, but changed to an association with heterosexual people (odds ratio=4·29). WGS provided substantially improved resolution and accuracy over NG-MAST and multilocus sequence typing, predicted antimicrobial resistance relatively well, and identified discrepant isolates, mixed infections or contaminants, and multidrug-resistant clades linked to risk groups. To our knowledge, we provide the first use of joint analysis of WGS and epidemiological data in an international programme for regional surveillance of sexually transmitted infections. WGS provided enhanced understanding of the distribution of antimicrobial resistance clones, including replacement with clones that were more susceptible to antimicrobials, in several risk groups nationally and regionally. We provide a framework for genomic surveillance of gonococci through standardised sampling, use of WGS, and a shared information architecture for interpretation and dissemination by use of open access software. The European Centre for Disease Prevention and Control, The Centre for Genomic Pathogen Surveillance, Örebro University Hospital, and Wellcome. Copyright © 2018 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY 4.0 license. Published by Elsevier Ltd.. All rights reserved.
Characterization and complete genome sequence of a previously uncharacterized panicovirus from Bermuda grass detected by high throughput sequencing

USDA-ARS?s Scientific Manuscript database

Bermuda grass samples were examined by transmission electron microscopy and 28-30 nm spherical virus particles were observed. Total RNA from these plants was subjected to high throughput sequencing (HTS). The nearly full genome sequence of a previously uncharacterized Panicovirus was identified from...
Leishmania species identification using FTA card sampling directly from patients' cutaneous lesions in the state of Lara, Venezuela.

PubMed

Kato, Hirotomo; Watanabe, Junko; Mendoza Nieto, Iraida; Korenaga, Masataka; Hashiguchi, Yoshihisa

2011-10-01

A molecular epidemiological study was performed using FTA card materials directly sampled from lesions of patients with cutaneous leishmaniasis (CL) in the state of Lara, Venezuela, where causative agents have been identified as Leishmania (Viannia) braziliensis and L. (Leishmania) venezuelensis in previous studies. Of the 17 patients diagnosed with CL, Leishmania spp. were successfully identified in 16 patients based on analysis of the cytochrome b gene and rRNA internal transcribed spacer sequences. Consistent with previous findings, seven of the patients were infected with L. (V.) braziliensis. However, parasites from the other nine patients were genetically identified as L. (L.) mexicana, which differed from results of previous enzymatic and antigenic analyses. These results strongly suggest that L. (L.) venezuelensis is a variant of L. (L.) mexicana and that the classification of L. (L.) venezuelensis should be reconsidered. Copyright © 2011 Royal Society of Tropical Medicine and Hygiene. Published by Elsevier Ltd. All rights reserved.
Structure and Evolution of Chlorate Reduction Composite Transposons

PubMed Central

Clark, Iain C.; Melnyk, Ryan A.; Engelbrektson, Anna; Coates, John D.

2013-01-01

ABSTRACT The genes for chlorate reduction in six bacterial strains were analyzed in order to gain insight into the metabolism. A newly isolated chlorate-reducing bacterium (Shewanella algae ACDC) and three previously isolated strains (Ideonella dechloratans, Pseudomonas sp. strain PK, and Dechloromarinus chlorophilus NSS) were genome sequenced and compared to published sequences (Alicycliphilus denitrificans BC plasmid pALIDE01 and Pseudomonas chloritidismutans AW-1). De novo assembly of genomes failed to join regions adjacent to genes involved in chlorate reduction, suggesting the presence of repeat regions. Using a bioinformatics approach and finishing PCRs to connect fragmented contigs, we discovered that chlorate reduction genes are flanked by insertion sequences, forming composite transposons in all four newly sequenced strains. These insertion sequences delineate regions with the potential to move horizontally and define a set of genes that may be important for chlorate reduction. In addition to core metabolic components, we have highlighted several such genes through comparative analysis and visualization. Phylogenetic analysis places chlorate reductase within a functionally diverse clade of type II dimethyl sulfoxide (DMSO) reductases, part of a larger family of enzymes with reactivity toward chlorate. Nucleotide-level forensics of regions surrounding chlorite dismutase (cld), as well as its phylogenetic clustering in a betaproteobacterial Cld clade, indicate that cld has been mobilized at least once from a perchlorate reducer to build chlorate respiration. PMID:23919996
Next generation sequencing and its applications in forensic genetics.

PubMed

Børsting, Claus; Morling, Niels

2015-09-01

It has been almost a decade since the first next generation sequencing (NGS) technologies emerged and quickly changed the way genetic research is conducted. Today, full genomes are mapped and published almost weekly and with ever increasing speed and decreasing costs. NGS methods and platforms have matured during the last 10 years, and the quality of the sequences has reached a level where NGS is used in clinical diagnostics of humans. Forensic genetic laboratories have also explored NGS technologies and especially in the last year, there has been a small explosion in the number of scientific articles and presentations at conferences with forensic aspects of NGS. These contributions have demonstrated that NGS offers new possibilities for forensic genetic case work. More information may be obtained from unique samples in a single experiment by analyzing combinations of markers (STRs, SNPs, insertion/deletions, mRNA) that cannot be analyzed simultaneously with the standard PCR-CE methods used today. The true variation in core forensic STR loci has been uncovered, and previously unknown STR alleles have been discovered. The detailed sequence information may aid mixture interpretation and will increase the statistical weight of the evidence. In this review, we will give an introduction to NGS and single-molecule sequencing, and we will discuss the possible applications of NGS in forensic genetics. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Molecular homogeneity of heat-stable enterotoxins produced by bovine enterotoxigenic Escherichia coli.

PubMed Central

Saeed, A M; Magnuson, N S; Sriranganathan, N; Burger, D; Cosand, W

1984-01-01

Heat-stable enterotoxins (STs) from four strains of bovine enterotoxigenic Escherichia coli representing four serogroups were purified to homogeneity by utilizing previously published purification schemata. Biochemical characterization of the purified STs showed that they met the basic criteria for the heat-stable enterotoxins of E. coli. Amino acid analysis of the purified STs revealed that they were peptides of identical amino acid composition. This composition consisted of 18 residues of 10 different amino acids, 6 of which were cysteine. The amino acid composition of the four ST peptides was identical to that reported for the STs of human and porcine E. coli. In addition, complete sequence analysis of two of the ST peptides and partial sequencing of several others revealed strong homology to the sequences of STs from human and porcine E. coli and to the sequence predicted from the last 18 codons of the transposon Tn1681. There was also substantial homology to the sequence predicted from the ST-coding genetic element of human E. coli, which may indicate the existence of identical bioactive configuration among ST peptides of E. coli strains of various host origins. These data support the hypothesis that STs produced by human, bovine, and porcine E. coli are coded by a closely related genetic element which may have originated from a single, widely disseminated transposon. Images PMID:6376355

From cultured to uncultured genome sequences: metagenomics and modeling microbial ecosystems.

PubMed

Garza, Daniel R; Dutilh, Bas E

2015-11-01

Microorganisms and the viruses that infect them are the most numerous biological entities on Earth and enclose its greatest biodiversity and genetic reservoir. With strength in their numbers, these microscopic organisms are major players in the cycles of energy and matter that sustain all life. Scientists have only scratched the surface of this vast microbial world through culture-dependent methods. Recent developments in generating metagenomes, large random samples of nucleic acid sequences isolated directly from the environment, are providing comprehensive portraits of the composition, structure, and functioning of microbial communities. Moreover, advances in metagenomic analysis have created the possibility of obtaining complete or nearly complete genome sequences from uncultured microorganisms, providing important means to study their biology, ecology, and evolution. Here we review some of the recent developments in the field of metagenomics, focusing on the discovery of genetic novelty and on methods for obtaining uncultured genome sequences, including through the recycling of previously published datasets. Moreover we discuss how metagenomics has become a core scientific tool to characterize eco-evolutionary patterns of microbial ecosystems, thus allowing us to simultaneously discover new microbes and study their natural communities. We conclude by discussing general guidelines and challenges for modeling the interactions between uncultured microorganisms and viruses based on the information contained in their genome sequences. These models will significantly advance our understanding of the functioning of microbial ecosystems and the roles of microbes in the environment.
Whole-exome sequencing identifies rare and low-frequency coding variants associated with LDL cholesterol.

PubMed

Lange, Leslie A; Hu, Youna; Zhang, He; Xue, Chenyi; Schmidt, Ellen M; Tang, Zheng-Zheng; Bizon, Chris; Lange, Ethan M; Smith, Joshua D; Turner, Emily H; Jun, Goo; Kang, Hyun Min; Peloso, Gina; Auer, Paul; Li, Kuo-Ping; Flannick, Jason; Zhang, Ji; Fuchsberger, Christian; Gaulton, Kyle; Lindgren, Cecilia; Locke, Adam; Manning, Alisa; Sim, Xueling; Rivas, Manuel A; Holmen, Oddgeir L; Gottesman, Omri; Lu, Yingchang; Ruderfer, Douglas; Stahl, Eli A; Duan, Qing; Li, Yun; Durda, Peter; Jiao, Shuo; Isaacs, Aaron; Hofman, Albert; Bis, Joshua C; Correa, Adolfo; Griswold, Michael E; Jakobsdottir, Johanna; Smith, Albert V; Schreiner, Pamela J; Feitosa, Mary F; Zhang, Qunyuan; Huffman, Jennifer E; Crosby, Jacy; Wassel, Christina L; Do, Ron; Franceschini, Nora; Martin, Lisa W; Robinson, Jennifer G; Assimes, Themistocles L; Crosslin, David R; Rosenthal, Elisabeth A; Tsai, Michael; Rieder, Mark J; Farlow, Deborah N; Folsom, Aaron R; Lumley, Thomas; Fox, Ervin R; Carlson, Christopher S; Peters, Ulrike; Jackson, Rebecca D; van Duijn, Cornelia M; Uitterlinden, André G; Levy, Daniel; Rotter, Jerome I; Taylor, Herman A; Gudnason, Vilmundur; Siscovick, David S; Fornage, Myriam; Borecki, Ingrid B; Hayward, Caroline; Rudan, Igor; Chen, Y Eugene; Bottinger, Erwin P; Loos, Ruth J F; Sætrom, Pål; Hveem, Kristian; Boehnke, Michael; Groop, Leif; McCarthy, Mark; Meitinger, Thomas; Ballantyne, Christie M; Gabriel, Stacey B; O'Donnell, Christopher J; Post, Wendy S; North, Kari E; Reiner, Alexander P; Boerwinkle, Eric; Psaty, Bruce M; Altshuler, David; Kathiresan, Sekar; Lin, Dan-Yu; Jarvik, Gail P; Cupples, L Adrienne; Kooperberg, Charles; Wilson, James G; Nickerson, Deborah A; Abecasis, Goncalo R; Rich, Stephen S; Tracy, Russell P; Willer, Cristen J

2014-02-06

Elevated low-density lipoprotein cholesterol (LDL-C) is a treatable, heritable risk factor for cardiovascular disease. Genome-wide association studies (GWASs) have identified 157 variants associated with lipid levels but are not well suited to assess the impact of rare and low-frequency variants. To determine whether rare or low-frequency coding variants are associated with LDL-C, we exome sequenced 2,005 individuals, including 554 individuals selected for extreme LDL-C (>98(th) or <2(nd) percentile). Follow-up analyses included sequencing of 1,302 additional individuals and genotype-based analysis of 52,221 individuals. We observed significant evidence of association between LDL-C and the burden of rare or low-frequency variants in PNPLA5, encoding a phospholipase-domain-containing protein, and both known and previously unidentified variants in PCSK9, LDLR and APOB, three known lipid-related genes. The effect sizes for the burden of rare variants for each associated gene were substantially higher than those observed for individual SNPs identified from GWASs. We replicated the PNPLA5 signal in an independent large-scale sequencing study of 2,084 individuals. In conclusion, this large whole-exome-sequencing study for LDL-C identified a gene not known to be implicated in LDL-C and provides unique insight into the design and analysis of similar experiments. Copyright © 2014 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Cytochrome cd1-containing nitrite reductase encoding gene nirS as a new functional biomarker for detection of anaerobic ammonium oxidizing (Anammox) bacteria.

PubMed

Li, Meng; Ford, Tim; Li, Xiaoyan; Gu, Ji-Dong

2011-04-15

A newly designed primer set (AnnirS), together with a previously published primer set (ScnirS), was used to detect anammox bacterial nirS genes from sediments collected from three marine environments. Phylogenetic analysis demonstrated that all retrieved sequences were clearly different from typical denitrifiers' nirS, but do group together with the known anammox bacterial nirS. Sequences targeted by ScnirS are closely related to Scalindua nirS genes recovered from the Peruvian oxygen minimum zone (OMZ), whereas sequences targeted by AnnirS are more closely affiliated with the nirS of Candidatus 'Kuenenia stuttgartiensis' and even form a new phylogenetic nirS clade, which might be related to other genera of the anammox bacteria. Analysis demonstrated that retrieved sequences had higher sequence identities (>60%) with known anammox bacterial nirS genes than with denitrifiers' nirS, on both nucleotide and amino acid levels. Compared to the 16S rRNA and hydrazine oxidoreductase (hzo) genes, the anammox bacterial nirS not only showed consistent phylogenetic relationships but also demonstrated more reliable quantification of anammox bacteria because of the single copy of the nirS gene in the anammox bacterial genome and the specificity of PCR primers for different genera of anammox bacteria, thus providing a suitable functional biomarker for investigation of anammox bacteria.
Brief Report: Cryopyrin-Associated Periodic Syndrome Caused by a Myeloid-Restricted Somatic NLRP3 Mutation.

PubMed

Zhou, Qing; Aksentijevich, Ivona; Wood, Geryl M; Walts, Avram D; Hoffmann, Patrycja; Remmers, Elaine F; Kastner, Daniel L; Ombrello, Amanda K

2015-09-01

To identify the cause of disease in an adult patient presenting with recent-onset fevers, chills, urticaria, fatigue, and profound myalgia, who was found to be negative for cryopyrin-associated periodic syndrome (CAPS) NLRP3 mutations by conventional Sanger DNA sequencing. We performed whole-exome sequencing and targeted deep sequencing using DNA from the patient's whole blood to identify a possible NLRP3 somatic mutation. We then screened for this mutation in subcloned NLRP3 amplicons from fibroblasts, buccal cells, granulocytes, negatively selected monocytes, and T and B lymphocytes and further confirmed the somatic mutation by targeted sequencing of exon 3. We identified a previously reported CAPS-associated mutation, p.Tyr570Cys, with a mutant allele frequency of 15% based on exome data. Targeted sequencing and subcloning of NLRP3 amplicons confirmed the presence of the somatic mutation in whole blood at a ratio similar to the exome data. The mutant allele frequency was in the range of 13.3-16.8% in monocytes and 15.2-18% in granulocytes. Notably, this mutation was either absent or present at a very low frequency in B and T lymphocytes, in buccal cells, and in the patient's cultured fibroblasts. Our findings indicate the possibility of myeloid-restricted somatic mosaicism in the pathogenesis of CAPS, underscoring the emerging role of massively parallel sequencing in clinical diagnosis. Published 2015. This article is a U.S. Government work and is in the public domain in the USA.
Commonly-occurring polymorphisms in the COMT, DRD1 and DRD2 genes influence different aspects of motor sequence learning in humans.

PubMed

Baetu, Irina; Burns, Nicholas R; Urry, Kristi; Barbante, Girolamo Giovanni; Pitcher, Julia B

2015-11-01

Performing sequences of movements is a ubiquitous skill that involves dopamine transmission. However, it is unclear which components of the dopamine system contribute to which aspects of motor sequence learning. Here we used a genetic approach to investigate the relationship between different components of the dopamine system and specific aspects of sequence learning in humans. In particular, we investigated variations in genes that code for the catechol-O-methyltransferase (COMT) enzyme, the dopamine transporter (DAT) and dopamine D1 and D2 receptors (DRD1 and DRD2). COMT and the DAT regulate dopamine availability in the prefrontal cortex and the striatum, respectively, two key regions recruited during learning, whereas dopamine D1 and D2 receptors are thought to be involved in long-term potentiation and depression, respectively. We show that polymorphisms in the COMT, DRD1 and DRD2 genes differentially affect behavioral performance on a sequence learning task in 161 Caucasian participants. The DRD1 polymorphism predicted the ability to learn new sequences, the DRD2 polymorphism predicted the ability to perform a previously learnt sequence after performing interfering random movements, whereas the COMT polymorphism predicted the ability to switch flexibly between two sequences. We used computer simulations to explore potential mechanisms underlying these effects, which revealed that the DRD1 and DRD2 effects are possibly related to neuroplasticity. Our prediction-error algorithm estimated faster rates of connection strengthening in genotype groups with presumably higher D1 receptor densities, and faster rates of connection weakening in genotype groups with presumably higher D2 receptor densities. Consistent with current dopamine theories, these simulations suggest that D1-mediated neuroplasticity contributes to learning to select appropriate actions, whereas D2-mediated neuroplasticity is involved in learning to inhibit incorrect action plans. However, the learning algorithm did not account for the COMT effect, suggesting that prefrontal dopamine availability might affect sequence switching via other, non-learning, mechanisms. These findings provide insight into the function of the dopamine system, which is relevant to the development of treatments for disorders such as Parkinson's disease. Our results suggest that treatments targeting dopamine D1 receptors may improve learning of novel sequences, whereas those targeting dopamine D2 receptors may improve the ability to initiate previously learned sequences of movements. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Evolution of the bamboos (Bambusoideae; Poaceae): a full plastome phylogenomic analysis.

PubMed

Wysocki, William P; Clark, Lynn G; Attigala, Lakshmi; Ruiz-Sanchez, Eduardo; Duvall, Melvin R

2015-03-18

Bambusoideae (Poaceae) comprise three distinct and well-supported lineages: tropical woody bamboos (Bambuseae), temperate woody bamboos (Arundinarieae) and herbaceous bamboos (Olyreae). Phylogenetic studies using chloroplast markers have generally supported a sister relationship between Bambuseae and Olyreae. This suggests either at least two origins of the woody bamboo syndrome in this subfamily or its loss in Olyreae. Here a full chloroplast genome (plastome) phylogenomic study is presented using the coding and noncoding regions of 13 complete plastomes from the Bambuseae, eight from Olyreae and 10 from Arundinarieae. Trees generated using full plastome sequences support the previously recovered monophyletic relationship between Bambuseae and Olyreae. In addition to these relationships, several unique plastome features are uncovered including the first mitogenome-to-plastome horizontal gene transfer observed in monocots. Phylogenomic agreement with previous published phylogenies reinforces the validity of these studies. Additionally, this study presents the first published plastomes from Neotropical woody bamboos and the first full plastome phylogenomic study performed within the herbaceous bamboos. Although the phylogenomic tree presented in this study is largely robust, additional studies using nuclear genes support monophyly in woody bamboos as well as hybridization among previous woody bamboo lineages. The evolutionary history of the Bambusoideae could be further clarified using transcriptomic techniques to increase sampling among nuclear orthologues and investigate the molecular genetics underlying the development of woody and floral tissues.
Higher Levels of Neanderthal Ancestry in East Asians than in Europeans

PubMed Central

Wall, Jeffrey D.; Yang, Melinda A.; Jay, Flora; Kim, Sung K.; Durand, Eric Y.; Stevison, Laurie S.; Gignoux, Christopher; Woerner, August; Hammer, Michael F.; Slatkin, Montgomery

2013-01-01

Neanderthals were a group of archaic hominins that occupied most of Europe and parts of Western Asia from ∼30,000 to 300,000 years ago (KYA). They coexisted with modern humans during part of this time. Previous genetic analyses that compared a draft sequence of the Neanderthal genome with genomes of several modern humans concluded that Neanderthals made a small (1–4%) contribution to the gene pools of all non-African populations. This observation was consistent with a single episode of admixture from Neanderthals into the ancestors of all non-Africans when the two groups coexisted in the Middle East 50–80 KYA. We examined the relationship between Neanderthals and modern humans in greater detail by applying two complementary methods to the published draft Neanderthal genome and an expanded set of high-coverage modern human genome sequences. We find that, consistent with the recent finding of Meyer et al. (2012), Neanderthals contributed more DNA to modern East Asians than to modern Europeans. Furthermore we find that the Maasai of East Africa have a small but significant fraction of Neanderthal DNA. Because our analysis is of several genomic samples from each modern human population considered, we are able to document the extent of variation in Neanderthal ancestry within and among populations. Our results combined with those previously published show that a more complex model of admixture between Neanderthals and modern humans is necessary to account for the different levels of Neanderthal ancestry among human populations. In particular, at least some Neanderthal–modern human admixture must postdate the separation of the ancestors of modern European and modern East Asian populations. PMID:23410836
PURA syndrome: clinical delineation and genotype-phenotype study in 32 individuals with review of published literature

PubMed Central

Reijnders, Margot R F; Janowski, Robert; Alvi, Mohsan; Self, Jay E; van Essen, Ton J; Vreeburg, Maaike; Rouhl, Rob P W; Stevens, Servi J C; Stegmann, Alexander P A; Schieving, Jolanda; Pfundt, Rolph; van Dijk, Katinke; Smeets, Eric; Stumpel, Connie T R M; Bok, Levinus A; Cobben, Jan Maarten; Engelen, Marc; Mansour, Sahar; Whiteford, Margo; Chandler, Kate E; Douzgou, Sofia; Cooper, Nicola S; Tan, Ene-Choo; Foo, Roger; Lai, Angeline H M; Rankin, Julia; Green, Andrew; Lönnqvist, Tuula; Isohanni, Pirjo; Williams, Shelley; Ruhoy, Ilene; Carvalho, Karen S; Dowling, James J; Lev, Dorit L; Sterbova, Katalin; Lassuthova, Petra; Neupauerová, Jana; Waugh, Jeff L; Keros, Sotirios; Clayton-Smith, Jill; Smithson, Sarah F; Brunner, Han G; van Hoeckel, Ceciel; Anderson, Mel; Clowes, Virginia E; Siu, Victoria Mok; DDD study, The; Selber, Paulo; Leventer, Richard J; Nellaker, Christoffer; Niessing, Dierk; Hunt, David; Baralle, Diana

2018-01-01

Background De novo mutations in PURA have recently been described to cause PURA syndrome, a neurodevelopmental disorder characterised by severe intellectual disability (ID), epilepsy, feeding difficulties and neonatal hypotonia. Objectives To delineate the clinical spectrum of PURA syndrome and study genotype-phenotype correlations. Methods Diagnostic or research-based exome or Sanger sequencing was performed in individuals with ID. We systematically collected clinical and mutation data on newly ascertained PURA syndrome individuals, evaluated data of previously reported individuals and performed a computational analysis of photographs. We classified mutations based on predicted effect using 3D in silico models of crystal structures of Drosophila-derived Pur-alpha homologues. Finally, we explored genotype-phenotype correlations by analysis of both recurrent mutations as well as mutation classes. Results We report mutations in PURA (purine-rich element binding protein A) in 32 individuals, the largest cohort described so far. Evaluation of clinical data, including 22 previously published cases, revealed that all have moderate to severe ID and neonatal-onset symptoms, including hypotonia (96%), respiratory problems (57%), feeding difficulties (77%), exaggerated startle response (44%), hypersomnolence (66%) and hypothermia (35%). Epilepsy (54%) and gastrointestinal (69%), ophthalmological (51%) and endocrine problems (42%) were observed frequently. Computational analysis of facial photographs showed subtle facial dysmorphism. No strong genotype-phenotype correlation was identified by subgrouping mutations into functional classes. Conclusion We delineate the clinical spectrum of PURA syndrome with the identification of 32 additional individuals. The identification of one individual through targeted Sanger sequencing points towards the clinical recognisability of the syndrome. Genotype-phenotype analysis showed no significant correlation between mutation classes and disease severity. PMID:29097605
Investigation of taxa of the family Pasteurellaceae isolated from Syrian and European hamsters and proposal of Mesocricetibacter intestinalis gen. nov., sp. nov. and Cricetibacter osteomyelitidis gen. nov., sp. nov.

PubMed

Christensen, H; Nicklas, W; Bisgaard, M

2014-11-01

Eleven strains from hamster of Bisgaard taxa 23 and 24, also referred to as Krause's groups 2 and 1, respectively, were investigated by a polyphasic approach including data published previously. Strains showed small, regular and circular colonies with smooth and shiny appearance, typical of members of the family Pasteurellaceae. The strains formed two monophyletic groups based on 16S rRNA gene sequence comparison to other members of the family Pasteurellaceae. Partial rpoB sequencing as well as published data on DNA-DNA hybridization showed high genotypic relationships within both groups. Menaquinone 7 (MK7) was found in strains of both groups as well as an unknown ubiquinone with shorter chain length than previously reported for any other member of the family Pasteurellaceae. A new genus with one species, Mesocricetibacter intestinalis gen. nov., sp. nov., is proposed to accommodate members of taxon 24 of Bisgaard whereas members of taxon 23 of Bisgaard are proposed to represent Cricetibacter osteomyelitidis gen. nov., sp. nov. Major fatty acids of type strains of type species of both genera are C(14:0), C(14:0) 3-OH/iso-C(16:1) I, C(16:1)ω7c and C(16:0). The two genera are clearly separated by phenotype from each other and from existing genera of the family Pasteurellaceae. The type strain of Mesocricetibacter intestinalis is HIM 933/7(T) ( =Kunstyr 246/85(T) =CCUG 28030(T) =DSM 28403(T)) while the type strain of Cricetibacter osteomyelitidis is HIM943/7(T) ( =Kunstyr 507/85(T) =CCUG 36451(T) =DSM 28404(T)). © 2014 IUMS.
No evidence of persisting measles virus in peripheral blood mononuclear cells from children with autism spectrum disorder.

PubMed

D'Souza, Yasmin; Fombonne, Eric; Ward, Brian J

2006-10-01

Despite epidemiologic evidence to the contrary, claims of an association between measles-mumps-rubella vaccination and the development of autism have persisted. Such claims are based primarily on the identification of measles virus nucleic acids in tissues and body fluids by polymerase chain reaction. We sought to determine whether measles virus nucleic acids persist in children with autism spectrum disorder compared with control children. Peripheral blood mononuclear cells were isolated from 54 children with autism spectrum disorder and 34 developmentally normal children, and up to 4 real-time polymerase chain reaction assays and 2 nested polymerase chain reaction assays were performed. These assays targeted the nucleoprotein, fusion, and hemagglutinin genes of measles virus using previously published primer pairs with detection by SYBR green I. Our own real-time assay targeted the fusion gene using novel primers and an internal fluorescent probe. Positive reactions were evaluated rigorously, and amplicons were sequenced. Finally, anti-measles antibody titers were measured by enzyme immunoassay. The real-time assays based on previously published primers gave rise to a large number of positive reactions in both autism spectrum disorder and control samples. Almost all of the positive reactions in these assays were eliminated by evaluation of melting curves and amplicon band size. The amplicons for the remaining positive reactions were cloned and sequenced. No sample from either autism spectrum disorder or control groups was found to contain nucleic acids from any measles virus gene. In the nested polymerase chain reaction and in-house assays, none of the samples yielded positive results. Furthermore, there was no difference in anti-measles antibody titers between the autism and control groups. There is no evidence of measles virus persistence in the peripheral blood mononuclear cells of children with autism spectrum disorder.
Advances in Exercise, Fitness, and Performance Genomics in 2011

PubMed Central

Roth, Stephen M.; Rankinen, Tuomo; Hagberg, James M.; Loos, Ruth J. F.; Pérusse, Louis; Sarzynski, Mark A.; Wolfarth, Bernd; Bouchard, Claude

2014-01-01

This review of the exercise genomics literature emphasizes the highest quality papers published in 2011. Given this emphasis on the best publications, only a small number of published papers are reviewed. One study found that physical activity levels were significantly lower in patients with mitochondrial DNA mutations compared to controls. A two-stage fine mapping follow-up of a previous linkage peak found strong associations between sequence variation in the activin A receptor, type-1B (ACVR1B) gene and knee extensor strength, with rs2854464 emerging as the most promising candidate polymorphism. The association of higher muscular strength with the rs2854464 A-allele was confirmed in two separate cohorts. A study using a combination of transcriptomic and genomic data identified a comprehensive map of the transcriptomic features important for aerobic exercise training-induced improvements in maximal oxygen consumption, but no genetic variants derived from candidate transcripts were associated with trainability. A large-scale de novo meta-analysis confirmed that the effect of sequence variation in the fat mass and obesity-associated (FTO) gene on the risk of obesity differs between sedentary and physically active adults. Evidence for gene-physical activity interactions on type 2 diabetes risk was found in two separate studies. A large study of women found that physical activity modified the effect of polymorphisms in the lipoprotein lipase (LPL), hepatic lipase (LIPC), and cholesteryl ester transfer protein (CETP) genes, identified in previous genome-wide association study (GWAS) reports, on HDL-C. We conclude that a strong exercise genomics corpus of evidence would not only translate into powerful genomic predictors but would also have a major impact on exercise biology and exercise behavior research. PMID:22330029
Whole mitochondrial and plastid genome SNP analysis of nine date palm cultivars reveals plastid heteroplasmy and close phylogenetic relationships among cultivars.

PubMed

Sabir, Jamal S M; Arasappan, Dhivya; Bahieldin, Ahmed; Abo-Aba, Salah; Bafeel, Sameera; Zari, Talal A; Edris, Sherif; Shokry, Ahmed M; Gadalla, Nour O; Ramadan, Ahmed M; Atef, Ahmed; Al-Kordy, Magdy A; El-Domyati, Fotoh M; Jansen, Robert K

2014-01-01

Date palm is a very important crop in western Asia and northern Africa, and it is the oldest domesticated fruit tree with archaeological records dating back 5000 years. The huge economic value of this crop has generated considerable interest in breeding programs to enhance production of dates. One of the major limitations of these efforts is the uncertainty regarding the number of date palm cultivars, which are currently based on fruit shape, size, color, and taste. Whole mitochondrial and plastid genome sequences were utilized to examine single nucleotide polymorphisms (SNPs) of date palms to evaluate the efficacy of this approach for molecular characterization of cultivars. Mitochondrial and plastid genomes of nine Saudi Arabian cultivars were sequenced. For each species about 60 million 100 bp paired-end reads were generated from total genomic DNA using the Illumina HiSeq 2000 platform. For each cultivar, sequences were aligned separately to the published date palm plastid and mitochondrial reference genomes, and SNPs were identified. The results identified cultivar-specific SNPs for eight of the nine cultivars. Two previous SNP analyses of mitochondrial and plastid genomes identified substantial intra-cultivar ( = intra-varietal) polymorphisms in organellar genomes but these studies did not properly take into account the fact that nearly half of the plastid genome has been integrated into the mitochondrial genome. Filtering all sequencing reads that mapped to both organellar genomes nearly eliminated mitochondrial heteroplasmy but all plastid SNPs remained heteroplasmic. This investigation provides valuable insights into how to deal with interorganellar DNA transfer in performing SNP analyses from total genomic DNA. The results confirm recent suggestions that plastid heteroplasmy is much more common than previously thought. Finally, low levels of sequence variation in plastid and mitochondrial genomes argue for using nuclear SNPs for molecular characterization of date palm cultivars.
Phylogenetic relationships and timing of diversification in gonorynchiform fishes inferred using nuclear gene DNA sequences (Teleostei: Ostariophysi).

PubMed

Near, Thomas J; Dornburg, Alex; Friedman, Matt

2014-11-01

The Gonorynchiformes are the sister lineage of the species-rich Otophysi and provide important insights into the diversification of ostariophysan fishes. Phylogenies of gonorynchiforms inferred using morphological characters and mtDNA gene sequences provide differing resolutions with regard to the sister lineage of all other gonorynchiforms (Chanos vs. Gonorynchus) and support for monophyly of the two miniaturized lineages Cromeria and Grasseichthys. In this study the phylogeny and divergence times of gonorynchiforms are investigated with DNA sequences sampled from nine nuclear genes and a published morphological character matrix. Bayesian phylogenetic analyses reveal substantial congruence among individual gene trees with inferences from eight genes placing Gonorynchus as the sister lineage to all other gonorynchiforms. Seven gene trees resolve Cromeria and Grasseichthys as a clade, supporting previous inferences using morphological characters. Phylogenies resulting from either concatenating the nuclear genes, performing a multispecies coalescent species tree analysis, or combining the morphological and nuclear gene DNA sequences resolve Gonorynchus as the living sister lineage of all other gonorynchiforms, strongly support the monophyly of Cromeria and Grasseichthys, and resolve a clade containing Parakneria, Cromeria, and Grasseichthys. The morphological dataset, which includes 13 gonorynchiform fossil taxa that range in age from Early Cretaceous to Eocene, was analyzed in combination with DNA sequences from the nine nuclear genes and a relaxed molecular clock to estimate times of evolutionary divergence. This "tip dating" strategy accommodates uncertainty in the phylogenetic resolution of fossil taxa that provide calibration information in the relaxed molecular clock analysis. The estimated age of the most recent common ancestor (MRCA) of living gonorynchiforms is slightly older than estimates from previous node dating efforts, but the molecular tip dating estimated ages of Kneriinae (Kneria, Parakneria, Cromeria, and Grasseichthys) and the two paedomorphic lineages, Cromeria and Grasseichthys, are considerably younger. Copyright © 2014 Elsevier Inc. All rights reserved.
APOBEC3H haplotypes and HIV-1 pro-viral vif DNA sequence diversity in early untreated human immunodeficiency virus-1 infection.

PubMed

Gourraud, P A; Karaouni, A; Woo, J M; Schmidt, T; Oksenberg, J R; Hecht, F M; Liegler, T J; Barbour, J D

2011-03-01

We examined single nucleotide polymorphisms (SNP) in the APOBEC3 locus on chromosome 22, paired with population sequences of pro-viral human immunodeficiency virus-1 (HIV-1) vif from peripheral blood mononuclear cells, from 96 recently HIV-1-infected treatment-naive adults. We found evidence for the existence of an APOBEC3H linkage disequilibrium (LD) block associated with variation in GA → AA, or APOBEC3F/H signature, sequence changes in pro-viral HIV-1 vif sequence (top 10 significant SNPs with a significant p = 4.8 × 10(-3)). We identified a common five position risk haplotype distal to APOBEC3H (A3Hrh). These markers were in high LD (D' = 1; r(2) = 0.98) to a previously described A3H "RED" haplotype containing a variant (E121) with enhanced susceptibility to HIV-1 Vif. This association was confirmed by a haplotype analysis. Homozygote carriers of the A3Hrh had lower GA->AA (A3F/H) sequence editing upon pro-viral HIV-1 vif sequence (p = 0.01), and lower HIV-1 RNA levels over time during early, untreated HIV-1 infection, (p = 0.015 mixed effects model). This effect may be due to enhanced susceptibility of A3H forms to HIV-1 Vif mediated viral suppression of sequence editing activity, slowing viral diversification and escape from immune responses. Copyright © 2011 American Society for Histocompatibility and Immunogenetics. Published by Elsevier Inc. All rights reserved.
Whole-exome sequencing identified a variant in EFTUD2 gene in establishing a genetic diagnosis.

PubMed

Rengasamy Venugopalan, S; Farrow, E G; Lypka, M

2017-06-01

Craniofacial anomalies are complex and have an overlapping phenotype. Mandibulofacial Dysostosis and Oculo-Auriculo-Vertebral Spectrum are conditions that share common craniofacial phenotype and present a challenge in arriving at a diagnosis. In this report, we present a case of female proband who was given a differential diagnosis of Treacher Collins syndrome or Hemifacial Microsomia without certainty. Prior genetic testing reported negative for 22q deletion and FGFR screenings. The objective of this study was to demonstrate the critical role of whole-exome sequencing in establishing a genetic diagnosis of the proband. The participants were 14½-year-old affected female proband/parent trio. Proband/parent trio were enrolled in the study. Surgical tissue sample from the proband and parental blood samples were collected and prepared for whole-exome sequencing. Illumina HiSeq 2500 instrument was used for sequencing (125 nucleotide reads/84X coverage). Analyses of variants were performed using custom-developed software, RUNES and VIKING. Variant analyses following whole-exome sequencing identified a heterozygous de novo pathogenic variant, c.259C>T (p.Gln87*), in EFTUD2 (NM_004247.3) gene in the proband. Previous studies have reported that the variants in EFTUD2 gene were associated with Mandibulofacial Dysostosis with Microcephaly. Patients with facial asymmetry, micrognathia, choanal atresia and microcephaly should be analyzed for variants in EFTUD2 gene. Next-generation sequencing techniques, such as whole-exome sequencing offer great promise to improve the understanding of etiologies of sporadic genetic diseases. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Microbiome analysis of dairy cows fed pasture or total mixed ration diets.

PubMed

de Menezes, Alexandre B; Lewis, Eva; O'Donovan, Michael; O'Neill, Brendan F; Clipson, Nicholas; Doyle, Evelyn M

2011-11-01

Understanding rumen microbial ecology is essential for the development of feed systems designed to improve livestock productivity, health and for methane mitigation strategies from cattle. Although rumen microbial communities have been studied previously, few studies have applied next-generation sequencing technologies to that ecosystem. The aim of this study was to characterize changes in microbial community structure arising from feeding dairy cows two widely used diets: pasture and total mixed ration (TMR). Bacterial, archaeal and protozoal communities were characterized by terminal restriction fragment length polymorphism of the amplified SSU rRNA gene and statistical analysis showed that bacterial and archaeal communities were significantly affected by diet, whereas no effect was observed for the protozoal community. Deep amplicon sequencing of the 16S rRNA gene revealed significant differences in the bacterial communities between the diets and between rumen solid and liquid content. At the family level, some important groups of rumen bacteria were clearly associated with specific diets, including the higher abundance of the Fibrobacteraceae in TMR solid samples and members of the propionate-producing Veillonelaceae in pasture samples. This study will be relevant to the study of rumen microbial ecology and livestock feed management. © 2011 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.
Genome editing of bread wheat using biolistic delivery of CRISPR/Cas9 in vitro transcripts or ribonucleoproteins.

PubMed

Liang, Zhen; Chen, Kunling; Zhang, Yi; Liu, Jinxing; Yin, Kangquan; Qiu, Jin-Long; Gao, Caixia

2018-03-01

This protocol is an extension to: Nat. Protoc. 9, 2395-2410 (2014); doi:10.1038/nprot.2014.157; published online 18 September 2014In recent years, CRISPR/Cas9 has emerged as a powerful tool for improving crop traits. Conventional plant genome editing mainly relies on plasmid-carrying cassettes delivered by Agrobacterium or particle bombardment. Here, we describe DNA-free editing of bread wheat by delivering in vitro transcripts (IVTs) or ribonucleoprotein complexes (RNPs) of CRISPR/Cas9 by particle bombardment. This protocol serves as an extension of our previously published protocol on genome editing in bread wheat using CRISPR/Cas9 plasmids delivered by particle bombardment. The methods we describe not only eliminate random integration of CRISPR/Cas9 into genomic DNA, but also reduce off-target effects. In this protocol extension article, we present detailed protocols for preparation of IVTs and RNPs; validation by PCR/restriction enzyme (RE) and next-generation sequencing; delivery by biolistics; and recovery of mutants and identification of mutants by pooling methods and Sanger sequencing. To use these protocols, researchers should have basic skills and experience in molecular biology and biolistic transformation. By using these protocols, plants edited without the use of any foreign DNA can be generated and identified within 9-11 weeks.
Less frequently mutated genes in colorectal cancer: evidences from next-generation sequencing of 653 routine cases.

PubMed

Malapelle, Umberto; Pisapia, Pasquale; Sgariglia, Roberta; Vigliar, Elena; Biglietto, Maria; Carlomagno, Chiara; Giuffrè, Giuseppe; Bellevicine, Claudio; Troncone, Giancarlo

2016-09-01

The incidence of RAS/RAF/PI3KA and TP53 gene mutations in colorectal cancer (CRC) is well established. Less information, however, is available on other components of the CRC genomic landscape, which are potential CRC prognostic/predictive markers. Following a previous validation study, ion-semiconductor next-generation sequencing (NGS) was employed to process 653 routine CRC samples by a multiplex PCR targeting 91 hotspot regions in 22 CRC significant genes. A total of 796 somatic mutations in 499 (76.4%) tumours were detected. Besides RAS/RAF/PI3KA and TP53, other 12 genes showed at least one mutation including FBXW7 (6%), PTEN (2.8%), SMAD4 (2.1%), EGFR (1.2%), CTNNB1 (1.1%), AKT1 (0.9%), STK11 (0.8%), ERBB2 (0.6%), ERBB4 (0.6%), ALK (0.2%), MAP2K1 (0.2%) and NOTCH1 (0.2%). In a routine diagnostic setting, NGS had the potential to generate robust and comprehensive genetic information also including less frequently mutated genes potentially relevant for prognostic assessments or for actionable treatments. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Evaluation of the PCR method for identification of Bifidobacterium species.

PubMed

Youn, S Y; Seo, J M; Ji, G E

2008-01-01

Bifidobacterium species are known for their beneficial effects on health and their wide use as probiotics. Although various polymerase chain reaction (PCR) methods for the identification of Bifidobacterium species have been published, the reliability of these methods remains open to question. In this study, we evaluated 37 previously reported PCR primer sets designed to amplify 16S rDNA, 23S rDNA, intergenic spacer regions, or repetitive DNA sequences of various Bifidobacterium species. Ten of 37 experimental primer sets showed specificity for B. adolescentis, B. angulatum, B. pseudocatenulatum, B. breve, B. bifidum, B. longum, B. longum biovar infantis and B. dentium. The results suggest that published Bifidobacterium primer sets should be re-evaluated for both reproducibility and specificity for the identification of Bifidobacterium species using PCR. Improvement of existing PCR methods will be needed to facilitate identification of other Bifidobacterium strains, such as B. animalis, B. catenulatum, B. thermophilum and B. subtile.
Harvesting of novel polyhydroxyalkanaote (PHA) synthase encoding genes from a soil metagenome library using phenotypic screening.

PubMed

Schallmey, Marcus; Ly, Anh; Wang, Chunxia; Meglei, Gabriela; Voget, Sonja; Streit, Wolfgang R; Driscoll, Brian T; Charles, Trevor C

2011-08-01

We previously reported the construction of metagenomic libraries in the IncP cosmid vector pRK7813, enabling heterologous expression of these broad-host-range libraries in multiple bacterial hosts. Expressing these libraries in Sinorhizobium meliloti, we have successfully complemented associated phenotypes of polyhydroxyalkanoate synthesis mutants. DNA sequence analysis of three clones indicates that the complementing genes are homologous to, but substantially different from, known polyhydroxyalkanaote synthase-encoding genes. Thus we have demonstrated the ability to isolate diverse genes for polyhydroxyalkanaote synthesis by functional complementation of defined mutants. Such genes might be of use in the engineering of more efficient systems for the industrial production of bioplastics. The use of functional complementation will also provide a vehicle to probe the genetics of polyhydroxyalkanaote metabolism and its relation to carbon availability in complex microbial assemblages. 2011 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. All rights reserved.

Classification of European Mtdnas from an Analysis of Three European Populations

PubMed Central

Torroni, A.; Huoponen, K.; Francalacci, P.; Petrozzi, M.; Morelli, L.; Scozzari, R.; Obinu, D.; Savontaus, M. L.; Wallace, D. C.

1996-01-01

Mitochondrial DNA (mtDNA) sequence variation was examined in Finns, Swedes and Tuscans by PCR amplification and restriction analysis. About 99% of the mtDNAs were subsumed within 10 mtDNA haplogroups (H, I, J, K, M, T, U, V, W, and X) suggesting that the identified haplogroups could encompass virtually all European mtDNAs. Because both hypervariable segments of the mtDNA control region were previously sequenced in the Tuscan samples, the mtDNA haplogroups and control region sequences could be compared. Using a combination of haplogroup-specific restriction site changes and control region nucleotide substitutions, the distribution of the haplogroups was surveyed through the published restriction site polymorphism and control region sequence data of Caucasoids. This supported the conclusion that most haplogroups observed in Europe are Caucasoid-specific, and that at least some of them occur at varying frequencies in different Caucasoid populations. The classification of almost all European mtDNA variation in a number of well defined haplogroups could provide additional insights about the origin and relationships of Caucasoid populations and the process of human colonization of Europe, and is valuable for the definition of the role played by mtDNA backgrounds in the expression of pathological mtDNA mutations PMID:8978068
Identification and characterization of mutant clones with enhanced propagation rates from phage-displayed peptide libraries.

PubMed

Nguyen, Kieu T H; Adamkiewicz, Marta A; Hebert, Lauren E; Zygiel, Emily M; Boyle, Holly R; Martone, Christina M; Meléndez-Ríos, Carola B; Noren, Karen A; Noren, Christopher J; Hall, Marilena Fitzsimons

2014-10-01

A target-unrelated peptide (TUP) can arise in phage display selection experiments as a result of a propagation advantage exhibited by the phage clone displaying the peptide. We previously characterized HAIYPRH, from the M13-based Ph.D.-7 phage display library, as a propagation-related TUP resulting from a G→A mutation in the Shine-Dalgarno sequence of gene II. This mutant was shown to propagate in Escherichia coli at a dramatically faster rate than phage bearing the wild-type Shine-Dalgarno sequence. We now report 27 additional fast-propagating clones displaying 24 different peptides and carrying 14 unique mutations. Most of these mutations are found either in or upstream of the gene II Shine-Dalgarno sequence, but still within the mRNA transcript of gene II. All 27 clones propagate at significantly higher rates than normal library phage, most within experimental error of wild-type M13 propagation, suggesting that mutations arise to compensate for the reduced virulence caused by the insertion of a lacZα cassette proximal to the replication origin of the phage used to construct the library. We also describe an efficient and convenient assay to diagnose propagation-related TUPS among peptide sequences selected by phage display. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.
High frequency of hepatitis E virus infection in swine from South Brazil and close similarity to human HEV isolates.

PubMed

Passos-Castilho, Ana Maria; Granato, Celso Francisco Hernandes

Hepatitis E virus is responsible for acute and chronic liver infections worldwide. Swine hepatitis E virus has been isolated in Brazil, and a probable zoonotic transmission has been described, although data are still scarce. The aim of this study was to investigate the frequency of hepatitis E virus infection in pigs from a small-scale farm in the rural area of Paraná State, South Brazil. Fecal samples were collected from 170 pigs and screened for hepatitis E virus RNA using a duplex real-time RT-PCR targeting a highly conserved 70nt long sequence within overlapping parts of ORF2 and ORF3 as well as a 113nt sequence of ORF2. Positive samples with high viral loads were subjected to direct sequencing and phylogenetic analysis. hepatitis E virus RNA was detected in 34 (20.0%) of the 170 pigs following positive results in at least one set of screening real-time RT-PCR primers and probes. The swine hepatitis E virus strains clustered with the genotype hepatitis E virus-3b reference sequences in the phylogenetic analysis and showed close similarity to human hepatitis E virus isolates previously reported in Brazil. Copyright © 2017 Sociedade Brasileira de Microbiologia. Published by Elsevier Editora Ltda. All rights reserved.
Accurate multiple sequence-structure alignment of RNA sequences using combinatorial optimization.

PubMed

Bauer, Markus; Klau, Gunnar W; Reinert, Knut

2007-07-27

The discovery of functional non-coding RNA sequences has led to an increasing interest in algorithms related to RNA analysis. Traditional sequence alignment algorithms, however, fail at computing reliable alignments of low-homology RNA sequences. The spatial conformation of RNA sequences largely determines their function, and therefore RNA alignment algorithms have to take structural information into account. We present a graph-based representation for sequence-structure alignments, which we model as an integer linear program (ILP). We sketch how we compute an optimal or near-optimal solution to the ILP using methods from combinatorial optimization, and present results on a recently published benchmark set for RNA alignments. The implementation of our algorithm yields better alignments in terms of two published scores than the other programs that we tested: This is especially the case with an increasing number of input sequences. Our program LARA is freely available for academic purposes from http://www.planet-lisa.net.
Piroplasms in brown hyaenas (Parahyaena brunnea) and spotted hyaenas (Crocuta crocuta) in Namibia and South Africa are closely related to Babesia lengau.

PubMed

Burroughs, Richard E J; Penzhorn, Barend L; Wiesel, Ingrid; Barker, Nancy; Vorster, Ilse; Oosthuizen, Marinda C

2017-02-01

The objective of our study was identification and molecular characterization of piroplasms and rickettsias occurring in brown (Parahyaena brunnea) and spotted hyaenas (Crocuta crocuta) from various localities in Namibia and South Africa. Whole blood (n = 59) and skin (n = 3) specimens from brown (n = 15) and spotted hyaenas (n = 47) were screened for the presence of Babesia, Theileria, Ehrlichia and Anaplasma species using the reverse line blot (RLB) hybridization technique. PCR products of 52/62 (83.9%) of the specimens hybridized only with the Theileria/Babesia genus-specific probes and not with any of the species-specific probes, suggesting the presence of a novel species or variant of a species. No Ehrlichia and/or Anaplasma species DNA could be detected. A parasite 18S ribosomal RNA gene of brown (n = 3) and spotted hyaena (n = 6) specimens was subsequently amplified and cloned, and the recombinants were sequenced. Homologous sequence searches of databases indicated that the obtained sequences were most closely related to Babesia lengau, originally described from cheetahs (Acinonyx jubatus). Observed sequence similarities were subsequently confirmed by phylogenetic analyses which showed that the obtained hyaena sequences formed a monophyletic group with B. lengau, B abesia conradae and sequences previously isolated from humans and wildlife in the western USA. Within the B. lengau clade, the obtained sequences and the published B. lengau sequences were grouped into six distinct groups, of which groups I to V represented novel B. lengau genotypes and/or gene variants. We suggest that these genotypes cannot be classified as new Babesia species, but rather as variants of B. lengau. This is the first report of occurrence of piroplasms in brown hyaenas.
Genome sequence and analysis of a stress-tolerant, wild-derived strain of Saccharomyces cerevisiae used in biofuels research

DOE Office of Scientific and Technical Information (OSTI.GOV)

McIlwain, Sean J.; Peris, Davis; Sardi, Maria

The genome sequences of more than 100 strains of the yeast Saccharomyces cerevisiae have been published. Unfortunately, most of these genome assemblies contain dozens to hundreds of gaps at repetitive sequences, including transposable elements, tRNAs, and subtelomeric regions, which is where novel genes generally reside. Relatively few strains have been chosen for genome sequencing based on their biofuel production potential, leaving an additional knowledge gap. Here, we describe the nearly complete genome sequence of GLBRCY22-3 (Y22-3), a strain of S. cerevisiae derived from the stress-tolerant wild strain NRRL YB-210 and subsequently engineered for xylose metabolism. After benchmarking several genome assemblymore » approaches, we developed a pipeline to integrate Pacific Biosciences (PacBio) and Illumina sequencing data and achieved one of the highest quality genome assemblies for any S. cerevisiae strain. Specifically, the contig N50 is 693 kbp, and the sequences of most chromosomes, the mitochondrial genome, and the 2-micron plasmid are complete. Our annotation predicts 92 genes that are not present in the reference genome of the laboratory strain S288c, over 70% of which were expressed. We predicted functions for 43 of these genes, 28 of which were previously uncharacterized and unnamed. Remarkably, many of these genes are predicted to be involved in stress tolerance and carbon metabolism and are shared with a Brazilian bioethanol production strain, even though the strains differ dramatically at most genetic loci. Lastly, the Y22-3 genome sequence provides an exceptionally high-quality resource for basic and applied research in bioenergy and genetics.« less
Genome sequence and analysis of a stress-tolerant, wild-derived strain of Saccharomyces cerevisiae used in biofuels research

DOE PAGES

McIlwain, Sean J.; Peris, Davis; Sardi, Maria; ...

2016-04-20

The genome sequences of more than 100 strains of the yeast Saccharomyces cerevisiae have been published. Unfortunately, most of these genome assemblies contain dozens to hundreds of gaps at repetitive sequences, including transposable elements, tRNAs, and subtelomeric regions, which is where novel genes generally reside. Relatively few strains have been chosen for genome sequencing based on their biofuel production potential, leaving an additional knowledge gap. Here, we describe the nearly complete genome sequence of GLBRCY22-3 (Y22-3), a strain of S. cerevisiae derived from the stress-tolerant wild strain NRRL YB-210 and subsequently engineered for xylose metabolism. After benchmarking several genome assemblymore » approaches, we developed a pipeline to integrate Pacific Biosciences (PacBio) and Illumina sequencing data and achieved one of the highest quality genome assemblies for any S. cerevisiae strain. Specifically, the contig N50 is 693 kbp, and the sequences of most chromosomes, the mitochondrial genome, and the 2-micron plasmid are complete. Our annotation predicts 92 genes that are not present in the reference genome of the laboratory strain S288c, over 70% of which were expressed. We predicted functions for 43 of these genes, 28 of which were previously uncharacterized and unnamed. Remarkably, many of these genes are predicted to be involved in stress tolerance and carbon metabolism and are shared with a Brazilian bioethanol production strain, even though the strains differ dramatically at most genetic loci. Lastly, the Y22-3 genome sequence provides an exceptionally high-quality resource for basic and applied research in bioenergy and genetics.« less
Autosomal dominant mutation in the signal peptide of renin in a kindred with anemia, hyperuricemia, and CKD.

PubMed

Beck, Bodo B; Trachtman, Howard; Gitman, Michael; Miller, Ilene; Sayer, John A; Pannes, Andrea; Baasner, Anne; Hildebrandt, Friedhelm; Wolf, Matthias T F

2011-11-01

Homozygous or compound heterozygous mutations in renin (REN) cause renal tubular dysgenesis, which is characterized by death in utero due to kidney failure and pulmonary hypoplasia. The phenotype resembles the fetopathy caused by angiotensin-converting enzyme inhibitor or angiotensin receptor blocker intake during pregnancy. Recently, heterozygous REN mutations were shown to result in early-onset hyperuricemia, anemia, and chronic kidney disease (CKD). To date, only 3 different heterozygous REN mutations have been published. We report mutation analysis of the REN gene in 39 kindreds with hyperuricemia and CKD who previously tested negative for mutations in the UMOD (uromodulin) and HNF1B (hepatocyte nuclear factor 1β) genes. We identified one kindred with a novel thymidine to cytosine mutation at position 28 in the REN complementary DNA, corresponding to a tryptophan to arginine substitution at amino acid 10, which is found within the signal sequence (c.28T>C; p.W10R). On this basis, we conclude that REN mutations are rare events in patients with CKD. Within the kindred, we found affected individuals over 4 generations who carried the novel REN mutation and were characterized by significant anemia, hyperuricemia, and CKD. Anemia was severe and disproportional to the degree of decreased kidney function. Because all heterozygous REN mutations that have been described are localized in the signal sequence, screening of the REN gene for patients with CKD with hyperuricemia and anemia may best be focused on sequencing of exon 1, which encodes the signal peptide. Published by Elsevier Inc.
Multiplexed SNP typing of ancient DNA clarifies the origin of Andaman mtDNA haplogroups amongst South Asian tribal populations.

PubMed

Endicott, Phillip; Metspalu, Mait; Stringer, Chris; Macaulay, Vincent; Cooper, Alan; Sanchez, Juan J

2006-12-20

The issue of errors in genetic data sets is of growing concern, particularly in population genetics where whole genome mtDNA sequence data is coming under increased scrutiny. Multiplexed PCR reactions, combined with SNP typing, are currently under-exploited in this context, but have the potential to genotype whole populations rapidly and accurately, significantly reducing the amount of errors appearing in published data sets. To show the sensitivity of this technique for screening mtDNA genomic sequence data, 20 historic samples of the enigmatic Andaman Islanders and 12 modern samples from three Indian tribal populations (Chenchu, Lambadi and Lodha) were genotyped for 20 coding region sites after provisional haplogroup assignment with control region sequences. The genotype data from the historic samples significantly revise the topologies for the Andaman M31 and M32 mtDNA lineages by rectifying conflicts in published data sets. The new Indian data extend the distribution of the M31a lineage to South Asia, challenging previous interpretations of mtDNA phylogeography. This genetic connection between the ancestors of the Andamanese and South Asian tribal groups approximately 30 kya has important implications for the debate concerning migration routes and settlement patterns of humans leaving Africa during the late Pleistocene, and indicates the need for more detailed genotyping strategies. The methodology serves as a low-cost, high-throughput model for the production and authentication of data from modern or ancient DNA, and demonstrates the value of museum collections as important records of human genetic diversity.
Multiplexed SNP Typing of Ancient DNA Clarifies the Origin of Andaman mtDNA Haplogroups amongst South Asian Tribal Populations

PubMed Central

Endicott, Phillip; Metspalu, Mait; Stringer, Chris; Macaulay, Vincent; Cooper, Alan; Sanchez, Juan J.

2006-01-01

The issue of errors in genetic data sets is of growing concern, particularly in population genetics where whole genome mtDNA sequence data is coming under increased scrutiny. Multiplexed PCR reactions, combined with SNP typing, are currently under-exploited in this context, but have the potential to genotype whole populations rapidly and accurately, significantly reducing the amount of errors appearing in published data sets. To show the sensitivity of this technique for screening mtDNA genomic sequence data, 20 historic samples of the enigmatic Andaman Islanders and 12 modern samples from three Indian tribal populations (Chenchu, Lambadi and Lodha) were genotyped for 20 coding region sites after provisional haplogroup assignment with control region sequences. The genotype data from the historic samples significantly revise the topologies for the Andaman M31 and M32 mtDNA lineages by rectifying conflicts in published data sets. The new Indian data extend the distribution of the M31a lineage to South Asia, challenging previous interpretations of mtDNA phylogeography. This genetic connection between the ancestors of the Andamanese and South Asian tribal groups ∼30 kya has important implications for the debate concerning migration routes and settlement patterns of humans leaving Africa during the late Pleistocene, and indicates the need for more detailed genotyping strategies. The methodology serves as a low-cost, high-throughput model for the production and authentication of data from modern or ancient DNA, and demonstrates the value of museum collections as important records of human genetic diversity. PMID:17218991
Reconstructing the Indian Origin and Dispersal of the European Roma: A Maternal Genetic Perspective

PubMed Central

Mendizabal, Isabel; Valente, Cristina; Gusmão, Alfredo; Alves, Cíntia; Gomes, Verónica; Goios, Ana; Parson, Walther; Calafell, Francesc; Alvarez, Luis; Amorim, António; Gusmão, Leonor

2011-01-01

Previous genetic, anthropological and linguistic studies have shown that Roma (Gypsies) constitute a founder population dispersed throughout Europe whose origins might be traced to the Indian subcontinent. Linguistic and anthropological evidence point to Indo-Aryan ethnic groups from North-western India as the ancestral parental population of Roma. Recently, a strong genetic hint supporting this theory came from a study of a private mutation causing primary congenital glaucoma. In the present study, complete mitochondrial control sequences of Iberian Roma and previously published maternal lineages of other European Roma were analyzed in order to establish the genetic affinities among Roma groups, determine the degree of admixture with neighbouring populations, infer the migration routes followed since the first arrival to Europe, and survey the origin of Roma within the Indian subcontinent. Our results show that the maternal lineage composition in the Roma groups follows a pattern of different migration routes, with several founder effects, and low effective population sizes along their dispersal. Our data allowed the confirmation of a North/West migration route shared by Polish, Lithuanian and Iberian Roma. Additionally, eleven Roma founder lineages were identified and degrees of admixture with host populations were estimated. Finally, the comparison with an extensive database of Indian sequences allowed us to identify the Punjab state, in North-western India, as the putative ancestral homeland of the European Roma, in agreement with previous linguistic and anthropological studies. PMID:21264345
Ultra-deep sequencing of ribosome-associated poly-adenylated RNA in early Drosophila embryos reveals hundreds of conserved translated sORFs.

PubMed

Li, Hongmei; Hu, Chuansheng; Bai, Ling; Li, Hua; Li, Mingfa; Zhao, Xiaodong; Czajkowsky, Daniel M; Shao, Zhifeng

2016-12-01

There is growing recognition that small open reading frames (sORFs) encoding peptides shorter than 100 amino acids are an important class of functional elements in the eukaryotic genome, with several already identified to play critical roles in growth, development, and disease. However, our understanding of their biological importance has been hindered owing to the significant technical challenges limiting their annotation. Here we combined ultra-deep sequencing of ribosome-associated poly-adenylated RNAs with rigorous conservation analysis to identify a comprehensive population of translated sORFs during early Drosophila embryogenesis. In total, we identify 399 sORFs, including those previously annotated but without evidence of translational capacity, those found within transcripts previously classified as non-coding, and those not previously known to be transcribed. Further, we find, for the first time, evidence for translation of many sORFs with different isoforms, suggesting their regulation is as complex as longer ORFs. Furthermore, many sORFs are found not associated with ribosomes in late-stage Drosophila S2 cells, suggesting that many of the translated sORFs may have stage-specific functions during embryogenesis. These results thus provide the first comprehensive annotation of the sORFs present during early Drosophila embryogenesis, a necessary basis for a detailed delineation of their function in embryogenesis and other biological processes. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Spinal motor neuron involvement in a patient with homozygous PRUNE mutation.

PubMed

Iacomino, Michele; Fiorillo, Chiara; Torella, Annalaura; Severino, Mariasavina; Broda, Paolo; Romano, Catia; Falsaperla, Raffaele; Pozzolini, Giulia; Minetti, Carlo; Striano, Pasquale; Nigro, Vincenzo; Zara, Federico

2018-05-01

In the last few years, whole exome sequencing (WES) allowed the identification of PRUNE mutations in patients featuring a complex neurological phenotype characterized by severe neurodevelopmental delay, microcephaly, epilepsy, optic atrophy, and brain or cerebellar atrophy. We describe an additional patient with homozygous PRUNE mutation who presented with spinal muscular atrophy phenotype, in addition to the already known brain developmental disorder. This novel feature expands the clinical consequences of PRUNE mutations and allow to converge PRUNE syndrome with previous descriptions of neurodevelopmental/neurodegenerative disorders linked to altered microtubule dynamics. Copyright © 2017 European Paediatric Neurology Society. Published by Elsevier Ltd. All rights reserved.
Upward gaze and head deviation with frontal eye field stimulation.

PubMed

Kaiboriboon, Kitti; Lüders, Hans O; Miller, Jonathan P; Leigh, R John

2012-03-01

Using electrical stimulation to the deep, most caudal part of the right frontal eye field (FEF), we demonstrate a novel pattern of vertical (upward) eye movement that was previously only thought possible by stimulating both frontal eye fields simultaneously. If stimulation was started when the subject looked laterally, the initial eye movement was back to the midline, followed by upward deviation. Our finding challenges current view of topological organisation in the human FEF and may have general implications for concepts of topological organisation of the motor cortex, since sustained stimulation also induced upward head movements as a component of the vertical gaze shift. [Published with video sequences].
Extended exome sequencing identifies BACH2 as a novel major risk locus for Addison's disease.

PubMed

Eriksson, D; Bianchi, M; Landegren, N; Nordin, J; Dalin, F; Mathioudaki, A; Eriksson, G N; Hultin-Rosenberg, L; Dahlqvist, J; Zetterqvist, H; Karlsson, Å; Hallgren, Å; Farias, F H G; Murén, E; Ahlgren, K M; Lobell, A; Andersson, G; Tandre, K; Dahlqvist, S R; Söderkvist, P; Rönnblom, L; Hulting, A-L; Wahlberg, J; Ekwall, O; Dahlqvist, P; Meadows, J R S; Bensing, S; Lindblad-Toh, K; Kämpe, O; Pielberg, G R

2016-12-01

Autoimmune disease is one of the leading causes of morbidity and mortality worldwide. In Addison's disease, the adrenal glands are targeted by destructive autoimmunity. Despite being the most common cause of primary adrenal failure, little is known about its aetiology. To understand the genetic background of Addison's disease, we utilized the extensively characterized patients of the Swedish Addison Registry. We developed an extended exome capture array comprising a selected set of 1853 genes and their potential regulatory elements, for the purpose of sequencing 479 patients with Addison's disease and 1394 controls. We identified BACH2 (rs62408233-A, OR = 2.01 (1.71-2.37), P = 1.66 × 10 -15 , MAF 0.46/0.29 in cases/controls) as a novel gene associated with Addison's disease development. We also confirmed the previously known associations with the HLA complex. Whilst BACH2 has been previously reported to associate with organ-specific autoimmune diseases co-inherited with Addison's disease, we have identified BACH2 as a major risk locus in Addison's disease, independent of concomitant autoimmune diseases. Our results may enable future research towards preventive disease treatment. © 2016 The Authors. Journal of Internal Medicine published by John Wiley & Sons Ltd on behalf of Association for Publication of The Journal of Internal Medicine.
Pellagra-like condition is xeroderma pigmentosum/Cockayne syndrome complex and niacin confers clinical benefit.

PubMed

Hijazi, H; Salih, M A; Hamad, M H A; Hassan, H H; Salih, S B M; Mohamed, K A; Mukhtar, M M; Karrar, Z A; Ansari, S; Ibrahim, N; Alkuraya, F S

2015-01-01

An extremely rare pellagra-like condition has been described, which was partially responsive to niacin and associated with a multisystem involvement. The condition was proposed to represent a novel autosomal recessive entity but the underlying mutation remained unknown for almost three decades. The objective of this study was to identify the causal mutation in the pellagra-like condition and investigate the mechanism by which niacin confers clinical benefit. Autozygosity mapping and exome sequencing were used to identify the causal mutation, and comet assay on patient fibroblasts before and after niacin treatment to assess its effect on DNA damage. We identified a single disease locus that harbors a novel mutation in ERCC5, thus confirming that the condition is in fact xeroderma pigmentosum/Cockayne syndrome (XP/CS) complex. Importantly, we also show that the previously described dermatological response to niacin is consistent with a dramatic protective effect against ultraviolet-induced DNA damage in patient fibroblasts conferred by niacin treatment. Our findings show the power of exome sequencing in reassigning previously described novel clinical entities, and suggest a mechanism for the dermatological response to niacin in patients with XP/CS complex. This raises interesting possibilities about the potential therapeutic use of niacin in XP. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Kinase gene fusions in defined subsets of melanoma.

PubMed

Turner, Jacqueline; Couts, Kasey; Sheren, Jamie; Saichaemchan, Siriwimon; Ariyawutyakorn, Witthawat; Avolio, Izabela; Cabral, Ethan; Glogowska, Magdelena; Amato, Carol; Robinson, Steven; Hintzsche, Jennifer; Applegate, Allison; Seelenfreund, Eric; Gonzalez, Rita; Wells, Keith; Bagby, Stacey; Tentler, John; Tan, Aik-Choon; Wisell, Joshua; Varella-Garcia, Marileila; Robinson, William

2017-01-01

Genomic rearrangements resulting in activating kinase fusions have been increasingly described in a number of cancers including malignant melanoma, but their frequency in specific melanoma subtypes has not been reported. We used break-apart fluorescence in situ hybridization (FISH) to identify genomic rearrangements in tissues from 59 patients with various types of malignant melanoma including acral lentiginous, mucosal, superficial spreading, and nodular. We identified four genomic rearrangements involving the genes BRAF, RET, and ROS1. Of these, three were confirmed by Immunohistochemistry (IHC) or sequencing and one was found to be an ARMC10-BRAF fusion that has not been previously reported in melanoma. These fusions occurred in different subtypes of melanoma but all in tumors lacking known driver mutations. Our data suggest gene fusions are more common than previously thought and should be further explored particularly in melanomas lacking known driver mutations. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
NEW MEMBERS OF THE SCORPIUS-CENTAURUS COMPLEX AND AGES OF ITS SUB-REGIONS

DOE Office of Scientific and Technical Information (OSTI.GOV)

Song, Inseok; Zuckerman, B.; Bessell, M. S.

2012-07-15

We have spectroscopically identified {approx}100 G-, K-, and M-type members of the Scorpius-Centaurus complex. To deduce the age of these young stars we compare their Li {lambda}6708 absorption line strengths against those of stars in the TW Hydrae association and {beta} Pictoris moving group. These line strengths indicate that Sco-Cen stars are younger than {beta} Pic stars whose ages of {approx}12 Myr have previously been derived from a kinematic traceback analysis. Our derived age, {approx}10 Myr, for stars in the Lower Centaurus Crux and Upper Centaurus Lupus subgroups of ScoCen is younger than previously published ages based on the movingmore » cluster method and upper main-sequence fitting. The discrepant ages are likely due to an incorrect (or lack of) cross-calibration between model-dependent and model-independent age-dating methods.« less
SPOT-ligand 2: improving structure-based virtual screening by binding-homology search on an expanded structural template library.

PubMed

Litfin, Thomas; Zhou, Yaoqi; Yang, Yuedong

2017-04-15

The high cost of drug discovery motivates the development of accurate virtual screening tools. Binding-homology, which takes advantage of known protein-ligand binding pairs, has emerged as a powerful discrimination technique. In order to exploit all available binding data, modelled structures of ligand-binding sequences may be used to create an expanded structural binding template library. SPOT-Ligand 2 has demonstrated significantly improved screening performance over its previous version by expanding the template library 15 times over the previous one. It also performed better than or similar to other binding-homology approaches on the DUD and DUD-E benchmarks. The server is available online at http://sparks-lab.org . yaoqi.zhou@griffith.edu.au or yuedong.yang@griffith.edu.au. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
From Conventional to Next Generation Sequencing of Epstein-Barr Virus Genomes.

PubMed

Kwok, Hin; Chiang, Alan Kwok Shing

2016-02-24

Genomic sequences of Epstein-Barr virus (EBV) have been of interest because the virus is associated with cancers, such as nasopharyngeal carcinoma, and conditions such as infectious mononucleosis. The progress of whole-genome EBV sequencing has been limited by the inefficiency and cost of the first-generation sequencing technology. With the advancement of next-generation sequencing (NGS) and target enrichment strategies, increasing number of EBV genomes has been published. These genomes were sequenced using different approaches, either with or without EBV DNA enrichment. This review provides an overview of the EBV genomes published to date, and a description of the sequencing technology and bioinformatic analyses employed in generating these sequences. We further explored ways through which the quality of sequencing data can be improved, such as using DNA oligos for capture hybridization, and longer insert size and read length in the sequencing runs. These advances will enable large-scale genomic sequencing of EBV which will facilitate a better understanding of the genetic variations of EBV in different geographic regions and discovery of potentially pathogenic variants in specific diseases.

Quantitative PCR analysis reveals a high incidence of large intragenic deletions in the FANCA gene in Spanish Fanconi anemia patients.

PubMed

Callén, E; Tischkowitz, M D; Creus, A; Marcos, R; Bueren, J A; Casado, J A; Mathew, C G; Surrallés, J

2004-01-01

Fanconi anaemia is an autosomal recessive disease characterized by chromosome fragility, multiple congenital abnormalities, progressive bone marrow failure and a high predisposition to develop malignancies. Most of the Fanconi anaemia patients belong to complementation group FA-A due to mutations in the FANCA gene. This gene contains 43 exons along a 4.3-kb coding sequence with a very heterogeneous mutational spectrum that makes the mutation screening of FANCA a difficult task. In addition, as the FANCA gene is rich in Alu sequences, it was reported that Alu-mediated recombination led to large intragenic deletions that cannot be detected in heterozygous state by conventional PCR, SSCP analysis, or DNA sequencing. To overcome this problem, a method based on quantitative fluorescent multiplex PCR was proposed to detect intragenic deletions in FANCA involving the most frequently deleted exons (exons 5, 11, 17, 21 and 31). Here we apply the proposed method to detect intragenic deletions in 25 Spanish FA-A patients previously assigned to complementation group FA-A by FANCA cDNA retroviral transduction. A total of eight heterozygous deletions involving from one to more than 26 exons were detected. Thus, one third of the patients carried a large intragenic deletion that would have not been detected by conventional methods. These results are in agreement with previously published data and indicate that large intragenic deletions are one of the most frequent mutations leading to Fanconi anaemia. Consequently, this technology should be applied in future studies on FANCA to improve the mutation detection rate. Copyright 2003 S. Karger AG, Basel
Molecular epidemiology of pathogenic Leptospira spp. in the straw-colored fruit bat (Eidolon helvum) migrating to Zambia from the Democratic Republic of Congo.

PubMed

Ogawa, Hirohito; Koizumi, Nobuo; Ohnuma, Aiko; Mutemwa, Alisheke; Hang'ombe, Bernard M; Mweene, Aaron S; Takada, Ayato; Sugimoto, Chihiro; Suzuki, Yasuhiko; Kida, Hiroshi; Sawa, Hirofumi

2015-06-01

The role played by bats as a potential source of transmission of Leptospira spp. to humans is poorly understood, despite various pathogenic Leptospira spp. being identified in these mammals. Here, we investigated the prevalence and diversity of pathogenic Leptospira spp. that infect the straw-colored fruit bat (Eidolon helvum). We captured this bat species, which is widely distributed in Africa, in Zambia during 2008-2013. We detected the flagellin B gene (flaB) from pathogenic Leptospira spp. in kidney samples from 79 of 529 E. helvum (14.9%) bats. Phylogenetic analysis of 70 flaB fragments amplified from E. helvum samples and previously reported sequences, revealed that 12 of the fragments grouped with Leptospira borgpetersenii and Leptospira kirschneri; however, the remaining 58 flaB fragments appeared not to be associated with any reported species. Additionally, the 16S ribosomal RNA gene (rrs) amplified from 27 randomly chosen flaB-positive samples was compared with previously reported sequences, including bat-derived Leptospira spp. All 27 rrs fragments clustered into a pathogenic group. Eight fragments were located in unique branches, the other 19 fragments were closely related to Leptospira spp. detected in bats. These results show that rrs sequences in bats are genetically related to each other without regional variation, suggesting that Leptospira are evolutionarily well-adapted to bats and have uniquely evolved in the bat population. Our study indicates that pathogenic Leptospira spp. in E. helvum in Zambia have unique genotypes. Copyright © 2015 The Authors. Published by Elsevier B.V. All rights reserved.
Genetic signs of multiple colonization events in Baltic ciscoes with radiation into sympatric spring- and autumn-spawners confined to early postglacial arrival

PubMed Central

Delling, Bo; Palm, Stefan; Palkopoulou, Eleftheria; Prestegaard, Tore

2014-01-01

Presence of sympatric populations may reflect local diversification or secondary contact of already distinct forms. The Baltic cisco (Coregonus albula) normally spawns in late autumn, but in a few lakes in Northern Europe sympatric autumn and spring- or winter-spawners have been described. So far, the evolutionary relationships and taxonomic status of these main life history forms have remained largely unclear. With microsatellites and mtDNA sequences, we analyzed extant and extinct spring- and autumn-spawners from a total of 23 Swedish localities, including sympatric populations. Published sequences from Baltic ciscoes in Germany and Finland, and Coregonus sardinella from North America were also included together with novel mtDNA sequences from Siberian C. sardinella. A clear genetic structure within Sweden was found that included two population assemblages markedly differentiated at microsatellites and apparently fixed for mtDNA haplotypes from two distinct clades. All sympatric Swedish populations belonged to the same assemblage, suggesting parallel evolution of spring-spawning rather than secondary contact. The pattern observed further suggests that postglacial immigration to Northern Europe occurred from at least two different refugia. Previous results showing that mtDNA in Baltic cisco is paraphyletic with respect to North American C. sardinella were confirmed. However, the inclusion of Siberian C. sardinella revealed a more complicated pattern, as these novel haplotypes were found within one of the two main C. albula clades and were clearly distinct from those in North American C. sardinella. The evolutionary history of Northern Hemisphere ciscoes thus seems to be more complex than previously recognized. PMID:25540695
Genetic signs of multiple colonization events in Baltic ciscoes with radiation into sympatric spring- and autumn-spawners confined to early postglacial arrival.

PubMed

Delling, Bo; Palm, Stefan; Palkopoulou, Eleftheria; Prestegaard, Tore

2014-11-01

Presence of sympatric populations may reflect local diversification or secondary contact of already distinct forms. The Baltic cisco (Coregonus albula) normally spawns in late autumn, but in a few lakes in Northern Europe sympatric autumn and spring- or winter-spawners have been described. So far, the evolutionary relationships and taxonomic status of these main life history forms have remained largely unclear. With microsatellites and mtDNA sequences, we analyzed extant and extinct spring- and autumn-spawners from a total of 23 Swedish localities, including sympatric populations. Published sequences from Baltic ciscoes in Germany and Finland, and Coregonus sardinella from North America were also included together with novel mtDNA sequences from Siberian C. sardinella. A clear genetic structure within Sweden was found that included two population assemblages markedly differentiated at microsatellites and apparently fixed for mtDNA haplotypes from two distinct clades. All sympatric Swedish populations belonged to the same assemblage, suggesting parallel evolution of spring-spawning rather than secondary contact. The pattern observed further suggests that postglacial immigration to Northern Europe occurred from at least two different refugia. Previous results showing that mtDNA in Baltic cisco is paraphyletic with respect to North American C. sardinella were confirmed. However, the inclusion of Siberian C. sardinella revealed a more complicated pattern, as these novel haplotypes were found within one of the two main C. albula clades and were clearly distinct from those in North American C. sardinella. The evolutionary history of Northern Hemisphere ciscoes thus seems to be more complex than previously recognized.
Ensembl Genomes 2016: more genomes, more complexity.

PubMed

Kersey, Paul Julian; Allen, James E; Armean, Irina; Boddu, Sanjay; Bolt, Bruce J; Carvalho-Silva, Denise; Christensen, Mikkel; Davis, Paul; Falin, Lee J; Grabmueller, Christoph; Humphrey, Jay; Kerhornou, Arnaud; Khobova, Julia; Aranganathan, Naveen K; Langridge, Nicholas; Lowy, Ernesto; McDowall, Mark D; Maheswari, Uma; Nuhn, Michael; Ong, Chuang Kee; Overduin, Bert; Paulini, Michael; Pedro, Helder; Perry, Emily; Spudich, Giulietta; Tapanari, Electra; Walts, Brandon; Williams, Gareth; Tello-Ruiz, Marcela; Stein, Joshua; Wei, Sharon; Ware, Doreen; Bolser, Daniel M; Howe, Kevin L; Kulesha, Eugene; Lawson, Daniel; Maslen, Gareth; Staines, Daniel M

2016-01-04

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including reference sequence, gene models, transcriptional data, genetic variation and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments. These include the development of new analyses and views to represent polyploid genomes (of which bread wheat is the primary exemplar); and the continued up-scaling of the resource, which now includes over 23 000 bacterial genomes, 400 fungal genomes and 100 protist genomes, in addition to 55 genomes from invertebrate metazoa and 39 genomes from plants. This dramatic increase in the number of included genomes is one part of a broader effort to automate the integration of archival data (genome sequence, but also associated RNA sequence data and variant calls) within the context of reference genomes and make it available through the Ensembl user interfaces. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Mitochondrial genomes reveal the extinct Hippidion as an outgroup to all living equids.

PubMed

Der Sarkissian, Clio; Vilstrup, Julia T; Schubert, Mikkel; Seguin-Orlando, Andaine; Eme, David; Weinstock, Jacobo; Alberdi, Maria Teresa; Martin, Fabiana; Lopez, Patricio M; Prado, Jose L; Prieto, Alfredo; Douady, Christophe J; Stafford, Tom W; Willerslev, Eske; Orlando, Ludovic

2015-03-01

Hippidions were equids with very distinctive anatomical features. They lived in South America 2.5 million years ago (Ma) until their extinction approximately 10 000 years ago. The evolutionary origin of the three known Hippidion morphospecies is still disputed. Based on palaeontological data, Hippidion could have diverged from the lineage leading to modern equids before 10 Ma. In contrast, a much later divergence date, with Hippidion nesting within modern equids, was indicated by partial ancient mitochondrial DNA sequences. Here, we characterized eight Hippidion complete mitochondrial genomes at 3.4-386.3-fold coverage using target-enrichment capture and next-generation sequencing. Our dataset reveals that the two morphospecies sequenced (H. saldiasi and H. principale) formed a monophyletic clade, basal to extant and extinct Equus lineages. This contrasts with previous genetic analyses and supports Hippidion as a distinct genus, in agreement with palaeontological models. We date the Hippidion split from Equus at 5.6-6.5 Ma, suggesting an early divergence in North America prior to the colonization of South America, after the formation of the Panamanian Isthmus 3.5 Ma and the Great American Biotic Interchange. © 2015 The Author(s) Published by the Royal Society. All rights reserved.
m6aViewer: software for the detection, analysis, and visualization of N6-methyladenosine peaks from m6A-seq/ME-RIP sequencing data.

PubMed

Antanaviciute, Agne; Baquero-Perez, Belinda; Watson, Christopher M; Harrison, Sally M; Lascelles, Carolina; Crinnion, Laura; Markham, Alexander F; Bonthron, David T; Whitehouse, Adrian; Carr, Ian M

2017-10-01

Recent methods for transcriptome-wide N 6 -methyladenosine (m 6 A) profiling have facilitated investigations into the RNA methylome and established m 6 A as a dynamic modification that has critical regulatory roles in gene expression and may play a role in human disease. However, bioinformatics resources available for the analysis of m 6 A sequencing data are still limited. Here, we describe m6aViewer-a cross-platform application for analysis and visualization of m 6 A peaks from sequencing data. m6aViewer implements a novel m 6 A peak-calling algorithm that identifies high-confidence methylated residues with more precision than previously described approaches. The application enables data analysis through a graphical user interface, and thus, in contrast to other currently available tools, does not require the user to be skilled in computer programming. m6aViewer and test data can be downloaded here: http://dna2.leeds.ac.uk/m6a. © 2017 Antanaviciute et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
Wenchuan Event Detection And Localization Using Waveform Correlation Coupled With Double Difference

NASA Astrophysics Data System (ADS)

Slinkard, M.; Heck, S.; Schaff, D. P.; Young, C. J.; Richards, P. G.

2014-12-01

The well-studied Wenchuan aftershock sequence triggered by the May 12, 2008, Ms 8.0, mainshock offers an ideal test case for evaluating the effectiveness of using waveform correlation coupled with double difference relocation to detect and locate events in a large aftershock sequence. We use Sandia's SeisCorr detector to process 3 months of data recorded by permanent IRIS and temporary ASCENT stations using templates from events listed in a global catalog to find similar events in the raw data stream. Then we take the detections and relocate them using the double difference method. We explore both the performance that can be expected with using just a small number of stations, and, the benefits of reprocessing a well-studied sequence such as this one using waveform correlation to find even more events. We benchmark our results against previously published results describing relocations of regional catalog data. Before starting this project, we had examples where with just a few stations at far-regional distances, waveform correlation combined with double difference did and impressive job of detection and location events with precision at the few hundred and even tens of meters level.
The phylogenetic position of the Critically Endangered Saint Croix ground lizard Ameiva polops: revisiting molecular systematics of West Indian Ameiva.

PubMed

Hurtado, Luis A; Santamaria, Carlos A; Fitzgerald, Lee A

2014-05-06

The phylogenetic position of the critically endangered Saint Croix ground lizard Ameiva polops is presently unknown and several hypotheses have been proposed. We investigated the phylogenetic position of this species using molecular phylogenetic methods. We obtained sequences of DNA fragments of the mitochondrial ribosomal genes 12S rDNA and 16S rDNA for this species. We aligned these sequences with published sequences of other Ameiva species, which include most of the Ameiva species from the West Indies, three Ameiva species from Central America and South America, and one from the teiid lizard Tupinambis teguixin, which was used as outgroup. We conducted Maximum Likelihood and Bayesian phylogenetic analyses. The phylogenetic reconstructions among the different methods were very similar, supporting the monophyly of West Indian Ameiva and showing within this lineage, a basal polytomy of four clades that are separated geographically. Ameiva polops grouped in a cluster that included the other two Ameiva species found in the Puerto Rican Bank: A. wetmorei and A. exsul. A sister relationship between A. polops and A. wetmorei is suggested by our analyses. We compare our results with a previous study on molecular systematics of West Indian Ameiva.
NLSdb-major update for database of nuclear localization signals and nuclear export signals.

PubMed

Bernhofer, Michael; Goldberg, Tatyana; Wolf, Silvana; Ahmed, Mohamed; Zaugg, Julian; Boden, Mikael; Rost, Burkhard

2018-01-04

NLSdb is a database collecting nuclear export signals (NES) and nuclear localization signals (NLS) along with experimentally annotated nuclear and non-nuclear proteins. NES and NLS are short sequence motifs related to protein transport out of and into the nucleus. The updated NLSdb now contains 2253 NLS and introduces 398 NES. The potential sets of novel NES and NLS have been generated by a simple 'in silico mutagenesis' protocol. We started with motifs annotated by experiments. In step 1, we increased specificity such that no known non-nuclear protein matched the refined motif. In step 2, we increased the sensitivity trying to match several different families with a motif. We then iterated over steps 1 and 2. The final set of 2253 NLS motifs matched 35% of 8421 experimentally verified nuclear proteins (up from 21% for the previous version) and none of 18 278 non-nuclear proteins. We updated the web interface providing multiple options to search protein sequences for NES and NLS motifs, and to evaluate your own signal sequences. NLSdb can be accessed via Rostlab services at: https://rostlab.org/services/nlsdb/. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Applications of next generation sequencing in molecular ecology of non-model organisms.

PubMed

Ekblom, R; Galindo, J

2011-07-01

As most biologists are probably aware, technological advances in molecular biology during the last few years have opened up possibilities to rapidly generate large-scale sequencing data from non-model organisms at a reasonable cost. In an era when virtually any study organism can 'go genomic', it is worthwhile to review how this may impact molecular ecology. The first studies to put the next generation sequencing (NGS) to the test in ecologically well-characterized species without previous genome information were published in 2007 and the beginning of 2008. Since then several studies have followed in their footsteps, and a large number are undoubtedly under way. This review focuses on how NGS has been, and can be, applied to ecological, population genetic and conservation genetic studies of non-model species, in which there is no (or very limited) genomic resources. Our aim is to draw attention to the various possibilities that are opening up using the new technologies, but we also highlight some of the pitfalls and drawbacks with these methods. We will try to provide a snapshot of the current state of the art for this rapidly advancing and expanding field of research and give some likely directions for future developments.
DNA sequence analysis of ARS elements from chromosome III of Saccharomyces cerevisiae: identification of a new conserved sequence.

PubMed Central

Palzkill, T G; Oliver, S G; Newlon, C S

1986-01-01

Four fragments of Saccharomyces cerevisiae chromosome III DNA which carry ARS elements have been sequenced. Each fragment contains multiple copies of sequences that have at least 10 out of 11 bases of homology to a previously reported 11 bp core consensus sequence. A survey of these new ARS sequences and previously reported sequences revealed the presence of an additional 11 bp conserved element located on the 3' side of the T-rich strand of the core consensus. Subcloning analysis as well as deletion and transposon insertion mutagenesis of ARS fragments support a role for 3' conserved sequence in promoting ARS activity. PMID:3529036
Lossy compression of quality scores in genomic data.

PubMed

Cánovas, Rodrigo; Moffat, Alistair; Turpin, Andrew

2014-08-01

Next-generation sequencing technologies are revolutionizing medicine. Data from sequencing technologies are typically represented as a string of bases, an associated sequence of per-base quality scores and other metadata, and in aggregate can require a large amount of space. The quality scores show how accurate the bases are with respect to the sequencing process, that is, how confident the sequencer is of having called them correctly, and are the largest component in datasets in which they are retained. Previous research has examined how to store sequences of bases effectively; here we add to that knowledge by examining methods for compressing quality scores. The quality values originate in a continuous domain, and so if a fidelity criterion is introduced, it is possible to introduce flexibility in the way these values are represented, allowing lossy compression over the quality score data. We present existing compression options for quality score data, and then introduce two new lossy techniques. Experiments measuring the trade-off between compression ratio and information loss are reported, including quantifying the effect of lossy representations on a downstream application that carries out single nucleotide polymorphism and insert/deletion detection. The new methods are demonstrably superior to other techniques when assessed against the spectrum of possible trade-offs between storage required and fidelity of representation. An implementation of the methods described here is available at https://github.com/rcanovas/libCSAM. rcanovas@student.unimelb.edu.au Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
R3D-2-MSA: the RNA 3D structure-to-multiple sequence alignment server.

PubMed

Cannone, Jamie J; Sweeney, Blake A; Petrov, Anton I; Gutell, Robin R; Zirbel, Craig L; Leontis, Neocles

2015-07-01

The RNA 3D Structure-to-Multiple Sequence Alignment Server (R3D-2-MSA) is a new web service that seamlessly links RNA three-dimensional (3D) structures to high-quality RNA multiple sequence alignments (MSAs) from diverse biological sources. In this first release, R3D-2-MSA provides manual and programmatic access to curated, representative ribosomal RNA sequence alignments from bacterial, archaeal, eukaryal and organellar ribosomes, using nucleotide numbers from representative atomic-resolution 3D structures. A web-based front end is available for manual entry and an Application Program Interface for programmatic access. Users can specify up to five ranges of nucleotides and 50 nucleotide positions per range. The R3D-2-MSA server maps these ranges to the appropriate columns of the corresponding MSA and returns the contents of the columns, either for display in a web browser or in JSON format for subsequent programmatic use. The browser output page provides a 3D interactive display of the query, a full list of sequence variants with taxonomic information and a statistical summary of distinct sequence variants found. The output can be filtered and sorted in the browser. Previous user queries can be viewed at any time by resubmitting the output URL, which encodes the search and re-generates the results. The service is freely available with no login requirement at http://rna.bgsu.edu/r3d-2-msa. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Microeukaryote Community Patterns along an O2/H2S Gradient in a Supersulfidic Anoxic Fjord (Framvaren, Norway)†

PubMed Central

Behnke, Anke; Bunge, John; Barger, Kathryn; Breiner, Hans-Werner; Alla, Victoria; Stoeck, Thorsten

2006-01-01

To resolve the fine-scale architecture of anoxic protistan communities, we conducted a cultivation-independent 18S rRNA survey in the superanoxic Framvaren Fjord in Norway. We generated three clone libraries along the steep O2/H2S gradient, using the multiple-primer approach. Of 1,100 clones analyzed, 753 proved to be high-quality protistan target sequences. These sequences were grouped into 92 phylotypes, which displayed high protistan diversity in the fjord (17 major eukaryotic phyla). Only a few were closely related to known taxa. Several sequences were dissimilar to all previously described sequences and occupied a basal position in the inferred phylogenies, suggesting that the sequences recovered were derived from novel, deeply divergent eukaryotes. We detected sequence clades with evolutionary importance (for example, clades in the euglenozoa) and clades that seem to be specifically adapted to anoxic environments, challenging the hypothesis that the global dispersal of protists is uniform. Moreover, with the detection of clones affiliated with jakobid flagellates, we present evidence that primitive descendants of early eukaryotes are present in this anoxic environment. To estimate sample coverage and phylotype richness, we used parametric and nonparametric statistical methods. The results show that although our data set is one of the largest published inventories, our sample missed a substantial proportion of the protistan diversity. Nevertheless, statistical and phylogenetic analyses of the three libraries revealed the fine-scale architecture of anoxic protistan communities, which may exhibit adaptation to different environmental conditions along the O2/H2S gradient. PMID:16672511
Characterization of the Campylobacter jejuni cryptic plasmid pTIW94 recovered from wild birds in the southeastern United States.

PubMed

Hiett, Kelli L; Rothrock, Michael J; Seal, Bruce S

2013-09-01

The complete nucleotide sequence was determined for a cryptic plasmid, pTIW94, recovered from several Campylobacter jejuni isolates from wild birds in the southeastern United States. pTIW94 is a circular molecule of 3860 nucleotides, with a G+C content (31.0%) similar to that of many Campylobacter spp. genomes. A typical origin of replication, with iteron sequences, was identified upstream of DNA sequences that demonstrated similarity to replication initiation proteins. A total of five open reading frames (ORFs) were identified; two of the five ORFs demonstrated significant similarity to plasmid pCC2228-2 found within Campylobacter coli. These two ORFs were similar to essential replication proteins RepA (100%; 26/26 aa identity) and RepB (95%; 327/346 aa identity). A third identified ORF demonstrated significant similarity (99%; 421/424 aa identity) to the MOB protein from C. coli 67-8, originally recovered from swine. The other two identified ORFs were either similar to hypothetical proteins from other Campylobacter spp., or exhibited no significant similarity to any DNA or protein sequence in the GenBank database. Promoter regions (-35 and -10 signal sites), ribosomal binding sites upstream of ORFs, and stem-loop structures were also identified within the plasmid. These results demonstrate that pTIW94 represents a previously un-reported small cryptic plasmid with unique sequences as well as highly similar sequences to other small plasmids found within Campylobacter spp., and that this cryptic plasmid is present among Campylobacter spp. recovered from different genera of wild birds. Copyright © 2013. Published by Elsevier Inc.
Use of a molecular approach for the definitive diagnosis of proliferative larval mesocestoidiasis in a cat.

PubMed

Jabbar, Abdul; Papini, Roberto; Ferrini, Nadia; Gasser, Robin B

2012-10-01

A 9 year-old male, neutered cat with a history of a sudden onset of lethargy, anorexia and respiratory distress was presented in a veterinary practice in Lucca, Italy. A clinical examination revealed that the cat was severely dehydrated, and had pale mucous membranes and tachypnoea. No pain or discomfort was detected at the time of physical examination. The cat was administered fluids, antibiotics and supportive therapy, but died overnight. The owner of the cat requested for a post mortem examination to be conducted. At necropsy, acephalic structures, consistent with proliferative tapeworm (cestode) larvae, were detected in the thoracic cavity on pleural surfaces. As these larvae could not be identified to genus or species by microscopy, a PCR-based sequencing-phylogenetic approach was used. Part of the cytochrome c oxidase subunit 1 gene was PCR-amplified from genomic DNAs from five individual larvae and sequenced; all five sequences obtained were identical. This consensus sequence was aligned (over 355 nucleotide positions) with homologous sequences representing a range of cestodes (including Echinococcus granulosus, Echinococcus multilocularis, Hymenolepis microstoma, Mesocestoides spp. and Taenia saginata) from previously published studies and then subjected to phylogenetic analysis. The sequence representing the larval cestode from the affected cat grouped, with strong statistical support, with those representing Mesocestoides corti and Mesocestoides lineatus. Therefore, a definitive diagnosis of pleural proliferative larval mesocestoidiasis could be made. This study illustrates the value of using molecular tools to directly assist clinical and pathological investigations of cestodiases of animals. Copyright © 2012 Elsevier B.V. All rights reserved.
An expanded mammal mitogenome dataset from Southeast Asia.

PubMed

Mohd Salleh, Faezah; Ramos-Madrigal, Jazmín; Peñaloza, Fernando; Liu, Shanlin; Mikkel-Holger, S Sinding; Riddhi, P Patel; Martins, Renata; Lenz, Dorina; Fickel, Jörns; Roos, Christian; Shamsir, Mohd Shahir; Azman, Mohammad Shahfiz; Burton, K Lim; Stephen, J Rossiter; Wilting, Andreas; Gilbert, M Thomas P

2017-08-01

Southeast (SE) Asia is 1 of the most biodiverse regions in the world, and it holds approximately 20% of all mammal species. Despite this, the majority of SE Asia's genetic diversity is still poorly characterized. The growing interest in using environmental DNA to assess and monitor SE Asian species, in particular threatened mammals-has created the urgent need to expand the available reference database of mitochondrial barcode and complete mitogenome sequences. We have partially addressed this need by generating 72 new mitogenome sequences reconstructed from DNA isolated from a range of historical and modern tissue samples. Approximately 55 gigabases of raw sequence were generated. From this data, we assembled 72 complete mitogenome sequences, with an average depth of coverage of ×102.9 and ×55.2 for modern samples and historical samples, respectively. This dataset represents 52 species, of which 30 species had no previous mitogenome data available. The mitogenomes were geotagged to their sampling location, where known, to display a detailed geographical distribution of the species. Our new database of 52 taxa will strongly enhance the utility of environmental DNA approaches for monitoring mammals in SE Asia as it greatly increases the likelihoods that identification of metabarcoding sequencing reads can be assigned to reference sequences. This magnifies the confidence in species detections and thus allows more robust surveys and monitoring programmes of SE Asia's threatened mammal biodiversity. The extensive collections of historical samples from SE Asia in western and SE Asian museums should serve as additional valuable material to further enrich this reference database. © The Author 2017. Published by Oxford University Press.
An evolutionary conserved pattern of 18S rRNA sequence complementarity to mRNA 5′ UTRs and its implications for eukaryotic gene translation regulation

PubMed Central

Pánek, Josef; Kolář, Michal; Vohradský, Jiří; Shivaya Valášek, Leoš

2013-01-01

There are several key mechanisms regulating eukaryotic gene expression at the level of protein synthesis. Interestingly, the least explored mechanisms of translational control are those that involve the translating ribosome per se, mediated for example via predicted interactions between the ribosomal RNAs (rRNAs) and mRNAs. Here, we took advantage of robustly growing large-scale data sets of mRNA sequences for numerous organisms, solved ribosomal structures and computational power to computationally explore the mRNA–rRNA complementarity that is statistically significant across the species. Our predictions reveal highly specific sequence complementarity of 18S rRNA sequences with mRNA 5′ untranslated regions (UTRs) forming a well-defined 3D pattern on the rRNA sequence of the 40S subunit. Broader evolutionary conservation of this pattern may imply that 5′ UTRs of eukaryotic mRNAs, which have already emerged from the mRNA-binding channel, may contact several complementary spots on 18S rRNA situated near the exit of the mRNA binding channel and on the middle-to-lower body of the solvent-exposed 40S ribosome including its left foot. We discuss physiological significance of this structurally conserved pattern and, in the context of previously published experimental results, propose that it modulates scanning of the 40S subunit through 5′ UTRs of mRNAs. PMID:23804757
The benefits of analysing complete mitochondrial genomes: Deep insights into the phylogeny and population structure of Echinococcus granulosus sensu lato genotypes G6 and G7.

PubMed

Laurimäe, Teivi; Kinkar, Liina; Romig, Thomas; Omer, Rihab A; Casulli, Adriano; Umhang, Gérald; Gasser, Robin B; Jabbar, Abdul; Sharbatkhori, Mitra; Mirhendi, Hossein; Ponce-Gordo, Francisco; Lazzarini, Lorena E; Soriano, Silvia V; Varcasia, Antonio; Nejad, Mohammad Rostami; Andresiuk, Vanessa; Maravilla, Pablo; González, Luis Miguel; Dybicz, Monika; Gawor, Jakub; Šarkūnas, Mindaugas; Šnábel, Viliam; Kuzmina, Tetiana; Saarma, Urmas

2018-06-12

Cystic echinococcosis (CE) is a zoonotic disease caused by the larval stage of the species complex Echinococcus granulosus sensu lato. Within this complex, genotypes G6 and G7 have been frequently associated with human CE worldwide. Previous studies exploring the genetic variability and phylogeography of genotypes G6 and G7 have been based on relatively short mtDNA sequences, and the resolution of these studies has often been low. Moreover, using short sequences, the distinction between G6 and G7 has in some cases remained challenging. The aim here was to sequence complete mitochondrial genomes (mitogenomes) to obtain deeper insight into the genetic diversity, phylogeny and population structure of genotypes G6 and G7. We sequenced complete mitogenomes of 94 samples collected from 15 different countries worldwide. The results demonstrated that (i) genotypes G6 and G7 can be clearly distinguished when mitogenome sequences are used; (ii) G7 is represented by two major haplogroups, G7a and G7b, the latter being specific to islands of Corsica and Sardinia; (iii) intensive animal trade, but also geographical isolation, have likely had the largest impact on shaping the genetic structure and distribution of genotypes G6 and G7. In addition, we found phylogenetically highly divergent haplotype from Mongolia (Gmon), which had a higher affinity to G6. Copyright © 2017. Published by Elsevier B.V.

Spectra library assisted de novo peptide sequencing for HCD and ETD spectra pairs.

PubMed

Yan, Yan; Zhang, Kaizhong

2016-12-23

De novo peptide sequencing via tandem mass spectrometry (MS/MS) has been developed rapidly in recent years. With the use of spectra pairs from the same peptide under different fragmentation modes, performance of de novo sequencing is greatly improved. Currently, with large amount of spectra sequenced everyday, spectra libraries containing tens of thousands of annotated experimental MS/MS spectra become available. These libraries provide information of the spectra properties, thus have the potential to be used with de novo sequencing to improve its performance. In this study, an improved de novo sequencing method assisted with spectra library is proposed. It uses spectra libraries as training datasets and introduces significant scores of the features used in our previous de novo sequencing method for HCD and ETD spectra pairs. Two pairs of HCD and ETD spectral datasets were used to test the performance of the proposed method and our previous method. The results show that this proposed method achieves better sequencing accuracy with higher ranked correct sequences and less computational time. This paper proposed an advanced de novo sequencing method for HCD and ETD spectra pair and used information from spectra libraries and significant improved previous similar methods.
PET Imaging Stability Measurements During Simultaneous Pulsing of Aggressive MR Sequences on the SIGNA PET/MR System.

PubMed

Deller, Timothy W; Khalighi, Mohammad Mehdi; Jansen, Floris P; Glover, Gary H

2018-01-01

The recent introduction of simultaneous whole-body PET/MR scanners has enabled new research taking advantage of the complementary information obtainable with PET and MRI. One such application is kinetic modeling, which requires high levels of PET quantitative stability. To accomplish the required PET stability levels, the PET subsystem must be sufficiently isolated from the effects of MR activity. Performance measurements have previously been published, demonstrating sufficient PET stability in the presence of MR pulsing for typical clinical use; however, PET stability during radiofrequency (RF)-intensive and gradient-intensive sequences has not previously been evaluated for a clinical whole-body scanner. In this work, PET stability of the GE SIGNA PET/MR was examined during simultaneous scanning of aggressive MR pulse sequences. Methods: PET performance tests were acquired with MR idle and during simultaneous MR pulsing. Recent system improvements mitigating RF interference and gain variation were used. A fast recovery fast spin echo MR sequence was selected for high RF power, and an echo planar imaging sequence was selected for its high heat-inducing gradients. Measurements were performed to determine PET stability under varying MR conditions using the following metrics: sensitivity, scatter fraction, contrast recovery, uniformity, count rate performance, and image quantitation. A final PET quantitative stability assessment for simultaneous PET scanning during functional MRI studies was performed with a spiral in-and-out gradient echo sequence. Results: Quantitation stability of a 68 Ge flood phantom was demonstrated within 0.34%. Normalized sensitivity was stable during simultaneous scanning within 0.3%. Scatter fraction measured with a 68 Ge line source in the scatter phantom was stable within the range of 40.4%-40.6%. Contrast recovery and uniformity were comparable for PET images acquired simultaneously with multiple MR conditions. Peak noise equivalent count rate was 224 kcps at an effective activity concentration of 18.6 kBq/mL, and the count rate curves and scatter fraction curve were consistent for the alternating MR pulsing states. A final test demonstrated quantitative stability during a spiral functional MRI sequence. Conclusion: PET stability metrics demonstrated that PET quantitation was not affected during simultaneous aggressive MRI. This stability enables demanding applications such as kinetic modeling. © 2018 by the Society of Nuclear Medicine and Molecular Imaging.
Thermal control of low-pressure fractionation processes. [in basaltic magma solidification

NASA Technical Reports Server (NTRS)

Usselman, T. M.; Hodge, D. S.

1978-01-01

Thermal models detailing the solidification paths for shallow basaltic magma chambers (both open and closed systems) were calculated using finite-difference techniques. The total solidification time for closed chambers are comparable to previously published calculations; however, the temperature-time paths are not. These paths are dependent on the phase relations and the crystallinity of the system, because both affect the manner in which the latent heat of crystallization is distributed. In open systems, where a chamber would be periodically replenished with additional parental liquid, calculations indicate that the possibility is strong that a steady-state temperature interval is achieved near a major phase boundary. In these cases it is straightforward to analyze fractionation models of the basaltic liquid evolution and their corresponding cumulate sequences. This steady thermal fractionating state can be invoked to explain large amounts of erupted basalts of similar composition over long time periods from the same volcanic center and some rhythmically layered basic cumulate sequences.
Early divergent strains of Yersinia pestis in Eurasia 5,000 years ago.

PubMed

Rasmussen, Simon; Allentoft, Morten Erik; Nielsen, Kasper; Orlando, Ludovic; Sikora, Martin; Sjögren, Karl-Göran; Pedersen, Anders Gorm; Schubert, Mikkel; Van Dam, Alex; Kapel, Christian Moliin Outzen; Nielsen, Henrik Bjørn; Brunak, Søren; Avetisyan, Pavel; Epimakhov, Andrey; Khalyapin, Mikhail Viktorovich; Gnuni, Artak; Kriiska, Aivar; Lasak, Irena; Metspalu, Mait; Moiseyev, Vyacheslav; Gromov, Andrei; Pokutta, Dalia; Saag, Lehti; Varul, Liivi; Yepiskoposyan, Levon; Sicheritz-Pontén, Thomas; Foley, Robert A; Lahr, Marta Mirazón; Nielsen, Rasmus; Kristiansen, Kristian; Willerslev, Eske

2015-10-22

The bacteria Yersinia pestis is the etiological agent of plague and has caused human pandemics with millions of deaths in historic times. How and when it originated remains contentious. Here, we report the oldest direct evidence of Yersinia pestis identified by ancient DNA in human teeth from Asia and Europe dating from 2,800 to 5,000 years ago. By sequencing the genomes, we find that these ancient plague strains are basal to all known Yersinia pestis. We find the origins of the Yersinia pestis lineage to be at least two times older than previous estimates. We also identify a temporal sequence of genetic changes that lead to increased virulence and the emergence of the bubonic plague. Our results show that plague infection was endemic in the human populations of Eurasia at least 3,000 years before any historical recordings of pandemics. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
PatternLab for proteomics 4.0: A one-stop shop for analyzing shotgun proteomic data

PubMed Central

Carvalho, Paulo C; Lima, Diogo B; Leprevost, Felipe V; Santos, Marlon D M; Fischer, Juliana S G; Aquino, Priscila F; Moresco, James J; Yates, John R; Barbosa, Valmir C

2017-01-01

PatternLab for proteomics is an integrated computational environment that unifies several previously published modules for analyzing shotgun proteomic data. PatternLab contains modules for formatting sequence databases, performing peptide spectrum matching, statistically filtering and organizing shotgun proteomic data, extracting quantitative information from label-free and chemically labeled data, performing statistics for differential proteomics, displaying results in a variety of graphical formats, performing similarity-driven studies with de novo sequencing data, analyzing time-course experiments, and helping with the understanding of the biological significance of data in the light of the Gene Ontology. Here we describe PatternLab for proteomics 4.0, which closely knits together all of these modules in a self-contained environment, covering the principal aspects of proteomic data analysis as a freely available and easily installable software package. All updates to PatternLab, as well as all new features added to it, have been tested over the years on millions of mass spectra. PMID:26658470
Fluorescent Amplified-Fragment Length Polymorphism Genotyping of Neisseria meningitidis Identifies Clones Associated with Invasive Disease

PubMed Central

Goulding, Jonathan N.; Hookey, John V.; Stanley, John; Olver, Will; Neal, Keith R.; Ala'Aldeen, Dlawer A. A.; Arnold, Catherine

2000-01-01

Fluorescent amplified-fragment length polymorphism (FAFLP), a genotyping technique with phylogenetic significance, was applied to 123 isolates of Neisseria meningitidis. Nine of these were from an outbreak in a British university; 9 were from a recent outbreak in Pontypridd, Glamorgan; 15 were from sporadic cases of meningococcal disease; 26 were from the National Collection of Type Cultures; 58 were carrier isolates from Ironville, Derbyshire; 1 was a disease isolate from Ironville; and five were representatives of invasive clones of N. meningitidis. FAFLP analysis results were compared with previously published multilocus sequence typing (MLST) and pulsed-field gel electrophoresis (PFGE) results. FAFLP was able to identify hypervirulent, hyperendemic lineages (invasive clones) of N. meningitidis as well as did MLST. PFGE did not discriminate between two strains from the outbreak that were classified as similar but distinct by FAFLP. The results suggest that high resolution of N. meningitidis for outbreak and other epidemiological analyses is more cost efficient by FAFLP than by sequencing procedures. PMID:11101599
Two new species of Brueelia Kéler, 1936 (Ischnocera, Philopteridae) parasitic on Neotropical trogons (Aves, Trogoniformes).

PubMed

Valim, Michel P; Weckstein, Jason D

2011-01-01

Two new species of Brueelia are described and illustrated. These new species and their type hosts are: Brueelia sueta ex Pharomachrus pavoninus (Spix, 1824), the Pavonine Quetzal and Brueelia cicchinoi ex Trogon viridis Linnaeus, the White-tailed Trogon. Both new species differ from the only Brueelia described on Trogon mexicanus by many morphological features, including those present in the male genitalia and female vulvar margin. Partial sequences of the mitochondrial cytochrome oxidase I (COI) gene for these two new species differ from one another by 13.6% uncorrected p-distance. Whereas Brueelia cicchinoi is only 0.3% divergent from previously published COI sequences identified as Brueelia sp. from the Mexican Trogon melanocephalus Gould, 1936 and Trogon massena Gould, 1938. We also found Brueelia cicchinoi on Trogon melanurus, Trogon collaris and Pharomachrus pavoninus. Thus Brueelia cicchinoi is found on multiple trogoniform hosts across an extremely large geographic distribution and has one of the largest number of host associations among Brueelia species.
Koolpinyah and Yata viruses: two newly recognised ephemeroviruses from tropical regions of Australia and Africa.

PubMed

Blasdell, Kim R; Widen, Steven G; Diviney, Sinéad M; Firth, Cadhla; Wood, Thomas G; Guzman, Hilda; Holmes, Edward C; Tesh, Robert B; Vasilakis, Nikos; Walker, Peter J

2014-12-05

Koolpinyah virus (KOOLV) isolated from healthy Australian cattle and Yata virus (YATV) isolated from a pool of Mansonia uniformis mosquitoes in the Central African Republic have been tentatively identified as rhabdoviruses. KOOLV was shown previously to be related antigenically to kotonkon virus, an ephemerovirus that has caused an ephemeral fever-like illness in cattle in Nigeria, but YATV failed to react antigenically with any other virus tested. Here we report the complete genome sequences of KOOLV (16,133 nt) and YATV (14,479 nt). Each has a complex genome organisation, with multiple genes, including a second non-structural glycoprotein (GNS) gene and a viroporin (α1) gene, between the G and L genes as is characteristic of ephemeroviruses. Based on an analysis of genome organisation, sequence identity and cross-neutralisation, we demonstrate that both KOOLV and YATV should be classified as two new species in the genus Ephemerovirus. Crown Copyright © 2014. Published by Elsevier B.V. All rights reserved.
A new phylogeny-based tribal classification of subfamily Detarioideae, an early branching clade of florally diverse tropical arborescent legumes.

PubMed

de la Estrella, Manuel; Forest, Félix; Klitgård, Bente; Lewis, Gwilym P; Mackinder, Barbara A; de Queiroz, Luciano P; Wieringa, Jan J; Bruneau, Anne

2018-05-02

Detarioideae (81 genera, c. 760 species) is one of the six Leguminosae subfamilies recently reinstated by the Legume Phylogeny Working Group. This subfamily displays high morphological variability and is one of the early branching clades in the evolution of legumes. Using previously published and newly generated sequences from four loci (matK-trnK, rpL16, trnG-trnG2G and ITS), we develop a new densely sampled phylogeny to assess generic relationships and tribal delimitations within Detarioideae. The ITS phylogenetic trees are poorly resolved, but the plastid data recover several strongly supported clades, which also are supported in a concatenated plastid + ITS sequence analysis. We propose a new phylogeny-based tribal classification for Detarioideae that includes six tribes: re-circumscribed Detarieae and Amherstieae, and the four new tribes Afzelieae, Barnebydendreae, Saraceae and Schotieae. An identification key and descriptions for each of the tribes are also provided.
Single-molecule sequencing and chromatin conformation capture enable de novo reference assembly of the domestic goat genome.

PubMed

Bickhart, Derek M; Rosen, Benjamin D; Koren, Sergey; Sayre, Brian L; Hastie, Alex R; Chan, Saki; Lee, Joyce; Lam, Ernest T; Liachko, Ivan; Sullivan, Shawn T; Burton, Joshua N; Huson, Heather J; Nystrom, John C; Kelley, Christy M; Hutchison, Jana L; Zhou, Yang; Sun, Jiajie; Crisà, Alessandra; Ponce de León, F Abel; Schwartz, John C; Hammond, John A; Waldbieser, Geoffrey C; Schroeder, Steven G; Liu, George E; Dunham, Maitreya J; Shendure, Jay; Sonstegard, Tad S; Phillippy, Adam M; Van Tassell, Curtis P; Smith, Timothy P L

2017-04-01

The decrease in sequencing cost and increased sophistication of assembly algorithms for short-read platforms has resulted in a sharp increase in the number of species with genome assemblies. However, these assemblies are highly fragmented, with many gaps, ambiguities, and errors, impeding downstream applications. We demonstrate current state of the art for de novo assembly using the domestic goat (Capra hircus) based on long reads for contig formation, short reads for consensus validation, and scaffolding by optical and chromatin interaction mapping. These combined technologies produced what is, to our knowledge, the most continuous de novo mammalian assembly to date, with chromosome-length scaffolds and only 649 gaps. Our assembly represents a ∼400-fold improvement in continuity due to properly assembled gaps, compared to the previously published C. hircus assembly, and better resolves repetitive structures longer than 1 kb, representing the largest repeat family and immune gene complex yet produced for an individual of a ruminant species.
Single-molecule sequencing and chromatin conformation capture enable de novo reference assembly of the domestic goat genome

PubMed Central

Bickhart, Derek M.; Rosen, Benjamin D.; Koren, Sergey; Sayre, Brian L.; Hastie, Alex R.; Chan, Saki; Lee, Joyce; Lam, Ernest T.; Liachko, Ivan; Sullivan, Shawn T.; Burton, Joshua N.; Huson, Heather J.; Nystrom, John C.; Kelley, Christy M.; Hutchison, Jana L.; Zhou, Yang; Sun, Jiajie; Crisà, Alessandra; de León, F. Abel Ponce; Schwartz, John C.; Hammond, John A.; Waldbieser, Geoffrey C.; Schroeder, Steven G.; Liu, George E.; Dunham, Maitreya J.; Shendure, Jay; Sonstegard, Tad S.; Phillippy, Adam M.; Van Tassell, Curtis P.; Smith, Timothy P.L.

2018-01-01

The decrease in sequencing cost and increased sophistication of assembly algorithms for short-read platforms has resulted in a sharp increase in the number of species with genome assemblies. However, these assemblies are highly fragmented, with many gaps, ambiguities, and errors, impeding downstream applications. We demonstrate current state of the art for de novo assembly using the domestic goat (Capra hircus), based on long reads for contig formation, short reads for consensus validation, and scaffolding by optical and chromatin interaction mapping. These combined technologies produced the most continuous de novo mammalian assembly to date, with chromosome-length scaffolds and only 649 gaps. Our assembly represents a ~400-fold improvement in continuity due to properly assembled gaps compared to the previously published C. hircus assembly, and better resolves repetitive structures longer than 1 kb, representing the largest repeat family and immune gene complex ever produced for an individual of a ruminant species. PMID:28263316
An unusual osteomyelitis caused by Moraxella osloensis: A case report.

PubMed

Alkhatib, Nidal J; Younis, Manaf H; Alobaidi, Ahmad S; Shaath, Nebal M

2017-01-01

Moraxella osloensis is a gram-negative coccobacillus, that is saprophytic on skin and mucosa, and rarely causing human infections. Reported cases of human infections usually occur in immunocompromised patients. We report the second case of M. osloensis-caused-osteomyelitis in literature, occurring in a young healthy man. The organism was identified by sequencing analysis of the 16S ribosomal RNA gene. Our patient was treated successfully with surgical debridement and intravenous third-generation cephalosporins. M. osloensis has been rarely reported to cause local or invasive infections. Our case report is the second case in literature and it is different from the previously reported case in that our patient has no chronic medical problems, no history of trauma, with unique presentation and features on the MRI and intraoperative finding. Proper diagnosis is essential for appropriate treatment of osteomyelitis. RNA gene sequence analysis is the primary method of M. osloensis diagnosis. M. osloensis is usually susceptible to simple antibiotics. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
Genome sequencing reveals loci under artificial selection that underlie disease phenotypes in the laboratory rat.

PubMed

Atanur, Santosh S; Diaz, Ana Garcia; Maratou, Klio; Sarkis, Allison; Rotival, Maxime; Game, Laurence; Tschannen, Michael R; Kaisaki, Pamela J; Otto, Georg W; Ma, Man Chun John; Keane, Thomas M; Hummel, Oliver; Saar, Kathrin; Chen, Wei; Guryev, Victor; Gopalakrishnan, Kathirvel; Garrett, Michael R; Joe, Bina; Citterio, Lorena; Bianchi, Giuseppe; McBride, Martin; Dominiczak, Anna; Adams, David J; Serikawa, Tadao; Flicek, Paul; Cuppen, Edwin; Hubner, Norbert; Petretto, Enrico; Gauguier, Dominique; Kwitek, Anne; Jacob, Howard; Aitman, Timothy J

2013-08-01

Large numbers of inbred laboratory rat strains have been developed for a range of complex disease phenotypes. To gain insights into the evolutionary pressures underlying selection for these phenotypes, we sequenced the genomes of 27 rat strains, including 11 models of hypertension, diabetes, and insulin resistance, along with their respective control strains. Altogether, we identified more than 13 million single-nucleotide variants, indels, and structural variants across these rat strains. Analysis of strain-specific selective sweeps and gene clusters implicated genes and pathways involved in cation transport, angiotensin production, and regulators of oxidative stress in the development of cardiovascular disease phenotypes in rats. Many of the rat loci that we identified overlap with previously mapped loci for related traits in humans, indicating the presence of shared pathways underlying these phenotypes in rats and humans. These data represent a step change in resources available for evolutionary analysis of complex traits in disease models. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.
Tilted pillar array fabrication by the combination of proton beam writing and soft lithography for microfluidic cell capture Part 2: Image sequence analysis based evaluation and biological application.

PubMed

Járvás, Gábor; Varga, Tamás; Szigeti, Márton; Hajba, László; Fürjes, Péter; Rajta, István; Guttman, András

2018-02-01

As a continuation of our previously published work, this paper presents a detailed evaluation of a microfabricated cell capture device utilizing a doubly tilted micropillar array. The device was fabricated using a novel hybrid technology based on the combination of proton beam writing and conventional lithography techniques. Tilted pillars offer unique flow characteristics and support enhanced fluidic interaction for improved immunoaffinity based cell capture. The performance of the microdevice was evaluated by an image sequence analysis based in-house developed single-cell tracking system. Individual cell tracking allowed in-depth analysis of the cell-chip surface interaction mechanism from hydrodynamic point of view. Simulation results were validated by using the hybrid device and the optimized surface functionalization procedure. Finally, the cell capture capability of this new generation microdevice was demonstrated by efficiently arresting cells from a HT29 cell-line suspension. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Combined analysis of fourteen nuclear genes refines the Ursidae phylogeny.

PubMed

Pagès, Marie; Calvignac, Sébastien; Klein, Catherine; Paris, Mathilde; Hughes, Sandrine; Hänni, Catherine

2008-04-01

Despite numerous studies, questions remain about the evolutionary history of Ursidae and additional independent genetic markers were needed to elucidate these ambiguities. For this purpose, we sequenced ten nuclear genes for all the eight extant bear species. By combining these new sequences with those of four other recently published nuclear markers, we provide new insights into the phylogenetic relationships of the Ursidae family members. The hypothesis that the giant panda was the first species to diverge among ursids is definitively confirmed and the precise branching order within the Ursus genus is clarified for the first time. Moreover, our analyses indicate that the American and the Asiatic black bears do not cluster as sister taxa, as had been previously hypothesised. Sun and sloth bears clearly appear as the most basal ursine species but uncertainties about their exact relationships remain. Since our larger dataset did not enable us to clarify this last question, identifying rare genomic changes in bear genomes could be a promising solution for further studies.
Computer ranking of the sequence of appearance of 73 features of the brain and related structures in staged human embryos during the sixth week of development.

PubMed

O'Rahilly, R; Müller, F; Hutchins, G M; Moore, G W

1987-09-01

The sequence of events in the development of the brain in human embryos, already published for stages 8-15, is here continued for stages 16 and 17. With the aid of a computerized bubble-sort algorithm, 71 individual embryos were ranked in ascending order of the features present. Whereas these numbered 100 in the previous study, the increasing structural complexity gave 27 new features in the two stages now under investigation. The chief characteristics of stage 16 (approximately 37 postovulatory days) are protruding basal nuclei, the caudal olfactory elevation (olfactory tubercle), the tectobulbar tracts, and ascending fibers to the cerebellum. The main features of stage 17 (approximately 41 postovulatory days) are the cortical nucleus of the amygdaloid body, an intermediate layer in the tectum mesencephali, the posterior commissure, and the habenulo-interpeduncular tract. In addition, a typical feature at stage 17 is the crescentic shape of the lens cavity.
Identification of novel isoprene synthases through genome mining and expression in Escherichia coli.

PubMed

Ilmén, Marja; Oja, Merja; Huuskonen, Anne; Lee, Sangmin; Ruohonen, Laura; Jung, Simon

2015-09-01

Isoprene is a naturally produced hydrocarbon emitted into the atmosphere by green plants. It is also a constituent of synthetic rubber and a potential biofuel. Microbial production of isoprene can become a sustainable alternative to the prevailing chemical production of isoprene from petroleum. In this work, sequence homology searches were conducted to find novel isoprene synthases. Candidate sequences were functionally expressed in Escherichia coli and the desired enzymes were identified based on an isoprene production assay. The activity of three enzymes was shown for the first time: expression of the candidate genes from Ipomoea batatas, Mangifera indica, and Elaeocarpus photiniifolius resulted in isoprene formation. The Ipomoea batatas isoprene synthase produced the highest amounts of isoprene in all experiments, exceeding the isoprene levels obtained by the previously known Populus alba and Pueraria montana isoprene synthases that were studied in parallel as controls. Copyright © 2015 International Metabolic Engineering Society. Published by Elsevier Inc. All rights reserved.
Koi herpesvirus represents a third cyprinid herpesvirus (CyHV-3) in the family Herpesviridae.

PubMed

Waltzek, Thomas B; Kelley, Garry O; Stone, David M; Way, Keith; Hanson, Larry; Fukuda, Hideo; Hirono, Ikuo; Aoki, Takashi; Davison, Andrew J; Hedrick, Ronald P

2005-06-01

The sequences of four complete genes were analysed in order to determine the relatedness of koi herpesvirus (KHV) to three fish viruses in the family Herpesviridae: carp pox herpesvirus (Cyprinid herpesvirus 1, CyHV-1), haematopoietic necrosis herpesvirus of goldfish (Cyprinid herpesvirus 2, CyHV-2) and channel catfish virus (Ictalurid herpesvirus 1, IcHV-1). The genes were predicted to encode a helicase, an intercapsomeric triplex protein, the DNA polymerase and the major capsid protein. The results showed that KHV is related closely to CyHV-1 and CyHV-2, and that the three cyprinid viruses are related, albeit more distantly, to IcHV-1. Twelve KHV isolates from four diverse geographical areas yielded identical sequences for a region of the DNA polymerase gene. These findings, with previously published morphological and biological data, indicate that KHV should join the group of related lower-vertebrate viruses in the family Herpesviridae under the formal designation Cyprinid herpesvirus 3 (CyHV-3).
Absolute dimensions and masses of eclipsing binaries. V. IQ Persei

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lacy, C.H.; Frueh, M.L.

1985-08-01

New photometric and spectroscopic observations of the 1.7 day eclipsing binary IQ Persei (B8 + A6) have been analyzed to yield very accurate fundamental properties of the system. Reticon spectroscopic observations obtained at McDonald Observatory were used to determine accurate radial velocities of both stars in this slightly eccentric large light-ratio binary. A new set of VR light curves obtained at McDonald Observatory were analyzed by synthesis techniques, and previously published UBV light curves were reanalyzed to yield accurate photometric orbits. Orbital parameters derived from both sets of photometric observations are in excellent agreement. The absolute dimensions, masses, luminosities, andmore » apsidal motion period (140 yr) derived from these observations agree well with the predictions of theoretical stellar evolution models. The A6 secondary is still very close to the zero-age main sequence. The B8 primary is about one-third of the way through its main-sequence evolution. 27 references.« less
QSRA: a quality-value guided de novo short read assembler.

PubMed

Bryant, Douglas W; Wong, Weng-Keen; Mockler, Todd C

2009-02-24

New rapid high-throughput sequencing technologies have sparked the creation of a new class of assembler. Since all high-throughput sequencing platforms incorporate errors in their output, short-read assemblers must be designed to account for this error while utilizing all available data. We have designed and implemented an assembler, Quality-value guided Short Read Assembler, created to take advantage of quality-value scores as a further method of dealing with error. Compared to previous published algorithms, our assembler shows significant improvements not only in speed but also in output quality. QSRA generally produced the highest genomic coverage, while being faster than VCAKE. QSRA is extremely competitive in its longest contig and N50/N80 contig lengths, producing results of similar quality to those of EDENA and VELVET. QSRA provides a step closer to the goal of de novo assembly of complex genomes, improving upon the original VCAKE algorithm by not only drastically reducing runtimes but also increasing the viability of the assembly algorithm through further error handling capabilities.

Using single nuclei for RNA-seq to capture the transcriptome of postmortem neurons

PubMed Central

Krishnaswami, Suguna Rani; Grindberg, Rashel V; Novotny, Mark; Venepally, Pratap; Lacar, Benjamin; Bhutani, Kunal; Linker, Sara B; Pham, Son; Erwin, Jennifer A; Miller, Jeremy A; Hodge, Rebecca; McCarthy, James K; Kelder, Martin; McCorrison, Jamison; Aevermann, Brian D; Fuertes, Francisco Diez; Scheuermann, Richard H; Lee, Jun; Lein, Ed S; Schork, Nicholas; McConnell, Michael J; Gage, Fred H; Lasken, Roger S

2016-01-01

A protocol is described for sequencing the transcriptome of a cell nucleus. Nuclei are isolated from specimens and sorted by FACS, cDNA libraries are constructed and RNA-seq is performed, followed by data analysis. Some steps follow published methods (Smart-seq2 for cDNA synthesis and Nextera XT barcoded library preparation) and are not described in detail here. Previous single-cell approaches for RNA-seq from tissues include cell dissociation using protease treatment at 30 °C, which is known to alter the transcriptome. We isolate nuclei at 4 °C from tissue homogenates, which cause minimal damage. Nuclear transcriptomes can be obtained from postmortem human brain tissue stored at −80 °C, making brain archives accessible for RNA-seq from individual neurons. The method also allows investigation of biological features unique to nuclei, such as enrichment of certain transcripts and precursors of some noncoding RNAs. By following this procedure, it takes about 4 d to construct cDNA libraries that are ready for sequencing. PMID:26890679
Bacteriophage P23-77 Capsid Protein Structures Reveal the Archetype of an Ancient Branch from a Major Virus Lineage

PubMed Central

Rissanen, Ilona; Grimes, Jonathan M.; Pawlowski, Alice; Mäntynen, Sari; Harlos, Karl; Bamford, Jaana K.H.; Stuart, David I.

2013-01-01

Summary It has proved difficult to classify viruses unless they are closely related since their rapid evolution hinders detection of remote evolutionary relationships in their genetic sequences. However, structure varies more slowly than sequence, allowing deeper evolutionary relationships to be detected. Bacteriophage P23-77 is an example of a newly identified viral lineage, with members inhabiting extreme environments. We have solved multiple crystal structures of the major capsid proteins VP16 and VP17 of bacteriophage P23-77. They fit the 14 Å resolution cryo-electron microscopy reconstruction of the entire virus exquisitely well, allowing us to propose a model for both the capsid architecture and viral assembly, quite different from previously published models. The structures of the capsid proteins and their mode of association to form the viral capsid suggest that the P23-77-like and adeno-PRD1 lineages of viruses share an extremely ancient common ancestor. PMID:23623731
Evolution and Diversity of Transposable Elements in Vertebrate Genomes.

PubMed

Sotero-Caio, Cibele G; Platt, Roy N; Suh, Alexander; Ray, David A

2017-01-01

Transposable elements (TEs) are selfish genetic elements that mobilize in genomes via transposition or retrotransposition and often make up large fractions of vertebrate genomes. Here, we review the current understanding of vertebrate TE diversity and evolution in the context of recent advances in genome sequencing and assembly techniques. TEs make up 4-60% of assembled vertebrate genomes, and deeply branching lineages such as ray-finned fishes and amphibians generally exhibit a higher TE diversity than the more recent radiations of birds and mammals. Furthermore, the list of taxa with exceptional TE landscapes is growing. We emphasize that the current bottleneck in genome analyses lies in the proper annotation of TEs and provide examples where superficial analyses led to misleading conclusions about genome evolution. Finally, recent advances in long-read sequencing will soon permit access to TE-rich genomic regions that previously resisted assembly including the gigantic, TE-rich genomes of salamanders and lungfishes. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
PSI/TM-Coffee: a web server for fast and accurate multiple sequence alignments of regular and transmembrane proteins using homology extension on reduced databases.

PubMed

Floden, Evan W; Tommaso, Paolo D; Chatzou, Maria; Magis, Cedrik; Notredame, Cedric; Chang, Jia-Ming

2016-07-08

The PSI/TM-Coffee web server performs multiple sequence alignment (MSA) of proteins by combining homology extension with a consistency based alignment approach. Homology extension is performed with Position Specific Iterative (PSI) BLAST searches against a choice of redundant and non-redundant databases. The main novelty of this server is to allow databases of reduced complexity to rapidly perform homology extension. This server also gives the possibility to use transmembrane proteins (TMPs) reference databases to allow even faster homology extension on this important category of proteins. Aside from an MSA, the server also outputs topological prediction of TMPs using the HMMTOP algorithm. Previous benchmarking of the method has shown this approach outperforms the most accurate alignment methods such as MSAProbs, Kalign, PROMALS, MAFFT, ProbCons and PRALINE™. The web server is available at http://tcoffee.crg.cat/tmcoffee. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Recorded interactive seminars and follow-up discussions as an effective method for distance learning.

PubMed

Miller, Kenneth T; Hannum, Wallace M; Proffit, William R

2011-03-01

Previous studies have suggested that, although orthodontic residents prefer to be live and interactive in a seminar, they learn almost as much when watching a previously recorded interactive seminar and following up with live discussion. Our objective was to test the effectiveness and acceptability of using previously recorded interactive seminars and different types of live follow-up discussions. Residents at schools participating from a distance completed preseminar readings and at their convenience watched streaming video of some or all recordings of 4 interactive seminar sequences consisting of 6 seminars each. Afterward, distant residents participated in 1 of 4 types of interaction: local follow-up discussion, videoconference, teleconference, and no discussion. The effectiveness of the seminar sequences was tested by pretest and posttest scores. Acceptability was evaluated from ratings of aspects of the seminar and discussion experience. Open-ended questions allowed residents to express what they liked and to suggest changes in their experiences. In each seminar sequence, test scores of schools participating through recordings and follow-up discussions improved more than those participating live and interactive. After viewing, residents preferred local follow-up discussion, which was not statistically different from participating live and interactive both locally and from a distance. Videoconference and teleconference discussions were both more acceptable to residents than no follow-up discussion, which was found to be significantly below all methods tested. When residents are live and interactive in a seminar, there does not appear to be a significant difference between being local vs at a distance. Recorded interactive seminars with follow-up discussions are also an effective and acceptable method of distance learning. Residents preferred local follow-up discussion, but, at a distance, they preferred videoconference to both teleconference and no discussion. Copyright © 2011 American Association of Orthodontists. Published by Mosby, Inc. All rights reserved.
Whole Genome Complete Resequencing of Bacillus subtilis Natto by Combining Long Reads with High-Quality Short Reads

PubMed Central

Kamada, Mayumi; Hase, Sumitaka; Sato, Kengo; Toyoda, Atsushi; Fujiyama, Asao; Sakakibara, Yasubumi

2014-01-01

De novo microbial genome sequencing reached a turning point with third-generation sequencing (TGS) platforms, and several microbial genomes have been improved by TGS long reads. Bacillus subtilis natto is closely related to the laboratory standard strain B. subtilis Marburg 168, and it has a function in the production of the traditional Japanese fermented food “natto.” The B. subtilis natto BEST195 genome was previously sequenced with short reads, but it included some incomplete regions. We resequenced the BEST195 genome using a PacBio RS sequencer, and we successfully obtained a complete genome sequence from one scaffold without any gaps, and we also applied Illumina MiSeq short reads to enhance quality. Compared with the previous BEST195 draft genome and Marburg 168 genome, we found that incomplete regions in the previous genome sequence were attributed to GC-bias and repetitive sequences, and we also identified some novel genes that are found only in the new genome. PMID:25329997
A High-Coverage Yersinia pestis Genome from a Sixth-Century Justinianic Plague Victim.

PubMed

Feldman, Michal; Harbeck, Michaela; Keller, Marcel; Spyrou, Maria A; Rott, Andreas; Trautmann, Bernd; Scholz, Holger C; Päffgen, Bernd; Peters, Joris; McCormick, Michael; Bos, Kirsten; Herbig, Alexander; Krause, Johannes

2016-11-01

The Justinianic Plague, which started in the sixth century and lasted to the mid eighth century, is thought to be the first of three historically documented plague pandemics causing massive casualties. Historical accounts and molecular data suggest the bacterium Yersinia pestis as its etiological agent. Here we present a new high-coverage (17.9-fold) Y. pestis genome obtained from a sixth-century skeleton recovered from a southern German burial site close to Munich. The reconstructed genome enabled the detection of 30 unique substitutions as well as structural differences that have not been previously described. We report indels affecting a lacl family transcription regulator gene as well as nonsynonymous substitutions in the nrdE, fadJ, and pcp genes, that have been suggested as plague virulence determinants or have been shown to be upregulated in different models of plague infection. In addition, we identify 19 false positive substitutions in a previously published lower-coverage Y. pestis genome from another archaeological site of the same time period and geographical region that is otherwise genetically identical to the high-coverage genome sequence reported here, suggesting low-genetic diversity of the plague during the sixth century in rural southern Germany. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Epidemiological characterization of a nosocomial outbreak of extended spectrum β-lactamase Escherichia coli ST-131 confirms the clinical value of core genome multilocus sequence typing.

PubMed

Woksepp, Hanna; Ryberg, Anna; Berglind, Linda; Schön, Thomas; Söderman, Jan

2017-12-01

Enhanced precision of epidemiological typing in clinically suspected nosocomial outbreaks is crucial. Our aim was to investigate whether single nucleotide polymorphism (SNP) analysis and core genome (cg) multilocus sequence typing (MLST) of whole genome sequencing (WGS) data would more reliably identify a nosocomial outbreak, compared to earlier molecular typing methods. Sixteen isolates from a nosocomial outbreak of ESBL E. coli ST-131 in southeastern Sweden and three control strains were subjected to WGS. Sequences were explored by SNP analysis and cgMLST. cgMLST clearly differentiated between the outbreak isolates and the control isolates (>1400 differences). All clinically identified outbreak isolates showed close clustering (≥2 allele differences), except for two isolates (>50 allele differences). These data confirmed that the isolates with >50 differing genes did not belong to the nosocomial outbreak. The number of SNPs within the outbreak was ≤7, whereas the two discrepant isolates had >700 SNPs. Two of the ESBL E. coli ST-131 isolates did not belong to the clinically identified outbreak. Our results illustrate the power of WGS in terms of resolution, which may avoid overestimation of patients belonging to outbreaks as judged from epidemiological data and previously employed molecular methods with lower discriminatory ability. © 2017 APMIS. Published by John Wiley & Sons Ltd.
Somatic mosaicism of a CDKL5 mutation identified by next-generation sequencing.

PubMed

Kato, Takeshi; Morisada, Naoya; Nagase, Hiroaki; Nishiyama, Masahiro; Toyoshima, Daisaku; Nakagawa, Taku; Maruyama, Azusa; Fu, Xue Jun; Nozu, Kandai; Wada, Hiroko; Takada, Satoshi; Iijima, Kazumoto

2015-10-01

CDKL5-related encephalopathy is an X-linked dominantly inherited disorder that is characterized by early infantile epileptic encephalopathy or atypical Rett syndrome. We describe a 5-year-old Japanese boy with intractable epilepsy, severe developmental delay, and Rett syndrome-like features. Onset was at 2 months, when his electroencephalogram showed sporadic single poly spikes and diffuse irregular poly spikes. We conducted a genetic analysis using an Illumina® TruSight™ One sequencing panel on a next-generation sequencer. We identified two epilepsy-associated single nucleotide variants in our case: CDKL5 p.Ala40Val and KCNQ2 p.Glu515Asp. CDKL5 p.Ala40Val has been previously reported to be responsible for early infantile epileptic encephalopathy. In our case, the CDKL5 heterozygous mutation showed somatic mosaicism because the boy's karyotype was 46,XY. The KCNQ2 variant p.Glu515Asp is known to cause benign familial neonatal seizures-1, and this variant showed paternal inheritance. Although we believe that the somatic mosaic CDKL5 mutation is mainly responsible for the neurological phenotype in the patient, the KCNQ2 variant might have some neurological effect. Genetic analysis by next-generation sequencing is capable of identifying multiple variants in a patient. Copyright © 2015 The Japanese Society of Child Neurology. Published by Elsevier B.V. All rights reserved.
IDEPI: rapid prediction of HIV-1 antibody epitopes and other phenotypic features from sequence data using a flexible machine learning platform.

PubMed

Hepler, N Lance; Scheffler, Konrad; Weaver, Steven; Murrell, Ben; Richman, Douglas D; Burton, Dennis R; Poignard, Pascal; Smith, Davey M; Kosakovsky Pond, Sergei L

2014-09-01

Since its identification in 1983, HIV-1 has been the focus of a research effort unprecedented in scope and difficulty, whose ultimate goals--a cure and a vaccine--remain elusive. One of the fundamental challenges in accomplishing these goals is the tremendous genetic variability of the virus, with some genes differing at as many as 40% of nucleotide positions among circulating strains. Because of this, the genetic bases of many viral phenotypes, most notably the susceptibility to neutralization by a particular antibody, are difficult to identify computationally. Drawing upon open-source general-purpose machine learning algorithms and libraries, we have developed a software package IDEPI (IDentify EPItopes) for learning genotype-to-phenotype predictive models from sequences with known phenotypes. IDEPI can apply learned models to classify sequences of unknown phenotypes, and also identify specific sequence features which contribute to a particular phenotype. We demonstrate that IDEPI achieves performance similar to or better than that of previously published approaches on four well-studied problems: finding the epitopes of broadly neutralizing antibodies (bNab), determining coreceptor tropism of the virus, identifying compartment-specific genetic signatures of the virus, and deducing drug-resistance associated mutations. The cross-platform Python source code (released under the GPL 3.0 license), documentation, issue tracking, and a pre-configured virtual machine for IDEPI can be found at https://github.com/veg/idepi.
Determination of disease phenotypes and pathogenic variants from exome sequence data in the CAGI 4 gene panel challenge.

PubMed

Kundu, Kunal; Pal, Lipika R; Yin, Yizhou; Moult, John

2017-09-01

The use of gene panel sequence for diagnostic and prognostic testing is now widespread, but there are so far few objective tests of methods to interpret these data. We describe the design and implementation of a gene panel sequencing data analysis pipeline (VarP) and its assessment in a CAGI4 community experiment. The method was applied to clinical gene panel sequencing data of 106 patients, with the goal of determining which of 14 disease classes each patient has and the corresponding causative variant(s). The disease class was correctly identified for 36 cases, including 10 where the original clinical pipeline did not find causative variants. For a further seven cases, we found strong evidence of an alternative disease to that tested. Many of the potentially causative variants are missense, with no previous association with disease, and these proved the hardest to correctly assign pathogenicity or otherwise. Post analysis showed that three-dimensional structure data could have helped for up to half of these cases. Over-reliance on HGMD annotation led to a number of incorrect disease assignments. We used a largely ad hoc method to assign probabilities of pathogenicity for each variant, and there is much work still to be done in this area. © 2017 The Authors. **Human Mutation published by Wiley Periodicals, Inc.
A simple, rapid, high-fidelity and cost-effective PCR-based two-step DNA synthesis method for long gene sequences.

PubMed

Xiong, Ai-Sheng; Yao, Quan-Hong; Peng, Ri-He; Li, Xian; Fan, Hui-Qin; Cheng, Zong-Ming; Li, Yi

2004-07-07

Chemical synthesis of DNA sequences provides a powerful tool for modifying genes and for studying gene function, structure and expression. Here, we report a simple, high-fidelity and cost-effective PCR-based two-step DNA synthesis (PTDS) method for synthesis of long segments of DNA. The method involves two steps. (i) Synthesis of individual fragments of the DNA of interest: ten to twelve 60mer oligonucleotides with 20 bp overlap are mixed and a PCR reaction is carried out with high-fidelity DNA polymerase Pfu to produce DNA fragments that are approximately 500 bp in length. (ii) Synthesis of the entire sequence of the DNA of interest: five to ten PCR products from the first step are combined and used as the template for a second PCR reaction using high-fidelity DNA polymerase pyrobest, with the two outermost oligonucleotides as primers. Compared with the previously published methods, the PTDS method is rapid (5-7 days) and suitable for synthesizing long segments of DNA (5-6 kb) with high G + C contents, repetitive sequences or complex secondary structures. Thus, the PTDS method provides an alternative tool for synthesizing and assembling long genes with complex structures. Using the newly developed PTDS method, we have successfully obtained several genes of interest with sizes ranging from 1.0 to 5.4 kb.
Combination of cytochrome b heteroduplex-assay and sequencing for identification of triatomine blood meals.

PubMed

Buitrago, Rosio; Depickère, Stéphanie; Bosseno, Marie-France; Patzi, Edda Siñani; Waleckx, Etienne; Salas, Renata; Aliaga, Claudia; Brenière, Simone Frédérique

2012-01-01

The identification of blood meals in vectors contributes greatly to the understanding of interactions between vectors, microorganisms and hosts. The aim of the current work was to complement the validation of cytochrome b (Cytb) heteroduplex assay (HDA) previously described, and to add the sequencing of the Cytb gene of some samples for the identification of blood meals in triatomines. Experimental feedings of reared triatomines helped to clarify the sensitivity of the HDA. Moreover, the sequencing coupled with the HDA, allowed the assessment of the technique's taxonomic level of discrimination. The primers used to produce DNA fragments of Cytb genes for HDA had a very high sensitivity for vertebrate DNAs, rather similar for mammals, birds and reptiles. However, the formation of heteroduplex depended on blood meal's quality rather than its quantity; a correlation was observed between blood meals' color and the positivity of HDA. HDA electrophoresis profiles were reproducible, and allowed the discrimination of blood origins at the species level. However, in some cases, intraspecific variability of Cytb gene generated different HDA profiles. The HDA based on comparison of electrophoresis profiles is a very useful tool for screening large samples to determine blood origins; the subsequent sequencing of PCR products of Cytb corresponding to different HDA profiles allowed the identification of species whatever the biotope in which the vectors were captured. Copyright © 2011. Published by Elsevier B.V.
Molecular characterization of amino acid deletion in VP1 (1D) protein and novel amino acid substitutions in 3D polymerase protein of foot and mouth disease virus subtype A/Iran87.

PubMed

Esmaelizad, Majid; Jelokhani-Niaraki, Saber; Hashemnejad, Khadije; Kamalzadeh, Morteza; Lotfi, Mohsen

2011-12-01

The nucleotide sequence of the VP1 (1D) and partial 3D polymerase (3D(pol)) coding regions of the foot and mouth disease virus (FMDV) vaccine strain A/Iran87, a highly passaged isolate (~150 passages), was determined and aligned with previously published FMDV serotype A sequences. Overall analysis of the amino acid substitutions revealed that the partial 3D(pol) coding region contained four amino acid alterations. Amino acid sequence comparison of the VP1 coding region of the field isolates revealed deletions in the highly passaged Iranian isolate (A/Iran87). The prominent G-H loop of the FMDV VP1 protein contains the conserved arginine-glycine-aspartic acid (RGD) tripeptide, which is a well-known ligand for a specific cell surface integrin. Despite losing the RGD sequence of the VP1 protein and an Asp(26)→Glu substitution in a beta sheet located within a small groove of the 3D(pol) protein, the virus grew in BHK 21 suspension cell cultures. Since this strain has been used as a vaccine strain, it may be inferred that the RGD deletion has no critical role in virus attachment to the cell during the initiation of infection. It is probable that this FMDV subtype can utilize other pathways for cell attachment.
Privacy-Preserving Data Exploration in Genome-Wide Association Studies.

PubMed

Johnson, Aaron; Shmatikov, Vitaly

2013-08-01

Genome-wide association studies (GWAS) have become a popular method for analyzing sets of DNA sequences in order to discover the genetic basis of disease. Unfortunately, statistics published as the result of GWAS can be used to identify individuals participating in the study. To prevent privacy breaches, even previously published results have been removed from public databases, impeding researchers' access to the data and hindering collaborative research. Existing techniques for privacy-preserving GWAS focus on answering specific questions, such as correlations between a given pair of SNPs (DNA sequence variations). This does not fit the typical GWAS process, where the analyst may not know in advance which SNPs to consider and which statistical tests to use, how many SNPs are significant for a given dataset, etc. We present a set of practical, privacy-preserving data mining algorithms for GWAS datasets. Our framework supports exploratory data analysis, where the analyst does not know a priori how many and which SNPs to consider. We develop privacy-preserving algorithms for computing the number and location of SNPs that are significantly associated with the disease, the significance of any statistical test between a given SNP and the disease, any measure of correlation between SNPs, and the block structure of correlations. We evaluate our algorithms on real-world datasets and demonstrate that they produce significantly more accurate results than prior techniques while guaranteeing differential privacy.
A targeted sequencing panel identifies rare damaging variants in multiple genes in the cranial neural tube defect, anencephaly.

PubMed

Ishida, M; Cullup, T; Boustred, C; James, C; Docker, J; English, C; Lench, N; Copp, A J; Moore, G E; Greene, N D E; Stanier, P

2018-04-01

Neural tube defects (NTDs) affecting the brain (anencephaly) are lethal before or at birth, whereas lower spinal defects (spina bifida) may lead to lifelong neurological handicap. Collectively, NTDs rank among the most common birth defects worldwide. This study focuses on anencephaly, which despite having a similar frequency to spina bifida and being the most common type of NTD observed in mouse models, has had more limited inclusion in genetic studies. A genetic influence is strongly implicated in determining risk of NTDs and a molecular diagnosis is of fundamental importance to families both in terms of understanding the origin of the condition and for managing future pregnancies. Here we used a custom panel of 191 NTD candidate genes to screen 90 patients with cranial NTDs (n = 85 anencephaly and n = 5 craniorachischisis) with a targeted exome sequencing platform. After filtering and comparing to our in-house control exome database (N = 509), we identified 397 rare variants (minor allele frequency, MAF < 1%), 21 of which were previously unreported and predicted damaging. This included 1 frameshift (PDGFRA), 2 stop-gained (MAT1A; NOS2) and 18 missense variations. Together with evidence for oligogenic inheritance, this study provides new information on the possible genetic causation of anencephaly. © 2017 The Authors. Clinical Genetics published by John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Remapping of the RP15 Locus for X-Linked Cone-Rod Degeneration to Xp11.4-p21.1, and Identification of a De Novo Insertion in the RPGR Exon ORF15

PubMed Central

Mears, Alan J.; Hiriyanna, Suja; Vervoort, Raf; Yashar, Beverly; Gieser, Linn; Fahrner, Stacey; Daiger, Stephen P.; Heckenlively, John R.; Sieving, Paul A.; Wright, Alan F.; Swaroop, Anand

2000-01-01

X-linked forms of retinitis pigmentosa (XLRP) are among the most severe, because of their early onset, often leading to significant vision loss before the 4th decade. Previously, the RP15 locus was assigned to Xp22, by linkage analysis of a single pedigree with “X-linked dominant cone-rod degeneration.” After clinical reevaluation of a female in this pedigree identified her as affected, we remapped the disease to a 19.5-cM interval (DXS1219–DXS993) at Xp11.4-p21.1. This new interval overlapped both RP3 (RPGR) and COD1. Sequencing of the previously published exons of RPGR revealed no mutations, but a de novo insertion was detected in the new RPGR exon, ORF15. The identification of an RPGR mutation in a family with a severe form of cone and rod degeneration suggests that RPGR mutations may encompass a broader phenotypic spectrum than has previously been recognized in “typical” retinitis pigmentosa. PMID:10970770
Examining the Gender Gap in Introductory Physics

NASA Astrophysics Data System (ADS)

Kost, Lauren; Pollock, Steven; Finkelstein, Noah

2009-05-01

Our previous research[1] showed that despite the use of interactive engagement techniques in the introductory physics course, the gap in performance between males and females on a mechanics conceptual learning survey persisted from pre- to post-test, at our institution. Such findings were counter to previously published work[2]. Follow-up studies[3] identified correlations between student performance on the conceptual learning survey and students' prior physics and math knowledge and their incoming attitudes and beliefs about physics and learning physics. The results indicate that the gender gap at our institution is predominantly associated with differences in males' and females' previous physics and math knowledge, and attitudes and beliefs. Our current work extends these results in two ways: 1) we look at the gender gap in the second semester of the introductory sequence and find results similar to those in the first semester course and 2) we identify ways in which males and females differentially experience several aspects of the introductory course. [1] Pollock, et al, Phys Rev: ST: PER 3, 010107. [2] Lorenzo, et al, Am J Phys 74, 118. [3] Kost, et al, PERC Proceedings 2008.
Brown and polar bear Y chromosomes reveal extensive male-biased gene flow within brother lineages.

PubMed

Bidon, Tobias; Janke, Axel; Fain, Steven R; Eiken, Hans Geir; Hagen, Snorre B; Saarma, Urmas; Hallström, Björn M; Lecomte, Nicolas; Hailer, Frank

2014-06-01

Brown and polar bears have become prominent examples in phylogeography, but previous phylogeographic studies relied largely on maternally inherited mitochondrial DNA (mtDNA) or were geographically restricted. The male-specific Y chromosome, a natural counterpart to mtDNA, has remained underexplored. Although this paternally inherited chromosome is indispensable for comprehensive analyses of phylogeographic patterns, technical difficulties and low variability have hampered its application in most mammals. We developed 13 novel Y-chromosomal sequence and microsatellite markers from the polar bear genome and screened these in a broad geographic sample of 130 brown and polar bears. We also analyzed a 390-kb-long Y-chromosomal scaffold using sequencing data from published male ursine genomes. Y chromosome evidence support the emerging understanding that brown and polar bears started to diverge no later than the Middle Pleistocene. Contrary to mtDNA patterns, we found 1) brown and polar bears to be reciprocally monophyletic sister (or rather brother) lineages, without signals of introgression, 2) male-biased gene flow across continents and on phylogeographic time scales, and 3) male dispersal that links the Alaskan ABC islands population to mainland brown bears. Due to female philopatry, mtDNA provides a highly structured estimate of population differentiation, while male-biased gene flow is a homogenizing force for nuclear genetic variation. Our findings highlight the importance of analyzing both maternally and paternally inherited loci for a comprehensive view of phylogeographic history, and that mtDNA-based phylogeographic studies of many mammals should be reevaluated. Recent advances in sequencing technology render the analysis of Y-chromosomal variation feasible, even in nonmodel organisms. © The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Berriasian (Early Cretaceous) radiometric ages from the Grindstone Creek Section, Sacramento Valley, California

USGS Publications Warehouse

Bralower, T.J.; Ludwig, K. R.; Obradovich, J.D.

1990-01-01

The Grindstone Creek Section, Glenn County, Northern California is a sequence of hemipelagic mudstone, siltstone and sandstone interbedded with concretionary limestone and a few thin tuffs and bentonites. Two tuffs have been collected from a narrow interval of this sequence and subjected to mineralogical and isotopic analyses. UPb isotopic analyses of zircon fractions from these volcanic horizons indicate an age of 137.1 + 1.6/-0.6 Ma. A detailed investigation has been conducted on the calcareous nannofossil stratigraphy of this section based on numerous samples with moderately preserved assemblages. The nannoflora is largely of Tethyan affinity, and allows direct correlation with the Berriasian stratotype section, with sections with published magnetostratigraphies and with a DSDP site drilled between known magnetic anomalies. The dated tuffs lie in the lower part of the upper Berriasian Cretarhabdus angustiforatus Zone (Assipetra infracretacea Subzone) and within the narrow range of Rhagodiscus nebulosus. At three different sections, this subzone can be correlated with M-sequence Polarity Zones M16 and M16n. An independent magnetostratigraphic correlation is provided at DSDP Site 387, drilled between anomalies M15 and M16, where basal sediments contain R. nebulosus. Buchia collected within a meter of the lower tuff lie within the B. uncitoides Zone which is Berriasian in age. The upper tuff level, which occurs 65 m above the lower tuff, is situated within the overlying B. pacifica Zone. This zone had previously been correlated with the early Valanginian, but is clearly also partly of Berriasian age based on nannofossil stratigraphy. Our results allow an estimate of the age of the Berriasian-Valanginian and Jurassic-Cretaceous boundaries of 135.1 Ma and 141.1 Ma, respectively, and these fall within the range of, but differ significantiy from, several published time-scales. ?? 1990.

Eight further individuals with intellectual disability and epilepsy carrying bi-allelic CNTNAP2 aberrations allow delineation of the mutational and phenotypic spectrum.

PubMed

Smogavec, Mateja; Cleall, Alison; Hoyer, Juliane; Lederer, Damien; Nassogne, Marie-Cécile; Palmer, Elizabeth E; Deprez, Marie; Benoit, Valérie; Maystadt, Isabelle; Noakes, Charlotte; Leal, Alejandro; Shaw, Marie; Gecz, Jozef; Raymond, Lucy; Reis, André; Shears, Deborah; Brockmann, Knut; Zweier, Christiane

2016-12-01

Heterozygous copy number variants (CNVs) or sequence variants in the contactin-associated protein 2 gene CNTNAP2 have been discussed as risk factors for a wide spectrum of neurodevelopmental and neuropsychiatric disorders. Bi-allelic aberrations in this gene are causative for an autosomal-recessive disorder with epilepsy, severe intellectual disability (ID) and cortical dysplasia (CDFES). As the number of reported individuals is still limited, we aimed at a further characterisation of the full mutational and clinical spectrum. Targeted sequencing, chromosomal microarray analysis or multigene panel sequencing was performed in individuals with severe ID and epilepsy. We identified homozygous mutations, compound heterozygous CNVs or CNVs and mutations in CNTNAP2 in eight individuals from six unrelated families. All aberrations were inherited from healthy, heterozygous parents and are predicted to be deleterious for protein function. Epilepsy occurred in all affected individuals with onset in the first 3.5 years of life. Further common aspects were ID (severe in 6/8), regression of speech development (5/8) and behavioural anomalies (7/8). Interestingly, cognitive impairment in one of two affected brothers was, in comparison, relatively mild with good speech and simple writing abilities. Cortical dysplasia that was previously reported in CDFES was not present in MRIs of six individuals and only suspected in one. By identifying novel homozygous or compound heterozygous, deleterious CNVs and mutations in eight individuals from six unrelated families with moderate-to-severe ID, early onset epilepsy and behavioural anomalies, we considerably broaden the mutational and clinical spectrum associated with bi-allelic aberrations in CNTNAP2. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
Genome sequencing of the redbanded stink bug (Piezodorus guildinii)

USDA-ARS?s Scientific Manuscript database

We assembled a partial genome sequence from the redbanded stink bug, Piezodorus guildinii from Illumina MiSeq sequencing runs. The sequence has been submitted and published under NCBI GenBank Accession Number JTEQ01000000. The BioProject and BioSample Accession numbers are PRJNA263369 and SAMN030997...
Androgen Receptor Splice Variants and Resistance to Taxane Chemotherapy

DTIC Science & Technology

2016-10-01

sequence (MTAS) on AR. Milestone: Identify the sequence of AR that is involved in microtubule-binding. Publish 1 peer-reviewed paper . Major Task 4...joined the project and worked on the validation of the PAXgene assay. 6. Products Publications, conference papers , and presentations...Journal publications. The following paper was published: Xichun Liu, Elisa Ledet, Dongying Li, Ary Dotiwala, Allie Steinberger, Jianzhuo
The phylogenomic position of the grey nurse shark Carcharias taurus Rafinesque, 1810 (Lamniformes, Odontaspididae) inferred from the mitochondrial genome.

PubMed

Bowden, Deborah L; Vargas-Caro, Carolina; Ovenden, Jennifer R; Bennett, Michael B; Bustamante, Carlos

2016-11-01

The complete mitochondrial genome of the grey nurse shark Carcharias taurus is described from 25 963 828 sequences obtained using Illumina NGS technology. Total length of the mitogenome is 16 715 bp, consisting of 2 rRNAs, 13 protein-coding regions, 22 tRNA and 2 non-coding regions thus updating the previously published mitogenome for this species. The phylogenomic reconstruction inferred from the mitogenome of 15 species of Lamniform and Carcharhiniform sharks supports the inclusion of C. taurus in a clade with the Lamnidae and Cetorhinidae. This complete mitogenome contributes to ongoing investigation into the monophyly of the Family Odontaspididae.
Trident sign trumps Aquaporin-4-IgG ELISA in diagnostic value in a case of longitudinally extensive transverse myelitis.

PubMed

Jolliffe, Evan A; Keegan, B Mark; Flanagan, Eoin P

2018-04-21

Longitudinally-extensive T2-hyperintense spinal cord lesions (≥3 vertebral segments) are associated with neuromyelitis optical spectrum disorder but occur with other disorders including spinal cord sarcoidosis. When linear dorsal subpial enhancement is accompanied by central cord/canal enhancement the axial post-gadolinium sequences may reveal a "trident" pattern that has previously been shown to be strongly suggestive of spinal cord sarcoidosis. We report a case in which the patient was initially diagnosed with neuromyelitis optical spectrum disorder, but where the "trident" sign ultimately led to the correct diagnosis of spinal cord sarcoidosis. Copyright © 2018. Published by Elsevier B.V.
Persistence of Multiple Genetic Lineages within Intrahost Populations of Ross River Virus▿

PubMed Central

Liu, Wen J.; Rourke, Michelle F.; Holmes, Edward C.; Aaskov, John G.

2011-01-01

We examined the structure and extent of genetic diversity in intrahost populations of Ross River virus (RRV) in samples from six human patients, focusing on the nonstructural (nsP3) and structural (E2) protein genes. Strikingly, although the samples were collected from contrasting ecological settings 3,000 kilometers apart in Australia, we observed multiple viral lineages in four of the six individuals, which is indicative of widespread mixed infections. In addition, a comparison with previously published RRV sequences revealed that these distinct lineages have been in circulation for at least 5 years, and we were able to document their long-term persistence over extensive geographical distances. PMID:21430052
A novel genome signature based on inter-nucleotide distances profiles for visualization of metagenomic data

NASA Astrophysics Data System (ADS)

Xie, Xian-Hua; Yu, Zu-Guo; Ma, Yuan-Lin; Han, Guo-Sheng; Anh, Vo

2017-09-01

There has been a growing interest in visualization of metagenomic data. The present study focuses on the visualization of metagenomic data using inter-nucleotide distances profile. We first convert the fragment sequences into inter-nucleotide distances profiles. Then we analyze these profiles by principal component analysis. Finally the principal components are used to obtain the 2-D scattered plot according to their source of species. We name our method as inter-nucleotide distances profiles (INP) method. Our method is evaluated on three benchmark data sets used in previous published papers. Our results demonstrate that the INP method is good, alternative and efficient for visualization of metagenomic data.
Persistent natural infection of a Culex tritaeniorhynchus cell line with a novel Culex tritaeniorhynchus rhabdovirus strain.

PubMed

Gillich, Nadine; Kuwata, Ryusei; Isawa, Haruhiko; Horie, Masayuki

2015-09-01

Culex tritaeniorhynchus rhabdovirus (CTRV) is a mosquito virus that establishes persistent infection without any obvious cell death. Therefore, occult infection by CTRV can be present in mosquito cell lines. In this study, it is shown that NIID-CTR cells, which were derived from Cx. tritaeniorhynchus, are persistently infected with a novel strain of CTRV. Complete genome sequencing of the infecting strain revealed that it is genetically similar but distinct from the previously isolated CTRV strain, excluding the possibility of contamination. These findings raise the importance of further CTRV studies, such as screening of CTRV in other mosquito cell lines. © 2015 The Societies and Wiley Publishing Asia Pty Ltd.
A Systolic Array-Based FPGA Parallel Architecture for the BLAST Algorithm

PubMed Central

Guo, Xinyu; Wang, Hong; Devabhaktuni, Vijay

2012-01-01

A design of systolic array-based Field Programmable Gate Array (FPGA) parallel architecture for Basic Local Alignment Search Tool (BLAST) Algorithm is proposed. BLAST is a heuristic biological sequence alignment algorithm which has been used by bioinformatics experts. In contrast to other designs that detect at most one hit in one-clock-cycle, our design applies a Multiple Hits Detection Module which is a pipelining systolic array to search multiple hits in a single-clock-cycle. Further, we designed a Hits Combination Block which combines overlapping hits from systolic array into one hit. These implementations completed the first and second step of BLAST architecture and achieved significant speedup comparing with previously published architectures. PMID:25969747
Unveiling the Biodiversity of Deep-Sea Nematodes through Metabarcoding: Are We Ready to Bypass the Classical Taxonomy?

PubMed Central

2015-01-01

Nematodes inhabiting benthic deep-sea ecosystems account for >90% of the total metazoan abundances and they have been hypothesised to be hyper-diverse, but their biodiversity is still largely unknown. Metabarcoding could facilitate the census of biodiversity, especially for those tiny metazoans for which morphological identification is difficult. We compared, for the first time, different DNA extraction procedures based on the use of two commercial kits and a previously published laboratory protocol and tested their suitability for sequencing analyses of 18S rDNA of marine nematodes. We also investigated the reliability of Roche 454 sequencing analyses for assessing the biodiversity of deep-sea nematode assemblages previously morphologically identified. Finally, intra-genomic variation in 18S rRNA gene repeats was investigated by Illumina MiSeq in different deep-sea nematode morphospecies to assess the influence of polymorphisms on nematode biodiversity estimates. Our results indicate that the two commercial kits should be preferred for the molecular analysis of biodiversity of deep-sea nematodes since they consistently provide amplifiable DNA suitable for sequencing. We report that the morphological identification of deep-sea nematodes matches the results obtained by metabarcoding analysis only at the order-family level and that a large portion of Operational Clustered Taxonomic Units (OCTUs) was not assigned. We also show that independently from the cut-off criteria and bioinformatic pipelines used, the number of OCTUs largely exceeds the number of individuals and that 18S rRNA gene of different morpho-species of nematodes displayed intra-genomic polymorphisms. Our results indicate that metabarcoding is an important tool to explore the diversity of deep-sea nematodes, but still fails in identifying most of the species due to limited number of sequences deposited in the public databases, and in providing quantitative data on the species encountered. These aspects should be carefully taken into account before using metabarcoding in quantitative ecological research and monitoring programmes of marine biodiversity. PMID:26701112
Unveiling the Biodiversity of Deep-Sea Nematodes through Metabarcoding: Are We Ready to Bypass the Classical Taxonomy?

PubMed

Dell'Anno, Antonio; Carugati, Laura; Corinaldesi, Cinzia; Riccioni, Giulia; Danovaro, Roberto

2015-01-01

Nematodes inhabiting benthic deep-sea ecosystems account for >90% of the total metazoan abundances and they have been hypothesised to be hyper-diverse, but their biodiversity is still largely unknown. Metabarcoding could facilitate the census of biodiversity, especially for those tiny metazoans for which morphological identification is difficult. We compared, for the first time, different DNA extraction procedures based on the use of two commercial kits and a previously published laboratory protocol and tested their suitability for sequencing analyses of 18S rDNA of marine nematodes. We also investigated the reliability of Roche 454 sequencing analyses for assessing the biodiversity of deep-sea nematode assemblages previously morphologically identified. Finally, intra-genomic variation in 18S rRNA gene repeats was investigated by Illumina MiSeq in different deep-sea nematode morphospecies to assess the influence of polymorphisms on nematode biodiversity estimates. Our results indicate that the two commercial kits should be preferred for the molecular analysis of biodiversity of deep-sea nematodes since they consistently provide amplifiable DNA suitable for sequencing. We report that the morphological identification of deep-sea nematodes matches the results obtained by metabarcoding analysis only at the order-family level and that a large portion of Operational Clustered Taxonomic Units (OCTUs) was not assigned. We also show that independently from the cut-off criteria and bioinformatic pipelines used, the number of OCTUs largely exceeds the number of individuals and that 18S rRNA gene of different morpho-species of nematodes displayed intra-genomic polymorphisms. Our results indicate that metabarcoding is an important tool to explore the diversity of deep-sea nematodes, but still fails in identifying most of the species due to limited number of sequences deposited in the public databases, and in providing quantitative data on the species encountered. These aspects should be carefully taken into account before using metabarcoding in quantitative ecological research and monitoring programmes of marine biodiversity.
Integrating restriction site-associated DNA sequencing (RAD-seq) with morphological cladistic analysis clarifies evolutionary relationships among major species groups of bee orchids.

PubMed

Bateman, Richard M; Sramkó, Gábor; Paun, Ovidiu

2018-01-25

Bee orchids (Ophrys) have become the most popular model system for studying reproduction via insect-mediated pseudo-copulation and for exploring the consequent, putatively adaptive, evolutionary radiations. However, despite intensive past research, both the phylogenetic structure and species diversity within the genus remain highly contentious. Here, we integrate next-generation sequencing and morphological cladistic techniques to clarify the phylogeny of the genus. At least two accessions of each of the ten species groups previously circumscribed from large-scale cloned nuclear ribosomal internal transcibed spacer (nrITS) sequencing were subjected to restriction site-associated sequencing (RAD-seq). The resulting matrix of 4159 single nucleotide polymorphisms (SNPs) for 34 accessions was used to construct an unrooted network and a rooted maximum likelihood phylogeny. A parallel morphological cladistic matrix of 43 characters generated both polymorphic and non-polymorphic sets of parsimony trees before being mapped across the RAD-seq topology. RAD-seq data strongly support the monophyly of nine out of ten groups previously circumscribed using nrITS and resolve three major clades; in contrast, supposed microspecies are barely distinguishable. Strong incongruence separated the RAD-seq trees from both the morphological trees and traditional classifications; mapping of the morphological characters across the RAD-seq topology rendered them far more homoplastic. The comparatively high level of morphological homoplasy reflects extensive convergence, whereas the derived placement of the fusca group is attributed to paedomorphic simplification. The phenotype of the most recent common ancestor of the extant lineages is inferred, but it post-dates the majority of the character-state changes that typify the genus. RAD-seq may represent the high-water mark of the contribution of molecular phylogenetics to understanding evolution within Ophrys; further progress will require large-scale population-level studies that integrate phenotypic and genotypic data in a cogent conceptual framework. © The Author(s) 2018. Published by Oxford University Press on behalf of the Annals of Botany Company.
Drivers of cyanobacterial diversity and community composition in mangrove soils in south-east Brazil.

PubMed

Rigonato, Janaina; Kent, Angela D; Alvarenga, Danillo O; Andreote, Fernando D; Beirigo, Raphael M; Vidal-Torrado, Pablo; Fiore, Marli F

2013-04-01

Cyanobacteria act as primary producers of carbon and nitrogen in nutrient-poor ecosystems such as mangroves. This important group of microorganisms plays a critical role in sustaining the productivity of mangrove ecosystems, but the structure and function of cyanobacteria assemblages can be perturbed by anthropogenic influences. The aim of this work was to assess the community structure and ecological drivers that influence the cyanobacterial community harboured in two Brazilian mangrove soils, and examine the long-term effects of oil contamination on these keystone species. Community fingerprinting results showed that, although cyanobacterial communities are distinct between the two mangroves, the structure and diversity of the assemblages exhibit similar responses to environmental gradients. In each ecosystem, cyanobacteria occupying near-shore areas were similar in composition, indicating importance of marine influences for structuring the community. Analysis of 16S rRNA sequences revealed the presence of diverse cyanobacterial communities in mangrove sediments, with clear differences among mangrove habitats along a transect from shore to forest. While near-shore sites in both mangroves were mainly occupied by Prochlorococcus and Synechococcus genera, sequences retrieved from other mangrove niches were mainly affiliated with uncultured cyanobacterial 16S rRNA. The most intriguing finding was the large number of potentially novel cyanobacteria 16S rRNA sequences obtained from a previously oil-contaminated site. The abundance of cyanobacterial 16S rRNA sequences observed in sites with a history of oil contamination was significantly lower than in the unimpacted areas. This study emphasized the role of environmental drivers in determining the structure of cyanobacterial communities in mangrove soils, and suggests that anthropogenic impacts may also act as ecological filters that select cyanobacterial taxa. These results are an important contribution to our understanding of the composition and relative abundance of previously poorly described cyanobacterial assemblages in mangrove ecosystems. © 2012 Society for Applied Microbiology and Blackwell Publishing Ltd.
The Conduct and Reporting of Child Health Research: An Analysis of Randomized Controlled Trials Published in 2012 and Evaluation of Change over 5 Years.

PubMed

Gates, Allison; Hartling, Lisa; Vandermeer, Ben; Caldwell, Patrina; Contopoulos-Ioannidis, Despina G; Curtis, Sarah; Fernandes, Ricardo M; Klassen, Terry P; Williams, Katrina; Dyson, Michele P

2018-02-01

For child health randomized controlled trials (RCTs) published in 2012, we aimed to describe design and reporting characteristics and evaluate changes since 2007; assess the association between trial design and registration and risk of bias (RoB); and assess the association between RoB and effect size. For 300 RCTs, we extracted design and reporting characteristics and assessed RoB. We assessed 5-year changes in design and reporting (based on 300 RCTs we had previously analyzed) using the Fisher exact test. We tested for associations between design and reporting characteristics and overall RoB and registration using the Fisher exact, Cochran-Armitage, Kruskal-Wallis, and Jonckheere-Terpstra tests. We pooled effect sizes and tested for differences by RoB using the χ 2 test for subgroups in meta-analysis. The 2012 and 2007 RCTs differed with respect to many design and reporting characteristics. From 2007 to 2012, RoB did not change for random sequence generation and improved for allocation concealment (P < .001). Fewer 2012 RCTs were rated high overall RoB and more were rated unclear (P = .03). Only 7.3% of 2012 RCTs were rated low overall RoB. Trial registration doubled from 2007 to 2012 (23% to 46%) (P < .001) and was associated with lower RoB (P = .009). Effect size did not differ by RoB (P = .43) CONCLUSIONS: Random sequence generation and allocation concealment were not often reported, and selective reporting was prevalent. Measures to increase trialists' awareness and application of existing reporting guidance, and the prospective registration of RCTs is needed to improve the trustworthiness of findings from this field. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.
MS2PIP prediction server: compute and visualize MS2 peak intensity predictions for CID and HCD fragmentation.

PubMed

Degroeve, Sven; Maddelein, Davy; Martens, Lennart

2015-07-01

We present an MS(2) peak intensity prediction server that computes MS(2) charge 2+ and 3+ spectra from peptide sequences for the most common fragment ions. The server integrates the Unimod public domain post-translational modification database for modified peptides. The prediction model is an improvement of the previously published MS(2)PIP model for Orbitrap-LTQ CID spectra. Predicted MS(2) spectra can be downloaded as a spectrum file and can be visualized in the browser for comparisons with observations. In addition, we added prediction models for HCD fragmentation (Q-Exactive Orbitrap) and show that these models compute accurate intensity predictions on par with CID performance. We also show that training prediction models for CID and HCD separately improves the accuracy for each fragmentation method. The MS(2)PIP prediction server is accessible from http://iomics.ugent.be/ms2pip. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Draft Genome Sequence of a Rare Smut Relative, Tilletiaria anomala UBC 951

DOE PAGES

Toome, Merje; Kuo, Alan; Henrissat, Bernard; ...

2014-06-12

We present the draft genome sequence of the smut fungus Tilletiaria anomala UBC 951 (Basidiomycota, Ustilaginomycotina). The sequenced genome size is 18.7 Mb, consisting of 289 scaffolds and a total of 6,810 predicted genes. This is the first genome sequence published for a fungus in the order Georgefisheriales (Exobasidiomycetes).
Sequence of the toxic shock syndrome toxin gene (tstH) borne by strains of Staphylococcus aureus isolated from patients with Kawasaki syndrome.

PubMed Central

Deresiewicz, R L; Flaxenburg, J; Leng, K; Kasper, D L

1996-01-01

To explore whether a novel staphylococcal clone or structural variant of toxic shock syndrome toxin 1 is associated with Kawasaki syndrome, six toxigenic strains of Staphylococcus aureus from Kawasaki syndrome patients were studied. The strains were divisible into two groups based on phenotypic and genotypic characteristics and are therefore unequivocally not clonal. Portions of the tstH genes of each strain were sequenced. Three were sequenced in their entirety, while the remainder were sequenced from codon 66 to codon 137 of the mature protein only. Two of the former group differed slightly in the sequences of their signal peptides relative to the sequence published for the tstH signal peptide. Those differences did not affect toxin processing or secretion. The sequenced portions of the regions encoding mature toxic shock syndrome toxin 1 were identical in all six strains and corresponded exactly to the published sequence of tstH. No evidence was found for the existence of a structural variant of tstH uniquely associated with Kawasaki syndrome. PMID:8757881
Composite conserved promoter-terminator motifs (PeSLs) that mediate modular shuffling in the diverse T4-like myoviruses.

PubMed

Comeau, André M; Arbiol, Christine; Krisch, Henry M

2014-06-19

The diverse T4-like phages (Tquatrovirinae) infect a wide array of gram-negative bacterial hosts. The genome architecture of these phages is generally well conserved, most of the phylogenetically variable genes being grouped together in a series hyperplastic regions (HPRs) that are interspersed among large blocks of conserved core genes. Recent evidence from a pair of closely related T4-like phages has suggested that small, composite terminator/promoter sequences (promoterearly stem loop [PeSLs]) were implicated in mediating the high levels of genetic plasticity by indels occurring within the HPRs. Here, we present the genome sequence analysis of two T4-like phages, PST (168 kb, 272 open reading frames [ORFs]) and nt-1 (248 kb, 405 ORFs). These two phages were chosen for comparative sequence analysis because, although they are closely related to phages that have been previously sequenced (T4 and KVP40, respectively), they have different host ranges. In each case, one member of the pair infects a bacterial strain that is a human pathogen, whereas the other phage's host is a nonpathogen. Despite belonging to phylogenetically distant branches of the T4-likes, these pairs of phage have diverged from each other in part by a mechanism apparently involving PeSL-mediated recombination. This analysis confirms a role of PeSL sequences in the generation of genomic diversity by serving as a point of genetic exchange between otherwise unrelated sequences within the HPRs. Finally, the palette of divergent genes swapped by PeSL-mediated homologous recombination is discussed in the context of the PeSLs' potentially important role in facilitating phage adaption to new hosts and environments. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Molecular identification of Trichuris vulpis and Trichuris suis isolated from different hosts.

PubMed

Cutillas, Cristina; de Rojas, Manuel; Ariza, Concepción; Ubeda, José Manuel; Guevara, Diego

2007-01-01

Trichuris suis was isolated from the cecum of two different hosts (Sus scrofa domestica -- swine and Sus scrofa scrofa -- wild boar) and Trichuris vulpis from dogs in Sevilla, Spain. Genomic DNA was isolated and internal transcribed spacers (ITS)1-5.8S-ITS2 segment from the ribosomal DNA (rDNA) was amplified and sequenced using polymerase chain reaction techniques. The sequence of T. suis from both hosts was 1,396 bp in length while that of T. vulpis was 1,044 bp. ITS1 of both populations isolated of T. suis was 661 nucleotides in length, while the ITS2 was 534 nucleotides in length. Furthermore, the ITS1 of T. vulpis was 410 nucleotides in length, while the ITS2 was 433 nucleotides in length. One hundred fifty-four nucleotides were observed along the 5.8S gene of T. suis and T. vulpis. Intraindividual and intraspecific variations were detected in the rDNA of both species. The presence of microsatellites was observed in all the individuals assayed. Sequence analysis of the ITSs and the 5.8S gene has demonstrated no sequence differences between T. suis isolated from both hosts (S. scrofa domestica -- swine and S. scrofa scrofa -- wild boar). Nevertheless, clear differences were detected between the ITS1 and ITS2 of T. suis and T. vulpis. Furthermore, a comparative molecular analysis between both species and the previously published ITS1-5.8S-ITS2 sequence data of Trichuris ovis, Trichuris leporis, Trichuris muris, Trichuris arvicolae, and Trichuris skrjabini was carried out. A common homology zone was detected in the ITS1 sequence of all species of trichurids.
Analysis of selected genes associated with cardiomyopathy by next-generation sequencing.

PubMed

Szabadosova, Viktoria; Boronova, Iveta; Ferenc, Peter; Tothova, Iveta; Bernasovska, Jarmila; Zigova, Michaela; Kmec, Jan; Bernasovsky, Ivan

2018-02-01

As the leading cause of congestive heart failure, cardiomyopathy represents a heterogenous group of heart muscle disorders. Despite considerable progress being made in the genetic diagnosis of cardiomyopathy by detection of the mutations in the most prevalent cardiomyopathy genes, the cause remains unsolved in many patients. High-throughput mutation screening in the disease genes for cardiomyopathy is now possible because of using target enrichment followed by next-generation sequencing. The aim of the study was to analyze a panel of genes associated with dilated or hypertrophic cardiomyopathy based on previously published results in order to identify the subjects at risk. The method of next-generation sequencing by IlluminaHiSeq 2500 platform was used to detect sequence variants in 16 individuals diagnosed with dilated or hypertrophic cardiomyopathy. Detected variants were filtered and the functional impact of amino acid changes was predicted by computational programs. DNA samples of the 16 patients were analyzed by whole exome sequencing. We identified six nonsynonymous variants that were shown to be pathogenic in all used prediction softwares: rs3744998 (EPG5), rs11551768 (MGME1), rs148374985 (MURC), rs78461695 (PLEC), rs17158558 (RET) and rs2295190 (SYNE1). Two of the analyzed sequence variants had minor allele frequency (MAF)<0.01: rs148374985 (MURC), rs34580776 (MYBPC3). Our data support the potential role of the detected variants in pathogenesis of dilated or hypertrophic cardiomyopathy; however, the possibility that these variants might not be true disease-causing variants but are susceptibility alleles that require additional mutations or injury to cause the clinical phenotype of disease must be considered. © 2017 Wiley Periodicals, Inc.

Improving taxonomic accuracy for fungi in public sequence databases: applying ‘one name one species’ in well-defined genera with Trichoderma/Hypocrea as a test case

PubMed Central

Strope, Pooja K; Chaverri, Priscila; Gazis, Romina; Ciufo, Stacy; Domrachev, Michael; Schoch, Conrad L

2017-01-01

Abstract The ITS (nuclear ribosomal internal transcribed spacer) RefSeq database at the National Center for Biotechnology Information (NCBI) is dedicated to the clear association between name, specimen and sequence data. This database is focused on sequences obtained from type material stored in public collections. While the initial ITS sequence curation effort together with numerous fungal taxonomy experts attempted to cover as many orders as possible, we extended our latest focus to the family and genus ranks. We focused on Trichoderma for several reasons, mainly because the asexual and sexual synonyms were well documented, and a list of proposed names and type material were recently proposed and published. In this case study the recent taxonomic information was applied to do a complete taxonomic audit for the genus Trichoderma in the NCBI Taxonomy database. A name status report is available here: https://www.ncbi.nlm.nih.gov/Taxonomy/TaxIdentifier/tax_identifier.cgi. As a result, the ITS RefSeq Targeted Loci database at NCBI has been augmented with more sequences from type and verified material from Trichoderma species. Additionally, to aid in the cross referencing of data from single loci and genomes we have collected a list of quality records of the RPB2 gene obtained from type material in GenBank that could help validate future submissions. During the process of curation misidentified genomes were discovered, and sequence records from type material were found hidden under previous classifications. Source metadata curation, although more cumbersome, proved to be useful as confirmation of the type material designation. Database URL: http://www.ncbi.nlm.nih.gov/bioproject/PRJNA177353 PMID:29220466
GTRAC: fast retrieval from compressed collections of genomic variants.

PubMed

Tatwawadi, Kedar; Hernaez, Mikel; Ochoa, Idoia; Weissman, Tsachy

2016-09-01

The dramatic decrease in the cost of sequencing has resulted in the generation of huge amounts of genomic data, as evidenced by projects such as the UK10K and the Million Veteran Project, with the number of sequenced genomes ranging in the order of 10 K to 1 M. Due to the large redundancies among genomic sequences of individuals from the same species, most of the medical research deals with the variants in the sequences as compared with a reference sequence, rather than with the complete genomic sequences. Consequently, millions of genomes represented as variants are stored in databases. These databases are constantly updated and queried to extract information such as the common variants among individuals or groups of individuals. Previous algorithms for compression of this type of databases lack efficient random access capabilities, rendering querying the database for particular variants and/or individuals extremely inefficient, to the point where compression is often relinquished altogether. We present a new algorithm for this task, called GTRAC, that achieves significant compression ratios while allowing fast random access over the compressed database. For example, GTRAC is able to compress a Homo sapiens dataset containing 1092 samples in 1.1 GB (compression ratio of 160), while allowing for decompression of specific samples in less than a second and decompression of specific variants in 17 ms. GTRAC uses and adapts techniques from information theory, such as a specialized Lempel-Ziv compressor, and tailored succinct data structures. The GTRAC algorithm is available for download at: https://github.com/kedartatwawadi/GTRAC CONTACT: : kedart@stanford.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Mixed-complexity artificial grammar learning in humans and macaque monkeys: evaluating learning strategies.

PubMed

Wilson, Benjamin; Smith, Kenny; Petkov, Christopher I

2015-03-01

Artificial grammars (AG) can be used to generate rule-based sequences of stimuli. Some of these can be used to investigate sequence-processing computations in non-human animals that might be related to, but not unique to, human language. Previous AG learning studies in non-human animals have used different AGs to separately test for specific sequence-processing abilities. However, given that natural language and certain animal communication systems (in particular, song) have multiple levels of complexity, mixed-complexity AGs are needed to simultaneously evaluate sensitivity to the different features of the AG. Here, we tested humans and Rhesus macaques using a mixed-complexity auditory AG, containing both adjacent (local) and non-adjacent (longer-distance) relationships. Following exposure to exemplary sequences generated by the AG, humans and macaques were individually tested with sequences that were either consistent with the AG or violated specific adjacent or non-adjacent relationships. We observed a considerable level of cross-species correspondence in the sensitivity of both humans and macaques to the adjacent AG relationships and to the statistical properties of the sequences. We found no significant sensitivity to the non-adjacent AG relationships in the macaques. A subset of humans was sensitive to this non-adjacent relationship, revealing interesting between- and within-species differences in AG learning strategies. The results suggest that humans and macaques are largely comparably sensitive to the adjacent AG relationships and their statistical properties. However, in the presence of multiple cues to grammaticality, the non-adjacent relationships are less salient to the macaques and many of the humans. © 2015 The Authors. European Journal of Neuroscience published by Federation of European Neuroscience Societies and John Wiley & Sons Ltd.
Origin and domestication of papaya Yh chromosome.

PubMed

VanBuren, Robert; Zeng, Fanchang; Chen, Cuixia; Zhang, Jisen; Wai, Ching Man; Han, Jennifer; Aryal, Rishi; Gschwend, Andrea R; Wang, Jianping; Na, Jong-Kuk; Huang, Lixian; Zhang, Lingmao; Miao, Wenjing; Gou, Jiqing; Arro, Jie; Guyot, Romain; Moore, Richard C; Wang, Ming-Li; Zee, Francis; Charlesworth, Deborah; Moore, Paul H; Yu, Qingyi; Ming, Ray

2015-04-01

Sex in papaya is controlled by a pair of nascent sex chromosomes. Females are XX, and two slightly different Y chromosomes distinguish males (XY) and hermaphrodites (XY(h)). The hermaphrodite-specific region of the Y(h) chromosome (HSY) and its X chromosome counterpart were sequenced and analyzed previously. We now report the sequence of the entire male-specific region of the Y (MSY). We used a BAC-by-BAC approach to sequence the MSY and resequence the Y regions of 24 wild males and the Y(h) regions of 12 cultivated hermaphrodites. The MSY and HSY regions have highly similar gene content and structure, and only 0.4% sequence divergence. The MSY sequences from wild males include three distinct haplotypes, associated with the populations' geographic locations, but gene flow is detected for other genomic regions. The Y(h) sequence is highly similar to one Y haplotype (MSY3) found only in wild dioecious populations from the north Pacific region of Costa Rica. The low MSY3-Y(h) divergence supports the hypothesis that hermaphrodite papaya is a product of human domestication. We estimate that Y(h) arose only ∼ 4000 yr ago, well after crop plant domestication in Mesoamerica >6200 yr ago but coinciding with the rise of the Maya civilization. The Y(h) chromosome has lower nucleotide diversity than the Y, or the genome regions that are not fully sex-linked, consistent with a domestication bottleneck. The identification of the ancestral MSY3 haplotype will expedite investigation of the mutation leading to the domestication of the hermaphrodite Y(h) chromosome. In turn, this mutation should identify the gene that was affected by the carpel-suppressing mutation that was involved in the evolution of males. © 2015 VanBuren et al.; Published by Cold Spring Harbor Laboratory Press.
Spectroscopic characterization of galaxy clusters in RCS-1: spectroscopic confirmation, redshift accuracy, and dynamical mass-richness relation

NASA Astrophysics Data System (ADS)

Gilbank, David G.; Barrientos, L. Felipe; Ellingson, Erica; Blindert, Kris; Yee, H. K. C.; Anguita, T.; Gladders, M. D.; Hall, P. B.; Hertling, G.; Infante, L.; Yan, R.; Carrasco, M.; Garcia-Vergara, Cristina; Dawson, K. S.; Lidman, C.; Morokuma, T.

2018-05-01

We present follow-up spectroscopic observations of galaxy clusters from the first Red-sequence Cluster Survey (RCS-1). This work focuses on two samples, a lower redshift sample of ˜30 clusters ranging in redshift from z ˜ 0.2-0.6 observed with multiobject spectroscopy (MOS) on 4-6.5-m class telescopes and a z ˜ 1 sample of ˜10 clusters 8-m class telescope observations. We examine the detection efficiency and redshift accuracy of the now widely used red-sequence technique for selecting clusters via overdensities of red-sequence galaxies. Using both these data and extended samples including previously published RCS-1 spectroscopy and spectroscopic redshifts from SDSS, we find that the red-sequence redshift using simple two-filter cluster photometric redshifts is accurate to σz ≈ 0.035(1 + z) in RCS-1. This accuracy can potentially be improved with better survey photometric calibration. For the lower redshift sample, ˜5 per cent of clusters show some (minor) contamination from secondary systems with the same red-sequence intruding into the measurement aperture of the original cluster. At z ˜ 1, the rate rises to ˜20 per cent. Approximately ten per cent of projections are expected to be serious, where the two components contribute significant numbers of their red-sequence galaxies to another cluster. Finally, we present a preliminary study of the mass-richness calibration using velocity dispersions to probe the dynamical masses of the clusters. We find a relation broadly consistent with that seen in the local universe from the WINGS sample at z ˜ 0.05.
Entire plastid phylogeny of the carrot genus (Daucus, Apiaceae): Concordance with nuclear data and mitochondrial and nuclear DNA insertions to the plastid.

PubMed

Spooner, David M; Ruess, Holly; Iorizzo, Massimo; Senalik, Douglas; Simon, Philipp

2017-02-01

We explored the phylogenetic utility of entire plastid DNA sequences in Daucus and compared the results with prior phylogenetic results using plastid and nuclear DNA sequences. We used Illumina sequencing to obtain full plastid sequences of 37 accessions of 20 Daucus taxa and outgroups, analyzed the data with phylogenetic methods, and examined evidence for mitochondrial DNA transfer to the plastid ( Dc MP). Our phylogenetic trees of the entire data set were highly resolved, with 100% bootstrap support for most of the external and many of the internal clades, except for the clade of D. carota and its most closely related species D. syrticus . Subsets of the data, including regions traditionally used as phylogenetically informative regions, provide various degrees of soft congruence with the entire data set. There are areas of hard incongruence, however, with phylogenies using nuclear data. We extended knowledge of a mitochondrial to plastid DNA insertion sequence previously named Dc MP and identified the first instance in flowering plants of a sequence of potential nuclear genome origin inserted into the plastid genome. There is a relationship of inverted repeat junction classes and repeat DNA to phylogeny, but no such relationship with nonsynonymous mutations. Our data have allowed us to (1) produce a well-resolved plastid phylogeny of Daucus , (2) evaluate subsets of the entire plastid data for phylogeny, (3) examine evidence for plastid and nuclear DNA phylogenetic incongruence, and (4) examine mitochondrial and nuclear DNA insertion into the plastid. © 2017 Spooner et al. Published by the Botanical Society of America. This work is licensed under a Creative Commons public domain license (CC0 1.0).
Identification of transcribed sequences in Arabidopsis thaliana by using high-resolution genome tiling arrays

PubMed Central

Stolc, Viktor; Samanta, Manoj Pratim; Tongprasit, Waraporn; Sethi, Himanshu; Liang, Shoudan; Nelson, David C.; Hegeman, Adrian; Nelson, Clark; Rancour, David; Bednarek, Sebastian; Ulrich, Eldon L.; Zhao, Qin; Wrobel, Russell L.; Newman, Craig S.; Fox, Brian G.; Phillips, George N.; Markley, John L.; Sussman, Michael R.

2005-01-01

Using a maskless photolithography method, we produced DNA oligonucleotide microarrays with probe sequences tiled throughout the genome of the plant Arabidopsis thaliana. RNA expression was determined for the complete nuclear, mitochondrial, and chloroplast genomes by tiling 5 million 36-mer probes. These probes were hybridized to labeled mRNA isolated from liquid grown T87 cells, an undifferentiated Arabidopsis cell culture line. Transcripts were detected from at least 60% of the nearly 26,330 annotated genes, which included 151 predicted genes that were not identified previously by a similar genome-wide hybridization study on four different cell lines. In comparison with previously published results with 25-mer tiling arrays produced by chromium masking-based photolithography technique, 36-mer oligonucleotide probes were found to be more useful in identifying intron–exon boundaries. Using two-dimensional HPLC tandem mass spectrometry, a small-scale proteomic analysis was performed with the same cells. A large amount of strongly hybridizing RNA was found in regions “antisense” to known genes. Similarity of antisense activities between the 25-mer and 36-mer data sets suggests that it is a reproducible and inherent property of the experiments. Transcription activities were also detected for many of the intergenic regions and the small RNAs, including tRNA, small nuclear RNA, small nucleolar RNA, and microRNA. Expression of tRNAs correlates with genome-wide amino acid usage. PMID:15755812
More Easily Cultivated Than Identified: Classical Isolation With Molecular Identification of Vaginal Bacteria.

PubMed

Srinivasan, Sujatha; Munch, Matthew M; Sizova, Maria V; Fiedler, Tina L; Kohler, Christina M; Hoffman, Noah G; Liu, Congzhou; Agnew, Kathy J; Marrazzo, Jeanne M; Epstein, Slava S; Fredricks, David N

2016-08-15

Women with bacterial vaginosis (BV) have complex communities of anaerobic bacteria. There are no cultivated isolates of several bacteria identified using molecular methods and associated with BV. It is unclear whether this is due to the inability to adequately propagate these bacteria or to correctly identify them in culture. Vaginal fluid from 15 women was plated on 6 different media using classical cultivation approaches. Individual isolates were identified by 16S ribosomal RNA (rRNA) gene sequencing and compared with validly described species. Bacterial community profiles in vaginal samples were determined using broad-range 16S rRNA gene polymerase chain reaction and pyrosequencing. We isolated and identified 101 distinct bacterial strains spanning 6 phyla including (1) novel strains with <98% 16S rRNA sequence identity to validly described species, (2) closely related species within a genus, (3) bacteria previously isolated from body sites other than the vagina, and (4) known bacteria formerly isolated from the vagina. Pyrosequencing showed that novel strains Peptoniphilaceae DNF01163 and Prevotellaceae DNF00733 were prevalent in women with BV. We isolated a diverse set of novel and clinically significant anaerobes from the human vagina using conventional approaches with systematic molecular identification. Several previously "uncultivated" bacteria are amenable to conventional cultivation. © The Author 2016. Published by Oxford University Press for the Infectious Diseases Society of America. All rights reserved. For permissions, e-mail journals.permissions@oup.com.
Monogenetic origin of Ubehebe Crater maar volcano, Death Valley, California: Paleomagnetic and stratigraphic evidence

NASA Astrophysics Data System (ADS)

Champion, Duane E.; Cyr, Andy; Fierstein, Judy; Hildreth, Wes

2018-04-01

Paleomagnetic data for samples collected from outcrops of basaltic spatter at the Ubehebe Crater cluster, Death Valley National Park, California, record a single direction of remanent magnetization indicating that these materials were emplaced during a short duration, monogenetic eruption sequence 2100 years ago. This conclusion is supported by geochemical data encompassing a narrow range of oxide variation, by detailed stratigraphic studies of conformable phreatomagmatic tephra deposits showing no evidence of erosion between layers, by draping of sharp rimmed craters by later tephra falls, and by oxidation of later tephra layers by the remaining heat of earlier spatter. This model is also supported through a reinterpretation and recalculation of the published 10Be age results (Sasnett et al., 2012) from an innovative and bold exposure-age study on very young materials. Their conclusion of multiple and protracted eruptions at Ubehebe Crater cluster is here modified through the understanding that some of their quartz-bearing clasts inherited 10Be from previous exposure on the fan surface (too old), and that other clasts were only exposed at the surface by wind and/or water erosion centuries after their eruption (too young). Ubehebe Crater cluster is a well preserved example of young monogenetic maar type volcanism protected within a National Park, and it represents neither a protracted eruption sequence as previously thought, nor a continuing volcanic hazard near its location.
Homozygous PPT1 Splice Donor Mutation in a Cane Corso Dog With Neuronal Ceroid Lipofuscinosis.

PubMed

Kolicheski, A; Barnes Heller, H L; Arnold, S; Schnabel, R D; Taylor, J F; Knox, C A; Mhlanga-Mutangadura, T; O'Brien, D P; Johnson, G S; Dreyfus, J; Katz, M L

2017-01-01

A 10-month-old spayed female Cane Corso dog was evaluated after a 2-month history of progressive blindness, ataxia, and lethargy. Neurologic examination abnormalities indicated a multifocal lesion with primarily cerebral and cerebellar signs. Clinical worsening resulted in humane euthanasia. On necropsy, there was marked astrogliosis throughout white matter tracts of the cerebrum, most prominently in the corpus callosum. In the cerebral cortex and midbrain, most neurons contained large amounts of autofluorescent storage material in the perinuclear area of the cells. Cerebellar storage material was present in the Purkinje cells, granular cell layer, and perinuclear regions of neurons in the deep nuclei. Neuronal ceroid lipofuscinosis (NCL) was diagnosed. Whole genome sequencing identified a PPT1c.124 + 1G>A splice donor mutation. This nonreference assembly allele was homozygous in the affected dog, has not previously been reported in dbSNP, and was absent from the whole genome sequences of 45 control dogs and 31 unaffected Cane Corsos. Our findings indicate a novel mutation causing the CLN1 form of NCL in a previously unreported dog breed. A canine model for CLN1 disease could provide an opportunity for therapeutic advancement, benefiting both humans and dogs with this disorder. Copyright © 2016 The Authors. Journal of Veterinary Internal Medicine published by Wiley Periodicals, Inc. on behalf of the American College of Veterinary Internal Medicine.
Genomic analysis of bluetongue virus episystems in Australia and Indonesia.

PubMed

Firth, Cadhla; Blasdell, Kim R; Amos-Ritchie, Rachel; Sendow, Indrawati; Agnihotri, Kalpana; Boyle, David B; Daniels, Peter; Kirkland, Peter D; Walker, Peter J

2017-11-23

The distribution of bluetongue viruses (BTV) in Australia is represented by two distinct and interconnected epidemiological systems (episystems)-one distributed primarily in the north and one in the east. The northern episystem is characterised by substantially greater antigenic diversity than the eastern episystem; yet the forces that act to limit the diversity present in the east remain unclear. Previous work has indicated that the northern episystem is linked to that of island South East Asia and Melanesia, and that BTV present in Indonesia, Papua New Guinea and East Timor, may act as source populations for new serotypes and genotypes of BTV to enter Australia's north. In this study, the genomes of 49 bluetongue viruses from the eastern episystem and 13 from Indonesia were sequenced and analysed along with 27 previously published genome sequences from the northern Australian episystem. The results of this analysis confirm that the Australian BTV population has its origins in the South East Asian/Melanesian episystem, and that incursions into northern Australia occur with some regularity. In addition, the presence of limited genetic diversity in the eastern episystem relative to that found in the north supports the presence of substantial, but not complete, barriers to gene flow between the northern and eastern Australian episystems. Genetic bottlenecks between each successive episystem are evident, and appear to be responsible for the reduction in BTV genetic diversity observed in the north to south-east direction.
Widespread occurrence of organelle genome-encoded 5S rRNAs including permuted molecules

PubMed Central

Valach, Matus; Burger, Gertraud; Gray, Michael W.; Lang, B. Franz

2014-01-01

5S Ribosomal RNA (5S rRNA) is a universal component of ribosomes, and the corresponding gene is easily identified in archaeal, bacterial and nuclear genome sequences. However, organelle gene homologs (rrn5) appear to be absent from most mitochondrial and several chloroplast genomes. Here, we re-examine the distribution of organelle rrn5 by building mitochondrion- and plastid-specific covariance models (CMs) with which we screened organelle genome sequences. We not only recover all organelle rrn5 genes annotated in GenBank records, but also identify more than 50 previously unrecognized homologs in mitochondrial genomes of various stramenopiles, red algae, cryptomonads, malawimonads and apusozoans, and surprisingly, in the apicoplast (highly derived plastid) genomes of the coccidian pathogens Toxoplasma gondii and Eimeria tenella. Comparative modeling of RNA secondary structure reveals that mitochondrial 5S rRNAs from brown algae adopt a permuted triskelion shape that has not been seen elsewhere. Expression of the newly predicted rrn5 genes is confirmed experimentally in 10 instances, based on our own and published RNA-Seq data. This study establishes that particularly mitochondrial 5S rRNA has a much broader taxonomic distribution and a much larger structural variability than previously thought. The newly developed CMs will be made available via the Rfam database and the MFannot organelle genome annotator. PMID:25429974
Identification of transcribed sequences in Arabidopsis thaliana by using high-resolution genome tiling arrays

NASA Technical Reports Server (NTRS)

Stolc, Viktor; Samanta, Manoj Pratim; Tongprasit, Waraporn; Sethi, Himanshu; Liang, Shoudan; Nelson, David C.; Hegeman, Adrian; Nelson, Clark; Rancour, David; Bednarek, Sebastian;

2005-01-01

Using a maskless photolithography method, we produced DNA oligonucleotide microarrays with probe sequences tiled throughout the genome of the plant Arabidopsis thaliana. RNA expression was determined for the complete nuclear, mitochondrial, and chloroplast genomes by tiling 5 million 36-mer probes. These probes were hybridized to labeled mRNA isolated from liquid grown T87 cells, an undifferentiated Arabidopsis cell culture line. Transcripts were detected from at least 60% of the nearly 26,330 annotated genes, which included 151 predicted genes that were not identified previously by a similar genome-wide hybridization study on four different cell lines. In comparison with previously published results with 25-mer tiling arrays produced by chromium masking-based photolithography technique, 36-mer oligonucleotide probes were found to be more useful in identifying intron-exon boundaries. Using two-dimensional HPLC tandem mass spectrometry, a small-scale proteomic analysis was performed with the same cells. A large amount of strongly hybridizing RNA was found in regions "antisense" to known genes. Similarity of antisense activities between the 25-mer and 36-mer data sets suggests that it is a reproducible and inherent property of the experiments. Transcription activities were also detected for many of the intergenic regions and the small RNAs, including tRNA, small nuclear RNA, small nucleolar RNA, and microRNA. Expression of tRNAs correlates with genome-wide amino acid usage.

Enhanced arbovirus surveillance with deep sequencing: Identification of novel rhabdoviruses and bunyaviruses in Australian mosquitoes.

PubMed

Coffey, Lark L; Page, Brady L; Greninger, Alexander L; Herring, Belinda L; Russell, Richard C; Doggett, Stephen L; Haniotis, John; Wang, Chunlin; Deng, Xutao; Delwart, Eric L

2014-01-05

Viral metagenomics characterizes known and identifies unknown viruses based on sequence similarities to any previously sequenced viral genomes. A metagenomics approach was used to identify virus sequences in Australian mosquitoes causing cytopathic effects in inoculated mammalian cell cultures. Sequence comparisons revealed strains of Liao Ning virus (Reovirus, Seadornavirus), previously detected only in China, livestock-infecting Stretch Lagoon virus (Reovirus, Orbivirus), two novel dimarhabdoviruses, named Beaumont and North Creek viruses, and two novel orthobunyaviruses, named Murrumbidgee and Salt Ash viruses. The novel virus proteomes diverged by ≥ 50% relative to their closest previously genetically characterized viral relatives. Deep sequencing also generated genomes of Warrego and Wallal viruses, orbiviruses linked to kangaroo blindness, whose genomes had not been fully characterized. This study highlights viral metagenomics in concert with traditional arbovirus surveillance to characterize known and new arboviruses in field-collected mosquitoes. Follow-up epidemiological studies are required to determine whether the novel viruses infect humans. © 2013 Elsevier Inc. All rights reserved.
Novel Virus Discovery and Genome Reconstruction from Field RNA Samples Reveals Highly Divergent Viruses in Dipteran Hosts

PubMed Central

Bass, David; Moureau, Gregory; Tang, Shuoya; McAlister, Erica; Culverwell, C. Lorna; Glücksman, Edvard; Wang, Hui; Brown, T. David K.; Gould, Ernest A.; Harbach, Ralph E.; de Lamballerie, Xavier; Firth, Andrew E.

2013-01-01

We investigated whether small RNA (sRNA) sequenced from field-collected mosquitoes and chironomids (Diptera) can be used as a proxy signature of viral prevalence within a range of species and viral groups, using sRNAs sequenced from wild-caught specimens, to inform total RNA deep sequencing of samples of particular interest. Using this strategy, we sequenced from adult Anopheles maculipennis s.l. mosquitoes the apparently nearly complete genome of one previously undescribed virus related to chronic bee paralysis virus, and, from a pool of Ochlerotatus caspius and Oc. detritus mosquitoes, a nearly complete entomobirnavirus genome. We also reconstructed long sequences (1503-6557 nt) related to at least nine other viruses. Crucially, several of the sequences detected were reconstructed from host organisms highly divergent from those in which related viruses have been previously isolated or discovered. It is clear that viral transmission and maintenance cycles in nature are likely to be significantly more complex and taxonomically diverse than previously expected. PMID:24260463
Ordering the mob: Insights into replicon and MOB typing schemes from analysis of a curated dataset of publicly available plasmids.

PubMed

Orlek, Alex; Phan, Hang; Sheppard, Anna E; Doumith, Michel; Ellington, Matthew; Peto, Tim; Crook, Derrick; Walker, A Sarah; Woodford, Neil; Anjum, Muna F; Stoesser, Nicole

2017-05-01

Plasmid typing can provide insights into the epidemiology and transmission of plasmid-mediated antibiotic resistance. The principal plasmid typing schemes are replicon typing and MOB typing, which utilize variation in replication loci and relaxase proteins respectively. Previous studies investigating the proportion of plasmids assigned a type by these schemes ('typeability') have yielded conflicting results; moreover, thousands of plasmid sequences have been added to NCBI in recent years, without consistent annotation to indicate which sequences represent complete plasmids. Here, a curated dataset of complete Enterobacteriaceae plasmids from NCBI was compiled, and used to assess the typeability and concordance of in silico replicon and MOB typing schemes. Concordance was assessed at hierarchical replicon type resolutions, from replicon family-level to plasmid multilocus sequence type (pMLST)-level, where available. We found that 85% and 65% of the curated plasmids could be replicon and MOB typed, respectively. Overall, plasmid size and the number of resistance genes were significant independent predictors of replicon and MOB typing success. We found some degree of non-concordance between replicon families and MOB types, which was only partly resolved when partitioning plasmids into finer-resolution groups (replicon and pMLST types). In some cases, non-concordance was attributed to ambiguous boundaries between MOBP and MOBQ types; in other cases, backbone mosaicism was considered a more plausible explanation. β-lactamase resistance genes tended not to show fidelity to a particular plasmid type, though some previously reported associations were supported. Overall, replicon and MOB typing schemes are likely to continue playing an important role in plasmid analysis, but their performance is constrained by the diverse and dynamic nature of plasmid genomes. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Viral Surveillance in Serum Samples From Patients With Acute Liver Failure By Metagenomic Next-Generation Sequencing.

PubMed

Somasekar, Sneha; Lee, Deanna; Rule, Jody; Naccache, Samia N; Stone, Mars; Busch, Michael P; Sanders, Corron; Lee, William M; Chiu, Charles Y

2017-10-16

Twelve percent of all acute liver failure (ALF) cases are of unknown origin, often termed indeterminate. A previously unrecognized hepatotropic virus has been suspected as a potential etiologic agent. We compared the performance of metagenomic next-generation sequencing (mNGS) with confirmatory nucleic acid testing (NAT) to routine clinical diagnostic testing in detection of known or novel viruses associated with ALF. Serum samples from 204 adult ALF patients collected from 1998 to 2010 as part of a nationwide registry were analyzed. One hundred eighty-seven patients (92%) were classified as indeterminate, while the remaining 17 patients (8%) served as controls, with infections by either hepatitis A virus or hepatitis B virus (HBV), or a noninfectious cause for their ALF. Eight cases of infection from previously unrecognized viral pathogens were detected by mNGS (4 cases of herpes simplex virus type 1, including 1 case of coinfection with HBV, and 1 case each of HBV, parvovirus B19, cytomegalovirus, and human herpesvirus 7). Several missed dual or triple infections were also identified, and assembled viral genomes provided additional information on genotyping and drug resistance mutations. Importantly, no sequences corresponding to novel viruses were detected. These results suggest that ALF patients should be screened for the presence of uncommon viruses and coinfections, and that most cases of indeterminate ALF in the United States do not appear to be caused by novel viral pathogens. In the future, mNGS testing may be useful for comprehensive diagnosis of viruses associated with ALF, or to exclude infectious etiologies. © The Author 2017. Published by Oxford University Press for the Infectious Diseases Society of America.
Mitochondrion-to-Chloroplast DNA Transfers and Intragenomic Proliferation of Chloroplast Group II Introns in Gloeotilopsis Green Algae (Ulotrichales, Ulvophyceae).

PubMed

Turmel, Monique; Otis, Christian; Lemieux, Claude

2016-09-19

To probe organelle genome evolution in the Ulvales/Ulotrichales clade, the newly sequenced chloroplast and mitochondrial genomes of Gloeotilopsis planctonica and Gloeotilopsis sarcinoidea (Ulotrichales) were compared with those of Pseudendoclonium akinetum (Ulotrichales) and of the few other green algae previously sampled in the Ulvophyceae. At 105,236 bp, the G planctonica mitochondrial DNA (mtDNA) is the largest mitochondrial genome reported so far among chlorophytes, whereas the 221,431-bp G planctonica and 262,888-bp G sarcinoidea chloroplast DNAs (cpDNAs) are the largest chloroplast genomes analyzed among the Ulvophyceae. Gains of non-coding sequences largely account for the expansion of these genomes. Both Gloeotilopsis cpDNAs lack the inverted repeat (IR) typically found in green plants, indicating that two independent IR losses occurred in the Ulvales/Ulotrichales. Our comparison of the Pseudendoclonium and Gloeotilopsis cpDNAs offered clues regarding the mechanism of IR loss in the Ulotrichales, suggesting that internal sequences from the rDNA operon were differentially lost from the two original IR copies during this process. Our analyses also unveiled a number of genetic novelties. Short mtDNA fragments were discovered in two distinct regions of the G sarcinoidea cpDNA, providing the first evidence for intracellular inter-organelle gene migration in green algae. We identified for the first time in green algal organelles, group II introns with LAGLIDADG ORFs as well as group II introns inserted into untranslated gene regions. We discovered many group II introns occupying sites not previously documented for the chloroplast genome and demonstrated that a number of them arose by intragenomic proliferation, most likely through retrohoming. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
The Structure of the Young Star Cluster NGC 6231. I. Stellar Population

NASA Astrophysics Data System (ADS)

Kuhn, Michael A.; Medina, Nicolás; Getman, Konstantin V.; Feigelson, Eric D.; Gromadzki, Mariusz; Borissova, Jordanka; Kurtev, Radostin

2017-09-01

NGC 6231 is a young cluster (age ˜2-7 Myr) dominating the Sco OB1 association (distance ˜1.59 kpc) with ˜100 O and B stars and a large pre-main-sequence stellar population. We combine a reanalysis of archival Chandra X-ray data with multiepoch near-infrared (NIR) photometry from the VISTA Variables in the Vía Lactéa (VVV) survey and published optical catalogs to obtain a catalog of 2148 probable cluster members. This catalog is 70% larger than previous censuses of probable cluster members in NGC 6231. It includes many low-mass stars detected in the NIR but not in the optical and some B stars without previously noted X-ray counterparts. In addition, we identify 295 NIR variables, about half of which are expected to be pre-main-sequence stars. With the more complete sample, we estimate a total population in the Chandra field of 5700-7500 cluster members down to 0.08 {M}⊙ (assuming a universal initial mass function) with a completeness limit at 0.5 {M}⊙ . A decrease in stellar X-ray luminosities is noted relative to other younger clusters. However, within the cluster, there is little variation in the distribution of X-ray luminosities for ages less than 5 Myr. The X-ray spectral hardness for B stars may be useful for distinguishing between early-B stars with X-rays generated in stellar winds and B-star systems with X-rays from a pre-main-sequence companion (>35% of B stars). A small fraction of catalog members have unusually high X-ray median energies or reddened NIR colors, which might be explained by absorption from thick or edge-on disks or being background field stars.
Intact long-type dupA as a marker for gastroduodenal diseases in Okinawan subpopulation, Japan.

PubMed

Takahashi, Ayaka; Shiota, Seiji; Matsunari, Osamu; Watada, Masahide; Suzuki, Rumiko; Nakachi, Saori; Kinjo, Nagisa; Kinjo, Fukunori; Yamaoka, Yoshio

2013-02-01

Helicobacter pylori dupA can be divided into two types according to the presence or absence of the mutation. In addition, full-sequenced data revealed that dupA has two types with different lengths depend on the presence of approximately 600 bp in the putative 5' region (presence; long-type and absence; short-type), which has not been taken into account in previous studies. A total of 319 strains isolated from Okinawa, the south islands of Japan, were included. The status of dupA and cagA was determined by polymerase chain reaction. The presence of mutations in long-type dupA was determined by DNA sequencing. The prevalence of long-type dupA was 26.3% (84/319). Sequence analysis showed that there were only six cases (7.1%) with point mutations lead to stop codon among 84 long-type dupA strains studied. Interestingly, intact long-type dupA without frameshift mutation, but not short-type dupA, was significantly associated with gastric ulcer and gastric cancer than gastritis (p = .001 and p = .019, respectively). After adjustment by age, gender, and cagA, the presence of intact long-type dupA was significantly associated with gastric ulcer and gastric cancer compared with gastritis (odds ratio [OR] = 3.35, 95% confidence interval [CI] = 1.55-7.24 and OR = 4.14, 95% CI = 1.23-13.94, respectively). Intact long-type dupA is a real virulence marker for severe outcomes in Okinawa, Japan. The previous information gained from PCR-based methods without taking long-type dupA into account must be interpreted with caution. © 2012 Blackwell Publishing Ltd.

Whole-Exome Sequencing Reveals Mutations in Genes Linked to Hemophagocytic Lymphohistiocytosis and Macrophage Activation Syndrome in Fatal Cases of H1N1 Influenza.

PubMed

Schulert, Grant S; Zhang, Mingce; Fall, Ndate; Husami, Ammar; Kissell, Diane; Hanosh, Andrew; Zhang, Kejian; Davis, Kristina; Jentzen, Jeffrey M; Napolitano, Lena; Siddiqui, Javed; Smith, Lauren B; Harms, Paul W; Grom, Alexei A; Cron, Randy Q

2016-04-01

Severe H1N1 influenza can be lethal in otherwise healthy individuals and can have features of reactive hemophagocytic lymphohistiocytosis (HLH). HLH is associated with mutations in lymphocyte cytolytic pathway genes, which have not been previously explored in H1N1 influenza. Sixteen cases of fatal influenza A(H1N1) infection, 81% with histopathologic hemophagocytosis, were identified and analyzed for clinical and laboratory features of HLH, using modified HLH-2004 and macrophage activation syndrome (MAS) criteria. Fourteen specimens were subject to whole-exome sequencing. Sequence alignment and variant filtering detected HLH gene mutations and potential disease-causing variants. Cytolytic function of the PRF1 p.A91V mutation was tested in lentiviral-transduced NK-92 natural killer (NK) cells. Despite several lacking variables, cases of influenza A(H1N1) infection met 44% and 81% of modified HLH-2004 and MAS criteria, respectively. Five subjects (36%) carried one of 3 heterozygous LYST mutations, 2 of whom also possessed the p.A91V PRF1 mutation, which was shown to decrease NK cell cytolytic function. Several patients also carried rare variants in other genes previously observed in MAS. This cohort of fatal influenza A(H1N1) infections confirms the presence of hemophagocytosis and HLH pathology. Moreover, the high percentage of HLH gene mutations suggests they are risk factors for mortality among individuals with influenza A(H1N1) infection. © The Author 2015. Published by Oxford University Press for the Infectious Diseases Society of America. All rights reserved. For permissions, e-mail journals.permissions@oup.com.
Toward a consistent model for strain accrual and release for the New Madrid Seismic Zone, central United States

USGS Publications Warehouse

Hough, S.E.; Page, M.

2011-01-01

At the heart of the conundrum of seismogenesis in the New Madrid Seismic Zone is the apparently substantial discrepancy between low strain rate and high recent seismic moment release. In this study we revisit the magnitudes of the four principal 1811–1812 earthquakes using intensity values determined from individual assessments from four experts. Using these values and the grid search method of Bakun and Wentworth (1997), we estimate magnitudes around 7.0 for all four events, values that are significantly lower than previously published magnitude estimates based on macroseismic intensities. We further show that the strain rate predicted from postglacial rebound is sufficient to produce a sequence with the moment release of one Mmax6.8 every 500 years, a rate that is much lower than previous estimates of late Holocene moment release. However, Mw6.8 is at the low end of the uncertainty range inferred from analysis of intensities for the largest 1811–1812 event. We show that Mw6.8 is also a reasonable value for the largest main shock given a plausible rupture scenario. One can also construct a range of consistent models that permit a somewhat higher Mmax, with a longer average recurrence rate. It is thus possible to reconcile predicted strain and seismic moment release rates with alternative models: one in which 1811–1812 sequences occur every 500 years, with the largest events being Mmax∼6.8, or one in which sequences occur, on average, less frequently, with Mmax of ∼7.0. Both models predict that the late Holocene rate of activity will continue for the next few to 10 thousand years.
De novo assembly and characterization of the Trichuris trichiura adult worm transcriptome using Ion Torrent sequencing.

PubMed

Santos, Leonardo N; Silva, Eduardo S; Santos, André S; De Sá, Pablo H; Ramos, Rommel T; Silva, Artur; Cooper, Philip J; Barreto, Maurício L; Loureiro, Sebastião; Pinheiro, Carina S; Alcantara-Neves, Neuza M; Pacheco, Luis G C

2016-07-01

Infection with helminthic parasites, including the soil-transmitted helminth Trichuris trichiura (human whipworm), has been shown to modulate host immune responses and, consequently, to have an impact on the development and manifestation of chronic human inflammatory diseases. De novo derivation of helminth proteomes from sequencing of transcriptomes will provide valuable data to aid identification of parasite proteins that could be evaluated as potential immunotherapeutic molecules in near future. Herein, we characterized the transcriptome of the adult stage of the human whipworm T. trichiura, using next-generation sequencing technology and a de novo assembly strategy. Nearly 17.6 million high-quality clean reads were assembled into 6414 contiguous sequences, with an N50 of 1606bp. In total, 5673 protein-encoding sequences were confidentially identified in the T. trichiura adult worm transcriptome; of these, 1013 sequences represent potential newly discovered proteins for the species, most of which presenting orthologs already annotated in the related species T. suis. A number of transcripts representing probable novel non-coding transcripts for the species T. trichiura were also identified. Among the most abundant transcripts, we found sequences that code for proteins involved in lipid transport, such as vitellogenins, and several chitin-binding proteins. Through a cross-species expression analysis of gene orthologs shared by T. trichiura and the closely related parasites T. suis and T. muris it was possible to find twenty-six protein-encoding genes that are consistently highly expressed in the adult stages of the three helminth species. Additionally, twenty transcripts could be identified that code for proteins previously detected by mass spectrometry analysis of protein fractions of the whipworm somatic extract that present immunomodulatory activities. Five of these transcripts were amongst the most highly expressed protein-encoding sequences in the T. trichiura adult worm. Besides, orthologs of proteins demonstrated to have potent immunomodulatory properties in related parasitic helminths were also predicted from the T. trichiura de novo assembled transcriptome. Copyright © 2016. Published by Elsevier B.V.
Molecular Epidemiological Survey and Genetic Characterization of Anaplasma Species in Mongolian Livestock.

PubMed

Ochirkhuu, Nyamsuren; Konnai, Satoru; Odbileg, Raadan; Murata, Shiro; Ohashi, Kazuhiko

2017-08-01

Anaplasma species are obligate intracellular rickettsial pathogens that cause great economic loss to the animal industry. Few studies on Anaplasma infections in Mongolian livestock have been conducted. This study examined the prevalence of Anaplasma marginale, Anaplasma ovis, Anaplasma phagocytophilum, and Anaplasma bovis by polymerase chain reaction assay in 928 blood samples collected from native cattle and dairy cattle (Bos taurus), yaks (Bos grunniens), sheep (Ovis aries), and goats (Capra aegagrus hircus) in four provinces of Ulaanbaatar city in Mongolia. We genetically characterized positive samples through sequencing analysis based on the heat-shock protein groEL, major surface protein 4 (msp4), and 16S rRNA genes. Only A. ovis was detected in Mongolian livestock (cattle, yaks, sheep, and goats), with 413 animals (44.5%) positive for groEL and 308 animals (33.2%) positive for msp4 genes. In the phylogenetic tree, we separated A. ovis sequences into two distinct clusters based on the groEL gene. One cluster comprised sequences derived mainly from sheep and goats, which was similar to that in A. ovis isolates from other countries. The other divergent cluster comprised sequences derived from cattle and yaks and appeared to be newly branched from that in previously published single isolates in Mongolian cattle. In addition, the msp4 gene of A. ovis using same and different samples with groEL gene of the pathogen demonstrated that all sequences derived from all animal species, except for three sequences derived from cattle and yak, were clustered together, and were identical or similar to those in isolates from other countries. We used 16S rRNA gene sequences to investigate the genetically divergent A. ovis and identified high homology of 99.3-100%. However, the sequences derived from cattle did not match those derived from sheep and goats. The results of this study on the prevalence and molecular characterization of A. ovis in Mongolian livestock can facilitate the control of infectious diseases in livestock.
Genetic characterization of hantaviruses associated with sigmodontine rodents in an endemic area for hantavirus pulmonary syndrome in southern Brazil.

PubMed

de Oliveira, Renata Carvalho; Padula, Paula J; Gomes, Raphael; Martinez, Valeria P; Bellomo, Carla; Bonvicino, Cibele R; Freire e Lima, Danúbia Inês; Bragagnolo, Camila; Caldas, Antônio C S; D'Andrea, Paulo S; de Lemos, Elba R S

2011-03-01

An ecological assessment of reservoir species was conducted in a rural area (Jaborá) in the mid-west of the state of Santa Catarina in southern Brazil, where hantavirus pulmonary syndrome is endemic, to evaluate the prevalence of hantavirus infection in wild rodents. Blood and tissue samples were collected from 507 rodents during seven field trips from March 2004 to April 2006. Some of the animals were karyotyped to confirm morphological identification. Phylogenetic reconstructions of rodent specimens, based on the mitochondrial DNA cytochrome b gene sequences, were also obtained. Hantavirus antibody was found in 22 (4.3%) of the 507 rodents: 5 Akodon montensis, 2 Akodon paranaensis, 14 Oligoryzomys nigripes, and 1 Sooretamys angouya. Viral RNAs detected in O. nigripes and A. montensis were amplified and sequenced. O. nigripes virus genome was 97.5% (nt) and 98.4% (nt) identical to sequences published for Araucaria (Juquitiba-like) virus based on N and G2 fragment sequences. Viral sequences from A. montensis strain showed 89% and 88% nucleotide identities in a 905-nt fragment of the nucleocapsid (N) protein-coding region of the S segment when it was compared with two other Akodontine rodent-associated viruses from Paraguay, A. montensis and Akodon cursor, respectively. Phylogenetic analysis showed the cocirculation of two genetic hantavirus lineages in the state of Santa Catarina, one from O. nigripes and the other from A. montensis, previously characterized in Brazil and Paraguay, respectively. The hantavirus associated with A. montensis, designed Jaborá virus, represents a distinct phylogenetic lineage among the Brazilian hantaviruses.
Development and Verification of an RNA Sequencing (RNA-Seq) Assay for the Detection of Gene Fusions in Tumors.

PubMed

Winters, Jennifer L; Davila, Jaime I; McDonald, Amber M; Nair, Asha A; Fadra, Numrah; Wehrs, Rebecca N; Thomas, Brittany C; Balcom, Jessica R; Jin, Long; Wu, Xianglin; Voss, Jesse S; Klee, Eric W; Oliver, Gavin R; Graham, Rondell P; Neff, Jadee L; Rumilla, Kandelaria M; Aypar, Umut; Kipp, Benjamin R; Jenkins, Robert B; Jen, Jin; Halling, Kevin C

2018-06-13

We assessed the performance characteristics of an RNA sequencing (RNA-Seq) assay designed to detect gene fusions in 571 genes to help manage patients with cancer. Polyadenylated RNA was converted to cDNA, which was then used to prepare next-generation sequencing libraries that were sequenced on an Illumina HiSeq 2500 instrument and analyzed with an in-house developed bioinformatic pipeline. The assay identified 38 of 41 gene fusions detected by another method, such as fluorescence in situ hybridization or RT-PCR, for a sensitivity of 93%. No false-positive gene fusions were identified in 15 normal tissue specimens and 10 tumor specimens that were negative for fusions by RNA sequencing or Mate Pair NGS (100% specificity). The assay also identified 22 fusions in 17 tumor specimens that had not been detected by other methods. Eighteen of the 22 fusions had not previously been described. Good intra-assay and interassay reproducibility was observed with complete concordance for the presence or absence of gene fusions in replicates. The analytical sensitivity of the assay was tested by diluting RNA isolated from gene fusion-positive cases with fusion-negative RNA. Gene fusions were generally detectable down to 12.5% dilutions for most fusions and as little as 3% for some fusions. This assay can help identify fusions in patients with cancer; these patients may in turn benefit from both US Food and Drug Administration-approved and investigational targeted therapies. Copyright © 2018 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.
Reefgenomics.Org - a repository for marine genomics data.

PubMed

Liew, Yi Jin; Aranda, Manuel; Voolstra, Christian R

2016-01-01

Over the last decade, technological advancements have substantially decreased the cost and time of obtaining large amounts of sequencing data. Paired with the exponentially increased computing power, individual labs are now able to sequence genomes or transcriptomes to investigate biological questions of interest. This has led to a significant increase in available sequence data. Although the bulk of data published in articles are stored in public sequence databases, very often, only raw sequencing data are available; miscellaneous data such as assembled transcriptomes, genome annotations etc. are not easily obtainable through the same means. Here, we introduce our website (http://reefgenomics.org) that aims to centralize genomic and transcriptomic data from marine organisms. Besides providing convenient means to download sequences, we provide (where applicable) a genome browser to explore available genomic features, and a BLAST interface to search through the hosted sequences. Through the interface, multiple datasets can be queried simultaneously, allowing for the retrieval of matching sequences from organisms of interest. The minimalistic, no-frills interface reduces visual clutter, making it convenient for end-users to search and explore processed sequence data. DATABASE URL: http://reefgenomics.org. © The Author(s) 2016. Published by Oxford University Press.
Variable presence of the inverted repeat and plastome stability in Erodium.

PubMed

Blazier, John C; Jansen, Robert K; Mower, Jeffrey P; Govindu, Madhu; Zhang, Jin; Weng, Mao-Lun; Ruhlman, Tracey A

2016-06-01

Several unrelated lineages such as plastids, viruses and plasmids, have converged on quadripartite genomes of similar size with large and small single copy regions and a large inverted repeat (IR). Except for Erodium (Geraniaceae), saguaro cactus and some legumes, the plastomes of all photosynthetic angiosperms display this structure. The functional significance of the IR is not understood and Erodium provides a system to examine the role of the IR in the long-term stability of these genomes. We compared the degree of genomic rearrangement in plastomes of Erodium that differ in the presence and absence of the IR. We sequenced 17 new Erodium plastomes. Using 454, Illumina, PacBio and Sanger sequences, 16 genomes were assembled and categorized along with one incomplete and two previously published Erodium plastomes. We conducted phylogenetic analyses among these species using a dataset of 19 protein-coding genes and determined if significantly higher evolutionary rates had caused the long branch seen previously in phylogenetic reconstructions within the genus. Bioinformatic comparisons were also performed to evaluate plastome evolution across the genus. Erodium plastomes fell into four types (Type 1-4) that differ in their substitution rates, short dispersed repeat content and degree of genomic rearrangement, gene and intron content and GC content. Type 4 plastomes had significantly higher rates of synonymous substitutions (dS) for all genes and for 14 of the 19 genes non-synonymous substitutions (dN) were significantly accelerated. We evaluated the evidence for a single IR loss in Erodium and in doing so discovered that Type 4 plastomes contain a novel IR. The presence or absence of the IR does not affect plastome stability in Erodium. Rather, the overall repeat content shows a negative correlation with genome stability, a pattern in agreement with other angiosperm groups and recent findings on genome stability in bacterial endosymbionts. © The Author 2016. Published by Oxford University Press on behalf of the Annals of Botany Company. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Multilocus sequence typing of Trichomonas vaginalis clinical samples from Amsterdam, the Netherlands.

PubMed

van der Veer, C; Himschoot, M; Bruisten, S M

2016-10-13

In this cross-sectional epidemiological study we aimed to identify molecular profiles for Trichomonas vaginalis and to determine how these molecular profiles were related to patient demographic and clinical characteristics. Molecular typing methods previously identified two genetically distinct subpopulations for T. vaginalis; however, few molecular epidemiological studies have been performed. We now increased the sensitivity of a previously described multilocus sequence typing (MLST) tool for T. vaginalis by using nested PCR. This enabled the typing of direct patient samples. From January to December 2014, we collected all T. vaginalis positive samples as detected by routine laboratory testing. Samples from patients either came from general practitioners offices or from the sexually transmitted infections (STI) clinic in Amsterdam. Epidemiological data for the STI clinic patients were retrieved from electronic patient files. The primary outcome was the success rate of genotyping direct T. vaginalis positive samples. The secondary outcome was the relation between T. vaginalis genotypes and risk factors for STI. All 7 MLST loci were successfully typed for 71/87 clinical samples. The 71 typed samples came from 69 patients, the majority of whom were women (n=62; 90%) and half (n=34; 49%) were STI clinic patients. Samples segregated into a two population structure for T. vaginalis representing genotypes I and II. Genotype I was most common (n=40; 59.7%). STI clinic patients infected with genotype II reported more sexual partners in the preceding 6 months than patients infected with genotype I (p=0.028). No other associations for gender, age, ethnicity, urogenital discharge or co-occurring STIs with T. vaginalis genotype were found. MLST with nested PCR is a sensitive typing method that allows typing of direct (uncultured) patient material. Genotype II is possibly more prevalent in high-risk sexual networks. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
Colorectal Cancer and the Human Gut Microbiome: Reproducibility with Whole-Genome Shotgun Sequencing

PubMed Central

Hua, Xing; Zeller, Georg; Sunagawa, Shinichi; Voigt, Anita Y.; Hercog, Rajna; Goedert, James J.; Shi, Jianxin; Bork, Peer; Sinha, Rashmi

2016-01-01

Accumulating evidence indicates that the gut microbiota affects colorectal cancer development, but previous studies have varied in population, technical methods, and associations with cancer. Understanding these variations is needed for comparisons and for potential pooling across studies. Therefore, we performed whole-genome shotgun sequencing on fecal samples from 52 pre-treatment colorectal cancer cases and 52 matched controls from Washington, DC. We compared findings from a previously published 16S rRNA study to the metagenomics-derived taxonomy within the same population. In addition, metagenome-predicted genes, modules, and pathways in the Washington, DC cases and controls were compared to cases and controls recruited in France whose specimens were processed using the same platform. Associations between the presence of fecal Fusobacteria, Fusobacterium, and Porphyromonas with colorectal cancer detected by 16S rRNA were reproduced by metagenomics, whereas higher relative abundance of Clostridia in cancer cases based on 16S rRNA was merely borderline based on metagenomics. This demonstrated that within the same sample set, most, but not all taxonomic associations were seen with both methods. Considering significant cancer associations with the relative abundance of genes, modules, and pathways in a recently published French metagenomics dataset, statistically significant associations in the Washington, DC population were detected for four out of 10 genes, three out of nine modules, and seven out of 17 pathways. In total, colorectal cancer status in the Washington, DC study was associated with 39% of the metagenome-predicted genes, modules, and pathways identified in the French study. More within and between population comparisons are needed to identify sources of variation and disease associations that can be reproduced despite these variations. Future studies should have larger sample sizes or pool data across studies to have sufficient power to detect associations that are reproducible and significant after correction for multiple testing. PMID:27171425
Evolutionary dynamics of Hepatitis C virus in a chronic HIV co-infected patient and its correlation with the immune status.

PubMed

Culasso, Andrés Carlos Alberto; Monzani, María Cecilia; Baré, Patricia; Campos, Rodolfo Hector

2018-05-04

The HCV evolutionary dynamics play a key role in the infection onset, maintenance of chronicity, pathogenicity, and drug resistance variants fixation, and are thought to be one of the main caveats in the development of an effective vaccine. Previous studies in HCV/HIV co-infected patients suggest that a decline in the immune status is related with increases in the HCV intra-host genetic diversity. However, these findings are based on single point sequence diversity measures or coalescence analyses in several virus-host interactions. In this work, we describe the molecular evolution of HCV-E2 region in a single HIV-co-infected patient with two clearly defined immune conditions. The phylogenetic analysis of the HCV-1a sequences from the studied patient showed that he was co-infected with three different viral lineages. These lineages were not evenly detected throughout time. The sequence diversity and coalescence analyses of these lineages suggested the action of different evolutionary patterns in different immune conditions: a slow rate, drift-like process in an immunocompromised condition (low levels of CD4+ T lymphocytes); and a fast rate, variant-switch process in an immunocompetent condition (high levels of CD4+ T lymphocytes). Copyright © 2017. Published by Elsevier B.V.
QuickNGS elevates Next-Generation Sequencing data analysis to a new level of automation.

PubMed

Wagle, Prerana; Nikolić, Miloš; Frommolt, Peter

2015-07-01

Next-Generation Sequencing (NGS) has emerged as a widely used tool in molecular biology. While time and cost for the sequencing itself are decreasing, the analysis of the massive amounts of data remains challenging. Since multiple algorithmic approaches for the basic data analysis have been developed, there is now an increasing need to efficiently use these tools to obtain results in reasonable time. We have developed QuickNGS, a new workflow system for laboratories with the need to analyze data from multiple NGS projects at a time. QuickNGS takes advantage of parallel computing resources, a comprehensive back-end database, and a careful selection of previously published algorithmic approaches to build fully automated data analysis workflows. We demonstrate the efficiency of our new software by a comprehensive analysis of 10 RNA-Seq samples which we can finish in only a few minutes of hands-on time. The approach we have taken is suitable to process even much larger numbers of samples and multiple projects at a time. Our approach considerably reduces the barriers that still limit the usability of the powerful NGS technology and finally decreases the time to be spent before proceeding to further downstream analysis and interpretation of the data.
WNT10A mutation results in severe tooth agenesis in a family of three sisters.

PubMed

Abid, M F; Simpson, M A; Barbosa, I A; Seppala, M; Irving, M; Sharpe, P T; Cobourne, M T

2018-06-21

To identify the genetic basis of severe tooth agenesis in a family of three affected sisters. A family of three sisters with severe tooth agenesis was recruited for whole-exome sequencing to identify potential genetic variation responsible for this penetrant phenotype. The unaffected father was tested for specific mutations using Sanger sequencing. Gene discovery was supplemented with in situ hybridization to localize gene expression during human tooth development. We report a nonsense heterozygous mutation in exon 2 of WNT10A c.321C>A[p.Cys107*] likely to be responsible for the severe tooth agenesis identified in this family through the creation of a premature stop codon, resulting in truncation of the amino acid sequence and therefore loss of protein function. In situ hybridization showed expression of WNT10A in odontogenic epithelium during the early and late stages of human primary tooth development. WNT10A has previously been associated with both syndromic and non-syndromic forms of tooth agenesis, and this report further expands our knowledge of genetic variation underlying non-syndromic forms of this condition. We also demonstrate expression of WNT10A in the epithelial compartment of human tooth germs during development. © 2018 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
The Phyre2 web portal for protein modelling, prediction and analysis

PubMed Central

Kelley, Lawrence A; Mezulis, Stefans; Yates, Christopher M; Wass, Mark N; Sternberg, Michael JE

2017-01-01

Summary Phyre2 is a suite of tools available on the web to predict and analyse protein structure, function and mutations. The focus of Phyre2 is to provide biologists with a simple and intuitive interface to state-of-the-art protein bioinformatics tools. Phyre2 replaces Phyre, the original version of the server for which we previously published a protocol. In this updated protocol, we describe Phyre2, which uses advanced remote homology detection methods to build 3D models, predict ligand binding sites, and analyse the effect of amino-acid variants (e.g. nsSNPs) for a user’s protein sequence. Users are guided through results by a simple interface at a level of detail determined by them. This protocol will guide a user from submitting a protein sequence to interpreting the secondary and tertiary structure of their models, their domain composition and model quality. A range of additional available tools is described to find a protein structure in a genome, to submit large number of sequences at once and to automatically run weekly searches for proteins difficult to model. The server is available at http://www.sbg.bio.ic.ac.uk/phyre2. A typical structure prediction will be returned between 30mins and 2 hours after submission. PMID:25950237
The velvet spiders: an atlas of the Eresidae (Arachnida, Araneae)

PubMed Central

Miller, Jeremy A.; Griswold, Charles E.; Scharff, Nikolaj; Řezáč, Milan; Szűts, Tamás; Marhabaie, Mohammad

2012-01-01

Abstract The family Eresidae C. L. Koch, 1850 is reviewed at the genus level. The family comprises nine genera including one new genus. They are: Adonea Simon, 1873, Dorceus C. L. Koch, 1846, Dresserus Simon, 1876, Eresus Walckenaer, 1805, Gandanameno Lehtinen, 1967, Loureedia gen. n., ParadoneaLawrence, 1968, Seothyra Purcell, 1903, and Stegodyphus Simon, 1873. A key to all genera and major lineages is provided along with corresponding diagnoses, as well as descriptions of selected species. These are documented with collections of photographs, scanning electron micrographs, and illustrations. A new phylogeny of Eresidae based on molecular sequence data expands on a previously published analysis. A species of the genus Paradonea Lawrence, 1968 is sequenced and placed phylogenetically for the first time. New sequences from twenty Gandanameno Lehtinen, 1967 specimens were added to investigate species limits within the genus. The genus Loureedia gen. n. is proposed to accommodate Eresus annulipes Lucas, 1857. Two species, Eresus semicanus Simon, 1908 and Eresus jerbae El-Hennawy, 2005, are synonymized with Loureedia annulipes comb. n. One new species, Paradonea presleyi sp. n. is described. Eresus algericus El-Hennawy, 2004 is transferred to Adonea Simon, 1873. The female of Dorceus fastuosus C. L. Koch, 1846 is described for the first time. The first figures depicting Paradonea splendens (Lawrence, 1936) are presented. PMID:22679386
Biomining active cellulases from a mining bioremediation system.

PubMed

Mewis, Keith; Armstrong, Zachary; Song, Young C; Baldwin, Susan A; Withers, Stephen G; Hallam, Steven J

2013-09-20

Functional metagenomics has emerged as a powerful method for gene model validation and enzyme discovery from natural and human engineered ecosystems. Here we report development of a high-throughput functional metagenomic screen incorporating bioinformatic and biochemical analyses features. A fosmid library containing 6144 clones sourced from a mining bioremediation system was screened for cellulase activity using 2,4-dinitrophenyl β-cellobioside, a previously proven cellulose model substrate. Fifteen active clones were recovered and fully sequenced revealing 9 unique clones with the ability to hydrolyse 1,4-β-D-glucosidic linkages. Transposon mutagenesis identified genes belonging to glycoside hydrolase (GH) 1, 3, or 5 as necessary for mediating this activity. Reference trees for GH 1, 3, and 5 families were generated from sequences in the CAZy database for automated phylogenetic analysis of fosmid end and active clone sequences revealing known and novel cellulase encoding genes. Active cellulase genes recovered in functional screens were subcloned into inducible high copy plasmids, expressed and purified to determine enzymatic properties including thermostability, pH optima, and substrate specificity. The workflow described here provides a general paradigm for recovery and characterization of microbially derived genes and gene products based on genetic logic and contemporary screening technologies developed for model organismal systems. Copyright © 2013 The Authors. Published by Elsevier B.V. All rights reserved.
Livestock rabies outbreaks in Shanxi province, China.

PubMed

Feng, Ye; Shi, Yanyan; Yu, Mingyang; Xu, Weidi; Gong, Wenjie; Tu, Zhongzhong; Ding, Laixi; He, Biao; Guo, Huancheng; Tu, Changchun

2016-10-01

Dogs play an important role in rabies transmission throughout the world. In addition to the severe human rabies situation in China, spillover of rabies virus from dogs in recent years has caused rabies outbreaks in sheep, cattle and pigs, showing that there is an increasing threat to other domestic animals. Two livestock rabies outbreaks were caused by dogs in Shanxi province, China from April to October in 2015, resulting in the deaths of 60 sheep, 10 cattle and one donkey. Brain samples from one infected bovine and the donkey were determined to be rabies virus (RABV) positive by fluorescent antibody test (FAT) and reverse transcription polymerase chain reaction (RT-PCR). The complete RABV N genes of the two field strains, together with those of two previously confirmed Shanxi dog strains, were amplified, sequenced and compared phylogenetically with published sequences of the N gene of RABV strains from Shanxi and surrounding provinces. All of the strains from Shanxi province grouped closely, sharing 99.6 %-100 % sequence identity, indicating the wide distribution and transmission of dog-mediated rabies in these areas. This is the first description of donkey rabies symptoms with phylogenetic analysis of RABVs in Shanxi province and surrounding regions. The result emphasizes the need for mandatory dog rabies vaccination and improved public education to eradicate dog rabies transmission.
Sequence stratigraphy of the Upper Cambrian (Furongian; Jiangshanian and Sunwaptan) Tunnel City Group, Upper Mississippi Valley: Transgressing assumptions of cratonic flooding

USGS Publications Warehouse

Eoff, Jennifer D.

2014-01-01

New data from detailed measured sections permit comprehensive analysis of the sequence framework of the Furongian (Upper Cambrian; Jiangshanian and Sunwaptan stages) Tunnel City Group (Lone Rock Formation and Mazomanie Formation) of Wisconsin and Minnesota. The sequence-stratigraphic architecture of the lower part of the Sunwaptan Stage at the base of the Tunnel City Group, at the contact between the Wonewoc Formation and Lone Rock Formation, records the first part of complex polyphase flooding (Sauk III) of the Laurentian craton, at a scale smaller than most events recorded by global sea-level curves. Flat-pebble conglomerate and glauconite document transgressive ravinement and development of a condensed section when creation of accommodation exceeded its consumption by sedimentation. Thinly-bedded, fossiliferous sandstone represents the most distal setting during earliest highstand. Subsequent deposition of sandstone characterized by hummocky or trough cross-stratification records progradational pulses of shallower, storm- and wave-dominated environments across the craton before final flooding of Sauk III commenced with carbonate deposition during the middle part of the Sunwaptan Stage. Comparison of early Sunwaptan flooding of the inner Laurentian craton to published interpretations from other parts of North America suggests that Sauk III was not a single, long-term accommodation event as previously proposed.
PCR amplification and sequences of cDNA clones for the small and large subunits of ADP-glucose pyrophosphorylase from barley tissues.

PubMed

Villand, P; Aalen, R; Olsen, O A; Lüthi, E; Lönneborg, A; Kleczkowski, L A

1992-06-01

Several cDNAs encoding the small and large subunit of ADP-glucose pyrophosphorylase (AGP) were isolated from total RNA of the starchy endosperm, roots and leaves of barley by polymerase chain reaction (PCR). Sets of degenerate oligonucleotide primers, based on previously published conserved amino acid sequences of plant AGP, were used for synthesis and amplification of the cDNAs. For either the endosperm, roots and leaves, the restriction analysis of PCR products (ca. 550 nucleotides each) has revealed heterogeneity, suggesting presence of three transcripts for AGP in the endosperm and roots, and up to two AGP transcripts in the leaf tissue. Based on the derived amino acid sequences, two clones from the endosperm, beps and bepl, were identified as coding for the small and large subunit of AGP, respectively, while a leaf transcript (blpl) encoded the putative large subunit of AGP. There was about 50% identity between the endosperm clones, and both of them were about 60% identical to the leaf cDNA. Northern blot analysis has indicated that beps and bepl are expressed in both the endosperm and roots, while blpl is detectable only in leaves. Application of the PCR technique in studies on gene structure and gene expression of plant AGP is discussed.
Of mice and (Viking?) men: phylogeography of British and Irish house mice.

PubMed

Searle, Jeremy B; Jones, Catherine S; Gündüz, Islam; Scascitelli, Moira; Jones, Eleanor P; Herman, Jeremy S; Rambau, R Victor; Noble, Leslie R; Berry, R J; Giménez, Mabel D; Jóhannesdóttir, Fríoa

2009-01-22

The west European subspecies of house mouse (Mus musculus domesticus) has gained much of its current widespread distribution through commensalism with humans. This means that the phylogeography of M. m. domesticus should reflect patterns of human movements. We studied restriction fragment length polymorphism (RFLP) and DNA sequence variations in mouse mitochondrial (mt) DNA throughout the British Isles (328 mice from 105 localities, including previously published data). There is a major mtDNA lineage revealed by both RFLP and sequence analyses, which is restricted to the northern and western peripheries of the British Isles, and also occurs in Norway. This distribution of the 'Orkney' lineage fits well with the sphere of influence of the Norwegian Vikings and was probably generated through inadvertent transport by them. To form viable populations, house mice would have required large human settlements such as the Norwegian Vikings founded. The other parts of the British Isles (essentially most of mainland Britain) are characterized by house mice with different mtDNA sequences, some of which are also found in Germany, and which probably reflect both Iron Age movements of people and mice and earlier development of large human settlements. MtDNA studies on house mice have the potential to reveal novel aspects of human history.

Sequence-based analysis of the microbial composition of water kefir from multiple sources.

PubMed

Marsh, Alan J; O'Sullivan, Orla; Hill, Colin; Ross, R Paul; Cotter, Paul D

2013-11-01

Water kefir is a water-sucrose-based beverage, fermented by a symbiosis of bacteria and yeast to produce a final product that is lightly carbonated, acidic and that has a low alcohol percentage. The microorganisms present in water kefir are introduced via water kefir grains, which consist of a polysaccharide matrix in which the microorganisms are embedded. We aimed to provide a comprehensive sequencing-based analysis of the bacterial population of water kefir beverages and grains, while providing an initial insight into the corresponding fungal population. To facilitate this objective, four water kefirs were sourced from the UK, Canada and the United States. Culture-independent, high-throughput, sequencing-based analyses revealed that the bacterial fraction of each water kefir and grain was dominated by Zymomonas, an ethanol-producing bacterium, which has not previously been detected at such a scale. The other genera detected were representatives of the lactic acid bacteria and acetic acid bacteria. Our analysis of the fungal component established that it was comprised of the genera Dekkera, Hanseniaspora, Saccharomyces, Zygosaccharomyces, Torulaspora and Lachancea. This information will assist in the ultimate identification of the microorganisms responsible for the potentially health-promoting attributes of these beverages. © 2013 Federation of European Microbiological Societies. Published by John Wiley & Sons Ltd. All rights reserved.
Of mice and (Viking?) men: phylogeography of British and Irish house mice

PubMed Central

Searle, Jeremy B.; Jones, Catherine S.; Gündüz, İslam; Scascitelli, Moira; Jones, Eleanor P.; Herman, Jeremy S.; Rambau, R. Victor; Noble, Leslie R.; Berry, R.J.; Giménez, Mabel D.; Jóhannesdóttir, Fríða

2008-01-01

The west European subspecies of house mouse (Mus musculus domesticus) has gained much of its current widespread distribution through commensalism with humans. This means that the phylogeography of M. m. domesticus should reflect patterns of human movements. We studied restriction fragment length polymorphism (RFLP) and DNA sequence variations in mouse mitochondrial (mt) DNA throughout the British Isles (328 mice from 105 localities, including previously published data). There is a major mtDNA lineage revealed by both RFLP and sequence analyses, which is restricted to the northern and western peripheries of the British Isles, and also occurs in Norway. This distribution of the ‘Orkney’ lineage fits well with the sphere of influence of the Norwegian Vikings and was probably generated through inadvertent transport by them. To form viable populations, house mice would have required large human settlements such as the Norwegian Vikings founded. The other parts of the British Isles (essentially most of mainland Britain) are characterized by house mice with different mtDNA sequences, some of which are also found in Germany, and which probably reflect both Iron Age movements of people and mice and earlier development of large human settlements. MtDNA studies on house mice have the potential to reveal novel aspects of human history. PMID:18826939
Mitogenome Sequencing in the Genus Camelus Reveals Evidence for Purifying Selection and Long-term Divergence between Wild and Domestic Bactrian Camels.

PubMed

Mohandesan, Elmira; Fitak, Robert R; Corander, Jukka; Yadamsuren, Adiya; Chuluunbat, Battsetseg; Abdelhadi, Omer; Raziq, Abdul; Nagy, Peter; Stalder, Gabrielle; Walzer, Chris; Faye, Bernard; Burger, Pamela A

2017-08-30

The genus Camelus is an interesting model to study adaptive evolution in the mitochondrial genome, as the three extant Old World camel species inhabit hot and low-altitude as well as cold and high-altitude deserts. We sequenced 24 camel mitogenomes and combined them with three previously published sequences to study the role of natural selection under different environmental pressure, and to advance our understanding of the evolutionary history of the genus Camelus. We confirmed the heterogeneity of divergence across different components of the electron transport system. Lineage-specific analysis of mitochondrial protein evolution revealed a significant effect of purifying selection in the concatenated protein-coding genes in domestic Bactrian camels. The estimated dN/dS < 1 in the concatenated protein-coding genes suggested purifying selection as driving force for shaping mitogenome diversity in camels. Additional analyses of the functional divergence in amino acid changes between species-specific lineages indicated fixed substitutions in various genes, with radical effects on the physicochemical properties of the protein products. The evolutionary time estimates revealed a divergence between domestic and wild Bactrian camels around 1.1 [0.58-1.8] million years ago (mya). This has major implications for the conservation and management of the critically endangered wild species, Camelus ferus.
Complete mitochondrial genome of the Asian paddle crab Charybdis japonica (Crustacea: Decapoda: Portunidae): gene rearrangement of the marine brachyurans and phylogenetic considerations of the decapods.

PubMed

Liu, Yuan; Cui, Zhaoxia

2010-06-01

Given the commercial and ecological importance of the Asian paddle crab, Charybdis japonica, there is a clearly need for genetic and molecular research on this species. Here, we present the complete mitochondrial genome sequence of C. japonica, determined by the long-polymerase chain reaction and primer walking sequencing method. The entire genome is 15,738 bp in length, encoding a standard set of 13 protein-coding genes, two ribosomal RNA genes, and 22 transfer RNA genes, plus the putative control region, which is typical for metazoans. The total A+T content of the genome is 69.2%, lower than the other brachyuran crabs except for Callinectes sapidus. The gene order is identical to the published marine brachyurans and differs from the ancestral pancrustacean order by only the position of the tRNA ( His ) gene. Phylogenetic analyses using the concatenated nucleotide and amino acid sequences of 13 protein-coding genes strongly support the monophyly of Dendrobranchiata and Pleocyemata, which is consistent with the previous taxonomic classification. However, the systematic status of Charybdis within subfamily Thalamitinae of family Portunidae is not supported. C. japonica, as the first species of Charybdis with complete mitochondrial genome available, will provide important information on both genomics and molecular ecology of the group.
The international Genome sample resource (IGSR): A worldwide collection of genome variation incorporating the 1000 Genomes Project data.

PubMed

Clarke, Laura; Fairley, Susan; Zheng-Bradley, Xiangqun; Streeter, Ian; Perry, Emily; Lowy, Ernesto; Tassé, Anne-Marie; Flicek, Paul

2017-01-04

The International Genome Sample Resource (IGSR; http://www.internationalgenome.org) expands in data type and population diversity the resources from the 1000 Genomes Project. IGSR represents the largest open collection of human variation data and provides easy access to these resources. IGSR was established in 2015 to maintain and extend the 1000 Genomes Project data, which has been widely used as a reference set of human variation and by researchers developing analysis methods. IGSR has mapped all of the 1000 Genomes sequence to the newest human reference (GRCh38), and will release updated variant calls to ensure maximal usefulness of the existing data. IGSR is collecting new structural variation data on the 1000 Genomes samples from long read sequencing and other technologies, and will collect relevant functional data into a single comprehensive resource. IGSR is extending coverage with new populations sequenced by collaborating groups. Here, we present the new data and analysis that IGSR has made available. We have also introduced a new data portal that increases discoverability of our data-previously only browseable through our FTP site-by focusing on particular samples, populations or data sets of interest. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Deriving high-resolution protein backbone structure propensities from all crystal data using the information maximization device.

PubMed

Solis, Armando D

2014-01-01

The most informative probability distribution functions (PDFs) describing the Ramachandran phi-psi dihedral angle pair, a fundamental descriptor of backbone conformation of protein molecules, are derived from high-resolution X-ray crystal structures using an information-theoretic approach. The Information Maximization Device (IMD) is established, based on fundamental information-theoretic concepts, and then applied specifically to derive highly resolved phi-psi maps for all 20 single amino acid and all 8000 triplet sequences at an optimal resolution determined by the volume of current data. The paper shows that utilizing the latent information contained in all viable high-resolution crystal structures found in the Protein Data Bank (PDB), totaling more than 77,000 chains, permits the derivation of a large number of optimized sequence-dependent PDFs. This work demonstrates the effectiveness of the IMD and the superiority of the resulting PDFs by extensive fold recognition experiments and rigorous comparisons with previously published triplet PDFs. Because it automatically optimizes PDFs, IMD results in improved performance of knowledge-based potentials, which rely on such PDFs. Furthermore, it provides an easy computational recipe for empirically deriving other kinds of sequence-dependent structural PDFs with greater detail and precision. The high-resolution phi-psi maps derived in this work are available for download.
Cryopreservation of Fish Spermatogonial Cells: The Future of Natural History Collections.

PubMed

Hagedorn, Mary M; Daly, Jonathan P; Carter, Virginia L; Cole, Kathleen S; Jaafar, Zeehan; Lager, Claire V A; Parenti, Lynne R

2018-04-18

As global biodiversity declines, the value of biological collections increases. Cryopreserved diploid spermatogonial cells meet two goals: to yield high-quality molecular sequence data; and to regenerate new individuals, hence potentially countering species extinction. Cryopreserved spermatogonial cells that allow for such mitigative measures are not currently in natural history museum collections because there are no standard protocols to collect them. Vertebrate specimens, especially fishes, are traditionally formalin-fixed and alcohol-preserved which makes them ideal for morphological studies and as museum vouchers, but inadequate for molecular sequence data. Molecular studies of fishes routinely use tissues preserved in ethanol; yet tissues preserved in this way may yield degraded sequences over time. As an alternative to tissue fixation methods, we assessed and compared previously published cryopreservation methods by gating and counting fish testicular cells with flow cytometry to identify presumptive spermatogonia A-type cells. Here we describe a protocol to cryopreserve tissues that yields a high percentage of viable spermatogonial cells from the testes of Asterropteryx semipunctata, a marine goby. Material cryopreserved using this protocol represents the first frozen and post-thaw viable spermatogonial cells of fishes archived in a natural history museum to provide better quality material for re-derivation of species and DNA preservation and analysis.
Phylogenetic Variation of Chelonid Alphaherpesvirus 5 (ChHV5) in Populations of Green Turtles Chelonia mydas along the Queensland Coast, Australia.

PubMed

Ariel, E; Nainu, F; Jones, K; Juntunen, K; Bell, I; Gaston, J; Scott, J; Trocini, S; Burgess, G W

2017-09-01

Sea turtle fibropapillomatosis (FP) is a disease marked by the proliferation of benign but debilitating cutaneous and occasional visceral tumors, likely to be caused by chelonid alphaherpesvirus 5 (ChHV5). This study presents a phylogeny of ChHV5 strains found on the east coast of Queensland, Australia, and a validation for previously unused primers. Two different primer sets (gB-1534 and gB-813) were designed to target a region including part of the UL27 glycoprotein B (gB) gene and part of UL28 of ChHV5. Sequences obtained from FP tumors found on juvenile green turtles Chelonia mydas (<65 cm curved carapace length) had substantial homology with published ChHV5 sequences, while a skin biopsy from a turtle without FP failed to react in the PCRs used in this study. The resulting sequences were used to generate a neighbor-joining tree from which three clusters of ChHV5 from Australian waters were identified: north Australian, north Queensland, and Queensland clusters. The clusters reflect the collection sites on the east coast of Queensland with a definitive north-south trend. Received October 22, 2016; accepted May 7, 2017.
A comprehensive list of cloned human DNA sequences

PubMed Central

Schmidtke, Jörg; Cooper, David N.

1987-01-01

A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:3575113
A comprehensive list of cloned human DNA sequences

PubMed Central

Schmidtke, Jörg; Cooper, David N.

1990-01-01

A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:2333227
A comprehensive list of cloned human DNA sequences

PubMed Central

Schmidtke, Jörg; Cooper, David N.

1988-01-01

A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:3368330
A comprehensive list of cloned human DNA sequences

PubMed Central

Schmidtke, Jörg; Cooper, David N.

1989-01-01

A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:2654889
Patome: a database server for biological sequence annotation and analysis in issued patents and published patent applications.

PubMed

Lee, Byungwook; Kim, Taehyung; Kim, Seon-Kyu; Lee, Kwang H; Lee, Doheon

2007-01-01

With the advent of automated and high-throughput techniques, the number of patent applications containing biological sequences has been increasing rapidly. However, they have attracted relatively little attention compared to other sequence resources. We have built a database server called Patome, which contains biological sequence data disclosed in patents and published applications, as well as their analysis information. The analysis is divided into two steps. The first is an annotation step in which the disclosed sequences were annotated with RefSeq database. The second is an association step where the sequences were linked to Entrez Gene, OMIM and GO databases, and their results were saved as a gene-patent table. From the analysis, we found that 55% of human genes were associated with patenting. The gene-patent table can be used to identify whether a particular gene or disease is related to patenting. Patome is available at http://www.patome.org/; the information is updated bimonthly.
Patome: a database server for biological sequence annotation and analysis in issued patents and published patent applications

PubMed Central

Lee, Byungwook; Kim, Taehyung; Kim, Seon-Kyu; Lee, Kwang H.; Lee, Doheon

2007-01-01

With the advent of automated and high-throughput techniques, the number of patent applications containing biological sequences has been increasing rapidly. However, they have attracted relatively little attention compared to other sequence resources. We have built a database server called Patome, which contains biological sequence data disclosed in patents and published applications, as well as their analysis information. The analysis is divided into two steps. The first is an annotation step in which the disclosed sequences were annotated with RefSeq database. The second is an association step where the sequences were linked to Entrez Gene, OMIM and GO databases, and their results were saved as a gene–patent table. From the analysis, we found that 55% of human genes were associated with patenting. The gene–patent table can be used to identify whether a particular gene or disease is related to patenting. Patome is available at ; the information is updated bimonthly. PMID:17085479
SAM syndrome is characterized by extensive phenotypic heterogeneity.

PubMed

Taiber, Shahar; Samuelov, Liat; Mohamad, Janan; Cohen Barak, Eran; Sarig, Ofer; Shalev, Stavit Allon; Lestringant, Gilles; Sprecher, Eli

2018-03-31

Severe skin dermatitis, multiple allergies and metabolic wasting (SAM) syndrome is a rare life-threatening inherited condition caused by bi-allelic mutations in DSG1 encoding desmoglein 1. The disease was initially reported to manifest with severe erythroderma, failure to thrive, atopic manifestations, recurrent infections, hypotrichosis and palmoplantar keratoderma. We present 3 new cases of SAM syndrome in 2 families and review the cases published so far. Whole exome and direct sequencing were used to identify SAM syndrome-causing mutations. Consistent with previous data, SAM syndrome was found in all 3 patients to result from homozygous mutations in DSG1 predicted to result in premature termination of translation. In contrast, as compared with patients previously reported, the present cases were found to display a wide range of clinical presentations of variable degrees of severity. The present data emphasizes the fact that SAM syndrome is characterized by extensive phenotypic heterogeneity, suggesting the existence of potent modifier traits. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
An improved method of measuring heart rate using a webcam

NASA Astrophysics Data System (ADS)

Liu, Yi; Ouyang, Jianfei; Yan, Yonggang

2014-09-01

Measuring heart rate traditionally requires special equipment and physical contact with the subject. Reliable non-contact and low-cost measurements are highly desirable for convenient and comfortable physiological self-assessment. Previous work has shown that consumer-grade cameras can provide useful signals for remote heart rate measurements. In this paper a simple and robust method of measuring the heart rate using low-cost webcam is proposed. Blood volume pulse is extracted by proper Region of Interest (ROI) and color channel selection from image sequences of human faces without complex computation. Heart rate is subsequently quantified by spectrum analysis. The method is successfully applied under natural lighting conditions. Results of experiments show that it takes less time, is much simpler, and has similar accuracy to the previously published and widely used method of Independent Component Analysis (ICA). Benefitting from non-contact, convenience, and low-costs, it provides great promise for popularization of home healthcare and can further be applied to biomedical research.
Simplified and economical 2D IR spectrometer design using a dual acousto-optic modulator

PubMed Central

Skoff, David R.; Laaser, Jennifer E.; Mukherjee, Sudipta S.; Middleton, Chris T.; Zanni, Martin T.

2012-01-01

Over the last decade two-dimensional infrared (2D IR) spectroscopy has proven to be a very useful extension of infrared spectroscopy, yet the technique remains restricted to a small group of specialized researchers because of its experimental complexity and high equipment cost. We report on a spectrometer that is compact, mechanically robust, and is much less expensive than previous designs because it uses a single pixel MCT detector rather than an array detector. Moreover, each axis of the spectrum can be collected in either the time or frequency domain via computer programming. We discuss pulse sequences for scanning the probe axis, which were not previously possible. We present spectra on metal carbonyl compounds at 5 µm and a model peptide at 6 µm. Data collection with a single pixel MCT takes longer than using an array detector, but publishable quality data are still achieved with only a few minutes of averaging. PMID:24659850
Mutations in PCYT1A cause spondylometaphyseal dysplasia with cone-rod dystrophy.

PubMed

Yamamoto, Guilherme L; Baratela, Wagner A R; Almeida, Tatiana F; Lazar, Monize; Afonso, Clara L; Oyamada, Maria K; Suzuki, Lisa; Oliveira, Luiz A N; Ramos, Ester S; Kim, Chong A; Passos-Bueno, Maria Rita; Bertola, Débora R

2014-01-02

Spondylometaphyseal dysplasia with cone-rod dystrophy is a rare autosomal-recessive disorder characterized by severe short stature, progressive lower-limb bowing, flattened vertebral bodies, metaphyseal involvement, and visual impairment caused by cone-rod dystrophy. Whole-exome sequencing of four individuals affected by this disorder from two Brazilian families identified two previously unreported homozygous mutations in PCYT1A. This gene encodes the alpha isoform of the phosphate cytidylyltransferase 1 choline enzyme, which is responsible for converting phosphocholine into cytidine diphosphate-choline, a key intermediate step in the phosphatidylcholine biosynthesis pathway. A different enzymatic defect in this pathway has been previously associated with a muscular dystrophy with mitochondrial structural abnormalities that does not have cartilage and/or bone or retinal involvement. Thus, the deregulation of the phosphatidylcholine pathway may play a role in multiple genetic diseases in humans, and further studies are necessary to uncover its precise pathogenic mechanisms and the entirety of its phenotypic spectrum. Copyright © 2014 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Independent and combined analyses of sequences from all three genomic compartments converge on the root of flowering plant phylogeny

PubMed Central

Barkman, Todd J.; Chenery, Gordon; McNeal, Joel R.; Lyons-Weiler, James; Ellisens, Wayne J.; Moore, Gerry; Wolfe, Andrea D.; dePamphilis, Claude W.

2000-01-01

Plant phylogenetic estimates are most likely to be reliable when congruent evidence is obtained independently from the mitochondrial, plastid, and nuclear genomes with all methods of analysis. Here, results are presented from separate and combined genomic analyses of new and previously published data, including six and nine genes (8,911 bp and 12,010 bp, respectively) for different subsets of taxa that suggest Amborella + Nymphaeales (water lilies) are the first-branching angiosperm lineage. Before and after tree-independent noise reduction, most individual genomic compartments and methods of analysis estimated the Amborella + Nymphaeales basal topology with high support. Previous phylogenetic estimates placing Amborella alone as the first extant angiosperm branch may have been misled because of a series of specific problems with paralogy, suboptimal outgroups, long-branch taxa, and method dependence. Ancestral character state reconstructions differ between the two topologies and affect inferences about the features of early angiosperms. PMID:11069280
Resurgence of Integrated Behavioral Units

PubMed Central

Bachá-Méndez, Gustavo; Reid, Alliston K; Mendoza-Soylovna, Adela

2007-01-01

Two experiments with rats examined the dynamics of well-learned response sequences when reinforcement contingencies were changed. Both experiments contained four phases, each of which reinforced a 2-response sequence of lever presses until responding was stable. The contingencies then were shifted to a new reinforced sequence until responding was again stable. Extinction-induced resurgence of previously reinforced, and then extinguished, heterogeneous response sequences was observed in all subjects in both experiments. These sequences were demonstrated to be integrated behavioral units, controlled by processes acting at the level of the entire sequence. Response-level processes were also simultaneously operative. Errors in sequence production were strongly influenced by the terminal, not the initial, response in the currently reinforced sequence, but not by the previously reinforced sequence. These studies demonstrate that sequence-level and response-level processes can operate simultaneously in integrated behavioral units. Resurgence and the development of integrated behavioral units may be dissociated; thus the observation of one does not necessarily imply the other. PMID:17345948

Deep whole-genome sequencing of 90 Han Chinese genomes.

PubMed

Lan, Tianming; Lin, Haoxiang; Zhu, Wenjuan; Laurent, Tellier Christian Asker Melchior; Yang, Mengcheng; Liu, Xin; Wang, Jun; Wang, Jian; Yang, Huanming; Xu, Xun; Guo, Xiaosen

2017-09-01

Next-generation sequencing provides a high-resolution insight into human genetic information. However, the focus of previous studies has primarily been on low-coverage data due to the high cost of sequencing. Although the 1000 Genomes Project and the Haplotype Reference Consortium have both provided powerful reference panels for imputation, low-frequency and novel variants remain difficult to discover and call with accuracy on the basis of low-coverage data. Deep sequencing provides an optimal solution for the problem of these low-frequency and novel variants. Although whole-exome sequencing is also a viable choice for exome regions, it cannot account for noncoding regions, sometimes resulting in the absence of important, causal variants. For Han Chinese populations, the majority of variants have been discovered based upon low-coverage data from the 1000 Genomes Project. However, high-coverage, whole-genome sequencing data are limited for any population, and a large amount of low-frequency, population-specific variants remain uncharacterized. We have performed whole-genome sequencing at a high depth (∼×80) of 90 unrelated individuals of Chinese ancestry, collected from the 1000 Genomes Project samples, including 45 Northern Han Chinese and 45 Southern Han Chinese samples. Eighty-three of these 90 have been sequenced by the 1000 Genomes Project. We have identified 12 568 804 single nucleotide polymorphisms, 2 074 210 short InDels, and 26 142 structural variations from these 90 samples. Compared to the Han Chinese data from the 1000 Genomes Project, we have found 7 000 629 novel variants with low frequency (defined as minor allele frequency < 5%), including 5 813 503 single nucleotide polymorphisms, 1 169 199 InDels, and 17 927 structural variants. Using deep sequencing data, we have built a greatly expanded spectrum of genetic variation for the Han Chinese genome. Compared to the 1000 Genomes Project, these Han Chinese deep sequencing data enhance the characterization of a large number of low-frequency, novel variants. This will be a valuable resource for promoting Chinese genetics research and medical development. Additionally, it will provide a valuable supplement to the 1000 Genomes Project, as well as to other human genome projects. © The Authors 2017. Published by Oxford University Press.
NEP: web server for epitope prediction based on antibody neutralization of viral strains with diverse sequences.

PubMed

Chuang, Gwo-Yu; Liou, David; Kwong, Peter D; Georgiev, Ivelin S

2014-07-01

Delineation of the antigenic site, or epitope, recognized by an antibody can provide clues about functional vulnerabilities and resistance mechanisms, and can therefore guide antibody optimization and epitope-based vaccine design. Previously, we developed an algorithm for antibody-epitope prediction based on antibody neutralization of viral strains with diverse sequences and validated the algorithm on a set of broadly neutralizing HIV-1 antibodies. Here we describe the implementation of this algorithm, NEP (Neutralization-based Epitope Prediction), as a web-based server. The users must supply as input: (i) an alignment of antigen sequences of diverse viral strains; (ii) neutralization data for the antibody of interest against the same set of antigen sequences; and (iii) (optional) a structure of the unbound antigen, for enhanced prediction accuracy. The prediction results can be downloaded or viewed interactively on the antigen structure (if supplied) from the web browser using a JSmol applet. Since neutralization experiments are typically performed as one of the first steps in the characterization of an antibody to determine its breadth and potency, the NEP server can be used to predict antibody-epitope information at no additional experimental costs. NEP can be accessed on the internet at http://exon.niaid.nih.gov/nep. Published by Oxford University Press on behalf of Nucleic Acids Research 2014. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Spherical: an iterative workflow for assembling metagenomic datasets.

PubMed

Hitch, Thomas C A; Creevey, Christopher J

2018-01-24

The consensus emerging from the study of microbiomes is that they are far more complex than previously thought, requiring better assemblies and increasingly deeper sequencing. However, current metagenomic assembly techniques regularly fail to incorporate all, or even the majority in some cases, of the sequence information generated for many microbiomes, negating this effort. This can especially bias the information gathered and the perceived importance of the minor taxa in a microbiome. We propose a simple but effective approach, implemented in Python, to address this problem. Based on an iterative methodology, our workflow (called Spherical) carries out successive rounds of assemblies with the sequencing reads not yet utilised. This approach also allows the user to reduce the resources required for very large datasets, by assembling random subsets of the whole in a "divide and conquer" manner. We demonstrate the accuracy of Spherical using simulated data based on completely sequenced genomes and the effectiveness of the workflow at retrieving lost information for taxa in three published metagenomics studies of varying sizes. Our results show that Spherical increased the amount of reads utilized in the assembly by up to 109% compared to the base assembly. The additional contigs assembled by the Spherical workflow resulted in a significant (P < 0.05) changes in the predicted taxonomic profile of all datasets analysed. Spherical is implemented in Python 2.7 and freely available for use under the MIT license. Source code and documentation is hosted publically at: https://github.com/thh32/Spherical .
Classification of Lactococcus lactis cell envelope proteinase based on gene sequencing, peptides formed after hydrolysis of milk, and computer modeling.

PubMed

Børsting, M W; Qvist, K B; Brockmann, E; Vindeløv, J; Pedersen, T L; Vogensen, F K; Ardö, Y

2015-01-01

Lactococcus lactis strains depend on a proteolytic system for growth in milk to release essential AA from casein. The cleavage specificities of the cell envelope proteinase (CEP) can vary between strains and environments and whether the enzyme is released or bound to the cell wall. Thirty-eight Lc. lactis strains were grouped according to their CEP AA sequences and according to identified peptides after hydrolysis of milk. Finally, AA positions in the substrate binding region were suggested by the use of a new CEP template based on Streptococcus C5a CEP. Aligning the CEP AA sequences of 38 strains of Lc. lactis showed that 21 strains, which were previously classified as group d, could be subdivided into 3 groups. Independently, similar subgroupings were found based on comparison of the Lc. lactis CEP AA sequences and based on normalized quantity of identified peptides released from αS1-casein and β-casein. A model structure of Lc. lactis CEP based on the crystal structure of Streptococcus C5a CEP was used to investigate the AA positions in the substrate-binding region. New AA positions were suggested, which could be relevant for the cleavage specificity of CEP; however, these could only explain 2 out of 3 found subgroups. The third subgroup could be explained by 1 to 5 AA positions located opposite the substrate binding region. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Whole exome sequencing for determination of tumor mutation load in liquid biopsy from advanced cancer patients.

PubMed

Koeppel, Florence; Blanchard, Steven; Jovelet, Cécile; Genin, Bérengère; Marcaillou, Charles; Martin, Emmanuel; Rouleau, Etienne; Solary, Eric; Soria, Jean-Charles; André, Fabrice; Lacroix, Ludovic

2017-01-01

Tumor mutation load (TML) has been proposed as a biomarker of patient response to immunotherapy in several studies. TML is usually determined by tumor biopsy DNA (tDNA) whole exome sequencing (WES), therefore TML evaluation is limited by informative biopsy availability. Circulating cell free DNA (cfDNA) provided by liquid biopsy is a surrogate specimen to biopsy for molecular profiling. Nevertheless performing WES on DNA from plasma is technically challenging and the ability to determine tumor mutation load from liquid biopsies remains to be demonstrated. In the current study, WES was performed on cfDNA from 32 metastatic patients of various cancer types included into MOSCATO 01 (NCT01566019) and/or MATCHR (NCT02517892) molecular triage trials. Results from targeted gene sequencing (TGS) and WES performed on cfDNA were compared to results from tumor tissue biopsy. In cfDNA samples, WES mutation detection sensitivity was 92% compared to targeted sequencing (TGS). When comparing cfDNA-WES to tDNA-WES, mutation detection sensitivity was 53%, consistent with previously published prospective study comparing cfDNA-TGS to tDNA-TGS. For samples in which presence of tumor DNA was confirmed in cfDNA, tumor mutation load from liquid biopsy was correlated with tumor biopsy. Taken together, this study demonstrated that liquid biopsy may be applied to determine tumor mutation load. Qualification of liquid biopsy for interpretation is a crucial point to use cfDNA for mutational load estimation.
Whole exome sequencing for determination of tumor mutation load in liquid biopsy from advanced cancer patients

PubMed Central

Blanchard, Steven; Jovelet, Cécile; Genin, Bérengère; Marcaillou, Charles; Martin, Emmanuel; Rouleau, Etienne; Solary, Eric; Soria, Jean-Charles; André, Fabrice; Lacroix, Ludovic

2017-01-01

Tumor mutation load (TML) has been proposed as a biomarker of patient response to immunotherapy in several studies. TML is usually determined by tumor biopsy DNA (tDNA) whole exome sequencing (WES), therefore TML evaluation is limited by informative biopsy availability. Circulating cell free DNA (cfDNA) provided by liquid biopsy is a surrogate specimen to biopsy for molecular profiling. Nevertheless performing WES on DNA from plasma is technically challenging and the ability to determine tumor mutation load from liquid biopsies remains to be demonstrated. In the current study, WES was performed on cfDNA from 32 metastatic patients of various cancer types included into MOSCATO 01 (NCT01566019) and/or MATCHR (NCT02517892) molecular triage trials. Results from targeted gene sequencing (TGS) and WES performed on cfDNA were compared to results from tumor tissue biopsy. In cfDNA samples, WES mutation detection sensitivity was 92% compared to targeted sequencing (TGS). When comparing cfDNA-WES to tDNA-WES, mutation detection sensitivity was 53%, consistent with previously published prospective study comparing cfDNA-TGS to tDNA-TGS. For samples in which presence of tumor DNA was confirmed in cfDNA, tumor mutation load from liquid biopsy was correlated with tumor biopsy. Taken together, this study demonstrated that liquid biopsy may be applied to determine tumor mutation load. Qualification of liquid biopsy for interpretation is a crucial point to use cfDNA for mutational load estimation. PMID:29161279
Analyses of Methylomes Derived from Meso-American Common Bean (Phaseolus vulgaris L.) Using MeDIP-Seq and Whole Genome Sodium Bisulfite-Sequencing.

PubMed

Crampton, Mollee; Sripathi, Venkateswara R; Hossain, Khwaja; Kalavacharla, Venu

2016-01-01

Common bean (Phaseolus vulgaris L.) is economically important for its high protein, fiber, and micronutrient contents, with a relatively small genome size of ∼587 Mb. Common bean is genetically diverse with two major gene pools, Meso-American and Andean. The phenotypic variability within common bean is partly attributed to the genetic diversity and epigenetic changes that are largely influenced by environmental factors. It is well established that an important epigenetic regulator of gene expression is DNA methylation. Here, we present results generated from two high-throughput sequencing technologies, methylated DNA immunoprecipitation-sequencing (MeDIP-seq) and whole genome bisulfite-sequencing (BS-Seq). Our analyses revealed that this Meso-American common bean displays similar methylation patterns as other previously published plant methylomes, with CG ∼50%, CHG ∼30%, and CHH ∼2.7% methylation, however, these differ from the common bean reference methylome of Andean origin. We identified higher CG methylation levels in both promoter and genic regions than CHG and CHH contexts. Moreover, we found relatively higher CG methylation levels in genes than in promoters. Conversely, the CHG and CHH methylation levels were highest in promoters than in genes. This is the first genome-wide DNA methylation profiling study in a Meso-American common bean cultivar ("Sierra") using NGS approaches. Our long-term goal is to generate genome-wide epigenomic maps in common bean focusing on chromatin accessibility, histone modifications, and DNA methylation.
Stellar Diameters and Temperatures. III. Main-sequence A, F, G, and K Stars: Additional High-precision Measurements and Empirical Relations

NASA Astrophysics Data System (ADS)

Boyajian, Tabetha S.; von Braun, Kaspar; van Belle, Gerard; Farrington, Chris; Schaefer, Gail; Jones, Jeremy; White, Russel; McAlister, Harold A.; ten Brummelaar, Theo A.; Ridgway, Stephen; Gies, Douglas; Sturmann, Laszlo; Sturmann, Judit; Turner, Nils H.; Goldfinger, P. J.; Vargas, Norm

2013-07-01

Based on CHARA Array measurements, we present the angular diameters of 23 nearby, main-sequence stars, ranging from spectral types A7 to K0, 5 of which are exoplanet host stars. We derive linear radii, effective temperatures, and absolute luminosities of the stars using Hipparcos parallaxes and measured bolometric fluxes. The new data are combined with previously published values to create an Angular Diameter Anthology of measured angular diameters to main-sequence stars (luminosity classes V and IV). This compilation consists of 125 stars with diameter uncertainties of less than 5%, ranging in spectral types from A to M. The large quantity of empirical data is used to derive color-temperature relations to an assortment of color indices in the Johnson (BVR J I J JHK), Cousins (R C I C), Kron (R K I K), Sloan (griz), and WISE (W 3 W 4) photometric systems. These relations have an average standard deviation of ~3% and are valid for stars with spectral types A0-M4. To derive even more accurate relations for Sun-like stars, we also determined these temperature relations omitting early-type stars (T eff > 6750 K) that may have biased luminosity estimates because of rapid rotation; for this subset the dispersion is only ~2.5%. We find effective temperatures in agreement within a couple of percent for the interferometrically characterized sample of main-sequence stars compared to those derived via the infrared flux method and spectroscopic analysis.
Molecular detection of trophic links in a complex insect host-parasitoid food web.

PubMed

Hrcek, Jan; Miller, Scott E; Quicke, Donald L J; Smith, M Alex

2011-09-01

Previously, host-parasitoid links have been unveiled almost exclusively by time-intensive rearing, while molecular methods were used only in simple agricultural host-parasitoid systems in the form of species-specific primers. Here, we present a general method for the molecular detection of these links applied to a complex caterpillar-parasitoid food web from tropical rainforest of Papua New Guinea. We DNA barcoded hosts, parasitoids and their tissue remnants and matched the sequences to our extensive library of local species. We were thus able to match 87% of host sequences and 36% of parasitoid sequences to species and infer subfamily or family in almost all cases. Our analysis affirmed 93 hitherto unknown trophic links between 37 host species from a wide range of Lepidoptera families and 46 parasitoid species from Hymenoptera and Diptera by identifying DNA sequences for both the host and the parasitoid involved in the interaction. Molecular detection proved especially useful in cases where distinguishing host species in caterpillar stage was difficult morphologically, or when the caterpillar died during rearing. We have even detected a case of extreme parasitoid specialization in a pair of Choreutis species that do not differ in caterpillar morphology and ecology. Using the molecular approach outlined here leads to better understanding of parasitoid host specificity, opens new possibilities for rapid surveys of food web structure and allows inference of species associations not already anticipated. Published 2011. This article is a US Government work and is in the public domain in the USA.
Targeted sequencing-based analyses of candidate gene variants in ulcerative colitis-associated colorectal neoplasia.

PubMed

Chakrabarty, Sanjiban; Varghese, Vinay Koshy; Sahu, Pranoy; Jayaram, Pradyumna; Shivakumar, Bhadravathi M; Pai, Cannanore Ganesh; Satyamoorthy, Kapaettu

2017-06-27

Long-standing ulcerative colitis (UC) leading to colorectal cancer (CRC) is one of the most serious and life-threatening consequences acknowledged globally. Ulcerative colitis-associated colorectal carcinogenesis showed distinct molecular alterations when compared with sporadic colorectal carcinoma. Targeted sequencing of 409 genes in tissue samples of 18 long-standing UC subjects at high risk of colorectal carcinoma (UCHR) was performed to identify somatic driver mutations, which may be involved in the molecular changes during the transformation of non-dysplastic mucosa to high-grade dysplasia. Findings from the study are also compared with previously published genome wide and exome sequencing data in inflammatory bowel disease-associated and sporadic colorectal carcinoma. Next-generation sequencing analysis identified 1107 mutations in 275 genes in UCHR subjects. In addition to TP53 (17%) and KRAS (22%) mutations, recurrent mutations in APC (33%), ACVR2A (61%), ARID1A (44%), RAF1 (39%) and MTOR (61%) were observed in UCHR subjects. In addition, APC, FGFR3, FGFR2 and PIK3CA driver mutations were identified in UCHR subjects. Recurrent mutations in ARID1A (44%), SMARCA4 (17%), MLL2 (44%), MLL3 (67%), SETD2 (17%) and TET2 (50%) genes involved in histone modification and chromatin remodelling were identified in UCHR subjects. Our study identifies new oncogenic driver mutations which may be involved in the transition of non-dysplastic cells to dysplastic phenotype in the subjects with long-standing UC with high risk of progression into colorectal neoplasia.
The Delta Scuti star 38 Eri from the ground and from space

NASA Astrophysics Data System (ADS)

Paparó, M.; Kolláth, Z.; Shobbrook, R. R.; Matthews, J. M.; Antoci, V.; Benkő, J. M.; Park, N.-K.; Mirtorabi, M. T.; Luedeke, K.; Kusakin, A.; Bognár, Zs; Sódor, Á.; García-Hernández, A.; Pe na, J. H.; Kuschnig, R.; Moffat, A. F. J.; Rowe, J.; Rucinski, S. M.; Sasselov, D.; Weiss, W. W.

2018-04-01

We present and discuss the pulsational characteristics of the Delta Scuti star 38 Eri from photometric data obtained at two widely spaced epochs, partly from the ground (1998) and partly from space (MOST, 2011). We found 18 frequencies resolving the discrepancy among the previously published frequencies. Some of the frequencies appeared with different relative amplitudes at two epochs, however, we carried out investigation for amplitude variability for only the MOST data. Amplitude variability was found for one of three frequencies that satisfy the necessary frequency criteria for linear-combination or resonant-mode coupling. Checking the criteria of beating and resonant-mode coupling we excluded them as possible reason for amplitude variability. The two recently developed methods of rotational-splitting and sequence-search were applied to find regular spacings based only on frequencies. Doublets or incomplete multiplets with l = 1, 2 and 3 were found in the rotational splitting search. In the sequence search method we identified four sequences. The averaged spacing, probably a combination of the large separation and the rotational frequency, is 1.724 ± 0.092 d-1. Using the spacing and the scaling relation \\bar{ρ }= [0.0394, 0.0554] gcm-3 was derived. The shift of the sequences proved to be the integer multiple of the rotational splitting spacing. Using the precise MOST frequencies and multi-colour photometry in a hybrid way, we identified four modes with l = 1, two modes with l = 2, two modes with l = 3, and two modes as l = 0 radial modes.
Genomic Sequence and Experimental Tractability of a New Decapod Shrimp Model, Neocaridina denticulata

PubMed Central

Kenny, Nathan J.; Sin, Yung Wa; Shen, Xin; Zhe, Qu; Wang, Wei; Chan, Ting Fung; Tobe, Stephen S.; Shimeld, Sebastian M.; Chu, Ka Hou; Hui, Jerome H. L.

2014-01-01

The speciose Crustacea is the largest subphylum of arthropods on the planet after the Insecta. To date, however, the only publically available sequenced crustacean genome is that of the water flea, Daphnia pulex, a member of the Branchiopoda. While Daphnia is a well-established ecotoxicological model, previous study showed that one-third of genes contained in its genome are lineage-specific and could not be identified in any other metazoan genomes. To better understand the genomic evolution of crustaceans and arthropods, we have sequenced the genome of a novel shrimp model, Neocaridina denticulata, and tested its experimental malleability. A library of 170-bp nominal fragment size was constructed from DNA of a starved single adult and sequenced using the Illumina HiSeq2000 platform. Core eukaryotic genes, the mitochondrial genome, developmental patterning genes (such as Hox) and microRNA processing pathway genes are all present in this animal, suggesting it has not undergone massive genomic loss. Comparison with the published genome of Daphnia pulex has allowed us to reveal 3750 genes that are indeed specific to the lineage containing malacostracans and branchiopods, rather than Daphnia-specific (E-value: 10−6). We also show the experimental tractability of N. denticulata, which, together with the genomic resources presented here, make it an ideal model for a wide range of further aquacultural, developmental, ecotoxicological, food safety, genetic, hormonal, physiological and reproductive research, allowing better understanding of the evolution of crustaceans and other arthropods. PMID:24619275
Analyses of Methylomes Derived from Meso-American Common Bean (Phaseolus vulgaris L.) Using MeDIP-Seq and Whole Genome Sodium Bisulfite-Sequencing

PubMed Central

Crampton, Mollee; Sripathi, Venkateswara R.; Hossain, Khwaja; Kalavacharla, Venu

2016-01-01

Common bean (Phaseolus vulgaris L.) is economically important for its high protein, fiber, and micronutrient contents, with a relatively small genome size of ∼587 Mb. Common bean is genetically diverse with two major gene pools, Meso-American and Andean. The phenotypic variability within common bean is partly attributed to the genetic diversity and epigenetic changes that are largely influenced by environmental factors. It is well established that an important epigenetic regulator of gene expression is DNA methylation. Here, we present results generated from two high-throughput sequencing technologies, methylated DNA immunoprecipitation-sequencing (MeDIP-seq) and whole genome bisulfite-sequencing (BS-Seq). Our analyses revealed that this Meso-American common bean displays similar methylation patterns as other previously published plant methylomes, with CG ∼50%, CHG ∼30%, and CHH ∼2.7% methylation, however, these differ from the common bean reference methylome of Andean origin. We identified higher CG methylation levels in both promoter and genic regions than CHG and CHH contexts. Moreover, we found relatively higher CG methylation levels in genes than in promoters. Conversely, the CHG and CHH methylation levels were highest in promoters than in genes. This is the first genome-wide DNA methylation profiling study in a Meso-American common bean cultivar (“Sierra”) using NGS approaches. Our long-term goal is to generate genome-wide epigenomic maps in common bean focusing on chromatin accessibility, histone modifications, and DNA methylation. PMID:27199997
Potential role of DNA methylation as a facilitator of target search processes for transcription factors through interplay with methyl-CpG-binding proteins.

PubMed

Kemme, Catherine A; Marquez, Rolando; Luu, Ross H; Iwahara, Junji

2017-07-27

Eukaryotic genomes contain numerous non-functional high-affinity sequences for transcription factors. These sequences potentially serve as natural decoys that sequester transcription factors. We have previously shown that the presence of sequences similar to the target sequence could substantially impede association of the transcription factor Egr-1 with its targets. In this study, using a stopped-flow fluorescence method, we examined the kinetic impact of DNA methylation of decoys on the search process of the Egr-1 zinc-finger protein. We analyzed its association with an unmethylated target site on fluorescence-labeled DNA in the presence of competitor DNA duplexes, including Egr-1 decoys. DNA methylation of decoys alone did not affect target search kinetics. In the presence of the MeCP2 methyl-CpG-binding domain (MBD), however, DNA methylation of decoys substantially (∼10-30-fold) accelerated the target search process of the Egr-1 zinc-finger protein. This acceleration did not occur when the target was also methylated. These results suggest that when decoys are methylated, MBD proteins can block them and thereby allow Egr-1 to avoid sequestration in non-functional locations. This effect may occur in vivo for DNA methylation outside CpG islands (CGIs) and could facilitate localization of some transcription factors within regulatory CGIs, where DNA methylation is rare. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Phylogenetic stratigraphy in the Guerrero Negro hypersaline microbial mat.

PubMed

Harris, J Kirk; Caporaso, J Gregory; Walker, Jeffrey J; Spear, John R; Gold, Nicholas J; Robertson, Charles E; Hugenholtz, Philip; Goodrich, Julia; McDonald, Daniel; Knights, Dan; Marshall, Paul; Tufo, Henry; Knight, Rob; Pace, Norman R

2013-01-01

The microbial mats of Guerrero Negro (GN), Baja California Sur, Mexico historically were considered a simple environment, dominated by cyanobacteria and sulfate-reducing bacteria. Culture-independent rRNA community profiling instead revealed these microbial mats as among the most phylogenetically diverse environments known. A preliminary molecular survey of the GN mat based on only ∼1500 small subunit rRNA gene sequences discovered several new phylum-level groups in the bacterial phylogenetic domain and many previously undetected lower-level taxa. We determined an additional ∼119,000 nearly full-length sequences and 28,000 >200 nucleotide 454 reads from a 10-layer depth profile of the GN mat. With this unprecedented coverage of long sequences from one environment, we confirm the mat is phylogenetically stratified, presumably corresponding to light and geochemical gradients throughout the depth of the mat. Previous shotgun metagenomic data from the same depth profile show the same stratified pattern and suggest that metagenome properties may be predictable from rRNA gene sequences. We verify previously identified novel lineages and identify new phylogenetic diversity at lower taxonomic levels, for example, thousands of operational taxonomic units at the family-genus levels differ considerably from known sequences. The new sequences populate parts of the bacterial phylogenetic tree that previously were poorly described, but indicate that any comprehensive survey of GN diversity has only begun. Finally, we show that taxonomic conclusions are generally congruent between Sanger and 454 sequencing technologies, with the taxonomic resolution achieved dependent on the abundance of reference sequences in the relevant region of the rRNA tree of life.
Rare and Coding Region Genetic Variants Associated With Risk of Ischemic Stroke: The NHLBI Exome Sequence Project.

PubMed

Auer, Paul L; Nalls, Mike; Meschia, James F; Worrall, Bradford B; Longstreth, W T; Seshadri, Sudha; Kooperberg, Charles; Burger, Kathleen M; Carlson, Christopher S; Carty, Cara L; Chen, Wei-Min; Cupples, L Adrienne; DeStefano, Anita L; Fornage, Myriam; Hardy, John; Hsu, Li; Jackson, Rebecca D; Jarvik, Gail P; Kim, Daniel S; Lakshminarayan, Kamakshi; Lange, Leslie A; Manichaikul, Ani; Quinlan, Aaron R; Singleton, Andrew B; Thornton, Timothy A; Nickerson, Deborah A; Peters, Ulrike; Rich, Stephen S

2015-07-01

Stroke is the second leading cause of death and the third leading cause of years of life lost. Genetic factors contribute to stroke prevalence, and candidate gene and genome-wide association studies (GWAS) have identified variants associated with ischemic stroke risk. These variants often have small effects without obvious biological significance. Exome sequencing may discover predicted protein-altering variants with a potentially large effect on ischemic stroke risk. To investigate the contribution of rare and common genetic variants to ischemic stroke risk by targeting the protein-coding regions of the human genome. The National Heart, Lung, and Blood Institute (NHLBI) Exome Sequencing Project (ESP) analyzed approximately 6000 participants from numerous cohorts of European and African ancestry. For discovery, 365 cases of ischemic stroke (small-vessel and large-vessel subtypes) and 809 European ancestry controls were sequenced; for replication, 47 affected sibpairs concordant for stroke subtype and an African American case-control series were sequenced, with 1672 cases and 4509 European ancestry controls genotyped. The ESP's exome sequencing and genotyping started on January 1, 2010, and continued through June 30, 2012. Analyses were conducted on the full data set between July 12, 2012, and July 13, 2013. Discovery of new variants or genes contributing to ischemic stroke risk and subtype (primary analysis) and determination of support for protein-coding variants contributing to risk in previously published candidate genes (secondary analysis). We identified 2 novel genes associated with an increased risk of ischemic stroke: a protein-coding variant in PDE4DIP (rs1778155; odds ratio, 2.15; P = 2.63 × 10(-8)) with an intracellular signal transduction mechanism and in ACOT4 (rs35724886; odds ratio, 2.04; P = 1.24 × 10(-7)) with a fatty acid metabolism; confirmation of PDE4DIP was observed in affected sibpair families with large-vessel stroke subtype and in African Americans. Replication of protein-coding variants in candidate genes was observed for 2 previously reported GWAS associations: ZFHX3 (cardioembolic stroke) and ABCA1 (large-vessel stroke). Exome sequencing discovered 2 novel genes and mechanisms, PDE4DIP and ACOT4, associated with increased risk for ischemic stroke. In addition, ZFHX3 and ABCA1 were discovered to have protein-coding variants associated with ischemic stroke. These results suggest that genetic variation in novel pathways contributes to ischemic stroke risk and serves as a target for prediction, prevention, and therapy.
Phylogeny and strain typing of Escherichia coli, inferred from variation at mononucleotide repeat loci.

PubMed

Diamant, Eran; Palti, Yniv; Gur-Arie, Riva; Cohen, Helit; Hallerman, Eric M; Kashi, Yechezkel

2004-04-01

Multilocus sequencing of housekeeping genes has been used previously for bacterial strain typing and for inferring evolutionary relationships among strains of Escherichia coli. In this study, we used shorter intergenic sequences that contained simple sequence repeats (SSRs) of repeating mononucleotide motifs (mononucleotide repeats [MNRs]) to infer the phylogeny of pathogenic and commensal E. coli strains. Seven noncoding loci (four MNRs and three non-SSRs) were sequenced in 27 strains, including enterohemorrhagic (six isolates of O157:H7), enteropathogenic, enterotoxigenic, B, and K-12 strains. The four MNRs were also sequenced in 20 representative strains of the E. coli reference (ECOR) collection. Sequence polymorphism was significantly higher at the MNR loci, including the flanking sequences, indicating a higher mutation rate in the sequences flanking the MNR tracts. The four MNR loci were amplifiable by PCR in the standard ECOR A, B1, and D groups, but only one (yaiN) in the B2 group was amplified, which is consistent with previous studies that suggested that B2 is the most ancient group. High sequence compatibility was found between the four MNR loci, indicating that they are in the same clonal frame. The phylogenetic trees that were constructed from the sequence data were in good agreement with those of previous studies that used multilocus enzyme electrophoresis. The results demonstrate that MNR loci are useful for inferring phylogenetic relationships and provide much higher sequence variation than housekeeping genes. Therefore, the use of MNR loci for multilocus sequence typing should prove efficient for clinical diagnostics, epidemiology, and evolutionary study of bacteria.
Phylogeny and Strain Typing of Escherichia coli, Inferred from Variation at Mononucleotide Repeat Loci

PubMed Central

Diamant, Eran; Palti, Yniv; Gur-Arie, Riva; Cohen, Helit; Hallerman, Eric M.; Kashi, Yechezkel

2004-01-01

Multilocus sequencing of housekeeping genes has been used previously for bacterial strain typing and for inferring evolutionary relationships among strains of Escherichia coli. In this study, we used shorter intergenic sequences that contained simple sequence repeats (SSRs) of repeating mononucleotide motifs (mononucleotide repeats [MNRs]) to infer the phylogeny of pathogenic and commensal E. coli strains. Seven noncoding loci (four MNRs and three non-SSRs) were sequenced in 27 strains, including enterohemorrhagic (six isolates of O157:H7), enteropathogenic, enterotoxigenic, B, and K-12 strains. The four MNRs were also sequenced in 20 representative strains of the E. coli reference (ECOR) collection. Sequence polymorphism was significantly higher at the MNR loci, including the flanking sequences, indicating a higher mutation rate in the sequences flanking the MNR tracts. The four MNR loci were amplifiable by PCR in the standard ECOR A, B1, and D groups, but only one (yaiN) in the B2 group was amplified, which is consistent with previous studies that suggested that B2 is the most ancient group. High sequence compatibility was found between the four MNR loci, indicating that they are in the same clonal frame. The phylogenetic trees that were constructed from the sequence data were in good agreement with those of previous studies that used multilocus enzyme electrophoresis. The results demonstrate that MNR loci are useful for inferring phylogenetic relationships and provide much higher sequence variation than housekeeping genes. Therefore, the use of MNR loci for multilocus sequence typing should prove efficient for clinical diagnostics, epidemiology, and evolutionary study of bacteria. PMID:15066845
The global catalogue of microorganisms 10K type strain sequencing project: closing the genomic gaps for the validly published prokaryotic and fungi species.

PubMed

Wu, Linhuan; McCluskey, Kevin; Desmeth, Philippe; Liu, Shuangjiang; Hideaki, Sugawara; Yin, Ye; Moriya, Ohkuma; Itoh, Takashi; Kim, Cha Young; Lee, Jung-Sook; Zhou, Yuguang; Kawasaki, Hiroko; Hazbón, Manzour Hernando; Robert, Vincent; Boekhout, Teun; Lima, Nelson; Evtushenko, Lyudmila; Boundy-Mills, Kyria; Bunk, Boyke; Moore, Edward R B; Eurwilaichitr, Lily; Ingsriswang, Supawadee; Shah, Heena; Yao, Su; Jin, Tao; Huang, Jinqun; Shi, Wenyu; Sun, Qinglan; Fan, Guomei; Li, Wei; Li, Xian; Kurtböke, Ipek; Ma, Juncai

2018-05-01

Genomic information is essential for taxonomic, phylogenetic, and functional studies to comprehensively decipher the characteristics of microorganisms, to explore microbiomes through metagenomics, and to answer fundamental questions of nature and human life. However, large gaps remain in the available genomic sequencing information published for bacterial and archaeal species, and the gaps are even larger for fungal type strains. The Global Catalogue of Microorganisms (GCM) leads an internationally coordinated effort to sequence type strains and close gaps in the genomic maps of microorganisms. Hence, the GCM aims to promote research by deep-mining genomic data.
HOCOMOCO: towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP-Seq analysis.

PubMed

Kulakovskiy, Ivan V; Vorontsov, Ilya E; Yevshin, Ivan S; Sharipov, Ruslan N; Fedorova, Alla D; Rumynskiy, Eugene I; Medvedeva, Yulia A; Magana-Mora, Arturo; Bajic, Vladimir B; Papatsenko, Dmitry A; Kolpakov, Fedor A; Makeev, Vsevolod J

2018-01-04

We present a major update of the HOCOMOCO collection that consists of patterns describing DNA binding specificities for human and mouse transcription factors. In this release, we profited from a nearly doubled volume of published in vivo experiments on transcription factor (TF) binding to expand the repertoire of binding models, replace low-quality models previously based on in vitro data only and cover more than a hundred TFs with previously unknown binding specificities. This was achieved by systematic motif discovery from more than five thousand ChIP-Seq experiments uniformly processed within the BioUML framework with several ChIP-Seq peak calling tools and aggregated in the GTRD database. HOCOMOCO v11 contains binding models for 453 mouse and 680 human transcription factors and includes 1302 mononucleotide and 576 dinucleotide position weight matrices, which describe primary binding preferences of each transcription factor and reliable alternative binding specificities. An interactive interface and bulk downloads are available on the web: http://hocomoco.autosome.ru and http://www.cbrc.kaust.edu.sa/hocomoco11. In this release, we complement HOCOMOCO by MoLoTool (Motif Location Toolbox, http://molotool.autosome.ru) that applies HOCOMOCO models for visualization of binding sites in short DNA sequences. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.