Sample records for highly variable sequence

  1. A multiple-alignment based primer design algorithm for genetically highly variable DNA targets

    PubMed Central

    2013-01-01

    Background Primer design for highly variable DNA sequences is difficult, and experimental success requires attention to many interacting constraints. The advent of next-generation sequencing methods allows the investigation of rare variants otherwise hidden deep in large populations, but requires attention to population diversity and primer localization in relatively conserved regions, in addition to recognized constraints typically considered in primer design. Results Design constraints include degenerate sites to maximize population coverage, matching of melting temperatures, optimizing de novo sequence length, finding optimal bio-barcodes to allow efficient downstream analyses, and minimizing risk of dimerization. To facilitate primer design addressing these and other constraints, we created a novel computer program (PrimerDesign) that automates this complex procedure. We show its powers and limitations and give examples of successful designs for the analysis of HIV-1 populations. Conclusions PrimerDesign is useful for researchers who want to design DNA primers and probes for analyzing highly variable DNA populations. It can be used to design primers for PCR, RT-PCR, Sanger sequencing, next-generation sequencing, and other experimental protocols targeting highly variable DNA samples. PMID:23965160

  2. Diversity of the P2 protein among nontypeable Haemophilus influenzae isolates.

    PubMed Central

    Bell, J; Grass, S; Jeanteur, D; Munson, R S

    1994-01-01

    The genes for outer membrane protein P2 of four nontypeable Haemophilus influenzae strains were cloned and sequenced. The derived amino acid sequences were compared with the outer membrane protein P2 sequence from H. influenzae type b MinnA and the sequences of P2 from three additional nontypeable H. influenzae strains. The sequences were 76 to 94% identical. The sequences had regions with considerable variability separated by regions which were highly conserved. The variable regions mapped to putative surface-exposed loops of the protein. PMID:8188390

  3. Length and sequence variability in mitochondrial control region of the milkfish, Chanos chanos.

    PubMed

    Ravago, Rachel G; Monje, Virginia D; Juinio-Meñez, Marie Antonette

    2002-01-01

    Extensive length variability was observed in the mitochondrial control region of the milkfish, Chanos chanos. The nucleotide sequence of the control region and flanking regions was determined. Length variability and heteroplasmy was due to the presence of varying numbers of a 41-bp tandemly repeated sequence and a 48-bp insertion/deletion (indel). The structure and organization of the milkfish control region is similar to that of other teleost fish and vertebrates. However, extensive variation in the copy number of tandem repeats (4-20 copies) and the presence of a relatively large (48-bp) indel, are apparently uncommon in teleost fish control region sequences reported to date. High sequence variability of control region peripheral domains indicates the potential utility of selected regions as markers for population-level studies.

  4. Improved imaging of cochlear nerve hypoplasia using a 3-Tesla variable flip-angle turbo spin-echo sequence and a 7-cm surface coil.

    PubMed

    Giesemann, Anja M; Raab, Peter; Lyutenski, Stefan; Dettmer, Sabine; Bültmann, Eva; Frömke, Cornelia; Lenarz, Thomas; Lanfermann, Heinrich; Goetz, Friedrich

    2014-03-01

    Magnetic resonance imaging of the temporal bone has an important role in decision making with regard to cochlea implantation, especially in children with cochlear nerve deficiency. The purpose of this study was to evaluate the usefulness of the combination of an advanced high-resolution T2-weighted sequence with a surface coil in a 3-Tesla magnetic resonance imaging scanner in cases of suspected cochlear nerve aplasia. Prospective study. Seven patients with cochlear nerve hypoplasia or aplasia were prospectively examined using a high-resolution three-dimensional variable flip-angle turbo spin-echo sequence using a surface coil, and the images were compared with the same sequence in standard resolution using a standard head coil. Three neuroradiologists evaluated the magnetic resonance images independently, rating the visibility of the nerves in diagnosing hypoplasia or aplasia. Eight ears in seven patients with hypoplasia or aplasia of the cochlear nerve were examined. The average age was 2.7 years (range, 9 months-5 years). Seven ears had accompanying malformations. The inter-rater reliability in diagnosing hypoplasia or aplasia was greater using the high-resolution three-dimensional variable flip-angle turbo spin-echo sequence (fixed-marginal kappa: 0.64) than with the same sequence in lower resolution (fixed-marginal kappa: 0.06). Examining cases of suspected cochlear nerve aplasia using the high-resolution three-dimensional variable flip-angle turbo spin-echo sequence in combination with a surface coil shows significant improvement over standard methods. © 2013 The American Laryngological, Rhinological and Otological Society, Inc.

  5. International interlaboratory study comparing single organism 16S rRNA gene sequencing data: Beyond consensus sequence comparisons

    PubMed Central

    Olson, Nathan D.; Lund, Steven P.; Zook, Justin M.; Rojas-Cornejo, Fabiola; Beck, Brian; Foy, Carole; Huggett, Jim; Whale, Alexandra S.; Sui, Zhiwei; Baoutina, Anna; Dobeson, Michael; Partis, Lina; Morrow, Jayne B.

    2015-01-01

    This study presents the results from an interlaboratory sequencing study for which we developed a novel high-resolution method for comparing data from different sequencing platforms for a multi-copy, paralogous gene. The combination of PCR amplification and 16S ribosomal RNA gene (16S rRNA) sequencing has revolutionized bacteriology by enabling rapid identification, frequently without the need for culture. To assess variability between laboratories in sequencing 16S rRNA, six laboratories sequenced the gene encoding the 16S rRNA from Escherichia coli O157:H7 strain EDL933 and Listeria monocytogenes serovar 4b strain NCTC11994. Participants performed sequencing methods and protocols available in their laboratories: Sanger sequencing, Roche 454 pyrosequencing®, or Ion Torrent PGM®. The sequencing data were evaluated on three levels: (1) identity of biologically conserved position, (2) ratio of 16S rRNA gene copies featuring identified variants, and (3) the collection of variant combinations in a set of 16S rRNA gene copies. The same set of biologically conserved positions was identified for each sequencing method. Analytical methods using Bayesian and maximum likelihood statistics were developed to estimate variant copy ratios, which describe the ratio of nucleotides at each identified biologically variable position, as well as the likely set of variant combinations present in 16S rRNA gene copies. Our results indicate that estimated variant copy ratios at biologically variable positions were only reproducible for high throughput sequencing methods. Furthermore, the likely variant combination set was only reproducible with increased sequencing depth and longer read lengths. We also demonstrate novel methods for evaluating variable positions when comparing multi-copy gene sequence data from multiple laboratories generated using multiple sequencing technologies. PMID:27077030

  6. Analysis of Variability in HIV-1 Subtype A Strains in Russia Suggests a Combination of Deep Sequencing and Multitarget RNA Interference for Silencing of the Virus.

    PubMed

    Kretova, Olga V; Chechetkin, Vladimir R; Fedoseeva, Daria M; Kravatsky, Yuri V; Sosin, Dmitri V; Alembekov, Ildar R; Gorbacheva, Maria A; Gashnikova, Natalya M; Tchurikov, Nickolai A

    2017-02-01

    Any method for silencing the activity of the HIV-1 retrovirus should tackle the extremely high variability of HIV-1 sequences and mutational escape. We studied sequence variability in the vicinity of selected RNA interference (RNAi) targets from isolates of HIV-1 subtype A in Russia, and we propose that using artificial RNAi is a potential alternative to traditional antiretroviral therapy. We prove that using multiple RNAi targets overcomes the variability in HIV-1 isolates. The optimal number of targets critically depends on the conservation of the target sequences. The total number of targets that are conserved with a probability of 0.7-0.8 should exceed at least 2. Combining deep sequencing and multitarget RNAi may provide an efficient approach to cure HIV/AIDS.

  7. DNA Barcode Sequence Identification Incorporating Taxonomic Hierarchy and within Taxon Variability

    PubMed Central

    Little, Damon P.

    2011-01-01

    For DNA barcoding to succeed as a scientific endeavor an accurate and expeditious query sequence identification method is needed. Although a global multiple–sequence alignment can be generated for some barcoding markers (e.g. COI, rbcL), not all barcoding markers are as structurally conserved (e.g. matK). Thus, algorithms that depend on global multiple–sequence alignments are not universally applicable. Some sequence identification methods that use local pairwise alignments (e.g. BLAST) are unable to accurately differentiate between highly similar sequences and are not designed to cope with hierarchic phylogenetic relationships or within taxon variability. Here, I present a novel alignment–free sequence identification algorithm–BRONX–that accounts for observed within taxon variability and hierarchic relationships among taxa. BRONX identifies short variable segments and corresponding invariant flanking regions in reference sequences. These flanking regions are used to score variable regions in the query sequence without the production of a global multiple–sequence alignment. By incorporating observed within taxon variability into the scoring procedure, misidentifications arising from shared alleles/haplotypes are minimized. An explicit treatment of more inclusive terminals allows for separate identifications to be made for each taxonomic level and/or for user–defined terminals. BRONX performs better than all other methods when there is imperfect overlap between query and reference sequences (e.g. mini–barcode queries against a full–length barcode database). BRONX consistently produced better identifications at the genus–level for all query types. PMID:21857897

  8. First Results on the Variability of Mid- and High-Latitude Ionospheric Electric Fields at 1- Second Time Scales

    NASA Astrophysics Data System (ADS)

    Ruohoniemi, J. M.; Greenwald, R. A.; Oksavik, K.; Baker, J. B.

    2007-12-01

    The electric fields at high latitudes are often modeled as a static pattern in the absence of variation in solar wind parameters or geomagnetic disturbance. However, temporal variability in the local electric fields on time scales of minutes for stable conditions has been reported and characterized statistically as an intrinsic property amounting to turbulence. We describe the results of applying a new technique to SuperDARN HF radar observations of ionospheric plasma convection at middle and high latitudes that gives views of the variability of the electric fields at sub-second time scales. We address the question of whether there is a limit to the temporal scale of the electric field variability and consider whether the turbulence on minute time scales is due to organized but unresolved behavior. The basis of the measurements is the ability to record raw samples from the individual multipulse sequences that are transmitted during the standard 3 or 6-second SuperDARN integration period; a backscattering volume is then effectively sampled at a cadence of 200 ms. The returns from the individual sequences are often sufficiently well-ordered to permit a sequence-by-sequence characterization of the electric field and backscattered power. We attempt a statistical characterization of the variability at these heretofore inaccessible time scales and consider how variability is influenced by solar wind and magentospheric factors.

  9. The rDNA ITS region in the lessepsian marine angiosperm Halophila stipulacea (Forssk.) Aschers. (Hydrocharitaceae): intragenomic variability and putative pseudogenic sequences.

    PubMed

    Ruggiero, Maria Valeria; Procaccini, Gabriele

    2004-01-01

    Halophila stipulacea is a dioecious marine angiosperm, widely distributed along the western coasts of the Indian Ocean and the Red Sea. This species is thought to be a Lessepsian immigrant that entered the Mediterranean Sea from the Red Sea after the opening of the Suez Canal (1869). Previous studies have revealed both high phenotypic and genetic variability in Halophila stipulacea populations from the western Mediterranean basin. In order to test the hypothesis of a Lessepsian introduction, we compare genetic polymorphism between putative native (Red Sea) and introduced (Mediterranean) populations through rDNA ITS region (ITS1-5.8S-ITS2) sequence analysis. A high degree of intraindividual variability of ITS sequences was found. Most of the intragenomic polymorphism was due to pseudogenic sequences, present in almost all individuals. Features of ITS functional sequences and pseudogenes are described. Possible causes for the lack of homogenization of ITS paralogues within individuals are discussed.

  10. Variable Behavior and Repeated Learning in Two Mouse Strains: Developmental and Genetic Contributions.

    PubMed

    Arnold, Megan A; Newland, M Christopher

    2018-06-16

    Behavioral inflexibility is often assessed using reversal learning tasks, which require a relatively low degree of response variability. No studies have assessed sensitivity to reinforcement contingencies that specifically select highly variable response patterns in mice, let alone in models of neurodevelopmental disorders involving limited response variation. Operant variability and incremental repeated acquisition (IRA) were used to assess unique aspects of behavioral variability of two mouse strains: BALB/c, a model of some deficits in ASD, and C57Bl/6. On the operant variability task, BALB/c mice responded more repetitively during adolescence than C57Bl/6 mice when reinforcement did not require variability but responded more variably when reinforcement required variability. During IRA testing in adulthood, both strains acquired an unchanging, performance sequence equally well. Strain differences emerged, however, after novel learning sequences began alternating with the performance sequence: BALB/c mice substantially outperformed C57Bl/6 mice. Using litter-mate controls, it was found that adolescent experience with variability did not affect either learning or performance on the IRA task in adulthood. These findings constrain the use of BALB/c mice as a model of ASD, but once again reveal this strain is highly sensitive to reinforcement contingencies and they are fast and robust learners. Copyright © 2018. Published by Elsevier B.V.

  11. Intergenic Sequence Ribotyping using a region neighboring dkgB links genovar to Kauffman-White serotype of Salmonella enterica

    USDA-ARS?s Scientific Manuscript database

    Previous research identified that the 5S ribosomal (rrn) gene and associated flanking sequences that are closely linked to the dkgB gene of Salmonella enterica were highly variable between serotypes, but not between subpopulations within the same serotype (PMID: 17005008). The degree of variability ...

  12. BASiCS: Bayesian Analysis of Single-Cell Sequencing Data

    PubMed Central

    Vallejos, Catalina A.; Marioni, John C.; Richardson, Sylvia

    2015-01-01

    Single-cell mRNA sequencing can uncover novel cell-to-cell heterogeneity in gene expression levels in seemingly homogeneous populations of cells. However, these experiments are prone to high levels of unexplained technical noise, creating new challenges for identifying genes that show genuine heterogeneous expression within the population of cells under study. BASiCS (Bayesian Analysis of Single-Cell Sequencing data) is an integrated Bayesian hierarchical model where: (i) cell-specific normalisation constants are estimated as part of the model parameters, (ii) technical variability is quantified based on spike-in genes that are artificially introduced to each analysed cell’s lysate and (iii) the total variability of the expression counts is decomposed into technical and biological components. BASiCS also provides an intuitive detection criterion for highly (or lowly) variable genes within the population of cells under study. This is formalised by means of tail posterior probabilities associated to high (or low) biological cell-to-cell variance contributions, quantities that can be easily interpreted by users. We demonstrate our method using gene expression measurements from mouse Embryonic Stem Cells. Cross-validation and meaningful enrichment of gene ontology categories within genes classified as highly (or lowly) variable supports the efficacy of our approach. PMID:26107944

  13. BASiCS: Bayesian Analysis of Single-Cell Sequencing Data.

    PubMed

    Vallejos, Catalina A; Marioni, John C; Richardson, Sylvia

    2015-06-01

    Single-cell mRNA sequencing can uncover novel cell-to-cell heterogeneity in gene expression levels in seemingly homogeneous populations of cells. However, these experiments are prone to high levels of unexplained technical noise, creating new challenges for identifying genes that show genuine heterogeneous expression within the population of cells under study. BASiCS (Bayesian Analysis of Single-Cell Sequencing data) is an integrated Bayesian hierarchical model where: (i) cell-specific normalisation constants are estimated as part of the model parameters, (ii) technical variability is quantified based on spike-in genes that are artificially introduced to each analysed cell's lysate and (iii) the total variability of the expression counts is decomposed into technical and biological components. BASiCS also provides an intuitive detection criterion for highly (or lowly) variable genes within the population of cells under study. This is formalised by means of tail posterior probabilities associated to high (or low) biological cell-to-cell variance contributions, quantities that can be easily interpreted by users. We demonstrate our method using gene expression measurements from mouse Embryonic Stem Cells. Cross-validation and meaningful enrichment of gene ontology categories within genes classified as highly (or lowly) variable supports the efficacy of our approach.

  14. Importance of Viral Sequence Length and Number of Variable and Informative Sites in Analysis of HIV Clustering.

    PubMed

    Novitsky, Vlad; Moyo, Sikhulile; Lei, Quanhong; DeGruttola, Victor; Essex, M

    2015-05-01

    To improve the methodology of HIV cluster analysis, we addressed how analysis of HIV clustering is associated with parameters that can affect the outcome of viral clustering. The extent of HIV clustering and tree certainty was compared between 401 HIV-1C near full-length genome sequences and subgenomic regions retrieved from the LANL HIV Database. Sliding window analysis was based on 99 windows of 1,000 bp and 45 windows of 2,000 bp. Potential associations between the extent of HIV clustering and sequence length and the number of variable and informative sites were evaluated. The near full-length genome HIV sequences showed the highest extent of HIV clustering and the highest tree certainty. At the bootstrap threshold of 0.80 in maximum likelihood (ML) analysis, 58.9% of near full-length HIV-1C sequences but only 15.5% of partial pol sequences (ViroSeq) were found in clusters. Among HIV-1 structural genes, pol showed the highest extent of clustering (38.9% at a bootstrap threshold of 0.80), although it was significantly lower than in the near full-length genome sequences. The extent of HIV clustering was significantly higher for sliding windows of 2,000 bp than 1,000 bp. We found a strong association between the sequence length and proportion of HIV sequences in clusters, and a moderate association between the number of variable and informative sites and the proportion of HIV sequences in clusters. In HIV cluster analysis, the extent of detectable HIV clustering is directly associated with the length of viral sequences used, as well as the number of variable and informative sites. Near full-length genome sequences could provide the most informative HIV cluster analysis. Selected subgenomic regions with a high extent of HIV clustering and high tree certainty could also be considered as a second choice.

  15. Importance of Viral Sequence Length and Number of Variable and Informative Sites in Analysis of HIV Clustering

    PubMed Central

    Novitsky, Vlad; Moyo, Sikhulile; Lei, Quanhong; DeGruttola, Victor

    2015-01-01

    Abstract To improve the methodology of HIV cluster analysis, we addressed how analysis of HIV clustering is associated with parameters that can affect the outcome of viral clustering. The extent of HIV clustering and tree certainty was compared between 401 HIV-1C near full-length genome sequences and subgenomic regions retrieved from the LANL HIV Database. Sliding window analysis was based on 99 windows of 1,000 bp and 45 windows of 2,000 bp. Potential associations between the extent of HIV clustering and sequence length and the number of variable and informative sites were evaluated. The near full-length genome HIV sequences showed the highest extent of HIV clustering and the highest tree certainty. At the bootstrap threshold of 0.80 in maximum likelihood (ML) analysis, 58.9% of near full-length HIV-1C sequences but only 15.5% of partial pol sequences (ViroSeq) were found in clusters. Among HIV-1 structural genes, pol showed the highest extent of clustering (38.9% at a bootstrap threshold of 0.80), although it was significantly lower than in the near full-length genome sequences. The extent of HIV clustering was significantly higher for sliding windows of 2,000 bp than 1,000 bp. We found a strong association between the sequence length and proportion of HIV sequences in clusters, and a moderate association between the number of variable and informative sites and the proportion of HIV sequences in clusters. In HIV cluster analysis, the extent of detectable HIV clustering is directly associated with the length of viral sequences used, as well as the number of variable and informative sites. Near full-length genome sequences could provide the most informative HIV cluster analysis. Selected subgenomic regions with a high extent of HIV clustering and high tree certainty could also be considered as a second choice. PMID:25560745

  16. Towards the Rational Design of a Candidate Vaccine against Pregnancy Associated Malaria: Conserved Sequences of the DBL6ε Domain of VAR2CSA

    PubMed Central

    Badaut, Cyril; Bertin, Gwladys; Rustico, Tatiana; Fievet, Nadine; Massougbodji, Achille; Gaye, Alioune; Deloron, Philippe

    2010-01-01

    Background Placental malaria is a disease linked to the sequestration of Plasmodium falciparum infected red blood cells (IRBC) in the placenta, leading to reduced materno-fetal exchanges and to local inflammation. One of the virulence factors of P. falciparum involved in cytoadherence to chondroitin sulfate A, its placental receptor, is the adhesive protein VAR2CSA. Its localisation on the surface of IRBC makes it accessible to the immune system. VAR2CSA contains six DBL domains. The DBL6ε domain is the most variable. High variability constitutes a means for the parasite to evade the host immune response. The DBL6ε domain could constitute a very attractive basis for a vaccine candidate but its reported variability necessitates, for antigenic characterisations, identifying and classifying commonalities across isolates. Methodology/Principal Findings Local alignment analysis of the DBL6ε domain had revealed that it is not as variable as previously described. Variability is concentrated in seven regions present on the surface of the DBL6ε domain. The main goal of our work is to classify and group variable sequences that will simplify further research to determine dominant epitopes. Firstly, variable sequences were grouped following their average percent pairwise identity (APPI). Groups comprising many variable sequences sharing low variability were found. Secondly, ELISA experiments following the IgG recognition of a recombinant DBL6ε domain, and of peptides mimicking its seven variable blocks, allowed to determine an APPI cut-off and to isolate groups represented by a single consensus sequence. Conclusions/Significance A new sequence approach is used to compare variable regions in sequences that have extensive segmental gene relationship. Using this approach, the VAR2CSA DBL6 domain is composed of 7 variable blocks with limited polymorphism. Each variable block is composed of a limited number of consensus types. Based on peptide based ELISA, variable blocks with 85% or greater sequence identity are expected to be recognized equally well by antibody and can be considered the same consensus type. Therefore, the analysis of the antibody response against the classified small number of sequences should be helpful to determine epitopes. PMID:20585655

  17. Alignment-free Transcriptomic and Metatranscriptomic Comparison Using Sequencing Signatures with Variable Length Markov Chains.

    PubMed

    Liao, Weinan; Ren, Jie; Wang, Kun; Wang, Shun; Zeng, Feng; Wang, Ying; Sun, Fengzhu

    2016-11-23

    The comparison between microbial sequencing data is critical to understand the dynamics of microbial communities. The alignment-based tools analyzing metagenomic datasets require reference sequences and read alignments. The available alignment-free dissimilarity approaches model the background sequences with Fixed Order Markov Chain (FOMC) yielding promising results for the comparison of microbial communities. However, in FOMC, the number of parameters grows exponentially with the increase of the order of Markov Chain (MC). Under a fixed high order of MC, the parameters might not be accurately estimated owing to the limitation of sequencing depth. In our study, we investigate an alternative to FOMC to model background sequences with the data-driven Variable Length Markov Chain (VLMC) in metatranscriptomic data. The VLMC originally designed for long sequences was extended to apply to high-throughput sequencing reads and the strategies to estimate the corresponding parameters were developed. The flexible number of parameters in VLMC avoids estimating the vast number of parameters of high-order MC under limited sequencing depth. Different from the manual selection in FOMC, VLMC determines the MC order adaptively. Several beta diversity measures based on VLMC were applied to compare the bacterial RNA-Seq and metatranscriptomic datasets. Experiments show that VLMC outperforms FOMC to model the background sequences in transcriptomic and metatranscriptomic samples. A software pipeline is available at https://d2vlmc.codeplex.com.

  18. JANE: efficient mapping of prokaryotic ESTs and variable length sequence reads on related template genomes

    PubMed Central

    2009-01-01

    Background ESTs or variable sequence reads can be available in prokaryotic studies well before a complete genome is known. Use cases include (i) transcriptome studies or (ii) single cell sequencing of bacteria. Without suitable software their further analysis and mapping would have to await finalization of the corresponding genome. Results The tool JANE rapidly maps ESTs or variable sequence reads in prokaryotic sequencing and transcriptome efforts to related template genomes. It provides an easy-to-use graphics interface for information retrieval and a toolkit for EST or nucleotide sequence function prediction. Furthermore, we developed for rapid mapping an enhanced sequence alignment algorithm which reassembles and evaluates high scoring pairs provided from the BLAST algorithm. Rapid assembly on and replacement of the template genome by sequence reads or mapped ESTs is achieved. This is illustrated (i) by data from Staphylococci as well as from a Blattabacteria sequencing effort, (ii) mapping single cell sequencing reads is shown for poribacteria to sister phylum representative Rhodopirellula Baltica SH1. The algorithm has been implemented in a web-server accessible at http://jane.bioapps.biozentrum.uni-wuerzburg.de. Conclusion Rapid prokaryotic EST mapping or mapping of sequence reads is achieved applying JANE even without knowing the cognate genome sequence. PMID:19943962

  19. Characterization of class II β chain major histocompatibility complex genes in a family of Hawaiian honeycreepers: 'amakihi (Hemignathus virens).

    PubMed

    Jarvi, Susan I; Bianchi, Kiara R; Farias, Margaret Em; Txakeeyang, Ann; McFarland, Thomas; Belcaid, Mahdi; Asano, Ashley

    2016-07-01

    Hawaiian honeycreepers (Drepanidinae) have evolved in the absence of mosquitoes for over five million years. Through human activity, mosquitoes were introduced to the Hawaiian archipelago less than 200 years ago. Mosquito-vectored diseases such as avian malaria caused by Plasmodium relictum and Avipoxviruses have greatly impacted these vulnerable species. Susceptibility to these diseases is variable among and within species. Due to their function in adaptive immunity, the role of major histocompatibility complex genes (Mhc) in disease susceptibility is under investigation. In this study, we evaluate gene organization and levels of diversity of Mhc class II β chain genes (exon 2) in a captive-reared family of Hawaii 'amakihi (Hemignathus virens). A total of 233 sequences (173 bp) were obtained by PCR+1 amplification and cloning, and 5720 sequences were generated by Roche 454 pyrosequencing. We report a total of 17 alleles originating from a minimum of 14 distinct loci. We detected three linkage groups that appear to represent three distinct haplotypes. Phylogenetic analysis revealed one variable cluster resembling classical Mhc sequences (DAB) and one highly conserved, low variability cluster resembling non-classical Mhc sequences (DBB). High net evolutionary divergence values between DAB and DBB resemble that seen between chicken BLB system and YLB system genes. High amino acid identity among non-classical alleles from 12 species of passerines (DBB) and four species of Galliformes (YLB) was found, suggesting that these non-classical passerine sequences may be related to the Galliforme YLB sequences.

  20. Calibration of high frequency pollen sequences and tree-ring records

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wigand, P.E.; Rose, M.R.

    This paper examines two radiocarbon dated paleoenvironmental records from southern Nevada for the possibility of correlating high-frequency pollen samples with nearby long-term tree-ring sequences. Variable deposition rate are discussed. Radiocarbon dating control of the pollen cores, as well as closer interval sampling is presented.

  1. IG and TR single chain fragment variable (scFv) sequence analysis: a new advanced functionality of IMGT/V-QUEST and IMGT/HighV-QUEST.

    PubMed

    Giudicelli, Véronique; Duroux, Patrice; Kossida, Sofia; Lefranc, Marie-Paule

    2017-06-26

    IMGT®, the international ImMunoGeneTics information system® ( http://www.imgt.org ), was created in 1989 in Montpellier, France (CNRS and Montpellier University) to manage the huge and complex diversity of the antigen receptors, and is at the origin of immunoinformatics, a science at the interface between immunogenetics and bioinformatics. Immunoglobulins (IG) or antibodies and T cell receptors (TR) are managed and described in the IMGT® databases and tools at the level of receptor, chain and domain. The analysis of the IG and TR variable (V) domain rearranged nucleotide sequences is performed by IMGT/V-QUEST (online since 1997, 50 sequences per batch) and, for next generation sequencing (NGS), by IMGT/HighV-QUEST, the high throughput version of IMGT/V-QUEST (portal begun in 2010, 500,000 sequences per batch). In vitro combinatorial libraries of engineered antibody single chain Fragment variable (scFv) which mimic the in vivo natural diversity of the immune adaptive responses are extensively screened for the discovery of novel antigen binding specificities. However the analysis of NGS full length scFv (~850 bp) represents a challenge as they contain two V domains connected by a linker and there is no tool for the analysis of two V domains in a single chain. The functionality "Analyis of single chain Fragment variable (scFv)" has been implemented in IMGT/V-QUEST and, for NGS, in IMGT/HighV-QUEST for the analysis of the two V domains of IG and TR scFv. It proceeds in five steps: search for a first closest V-REGION, full characterization of the first V-(D)-J-REGION, then search for a second V-REGION and full characterization of the second V-(D)-J-REGION, and finally linker delimitation. For each sequence or NGS read, positions of the 5'V-DOMAIN, linker and 3'V-DOMAIN in the scFv are provided in the 'V-orientated' sense. Each V-DOMAIN is fully characterized (gene identification, sequence description, junction analysis, characterization of mutations and amino changes). The functionality is generic and can analyse any IG or TR single chain nucleotide sequence containing two V domains, provided that the corresponding species IMGT reference directory is available. The "Analysis of single chain Fragment variable (scFv)" implemented in IMGT/V-QUEST and, for NGS, in IMGT/HighV-QUEST provides the identification and full characterization of the two V domains of full-length scFv (~850 bp) nucleotide sequences from combinatorial libraries. The analysis can also be performed on concatenated paired chains of expressed antigen receptor IG or TR repertoires.

  2. Variability of Actinobacteria, a minor component of rumen microflora.

    PubMed

    Suľák, M; Sikorová, L; Jankuvová, J; Javorský, P; Pristaš, P

    2012-07-01

    Actinobacteria (Actinomycetes) are a significant and interesting group of gram-positive bacteria. They are regular, though infrequent, members of the microbial life in the rumen and represent up to 3 % of total rumen bacteria; there is considerable lack of information about ecology and biology of rumen actinobacteria. During the characterization of variability of rumen treponemas using non-cultivation approach, we also noted the variability of rumen actinobacteria. By using Treponema-specific primers a specific 16S rRNA gene library was prepared from cow and sheep rumen total DNA. About 10 % of recombinant clones contained actinobacteria-like sequences. Phylogenetic analyses of 11 clones obtained showed the high variability of actinobacteria in the ruminant digestive system. While some sequences are nearly identical to known sequences of actinobacteria, we detected completely new clusters of actinobacteria-like sequences, representing probably new, as yet undiscovered, group of rumen Actinobacteria. Further research will be necessary for understanding their nature and functions in the rumen.

  3. The phosphotransferase system-dependent sucrose utilization regulon in enteropathogenic Escherichia coli strains is located in a variable chromosomal region containing iap sequences.

    PubMed

    Treviño-Quintanilla, Luis Gerardo; Escalante, Adelfo; Caro, Alma Delia; Martínez, Alfredo; González, Ricardo; Puente, José Luis; Bolívar, Francisco; Gosset, Guillermo

    2007-01-01

    The capacity to utilize sucrose as a carbon and energy source (Scr(+) phenotype) is a highly variable trait among Escherichia coli strains. In this study, seven enteropathogenic E. coli (EPEC) strains from different sources were studied for their capacity to grow using sucrose. Liquid media cultures showed that all analyzed strains have the Scr(+) phenotype and two distinct groups were defined: one of five and another of two strains displaying doubling times of 67 and 125 min, respectively. The genes conferring the Scr(+) phenotype in one of the fast-growing strains (T19) were cloned and sequenced. Comparative sequence analysis revealed that this strain possesses the scr regulon genes scrKYABR, encoding phosphoenolpyruvate:phosphotransferase system-dependent sucrose transport and utilization activities. Transcript level quantification revealed sucrose-dependent induction of scrK and scrR genes in fast-growing strains, whereas no transcripts were detected in slow-growing strains. Sequence comparison analysis revealed that the scr genes in strain T19 are almost identical to those present in the scr regulon of prototype EPEC E2348/69 and in both strains, the scr genes are inserted in the chromosomal intergenic region of hypothetical genes ygcE and ygcF. Comparison of the ygcE-ygcF intergenic region sequence of strains MG1655, enterohemorrhagic EDL933, uropathogenic ECFT073 and EPEC T19-E2348/69 revealed that the number of extragenic highly repeated iap sequences corresponded to nine, four, two and none, respectively. These results show that the iap sequence-containing chromosomal ygcE-ygcF intergenic region is highly variable in E. coli. Copyright (c) 2007 S. Karger AG, Basel.

  4. Highly conserved intragenic HSV-2 sequences: Results from next-generation sequencing of HSV-2 UL and US regions from genital swabs collected from 3 continents.

    PubMed

    Johnston, Christine; Magaret, Amalia; Roychoudhury, Pavitra; Greninger, Alexander L; Cheng, Anqi; Diem, Kurt; Fitzgibbon, Matthew P; Huang, Meei-Li; Selke, Stacy; Lingappa, Jairam R; Celum, Connie; Jerome, Keith R; Wald, Anna; Koelle, David M

    2017-10-01

    Understanding the variability in circulating herpes simplex virus type 2 (HSV-2) genomic sequences is critical to the development of HSV-2 vaccines. Genital lesion swabs containing ≥ 10 7 log 10 copies HSV DNA collected from Africa, the USA, and South America underwent next-generation sequencing, followed by K-mer based filtering and de novo genomic assembly. Sites of heterogeneity within coding regions in unique long and unique short (U L _U S ) regions were identified. Phylogenetic trees were created using maximum likelihood reconstruction. Among 46 samples from 38 persons, 1468 intragenic base-pair substitutions were identified. The maximum nucleotide distance between strains for concatenated U L_ U S segments was 0.4%. Phylogeny did not reveal geographic clustering. The most variable proteins had non-synonymous mutations in < 3% of amino acids. Unenriched HSV-2 DNA can undergo next-generation sequencing to identify intragenic variability. The use of clinical swabs for sequencing expands the information that can be gathered directly from these specimens. Copyright © 2017 Elsevier Inc. All rights reserved.

  5. Optimized Next-Generation Sequencing Genotype-Haplotype Calling for Genome Variability Analysis

    PubMed Central

    Navarro, Javier; Nevado, Bruno; Hernández, Porfidio; Vera, Gonzalo; Ramos-Onsins, Sebastián E

    2017-01-01

    The accurate estimation of nucleotide variability using next-generation sequencing data is challenged by the high number of sequencing errors produced by new sequencing technologies, especially for nonmodel species, where reference sequences may not be available and the read depth may be low due to limited budgets. The most popular single-nucleotide polymorphism (SNP) callers are designed to obtain a high SNP recovery and low false discovery rate but are not designed to account appropriately the frequency of the variants. Instead, algorithms designed to account for the frequency of SNPs give precise results for estimating the levels and the patterns of variability. These algorithms are focused on the unbiased estimation of the variability and not on the high recovery of SNPs. Here, we implemented a fast and optimized parallel algorithm that includes the method developed by Roesti et al and Lynch, which estimates the genotype of each individual at each site, considering the possibility to call both bases from the genotype, a single one or none. This algorithm does not consider the reference and therefore is independent of biases related to the reference nucleotide specified. The pipeline starts from a BAM file converted to pileup or mpileup format and the software outputs a FASTA file. The new program not only reduces the running times but also, given the improved use of resources, it allows its usage with smaller computers and large parallel computers, expanding its benefits to a wider range of researchers. The output file can be analyzed using software for population genetics analysis, such as the R library PopGenome, the software VariScan, and the program mstatspop for analysis considering positions with missing data. PMID:28894353

  6. Inferring the expression variability of human transposable element-derived exons by linear model analysis of deep RNA sequencing data.

    PubMed

    Zhang, Wensheng; Edwards, Andrea; Fan, Wei; Fang, Zhide; Deininger, Prescott; Zhang, Kun

    2013-08-28

    The exonization of transposable elements (TEs) has proven to be a significant mechanism for the creation of novel exons. Existing knowledge of the retention patterns of TE exons in mRNAs were mainly established by the analysis of Expressed Sequence Tag (EST) data and microarray data. This study seeks to validate and extend previous studies on the expression of TE exons by an integrative statistical analysis of high throughput RNA sequencing data. We collected 26 RNA-seq datasets spanning multiple tissues and cancer types. The exon-level digital expressions (indicating retention rates in mRNAs) were quantified by a double normalized measure, called the rescaled RPKM (Reads Per Kilobase of exon model per Million mapped reads). We analyzed the distribution profiles and the variability (across samples and between tissue/disease groups) of TE exon expressions, and compared them with those of other constitutive or cassette exons. We inferred the effects of four genomic factors, including the location, length, cognate TE family and TE nucleotide proportion (RTE, see Methods section) of a TE exon, on the exons' expression level and expression variability. We also investigated the biological implications of an assembly of highly-expressed TE exons. Our analysis confirmed prior studies from the following four aspects. First, with relatively high expression variability, most TE exons in mRNAs, especially those without exact counterparts in the UCSC RefSeq (Reference Sequence) gene tables, demonstrate low but still detectable expression levels in most tissue samples. Second, the TE exons in coding DNA sequences (CDSs) are less highly expressed than those in 3' (5') untranslated regions (UTRs). Third, the exons derived from chronologically ancient repeat elements, such as MIRs, tend to be highly expressed in comparison with those derived from younger TEs. Fourth, the previously observed negative relationship between the lengths of exons and the inclusion levels in transcripts is also true for exonized TEs. Furthermore, our study resulted in several novel findings. They include: (1) for the TE exons with non-zero expression and as shown in most of the studied biological samples, a high TE nucleotide proportion leads to their lower retention rates in mRNAs; (2) the considered genomic features (i.e. a continuous variable such as the exon length or a category indicator such as 3'UTR) influence the expression level and the expression variability (CV) of TE exons in an inverse manner; (3) not only the exons derived from Alu elements but also the exons from the TEs of other families were preferentially established in zinc finger (ZNF) genes.

  7. Using nearly full-genome HIV sequence data improves phylogeny reconstruction in a simulated epidemic

    PubMed Central

    Yebra, Gonzalo; Hodcroft, Emma B.; Ragonnet-Cronin, Manon L.; Pillay, Deenan; Brown, Andrew J. Leigh; Fraser, Christophe; Kellam, Paul; de Oliveira, Tulio; Dennis, Ann; Hoppe, Anne; Kityo, Cissy; Frampton, Dan; Ssemwanga, Deogratius; Tanser, Frank; Keshani, Jagoda; Lingappa, Jairam; Herbeck, Joshua; Wawer, Maria; Essex, Max; Cohen, Myron S.; Paton, Nicholas; Ratmann, Oliver; Kaleebu, Pontiano; Hayes, Richard; Fidler, Sarah; Quinn, Thomas; Novitsky, Vladimir; Haywards, Andrew; Nastouli, Eleni; Morris, Steven; Clark, Duncan; Kozlakidis, Zisis

    2016-01-01

    HIV molecular epidemiology studies analyse viral pol gene sequences due to their availability, but whole genome sequencing allows to use other genes. We aimed to determine what gene(s) provide(s) the best approximation to the real phylogeny by analysing a simulated epidemic (created as part of the PANGEA_HIV project) with a known transmission tree. We sub-sampled a simulated dataset of 4662 sequences into different combinations of genes (gag-pol-env, gag-pol, gag, pol, env and partial pol) and sampling depths (100%, 60%, 20% and 5%), generating 100 replicates for each case. We built maximum-likelihood trees for each combination using RAxML (GTR + Γ), and compared their topologies to the corresponding true tree’s using CompareTree. The accuracy of the trees was significantly proportional to the length of the sequences used, with the gag-pol-env datasets showing the best performance and gag and partial pol sequences showing the worst. The lowest sampling depths (20% and 5%) greatly reduced the accuracy of tree reconstruction and showed high variability among replicates, especially when using the shortest gene datasets. In conclusion, using longer sequences derived from nearly whole genomes will improve the reliability of phylogenetic reconstruction. With low sample coverage, results can be highly variable, particularly when based on short sequences. PMID:28008945

  8. Using nearly full-genome HIV sequence data improves phylogeny reconstruction in a simulated epidemic.

    PubMed

    Yebra, Gonzalo; Hodcroft, Emma B; Ragonnet-Cronin, Manon L; Pillay, Deenan; Brown, Andrew J Leigh

    2016-12-23

    HIV molecular epidemiology studies analyse viral pol gene sequences due to their availability, but whole genome sequencing allows to use other genes. We aimed to determine what gene(s) provide(s) the best approximation to the real phylogeny by analysing a simulated epidemic (created as part of the PANGEA_HIV project) with a known transmission tree. We sub-sampled a simulated dataset of 4662 sequences into different combinations of genes (gag-pol-env, gag-pol, gag, pol, env and partial pol) and sampling depths (100%, 60%, 20% and 5%), generating 100 replicates for each case. We built maximum-likelihood trees for each combination using RAxML (GTR + Γ), and compared their topologies to the corresponding true tree's using CompareTree. The accuracy of the trees was significantly proportional to the length of the sequences used, with the gag-pol-env datasets showing the best performance and gag and partial pol sequences showing the worst. The lowest sampling depths (20% and 5%) greatly reduced the accuracy of tree reconstruction and showed high variability among replicates, especially when using the shortest gene datasets. In conclusion, using longer sequences derived from nearly whole genomes will improve the reliability of phylogenetic reconstruction. With low sample coverage, results can be highly variable, particularly when based on short sequences.

  9. Genome variability of foot-and-mouth disease virus during the short period of the 2010 epidemic in Japan.

    PubMed

    Nishi, Tatsuya; Yamada, Manabu; Fukai, Katsuhiko; Shimada, Nobuaki; Morioka, Kazuki; Yoshida, Kazuo; Sakamoto, Kenichi; Kanno, Toru; Yamakawa, Makoto

    2017-02-01

    Foot-and-mouth disease virus (FMDV) is highly contagious and has a high mutation rate, leading to extensive genetic variation. To investigate how FMDV genetically evolves over a short period of an epidemic after initial introduction into an FMD-free area, whole L-fragment sequences of 104 FMDVs isolated from the 2010 epidemic in Japan, which continued for less than three months were determined and phylogenetically and comparatively analyzed. Phylogenetic analysis of whole L-fragment sequences showed that these isolates were classified into a single group, indicating that FMDV was introduced into Japan in the epidemic via a single introduction. Nucleotide sequences of 104 virus isolates showed more than 99.56% pairwise identity rates without any genetic deletion or insertion, although no sequences were completely identical with each other. These results indicate that genetic substitutions of FMDV occurred gradually and constantly during the epidemic and generation of an extensive mutant virus could have been prevented by rapid eradication strategy. From comparative analysis of variability of each FMDV protein coding region, VP4 and 2C regions showed the highest average identity rates and invariant rates, and were confirmed as highly conserved. In contrast, the protein coding regions VP2 and VP1 were confirmed to be highly variable regions with the lowest average identity rates and invariant rates, respectively. Our data demonstrate the importance of rapid eradication strategy in an FMD epidemic and provide valuable information on the genome variability of FMDV during the short period of an epidemic. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.

  10. An Avian Basal Ganglia-Forebrain Circuit Contributes Differentially to Syllable Versus Sequence Variability of Adult Bengalese Finch Song

    PubMed Central

    Hampton, Cara M.; Sakata, Jon T.; Brainard, Michael S.

    2009-01-01

    Behavioral variability is important for motor skill learning but continues to be present and actively regulated even in well-learned behaviors. In adult songbirds, two types of song variability can persist and are modulated by social context: variability in syllable structure and variability in syllable sequencing. The degree to which the control of both types of adult variability is shared or distinct remains unknown. The output of a basal ganglia-forebrain circuit, LMAN (the lateral magnocellular nucleus of the anterior nidopallium), has been implicated in song variability. For example, in adult zebra finches, neurons in LMAN actively control the variability of syllable structure. It is unclear, however, whether LMAN contributes to variability in adult syllable sequencing because sequence variability in adult zebra finch song is minimal. In contrast, Bengalese finches retain variability in both syllable structure and syllable sequencing into adulthood. We analyzed the effects of LMAN lesions on the variability of syllable structure and sequencing and on the social modulation of these forms of variability in adult Bengalese finches. We found that lesions of LMAN significantly reduced the variability of syllable structure but not of syllable sequencing. We also found that LMAN lesions eliminated the social modulation of the variability of syllable structure but did not detect significant effects on the modulation of sequence variability. These results show that LMAN contributes differentially to syllable versus sequence variability of adult song and suggest that these forms of variability are regulated by distinct neural pathways. PMID:19357331

  11. Using high throughput sequencing to explore the biodiversity in oral bacterial communities.

    PubMed

    Diaz, P I; Dupuy, A K; Abusleme, L; Reese, B; Obergfell, C; Choquette, L; Dongari-Bagtzoglou, A; Peterson, D E; Terzi, E; Strausbaugh, L D

    2012-06-01

    High throughput sequencing of 16S ribosomal RNA gene amplicons is a cost-effective method for characterization of oral bacterial communities. However, before undertaking large-scale studies, it is necessary to understand the technique-associated limitations and intrinsic variability of the oral ecosystem. In this work we evaluated bias in species representation using an in vitro-assembled mock community of oral bacteria. We then characterized the bacterial communities in saliva and buccal mucosa of five healthy subjects to investigate the power of high throughput sequencing in revealing their diversity and biogeography patterns. Mock community analysis showed primer and DNA isolation biases and an overestimation of diversity that was reduced after eliminating singleton operational taxonomic units (OTUs). Sequencing of salivary and mucosal communities found a total of 455 OTUs (0.3% dissimilarity) with only 78 of these present in all subjects. We demonstrate that this variability was partly the result of incomplete richness coverage even at great sequencing depths, and so comparing communities by their structure was more effective than comparisons based solely on membership. With respect to oral biogeography, we found inter-subject variability in community structure was lower than site differences between salivary and mucosal communities within subjects. These differences were evident at very low sequencing depths and were mostly caused by the abundance of Streptococcus mitis and Gemella haemolysans in mucosa. In summary, we present an experimental and data analysis framework that will facilitate design and interpretation of pyrosequencing-based studies. Despite challenges associated with this technique, we demonstrate its power for evaluation of oral diversity and biogeography patterns. © 2012 John Wiley & Sons A/S.

  12. FUNGAL-SPECIFIC PCR PRIMERS DEVELOPED FOR ANALYSIS OF THE ITS REGION OF ENVIRONMENTAL DNA EXTRACTS

    EPA Science Inventory

    Background The Internal Transcribed Spacer (ITS) regions of fungal ribosomal DNA (rDNA) are highly variable sequences of great importance in distinguishing fungal species by PCR analysis. Previously published PCR primers available for amplifying these sequences from environmenta...

  13. High sequence variability among hemocyte-specific Kazal-type proteinase inhibitors in decapod crustaceans.

    PubMed

    Cerenius, Lage; Liu, Haipeng; Zhang, Yanjiao; Rimphanitchayakit, Vichien; Tassanakajon, Anchalee; Gunnar Andersson, M; Söderhäll, Kenneth; Söderhäll, Irene

    2010-01-01

    Crustacean hemocytes were found to produce a large number of transcripts coding for Kazal-type proteinase inhibitors (KPIs). A detailed study performed with the crayfish Pacifastacus leniusculus and the shrimp Penaeus monodon revealed the presence of at least 26 and 20 different Kazal domains from the hemocyte KPIs, respectively. Comparisons with KPIs from other taxa indicate that the sequences of these domains evolve rapidly. A few conserved positions, e.g. six invariant cysteines were present in all domain sequences whereas the position of P1 amino acid, a determinant for substrate specificity, varied highly. A study with a single crayfish animal suggested that even at the individual level considerable sequence variability among hemocyte KPIs produced exist. Expression analysis of four crayfish KPI transcripts in hematopoietic tissue cells and different hemocyte types suggest that some of these KPIs are likely to be involved in hematopoiesis or hemocyte release as they were produced in particular hemocyte types or maturation stages only.

  14. Application of whole genome and RNA sequencing to investigate the genomic landscape of common variable immunodeficiency disorders.

    PubMed

    van Schouwenburg, Pauline A; Davenport, Emma E; Kienzler, Anne-Kathrin; Marwah, Ishita; Wright, Benjamin; Lucas, Mary; Malinauskas, Tomas; Martin, Hilary C; Lockstone, Helen E; Cazier, Jean-Baptiste; Chapel, Helen M; Knight, Julian C; Patel, Smita Y

    2015-10-01

    Common Variable Immunodeficiency Disorders (CVIDs) are the most prevalent cause of primary antibody failure. CVIDs are highly variable and a genetic causes have been identified in <5% of patients. Here, we performed whole genome sequencing (WGS) of 34 CVID patients (94% sporadic) and combined them with transcriptomic profiling (RNA-sequencing of B cells) from three patients and three healthy controls. We identified variants in CVID disease genes TNFRSF13B, TNFRSF13C, LRBA and NLRP12 and enrichment of variants in known and novel disease pathways. The pathways identified include B-cell receptor signalling, non-homologous end-joining, regulation of apoptosis, T cell regulation and ICOS signalling. Our data confirm the polygenic nature of CVID and suggest individual-specific aetiologies in many cases. Together our data show that WGS in combination with RNA-sequencing allows for a better understanding of CVIDs and the identification of novel disease associated pathways. Copyright © 2015. Published by Elsevier Inc.

  15. Modeling genome coverage in single-cell sequencing

    PubMed Central

    Daley, Timothy; Smith, Andrew D.

    2014-01-01

    Motivation: Single-cell DNA sequencing is necessary for examining genetic variation at the cellular level, which remains hidden in bulk sequencing experiments. But because they begin with such small amounts of starting material, the amount of information that is obtained from single-cell sequencing experiment is highly sensitive to the choice of protocol employed and variability in library preparation. In particular, the fraction of the genome represented in single-cell sequencing libraries exhibits extreme variability due to quantitative biases in amplification and loss of genetic material. Results: We propose a method to predict the genome coverage of a deep sequencing experiment using information from an initial shallow sequencing experiment mapped to a reference genome. The observed coverage statistics are used in a non-parametric empirical Bayes Poisson model to estimate the gain in coverage from deeper sequencing. This approach allows researchers to know statistical features of deep sequencing experiments without actually sequencing deeply, providing a basis for optimizing and comparing single-cell sequencing protocols or screening libraries. Availability and implementation: The method is available as part of the preseq software package. Source code is available at http://smithlabresearch.org/preseq. Contact: andrewds@usc.edu Supplementary information: Supplementary material is available at Bioinformatics online. PMID:25107873

  16. An efficient approach to BAC based assembly of complex genomes.

    PubMed

    Visendi, Paul; Berkman, Paul J; Hayashi, Satomi; Golicz, Agnieszka A; Bayer, Philipp E; Ruperao, Pradeep; Hurgobin, Bhavna; Montenegro, Juan; Chan, Chon-Kit Kenneth; Staňková, Helena; Batley, Jacqueline; Šimková, Hana; Doležel, Jaroslav; Edwards, David

    2016-01-01

    There has been an exponential growth in the number of genome sequencing projects since the introduction of next generation DNA sequencing technologies. Genome projects have increasingly involved assembly of whole genome data which produces inferior assemblies compared to traditional Sanger sequencing of genomic fragments cloned into bacterial artificial chromosomes (BACs). While whole genome shotgun sequencing using next generation sequencing (NGS) is relatively fast and inexpensive, this method is extremely challenging for highly complex genomes, where polyploidy or high repeat content confounds accurate assembly, or where a highly accurate 'gold' reference is required. Several attempts have been made to improve genome sequencing approaches by incorporating NGS methods, to variable success. We present the application of a novel BAC sequencing approach which combines indexed pools of BACs, Illumina paired read sequencing, a sequence assembler specifically designed for complex BAC assembly, and a custom bioinformatics pipeline. We demonstrate this method by sequencing and assembling BAC cloned fragments from bread wheat and sugarcane genomes. We demonstrate that our assembly approach is accurate, robust, cost effective and scalable, with applications for complete genome sequencing in large and complex genomes.

  17. Analysis of human papillomavirus 16 E6, E7 genes and Long Control Region in cervical samples from Uruguayan women.

    PubMed

    Ramas, Viviana; Mirazo, Santiago; Bonilla, Sylvia; Ruchansky, Dora; Arbiza, Juan

    2018-05-15

    This study aims to investigate the HPV16 variant distribution by sequence analyses of E6, E7 oncogenes and the Long Control Region (LCR), from cervical cells collected from Uruguayan women, and to reconstruct the phylogenetic relationships among variants. Forty-seven HPV16 variants, obtained from women with HSIL, LSIL, ASCUS and NILM cytological classes were analyzed for LCR and 12 were further studied for E6 and E7. Detailed sequence comparison, genetic heterogeneity analyses and phylogenetic reconstruction were performed. A high variability was observed among LCR sequences, which were distributed in 18 different variants. E6 and E7 sequences exhibited novel non-synonymous substitutions. Uruguayan sequences mainly belonged to the European lineage, and only 5 sequences clustered in non-European branches; 3 of them in the Asian-American and North-American linage and 2 in an African branch. Additionally, 6 new variants from European and African clusters were identified. HPV16 isolates mainly belonged to the European lineage, though strains from African and Asian-American lineages were also identified. Herein is reported for the first time the distribution and molecular characterization of HPV16 variants from Uruguay, providing novel insights on the molecular epidemiology of this infectious disease in the South America. A high variability among HPV 16 isolates mainly belonged to European lineage, provides an extensive sequence dataset from a country with high burden of cervical cancer. Copyright © 2018 Elsevier B.V. All rights reserved.

  18. Analysis of sequence variability in the macronuclear DNA of Paramecium tetraurelia: A somatic view of the germline

    PubMed Central

    Duret, Laurent; Cohen, Jean; Jubin, Claire; Dessen, Philippe; Goût, Jean-François; Mousset, Sylvain; Aury, Jean-Marc; Jaillon, Olivier; Noël, Benjamin; Arnaiz, Olivier; Bétermier, Mireille; Wincker, Patrick; Meyer, Eric; Sperling, Linda

    2008-01-01

    Ciliates are the only unicellular eukaryotes known to separate germinal and somatic functions. Diploid but silent micronuclei transmit the genetic information to the next sexual generation. Polyploid macronuclei express the genetic information from a streamlined version of the genome but are replaced at each sexual generation. The macronuclear genome of Paramecium tetraurelia was recently sequenced by a shotgun approach, providing access to the gene repertoire. The 72-Mb assembly represents a consensus sequence for the somatic DNA, which is produced after sexual events by reproducible rearrangements of the zygotic genome involving elimination of repeated sequences, precise excision of unique-copy internal eliminated sequences (IES), and amplification of the cellular genes to high copy number. We report use of the shotgun sequencing data (>106 reads representing 13× coverage of a completely homozygous clone) to evaluate variability in the somatic DNA produced by these developmental genome rearrangements. Although DNA amplification appears uniform, both of the DNA elimination processes produce sequence heterogeneity. The variability that arises from IES excision allowed identification of hundreds of putative new IESs, compared to 42 that were previously known, and revealed cases of erroneous excision of segments of coding sequences. We demonstrate that IESs in coding regions are under selective pressure to introduce premature termination of translation in case of excision failure. PMID:18256234

  19. Recurrent interactions between the input and output of a songbird cortico-basal ganglia pathway are implicated in vocal sequence variability

    PubMed Central

    Hamaguchi, Kosuke; Mooney, Richard

    2012-01-01

    Complex brain functions, such as the capacity to learn and modulate vocal sequences, depend on activity propagation in highly distributed neural networks. To explore the synaptic basis of activity propagation in such networks, we made dual in vivo intracellular recordings in anesthetized zebra finches from the input (nucleus HVC) and output (lateral magnocellular nucleus of the anterior nidopallium (LMAN)) neurons of a songbird cortico-basal ganglia (BG) pathway necessary to the learning and modulation of vocal motor sequences. These recordings reveal evidence of bidirectional interactions, rather than only feedforward propagation of activity from HVC to LMAN, as had been previously supposed. A combination of dual and triple recording configurations and pharmacological manipulations was used to map out circuitry by which activity propagates from LMAN to HVC. These experiments indicate that activity travels to HVC through at least two independent ipsilateral pathways, one of which involves fast signaling through a midbrain dopaminergic cell group, reminiscent of recurrent mesocortical loops described in mammals. We then used in vivo pharmacological manipulations to establish that augmented LMAN activity is sufficient to restore high levels of sequence variability in adult birds, suggesting that recurrent interactions through highly distributed forebrain – midbrain pathways can modulate learned vocal sequences. PMID:22915110

  20. Epstein-Barr Virus Latent Membrane Protein 1 Genetic Variability in Peripheral Blood B Cells and Oropharyngeal Fluids

    PubMed Central

    Renzette, Nicholas; Somasundaran, Mohan; Brewster, Frank; Coderre, James; Weiss, Eric R.; McManus, Margaret; Greenough, Thomas; Tabak, Barbara; Garber, Manuel; Kowalik, Timothy F.

    2014-01-01

    ABSTRACT We report the diversity of latent membrane protein 1 (LMP1) gene founder sequences and the level of Epstein-Barr virus (EBV) genome variability over time and across anatomic compartments by using virus genomes amplified directly from oropharyngeal wash specimens and peripheral blood B cells during acute infection and convalescence. The intrahost nucleotide variability of the founder virus was 0.02% across the region sequences, and diversity increased significantly over time in the oropharyngeal compartment (P = 0.004). The LMP1 region showing the greatest level of variability in both compartments, and over time, was concentrated within the functional carboxyl-terminal activating regions 2 and 3 (CTAR2 and CTAR3). Interestingly, a deletion in a proline-rich repeat region (amino acids 274 to 289) of EBV commonly reported in EBV sequenced from cancer specimens was not observed in acute infectious mononucleosis (AIM) patients. Taken together, these data highlight the diversity in circulating EBV genomes and its potential importance in disease pathogenesis and vaccine design. IMPORTANCE This study is among the first to leverage an improved high-throughput deep-sequencing methodology to investigate directly from patient samples the degree of diversity in Epstein-Barr virus (EBV) populations and the extent to which viral genome diversity develops over time in the infected host. Significant variability of circulating EBV latent membrane protein 1 (LMP1) gene sequences was observed between cellular and oral wash samples, and this variability increased over time in oral wash samples. The significance of EBV genetic diversity in transmission and disease pathogenesis are discussed. PMID:24429365

  1. Epstein-Barr virus latent membrane protein 1 genetic variability in peripheral blood B cells and oropharyngeal fluids.

    PubMed

    Renzette, Nicholas; Somasundaran, Mohan; Brewster, Frank; Coderre, James; Weiss, Eric R; McManus, Margaret; Greenough, Thomas; Tabak, Barbara; Garber, Manuel; Kowalik, Timothy F; Luzuriaga, Katherine

    2014-04-01

    We report the diversity of latent membrane protein 1 (LMP1) gene founder sequences and the level of Epstein-Barr virus (EBV) genome variability over time and across anatomic compartments by using virus genomes amplified directly from oropharyngeal wash specimens and peripheral blood B cells during acute infection and convalescence. The intrahost nucleotide variability of the founder virus was 0.02% across the region sequences, and diversity increased significantly over time in the oropharyngeal compartment (P = 0.004). The LMP1 region showing the greatest level of variability in both compartments, and over time, was concentrated within the functional carboxyl-terminal activating regions 2 and 3 (CTAR2 and CTAR3). Interestingly, a deletion in a proline-rich repeat region (amino acids 274 to 289) of EBV commonly reported in EBV sequenced from cancer specimens was not observed in acute infectious mononucleosis (AIM) patients. Taken together, these data highlight the diversity in circulating EBV genomes and its potential importance in disease pathogenesis and vaccine design. This study is among the first to leverage an improved high-throughput deep-sequencing methodology to investigate directly from patient samples the degree of diversity in Epstein-Barr virus (EBV) populations and the extent to which viral genome diversity develops over time in the infected host. Significant variability of circulating EBV latent membrane protein 1 (LMP1) gene sequences was observed between cellular and oral wash samples, and this variability increased over time in oral wash samples. The significance of EBV genetic diversity in transmission and disease pathogenesis are discussed.

  2. Comparative and Evolutionary Analyses of Meloidogyne spp. Based on Mitochondrial Genome Sequences

    PubMed Central

    García, Laura Evangelina; Sánchez-Puerta, M. Virginia

    2015-01-01

    Molecular taxonomy and evolution of nematodes have been recently the focus of several studies. Mitochondrial sequences were proposed as an alternative for precise identification of Meloidogyne species, to study intraspecific variability and to follow maternal lineages. We characterized the mitochondrial genomes (mtDNAs) of the root knot nematodes M. floridensis, M. hapla and M. incognita. These were AT rich (81–83%) and highly compact, encoding 12 proteins, 2 rRNAs, and 22 tRNAs. Comparisons with published mtDNAs of M. chitwoodi, M. incognita (another strain) and M. graminicola revealed that they share protein and rRNA gene order but differ in the order of tRNAs. The mtDNAs of M. floridensis and M. incognita were strikingly similar (97–100% identity for all coding regions). In contrast, M. floridensis, M. chitwoodi, M. hapla and M. graminicola showed 65–84% nucleotide identity for coding regions. Variable mitochondrial sequences are potentially useful for evolutionary and taxonomic studies. We developed a molecular taxonomic marker by sequencing a highly-variable ~2 kb mitochondrial region, nad5-cox1, from 36 populations of root-knot nematodes to elucidate relationships within the genus Meloidogyne. Isolates of five species formed monophyletic groups and showed little intraspecific variability. We also present a thorough analysis of the mitochondrial region cox2-rrnS. Phylogenies based on either mitochondrial region had good discrimination power but could not discriminate between M. arenaria, M. incognita and M. floridensis. PMID:25799071

  3. The impact of sampling, PCR, and sequencing replication on discerning changes in drinking water bacterial community over diurnal time-scales.

    PubMed

    Bautista-de Los Santos, Quyen Melina; Schroeder, Joanna L; Blakemore, Oliver; Moses, Jonathan; Haffey, Mark; Sloan, William; Pinto, Ameet J

    2016-03-01

    High-throughput and deep DNA sequencing, particularly amplicon sequencing, is being increasingly utilized to reveal spatial and temporal dynamics of bacterial communities in drinking water systems. Whilst the sampling and methodological biases associated with PCR and sequencing have been studied in other environments, they have not been quantified for drinking water. These biases are likely to have the greatest effect on the ability to characterize subtle spatio-temporal patterns influenced by process/environmental conditions. In such cases, intra-sample variability may swamp any underlying small, systematic variation. To evaluate this, we undertook a study with replication at multiple levels including sampling sites, sample collection, PCR amplification, and high throughput sequencing of 16S rRNA amplicons. The variability inherent to the PCR amplification and sequencing steps is significant enough to mask differences between bacterial communities from replicate samples. This was largely driven by greater variability in detection of rare bacteria (relative abundance <0.01%) across PCR/sequencing replicates as compared to replicate samples. Despite this, we captured significant changes in bacterial community over diurnal time-scales and find that the extent and pattern of diurnal changes is specific to each sampling location. Further, we find diurnal changes in bacterial community arise due to differences in the presence/absence of the low abundance bacteria and changes in the relative abundance of dominant bacteria. Finally, we show that bacterial community composition is significantly different across sampling sites for time-periods during which there are typically rapid changes in water use. This suggests hydraulic changes (driven by changes in water demand) contribute to shaping the bacterial community in bulk drinking water over diurnal time-scales. Copyright © 2015 Elsevier Ltd. All rights reserved.

  4. Global sequence variation in the histidine-rich proteins 2 and 3 of Plasmodium falciparum: implications for the performance of malaria rapid diagnostic tests

    PubMed Central

    2010-01-01

    Background Accurate diagnosis is essential for prompt and appropriate treatment of malaria. While rapid diagnostic tests (RDTs) offer great potential to improve malaria diagnosis, the sensitivity of RDTs has been reported to be highly variable. One possible factor contributing to variable test performance is the diversity of parasite antigens. This is of particular concern for Plasmodium falciparum histidine-rich protein 2 (PfHRP2)-detecting RDTs since PfHRP2 has been reported to be highly variable in isolates of the Asia-Pacific region. Methods The pfhrp2 exon 2 fragment from 458 isolates of P. falciparum collected from 38 countries was amplified and sequenced. For a subset of 80 isolates, the exon 2 fragment of histidine-rich protein 3 (pfhrp3) was also amplified and sequenced. DNA sequence and statistical analysis of the variation observed in these genes was conducted. The potential impact of the pfhrp2 variation on RDT detection rates was examined by analysing the relationship between sequence characteristics of this gene and the results of the WHO product testing of malaria RDTs: Round 1 (2008), for 34 PfHRP2-detecting RDTs. Results Sequence analysis revealed extensive variations in the number and arrangement of various repeats encoded by the genes in parasite populations world-wide. However, no statistically robust correlation between gene structure and RDT detection rate for P. falciparum parasites at 200 parasites per microlitre was identified. Conclusions The results suggest that despite extreme sequence variation, diversity of PfHRP2 does not appear to be a major cause of RDT sensitivity variation. PMID:20470441

  5. Evaluating Quality of Aged Archival Formalin-Fixed Paraffin-Embedded Samples for RNA-Sequencing

    EPA Science Inventory

    Archival formalin-fixed paraffin-embedded (FFPE) samples offer a vast, untapped source of genomic data for biomarker discovery. However, the quality of FFPE samples is often highly variable, and conventional methods to assess RNA quality for RNA-sequencing (RNA-seq) are not infor...

  6. The complete genome sequences of 65 Campylobacter jejuni and C. coli strains

    USDA-ARS?s Scientific Manuscript database

    Campylobacter jejuni (Cj) and C. coli (Cc) are genetically highly diverse based on various molecular methods including MLST, microarray-based comparisons and the whole genome sequences of a few strains. Cj and Cc diversity is also exhibited by variable capsular polysaccharides (CPS) that are the maj...

  7. Development of genomic microsatellites in Gleditsia triacanthos (Fabaceae) using illumina sequencing

    Treesearch

    Sandra A. Owusu; Margaret Staton; Tara N. Jennings; Scott Schlarbaum; Mark V. Coggeshall; Jeanne Romero-Severson; John E. Carlson; Oliver Gailing

    2013-01-01

    Premise of the study: Fourteen genomic microsatellite markers were developed and characterized in honey locust, Gleditsia triacanthos, using Illumina sequencing. Due to their high variability, these markers can be applied in analyses of genetic diversity and structure, and in mating system and gene flow studies.

  8. Fine Analysis of Genetic Diversity of the tpr Gene Family among Treponemal Species, Subspecies and Strains

    PubMed Central

    Centurion-Lara, Arturo; Giacani, Lorenzo; Godornes, Charmie; Molini, Barbara J.; Brinck Reid, Tara; Lukehart, Sheila A.

    2013-01-01

    Background The pathogenic non-cultivable treponemes include three subspecies of Treponema pallidum (pallidum, pertenue, endemicum), T. carateum, T. paraluiscuniculi, and the unclassified Fribourg-Blanc treponeme (Simian isolate). These treponemes are morphologically indistinguishable and antigenically and genetically highly similar, yet cross-immunity is variable or non-existent. Although all of these organisms cause chronic, multistage skin and systemic disease, they have historically been classified by mode of transmission, clinical presentations and host ranges. Whole genome studies underscore the high degree of sequence identity among species, subspecies and strains, pinpointing a limited number of genomic regions for variation. Many of these “hot spots” include members of the tpr gene family, composed of 12 paralogs encoding candidate virulence factors. We hypothesize that the distinct clinical presentations, host specificity, and variable cross-immunity might reside on virulence factors such as the tpr genes. Methodology/Principal Findings Sequence analysis of 11 tpr loci (excluding tprK) from 12 strains demonstrated an impressive heterogeneity, including SNPs, indels, chimeric genes, truncated gene products and large deletions. Comparative analyses of sequences and 3D models of predicted proteins in Subfamily I highlight the striking co-localization of discrete variable regions with predicted surface-exposed loops. A hallmark of Subfamily II is the presence of chimeric genes in the tprG and J loci. Diversity in Subfamily III is limited to tprA and tprL. Conclusions/Significance An impressive sequence variability was found in tpr sequences among the Treponema isolates examined in this study, with most of the variation being consistent within subspecies or species, or between syphilis vs. non-syphilis strains. Variability was seen in the pallidum subspecies, which can be divided into 5 genogroups. These findings support a genetic basis for the classification of these organisms into their respective subspecies and species. Future functional studies will determine whether the identified genetic differences relate to cross-immunity, clinical differences, or host ranges. PMID:23696912

  9. Deep sequencing reveals cell-type-specific patterns of single-cell transcriptome variation.

    PubMed

    Dueck, Hannah; Khaladkar, Mugdha; Kim, Tae Kyung; Spaethling, Jennifer M; Francis, Chantal; Suresh, Sangita; Fisher, Stephen A; Seale, Patrick; Beck, Sheryl G; Bartfai, Tamas; Kuhn, Bernhard; Eberwine, James; Kim, Junhyong

    2015-06-09

    Differentiation of metazoan cells requires execution of different gene expression programs but recent single-cell transcriptome profiling has revealed considerable variation within cells of seeming identical phenotype. This brings into question the relationship between transcriptome states and cell phenotypes. Additionally, single-cell transcriptomics presents unique analysis challenges that need to be addressed to answer this question. We present high quality deep read-depth single-cell RNA sequencing for 91 cells from five mouse tissues and 18 cells from two rat tissues, along with 30 control samples of bulk RNA diluted to single-cell levels. We find that transcriptomes differ globally across tissues with regard to the number of genes expressed, the average expression patterns, and within-cell-type variation patterns. We develop methods to filter genes for reliable quantification and to calibrate biological variation. All cell types include genes with high variability in expression, in a tissue-specific manner. We also find evidence that single-cell variability of neuronal genes in mice is correlated with that in rats consistent with the hypothesis that levels of variation may be conserved. Single-cell RNA-sequencing data provide a unique view of transcriptome function; however, careful analysis is required in order to use single-cell RNA-sequencing measurements for this purpose. Technical variation must be considered in single-cell RNA-sequencing studies of expression variation. For a subset of genes, biological variability within each cell type appears to be regulated in order to perform dynamic functions, rather than solely molecular noise.

  10. A Bioinformatic Pipeline for Monitoring of the Mutational Stability of Viral Drug Targets with Deep-Sequencing Technology.

    PubMed

    Kravatsky, Yuri; Chechetkin, Vladimir; Fedoseeva, Daria; Gorbacheva, Maria; Kravatskaya, Galina; Kretova, Olga; Tchurikov, Nickolai

    2017-11-23

    The efficient development of antiviral drugs, including efficient antiviral small interfering RNAs (siRNAs), requires continuous monitoring of the strict correspondence between a drug and the related highly variable viral DNA/RNA target(s). Deep sequencing is able to provide an assessment of both the general target conservation and the frequency of particular mutations in the different target sites. The aim of this study was to develop a reliable bioinformatic pipeline for the analysis of millions of short, deep sequencing reads corresponding to selected highly variable viral sequences that are drug target(s). The suggested bioinformatic pipeline combines the available programs and the ad hoc scripts based on an original algorithm of the search for the conserved targets in the deep sequencing data. We also present the statistical criteria for the threshold of reliable mutation detection and for the assessment of variations between corresponding data sets. These criteria are robust against the possible sequencing errors in the reads. As an example, the bioinformatic pipeline is applied to the study of the conservation of RNA interference (RNAi) targets in human immunodeficiency virus 1 (HIV-1) subtype A. The developed pipeline is freely available to download at the website http://virmut.eimb.ru/. Brief comments and comparisons between VirMut and other pipelines are also presented.

  11. Genetic variability of Echinococcus granulosus from the Tibetan plateau inferred by mitochondrial DNA sequences.

    PubMed

    Yan, Ning; Nie, Hua-Ming; Jiang, Zhong-Rong; Yang, Ai-Guo; Deng, Shi-Jin; Guo, Li; Yu, Hua; Yan, Yu-Bao; Tsering, Dawa; Kong, Wei-Shu; Wang, Ning; Wang, Jia-Hai; Xie, Yue; Fu, Yan; Yang, De-Ying; Wang, Shu-Xian; Gu, Xiao-Bin; Peng, Xue-Rong; Yang, Guang-You

    2013-09-01

    To analyse genetic variability and population structure, 84 isolates of Echinococcus granulosus (Cestoda: Taeniidae) collected from various host species at different sites of the Tibetan plateau in China were sequenced for the whole mitochondrial nad1 (894 bp) and atp6 (513 bp) genes. The vast majority were classified as G1 genotype (n=82), and two samples from human patients in Sichuan province were identified as G3 genotype. Based on the concatenated sequences of nad1+atp6, 28 different haplotypes (NA1-NA28) were identified. A parsimonious network of the concatenated sequence haplotypes showed star-like features in the overall population, with NA1 as the major haplotype in the population networks. By AMOVA it was shown that variation of E. granulosus within the overall population was the main pattern of the total genetic variability. Neutrality indexes of the concatenated sequence (nad1+atp6) were computed by Tajima's D and Fu's Fs tests and showed high negative values for E. granulosus, indicating significant deviations from neutrality. FST and Nm values suggested that the populations were not genetically differentiated. Copyright © 2013 Elsevier B.V. All rights reserved.

  12. Proteomic Identification of Monoclonal Antibodies from Serum

    PubMed Central

    2015-01-01

    Characterizing the in vivo dynamics of the polyclonal antibody repertoire in serum, such as that which might arise in response to stimulation with an antigen, is difficult due to the presence of many highly similar immunoglobulin proteins, each specified by distinct B lymphocytes. These challenges have precluded the use of conventional mass spectrometry for antibody identification based on peptide mass spectral matches to a genomic reference database. Recently, progress has been made using bottom-up analysis of serum antibodies by nanoflow liquid chromatography/high-resolution tandem mass spectrometry combined with a sample-specific antibody sequence database generated by high-throughput sequencing of individual B cell immunoglobulin variable domains (V genes). Here, we describe how intrinsic features of antibody primary structure, most notably the interspersed segments of variable and conserved amino acid sequences, generate recurring patterns in the corresponding peptide mass spectra of V gene peptides, greatly complicating the assignment of correct sequences to mass spectral data. We show that the standard method of decoy-based error modeling fails to account for the error introduced by these highly similar sequences, leading to a significant underestimation of the false discovery rate. Because of these effects, antibody-derived peptide mass spectra require increased stringency in their interpretation. The use of filters based on the mean precursor ion mass accuracy of peptide-spectrum matches is shown to be particularly effective in distinguishing between “true” and “false” identifications. These findings highlight important caveats associated with the use of standard database search and error-modeling methods with nonstandard data sets and custom sequence databases. PMID:24684310

  13. Full Genome Sequencing Reveals New Southern African Territories Genotypes Bringing Us Closer to Understanding True Variability of Foot-and-Mouth Disease Virus in Africa

    PubMed Central

    Lasecka-Dykes, Lidia; Wright, Caroline F.; Di Nardo, Antonello; Logan, Grace; Mioulet, Valerie; Jackson, Terry; Tuthill, Tobias J.; Knowles, Nick J.; King, Donald P.

    2018-01-01

    Foot-and-mouth disease virus (FMDV) causes a highly contagious disease of cloven-hooved animals that poses a constant burden on farmers in endemic regions and threatens the livestock industries in disease-free countries. Despite the increased number of publicly available whole genome sequences, FMDV data are biased by the opportunistic nature of sampling. Since whole genomic sequences of Southern African Territories (SAT) are particularly underrepresented, this study sequenced 34 isolates from eastern and southern Africa. Phylogenetic analyses revealed two novel genotypes (that comprised 8/34 of these SAT isolates) which contained unusual 5′ untranslated and non-structural encoding regions. While recombination has occurred between these sequences, phylogeny violation analyses indicated that the high degree of sequence diversity for the novel SAT genotypes has not solely arisen from recombination events. Based on estimates of the timing of ancestral divergence, these data are interpreted as being representative of un-sampled FMDV isolates that have been subjected to geographical isolation within Africa by the effects of the Great African Rinderpest Pandemic (1887–1897), which caused a mass die-out of FMDV-susceptible hosts. These findings demonstrate that further sequencing of African FMDV isolates is likely to reveal more unusual genotypes and will allow for better understanding of natural variability and evolution of FMDV. PMID:29652800

  14. Gene sequence variability of the three surface proteins of human respiratory syncytial virus (HRSV) in Texas.

    PubMed

    Tapia, Lorena I; Shaw, Chad A; Aideyan, Letisha O; Jewell, Alan M; Dawson, Brian C; Haq, Taha R; Piedra, Pedro A

    2014-01-01

    Human respiratory syncytial virus (HRSV) has three surface glycoproteins: small hydrophobic (SH), attachment (G) and fusion (F), encoded by three consecutive genes (SH-G-F). A 270-nt fragment of the G gene is used to genotype HRSV isolates. This study genotyped and investigated the variability of the gene and amino acid sequences of the three surface proteins of HRSV strains collected from 1987 to 2005 from one center. Sixty original clinical isolates and 5 prototype strains were analyzed. Sequences containing SH, F and G genes were generated, and multiple alignments and phylogenetic trees were analyzed. Genetic variability by protein domains comparing virus genotypes was assessed. Complete sequences of the SH-G-F genes were obtained for all 65 samples: HRSV-A = 35; HRSV-B = 30. In group A strains, genotypes GA5 and GA2 were predominant. For HRSV-B strains, the genotype GB4 was predominant from 1992 to 1994 and only genotype BA viruses were detected in 2004-2005. Different genetic variability at nucleotide level was detected between the genes, with G gene being the most variable and the highest variability detected in the 270-nt G fragment that is frequently used to genotype the virus. High variability (>10%) was also detected in the signal peptide and transmembrane domains of the F gene of HRSV A strains. Variability among the HRSV strains resulting in non-synonymous changes was detected in hypervariable domains of G protein, the signal peptide of the F protein, a not previously defined domain in the F protein, and the antigenic site Ø in the pre-fusion F. Divergent trends were observed between HRSV -A and -B groups for some functional domains. A diverse population of HRSV -A and -B genotypes circulated in Houston during an 18 year period. We hypothesize that diverse sequence variation of the surface protein genes provide HRSV strains a survival advantage in a partially immune-protected community.

  15. Gene Sequence Variability of the Three Surface Proteins of Human Respiratory Syncytial Virus (HRSV) in Texas

    PubMed Central

    Tapia, Lorena I.; Shaw, Chad A.; Aideyan, Letisha O.; Jewell, Alan M.; Dawson, Brian C.; Haq, Taha R.; Piedra, Pedro A.

    2014-01-01

    Human respiratory syncytial virus (HRSV) has three surface glycoproteins: small hydrophobic (SH), attachment (G) and fusion (F), encoded by three consecutive genes (SH-G-F). A 270-nt fragment of the G gene is used to genotype HRSV isolates. This study genotyped and investigated the variability of the gene and amino acid sequences of the three surface proteins of HRSV strains collected from 1987 to 2005 from one center. Sixty original clinical isolates and 5 prototype strains were analyzed. Sequences containing SH, F and G genes were generated, and multiple alignments and phylogenetic trees were analyzed. Genetic variability by protein domains comparing virus genotypes was assessed. Complete sequences of the SH-G-F genes were obtained for all 65 samples: HRSV-A = 35; HRSV-B = 30. In group A strains, genotypes GA5 and GA2 were predominant. For HRSV-B strains, the genotype GB4 was predominant from 1992 to 1994 and only genotype BA viruses were detected in 2004–2005. Different genetic variability at nucleotide level was detected between the genes, with G gene being the most variable and the highest variability detected in the 270-nt G fragment that is frequently used to genotype the virus. High variability (>10%) was also detected in the signal peptide and transmembrane domains of the F gene of HRSV A strains. Variability among the HRSV strains resulting in non-synonymous changes was detected in hypervariable domains of G protein, the signal peptide of the F protein, a not previously defined domain in the F protein, and the antigenic site Ø in the pre-fusion F. Divergent trends were observed between HRSV -A and -B groups for some functional domains. A diverse population of HRSV -A and -B genotypes circulated in Houston during an 18 year period. We hypothesize that diverse sequence variation of the surface protein genes provide HRSV strains a survival advantage in a partially immune-protected community. PMID:24625544

  16. B-Bolivia, an Allele of the Maize b1 Gene with Variable Expression, Contains a High Copy Retrotransposon-Related Sequence Immediately Upstream1

    PubMed Central

    Selinger, David A.; Chandler, Vicki L.

    2001-01-01

    The maize (Zea mays) b1 gene encodes a transcription factor that regulates the anthocyanin pigment pathway. Of the b1 alleles with distinct tissue-specific expression, B-Peru and B-Bolivia are the only alleles that confer seed pigmentation. B-Bolivia produces variable and weaker seed expression but darker, more regular plant expression relative to B-Peru. Our experiments demonstrated that B-Bolivia is not expressed in the seed when transmitted through the male. When transmitted through the female the proportion of kernels pigmented and the intensity of pigment varied. Molecular characterization of B-Bolivia demonstrated that it shares the first 530 bp of the upstream region with B-Peru, a region sufficient for seed expression. Immediately upstream of 530 bp, B-Bolivia is completely divergent from B-Peru. These sequences share sequence similarity to retrotransposons. Transient expression assays of various promoter constructs identified a 33-bp region in B-Bolivia that can account for the reduced aleurone pigment amounts (40%) observed with B-Bolivia relative to B-Peru. Transgenic plants carrying the B-Bolivia promoter proximal region produced pigmented seeds. Similar to native B-Bolivia, some transgene loci are variably expressed in seeds. In contrast to native B-Bolivia, the transgene loci are expressed in seeds when transmitted through both the male and female. Some transgenic lines produced pigment in vegetative tissues, but the tissue-specificity was different from B-Bolivia, suggesting the introduced sequences do not contain the B-Bolivia plant-specific regulatory sequences. We hypothesize that the chromatin context of the B-Bolivia allele controls its epigenetic seed expression properties, which could be influenced by the adjacent highly repeated retrotransposon sequence. PMID:11244116

  17. Assessing the intra-species genetic variability in the clonal pathogen Campylobacter fetus: CRISPRs are highly polymorphic DNA markers.

    PubMed

    Calleros, Lucía; Betancor, Laura; Iraola, Gregorio; Méndez, Alejandra; Morsella, Claudia; Paolicchi, Fernando; Silveyra, Silvia; Velilla, Alejandra; Pérez, Ruben

    2017-01-01

    Campylobacter fetus is a Gram-negative, microaerophilic bacterium that infects animals and humans. The subspecies Campylobacter fetus subsp. fetus (Cff) affects a broad range of vertebrate hosts and induces abortion in cows and sheep. Campylobacter fetus subsp. venerealis (Cfv) is restricted to cattle and causes the endemic disease bovine genital campylobacteriosis, which triggers reproductive problems and is responsible for major economic losses. Campylobacter fetus subsp. testudinum (Cft) has been isolated mostly from apparently healthy reptiles belonging to different species but also from ill snakes and humans. Genotypic differentiation of Cff and Cfv is difficult, and epidemiological information is scarce because there are few methods to study the genetic diversity of the strains. We analyze the efficacy of MLST, ribosomal sequences (23S gene and internal spacer region), and CRISPRs to assess the genetic variability of C. fetus in bovine and human isolates. Sequences retrieved from complete genomes were included in the analysis for comparative purposes. MLST and ribosomal sequences had scarce or null variability, while the CRISPR-cas system structure and the sequence of CRISPR1 locus showed remarkable diversity. None of the sequences here analyzed provided evidence of a genetic differentiation of Cff and Cfv in bovine isolates. Comparison of bovine and human isolates with Cft strains showed a striking divergence. Inter-host differences raise the possibility of determining the original host of human infections using CRISPR sequences. CRISPRs are the most variable sequences analyzed in C. fetus so far, and constitute excellent representatives of a dynamic fraction of the genome. CRISPR typing is a promising tool to characterize isolates and to track the source and transmission route of C. fetus infections. Copyright © 2016 Elsevier B.V. All rights reserved.

  18. Origin and distribution of Sporothrix globosa causing sapronoses in Asia.

    PubMed

    Moussa, Tarek A A; Kadasa, Naif M S; Al Zahrani, Hassan S; Ahmed, Sarah Abdallah; Feng, Peiying; Gerrits van den Ende, Albertus H G; Zhang, Yu; Kano, Rui; Li, Fuqiu; Li, Shanshan; Song, Yang; Dong, Bilin; Rossato, Luana; Dolatabadi, Somayeh; Hoog, Sybren de

    2017-05-01

    The aim of the study was to evaluate the main sources and epidemiological patterns and speculate on the evolutionary origin of Sporothrix globosa in Asia. Case and case series literature on sporotrichosis in Asia from January 2007 onwards were reviewed using meta-analysis. Phylogenetic analysis of relevant S. globosa was carried out on the basis of concatenated sequences of ITS, TEF3 and CAL. A haplotype network of CAL sequences of 281 Sporothrix isolates was analysed to determine the population structure of S. globosa. Nearly all cases of sporotrichosis caused by S. globosa in Asia were human. In contrast to the remaining pathogenic Sporothrix species, feline transmission was exceptional; nearly all regional cat-associated cases were caused by Sporothrix schenckii. While the latter species was highly variable and showed recombination, S. globosa seemed to be a clonal offshoot, as was Sporothrix brasiliensis. The origin of the segregants was located in an area of high variability in S. schenckii with a relatively high frequency of Asian strains. In Asia, S. globosa was the prevalent species. The low diversity of S. globosa suggested a recent divergence with a founder effect of low variability from the variable ancestral species, S. schenckii.

  19. Mitochondrial sequences of Seriatopora corals show little agreement with morphology and reveal the duplication of a tRNA gene near the control region

    NASA Astrophysics Data System (ADS)

    Flot, J.-F.; Licuanan, W. Y.; Nakano, Y.; Payri, C.; Cruaud, C.; Tillier, S.

    2008-12-01

    The taxonomy of corals of the genus Seriatopora has not previously been studied using molecular sequence markers. As a first step toward a re-evaluation of species boundaries in this genus, mitochondrial sequence variability was analyzed in 51 samples collected from Okinawa, New Caledonia, and the Philippines. Four clusters of sequences were detected that showed little concordance with species currently recognized on a morphological basis. The most likely explanation is that the skeletal characters used for species identification are highly variable (polymorphic or phenotypically plastic); alternative explanations include introgression/hybridization, or deep coalescence and the retention of ancestral mitochondrial polymorphisms. In all individuals sequenced, two copies of trnW were found on either side of the atp8 gene near the putative D-loop, a novel mitochondrial gene arrangement that may have arisen from a duplication of the trnW-atp8 region followed by a deletion of one atp8.

  20. Mechanisms controlling the complete accretionary beach state sequence

    NASA Astrophysics Data System (ADS)

    Dubarbier, Benjamin; Castelle, Bruno; Ruessink, Gerben; Marieu, Vincent

    2017-06-01

    Accretionary downstate beach sequence is a key element of observed nearshore morphological variability along sandy coasts. We present and analyze the first numerical simulation of such a sequence using a process-based morphodynamic model that solves the coupling between waves, depth-integrated currents, and sediment transport. The simulation evolves from an alongshore uniform barred beach (storm profile) to an almost featureless shore-welded terrace (summer profile) through the highly alongshore variable detached crescentic bar and transverse bar/rip system states. A global analysis of the full sequence allows determining the varying contributions of the different hydro-sedimentary processes. Sediment transport driven by orbital velocity skewness is critical to the overall onshore sandbar migration, while gravitational downslope sediment transport acts as a damping term inhibiting further channel growth enforced by rip flow circulation. Accurate morphological diffusivity and inclusion of orbital velocity skewness opens new perspectives in terms of morphodynamic modeling of real beaches.

  1. High levels of diversity characterize mandrill (Mandrillus sphinx) Mhc-DRB sequences.

    PubMed

    Abbott, Kristin M; Wickings, E Jean; Knapp, Leslie A

    2006-08-01

    The major histocompatibility complex (MHC) is highly polymorphic in most primate species studied thus far. The rhesus macaque (Macaca mulatta) has been studied extensively and the Mhc-DRB region demonstrates variability similar to humans. The extent of MHC diversity is relatively unknown for other Old World monkeys (OWM), especially among genera other than Macaca. A molecular survey of the Mhc-DRB region in mandrills (Mandrillus sphinx) revealed extensive variability, suggesting that other OWMs may also possess high levels of Mhc-DRB polymorphism. In the present study, 33 Mhc-DRB loci were identified from only 13 animals. Eleven were wild-born and presumed to be unrelated and two were captive-born twins. Two to seven different sequences were identified for each individual, suggesting that some mandrills may have as many as four Mhc-DRB loci on a single haplotype. From these sequences, representatives of at least six Mhc-DRB loci or lineages were identified. As observed in other primates, some new lineages may have arisen through the process of gene conversion. These findings indicate that mandrills have Mhc-DRB diversity not unlike rhesus macaques and humans.

  2. Variability among the Most Rapidly Evolving Plastid Genomic Regions is Lineage-Specific: Implications of Pairwise Genome Comparisons in Pyrus (Rosaceae) and Other Angiosperms for Marker Choice

    PubMed Central

    Ter-Voskanyan, Hasmik; Allgaier, Martin; Borsch, Thomas

    2014-01-01

    Plastid genomes exhibit different levels of variability in their sequences, depending on the respective kinds of genomic regions. Genes are usually more conserved while noncoding introns and spacers evolve at a faster pace. While a set of about thirty maximum variable noncoding genomic regions has been suggested to provide universally promising phylogenetic markers throughout angiosperms, applications often require several regions to be sequenced for many individuals. Our project aims to illuminate evolutionary relationships and species-limits in the genus Pyrus (Rosaceae)—a typical case with very low genetic distances between taxa. In this study, we have sequenced the plastid genome of Pyrus spinosa and aligned it to the already available P. pyrifolia sequence. The overall p-distance of the two Pyrus genomes was 0.00145. The intergenic spacers between ndhC–trnV, trnR–atpA, ndhF–rpl32, psbM–trnD, and trnQ–rps16 were the most variable regions, also comprising the highest total numbers of substitutions, indels and inversions (potentially informative characters). Our comparative analysis of further plastid genome pairs with similar low p-distances from Oenothera (representing another rosid), Olea (asterids) and Cymbidium (monocots) showed in each case a different ranking of genomic regions in terms of variability and potentially informative characters. Only two intergenic spacers (ndhF–rpl32 and trnK–rps16) were consistently found among the 30 top-ranked regions. We have mapped the occurrence of substitutions and microstructural mutations in the four genome pairs. High AT content in specific sequence elements seems to foster frequent mutations. We conclude that the variability among the fastest evolving plastid genomic regions is lineage-specific and thus cannot be precisely predicted across angiosperms. The often lineage-specific occurrence of stem-loop elements in the sequences of introns and spacers also governs lineage-specific mutations. Sequencing whole plastid genomes to find markers for evolutionary analyses is therefore particularly useful when overall genetic distances are low. PMID:25405773

  3. Sequential associative memory with nonuniformity of the layer sizes.

    PubMed

    Teramae, Jun-Nosuke; Fukai, Tomoki

    2007-01-01

    Sequence retrieval has a fundamental importance in information processing by the brain, and has extensively been studied in neural network models. Most of the previous sequential associative memory embedded sequences of memory patterns have nearly equal sizes. It was recently shown that local cortical networks display many diverse yet repeatable precise temporal sequences of neuronal activities, termed "neuronal avalanches." Interestingly, these avalanches displayed size and lifetime distributions that obey power laws. Inspired by these experimental findings, here we consider an associative memory model of binary neurons that stores sequences of memory patterns with highly variable sizes. Our analysis includes the case where the statistics of these size variations obey the above-mentioned power laws. We study the retrieval dynamics of such memory systems by analytically deriving the equations that govern the time evolution of macroscopic order parameters. We calculate the critical sequence length beyond which the network cannot retrieve memory sequences correctly. As an application of the analysis, we show how the present variability in sequential memory patterns degrades the power-law lifetime distribution of retrieved neural activities.

  4. Generation of “LYmph Node Derived Antibody Libraries” (LYNDAL) for selecting fully human antibody fragments with therapeutic potential

    PubMed Central

    Diebolder, Philipp; Keller, Armin; Haase, Stephanie; Schlegelmilch, Anne; Kiefer, Jonathan D; Karimi, Tamana; Weber, Tobias; Moldenhauer, Gerhard; Kehm, Roland; Eis-Hübinger, Anna M; Jäger, Dirk; Federspil, Philippe A; Herold-Mende, Christel; Dyckhoff, Gerhard; Kontermann, Roland E; Arndt, Michaela AE; Krauss, Jürgen

    2014-01-01

    The development of efficient strategies for generating fully human monoclonal antibodies with unique functional properties that are exploitable for tailored therapeutic interventions remains a major challenge in the antibody technology field. Here, we present a methodology for recovering such antibodies from antigen-encountered human B cell repertoires. As the source for variable antibody genes, we cloned immunoglobulin G (IgG)-derived B cell repertoires from lymph nodes of 20 individuals undergoing surgery for head and neck cancer. Sequence analysis of unselected “LYmph Node Derived Antibody Libraries” (LYNDAL) revealed a naturally occurring distribution pattern of rearranged antibody sequences, representing all known variable gene families and most functional germline sequences. To demonstrate the feasibility for selecting antibodies with therapeutic potential from these repertoires, seven LYNDAL from donors with high serum titers against herpes simplex virus (HSV) were panned on recombinant glycoprotein B of HSV-1. Screening for specific binders delivered 34 single-chain variable fragments (scFvs) with unique sequences. Sequence analysis revealed extensive somatic hypermutation of enriched clones as a result of affinity maturation. Binding of scFvs to common glycoprotein B variants from HSV-1 and HSV-2 strains was highly specific, and the majority of analyzed antibody fragments bound to the target antigen with nanomolar affinity. From eight scFvs with HSV-neutralizing capacity in vitro, the most potent antibody neutralized 50% HSV-2 at 4.5 nM as a dimeric (scFv)2. We anticipate our approach to be useful for recovering fully human antibodies with therapeutic potential. PMID:24256717

  5. Generation of “LYmph Node Derived Antibody Libraries” (LYNDAL) for selecting fully human antibody fragments with therapeutic potential.

    PubMed

    Diebolder, Philipp; Keller, Armin; Haase, Stephanie; Schlegelmilch, Anne; Kiefer, Jonathan D; Karimi, Tamana; Weber, Tobias; Moldenhauer, Gerhard; Kehm, Roland; Eis-Hübinger, Anna M; Jäger, Dirk; Federspil, Philippe A; Herold-Mende, Christel; Dyckhoff, Gerhard; Kontermann, Roland E; Arndt, Michaela A E; Krauss, Jürgen

    2014-01-01

    The development of efficient strategies for generating fully human monoclonal antibodies with unique functional properties that are exploitable for tailored therapeutic interventions remains a major challenge in the antibody technology field. Here, we present a methodology for recovering such antibodies from antigen-encountered human B cell repertoires. As the source for variable antibody genes, we cloned immunoglobulin G (IgG)-derived B cell repertoires from lymph nodes of 20 individuals undergoing surgery for head and neck cancer. Sequence analysis of unselected “LYmph Node Derived Antibody Libraries” (LYNDAL) revealed a naturally occurring distribution pattern of rearranged antibody sequences, representing all known variable gene families and most functional germline sequences. To demonstrate the feasibility for selecting antibodies with therapeutic potential from these repertoires, seven LYNDAL from donors with high serum titers against herpes simplex virus (HSV) were panned on recombinant glycoprotein B of HSV-1. Screening for specific binders delivered 34 single-chain variable fragments (scFvs) with unique sequences. Sequence analysis revealed extensive somatic hypermutation of enriched clones as a result of affinity maturation. Binding of scFvs to common glycoprotein B variants from HSV-1 and HSV-2 strains was highly specific, and the majority of analyzed antibody fragments bound to the target antigen with nanomolar affinity. From eight scFvs with HSV-neutralizing capacity in vitro,the most potent antibody neutralized 50% HSV-2 at 4.5 nM as a dimeric (scFv)2. We anticipate our approach to be useful for recovering fully human antibodies with therapeutic potential.

  6. Hubby and Lewontin on Protein Variation in Natural Populations: When Molecular Genetics Came to the Rescue of Population Genetics.

    PubMed

    Charlesworth, Brian; Charlesworth, Deborah; Coyne, Jerry A; Langley, Charles H

    2016-08-01

    The 1966 GENETICS papers by John Hubby and Richard Lewontin were a landmark in the study of genome-wide levels of variability. They used the technique of gel electrophoresis of enzymes and proteins to study variation in natural populations of Drosophila pseudoobscura, at a set of loci that had been chosen purely for technical convenience, without prior knowledge of their levels of variability. Together with the independent study of human populations by Harry Harris, this seminal study provided the first relatively unbiased picture of the extent of genetic variability in protein sequences within populations, revealing that many genes had surprisingly high levels of diversity. These papers stimulated a large research program that found similarly high electrophoretic variability in many different species and led to statistical tools for interpreting the data in terms of population genetics processes such as genetic drift, balancing and purifying selection, and the effects of selection on linked variants. The current use of whole-genome sequences in studies of variation is the direct descendant of this pioneering work. Copyright © 2016 by the Genetics Society of America.

  7. TUMOR HAPLOTYPE ASSEMBLY ALGORITHMS FOR CANCER GENOMICS

    PubMed Central

    AGUIAR, DEREK; WONG, WENDY S.W.; ISTRAIL, SORIN

    2014-01-01

    The growing availability of inexpensive high-throughput sequence data is enabling researchers to sequence tumor populations within a single individual at high coverage. But, cancer genome sequence evolution and mutational phenomena like driver mutations and gene fusions are difficult to investigate without first reconstructing tumor haplotype sequences. Haplotype assembly of single individual tumor populations is an exceedingly difficult task complicated by tumor haplotype heterogeneity, tumor or normal cell sequence contamination, polyploidy, and complex patterns of variation. While computational and experimental haplotype phasing of diploid genomes has seen much progress in recent years, haplotype assembly in cancer genomes remains uncharted territory. In this work, we describe HapCompass-Tumor a computational modeling and algorithmic framework for haplotype assembly of copy number variable cancer genomes containing haplotypes at different frequencies and complex variation. We extend our polyploid haplotype assembly model and present novel algorithms for (1) complex variations, including copy number changes, as varying numbers of disjoint paths in an associated graph, (2) variable haplotype frequencies and contamination, and (3) computation of tumor haplotypes using simple cycles of the compass graph which constrain the space of haplotype assembly solutions. The model and algorithm are implemented in the software package HapCompass-Tumor which is available for download from http://www.brown.edu/Research/Istrail_Lab/. PMID:24297529

  8. The Winds of Main Sequence B Stars in NGC 6231, Evidence for Shocks in Weak Winds.

    NASA Astrophysics Data System (ADS)

    Massa, Derck

    1996-07-01

    Because the main sequence B stars in NGC 6231 have abnormallystrong C iv wind lines, they are the only main sequence Bstars with distinct edge velocities. Although the underlyingcause for the strong lines remains unknown, these stars doprovide an opportunity to test two important ideas concerningB star winds: 1) that the driving ions in the winds of starswith low mass loss rates decouple from the general flow, and;2) that shocks deep in the winds of main sequence B stars areresponsible for their observed X-rays. In both of thesemodels, the wind accelerates toward a terminal velocity,v_infty, far greater than the observed value, shocking ordecoupling well before it can attain the high v_infty. As aresult, the observable wind accelerates very rapidly, leadingto wind flushing times less than 30 minutes. If theseconjectures are correct, then the winds of main sequence Bstars should be highly variable on time scales of minutes.Model fitting of available IUE data are consistant with thegeneral notion of a rapidly accelerating wind, shocking wellbefore its actual v_infty. However, these are 5 hourexposures, so the fits are to ill-defined mean wind flows.The new GHRS observations will provide adequate spectral andtemporal resolution to observe the expected variability and,thereby, verify the existance of two important astrophysicalprocesses.

  9. Sequence stratigraphy of upper Paleogene to Neogene carbonates exposed from Guánica bay to Guayanilla, Southern Puerto Rico.

    NASA Astrophysics Data System (ADS)

    Flores Hots, V. E.; Santos, H.

    2016-12-01

    Detailed stratigraphic columns were measured and microfacies analysis was performed in southwestern Puerto Rico to conduct a sequence stratigraphic analysis of Paleogene to Neogene strata. Two of the best exposed outcrops include the Guánica Bay and outcrops along Highway PR-132 in Guayanilla. Three depositional sequences, separated by two major sequence boundaries were found. The lower sequence occurs within the Juana Díaz Formation and is an open shelf to reef facies indicative of a Transgressive System Tract (TST), that is overlain by a High Stand System Tract (HST) marked by reef progradation. The HST in both Guánica Bay and Guayanilla is characterized by coral-rhodolith cyclicity however sections in Guánica Bay show pervasive recrystallization due to diagenetic alteration as a result of a long periods of exposure. This first sequence is Oligocene in age. The middle sequence, exposed at the eastern section of the Guánica Bay is also part of the Juana Díaz Formation and includes a turbiditic Lowstand System Tract (LST) of slope-like deposits flow, a TST constituted by coral rubble and skeletal grainstones belonging to a shallow island slope environment; and a HST that consists of an island slope chalk facies intercalated with turbidite grainstones derived storm events at the Guayanilla location. During the deposition of the middle sequence the Guánica Bay west section was topographically higher and exposed. The upper depositional sequence is Miocene in age and is composed of a TST with the transgression starting distally in the Guánica area and transgressing northward toward the Guayanilla area. These was correlated using high resolution 87Sr/86Sr isotope concentrations of shallow marine mollusks Kuphus incrassatus in the Ponce Formation at the Guánica Bay and Guayanilla locations. Facies patterns like the ones in the studied outcrops of southwestern Puerto Rico provide an exemplary environmental model of variability of paleodepositional relief, tectonic setting, variability in depositional setting of reef Sediment acumulations, the influence of storm events and variability in rock porosity by diagenetic processes yielding valuable models that may apply to potential Oligocene - Miocene hydrocarbon reservoirs.

  10. Breast MRI at 7 Tesla with a bilateral coil and robust fat suppression.

    PubMed

    Brown, Ryan; Storey, Pippa; Geppert, Christian; McGorty, KellyAnne; Klautau Leite, Ana Paula; Babb, James; Sodickson, Daniel K; Wiggins, Graham C; Moy, Linda

    2014-03-01

    To develop a bilateral coil and fat suppressed T1-weighted sequence for 7 Tesla (T) breast MRI. A dual-solenoid coil and three-dimensional (3D) T1w gradient echo sequence with B1+ insensitive fat suppression (FS) were developed. T1w FS image quality was characterized through image uniformity and fat-water contrast measurements in 11 subjects. Signal-to-noise ratio (SNR) and flip angle maps were acquired to assess the coil performance. Bilateral contrast-enhanced and unilateral high resolution (0.6 mm isotropic, 6.5 min acquisition time) imaging highlighted the 7T SNR advantage. Reliable and effective FS and high image quality was observed in all subjects at 7T, indicating that the custom coil and pulse sequence were insensitive to high-field obstacles such as variable tissue loading. 7T and 3T image uniformity was similar (P=0.24), indicating adequate 7T B1+ uniformity. High 7T SNR and fat-water contrast enabled 0.6 mm isotropic imaging and visualization of a high level of fibroglandular tissue detail. 7T T1w FS bilateral breast imaging is feasible with a custom radiofrequency (RF) coil and pulse sequence. Similar image uniformity was achieved at 7T and 3T, despite different RF field behavior and variable coil-tissue interaction due to anatomic differences that might be expected to alter magnetic field patterns. Copyright © 2013 Wiley Periodicals, Inc.

  11. Breast MRI at 7 Tesla with a Bilateral Coil and Robust Fat Suppression

    PubMed Central

    Brown, Ryan; Storey, Pippa; Geppert, Christian; McGorty, KellyAnne; Leite, Ana Paula Klautau; Babb, James; Sodickson, Daniel K.; Wiggins, Graham C.; Moy, Linda

    2013-01-01

    Purpose To develop a bilateral coil and optimized fat suppressed T1-weighted sequence for 7T breast MRI. Materials and Methods A dual-solenoid coil and 3D T1w gradient echo sequence with B1+ insensitive fat suppression (FS) were developed for 7T. T1w FS image quality was characterized through image uniformity and fat/water contrast measurements in 11 subjects. Signal-to-noise ratio (SNR) and flip angle maps were acquired to assess the coil performance. Bilateral contrast-enhanced and unilateral high resolution (0.6 mm isotropic, 6.5 min acquisition time) imaging highlighted the 7 T SNR advantage. Results Reliable and effective FS and high image quality was observed in all subjects at 7T, indicating that the custom coil and pulse sequence were insensitive to high-field obstacles such as variable tissue loading. 7T and 3T T1w FS image uniformity was similar (P=0.24), indicating adequate 7T B1+ uniformity. High 7T SNR and fat/water contrast enabled 0.6 mm isotropic imaging and visualization of a high level of fibroglandular tissue detail. Conclusion 7T T1w FS bilateral breast imaging is feasible with a custom RF coil and pulse sequence. Similar image uniformity was achieved at 7T and 3T, despite different RF field behavior and variable coil-tissue interaction due to anatomic differences that might be expected to alter magnetic field patterns. PMID:24123517

  12. Natural selection of the major histocompatibility complex (Mhc) in Hawaiian honeycreepers (Drepanidinae)

    USGS Publications Warehouse

    Jarvi, S.I.; Tarr, C.L.; Mcintosh, C.E.; Atkinson, C.T.; Fleischer, R.C.

    2004-01-01

    The native Hawaiian honeycreepers represent a classic example of adaptive radiation and speciation, but currently face one the highest extinction rates in the world. Although multiple factors have likely influenced the fate of Hawaiian birds, the relatively recent introduction of avian malaria is thought to be a major factor limiting honeycreeper distribution and abundance. We have initiated genetic analyses of class II ?? chain Mhc genes in four species of honeycreepers using methods that eliminate the possibility of sequencing mosaic variants formed by cloning heteroduplexed polymerase chain reaction products. Phylogenetic analyses group the honeycreeper Mhc sequences into two distinct clusters. Variation within one cluster is high, with dN > d S and levels of diversity similar to other studies of Mhc (B system) genes in birds. The second cluster is nearly invariant and includes sequences from honeycreepers (Fringillidae), a sparrow (Emberizidae) and a blackbird (Emberizidae). This highly conserved cluster appears reminiscent of the independently segregating Rfp-Y system of genes defined in chickens. The notion that balancing selection operates at the Mhc in the honeycreepers is supported by transpecies polymorphism and strikingly high dN/dS ratios at codons putatively involved in peptide interaction. Mitochondrial DNA control region sequences were invariant in the i'iwi, but were highly variable in the 'amakihi. By contrast, levels of variability of class II ?? chain Mhc sequence codons that are hypothesized to be directly involved in peptide interactions appear comparable between i'iwi and 'amakihi. In the i'iwi, natural selection may have maintained variation within the Mhc, even in the face of what appears to a genetic bottleneck.

  13. Meta-analysis of RNA-Seq data across cohorts in a multi-season feed efficiency study of crossbred beef steers accounts for biological and technical variability within season

    USDA-ARS?s Scientific Manuscript database

    High-throughput sequencing is often used for studies of the transcriptome, particularly for comparisons between experimental conditions. Due to sequencing costs, a limited number of biological replicates are typically considered in such experiments, leading to low detection power for differential ex...

  14. YAMAT-seq: an efficient method for high-throughput sequencing of mature transfer RNAs

    PubMed Central

    Shigematsu, Megumi; Honda, Shozo; Loher, Phillipe; Telonis, Aristeidis G.; Rigoutsos, Isidore

    2017-01-01

    Abstract Besides translation, transfer RNAs (tRNAs) play many non-canonical roles in various biological pathways and exhibit highly variable expression profiles. To unravel the emerging complexities of tRNA biology and molecular mechanisms underlying them, an efficient tRNA sequencing method is required. However, the rigid structure of tRNA has been presenting a challenge to the development of such methods. We report the development of Y-shaped Adapter-ligated MAture TRNA sequencing (YAMAT-seq), an efficient and convenient method for high-throughput sequencing of mature tRNAs. YAMAT-seq circumvents the issue of inefficient adapter ligation, a characteristic of conventional RNA sequencing methods for mature tRNAs, by employing the efficient and specific ligation of Y-shaped adapter to mature tRNAs using T4 RNA Ligase 2. Subsequent cDNA amplification and next-generation sequencing successfully yield numerous mature tRNA sequences. YAMAT-seq has high specificity for mature tRNAs and high sensitivity to detect most isoacceptors from minute amount of total RNA. Moreover, YAMAT-seq shows quantitative capability to estimate expression levels of mature tRNAs, and has high reproducibility and broad applicability for various cell lines. YAMAT-seq thus provides high-throughput technique for identifying tRNA profiles and their regulations in various transcriptomes, which could play important regulatory roles in translation and other biological processes. PMID:28108659

  15. Bi-exponential T2 analysis of healthy and diseased Achilles tendons: an in vivo preliminary magnetic resonance study and correlation with clinical score.

    PubMed

    Juras, Vladimir; Apprich, Sebastian; Szomolanyi, Pavol; Bieri, Oliver; Deligianni, Xeni; Trattnig, Siegfried

    2013-10-01

    To compare mono- and bi-exponential T2 analysis in healthy and degenerated Achilles tendons using a recently introduced magnetic resonance variable-echo-time sequence (vTE) for T2 mapping. Ten volunteers and ten patients were included in the study. A variable-echo-time sequence was used with 20 echo times. Images were post-processed with both techniques, mono- and bi-exponential [T2 m, short T2 component (T2 s) and long T2 component (T2 l)]. The number of mono- and bi-exponentially decaying pixels in each region of interest was expressed as a ratio (B/M). Patients were clinically assessed with the Achilles Tendon Rupture Score (ATRS), and these values were correlated with the T2 values. The means for both T2 m and T2 s were statistically significantly different between patients and volunteers; however, for T2 s, the P value was lower. In patients, the Pearson correlation coefficient between ATRS and T2 s was -0.816 (P = 0.007). The proposed variable-echo-time sequence can be successfully used as an alternative method to UTE sequences with some added benefits, such as a short imaging time along with relatively high resolution and minimised blurring artefacts, and minimised susceptibility artefacts and chemical shift artefacts. Bi-exponential T2 calculation is superior to mono-exponential in terms of statistical significance for the diagnosis of Achilles tendinopathy. • Magnetic resonance imaging offers new insight into healthy and diseased Achilles tendons • Bi-exponential T2 calculation in Achilles tendons is more beneficial than mono-exponential • A short T2 component correlates strongly with clinical score • Variable echo time sequences successfully used instead of ultrashort echo time sequences.

  16. Genetic variability in isolates of Chromobacterium violaceum from pulmonary secretion, water, and soil.

    PubMed

    Santini, A C; Magalhães, J T; Cascardo, J C M; Corrêa, R X

    2016-04-28

    Chromobacterium violaceum is a free-living Gram-negative bacillus usually found in the water and soil in tropical regions, which causes infections in humans. Chromobacteriosis is characterized by rapid dissemination and high mortality. The aim of this study was to detect the genetic variability among C. violaceum type strain ATCC 12472, and seven isolates from the environment and one from a pulmonary secretion from a chromobacteriosis patient from Ilhéus, Bahia. The molecular characterization of all samples was performed by polymerase chain reaction (PCR) sequencing and 16S rDNA analysis. Primers specific for two ATCC 12472 pathogenicity genes, hilA and yscD, as well as random amplified polymorphic DNA (RAPD), were used for PCR amplification and comparative sequencing of the products. For a more specific approach, the PCR products of 16S rDNA were digested with restriction enzymes. Seven of the samples, including type-strain ATCC 12472, were amplified by the hilA primers; these were subsequently sequenced. Gene yscD was amplified only in type-strain ATCC 12472. MspI and AluI digestion revealed 16S rDNA polymorphisms. This data allowed the generation of a dendogram for each analysis. The isolates of C. violaceum have variability in random genomic regions demonstrated by RAPD. Also, these isolates have variability in pathogenicity genes, as demonstrated by sequencing and restriction enzyme digestion.

  17. DNA fingerprints in physical anthropology.

    PubMed

    Weiss, Mark L

    1989-01-01

    Hypervariabal minisatellite DNA is a recently described class of nuclear sequences with no known biological function. The minisatellites do form a subtype of restricition fragment length polymorphisms possessing several characteristics particularly intriguing to anthropologists interested in forensics, sociobiology, primate conservation, genetic variability, and molecular evolution. The sequence occupy at least five dozen loci scattered throughout the human genome. Unlike many polymorphisms, many of the loci have numerous alleles each present at similar frequencies. Such a genetic structure produces exceptionally high levels of heterozygosity and thus provides a tool for the individualization of tissue samples. Additionally, as the alleles are inherited in a Mendelian fashion, the minisatellites provide a superb tool for the identification of paternity (or maternity). Unlike standard blood groups, levels of variability are so high in populations studied to data that parentage can be established by inclusion rather than exclution. Homologous sequences are shown to exist in a variety of Old World primates. Visualization of genetic fingerprints in nonhumans may allow for determination of paternity where the pool of potential sires is available, while also providing information on levels of genetic variability. These capabilities will ultimately provide for better management of primate colonies. Used in concert with behavioral data, a number of sociobiological will also become more amenable to investigation. Copyright © 1989 Wiley-Liss, Inc., A Wiley Company.

  18. Structural and functional partitioning of bread wheat chromosome 3B.

    PubMed

    Choulet, Frédéric; Alberti, Adriana; Theil, Sébastien; Glover, Natasha; Barbe, Valérie; Daron, Josquin; Pingault, Lise; Sourdille, Pierre; Couloux, Arnaud; Paux, Etienne; Leroy, Philippe; Mangenot, Sophie; Guilhot, Nicolas; Le Gouis, Jacques; Balfourier, Francois; Alaux, Michael; Jamilloux, Véronique; Poulain, Julie; Durand, Céline; Bellec, Arnaud; Gaspin, Christine; Safar, Jan; Dolezel, Jaroslav; Rogers, Jane; Vandepoele, Klaas; Aury, Jean-Marc; Mayer, Klaus; Berges, Hélène; Quesneville, Hadi; Wincker, Patrick; Feuillet, Catherine

    2014-07-18

    We produced a reference sequence of the 1-gigabase chromosome 3B of hexaploid bread wheat. By sequencing 8452 bacterial artificial chromosomes in pools, we assembled a sequence of 774 megabases carrying 5326 protein-coding genes, 1938 pseudogenes, and 85% of transposable elements. The distribution of structural and functional features along the chromosome revealed partitioning correlated with meiotic recombination. Comparative analyses indicated high wheat-specific inter- and intrachromosomal gene duplication activities that are potential sources of variability for adaption. In addition to providing a better understanding of the organization, function, and evolution of a large and polyploid genome, the availability of a high-quality sequence anchored to genetic maps will accelerate the identification of genes underlying important agronomic traits. Copyright © 2014, American Association for the Advancement of Science.

  19. A comparative analysis of exome capture.

    PubMed

    Parla, Jennifer S; Iossifov, Ivan; Grabill, Ian; Spector, Mona S; Kramer, Melissa; McCombie, W Richard

    2011-09-29

    Human exome resequencing using commercial target capture kits has been and is being used for sequencing large numbers of individuals to search for variants associated with various human diseases. We rigorously evaluated the capabilities of two solution exome capture kits. These analyses help clarify the strengths and limitations of those data as well as systematically identify variables that should be considered in the use of those data. Each exome kit performed well at capturing the targets they were designed to capture, which mainly corresponds to the consensus coding sequences (CCDS) annotations of the human genome. In addition, based on their respective targets, each capture kit coupled with high coverage Illumina sequencing produced highly accurate nucleotide calls. However, other databases, such as the Reference Sequence collection (RefSeq), define the exome more broadly, and so not surprisingly, the exome kits did not capture these additional regions. Commercial exome capture kits provide a very efficient way to sequence select areas of the genome at very high accuracy. Here we provide the data to help guide critical analyses of sequencing data derived from these products.

  20. Pyrosequencing the Canine Faecal Microbiota: Breadth and Depth of Biodiversity

    PubMed Central

    Hand, Daniel; Wallis, Corrin; Colyer, Alison; Penn, Charles W.

    2013-01-01

    Mammalian intestinal microbiota remain poorly understood despite decades of interest and investigation by culture-based and other long-established methodologies. Using high-throughput sequencing technology we now report a detailed analysis of canine faecal microbiota. The study group of animals comprised eleven healthy adult miniature Schnauzer dogs of mixed sex and age, some closely related and all housed in kennel and pen accommodation on the same premises with similar feeding and exercise regimes. DNA was extracted from faecal specimens and subjected to PCR amplification of 16S rDNA, followed by sequencing of the 5′ region that included variable regions V1 and V2. Barcoded amplicons were sequenced by Roche-454 FLX high-throughput pyrosequencing. Sequences were assigned to taxa using the Ribosomal Database Project Bayesian classifier and revealed dominance of Fusobacterium and Bacteroidetes phyla. Differences between animals in the proportions of different taxa, among 10,000 reads per animal, were clear and not supportive of the concept of a “core microbiota”. Despite this variability in prominent genera, littermates were shown to have a more similar faecal microbial composition than unrelated dogs. Diversity of the microbiota was also assessed by assignment of sequence reads into operational taxonomic units (OTUs) at the level of 97% sequence identity. The OTU data were then subjected to rarefaction analysis and determination of Chao1 richness estimates. The data indicated that faecal microbiota comprised possibly as many as 500 to 1500 OTUs. PMID:23382835

  1. Fast T1 and T2 mapping methods: the zoomed U-FLARE sequence compared with EPI and snapshot-FLASH for abdominal imaging at 11.7 Tesla.

    PubMed

    Pastor, Géraldine; Jiménez-González, María; Plaza-García, Sandra; Beraza, Marta; Reese, Torsten

    2017-06-01

    A newly adapted zoomed ultrafast low-angle RARE (U-FLARE) sequence is described for abdominal imaging applications at 11.7 Tesla and compared with the standard echo-plannar imaging (EPI) and snapshot fast low angle shot (FLASH) methods. Ultrafast EPI and snapshot-FLASH protocols were evaluated to determine relaxation times in phantoms and in the mouse kidney in vivo. Owing to their apparent shortcomings, imaging artefacts, signal-to-noise ratio (SNR), and variability in the determination of relaxation times, these methods are compared with the newly implemented zoomed U-FLARE sequence. Snapshot-FLASH has a lower SNR when compared with the zoomed U-FLARE sequence and EPI. The variability in the measurement of relaxation times is higher in the Look-Locker sequences than in inversion recovery experiments. Respectively, the average T1 and T2 values at 11.7 Tesla are as follows: kidney cortex, 1810 and 29 ms; kidney medulla, 2100 and 25 ms; subcutaneous tumour, 2365 and 28 ms. This study demonstrates that the zoomed U-FLARE sequence yields single-shot single-slice images with good anatomical resolution and high SNR at 11.7 Tesla. Thus, it offers a viable alternative to standard protocols for mapping very fast parameters, such as T1 and T2, or dynamic processes in vivo at high field.

  2. Case Study Projects for College Mathematics Courses Based on a Particular Function of Two Variables

    ERIC Educational Resources Information Center

    Shi, Y.

    2007-01-01

    Based on a sequence of number pairs, a recent paper (Mauch, E. and Shi, Y., 2005, Using a sequence of number pairs as an example in teaching mathematics, "Mathematics and Computer Education," 39(3), 198-205) presented some interesting examples that can be used in teaching high school and college mathematics classes such as algebra, geometry,…

  3. Simian immunodeficiency viruses from African green monkeys display unusual genetic diversity.

    PubMed Central

    Johnson, P R; Fomsgaard, A; Allan, J; Gravell, M; London, W T; Olmsted, R A; Hirsch, V M

    1990-01-01

    African green monkeys are asymptomatic carriers of simian immunodeficiency viruses (SIV), commonly called SIVagm. As many as 50% of African green monkeys in the wild may be SIV seropositive. This high seroprevalence rate and the potential for genetic variation of lentiviruses suggested to us that African green monkeys may harbor widely differing genotypes of SIVagm. To investigate this hypothesis, we determined the entire nucleotide sequence of an infectious proviral molecular clone of SIVagm (155-4) and partial sequences (long terminal repeat and Gag) of three other distinct SIVagm isolates (90, gri-1, and ver-1). Comparisons among the SIVagm isolates revealed extreme diversity at the nucleotide and amino acid levels. Long terminal repeat nucleotide sequences varied up to 35% and Gag protein sequences varied up to 30%. The variability among SIVagm isolates exceeded the variability among any other group of primate lentiviruses. Our data suggest that SIVagm has been in the African green monkey population for a long time and may be the oldest primate lentivirus group in existence. PMID:2304139

  4. Thought Speed, Mood, and the Experience of Mental Motion.

    PubMed

    Pronin, Emily; Jacobs, Elana

    2008-11-01

    This article presents a theoretical account relating thought speed to mood and psychological experience. Thought sequences that occur at a fast speed generally induce more positive affect than do those that occur slowly. Thought speed constitutes one aspect of mental motion. Another aspect involves thought variability, or the degree to which thoughts in a sequence either vary widely from or revolve closely around a theme. Thought sequences possessing more motion (occurring fast and varying widely) generally produce more positive affect than do sequences possessing little motion (occurring slowly and repetitively). When speed and variability oppose each other, such that one is low and the other is high, predictable psychological states also emerge. For example, whereas slow, repetitive thinking can prompt dejection, fast, repetitive thinking can prompt anxiety. This distinction is related to the fact that fast thinking involves greater actual and felt energy than slow thinking does. Effects of mental motion occur independent of the specific content of thought. Their consequences for mood and energy hold psychotherapeutic relevance. © 2008 Association for Psychological Science.

  5. Sequence stratigraphy and high-frequency cycles: New aspects for a quantitative evaluation of the Gulf of Suez basin, Egypt

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nio, S.D.; Yang, C.S.; Tewfik, N.

    1993-09-01

    A new development in the application of sequence stratigraphic concepts in marine as well as continental basins is the recognition of high-frequency cyclic patterns in rock successions in the subsurface. Studies of six wells from the northern, central, and southern parts of the Gulf of Suez show the presence of well-preserved, high-frequency cycles with periodicities similar to the orbitally forced Malankovitch parameters. Subsurface rock successions, third-order sequences, and high-frequency cycles were compared with outcrops. After establishing the biostratigraphic framework for the above-mentioned wells, a sequence analysis was performed. Sequence boundaries and maximum flooding positions in each well were calibrated withmore » the occurrences and evaluation of the high-frequency cycles. It became obvious that there is an intimate relationship between these high-frequency Milankovitch cycles and sequence organization. In addition, a close relationship can be observed in the subsurface as well as in outcrops between high-frequency climatic changes (connected to the Milankovitch cycles) and (litho)facies variability. Quantitative evaluations of each sequence and/or systems tract can be computed with the International Geoservices' cyclicity analysis tool (MILABAR). The results are summarized in a well composite chart, rate (NAR), and ratio of preserved time. In correlations between the wells, an accuracy of 500-100 Ka can be obtained. The quantitative evaluation of the sequence and high-frequency cycle analysis gave some new aspects concerning the (litho)facies and geodynamic development during the pre- as well as the synrift stages of the Gulf of Suez Basin.« less

  6. Assessment of sequence variability in a p23 gene region within and among three genotypes of the Theileria orientalis complex from south-eastern Australia.

    PubMed

    Perera, Piyumali K; Gasser, Robin B; Jabbar, Abdul

    2015-03-01

    Oriental theileriosis is a tick-borne, protozoan disease of cattle caused by one or more genotypes of Theileria orientalis complex. In this study, we assessed sequence variability in a region of the 23kDa piroplasm membrane protein (p23) gene within and among three T. orientalis genotypes (designated buffeli, chitose and ikeda) in south-eastern Australia. Genomic DNA (n=100) was extracted from blood of infected cattle from various locations endemic for oriental theileriosis and tested by polymerase chain reaction (PCR)-coupled mutation scanning (single-strand conformation polymorphism (SSCP)) and targeted sequencing analysis. Eight distinct sequences represented all DNA samples, and three genotypes were found: buffeli (n=3), chitose (3) and ikeda (2). Nucleotide pairwise comparisons among these eight sequences revealed considerably higher variability among the genotypes (6.6-11.7%) than within them (0-1.9%), indicating that the p23 gene region allows the accurate identification of T. orientalis genotypes. In the future, we will combine this gene with other molecular markers to study the genetic structure of T. orientalis populations in Australasia, which will pave the way to establish a highly sensitive and specific PCR-based assay for genotypic diagnosis of infection and for assessing levels of parasitaemia in cattle. Copyright © 2014 Elsevier GmbH. All rights reserved.

  7. Evaluation of composition and individual variability of rumen microbiota in yaks by 16S rRNA high-throughput sequencing technology.

    PubMed

    Guo, Wei; Li, Ying; Wang, Lizhi; Wang, Jiwen; Xu, Qin; Yan, Tianhai; Xue, Bai

    2015-08-01

    The Yak (Bos grunniens) is a unique species of ruminant animals that is important to agriculture of the Tibetan plateau, and has a complex intestinal microbial community. The objective of the present study was to characterize the composition and individual variability of microbiota in the rumen of yaks using 16S rRNA gene high-throughput sequencing technique. Rumen samples used in the present study were obtained from grazing adult male yaks (n = 6) in a commercial farm in Ganzi Autonomous Prefecture of Sichuan Province, China. Universal prokaryote primers were used to target the V4-V5 hypervariable region of 16S rRNA gene. A total of 7200 operational taxonomic units (OTUs) were obtained after sequence filtering and chimera removal. Within these OTUs, 0.56% belonged to Archaea (40 OTUs), 7.19% to unassigned species (518 OTUs), and the remaining OTUs (6642) in all samples were of bacterial origin. When examining the community structure of bacteria, we identified 23 phyla within 159 families after taxonomic summarization. Bacteroidetes and Firmicutes were the predominant phyla accounting for 39.68% (SD = 0.05) and 45.90% (SD = 0.06), respectively. Moreover, 3764 OTUs were identified as shared OTUs (i.e. represented in all yaks) and belonged to 35 genera, exhibiting highly variable abundance across individual samples. Phylogenetic placement of these genera across individual samples was examined. In addition, we evaluated the distance among the 6 rumen samples by adding taxon phylogeny using UniFrac, representing 24.1% of average distance. In summary, the current study reveals a shared rumen microbiome and phylogenetic lineage and presents novel information on composition and individual variability of the bacterial community in the rumen of yaks. Copyright © 2015. Published by Elsevier Ltd.

  8. An RNAi in silico approach to find an optimal shRNA cocktail against HIV-1

    PubMed Central

    2010-01-01

    Background HIV-1 can be inhibited by RNA interference in vitro through the expression of short hairpin RNAs (shRNAs) that target conserved genome sequences. In silico shRNA design for HIV has lacked a detailed study of virus variability constituting a possible breaking point in a clinical setting. We designed shRNAs against HIV-1 considering the variability observed in naïve and drug-resistant isolates available at public databases. Methods A Bioperl-based algorithm was developed to automatically scan multiple sequence alignments of HIV, while evaluating the possibility of identifying dominant and subdominant viral variants that could be used as efficient silencing molecules. Student t-test and Bonferroni Dunn correction test were used to assess statistical significance of our findings. Results Our in silico approach identified the most common viral variants within highly conserved genome regions, with a calculated free energy of ≥ -6.6 kcal/mol. This is crucial for strand loading to RISC complex and for a predicted silencing efficiency score, which could be used in combination for achieving over 90% silencing. Resistant and naïve isolate variability revealed that the most frequent shRNA per region targets a maximum of 85% of viral sequences. Adding more divergent sequences maintained this percentage. Specific sequence features that have been found to be related with higher silencing efficiency were hardly accomplished in conserved regions, even when lower entropy values correlated with better scores. We identified a conserved region among most HIV-1 genomes, which meets as many sequence features for efficient silencing. Conclusions HIV-1 variability is an obstacle to achieving absolute silencing using shRNAs designed against a consensus sequence, mainly because there are many functional viral variants. Our shRNA cocktail could be truly effective at silencing dominant and subdominant naïve viral variants. Additionally, resistant isolates might be targeted under specific antiretroviral selective pressure, but in both cases these should be tested exhaustively prior to clinical use. PMID:21172023

  9. Remarkable sequence conservation of the last intron in the PKD1 gene.

    PubMed

    Rodova, Marianna; Islam, M Rafiq; Peterson, Kenneth R; Calvet, James P

    2003-10-01

    The last intron of the PKD1 gene (intron 45) was found to have exceptionally high sequence conservation across four mammalian species: human, mouse, rat, and dog. This conservation did not extend to the comparable intron in pufferfish. Pairwise comparisons for intron 45 showed 91% identity (human vs. dog) to 100% identity (mouse vs. rat) for an average for all four species of 94% identity. In contrast, introns 43 and 44 of the PKD1 gene had average pairwise identities of 57% and 54%, and exons 43, 44, and 45 and the coding region of exon 46 had average pairwise identities of 80%, 84%, 82%, and 80%. Intron 45 is 90 to 95 bp in length, with the major region of sequence divergence being in a central 4-bp to 9-bp variable region. RNA secondary structure analysis of intron 45 predicts a branching stem-loop structure in which the central variable region lies in one loop and the putative branch point sequence lies in another loop, suggesting that the intron adopts a specific stem-loop structure that may be important for its removal. Although intron 45 appears to conform to the class of small, G-triplet-containing introns that are spliced by a mechanism utilizing intron definition, its high sequence conservation may be a reflection of constraints imposed by a unique mechanism that coordinates splicing of this last PKD1 intron with polyadenylation.

  10. Comparison of taxon-specific versus general locus sets for targeted sequence capture in plant phylogenomics.

    PubMed

    Chau, John H; Rahfeldt, Wolfgang A; Olmstead, Richard G

    2018-03-01

    Targeted sequence capture can be used to efficiently gather sequence data for large numbers of loci, such as single-copy nuclear loci. Most published studies in plants have used taxon-specific locus sets developed individually for a clade using multiple genomic and transcriptomic resources. General locus sets can also be developed from loci that have been identified as single-copy and have orthologs in large clades of plants. We identify and compare a taxon-specific locus set and three general locus sets (conserved ortholog set [COSII], shared single-copy nuclear [APVO SSC] genes, and pentatricopeptide repeat [PPR] genes) for targeted sequence capture in Buddleja (Scrophulariaceae) and outgroups. We evaluate their performance in terms of assembly success, sequence variability, and resolution and support of inferred phylogenetic trees. The taxon-specific locus set had the most target loci. Assembly success was high for all locus sets in Buddleja samples. For outgroups, general locus sets had greater assembly success. Taxon-specific and PPR loci had the highest average variability. The taxon-specific data set produced the best-supported tree, but all data sets showed improved resolution over previous non-sequence capture data sets. General locus sets can be a useful source of sequence capture targets, especially if multiple genomic resources are not available for a taxon.

  11. [Study on ITS sequences of Aconitum vilmorinianum and its medicinal adulterant].

    PubMed

    Zhang, Xiao-nan; Du, Chun-hua; Fu, De-huan; Gao, Li; Zhou, Pei-jun; Wang, Li

    2012-09-01

    To analyze and compare the ITS sequences of Aconitum vilmorinianum and its medicinal adulterant Aconitum austroyunnanense. Total genomic DNA were extracted from sample materials by improved CTAB method, ITS sequences of samples were amplified using PCR systems, directly sequenced and analyzed using software DNAStar, ClustalX1.81 and MEGA 4.0. 299 consistent sites, 19 variable sites and 13 informative sites were found in ITS1 sequences, 162 consistent sites, 2 variable sites and 1 informative sites were found in 5.8S sequences, 217 consistent sites, 3 variable sites and 1 informative site were found in ITS2 sequences. Base transition and transversion was not found only in 5.8S sequences, 2 sites transition and 1 site transversion were found in ITS1 sequences, only 1 site transversion was found in ITS2 sequences comparting the ITS sequences data matrix. By analyzing the ITS sequences data matrix from 2 population of Aconitum vilmorinianum and 3 population of Aconitum austroyunnanense, we found a stable informative site at the 596th base in ITS2 sequences, in all the samples of Aconitum vilmorinianum the base was C, and in all the samples of Aconitum austroyunnanense the base was A. Aconitum vilmorinianum and Aconitum austroyunnanense can be identified by their characters of ITS sequences, and the variable sites in ITS1 sequences are more than in ITS2 sequences.

  12. Sequence and Secondary Structure of the Mitochondrial Small-Subunit rRNA V4, V6, and V9 Domains Reveal Highly Species-Specific Variations within the Genus Agrocybe

    PubMed Central

    Gonzalez, Patrice; Labarère, Jacques

    1998-01-01

    A comparative study of variable domains V4, V6, and V9 of the mitochondrial small-subunit (SSU) rRNA was carried out with the genus Agrocybe by PCR amplification of 42 wild isolates belonging to 10 species, Agrocybe aegerita, Agrocybe dura, Agrocybe chaxingu, Agrocybe erebia, Agrocybe firma, Agrocybe praecox, Agrocybe paludosa, Agrocybe pediades, Agrocybe alnetorum, and Agrocybe vervacti. Sequencing of the PCR products showed that the three domains in the isolates belonging to the same species were the same length and had the same sequence, while variations were found among the 10 species. Alignment of the sequences showed that nucleotide motifs encountered in the smallest sequence of each variable domain were also found in the largest sequence, indicating that the sequences evolved by insertion-deletion events. Determination of the secondary structure of each domain revealed that the insertion-deletion events commonly occurred in regions not directly involved in the secondary structure (i.e., the loops). Moreover, conserved sequences ranging from 4 to 25 nucleotides long were found at the beginning and end of each domain and could constitute genus-specific sequences. Comparisons of the V4, V6, and V9 secondary structures resulted in identification of the following four groups: (i) group I, which was characterized by the presence of additional P23-1 and P23-3 helices in the V4 domain and the lack of the P49-1 helix in V9 and included A. aegerita, A. chaxingu, and A. erebia; (ii) group II, which had the P23-3 helix in V4 and the P49-1 helix in V9 and included A. pediades; (iii) group III, which did not have additional helices in V4, had the P49-1 helix in V9 and included A. paludosa, A. firma, A. alnetorum, and A. praecox; and (iv) group IV, which lacked both the V4 additional helices and the P49-1 helix in V9 and included A. vervacti and A. dura. This grouping of species was supported by the structure of a consensus tree based on the variable domain sequences. The conservation of the sequences of the V4, V6, and V9 domains of the mitochondrial SSU rRNA within species and the high degree of interspecific variation found in the Agrocybe species studied open the way for these sequences to be used as specific molecular markers of the Basidiomycota. PMID:9797259

  13. Sequence and secondary structure of the mitochondrial small-subunit rRNA V4, V6, and V9 domains reveal highly species-specific variations within the genus Agrocybe.

    PubMed

    Gonzalez, P; Labarère, J

    1998-11-01

    A comparative study of variable domains V4, V6, and V9 of the mitochondrial small-subunit (SSU) rRNA was carried out with the genus Agrocybe by PCR amplification of 42 wild isolates belonging to 10 species, Agrocybe aegerita, Agrocybe dura, Agrocybe chaxingu, Agrocybe erebia, Agrocybe firma, Agrocybe praecox, Agrocybe paludosa, Agrocybe pediades, Agrocybe alnetorum, and Agrocybe vervacti. Sequencing of the PCR products showed that the three domains in the isolates belonging to the same species were the same length and had the same sequence, while variations were found among the 10 species. Alignment of the sequences showed that nucleotide motifs encountered in the smallest sequence of each variable domain were also found in the largest sequence, indicating that the sequences evolved by insertion-deletion events. Determination of the secondary structure of each domain revealed that the insertion-deletion events commonly occurred in regions not directly involved in the secondary structure (i.e., the loops). Moreover, conserved sequences ranging from 4 to 25 nucleotides long were found at the beginning and end of each domain and could constitute genus-specific sequences. Comparisons of the V4, V6, and V9 secondary structures resulted in identification of the following four groups: (i) group I, which was characterized by the presence of additional P23-1 and P23-3 helices in the V4 domain and the lack of the P49-1 helix in V9 and included A. aegerita, A. chaxingu, and A. erebia; (ii) group II, which had the P23-3 helix in V4 and the P49-1 helix in V9 and included A. pediades; (iii) group III, which did not have additional helices in V4, had the P49-1 helix in V9 and included A. paludosa, A. firma, A. alnetorum, and A. praecox; and (iv) group IV, which lacked both the V4 additional helices and the P49-1 helix in V9 and included A. vervacti and A. dura. This grouping of species was supported by the structure of a consensus tree based on the variable domain sequences. The conservation of the sequences of the V4, V6, and V9 domains of the mitochondrial SSU rRNA within species and the high degree of interspecific variation found in the Agrocybe species studied open the way for these sequences to be used as specific molecular markers of the Basidiomycota.

  14. YAMAT-seq: an efficient method for high-throughput sequencing of mature transfer RNAs.

    PubMed

    Shigematsu, Megumi; Honda, Shozo; Loher, Phillipe; Telonis, Aristeidis G; Rigoutsos, Isidore; Kirino, Yohei

    2017-05-19

    Besides translation, transfer RNAs (tRNAs) play many non-canonical roles in various biological pathways and exhibit highly variable expression profiles. To unravel the emerging complexities of tRNA biology and molecular mechanisms underlying them, an efficient tRNA sequencing method is required. However, the rigid structure of tRNA has been presenting a challenge to the development of such methods. We report the development of Y-shaped Adapter-ligated MAture TRNA sequencing (YAMAT-seq), an efficient and convenient method for high-throughput sequencing of mature tRNAs. YAMAT-seq circumvents the issue of inefficient adapter ligation, a characteristic of conventional RNA sequencing methods for mature tRNAs, by employing the efficient and specific ligation of Y-shaped adapter to mature tRNAs using T4 RNA Ligase 2. Subsequent cDNA amplification and next-generation sequencing successfully yield numerous mature tRNA sequences. YAMAT-seq has high specificity for mature tRNAs and high sensitivity to detect most isoacceptors from minute amount of total RNA. Moreover, YAMAT-seq shows quantitative capability to estimate expression levels of mature tRNAs, and has high reproducibility and broad applicability for various cell lines. YAMAT-seq thus provides high-throughput technique for identifying tRNA profiles and their regulations in various transcriptomes, which could play important regulatory roles in translation and other biological processes. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  15. Characterization of Satellite DNA Sequences from the Commercially Important Marine Rotifers Brachionus rotundiformis and Brachionus plicatilis.

    PubMed

    Boehm; Gibson; Lubzens

    2000-01-01

    This study was initiated to search for species-specific and strain-specific satellite DNA sequences for which oligonucleotide primers could be designed to differentiate between various commercially important strains of the marine monogonont rotifers Brachionus rotundiformis and Brachionus plicatilis. Two unrelated, highly reiterated satellite sequences were cloned and characterized. The eight sequenced monomers from B. rotundiformis and six from B. plicatilis had low intrarepeat variability and were similar in their overall lengths, A + T compositions, and high degrees of repeated motif substructure. However, hybridizations to 19 representative strains, sequence characterizations, and GenBank searches indicated that these two satellites are morphotype-specific and population-specific, respectively, and share little homology to each other or to other characterized sequences in the database. Primer pairs designed for the B. rotundiformis satellite confirmed hybridization specificities on polymerase chain reaction and could serve as a useful molecular diagnostic tool to identify strains belonging to the SS morphotype, which are gaining widespread usage as first feeds for marine fish in commercial production.

  16. Analysis of hepatitis B virus preS1 variability and prevalence of the rs2296651 polymorphism in a Spanish population

    PubMed Central

    Casillas, Rosario; Tabernero, David; Gregori, Josep; Belmonte, Irene; Cortese, Maria Francesca; González, Carolina; Riveiro-Barciela, Mar; López, Rosa Maria; Quer, Josep; Esteban, Rafael; Buti, Maria; Rodríguez-Frías, Francisco

    2018-01-01

    AIM To determine the variability/conservation of the domain of hepatitis B virus (HBV) preS1 region that interacts with sodium-taurocholate cotransporting polypeptide (hereafter, NTCP-interacting domain) and the prevalence of the rs2296651 polymorphism (S267F, NTCP variant) in a Spanish population. METHODS Serum samples from 246 individuals were included and divided into 3 groups: patients with chronic HBV infection (CHB) (n = 41, 73% Caucasians), patients with resolved HBV infection (n = 100, 100% Caucasians) and an HBV-uninfected control group (n = 105, 100% Caucasians). Variability/conservation of the amino acid (aa) sequences of the NTCP-interacting domain, (aa 2-48 in viral genotype D) and a highly conserved preS1 domain associated with virion morphogenesis (aa 92-103 in viral genotype D) were analyzed by next-generation sequencing and compared in 18 CHB patients with viremia > 4 log IU/mL. The rs2296651 polymorphism was determined in all individuals in all 3 groups using an in-house real-time PCR melting curve analysis. RESULTS The HBV preS1 NTCP-interacting domain showed a high degree of conservation among the examined viral genomes especially between aa 9 and 21 (in the genotype D consensus sequence). As compared with the virion morphogenesis domain, the NTCP-interacting domain had a smaller proportion of HBV genotype-unrelated changes comprising > 1% of the quasispecies (25.5% vs 31.8%), but a larger proportion of genotype-associated viral polymorphisms (34% vs 27.3%), according to consensus sequences from GenBank patterns of HBV genotypes A to H. Variation/conservation in both domains depended on viral genotype, with genotype C being the most highly conserved and genotype E the most variable (limited finding, only 2 genotype E included). Of note, proline residues were highly conserved in both domains, and serine residues showed changes only to threonine or tyrosine in the virion morphogenesis domain. The rs2296651 polymorphism was not detected in any participant. CONCLUSION In our CHB population, the NTCP-interacting domain was highly conserved, particularly the proline residues and essential amino acids related with the NTCP interaction, and the prevalence of rs2296651 was low/null. PMID:29456407

  17. Development of a real-time PCR method for the differential detection and quantification of four solanaceae in GMO analysis: potato (Solanum tuberosum), tomato (Solanum lycopersicum), eggplant (Solanum melongena), and pepper (Capsicum annuum).

    PubMed

    Chaouachi, Maher; El Malki, Redouane; Berard, Aurélie; Romaniuk, Marcel; Laval, Valérie; Brunel, Dominique; Bertheau, Yves

    2008-03-26

    The labeling of products containing genetically modified organisms (GMO) is linked to their quantification since a threshold for the presence of fortuitous GMOs in food has been established. This threshold is calculated from a combination of two absolute quantification values: one for the specific GMO target and the second for an endogenous reference gene specific to the taxon. Thus, the development of reliable methods to quantify GMOs using endogenous reference genes in complex matrixes such as food and feed is needed. Plant identification can be difficult in the case of closely related taxa, which moreover are subject to introgression events. Based on the homology of beta-fructosidase sequences obtained from public databases, two couples of consensus primers were designed for the detection, quantification, and differentiation of four Solanaceae: potato (Solanum tuberosum), tomato (Solanum lycopersicum), pepper (Capsicum annuum), and eggplant (Solanum melongena). Sequence variability was studied first using lines and cultivars (intraspecies sequence variability), then using taxa involved in gene introgressions, and finally, using taxonomically close taxa (interspecies sequence variability). This study allowed us to design four highly specific TaqMan-MGB probes. A duplex real time PCR assay was developed for simultaneous quantification of tomato and potato. For eggplant and pepper, only simplex real time PCR tests were developed. The results demonstrated the high specificity and sensitivity of the assays. We therefore conclude that beta-fructosidase can be used as an endogenous reference gene for GMO analysis.

  18. Genetic structure of Plasmodium vivax using the merozoite surface protein 1 icb5-6 fragment reveals new hybrid haplotypes in southern Mexico

    PubMed Central

    2014-01-01

    Background Plasmodium vivax is a protozoan parasite with an extensive worldwide distribution, being highly prevalent in Asia as well as in Mesoamerica and South America. In southern Mexico, P. vivax transmission has been endemic and recent studies suggest that these parasites have unique biological and genetic features. The msp1 gene has shown high rate of nucleotide substitutions, deletions, insertions, and its mosaic structure reveals frequent events of recombination, maybe between highly divergent parasite isolates. Methods The nucleotide sequence variation in the polymorphic icb5-6 fragment of the msp1 gene of Mexican and worldwide isolates was analysed. To understand how genotype diversity arises, disperses and persists in Mexico, the genetic structure and genealogical relationships of local isolates were examined. To identify new sequence hybrids and their evolutionary relationships with other P. vivax isolates circulating worldwide two haplotype networks were constructed questioning that two portions of the icb5-6 have different evolutionary history. Results Twelve new msp1 icb5-6 haplotypes of P. vivax from Mexico were identified. These nucleotide sequences show mosaic structure comprising three partially conserved and two variable subfragments and resulted into five different sequence types. The variable subfragment sV1 has undergone recombination events and resulted in hybrid sequences and the haplotype network allocated the Mexican haplotypes to three lineages, corresponding to the Sal I and Belem types, and other more divergent group. In contrast, the network from icb5-6 fragment but not sV1 revealed that the Mexican haplotypes belong to two separate lineages, none of which are closely related to Sal I or Belem sequences. Conclusions These results suggest that the new hybrid haplotypes from southern Mexico were the result of at least three different recombination events. These rearrangements likely resulted from the recombination between haplotypes of highly divergent lineages that are frequently distributed in South America and Asia and diversified rapidly. PMID:24472213

  19. Self-Exciting Point Process Modeling of Conversation Event Sequences

    NASA Astrophysics Data System (ADS)

    Masuda, Naoki; Takaguchi, Taro; Sato, Nobuo; Yano, Kazuo

    Self-exciting processes of Hawkes type have been used to model various phenomena including earthquakes, neural activities, and views of online videos. Studies of temporal networks have revealed that sequences of social interevent times for individuals are highly bursty. We examine some basic properties of event sequences generated by the Hawkes self-exciting process to show that it generates bursty interevent times for a wide parameter range. Then, we fit the model to the data of conversation sequences recorded in company offices in Japan. In this way, we can estimate relative magnitudes of the self excitement, its temporal decay, and the base event rate independent of the self excitation. These variables highly depend on individuals. We also point out that the Hawkes model has an important limitation that the correlation in the interevent times and the burstiness cannot be independently modulated.

  20. Variation of 45S rDNA intergenic spacers in Arabidopsis thaliana.

    PubMed

    Havlová, Kateřina; Dvořáčková, Martina; Peiro, Ramon; Abia, David; Mozgová, Iva; Vansáčová, Lenka; Gutierrez, Crisanto; Fajkus, Jiří

    2016-11-01

    Approximately seven hundred 45S rRNA genes (rDNA) in the Arabidopsis thaliana genome are organised in two 4 Mbp-long arrays of tandem repeats arranged in head-to-tail fashion separated by an intergenic spacer (IGS). These arrays make up 5 % of the A. thaliana genome. IGS are rapidly evolving sequences and frequent rearrangements inside the rDNA loci have generated considerable interspecific and even intra-individual variability which allows to distinguish among otherwise highly conserved rRNA genes. The IGS has not been comprehensively described despite its potential importance in regulation of rDNA transcription and replication. Here we describe the detailed sequence variation in the complete IGS of A. thaliana WT plants and provide the reference/consensus IGS sequence, as well as genomic DNA analysis. We further investigate mutants dysfunctional in chromatin assembly factor-1 (CAF-1) (fas1 and fas2 mutants), which are known to have a reduced number of rDNA copies, and plant lines with restored CAF-1 function (segregated from a fas1xfas2 genetic background) showing major rDNA rearrangements. The systematic rDNA loss in CAF-1 mutants leads to the decreased variability of the IGS and to the occurrence of distinct IGS variants. We present for the first time a comprehensive and representative set of complete IGS sequences, obtained by conventional cloning and by Pacific Biosciences sequencing. Our data expands the knowledge of the A. thaliana IGS sequence arrangement and variability, which has not been available in full and in detail until now. This is also the first study combining IGS sequencing data with RFLP analysis of genomic DNA.

  1. Recombination events and variability among full-length genomes of co-circulating molluscum contagiosum virus subtypes 1 and 2.

    PubMed

    López-Bueno, Alberto; Parras-Moltó, Marcos; López-Barrantes, Olivia; Belda, Sylvia; Alejo, Alí

    2017-05-01

    Molluscum contagiosum virus (MCV) is the sole member of the Molluscipoxvirus genus and causes a highly prevalent human disease of the skin characterized by the formation of a variable number of lesions that can persist for prolonged periods of time. Two major genotypes, subtype 1 and subtype 2, are recognized, although currently only a single complete genomic sequence corresponding to MCV subtype 1 is available. Using next-generation sequencing techniques, we report the complete genomic sequence of four new MCV isolates, including the first one derived from a subtype 2. Comparisons suggest a relatively distant evolutionary split between both MCV subtypes. Further, our data illustrate concurrent circulation of distinct viruses within a population and reveal the existence of recombination events among them. These results help identify a set of MCV genes with potentially relevant roles in molluscum contagiosum epidemiology and pathogenesis.

  2. Biosystematics and Conservation: A Case Study with Two Enigmatic and Uncommon Species of Crassula from New Zealand

    PubMed Central

    De Lange, P. J.; Heenan, P. B.; Keeling, D. J.; Murray, B. G.; Smissen, R.; Sykes, W. R.

    2008-01-01

    Background and Aims Crassula hunua and C. ruamahanga have been taxonomically controversial. Here their distinctiveness is assessed so that their taxonomic and conservation status can be clarified. Methods Populations of these two species were analysed using morphological, chromosomal and DNA sequence data. Key Results It proved impossible to differentiate between these two species using 12 key morphological characters. Populations were found to be chromosomally variable with 11 different chromosome numbers ranging from 2n = 42 to 2n = 100. Meiotic behaviour and levels of pollen stainability were both variable. Phylogenetic analyses showed that differences exist in both nuclear and plastid DNA sequences between individual plants, sometimes from the same population. Conclusions The results suggest that these plants are a species complex that has evolved through interspecific hybridization and polyploidy. Their high levels of chromosomal and DNA sequence variation present a problem for their conservation. PMID:18055560

  3. Whole genome sequencing of the monomorphic pathogen Mycobacterium bovis reveals local differentiation of cattle clinical isolates.

    PubMed

    Lasserre, Moira; Fresia, Pablo; Greif, Gonzalo; Iraola, Gregorio; Castro-Ramos, Miguel; Juambeltz, Arturo; Nuñez, Álvaro; Naya, Hugo; Robello, Carlos; Berná, Luisa

    2018-01-02

    Bovine tuberculosis (bTB) poses serious risks to animal welfare and economy, as well as to public health as a zoonosis. Its etiological agent, Mycobacterium bovis, belongs to the Mycobacterium tuberculosis complex (MTBC), a group of genetically monomorphic organisms featured by a remarkably high overall nucleotide identity (99.9%). Indeed, this characteristic is of major concern for correct typing and determination of strain-specific traits based on sequence diversity. Due to its historical economic dependence on cattle production, Uruguay is deeply affected by the prevailing incidence of Mycobacterium bovis. With the world's highest number of cattle per human, and its intensive cattle production, Uruguay represents a particularly suited setting to evaluate genomic variability among isolates, and the diversity traits associated to this pathogen. We compared 186 genomes from MTBC strains isolated worldwide, and found a highly structured population in M. bovis. The analysis of 23 new M. bovis genomes, belonging to strains isolated in Uruguay evidenced three groups present in the country. Despite presenting an expected highly conserved genomic structure and sequence, these strains segregate into a clustered manner within the worldwide phylogeny. Analysis of the non-pe/ppe differential areas against a reference genome defined four main sources of variability, namely: regions of difference (RD), variable genes, duplications and novel genes. RDs and variant analysis segregated the strains into clusters that are concordant with their spoligotype identities. Due to its high homoplasy rate, spoligotyping failed to reflect the true genomic diversity among worldwide representative strains, however, it remains a good indicator for closely related populations. This study introduces a comprehensive population structure analysis of worldwide M. bovis isolates. The incorporation and analysis of 23 novel Uruguayan M. bovis genomes, sheds light onto the genomic diversity of this pathogen, evidencing the existence of greater genetic variability among strains than previously contemplated.

  4. Oligo kernels for datamining on biological sequences: a case study on prokaryotic translation initiation sites

    PubMed Central

    Meinicke, Peter; Tech, Maike; Morgenstern, Burkhard; Merkl, Rainer

    2004-01-01

    Background Kernel-based learning algorithms are among the most advanced machine learning methods and have been successfully applied to a variety of sequence classification tasks within the field of bioinformatics. Conventional kernels utilized so far do not provide an easy interpretation of the learnt representations in terms of positional and compositional variability of the underlying biological signals. Results We propose a kernel-based approach to datamining on biological sequences. With our method it is possible to model and analyze positional variability of oligomers of any length in a natural way. On one hand this is achieved by mapping the sequences to an intuitive but high-dimensional feature space, well-suited for interpretation of the learnt models. On the other hand, by means of the kernel trick we can provide a general learning algorithm for that high-dimensional representation because all required statistics can be computed without performing an explicit feature space mapping of the sequences. By introducing a kernel parameter that controls the degree of position-dependency, our feature space representation can be tailored to the characteristics of the biological problem at hand. A regularized learning scheme enables application even to biological problems for which only small sets of example sequences are available. Our approach includes a visualization method for transparent representation of characteristic sequence features. Thereby importance of features can be measured in terms of discriminative strength with respect to classification of the underlying sequences. To demonstrate and validate our concept on a biochemically well-defined case, we analyze E. coli translation initiation sites in order to show that we can find biologically relevant signals. For that case, our results clearly show that the Shine-Dalgarno sequence is the most important signal upstream a start codon. The variability in position and composition we found for that signal is in accordance with previous biological knowledge. We also find evidence for signals downstream of the start codon, previously introduced as transcriptional enhancers. These signals are mainly characterized by occurrences of adenine in a region of about 4 nucleotides next to the start codon. Conclusions We showed that the oligo kernel can provide a valuable tool for the analysis of relevant signals in biological sequences. In the case of translation initiation sites we could clearly deduce the most discriminative motifs and their positional variation from example sequences. Attractive features of our approach are its flexibility with respect to oligomer length and position conservation. By means of these two parameters oligo kernels can easily be adapted to different biological problems. PMID:15511290

  5. Combining phage display with de novo protein sequencing for reverse engineering of monoclonal antibodies.

    PubMed

    Rickert, Keith W; Grinberg, Luba; Woods, Robert M; Wilson, Susan; Bowen, Michael A; Baca, Manuel

    2016-01-01

    The enormous diversity created by gene recombination and somatic hypermutation makes de novo protein sequencing of monoclonal antibodies a uniquely challenging problem. Modern mass spectrometry-based sequencing will rarely, if ever, provide a single unambiguous sequence for the variable domains. A more likely outcome is computation of an ensemble of highly similar sequences that can satisfy the experimental data. This outcome can result in the need for empirical testing of many candidate sequences, sometimes iteratively, to identity one which can replicate the activity of the parental antibody. Here we describe an improved approach to antibody protein sequencing by using phage display technology to generate a combinatorial library of sequences that satisfy the mass spectrometry data, and selecting for functional candidates that bind antigen. This approach was used to reverse engineer 2 commercially-obtained monoclonal antibodies against murine CD137. Proteomic data enabled us to assign the majority of the variable domain sequences, with the exception of 3-5% of the sequence located within or adjacent to complementarity-determining regions. To efficiently resolve the sequence in these regions, small phage-displayed libraries were generated and subjected to antigen binding selection. Following enrichment of antigen-binding clones, 2 clones were selected for each antibody and recombinantly expressed as antigen-binding fragments (Fabs). In both cases, the reverse-engineered Fabs exhibited identical antigen binding affinity, within error, as Fabs produced from the commercial IgGs. This combination of proteomic and protein engineering techniques provides a useful approach to simplifying the technically challenging process of reverse engineering monoclonal antibodies from protein material.

  6. Combining phage display with de novo protein sequencing for reverse engineering of monoclonal antibodies

    PubMed Central

    Rickert, Keith W.; Grinberg, Luba; Woods, Robert M.; Wilson, Susan; Bowen, Michael A.; Baca, Manuel

    2016-01-01

    ABSTRACT The enormous diversity created by gene recombination and somatic hypermutation makes de novo protein sequencing of monoclonal antibodies a uniquely challenging problem. Modern mass spectrometry-based sequencing will rarely, if ever, provide a single unambiguous sequence for the variable domains. A more likely outcome is computation of an ensemble of highly similar sequences that can satisfy the experimental data. This outcome can result in the need for empirical testing of many candidate sequences, sometimes iteratively, to identity one which can replicate the activity of the parental antibody. Here we describe an improved approach to antibody protein sequencing by using phage display technology to generate a combinatorial library of sequences that satisfy the mass spectrometry data, and selecting for functional candidates that bind antigen. This approach was used to reverse engineer 2 commercially-obtained monoclonal antibodies against murine CD137. Proteomic data enabled us to assign the majority of the variable domain sequences, with the exception of 3–5% of the sequence located within or adjacent to complementarity-determining regions. To efficiently resolve the sequence in these regions, small phage-displayed libraries were generated and subjected to antigen binding selection. Following enrichment of antigen-binding clones, 2 clones were selected for each antibody and recombinantly expressed as antigen-binding fragments (Fabs). In both cases, the reverse-engineered Fabs exhibited identical antigen binding affinity, within error, as Fabs produced from the commercial IgGs. This combination of proteomic and protein engineering techniques provides a useful approach to simplifying the technically challenging process of reverse engineering monoclonal antibodies from protein material. PMID:26852694

  7. Genetic Variability of Beauveria bassiana and a DNA Marker for Environmental Monitoring of a Highly Virulent Isolate Against Cosmopolites sordidus.

    PubMed

    Ferri, D V; Munhoz, C F; Neves, P M O; Ferracin, L M; Sartori, D; Vieira, M L C; Fungaro, M H P

    2012-12-01

    The banana weevil Cosmopolites sordidus (Germar) is one of a number of pests that attack banana crops. The use of the entomopathogenic fungus Beauveria bassiana as a biological control agent for this pest may contribute towards reducing the application of chemical insecticides on banana crops. In this study, the genetic variability of a collection of Brazilian isolates of B. bassiana was evaluated. Samples were obtained from various geographic regions of Brazil, and from different hosts of the Curculionidae family. Based on the DNA fingerprints generated by RAPD and AFLP, we found that 92 and 88 % of the loci were polymorphic, respectively. The B. bassiana isolates were attributed to two genotypic clusters based on the RAPD data, and to three genotypic clusters, when analyzed with AFLP. The nucleotide sequences of nuclear ribosomal DNA intergenic spacers confirmed that all isolates are in fact B. bassiana. Analysis of molecular variance showed that variability among the isolates was not correlated with geographic origin or hosts. A RAPD-specific marker for isolate CG 1024, which is highly virulent to C. sordidus, was cloned and sequenced. Based on the sequences obtained, specific PCR primers BbasCG1024F (5'-TGC GGC TGA GGA GGA CT-3') and BbasCG1024R (5'-TGC GGC TGA GTG TAG AAC-3') were designed for detecting and monitoring this isolate in the field.

  8. Scaling exponents for ordered maxima

    DOE PAGES

    Ben-Naim, E.; Krapivsky, P. L.; Lemons, N. W.

    2015-12-22

    We study extreme value statistics of multiple sequences of random variables. For each sequence with N variables, independently drawn from the same distribution, the running maximum is defined as the largest variable to date. We compare the running maxima of m independent sequences and investigate the probability S N that the maxima are perfectly ordered, that is, the running maximum of the first sequence is always larger than that of the second sequence, which is always larger than the running maximum of the third sequence, and so on. The probability S N is universal: it does not depend on themore » distribution from which the random variables are drawn. For two sequences, S N~N –1/2, and in general, the decay is algebraic, S N~N –σm, for large N. We analytically obtain the exponent σ 3≅1.302931 as root of a transcendental equation. Moreover, the exponents σ m grow with m, and we show that σ m~m for large m.« less

  9. Evaluation of Nine Somatic Variant Callers for Detection of Somatic Mutations in Exome and Targeted Deep Sequencing Data.

    PubMed

    Krøigård, Anne Bruun; Thomassen, Mads; Lænkholm, Anne-Vibeke; Kruse, Torben A; Larsen, Martin Jakob

    2016-01-01

    Next generation sequencing is extensively applied to catalogue somatic mutations in cancer, in research settings and increasingly in clinical settings for molecular diagnostics, guiding therapy decisions. Somatic variant callers perform paired comparisons of sequencing data from cancer tissue and matched normal tissue in order to detect somatic mutations. The advent of many new somatic variant callers creates a need for comparison and validation of the tools, as no de facto standard for detection of somatic mutations exists and only limited comparisons have been reported. We have performed a comprehensive evaluation using exome sequencing and targeted deep sequencing data of paired tumor-normal samples from five breast cancer patients to evaluate the performance of nine publicly available somatic variant callers: EBCall, Mutect, Seurat, Shimmer, Indelocator, Somatic Sniper, Strelka, VarScan 2 and Virmid for the detection of single nucleotide mutations and small deletions and insertions. We report a large variation in the number of calls from the nine somatic variant callers on the same sequencing data and highly variable agreement. Sequencing depth had markedly diverse impact on individual callers, as for some callers, increased sequencing depth highly improved sensitivity. For SNV calling, we report EBCall, Mutect, Virmid and Strelka to be the most reliable somatic variant callers for both exome sequencing and targeted deep sequencing. For indel calling, EBCall is superior due to high sensitivity and robustness to changes in sequencing depths.

  10. Evaluation of Nine Somatic Variant Callers for Detection of Somatic Mutations in Exome and Targeted Deep Sequencing Data

    PubMed Central

    Krøigård, Anne Bruun; Thomassen, Mads; Lænkholm, Anne-Vibeke; Kruse, Torben A.; Larsen, Martin Jakob

    2016-01-01

    Next generation sequencing is extensively applied to catalogue somatic mutations in cancer, in research settings and increasingly in clinical settings for molecular diagnostics, guiding therapy decisions. Somatic variant callers perform paired comparisons of sequencing data from cancer tissue and matched normal tissue in order to detect somatic mutations. The advent of many new somatic variant callers creates a need for comparison and validation of the tools, as no de facto standard for detection of somatic mutations exists and only limited comparisons have been reported. We have performed a comprehensive evaluation using exome sequencing and targeted deep sequencing data of paired tumor-normal samples from five breast cancer patients to evaluate the performance of nine publicly available somatic variant callers: EBCall, Mutect, Seurat, Shimmer, Indelocator, Somatic Sniper, Strelka, VarScan 2 and Virmid for the detection of single nucleotide mutations and small deletions and insertions. We report a large variation in the number of calls from the nine somatic variant callers on the same sequencing data and highly variable agreement. Sequencing depth had markedly diverse impact on individual callers, as for some callers, increased sequencing depth highly improved sensitivity. For SNV calling, we report EBCall, Mutect, Virmid and Strelka to be the most reliable somatic variant callers for both exome sequencing and targeted deep sequencing. For indel calling, EBCall is superior due to high sensitivity and robustness to changes in sequencing depths. PMID:27002637

  11. Effects of weather conditions on emergency ambulance calls for acute coronary syndromes

    NASA Astrophysics Data System (ADS)

    Vencloviene, Jone; Babarskiene, Ruta; Dobozinskas, Paulius; Siurkaite, Viktorija

    2015-08-01

    The aim of this study was to evaluate the relationship between weather conditions and daily emergency ambulance calls for acute coronary syndromes (ACS). The study included data on 3631 patients who called the ambulance for chest pain and were admitted to the department of cardiology as patients with ACS. We investigated the effect of daily air temperature ( T), barometric pressure (BP), relative humidity, and wind speed (WS) to detect the risk areas for low and high daily volume (DV) of emergency calls. We used the classification and regression tree method as well as cluster analysis. The clusters were created by applying the k-means cluster algorithm using the standardized daily weather variables. The analysis was performed separately during cold (October-April) and warm (May-September) seasons. During the cold period, the greatest DV was observed on days of low T during the 3-day sequence, on cold and windy days, and on days of low BP and high WS during the 3-day sequence; low DV was associated with high BP and decreased WS on the previous day. During June-September, a lower DV was associated with low BP, windless days, and high BP and low WS during the 3-day sequence. During the warm period, the greatest DV was associated with increased BP and changing WS during the 3-day sequence. These results suggest that daily T, BP, and WS on the day of the ambulance call and on the two previous days may be prognostic variables for the risk of ACS.

  12. Application of the High Resolution Melting analysis for genetic mapping of Sequence Tagged Site markers in narrow-leafed lupin (Lupinus angustifolius L.).

    PubMed

    Kamel, Katarzyna A; Kroc, Magdalena; Święcicki, Wojciech

    2015-01-01

    Sequence tagged site (STS) markers are valuable tools for genetic and physical mapping that can be successfully used in comparative analyses among related species. Current challenges for molecular markers genotyping in plants include the lack of fast, sensitive and inexpensive methods suitable for sequence variant detection. In contrast, high resolution melting (HRM) is a simple and high-throughput assay, which has been widely applied in sequence polymorphism identification as well as in the studies of genetic variability and genotyping. The present study is the first attempt to use the HRM analysis to genotype STS markers in narrow-leafed lupin (Lupinus angustifolius L.). The sensitivity and utility of this method was confirmed by the sequence polymorphism detection based on melting curve profiles in the parental genotypes and progeny of the narrow-leafed lupin mapping population. Application of different approaches, including amplicon size and a simulated heterozygote analysis, has allowed for successful genetic mapping of 16 new STS markers in the narrow-leafed lupin genome.

  13. The influence of phonological priming on variability in articulation

    NASA Astrophysics Data System (ADS)

    Babel, Molly E.; Munson, Benjamin

    2004-05-01

    Previous research [Sevald and Dell, Cognition 53, 91-127 (1994)] has found that reiterant sequences of CVC words are produced more quickly when the prime word and target word share VC sequences (i.e., sequences like sit sick) than when they are identical (sequences like sick sick). Even slower production rates are found when primes and targets share a CV sequence (sequences like kick sick). These data have been used to support a model of speech production in which lexical items and their constituent phonemes are activated sequentially. The current experiment investigated whether phonological priming also influences variability in the acoustic characteristics of words. Specifically, we examined whether greater variability in the acoustic characteristics of target words was noted in the CV-related prime context than in the identical-prime context, and whether less variability was noted in the VC-related context. Thirty adult subjects with typical speech, language, and hearing ability produced reiterant two-word sequences that varied in their phonological similarity. The duration, first, and second formant frequencies of the target-words' vowels were measured. Preliminary analyses indicate that phonological priming does not have a systematic effect on variability in these acoustic parameters.

  14. Characterization of the variable-number tandem repeats in vrrA from different Bacillus anthracis isolates

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jackson, P.J.; Walthers, E.A.; Richmond, K.L.

    1997-04-01

    PCR analysis of 198 Bacillus anthracis isolates revealed a variable region of DNA sequence differing in length among the isolates. Five Polymorphisms differed by the presence Of two to six copies of the 12-bp tandem repeat 5{prime}-CAATATCAACAA-3{prime}. This variable-number tandem repeat (VNTR) region is located within a larger sequence containing one complete open reading frame that encodes a putative 30-kDa protein. Length variation did not change the reading frame of the encoded protein and only changed the copy number of a 4-amino-acid sequence (QYQQ) from 2 to 6. The structure of the VNTR region suggests that these multiple repeats aremore » generated by recombination or polymerase slippage. Protein structures predicted from the reverse-translated DNA sequence suggest that any structural changes in the encoded protein are confined to the region encoded by the VNTR sequence. Copy number differences in the VNTR region were used to define five different B. anthracis alleles. Characterization of 198 isolates revealed allele frequencies of 6.1, 17.7, 59.6, 5.6, and 11.1% sequentially from shorter to longer alleles. The high degree of polymorphism in the VNTR region provides a criterion for assigning isolates to five allelic categories. There is a correlation between categories and geographic distribution. Such molecular markers can be used to monitor the epidemiology of anthrax outbreaks in domestic and native herbivore populations. 22 refs., 4 figs., 3 tabs.« less

  15. Discrimination of germline V genes at different sequencing lengths and mutational burdens: A new tool for identifying and evaluating the reliability of V gene assignment.

    PubMed

    Zhang, Bochao; Meng, Wenzhao; Prak, Eline T Luning; Hershberg, Uri

    2015-12-01

    Immune repertoires are collections of lymphocytes that express diverse antigen receptor gene rearrangements consisting of Variable (V), (Diversity (D) in the case of heavy chains) and Joining (J) gene segments. Clonally related cells typically share the same germline gene segments and have highly similar junctional sequences within their third complementarity determining regions. Identifying clonal relatedness of sequences is a key step in the analysis of immune repertoires. The V gene is the most important for clone identification because it has the longest sequence and the greatest number of sequence variants. However, accurate identification of a clone's germline V gene source is challenging because there is a high degree of similarity between different germline V genes. This difficulty is compounded in antibodies, which can undergo somatic hypermutation. Furthermore, high-throughput sequencing experiments often generate partial sequences and have significant error rates. To address these issues, we describe a novel method to estimate which germline V genes (or alleles) cannot be discriminated under different conditions (read lengths, sequencing errors or somatic hypermutation frequencies). Starting with any set of germline V genes, this method measures their similarity using different sequencing lengths and calculates their likelihood of unambiguous assignment under different levels of mutation. Hence, one can identify, under different experimental and biological conditions, the germline V genes (or alleles) that cannot be uniquely identified and bundle them together into groups of specific V genes with highly similar sequences. Copyright © 2015 Elsevier B.V. All rights reserved.

  16. Distribution of Helicobacter pylori virulence markers in patients with gastroduodenal diseases in a region at high risk of gastric cancer.

    PubMed

    Wang, Ming-yi; Chen, Cheng; Gao, Xiao-zhong; Li, Jie; Yue, Jing; Ling, Feng; Wang, Xiao-chun; Shao, Shi-he

    2013-01-01

    Helicobacter pylori (H. pylori) is a major human pathogen that is responsible for various gastroduodenal diseases. We investigated the prevalence of H. pylori virulence markers in a region at high risk of gastric cancer. One hundred and sixteen H. pylori strains were isolated from patients with gastroduodenal diseases. cagA, the cagA 3' variable region, cagPAI genes, vacA, and dupA genotypes were determined by PCR, and some amplicons of the cagA 3' variable region, cagPAI genes and dupA were sequenced. cagA was detected in all strains. The cagA 3' variable region of 85 strains (73.3%) was amplified, and the sequences of 24 strains were obtained including 22 strains possessing the East Asian-type. The partial cagPAI presented at a higher frequency in chronic gastritis (44.4%) than that of the severe clinical outcomes (9.7%, p < 0.001). The most prevalent vacA genotypes were s1a/m2 (48.3%) and s1c/m2 (13.8%). Thirty-six strains (31.0%) possessed dupA and sequencing of dupA revealed an ORF of 2449-bp. The prevalence of dupA was significantly higher in strains from patients with the severe clinical outcomes (40.3%) than that from chronic gastritis (20.4%, p = 0.02). The high rate of East Asian-type cagA, intact cagPAI, virulent vacA genotypes, and the intact long-type dupA may underlie the high risk of gastric cancer in the region. Copyright © 2013 Elsevier Ltd. All rights reserved.

  17. Exome sequence analysis suggests genetic burden contributes to phenotypic variability and complex neuropathy

    PubMed Central

    Gonzaga-Jauregui, Claudia; Harel, Tamar; Gambin, Tomasz; Kousi, Maria; Griffin, Laurie B.; Francescatto, Ludmila; Ozes, Burcak; Karaca, Ender; Jhangiani, Shalini; Bainbridge, Matthew N.; Lawson, Kim S.; Pehlivan, Davut; Okamoto, Yuji; Withers, Marjorie; Mancias, Pedro; Slavotinek, Anne; Reitnauer, Pamela J; Goksungur, Meryem T.; Shy, Michael; Crawford, Thomas O.; Koenig, Michel; Willer, Jason; Flores, Brittany N.; Pediaditrakis, Igor; Us, Onder; Wiszniewski, Wojciech; Parman, Yesim; Antonellis, Anthony; Muzny, Donna M.; Katsanis, Nicholas; Battaloglu, Esra; Boerwinkle, Eric; Gibbs, Richard A.; Lupski, James R.

    2015-01-01

    Charcot-Marie-Tooth (CMT) disease is a clinically and genetically heterogeneous distal symmetric polyneuropathy. Whole-exome sequencing (WES) of 40 individuals from 37 unrelated families with CMT-like peripheral neuropathy refractory to molecular diagnosis identified apparent causal mutations in ~45% (17/37) of families. Three candidate disease genes are proposed, supported by a combination of genetic and in vivo studies. Aggregate analysis of mutation data revealed a significantly increased number of rare variants across 58 neuropathy associated genes in subjects versus controls; confirmed in a second ethnically discrete neuropathy cohort, suggesting mutation burden potentially contributes to phenotypic variability. Neuropathy genes shown to have highly penetrant Mendelizing variants (HMPVs) and implicated by burden in families were shown to interact genetically in a zebrafish assay exacerbating the phenotype established by the suppression of single genes. Our findings suggest that the combinatorial effect of rare variants contributes to disease burden and variable expressivity. PMID:26257172

  18. Improved first-pass spiral myocardial perfusion imaging with variable density trajectories.

    PubMed

    Salerno, Michael; Sica, Christopher; Kramer, Christopher M; Meyer, Craig H

    2013-11-01

    To develop and evaluate variable-density spiral first-pass perfusion pulse sequences for improved efficiency and off-resonance performance and to demonstrate the utility of an apodizing density compensation function (DCF) to improve signal-to-noise ratio (SNR) and reduce dark-rim artifact caused by cardiac motion and Gibbs Ringing. Three variable density spiral trajectories were designed, simulated, and evaluated in 18 normal subjects, and in eight patients with cardiac pathology on a 1.5T scanner. By using a DCF, which intentionally apodizes the k-space data, the sidelobe amplitude of the theoretical point spread function (PSF) is reduced by 68%, with only a 13% increase in the full-width at half-maximum of the main-lobe when compared with the same data corrected with a conventional variable-density DCF, and has an 8% higher resolution than a uniform density spiral with the same number of interleaves and readout duration. Furthermore, this strategy results in a greater than 60% increase in measured SNR when compared with the same variable-density spiral data corrected with a conventional DCF (P < 0.01). Perfusion defects could be clearly visualized with minimal off-resonance and dark-rim artifacts. Variable-density spiral pulse sequences using an apodized DCF produce high-quality first-pass perfusion images with minimal dark-rim and off-resonance artifacts, high SNR and contrast-to-noise ratio, and good delineation of resting perfusion abnormalities. Copyright © 2012 Wiley Periodicals, Inc.

  19. The LANL hemorrhagic fever virus database, a new platform for analyzing biothreat viruses.

    PubMed

    Kuiken, Carla; Thurmond, Jim; Dimitrijevic, Mira; Yoon, Hyejin

    2012-01-01

    Hemorrhagic fever viruses (HFVs) are a diverse set of over 80 viral species, found in 10 different genera comprising five different families: arena-, bunya-, flavi-, filo- and togaviridae. All these viruses are highly variable and evolve rapidly, making them elusive targets for the immune system and for vaccine and drug design. About 55,000 HFV sequences exist in the public domain today. A central website that provides annotated sequences and analysis tools will be helpful to HFV researchers worldwide. The HFV sequence database collects and stores sequence data and provides a user-friendly search interface and a large number of sequence analysis tools, following the model of the highly regarded and widely used Los Alamos HIV database [Kuiken, C., B. Korber, and R.W. Shafer, HIV sequence databases. AIDS Rev, 2003. 5: p. 52-61]. The database uses an algorithm that aligns each sequence to a species-wide reference sequence. The NCBI RefSeq database [Sayers et al. (2011) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res., 39, D38-D51.] is used for this; if a reference sequence is not available, a Blast search finds the best candidate. Using this method, sequences in each genus can be retrieved pre-aligned. The HFV website can be accessed via http://hfv.lanl.gov.

  20. Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment

    PubMed Central

    2013-01-01

    Background Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as progressive alignment, whose run time grows with the square of the number and the length of the aligned sequences and requires significant computational resources. In this work, we present a method to efficiently discover regions of high similarity across multiple sequences without performing expensive sequence alignment. The method is based on approximating edit distance between segments of sequences using p-mer frequency counts. Then, efficient high-throughput data stream clustering is used to group highly similar segments into so called quasi-alignments. Quasi-alignments have numerous applications such as identifying species and their taxonomic class from sequences, comparing sequences for similarities, and, as in this paper, discovering conserved regions across related sequences. Results In this paper, we show that quasi-alignments can be used to discover highly similar segments across multiple sequences from related or different genomes efficiently and accurately. Experiments on a large number of unaligned 16S rRNA sequences obtained from the Greengenes database show that the method is able to identify conserved regions which agree with known hypervariable regions in 16S rRNA. Furthermore, the experiments show that the proposed method scales well for large data sets with a run time that grows only linearly with the number and length of sequences, whereas for existing multiple sequence alignment heuristics the run time grows super-linearly. Conclusion Quasi-alignment-based algorithms can detect highly similar regions and conserved areas across multiple sequences. Since the run time is linear and the sequences are converted into a compact clustering model, we are able to identify conserved regions fast or even interactively using a standard PC. Our method has many potential applications such as finding characteristic signature sequences for families of organisms and studying conserved and variable regions in, for example, 16S rRNA. PMID:24564200

  1. Fast discovery and visualization of conserved regions in DNA sequences using quasi-alignment.

    PubMed

    Nagar, Anurag; Hahsler, Michael

    2013-01-01

    Next Generation Sequencing techniques are producing enormous amounts of biological sequence data and analysis becomes a major computational problem. Currently, most analysis, especially the identification of conserved regions, relies heavily on Multiple Sequence Alignment and its various heuristics such as progressive alignment, whose run time grows with the square of the number and the length of the aligned sequences and requires significant computational resources. In this work, we present a method to efficiently discover regions of high similarity across multiple sequences without performing expensive sequence alignment. The method is based on approximating edit distance between segments of sequences using p-mer frequency counts. Then, efficient high-throughput data stream clustering is used to group highly similar segments into so called quasi-alignments. Quasi-alignments have numerous applications such as identifying species and their taxonomic class from sequences, comparing sequences for similarities, and, as in this paper, discovering conserved regions across related sequences. In this paper, we show that quasi-alignments can be used to discover highly similar segments across multiple sequences from related or different genomes efficiently and accurately. Experiments on a large number of unaligned 16S rRNA sequences obtained from the Greengenes database show that the method is able to identify conserved regions which agree with known hypervariable regions in 16S rRNA. Furthermore, the experiments show that the proposed method scales well for large data sets with a run time that grows only linearly with the number and length of sequences, whereas for existing multiple sequence alignment heuristics the run time grows super-linearly. Quasi-alignment-based algorithms can detect highly similar regions and conserved areas across multiple sequences. Since the run time is linear and the sequences are converted into a compact clustering model, we are able to identify conserved regions fast or even interactively using a standard PC. Our method has many potential applications such as finding characteristic signature sequences for families of organisms and studying conserved and variable regions in, for example, 16S rRNA.

  2. Global Sea-level Changes Revealed in the Sediments of the Canterbury Basin, New Zealand: IODP Expedition 317

    NASA Astrophysics Data System (ADS)

    McHugh, C. M.; Fulthorpe, C.; Blum, P.; Rios, J.; Chow, Y.; Mishkin, K.

    2012-12-01

    Continental margins are composed of thick sedimentary sections that preserve the record of local processes modulated by global sea-level (eustatic) changes and climate. Understanding this regional variability permits us to extract the eustatic record. Integrated Ocean Drilling Program Expedition 317 drilled four sites in the offshore Canterbury Basin, eastern South Island of New Zealand, in water depths of 85 m to 320 m. One of the objectives of the expedition was to understand the influence of eustasy on continental margins sedimentation and to test the concepts of sequence stratigraphy. A high-resolution multiproxy approach that involves geochemical elemental analyses, lithostratigraphy and biostratigraphy is applied to understand the margin's sedimentation for the past ~5 million years. Multichannel seismic data (EW00-01 survey) provide a seismic sequence stratigraphic framework against which to interpret the multiproxy data. The mid- to late Pleistocene sedimentation is characterized by variable lithologies and changing facies. However, elemental compositions and facies follow predictable patterns within seismic sequences. Oxygen isotope measurements for the latest Pleistocene indicate that 100 ky Milankovich astronomical forcing controlled this variability. In contrast, Pliocene and early Pleistocene sediments are composed of repetitive siliciclastic and carbonate mud lithologies with less facies variability. Results of our analyses suggest that repetitive alternations of green and gray mud were deposited during warmer and cooler periods, respectively. Oxygen isotopes suggest that this cyclicity may reflect 40 ky Milankovich forcing. Ocean Drilling Program Legs 150 and 174A drilled on the New Jersey continental margin with similar objectives to those of Expedition 317. Results from this northern and southern hemisphere drilling reveal that eustasy, controlled by Milankovich forcing, strongly influences margin sedimentation and the formation of basin-wide unconformities. However, the correlation between eustasy and seismic sequence formation is not always one to one. High sedimentation rates in the Pleistocene offshore Canterbury Basin record a one- to-one correlation between glacioeustasy and seismic sequences, and in some sequences possibly a higher order frequency. But this is not the case for offshore New Jersey, where accumulation rates were lower and only the uppermost seismic sequences represent 100 ky cycles. Furthermore, Pliocene sedimentation in the Canterbury Basin was also controlled by eustasy, but does not show a one-to-one correlation between Milankovich cycles and seismic stratigraphy. Northern and southern hemisphere comparisons provide a powerful tool to better understand controls on regional sedimentation and extract a global signal.

  3. Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution.

    PubMed

    Melters, Daniël P; Bradnam, Keith R; Young, Hugh A; Telis, Natalie; May, Michael R; Ruby, J Graham; Sebra, Robert; Peluso, Paul; Eid, John; Rank, David; Garcia, José Fernando; DeRisi, Joseph L; Smith, Timothy; Tobias, Christian; Ross-Ibarra, Jeffrey; Korf, Ian; Chan, Simon W L

    2013-01-30

    Centromeres are essential for chromosome segregation, yet their DNA sequences evolve rapidly. In most animals and plants that have been studied, centromeres contain megabase-scale arrays of tandem repeats. Despite their importance, very little is known about the degree to which centromere tandem repeats share common properties between different species across different phyla. We used bioinformatic methods to identify high-copy tandem repeats from 282 species using publicly available genomic sequence and our own data. Our methods are compatible with all current sequencing technologies. Long Pacific Biosciences sequence reads allowed us to find tandem repeat monomers up to 1,419 bp. We assumed that the most abundant tandem repeat is the centromere DNA, which was true for most species whose centromeres have been previously characterized, suggesting this is a general property of genomes. High-copy centromere tandem repeats were found in almost all animal and plant genomes, but repeat monomers were highly variable in sequence composition and length. Furthermore, phylogenetic analysis of sequence homology showed little evidence of sequence conservation beyond approximately 50 million years of divergence. We find that despite an overall lack of sequence conservation, centromere tandem repeats from diverse species showed similar modes of evolution. While centromere position in most eukaryotes is epigenetically determined, our results indicate that tandem repeats are highly prevalent at centromeres of both animal and plant genomes. This suggests a functional role for such repeats, perhaps in promoting concerted evolution of centromere DNA across chromosomes.

  4. Comparative analysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution

    PubMed Central

    2013-01-01

    Background Centromeres are essential for chromosome segregation, yet their DNA sequences evolve rapidly. In most animals and plants that have been studied, centromeres contain megabase-scale arrays of tandem repeats. Despite their importance, very little is known about the degree to which centromere tandem repeats share common properties between different species across different phyla. We used bioinformatic methods to identify high-copy tandem repeats from 282 species using publicly available genomic sequence and our own data. Results Our methods are compatible with all current sequencing technologies. Long Pacific Biosciences sequence reads allowed us to find tandem repeat monomers up to 1,419 bp. We assumed that the most abundant tandem repeat is the centromere DNA, which was true for most species whose centromeres have been previously characterized, suggesting this is a general property of genomes. High-copy centromere tandem repeats were found in almost all animal and plant genomes, but repeat monomers were highly variable in sequence composition and length. Furthermore, phylogenetic analysis of sequence homology showed little evidence of sequence conservation beyond approximately 50 million years of divergence. We find that despite an overall lack of sequence conservation, centromere tandem repeats from diverse species showed similar modes of evolution. Conclusions While centromere position in most eukaryotes is epigenetically determined, our results indicate that tandem repeats are highly prevalent at centromeres of both animal and plant genomes. This suggests a functional role for such repeats, perhaps in promoting concerted evolution of centromere DNA across chromosomes. PMID:23363705

  5. Whole-genome sequencing of a Plasmodium vivax clinical isolate exhibits geographical characteristics and high genetic variation in China-Myanmar border area.

    PubMed

    Chen, Shen-Bo; Wang, Yue; Kassegne, Kokouvi; Xu, Bin; Shen, Hai-Mo; Chen, Jun-Hu

    2017-02-06

    Currently in China, the trend of Plasmodium vivax cases imported from Southeast Asia was increased especially in the China-Myanmar border area. Driven by the increase in P. vivax cases and stronger need for vaccine and drug development, several P. vivax isolates genome sequencing projects are underway. However, little is known about the genetic variability in this area until now. The sequencing of the first P. vivax isolate from China-Myanmar border area (CMB-1) generated 120 million paired-end reads. A percentage of 10.6 of the quality-evaluated reads were aligned onto 99.9% of the reference strain Sal I genome in 62-fold coverage with an average of 4.8 SNPs per kb. We present a 539-SNP marker data set for P. vivax that can identify different parasites from different geographic origins with high sensitivity. We also identified exceptionally high levels of genetic variability in members of multigene families such as RBP, SERA, vir, MSP3 and AP2. The de-novo assembly yielded a database composed of 8,409 contigs with N50 lengths of 6.6 kb and revealed 661 novel predicted genes including 78 vir genes, suggesting a greater functional variation in P. vivax from this area. Our result contributes to a better understanding of P. vivax genetic variation, and provides a fundamental basis for the geographic differentiation of vivax malaria from China-Myanmar border area using a direct sequencing approach without leukocyte depletion. This novel sequencing method can be used as an essential tool for the genomic research of P. vivax in the near future.

  6. Genetic diversity based on 28S rDNA sequences among populations of Culex quinquefasciatus collected at different locations in Tamil Nadu, India.

    PubMed

    Sakthivelkumar, S; Ramaraj, P; Veeramani, V; Janarthanan, S

    2015-09-01

    The basis of the present study was to distinguish the existence of any genetic variability among populations of Culex quinquefasciatus which would be a valuable tool in the management of mosquito control programmes. In the present study, population of Cx. quinquefasciatus collected at different locations in Tamil Nadu were analyzed for their genetic variation based on 28S rDNA D2 region nucleotide sequences. A high degree of genetic polymorphism was detected in the sequences of D2 region of 28S rDNA on the predicted secondary structures in spite of high nucleotide sequence similarity. The findings based on secondary structure using rDNA sequences suggested the existence of a complex genotypic diversity of Cx. quinquefasciatus population collected at different locations of Tamil Nadu, India. This complexity in genetic diversity in a single mosquito population collected at different locations is considered an important issue towards their influence and nature of vector potential of these mosquitoes.

  7. Friis Hills Drilling Project - Coring an Early to mid-Miocene terrestrial sequence in the Transantarctic Mountains to examine climate gradients and ice sheet variability along an inland-to-offshore transect

    NASA Astrophysics Data System (ADS)

    Lewis, A. R.; Levy, R. H.; Naish, T.; Gorman, A. R.; Golledge, N.; Dickinson, W. W.; Kraus, C.; Florindo, F.; Ashworth, A. C.; Pyne, A.; Kingan, T.

    2015-12-01

    The Early to mid-Miocene is a compelling interval to study Antarctic ice sheet (AIS) sensitivity. Circulation patterns in the southern hemisphere were broadly similar to present and reconstructed atmospheric CO2 concentrations were analogous to those projected for the next several decades. Geologic records from locations proximal to the AIS are required to examine ice sheet response to climate variability during this time. Coastal and offshore drill core records recovered by ANDRILL and IODP provide information regarding ice sheet variability along and beyond the coastal margin but they cannot constrain the extent of inland retreat. Additional environmental data from the continental interior is required to constrain the magnitude of ice sheet variability and inform numerical ice sheet models. The only well-dated terrestrial deposits that register early to mid-Miocene interior ice extent and climate are in the Friis Hills, 80 km inland. The deposits record multiple glacial-interglacial cycles and fossiliferous non-glacial beds show that interglacial climate was warm enough for a diverse biota. Drifts are preserved in a shallow valley with the oldest beds exposed along the edges where they terminate at sharp erosional margins. These margins reveal drifts in short stratigraphic sections but none is more than 13 m thick. A 34 m-thick composite stratigraphic sequence has been produced from exposed drift sequences but correlating beds in scattered exposures is problematic. Moreover, much of the sequence is buried and inaccessible in the basin center. New seismic data collected during 2014 reveal a sequence of sediments at least 50 m thick. This stratigraphic package likely preserves a detailed and more complete sedimentary sequence for the Friis Hills that can be used to refine and augment the outcrop-based composite stratigraphy. We aim to drill through this sequence using a helicopter-transportable diamond coring system. These new cores will allow us to obtain continuous measurements on unweathered material through the terrestrial sequence. Beds of tephra are exposed in outcrop and we expect to encounter these key age markers in the cored sequence. These new high quality, well-dated terrestrial data will be directly compared to marine cores to provide environmental data across a broad onshore-offshore transect.

  8. Spatiotemporal attention operator using isotropic contrast and regional homogeneity

    NASA Astrophysics Data System (ADS)

    Palenichka, Roman; Lakhssassi, Ahmed; Zaremba, Marek

    2011-04-01

    A multiscale operator for spatiotemporal isotropic attention is proposed to reliably extract attention points during image sequence analysis. Its consecutive local maxima indicate attention points as the centers of image fragments of variable size with high intensity contrast, region homogeneity, regional shape saliency, and temporal change presence. The scale-adaptive estimation of temporal change (motion) and its aggregation with the regional shape saliency contribute to the accurate determination of attention points in image sequences. Multilocation descriptors of an image sequence are extracted at the attention points in the form of a set of multidimensional descriptor vectors. A fast recursive implementation is also proposed to make the operator's computational complexity independent from the spatial scale size, which is the window size in the spatial averaging filter. Experiments on the accuracy of attention-point detection have proved the operator consistency and its high potential for multiscale feature extraction from image sequences.

  9. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Poyer, D.A.

    In this report, tests of statistical significance of five sets of variables with household energy consumption (at the point of end-use) are described. Five models, in sequence, were empirically estimated and tested for statistical significance by using the Residential Energy Consumption Survey of the US Department of Energy, Energy Information Administration. Each model incorporated additional information, embodied in a set of variables not previously specified in the energy demand system. The variable sets were generally labeled as economic variables, weather variables, household-structure variables, end-use variables, and housing-type variables. The tests of statistical significance showed each of the variable sets tomore » be highly significant in explaining the overall variance in energy consumption. The findings imply that the contemporaneous interaction of different types of variables, and not just one exclusive set of variables, determines the level of household energy consumption.« less

  10. Evolutionary and biophysical relationships among the papillomavirus E2 proteins.

    PubMed

    Blakaj, Dukagjin M; Fernandez-Fuentes, Narcis; Chen, Zigui; Hegde, Rashmi; Fiser, Andras; Burk, Robert D; Brenowitz, Michael

    2009-01-01

    Infection by human papillomavirus (HPV) may result in clinical conditions ranging from benign warts to invasive cancer. The HPV E2 protein represses oncoprotein transcription and is required for viral replication. HPV E2 binds to palindromic DNA sequences of highly conserved four base pair sequences flanking an identical length variable 'spacer'. E2 proteins directly contact the conserved but not the spacer DNA. Variation in naturally occurring spacer sequences results in differential protein affinity that is dependent on their sensitivity to the spacer DNA's unique conformational and/or dynamic properties. This article explores the biophysical character of this core viral protein with the goal of identifying characteristics that associated with risk of virally caused malignancy. The amino acid sequence, 3d structure and electrostatic features of the E2 protein DNA binding domain are highly conserved; specific interactions with DNA binding sites have also been conserved. In contrast, the E2 protein's transactivation domain does not have extensive surfaces of highly conserved residues. Rather, regions of high conservation are localized to small surface patches. Implications to cancer biology are discussed.

  11. The genetic basis of adaptive pigmentation variation in Drosophila melanogaster.

    PubMed

    Pool, John E; Aquadro, Charles F

    2007-07-01

    In a broad survey of Drosophila melanogaster population samples, levels of abdominal pigmentation were found to be highly variable and geographically differentiated. A strong positive correlation was found between dark pigmentation and high altitude, suggesting adaptation to specific environments. DNA sequence polymorphism at the candidate gene ebony revealed a clear association with the pigmentation of homozygous third chromosome lines. The darkest lines sequenced had nearly identical haplotypes spanning 14.5 kb upstream of the protein-coding exons of ebony. Thus, natural selection may have elevated the frequency of an allele that confers dark abdominal pigmentation by influencing the regulation of ebony.

  12. [Polymorphism of KPI-A genes from plants of the subgenus Potatoe (sect. Petota, Estolonifera and Lycopersicum) and subgenus Solanum].

    PubMed

    Krinitsyna, A A; Mel'nikova, N V; Belenikin, M S; Poltronieri, P; Santino, A; Kudriavtseva, A V; Savilova, A M; Speranskaia, A S

    2013-01-01

    Kunitz-type proteinase inhibitor proteins of group A (KPI-A) are involved in the protection of potato plants from pathogens and pests. Although sequences of large number of the KPI-A genes from different species of cultivated potato (Solanum tuberosum subsp. tuberosum) and a few genes from tomato (Solanum lycopersicum) are known to date, information about the allelic diversity of these genes in other species of the genus Solanum is lacking. In our work, the consensus sequences of the KPI-A genes were established in two species of subgenus Potatoe sect. Petota (Solanum tuberosum subsp. andigenum--5 genes and Solanum stoloniferum--2 genes) and in the subgenus Solanum (Solanum nigrum--5 genes) by amplification, cloning, sequencing and subsequent analysis. The determined sequences of KPI-A genes were 97-100% identical to known sequences of the cultivated potato of sect. Petota (cultivated potato Solanum tuberosum subsp. tuberosum) and sect. Etuberosum (S. palustre). The interspecific variability of these genes did not exceed the intraspecific variability for all studied species except Solanum lycopersicum. The distribution of highly variable and conserved sequences in the mature protein-encoding regions was uniform for all investigated KPI-A genes. However, our attempts to amplify the homologous genes using the same primers and the genomes of Solanum dulcamarum, Solanum lycopersicum and Mandragora officinarum resulted in no product formation. Phylogenetic analysis of KPI-A diversity showed that the sequences of the S. lycopersicum form independent cluster, whereas KPI-A of S. nigrum and species of sect. Etuberosum and sect. Petota are closely related and do not form species-specific subclasters. Although Solanum nigrum is resistant to all known races of economically one of the most important diseases of solanaceous plants oomycete Phytophthora infestans aminoacid sequences encoding by KPI-A genes from its genome have nearly or absolutely no differences to the same from genomes of cultivated potatoes involved by P. infestans.

  13. Sequence information signal processor for local and global string comparisons

    DOEpatents

    Peterson, John C.; Chow, Edward T.; Waterman, Michael S.; Hunkapillar, Timothy J.

    1997-01-01

    A sequence information signal processing integrated circuit chip designed to perform high speed calculation of a dynamic programming algorithm based upon the algorithm defined by Waterman and Smith. The signal processing chip of the present invention is designed to be a building block of a linear systolic array, the performance of which can be increased by connecting additional sequence information signal processing chips to the array. The chip provides a high speed, low cost linear array processor that can locate highly similar global sequences or segments thereof such as contiguous subsequences from two different DNA or protein sequences. The chip is implemented in a preferred embodiment using CMOS VLSI technology to provide the equivalent of about 400,000 transistors or 100,000 gates. Each chip provides 16 processing elements, and is designed to provide 16 bit, two's compliment operation for maximum score precision of between -32,768 and +32,767. It is designed to provide a comparison between sequences as long as 4,194,304 elements without external software and between sequences of unlimited numbers of elements with the aid of external software. Each sequence can be assigned different deletion and insertion weight functions. Each processor is provided with a similarity measure device which is independently variable. Thus, each processor can contribute to maximum value score calculation using a different similarity measure.

  14. Detection and sequence/structure mapping of biophysical constraints to protein variation in saturated mutational libraries and protein sequence alignments with a dedicated server.

    PubMed

    Abriata, Luciano A; Bovigny, Christophe; Dal Peraro, Matteo

    2016-06-17

    Protein variability can now be studied by measuring high-resolution tolerance-to-substitution maps and fitness landscapes in saturated mutational libraries. But these rich and expensive datasets are typically interpreted coarsely, restricting detailed analyses to positions of extremely high or low variability or dubbed important beforehand based on existing knowledge about active sites, interaction surfaces, (de)stabilizing mutations, etc. Our new webserver PsychoProt (freely available without registration at http://psychoprot.epfl.ch or at http://lucianoabriata.altervista.org/psychoprot/index.html ) helps to detect, quantify, and sequence/structure map the biophysical and biochemical traits that shape amino acid preferences throughout a protein as determined by deep-sequencing of saturated mutational libraries or from large alignments of naturally occurring variants. We exemplify how PsychoProt helps to (i) unveil protein structure-function relationships from experiments and from alignments that are consistent with structures according to coevolution analysis, (ii) recall global information about structural and functional features and identify hitherto unknown constraints to variation in alignments, and (iii) point at different sources of variation among related experimental datasets or between experimental and alignment-based data. Remarkably, metabolic costs of the amino acids pose strong constraints to variability at protein surfaces in nature but not in the laboratory. This and other differences call for caution when extrapolating results from in vitro experiments to natural scenarios in, for example, studies of protein evolution. We show through examples how PsychoProt can be a useful tool for the broad communities of structural biology and molecular evolution, particularly for studies about protein modeling, evolution and design.

  15. Adenosine stress cardiovascular magnetic resonance with variable-density spiral pulse sequences accurately detects coronary artery disease: initial clinical evaluation.

    PubMed

    Salerno, Michael; Taylor, Angela; Yang, Yang; Kuruvilla, Sujith; Ragosta, Michael; Meyer, Craig H; Kramer, Christopher M

    2014-07-01

    Adenosine stress cardiovascular magnetic resonance perfusion imaging can be limited by motion-induced dark-rim artifacts, which may be mistaken for true perfusion abnormalities. A high-resolution variable-density spiral pulse sequence with a novel density compensation strategy has been shown to reduce dark-rim artifacts in first-pass perfusion imaging. We aimed to assess the clinical performance of adenosine stress cardiovascular magnetic resonance using this new perfusion sequence to detect obstructive coronary artery disease. Cardiovascular magnetic resonance perfusion imaging was performed during adenosine stress (140 μg/kg per minute) and at rest on a Siemens 1.5-T Avanto scanner in 41 subjects with chest pain scheduled for coronary angiography. Perfusion images were acquired during injection of 0.1 mmol/kg Gadolinium-diethylenetriaminepentacetate at 3 short-axis locations using a saturation recovery interleaved variable-density spiral pulse sequence. Significant stenosis was defined as >50% by quantitative coronary angiography. Two blinded reviewers evaluated the perfusion images for the presence of adenosine-induced perfusion abnormalities and assessed image quality using a 5-point scale (1 [poor] to 5 [excellent]). The prevalence of obstructive coronary artery disease by quantitative coronary angiography was 68%. The average sensitivity, specificity, and accuracy were 89%, 85%, and 88%, respectively, with a positive predictive value and negative predictive value of 93% and 79%, respectively. The average image quality score was 4.4±0.7, with only 1 study with more than mild dark-rim artifacts. There was good inter-reader reliability with a κ statistic of 0.67. Spiral adenosine stress cardiovascular magnetic resonance results in high diagnostic accuracy for the detection of obstructive coronary artery disease with excellent image quality and minimal dark-rim artifacts. © 2014 American Heart Association, Inc.

  16. Structural diversity of domain superfamilies in the CATH database.

    PubMed

    Reeves, Gabrielle A; Dallman, Timothy J; Redfern, Oliver C; Akpor, Adrian; Orengo, Christine A

    2006-07-14

    The CATH database of domain structures has been used to explore the structural variation of homologous domains in 294 well populated domain structure superfamilies, each containing at least three sequence diverse relatives. Our analyses confirm some previously detected trends relating sequence divergence to structural variation but for a much larger dataset and in some superfamilies the new data reveal exceptional structural variation. Use of a new algorithm (2DSEC) to analyse variability in secondary structure compositions across a superfamily sheds new light on how structures evolve. 2DSEC detects inserted secondary structures that embellish the core of conserved secondary structures found throughout the superfamily. Analysis showed that for 56% of highly populated superfamilies (>9 sequence diverse relatives), there are twofold or more increases in the numbers of secondary structures in some relatives. In some families fivefold increases occur, sometimes modifying the fold of the domain. Manual inspection of secondary structure insertions or embellishments in 48 particularly variable superfamilies revealed that although these insertions were usually discontiguous in the sequence they were often co-located in 3D resulting in a larger structural motif that often modified the geometry of the active site or the surface conformation promoting diverse domain partnerships and protein interactions. These observations, supported by automatic analysis of all well populated CATH families, suggest that accretion of small secondary structure insertions may provide a simple mechanism for evolving new functions in diverse relatives. Some layered domain architectures (e.g. mainly-beta and alpha-beta sandwiches) that recur highly in the genomes more frequently exploit these types of embellishments to modify function. In these architectures, aggregation occurs most often at the edges, top or bottom of the beta-sheets. Information on structural variability across domain superfamilies has been made available through the CATH Dictionary of Homologous Structures (DHS).

  17. Novel chytrid lineages dominate fungal sequences in diverse marine and freshwater habitats

    NASA Astrophysics Data System (ADS)

    Comeau, André M.; Vincent, Warwick F.; Bernier, Louis; Lovejoy, Connie

    2016-07-01

    In aquatic environments, fungal communities remain little studied despite their taxonomic and functional diversity. To extend the ecological coverage of this group, we conducted an in-depth analysis of fungal sequences within our collection of 3.6 million V4 18S rRNA pyrosequences originating from 319 individual marine (including sea-ice) and freshwater samples from libraries generated within diverse projects studying Arctic and temperate biomes in the past decade. Among the ~1.7 million post-filtered reads of highest taxonomic and phylogenetic quality, 23,263 fungal sequences were identified. The overall mean proportion was 1.35%, but with large variability; for example, from 0.01 to 59% of total sequences for Arctic seawater samples. Almost all sample types were dominated by Chytridiomycota-like sequences, followed by moderate-to-minor contributions of Ascomycota, Cryptomycota and Basidiomycota. Species and/or strain richness was high, with many novel sequences and high niche separation. The affinity of the most common reads to phytoplankton parasites suggests that aquatic fungi deserve renewed attention for their role in algal succession and carbon cycling.

  18. Analysis of ChimeriVax Japanese Encephalitis Virus envelope for T-cell epitopes and comparison to circulating strain sequences.

    PubMed

    De Groot, Anne S; Martin, William; Moise, Leonard; Guirakhoo, Farshad; Monath, Thomas

    2007-11-19

    T-cell epitope variability is associated with viral immune escape and may influence the outcome of vaccination against the highly variable Japanese Encephalitis Virus (JEV). We computationally analyzed the ChimeriVax-JEV vaccine envelope sequence for T helper epitopes that are conserved in 12 circulating JEV strains and discovered 75% conservation among putative epitopes. Among non-identical epitopes, only minor amino acid changes that would not significantly affect HLA-binding were present. Therefore, in most cases, circulating strain epitopes could be restricted by the same HLA and are likely to stimulate a cross-reactive T-cell response. Based on this analysis, we predict no significant abrogation of ChimeriVax-JEV-conferred protection against circulating JEV strains.

  19. A Pipeline for High-Throughput Concentration Response Modeling of Gene Expression for Toxicogenomics

    PubMed Central

    House, John S.; Grimm, Fabian A.; Jima, Dereje D.; Zhou, Yi-Hui; Rusyn, Ivan; Wright, Fred A.

    2017-01-01

    Cell-based assays are an attractive option to measure gene expression response to exposure, but the cost of whole-transcriptome RNA sequencing has been a barrier to the use of gene expression profiling for in vitro toxicity screening. In addition, standard RNA sequencing adds variability due to variable transcript length and amplification. Targeted probe-sequencing technologies such as TempO-Seq, with transcriptomic representation that can vary from hundreds of genes to the entire transcriptome, may reduce some components of variation. Analyses of high-throughput toxicogenomics data require renewed attention to read-calling algorithms and simplified dose–response modeling for datasets with relatively few samples. Using data from induced pluripotent stem cell-derived cardiomyocytes treated with chemicals at varying concentrations, we describe here and make available a pipeline for handling expression data generated by TempO-Seq to align reads, clean and normalize raw count data, identify differentially expressed genes, and calculate transcriptomic concentration–response points of departure. The methods are extensible to other forms of concentration–response gene-expression data, and we discuss the utility of the methods for assessing variation in susceptibility and the diseased cellular state. PMID:29163636

  20. Using variable rate models to identify genes under selection in sequence pairs: their validity and limitations for EST sequences.

    PubMed

    Church, Sheri A; Livingstone, Kevin; Lai, Zhao; Kozik, Alexander; Knapp, Steven J; Michelmore, Richard W; Rieseberg, Loren H

    2007-02-01

    Using likelihood-based variable selection models, we determined if positive selection was acting on 523 EST sequence pairs from two lineages of sunflower and lettuce. Variable rate models are generally not used for comparisons of sequence pairs due to the limited information and the inaccuracy of estimates of specific substitution rates. However, previous studies have shown that the likelihood ratio test (LRT) is reliable for detecting positive selection, even with low numbers of sequences. These analyses identified 56 genes that show a signature of selection, of which 75% were not identified by simpler models that average selection across codons. Subsequent mapping studies in sunflower show four of five of the positively selected genes identified by these methods mapped to domestication QTLs. We discuss the validity and limitations of using variable rate models for comparisons of sequence pairs, as well as the limitations of using ESTs for identification of positively selected genes.

  1. Complete mitochondrial genome sequences from five Eimeria species (Apicomplexa; Coccidia; Eimeriidae) infecting domestic turkeys

    PubMed Central

    2014-01-01

    Background Clinical and subclinical coccidiosis is cosmopolitan and inflicts significant losses to the poultry industry globally. Seven named Eimeria species are responsible for coccidiosis in turkeys: Eimeria dispersa; Eimeria meleagrimitis; Eimeria gallopavonis; Eimeria meleagridis; Eimeria adenoeides; Eimeria innocua; and, Eimeria subrotunda. Although attempts have been made to characterize these parasites molecularly at the nuclear 18S rDNA and ITS loci, the maternally-derived and mitotically replicating mitochondrial genome may be more suited for species level molecular work; however, only limited sequence data are available for Eimeria spp. infecting turkeys. The purpose of this study was to sequence and annotate the complete mitochondrial genomes from 5 Eimeria species that commonly infect the domestic turkey (Meleagris gallopavo). Methods Six single-oocyst derived cultures of five Eimeria species infecting turkeys were PCR-amplified and sequenced completely prior to detailed annotation. Resulting sequences were aligned and used in phylogenetic analyses (BI, ML, and MP) that included complete mitochondrial genomes from 16 Eimeria species or concatenated CDS sequences from each genome. Results Complete mitochondrial genome sequences were obtained for Eimeria adenoeides Guelph, 6211 bp; Eimeria dispersa Briston, 6238 bp; Eimeria meleagridis USAR97-01, 6212 bp; Eimeria meleagrimitis USMN08-01, 6165 bp; Eimeria gallopavonis Weybridge, 6215 bp; and Eimeria gallopavonis USKS06-01, 6215 bp). The order, orientation and CDS lengths of the three protein coding genes (COI, COIII and CytB) as well as rDNA fragments encoding ribosomal large and small subunit rRNA were conserved among all sequences. Pairwise sequence identities between species ranged from 88.1% to 98.2%; sequence variability was concentrated within CDS or between rDNA fragments (where indels were common). No phylogenetic reconstruction supported monophyly of Eimeria species infecting turkeys; Eimeria dispersa may have arisen via host switching from another avian host. Phylogenetic analyses suggest E. necatrix and E. tenella are related distantly to other Eimeria of chickens. Conclusions Mitochondrial genomes of Eimeria species sequenced to date are highly conserved with regard to gene content and structure. Nonetheless, complete mitochondrial genome sequences and, particularly the three CDS, possess sufficient sequence variability for differentiating Eimeria species of poultry. The mitochondrial genome sequences are highly suited for molecular diagnostics and phylogenetics of coccidia and, potentially, genetic markers for molecular epidemiology. PMID:25034633

  2. Complete mitochondrial genome sequences from five Eimeria species (Apicomplexa; Coccidia; Eimeriidae) infecting domestic turkeys.

    PubMed

    Ogedengbe, Mosun E; El-Sherry, Shiem; Whale, Julia; Barta, John R

    2014-07-17

    Clinical and subclinical coccidiosis is cosmopolitan and inflicts significant losses to the poultry industry globally. Seven named Eimeria species are responsible for coccidiosis in turkeys: Eimeria dispersa; Eimeria meleagrimitis; Eimeria gallopavonis; Eimeria meleagridis; Eimeria adenoeides; Eimeria innocua; and, Eimeria subrotunda. Although attempts have been made to characterize these parasites molecularly at the nuclear 18S rDNA and ITS loci, the maternally-derived and mitotically replicating mitochondrial genome may be more suited for species level molecular work; however, only limited sequence data are available for Eimeria spp. infecting turkeys. The purpose of this study was to sequence and annotate the complete mitochondrial genomes from 5 Eimeria species that commonly infect the domestic turkey (Meleagris gallopavo). Six single-oocyst derived cultures of five Eimeria species infecting turkeys were PCR-amplified and sequenced completely prior to detailed annotation. Resulting sequences were aligned and used in phylogenetic analyses (BI, ML, and MP) that included complete mitochondrial genomes from 16 Eimeria species or concatenated CDS sequences from each genome. Complete mitochondrial genome sequences were obtained for Eimeria adenoeides Guelph, 6211 bp; Eimeria dispersa Briston, 6238 bp; Eimeria meleagridis USAR97-01, 6212 bp; Eimeria meleagrimitis USMN08-01, 6165 bp; Eimeria gallopavonis Weybridge, 6215 bp; and Eimeria gallopavonis USKS06-01, 6215 bp). The order, orientation and CDS lengths of the three protein coding genes (COI, COIII and CytB) as well as rDNA fragments encoding ribosomal large and small subunit rRNA were conserved among all sequences. Pairwise sequence identities between species ranged from 88.1% to 98.2%; sequence variability was concentrated within CDS or between rDNA fragments (where indels were common). No phylogenetic reconstruction supported monophyly of Eimeria species infecting turkeys; Eimeria dispersa may have arisen via host switching from another avian host. Phylogenetic analyses suggest E. necatrix and E. tenella are related distantly to other Eimeria of chickens. Mitochondrial genomes of Eimeria species sequenced to date are highly conserved with regard to gene content and structure. Nonetheless, complete mitochondrial genome sequences and, particularly the three CDS, possess sufficient sequence variability for differentiating Eimeria species of poultry. The mitochondrial genome sequences are highly suited for molecular diagnostics and phylogenetics of coccidia and, potentially, genetic markers for molecular epidemiology.

  3. Species Identification of Bovine, Ovine and Porcine Type 1 Collagen; Comparing Peptide Mass Fingerprinting and LC-Based Proteomics Methods.

    PubMed

    Buckley, Mike

    2016-03-24

    Collagen is one of the most ubiquitous proteins in the animal kingdom and the dominant protein in extracellular tissues such as bone, skin and other connective tissues in which it acts primarily as a supporting scaffold. It has been widely investigated scientifically, not only as a biomedical material for regenerative medicine, but also for its role as a food source for both humans and livestock. Due to the long-term stability of collagen, as well as its abundance in bone, it has been proposed as a source of biomarkers for species identification not only for heat- and pressure-rendered animal feed but also in ancient archaeological and palaeontological specimens, typically carried out by peptide mass fingerprinting (PMF) as well as in-depth liquid chromatography (LC)-based tandem mass spectrometric methods. Through the analysis of the three most common domesticates species, cow, sheep, and pig, this research investigates the advantages of each approach over the other, investigating sites of sequence variation with known functional properties of the collagen molecule. Results indicate that the previously identified species biomarkers through PMF analysis are not among the most variable type 1 collagen peptides present in these tissues, the latter of which can be detected by LC-based methods. However, it is clear that the highly repetitive sequence motif of collagen throughout the molecule, combined with the variability of the sites and relative abundance levels of hydroxylation, can result in high scoring false positive peptide matches using these LC-based methods. Additionally, the greater alpha 2(I) chain sequence variation, in comparison to the alpha 1(I) chain, did not appear to be specific to any particular functional properties, implying that intra-chain functional constraints on sequence variation are not as great as inter-chain constraints. However, although some of the most variable peptides were only observed in LC-based methods, until the range of publicly available collagen sequences improves, the simplicity of the PMF approach and suitable range of peptide sequence variation observed makes it the ideal method for initial taxonomic identification prior to further analysis by LC-based methods only when required.

  4. Structure of Infaunal Communities on the Beaufort Sea Shelf and Slope: Insights from Morphological and Environmental DNA Sequencing Approaches

    NASA Astrophysics Data System (ADS)

    Hardy, S. M.; Bik, H.; Walker, A.; Sharma, J.; Blanchard, A.

    2016-02-01

    Rapid change is occurring in the Arctic concurrently with increased human activity, yet our knowledge of the structure and function of high-Arctic sediment communities is still rudimentary. The Beaufort Sea is particularly poorly sampled, and largely unexplored at slope depths, providing little information with which to assess the impacts of petroleum exploration activities now beginning in this area. We are investigating diversity and community structure of meio- and macrobenthic infauna on the continental shelf and slope of the Beaufort Sea across a range of depths (50 to 1000 m) using traditional taxonomic and environmental DNA sequencing approaches, and comparing results to additional sites in the adjacent NE Chukchi Sea petroleum lease-sale area. The Beaufort slope is topographically complex and characterized by an east-west gradient in benthic habitat characteristics, with heavy input of terrestrial organic matter particularly in the region of the Mackenzie River delta. Warmer, saltier subsurface Atlantic water masses impact benthic communities at mid-slope depths, likely influencing turnover in community structure observed with depth. Food resources are variable across the region, with very high sediment chlorophyll concentrations at 350 m depth in some areas. Differences in nematode assemblages were detected across the Beaufort Sea shelf/slope, across depths within the Beaufort Sea, and between the Beaufort and adjacent NE Chukchi Sea. These differences were apparent in both morphological and environmental sequencing data. Macrofaunal communities showed variable community structure among transects, with high abundance and high dominance in polychaete assemblages coincident with the chlorophyll maximum. Sequencing data also revealed an abundance of protists in sediments which have been mostly ignored in studies of ecosystem dynamics in this region, and may represent an important component of the food web.

  5. Transcriptome sequencing of diverse peanut (arachis) wild species and the cultivated species reveals a wealth of untapped genetic variability

    USDA-ARS?s Scientific Manuscript database

    Next generation sequencing technologies and improved bioinformatics methods have provided opportunities to study sequence variability in complex polyploid transcriptomes. In this study, we used a diverse panel of twenty-two Arachis accessions representing seven Arachis hypogaea market classes, A-, B...

  6. Salmonella enterica Prophage Sequence Profiles Reflect Genome Diversity and Can Be Used for High Discrimination Subtyping.

    PubMed

    Mottawea, Walid; Duceppe, Marc-Olivier; Dupras, Andrée A; Usongo, Valentine; Jeukens, Julie; Freschi, Luca; Emond-Rheault, Jean-Guillaume; Hamel, Jeremie; Kukavica-Ibrulj, Irena; Boyle, Brian; Gill, Alexander; Burnett, Elton; Franz, Eelco; Arya, Gitanjali; Weadge, Joel T; Gruenheid, Samantha; Wiedmann, Martin; Huang, Hongsheng; Daigle, France; Moineau, Sylvain; Bekal, Sadjia; Levesque, Roger C; Goodridge, Lawrence D; Ogunremi, Dele

    2018-01-01

    Non-typhoidal Salmonella is a leading cause of foodborne illness worldwide. Prompt and accurate identification of the sources of Salmonella responsible for disease outbreaks is crucial to minimize infections and eliminate ongoing sources of contamination. Current subtyping tools including single nucleotide polymorphism (SNP) typing may be inadequate, in some instances, to provide the required discrimination among epidemiologically unrelated Salmonella strains. Prophage genes represent the majority of the accessory genes in bacteria genomes and have potential to be used as high discrimination markers in Salmonella . In this study, the prophage sequence diversity in different Salmonella serovars and genetically related strains was investigated. Using whole genome sequences of 1,760 isolates of S. enterica representing 151 Salmonella serovars and 66 closely related bacteria, prophage sequences were identified from assembled contigs using PHASTER. We detected 154 different prophages in S. enterica genomes. Prophage sequences were highly variable among S. enterica serovars with a median ± interquartile range (IQR) of 5 ± 3 prophage regions per genome. While some prophage sequences were highly conserved among the strains of specific serovars, few regions were lineage specific. Therefore, strains belonging to each serovar could be clustered separately based on their prophage content. Analysis of S . Enteritidis isolates from seven outbreaks generated distinct prophage profiles for each outbreak. Taken altogether, the diversity of the prophage sequences correlates with genome diversity. Prophage repertoires provide an additional marker for differentiating S. enterica subtypes during foodborne outbreaks.

  7. Common 5S rRNA variants are likely to be accepted in many sequence contexts

    NASA Technical Reports Server (NTRS)

    Zhang, Zhengdong; D'Souza, Lisa M.; Lee, Youn-Hyung; Fox, George E.

    2003-01-01

    Over evolutionary time RNA sequences which are successfully fixed in a population are selected from among those that satisfy the structural and chemical requirements imposed by the function of the RNA. These sequences together comprise the structure space of the RNA. In principle, a comprehensive understanding of RNA structure and function would make it possible to enumerate which specific RNA sequences belong to a particular structure space and which do not. We are using bacterial 5S rRNA as a model system to attempt to identify principles that can be used to predict which sequences do or do not belong to the 5S rRNA structure space. One promising idea is the very intuitive notion that frequently seen sequence changes in an aligned data set of naturally occurring 5S rRNAs would be widely accepted in many other 5S rRNA sequence contexts. To test this hypothesis, we first developed well-defined operational definitions for a Vibrio region of the 5S rRNA structure space and what is meant by a highly variable position. Fourteen sequence variants (10 point changes and 4 base-pair changes) were identified in this way, which, by the hypothesis, would be expected to incorporate successfully in any of the known sequences in the Vibrio region. All 14 of these changes were constructed and separately introduced into the Vibrio proteolyticus 5S rRNA sequence where they are not normally found. Each variant was evaluated for its ability to function as a valid 5S rRNA in an E. coli cellular context. It was found that 93% (13/14) of the variants tested are likely valid 5S rRNAs in this context. In addition, seven variants were constructed that, although present in the Vibrio region, did not meet the stringent criteria for a highly variable position. In this case, 86% (6/7) are likely valid. As a control we also examined seven variants that are seldom or never seen in the Vibrio region of 5S rRNA sequence space. In this case only two of seven were found to be potentially valid. The results demonstrate that changes that occur multiple times in a local region of RNA sequence space in fact usually will be accepted in any sequence context in that same local region.

  8. Improving RNA-Seq expression estimation by modeling isoform- and exon-specific read sequencing rate.

    PubMed

    Liu, Xuejun; Shi, Xinxin; Chen, Chunlin; Zhang, Li

    2015-10-16

    The high-throughput sequencing technology, RNA-Seq, has been widely used to quantify gene and isoform expression in the study of transcriptome in recent years. Accurate expression measurement from the millions or billions of short generated reads is obstructed by difficulties. One is ambiguous mapping of reads to reference transcriptome caused by alternative splicing. This increases the uncertainty in estimating isoform expression. The other is non-uniformity of read distribution along the reference transcriptome due to positional, sequencing, mappability and other undiscovered sources of biases. This violates the uniform assumption of read distribution for many expression calculation approaches, such as the direct RPKM calculation and Poisson-based models. Many methods have been proposed to address these difficulties. Some approaches employ latent variable models to discover the underlying pattern of read sequencing. However, most of these methods make bias correction based on surrounding sequence contents and share the bias models by all genes. They therefore cannot estimate gene- and isoform-specific biases as revealed by recent studies. We propose a latent variable model, NLDMseq, to estimate gene and isoform expression. Our method adopts latent variables to model the unknown isoforms, from which reads originate, and the underlying percentage of multiple spliced variants. The isoform- and exon-specific read sequencing biases are modeled to account for the non-uniformity of read distribution, and are identified by utilizing the replicate information of multiple lanes of a single library run. We employ simulation and real data to verify the performance of our method in terms of accuracy in the calculation of gene and isoform expression. Results show that NLDMseq obtains competitive gene and isoform expression compared to popular alternatives. Finally, the proposed method is applied to the detection of differential expression (DE) to show its usefulness in the downstream analysis. The proposed NLDMseq method provides an approach to accurately estimate gene and isoform expression from RNA-Seq data by modeling the isoform- and exon-specific read sequencing biases. It makes use of a latent variable model to discover the hidden pattern of read sequencing. We have shown that it works well in both simulations and real datasets, and has competitive performance compared to popular methods. The method has been implemented as a freely available software which can be found at https://github.com/PUGEA/NLDMseq.

  9. Haplotype block structure study of the CFTR gene. Most variants are associated with the M470 allele in several European populations.

    PubMed

    Pompei, Fiorenza; Ciminelli, Bianca Maria; Bombieri, Cristina; Ciccacci, Cinzia; Koudova, Monika; Giorgi, Silvia; Belpinati, Francesca; Begnini, Angela; Cerny, Milos; Des Georges, Marie; Claustres, Mireille; Ferec, Claude; Macek, Milan; Modiano, Guido; Pignatti, Pier Franco

    2006-01-01

    An average of about 1700 CFTR (cystic fibrosis transmembrane conductance regulator) alleles from normal individuals from different European populations were extensively screened for DNA sequence variation. A total of 80 variants were observed: 61 coding SNSs (results already published), 13 noncoding SNSs, three STRs, two short deletions, and one nucleotide insertion. Eight DNA variants were classified as non-CF causing due to their high frequency of occurrence. Through this survey the CFTR has become the most exhaustively studied gene for its coding sequence variability and, though to a lesser extent, for its noncoding sequence variability as well. Interestingly, most variation was associated with the M470 allele, while the V470 allele showed an 'extended haplotype homozygosity' (EHH). These findings make us suggest a role for selection acting either on the M470V itself or through an hitchhiking mechanism involving a second site. The possible ancient origin of the V allele in an 'out of Africa' time frame is discussed.

  10. Selection and Validation of a Multilocus Variable-Number Tandem-Repeat Analysis Panel for Typing Shigella spp.▿ †

    PubMed Central

    Gorgé, Olivier; Lopez, Stéphanie; Hilaire, Valérie; Lisanti, Olivier; Ramisse, Vincent; Vergnaud, Gilles

    2008-01-01

    The Shigella genus has historically been separated into four species, based on biochemical assays. The classification within each species relies on serotyping. Recently, genome sequencing and DNA assays, in particular the multilocus sequence typing (MLST) approach, greatly improved the current knowledge of the origin and phylogenetic evolution of Shigella spp. The Shigella and Escherichia genera are now considered to belong to a unique genomospecies. Multilocus variable-number tandem-repeat (VNTR) analysis (MLVA) provides valuable polymorphic markers for genotyping and performing phylogenetic analyses of highly homogeneous bacterial pathogens. Here, we assess the capability of MLVA for Shigella typing. Thirty-two potentially polymorphic VNTRs were selected by analyzing in silico five Shigella genomic sequences and subsequently evaluated. Eventually, a panel of 15 VNTRs was selected (i.e., MLVA15 analysis). MLVA15 analysis of 78 strains or genome sequences of Shigella spp. and 11 strains or genome sequences of Escherichia coli distinguished 83 genotypes. Shigella population cluster analysis gave consistent results compared to MLST. MLVA15 analysis showed capabilities for E. coli typing, providing classification among pathogenic and nonpathogenic E. coli strains included in the study. The resulting data can be queried on our genotyping webpage (http://mlva.u-psud.fr). The MLVA15 assay is rapid, highly discriminatory, and reproducible for Shigella and Escherichia strains, suggesting that it could significantly contribute to epidemiological trace-back analysis of Shigella infections and pathogenic Escherichia outbreaks. Typing was performed on strains obtained mostly from collections. Further studies should include strains of much more diverse origins, including all pathogenic E. coli types. PMID:18216214

  11. Equine infectious anemia virus in naturally infected horses from the Brazilian Pantanal.

    PubMed

    Cursino, Andreia Elisa; Vilela, Ana Paula Pessoa; Franco-Luiz, Ana Paula Moreira; de Oliveira, Jaquelline Germano; Nogueira, Márcia Furlan; Júnior, João Pessoa Araújo; de Aguiar, Daniel Moura; Kroon, Erna Geessien

    2018-05-11

    Equine infectious anemia (EIA) has a worldwide distribution, and is widespread in Brazil. The Brazilian Pantanal presents with high prevalence comprising equine performance and indirectly the livestock industry, since the horses are used for cattle management. Although EIA is routinely diagnosed by the agar gel immunodiffusion test (AGID), this serological assay has some limitations, so PCR-based detection methods have the potential to overcome these limitations and act as complementary tests to those currently used. Considering the limited number of equine infectious anemia virus (EIAV) sequences which are available in public databases and the great genome variability, studies of EIAV detection and characterization molecular remain important. In this study we detected EIAV proviral DNA from 23 peripheral blood mononuclear cell (PBMCs) samples of naturally infected horses from Brazilian Pantanal using a semi-nested-PCR (sn-PCR). The serological profile of the animals was also evaluated by AGID and ELISA for gp90 and p26. Furthermore, the EIAV PCR amplified DNA was sequenced and phylogenetically analyzed. Here we describe the first EIAV sequences of the 5' LTR of the tat gene in naturally infected horses from Brazil, which presented with 91% similarity to EIAV reference sequences. The Brazilian EIAV sequences also presented variable nucleotide similarities among themselves, ranging from 93,5% to 100%. Phylogenetic analysis showed that Brazilian EIAV sequences grouped in a separate clade relative to other reference sequences. Thus this molecular detection and characterization may provide information about EIAV circulation in Brazilian territories and improve phylogenetic inferences.

  12. Advanced colorectal adenoma related gene expression signature may predict prognostic for colorectal cancer patients with adenoma-carcinoma sequence.

    PubMed

    Li, Bing; Shi, Xiao-Yu; Liao, Dai-Xiang; Cao, Bang-Rong; Luo, Cheng-Hua; Cheng, Shu-Jun

    2015-01-01

    There are still no absolute parameters predicting progression of adenoma into cancer. The present study aimed to characterize functional differences on the multistep carcinogenetic process from the adenoma-carcinoma sequence. All samples were collected and mRNA expression profiling was performed by using Agilent Microarray high-throughput gene-chip technology. Then, the characteristics of mRNA expression profiles of adenoma-carcinoma sequence were described with bioinformatics software, and we analyzed the relationship between gene expression profiles of adenoma-adenocarcinoma sequence and clinical prognosis of colorectal cancer. The mRNA expressions of adenoma-carcinoma sequence were significantly different between high-grade intraepithelial neoplasia group and adenocarcinoma group. The biological process of gene ontology function enrichment analysis on differentially expressed genes between high-grade intraepithelial neoplasia group and adenocarcinoma group showed that genes enriched in the extracellular structure organization, skeletal system development, biological adhesion and itself regulated growth regulation, with the P value after FDR correction of less than 0.05. In addition, IPR-related protein mainly focused on the insulin-like growth factor binding proteins. The variable trends of gene expression profiles for adenoma-carcinoma sequence were mainly concentrated in high-grade intraepithelial neoplasia and adenocarcinoma. The differentially expressed genes are significantly correlated between high-grade intraepithelial neoplasia group and adenocarcinoma group. Bioinformatics analysis is an effective way to study the gene expression profiles in the adenoma-carcinoma sequence, and may provide an effective tool to involve colorectal cancer research strategy into colorectal adenoma or advanced adenoma.

  13. The LANL hemorrhagic fever virus database, a new platform for analyzing biothreat viruses

    PubMed Central

    Kuiken, Carla; Thurmond, Jim; Dimitrijevic, Mira; Yoon, Hyejin

    2012-01-01

    Hemorrhagic fever viruses (HFVs) are a diverse set of over 80 viral species, found in 10 different genera comprising five different families: arena-, bunya-, flavi-, filo- and togaviridae. All these viruses are highly variable and evolve rapidly, making them elusive targets for the immune system and for vaccine and drug design. About 55 000 HFV sequences exist in the public domain today. A central website that provides annotated sequences and analysis tools will be helpful to HFV researchers worldwide. The HFV sequence database collects and stores sequence data and provides a user-friendly search interface and a large number of sequence analysis tools, following the model of the highly regarded and widely used Los Alamos HIV database [Kuiken, C., B. Korber, and R.W. Shafer, HIV sequence databases. AIDS Rev, 2003. 5: p. 52–61]. The database uses an algorithm that aligns each sequence to a species-wide reference sequence. The NCBI RefSeq database [Sayers et al. (2011) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res., 39, D38–D51.] is used for this; if a reference sequence is not available, a Blast search finds the best candidate. Using this method, sequences in each genus can be retrieved pre-aligned. The HFV website can be accessed via http://hfv.lanl.gov. PMID:22064861

  14. Gate sequence for continuous variable one-way quantum computation

    PubMed Central

    Su, Xiaolong; Hao, Shuhong; Deng, Xiaowei; Ma, Lingyu; Wang, Meihong; Jia, Xiaojun; Xie, Changde; Peng, Kunchi

    2013-01-01

    Measurement-based one-way quantum computation using cluster states as resources provides an efficient model to perform computation and information processing of quantum codes. Arbitrary Gaussian quantum computation can be implemented sufficiently by long single-mode and two-mode gate sequences. However, continuous variable gate sequences have not been realized so far due to an absence of cluster states larger than four submodes. Here we present the first continuous variable gate sequence consisting of a single-mode squeezing gate and a two-mode controlled-phase gate based on a six-mode cluster state. The quantum property of this gate sequence is confirmed by the fidelities and the quantum entanglement of two output modes, which depend on both the squeezing and controlled-phase gates. The experiment demonstrates the feasibility of implementing Gaussian quantum computation by means of accessible gate sequences.

  15. A high HIV-1 strain variability in London, UK, revealed by full-genome analysis: Results from the ICONIC project.

    PubMed

    Yebra, Gonzalo; Frampton, Dan; Gallo Cassarino, Tiziano; Raffle, Jade; Hubb, Jonathan; Ferns, R Bridget; Waters, Laura; Tong, C Y William; Kozlakidis, Zisis; Hayward, Andrew; Kellam, Paul; Pillay, Deenan; Clark, Duncan; Nastouli, Eleni; Leigh Brown, Andrew J

    2018-01-01

    The ICONIC project has developed an automated high-throughput pipeline to generate HIV nearly full-length genomes (NFLG, i.e. from gag to nef) from next-generation sequencing (NGS) data. The pipeline was applied to 420 HIV samples collected at University College London Hospitals NHS Trust and Barts Health NHS Trust (London) and sequenced using an Illumina MiSeq at the Wellcome Trust Sanger Institute (Cambridge). Consensus genomes were generated and subtyped using COMET, and unique recombinants were studied with jpHMM and SimPlot. Maximum-likelihood phylogenetic trees were constructed using RAxML to identify transmission networks using the Cluster Picker. The pipeline generated sequences of at least 1Kb of length (median = 7.46Kb, IQR = 4.01Kb) for 375 out of the 420 samples (89%), with 174 (46.4%) being NFLG. A total of 365 sequences (169 of them NFLG) corresponded to unique subjects and were included in the down-stream analyses. The most frequent HIV subtypes were B (n = 149, 40.8%) and C (n = 77, 21.1%) and the circulating recombinant form CRF02_AG (n = 32, 8.8%). We found 14 different CRFs (n = 66, 18.1%) and multiple URFs (n = 32, 8.8%) that involved recombination between 12 different subtypes/CRFs. The most frequent URFs were B/CRF01_AE (4 cases) and A1/D, B/C, and B/CRF02_AG (3 cases each). Most URFs (19/26, 73%) lacked breakpoints in the PR+RT pol region, rendering them undetectable if only that was sequenced. Twelve (37.5%) of the URFs could have emerged within the UK, whereas the rest were probably imported from sub-Saharan Africa, South East Asia and South America. For 2 URFs we found highly similar pol sequences circulating in the UK. We detected 31 phylogenetic clusters using the full dataset: 25 pairs (mostly subtypes B and C), 4 triplets and 2 quadruplets. Some of these were not consistent across different genes due to inter- and intra-subtype recombination. Clusters involved 70 sequences, 19.2% of the dataset. The initial analysis of genome sequences detected substantial hidden variability in the London HIV epidemic. Analysing full genome sequences, as opposed to only PR+RT, identified previously undetected recombinants. It provided a more reliable description of CRFs (that would be otherwise misclassified) and transmission clusters.

  16. Intraspecific ITS Variability in the Kingdom Fungi as Expressed in the International Sequence Databases and Its Implications for Molecular Species Identification

    PubMed Central

    Nilsson, R. Henrik; Kristiansson, Erik; Ryberg, Martin; Hallenberg, Nils; Larsson, Karl-Henrik

    2008-01-01

    The internal transcribed spacer (ITS) region of the nuclear ribosomal repeat unit is the most popular locus for species identification and subgeneric phylogenetic inference in sequence-based mycological research. The region is known to show certain variability even within species, although its intraspecific variability is often held to be limited and clearly separated from interspecific variability. The existence of such a divide between intra- and interspecific variability is implicitly assumed by automated approaches to species identification, but whether intraspecific variability indeed is negligible within the fungal kingdom remains contentious. The present study estimates the intraspecific ITS variability in all fungi presently available to the mycological community through the international sequence databases. Substantial differences were found within the kingdom, and the results are not easily correlated to the taxonomic affiliation or nutritional mode of the taxa considered. No single unifying yet stringent upper limit for intraspecific variability, such as the canonical 3% threshold, appears to be applicable with the desired outcome throughout the fungi. Our results caution against simplified approaches to automated ITS-based species delimitation and reiterate the need for taxonomic expertise in the translation of sequence data into species names. PMID:19204817

  17. Transcript expression and genetic variability analysis of caspases in breast carcinomas suggests CASP9 as the most interesting target.

    PubMed

    Brynychova, Veronika; Hlavac, Viktor; Ehrlichova, Marie; Vaclavikova, Radka; Nemcova-Furstova, Vlasta; Pecha, Vaclav; Trnkova, Marketa; Mrhalova, Marcela; Kodet, Roman; Vrana, David; Gatek, Jiri; Bendova, Marie; Vernerova, Zdenka; Kovar, Jan; Soucek, Pavel

    2017-01-01

    Apoptosis plays a critical role in cancer cell survival and tumor development. We provide a hypothesis-generating screen for further research by exploring the expression profile and genetic variability of caspases (2, 3, 7, 8, 9, and 10) in breast carcinoma patients. This study addressed isoform-specific caspase transcript expression and genetic variability in regulatory sequences of caspases 2 and 9. Gene expression profiling was performed by quantitative real-time PCR in tumor and paired non-malignant tissues of two independent groups of patients. Genetic variability was determined by high resolution melting, allelic discrimination, and sequencing analysis in tumor and peripheral blood lymphocyte DNA of the patients. CASP3 A+B and S isoforms were over-expressed in tumors of both patient groups. The CASP9 transcript was down-regulated in tumors of both groups of patients and significantly associated with expression of hormonal receptors and with the presence of rs4645978-rs2020903-rs4646034 haplotype in the CASP9 gene. Patients with a low intratumoral CASP9A/B isoform expression ratio (predicted to shift equilibrium towards anti-apoptotic isoform) subsequently treated with adjuvant chemotherapy had a significantly shorter disease-free survival than those with the high ratio (p=0.04). Inheritance of CC genotype of rs2020903 in CASP9 was associated with progesterone receptor expression in tumors (p=0.003). Genetic variability in CASP9 and expression of its splicing variants present targets for further study.

  18. Analysis of variable sites between two complete South China tiger (Panthera tigris amoyensis) mitochondrial genomes.

    PubMed

    Zhang, Wenping; Yue, Bisong; Wang, Xiaofang; Zhang, Xiuyue; Xie, Zhong; Liu, Nonglin; Fu, Wenyuan; Yuan, Yaohua; Chen, Daqing; Fu, Danghua; Zhao, Bo; Yin, Yuzhong; Yan, Xiahui; Wang, Xinjing; Zhang, Rongying; Liu, Jie; Li, Maoping; Tang, Yao; Hou, Rong; Zhang, Zhihe

    2011-10-01

    In order to investigate the mitochondrial genome of Panthera tigris amoyensis, two South China tigers (P25 and P27) were analyzed following 15 cymt-specific primer sets. The entire mtDNA sequence was found to be 16,957 bp and 17,001 bp long for P25 and P27 respectively, and this difference in length between P25 and P27 occurred in the number of tandem repeats in the RS-3 segment of the control region. The structural characteristics of complete P. t. amoyensis mitochondrial genomes were also highly similar to those of P. uncia. Additionally, the rate of point mutation was only 0.3% and a total of 59 variable sites between P25 and P27 were found. Out of the 59 variable sites, 6 were located in 6 different tRNA genes, 6 in the 2 rRNA genes, 7 in non-coding regions (one located between tRNA-Asn and tRNA-Tyr and six in the D-loop), and 40 in 10 protein-coding genes. COI held the largest amount of variable sites (9 sites) and Cytb contained the highest variable rate (0.7%) in the complete sequences. Moreover, out of the 40 variable sites located in 10 protein-coding genes, 12 sites were nonsynonymous.

  19. Managing the genomic revolution in cancer diagnostics.

    PubMed

    Nguyen, Doreen; Gocke, Christopher D

    2017-08-01

    Molecular tumor profiling is now a routine part of patient care, revealing targetable genomic alterations and molecularly distinct tumor subtypes with therapeutic and prognostic implications. The widespread adoption of next-generation sequencing technologies has greatly facilitated clinical implementation of genomic data and opened the door for high-throughput multigene-targeted sequencing. Herein, we discuss the variability of cancer genetic profiling currently offered by clinical laboratories, the challenges of applying rapidly evolving medical knowledge to individual patients, and the need for more standardized population-based molecular profiling.

  20. Three years of ULTRASPEC at the Thai 2.4-m telescope: Capabilities and scientific highlights

    NASA Astrophysics Data System (ADS)

    Yadav, Ram Kesh; Richichi, Andrea; Irawati, Puji; Dhillon, Vikram Singh; Marsh, Thomas R.; Soonthornthum, Boonrucksar

    2018-04-01

    High temporal resolution observations enable the study of rapid phenomena such as the flux variations in binary system objects, e.g. cataclysmic variables, compact binary systems, the flux variations in young star clusters, stellar occultations and more. The 2.4-m Thai National Telescope (TNT) is ideally suited for this niche research, being the largest facility in Southeast Asia and being equipped with ULTRASPEC, a high-speed imager based on a low-noise frame transfer electron-multiplying CCD. In the sub-window mode, ULTRASPEC can record uninterrupted sequences with frame rates as fast as few milliseconds. We present some of the key results obtained in the area of high time resolution with ULTRASPEC. We also present the results of a recent worldwide campaign to observe the current series of lunar occultations of Aldebaran (α Tauri) carried out in close collaboration with the Devasthal facilities, the out-of-eclipse variations on the post common-envelope system J1021+1744, and pre-main-sequence variables in young open cluster Stock 8.

  1. The actin multigene family and livestock speciation using the polymerase chain reaction.

    PubMed

    Fairbrother, K S; Hopwood, A J; Lockley, A K; Bardsley, R G

    1998-01-01

    Actins constitute a family of highly-conserved multifunctional intracellular proteins, best known as myofibrillar components in striated muscle fibres. Most vertebrate genomes contain numerous actin genes with high sequence homology in protein coding regions but considerable variability in intron number and sizes. This genetic diversity can be utilised for livestock speciation purposes. The high sequence conservation has enabled a single pair of oligonucleotides to be used to prime the polymerase chain reaction (PCR) with DNA extracted from all animals so far studied. Multiple amplification products were obtained which on gel electrophoresis constituted characteristic species-specific 'fingerprints'. The patterns were reproducible, did not vary between individuals of the same breed or between different breeds within a species, and could be generated even from heat-processed muscle held at 120 degrees C for one hour. Given the capacity of PCR to amplify relatively short sequences in highly-degraded DNA, this approach may be suitable for authentication of processed meat products.

  2. The nuclear 18S ribosomal RNA gene as a source of phylogenetic information in the genus Taenia.

    PubMed

    Yan, Hongbin; Lou, Zhongzi; Li, Li; Ni, Xingwei; Guo, Aijiang; Li, Hongmin; Zheng, Yadong; Dyachenko, Viktor; Jia, Wanzhong

    2013-03-01

    Most species of the genus Taenia are of considerable medical and veterinary significance. In this study, complete nuclear 18S rRNA gene sequences were obtained from seven members of genus Taenia [Taenia multiceps, Taenia saginata, Taenia asiatica, Taenia solium, Taenia pisiformis, Taenia hydatigena, and Taenia taeniaeformis] and a phylogeny inferred using these sequences. Most of the variable sites fall within the variable regions, V1-V5. We show that sequences from the nuclear 18S ribosomal RNA gene have considerable promise as sources of phylogenetic information within the genus Taenia. Furthermore, given that almost all the variable sites lie within defined variable portions of that gene, it will be appropriate and economical to sequence only those regions for additional species of Taenia.

  3. Characteristics, stratigraphic architecture, and time framework of multi-order mixed siliciclastic and carbonate depositional sequences, outcropping Cisco Group (Late Pennsylvanian and Early Permian), Eastern Shelf, north-central Texas, USA

    NASA Astrophysics Data System (ADS)

    Yang, Wan; Kominz, Michelle A.

    2003-01-01

    The Cisco Group on the Eastern Shelf of the Midland Basin is composed of fluvial, deltaic, shelf, shelf-margin, and slope-to-basin carbonate and siliciclastic rocks. Sedimentologic and stratigraphic analyses of 181 meter-to-decimeter-scale depositional sequences exposed in the up-dip shelf indicated that the siliciclastic and carbonate parasequences in the transgressive systems tracts (TST) are thin and upward deepening, whereas those in highstand systems tracts (HST) are thick and upward shallowing. The sequences can be subdivided into five types on the basis of principal lithofacies, and exhibit variable magnitude of facies shift corresponding to variable extents of marine transgression and regression on the shelf. The sequence stacking patterns and their regional persistence suggest a three-level sequence hierarchy controlled by eustasy, whereas local and regional changes in lithology, thickness, and sequence type, magnitude, and absence were controlled by interplay of eustasy, differential shelf subsidence, depositional topography, and pattern of siliciclastic supply. The outcropping Cisco Group is highly incomplete with an estimated 6-11% stratigraphic completeness. The average duration of deposition of the major (third-order) sequences is estimated as 67-102 ka on the up-dip shelf and increases down dip, while the average duration of the major sequence boundaries (SB) is estimated as 831-1066 ka and decreases down dip. The nondepositional and erosional hiatus on the up-dip shelf was represented by lowstand deltaic systems in the basin and slope.

  4. The genetic basis of adaptive pigmentation variation in Drosophila melanogaster

    PubMed Central

    Pool, John E.; Aquadro, Charles F.

    2009-01-01

    In a broad survey of Drosophila melanogaster population samples, levels of abdominal pigmentation were found to be highly variable and geographically differentiated. A strong positive correlation was found between dark pigmentation and high altitude, suggesting adaptation to specific environments. DNA sequence polymorphism at the candidate gene ebony revealed a clear association with the pigmentation of homozygous third chromosome lines. The darkest lines sequenced had nearly identical haplotypes spanning 14.5 kilobases upstream of the protein-coding exons of ebony. Thus, natural selection may have elevated the frequency of an allele that confers dark abdominal pigmentation by influencing the regulation of ebony. PMID:17614900

  5. Dissociable effects of practice variability on learning motor and timing skills.

    PubMed

    Caramiaux, Baptiste; Bevilacqua, Frédéric; Wanderley, Marcelo M; Palmer, Caroline

    2018-01-01

    Motor skill acquisition inherently depends on the way one practices the motor task. The amount of motor task variability during practice has been shown to foster transfer of the learned skill to other similar motor tasks. In addition, variability in a learning schedule, in which a task and its variations are interweaved during practice, has been shown to help the transfer of learning in motor skill acquisition. However, there is little evidence on how motor task variations and variability schedules during practice act on the acquisition of complex motor skills such as music performance, in which a performer learns both the right movements (motor skill) and the right time to perform them (timing skill). This study investigated the impact of rate (tempo) variability and the schedule of tempo change during practice on timing and motor skill acquisition. Complete novices, with no musical training, practiced a simple musical sequence on a piano keyboard at different rates. Each novice was assigned to one of four learning conditions designed to manipulate the amount of tempo variability across trials (large or small tempo set) and the schedule of tempo change (randomized or non-randomized order) during practice. At test, the novices performed the same musical sequence at a familiar tempo and at novel tempi (testing tempo transfer), as well as two novel (but related) sequences at a familiar tempo (testing spatial transfer). We found that practice conditions had little effect on learning and transfer performance of timing skill. Interestingly, practice conditions influenced motor skill learning (reduction of movement variability): lower temporal variability during practice facilitated transfer to new tempi and new sequences; non-randomized learning schedule improved transfer to new tempi and new sequences. Tempo (rate) and the sequence difficulty (spatial manipulation) affected performance variability in both timing and movement. These findings suggest that there is a dissociable effect of practice variability on learning complex skills that involve both motor and timing constraints.

  6. DNA Extraction Protocols for Whole-Genome Sequencing in Marine Organisms.

    PubMed

    Panova, Marina; Aronsson, Henrik; Cameron, R Andrew; Dahl, Peter; Godhe, Anna; Lind, Ulrika; Ortega-Martinez, Olga; Pereyra, Ricardo; Tesson, Sylvie V M; Wrange, Anna-Lisa; Blomberg, Anders; Johannesson, Kerstin

    2016-01-01

    The marine environment harbors a large proportion of the total biodiversity on this planet, including the majority of the earths' different phyla and classes. Studying the genomes of marine organisms can bring interesting insights into genome evolution. Today, almost all marine organismal groups are understudied with respect to their genomes. One potential reason is that extraction of high-quality DNA in sufficient amounts is challenging for many marine species. This is due to high polysaccharide content, polyphenols and other secondary metabolites that will inhibit downstream DNA library preparations. Consequently, protocols developed for vertebrates and plants do not always perform well for invertebrates and algae. In addition, many marine species have large population sizes and, as a consequence, highly variable genomes. Thus, to facilitate the sequence read assembly process during genome sequencing, it is desirable to obtain enough DNA from a single individual, which is a challenge in many species of invertebrates and algae. Here, we present DNA extraction protocols for seven marine species (four invertebrates, two algae, and a marine yeast), optimized to provide sufficient DNA quality and yield for de novo genome sequencing projects.

  7. A map of human microRNA variation uncovers unexpectedly high levels of variability

    PubMed Central

    2012-01-01

    Background MicroRNAs (miRNAs) are key components of the gene regulatory network in many species. During the past few years, these regulatory elements have been shown to be involved in an increasing number and range of diseases. Consequently, the compilation of a comprehensive map of natural variability in a healthy population seems an obvious requirement for future research on miRNA-related pathologies. Methods Data on 14 populations from the 1000 Genomes Project were analyzed, along with new data extracted from 60 exomes of healthy individuals from a population from southern Spain, sequenced in the context of the Medical Genome Project, to derive an accurate map of miRNA variability. Results Despite the common belief that miRNAs are highly conserved elements, analysis of the sequences of the 1,152 individuals indicated that the observed level of variability is double what was expected. A total of 527 variants were found. Among these, 45 variants affected the recognition region of the corresponding miRNA and were found in 43 different miRNAs, 26 of which are known to be involved in 57 diseases. Different parts of the mature structure of the miRNA were affected to different degrees by variants, which suggests the existence of a selective pressure related to the relative functional impact of the change. Moreover, 41 variants showed a significant deviation from the Hardy-Weinberg equilibrium, which supports the existence of a selective process against some alleles. The average number of variants per individual in miRNAs was 28. Conclusions Despite an expectation that miRNAs would be highly conserved genomic elements, our study reports a level of variability comparable to that observed for coding genes. PMID:22906193

  8. Characterizing the replicability of cell types defined by single cell RNA-sequencing data using MetaNeighbor.

    PubMed

    Crow, Megan; Paul, Anirban; Ballouz, Sara; Huang, Z Josh; Gillis, Jesse

    2018-02-28

    Single-cell RNA-sequencing (scRNA-seq) technology provides a new avenue to discover and characterize cell types; however, the experiment-specific technical biases and analytic variability inherent to current pipelines may undermine its replicability. Meta-analysis is further hampered by the use of ad hoc naming conventions. Here we demonstrate our replication framework, MetaNeighbor, that quantifies the degree to which cell types replicate across datasets, and enables rapid identification of clusters with high similarity. We first measure the replicability of neuronal identity, comparing results across eight technically and biologically diverse datasets to define best practices for more complex assessments. We then apply this to novel interneuron subtypes, finding that 24/45 subtypes have evidence of replication, which enables the identification of robust candidate marker genes. Across tasks we find that large sets of variably expressed genes can identify replicable cell types with high accuracy, suggesting a general route forward for large-scale evaluation of scRNA-seq data.

  9. Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses

    PubMed Central

    Liu, Bo; Madduri, Ravi K; Sotomayor, Borja; Chard, Kyle; Lacinski, Lukasz; Dave, Utpal J; Li, Jianqiang; Liu, Chunchen; Foster, Ian T

    2014-01-01

    Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data sharing and retrieval has presented significant challenges. The variability in data volume results in variable computing and storage requirements, therefore biomedical researchers are pursuing more reliable, dynamic and convenient methods for conducting sequencing analyses. This paper proposes a Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses, which enables reliable and highly scalable execution of sequencing analyses workflows in a fully automated manner. Our platform extends the existing Galaxy workflow system by adding data management capabilities for transferring large quantities of data efficiently and reliably (via Globus Transfer), domain-specific analyses tools preconfigured for immediate use by researchers (via user-specific tools integration), automatic deployment on Cloud for on-demand resource allocation and pay-as-you-go pricing (via Globus Provision), a Cloud provisioning tool for auto-scaling (via HTCondor scheduler), and the support for validating the correctness of workflows (via semantic verification tools). Two bioinformatics workflow use cases as well as performance evaluation are presented to validate the feasibility of the proposed approach. PMID:24462600

  10. Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses.

    PubMed

    Liu, Bo; Madduri, Ravi K; Sotomayor, Borja; Chard, Kyle; Lacinski, Lukasz; Dave, Utpal J; Li, Jianqiang; Liu, Chunchen; Foster, Ian T

    2014-06-01

    Due to the upcoming data deluge of genome data, the need for storing and processing large-scale genome data, easy access to biomedical analyses tools, efficient data sharing and retrieval has presented significant challenges. The variability in data volume results in variable computing and storage requirements, therefore biomedical researchers are pursuing more reliable, dynamic and convenient methods for conducting sequencing analyses. This paper proposes a Cloud-based bioinformatics workflow platform for large-scale next-generation sequencing analyses, which enables reliable and highly scalable execution of sequencing analyses workflows in a fully automated manner. Our platform extends the existing Galaxy workflow system by adding data management capabilities for transferring large quantities of data efficiently and reliably (via Globus Transfer), domain-specific analyses tools preconfigured for immediate use by researchers (via user-specific tools integration), automatic deployment on Cloud for on-demand resource allocation and pay-as-you-go pricing (via Globus Provision), a Cloud provisioning tool for auto-scaling (via HTCondor scheduler), and the support for validating the correctness of workflows (via semantic verification tools). Two bioinformatics workflow use cases as well as performance evaluation are presented to validate the feasibility of the proposed approach. Copyright © 2014 Elsevier Inc. All rights reserved.

  11. Early-onset neonatal sepsis is associated with a high heart rate during automatically selected stationary periods.

    PubMed

    Nguyen, Nga; Vandenbroucke, Laurent; Hernández, Alfredo; Pham, Tu; Beuchée, Alain; Pladys, Patrick

    2017-05-01

    This study examined the heart rate variability characteristics associated with early-onset neonatal sepsis in a prospective, observational controlled study. Eligible patients were full-term neonates hospitalised with clinical signs that suggested early-onset sepsis and a C-reactive protein of >10 mg/L. Sepsis was considered proven in cases of symptomatic septicaemia, meningitis, pneumonia or enterocolitis. Heart rate variability parameters (n = 16) were assessed from five-, 15- and 30-minute stationary sequences automatically selected from electrocardiographic recordings performed at admission and compared with a control group using the U-test with post hoc Benjamini-Yekutieli correction. Stationary sequences corresponded to the periods with the lowest changes of heart rate variability over time. A total of 40 full-term infants were enrolled, including 14 with proven sepsis. The mean duration of the cardiac cycle length was lower in the proven sepsis group than in the control group (n = 11), without other significant changes in heart rate variability parameters. These durations, measured in five-minute stationary periods, were 406 (367-433) ms in proven sepsis group versus 507 (463-522) ms in the control group (p < 0.05). Early-onset neonatal sepsis was associated with a high mean heart rate measured during automatically selected stationary periods. ©2017 Foundation Acta Paediatrica. Published by John Wiley & Sons Ltd.

  12. Systematics of Cladophora spp. (Chlorophyta) from North Carolina, USA, based upon morphology and DNA sequence data with a description of Cladophora subtilissima sp. nov.

    PubMed

    Taylor, Robin L; Bailey, Jeffrey Craig; Freshwater, David Wilson

    2017-06-01

    Identification of Cladophora species is challenging due to conservation of gross morphology, few discrete autapomorphies, and environmental influences on morphology. Twelve species of marine Cladophora were reported from North Carolina waters. Cladophora specimens were collected from inshore and offshore marine waters for DNA sequence and morphological analyses. The nuclear-encoded rRNA internal transcribed spacer regions (ITS) were sequenced for 105 specimens and used in molecular assisted identification. The ITS1 and ITS2 region was highly variable, and sequences were sorted into ITS Sets of Alignable Sequences (SASs). Sequencing of short hyper-variable ITS1 sections from Cladophora type specimens was used to positively identify species represented by SASs when the types were made available. Secondary structures for the ITS1 locus were also predicted for each specimen and compared to predicted structures from Cladophora sequences available in GenBank. Nine ITS SASs were identified and representative specimens chosen for phylogenetic analyses of 18S and 28S rRNA gene sequences to reveal relationships with other Cladophora species. Phylogenetic analyses indicated that marine Cladophorales were polyphyletic and separated into two clades, the Cladophora clade and the "Siphonocladales" clade. Morphological analyses were performed to assess the consistency of character states within species, and complement the DNA sequence analyses. These analyses revealed intra- and interspecific character state variation, and that combined molecular and morphological analyses were required for the identification of species. One new report, Cladophora dotyana, and one new species Cladophora subtilissima sp. nov., were revealed, and increased the biodiversity of North Carolina marine Cladophora to 14 species. © 2017 Phycological Society of America.

  13. Accuracy and variability of tumor burden measurement on multi-parametric MRI

    NASA Astrophysics Data System (ADS)

    Salarian, Mehrnoush; Gibson, Eli; Shahedi, Maysam; Gaed, Mena; Gómez, José A.; Moussa, Madeleine; Romagnoli, Cesare; Cool, Derek W.; Bastian-Jordan, Matthew; Chin, Joseph L.; Pautler, Stephen; Bauman, Glenn S.; Ward, Aaron D.

    2014-03-01

    Measurement of prostate tumour volume can inform prognosis and treatment selection, including an assessment of the suitability and feasibility of focal therapy, which can potentially spare patients the deleterious side effects of radical treatment. Prostate biopsy is the clinical standard for diagnosis but provides limited information regarding tumour volume due to sparse tissue sampling. A non-invasive means for accurate determination of tumour burden could be of clinical value and an important step toward reduction of overtreatment. Multi-parametric magnetic resonance imaging (MPMRI) is showing promise for prostate cancer diagnosis. However, the accuracy and inter-observer variability of prostate tumour volume estimation based on separate expert contouring of T2-weighted (T2W), dynamic contrastenhanced (DCE), and diffusion-weighted (DW) MRI sequences acquired using an endorectal coil at 3T is currently unknown. We investigated this question using a histologic reference standard based on a highly accurate MPMRIhistology image registration and a smooth interpolation of planimetric tumour measurements on histology. Our results showed that prostate tumour volumes estimated based on MPMRI consistently overestimated histological reference tumour volumes. The variability of tumour volume estimates across the different pulse sequences exceeded interobserver variability within any sequence. Tumour volume estimates on DCE MRI provided the lowest inter-observer variability and the highest correlation with histology tumour volumes, whereas the apparent diffusion coefficient (ADC) maps provided the lowest volume estimation error. If validated on a larger data set, the observed correlations could support the development of automated prostate tumour volume segmentation algorithms as well as correction schemes for tumour burden estimation on MPMRI.

  14. The complementarity-determining region sequences in IgY antivenom hypervariable regions.

    PubMed

    da Rocha, David Gitirana; Fernandez, Jorge Hernandez; de Almeida, Claudia Maria Costa; da Silva, Claudia Letícia; Magnoli, Fabio Carlos; da Silva, Osmair Élder; da Silva, Wilmar Dias

    2017-08-01

    The data presented in this article are related to the research article entitled "Development of IgY antibodies against anti-snake toxins endowed with highly lethal neutralizing activity" (da Rocha et al., 2017) [1]. Complementarity-determining region (CDR) sequences are variable antibody (Ab) sequences that respond with specificity, duration and strength to identify and bind to antigen (Ag) epitopes. B lymphocytes isolated from hens immunized with Bitis arietans (Ba) and anti- Crotalus durissus terrificus (Cdt) venoms and expressing high specificity, affinity and toxicity neutralizing antibody titers were used as DNA sources. The VLF1, CDR1, CDR2, VLR1 and CDR3 sequences were validated by BLASTp, and values corresponding to IgY V L and V H anti-Ba or anti-Cdt venoms were identified, registered [ Gallus gallus IgY Fv Light chain (GU815099)/ Gallus gallus IgY Fv Heavy chain (GU815098)] and used for molecular modeling of IgY scFv anti-Ba. The resulting CDR1, CDR2 and CDR3 sequences were combined to construct the three - dimensional structure of the Ab paratope.

  15. Identification of antigen-specific human monoclonal antibodies using high-throughput sequencing of the antibody repertoire.

    PubMed

    Liu, Ju; Li, Ruihua; Liu, Kun; Li, Liangliang; Zai, Xiaodong; Chi, Xiangyang; Fu, Ling; Xu, Junjie; Chen, Wei

    2016-04-22

    High-throughput sequencing of the antibody repertoire provides a large number of antibody variable region sequences that can be used to generate human monoclonal antibodies. However, current screening methods for identifying antigen-specific antibodies are inefficient. In the present study, we developed an antibody clone screening strategy based on clone dynamics and relative frequency, and used it to identify antigen-specific human monoclonal antibodies. Enzyme-linked immunosorbent assay showed that at least 52% of putative positive immunoglobulin heavy chains composed antigen-specific antibodies. Combining information on dynamics and relative frequency improved identification of positive clones and elimination of negative clones. and increase the credibility of putative positive clones. Therefore the screening strategy could simplify the subsequent experimental screening and may facilitate the generation of antigen-specific antibodies. Copyright © 2016 Elsevier Inc. All rights reserved.

  16. Development of phylogenetic markers for Sebacina (Sebacinaceae) mycorrhizal fungi associated with Australian orchids.

    PubMed

    Ruibal, Monica P; Peakall, Rod; Foret, Sylvain; Linde, Celeste C

    2014-06-01

    To investigate fungal species identity and diversity in mycorrhizal fungi of order Sebacinales, we developed phylogenetic markers. These new markers will enable future studies investigating species delineation and phylogenetic relationships of the fungal symbionts and facilitate investigations into evolutionary interactions among Sebacina species and their orchid hosts. • We generated partial genome sequences for a Sebacina symbiont originating from Caladenia huegelii with 454 genome sequencing and from three symbionts from Eriochilus dilatatus and one from E. pulchellus using Illumina sequencing. Six nuclear and two mitochondrial loci showed high variability (10-31% parsimony informative sites) for Sebacinales mycorrhizal fungi across four genera of Australian orchids (Caladenia, Eriochilus, Elythranthera, and Glossodia). • We obtained highly informative DNA markers that will allow investigation of mycorrhizal diversity of Sebacinaceae fungi associated with terrestrial orchids in Australia and worldwide.

  17. Conservation of the C-type lectin fold for massive sequence variation in a Treponema diversity-generating retroelement

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Le Coq, Johanne; Ghosh, Partho

    2012-06-19

    Anticipatory ligand binding through massive protein sequence variation is rare in biological systems, having been observed only in the vertebrate adaptive immune response and in a phage diversity-generating retroelement (DGR). Earlier work has demonstrated that the prototypical DGR variable protein, major tropism determinant (Mtd), meets the demands of anticipatory ligand binding by novel means through the C-type lectin (CLec) fold. However, because of the low sequence identity among DGR variable proteins, it has remained unclear whether the CLec fold is a general solution for DGRs. We have addressed this problem by determining the structure of a second DGR variable protein,more » TvpA, from the pathogenic oral spirochete Treponema denticola. Despite its weak sequence identity to Mtd ({approx}16%), TvpA was found to also have a CLec fold, with predicted variable residues exposed in a ligand-binding site. However, this site in TvpA was markedly more variable than the one in Mtd, reflecting the unprecedented approximate 10{sup 20} potential variability of TvpA. In addition, similarity between TvpA and Mtd with formylglycine-generating enzymes was detected. These results provide strong evidence for the conservation of the formylglycine-generating enzyme-type CLec fold among DGRs as a means of accommodating massive sequence variation.« less

  18. Conservation of the C-type lectin fold for massive sequence variation in a Treponema diversity-generating retroelement

    PubMed Central

    Le Coq, Johanne; Ghosh, Partho

    2011-01-01

    Anticipatory ligand binding through massive protein sequence variation is rare in biological systems, having been observed only in the vertebrate adaptive immune response and in a phage diversity-generating retroelement (DGR). Earlier work has demonstrated that the prototypical DGR variable protein, major tropism determinant (Mtd), meets the demands of anticipatory ligand binding by novel means through the C-type lectin (CLec) fold. However, because of the low sequence identity among DGR variable proteins, it has remained unclear whether the CLec fold is a general solution for DGRs. We have addressed this problem by determining the structure of a second DGR variable protein, TvpA, from the pathogenic oral spirochete Treponema denticola. Despite its weak sequence identity to Mtd (∼16%), TvpA was found to also have a CLec fold, with predicted variable residues exposed in a ligand-binding site. However, this site in TvpA was markedly more variable than the one in Mtd, reflecting the unprecedented approximate 1020 potential variability of TvpA. In addition, similarity between TvpA and Mtd with formylglycine-generating enzymes was detected. These results provide strong evidence for the conservation of the formylglycine-generating enzyme-type CLec fold among DGRs as a means of accommodating massive sequence variation. PMID:21873231

  19. Conservation of the C-type lectin fold for massive sequence variation in a Treponema diversity-generating retroelement.

    PubMed

    Le Coq, Johanne; Ghosh, Partho

    2011-08-30

    Anticipatory ligand binding through massive protein sequence variation is rare in biological systems, having been observed only in the vertebrate adaptive immune response and in a phage diversity-generating retroelement (DGR). Earlier work has demonstrated that the prototypical DGR variable protein, major tropism determinant (Mtd), meets the demands of anticipatory ligand binding by novel means through the C-type lectin (CLec) fold. However, because of the low sequence identity among DGR variable proteins, it has remained unclear whether the CLec fold is a general solution for DGRs. We have addressed this problem by determining the structure of a second DGR variable protein, TvpA, from the pathogenic oral spirochete Treponema denticola. Despite its weak sequence identity to Mtd (∼16%), TvpA was found to also have a CLec fold, with predicted variable residues exposed in a ligand-binding site. However, this site in TvpA was markedly more variable than the one in Mtd, reflecting the unprecedented approximate 10(20) potential variability of TvpA. In addition, similarity between TvpA and Mtd with formylglycine-generating enzymes was detected. These results provide strong evidence for the conservation of the formylglycine-generating enzyme-type CLec fold among DGRs as a means of accommodating massive sequence variation.

  20. Diatom centromeres suggest a mechanism for nuclear DNA acquisition

    DOE PAGES

    Diner, Rachel E.; Noddings, Chari M.; Lian, Nathan C.; ...

    2017-07-18

    Centromeres are essential for cell division and growth in all eukaryotes, and knowledge of their sequence and structure guides the development of artificial chromosomes for functional cellular biology studies. Centromeric proteins are conserved among eukaryotes; however, centromeric DNA sequences are highly variable. We combined forward and reverse genetic approaches with chromatin immunoprecipitation to identify centromeres of the model diatom Phaeodactylum tricornutum. We observed 25 unique centromere sequences typically occurring once per chromosome, a finding that helps to resolve nuclear genome organization and indicates monocentric regional centromeres. Diatom centromere sequences contain low-GC content regions but lack repeats or other conserved sequencemore » features. Native and foreign sequences with similar GC content to P. tricornutum centromeres can maintain episomes and recruit the diatom centromeric histone protein CENH3, suggesting nonnative sequences can also function as diatom centromeres. Thus, simple sequence requirements may enable DNA from foreign sources to persist in the nucleus as extrachromosomal episomes, revealing a potential mechanism for organellar and foreign DNA acquisition.« less

  1. Diatom centromeres suggest a mechanism for nuclear DNA acquisition

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Diner, Rachel E.; Noddings, Chari M.; Lian, Nathan C.

    Centromeres are essential for cell division and growth in all eukaryotes, and knowledge of their sequence and structure guides the development of artificial chromosomes for functional cellular biology studies. Centromeric proteins are conserved among eukaryotes; however, centromeric DNA sequences are highly variable. We combined forward and reverse genetic approaches with chromatin immunoprecipitation to identify centromeres of the model diatom Phaeodactylum tricornutum. We observed 25 unique centromere sequences typically occurring once per chromosome, a finding that helps to resolve nuclear genome organization and indicates monocentric regional centromeres. Diatom centromere sequences contain low-GC content regions but lack repeats or other conserved sequencemore » features. Native and foreign sequences with similar GC content to P. tricornutum centromeres can maintain episomes and recruit the diatom centromeric histone protein CENH3, suggesting nonnative sequences can also function as diatom centromeres. Thus, simple sequence requirements may enable DNA from foreign sources to persist in the nucleus as extrachromosomal episomes, revealing a potential mechanism for organellar and foreign DNA acquisition.« less

  2. Automated sequence-specific protein NMR assignment using the memetic algorithm MATCH.

    PubMed

    Volk, Jochen; Herrmann, Torsten; Wüthrich, Kurt

    2008-07-01

    MATCH (Memetic Algorithm and Combinatorial Optimization Heuristics) is a new memetic algorithm for automated sequence-specific polypeptide backbone NMR assignment of proteins. MATCH employs local optimization for tracing partial sequence-specific assignments within a global, population-based search environment, where the simultaneous application of local and global optimization heuristics guarantees high efficiency and robustness. MATCH thus makes combined use of the two predominant concepts in use for automated NMR assignment of proteins. Dynamic transition and inherent mutation are new techniques that enable automatic adaptation to variable quality of the experimental input data. The concept of dynamic transition is incorporated in all major building blocks of the algorithm, where it enables switching between local and global optimization heuristics at any time during the assignment process. Inherent mutation restricts the intrinsically required randomness of the evolutionary algorithm to those regions of the conformation space that are compatible with the experimental input data. Using intact and artificially deteriorated APSY-NMR input data of proteins, MATCH performed sequence-specific resonance assignment with high efficiency and robustness.

  3. [Hydrologic variability and sensitivity based on Hurst coefficient and Bartels statistic].

    PubMed

    Lei, Xu; Xie, Ping; Wu, Zi Yi; Sang, Yan Fang; Zhao, Jiang Yan; Li, Bin Bin

    2018-04-01

    Due to the global climate change and frequent human activities in recent years, the pure stochastic components of hydrological sequence is mixed with one or several of the variation ingredients, including jump, trend, period and dependency. It is urgently needed to clarify which indices should be used to quantify the degree of their variability. In this study, we defined the hydrological variability based on Hurst coefficient and Bartels statistic, and used Monte Carlo statistical tests to test and analyze their sensitivity to different variants. When the hydrological sequence had jump or trend variation, both Hurst coefficient and Bartels statistic could reflect the variation, with the Hurst coefficient being more sensitive to weak jump or trend variation. When the sequence had period, only the Bartels statistic could detect the mutation of the sequence. When the sequence had a dependency, both the Hurst coefficient and the Bartels statistics could reflect the variation, with the latter could detect weaker dependent variations. For the four variations, both the Hurst variability and Bartels variability increased with the increases of variation range. Thus, they could be used to measure the variation intensity of the hydrological sequence. We analyzed the temperature series of different weather stations in the Lancang River basin. Results showed that the temperature of all stations showed the upward trend or jump, indicating that the entire basin had experienced warming in recent years and the temperature variability in the upper and lower reaches was much higher. This case study showed the practicability of the proposed method.

  4. Separating the wheat from the chaff: mitigating the effects of noise in a plastome phylogenomic data set from Pinus L. (Pinaceae)

    PubMed Central

    2012-01-01

    Background Through next-generation sequencing, the amount of sequence data potentially available for phylogenetic analyses has increased exponentially in recent years. Simultaneously, the risk of incorporating ‘noisy’ data with misleading phylogenetic signal has also increased, and may disproportionately influence the topology of weakly supported nodes and lineages featuring rapid radiations and/or elevated rates of evolution. Results We investigated the influence of phylogenetic noise in large data sets by applying two fundamental strategies, variable site removal and long-branch exclusion, to the phylogenetic analysis of a full plastome alignment of 107 species of Pinus and six Pinaceae outgroups. While high overall phylogenetic resolution resulted from inclusion of all data, three historically recalcitrant nodes remained conflicted with previous analyses. Close investigation of these nodes revealed dramatically different responses to data removal. Whereas topological resolution and bootstrap support for two clades peaked with removal of highly variable sites, the third clade resolved most strongly when all sites were included. Similar trends were observed using long-branch exclusion, but patterns were neither as strong nor as clear. When compared to previous phylogenetic analyses of nuclear loci and morphological data, the most highly supported topologies seen in Pinus plastome analysis are congruent for the two clades gaining support from variable site removal and long-branch exclusion, but in conflict for the clade with highest support from the full data set. Conclusions These results suggest that removal of misleading signal in phylogenomic datasets can result not only in increased resolution for poorly supported nodes, but may serve as a tool for identifying erroneous yet highly supported topologies. For Pinus chloroplast genomes, removal of variable sites appears to be more effective than long-branch exclusion for clarifying phylogenetic hypotheses. PMID:22731878

  5. The nature and use of prediction skills in a biological computer simulation

    NASA Astrophysics Data System (ADS)

    Lavoie, Derrick R.; Good, Ron

    The primary goal of this study was to examine the science process skill of prediction using qualitative research methodology. The think-aloud interview, modeled after Ericsson and Simon (1984), let to the identification of 63 program exploration and prediction behaviors.The performance of seven formal and seven concrete operational high-school biology students were videotaped during a three-phase learning sequence on water pollution. Subjects explored the effects of five independent variables on two dependent variables over time using a computer-simulation program. Predictions were made concerning the effect of the independent variables upon dependent variables through time. Subjects were identified according to initial knowledge of the subject matter and success at solving three selected prediction problems.Successful predictors generally had high initial knowledge of the subject matter and were formal operational. Unsuccessful predictors generally had low initial knowledge and were concrete operational. High initial knowledge seemed to be more important to predictive success than stage of Piagetian cognitive development.Successful prediction behaviors involved systematic manipulation of the independent variables, note taking, identification and use of appropriate independent-dependent variable relationships, high interest and motivation, and in general, higher-level thinking skills. Behaviors characteristic of unsuccessful predictors were nonsystematic manipulation of independent variables, lack of motivation and persistence, misconceptions, and the identification and use of inappropriate independent-dependent variable relationships.

  6. Barcoding Fauna Bavarica: Myriapoda – a contribution to DNA sequence-based identifications of centipedes and millipedes (Chilopoda, Diplopoda)

    PubMed Central

    Spelda, Jörg; Reip, Hans S.; Oliveira–Biener, Ulla; Melzer, Roland R.

    2011-01-01

    Abstract We give a first account of our ongoing barcoding activities on Bavarian myriapods in the framework of the Barcoding Fauna Bavarica project and IBOL, the International Barcode of Life. Having analyzed 126 taxa (including 122 species) belonging to all major German chilopod and diplopod lineages, often using four or more specimens each, at the moment our species stock includes 82% of the diplopods and 65% of the chilopods found in Bavaria, southern Germany. The partial COI sequences allow correct identification of more than 95% of the current set of Bavarian species. Moreover, most of the myriapod orders and families appear as distinct clades in neighbour-joining trees, although the phylogenetic relationships between them are not always depicted correctly. We give examples of (1) high interspecific sequence variability among closely related species; (2) low interspecific variability in some chordeumatidan genera, indicating that recent speciations cannot be resolved with certainty using COI DNA barcodes; (3) high intraspecific variation in some genera, suggesting the existence of cryptic lineages; and (4) the possible polyphyly of some taxa, i.e. the chordeumatidan genus Ochogona. This shows that, in addition to species identification, our data may be useful in various ways in the context of species delimitations, taxonomic revisions and analyses of ongoing speciation processes. PMID:22303099

  7. Identification of species by multiplex analysis of variable-length sequences

    PubMed Central

    Pereira, Filipe; Carneiro, João; Matthiesen, Rune; van Asch, Barbara; Pinto, Nádia; Gusmão, Leonor; Amorim, António

    2010-01-01

    The quest for a universal and efficient method of identifying species has been a longstanding challenge in biology. Here, we show that accurate identification of species in all domains of life can be accomplished by multiplex analysis of variable-length sequences containing multiple insertion/deletion variants. The new method, called SPInDel, is able to discriminate 93.3% of eukaryotic species from 18 taxonomic groups. We also demonstrate that the identification of prokaryotic and viral species with numeric profiles of fragment lengths is generally straightforward. A computational platform is presented to facilitate the planning of projects and includes a large data set with nearly 1800 numeric profiles for species in all domains of life (1556 for eukaryotes, 105 for prokaryotes and 130 for viruses). Finally, a SPInDel profiling kit for discrimination of 10 mammalian species was successfully validated on highly processed food products with species mixtures and proved to be easily adaptable to multiple screening procedures routinely used in molecular biology laboratories. These results suggest that SPInDel is a reliable and cost-effective method for broad-spectrum species identification that is appropriate for use in suboptimal samples and is amenable to different high-throughput genotyping platforms without the need for DNA sequencing. PMID:20923781

  8. Sequence, distribution and chromosomal context of class I and class II pilin genes of Neisseria meningitidis identified in whole genome sequences

    PubMed Central

    2014-01-01

    Background Neisseria meningitidis expresses type four pili (Tfp) which are important for colonisation and virulence. Tfp have been considered as one of the most variable structures on the bacterial surface due to high frequency gene conversion, resulting in amino acid sequence variation of the major pilin subunit (PilE). Meningococci express either a class I or a class II pilE gene and recent work has indicated that class II pilins do not undergo antigenic variation, as class II pilE genes encode conserved pilin subunits. The purpose of this work was to use whole genome sequences to further investigate the frequency and variability of the class II pilE genes in meningococcal isolate collections. Results We analysed over 600 publically available whole genome sequences of N. meningitidis isolates to determine the sequence and genomic organization of pilE. We confirmed that meningococcal strains belonging to a limited number of clonal complexes (ccs, namely cc1, cc5, cc8, cc11 and cc174) harbour a class II pilE gene which is conserved in terms of sequence and chromosomal context. We also identified pilS cassettes in all isolates with class II pilE, however, our analysis indicates that these do not serve as donor sequences for pilE/pilS recombination. Furthermore, our work reveals that the class II pilE locus lacks the DNA sequence motifs that enable (G4) or enhance (Sma/Cla repeat) pilin antigenic variation. Finally, through analysis of pilin genes in commensal Neisseria species we found that meningococcal class II pilE genes are closely related to pilE from Neisseria lactamica and Neisseria polysaccharea, suggesting horizontal transfer among these species. Conclusions Class II pilins can be defined by their amino acid sequence and genomic context and are present in meningococcal isolates which have persisted and spread globally. The absence of G4 and Sma/Cla sequences adjacent to the class II pilE genes is consistent with the lack of pilin subunit variation in these isolates, although horizontal transfer may generate class II pilin diversity. This study supports the suggestion that high frequency antigenic variation of pilin is not universal in pathogenic Neisseria. PMID:24690385

  9. Differentiation of Xylella fastidiosa Strains via Multilocus Sequence Analysis of Environmentally Mediated Genes (MLSA-E)

    PubMed Central

    Parker, Jennifer K.; Havird, Justin C.

    2012-01-01

    Isolates of the plant pathogen Xylella fastidiosa are genetically very similar, but studies on their biological traits have indicated differences in virulence and infection symptomatology. Taxonomic analyses have identified several subspecies, and phylogenetic analyses of housekeeping genes have shown broad host-based genetic differences; however, results are still inconclusive for genetic differentiation of isolates within subspecies. This study employs multilocus sequence analysis of environmentally mediated genes (MLSA-E; genes influenced by environmental factors) to investigate X. fastidiosa relationships and differentiate isolates with low genetic variability. Potential environmentally mediated genes, including host colonization and survival genes related to infection establishment, were identified a priori. The ratio of the rate of nonsynonymous substitutions to the rate of synonymous substitutions (dN/dS) was calculated to select genes that may be under increased positive selection compared to previously studied housekeeping genes. Nine genes were sequenced from 54 X. fastidiosa isolates infecting different host plants across the United States. Results of maximum likelihood (ML) and Bayesian phylogenetic (BP) analyses are in agreement with known X. fastidiosa subspecies clades but show novel within-subspecies differentiation, including geographic differentiation, and provide additional information regarding host-based isolate variation and specificity. dN/dS ratios of environmentally mediated genes, though <1 due to high sequence similarity, are significantly greater than housekeeping gene dN/dS ratios and correlate with increased sequence variability. MLSA-E can more precisely resolve relationships between closely related bacterial strains with low genetic variability, such as X. fastidiosa isolates. Discovering the genetic relationships between X. fastidiosa isolates will provide new insights into the epidemiology of populations of X. fastidiosa, allowing improved disease management in economically important crops. PMID:22194287

  10. Differentiation of Xylella fastidiosa strains via multilocus sequence analysis of environmentally mediated genes (MLSA-E).

    PubMed

    Parker, Jennifer K; Havird, Justin C; De La Fuente, Leonardo

    2012-03-01

    Isolates of the plant pathogen Xylella fastidiosa are genetically very similar, but studies on their biological traits have indicated differences in virulence and infection symptomatology. Taxonomic analyses have identified several subspecies, and phylogenetic analyses of housekeeping genes have shown broad host-based genetic differences; however, results are still inconclusive for genetic differentiation of isolates within subspecies. This study employs multilocus sequence analysis of environmentally mediated genes (MLSA-E; genes influenced by environmental factors) to investigate X. fastidiosa relationships and differentiate isolates with low genetic variability. Potential environmentally mediated genes, including host colonization and survival genes related to infection establishment, were identified a priori. The ratio of the rate of nonsynonymous substitutions to the rate of synonymous substitutions (dN/dS) was calculated to select genes that may be under increased positive selection compared to previously studied housekeeping genes. Nine genes were sequenced from 54 X. fastidiosa isolates infecting different host plants across the United States. Results of maximum likelihood (ML) and Bayesian phylogenetic (BP) analyses are in agreement with known X. fastidiosa subspecies clades but show novel within-subspecies differentiation, including geographic differentiation, and provide additional information regarding host-based isolate variation and specificity. dN/dS ratios of environmentally mediated genes, though <1 due to high sequence similarity, are significantly greater than housekeeping gene dN/dS ratios and correlate with increased sequence variability. MLSA-E can more precisely resolve relationships between closely related bacterial strains with low genetic variability, such as X. fastidiosa isolates. Discovering the genetic relationships between X. fastidiosa isolates will provide new insights into the epidemiology of populations of X. fastidiosa, allowing improved disease management in economically important crops.

  11. Automatic Cloud Classification from Multi-Spectral Satellite Data Over Oceanic Regions

    DTIC Science & Technology

    1992-01-14

    parameters the first two colors used are, blue for low values and dark green for high parameter values. If a third class is identified, the intermediate...intermediate yellow and high dark green classes. The color sequence blue-yellow-light green- dark green, then characterizes the low to high parameter value...to light green then to dark green correspond to superpixels of increasing (from low to high) variability in their altitude, (see Table V.3). When the

  12. Genetic fidelity and variability of micropropagated cassava plants (Manihot esculenta Crantz) evaluated using ISSR markers.

    PubMed

    Vidal, Á M; Vieira, L J; Ferreira, C F; Souza, F V D; Souza, A S; Ledo, C A S

    2015-07-14

    Molecular markers are efficient for assessing the genetic fidelity of various species of plants after in vitro culture. In this study, we evaluated the genetic fidelity and variability of micropropagated cassava plants (Manihot esculenta Crantz) using inter-simple sequence repeat markers. Twenty-two cassava accessions from the Embrapa Cassava & Fruits Germplasm Bank were used. For each accession, DNA was extracted from a plant maintained in the field and from 3 plants grown in vitro. For DNA amplification, 27 inter-simple sequence repeat primers were used, of which 24 generated 175 bands; 100 of those bands were polymorphic and were used to study genetic variability among accessions of cassava plants maintained in the field. Based on the genetic distance matrix calculated using the arithmetic complement of the Jaccard's index, genotypes were clustered using the unweighted pair group method using arithmetic averages. The number of bands per primer was 2-13, with an average of 7.3. For most micropropagated accessions, the fidelity study showed no genetic variation between plants of the same accessions maintained in the field and those maintained in vitro, confirming the high genetic fidelity of the micropropagated plants. However, genetic variability was observed among different accessions grown in the field, and clustering based on the dissimilarity matrix revealed 7 groups. Inter-simple sequence repeat markers were efficient for detecting the genetic homogeneity of cassava plants derived from meristem culture, demonstrating the reliability of this propagation system.

  13. Insertion sequence ISRP10 inactivation of the oprD gene in imipenem-resistant Pseudomonas aeruginosa clinical isolates.

    PubMed

    Sun, Qinghui; Ba, Zhaofen; Wu, Guoying; Wang, Wei; Lin, Shuxiang; Yang, Hongjiang

    2016-05-01

    Carbapenem resistance mechanisms were investigated in 32 imipenem-resistant Pseudomonas aeruginosa clinical isolates recovered from hospitalised children. Sequence analysis revealed that 31 of the isolates had an insertion sequence element ISRP10 disrupting the porin gene oprD, demonstrating that ISRP10 inactivation of oprD conferred imipenem resistance in the majority of the isolates. Multilocus sequence typing (MLST) was used to discriminate the isolates. In total, 11 sequence types (STs) were identified including 3 novel STs, and 68.3% (28/41) of the tested strains were characterised as clone ST253. In combination with random amplified polymorphic DNA (RAPD) analysis, the imipenem-resistant isolates displayed a relatively high degree of genetic variability and were unlikely associated with nosocomial infections. Copyright © 2016 Elsevier B.V. and the International Society of Chemotherapy. All rights reserved.

  14. Conservation of CD44 exon v3 functional elements in mammals

    PubMed Central

    Vela, Elena; Hilari, Josep M; Delclaux, María; Fernández-Bellon, Hugo; Isamat, Marcos

    2008-01-01

    Background The human CD44 gene contains 10 variable exons (v1 to v10) that can be alternatively spliced to generate hundreds of different CD44 protein isoforms. Human CD44 variable exon v3 inclusion in the final mRNA depends on a multisite bipartite splicing enhancer located within the exon itself, which we have recently described, and provides the protein domain responsible for growth factor binding to CD44. Findings We have analyzed the sequence of CD44v3 in 95 mammalian species to report high conservation levels for both its splicing regulatory elements (the 3' splice site and the exonic splicing enhancer), and the functional glycosaminglycan binding site coded by v3. We also report the functional expression of CD44v3 isoforms in peripheral blood cells of different mammalian taxa with both consensus and variant v3 sequences. Conclusion CD44v3 mammalian sequences maintain all functional splicing regulatory elements as well as the GAG binding site with the same relative positions and sequence identity previously described during alternative splicing of human CD44. The sequence within the GAG attachment site, which in turn contains the Y motif of the exonic splicing enhancer, is more conserved relative to the rest of exon. Amplification of CD44v3 sequence from mammalian species but not from birds, fish or reptiles, may lead to classify CD44v3 as an exclusive mammalian gene trait. PMID:18710510

  15. Russell body inducing threshold depends on the variable domain sequences of individual human IgG clones and the cellular protein homeostasis.

    PubMed

    Stoops, Janelle; Byrd, Samantha; Hasegawa, Haruki

    2012-10-01

    Russell bodies are intracellular aggregates of immunoglobulins. Although the mechanism of Russell body biogenesis has been extensively studied by using truncated mutant heavy chains, the importance of the variable domain sequences in this process and in immunoglobulin biosynthesis remains largely unknown. Using a panel of structurally and functionally normal human immunoglobulin Gs, we show that individual immunoglobulin G clones possess distinctive Russell body inducing propensities that can surface differently under normal and abnormal cellular conditions. Russell body inducing predisposition unique to each immunoglobulin G clone was corroborated by the intrinsic physicochemical properties encoded in the heavy chain variable domain/light chain variable domain sequence combinations that define each immunoglobulin G clone. While the sequence based intrinsic factors predispose certain immunoglobulin G clones to be more prone to induce Russell bodies, extrinsic factors such as stressful cell culture conditions also play roles in unmasking Russell body propensity from immunoglobulin G clones that are normally refractory to developing Russell bodies. By taking advantage of heterologous expression systems, we dissected the roles of individual subunit chains in Russell body formation and examined the effect of non-cognate subunit chain pair co-expression on Russell body forming propensity. The results suggest that the properties embedded in the variable domain of individual light chain clones and their compatibility with the partnering heavy chain variable domain sequences underscore the efficiency of immunoglobulin G biosynthesis, the threshold for Russell body induction, and the level of immunoglobulin G secretion. We propose that an interplay between the unique properties encoded in variable domain sequences and the state of protein homeostasis determines whether an immunoglobulin G expressing cell will develop the Russell body phenotype in a dynamic cellular setting. Copyright © 2012 Elsevier B.V. All rights reserved.

  16. ITS all right mama: investigating the formation of chimeric sequences in the ITS2 region by DNA metabarcoding analyses of fungal mock communities of different complexities.

    PubMed

    Bjørnsgaard Aas, Anders; Davey, Marie Louise; Kauserud, Håvard

    2017-07-01

    The formation of chimeric sequences can create significant methodological bias in PCR-based DNA metabarcoding analyses. During mixed-template amplification of barcoding regions, chimera formation is frequent and well documented. However, profiling of fungal communities typically uses the more variable rDNA region ITS. Due to a larger research community, tools for chimera detection have been developed mainly for the 16S/18S markers. However, these tools are widely applied to the ITS region without verification of their performance. We examined the rate of chimera formation during amplification and 454 sequencing of the ITS2 region from fungal mock communities of different complexities. We evaluated the chimera detecting ability of two common chimera-checking algorithms: perseus and uchime. Large proportions of the chimeras reported were false positives. No false negatives were found in the data set. Verified chimeras accounted for only 0.2% of the total ITS2 reads, which is considerably less than what is typically reported in 16S and 18S metabarcoding analyses. Verified chimeric 'parent sequences' had significantly higher per cent identity to one another than to random members of the mock communities. Community complexity increased the rate of chimera formation. GC content was higher around the verified chimeric break points, potentially facilitating chimera formation through base pair mismatching in the neighbouring regions of high similarity in the chimeric region. We conclude that the hypervariable nature of the ITS region seems to buffer the rate of chimera formation in comparison with other, less variable barcoding regions, due to shorter regions of high sequence similarity. © 2016 John Wiley & Sons Ltd.

  17. Unifying measures of gene function and evolution.

    PubMed

    Wolf, Yuri I; Carmel, Liran; Koonin, Eugene V

    2006-06-22

    Recent genome analyses revealed intriguing correlations between variables characterizing the functioning of a gene, such as expression level (EL), connectivity of genetic and protein-protein interaction networks, and knockout effect, and variables describing gene evolution, such as sequence evolution rate (ER) and propensity for gene loss. Typically, variables within each of these classes are positively correlated, e.g. products of highly expressed genes also have a propensity to be involved in many protein-protein interactions, whereas variables between classes are negatively correlated, e.g. highly expressed genes, on average, evolve slower than weakly expressed genes. Here, we describe principal component (PC) analysis of seven genome-related variables and propose biological interpretations for the first three PCs. The first PC reflects a gene's 'importance', or the 'status' of a gene in the genomic community, with positive contributions from knockout lethality, EL, number of protein-protein interaction partners and the number of paralogues, and negative contributions from sequence ER and gene loss propensity. The next two PCs define a plane that seems to reflect the functional and evolutionary plasticity of a gene. Specifically, PC2 can be interpreted as a gene's 'adaptability' whereby genes with high adaptability readily duplicate, have many genetic interaction partners and tend to be non-essential. PC3 also might reflect the role of a gene in organismal adaptation albeit with a negative rather than a positive contribution of genetic interactions; we provisionally designate this PC 'reactivity'. The interpretation of PC2 and PC3 as measures of a gene's plasticity is compatible with the observation that genes with high values of these PCs tend to be expressed in a condition- or tissue-specific manner. Functional classes of genes substantially vary in status, adaptability and reactivity, with the highest status characteristic of the translation system and cytoskeletal proteins, highest adaptability seen in cellular processes and signalling genes, and top reactivity characteristic of metabolic enzymes.

  18. Variable Number of Tandem Repeat Markers in the Genome Sequence of Mycosphaerella Fijiensis, the Causal Agent of Black Leaf Streak Disease of Banana (Musa spp.)

    USDA-ARS?s Scientific Manuscript database

    Mycosphaerella fijiensis, the causal agent of banana leaf streak disease (commonly known as black Sigatoka), is the most devastating pathogen attacking bananas (Musa spp). Recently the whole genome sequence of M. fijiensis became available. This sequence was screened for the presence of Variable Num...

  19. Effects of "D"-Amphetamine and Ethanol on Variable and Repetitive Key-Peck Sequences in Pigeons

    ERIC Educational Resources Information Center

    Ward, Ryan D.; Bailey, Ericka M.; Odum, Amy L.

    2006-01-01

    This experiment assessed the effects of "d"-Amphetamine and ethanol on reinforced variable and repetitive key-peck sequences in pigeons. Pigeons responded on two keys under a multiple schedule of Repeat and Vary components. In the Repeat component, completion of a target sequence of right, right, left, left resulted in food. In the Vary component,…

  20. Next-Generation Sequencing: a Diagnostic One-Stop Shop for Hepatitis C?

    PubMed

    Poljak, Mario

    2016-10-01

    Before starting chronic hepatitis C treatment, the viral genotype/subtype has to be accurately determined and potentially coupled with drug resistance testing. Due to the high genetic variability of the hepatitis C virus, this can be a demanding task that can potentially be streamlined by viral whole-genome sequencing using next-generation sequencing as demonstrated by an article in this issue of the Journal of Clinical Microbiology by E. Thomson, C. L. C. Ip, A. Badhan, M. T. Christiansen, W. Adamson, et al. (J Clin Microbiol. 54:2455-2469, 2016, http://dx.doi.org/10.1128/JCM.00330-16). Copyright © 2016, American Society for Microbiology. All Rights Reserved.

  1. Sequential Effects of High and Low Instructional Guidance on Children's Acquisition of Experimentation Skills: Is It All in the Timing?

    ERIC Educational Resources Information Center

    Matlen, Bryan J.; Klahr, David

    2013-01-01

    We report the effect of different sequences of high vs low levels of instructional guidance on children's immediate learning and long-term transfer of simple experimental design procedures and concepts, often called "CVS" (Control of Variables Strategy). Third-grade children (N = 57) received instruction in CVS via one of four possible orderings…

  2. Differentiation of “Candidatus Liberibacter asiaticus” Isolates by Variable-Number Tandem-Repeat Analysis ▿

    PubMed Central

    Katoh, Hiroshi; Subandiyah, Siti; Tomimura, Kenta; Okuda, Mitsuru; Su, Hong-Ji; Iwanami, Toru

    2011-01-01

    Four highly polymorphic simple sequence repeat (SSR) loci were selected and used to differentiate 84 Japanese isolates of “Candidatus Liberibacter asiaticus.” The Nei's measure of genetic diversity values for these four SSRs ranged from 0.60 to 0.86. The four SSR loci were also highly polymorphic in four isolates from Taiwan and 12 isolates from Indonesia. PMID:21239554

  3. Nested Machine Learning Facilitates Increased Sequence Content for Large-Scale Automated High Resolution Melt Genotyping

    PubMed Central

    Fraley, Stephanie I.; Athamanolap, Pornpat; Masek, Billie J.; Hardick, Justin; Carroll, Karen C.; Hsieh, Yu-Hsiang; Rothman, Richard E.; Gaydos, Charlotte A.; Wang, Tza-Huei; Yang, Samuel

    2016-01-01

    High Resolution Melt (HRM) is a versatile and rapid post-PCR DNA analysis technique primarily used to differentiate sequence variants among only a few short amplicons. We recently developed a one-vs-one support vector machine algorithm (OVO SVM) that enables the use of HRM for identifying numerous short amplicon sequences automatically and reliably. Herein, we set out to maximize the discriminating power of HRM + SVM for a single genetic locus by testing longer amplicons harboring significantly more sequence information. Using universal primers that amplify the hypervariable bacterial 16 S rRNA gene as a model system, we found that long amplicons yield more complex HRM curve shapes. We developed a novel nested OVO SVM approach to take advantage of this feature and achieved 100% accuracy in the identification of 37 clinically relevant bacteria in Leave-One-Out-Cross-Validation. A subset of organisms were independently tested. Those from pure culture were identified with high accuracy, while those tested directly from clinical blood bottles displayed more technical variability and reduced accuracy. Our findings demonstrate that long sequences can be accurately and automatically profiled by HRM with a novel nested SVM approach and suggest that clinical sample testing is feasible with further optimization. PMID:26778280

  4. Effects of B1 inhomogeneity correction for three-dimensional variable flip angle T1 measurements in hip dGEMRIC at 3 T and 1.5 T.

    PubMed

    Siversson, Carl; Chan, Jenny; Tiderius, Carl-Johan; Mamisch, Tallal Charles; Jellus, Vladimir; Svensson, Jonas; Kim, Young-Jo

    2012-06-01

    Delayed gadolinium-enhanced MRI of cartilage is a technique for studying the development of osteoarthritis using quantitative T(1) measurements. Three-dimensional variable flip angle is a promising method for performing such measurements rapidly, by using two successive spoiled gradient echo sequences with different excitation pulse flip angles. However, the three-dimensional variable flip angle method is very sensitive to inhomogeneities in the transmitted B(1) field in vivo. In this study, a method for correcting for such inhomogeneities, using an additional B(1) mapping spin-echo sequence, was evaluated. Phantom studies concluded that three-dimensional variable flip angle with B(1) correction calculates accurate T(1) values also in areas with high B(1) deviation. Retrospective analysis of in vivo hip delayed gadolinium-enhanced MRI of cartilage data from 40 subjects showed the difference between three-dimensional variable flip angle with and without B(1) correction to be generally two to three times higher at 3 T than at 1.5 T. In conclusion, the B(1) variations should always be taken into account, both at 1.5 T and at 3 T. Copyright © 2011 Wiley-Liss, Inc.

  5. DNA copy number changes define spatial patterns of heterogeneity in colorectal cancer

    PubMed Central

    Mamlouk, Soulafa; Childs, Liam Harold; Aust, Daniela; Heim, Daniel; Melching, Friederike; Oliveira, Cristiano; Wolf, Thomas; Durek, Pawel; Schumacher, Dirk; Bläker, Hendrik; von Winterfeld, Moritz; Gastl, Bastian; Möhr, Kerstin; Menne, Andrea; Zeugner, Silke; Redmer, Torben; Lenze, Dido; Tierling, Sascha; Möbs, Markus; Weichert, Wilko; Folprecht, Gunnar; Blanc, Eric; Beule, Dieter; Schäfer, Reinhold; Morkel, Markus; Klauschen, Frederick; Leser, Ulf; Sers, Christine

    2017-01-01

    Genetic heterogeneity between and within tumours is a major factor determining cancer progression and therapy response. Here we examined DNA sequence and DNA copy-number heterogeneity in colorectal cancer (CRC) by targeted high-depth sequencing of 100 most frequently altered genes. In 97 samples, with primary tumours and matched metastases from 27 patients, we observe inter-tumour concordance for coding mutations; in contrast, gene copy numbers are highly discordant between primary tumours and metastases as validated by fluorescent in situ hybridization. To further investigate intra-tumour heterogeneity, we dissected a single tumour into 68 spatially defined samples and sequenced them separately. We identify evenly distributed coding mutations in APC and TP53 in all tumour areas, yet highly variable gene copy numbers in numerous genes. 3D morpho-molecular reconstruction reveals two clusters with divergent copy number aberrations along the proximal–distal axis indicating that DNA copy number variations are a major source of tumour heterogeneity in CRC. PMID:28120820

  6. Prevalence of human pegivirus-1 and sequence variability of its E2 glycoprotein estimated from screening donors of fetal stem cell-containing material.

    PubMed

    Vitrenko, Yakov; Kostenko, Iryna; Kulebyakina, Kateryna; Sorochynska, Khrystyna

    2017-08-31

    Human pegivirus-1 (HPgV-1) is a member of the Flaviviridae family whose genomic organization and mode of cellular entry is similar to that of hepatitis C virus (HCV). The E2 glycoprotein of HPgV-1 is the principle mediator in the virus-cell interaction and as such harbors most of HPgV-1's antigenic determinants. HPgV-1 persists in blood cell precursors which are increasingly used for cell therapy. We studied HPgV-1 prevalence in a large cohort of females donating fetal tissues for clinical use. PCR was used for screening and estimation of viral load in viremic plasma and fetal samples. Sequence analysis was performed for portions of the 5'-untranslated and E2 regions of HPgV-1 purified from donor plasmas. Sequencing was followed by phylogenetic analysis. HPgV-1 was revealed in 13.7% of plasmas, 5.0% of fetal tissues, 5.4% of chorions, exceeding the prevalence of HCV in these types of samples. Transmission of HPgV-1 occurred in 25.8% of traceable mother-chorion-fetal tissues triads. For HPgV-1-positive donors, a high viral load in plasma appears to be a prerequisite for transmission. However, about one third of fetal samples acquired infection from non-viremic individuals. Sequencing of 5'-untranslated region placed most HPgV-1 samples to genotype 2a. At the same time, a portion of E2 sequence provided a much weaker support for this grouping apparently due to a higher variability. Polymorphisms were detected in important structural and antigenic motifs of E2. HPgV-1 is efficiently transmitted to fetus at early embryonic stages. A high variability in E2 may pose a risk of generation of pathogenic subtypes. Although HPgV-1 is considered benign and no longer tested mandatorily in blood banks, the virus may have adversary effects at target niches if delivered with infected graft upon cell transplantation. This argues for the necessity of HPgV-1 testing of cell samples aimed for clinical use.

  7. Development of TaqMan probes targeting the four major celiac disease epitopes found in α-gliadin sequences of spelt (Triticum aestivum ssp. spelta) and bread wheat (Triticum aestivum ssp. aestivum).

    PubMed

    Dubois, Benjamin; Bertin, Pierre; Muhovski, Yordan; Escarnot, Emmanuelle; Mingeot, Dominique

    2017-01-01

    Celiac disease (CD) is caused by specific sequences of gluten proteins found in cereals such as bread wheat ( Triticum aestivum ssp. aestivum ) and spelt ( T. aestivum ssp. spelta ). Among them, the α-gliadins display the highest immunogenicity, with four T-cell stimulatory epitopes. The toxicity of each epitope sequence can be reduced or even suppressed according to the allelic form of each sequence. One way to address the CD problem would be to make use of this allelic variability in breeding programs to develop safe varieties, but tools to track the presence of toxic epitopes are required. The objective of this study was to develop a tool to accurately detect and quantify the immunogenic content of expressed α-gliadins of spelt and bread wheat. Four TaqMan probes that only hybridize to the canonical-i.e. toxic-form of each of the four epitopes were developed and their specificity was demonstrated. Six TaqMan probes targeting stable reference genes were also developed and constitute a tool to normalize qPCR data. The probes were used to measure the epitope expression levels of 11 contrasted spelt accessions and three ancestral diploid accessions of bread wheat and spelt. A high expression variability was highlighted among epitopes and among accessions, especially in Asian spelts, which showed lower epitope expression levels than the other spelts. Some discrepancies were identified between the canonical epitope expression level and the global amount of expressed α-gliadins, which makes the designed TaqMan probes a useful tool to quantify the immunogenic potential independently of the global amount of expressed α-gliadins. The results obtained in this study provide useful tools to study the immunogenic potential of expressed α-gliadin sequences from Triticeae accessions such as spelt and bread wheat. The application of the designed probes to contrasted spelt accessions revealed a high variability and interesting low canonical epitope expression levels in the Asian spelt accessions studied.

  8. Unified Deep Learning Architecture for Modeling Biology Sequence.

    PubMed

    Wu, Hongjie; Cao, Chengyuan; Xia, Xiaoyan; Lu, Qiang

    2017-10-09

    Prediction of the spatial structure or function of biological macromolecules based on their sequence remains an important challenge in bioinformatics. When modeling biological sequences using traditional sequencing models, characteristics, such as long-range interactions between basic units, the complicated and variable output of labeled structures, and the variable length of biological sequences, usually lead to different solutions on a case-by-case basis. This study proposed the use of bidirectional recurrent neural networks based on long short-term memory or a gated recurrent unit to capture long-range interactions by designing the optional reshape operator to adapt to the diversity of the output labels and implementing a training algorithm to support the training of sequence models capable of processing variable-length sequences. Additionally, the merge and pooling operators enhanced the ability to capture short-range interactions between basic units of biological sequences. The proposed deep-learning model and its training algorithm might be capable of solving currently known biological sequence-modeling problems through the use of a unified framework. We validated our model on one of the most difficult biological sequence-modeling problems currently known, with our results indicating the ability of the model to obtain predictions of protein residue interactions that exceeded the accuracy of current popular approaches by 10% based on multiple benchmarks.

  9. Molecular Characterization of Geographically Different Banana bunchy top virus Isolates in India.

    PubMed

    Selvarajan, R; Mary Sheeba, M; Balasubramanian, V; Rajmohan, R; Dhevi, N Lakshmi; Sasireka, T

    2010-10-01

    Banana bunchy top disease (BBTD) caused by Banana bunchy top virus (BBTV) is one of the most devastating diseases of banana and poses a serious threat for cultivars like Hill Banana (Syn: Virupakshi) and Grand Naine in India. In this study, we have cloned and sequenced the complete genome comprised of six DNA components of BBTV infecting Hill Banana grown in lower Pulney hills, Tamil Nadu State, India. The complete genome sequence of this hill banana isolate showed high degree of similarity with the corresponding sequences of BBTV isolates originating from Lucknow, Uttar Pradesh State, India, and from Fiji, Egypt, Pakistan, and Australia. In addition, sixteen coat protein (CP) and thirteen replicase genes (Rep) sequences of BBTV isolates collected from different banana growing states of India were cloned and sequenced. The replicase sequences of 13 isolates showed high degree of similarity with that of South Pacific group of BBTV isolates. However, the CP gene of BBTV isolates from Shervroy and Kodaikanal hills of Tamil Nadu showed higher amino acid sequence variability compared to other isolates. Another hill banana isolate from Meghalaya state had 23 nucleotide substitutions in the CP gene but the amino acid sequence was conserved. This is the first report of the characterization of a complete genome of BBTV occurring in the high altitudes of India. Our study revealed that the Indian BBTV isolates with distinct geographical origins belongs to the South Pacific group, except Shervroy and Kodaikanal hill isolates which neither belong to the South Pacific nor the Asian group.

  10. The 3of5 web application for complex and comprehensive pattern matching in protein sequences.

    PubMed

    Seiler, Markus; Mehrle, Alexander; Poustka, Annemarie; Wiemann, Stefan

    2006-03-16

    The identification of patterns in biological sequences is a key challenge in genome analysis and in proteomics. Frequently such patterns are complex and highly variable, especially in protein sequences. They are frequently described using terms of regular expressions (RegEx) because of the user-friendly terminology. Limitations arise for queries with the increasing complexity of patterns and are accompanied by requirements for enhanced capabilities. This is especially true for patterns containing ambiguous characters and positions and/or length ambiguities. We have implemented the 3of5 web application in order to enable complex pattern matching in protein sequences. 3of5 is named after a special use of its main feature, the novel n-of-m pattern type. This feature allows for an extensive specification of variable patterns where the individual elements may vary in their position, order, and content within a defined stretch of sequence. The number of distinct elements can be constrained by operators, and individual characters may be excluded. The n-of-m pattern type can be combined with common regular expression terms and thus also allows for a comprehensive description of complex patterns. 3of5 increases the fidelity of pattern matching and finds ALL possible solutions in protein sequences in cases of length-ambiguous patterns instead of simply reporting the longest or shortest hits. Grouping and combined search for patterns provides a hierarchical arrangement of larger patterns sets. The algorithm is implemented as internet application and freely accessible. The application is available at http://dkfz.de/mga2/3of5/3of5.html. The 3of5 application offers an extended vocabulary for the definition of search patterns and thus allows the user to comprehensively specify and identify peptide patterns with variable elements. The n-of-m pattern type offers an improved accuracy for pattern matching in combination with the ability to find all solutions, without compromising the user friendliness of regular expression terms.

  11. Plant centromere organization: a dynamic structure with conserved functions.

    PubMed

    Ma, Jianxin; Wing, Rod A; Bennetzen, Jeffrey L; Jackson, Scott A

    2007-03-01

    Although the structural features of centromeres from most multicellular eukaryotes remain to be characterized, recent analyses of the complete sequences of two centromeric regions of rice, together with data from Arabidopsis thaliana and maize, have illuminated the considerable size variation and sequence divergence of plant centromeres. Despite the severe suppression of meiotic chromosomal exchange in centromeric and pericentromeric regions of rice, the centromere core shows high rates of unequal homologous recombination in the absence of chromosomal exchange, resulting in frequent and extensive DNA rearrangement. Not only is the sequence of centromeric tandem and non-tandem repeats highly variable but also the copy number, spacing, order and orientation, providing ample natural variation as the basis for selection of superior centromere performance. This review article focuses on the structural and evolutionary dynamics of plant centromere organization and the potential molecular mechanisms responsible for the rapid changes of centromeric components.

  12. Genetic variability in sunflower (Helianthus annuus L.) and in the Helianthus genus as assessed by retrotransposon-based molecular markers.

    PubMed

    Vukich, M; Schulman, A H; Giordani, T; Natali, L; Kalendar, R; Cavallini, A

    2009-10-01

    The inter-retrotransposon amplified polymorphism (IRAP) protocol was applied for the first time within the genus Helianthus to assess intraspecific variability based on retrotransposon sequences among 36 wild accessions and 26 cultivars of Helianthus annuus L., and interspecific variability among 39 species of Helianthus. Two groups of LTRs, one belonging to a Copia-like retroelement and the other to a putative retrotransposon of unknown nature (SURE) have been isolated, sequenced and primers were designed to obtain IRAP fingerprints. The number of polymorphic bands in H. annuus wild accessions is as high as in Helianthus species. If we assume that a polymorphic band can be related to a retrotransposon insertion, this result suggests that retrotransposon activity continued after Helianthus speciation. Calculation of similarity indices from binary matrices (Shannon's and Jaccard's indices) show that variability is reduced among domesticated H. annuus. On the contrary, similarity indices among Helianthus species were as large as those observed among wild H. annuus accessions, probably related to their scattered geographic distribution. Principal component analysis of IRAP fingerprints allows the distinction between perennial and annual Helianthus species especially when the SURE element is concerned.

  13. Identification of genes in anonymous DNA sequences. Annual performance report, February 1, 1991--January 31, 1992

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fields, C.A.

    1996-06-01

    The objective of this project is the development of practical software to automate the identification of genes in anonymous DNA sequences from the human, and other higher eukaryotic genomes. A software system for automated sequence analysis, gm (gene modeler) has been designed, implemented, tested, and distributed to several dozen laboratories worldwide. A significantly faster, more robust, and more flexible version of this software, gm 2.0 has now been completed, and is being tested by operational use to analyze human cosmid sequence data. A range of efforts to further understand the features of eukaryoyic gene sequences are also underway. This progressmore » report also contains papers coming out of the project including the following: gm: a Tool for Exploratory Analysis of DNA Sequence Data; The Human THE-LTR(O) and MstII Interspersed Repeats are subfamilies of a single widely distruted highly variable repeat family; Information contents and dinucleotide compostions of plant intron sequences vary with evolutionary origin; Splicing signals in Drosophila: intron size, information content, and consensus sequences; Integration of automated sequence analysis into mapping and sequencing projects; Software for the C. elegans genome project.« less

  14. Method to amplify variable sequences without imposing primer sequences

    DOEpatents

    Bradbury, Andrew M.; Zeytun, Ahmet

    2006-11-14

    The present invention provides methods of amplifying target sequences without including regions flanking the target sequence in the amplified product or imposing amplification primer sequences on the amplified product. Also provided are methods of preparing a library from such amplified target sequences.

  15. The gut mycobiome of the Human Microbiome Project healthy cohort.

    PubMed

    Nash, Andrea K; Auchtung, Thomas A; Wong, Matthew C; Smith, Daniel P; Gesell, Jonathan R; Ross, Matthew C; Stewart, Christopher J; Metcalf, Ginger A; Muzny, Donna M; Gibbs, Richard A; Ajami, Nadim J; Petrosino, Joseph F

    2017-11-25

    Most studies describing the human gut microbiome in healthy and diseased states have emphasized the bacterial component, but the fungal microbiome (i.e., the mycobiome) is beginning to gain recognition as a fundamental part of our microbiome. To date, human gut mycobiome studies have primarily been disease centric or in small cohorts of healthy individuals. To contribute to existing knowledge of the human mycobiome, we investigated the gut mycobiome of the Human Microbiome Project (HMP) cohort by sequencing the Internal Transcribed Spacer 2 (ITS2) region as well as the 18S rRNA gene. Three hundred seventeen HMP stool samples were analyzed by ITS2 sequencing. Fecal fungal diversity was significantly lower in comparison to bacterial diversity. Yeast dominated the samples, comprising eight of the top 15 most abundant genera. Specifically, fungal communities were characterized by a high prevalence of Saccharomyces, Malassezia, and Candida, with S. cerevisiae, M. restricta, and C. albicans operational taxonomic units (OTUs) present in 96.8, 88.3, and 80.8% of samples, respectively. There was a high degree of inter- and intra-volunteer variability in fungal communities. However, S. cerevisiae, M. restricta, and C. albicans OTUs were found in 92.2, 78.3, and 63.6% of volunteers, respectively, in all samples donated over an approximately 1-year period. Metagenomic and 18S rRNA gene sequencing data agreed with ITS2 results; however, ITS2 sequencing provided greater resolution of the relatively low abundance mycobiome constituents. Compared to bacterial communities, the human gut mycobiome is low in diversity and dominated by yeast including Saccharomyces, Malassezia, and Candida. Both inter- and intra-volunteer variability in the HMP cohort were high, revealing that unlike bacterial communities, an individual's mycobiome is no more similar to itself over time than to another person's. Nonetheless, several fungal species persisted across a majority of samples, evidence that a core gut mycobiome may exist. ITS2 sequencing data provided greater resolution of the mycobiome membership compared to metagenomic and 18S rRNA gene sequencing data, suggesting that it is a more sensitive method for studying the mycobiome of stool samples.

  16. Variation, Repetition, And Choice

    PubMed Central

    Abreu-Rodrigues, Josele; Lattal, Kennon A; dos Santos, Cristiano V; Matos, Ricardo A

    2005-01-01

    Experiment 1 investigated the controlling properties of variability contingencies on choice between repeated and variable responding. Pigeons were exposed to concurrent-chains schedules with two alternatives. In the REPEAT alternative, reinforcers in the terminal link depended on a single sequence of four responses. In the VARY alternative, a response sequence in the terminal link was reinforced only if it differed from the n previous sequences (lag criterion). The REPEAT contingency generated low, constant levels of sequence variation whereas the VARY contingency produced levels of sequence variation that increased with the lag criterion. Preference for the REPEAT alternative tended to increase directly with the degree of variation required for reinforcement. Experiment 2 examined the potential confounding effects in Experiment 1 of immediacy of reinforcement by yoking the interreinforcer intervals in the REPEAT alternative to those in the VARY alternative. Again, preference for REPEAT was a function of the lag criterion. Choice between varying and repeating behavior is discussed with respect to obtained behavioral variability, probability of reinforcement, delay of reinforcement, and switching within a sequence. PMID:15828592

  17. Species-specific identification of Dekkera/Brettanomyces yeasts by fluorescently labeled DNA probes targeting the 26S rRNA.

    PubMed

    Röder, Christoph; König, Helmut; Fröhlich, Jürgen

    2007-09-01

    Sequencing of the complete 26S rRNA genes of all Dekkera/Brettanomyces species colonizing different beverages revealed the potential for a specific primer and probe design to support diagnostic PCR approaches and FISH. By analysis of the complete 26S rRNA genes of all five currently known Dekkera/Brettanomyces species (Dekkera bruxellensis, D. anomala, Brettanomyces custersianus, B. nanus and B. naardenensis), several regions with high nucleotide sequence variability yet distinct from the D1/D2 domains were identified. FISH species-specific probes targeting the 26S rRNA gene's most variable regions were designed. Accessibility of probe targets for hybridization was facilitated by the construction of partially complementary 'side'-labeled probes, based on secondary structure models of the rRNA sequences. The specificity and routine applicability of the FISH-based method for yeast identification were tested by analyzing different wine isolates. Investigation of the prevalence of Dekkera/Brettanomyces yeasts in the German viticultural regions Wonnegau, Nierstein and Bingen (Rhinehesse, Rhineland-Palatinate) resulted in the isolation of 37 D. bruxellensis strains from 291 wine samples.

  18. Ramped-Amplitude Cross Polarization in Magic-Angle-Spinning NMR

    NASA Astrophysics Data System (ADS)

    Metz, G.; Wu, X. L.; Smith, S. O.

    The Hartmann-Hahn matching profile in CP-MAS NMR shows a strong mismatch dependence if the MAS frequency is on the order of the dipolar couplings in the sample. Under these conditions, the profile breaks down into a series of narrow matching bands separated by the spinning speed, and it becomes difficult to establish and maintain an efficient matching condition. Variable-amplitude CP (VACP), as introduced previously (Peersen et al., J. Magn. Reson. A104, 334, 1993), has been proven to be effective for restoring flat profiles at high spinning speeds. Here, a refined implementation of VACP using a ramped-amplitude cross-polarization sequence (RAMP-CP) is described. The order of the amplitude modulation is shown to be of importance for the cross-polarization process. The new pulse sequence with a linear amplitude ramp is not only easier to set up but also improves the performance of the variable-amplitude experiment in that it produces flat profiles over a wider range of matching conditions even with short total contact times. An increase in signal intensity is obtained compared to both con ventional CP and the originally proposed VACP sequence.

  19. A high HIV-1 strain variability in London, UK, revealed by full-genome analysis: Results from the ICONIC project

    PubMed Central

    Frampton, Dan; Gallo Cassarino, Tiziano; Raffle, Jade; Hubb, Jonathan; Ferns, R. Bridget; Waters, Laura; Tong, C. Y. William; Kozlakidis, Zisis; Hayward, Andrew; Kellam, Paul; Pillay, Deenan; Clark, Duncan; Nastouli, Eleni; Leigh Brown, Andrew J.

    2018-01-01

    Background & methods The ICONIC project has developed an automated high-throughput pipeline to generate HIV nearly full-length genomes (NFLG, i.e. from gag to nef) from next-generation sequencing (NGS) data. The pipeline was applied to 420 HIV samples collected at University College London Hospitals NHS Trust and Barts Health NHS Trust (London) and sequenced using an Illumina MiSeq at the Wellcome Trust Sanger Institute (Cambridge). Consensus genomes were generated and subtyped using COMET, and unique recombinants were studied with jpHMM and SimPlot. Maximum-likelihood phylogenetic trees were constructed using RAxML to identify transmission networks using the Cluster Picker. Results The pipeline generated sequences of at least 1Kb of length (median = 7.46Kb, IQR = 4.01Kb) for 375 out of the 420 samples (89%), with 174 (46.4%) being NFLG. A total of 365 sequences (169 of them NFLG) corresponded to unique subjects and were included in the down-stream analyses. The most frequent HIV subtypes were B (n = 149, 40.8%) and C (n = 77, 21.1%) and the circulating recombinant form CRF02_AG (n = 32, 8.8%). We found 14 different CRFs (n = 66, 18.1%) and multiple URFs (n = 32, 8.8%) that involved recombination between 12 different subtypes/CRFs. The most frequent URFs were B/CRF01_AE (4 cases) and A1/D, B/C, and B/CRF02_AG (3 cases each). Most URFs (19/26, 73%) lacked breakpoints in the PR+RT pol region, rendering them undetectable if only that was sequenced. Twelve (37.5%) of the URFs could have emerged within the UK, whereas the rest were probably imported from sub-Saharan Africa, South East Asia and South America. For 2 URFs we found highly similar pol sequences circulating in the UK. We detected 31 phylogenetic clusters using the full dataset: 25 pairs (mostly subtypes B and C), 4 triplets and 2 quadruplets. Some of these were not consistent across different genes due to inter- and intra-subtype recombination. Clusters involved 70 sequences, 19.2% of the dataset. Conclusions The initial analysis of genome sequences detected substantial hidden variability in the London HIV epidemic. Analysing full genome sequences, as opposed to only PR+RT, identified previously undetected recombinants. It provided a more reliable description of CRFs (that would be otherwise misclassified) and transmission clusters. PMID:29389981

  20. Short-read, high-throughput sequencing technology for STR genotyping

    PubMed Central

    Bornman, Daniel M.; Hester, Mark E.; Schuetter, Jared M.; Kasoji, Manjula D.; Minard-Smith, Angela; Barden, Curt A.; Nelson, Scott C.; Godbold, Gene D.; Baker, Christine H.; Yang, Boyu; Walther, Jacquelyn E.; Tornes, Ivan E.; Yan, Pearlly S.; Rodriguez, Benjamin; Bundschuh, Ralf; Dickens, Michael L.; Young, Brian A.; Faith, Seth A.

    2013-01-01

    DNA-based methods for human identification principally rely upon genotyping of short tandem repeat (STR) loci. Electrophoretic-based techniques for variable-length classification of STRs are universally utilized, but are limited in that they have relatively low throughput and do not yield nucleotide sequence information. High-throughput sequencing technology may provide a more powerful instrument for human identification, but is not currently validated for forensic casework. Here, we present a systematic method to perform high-throughput genotyping analysis of the Combined DNA Index System (CODIS) STR loci using short-read (150 bp) massively parallel sequencing technology. Open source reference alignment tools were optimized to evaluate PCR-amplified STR loci using a custom designed STR genome reference. Evaluation of this approach demonstrated that the 13 CODIS STR loci and amelogenin (AMEL) locus could be accurately called from individual and mixture samples. Sensitivity analysis showed that as few as 18,500 reads, aligned to an in silico referenced genome, were required to genotype an individual (>99% confidence) for the CODIS loci. The power of this technology was further demonstrated by identification of variant alleles containing single nucleotide polymorphisms (SNPs) and the development of quantitative measurements (reads) for resolving mixed samples. PMID:25621315

  1. Identification of optimum sequencing depth especially for de novo genome assembly of small genomes using next generation sequencing data.

    PubMed

    Desai, Aarti; Marwah, Veer Singh; Yadav, Akshay; Jha, Vineet; Dhaygude, Kishor; Bangar, Ujwala; Kulkarni, Vivek; Jere, Abhay

    2013-01-01

    Next Generation Sequencing (NGS) is a disruptive technology that has found widespread acceptance in the life sciences research community. The high throughput and low cost of sequencing has encouraged researchers to undertake ambitious genomic projects, especially in de novo genome sequencing. Currently, NGS systems generate sequence data as short reads and de novo genome assembly using these short reads is computationally very intensive. Due to lower cost of sequencing and higher throughput, NGS systems now provide the ability to sequence genomes at high depth. However, currently no report is available highlighting the impact of high sequence depth on genome assembly using real data sets and multiple assembly algorithms. Recently, some studies have evaluated the impact of sequence coverage, error rate and average read length on genome assembly using multiple assembly algorithms, however, these evaluations were performed using simulated datasets. One limitation of using simulated datasets is that variables such as error rates, read length and coverage which are known to impact genome assembly are carefully controlled. Hence, this study was undertaken to identify the minimum depth of sequencing required for de novo assembly for different sized genomes using graph based assembly algorithms and real datasets. Illumina reads for E.coli (4.6 MB) S.kudriavzevii (11.18 MB) and C.elegans (100 MB) were assembled using SOAPdenovo, Velvet, ABySS, Meraculous and IDBA-UD. Our analysis shows that 50X is the optimum read depth for assembling these genomes using all assemblers except Meraculous which requires 100X read depth. Moreover, our analysis shows that de novo assembly from 50X read data requires only 6-40 GB RAM depending on the genome size and assembly algorithm used. We believe that this information can be extremely valuable for researchers in designing experiments and multiplexing which will enable optimum utilization of sequencing as well as analysis resources.

  2. Fusarium proliferatum - Causal agent of garlic bulb rot in Spain: Genetic variability and mycotoxin production.

    PubMed

    Gálvez, Laura; Urbaniak, Monika; Waśkiewicz, Agnieszka; Stępień, Łukasz; Palmero, Daniel

    2017-10-01

    Fusarium proliferatum is a world-wide occurring fungal pathogen affecting several crops included garlic bulbs. In Spain, this is the most frequent pathogenic fungus associated with garlic rot during storage. Moreover, F. proliferatum is an important mycotoxigenic species, producing a broad range of toxins, which may pose a risk for food safety. The aim of this study is to assess the intraspecific variability of the garlic pathogen in Spain implied by analyses of translation elongation factor (tef-1α) and FUM1 gene sequences as well as the differences in growth rates. Phylogenetic characterization has been complemented with the characterization of mating type alleles as well as the species potential as a toxin producer. Phylogenetic trees based on the sequence of the translation elongation factor and FUM1 genes from seventy nine isolates from garlic revealed a considerable intraspecific variability as well as high level of diversity in growth speed. Based on the MAT alleles amplified by PCR, F. proliferatum isolates were separated into different groups on both trees. All isolates collected from garlic in Spain proved to be fumonisin B 1 , B 2 , and B 3 producers. Quantitative analyses of fumonisins, beauvericin and moniliformin (common secondary metabolites of F. proliferatum) showed no correlation with phylogenetic analysis neither mycelial growth. This pathogen presents a high intraspecific variability within the same geographical region and host, which is necessary to be considered in the management of the disease. Copyright © 2017 Elsevier Ltd. All rights reserved.

  3. Implementing targeted region capture sequencing for the clinical detection of Alagille syndrome: An efficient and cost‑effective method.

    PubMed

    Huang, Tianhong; Yang, Guilin; Dang, Xiao; Ao, Feijian; Li, Jiankang; He, Yizhou; Tang, Qiyuan; He, Qing

    2017-11-01

    Alagille syndrome (AGS) is a highly variable, autosomal dominant disease that affects multiple structures including the liver, heart, eyes, bones and face. Targeted region capture sequencing focuses on a panel of known pathogenic genes and provides a rapid, cost‑effective and accurate method for molecular diagnosis. In a Chinese family, this method was used on the proband and Sanger sequencing was applied to validate the candidate mutation. A de novo heterozygous mutation (c.3254_3255insT p.Leu1085PhefsX24) of the jagged 1 gene was identified as the potential disease‑causing gene mutation. In conclusion, the present study suggested that target region capture sequencing is an efficient, reliable and accurate approach for the clinical diagnosis of AGS. Furthermore, these results expand on the understanding of the pathogenesis of AGS.

  4. High-Throughput rRNA Gene Sequencing Reveals High
and Complex Bacterial Diversity Associated with
Brazilian Coffee Bean Fermentation

    PubMed Central

    Vinícius de Melo, Gilberto

    2018-01-01

    Summary Coffee bean fermentation is a spontaneous, on-farm process involving the action of different microbial groups, including bacteria and fungi. In this study, high-throughput sequencing approach was employed to study the diversity and dynamics of bacteria associated with Brazilian coffee bean fermentation. The total DNA from fermenting coffee samples was extracted at different time points, and the 16S rRNA gene with segments around the V4 variable region was sequenced by Illumina high-throughput platform. Using this approach, the presence of over eighty bacterial genera was determined, many of which have been detected for the first time during coffee bean fermentation, including Fructobacillus, Pseudonocardia, Pedobacter, Sphingomonas and Hymenobacter. The presence of Fructobacillus suggests an influence of these bacteria on fructose metabolism during coffee fermentation. Temporal analysis showed a strong dominance of lactic acid bacteria with over 97% of read sequences at the end of fermentation, mainly represented by the Leuconostoc and Lactococcus. Metabolism of lactic acid bacteria was associated with the high formation of lactic acid during fermentation, as determined by HPLC analysis. The results reported in this study confirm the underestimation of bacterial diversity associated with coffee fermentation. New microbial groups reported in this study may be explored as functional starter cultures for on-farm coffee processing.

  5. Microbial ecology in the age of genomics and metagenomics: concepts, tools, and recent advances.

    PubMed

    Xu, Jianping

    2006-06-01

    Microbial ecology examines the diversity and activity of micro-organisms in Earth's biosphere. In the last 20 years, the application of genomics tools have revolutionized microbial ecological studies and drastically expanded our view on the previously underappreciated microbial world. This review first introduces the basic concepts in microbial ecology and the main genomics methods that have been used to examine natural microbial populations and communities. In the ensuing three specific sections, the applications of the genomics in microbial ecological research are highlighted. The first describes the widespread application of multilocus sequence typing and representational difference analysis in studying genetic variation within microbial species. Such investigations have identified that migration, horizontal gene transfer and recombination are common in natural microbial populations and that microbial strains can be highly variable in genome size and gene content. The second section highlights and summarizes the use of four specific genomics methods (phylogenetic analysis of ribosomal RNA, DNA-DNA re-association kinetics, metagenomics, and micro-arrays) in analysing the diversity and potential activity of microbial populations and communities from a variety of terrestrial and aquatic environments. Such analyses have identified many unexpected phylogenetic lineages in viruses, bacteria, archaea, and microbial eukaryotes. Functional analyses of environmental DNA also revealed highly prevalent, but previously unknown, metabolic processes in natural microbial communities. In the third section, the ecological implications of sequenced microbial genomes are briefly discussed. Comparative analyses of prokaryotic genomic sequences suggest the importance of ecology in determining microbial genome size and gene content. The significant variability in genome size and gene content among strains and species of prokaryotes indicate the highly fluid nature of prokaryotic genomes, a result consistent with those from multilocus sequence typing and representational difference analyses. The integration of various levels of ecological analyses coupled to the application and further development of high throughput technologies are accelerating the pace of discovery in microbial ecology.

  6. Differential recognition of the ORF2 region in a complete genome sequence of porcine circovirus type 2 (PCV2) isolated from boar bone marrow in Korea.

    PubMed

    Kweon, Chang-Hee; Nguyen, Lien Thi Kim; Yoo, Mi-Sun; Kang, Seung-Won

    2015-09-15

    Porcine circovirus type 2 (PCV2) is the causative agent of post-weaning multisystemic wasting syndrome (PMWS) in swine. Here, a phylogenetic tree was constructed using PCV2 nucleotide sequences derived from the bone marrow of Korean boar and previously reported PCV2 sequences isolated from various countries. PCV2 from Korean boar bone marrow (KC188796) was classified into the group containing PCV2a-Canada and other PCV2 strain from Korea. While the ORF1 region of the PCV2 genome was highly conserved, ORF2 (the capsid protein coding region) was relatively variable. The nucleotide sequences for bone marrow-derived PCV2 were 93.4-99.0% homologous to the other reference sequences. The deduced amino acid sequences for the ORF1 and ORF2 coding regions were 97.4-99.3% and 84.5-97.4% homologous with the other reference strains, respectively, indicating that KC188796 did not differ markedly from the other PCV2 strains. Phylogenetic analysis demonstrated that bone marrow-derived PCV2 was highly similar to PCV2a from Canada and may be related to persistent PCV2 infections in swine. Copyright © 2015 Elsevier B.V. All rights reserved.

  7. Alterations of microbiota in urine from women with interstitial cystitis

    PubMed Central

    2012-01-01

    Background Interstitial Cystitis (IC) is a chronic inflammatory condition of the bladder with unknown etiology. The aim of this study was to characterize the microbial community present in the urine from IC female patients by 454 high throughput sequencing of the 16S variable regions V1V2 and V6. The taxonomical composition, richness and diversity of the IC microbiota were determined and compared to the microbial profile of asymptomatic healthy female (HF) urine. Results The composition and distribution of bacterial sequences differed between the urine microbiota of IC patients and HFs. Reduced sequence richness and diversity were found in IC patient urine, and a significant difference in the community structure of IC urine in relation to HF urine was observed. More than 90% of the IC sequence reads were identified as belonging to the bacterial genus Lactobacillus, a marked increase compared to 60% in HF urine. Conclusion The 16S rDNA sequence data demonstrates a shift in the composition of the bacterial community in IC urine. The reduced microbial diversity and richness is accompanied by a higher abundance of the bacterial genus Lactobacillus, compared to HF urine. This study demonstrates that high throughput sequencing analysis of urine microbiota in IC patients is a powerful tool towards a better understanding of this enigmatic disease. PMID:22974186

  8. DNA sequence determinants controlling affinity, stability and shape of DNA complexes bound by the nucleoid protein Fis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hancock, Stephen P.; Stella, Stefano; Cascio, Duilio

    The abundant Fis nucleoid protein selectively binds poorly related DNA sequences with high affinities to regulate diverse DNA reactions. Fis binds DNA primarily through DNA backbone contacts and selects target sites by reading conformational properties of DNA sequences, most prominently intrinsic minor groove widths. High-affinity binding requires Fis-stabilized DNA conformational changes that vary depending on DNA sequence. In order to better understand the molecular basis for high affinity site recognition, we analyzed the effects of DNA sequence within and flanking the core Fis binding site on binding affinity and DNA structure. X-ray crystal structures of Fis-DNA complexes containing variable sequencesmore » in the noncontacted center of the binding site or variations within the major groove interfaces show that the DNA can adapt to the Fis dimer surface asymmetrically. We show that the presence and position of pyrimidine-purine base steps within the major groove interfaces affect both local DNA bending and minor groove compression to modulate affinities and lifetimes of Fis-DNA complexes. Sequences flanking the core binding site also modulate complex affinities, lifetimes, and the degree of local and global Fis-induced DNA bending. In particular, a G immediately upstream of the 15 bp core sequence inhibits binding and bending, and A-tracts within the flanking base pairs increase both complex lifetimes and global DNA curvatures. Taken together, our observations support a revised DNA motif specifying high-affinity Fis binding and highlight the range of conformations that Fis-bound DNA can adopt. Lastly, the affinities and DNA conformations of individual Fis-DNA complexes are likely to be tailored to their context-specific biological functions.« less

  9. DNA sequence determinants controlling affinity, stability and shape of DNA complexes bound by the nucleoid protein Fis

    DOE PAGES

    Hancock, Stephen P.; Stella, Stefano; Cascio, Duilio; ...

    2016-03-09

    The abundant Fis nucleoid protein selectively binds poorly related DNA sequences with high affinities to regulate diverse DNA reactions. Fis binds DNA primarily through DNA backbone contacts and selects target sites by reading conformational properties of DNA sequences, most prominently intrinsic minor groove widths. High-affinity binding requires Fis-stabilized DNA conformational changes that vary depending on DNA sequence. In order to better understand the molecular basis for high affinity site recognition, we analyzed the effects of DNA sequence within and flanking the core Fis binding site on binding affinity and DNA structure. X-ray crystal structures of Fis-DNA complexes containing variable sequencesmore » in the noncontacted center of the binding site or variations within the major groove interfaces show that the DNA can adapt to the Fis dimer surface asymmetrically. We show that the presence and position of pyrimidine-purine base steps within the major groove interfaces affect both local DNA bending and minor groove compression to modulate affinities and lifetimes of Fis-DNA complexes. Sequences flanking the core binding site also modulate complex affinities, lifetimes, and the degree of local and global Fis-induced DNA bending. In particular, a G immediately upstream of the 15 bp core sequence inhibits binding and bending, and A-tracts within the flanking base pairs increase both complex lifetimes and global DNA curvatures. Taken together, our observations support a revised DNA motif specifying high-affinity Fis binding and highlight the range of conformations that Fis-bound DNA can adopt. Lastly, the affinities and DNA conformations of individual Fis-DNA complexes are likely to be tailored to their context-specific biological functions.« less

  10. Haplotype Detection from Next-Generation Sequencing in High-Ploidy-Level Species: 45S rDNA Gene Copies in the Hexaploid Spartina maritima

    PubMed Central

    Boutte, Julien; Aliaga, Benoît; Lima, Oscar; Ferreira de Carvalho, Julie; Ainouche, Abdelkader; Macas, Jiri; Rousseau-Gueutin, Mathieu; Coriton, Olivier; Ainouche, Malika; Salmon, Armel

    2015-01-01

    Gene and whole-genome duplications are widespread in plant nuclear genomes, resulting in sequence heterogeneity. Identification of duplicated genes may be particularly challenging in highly redundant genomes, especially when there are no diploid parents as a reference. Here, we developed a pipeline to detect the different copies in the ribosomal RNA gene family in the hexaploid grass Spartina maritima from next-generation sequencing (Roche-454) reads. The heterogeneity of the different domains of the highly repeated 45S unit was explored by identifying single nucleotide polymorphisms (SNPs) and assembling reads based on shared polymorphisms. SNPs were validated using comparisons with Illumina sequence data sets and by cloning and Sanger (re)sequencing. Using this approach, 29 validated polymorphisms and 11 validated haplotypes were reported (out of 34 and 20, respectively, that were initially predicted by our program). The rDNA domains of S. maritima have similar lengths as those found in other Poaceae, apart from the 5′-ETS, which is approximately two-times longer in S. maritima. Sequence homogeneity was encountered in coding regions and both internal transcribed spacers (ITS), whereas high intragenomic variability was detected in the intergenic spacer (IGS) and the external transcribed spacer (ETS). Molecular cytogenetic analysis by fluorescent in situ hybridization (FISH) revealed the presence of one pair of 45S rDNA signals on the chromosomes of S. maritima instead of three expected pairs for a hexaploid genome, indicating loss of duplicated homeologous loci through the diploidization process. The procedure developed here may be used at any ploidy level and using different sequencing technologies. PMID:26530424

  11. RoboOligo: software for mass spectrometry data to support manual and de novo sequencing of post-transcriptionally modified ribonucleic acids

    PubMed Central

    Sample, Paul J.; Gaston, Kirk W.; Alfonzo, Juan D.; Limbach, Patrick A.

    2015-01-01

    Ribosomal ribonucleic acid (RNA), transfer RNA and other biological or synthetic RNA polymers can contain nucleotides that have been modified by the addition of chemical groups. Traditional Sanger sequencing methods cannot establish the chemical nature and sequence of these modified-nucleotide containing oligomers. Mass spectrometry (MS) has become the conventional approach for determining the nucleotide composition, modification status and sequence of modified RNAs. Modified RNAs are analyzed by MS using collision-induced dissociation tandem mass spectrometry (CID MS/MS), which produces a complex dataset of oligomeric fragments that must be interpreted to identify and place modified nucleosides within the RNA sequence. Here we report the development of RoboOligo, an interactive software program for the robust analysis of data generated by CID MS/MS of RNA oligomers. There are three main functions of RoboOligo: (i) automated de novo sequencing via the local search paradigm. (ii) Manual sequencing with real-time spectrum labeling and cumulative intensity scoring. (iii) A hybrid approach, coined ‘variable sequencing’, which combines the user intuition of manual sequencing with the high-throughput sampling of automated de novo sequencing. PMID:25820423

  12. Sequence variation of the glycoprotein gene identifies three distinct lineages within field isolates of viral hemorrhagic septicemia virus, a fish rhabdovirus

    USGS Publications Warehouse

    Benmansour, A.; Bascuro, B.; Monnier, A.F.; Vende, P.; Winton, J.R.; de Kinkelin, P.

    1997-01-01

    To evaluate the genetic diversity of viral haemorrhagic septicaemia virus (VHSV), the sequence of the glycoprotein genes (G) of 11 North American and European isolates were determined. Comparison with the G protein of representative members of the family Rhabdoviridae suggested that VHSV was a different virus species from infectious haemorrhagic necrosis virus (IHNV) and Hirame rhabdovirus (HIRRV). At a higher taxonomic level, VHSV, IHNV and HIRRV formed a group which was genetically closest to the genus Lyssavirus. Compared with each other, the G genes of VHSV displayed a dissimilar overall genetic diversity which correlated with differences in geographical origin. The multiple sequence alignment of the complete G protein, showed that the divergent positions were not uniformly distributed along the sequence. A central region (amino acid position 245-300) accumulated substitutions and appeared to be highly variable. The genetic heterogeneity within a single isolate was high, with an apparent internal mutation frequency of 1.2 x 10(-3) per nucleotide site, attesting the quasispecies nature of the viral population. The phylogeny separated VHSV strains according to the major geographical area of isolation: genotype I for continental Europe, genotype II for the British Isles, and genotype III for North America. Isolates from continental Europe exhibited the highest genetic variability, with sub-groups correlated partially with the serological classification. Neither neutralizing polyclonal sera, nor monoclonal antibodies, were able to discriminate between the genotypes. The overall structure of the phylogenetic tree suggests that VHSV genetic diversity and evolution fit within the model of random change and positive selection operating on quasispecies.

  13. Bacterial community composition in different sediments from the Eastern Mediterranean Sea: a comparison of four 16S ribosomal DNA clone libraries.

    PubMed

    Polymenakou, Paraskevi N; Bertilsson, Stefan; Tselepides, Anastasios; Stephanou, Euripides G

    2005-10-01

    The regional variability of sediment bacterial community composition and diversity was studied by comparative analysis of four large 16S ribosomal DNA (rDNA) clone libraries from sediments in different regions of the Eastern Mediterranean Sea (Thermaikos Gulf, Cretan Sea, and South lonian Sea). Amplified rDNA restriction analysis of 664 clones from the libraries indicate that the rDNA richness and evenness was high: for example, a near-1:1 relationship among screened clones and number of unique restriction patterns when up to 190 clones were screened for each library. Phylogenetic analysis of 207 bacterial 16S rDNA sequences from the sediment libraries demonstrated that Gamma-, Delta-, and Alphaproteobacteria, Holophaga/Acidobacteria, Planctomycetales, Actinobacteria, Bacteroidetes, and Verrucomicrobia were represented in all four libraries. A few clones also grouped with the Betaproteobacteria, Nitrospirae, Spirochaetales, Chlamydiae, Firmicutes, and candidate division OPl 1. The abundance of sequences affiliated with Gammaproteobacteria was higher in libraries from shallow sediments in the Thermaikos Gulf (30 m) and the Cretan Sea (100 m) compared to the deeper South Ionian station (2790 m). Most sequences in the four sediment libraries clustered with uncultured 16S rDNA phylotypes from marine habitats, and many of the closest matches were clones from hydrocarbon seeps, benzene-mineralizing consortia, sulfate reducers, sulk oxidizers, and ammonia oxidizers. LIBSHUFF statistics of 16S rDNA gene sequences from the four libraries revealed major differences, indicating either a very high richness in the sediment bacterial communities or considerable variability in bacterial community composition among regions, or both.

  14. LineageSpecificSeqgen: generating sequence data with lineage-specific variation in the proportion of variable sites

    PubMed Central

    Grievink, Liat Shavit; Penny, David; Hendy, Mike D; Holland, Barbara R

    2009-01-01

    Correction to Shavit Grievink L, Penny D, Hendy MD, Holland BR: LineageSpecificSeqgen: generating sequence data with lineage-specific variation in the proportion of variable sites. BMC Evol Biol 2008, 8(1):317.

  15. Detecting exact breakpoints of deletions with diversity in hepatitis B viral genomic DNA from next-generation sequencing data.

    PubMed

    Cheng, Ji-Hong; Liu, Wen-Chun; Chang, Ting-Tsung; Hsieh, Sun-Yuan; Tseng, Vincent S

    2017-10-01

    Many studies have suggested that deletions of Hepatitis B Viral (HBV) are associated with the development of progressive liver diseases, even ultimately resulting in hepatocellular carcinoma (HCC). Among the methods for detecting deletions from next-generation sequencing (NGS) data, few methods considered the characteristics of virus, such as high evolution rates and high divergence among the different HBV genomes. Sequencing high divergence HBV genome sequences using the NGS technology outputs millions of reads. Thus, detecting exact breakpoints of deletions from these big and complex data incurs very high computational cost. We proposed a novel analytical method named VirDelect (Virus Deletion Detect), which uses split read alignment base to detect exact breakpoint and diversity variable to consider high divergence in single-end reads data, such that the computational cost can be reduced without losing accuracy. We use four simulated reads datasets and two real pair-end reads datasets of HBV genome sequence to verify VirDelect accuracy by score functions. The experimental results show that VirDelect outperforms the state-of-the-art method Pindel in terms of accuracy score for all simulated datasets and VirDelect had only two base errors even in real datasets. VirDelect is also shown to deliver high accuracy in analyzing the single-end read data as well as pair-end data. VirDelect can serve as an effective and efficient bioinformatics tool for physiologists with high accuracy and efficient performance and applicable to further analysis with characteristics similar to HBV on genome length and high divergence. The software program of VirDelect can be downloaded at https://sourceforge.net/projects/virdelect/. Copyright © 2017. Published by Elsevier Inc.

  16. A novel ultra high-throughput 16S rRNA gene amplicon sequencing library preparation method for the Illumina HiSeq platform.

    PubMed

    de Muinck, Eric J; Trosvik, Pål; Gilfillan, Gregor D; Hov, Johannes R; Sundaram, Arvind Y M

    2017-07-06

    Advances in sequencing technologies and bioinformatics have made the analysis of microbial communities almost routine. Nonetheless, the need remains to improve on the techniques used for gathering such data, including increasing throughput while lowering cost and benchmarking the techniques so that potential sources of bias can be better characterized. We present a triple-index amplicon sequencing strategy to sequence large numbers of samples at significantly lower c ost and in a shorter timeframe compared to existing methods. The design employs a two-stage PCR protocol, incorpo rating three barcodes to each sample, with the possibility to add a fourth-index. It also includes heterogeneity spacers to overcome low complexity issues faced when sequencing amplicons on Illumina platforms. The library preparation method was extensively benchmarked through analysis of a mock community in order to assess biases introduced by sample indexing, number of PCR cycles, and template concentration. We further evaluated the method through re-sequencing of a standardized environmental sample. Finally, we evaluated our protocol on a set of fecal samples from a small cohort of healthy adults, demonstrating good performance in a realistic experimental setting. Between-sample variation was mainly related to batch effects, such as DNA extraction, while sample indexing was also a significant source of bias. PCR cycle number strongly influenced chimera formation and affected relative abundance estimates of species with high GC content. Libraries were sequenced using the Illumina HiSeq and MiSeq platforms to demonstrate that this protocol is highly scalable to sequence thousands of samples at a very low cost. Here, we provide the most comprehensive study of performance and bias inherent to a 16S rRNA gene amplicon sequencing method to date. Triple-indexing greatly reduces the number of long custom DNA oligos required for library preparation, while the inclusion of variable length heterogeneity spacers minimizes the need for PhiX spike-in. This design results in a significant cost reduction of highly multiplexed amplicon sequencing. The biases we characterize highlight the need for highly standardized protocols. Reassuringly, we find that the biological signal is a far stronger structuring factor than the various sources of bias.

  17. Intra- to Multi-Decadal Temperature Variability over the Continental United States: 1896-2012

    USDA-ARS?s Scientific Manuscript database

    The Optimal Ranking Regime (ORR) method was used to identify intra- to multi-decadal (IMD) time windows containing significant ranking sequences in U.S. climate division temperature data. The simplicity of the ORR procedure’s output – a time series’ most significant non-overlapping periods of high o...

  18. Optimal ranking regime analysis of intra- to multidecadal U.S. climate variability. Part I: Temperature

    USDA-ARS?s Scientific Manuscript database

    The Optimal Ranking Regime (ORR) method was used to identify intra- to multi-decadal (IMD) time windows containing significant ranking sequences in U.S. climate division temperature data. The simplicity of the ORR procedure’s output – a time series’ most significant non-overlapping periods of high o...

  19. Immunoglobulin kappa light chain gene promoter and enhancer are not responsible for B-cell restricted gene rearrangement.

    PubMed Central

    Goodhardt, M; Babinet, C; Lutfalla, G; Kallenbach, S; Cavelier, P; Rougeon, F

    1989-01-01

    We have produced transgenic mice which synthesize chimeric mouse-rabbit immunoglobulin (Ig) kappa light chains following in vivo recombination of an injected unrearranged kappa gene. The exogenous gene construct contained a mouse germ-line kappa variable (V kappa) gene segment, the mouse germ-line joining (J kappa) locus including the enhancer, and the rabbit b9 constant (C kappa) region. A high level of V-J recombination of the kappa transgene was observed in spleen of the transgenic mice. Surprisingly, a particularly high degree of variability in the exact site of recombination and the presence of non germ-line encoded nucleotides (N-regions) were found at the V-J junction of the rearranged kappa transgene. Furthermore, unlike endogenous kappa genes, rearrangement of the exogenous gene occurred in T-cells of the transgenic mice. These results show that additional sequences, other than the heptamer-nonamer signal sequences and the promoter and enhancer elements, are required to obtain stage- and lineage- specific regulation of Ig kappa light chain gene rearrangement in vivo. Images PMID:2508061

  20. Genetic polymorphisms in the amino acid transporters LAT1 and LAT2 in relation to the pharmacokinetics and side effects of melphalan.

    PubMed

    Kühne, Annett; Kaiser, Rolf; Schirmer, Markus; Heider, Ulrike; Muhlke, Sabine; Niere, Wiebke; Overbeck, Tobias; Hohloch, Karin; Trümper, Lorenz; Sezer, Orhan; Brockmöller, Jürgen

    2007-07-01

    Melphalan is widely used in the treatment of multiple myeloma. Pharmacokinetics of this alkylating drug shows high inter-individual variability. As melphalan is a phenylalanine derivative, the pharmacokinetic variability may be determined by genetic polymorphisms in the L-type amino acid transporters LAT1 (SLC7A5) and LAT2 (SLC7A8). Pharmacokinetics were analysed in 64 patients after first administration of intravenous melphalan. Severity of side effects was documented according to WHO criteria. Genomic DNA was analysed for polymorphisms in LAT1 and LAT2 by sequencing of the entire coding region, intron-exon boundaries and 2 kb upstream promoter region. Selected polymorphisms in the common heavy chain of both transporters, the protein 4F2hc (SLC3A2), were analysed by single nucleotide primer extension. Melphalan pharmacokinetics was highly variable with up to 6.2-fold differences in total clearance. A total of 44 polymorphisms were identified in LAT1 and 21 polymorphisms in LAT2. From all variants, only five were in the coding region and only one heterozygous non-synonymous polymorphism (Ala94Thr) was found in LAT2. Numerous polymorphisms were found in the LAT1 and LAT2 5'-flanking regions but did not correlate with expression of the respective genes. No significant correlations could be observed between the polymorphisms in 4F2hc, LAT1, and LAT2 with melphalan pharmacokinetics or with melphalan side effects. The study confirmed that these transporter genes are highly conserved, particularly in the coding sequences. Genetic variation in 4F2hc, LAT1, and LAT2 does not appear to be a major cause of inter-individual variability in pharmacokinetics and of adverse reactions to melphalan.

  1. Network Analysis of Sequence-Function Relationships and Exploration of Sequence Space of TEM β-Lactamases.

    PubMed

    Zeil, Catharina; Widmann, Michael; Fademrecht, Silvia; Vogel, Constantin; Pleiss, Jürgen

    2016-05-01

    The Lactamase Engineering Database (www.LacED.uni-stuttgart.de) was developed to facilitate the classification and analysis of TEM β-lactamases. The current version contains 474 TEM variants. Two hundred fifty-nine variants form a large scale-free network of highly connected point mutants. The network was divided into three subnetworks which were enriched by single phenotypes: one network with predominantly 2be and two networks with 2br phenotypes. Fifteen positions were found to be highly variable, contributing to the majority of the observed variants. Since it is expected that a considerable fraction of the theoretical sequence space is functional, the currently sequenced 474 variants represent only the tip of the iceberg of functional TEM β-lactamase variants which form a huge natural reservoir of highly interconnected variants. Almost 50% of the variants are part of a quartet. Thus, two single mutations that result in functional enzymes can be combined into a functional protein. Most of these quartets consist of the same phenotype, or the mutations are additive with respect to the phenotype. By predicting quartets from triplets, 3,916 unknown variants were constructed. Eighty-seven variants complement multiple quartets and therefore have a high probability of being functional. The construction of a TEM β-lactamase network and subsequent analyses by clustering and quartet prediction are valuable tools to gain new insights into the viable sequence space of TEM β-lactamases and to predict their phenotype. The highly connected sequence space of TEM β-lactamases is ideally suited to network analysis and demonstrates the strengths of network analysis over tree reconstruction methods. Copyright © 2016, American Society for Microbiology. All Rights Reserved.

  2. Analysis of whole genome sequences of 16 strains of rubella virus from the United States, 1961-2009.

    PubMed

    Abernathy, Emily; Chen, Min-hsin; Bera, Jayati; Shrivastava, Susmita; Kirkness, Ewen; Zheng, Qi; Bellini, William; Icenogle, Joseph

    2013-01-25

    Rubella virus is the causative agent of rubella, a mild rash illness, and a potent teratogenic agent when contracted by a pregnant woman. Global rubella control programs target the reduction and elimination of congenital rubella syndrome. Phylogenetic analysis of partial sequences of rubella viruses has contributed to virus surveillance efforts and played an important role in demonstrating that indigenous rubella viruses have been eliminated in the United States. Sixteen wild-type rubella viruses were chosen for whole genome sequencing. All 16 viruses were collected in the United States from 1961 to 2009 and are from 8 of the 13 known rubella genotypes. Phylogenetic analysis of 30 whole genome sequences produced a maximum likelihood tree giving high bootstrap values for all genotypes except provisional genotype 1a. Comparison of the 16 new complete sequences and 14 previously sequenced wild-type viruses found regions with clusters of variable amino acids. The 5' 250 nucleotides of the genome are more conserved than any other part of the genome. Genotype specific deletions in the untranslated region between the non-structural and structural open reading frames were observed for genotypes 2B and genotype 1G. No evidence was seen for recombination events among the 30 viruses. The analysis presented here is consistent with previous reports on the genetic characterization of rubella virus genomes. Conserved and variable regions were identified and additional evidence for genotype specific nucleotide deletions in the intergenic region was found. Phylogenetic analysis confirmed genotype groupings originally based on structural protein coding region sequences, which provides support for the WHO nomenclature for genetic characterization of wild-type rubella viruses.

  3. Alternating high and low climate variability: The context of natural selection and speciation in Plio-Pleistocene hominin evolution.

    PubMed

    Potts, Richard; Faith, J Tyler

    2015-10-01

    Interaction of orbital insolation cycles defines a predictive model of alternating phases of high- and low-climate variability for tropical East Africa over the past 5 million years. This model, which is described in terms of climate variability stages, implies repeated increases in landscape/resource instability and intervening periods of stability in East Africa. It predicts eight prolonged (>192 kyr) eras of intensified habitat instability (high variability stages) in which hominin evolutionary innovations are likely to have occurred, potentially by variability selection. The prediction that repeated shifts toward high climate variability affected paleoenvironments and evolution is tested in three ways. In the first test, deep-sea records of northeast African terrigenous dust flux (Sites 721/722) and eastern Mediterranean sapropels (Site 967A) show increased and decreased variability in concert with predicted shifts in climate variability. These regional measurements of climate dynamics are complemented by stratigraphic observations in five basins with lengthy stratigraphic and paleoenvironmental records: the mid-Pleistocene Olorgesailie Basin, the Plio-Pleistocene Turkana and Olduvai Basins, and the Pliocene Tugen Hills sequence and Hadar Basin--all of which show that highly variable landscapes inhabited by hominin populations were indeed concentrated in predicted stages of prolonged high climate variability. Second, stringent null-model tests demonstrate a significant association of currently known first and last appearance datums (FADs and LADs) of the major hominin lineages, suites of technological behaviors, and dispersal events with the predicted intervals of prolonged high climate variability. Palynological study in the Nihewan Basin, China, provides a third test, which shows the occupation of highly diverse habitats in eastern Asia, consistent with the predicted increase in adaptability in dispersing Oldowan hominins. Integration of fossil, archeological, sedimentary, and paleolandscape evidence illustrates the potential influence of prolonged high variability on the origin and spread of critical adaptations and lineages in the evolution of Homo. The growing body of data concerning environmental dynamics supports the idea that the evolution of adaptability in response to climate and overall ecological instability represents a unifying theme in hominin evolutionary history. Published by Elsevier Ltd.

  4. Sequence intrinsic somatic mutation mechanisms contribute to affinity maturation of VRC01-class HIV-1 broadly neutralizing antibodies

    PubMed Central

    Hwang, Joyce K.; Wang, Chong; Du, Zhou; Meyers, Robin M.; Kepler, Thomas B.; Neuberg, Donna; Kwong, Peter D.; Mascola, John R.; Joyce, M. Gordon; Bonsignori, Mattia; Haynes, Barton F.; Yeap, Leng-Siew; Alt, Frederick W.

    2017-01-01

    Variable regions of Ig chains provide the antigen recognition portion of B-cell receptors and derivative antibodies. Ig heavy-chain variable region exons are assembled developmentally from V, D, J gene segments. Each variable region contains three antigen-contacting complementarity-determining regions (CDRs), with CDR1 and CDR2 encoded by the V segment and CDR3 encoded by the V(D)J junction region. Antigen-stimulated germinal center (GC) B cells undergo somatic hypermutation (SHM) of V(D)J exons followed by selection for SHMs that increase antigen-binding affinity. Some HIV-1–infected human subjects develop broadly neutralizing antibodies (bnAbs), such as the potent VRC01-class bnAbs, that neutralize diverse HIV-1 strains. Mature VRC01-class bnAbs, including VRC-PG04, accumulate very high SHM levels, a property that hinders development of vaccine strategies to elicit them. Because many VRC01-class bnAb SHMs are not required for broad neutralization, high overall SHM may be required to achieve certain functional SHMs. To elucidate such requirements, we used a V(D)J passenger allele system to assay, in mouse GC B cells, sequence-intrinsic SHM-targeting rates of nucleotides across substrates representing maturation stages of human VRC-PG04. We identify rate-limiting SHM positions for VRC-PG04 maturation, as well as SHM hotspots and intrinsically frequent deletions associated with SHM. We find that mature VRC-PG04 has low SHM capability due to hotspot saturation but also demonstrate that generation of new SHM hotspots and saturation of existing hotspot regions (e.g., CDR3) does not majorly influence intrinsic SHM in unmutated portions of VRC-PG04 progenitor sequences. We discuss implications of our findings for bnAb affinity maturation mechanisms. PMID:28747530

  5. [Identification of new conserved and variable regions in the 16S rRNA gene of acetic acid bacteria and acetobacteraceae family].

    PubMed

    Chakravorty, S; Sarkar, S; Gachhui, R

    2015-01-01

    The Acetobacteraceae family of the class Alpha Proteobacteria is comprised of high sugar and acid tolerant bacteria. The Acetic Acid Bacteria are the economically most significant group of this family because of its association with food products like vinegar, wine etc. Acetobacteraceae are often hard to culture in laboratory conditions and they also maintain very low abundances in their natural habitats. Thus identification of the organisms in such environments is greatly dependent on modern tools of molecular biology which require a thorough knowledge of specific conserved gene sequences that may act as primers and or probes. Moreover unconserved domains in genes also become markers for differentiating closely related genera. In bacteria, the 16S rRNA gene is an ideal candidate for such conserved and variable domains. In order to study the conserved and variable domains of the 16S rRNA gene of Acetic Acid Bacteria and the Acetobacteraceae family, sequences from publicly available databases were aligned and compared. Near complete sequences of the gene were also obtained from Kombucha tea biofilm, a known Acetobacteraceae family habitat, in order to corroborate the domains obtained from the alignment studies. The study indicated that the degree of conservation in the gene is significantly higher among the Acetic Acid Bacteria than the whole Acetobacteraceae family. Moreover it was also observed that the previously described hypervariable regions V1, V3, V5, V6 and V7 were more or less conserved in the family and the spans of the variable regions are quite distinct as well.

  6. Random and externally controlled occurrences of Dansgaard-Oeschger events

    NASA Astrophysics Data System (ADS)

    Lohmann, Johannes; Ditlevsen, Peter D.

    2018-05-01

    Dansgaard-Oeschger (DO) events constitute the most pronounced mode of centennial to millennial climate variability of the last glacial period. Since their discovery, many decades of research have been devoted to understand the origin and nature of these rapid climate shifts. In recent years, a number of studies have appeared that report emergence of DO-type variability in fully coupled general circulation models via different mechanisms. These mechanisms result in the occurrence of DO events at varying degrees of regularity, ranging from periodic to random. When examining the full sequence of DO events as captured in the North Greenland Ice Core Project (NGRIP) ice core record, one can observe high irregularity in the timing of individual events at any stage within the last glacial period. In addition to the prevailing irregularity, certain properties of the DO event sequence, such as the average event frequency or the relative distribution of cold versus warm periods, appear to be changing throughout the glacial. By using statistical hypothesis tests on simple event models, we investigate whether the observed event sequence may have been generated by stationary random processes or rather was strongly modulated by external factors. We find that the sequence of DO warming events is consistent with a stationary random process, whereas dividing the event sequence into warming and cooling events leads to inconsistency with two independent event processes. As we include external forcing, we find a particularly good fit to the observed DO sequence in a model where the average residence time in warm periods are controlled by global ice volume and cold periods by boreal summer insolation.

  7. Sequence-related amplified polymorphism (SRAP) markers: A potential resource for studies in plant molecular biology(1.).

    PubMed

    Robarts, Daniel W H; Wolfe, Andrea D

    2014-07-01

    In the past few decades, many investigations in the field of plant biology have employed selectively neutral, multilocus, dominant markers such as inter-simple sequence repeat (ISSR), random-amplified polymorphic DNA (RAPD), and amplified fragment length polymorphism (AFLP) to address hypotheses at lower taxonomic levels. More recently, sequence-related amplified polymorphism (SRAP) markers have been developed, which are used to amplify coding regions of DNA with primers targeting open reading frames. These markers have proven to be robust and highly variable, on par with AFLP, and are attained through a significantly less technically demanding process. SRAP markers have been used primarily for agronomic and horticultural purposes, developing quantitative trait loci in advanced hybrids and assessing genetic diversity of large germplasm collections. Here, we suggest that SRAP markers should be employed for research addressing hypotheses in plant systematics, biogeography, conservation, ecology, and beyond. We provide an overview of the SRAP literature to date, review descriptive statistics of SRAP markers in a subset of 171 publications, and present relevant case studies to demonstrate the applicability of SRAP markers to the diverse field of plant biology. Results of these selected works indicate that SRAP markers have the potential to enhance the current suite of molecular tools in a diversity of fields by providing an easy-to-use, highly variable marker with inherent biological significance.

  8. Sequence-related amplified polymorphism (SRAP) markers: A potential resource for studies in plant molecular biology1

    PubMed Central

    Robarts, Daniel W. H.; Wolfe, Andrea D.

    2014-01-01

    In the past few decades, many investigations in the field of plant biology have employed selectively neutral, multilocus, dominant markers such as inter-simple sequence repeat (ISSR), random-amplified polymorphic DNA (RAPD), and amplified fragment length polymorphism (AFLP) to address hypotheses at lower taxonomic levels. More recently, sequence-related amplified polymorphism (SRAP) markers have been developed, which are used to amplify coding regions of DNA with primers targeting open reading frames. These markers have proven to be robust and highly variable, on par with AFLP, and are attained through a significantly less technically demanding process. SRAP markers have been used primarily for agronomic and horticultural purposes, developing quantitative trait loci in advanced hybrids and assessing genetic diversity of large germplasm collections. Here, we suggest that SRAP markers should be employed for research addressing hypotheses in plant systematics, biogeography, conservation, ecology, and beyond. We provide an overview of the SRAP literature to date, review descriptive statistics of SRAP markers in a subset of 171 publications, and present relevant case studies to demonstrate the applicability of SRAP markers to the diverse field of plant biology. Results of these selected works indicate that SRAP markers have the potential to enhance the current suite of molecular tools in a diversity of fields by providing an easy-to-use, highly variable marker with inherent biological significance. PMID:25202637

  9. Pattern of eyelid motion predictive of decision errors during drowsiness: oculomotor indices of altered states.

    PubMed

    Lobb, M L; Stern, J A

    1986-08-01

    Sequential patterns of eye and eyelid motion were identified in seven subjects performing a modified serial probe recognition task under drowsy conditions. Using simultaneous EOG and video recordings, eyelid motion was divided into components above, within, and below the pupil and the durations in sequence were recorded. A serial probe recognition task was modified to allow for distinguishing decision errors from attention errors. Decision errors were found to be more frequent following a downward shift in the gaze angle which the eyelid closing sequence was reduced from a five element to a three element sequence. The velocity of the eyelid moving over the pupil during decision errors was slow in the closing and fast in the reopening phase, while on decision correct trials it was fast in closing and slower in reopening. Due to the high variability of eyelid motion under drowsy conditions these findings were only marginally significant. When a five element blink occurred, the velocity of the lid over pupil motion component of these endogenous eye blinks was significantly faster on decision correct than on decision error trials. Furthermore, the highly variable, long duration closings associated with the decision response produced slow eye movements in the horizontal plane (SEM) which were more frequent and significantly longer in duration on decision error versus decision correct responses.

  10. Analysis of the entire genomes of fifteen torque teno midi virus variants classifiable into a third group of genus Anellovirus.

    PubMed

    Ninomiya, M; Takahashi, M; Shimosegawa, T; Okamoto, H

    2007-01-01

    Recently, we identified a novel human virus with a circular DNA genome of 3.2 kb, tentatively designated as torque teno midi virus (TTMDV), with a genomic organization resembling those of torque teno virus (TTV) of 3.8-3.9 kb and torque teno mini virus (TTMV) of 2.8-2.9 kb. To investigate the extent of genomic variability of TTMDV genomes, the full-length sequence was determined for 15 TTMDV isolates obtained from viremic individuals in Japan. The 15 TTMDV isolates comprised 3175-3230 bases and shared 67.0-90.3% identities with each other, and were only 68.4-73.0% identical to the 3 reported TTMDV isolates over the entire genome. TTMDV possessed a genomic organization with four open reading frames (ORF1-ORF4) with characteristic sequence motifs and stem and loop structures with high GC content, similar to TTV and TTMV. The total of 18 TTMDV genomes differed by up to 60.7% from each other in the amino acid sequence of ORF1 (658-677 amino acids), but segregated phylogenetically into the same cluster, which was distantly related to the TTVs and TTMVs. These results indicate that TTMDV with a circular DNA genome of 3.2 kb, has an extremely high degree of genomic variability, and is classifiable into a third group in the genus Anellovirus.

  11. Successful Recovery of Nuclear Protein-Coding Genes from Small Insects in Museums Using Illumina Sequencing.

    PubMed

    Kanda, Kojun; Pflug, James M; Sproul, John S; Dasenko, Mark A; Maddison, David R

    2015-01-01

    In this paper we explore high-throughput Illumina sequencing of nuclear protein-coding, ribosomal, and mitochondrial genes in small, dried insects stored in natural history collections. We sequenced one tenebrionid beetle and 12 carabid beetles ranging in size from 3.7 to 9.7 mm in length that have been stored in various museums for 4 to 84 years. Although we chose a number of old, small specimens for which we expected low sequence recovery, we successfully recovered at least some low-copy nuclear protein-coding genes from all specimens. For example, in one 56-year-old beetle, 4.4 mm in length, our de novo assembly recovered about 63% of approximately 41,900 nucleotides in a target suite of 67 nuclear protein-coding gene fragments, and 70% using a reference-based assembly. Even in the least successfully sequenced carabid specimen, reference-based assembly yielded fragments that were at least 50% of the target length for 34 of 67 nuclear protein-coding gene fragments. Exploration of alternative references for reference-based assembly revealed few signs of bias created by the reference. For all specimens we recovered almost complete copies of ribosomal and mitochondrial genes. We verified the general accuracy of the sequences through comparisons with sequences obtained from PCR and Sanger sequencing, including of conspecific, fresh specimens, and through phylogenetic analysis that tested the placement of sequences in predicted regions. A few possible inaccuracies in the sequences were detected, but these rarely affected the phylogenetic placement of the samples. Although our sample sizes are low, an exploratory regression study suggests that the dominant factor in predicting success at recovering nuclear protein-coding genes is a high number of Illumina reads, with success at PCR of COI and killing by immersion in ethanol being secondary factors; in analyses of only high-read samples, the primary significant explanatory variable was body length, with small beetles being more successfully sequenced.

  12. Successful Recovery of Nuclear Protein-Coding Genes from Small Insects in Museums Using Illumina Sequencing

    PubMed Central

    Dasenko, Mark A.

    2015-01-01

    In this paper we explore high-throughput Illumina sequencing of nuclear protein-coding, ribosomal, and mitochondrial genes in small, dried insects stored in natural history collections. We sequenced one tenebrionid beetle and 12 carabid beetles ranging in size from 3.7 to 9.7 mm in length that have been stored in various museums for 4 to 84 years. Although we chose a number of old, small specimens for which we expected low sequence recovery, we successfully recovered at least some low-copy nuclear protein-coding genes from all specimens. For example, in one 56-year-old beetle, 4.4 mm in length, our de novo assembly recovered about 63% of approximately 41,900 nucleotides in a target suite of 67 nuclear protein-coding gene fragments, and 70% using a reference-based assembly. Even in the least successfully sequenced carabid specimen, reference-based assembly yielded fragments that were at least 50% of the target length for 34 of 67 nuclear protein-coding gene fragments. Exploration of alternative references for reference-based assembly revealed few signs of bias created by the reference. For all specimens we recovered almost complete copies of ribosomal and mitochondrial genes. We verified the general accuracy of the sequences through comparisons with sequences obtained from PCR and Sanger sequencing, including of conspecific, fresh specimens, and through phylogenetic analysis that tested the placement of sequences in predicted regions. A few possible inaccuracies in the sequences were detected, but these rarely affected the phylogenetic placement of the samples. Although our sample sizes are low, an exploratory regression study suggests that the dominant factor in predicting success at recovering nuclear protein-coding genes is a high number of Illumina reads, with success at PCR of COI and killing by immersion in ethanol being secondary factors; in analyses of only high-read samples, the primary significant explanatory variable was body length, with small beetles being more successfully sequenced. PMID:26716693

  13. An analysis of rotor blade twist variables associated with different Euler sequences and pretwist treatments

    NASA Technical Reports Server (NTRS)

    Alkire, K.

    1984-01-01

    A nonlinear analysis which is necessary to adequately model elastic helicopter rotor blades experiencing moderately large deformations was examined. The analysis must be based on an appropriate description of the blade's deformation geometry including elastic bending and twist. Built-in pretwist angles complicate the deformation process ant its definition. Relationships between the twist variables associated with different rotation sequences and corresponding forms of the transformation matrix are lasted. Relationships between the twist variables associated with first, the pretwist combined with the deformation twist are included. Many of the corresponding forms of the transformation matrix for the two cases are listed. It is shown that twist variables connected with the combined twist treatment are related to those where the pretwist is applied initially. A method to determine the relationships and some results are outlined. A procedure to evaluate the transformation matrix that eliminates the Eulerlike sequence altogether is demonstrated. The resulting form of the transformation matrix is unaffected by rotation sequence or pretwist treatment.

  14. Pre-main sequence variables in young cluster Stock 18

    NASA Astrophysics Data System (ADS)

    Sinha, Tirthendu; Sharma, Saurabh; Pandey, Rakesh; Pandey, Anil Kumar

    2018-04-01

    We have carried out multi-epoch deep I band photometry of the open cluster Stock 18 to search for variable stars in star forming regions. In the present study, we identified 65 periodic and 217 non-periodic variable stars. The periods of most of the periodic variables are between 2 hours to 15 days and their magnitude varies between 0.05 to 0.6 mag. We have derived spectral energy distributions for 48 probable pre-main sequence variables. Their average age and mass are 2.7 ± 0.3 Myrs and 2.7 ± 0.2 Mo, respectively.

  15. Influence of Flow Sequencing Attributed to Climate Change and Climate Variability on the Assessment of Water-dependent Ecosystem Outcomes

    NASA Astrophysics Data System (ADS)

    Wang, J.; Nathan, R.; Horne, A.

    2017-12-01

    Traditional approaches to characterize water-dependent ecosystem outcomes in response to flow have been based on time-averaged hydrological indicators, however there is increasing recognition for the need to characterize ecological processes that are highly dependent on the sequencing of flow conditions (i.e. floods and droughts). This study considers the representation of flow regimes when considering assessment of ecological outcomes, and in particular, the need to account for sequencing and variability of flow. We conducted two case studies - one in the largely unregulated Ovens River catchment and one in the highly regulated Murray River catchment (both located in south-eastern Australia) - to explore the importance of flow sequencing to the condition of a typical long-lived ecological asset in Australia, the River Red Gum forests. In the first, the Ovens River case study, the implications of representing climate change using different downscaling methods (annual scaling, monthly scaling, quantile mapping, and weather generator method) on the sequencing of flows and resulting ecological outcomes were considered. In the second, the Murray River catchment, sequencing within a historic drought period was considered by systematically making modest adjustments on an annual basis to the hydrological records. In both cases, the condition of River Red Gum forests was assessed using an ecological model that incorporates transitions between ecological conditions in response to sequences of required flow components. The results of both studies show the importance of considering how hydrological alterations are represented when assessing ecological outcomes. The Ovens case study showed that there is significant variation in the predicted ecological outcomes when different downscaling techniques are applied. Similarly, the analysis in the Murray case study showed that the drought as it historically occurred provided one of the best possible outcomes for River Red Gum forests when compared to other re-arrangements of flow within the same drought. These results have implications for the way we represent climate change impacts and drought risk assessments where ecological outcomes are a key management objective.

  16. A population study of the minicircles in Trypanosoma cruzi: predicting guide RNAs in the absence of empirical RNA editing.

    PubMed

    Thomas, Sean; Martinez, L L Isadora Trejo; Westenberger, Scott J; Sturm, Nancy R

    2007-05-24

    The structurally complex network of minicircles and maxicircles comprising the mitochondrial DNA of kinetoplastids mirrors the complexity of the RNA editing process that is required for faithful expression of encrypted maxicircle genes. Although a few of the guide RNAs that direct this editing process have been discovered on maxicircles, guide RNAs are mostly found on the minicircles. The nuclear and maxicircle genomes have been sequenced and assembled for Trypanosoma cruzi, the causative agent of Chagas disease, however the complement of 1.4-kb minicircles, carrying four guide RNA genes per molecule in this parasite, has been less thoroughly characterised. Fifty-four CL Brener and 53 Esmeraldo strain minicircle sequence reads were extracted from T. cruzi whole genome shotgun sequencing data. With these sequences and all published T. cruzi minicircle sequences, 108 unique guide RNAs from all known T. cruzi minicircle sequences and two guide RNAs from the CL Brener maxicircle were predicted using a local alignment algorithm and mapped onto predicted or experimentally determined sequences of edited maxicircle open reading frames. For half of the sequences no statistically significant guide RNA could be assigned. Likely positions of these unidentified gRNAs in T. cruzi minicircle sequences are estimated using a simple Hidden Markov Model. With the local alignment predictions as a standard, the HMM had an ~85% chance of correctly identifying at least 20 nucleotides of guide RNA from a given minicircle sequence. Inter-minicircle recombination was documented. Variable regions contain species-specific areas of distinct nucleotide preference. Two maxicircle guide RNA genes were found. The identification of new minicircle sequences and the further characterization of all published minicircles are presented, including the first observation of recombination between minicircles. Extrapolation suggests a level of 4% recombinants in the population, supporting a relatively high recombination rate that may serve to minimize the persistence of gRNA pseudogenes. Characteristic nucleotide preferences observed within variable regions provide potential clues regarding the transcription and maturation of T. cruzi guide RNAs. Based on these preferences, a method of predicting T. cruzi guide RNAs using only primary minicircle sequence data was created.

  17. rpoB-Based Identification of Nonpigmented and Late-Pigmenting Rapidly Growing Mycobacteria

    PubMed Central

    Adékambi, Toïdi; Colson, Philippe; Drancourt, Michel

    2003-01-01

    Nonpigmented and late-pigmenting rapidly growing mycobacteria (RGM) are increasingly isolated in clinical microbiology laboratories. Their accurate identification remains problematic because classification is labor intensive work and because new taxa are not often incorporated into classification databases. Also, 16S rRNA gene sequence analysis underestimates RGM diversity and does not distinguish between all taxa. We determined the complete nucleotide sequence of the rpoB gene, which encodes the bacterial β subunit of the RNA polymerase, for 20 RGM type strains. After using in-house software which analyzes and graphically represents variability stretches of 60 bp along the nucleotide sequence, our analysis focused on a 723-bp variable region exhibiting 83.9 to 97% interspecies similarity and 0 to 1.7% intraspecific divergence. Primer pair Myco-F-Myco-R was designed as a tool for both PCR amplification and sequencing of this region for molecular identification of RGM. This tool was used for identification of 63 RGM clinical isolates previously identified at the species level on the basis of phenotypic characteristics and by 16S rRNA gene sequence analysis. Of 63 clinical isolates, 59 (94%) exhibited <2% partial rpoB gene sequence divergence from 1 of 20 species under study and were regarded as correctly identified at the species level. Mycobacterium abscessus and Mycobacterium mucogenicum isolates were clearly distinguished from Mycobacterium chelonae; Mycobacterium mageritense isolates were clearly distinguished from “Mycobacterium houstonense.” Four isolates were not identified at the species level because they exhibited >3% partial rpoB gene sequence divergence from the corresponding type strain; they belonged to three taxa related to M. mucogenicum, Mycobacterium smegmatis, and Mycobacterium porcinum. For M. abscessus and M. mucogenicum, this partial sequence yielded a high genetic heterogeneity within the clinical isolates. We conclude that molecular identification by analysis of the 723-bp rpoB sequence is a rapid and accurate tool for identification of RGM. PMID:14662964

  18. Comprehensive Interrogation of Natural TALE DNA Binding Modules and Transcriptional Repressor Domains

    PubMed Central

    Cong, Le; Zhou, Ruhong; Kuo, Yu-chi; Cunniff, Margaret; Zhang, Feng

    2012-01-01

    Transcription activator-like effectors (TALE) are sequence-specific DNA binding proteins that harbor modular, repetitive DNA binding domains. TALEs have enabled the creation of customizable designer transcriptional factors and sequence-specific nucleases for genome engineering. Here we report two improvements of the TALE toolbox for achieving efficient activation and repression of endogenous gene expression in mammalian cells. We show that the naturally occurring repeat variable diresidue (RVD) Asn-His (NH) has high biological activity and specificity for guanine, a highly prevalent base in mammalian genomes. We also report an effective TALE transcriptional repressor architecture for targeted inhibition of transcription in mammalian cells. These findings will improve the precision and effectiveness of genome engineering that can be achieved using TALEs. PMID:22828628

  19. Cultural conservatism and variability in the Acheulian sequence of Gesher Benot Ya'aqov.

    PubMed

    Sharon, Gonen; Alperson-Afil, Nira; Goren-Inbar, Naama

    2011-04-01

    The Acheulian Technocomplex exhibits two phenomena: variability and conservatism. Variability is expressed in the composition and frequencies of tool types, particularly in the varying frequencies of bifaces (handaxes and cleavers). Conservatism is expressed in the continuous presence of bifaces along an immense time trajectory. The site of Gesher Benot Ya'aqov (GBY) offers a unique opportunity to study aspects of variability and conservatism as a result of its long cultural-stratigraphic sequence containing superimposed lithic assemblages. This study explores aspects of variability and conservatism within the Acheulian lithic assemblages of GBY, with emphasis placed on the bifacial tools. While variability has been studied through a comparison of typological frequencies in a series of assemblages from the site, evidence for conservatism was examined in the production modes expressed by the reduction sequence of the bifaces. We demonstrate that while pronounced typological variability is observed among the GBY assemblages, they were all manufactured by the same technology. The technology, size, and morphology of the bifaces throughout the entire stratigraphic sequence of GBY reflect the strong conservatism of their makers. We conclude that the biface frequency cannot be considered as a chrono/cultural marker that might otherwise allow us to distinguish between different phases within the Acheulian. The variability observed within the assemblages is explained as a result of different activities, tasks, and functions, which were carried out at specific localities along the shores of the paleo-Hula Lake in the early Middle Pleistocene. Copyright © 2010 Elsevier Ltd. All rights reserved.

  20. Improper trunk rotation sequence is associated with increased maximal shoulder external rotation angle and shoulder joint force in high school baseball pitchers.

    PubMed

    Oyama, Sakiko; Yu, Bing; Blackburn, J Troy; Padua, Darin A; Li, Li; Myers, Joseph B

    2014-09-01

    In a properly coordinated throwing motion, peak pelvic rotation velocity is reached before peak upper torso rotation velocity, so that angular momentum can be transferred effectively from the proximal (pelvis) to distal (upper torso) segment. However, the effects of trunk rotation sequence on pitching biomechanics and performance have not been investigated. The aim of this study was to investigate the effects of trunk rotation sequence on ball speed and on upper extremity biomechanics that are linked to injuries in high school baseball pitchers. The hypothesis was that pitchers with improper trunk rotation sequence would demonstrate lower ball velocity and greater stress to the joint. Descriptive laboratory study. Three-dimensional pitching kinematics data were captured from 72 high school pitchers. Subjects were considered to have proper or improper trunk rotation sequences when the peak pelvic rotation velocity was reached either before or after the peak upper torso rotation velocity beyond the margin of error (±3.7% of the time from stride-foot contact to ball release). Maximal shoulder external rotation angle, elbow extension angle at ball release, peak shoulder proximal force, shoulder internal rotation moment, and elbow varus moment were compared between groups using independent t tests (α < 0.05). Pitchers with improper trunk rotation sequences (n = 33) demonstrated greater maximal shoulder external rotation angle (mean difference, 7.2° ± 2.9°, P = .016) and greater shoulder proximal force (mean difference, 9.2% ± 3.9% body weight, P = .021) compared with those with proper trunk rotation sequences (n = 22). No other variables differed significantly different between groups. High school baseball pitchers who demonstrated improper trunk rotation sequences demonstrated greater maximal shoulder external rotation angle and shoulder proximal force compared with pitchers with proper trunk rotation sequences. Improper sequencing of the trunk and torso alter upper extremity joint loading in ways that may influence injury risk. As such, exercises that reinforce the use of a proper trunk rotation sequence during the pitching motion may reduce the stress placed on the structures around the shoulder joint and lead to the prevention of injuries. © 2014 The Author(s).

  1. Variable speed wind turbine generator with zero-sequence filter

    DOEpatents

    Muljadi, Eduard

    1998-01-01

    A variable speed wind turbine generator system to convert mechanical power into electrical power or energy and to recover the electrical power or energy in the form of three phase alternating current and return the power or energy to a utility or other load with single phase sinusoidal waveform at sixty (60) hertz and unity power factor includes an excitation controller for generating three phase commanded current, a generator, and a zero sequence filter. Each commanded current signal includes two components: a positive sequence variable frequency current signal to provide the balanced three phase excitation currents required in the stator windings of the generator to generate the rotating magnetic field needed to recover an optimum level of real power from the generator; and a zero frequency sixty (60) hertz current signal to allow the real power generated by the generator to be supplied to the utility. The positive sequence current signals are balanced three phase signals and are prevented from entering the utility by the zero sequence filter. The zero sequence current signals have zero phase displacement from each other and are prevented from entering the generator by the star connected stator windings. The zero sequence filter allows the zero sequence current signals to pass through to deliver power to the utility.

  2. Variable Speed Wind Turbine Generator with Zero-sequence Filter

    DOEpatents

    Muljadi, Eduard

    1998-08-25

    A variable speed wind turbine generator system to convert mechanical power into electrical power or energy and to recover the electrical power or energy in the form of three phase alternating current and return the power or energy to a utility or other load with single phase sinusoidal waveform at sixty (60) hertz and unity power factor includes an excitation controller for generating three phase commanded current, a generator, and a zero sequence filter. Each commanded current signal includes two components: a positive sequence variable frequency current signal to provide the balanced three phase excitation currents required in the stator windings of the generator to generate the rotating magnetic field needed to recover an optimum level of real power from the generator; and a zero frequency sixty (60) hertz current signal to allow the real power generated by the generator to be supplied to the utility. The positive sequence current signals are balanced three phase signals and are prevented from entering the utility by the zero sequence filter. The zero sequence current signals have zero phase displacement from each other and are prevented from entering the generator by the star connected stator windings. The zero sequence filter allows the zero sequence current signals to pass through to deliver power to the utility.

  3. Variable speed wind turbine generator with zero-sequence filter

    DOEpatents

    Muljadi, E.

    1998-08-25

    A variable speed wind turbine generator system to convert mechanical power into electrical power or energy and to recover the electrical power or energy in the form of three phase alternating current and return the power or energy to a utility or other load with single phase sinusoidal waveform at sixty (60) hertz and unity power factor includes an excitation controller for generating three phase commanded current, a generator, and a zero sequence filter. Each commanded current signal includes two components: a positive sequence variable frequency current signal to provide the balanced three phase excitation currents required in the stator windings of the generator to generate the rotating magnetic field needed to recover an optimum level of real power from the generator; and a zero frequency sixty (60) hertz current signal to allow the real power generated by the generator to be supplied to the utility. The positive sequence current signals are balanced three phase signals and are prevented from entering the utility by the zero sequence filter. The zero sequence current signals have zero phase displacement from each other and are prevented from entering the generator by the star connected stator windings. The zero sequence filter allows the zero sequence current signals to pass through to deliver power to the utility. 14 figs.

  4. Variability and transmission by Aphis glycines of North American and Asian Soybean mosaic virus isolates.

    PubMed

    Domier, L L; Latorre, I J; Steinlage, T A; McCoppin, N; Hartman, G L

    2003-10-01

    The variability of North American and Asian strains and isolates of Soybean mosaic virus was investigated. First, polymerase chain reaction (PCR) products representing the coat protein (CP)-coding regions of 38 SMVs were analyzed for restriction fragment length polymorphisms (RFLP). Second, the nucleotide and predicted amino acid sequence variability of the P1-coding region of 18 SMVs and the helper component/protease (HC/Pro) and CP-coding regions of 25 SMVs were assessed. The CP nucleotide and predicted amino acid sequences were the most similar and predicted phylogenetic relationships similar to those obtained from RFLP analysis. Neither RFLP nor sequence analyses of the CP-coding regions grouped the SMVs by geographical origin. The P1 and HC/Pro sequences were more variable and separated the North American and Asian SMV isolates into two groups similar to previously reported differences in pathogenic diversity of the two sets of SMV isolates. The P1 region was the most informative of the three regions analyzed. To assess the biological relevance of the sequence differences in the HC/Pro and CP coding regions, the transmissibility of 14 SMV isolates by Aphis glycines was tested. All field isolates of SMV were transmitted efficiently by A. glycines, but the laboratory isolates analyzed were transmitted poorly. The amino acid sequences from most, but not all, of the poorly transmitted isolates contained mutations in the aphid transmission-associated DAG and/or KLSC amino acid sequence motifs of CP and HC/Pro, respectively.

  5. Optimization of parameter values for complex pulse sequences by simulated annealing: application to 3D MP-RAGE imaging of the brain.

    PubMed

    Epstein, F H; Mugler, J P; Brookeman, J R

    1994-02-01

    A number of pulse sequence techniques, including magnetization-prepared gradient echo (MP-GRE), segmented GRE, and hybrid RARE, employ a relatively large number of variable pulse sequence parameters and acquire the image data during a transient signal evolution. These sequences have recently been proposed and/or used for clinical applications in the brain, spine, liver, and coronary arteries. Thus, the need for a method of deriving optimal pulse sequence parameter values for this class of sequences now exists. Due to the complexity of these sequences, conventional optimization approaches, such as applying differential calculus to signal difference equations, are inadequate. We have developed a general framework for adapting the simulated annealing algorithm to pulse sequence parameter value optimization, and applied this framework to the specific case of optimizing the white matter-gray matter signal difference for a T1-weighted variable flip angle 3D MP-RAGE sequence. Using our algorithm, the values of 35 sequence parameters, including the magnetization-preparation RF pulse flip angle and delay time, 32 flip angles in the variable flip angle gradient-echo acquisition sequence, and the magnetization recovery time, were derived. Optimized 3D MP-RAGE achieved up to a 130% increase in white matter-gray matter signal difference compared with optimized 3D RF-spoiled FLASH with the same total acquisition time. The simulated annealing approach was effective at deriving optimal parameter values for a specific 3D MP-RAGE imaging objective, and may be useful for other imaging objectives and sequences in this general class.

  6. C-Terminal Domain of Hemocyanin, a Major Antimicrobial Protein from Litopenaeus vannamei: Structural Homology with Immunoglobulins and Molecular Diversity

    PubMed Central

    Zhang, Yue-Ling; Peng, Bo; Li, Hui; Yan, Fang; Wu, Hong-Kai; Zhao, Xian-Liang; Lin, Xiang-Min; Min, Shao-Ying; Gao, Yuan-Yuan; Wang, San-Ying; Li, Yuan-You; Peng, Xuan-Xian

    2017-01-01

    Invertebrates rely heavily on immune-like molecules with highly diversified variability so as to counteract infections. However, the mechanisms and the relationship between this variability and functionalities are not well understood. Here, we showed that the C-terminal domain of hemocyanin (HMC) from shrimp Litopenaeus vannamei contained an evolutionary conserved domain with highly variable genetic sequence, which is structurally homologous to immunoglobulin (Ig). This domain is responsible for recognizing and binding to bacteria or red blood cells, initiating agglutination and hemolysis. Furthermore, when HMC is separated into three fractions using anti-human IgM, IgG, or IgA, the subpopulation, which reacted with anti-human IgM (HMC-M), showed the most significant antimicrobial activity. The high potency of HMC-M is a consequence of glycosylation, as it contains high abundance of α-d-mannose relative to α-d-glucose and N-acetyl-d-galactosamine. Thus, the removal of these glycans abolished the antimicrobial activity of HMC-M. Our results present a comprehensive investigation of the role of HMC in fighting against infections through genetic variability and epigenetic modification. PMID:28659912

  7. Sequencing of hepatitis C virus for detection of resistance to direct-acting antiviral therapy: A systematic review.

    PubMed

    Bartlett, Sofia R; Grebely, Jason; Eltahla, Auda A; Reeves, Jacqueline D; Howe, Anita Y M; Miller, Veronica; Ceccherini-Silberstein, Francesca; Bull, Rowena A; Douglas, Mark W; Dore, Gregory J; Harrington, Patrick; Lloyd, Andrew R; Jacka, Brendan; Matthews, Gail V; Wang, Gary P; Pawlotsky, Jean-Michel; Feld, Jordan J; Schinkel, Janke; Garcia, Federico; Lennerstrand, Johan; Applegate, Tanya L

    2017-07-01

    The significance of the clinical impact of direct-acting antiviral (DAA) resistance-associated substitutions (RASs) in hepatitis C virus (HCV) on treatment failure is unclear. No standardized methods or guidelines for detection of DAA RASs in HCV exist. To facilitate further evaluations of the impact of DAA RASs in HCV, we conducted a systematic review of RAS sequencing protocols, compiled a comprehensive public library of sequencing primers, and provided expert guidance on the most appropriate methods to screen and identify RASs. The development of standardized RAS sequencing protocols is complicated due to a high genetic variability and the need for genotype- and subtype-specific protocols for multiple regions. We have identified several limitations of the available methods and have highlighted areas requiring further research and development. The development, validation, and sharing of standardized methods for all genotypes and subtypes should be a priority. ( Hepatology Communications 2017;1:379-390).

  8. Automated design evolution of stereochemically randomized protein foldamers

    NASA Astrophysics Data System (ADS)

    Ranbhor, Ranjit; Kumar, Anil; Patel, Kirti; Ramakrishnan, Vibin; Durani, Susheel

    2018-05-01

    Diversification of chain stereochemistry opens up the possibilities of an ‘in principle’ increase in the design space of proteins. This huge increase in the sequence and consequent structural variation is aimed at the generation of smart materials. To diversify protein structure stereochemically, we introduced L- and D-α-amino acids as the design alphabet. With a sequence design algorithm, we explored the usage of specific variables such as chirality and the sequence of this alphabet in independent steps. With molecular dynamics, we folded stereochemically diverse homopolypeptides and evaluated their ‘fitness’ for possible design as protein-like foldamers. We propose a fitness function to prune the most optimal fold among 1000 structures simulated with an automated repetitive simulated annealing molecular dynamics (AR-SAMD) approach. The highly scored poly-leucine fold with sequence lengths of 24 and 30 amino acids were later sequence-optimized using a Dead End Elimination cum Monte Carlo based optimization tool. This paper demonstrates a novel approach for the de novo design of protein-like foldamers.

  9. Multiple Use One-Sided Hypotheses Testing in Univariate Linear Calibration

    NASA Technical Reports Server (NTRS)

    Krishnamoorthy, K.; Kulkarni, Pandurang M.; Mathew, Thomas

    1996-01-01

    Consider a normally distributed response variable, related to an explanatory variable through the simple linear regression model. Data obtained on the response variable, corresponding to known values of the explanatory variable (i.e., calibration data), are to be used for testing hypotheses concerning unknown values of the explanatory variable. We consider the problem of testing an unlimited sequence of one sided hypotheses concerning the explanatory variable, using the corresponding sequence of values of the response variable and the same set of calibration data. This is the situation of multiple use of the calibration data. The tests derived in this context are characterized by two types of uncertainties: one uncertainty associated with the sequence of values of the response variable, and a second uncertainty associated with the calibration data. We derive tests based on a condition that incorporates both of these uncertainties. The solution has practical applications in the decision limit problem. We illustrate our results using an example dealing with the estimation of blood alcohol concentration based on breath estimates of the alcohol concentration. In the example, the problem is to test if the unknown blood alcohol concentration of an individual exceeds a threshold that is safe for driving.

  10. Analysis of chaos in high-dimensional wind power system.

    PubMed

    Wang, Cong; Zhang, Hongli; Fan, Wenhui; Ma, Ping

    2018-01-01

    A comprehensive analysis on the chaos of a high-dimensional wind power system is performed in this study. A high-dimensional wind power system is more complex than most power systems. An 11-dimensional wind power system proposed by Huang, which has not been analyzed in previous studies, is investigated. When the systems are affected by external disturbances including single parameter and periodic disturbance, or its parameters changed, chaotic dynamics of the wind power system is analyzed and chaotic parameters ranges are obtained. Chaos existence is confirmed by calculation and analysis of all state variables' Lyapunov exponents and the state variable sequence diagram. Theoretical analysis and numerical simulations show that the wind power system chaos will occur when parameter variations and external disturbances change to a certain degree.

  11. Variable-Interval Sequenced-Action Camera (VINSAC). Dissemination Document No. 1.

    ERIC Educational Resources Information Center

    Ward, Ted

    The 16 millimeter (mm) Variable-Interval Sequenced-Action Camera (VINSAC) is designed for inexpensive photographic recording of effective teacher instruction and use of instructional materials for teacher education and research purposes. The camera photographs single frames at preselected time intervals (.5 second to 20 seconds) which are…

  12. Segmenting words from natural speech: subsegmental variation in segmental cues.

    PubMed

    Rytting, C Anton; Brew, Chris; Fosler-Lussier, Eric

    2010-06-01

    Most computational models of word segmentation are trained and tested on transcripts of speech, rather than the speech itself, and assume that speech is converted into a sequence of symbols prior to word segmentation. We present a way of representing speech corpora that avoids this assumption, and preserves acoustic variation present in speech. We use this new representation to re-evaluate a key computational model of word segmentation. One finding is that high levels of phonetic variability degrade the model's performance. While robustness to phonetic variability may be intrinsically valuable, this finding needs to be complemented by parallel studies of the actual abilities of children to segment phonetically variable speech.

  13. Computational study of β-N-acetylhexosaminidase from Talaromyces flavus, a glycosidase with high substrate flexibility.

    PubMed

    Kulik, Natallia; Slámová, Kristýna; Ettrich, Rüdiger; Křen, Vladimír

    2015-01-28

    β-N-Acetylhexosaminidase (GH20) from the filamentous fungus Talaromyces flavus, previously identified as a prominent enzyme in the biosynthesis of modified glycosides, lacks a high resolution three-dimensional structure so far. Despite of high sequence identity to previously reported Aspergillus oryzae and Penicilluim oxalicum β-N-acetylhexosaminidases, this enzyme tolerates significantly better substrate modification. Understanding of key structural features, prediction of effective mutants and potential substrate characteristics prior to their synthesis are of general interest. Computational methods including homology modeling and molecular dynamics simulations were applied to shad light on the structure-activity relationship in the enzyme. Primary sequence analysis revealed some variable regions able to influence difference in substrate affinity of hexosaminidases. Moreover, docking in combination with consequent molecular dynamics simulations of C-6 modified glycosides enabled us to identify the structural features required for accommodation and processing of these bulky substrates in the active site of hexosaminidase from T. flavus. To access the reliability of predictions on basis of the reported model, all results were confronted with available experimental data that demonstrated the principal correctness of the predictions as well as the model. The main variable regions in β-N-acetylhexosaminidases determining difference in modified substrate affinity are located close to the active site entrance and engage two loops. Differences in primary sequence and the spatial arrangement of these loops and their interplay with active site amino acids, reflected by interaction energies and dynamics, account for the different catalytic activity and substrate specificity of the various fungal and bacterial β-N-acetylhexosaminidases.

  14. Newly developed primers for complete YCF1 amplification in Pinus (Pinaceae) chloroplasts with possible family-wide utility

    Treesearch

    Matthew Parks; Aaron Liston; Rich Cronn

    2011-01-01

    Primers were designed to amplify the highly variable locus ycf1 from all 11 subsections of Pinus to facilitate plastome assemblies based on short sequence reads as well as future phylogenetic and population genetic analyses. Primer design was based on alignment of 33 Pinus and four Pinaceae plastomes with...

  15. Temporal Dynamics of In-Field Bioreactor Populations Reflect the Groundwater System and Respond Predictably to Perturbation.

    PubMed

    King, Andrew J; Preheim, Sarah P; Bailey, Kathryn L; Robeson, Michael S; Roy Chowdhury, Taniya; Crable, Bryan R; Hurt, Richard A; Mehlhorn, Tonia; Lowe, Kenneth A; Phelps, Tommy J; Palumbo, Anthony V; Brandt, Craig C; Brown, Steven D; Podar, Mircea; Zhang, Ping; Lancaster, W Andrew; Poole, Farris; Watson, David B; W Fields, Matthew; Chandonia, John-Marc; Alm, Eric J; Zhou, Jizhong; Adams, Michael W W; Hazen, Terry C; Arkin, Adam P; Elias, Dwayne A

    2017-03-07

    Temporal variability complicates testing the influences of environmental variability on microbial community structure and thus function. An in-field bioreactor system was developed to assess oxic versus anoxic manipulations on in situ groundwater communities. Each sample was sequenced (16S SSU rRNA genes, average 10,000 reads), and biogeochemical parameters are monitored by quantifying 53 metals, 12 organic acids, 14 anions, and 3 sugars. Changes in dissolved oxygen (DO), pH, and other variables were similar across bioreactors. Sequencing revealed a complex community that fluctuated in-step with the groundwater community and responded to DO. This also directly influenced the pH, and so the biotic impacts of DO and pH shifts are correlated. A null model demonstrated that bioreactor communities were driven in part not only by experimental conditions but also by stochastic variability and did not accurately capture alterations in diversity during perturbations. We identified two groups of abundant OTUs important to this system; one was abundant in high DO and pH and contained heterotrophs and oxidizers of iron, nitrite, and ammonium, whereas the other was abundant in low DO with the capability to reduce nitrate. In-field bioreactors are a powerful tool for capturing natural microbial community responses to alterations in geochemical factors beyond the bulk phase.

  16. Temporal Dynamics of In-Field Bioreactor Populations Reflect the Groundwater System and Respond Predictably to Perturbation

    DOE PAGES

    King, Andrew J.; Preheim, Sarah P.; Bailey, Kathryn L.; ...

    2017-01-23

    Temporal variability complicates testing the influences of environmental variability on microbial community structure and thus function. An in-field bioreactor system was developed to assess oxic versus anoxic manipulations on in-situ groundwater communities. Each sample was sequenced (16S SSU rRNA genes, average 10,000 reads) and biogeochemical parameters monitored by quantifying 53 metals, 12 organic acids, 14 anions and 3 sugars. Changes in dissolved oxygen (DO), pH, and other variables were similar across bioreactors. Sequencing revealed a complex community that fluctuated in-step with the groundwater community, and responded to DO. This also directly influenced the pH and so the biotic impacts ofmore » DO and pH shifts are correlated. A null model demonstrated that bioreactor communities were driven in part by experimental conditions but also by stochastic variability and did not accurately capture alterations in diversity during perturbations. We identified two groups of abundant OTUs important to this system; one was abundant in high DO and pH and contained heterotrophs and oxidizers of iron, nitrite, and ammonium, whereas the other was abundant in low DO with the capability to reduce nitrate. In-field bioreactors are a powerful tool for capturing natural microbial community responses to alterations in geochemical factors beyond the bulk phase.« less

  17. Temporal Dynamics of In-Field Bioreactor Populations Reflect the Groundwater System and Respond Predictably to Perturbation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    King, Andrew J.; Preheim, Sarah P.; Bailey, Kathryn L.

    Temporal variability complicates testing the influences of environmental variability on microbial community structure and thus function. An in-field bioreactor system was developed to assess oxic versus anoxic manipulations on in-situ groundwater communities. Each sample was sequenced (16S SSU rRNA genes, average 10,000 reads) and biogeochemical parameters monitored by quantifying 53 metals, 12 organic acids, 14 anions and 3 sugars. Changes in dissolved oxygen (DO), pH, and other variables were similar across bioreactors. Sequencing revealed a complex community that fluctuated in-step with the groundwater community, and responded to DO. This also directly influenced the pH and so the biotic impacts ofmore » DO and pH shifts are correlated. A null model demonstrated that bioreactor communities were driven in part by experimental conditions but also by stochastic variability and did not accurately capture alterations in diversity during perturbations. We identified two groups of abundant OTUs important to this system; one was abundant in high DO and pH and contained heterotrophs and oxidizers of iron, nitrite, and ammonium, whereas the other was abundant in low DO with the capability to reduce nitrate. In-field bioreactors are a powerful tool for capturing natural microbial community responses to alterations in geochemical factors beyond the bulk phase.« less

  18. Improved First Pass Spiral Myocardial Perfusion Imaging with Variable Density Trajectories

    PubMed Central

    Salerno, Michael; Sica, Christopher; Kramer, Christopher M.; Meyer, Craig H.

    2013-01-01

    Purpose To develop and evaluate variable-density (VD) spiral first-pass perfusion pulse sequences for improved efficiency and off-resonance performance and to demonstrate the utility of an apodizing density compensation function (DCF) to improve SNR and reduce dark-rim artifact caused by cardiac motion and Gibbs Ringing. Methods Three variable density spiral trajectories were designed, simulated, and evaluated in 18 normal subjects, and in 8 patients with cardiac pathology on a 1.5T scanner. Results By utilizing a density compensation function (DCF) which intentionally apodizes the k-space data, the side-lobe amplitude of the theoretical PSF is reduced by 68%, with only a 13% increase in the FWHM of the main-lobe as compared to the same data corrected with a conventional VD DCF, and has an 8% higher resolution than a uniform density spiral with the same number of interleaves and readout duration. Furthermore, this strategy results in a greater than 60% increase in measured SNR as compared to the same VD spiral data corrected with a conventional DCF (p<0.01). Perfusion defects could be clearly visualized with minimal off-resonance and dark-rim artifacts. Conclusion VD spiral pulse sequences using an apodized DCF produce high-quality first-pass perfusion images with minimal dark-rim and off-resonance artifacts, high SNR and CNR and good delineation of resting perfusion abnormalities. PMID:23280884

  19. Increasing Clinical Severity during a Dengue Virus Type 3 Cuban Epidemic: Deep Sequencing of Evolving Viral Populations

    PubMed Central

    Blanc, Hervé; Bordería, Antonio V.; Díaz, Gisell; Henningsson, Rasmus; Gonzalez, Daniel; Santana, Emidalys; Alvarez, Mayling; Castro, Osvaldo; Fontes, Magnus; Vignuzzi, Marco; Guzman, Maria G.

    2016-01-01

    ABSTRACT During the dengue virus type 3 (DENV-3) epidemic that occurred in Havana in 2001 to 2002, severe disease was associated with the infection sequence DENV-1 followed by DENV-3 (DENV-1/DENV-3), while the sequence DENV-2/DENV-3 was associated with mild/asymptomatic infections. To determine the role of the virus in the increasing severity demonstrated during the epidemic, serum samples collected at different time points were studied. A total of 22 full-length sequences were obtained using a deep-sequencing approach. Bayesian phylogenetic analysis of consensus sequences revealed that two DENV-3 lineages were circulating in Havana at that time, both grouped within genotype III. The predominant lineage is closely related to Peruvian and Ecuadorian strains, while the minor lineage is related to Venezuelan strains. According to consensus sequences, relatively few nonsynonymous mutations were observed; only one was fixed during the epidemic at position 4380 in the NS2B gene. Intrahost genetic analysis indicated that a significant minor population was selected and became predominant toward the end of the epidemic. In conclusion, greater variability was detected during the epidemic's progression in terms of significant minority variants, particularly in the nonstructural genes. An increasing trend of genetic diversity toward the end of the epidemic was observed only for synonymous variant allele rates, with higher variability in secondary cases. Remarkably, significant intrahost genetic variation was demonstrated within the same patient during the course of secondary infection with DENV-1/DENV-3, including changes in the structural proteins premembrane (PrM) and envelope (E). Therefore, the dynamic of evolving viral populations in the context of heterotypic antibodies could be related to the increasing clinical severity observed during the epidemic. IMPORTANCE Based on the evidence that DENV fitness is context dependent, our research has focused on the study of viral factors associated with intraepidemic increasing severity in a unique epidemiological setting. Here, we investigated the intrahost genetic diversity in acute human samples collected at different time points during the DENV-3 epidemic that occurred in Cuba in 2001 to 2002 using a deep-sequencing approach. We concluded that greater variability in significant minor populations occurred as the epidemic progressed, particularly in the nonstructural genes, with higher variability observed in secondary infection cases. Remarkably, for the first time significant intrahost genetic variation was demonstrated within the same patient during the course of secondary infection with DENV-1/DENV-3, including changes in structural proteins. These findings indicate that high-resolution approaches are needed to unravel molecular mechanisms involved in dengue pathogenesis. PMID:26889031

  20. A novel program to design siRNAs simultaneously effective to highly variable virus genomes.

    PubMed

    Lee, Hui Sun; Ahn, Jeonghyun; Jun, Eun Jung; Yang, Sanghwa; Joo, Chul Hyun; Kim, Yoo Kyum; Lee, Heuiran

    2009-07-10

    A major concern of antiviral therapy using small interfering RNAs (siRNAs) targeting RNA viral genome is high sequence diversity and mutation rate due to genetic instability. To overcome this problem, it is indispensable to design siRNAs targeting highly conserved regions. We thus designed CAPSID (Convenient Application Program for siRNA Design), a novel bioinformatics program to identify siRNAs targeting highly conserved regions within RNA viral genomes. From a set of input RNAs of diverse sequences, CAPSID rapidly searches conserved patterns and suggests highly potent siRNA candidates in a hierarchical manner. To validate the usefulness of this novel program, we investigated the antiviral potency of universal siRNA for various Human enterovirus B (HEB) serotypes. Assessment of antiviral efficacy using Hela cells, clearly demonstrates that HEB-specific siRNAs exhibit protective effects against all HEBs examined. These findings strongly indicate that CAPSID can be applied to select universal antiviral siRNAs against highly divergent viral genomes.

  1. A comprehensive analysis of three Asiatic black bear mitochondrial genomes (subspecies ussuricus, formosanus and mupinensis), with emphasis on the complete mtDNA sequence of Ursus thibetanus ussuricus (Ursidae).

    PubMed

    Hwang, Dae-Sik; Ki, Jang-Seu; Jeong, Dong-Hyuk; Kim, Bo-Hyun; Lee, Bae-Keun; Han, Sang-Hoon; Lee, Jae-Seong

    2008-08-01

    In the present paper, we describe the mitochondrial genome sequence of the Asiatic black bear (Ursus thibetanus ussuricus) with particular emphasis on the control region (CR), and compared with mitochondrial genomes on molecular relationships among the bears. The mitochondrial genome sequence of U. thibetanus ussuricus was 16,700 bp in size with mostly conserved structures (e.g. 13 protein-coding, two rRNA genes, 22 tRNA genes). The CR consisted of several typical conserved domains such as F, E, D, and C boxes, and a conserved sequence block. Nucleotide sequences and the repeated motifs in the CR were different among the bear species, and their copy numbers were also variable according to populations, even within F1 generations of U. thibetanus ussuricus. Comparative analyses showed that the CR D1 region was highly informative for the discrimination of the bear family. These findings suggest that nucleotide sequences of both repeated motifs and CR D1 in the bear family are good markers for species discriminations.

  2. Nucleotide sequence of an exceptionally long 5.8S ribosomal RNA from Crithidia fasciculata.

    PubMed Central

    Schnare, M N; Gray, M W

    1982-01-01

    In Crithidia fasciculata, a trypanosomatid protozoan, the large ribosomal subunit contains five small RNA species (e, f, g, i, j) in addition to 5S rRNA [Gray, M.W. (1981) Mol. Cell. Biol. 1, 347-357]. The complete primary sequence of species i is shown here to be pAACGUGUmCGCGAUGGAUGACUUGGCUUCCUAUCUCGUUGA ... AGAmACGCAGUAAAGUGCGAUAAGUGGUApsiCAAUUGmCAGAAUCAUUCAAUUACCGAAUCUUUGAACGAAACGG ... CGCAUGGGAGAAGCUCUUUUGAGUCAUCCCCGUGCAUGCCAUAUUCUCCAmGUGUCGAA(C)OH. This sequence establishes that species i is a 5.8S rRNA, despite its exceptional length (171-172 nucleotides). The extra nucleotides in C. fasciculata 5.8S rRNA are located in a region whose primary sequence and length are highly variable among 5.8S rRNAs, but which is capable of forming a stable hairpin loop structure (the "G+C-rich hairpin"). The sequence of C. fasciculata 5.8S rRNA is no more closely related to that of another protozoan, Acanthamoeba castellanii, than it is to representative 5.8S rRNA sequences from the other eukaryotic kingdoms, emphasizing the deep phylogenetic divisions that seem to exist within the Kingdom Protista. Images PMID:7079176

  3. Sequence diversity among badnavirus isolates infecting black pepper and related species in India.

    PubMed

    Bhat, A I; Sasi, Shina; Revathy, K A; Deeshma, K P; Saji, K V

    2014-01-01

    The badnavirus, piper yellow mottle virus (PYMoV) is known to infect black pepper (Piper nigrum), betelvine (P. betle) and Indian long pepper (P. longum) in India and other parts of the world. Occurrence of PYMoV or other badnaviruses in other species of Piper and its variability is not reported so far. We have analysed sequence variability in the conserved putative reverse transcriptase (RT)/ribonuclease H (RNase H) coding region of the virus using specific badnavirus primers from 13 virus isolates of black pepper collected from different cultivars and regions and one isolate each from 23 other species of Piper. Of these, four species failed to produce expected amplicon while amplicon from four other species showed more similarities to plant sequences than to badnaviruses. Of the remaining, isolates from black pepper, P. argyrophyllum, P. attenuatum, P. barberi, P. betle, P. colubrinum, P. galeatum, P. longum, P. ornatum, P. sarmentosum and P. trichostachyon showed an identity of >85 % at the nucleotide and >90 % at the amino acid level with PYMoV indicating that they are isolates of PYMoV. On the other hand high sequence variability of 21-43 % at nucleotide and 17-46 % at amino acid level compared to PYMoV was found among isolates infecting P. bababudani, P. chaba, P. peepuloides, P. mullesua and P. thomsonii suggesting the presence of new badnaviruses. Phylogenetic analyses showed close clustering of all PYMoV isolates that were well separated from other known badnaviruses. This is the first report of occurrence of PYMoV in eight Piper spp and likely occurrence of four new species in five Piper spp.

  4. Analysis of mitochondrial DNA in Bolivian llama, alpaca and vicuna populations: a contribution to the phylogeny of the South American camelids.

    PubMed

    Barreta, J; Gutiérrez-Gil, B; Iñiguez, V; Saavedra, V; Chiri, R; Latorre, E; Arranz, J J

    2013-04-01

    The objectives of this work were to assess the mtDNA diversity of Bolivian South American camelid (SAC) populations and to shed light on the evolutionary relationships between the Bolivian camelids and other populations of SACs. We have analysed two different mtDNA regions: the complete coding region of the MT-CYB gene and 513 bp of the D-loop region. The populations sampled included Bolivian llamas, alpacas and vicunas, and Chilean guanacos. High levels of genetic diversity were observed in the studied populations. In general, MT-CYB was more variable than D-loop. On a species level, the vicunas showed the lowest genetic variability, followed by the guanacos, alpacas and llamas. Phylogenetic analyses performed by including additional available mtDNA sequences from the studied species confirmed the existence of the two monophyletic clades previously described by other authors for guanacos (G) and vicunas (V). Significant levels of mtDNA hybridization were found in the domestic species. Our sequence analyses revealed significant sequence divergence within clade G, and some of the Bolivian llamas grouped with the majority of the southern guanacos. This finding supports the existence of more than the one llama domestication centre in South America previously suggested on the basis of archaeozoological evidence. Additionally, analysis of D-loop sequences revealed two new matrilineal lineages that are distinct from the previously reported G and V clades. The results presented here represent the first report on the population structure and genetic variability of Bolivian camelids and may help to elucidate the complex and dynamic domestication process of SAC populations. © 2012 The Authors, Animal Genetics © 2012 Stichting International Foundation for Animal Genetics.

  5. In-Field Spatial Variability in the Degradation of the Phenyl-Urea Herbicide Isoproturon Is the Result of Interactions between Degradative Sphingomonas spp. and Soil pH

    PubMed Central

    Bending, Gary D.; Lincoln, Suzanne D.; Sørensen, Sebastian R.; Morgan, J. Alun W.; Aamand, Jens; Walker, Allan

    2003-01-01

    Substantial spatial variability in the degradation rate of the phenyl-urea herbicide isoproturon (IPU) [3-(4-isopropylphenyl)-1,1-dimethylurea] has been shown to occur within agricultural fields, with implications for the longevity of the compound in the soil, and its movement to ground- and surface water. The microbial mechanisms underlying such spatial variability in degradation rate were investigated at Deep Slade field in Warwickshire, United Kingdom. Most-probable-number analysis showed that rapid degradation of IPU was associated with proliferation of IPU-degrading organisms. Slow degradation of IPU was linked to either a delay in the proliferation of IPU-degrading organisms or apparent cometabolic degradation. Using enrichment techniques, an IPU-degrading bacterial culture (designated strain F35) was isolated from fast-degrading soil, and partial 16S rRNA sequencing placed it within the Sphingomonas group. Denaturing gradient gel electrophoresis (DGGE) of PCR-amplified bacterial community 16S rRNA revealed two bands that increased in intensity in soil during growth-linked metabolism of IPU, and sequencing of the excised bands showed high sequence homology to the Sphingomonas group. However, while F35 was not closely related to either DGGE band, one of the DGGE bands showed 100% partial 16S rRNA sequence homology to an IPU-degrading Sphingomonas sp. (strain SRS2) isolated from Deep Slade field in an earlier study. Experiments with strains SRS2 and F35 in soil and liquid culture showed that the isolates had a narrow pH optimum (7 to 7.5) for metabolism of IPU. The pH requirements of IPU-degrading strains of Sphingomonas spp. could largely account for the spatial variation of IPU degradation rates across the field. PMID:12571001

  6. Implementation and utilization of genetic testing in personalized medicine

    PubMed Central

    Abul-Husn, Noura S; Owusu Obeng, Aniwaa; Sanderson, Saskia C; Gottesman, Omri; Scott, Stuart A

    2014-01-01

    Clinical genetic testing began over 30 years ago with the availability of mutation detection for sickle cell disease diagnosis. Since then, the field has dramatically transformed to include gene sequencing, high-throughput targeted genotyping, prenatal mutation detection, preimplantation genetic diagnosis, population-based carrier screening, and now genome-wide analyses using microarrays and next-generation sequencing. Despite these significant advances in molecular technologies and testing capabilities, clinical genetics laboratories historically have been centered on mutation detection for Mendelian disorders. However, the ongoing identification of deoxyribonucleic acid (DNA) sequence variants associated with common diseases prompted the availability of testing for personal disease risk estimation, and created commercial opportunities for direct-to-consumer genetic testing companies that assay these variants. This germline genetic risk, in conjunction with other clinical, family, and demographic variables, are the key components of the personalized medicine paradigm, which aims to apply personal genomic and other relevant data into a patient’s clinical assessment to more precisely guide medical management. However, genetic testing for disease risk estimation is an ongoing topic of debate, largely due to inconsistencies in the results, concerns over clinical validity and utility, and the variable mode of delivery when returning genetic results to patients in the absence of traditional counseling. A related class of genetic testing with analogous issues of clinical utility and acceptance is pharmacogenetic testing, which interrogates sequence variants implicated in interindividual drug response variability. Although clinical pharmacogenetic testing has not previously been widely adopted, advances in rapid turnaround time genetic testing technology and the recent implementation of preemptive genotyping programs at selected medical centers suggest that personalized medicine through pharmacogenetics is now a reality. This review aims to summarize the current state of implementing genetic testing for personalized medicine, with an emphasis on clinical pharmacogenetic testing. PMID:25206309

  7. Mining the oral mycobiome: Methods, components, and meaning

    PubMed Central

    Diaz, Patricia I.; Hong, Bo-Young; Dupuy, Amanda K.; Strausbaugh, Linda D.

    2017-01-01

    ABSTRACT Research on oral fungi has centered on Candida. However, recent internal transcribed spacer (ITS)-based studies revealed a vast number of fungal taxa as potential oral residents. We review DNA-based studies of the oral mycobiome and contrast them with cultivation-based surveys, showing that most genera encountered by cultivation have also been detected molecularly. Some taxa such as Malassezia, however, appear in high prevalence and abundance in molecular studies but have not been cultivated. Important technical and bioinformatic challenges to ITS-based oral mycobiome studies are discussed. These include optimization of sample lysis, variability in length of ITS amplicons, high intra-species ITS sequence variability, high inter-species variability in ITS copy number and challenges in nomenclature and maintenance of curated reference databases. Molecular surveys are powerful first steps to characterize the oral mycobiome but further research is needed to unravel which fungi detected by DNA are true oral residents and what role they play in oral homeostasis. PMID:27791473

  8. Worldwide genetic variability of the Duffy binding protein: insights into Plasmodium vivax vaccine development.

    PubMed

    Nóbrega de Sousa, Taís; Carvalho, Luzia Helena; Alves de Brito, Cristiana Ferreira

    2011-01-01

    The dependence of Plasmodium vivax on invasion mediated by Duffy binding protein (DBP) makes this protein a prime candidate for development of a vaccine. However, the development of a DBP-based vaccine might be hampered by the high variability of the protein ligand (DBP(II)), known to bias the immune response toward a specific DBP variant. Here, the hypothesis being investigated is that the analysis of the worldwide DBP(II) sequences will allow us to determine the minimum number of haplotypes (MNH) to be included in a DBP-based vaccine of broad coverage. For that, all DBP(II) sequences available were compiled and MNH was based on the most frequent nonsynonymous single nucleotide polymorphisms, the majority mapped on B and T cell epitopes. A preliminary analysis of DBP(II) genetic diversity from eight malaria-endemic countries estimated that a number between two to six DBP haplotypes (17 in total) would target at least 50% of parasite population circulating in each endemic region. Aiming to avoid region-specific haplotypes, we next analyzed the MNH that broadly cover worldwide parasite population. The results demonstrated that seven haplotypes would be required to cover around 60% of DBP(II) sequences available. Trying to validate these selected haplotypes per country, we found that five out of the eight countries will be covered by the MNH (67% of parasite populations, range 48-84%). In addition, to identify related subgroups of DBP(II) sequences we used a Bayesian clustering algorithm. The algorithm grouped all DBP(II) sequences in six populations that were independent of geographic origin, with ancestral populations present in different proportions in each country. In conclusion, in this first attempt to undertake a global analysis about DBP(II) variability, the results suggest that the development of DBP-based vaccine should consider multi-haplotype strategies; otherwise a putative P. vivax vaccine may not target some parasite populations.

  9. Genetic variability of Echinococcus granulosus complex in various geographical populations of Iran inferred by mitochondrial DNA sequences.

    PubMed

    Spotin, Adel; Mahami-Oskouei, Mahmoud; Harandi, Majid Fasihi; Baratchian, Mehdi; Bordbar, Ali; Ahmadpour, Ehsan; Ebrahimi, Sahar

    2017-01-01

    To investigate the genetic variability and population structure of Echinococcus granulosus complex, 79 isolates were sequenced from different host species covering human, dog, camel, goat, sheep and cattle as of various geographical sub-populations of Iran (Northwestern, Northern, and Southeastern). In addition, 36 sequences of other geographical populations (Western, Southeastern and Central Iran), were directly retrieved from GenBank database for the mitochondrial cytochrome c oxidase subunit 1 (cox1) gene. The confirmed isolates were grouped as G1 genotype (n=92), G6 genotype (n=14), G3 genotype (n=8) and G2 genotype (n=1). 50 unique haplotypes were identified based on the analyzed sequences of cox1. A parsimonious network of the sequence haplotypes displayed star-like features in the overall population containing IR23 (22: 19.1%) as the most common haplotype. According to the analysis of molecular variance (AMOVA) test, the high value of haplotype diversity of E. granulosus complex was shown the total genetic variability within populations while nucleotide diversity was low in all populations. Neutrality indices of the cox1 (Tajima's D and Fu's Fs tests) were shown negative values in Western-Northwestern, Northern and Southeastern populations which indicating significant divergence from neutrality and positive but not significant in Central isolates. A pairwise fixation index (Fst) as a degree of gene flow was generally low value for all populations (0.00647-0.15198). The statistically Fst values indicate that Echinococcus sensu stricto (genotype G1-G3) populations are not genetically well differentiated in various geographical regions of Iran. To appraise the hypothetical evolutionary scenario, further study is needed to analyze concatenated mitogenomes and as well a panel of single locus nuclear markers should be considered in wider areas of Iran and neighboring countries. Copyright © 2016 Elsevier B.V. All rights reserved.

  10. The bias associated with amplicon sequencing does not affect the quantitative assessment of bacterial community dynamics.

    PubMed

    Ibarbalz, Federico M; Pérez, María Victoria; Figuerola, Eva L M; Erijman, Leonardo

    2014-01-01

    The performance of two sets of primers targeting variable regions of the 16S rRNA gene V1-V3 and V4 was compared in their ability to describe changes of bacterial diversity and temporal turnover in full-scale activated sludge. Duplicate sets of high-throughput amplicon sequencing data of the two 16S rRNA regions shared a collection of core taxa that were observed across a series of twelve monthly samples, although the relative abundance of each taxon was substantially different between regions. A case in point was the changes in the relative abundance of filamentous bacteria Thiothrix, which caused a large effect on diversity indices, but only in the V1-V3 data set. Yet the relative abundance of Thiothrix in the amplicon sequencing data from both regions correlated with the estimation of its abundance determined using fluorescence in situ hybridization. In nonmetric multidimensional analysis samples were distributed along the first ordination axis according to the sequenced region rather than according to sample identities. The dynamics of microbial communities indicated that V1-V3 and the V4 regions of the 16S rRNA gene yielded comparable patterns of: 1) the changes occurring within the communities along fixed time intervals, 2) the slow turnover of activated sludge communities and 3) the rate of species replacement calculated from the taxa-time relationships. The temperature was the only operational variable that showed significant correlation with the composition of bacterial communities over time for the sets of data obtained with both pairs of primers. In conclusion, we show that despite the bias introduced by amplicon sequencing, the variable regions V1-V3 and V4 can be confidently used for the quantitative assessment of bacterial community dynamics, and provide a proper qualitative account of general taxa in the community, especially when the data are obtained over a convenient time window rather than at a single time point.

  11. Unusual Variability of the Drosophila Melanogaster Ref(2)p Protein Which Controls the Multiplication of Sigma Rhabdovirus

    PubMed Central

    Dru, P.; Bras, F.; Dezelee, S.; Gay, P.; Petitjean, A. M.; Pierre-Deneubourg, A.; Teninges, D.; Contamine, D.

    1993-01-01

    The ref(2)P gene of Drosophila melanogaster was identified by the discovery of two alleles, P(o) and P(p), respectively, permissive and restrictive for sigma rhabdovirus multiplication. A surprising variability of this gene was first noticed by the observation of size differences between the transcripts of permissive and restrictive alleles. In this paper, another restrictive allele, P(n), clearly distinct from P(p), is described: it exhibits a weaker antiviral effect than P(p) and differs from P(p) by its molecular structure. Five types of alleles were distinguished on the basis of their molecular structure, as revealed by S1 nuclease analysis of 17 D. melanogaster strains; three alleles were permissive and two restrictive. Comparison of the sequences of four haplotypes revealed numerous point mutations, two deletions (21 and 24 bp) and a complex event involving a 3-bp deletion, all affected the coding region. The unusual variability of the ref(2)P locus was confirmed by the high ratio of amino acid replacements to synonymous mutations (7:1), as compared to that of other genes, such as the Adh (2:42). Nevertheless, nucleotide sequence comparison with the Drosophila erecta ref(2)P gene shows that selective pressures are exerted to maintain the existence of a functional protein. The effects of this high variability on the ref(2)P protein are discussed in relation to its specific antiviral properties and to its function in D. melanogaster, where it is required for male fertility. PMID:8462852

  12. Changes in the neural control of a complex motor sequence during learning

    PubMed Central

    Otchy, Timothy M.; Goldberg, Jesse H.; Aronov, Dmitriy; Fee, Michale S.

    2011-01-01

    The acquisition of complex motor sequences often proceeds through trial-and-error learning, requiring the deliberate exploration of motor actions and the concomitant evaluation of the resulting performance. Songbirds learn their song in this manner, producing highly variable vocalizations as juveniles. As the song improves, vocal variability is gradually reduced until it is all but eliminated in adult birds. In the present study we examine how the motor program underlying such a complex motor behavior evolves during learning by recording from the robust nucleus of the arcopallium (RA), a motor cortex analog brain region. In young birds, neurons in RA exhibited highly variable firing patterns that throughout development became more precise, sparse, and bursty. We further explored how the developing motor program in RA is shaped by its two main inputs: LMAN, the output nucleus of a basal ganglia-forebrain circuit, and HVC, a premotor nucleus. Pharmacological inactivation of LMAN during singing made the song-aligned firing patterns of RA neurons adultlike in their stereotypy without dramatically affecting the spike statistics or the overall firing patterns. Removing the input from HVC, on the other hand, resulted in a complete loss of stereotypy of both the song and the underlying motor program. Thus our results show that a basal ganglia-forebrain circuit drives motor exploration required for trial-and-error learning by adding variability to the developing motor program. As learning proceeds and the motor circuits mature, the relative contribution of LMAN is reduced, allowing the premotor input from HVC to drive an increasingly stereotyped song. PMID:21543758

  13. Automated analysis of high-throughput B-cell sequencing data reveals a high frequency of novel immunoglobulin V gene segment alleles.

    PubMed

    Gadala-Maria, Daniel; Yaari, Gur; Uduman, Mohamed; Kleinstein, Steven H

    2015-02-24

    Individual variation in germline and expressed B-cell immunoglobulin (Ig) repertoires has been associated with aging, disease susceptibility, and differential response to infection and vaccination. Repertoire properties can now be studied at large-scale through next-generation sequencing of rearranged Ig genes. Accurate analysis of these repertoire-sequencing (Rep-Seq) data requires identifying the germline variable (V), diversity (D), and joining (J) gene segments used by each Ig sequence. Current V(D)J assignment methods work by aligning sequences to a database of known germline V(D)J segment alleles. However, existing databases are likely to be incomplete and novel polymorphisms are hard to differentiate from the frequent occurrence of somatic hypermutations in Ig sequences. Here we develop a Tool for Ig Genotype Elucidation via Rep-Seq (TIgGER). TIgGER analyzes mutation patterns in Rep-Seq data to identify novel V segment alleles, and also constructs a personalized germline database containing the specific set of alleles carried by a subject. This information is then used to improve the initial V segment assignments from existing tools, like IMGT/HighV-QUEST. The application of TIgGER to Rep-Seq data from seven subjects identified 11 novel V segment alleles, including at least one in every subject examined. These novel alleles constituted 13% of the total number of unique alleles in these subjects, and impacted 3% of V(D)J segment assignments. These results reinforce the highly polymorphic nature of human Ig V genes, and suggest that many novel alleles remain to be discovered. The integration of TIgGER into Rep-Seq processing pipelines will increase the accuracy of V segment assignments, thus improving B-cell repertoire analyses.

  14. Tufted capuchin monkeys (Sapajus sp) learning how to crack nuts: does variability decline throughout development?

    PubMed

    Resende, Briseida Dogo; Nagy-Reis, Mariana Baldy; Lacerda, Fernanda Neves; Pagnotta, Murillo; Savalli, Carine

    2014-11-01

    We investigated the process of nut-cracking acquisition in a semi-free population of tufted capuchin monkeys (Sapajus sp) in São Paulo, Brazil. We analyzed the cracking episodes from monkeys of different ages and found that variability of actions related to cracking declined. Inept movements were more frequent in juveniles, which also showed an improvement on efficient striking. The most effective behavioral sequence for cracking was more frequently used by the most experienced monkeys, which also used non-optimal sequences. Variability in behavior sequences and actions may allow adaptive changes to behavior under changing environmental conditions. Copyright © 2014 Elsevier B.V. All rights reserved.

  15. Cloning and sequence analysis of complementary DNA encoding an aberrantly rearranged human T-cell gamma chain.

    PubMed Central

    Dialynas, D P; Murre, C; Quertermous, T; Boss, J M; Leiden, J M; Seidman, J G; Strominger, J L

    1986-01-01

    Complementary DNA (cDNA) encoding a human T-cell gamma chain has been cloned and sequenced. At the junction of the variable and joining regions, there is an apparent deletion of two nucleotides in the human cDNA sequence relative to the murine gamma-chain cDNA sequence, resulting simultaneously in the generation of an in-frame stop codon and in a translational frameshift. For this reason, the sequence presented here encodes an aberrantly rearranged human T-cell gamma chain. There are several surprising differences between the deduced human and murine gamma-chain amino acid sequences. These include poor homology in the variable region, poor homology in a discrete segment of the constant region precisely bounded by the expected junctions of exon CII, and the presence in the human sequence of five potential sites for N-linked glycosylation. Images PMID:3458221

  16. First insight into dead wood protistan diversity: a molecular sampling of bright-spored Myxomycetes (Amoebozoa, slime-moulds) in decaying beech logs.

    PubMed

    Clissmann, Fionn; Fiore-Donno, Anna Maria; Hoppe, Björn; Krüger, Dirk; Kahl, Tiemo; Unterseher, Martin; Schnittler, Martin

    2015-06-01

    Decaying wood hosts a large diversity of seldom investigated protists. Environmental sequencing offers novel insights into communities, but has rarely been applied to saproxylic protists. We investigated the diversity of bright-spored wood-inhabiting Myxomycetes by environmental sequencing. Myxomycetes have a complex life cycle culminating in the formation of mainly macroscopic fruiting bodies, highly variable in shape and colour that are often found on decaying logs. Our hypothesis was that diversity of bright-spored Myxomycetes would increase with decay. DNA was extracted from wood chips collected from 17 beech logs of varying decay stages from the Hainich-Dün region in Central Germany. We obtained 260 partial small subunit ribosomal RNA gene sequences of bright-spored Myxomycetes that were assembled into 29 OTUs, of which 65% were less than 98% similar to those in the existing database. The OTU richness revealed by molecular analysis surpassed that of a parallel inventory of fruiting bodies. We tested several environmental variables and identified pH, rather than decay stage, as the main structuring factor of myxomycete distribution. © FEMS 2015. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  17. Genomic analysis of NAC transcription factors in banana (Musa acuminata) and definition of NAC orthologous groups for monocots and dicots.

    PubMed

    Cenci, Albero; Guignon, Valentin; Roux, Nicolas; Rouard, Mathieu

    2014-05-01

    Identifying the molecular mechanisms underlying tolerance to abiotic stresses is important in crop breeding. A comprehensive understanding of the gene families associated with drought tolerance is therefore highly relevant. NAC transcription factors form a large plant-specific gene family involved in the regulation of tissue development and responses to biotic and abiotic stresses. The main goal of this study was to set up a framework of orthologous groups determined by an expert sequence comparison of NAC genes from both monocots and dicots. In order to clarify the orthologous relationships among NAC genes of different species, we performed an in-depth comparative study of four divergent taxa, in dicots and monocots, whose genomes have already been completely sequenced: Arabidopsis thaliana, Vitis vinifera, Musa acuminata and Oryza sativa. Due to independent evolution, NAC copy number is highly variable in these plant genomes. Based on an expert NAC sequence comparison, we propose forty orthologous groups of NAC sequences that were probably derived from an ancestor gene present in the most recent common ancestor of dicots and monocots. These orthologous groups provide a curated resource for large-scale protein sequence annotation of NAC transcription factors. The established orthology relationships also provide a useful reference for NAC function studies in newly sequenced genomes such as M. acuminata and other plant species.

  18. Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species

    PubMed Central

    2013-01-01

    Background The process of generating raw genome sequence data continues to become cheaper, faster, and more accurate. However, assembly of such data into high-quality, finished genome sequences remains challenging. Many genome assembly tools are available, but they differ greatly in terms of their performance (speed, scalability, hardware requirements, acceptance of newer read technologies) and in their final output (composition of assembled sequence). More importantly, it remains largely unclear how to best assess the quality of assembled genome sequences. The Assemblathon competitions are intended to assess current state-of-the-art methods in genome assembly. Results In Assemblathon 2, we provided a variety of sequence data to be assembled for three vertebrate species (a bird, a fish, and snake). This resulted in a total of 43 submitted assemblies from 21 participating teams. We evaluated these assemblies using a combination of optical map data, Fosmid sequences, and several statistical methods. From over 100 different metrics, we chose ten key measures by which to assess the overall quality of the assemblies. Conclusions Many current genome assemblers produced useful assemblies, containing a significant representation of their genes and overall genome structure. However, the high degree of variability between the entries suggests that there is still much room for improvement in the field of genome assembly and that approaches which work well in assembling the genome of one species may not necessarily work well for another. PMID:23870653

  19. A new hybrid approach for MHC genotyping: high-throughput NGS and long read MinION nanopore sequencing, with application to the non-model vertebrate Alpine chamois (Rupicapra rupicapra).

    PubMed

    Fuselli, S; Baptista, R P; Panziera, A; Magi, A; Guglielmi, S; Tonin, R; Benazzo, A; Bauzer, L G; Mazzoni, C J; Bertorelle, G

    2018-03-24

    The major histocompatibility complex (MHC) acts as an interface between the immune system and infectious diseases. Accurate characterization and genotyping of the extremely variable MHC loci are challenging especially without a reference sequence. We designed a combination of long-range PCR, Illumina short-reads, and Oxford Nanopore MinION long-reads approaches to capture the genetic variation of the MHC II DRB locus in an Italian population of the Alpine chamois (Rupicapra rupicapra). We utilized long-range PCR to generate a 9 Kb fragment of the DRB locus. Amplicons from six different individuals were fragmented, tagged, and simultaneously sequenced with Illumina MiSeq. One of these amplicons was sequenced with the MinION device, which produced long reads covering the entire amplified fragment. A pipeline that combines short and long reads resolved several short tandem repeats and homopolymers and produced a de novo reference, which was then used to map and genotype the short reads from all individuals. The assembled DRB locus showed a high level of polymorphism and the presence of a recombination breakpoint. Our results suggest that an amplicon-based NGS approach coupled with single-molecule MinION nanopore sequencing can efficiently achieve both the assembly and the genotyping of complex genomic regions in multiple individuals in the absence of a reference sequence.

  20. ImmuneDB: a system for the analysis and exploration of high-throughput adaptive immune receptor sequencing data.

    PubMed

    Rosenfeld, Aaron M; Meng, Wenzhao; Luning Prak, Eline T; Hershberg, Uri

    2017-01-15

    As high-throughput sequencing of B cells becomes more common, the need for tools to analyze the large quantity of data also increases. This article introduces ImmuneDB, a system for analyzing vast amounts of heavy chain variable region sequences and exploring the resulting data. It can take as input raw FASTA/FASTQ data, identify genes, determine clones, construct lineages, as well as provide information such as selection pressure and mutation analysis. It uses an industry leading database, MySQL, to provide fast analysis and avoid the complexities of using error prone flat-files. ImmuneDB is freely available at http://immunedb.comA demo of the ImmuneDB web interface is available at: http://immunedb.com/demo CONTACT: Uh25@drexel.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  1. Molecular Cloning and Expression of Three Polygalacturonase cDNAs from the Tarnished Plant Bug, Lygus lineolaris

    PubMed Central

    Allen, Margaret L.; Mertens, Jeffrey A.

    2008-01-01

    Three unique cDNAs encoding putative polygalacturonase enzymes were isolated from the tarnished plant bug, Lygus lineolaris (Palisot de Beauvois) (Hemiptera: Miridae). The three nucleotide sequences were dissimilar to one another, but the deduced amino acid sequences were similar to each other and to other polygalacturonases from insects, fungi, plants, and bacteria. Four conserved segments characteristic of polygalacturonases were present, but with some notable semiconservative substitutions. Two of four expected disulfide bridge—forming cysteine pairs were present. All three inferred protein translations included predicted signal sequences of 17 to 20 amino acids. Amplification of genomic DNA identified an intron in one of the genes, Llpg1, in the 5′ untranslated region. Semiquantitative RT-PCR revealed expression in all stages of the insect except the eggs. Expression in adults, male and female, was highly variable, indicating a family of highly inducible and diverse enzymes adapted to the generalist polyphagous nature of this important pest. PMID:20233096

  2. The Dallas-Fort Worth Airport Earthquake Sequence: Seismicity Beyond Injection Period

    NASA Astrophysics Data System (ADS)

    Ogwari, Paul O.; DeShon, Heather R.; Hornbach, Matthew J.

    2018-01-01

    The 2008 Dallas-Fort Worth Airport earthquakes mark the beginning of seismicity rate changes linked to oil and gas operations in the central United States. We assess the spatial and temporal evolution of the sequence through December 2015 using template-based waveform correlation and relative location methods. We locate 400 earthquakes spanning 2008-2015 along a basement fault mapped as the Airport fault. The sequence exhibits temporally variable b values, and small-magnitude (m < 3.4) earthquakes spread northeast along strike over time. Pore pressure diffusion models indicate that the high-volume brine injection well located within 1 km of the 2008 earthquakes, although only operating from September 2008 to August 2009, contributes most significantly to long-term pressure perturbations, and hence stress changes, along the fault; a second long-operating, low-volume injector located 10 km north causes insufficient pressure changes. High-volume injection for a short time period near a critically stressed fault can induce long-lasting seismicity.

  3. BRILIA: Integrated Tool for High-Throughput Annotation and Lineage Tree Assembly of B-Cell Repertoires.

    PubMed

    Lee, Donald W; Khavrutskii, Ilja V; Wallqvist, Anders; Bavari, Sina; Cooper, Christopher L; Chaudhury, Sidhartha

    2016-01-01

    The somatic diversity of antigen-recognizing B-cell receptors (BCRs) arises from Variable (V), Diversity (D), and Joining (J) (VDJ) recombination and somatic hypermutation (SHM) during B-cell development and affinity maturation. The VDJ junction of the BCR heavy chain forms the highly variable complementarity determining region 3 (CDR3), which plays a critical role in antigen specificity and binding affinity. Tracking the selection and mutation of the CDR3 can be useful in characterizing humoral responses to infection and vaccination. Although tens to hundreds of thousands of unique BCR genes within an expressed B-cell repertoire can now be resolved with high-throughput sequencing, tracking SHMs is still challenging because existing annotation methods are often limited by poor annotation coverage, inconsistent SHM identification across the VDJ junction, or lack of B-cell lineage data. Here, we present B-cell repertoire inductive lineage and immunosequence annotator (BRILIA), an algorithm that leverages repertoire-wide sequencing data to globally improve the VDJ annotation coverage, lineage tree assembly, and SHM identification. On benchmark tests against simulated human and mouse BCR repertoires, BRILIA correctly annotated germline and clonally expanded sequences with 94 and 70% accuracy, respectively, and it has a 90% SHM-positive prediction rate in the CDR3 of heavily mutated sequences; these are substantial improvements over existing methods. We used BRILIA to process BCR sequences obtained from splenic germinal center B cells extracted from C57BL/6 mice. BRILIA returned robust B-cell lineage trees and yielded SHM patterns that are consistent across the VDJ junction and agree with known biological mechanisms of SHM. By contrast, existing BCR annotation tools, which do not account for repertoire-wide clonal relationships, systematically underestimated both the size of clonally related B-cell clusters and yielded inconsistent SHM frequencies. We demonstrate BRILIA's utility in B-cell repertoire studies related to VDJ gene usage, mechanisms for adenosine mutations, and SHM hot spot motifs. Furthermore, we show that the complete gene usage annotation and SHM identification across the entire CDR3 are essential for studying the B-cell affinity maturation process through immunosequencing methods.

  4. Species identification of mutans streptococci by groESL gene sequence.

    PubMed

    Hung, Wei-Chung; Tsai, Jui-Chang; Hsueh, Po-Ren; Chia, Jean-San; Teng, Lee-Jene

    2005-09-01

    The near full-length sequences of the groESL genes were determined and analysed among eight reference strains (serotypes a to h) representing five species of mutans group streptococci. The groES sequences from these reference strains revealed that there are two lengths (285 and 288 bp) in the five species. The intergenic spacer between groES and groEL appears to be a unique marker for species, with a variable size (ranging from 111 to 310 bp) and sequence. Phylogenetic analysis of groES and groEL separated the eight serotypes into two major clusters. Strains of serotypes b, c, e and f were highly related and had groES gene sequences of the same length, 288 bp, while strains of serotypes a, d, g and h were also closely related and their groES gene sequence lengths were 285 bp. The groESL sequences in clinical isolates of three serotypes of S. mutans were analysed for intraspecies polymorphism. The results showed that the groESL sequences could provide information for differentiation among species, but were unable to distinguish serotypes of the same species. Based on the determined sequences, a PCR assay was developed that could differentiate members of the mutans streptococci by amplicon size and provide an alternative way for distinguishing mutans streptococci from other viridans streptococci.

  5. Transposable elements in cancer.

    PubMed

    Burns, Kathleen H

    2017-07-01

    Transposable elements give rise to interspersed repeats, sequences that comprise most of our genomes. These mobile DNAs have been historically underappreciated - both because they have been presumed to be unimportant, and because their high copy number and variability pose unique technical challenges. Neither impediment now seems steadfast. Interest in the human mobilome has never been greater, and methods enabling its study are maturing at a fast pace. This Review describes the activity of transposable elements in human cancers, particularly long interspersed element-1 (LINE-1). LINE-1 sequences are self-propagating, protein-coding retrotransposons, and their activity results in somatically acquired insertions in cancer genomes. Altered expression of transposable elements and animation of genomic LINE-1 sequences appear to be hallmarks of cancer, and can be responsible for driving mutations in tumorigenesis.

  6. StralSV: assessment of sequence variability within similar 3D structures and application to polio RNA-dependent RNA polymerase.

    PubMed

    Zemla, Adam T; Lang, Dorothy M; Kostova, Tanya; Andino, Raul; Ecale Zhou, Carol L

    2011-06-02

    Most of the currently used methods for protein function prediction rely on sequence-based comparisons between a query protein and those for which a functional annotation is provided. A serious limitation of sequence similarity-based approaches for identifying residue conservation among proteins is the low confidence in assigning residue-residue correspondences among proteins when the level of sequence identity between the compared proteins is poor. Multiple sequence alignment methods are more satisfactory--still, they cannot provide reliable results at low levels of sequence identity. Our goal in the current work was to develop an algorithm that could help overcome these difficulties by facilitating the identification of structurally (and possibly functionally) relevant residue-residue correspondences between compared protein structures. Here we present StralSV (structure-alignment sequence variability), a new algorithm for detecting closely related structure fragments and quantifying residue frequency from tight local structure alignments. We apply StralSV in a study of the RNA-dependent RNA polymerase of poliovirus, and we demonstrate that the algorithm can be used to determine regions of the protein that are relatively unique, or that share structural similarity with proteins that would be considered distantly related. By quantifying residue frequencies among many residue-residue pairs extracted from local structural alignments, one can infer potential structural or functional importance of specific residues that are determined to be highly conserved or that deviate from a consensus. We further demonstrate that considerable detailed structural and phylogenetic information can be derived from StralSV analyses. StralSV is a new structure-based algorithm for identifying and aligning structure fragments that have similarity to a reference protein. StralSV analysis can be used to quantify residue-residue correspondences and identify residues that may be of particular structural or functional importance, as well as unusual or unexpected residues at a given sequence position. StralSV is provided as a web service at http://proteinmodel.org/AS2TS/STRALSV/.

  7. Community composition of lacustrine small eukaryotes in hyper-eutrophic conditions in relation to top-down and bottom-up factors.

    PubMed

    Lepère, Cécile; Domaizon, Isabelle; Debroas, Didier

    2007-09-01

    Small eukaryotes (0.2-5 microm) in hyper-eutrophic conditions were described using terminal restriction fragment length polymorphism and cloning-sequencing, and were related to environmental variables both by an experimental approach and by a temporal field study. In situ analysis showed marked temporal variations in the dominant terminal restriction fragments (T-RFs), which were related to environmental variables such as nutrient concentrations and metazooplankton composition. To monitor the responses of the small-eukaryote community to top-down (absence or presence of planktivorous fish) and bottom-up (low or high nitrogen and phosphorus addition) effects, a cross-classified design mesocosm experiment was used. Depending on the type of treatment, we recorded changes in the diversity of T-RFs, as well as modifications in phylogenetic composition. Centroheliozoa and Cryptophyta were found in all types of treatment, whereas Chlorophyta were specific to enclosures receiving high nutrient loadings, and were associated either with LKM11 and 'environmental sequences'. Cercozoa and Fungi were not detected in enclosures receiving high nutrient loadings and fishes. Our results showed that resources and top-down factors are both clearly involved in shaping the structure of small eukaryotes, not only autotrophs but also heterotrophs, via complex interactions and trophic cascades within a microbial loop, notably in response to nutrient loading.

  8. Genetic variability and evolutionary dynamics of viruses of the family Closteroviridae

    PubMed Central

    Rubio, Luis; Guerri, José; Moreno, Pedro

    2013-01-01

    RNA viruses have a great potential for genetic variation, rapid evolution and adaptation. Characterization of the genetic variation of viral populations provides relevant information on the processes involved in virus evolution and epidemiology and it is crucial for designing reliable diagnostic tools and developing efficient and durable disease control strategies. Here we performed an updated analysis of sequences available in Genbank and reviewed present knowledge on the genetic variability and evolutionary processes of viruses of the family Closteroviridae. Several factors have shaped the genetic structure and diversity of closteroviruses. (I) A strong negative selection seems to be responsible for the high genetic stability in space and time for some viruses. (2) Long distance migration, probably by human transport of infected propagative plant material, have caused that genetically similar virus isolates are found in distant geographical regions. (3) Recombination between divergent sequence variants have generated new genotypes and plays an important role for the evolution of some viruses of the family Closteroviridae. (4) Interaction between virus strains or between different viruses in mixed infections may alter accumulation of certain strains. (5) Host change or virus transmission by insect vectors induced changes in the viral population structure due to positive selection of sequence variants with higher fitness for host-virus or vector-virus interaction (adaptation) or by genetic drift due to random selection of sequence variants during the population bottleneck associated to the transmission process. PMID:23805130

  9. Targeted next-generation sequencing in steroid-resistant nephrotic syndrome: mutations in multiple glomerular genes may influence disease severity.

    PubMed

    Bullich, Gemma; Trujillano, Daniel; Santín, Sheila; Ossowski, Stephan; Mendizábal, Santiago; Fraga, Gloria; Madrid, Álvaro; Ariceta, Gema; Ballarín, José; Torra, Roser; Estivill, Xavier; Ars, Elisabet

    2015-09-01

    Genetic diagnosis of steroid-resistant nephrotic syndrome (SRNS) using Sanger sequencing is complicated by the high genetic heterogeneity and phenotypic variability of this disease. We aimed to improve the genetic diagnosis of SRNS by simultaneously sequencing 26 glomerular genes using massive parallel sequencing and to study whether mutations in multiple genes increase disease severity. High-throughput mutation analysis was performed in 50 SRNS and/or focal segmental glomerulosclerosis (FSGS) patients, a validation cohort of 25 patients with known pathogenic mutations, and a discovery cohort of 25 uncharacterized patients with probable genetic etiology. In the validation cohort, we identified the 42 previously known pathogenic mutations across NPHS1, NPHS2, WT1, TRPC6, and INF2 genes. In the discovery cohort, disease-causing mutations in SRNS/FSGS genes were found in nine patients. We detected three patients with mutations in an SRNS/FSGS gene and COL4A3. Two of them were familial cases and presented a more severe phenotype than family members with mutation in only one gene. In conclusion, our results show that massive parallel sequencing is feasible and robust for genetic diagnosis of SRNS/FSGS. Our results indicate that patients carrying mutations in an SRNS/FSGS gene and also in COL4A3 gene have increased disease severity.

  10. Ecological niche modelling and nDNA sequencing support a new, morphologically cryptic beetle species unveiled by DNA barcoding.

    PubMed

    Hawlitschek, Oliver; Porch, Nick; Hendrich, Lars; Balke, Michael

    2011-02-09

    DNA sequencing techniques used to estimate biodiversity, such as DNA barcoding, may reveal cryptic species. However, disagreements between barcoding and morphological data have already led to controversy. Species delimitation should therefore not be based on mtDNA alone. Here, we explore the use of nDNA and bioclimatic modelling in a new species of aquatic beetle revealed by mtDNA sequence data. The aquatic beetle fauna of Australia is characterised by high degrees of endemism, including local radiations such as the genus Antiporus. Antiporus femoralis was previously considered to exist in two disjunct, but morphologically indistinguishable populations in south-western and south-eastern Australia. We constructed a phylogeny of Antiporus and detected a deep split between these populations. Diagnostic characters from the highly variable nuclear protein encoding arginine kinase gene confirmed the presence of two isolated populations. We then used ecological niche modelling to examine the climatic niche characteristics of the two populations. All results support the status of the two populations as distinct species. We describe the south-western species as Antiporus occidentalis sp.n. In addition to nDNA sequence data and extended use of mitochondrial sequences, ecological niche modelling has great potential for delineating morphologically cryptic species.

  11. Ecological Niche Modelling and nDNA Sequencing Support a New, Morphologically Cryptic Beetle Species Unveiled by DNA Barcoding

    PubMed Central

    Hawlitschek, Oliver; Porch, Nick; Hendrich, Lars; Balke, Michael

    2011-01-01

    Background DNA sequencing techniques used to estimate biodiversity, such as DNA barcoding, may reveal cryptic species. However, disagreements between barcoding and morphological data have already led to controversy. Species delimitation should therefore not be based on mtDNA alone. Here, we explore the use of nDNA and bioclimatic modelling in a new species of aquatic beetle revealed by mtDNA sequence data. Methodology/Principal Findings The aquatic beetle fauna of Australia is characterised by high degrees of endemism, including local radiations such as the genus Antiporus. Antiporus femoralis was previously considered to exist in two disjunct, but morphologically indistinguishable populations in south-western and south-eastern Australia. We constructed a phylogeny of Antiporus and detected a deep split between these populations. Diagnostic characters from the highly variable nuclear protein encoding arginine kinase gene confirmed the presence of two isolated populations. We then used ecological niche modelling to examine the climatic niche characteristics of the two populations. All results support the status of the two populations as distinct species. We describe the south-western species as Antiporus occidentalis sp.n. Conclusion/Significance In addition to nDNA sequence data and extended use of mitochondrial sequences, ecological niche modelling has great potential for delineating morphologically cryptic species. PMID:21347370

  12. The sequence of camelpox virus shows it is most closely related to variola virus, the cause of smallpox.

    PubMed

    Gubser, Caroline; Smith, Geoffrey L

    2002-04-01

    Camelpox virus (CMPV) and variola virus (VAR) are orthopoxviruses (OPVs) that share several biological features and cause high mortality and morbidity in their single host species. The sequence of a virulent CMPV strain was determined; it is 202182 bp long, with inverted terminal repeats (ITRs) of 6045 bp and has 206 predicted open reading frames (ORFs). As for other poxviruses, the genes are tightly packed with little non-coding sequence. Most genes within 25 kb of each terminus are transcribed outwards towards the terminus, whereas genes within the centre of the genome are transcribed from either DNA strand. The central region of the genome contains genes that are highly conserved in other OPVs and 87 of these are conserved in all sequenced chordopoxviruses. In contrast, genes towards either terminus are more variable and encode proteins involved in host range, virulence or immunomodulation. In some cases, these are broken versions of genes found in other OPVs. The relationship of CMPV to other OPVs was analysed by comparisons of DNA and predicted protein sequences, repeats within the ITRs and arrangement of ORFs within the terminal regions. Each comparison gave the same conclusion: CMPV is the closest known virus to variola virus, the cause of smallpox.

  13. Synthetic spike-in standards for high-throughput 16S rRNA gene amplicon sequencing

    PubMed Central

    Tourlousse, Dieter M.; Yoshiike, Satowa; Ohashi, Akiko; Matsukura, Satoko; Noda, Naohiro

    2017-01-01

    Abstract High-throughput sequencing of 16S rRNA gene amplicons (16S-seq) has become a widely deployed method for profiling complex microbial communities but technical pitfalls related to data reliability and quantification remain to be fully addressed. In this work, we have developed and implemented a set of synthetic 16S rRNA genes to serve as universal spike-in standards for 16S-seq experiments. The spike-ins represent full-length 16S rRNA genes containing artificial variable regions with negligible identity to known nucleotide sequences, permitting unambiguous identification of spike-in sequences in 16S-seq read data from any microbiome sample. Using defined mock communities and environmental microbiota, we characterized the performance of the spike-in standards and demonstrated their utility for evaluating data quality on a per-sample basis. Further, we showed that staggered spike-in mixtures added at the point of DNA extraction enable concurrent estimation of absolute microbial abundances suitable for comparative analysis. Results also underscored that template-specific Illumina sequencing artifacts may lead to biases in the perceived abundance of certain taxa. Taken together, the spike-in standards represent a novel bioanalytical tool that can substantially improve 16S-seq-based microbiome studies by enabling comprehensive quality control along with absolute quantification. PMID:27980100

  14. Identification and verification of hybridoma-derived monoclonal antibody variable region sequences using recombinant DNA technology and mass spectrometry

    USDA-ARS?s Scientific Manuscript database

    Antibody engineering requires the identification of antigen binding domains or variable regions (VR) unique to each antibody. It is the VR that define the unique antigen binding properties and proper sequence identification is essential for functional evaluation and performance of recombinant antibo...

  15. Database-independent Protein Sequencing (DiPS) Enables Full-length de Novo Protein and Antibody Sequence Determination.

    PubMed

    Savidor, Alon; Barzilay, Rotem; Elinger, Dalia; Yarden, Yosef; Lindzen, Moshit; Gabashvili, Alexandra; Adiv Tal, Ophir; Levin, Yishai

    2017-06-01

    Traditional "bottom-up" proteomic approaches use proteolytic digestion, LC-MS/MS, and database searching to elucidate peptide identities and their parent proteins. Protein sequences absent from the database cannot be identified, and even if present in the database, complete sequence coverage is rarely achieved even for the most abundant proteins in the sample. Thus, sequencing of unknown proteins such as antibodies or constituents of metaproteomes remains a challenging problem. To date, there is no available method for full-length protein sequencing, independent of a reference database, in high throughput. Here, we present Database-independent Protein Sequencing, a method for unambiguous, rapid, database-independent, full-length protein sequencing. The method is a novel combination of non-enzymatic, semi-random cleavage of the protein, LC-MS/MS analysis, peptide de novo sequencing, extraction of peptide tags, and their assembly into a consensus sequence using an algorithm named "Peptide Tag Assembler." As proof-of-concept, the method was applied to samples of three known proteins representing three size classes and to a previously un-sequenced, clinically relevant monoclonal antibody. Excluding leucine/isoleucine and glutamic acid/deamidated glutamine ambiguities, end-to-end full-length de novo sequencing was achieved with 99-100% accuracy for all benchmarking proteins and the antibody light chain. Accuracy of the sequenced antibody heavy chain, including the entire variable region, was also 100%, but there was a 23-residue gap in the constant region sequence. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.

  16. Hidden Markov models incorporating fuzzy measures and integrals for protein sequence identification and alignment.

    PubMed

    Bidargaddi, Niranjan P; Chetty, Madhu; Kamruzzaman, Joarder

    2008-06-01

    Profile hidden Markov models (HMMs) based on classical HMMs have been widely applied for protein sequence identification. The formulation of the forward and backward variables in profile HMMs is made under statistical independence assumption of the probability theory. We propose a fuzzy profile HMM to overcome the limitations of that assumption and to achieve an improved alignment for protein sequences belonging to a given family. The proposed model fuzzifies the forward and backward variables by incorporating Sugeno fuzzy measures and Choquet integrals, thus further extends the generalized HMM. Based on the fuzzified forward and backward variables, we propose a fuzzy Baum-Welch parameter estimation algorithm for profiles. The strong correlations and the sequence preference involved in the protein structures make this fuzzy architecture based model as a suitable candidate for building profiles of a given family, since the fuzzy set can handle uncertainties better than classical methods.

  17. Mind the gap! The mitochondrial control region and its power as a phylogenetic marker in echinoids.

    PubMed

    Bronstein, Omri; Kroh, Andreas; Haring, Elisabeth

    2018-05-30

    In Metazoa, mitochondrial markers are the most commonly used targets for inferring species-level molecular phylogenies due to their extremely low rate of recombination, maternal inheritance, ease of use and fast substitution rate in comparison to nuclear DNA. The mitochondrial control region (CR) is the main non-coding area of the mitochondrial genome and contains the mitochondrial origin of replication and transcription. While sequences of the cytochrome oxidase subunit 1 (COI) and 16S rRNA genes are the prime mitochondrial markers in phylogenetic studies, the highly variable CR is typically ignored and not targeted in such analyses. However, the higher substitution rate of the CR can be harnessed to infer the phylogeny of closely related species, and the use of a non-coding region alleviates biases resulting from both directional and purifying selection. Additionally, complete mitochondrial genome assemblies utilizing next generation sequencing (NGS) data often show exceptionally low coverage at specific regions, including the CR. This can only be resolved by targeted sequencing of this region. Here we provide novel sequence data for the echinoid mitochondrial control region in over 40 species across the echinoid phylogenetic tree. We demonstrate the advantages of directly targeting the CR and adjacent tRNAs to facilitate complementing low coverage NGS data from complete mitochondrial genome assemblies. Finally, we test the performance of this region as a phylogenetic marker both in the lab and in phylogenetic analyses, and demonstrate its superior performance over the other available mitochondrial markers in echinoids. Our target region of the mitochondrial CR (1) facilitates the first thorough investigation of this region across a wide range of echinoid taxa, (2) provides a tool for complementing missing data in NGS experiments, and (3) identifies the CR as a powerful, novel marker for phylogenetic inference in echinoids due to its high variability, lack of selection, and high compatibility across the entire class, outperforming conventional mitochondrial markers.

  18. Spatiotemporal Patterns of Contact Across the Rat Vibrissal Array During Exploratory Behavior

    PubMed Central

    Hobbs, Jennifer A.; Towal, R. Blythe; Hartmann, Mitra J. Z.

    2016-01-01

    The rat vibrissal system is an important model for the study of somatosensation, but the small size and rapid speed of the vibrissae have precluded measuring precise vibrissal-object contact sequences during behavior. We used a laser light sheet to quantify, with 1 ms resolution, the spatiotemporal structure of whisker-surface contact as five naïve rats freely explored a flat, vertical glass wall. Consistent with previous work, we show that the whisk cycle cannot be uniquely defined because different whiskers often move asynchronously, but that quasi-periodic (~8 Hz) variations in head velocity represent a distinct temporal feature on which to lock analysis. Around times of minimum head velocity, whiskers protract to make contact with the surface, and then sustain contact with the surface for extended durations (~25–60 ms) before detaching. This behavior results in discrete temporal windows in which large numbers of whiskers are in contact with the surface. These “sustained collective contact intervals” (SCCIs) were observed on 100% of whisks for all five rats. The overall spatiotemporal structure of the SCCIs can be qualitatively predicted based on information about head pose and the average whisk cycle. In contrast, precise sequences of whisker-surface contact depend on detailed head and whisker kinematics. Sequences of vibrissal contact were highly variable, equally likely to propagate in all directions across the array. Somewhat more structure was found when sequences of contacts were examined on a row-wise basis. In striking contrast to the high variability associated with contact sequences, a consistent feature of each SCCI was that the contact locations of the whiskers on the glass converged and moved more slowly on the sheet. Together, these findings lead us to propose that the rat uses a strategy of “windowed sampling” to extract an object's spatial features: specifically, the rat spatially integrates quasi-static mechanical signals across whiskers during the period of sustained contact, resembling an “enclosing” haptic procedure. PMID:26778990

  19. Pseudomonas aeruginosa clinical and environmental isolates constitute a single population with high phenotypic diversity

    PubMed Central

    2014-01-01

    Background Pseudomonas aeruginosa is an opportunistic pathogen with a high incidence of hospital infections that represents a threat to immune compromised patients. Genomic studies have shown that, in contrast to other pathogenic bacteria, clinical and environmental isolates do not show particular genomic differences. In addition, genetic variability of all the P. aeruginosa strains whose genomes have been sequenced is extremely low. This low genomic variability might be explained if clinical strains constitute a subpopulation of this bacterial species present in environments that are close to human populations, which preferentially produce virulence associated traits. Results In this work, we sequenced the genomes and performed phenotypic descriptions for four non-human P. aeruginosa isolates collected from a plant, the ocean, a water-spring, and from dolphin stomach. We show that the four strains are phenotypically diverse and that this is not reflected in genomic variability, since their genomes are almost identical. Furthermore, we performed a detailed comparative genomic analysis of the four strains studied in this work with the thirteen previously reported P. aeruginosa genomes by means of describing their core and pan-genomes. Conclusions Contrary to what has been described for other bacteria we have found that the P. aeruginosa core genome is constituted by a high proportion of genes and that its pan-genome is thus relatively small. Considering the high degree of genomic conservation between isolates of P. aeruginosa from diverse environments, including human tissues, some implications for the treatment of infections are discussed. This work also represents a methodological contribution for the genomic study of P. aeruginosa, since we provide a database of the comparison of all the proteins encoded by the seventeen strains analyzed. PMID:24773920

  20. Species composition of the genus Saprolegnia in fin fish aquaculture environments, as determined by nucleotide sequence analysis of the nuclear rDNA ITS regions.

    PubMed

    de la Bastide, Paul Y; Leung, Wai Lam; Hintz, William E

    2015-01-01

    The ITS region of the rDNA gene was compared for Saprolegnia spp. in order to improve our understanding of nucleotide sequence variability within and between species of this genus, determine species composition in Canadian fin fish aquaculture facilities, and to assess the utility of ITS sequence variability in genetic marker development. From a collection of more than 400 field isolates, ITS region nucleotide sequences were studied and it was determined that there was sufficient consistent inter-specific variation to support the designation of species identity based on ITS sequence data. This non-subjective approach to species identification does not rely upon transient morphological features. Phylogenetic analyses comparing our ITS sequences and species designations with data from previous studies generally supported the clade scheme of Diéguez-Uribeondo et al. (2007) and found agreement with the molecular taxonomic cluster system of Sandoval-Sierra et al. (2014). Our Canadian ITS sequence collection will thus contribute to the public database and assist the clarification of Saprolegnia spp. taxonomy. The analysis of ITS region sequence variability facilitated genus- and species-level identification of unknown samples from aquaculture facilities and provided useful information on species composition. A unique ITS-RFLP for the identification of S. parasitica was also described. Copyright © 2014 The British Mycological Society. Published by Elsevier Ltd. All rights reserved.

  1. Generation and Characterization of HIV-1 Transmitted and Founder Virus Consensus Sequence from Intravenous Drug Users in Xinjiang, China.

    PubMed

    Li, Fan; Ma, Liying; Feng, Yi; Hu, Jing; Ni, Na; Ruan, Yuhua; Shao, Yiming

    2017-06-01

    HIV-1 transmission in intravenous drug users (IDUs) has been characterized by high genetic multiplicity and suggests a greater challenge for HIV-1 infection blocking. We investigated a total of 749 sequences of full-length gp160 gene obtained by single genome sequencing (SGS) from 22 HIV-1 early infected IDUs in Xinjiang province, northwest China, and generated a transmitted and founder virus (T/F virus) consensus sequence (IDU.CON). The T/F virus was classified as subtype CRF07_BC and predicted to be CCR5-tropic virus. The variable region (V1, V2, and V4 loop) of IDU.CON showed length variation compared with the heterosexual T/F virus consensus sequence (HSX.CON) and homosexual T/F virus consensus sequence (MSM.CON). A total of 26 N-linked glycosylation sites were discovered in the IDU.CON sequence, which is less than that of MSM.CON and HSX.CON. Characterization of T/F virus from IDUs highlights the genetic make-up and complexity of virus near the moment of transmission or in early infection preceding systemic dissemination and is important toward the development of an effective HIV-1 preventive methods, including vaccines.

  2. Length-independent structural similarities enrich the antibody CDR canonical class model.

    PubMed

    Nowak, Jaroslaw; Baker, Terry; Georges, Guy; Kelm, Sebastian; Klostermann, Stefan; Shi, Jiye; Sridharan, Sudharsan; Deane, Charlotte M

    2016-01-01

    Complementarity-determining regions (CDRs) are antibody loops that make up the antigen binding site. Here, we show that all CDR types have structurally similar loops of different lengths. Based on these findings, we created length-independent canonical classes for the non-H3 CDRs. Our length variable structural clusters show strong sequence patterns suggesting either that they evolved from the same original structure or result from some form of convergence. We find that our length-independent method not only clusters a larger number of CDRs, but also predicts canonical class from sequence better than the standard length-dependent approach. To demonstrate the usefulness of our findings, we predicted cluster membership of CDR-L3 sequences from 3 next-generation sequencing datasets of the antibody repertoire (over 1,000,000 sequences). Using the length-independent clusters, we can structurally classify an additional 135,000 sequences, which represents a ∼20% improvement over the standard approach. This suggests that our length-independent canonical classes might be a highly prevalent feature of antibody space, and could substantially improve our ability to accurately predict the structure of novel CDRs identified by next-generation sequencing.

  3. Sequence diversity of wheat mosaic virus isolates.

    PubMed

    Stewart, Lucy R

    2016-02-02

    Wheat mosaic virus (WMoV), transmitted by eriophyid wheat curl mites (Aceria tosichella) is the causal agent of High Plains disease in wheat and maize. WMoV and other members of the genus Emaravirus evaded thorough molecular characterization for many years due to the experimental challenges of mite transmission and manipulating multisegmented negative sense RNA genomes. Recently, the complete genome sequence of a Nebraska isolate of WMoV revealed eight segments, plus a variant sequence of the nucleocapsid protein-encoding segment. Here, near-complete and partial consensus sequences of five more WMoV isolates are reported and compared to the Nebraska isolate: an Ohio maize isolate (GG1), a Kansas barley isolate (KS7), and three Ohio wheat isolates (H1, K1, W1). Results show two distinct groups of WMoV isolates: Ohio wheat isolate RNA segments had 84% or lower nucleotide sequence identity to the NE isolate, whereas GG1 and KS7 had 98% or higher nucleotide sequence identity to the NE isolate. Knowledge of the sequence variability of WMoV isolates is a step toward understanding virus biology, and potentially explaining observed biological variation. Published by Elsevier B.V.

  4. The parasite that causes whirling disease, Myxobolus cerebralis, is genetically variable within and across spatial scales.

    PubMed

    Lodh, Nilanjan; Kerans, Billie L; Stevens, Lori

    2012-01-01

    Understanding the genetic structure of parasite populations on the natural landscape can reveal important aspects of disease ecology and epidemiology and can indicate parasite dispersal across the landscape. Myxobolus cerebralis (Myxozoa: Myxosporea), the causative agent of whirling disease in the definitive host Tubifex tubifex, is native to Eurasia and has spread to more than 25 states in the USA. The small amounts of data available to date suggest that M. cerebralis has little genetic variability. We examined the genetic variability of parasites infecting the definitive host T. tubifex in the Madison River, MT, and also from other parts of North America and Europe. We cloned and sequenced 18S ribosomal DNA and the internal transcribed spacer-1 (ITS-1) gene. Five oligochaetes were examined for 18S and five for ITS-1, only one individual was examined for both genes. We found two different 18S rRNA haplotypes of M. cerebralis from five worms and both intra- and interworm genetic variation for ITS-1, which showed 16 different haplotypes from among 20 clones. Comparison of our sequences with those from other studies revealed M. cerebralis from MT was similar to the parasite collected from Alaska, Oregon, California, and Virginia in the USA and from Munich, Germany, based on 18S, whereas parasite sequences from West Virginia were very different. Combined with the high haplotype diversity of ITS-1 and uniqueness of ITS-1 haplotypes, our results show that M. cerebralis is more variable than previously thought and raises the possibility of multiple introductions of the parasite into North America. © 2011 The Author(s) Journal of Eukaryotic Microbiology © 2011 International Society of Protistologists.

  5. The Optical Gravitational Lensing Experiment. Ellipsoidal Variability of Red Giants in the Large Magellanic Cloud

    NASA Astrophysics Data System (ADS)

    Soszynski, I.; Udalski, A.; Kubiak, M.; Szymanski, M. K.; Pietrzynski, G.; Zebrun, K.; Szewczyk, O.; Wyrzykowski, L.; Dziembowski, W. A.

    2004-12-01

    We used the OGLE-II and OGLE-III photometry of red giants in the Large Magellanic Cloud to select and study objects revealing ellipsoidal variability. We detected 1546 candidates for long period ellipsoidal variables and 121 eclipsing binary systems with clear ellipsoidal modulation. The ellipsoidal red giants follow a period--luminosity (PL) relationship (sequence E), and the scatter of the relation is correlated with the amplitude of variability: the larger the amplitude, the smaller the scatter. We note that some of the ellipsoidal candidates exhibit simultaneously OGLE Small Amplitude Red Giants pulsations. Thus, in some cases the Long Secondary Period (LSP) phenomenon can be explained by the ellipsoidal modulation. We also select about 1600 red giants with distinct LSP, which are not ellipsoidal variables. We discover that besides the sequence D in the PL diagram known before, the LSP giants form additional less numerous sequence for longer periods. We notice that the PL sequence of the ellipsoidal candidates is a direct continuation of the LSP sequence toward fainter stars, what might suggest that the LSP phenomenon is related to binarity but there are strong arguments against such a possibility. About 10% of the presented light curves reveal clear deformation by the eccentricity of the system orbits. The largest estimated eccentricity in our sample is about 0.4. All presented data, including individual BVI observations and finding charts are available from the OGLE Internet archive.

  6. SeqFIRE: a web application for automated extraction of indel regions and conserved blocks from protein multiple sequence alignments.

    PubMed

    Ajawatanawong, Pravech; Atkinson, Gemma C; Watson-Haigh, Nathan S; Mackenzie, Bryony; Baldauf, Sandra L

    2012-07-01

    Analyses of multiple sequence alignments generally focus on well-defined conserved sequence blocks, while the rest of the alignment is largely ignored or discarded. This is especially true in phylogenomics, where large multigene datasets are produced through automated pipelines. However, some of the most powerful phylogenetic markers have been found in the variable length regions of multiple alignments, particularly insertions/deletions (indels) in protein sequences. We have developed Sequence Feature and Indel Region Extractor (SeqFIRE) to enable the automated identification and extraction of indels from protein sequence alignments. The program can also extract conserved blocks and identify fast evolving sites using a combination of conservation and entropy. All major variables can be adjusted by the user, allowing them to identify the sets of variables most suited to a particular analysis or dataset. Thus, all major tasks in preparing an alignment for further analysis are combined in a single flexible and user-friendly program. The output includes a numbered list of indels, alignments in NEXUS format with indels annotated or removed and indel-only matrices. SeqFIRE is a user-friendly web application, freely available online at www.seqfire.org/.

  7. Protein consensus-based surface engineering (ProCoS): a computer-assisted method for directed protein evolution.

    PubMed

    Shivange, Amol V; Hoeffken, Hans Wolfgang; Haefner, Stefan; Schwaneberg, Ulrich

    2016-12-01

    Protein consensus-based surface engineering (ProCoS) is a simple and efficient method for directed protein evolution combining computational analysis and molecular biology tools to engineer protein surfaces. ProCoS is based on the hypothesis that conserved residues originated from a common ancestor and that these residues are crucial for the function of a protein, whereas highly variable regions (situated on the surface of a protein) can be targeted for surface engineering to maximize performance. ProCoS comprises four main steps: ( i ) identification of conserved and highly variable regions; ( ii ) protein sequence design by substituting residues in the highly variable regions, and gene synthesis; ( iii ) in vitro DNA recombination of synthetic genes; and ( iv ) screening for active variants. ProCoS is a simple method for surface mutagenesis in which multiple sequence alignment is used for selection of surface residues based on a structural model. To demonstrate the technique's utility for directed evolution, the surface of a phytase enzyme from Yersinia mollaretii (Ymphytase) was subjected to ProCoS. Screening just 1050 clones from ProCoS engineering-guided mutant libraries yielded an enzyme with 34 amino acid substitutions. The surface-engineered Ymphytase exhibited 3.8-fold higher pH stability (at pH 2.8 for 3 h) and retained 40% of the enzyme's specific activity (400 U/mg) compared with the wild-type Ymphytase. The pH stability might be attributed to a significantly increased (20 percentage points; from 9% to 29%) number of negatively charged amino acids on the surface of the engineered phytase.

  8. Report for the NGFA-5 project.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jaing, C; Jackson, P; Thissen, J

    The objective of this project is to provide DHS a comprehensive evaluation of the current genomic technologies including genotyping, TaqMan PCR, multiple locus variable tandem repeat analysis (MLVA), microarray and high-throughput DNA sequencing in the analysis of biothreat agents from complex environmental samples. To effectively compare the sensitivity and specificity of the different genomic technologies, we used SNP TaqMan PCR, MLVA, microarray and high-throughput illumine and 454 sequencing to test various strains from B. anthracis, B. thuringiensis, BioWatch aerosol filter extracts or soil samples that were spiked with B. anthracis, and samples that were previously collected during DHS and EPAmore » environmental release exercises that were known to contain B. thuringiensis spores. The results of all the samples against the various assays are discussed in this report.« less

  9. Centromere and telomere sequence alterations reflect the rapid genome evolution within the carnivorous plant genus Genlisea.

    PubMed

    Tran, Trung D; Cao, Hieu X; Jovtchev, Gabriele; Neumann, Pavel; Novák, Petr; Fojtová, Miloslava; Vu, Giang T H; Macas, Jiří; Fajkus, Jiří; Schubert, Ingo; Fuchs, Joerg

    2015-12-01

    Linear chromosomes of eukaryotic organisms invariably possess centromeres and telomeres to ensure proper chromosome segregation during nuclear divisions and to protect the chromosome ends from deterioration and fusion, respectively. While centromeric sequences may differ between species, with arrays of tandemly repeated sequences and retrotransposons being the most abundant sequence types in plant centromeres, telomeric sequences are usually highly conserved among plants and other organisms. The genome size of the carnivorous genus Genlisea (Lentibulariaceae) is highly variable. Here we study evolutionary sequence plasticity of these chromosomal domains at an intrageneric level. We show that Genlisea nigrocaulis (1C = 86 Mbp; 2n = 40) and G. hispidula (1C = 1550 Mbp; 2n = 40) differ as to their DNA composition at centromeres and telomeres. G. nigrocaulis and its close relative G. pygmaea revealed mainly 161 bp tandem repeats, while G. hispidula and its close relative G. subglabra displayed a combination of four retroelements at centromeric positions. G. nigrocaulis and G. pygmaea chromosome ends are characterized by the Arabidopsis-type telomeric repeats (TTTAGGG); G. hispidula and G. subglabra instead revealed two intermingled sequence variants (TTCAGG and TTTCAGG). These differences in centromeric and, surprisingly, also in telomeric DNA sequences, uncovered between groups with on average a > 9-fold genome size difference, emphasize the fast genome evolution within this genus. Such intrageneric evolutionary alteration of telomeric repeats with cytosine in the guanine-rich strand, not yet known for plants, might impact the epigenetic telomere chromatin modification. © 2015 The Authors The Plant Journal © 2015 John Wiley & Sons Ltd.

  10. Genetic characterization of the non-structural protein-3 gene of bluetongue virus serotype-2 isolate from India.

    PubMed

    Pudupakam, Raghavendra Sumanth; Raghunath, Shobana; Pudupakam, Meghanath; Daggupati, Sreenivasulu

    2017-03-01

    Sequence analysis and phylogenetic studies based on non-structural protein-3 (NS3) gene are important in understanding the evolution and epidemiology of bluetongue virus (BTV). This study was aimed at characterizing the NS3 gene sequence of Indian BTV serotype-2 (BTV2) to elucidate its genetic relationship to global BTV isolates. The NS3 gene of BTV2 was amplified from infected BHK-21 cell cultures, cloned and subjected to sequence analysis. The generated NS3 gene sequence was compared with the corresponding sequences of different BTV serotypes across the world, and a phylogenetic relationship was established. The NS3 gene of BTV2 showed moderate levels of variability in comparison to different BTV serotypes, with nucleotide sequence identities ranging from 81% to 98%. The region showed high sequence homology of 93-99% at amino acid level with various BTV serotypes. The PPXY/PTAP late domain motifs, glycosylation sites, hydrophobic domains, and the amino acid residues critical for virus-host interactions were conserved in NS3 protein. Phylogenetic analysis revealed that BTV isolates segregate into four topotypes and that the Indian BTV2 in subclade IA is closely related to Asian and Australian origin strains. Analysis of the NS3 gene indicated that Indian BTV2 isolate is closely related to strains from Asia and Australia, suggesting a common origin of infection. Although the pattern of evolution of BTV2 isolate is different from other global isolates, the deduced amino acid sequence of NS3 protein demonstrated high molecular stability.

  11. Genetic characterization of the non-structural protein-3 gene of bluetongue virus serotype-2 isolate from India

    PubMed Central

    Pudupakam, Raghavendra Sumanth; Raghunath, Shobana; Pudupakam, Meghanath; Daggupati, Sreenivasulu

    2017-01-01

    Aim: Sequence analysis and phylogenetic studies based on non-structural protein-3 (NS3) gene are important in understanding the evolution and epidemiology of bluetongue virus (BTV). This study was aimed at characterizing the NS3 gene sequence of Indian BTV serotype-2 (BTV2) to elucidate its genetic relationship to global BTV isolates. Materials and Methods: The NS3 gene of BTV2 was amplified from infected BHK-21 cell cultures, cloned and subjected to sequence analysis. The generated NS3 gene sequence was compared with the corresponding sequences of different BTV serotypes across the world, and a phylogenetic relationship was established. Results: The NS3 gene of BTV2 showed moderate levels of variability in comparison to different BTV serotypes, with nucleotide sequence identities ranging from 81% to 98%. The region showed high sequence homology of 93-99% at amino acid level with various BTV serotypes. The PPXY/PTAP late domain motifs, glycosylation sites, hydrophobic domains, and the amino acid residues critical for virus-host interactions were conserved in NS3 protein. Phylogenetic analysis revealed that BTV isolates segregate into four topotypes and that the Indian BTV2 in subclade IA is closely related to Asian and Australian origin strains. Conclusion: Analysis of the NS3 gene indicated that Indian BTV2 isolate is closely related to strains from Asia and Australia, suggesting a common origin of infection. Although the pattern of evolution of BTV2 isolate is different from other global isolates, the deduced amino acid sequence of NS3 protein demonstrated high molecular stability. PMID:28435199

  12. Evaluating multiplexed next-generation sequencing as a method in palynology for mixed pollen samples.

    PubMed

    Keller, A; Danner, N; Grimmer, G; Ankenbrand, M; von der Ohe, K; von der Ohe, W; Rost, S; Härtel, S; Steffan-Dewenter, I

    2015-03-01

    The identification of pollen plays an important role in ecology, palaeo-climatology, honey quality control and other areas. Currently, expert knowledge and reference collections are essential to identify pollen origin through light microscopy. Pollen identification through molecular sequencing and DNA barcoding has been proposed as an alternative approach, but the assessment of mixed pollen samples originating from multiple plant species is still a tedious and error-prone task. Next-generation sequencing has been proposed to avoid this hindrance. In this study we assessed mixed pollen probes through next-generation sequencing of amplicons from the highly variable, species-specific internal transcribed spacer 2 region of nuclear ribosomal DNA. Further, we developed a bioinformatic workflow to analyse these high-throughput data with a newly created reference database. To evaluate the feasibility, we compared results from classical identification based on light microscopy from the same samples with our sequencing results. We assessed in total 16 mixed pollen samples, 14 originated from honeybee colonies and two from solitary bee nests. The sequencing technique resulted in higher taxon richness (deeper assignments and more identified taxa) compared to light microscopy. Abundance estimations from sequencing data were significantly correlated with counted abundances through light microscopy. Simulation analyses of taxon specificity and sensitivity indicate that 96% of taxa present in the database are correctly identifiable at the genus level and 70% at the species level. Next-generation sequencing thus presents a useful and efficient workflow to identify pollen at the genus and species level without requiring specialised palynological expert knowledge. © 2014 German Botanical Society and The Royal Botanical Society of the Netherlands.

  13. Genome Sequences of Populus tremula Chloroplast and Mitochondrion: Implications for Holistic Poplar Breeding

    PubMed Central

    Mader, Malte; Le Paslier, Marie-Christine; Bounon, Rémi; Berard, Aurélie; Vettori, Cristina; Schroeder, Hilke; Leplé, Jean-Charles; Fladung, Matthias

    2016-01-01

    Complete Populus genome sequences are available for the nucleus (P. trichocarpa; section Tacamahaca) and for chloroplasts (seven species), but not for mitochondria. Here, we provide the complete genome sequences of the chloroplast and the mitochondrion for the clones P. tremula W52 and P. tremula x P. alba 717-1B4 (section Populus). The organization of the chloroplast genomes of both Populus clones is described. A phylogenetic tree constructed from all available complete chloroplast DNA sequences of Populus was not congruent with the assignment of the related species to different Populus sections. In total, 3,024 variable nucleotide positions were identified among all compared Populus chloroplast DNA sequences. The 5-prime part of the LSC from trnH to atpA showed the highest frequency of variations. The variable positions included 163 positions with SNPs allowing for differentiating the two clones with P. tremula chloroplast genomes (W52, 717-1B4) from the other seven Populus individuals. These potential P. tremula-specific SNPs were displayed as a whole-plastome barcode on the P. tremula W52 chloroplast DNA sequence. Three of these SNPs and one InDel in the trnH-psbA linker were successfully validated by Sanger sequencing in an extended set of Populus individuals. The complete mitochondrial genome sequence of P. tremula is the first in the family of Salicaceae. The mitochondrial genomes of the two clones are 783,442 bp (W52) and 783,513 bp (717-1B4) in size, structurally very similar and organized as single circles. DNA sequence regions with high similarity to the W52 chloroplast sequence account for about 2% of the W52 mitochondrial genome. The mean SNP frequency was found to be nearly six fold higher in the chloroplast than in the mitochondrial genome when comparing 717-1B4 with W52. The availability of the genomic information of all three DNA-containing cell organelles will allow a holistic approach in poplar molecular breeding in the future. PMID:26800039

  14. HLA genotyping by next-generation sequencing of complementary DNA.

    PubMed

    Segawa, Hidenobu; Kukita, Yoji; Kato, Kikuya

    2017-11-28

    Genotyping of the human leucocyte antigen (HLA) is indispensable for various medical treatments. However, unambiguous genotyping is technically challenging due to high polymorphism of the corresponding genomic region. Next-generation sequencing is changing the landscape of genotyping. In addition to high throughput of data, its additional advantage is that DNA templates are derived from single molecules, which is a strong merit for the phasing problem. Although most currently developed technologies use genomic DNA, use of cDNA could enable genotyping with reduced costs in data production and analysis. We thus developed an HLA genotyping system based on next-generation sequencing of cDNA. Each HLA gene was divided into 3 or 4 target regions subjected to PCR amplification and subsequent sequencing with Ion Torrent PGM. The sequence data were then subjected to an automated analysis. The principle of the analysis was to construct candidate sequences generated from all possible combinations of variable bases and arrange them in decreasing order of the number of reads. Upon collecting candidate sequences from all target regions, 2 haplotypes were usually assigned. Cases not assigned 2 haplotypes were forwarded to 4 additional processes: selection of candidate sequences applying more stringent criteria, removal of artificial haplotypes, selection of candidate sequences with a relaxed threshold for sequence matching, and countermeasure for incomplete sequences in the HLA database. The genotyping system was evaluated using 30 samples; the overall accuracy was 97.0% at the field 3 level and 98.3% at the G group level. With one sample, genotyping of DPB1 was not completed due to short read size. We then developed a method for complete sequencing of individual molecules of the DPB1 gene, using the molecular barcode technology. The performance of the automatic genotyping system was comparable to that of systems developed in previous studies. Thus, next-generation sequencing of cDNA is a viable option for HLA genotyping.

  15. High Efficiency, Low Distortion 3D Diffusion Tensor Imaging with Variable Density Spiral Fast Spin Echoes (3D DW VDS RARE)

    PubMed Central

    Frank, Lawrence R.; Jung, Youngkyoo; Inati, Souheil; Tyszka, J. Michael; Wong, Eric C.

    2009-01-01

    We present an acquisition and reconstruction method designed to acquire high resolution 3D fast spin echo diffusion tensor images while mitigating the major sources of artifacts in DTI - field distortions, eddy currents and motion. The resulting images, being 3D, are of high SNR, and being fast spin echoes, exhibit greatly reduced field distortions. This sequence utilizes variable density spiral acquisition gradients, which allow for the implementation of a self-navigation scheme by which both eddy current and motion artifacts are removed. The result is that high resolution 3D DTI images are produced without the need for eddy current compensating gradients or B0 field correction. In addition, a novel method for fast and accurate reconstruction of the non-Cartesian data is employed. Results are demonstrated in the brains of normal human volunteers. PMID:19778618

  16. Research in Stochastic Processes.

    DTIC Science & Technology

    1983-10-01

    increases. A more detailed investigation for the exceedances themselves (rather than Just the cluster centers) was undertaken, together with J. HUsler and...J. HUsler and M.R. Leadbetter, Compoung Poisson limit theorems for high level exceedances by stationary sequences, Center for Stochastic Processes...stability by a random linear operator. C.D. Hardin, General (asymmetric) stable variables and processes. T. Hsing, J. HUsler and M.R. Leadbetter, Compound

  17. High taxonomic variability despite stable functional structure across microbial communities.

    PubMed

    Louca, Stilianos; Jacques, Saulo M S; Pires, Aliny P F; Leal, Juliana S; Srivastava, Diane S; Parfrey, Laura Wegener; Farjalla, Vinicius F; Doebeli, Michael

    2016-12-05

    Understanding the processes that are driving variation of natural microbial communities across space or time is a major challenge for ecologists. Environmental conditions strongly shape the metabolic function of microbial communities; however, other processes such as biotic interactions, random demographic drift or dispersal limitation may also influence community dynamics. The relative importance of these processes and their effects on community function remain largely unknown. To address this uncertainty, here we examined bacterial and archaeal communities in replicate 'miniature' aquatic ecosystems contained within the foliage of wild bromeliads. We used marker gene sequencing to infer the taxonomic composition within nine metabolic functional groups, and shotgun environmental DNA sequencing to estimate the relative abundances of these groups. We found that all of the bromeliads exhibited remarkably similar functional community structures, but that the taxonomic composition within individual functional groups was highly variable. Furthermore, using statistical analyses, we found that non-neutral processes, including environmental filtering and potentially biotic interactions, at least partly shaped the composition within functional groups and were more important than spatial dispersal limitation and demographic drift. Hence both the functional structure and taxonomic composition within functional groups of natural microbial communities may be shaped by non-neutral and roughly separate processes.

  18. Production, characteristics and applications of the cell-bound phytase of Pichia anomala.

    PubMed

    Vohra, Ashima; Kaur, Parvinder; Satyanarayana, T

    2011-01-01

    Among several yeasts isolated from dried flowers of Woodfordia fruticosa, Pichia anomala produced a high titre of cell-bound phytase. The optimization of fermentation variables led to formulation of media and selection of cultural variables that supported enhanced phytase production. The enzyme productivity was very high in fed batch fermentation in air-lift fermentor as compared to that in stirred tank fermentor. Amelioration in the cell-bound phytase activity was observed when yeast cells were permeabilized with Triton-X-100. The enzyme is thermostable and acid stable with broad substrate specificity, the characteristics that are desirable for enzymes to be used in the animal feed industry. The phytase-encoding gene was cloned and sequenced. The 3D structure of the enzyme was proposed by comparative modeling using phytase of Debaryomyces occidentalis (50% sequence identity) as template. When broiler chicks, and fresh water and marine fishes were fed with the feed supplemented with yeast biomass containing phytase, improvement in growth and phosphorus retention, and decrease in the excretion of phosphorus in the faeces were recorded. The cell-bound phytase of P. anomala could effectively dephytinize wheat flour and soymilk.

  19. An Integrated Tool to Study MHC Region: Accurate SNV Detection and HLA Genes Typing in Human MHC Region Using Targeted High-Throughput Sequencing

    PubMed Central

    Liu, Xiao; Xu, Yinyin; Liang, Dequan; Gao, Peng; Sun, Yepeng; Gifford, Benjamin; D’Ascenzo, Mark; Liu, Xiaomin; Tellier, Laurent C. A. M.; Yang, Fang; Tong, Xin; Chen, Dan; Zheng, Jing; Li, Weiyang; Richmond, Todd; Xu, Xun; Wang, Jun; Li, Yingrui

    2013-01-01

    The major histocompatibility complex (MHC) is one of the most variable and gene-dense regions of the human genome. Most studies of the MHC, and associated regions, focus on minor variants and HLA typing, many of which have been demonstrated to be associated with human disease susceptibility and metabolic pathways. However, the detection of variants in the MHC region, and diagnostic HLA typing, still lacks a coherent, standardized, cost effective and high coverage protocol of clinical quality and reliability. In this paper, we presented such a method for the accurate detection of minor variants and HLA types in the human MHC region, using high-throughput, high-coverage sequencing of target regions. A probe set was designed to template upon the 8 annotated human MHC haplotypes, and to encompass the 5 megabases (Mb) of the extended MHC region. We deployed our probes upon three, genetically diverse human samples for probe set evaluation, and sequencing data show that ∼97% of the MHC region, and over 99% of the genes in MHC region, are covered with sufficient depth and good evenness. 98% of genotypes called by this capture sequencing prove consistent with established HapMap genotypes. We have concurrently developed a one-step pipeline for calling any HLA type referenced in the IMGT/HLA database from this target capture sequencing data, which shows over 96% typing accuracy when deployed at 4 digital resolution. This cost-effective and highly accurate approach for variant detection and HLA typing in the MHC region may lend further insight into immune-mediated diseases studies, and may find clinical utility in transplantation medicine research. This one-step pipeline is released for general evaluation and use by the scientific community. PMID:23894464

  20. Molecular analysis of immunoglobulin variable genes supports a germinal center experienced normal counterpart in primary cutaneous diffuse large B-cell lymphoma, leg-type.

    PubMed

    Pham-Ledard, Anne; Prochazkova-Carlotti, Martina; Deveza, Mélanie; Laforet, Marie-Pierre; Beylot-Barry, Marie; Vergier, Béatrice; Parrens, Marie; Feuillard, Jean; Merlio, Jean-Philippe; Gachard, Nathalie

    2017-11-01

    Immunophenotype of primary cutaneous diffuse large B-cell lymphoma, leg-type (PCLBCL-LT) suggests a germinal center-experienced B lymphocyte (BCL2+ MUM1+ BCL6+/-). As maturation history of B-cell is "imprinted" during B-cell development on the immunoglobulin gene sequence, we studied the structure and sequence of the variable part of the genes (IGHV, IGLV, IGKV), immunoglobulin surface expression and features of class switching in order to determine the PCLBCL-LT cell of origin. Clonality analysis with BIOMED2 protocol and VH leader primers was done on DNA extracted from frozen skin biopsies on retrospective samples from 14 patients. The clonal DNA IGHV sequence of the tumor was aligned and compared with the closest germline sequence and homology percentage was calculated. Superantigen binding sites were studied. Features of selection pressure were evaluated with the multinomial Lossos model. A functional monoclonal sequence was observed in 14 cases as determined for IGHV (10), IGLV (2) or IGKV (3). IGV mutation rates were high (>5%) in all cases but one (median:15.5%), with superantigen binding sites conservation. Features of selection pressure were identified in 11/12 interpretable cases, more frequently negative (75%) than positive (25%). Intraclonal variation was detected in 3 of 8 tumor specimens with a low rate of mutations. Surface immunoglobulin was an IgM in 12/12 cases. FISH analysis of IGHM locus, deleted during class switching, showed heterozygous IGHM gene deletion in half of cases. The genomic PCR analysis confirmed the deletions within the switch μ region. IGV sequences were highly mutated but functional, with negative features of selection pressure suggesting one or more germinal center passage(s) with somatic hypermutation, but superantigen (SpA) binding sites conservation. Genetic features of class switch were observed, but on the non functional allele and co-existing with primary isotype IgM expression. These data suggest that cell-of origin is germinal center experienced and superantigen driven selected B-cell, in a stage between germinal center B-cell and plasma cell. Copyright © 2017 Japanese Society for Investigative Dermatology. Published by Elsevier B.V. All rights reserved.

  1. Two distinct origins for Archean greenstone belts

    NASA Astrophysics Data System (ADS)

    Smithies, R. Hugh; Ivanic, Tim J.; Lowrey, Jack R.; Morris, Paul A.; Barnes, Stephen J.; Wyche, Stephen; Lu, Yong-Jun

    2018-04-01

    Applying the Th/Yb-Nb/Yb plot of Pearce (2008) to the well-studied Archean greenstone sequences of Western Australia shows that individual volcanic sequences evolved through one of two distinct processes reflecting different modes of crust-mantle interaction. In the Yilgarn Craton, the volcanic stratigraphy of the 2.99-2.71 Ga Youanmi Terrane mainly evolved through processes leading to Th/Yb-Nb/Yb trends with a narrow range of Th/Nb ('constant-Th/Nb' greenstones). In contrast, the 2.71-2.66 Ga volcanic stratigraphy of the Eastern Goldfields Superterrane evolved through processes leading to Th/Yb-Nb/Yb trends showing a continuous range in Th/Nb ('variable-Th/Nb' greenstones). Greenstone sequences of the Pilbara Craton show a similar evolution, with constant-Th/Nb greenstone evolution between 3.13 and 2.95 Ga and variable-Th/Nb greenstone evolution between 3.49 and 3.23 Ga and between 2.77 and 2.68 Ga. The variable-Th/Nb trends dominate greenstone sequences in Australia and worldwide, and are temporally associated with peaks in granite magmatism, which promoted crustal preservation. The increasing Th/Nb in basalts correlates with decreasing εNd, reflecting variable amounts of crustal assimilation during emplacement of mantle-derived magmas. These greenstones are typically accompanied in the early stages by komatiite, and can probably be linked to mantle plume activity. Thus, regions such as the Eastern Goldfields Superterrane simply developed as plume-related rifts over existing granite-greenstone crust - in this case the Youanmi Terrane. Their Th/Nb trends are difficult to reconcile with modern-style subduction processes. The constant-Th/Nb trends may reflect derivation from a mantle source already with a high and constant Th/Nb ratio. This, and a lithological association including boninite-like lavas, basalts, and calc-alkaline andesites, all within a narrow Th/Nb range, resembles compositions typical of modern-style subduction settings. These greenstones are very rare, and were probably only preserved when fortuitously stabilised by granitic magmatism related to the evolution of later variable-Th/Nb greenstones. The rarity of constant-Th/Nb trends suggests that either processes forming them never dominated Archean greenstone evolution, or that such greenstones simply were rarely preserved. Metamorphic mobility of Th renders the Th/Yb-Nb/Yb plot inappropriate for interpreting Eoarchean greenstone units worldwide. Nevertheless, such sequences appear dominated by volcanic rocks that, in modern settings, reflect only the embryonic or initiation stages of subduction. They probably record subduction failure rather than anything resembling modern-style subduction.

  2. Sequence-Based Discovery Demonstrates That Fixed Light Chain Human Transgenic Rats Produce a Diverse Repertoire of Antigen-Specific Antibodies.

    PubMed

    Harris, Katherine E; Aldred, Shelley Force; Davison, Laura M; Ogana, Heather Anne N; Boudreau, Andrew; Brüggemann, Marianne; Osborn, Michael; Ma, Biao; Buelow, Benjamin; Clarke, Starlynn C; Dang, Kevin H; Iyer, Suhasini; Jorgensen, Brett; Pham, Duy T; Pratap, Payal P; Rangaswamy, Udaya S; Schellenberger, Ute; van Schooten, Wim C; Ugamraj, Harshad S; Vafa, Omid; Buelow, Roland; Trinklein, Nathan D

    2018-01-01

    We created a novel transgenic rat that expresses human antibodies comprising a diverse repertoire of heavy chains with a single common rearranged kappa light chain (IgKV3-15-JK1). This fixed light chain animal, called OmniFlic, presents a unique system for human therapeutic antibody discovery and a model to study heavy chain repertoire diversity in the context of a constant light chain. The purpose of this study was to analyze heavy chain variable gene usage, clonotype diversity, and to describe the sequence characteristics of antigen-specific monoclonal antibodies (mAbs) isolated from immunized OmniFlic animals. Using next-generation sequencing antibody repertoire analysis, we measured heavy chain variable gene usage and the diversity of clonotypes present in the lymph node germinal centers of 75 OmniFlic rats immunized with 9 different protein antigens. Furthermore, we expressed 2,560 unique heavy chain sequences sampled from a diverse set of clonotypes as fixed light chain antibody proteins and measured their binding to antigen by ELISA. Finally, we measured patterns and overall levels of somatic hypermutation in the full B-cell repertoire and in the 2,560 mAbs tested for binding. The results demonstrate that OmniFlic animals produce an abundance of antigen-specific antibodies with heavy chain clonotype diversity that is similar to what has been described with unrestricted light chain use in mammals. In addition, we show that sequence-based discovery is a highly effective and efficient way to identify a large number of diverse monoclonal antibodies to a protein target of interest.

  3. The complete chloroplast genome sequence of Actinidia arguta using the PacBio RS II platform

    PubMed Central

    Lin, Miaomiao; Qi, Xiujuan; Chen, Jinyong; Sun, Leiming; Zhong, Yunpeng; Fang, Jinbao; Hu, Chungen

    2018-01-01

    Actinidia arguta is the most basal species in a phylogenetically and economically important genus in the family Actinidiaceae. To better understand the molecular basis of the Actinidia arguta chloroplast (cp), we sequenced the complete cp genome from A. arguta using Illumina and PacBio RS II sequencing technologies. The cp genome from A. arguta was 157,611 bp in length and composed of a pair of 24,232 bp inverted repeats (IRs) separated by a 20,463 bp small single copy region (SSC) and an 88,684 bp large single copy region (LSC). Overall, the cp genome contained 113 unique genes. The cp genomes from A. arguta and three other Actinidia species from GenBank were subjected to a comparative analysis. Indel mutation events and high frequencies of base substitution were identified, and the accD and ycf2 genes showed a high degree of variation within Actinidia. Forty-seven simple sequence repeats (SSRs) and 155 repetitive structures were identified, further demonstrating the rapid evolution in Actinidia. The cp genome analysis and the identification of variable loci provide vital information for understanding the evolution and function of the chloroplast and for characterizing Actinidia population genetics. PMID:29795601

  4. Degenerate Pax2 and Senseless binding motifs improve detection of low-affinity sites required for enhancer specificity

    PubMed Central

    Zandvakili, Arya; Campbell, Ian; Weirauch, Matthew T.

    2018-01-01

    Cells use thousands of regulatory sequences to recruit transcription factors (TFs) and produce specific transcriptional outcomes. Since TFs bind degenerate DNA sequences, discriminating functional TF binding sites (TFBSs) from background sequences represents a significant challenge. Here, we show that a Drosophila regulatory element that activates Epidermal Growth Factor signaling requires overlapping, low-affinity TFBSs for competing TFs (Pax2 and Senseless) to ensure cell- and segment-specific activity. Testing available TF binding models for Pax2 and Senseless, however, revealed variable accuracy in predicting such low-affinity TFBSs. To better define parameters that increase accuracy, we developed a method that systematically selects subsets of TFBSs based on predicted affinity to generate hundreds of position-weight matrices (PWMs). Counterintuitively, we found that degenerate PWMs produced from datasets depleted of high-affinity sequences were more accurate in identifying both low- and high-affinity TFBSs for the Pax2 and Senseless TFs. Taken together, these findings reveal how TFBS arrangement can be constrained by competition rather than cooperativity and that degenerate models of TF binding preferences can improve identification of biologically relevant low affinity TFBSs. PMID:29617378

  5. Single nucleotide polymorphism analysis reveals heterogeneity within a seedling tree population of a polyembryonic mango cultivar.

    PubMed

    Winterhagen, Patrick; Wünsche, Jens-Norbert

    2016-05-01

    Within a polyembryonic mango seedling tree population, the genetic background of individuals should be identical because vigorous plants for cultivation are expected to develop from nucellar embryos representing maternal clones. Due to the fact that the mango cultivar 'Hôi' is assigned to the polyembryonic ecotype, an intra-cultivar variability of ethylene receptor genes was unexpected. Ethylene receptors in plants are conserved, but the number of receptors or receptor isoforms is variable regarding different plant species. However, it is shown here that the ethylene receptor MiETR1 is present in various isoforms within the mango cultivar 'Hôi'. The investigation of single nucleotide polymorphisms revealed that different MiETR1 isoforms can not be discriminated simply by individual single nucleotide exchanges but by the specific arrangement of single nucleotide polymorphisms at certain positions in the exons of MiETR1. Furthermore, an MiETR1 isoform devoid of introns in the genomic sequence was identified. The investigation demonstrates some limitations of high resolution melting and ScreenClust analysis and points out the necessity of sequencing to identify individual isoforms and to determine the variability within the tree population.

  6. BAUM: improving genome assembly by adaptive unique mapping and local overlap-layout-consensus approach.

    PubMed

    Wang, Anqi; Wang, Zhanyu; Li, Zheng; Li, Lei M

    2018-06-15

    It is highly desirable to assemble genomes of high continuity and consistency at low cost. The current bottleneck of draft genome continuity using the second generation sequencing (SGS) reads is primarily caused by uncertainty among repetitive sequences. Even though the single-molecule real-time sequencing technology is very promising to overcome the uncertainty issue, its relatively high cost and error rate add burden on budget or computation. Many long-read assemblers take the overlap-layout-consensus (OLC) paradigm, which is less sensitive to sequencing errors, heterozygosity and variability of coverage. However, current assemblers of SGS data do not sufficiently take advantage of the OLC approach. Aiming at minimizing uncertainty, the proposed method BAUM, breaks the whole genome into regions by adaptive unique mapping; then the local OLC is used to assemble each region in parallel. BAUM can (i) perform reference-assisted assembly based on the genome of a close species (ii) or improve the results of existing assemblies that are obtained based on short or long sequencing reads. The tests on two eukaryote genomes, a wild rice Oryza longistaminata and a parrot Melopsittacus undulatus, show that BAUM achieved substantial improvement on genome size and continuity. Besides, BAUM reconstructed a considerable amount of repetitive regions that failed to be assembled by existing short read assemblers. We also propose statistical approaches to control the uncertainty in different steps of BAUM. http://www.zhanyuwang.xin/wordpress/index.php/2017/07/21/baum. Supplementary data are available at Bioinformatics online.

  7. Microbial community analysis of a coastal hot spring in Kagoshima, Japan, using molecular- and culture-based approaches.

    PubMed

    Nishiyama, Minako; Yamamoto, Shuichi; Kurosawa, Norio

    2013-08-01

    Ibusuki hot spring is located on the coastline of Kagoshima Bay, Japan. The hot spring water is characterized by high salinity, high temperature, and neutral pH. The hot spring is covered by the sea during high tide, which leads to severe fluctuations in several environmental variables. A combination of molecular- and culture-based techniques was used to determine the bacterial and archaeal diversity of the hot spring. A total of 48 thermophilic bacterial strains were isolated from two sites (Site 1: 55.6°C; Site 2: 83.1°C) and they were categorized into six groups based on their 16S rRNA gene sequence similarity. Two groups (including 32 isolates) demonstrated low sequence similarity with published species, suggesting that they might represent novel taxa. The 148 clones from the Site 1 bacterial library included 76 operational taxonomy units (OTUs; 97% threshold), while 132 clones from the Site 2 bacterial library included 31 OTUs. Proteobacteria, Bacteroidetes, and Firmicutes were frequently detected in both clone libraries. The clones were related to thermophilic, mesophilic and psychrophilic bacteria. Approximately half of the sequences in bacterial clone libraries shared <92% sequence similarity with their closest sequences in a public database, suggesting that the Ibusuki hot spring may harbor a unique and novel bacterial community. By contrast, 77 clones from the Site 2 archaeal library contained only three OTUs, most of which were affiliated with Thaumarchaeota.

  8. A Wide-field Survey for Transiting Hot Jupiters and Eclipsing Pre-main-sequence Binaries in Young Stellar Associations

    NASA Astrophysics Data System (ADS)

    Oelkers, Ryan J.; Macri, Lucas M.; Marshall, Jennifer L.; DePoy, Darren L.; Lambas, Diego G.; Colazo, Carlos; Stringer, Katelyn

    2016-09-01

    The past two decades have seen a significant advancement in the detection, classification, and understanding of exoplanets and binaries. This is due, in large part, to the increase in use of small-aperture telescopes (<20 cm) to survey large areas of the sky to milli-mag precision with rapid cadence. The vast majority of the planetary and binary systems studied to date consists of main-sequence or evolved objects, leading to a dearth of knowledge of properties at early times (<50 Myr). Only a dozen binaries and one candidate transiting Hot Jupiter are known among pre-main-sequence objects, yet these are the systems that can provide the best constraints on stellar formation and planetary migration models. The deficiency in the number of well characterized systems is driven by the inherent and aperiodic variability found in pre-main-sequence objects, which can mask and mimic eclipse signals. Hence, a dramatic increase in the number of young systems with high-quality observations is highly desirable to guide further theoretical developments. We have recently completed a photometric survey of three nearby (<150 pc) and young (<50 Myr) moving groups with a small-aperture telescope. While our survey reached the requisite photometric precision, the temporal coverage was insufficient to detect Hot Jupiters. Nevertheless, we discovered 346 pre-main-sequence binary candidates, including 74 high-priority objects for further study. This paper includes data taken at The McDonald Observatory of The University of Texas at Austin.

  9. Comprehensive global amino acid sequence analysis of PB1F2 protein of influenza A H5N1 viruses and the influenza A virus subtypes responsible for the 20th-century pandemics.

    PubMed

    Pasricha, Gunisha; Mishra, Akhilesh C; Chakrabarti, Alok K

    2013-07-01

    PB1F2 is the 11th protein of influenza A virus translated from +1 alternate reading frame of PB1 gene. Since the discovery, varying sizes and functions of the PB1F2 protein of influenza A viruses have been reported. Selection of PB1 gene segment in the pandemics, variable size and pleiotropic effect of PB1F2 intrigued us to analyze amino acid sequences of this protein in various influenza A viruses. Amino acid sequences for PB1F2 protein of influenza A H5N1, H1N1, H2N2, and H3N2 subtypes were obtained from Influenza Research Database. Multiple sequence alignments of the PB1F2 protein sequences of the aforementioned subtypes were used to determine the size, variable and conserved domains and to perform mutational analysis. Analysis showed that 96·4% of the H5N1 influenza viruses harbored full-length PB1F2 protein. Except for the 2009 pandemic H1N1 virus, all the subtypes of the 20th-century pandemic influenza viruses contained full-length PB1F2 protein. Through the years, PB1F2 protein of the H1N1 and H3N2 viruses has undergone much variation. PB1F2 protein sequences of H5N1 viruses showed both human- and avian host-specific conserved domains. Global database of PB1F2 protein revealed that N66S mutation was present only in 3·8% of the H5N1 strains. We found a novel mutation, N84S in the PB1F2 protein of 9·35% of the highly pathogenic avian influenza H5N1 influenza viruses. Varying sizes and mutations of the PB1F2 protein in different influenza A virus subtypes with pandemic potential were obtained. There was genetic divergence of the protein in various hosts which highlighted the host-specific evolution of the virus. However, studies are required to correlate this sequence variability with the virulence and pathogenicity. © 2012 John Wiley & Sons Ltd.

  10. High-resolution typing of Chlamydia trachomatis: epidemiological and clinical uses.

    PubMed

    de Vries, Henry J C; Schim van der Loeff, Maarten F; Bruisten, Sylvia M

    2015-02-01

    A state-of-the-art overview of molecular Chlamydia trachomatis typing methods that are used for routine diagnostics and scientific studies. Molecular epidemiology uses high-resolution typing techniques such as multilocus sequence typing, multilocus variable number of tandem repeats analysis, and whole-genome sequencing to identify strains based on their DNA sequence. These data can be used for cluster, network and phylogenetic analyses, and are used to unveil transmission networks, risk groups, and evolutionary pathways. High-resolution typing of C. trachomatis strains is applied to monitor treatment efficacy and re-infections, and to study the recent emergence of lymphogranuloma venereum (LGV) amongst men who have sex with men in high-income countries. Chlamydia strain typing has clinical relevance in disease management, as LGV needs longer treatment than non-LGV C. trachomatis. It has also led to the discovery of a new variant Chlamydia strain in Sweden, which was not detected by some commercial C. trachomatis diagnostic platforms. After a brief history and comparison of the various Chlamydia typing methods, the applications of the current techniques are described and future endeavors to extend scientific understanding are formulated. High-resolution typing will likely help to further unravel the pathophysiological mechanisms behind the wide clinical spectrum of chlamydial disease.

  11. Trypanosoma cruzi Clone Dm28c Draft Genome Sequence

    PubMed Central

    Grisard, Edmundo Carlos; Teixeira, Santuza Maria Ribeiro; de Almeida, Luiz Gonzaga Paula; Stoco, Patricia Hermes; Gerber, Alexandra Lehmkuhl; Talavera-López, Carlos; Lima, Oberdan Cunha; Andersson, Björn

    2014-01-01

    Trypanosoma cruzi affects millions of people worldwide. Clinical variability of Chagas disease can be due to the genetic variability of this parasite, requiring further genome studies. Here we report the genome sequence of the T. cruzi Dm28c clone (TcI), a strain related to the sylvatic cycle of the parasite. PMID:24482508

  12. High Genetic Diversity Revealed by Variable-Number Tandem Repeat Genotyping and Analysis of hsp65 Gene Polymorphism in a Large Collection of “Mycobacterium canettii” Strains Indicates that the M. tuberculosis Complex Is a Recently Emerged Clone of “M. canettii”

    PubMed Central

    Fabre, Michel; Koeck, Jean-Louis; Le Flèche, Philippe; Simon, Fabrice; Hervé, Vincent; Vergnaud, Gilles; Pourcel, Christine

    2004-01-01

    We have analyzed, using complementary molecular methods, the diversity of 43 strains of “Mycobacterium canettii” originating from the Republic of Djibouti, on the Horn of Africa, from 1998 to 2003. Genotyping by multiple-locus variable-number tandem repeat analysis shows that all the strains belong to a single but very distant group when compared to strains of the Mycobacterium tuberculosis complex (MTBC). Thirty-one strains cluster into one large group with little variability and five strains form another group, whereas the other seven are more diverged. In total, 14 genotypes are observed. The DR locus analysis reveals additional variability, some strains being devoid of a direct repeat locus and others having unique spacers. The hsp65 gene polymorphism was investigated by restriction enzyme analysis and sequencing of PCR amplicons. Four new single nucleotide polymorphisms were discovered. One strain was characterized by three nucleotide changes in 441 bp, creating new restriction enzyme polymorphisms. As no sequence variability was found for hsp65 in the whole MTBC, and as a single point mutation separates M. tuberculosis from the closest “M. canettii” strains, this diversity within “M. canettii” subspecies strongly suggests that it is the most probable source species of the MTBC rather than just another branch of the MTBC. PMID:15243089

  13. Variability and population genetic structure in Achyrocline flaccida (Weinm.) DC., a species with high value in folk medicine in South America.

    PubMed

    Rosa, Juliana da; Weber, Gabriela Gomes; Cardoso, Rafaela; Górski, Felipe; Da-Silva, Paulo Roberto

    2017-01-01

    Better knowledge of medicinal plant species and their conservation is an urgent need worldwide. Decision making for conservation strategies can be based on the knowledge of the variability and population genetic structure of the species and on the events that may influence these genetic parameters. Achyrocline flaccida (Weinm.) DC. is a native plant from the grassy fields of South America with high value in folk medicine. In spite of its importance, no genetic and conservation studies are available for the species. In this work, microsatellite and ISSR (inter-simple sequence repeat) markers were used to estimate the genetic variability and structure of seven populations of A. flaccida from southern Brazil. The microsatellite markers were inefficient in A. flaccida owing to a high number of null alleles. After the evaluation of 42 ISSR primers on one population, 10 were selected for further analysis of seven A. flaccida populations. The results of ISSR showed that the high number of exclusive absence of loci might contribute to the inter-population differentiation. Genetic variability of the species was high (Nei's diversity of 0.23 and Shannon diversity of 0.37). AMOVA indicated higher genetic variability within (64.7%) than among (33.96%) populations, and the variability was unevenly distributed (FST 0.33). Gene flow among populations ranged from 1.68 to 5.2 migrants per generation, with an average of 1.39. The results of PCoA and Bayesian analyses corroborated and indicated that the populations are structured. The observed genetic variability and population structure of A. flaccida are discussed in the context of the vegetation formation history in southern Brazil, as well as the possible anthropogenic effects. Additionally, we discuss the implications of the results in the conservation of the species.

  14. The sequence of cortical activity inferred by response latency variability in the human ventral pathway of face processing.

    PubMed

    Lin, Jo-Fu Lotus; Silva-Pereyra, Juan; Chou, Chih-Che; Lin, Fa-Hsuan

    2018-04-11

    Variability in neuronal response latency has been typically considered caused by random noise. Previous studies of single cells and large neuronal populations have shown that the temporal variability tends to increase along the visual pathway. Inspired by these previous studies, we hypothesized that functional areas at later stages in the visual pathway of face processing would have larger variability in the response latency. To test this hypothesis, we used magnetoencephalographic data collected when subjects were presented with images of human faces. Faces are known to elicit a sequence of activity from the primary visual cortex to the fusiform gyrus. Our results revealed that the fusiform gyrus showed larger variability in the response latency compared to the calcarine fissure. Dynamic and spectral analyses of the latency variability indicated that the response latency in the fusiform gyrus was more variable than in the calcarine fissure between 70 ms and 200 ms after the stimulus onset and between 4 Hz and 40 Hz, respectively. The sequential processing of face information from the calcarine sulcus to the fusiform sulcus was more reliably detected based on sizes of the response variability than instants of the maximal response peaks. With two areas in the ventral visual pathway, we show that the variability in response latency across brain areas can be used to infer the sequence of cortical activity.

  15. Multilocus variable-number tandem repeat analysis for molecular typing and phylogenetic analysis of Shigella flexneri

    PubMed Central

    2009-01-01

    Background Shigella flexneri is one of the causative agents of shigellosis, a major cause of childhood mortality in developing countries. Multilocus variable-number tandem repeat (VNTR) analysis (MLVA) is a prominent subtyping method to resolve closely related bacterial isolates for investigation of disease outbreaks and provide information for establishing phylogenetic patterns among isolates. The present study aimed to develop an MLVA method for S. flexneri and the VNTR loci identified were tested on 242 S. flexneri isolates to evaluate their variability in various serotypes. The isolates were also analyzed by pulsed-field gel electrophoresis (PFGE) to compare the discriminatory power and to evaluate the usefulness of MLVA as a tool for phylogenetic analysis of S. flexneri. Results Thirty-six VNTR loci were identified by exploring the repeat sequence loci in genomic sequences of Shigella species and by testing the loci on nine isolates of different subserotypes. The VNTR loci in different serotype groups differed greatly in their variability. The discriminatory power of an MLVA assay based on four most variable VNTR loci was higher, though not significantly, than PFGE for the total isolates, a panel of 2a isolates, which were relatively diverse, and a panel of 4a/Y isolates, which were closely-related. Phylogenetic groupings based on PFGE patterns and MLVA profiles were considerably concordant. The genetic relationships among the isolates were correlated with serotypes. The phylogenetic trees constructed using PFGE patterns and MLVA profiles presented two distinct clusters for the isolates of serotype 3 and one distinct cluster for each of the serotype groups, 1a/1b/NT, 2a/2b/X/NT, 4a/Y, and 6. Isolates that had different serotypes but had closer genetic relatedness than those with the same serotype were observed between serotype Y and subserotype 4a, serotype X and subserotype 2b, subserotype 1a and 1b, and subserotype 3a and 3b. Conclusions The 36 VNTR loci identified exhibited considerably different degrees of variability among S. flexneri serotype groups. VNTR locus could be highly variable in a serotype but invariable in others. MLVA assay based on four highly variable loci could display a comparable resolving power to PFGE in discriminating isolates. MLVA is also a prominent molecular tool for phylogenetic analysis of S. flexneri; the resulting data are beneficial to establish clear clonal patterns among different serotype groups and to discern clonal groups among isolates within the same serotype. As highly variable VNTR loci could be serotype-specific, a common MLVA protocol that consists of only a small set of loci, for example four to eight loci, and that provides high resolving power to all S. flexneri serotypes may not be obtainable. PMID:20042119

  16. Seq2Logo: a method for construction and visualization of amino acid binding motifs and sequence profiles including sequence weighting, pseudo counts and two-sided representation of amino acid enrichment and depletion

    PubMed Central

    Thomsen, Martin Christen Frølund; Nielsen, Morten

    2012-01-01

    Seq2Logo is a web-based sequence logo generator. Sequence logos are a graphical representation of the information content stored in a multiple sequence alignment (MSA) and provide a compact and highly intuitive representation of the position-specific amino acid composition of binding motifs, active sites, etc. in biological sequences. Accurate generation of sequence logos is often compromised by sequence redundancy and low number of observations. Moreover, most methods available for sequence logo generation focus on displaying the position-specific enrichment of amino acids, discarding the equally valuable information related to amino acid depletion. Seq2logo aims at resolving these issues allowing the user to include sequence weighting to correct for data redundancy, pseudo counts to correct for low number of observations and different logotype representations each capturing different aspects related to amino acid enrichment and depletion. Besides allowing input in the format of peptides and MSA, Seq2Logo accepts input as Blast sequence profiles, providing easy access for non-expert end-users to characterize and identify functionally conserved/variable amino acids in any given protein of interest. The output from the server is a sequence logo and a PSSM. Seq2Logo is available at http://www.cbs.dtu.dk/biotools/Seq2Logo (14 May 2012, date last accessed). PMID:22638583

  17. Interspecific and intraspecific gene variability in a 1-Mb region containing the highest density of NBS-LRR genes found in the melon genome.

    PubMed

    González, Víctor M; Aventín, Núria; Centeno, Emilio; Puigdomènech, Pere

    2014-12-17

    Plant NBS-LRR -resistance genes tend to be found in clusters, which have been shown to be hot spots of genome variability. In melon, half of the 81 predicted NBS-LRR genes group in nine clusters, and a 1 Mb region on linkage group V contains the highest density of R-genes and presence/absence gene polymorphisms found in the melon genome. This region is known to contain the locus of Vat, an agronomically important gene that confers resistance to aphids. However, the presence of duplications makes the sequencing and annotation of R-gene clusters difficult, usually resulting in multi-gapped sequences with higher than average errors. A 1-Mb sequence that contains the largest NBS-LRR gene cluster found in melon was improved using a strategy that combines Illumina paired-end mapping and PCR-based gap closing. Unknown sequence was decreased by 70% while about 3,000 SNPs and small indels were corrected. As a result, the annotations of 18 of a total of 23 NBS-LRR genes found in this region were modified, including additional coding sequences, amino acid changes, correction of splicing boundaries, or fussion of ORFs in common transcription units. A phylogeny analysis of the R-genes and their comparison with syntenic sequences in other cucurbits point to a pattern of local gene amplifications since the diversification of cucurbits from other families, and through speciation within the family. A candidate Vat gene is proposed based on the sequence similarity between a reported Vat gene from a Korean melon cultivar and a sequence fragment previously absent in the unrefined sequence. A sequence refinement strategy allowed substantial improvement of a 1 Mb fragment of the melon genome and the re-annotation of the largest cluster of NBS-LRR gene homologues found in melon. Analysis of the cluster revealed that resistance genes have been produced by sequence duplication in adjacent genome locations since the divergence of cucurbits from other close families, and through the process of speciation within the family a candidate Vat gene was also identified using sequence previously unavailable, which demonstrates the advantages of genome assembly refinements when analyzing complex regions such as those containing clusters of highly similar genes.

  18. Using msa-2b as a molecular marker for genotyping Mexican isolates of Babesia bovis.

    PubMed

    Genis, Alma D; Perez, Jocelin; Mosqueda, Juan J; Alvarez, Antonio; Camacho, Minerva; Muñoz, Maria de Lourdes; Rojas, Carmen; Figueroa, Julio V

    2009-12-01

    Variable merozoite surface antigens of Babesia bovis are exposed glycoproteins having a role in erythrocyte invasion. Members of this gene family include msa-1 and msa-2 (msa-2c, msa-2a(1), msa-2a(2) and msa-2b). To determine the sequence variation among B. bovis Mexican isolates using msa-2b as a genetic marker, PCR amplicons corresponding to msa-2b were cloned and plasmids carrying the corresponding inserts were purified and sequenced. Comparative analysis of nucleotide and deduced amino acid sequences revealed distinct degrees of variability and identity among the coding gene sequences obtained from 16 geographically different Mexican B. bovis isolates and a reference strain. Clustal-W multiple alignments of the MSA-2b deduced amino acid sequences performed with the 17 B. bovis Mexican isolates, revealed the identification of three genotypes with a distinct set each of amino acid residues present at the variable region: Genotype I represented by the MO7 strain (in vitro culture-derived from the Mexico isolate) as well as RAD, Chiapas-1, Tabasco and Veracruz-3 isolates; Genotype II, represented by the Jalisco, Mexico and Veracruz-2 isolates; and Genotype III comprising the sequences from most of the isolates studied, Tamaulipas-1, Chiapas-2, Guerrero-1, Nayarit, Quintana Roo, Nuevo Leon, Tamaulipas-2, Yucatan and Guerrero-2. Moreover, these three genotypes could be discriminated against each other by using a PCR-RFLP approach. The results suggest that occurrence of indels within the variable region of msa-2b sequences can be useful markers for identifying a particular genotype present in field populations of B. bovis isolated from infected cattle in Mexico.

  19. Paleobiogeography, high-resolution stratigraphy, and the future of Paleozoic biostratigraphy: Fine-scale diachroneity of the Wenlock (Silurian) conodont Kockelella walliseri

    USGS Publications Warehouse

    Cramer, Bradley D.; Kleffner, Mark A.; Brett, Carlton E.; McLaughlin, P.I.; Jeppsson, Lennart; Munnecke, Axel; Samtleben, Christian

    2010-01-01

    The Wenlock Epoch of the Silurian Period has become one of the chronostratigraphically best-constrained intervals of the Paleozoic. The integration of multiple chronostratigraphic tools, such as conodont and graptolite biostratigraphy, sequence stratigraphy, and ??13Ccarb chemostratigraphy, has greatly improved global chronostratigraphic correlation and portions of the Wenlock can now be correlated with precision better than ??100kyr. Additionally, such detailed and integrated chronostratigraphy provides an opportunity to evaluate the fidelity of individual chronostratigraphic tools. Here, we use conodont biostratigraphy, sequence stratigraphy and carbon isotope (??13Ccarb) chemostratigraphy to demonstrate that the conodont Kockelella walliseri, an important guide fossil for middle and upper Sheinwoodian strata (lower stage of the Wenlock Series), first appears at least one full stratigraphic sequence lower in Laurentia than in Baltica. Rather than serving as a demonstration of the unreliability of conodont biostratigraphy, this example serves to demonstrate the promise of high-resolution Paleozoic stratigraphy. The temporal difference between the two first occurrences was likely less than 1million years, and although it is conceptually understood that speciation and colonization must have been non-instantaneous events, Paleozoic paleobiogeographic variability on such short timescales (tens to hundreds of kyr) traditionally has been ignored or considered to be of little practical importance. The expansion of high-resolution Paleozoic stratigraphy in the future will require robust biostratigraphic zonations that embrace the integration of multiple chronostratigraphic tools as well as the paleobiogeographic variability in ranges that they will inevitably demonstrate. In addition, a better understanding of the paleobiogeographic migration histories of marine organisms will provide a unique tool for future Paleozoic paleoceanography and paleobiology research. ?? 2010 Elsevier B.V.

  20. Search for Variables in the Kepler Field on DASCH Plates

    NASA Astrophysics Data System (ADS)

    Tang, Sumin; Grindlay, J.; Los, E.; Servillat, M.

    2011-01-01

    The Digital Access to a Sky Century @ Harvard (DASCH) is a project to digitize the half a million glass photographic plates over the period 1880s-1980s. This 100 year coverage is a unique resource for studying temporal variations in the universe. Here we present our variable search algorithms and variable catalog in the Kepler fields based on 3000 scanned plates. We use the KIC spectral classifications to search for long-term variability of any main sequence stars, particularly M dwarfs. We apply a variability search technique developed for DASCH and set limits on the fraction of main sequence stars, by spectral type, which show detectable (>0.2mag) variability on timescales 10-100y. Such limits are of particular interest for M dwarfs given the recent discoveries of their planet systems.

  1. Variable Selection through Correlation Sifting

    NASA Astrophysics Data System (ADS)

    Huang, Jim C.; Jojic, Nebojsa

    Many applications of computational biology require a variable selection procedure to sift through a large number of input variables and select some smaller number that influence a target variable of interest. For example, in virology, only some small number of viral protein fragments influence the nature of the immune response during viral infection. Due to the large number of variables to be considered, a brute-force search for the subset of variables is in general intractable. To approximate this, methods based on ℓ1-regularized linear regression have been proposed and have been found to be particularly successful. It is well understood however that such methods fail to choose the correct subset of variables if these are highly correlated with other "decoy" variables. We present a method for sifting through sets of highly correlated variables which leads to higher accuracy in selecting the correct variables. The main innovation is a filtering step that reduces correlations among variables to be selected, making the ℓ1-regularization effective for datasets on which many methods for variable selection fail. The filtering step changes both the values of the predictor variables and output values by projections onto components obtained through a computationally-inexpensive principal components analysis. In this paper we demonstrate the usefulness of our method on synthetic datasets and on novel applications in virology. These include HIV viral load analysis based on patients' HIV sequences and immune types, as well as the analysis of seasonal variation in influenza death rates based on the regions of the influenza genome that undergo diversifying selection in the previous season.

  2. DNA extraction for streamlined metagenomics of diverse environmental samples.

    PubMed

    Marotz, Clarisse; Amir, Amnon; Humphrey, Greg; Gaffney, James; Gogul, Grant; Knight, Rob

    2017-06-01

    A major bottleneck for metagenomic sequencing is rapid and efficient DNA extraction. Here, we compare the extraction efficiencies of three magnetic bead-based platforms (KingFisher, epMotion, and Tecan) to a standardized column-based extraction platform across a variety of sample types, including feces, oral, skin, soil, and water. Replicate sample plates were extracted and prepared for 16S rRNA gene amplicon sequencing in parallel to assess extraction bias and DNA quality. The data demonstrate that any effect of extraction method on sequencing results was small compared with the variability across samples; however, the KingFisher platform produced the largest number of high-quality reads in the shortest amount of time. Based on these results, we have identified an extraction pipeline that dramatically reduces sample processing time without sacrificing bacterial taxonomic or abundance information.

  3. III. NIH Toolbox Cognition Battery (CB): measuring episodic memory.

    PubMed

    Bauer, Patricia J; Dikmen, Sureyya S; Heaton, Robert K; Mungas, Dan; Slotkin, Jerry; Beaumont, Jennifer L

    2013-08-01

    One of the most significant domains of cognition is episodic memory, which allows for rapid acquisition and long-term storage of new information. For purposes of the NIH Toolbox, we devised a new test of episodic memory. The nonverbal NIH Toolbox Picture Sequence Memory Test (TPSMT) requires participants to reproduce the order of an arbitrarily ordered sequence of pictures presented on a computer. To adjust for ability, sequence length varies from 6 to 15 pictures. Multiple trials are administered to increase reliability. Pediatric data from the validation study revealed the TPSMT to be sensitive to age-related changes. The task also has high test-retest reliability and promising construct validity. Steps to further increase the sensitivity of the instrument to individual and age-related variability are described. © 2013 The Society for Research in Child Development, Inc.

  4. Full-genome sequences of hepatitis B virus subgenotype D3 isolates from the Brazilian Amazon Region.

    PubMed

    Spitz, Natália; Mello, Francisco C A; Araujo, Natalia Motta

    2015-02-01

    The Brazilian Amazon Region is a highly endemic area for hepatitis B virus (HBV). However, little is known regarding the genetic variability of the strains circulating in this geographical region. Here, we describe the first full-length genomes of HBV isolated in the Brazilian Amazon Region; these genomes are also the first complete HBV subgenotype D3 genomes reported for Brazil. The genomes of the five Brazilian isolates were all 3,182 base pairs in length and the isolates were classified as belonging to subgenotype D3, subtypes ayw2 (n = 3) and ayw3 (n = 2). Phylogenetic analysis suggested that the Brazilian sequences are not likely to be closely related to European D3 sequences. Such results will contribute to further epidemiological and evolutionary studies of HBV.

  5. Molecular variability analysis of five new complete cacao swollen shoot virus genomic sequences.

    PubMed

    Muller, E; Sackey, S

    2005-01-01

    Cacao swollen shoot virus (CSSV), a member of the family Caulimovi-ridae, genus Badnavirus occurs in all the main cacao-growing areas of West Africa. We amplified, cloned and sequenced complete genomes of five new isolates, two originating from Togo and three originating from Ghana. The genome of these five newly sequenced isolates all contain the five putative open reading frames I, II, III, X and Y described for the first sequenced CSSV isolate, Agou1 originating from Togo. Their genomes have been aligned with the genome of Agou1. The nucleotide and amino acid sequence identities between isolates have been calculated and a phylogenetic analysis has been made including other pararetroviruses. Maximum nucleotide sequence variability between complete genomes of CSSV isolates was 29.4%. Geographical differentiation between isolates appears more important than differentiation between mild and severe isolates. ORF X differs greatly in size and sequence between the Togolese isolates Nyongbo2 and Agou1, and the four other isolates, its functional role is therefore clearly questionable.

  6. Sequence variation and phylogenetic analysis of envelope glycoprotein of hepatitis G virus.

    PubMed

    Lim, M Y; Fry, K; Yun, A; Chong, S; Linnen, J; Fung, K; Kim, J P

    1997-11-01

    A transfusion-transmissible agent provisionally designated hepatitis G virus (HGV) was recently identified. In this study, we examined the variability of the HGV genome by analysing sequences in the putative envelope region from 72 isolates obtained from diverse geographical sources. The 1561 nucleotide sequence of the E1/E2/NS2a region of HGV was determined from 12 isolates, and compared with three published sequences. The most variability was observed in 400 nucleotides at the N terminus of E2. We next analysed this 400 nucleotide envelope variable region (EV) from an additional 60 HGV isolates. This sequence varied considerably among the 75 isolates, with overall identity ranging from 79.3% to 99.5% at the nucleotide level, and from 83.5% to 100% at the amino acid level. However, hypervariable regions were not identified. Phylogenetic analyses indicated that the 75 HGV isolates belong to a single genotype. A single-tier distribution of evolutionary distances was observed among the 15 E1/E2/NS2a sequences and the 75 EV sequences. In contrast, 11 isolates of HCV were analysed and showed a three-tiered distribution, representing genotypes, subtypes, and isolates. The 75 isolates of HGV fell into four clusters on the phylogenetic tree. Tight geographical clustering was observed among the HGV isolates from Japan and Korea.

  7. Analysis of simian immunodeficiency virus sequence variation in tissues of rhesus macaques with simian AIDS.

    PubMed Central

    Kodama, T; Mori, K; Kawahara, T; Ringler, D J; Desrosiers, R C

    1993-01-01

    One rhesus macaque displayed severe encephalomyelitis and another displayed severe enterocolitis following infection with molecularly cloned simian immunodeficiency virus (SIV) strain SIVmac239. Little or no free anti-SIV antibody developed in these two macaques, and they died relatively quickly (4 to 6 months) after infection. Manifestation of the tissue-specific disease in these macaques was associated with the emergence of variants with high replicative capacity for macrophages and primary infection of tissue macrophages. The nature of sequence variation in the central region (vif, vpr, and vpx), the env gene, and the nef long terminal repeat (LTR) region in brain, colon, and other tissues was examined to see whether specific genetic changes were associated with SIV replication in brain or gut. Sequence analysis revealed strong conservation of the intergenic central region, nef, and the LTR. However, analysis of env sequences in these two macaques and one other revealed significant, interesting patterns of sequence variation. (i) Changes in env that were found previously to contribute to the replicative ability of SIVmac for macrophages in culture were present in the tissues of these animals. (ii) The greatest variability was located in the regions between V1 and V2 and from "V3" through C3 in gp120, which are different in location from the variable regions observed previously in animals with strong antibody responses and long-term persistent infection. (iii) The predominant sequence change of D-->N at position 385 in C3 is most surprising, since this change in both SIV and human immunodeficiency virus type 1 has been associated with dramatically diminished affinity for CD4 and replication in vitro. (iv) The nature of sequence changes at some positions (146, 178, 345, 385, and "V3") suggests that viral replication in brain and gut may be facilitated by specific sequence changes in env in addition to those that impart a general ability to replicate well in macrophages. These results demonstrate that complex selective pressures, including immune responses and varying cell and tissue specificity, can influence the nature of sequence changes in env. Images PMID:8411355

  8. The nonlinear, complex sequential organization of behavior in schizophrenic patients: neurocognitive strategies and clinical correlations.

    PubMed

    Paulus, M P; Perry, W; Braff, D L

    1999-09-01

    Thought disorder is a hallmark of schizophrenia and can be inferred from disorganized behavior. Measures of the sequential organization of behavior are important because they reflect the cognitive processes of the selection and sequencing of behavioral elements, which generate observable and analyzable behavioral patterns. In this context, sequences of choices generated by schizophrenic patients in a two-choice guessing task fluctuate significantly, which reflects an "oscillating dysregulation" between highly predictable and highly unpredictable subsequences within a single test session. In this study, we aimed to clarify the significance of dysregulation by seeing whether demographic, clinical, neuropsychological, and psychological measures predict the degree of dysregulation observed on this two-choice task. Thirty schizophrenic patients repeatedly performed a LEFT or RIGHT key press that was followed by a stimulus, which occurred randomly on the left or right side of the computer screen. Thus, the stimulus location had nothing to do with the key press behavior. The range of key press sequence predictabilities as measured by the dynamical entropy was used to quantify the dysregulation of response sequences and reflects the range of fixity and randomness of the responses. A factor analysis was performed and step-wise multiple regression analyses were used to relate the factor scores to demographic, clinical, symptomatic, Wisconsin Card Sorting Test (WCST), and Rorschach variables. The LEFT/RIGHT key press sequences were determined by three factors: 1) the degree of win-stay/lose-shift strategy; 2) the degree of contextual influence on the current choice; and 3) the degree of dysregulation on the choice task. Demographic and clinical variables did not predict any of the three response patterns on the choice task. In contrast, the WCST and Rorschach test predicted performance on various factors of choice task response patterns. Schizophrenic patients employ several rules, i.e., "win-stay/lose-shift" and "decide according to the previous choice," that fluctuate significantly when generating sequences on this task, confirming that a basic behavioral dysregulation occurs in a single schizophrenic subject across a single test session. The organization or the "temporal architecture" of the behavioral sequences is not related to symptoms per se, but is related to deficits in executive functioning, problem solving, and perceptual organizational abilities.

  9. Effects of Physiochemical Factors on Prokaryotic Biodiversity in Malaysian Circumneutral Hot Springs.

    PubMed

    Chan, Chia S; Chan, Kok-Gan; Ee, Robson; Hong, Kar-Wai; Urbieta, María S; Donati, Edgardo R; Shamsir, Mohd S; Goh, Kian M

    2017-01-01

    Malaysia has a great number of hot springs, especially along the flank of the Banjaran Titiwangsa mountain range. Biological studies of the Malaysian hot springs are rare because of the lack of comprehensive information on their microbial communities. In this study, we report a cultivation-independent census to describe microbial communities in six hot springs. The Ulu Slim (US), Sungai Klah (SK), Dusun Tua (DT), Sungai Serai (SS), Semenyih (SE), and Ayer Hangat (AH) hot springs exhibit circumneutral pH with temperatures ranging from 43°C to 90°C. Genomic DNA was extracted from environmental samples and the V3-V4 hypervariable regions of 16S rRNA genes were amplified, sequenced, and analyzed. High-throughput sequencing analysis showed that microbial richness was high in all samples as indicated by the detection of 6,334-26,244 operational taxonomy units. In total, 59, 61, 72, 73, 65, and 52 bacterial phyla were identified in the US, SK, DT, SS, SE, and AH hot springs, respectively. Generally, Firmicutes and Proteobacteria dominated the bacterial communities in all hot springs. Archaeal communities mainly consisted of Crenarchaeota, Euryarchaeota, and Parvarchaeota. In beta diversity analysis, the hot spring microbial memberships were clustered primarily on the basis of temperature and salinity. Canonical correlation analysis to assess the relationship between the microbial communities and physicochemical variables revealed that diversity patterns were best explained by a combination of physicochemical variables, rather than by individual abiotic variables such as temperature and salinity.

  10. Effects of Physiochemical Factors on Prokaryotic Biodiversity in Malaysian Circumneutral Hot Springs

    PubMed Central

    Chan, Chia S.; Chan, Kok-Gan; Ee, Robson; Hong, Kar-Wai; Urbieta, María S.; Donati, Edgardo R.; Shamsir, Mohd S.; Goh, Kian M.

    2017-01-01

    Malaysia has a great number of hot springs, especially along the flank of the Banjaran Titiwangsa mountain range. Biological studies of the Malaysian hot springs are rare because of the lack of comprehensive information on their microbial communities. In this study, we report a cultivation-independent census to describe microbial communities in six hot springs. The Ulu Slim (US), Sungai Klah (SK), Dusun Tua (DT), Sungai Serai (SS), Semenyih (SE), and Ayer Hangat (AH) hot springs exhibit circumneutral pH with temperatures ranging from 43°C to 90°C. Genomic DNA was extracted from environmental samples and the V3–V4 hypervariable regions of 16S rRNA genes were amplified, sequenced, and analyzed. High-throughput sequencing analysis showed that microbial richness was high in all samples as indicated by the detection of 6,334–26,244 operational taxonomy units. In total, 59, 61, 72, 73, 65, and 52 bacterial phyla were identified in the US, SK, DT, SS, SE, and AH hot springs, respectively. Generally, Firmicutes and Proteobacteria dominated the bacterial communities in all hot springs. Archaeal communities mainly consisted of Crenarchaeota, Euryarchaeota, and Parvarchaeota. In beta diversity analysis, the hot spring microbial memberships were clustered primarily on the basis of temperature and salinity. Canonical correlation analysis to assess the relationship between the microbial communities and physicochemical variables revealed that diversity patterns were best explained by a combination of physicochemical variables, rather than by individual abiotic variables such as temperature and salinity. PMID:28729863

  11. An Evaluation of an Intervention Sequence Outline in Positive Behaviour Support for People with Autism and Severe Escape-Motivated Challenging Behaviour

    ERIC Educational Resources Information Center

    McClean, Brian; Grey, Ian

    2012-01-01

    Background: Positive behaviour support emphasises the impact of contextual variables to enhance participation, choice, and quality of life. This study evaluates a sequence for implementing changes to key contextual variables for 4 individuals. Interventions were maintained and data collection continued over a 3-year period. Method: Functional…

  12. MHC class I and MHC class II DRB gene variability in wild and captive Bengal tigers (Panthera tigris tigris).

    PubMed

    Pokorny, Ina; Sharma, Reeta; Goyal, Surendra Prakash; Mishra, Sudanshu; Tiedemann, Ralph

    2010-10-01

    Bengal tigers are highly endangered and knowledge on adaptive genetic variation can be essential for efficient conservation and management. Here we present the first assessment of allelic variation in major histocompatibility complex (MHC) class I and MHC class II DRB genes for wild and captive tigers from India. We amplified, cloned, and sequenced alpha-1 and alpha-2 domain of MHC class I and beta-1 domain of MHC class II DRB genes in 16 tiger specimens of different geographic origin. We detected high variability in peptide-binding sites, presumably resulting from positive selection. Tigers exhibit a low number of MHC DRB alleles, similar to other endangered big cats. Our initial assessment-admittedly with limited geographic coverage and sample size-did not reveal significant differences between captive and wild tigers with regard to MHC variability. In addition, we successfully amplified MHC DRB alleles from scat samples. Our characterization of tiger MHC alleles forms a basis for further in-depth analyses of MHC variability in this illustrative threatened mammal.

  13. Comparative Molecular and Morphological Variation Analysis of Siderastrea (Anthozoa, Scleractinia) Reveals the Presence of Siderastrea stellata in the Gulf of Mexico.

    PubMed

    García, Norberto A Colín; Campos, Jorge E; Musi, José L Tello; Forsman, Zac H; Muñoz, Jorge L Montero; Reyes, Alejandro Monsalvo; González, Jesús E Arias

    2017-02-01

    The genus Siderastrea exhibits high levels of morphological variability. Some of its species share similar morphological characteristics with congeners, making their identification difficult. Siderastrea stellata has been reported as an intermediary of S. siderea and S. radians in the Brazilian reef ecosystem. In an earlier study conducted in Mexico, we detected Siderastrea colonies with morphological features that were not consistent with some siderastreid species previously reported in the Gulf of Mexico. Thus, we performed a combined morphological and molecular analysis to identify Siderastrea species boundaries from the Gulf of Mexico. Some colonies presented high morphologic variability, with characteristics that corresponded to Siderastrea stellata. Molecular analysis, using the nuclear ITS and ITS2 region, corroborated the morphological results, revealing low genetic variability between S. radians and S. stellata. Since the ITS sequences did not distinguish between Siderastrea species, we used the ITS2 region to differentiate S. stellata from S. radians. This is the first report of Siderastrea stellata and its variability in the Gulf of Mexico that is supported by morphological and molecular analyses.

  14. Modeling bias and variation in the stochastic processes of small RNA sequencing

    PubMed Central

    Etheridge, Alton; Sakhanenko, Nikita; Galas, David

    2017-01-01

    Abstract The use of RNA-seq as the preferred method for the discovery and validation of small RNA biomarkers has been hindered by high quantitative variability and biased sequence counts. In this paper we develop a statistical model for sequence counts that accounts for ligase bias and stochastic variation in sequence counts. This model implies a linear quadratic relation between the mean and variance of sequence counts. Using a large number of sequencing datasets, we demonstrate how one can use the generalized additive models for location, scale and shape (GAMLSS) distributional regression framework to calculate and apply empirical correction factors for ligase bias. Bias correction could remove more than 40% of the bias for miRNAs. Empirical bias correction factors appear to be nearly constant over at least one and up to four orders of magnitude of total RNA input and independent of sample composition. Using synthetic mixes of known composition, we show that the GAMLSS approach can analyze differential expression with greater accuracy, higher sensitivity and specificity than six existing algorithms (DESeq2, edgeR, EBSeq, limma, DSS, voom) for the analysis of small RNA-seq data. PMID:28369495

  15. Ultra-barcoding in cacao (Theobroma spp.; Malvaceae) using whole chloroplast genomes and nuclear ribosomal DNA.

    PubMed

    Kane, Nolan; Sveinsson, Saemundur; Dempewolf, Hannes; Yang, Ji Yong; Zhang, Dapeng; Engels, Johannes M M; Cronk, Quentin

    2012-02-01

    To reliably identify lineages below the species level such as subspecies or varieties, we propose an extension to DNA-barcoding using next-generation sequencing to produce whole organellar genomes and substantial nuclear ribosomal sequence. Because this method uses much longer versions of the traditional DNA-barcoding loci in the plastid and ribosomal DNA, we call our approach ultra-barcoding (UBC). We used high-throughput next-generation sequencing to scan the genome and generate reliable sequence of high copy number regions. Using this method, we examined whole plastid genomes as well as nearly 6000 bases of nuclear ribosomal DNA sequences for nine genotypes of Theobroma cacao and an individual of the related species T. grandiflorum, as well as an additional publicly available whole plastid genome of T. cacao. All individuals of T. cacao examined were uniquely distinguished, and evidence of reticulation and gene flow was observed. Sequence variation was observed in some of the canonical barcoding regions between species, but other regions of the chloroplast were more variable both within species and between species, as were ribosomal spacers. Furthermore, no single region provides the level of data available using the complete plastid genome and rDNA. Our data demonstrate that UBC is a viable, increasingly cost-effective approach for reliably distinguishing varieties and even individual genotypes of T. cacao. This approach shows great promise for applications where very closely related or interbreeding taxa must be distinguished.

  16. Training the max-margin sequence model with the relaxed slack variables.

    PubMed

    Niu, Lingfeng; Wu, Jianmin; Shi, Yong

    2012-09-01

    Sequence models are widely used in many applications such as natural language processing, information extraction and optical character recognition, etc. We propose a new approach to train the max-margin based sequence model by relaxing the slack variables in this paper. With the canonical feature mapping definition, the relaxed problem is solved by training a multiclass Support Vector Machine (SVM). Compared with the state-of-the-art solutions for the sequence learning, the new method has the following advantages: firstly, the sequence training problem is transformed into a multiclassification problem, which is more widely studied and already has quite a few off-the-shelf training packages; secondly, this new approach reduces the complexity of training significantly and achieves comparable prediction performance compared with the existing sequence models; thirdly, when the size of training data is limited, by assigning different slack variables to different microlabel pairs, the new method can use the discriminative information more frugally and produces more reliable model; last but not least, by employing kernels in the intermediate multiclass SVM, nonlinear feature space can be easily explored. Experimental results on the task of named entity recognition, information extraction and handwritten letter recognition with the public datasets illustrate the efficiency and effectiveness of our method. Copyright © 2012 Elsevier Ltd. All rights reserved.

  17. Sequence Variation of the tRNALeu Intron as a Marker for Genetic Diversity and Specificity of Symbiotic Cyanobacteria in Some Lichens

    PubMed Central

    Paulsrud, Per; Lindblad, Peter

    1998-01-01

    We examined the genetic diversity of Nostoc symbionts in some lichens by using the tRNALeu (UAA) intron as a genetic marker. The nucleotide sequence was analyzed in the context of the secondary structure of the transcribed intron. Cyanobacterial tRNALeu (UAA) introns were specifically amplified from freshly collected lichen samples without previous DNA extraction. The lichen species used in the present study were Nephroma arcticum, Peltigera aphthosa, P. membranacea, and P. canina. Introns with different sizes around 300 bp were consistently obtained. Multiple clones from single PCRs were screened by using their single-stranded conformational polymorphism pattern, and the nucleotide sequence was determined. No evidence for sample heterogenity was found. This implies that the symbiont in situ is not a diverse community of cyanobionts but, rather, one Nostoc strain. Furthermore, each lichen thallus contained only one intron type, indicating that each thallus is colonized only once or that there is a high degree of specificity. The same cyanobacterial intron sequence was also found in samples of one lichen species from different localities. In a phylogenetic analysis, the cyanobacterial lichen sequences grouped together with the sequences from two free-living Nostoc strains. The size differences in the intron were due to insertions and deletions in highly variable regions. The sequence data were used in discussions concerning specificity and biology of the lichen symbiosis. It is concluded that the tRNALeu (UAA) intron can be of great value when examining cyanobacterial diversity. PMID:9435083

  18. Microbial Diversity of Acidic Hot Spring (Kawah Hujan B) in Geothermal Field of Kamojang Area, West Java-Indonesia

    PubMed Central

    Aditiawati, Pingkan; Yohandini, Heni; Madayanti, Fida; Akhmaloka

    2009-01-01

    Microbial communities in an acidic hot spring, namely Kawah Hujan B, at Kamojang geothermal field, West Java-Indonesia was examined using culture dependent and culture independent strategies. Chemical analysis of the hot spring water showed a characteristic of acidic-sulfate geothermal activity that contained high sulfate concentrations and low pH values (pH 1.8 to 1.9). Microbial community present in the spring was characterized by 16S rRNA gene combined with denaturing gradient gel electrophoresis (DGGE) analysis. The majority of the sequences recovered from culture-independent method were closely related to Crenarchaeota and Proteobacteria phyla. However, detail comparison among the member of Crenarchaeota showing some sequences variation compared to that the published data especially on the hypervariable and variable regions. In addition, the sequences did not belong to certain genus. Meanwhile, the 16S Rdna sequences from culture-dependent samples revealed mostly close to Firmicute and gamma Proteobacteria. PMID:19440252

  19. Evolution and molecular epidemiology of classical swine fever virus during a multi-annual outbreak amongst European wild boar.

    PubMed

    Goller, Katja V; Gabriel, Claudia; Dimna, Mireille Le; Le Potier, Marie-Frédérique; Rossi, Sophie; Staubach, Christoph; Merboth, Matthias; Beer, Martin; Blome, Sandra

    2016-03-01

    Classical swine fever is a viral disease of pigs that carries tremendous socio-economic impact. In outbreak situations, genetic typing is carried out for the purpose of molecular epidemiology in both domestic pigs and wild boar. These analyses are usually based on harmonized partial sequences. However, for high-resolution analyses towards the understanding of genetic variability and virus evolution, full-genome sequences are more appropriate. In this study, a unique set of representative virus strains was investigated that was collected during an outbreak in French free-ranging wild boar in the Vosges-du-Nord mountains between 2003 and 2007. Comparative sequence and evolutionary analyses of the nearly full-length sequences showed only slow evolution of classical swine fever virus strains over the years and no impact of vaccination on mutation rates. However, substitution rates varied amongst protein genes; furthermore, a spatial and temporal pattern could be observed whereby two separate clusters were formed that coincided with physical barriers.

  20. Microbial diversity of acidic hot spring (kawah hujan B) in geothermal field of kamojang area, west java-indonesia.

    PubMed

    Aditiawati, Pingkan; Yohandini, Heni; Madayanti, Fida; Akhmaloka

    2009-01-01

    Microbial communities in an acidic hot spring, namely Kawah Hujan B, at Kamojang geothermal field, West Java-Indonesia was examined using culture dependent and culture independent strategies. Chemical analysis of the hot spring water showed a characteristic of acidic-sulfate geothermal activity that contained high sulfate concentrations and low pH values (pH 1.8 to 1.9). Microbial community present in the spring was characterized by 16S rRNA gene combined with denaturing gradient gel electrophoresis (DGGE) analysis. The majority of the sequences recovered from culture-independent method were closely related to Crenarchaeota and Proteobacteria phyla. However, detail comparison among the member of Crenarchaeota showing some sequences variation compared to that the published data especially on the hypervariable and variable regions. In addition, the sequences did not belong to certain genus. Meanwhile, the 16S Rdna sequences from culture-dependent samples revealed mostly close to Firmicute and gamma Proteobacteria.

  1. Phylogenetic relationships of Paradiclybothrium pacificum and Diclybothrium armatum (Monogenoidea: Diclybothriidae) inferred from 18S rDNA sequence data.

    PubMed

    Rozhkovan, Konstantin V; Shedko, Marina B

    2015-10-01

    The Diclybothriidae (Monogenoidea: Oligonchoinea) includes specific parasites of fishes assigned to the ancient order Acipenseriformes. Phylogeny of the Diclybothriidae is still unclear despite several systematic studies based on morphological characters. Together with the closely related Hexabothriidae represented by parasites of sharks and ray-fishes, the position of Diclybothriidae in different taxonomical systems has been matter of discussion. Here, we present the first molecular data on Diclybothriidae. The SSU rRNA gene was used to investigate the phylogenetic position of Paradiclybothrium pacificum and Diclybothrium armatum among the other Oligonchoinea. Complete nucleotide sequences of P. pacificum and D. armatum demonstrated high identity (98.53%) with no intraspecific sequence variability. Specimens of D. armatum were obtained from different hosts (Acipenser schrenckii and Huso dauricus); however, variation by host was not detected. The sequence divergence and phylogenetic trees data show that Diclybothriidae and Hexabothriidae are more closely related to each other than with other representatives of Oligonchoinea. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  2. Microbial ecological succession during municipal solid waste decomposition.

    PubMed

    Staley, Bryan F; de Los Reyes, Francis L; Wang, Ling; Barlaz, Morton A

    2018-04-28

    The decomposition of landfilled refuse proceeds through distinct phases, each defined by varying environmental factors such as volatile fatty acid concentration, pH, and substrate quality. The succession of microbial communities in response to these changing conditions was monitored in a laboratory-scale simulated landfill to minimize measurement difficulties experienced at field scale. 16S rRNA gene sequences retrieved at separate stages of decomposition showed significant succession in both Bacteria and methanogenic Archaea. A majority of Bacteria sequences in landfilled refuse belong to members of the phylum Firmicutes, while Proteobacteria levels fluctuated and Bacteroidetes levels increased as decomposition proceeded. Roughly 44% of archaeal sequences retrieved under conditions of low pH and high acetate were strictly hydrogenotrophic (Methanomicrobiales, Methanobacteriales). Methanosarcina was present at all stages of decomposition. Correspondence analysis showed bacterial population shifts were attributed to carboxylic acid concentration and solids hydrolysis, while archaeal populations were affected to a higher degree by pH. T-RFLP analysis showed specific taxonomic groups responded differently and exhibited unique responses during decomposition, suggesting that species composition and abundance within Bacteria and Archaea are highly dynamic. This study shows landfill microbial demographics are highly variable across both spatial and temporal transects.

  3. Rapid Evolution of Virulence and Drug Resistance in the Emerging Zoonotic Pathogen Streptococcus suis

    PubMed Central

    Holden, Matthew T. G.; Hauser, Heidi; Sanders, Mandy; Ngo, Thi Hoa; Cherevach, Inna; Cronin, Ann; Goodhead, Ian; Mungall, Karen; Quail, Michael A.; Price, Claire; Rabbinowitsch, Ester; Sharp, Sarah; Croucher, Nicholas J.; Chieu, Tran Bich; Thi Hoang Mai, Nguyen; Diep, To Song; Chinh, Nguyen Tran; Kehoe, Michael; Leigh, James A.; Ward, Philip N.; Dowson, Christopher G.; Whatmore, Adrian M.; Chanter, Neil; Iversen, Pernille; Gottschalk, Marcelo; Slater, Josh D.; Smith, Hilde E.; Spratt, Brian G.; Xu, Jianguo; Ye, Changyun; Bentley, Stephen; Barrell, Barclay G.; Schultsz, Constance; Maskell, Duncan J.; Parkhill, Julian

    2009-01-01

    Background Streptococcus suis is a zoonotic pathogen that infects pigs and can occasionally cause serious infections in humans. S. suis infections occur sporadically in human Europe and North America, but a recent major outbreak has been described in China with high levels of mortality. The mechanisms of S. suis pathogenesis in humans and pigs are poorly understood. Methodology/Principal Findings The sequencing of whole genomes of S. suis isolates provides opportunities to investigate the genetic basis of infection. Here we describe whole genome sequences of three S. suis strains from the same lineage: one from European pigs, and two from human cases from China and Vietnam. Comparative genomic analysis was used to investigate the variability of these strains. S. suis is phylogenetically distinct from other Streptococcus species for which genome sequences are currently available. Accordingly, ∼40% of the ∼2 Mb genome is unique in comparison to other Streptococcus species. Finer genomic comparisons within the species showed a high level of sequence conservation; virtually all of the genome is common to the S. suis strains. The only exceptions are three ∼90 kb regions, present in the two isolates from humans, composed of integrative conjugative elements and transposons. Carried in these regions are coding sequences associated with drug resistance. In addition, small-scale sequence variation has generated pseudogenes in putative virulence and colonization factors. Conclusions/Significance The genomic inventories of genetically related S. suis strains, isolated from distinct hosts and diseases, exhibit high levels of conservation. However, the genomes provide evidence that horizontal gene transfer has contributed to the evolution of drug resistance. PMID:19603075

  4. Myxobolus cerebralis internal transcribed spacer 1 (ITS-1) sequences support recent spread of the parasite to North America and within Europe

    USGS Publications Warehouse

    Whipps, Christopher M.; El-Matbouli, M.; Hedrick, R.P.; Blazer, V.; Kent, M.L.

    2004-01-01

    Molecular approaches for resolving relationships among the Myxozoa have relied mainly on small subunit (SSU) ribosomal DNA (rDNA) sequence analysis. This region of the gene is generally used for higher phylogenetic studies, and the conservative nature of this gene may make it inadequate for intraspecific comparisons. Previous intraspecific studies of Myxobolus cerebralis based on molecular analyses reported that the sequence of SSU rDNA and the internal transcribed spacer (ITS) were highly conserved in representatives of the parasite from North America and Europe. Considering that the ITS is usually a more variable region than the SSU, we reanalyzed available sequences on GenBank and obtained sequences from other M. cerebralis representatives from the states of California and West Virginia in the USA and from Germany and Russia. With the exception of 7 base pairs, most of the sequence designated as ITS-1 in GenBank was a highly conserved portion of the rDNA near the 3-prime end of the SSU region. Nonetheless, the additional ITS-1 sequences obtained from the available geographic representatives were well conserved. It is unlikely that we would have observed virtually identical ITS-1 sequences between European and American M. cerebralis samples had it spread naturally over time, particularly when compared to the variation seen between isolates of another myxozoan (Kudoa thyrsites) that has most likely spread naturally. These data further support the hypothesis that the current distribution of M. cerebralis in North America is a result of recent introductions followed by dispersal via anthropogenic means, largely through the stocking of infected trout for sport fishing.

  5. Photometric search for variable stars in the young open cluster Berkeley 59

    NASA Astrophysics Data System (ADS)

    Lata, Sneh; Pandey, A. K.; Maheswar, G.; Mondal, Soumen; Kumar, Brijesh

    2011-12-01

    We present the time series photometry of stars located in the extremely young open cluster Berkeley 59. Using the 1.04-m telescope at Aryabhatta Research Institute of Observational Sciences (ARIES), Nainital, we have identified 42 variables in a field of ˜13 × 13 arcmin2 around the cluster. The probable members of the cluster have been identified using a (V, V-I) colour-magnitude diagram and a (J-H, H-K) colour-colour diagram. 31 variables have been found to be pre-main-sequence stars associated with the cluster. The ages and masses of the pre-main-sequence stars have been derived from the colour-magnitude diagram by fitting theoretical models to the observed data points. The ages of the majority of the probable pre-main-sequence variable candidates range from 1 to 5 Myr. The masses of these pre-main-sequence variable stars have been found to be in the range of ˜0.3 to ˜3.5 M⊙, and these could be T Tauri stars. The present statistics reveal that about 90 per cent T Tauri stars have period <15 d. The classical T Tauri stars are found to have a larger amplitude than the weak-line T Tauri stars. There is an indication that the amplitude decreases with an increase in mass, which could be due to the dispersal of the discs of relatively massive stars.

  6. Genotypic and phenotypic diversity of Alicyclobacillus acidocaldarius isolates.

    PubMed

    Félix-Valenzuela, L; Guardiola-Avila, I; Burgara-Estrella, A; Ibarra-Zavala, M; Mata-Haro, V

    2015-10-01

    The fruit juice industry recognizes Alicyclobacillus as a major quality control target micro-organism. In this study, we analysed 19 bacterial isolates to identify Alicyclobacillus species by polymerase chain reaction (PCR) and sequencing analyses. Phenotypic and genomic diversity among isolates were investigated by API 50CHB system and ERIC-PCR (enterobacterial repetitive intergenic consensus-PCR) respectively. All bacterial isolates were identified as Alicyclobacillus acidocaldarius, and almost all showed identical DNA sequences according to their 16S rRNA (rDNA) gene partial sequences. Only few carbohydrates were fermented by A. acidocaldarius isolates, and there was little variability in the biochemical profile. Genotypic fingerprinting of the A. acidocaldarius isolates showed high diversity, and clusters by ERIC-PCR were distinct to those obtained from the 16S rRNA gene phylogenetic tree. There was no correlation between phenotypic and genotypic variability in the A. acidocaldarius isolates analysed in this study. Detection of Alicyclobacillus strains is imperative in fruit concentrates and juices due to the production of guaiacol. Identification of the genera originates rejection of the product by processing industry. However, not all the Alicyclobacillus species are deteriorative and hence the importance to differentiate among them. In this study, partial 16S ribosomal RNA sequence alignment allowed the differentiation of species. In addition, ERIC-PCR was introduced for the genotypic characterization of Alicyclobacillus, as an alternative for differentiation among isolates from the same species. © 2015 The Society for Applied Microbiology.

  7. Phylogenetic utility, and variability in structure and content, of complete mitochondrial genomes among genetic lineages of the Hawaiian anchialine shrimp Halocaridina rubra Holthuis 1963 (Atyidae:Decapoda).

    PubMed

    Justice, Joshua L; Weese, David A; Santos, Scott Ross

    2016-07-01

    The Atyidae are caridean shrimp possessing hair-like setae on their claws and are important contributors to ecological services in tropical and temperate fresh and brackish water ecosystems. Complete mitochondrial genomes have only been reported from five of the 449 species in the family, thus limiting understanding of mitochondrial genome evolution and the phylogenetic utility of complete mitochondrial sequences in the Atyidae. Here, comparative analyses of complete mitochondrial genomes from eight genetic lineages of Halocaridina rubra, an atyid endemic to the anchialine ecosystem of the Hawaiian Archipelago, are presented. Although gene number, order, and orientation were syntenic among genomes, three regions were identified and further quantified where conservation was substantially lower: (1) high length and sequence variability in the tRNA-Lys and tRNA-Asp intergenic region; (2) a 317-bp insertion between the NAD6 and CytB genes confined to a single lineage and representing a partial duplication of CytB; and (3) the putative control region. Phylogenetic analyses utilizing complete mitochondrial sequences provided new insights into relationships among the H. rubra genetic lineages, with the topology of one clade correlating to the geologic sequence of the islands. However, deeper nodes in the phylogeny lacked bootstrap support. Overall, our results from H. rubra suggest intra-specific mitochondrial genomic diversity could be underestimated across the Metazoa since the vast majority of complete genomes are from just a single individual of a species.

  8. Failure to produce response variability with reinforcement

    PubMed Central

    Schwartz, Barry

    1982-01-01

    Two experiments attempted to train pigeons to produce variable response sequences. In the first, naive pigeons were exposed to a procedure requiring four pecks on each of two keys in any order, with a reinforcer delivered only if a given sequence was different from the preceding one. In the second experiment, the same pigeons were exposed to this procedure after having been trained successfully to alternate between two specific response sequences. In neither case did any pigeon produce more than a few different sequences or obtain more than 50% of the possible reinforcers. Stereotyped sequences developed even though stereotypy was not reinforced. It is suggested that reinforcers have both hedonic and informative properties and that the hedonic properties are responsible for sterotyped repetition of reinforced responses, even when stereotypy is negatively related to reinforcer delivery. PMID:16812263

  9. Synthetic spike-in standards for high-throughput 16S rRNA gene amplicon sequencing.

    PubMed

    Tourlousse, Dieter M; Yoshiike, Satowa; Ohashi, Akiko; Matsukura, Satoko; Noda, Naohiro; Sekiguchi, Yuji

    2017-02-28

    High-throughput sequencing of 16S rRNA gene amplicons (16S-seq) has become a widely deployed method for profiling complex microbial communities but technical pitfalls related to data reliability and quantification remain to be fully addressed. In this work, we have developed and implemented a set of synthetic 16S rRNA genes to serve as universal spike-in standards for 16S-seq experiments. The spike-ins represent full-length 16S rRNA genes containing artificial variable regions with negligible identity to known nucleotide sequences, permitting unambiguous identification of spike-in sequences in 16S-seq read data from any microbiome sample. Using defined mock communities and environmental microbiota, we characterized the performance of the spike-in standards and demonstrated their utility for evaluating data quality on a per-sample basis. Further, we showed that staggered spike-in mixtures added at the point of DNA extraction enable concurrent estimation of absolute microbial abundances suitable for comparative analysis. Results also underscored that template-specific Illumina sequencing artifacts may lead to biases in the perceived abundance of certain taxa. Taken together, the spike-in standards represent a novel bioanalytical tool that can substantially improve 16S-seq-based microbiome studies by enabling comprehensive quality control along with absolute quantification. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  10. MPRAnator: a web-based tool for the design of massively parallel reporter assay experiments

    PubMed Central

    Georgakopoulos-Soares, Ilias; Jain, Naman; Gray, Jesse M; Hemberg, Martin

    2017-01-01

    Motivation: With the rapid advances in DNA synthesis and sequencing technologies and the continuing decline in the associated costs, high-throughput experiments can be performed to investigate the regulatory role of thousands of oligonucleotide sequences simultaneously. Nevertheless, designing high-throughput reporter assay experiments such as massively parallel reporter assays (MPRAs) and similar methods remains challenging. Results: We introduce MPRAnator, a set of tools that facilitate rapid design of MPRA experiments. With MPRA Motif design, a set of variables provides fine control of how motifs are placed into sequences, thereby allowing the investigation of the rules that govern transcription factor (TF) occupancy. MPRA single-nucleotide polymorphism design can be used to systematically examine the functional effects of single or combinations of single-nucleotide polymorphisms at regulatory sequences. Finally, the Transmutation tool allows for the design of negative controls by permitting scrambling, reversing, complementing or introducing multiple random mutations in the input sequences or motifs. Availability and implementation: MPRAnator tool set is implemented in Python, Perl and Javascript and is freely available at www.genomegeek.com and www.sanger.ac.uk/science/tools/mpranator. The source code is available on www.github.com/hemberg-lab/MPRAnator/ under the MIT license. The REST API allows programmatic access to MPRAnator using simple URLs. Contact: igs@sanger.ac.uk or mh26@sanger.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27605100

  11. MPRAnator: a web-based tool for the design of massively parallel reporter assay experiments.

    PubMed

    Georgakopoulos-Soares, Ilias; Jain, Naman; Gray, Jesse M; Hemberg, Martin

    2017-01-01

    With the rapid advances in DNA synthesis and sequencing technologies and the continuing decline in the associated costs, high-throughput experiments can be performed to investigate the regulatory role of thousands of oligonucleotide sequences simultaneously. Nevertheless, designing high-throughput reporter assay experiments such as massively parallel reporter assays (MPRAs) and similar methods remains challenging. We introduce MPRAnator, a set of tools that facilitate rapid design of MPRA experiments. With MPRA Motif design, a set of variables provides fine control of how motifs are placed into sequences, thereby allowing the investigation of the rules that govern transcription factor (TF) occupancy. MPRA single-nucleotide polymorphism design can be used to systematically examine the functional effects of single or combinations of single-nucleotide polymorphisms at regulatory sequences. Finally, the Transmutation tool allows for the design of negative controls by permitting scrambling, reversing, complementing or introducing multiple random mutations in the input sequences or motifs. MPRAnator tool set is implemented in Python, Perl and Javascript and is freely available at www.genomegeek.com and www.sanger.ac.uk/science/tools/mpranator The source code is available on www.github.com/hemberg-lab/MPRAnator/ under the MIT license. The REST API allows programmatic access to MPRAnator using simple URLs. igs@sanger.ac.uk or mh26@sanger.ac.ukSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  12. Chloroplast chlB gene is required for light-independent chlorophyll accumulation in Chlamydomonas reinhardtii.

    PubMed

    Liu, X Q; Xu, H; Huang, C

    1993-10-01

    Light-independent chlorophyll synthesis occurs in some algae, lower plants, and gymnosperms, but not in angiosperms. We have identified a new chloroplast gene, chlB, that is required for the light-independent accumulation of chlorophyll in the green alga Chlamydomonas reinhardtii. The chlB gene was cloned, sequenced, and then disrupted by performing particle gun-mediated chloroplast transformation. The resulting homoplasmic mutant was unable to accumulate chlorophyll in the dark and thus exhibited a 'yellow-in-the-dark' phenotype. The chlB gene encodes a polypeptide of 688 amino acid residues, and is distinct from two previously characterized chloroplast genes (chlN and chlL) also required for light-independent chlorophyll accumulation in C. reinhardtii. Three unidentified open reading frames in chloroplast genomes of liverwort, black pine, and Chlamydomonas moewusii were also identified as chlB genes, based on their striking sequence similarities to the C. reinhardtii chlB gene. A chlB-like gene is absent in chloroplast genomes of tobacco and rice, consistent with the lack of light-independent chlorophyll synthesis in these plants. Polypeptides encoded by the chloroplast chlB genes also show significant sequence similarities with the bchB gene product of Rhodobacter capsulatus. Comparisons among the chloroplast chlB and the bacterial bchB gene products revealed five highly conserved sequence areas that are interspersed by four stretches of highly variable and probably insertional sequences.

  13. Satellite DNA and cytogenetic evolution: molecular aspects and implications for man. [Kangaroo rats

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hatch, F.T.; Mazrimas, J.

    1977-02-28

    Simple, highly reiterated DNA sequences, often observed in density gradients as satellite DNAs, exist in condensed heterochromatin. This material is predominantly located at chromosomal centromeres, occasionally at telomeres, or intercalated within arms; in a few species it occupies entire chromosome arms. Satellite DNAs are a highly variable component of the genome of most higher eukaryotes, but their functions have remained speculative. The genus of kangaroo rats (Dipodomys) exhibits remarkable interspecies variations in content of three satellite DNAs, consisting of simple sequences 3 to 10 base pairs long, and in species karyotypes. A broad range of diploid-DNA content is correlated withmore » satellite-DNA content. The latter is correlated positively with predominance of biarmed over uniarmed chromosomes (high fundamental number FN) and inversely with two anatomical indices (leg-bone-length ratios) of specialization for the jumping gait. Karyotypic variation is achieved via chromosomal rearrangements, e.g., Robertsonian fusion, C-band heteromorphism, and pericentric inversion. Environmental adaptation is achieved, in part, by reassortment of gene-linkage groups and regulatory controls as a result of the chromosomal rearrangements. The foregoing relationships led to the postulation that highly reiterated DNA sequences play a supragenic, global role in environmental adaptation and the evolution of new species.« less

  14. Intracranial cerebrospinal fluid spaces imaging using a pulse-triggered three-dimensional turbo spin echo MR sequence with variable flip-angle distribution.

    PubMed

    Hodel, Jérôme; Silvera, Jonathan; Bekaert, Olivier; Rahmouni, Alain; Bastuji-Garin, Sylvie; Vignaud, Alexandre; Petit, Eric; Durning, Bruno; Decq, Philippe

    2011-02-01

    To assess the three-dimensional turbo spin echo with variable flip-angle distribution magnetic resonance sequence (SPACE: Sampling Perfection with Application optimised Contrast using different flip-angle Evolution) for the imaging of intracranial cerebrospinal fluid (CSF) spaces. We prospectively investigated 18 healthy volunteers and 25 patients, 20 with communicating hydrocephalus (CH), five with non-communicating hydrocephalus (NCH), using the SPACE sequence at 1.5T. Volume rendering views of both intracranial and ventricular CSF were obtained for all patients and volunteers. The subarachnoid CSF distribution was qualitatively evaluated on volume rendering views using a four-point scale. The CSF volumes within total, ventricular and subarachnoid spaces were calculated as well as the ratio between ventricular and subarachnoid CSF volumes. Three different patterns of subarachnoid CSF distribution were observed. In healthy volunteers we found narrowed CSF spaces within the occipital aera. A diffuse narrowing of the subarachnoid CSF spaces was observed in patients with NCH whereas patients with CH exhibited narrowed CSF spaces within the high midline convexity. The ratios between ventricular and subarachnoid CSF volumes were significantly different among the volunteers, patients with CH and patients with NCH. The assessment of CSF spaces volume and distribution may help to characterise hydrocephalus.

  15. Tomato (Solanum lycopersicum) variety discrimination and hybridization analysis based on the 5S rRNA region.

    PubMed

    Sun, Yan-Lin; Kang, Ho-Min; Kim, Young-Sik; Baek, Jun-Pill; Zheng, Shi-Lin; Xiang, Jin-Jun; Hong, Soon-Kwan

    2014-05-04

    The tomato ( Solanum lycopersicum ) is a major vegetable crop worldwide. To satisfy popular demand, more than 500 tomato varieties have been bred. However, a clear variety identification has not been found. Thorough understanding of the phylogenetic relationship and hybridization information of tomato varieties is very important for further variety breeding. Thus, in this study, we collected 26 tomato varieties and attempted to distinguish them based on the 5S rRNA region, which is widely used in the determination of phylogenetic relations. Sequence analysis of the 5S rRNA region suggested that a large number of nucleotide variations exist among tomato varieties. These variable nucleotide sites were also informative regarding hybridization. Chromas sequencing of Yellow Mountain View and Seuwiteuking varieties indicated three and one variable nucleotide sites in the non-transcribed spacer (NTS) of the 5S rRNA region showing hybridization, respectively. Based on a phylogenetic tree constructed using the 5S rRNA sequences, we observed that 16 tomato varieties were divided into three groups at 95% similarity. Rubiking and Sseommeoking, Lang Selection Procedure and Seuwiteuking, and Acorn Gold and Yellow Mountain View exhibited very high identity with their partners. This work will aid variety authentication and provides a basis for further tomato variety breeding.

  16. High Genetic Diversity and Novelty in Eukaryotic Plankton Assemblages Inhabiting Saline Lakes in the Qaidam Basin

    PubMed Central

    Wang, Jiali; Wang, Fang; Chu, Limin; Wang, Hao; Zhong, Zhiping; Liu, Zhipei; Gao, Jianyong; Duan, Hairong

    2014-01-01

    Saline lakes are intriguing ecosystems harboring extremely productive microbial communities in spite of their extreme environmental conditions. We performed a comprehensive analysis of the genetic diversity (18S rRNA gene) of the planktonic microbial eukaryotes (nano- and picoeukaryotes) in six different inland saline lakes located in the Qaidam Basin. The novelty level are high, with about 11.23% of the whole dataset showing <90% identity to any previously reported sequence in GenBank. At least 4 operational taxonomic units (OTUs) in mesosaline lakes, while up to eighteen OTUs in hypersaline lakes show very low CCM and CEM scores, indicating that these sequences are highly distantly related to any existing sequence. Most of the 18S rRNA gene sequence reads obtained in investigated mesosaline lakes is closely related to Holozoa group (48.13%), whereas Stramenopiles (26.65%) and Alveolates (10.84%) are the next most common groups. Hypersaline lakes in the Qaidam Basin are also dominated by Holozoa group, accounting for 26.65% of the total number of sequence reads. Notably, Chlorophyta group are only found in high abundance in Lake Gasikule (28.00%), whereas less represented in other hypersaline lakes such as Gahai (0.50%) and Xiaochaidan (1.15%). Further analysis show that the compositions of planktonic eukaryotic assemblages are also most variable between different sampling sites in the same lake. Out of the parameters, four show significant correlation to this CCA: altitude, calcium, sodium and potassium concentrations. Overall, this study shows important gaps in the current knowledge about planktonic microbial eukaryotes inhabiting Qaidam Basin (hyper) saline water bodies. The identified diversity and novelty patterns among eukaryotic plankton assemblages in saline lake are of great importance for understanding and interpreting their ecology and evolution. PMID:25401703

  17. Laying-sequence-specific variation in yolk oestrogen levels, and relationship to plasma oestrogen in female zebra finches (Taeniopygia guttata)

    PubMed Central

    Williams, Tony D.; Ames, Caroline E.; Kiparissis, Yiannis; Wynne-Edwards, Katherine E.

    2005-01-01

    We investigated the relationship between plasma and yolk oestrogens in laying female zebra finches (Taeniopygia guttata) by manipulating plasma oestradiol (E2) levels, via injection of oestradiol-17β, in a sequence-specific manner to maintain chronically high plasma levels for later-developing eggs (contrasting with the endogenous pattern of decreasing plasma E2 concentrations during laying). We report systematic variation in yolk oestrogen concentrations, in relation to laying sequence, similar to that widely reported for androgenic steroids. In sham-manipulated females, yolk E2 concentrations decreased with laying sequence. However, in E2-treated females plasma E2 levels were higher during the period of rapid yolk development of later-laid eggs, compared with control females. As a consequence, we reversed the laying-sequence-specific pattern of yolk E2: in E2-treated females, yolk E2 concentrations increased with laying-sequence. In general therefore, yolk E2 levels were a direct reflection of plasma E2 levels. However, in control females there was some inter-individual variability in the endogenous pattern of plasma E2 levels through the laying cycle which could generate variation in sequence-specific patterns of yolk hormone levels even if these primarily reflect circulating steroid levels. PMID:15695208

  18. Comparative profiling of microbial community of three economically important fishes reared in sea cages under tropical offshore environment.

    PubMed

    Rasheeda, M K; Rangamaran, Vijaya Raghavan; Srinivasan, Senthilkumar; Ramaiah, Sendhil Kumar; Gunasekaran, Rajaprabhu; Jaypal, Santhanakumar; Gopal, Dharani; Ramalingam, Kirubagaran

    2017-08-01

    The present study was undertaken to evaluate the microbial composition of farmed cobia pompano and milkfish, reared in sea-cages by culture-independent methods. This study would serve as a basis for assessing the general health of fish, identifying the dominant bacterial species present in the gut for future probiotic work and in early detection of potential pathogens. High-throughput sequencing of V3-V4 hyper variable regions of 16S rDNA on Illumina MiSeq platform facilitated unravelling of composite bacterial population. Analysis of 1.3 million quality-filtered sequences revealed high microbial diversity. Characteristic marine fish gut microbes: Vibrio and Photobacterium spp. showed prevalence in cobia and pompano whereas Pelomonas and Fusobacterium spp. dominated the gut of milkfish. Pompano hindgut with 10,537 operational taxonomy units (OTUs) exhibited the highest alpha-diversity index followed by cobia (10,435) and milkfish (2799). Additionally unique and shared OTUs in each gut type were identified. Gammaproteobacteria dominated in cobia and pompano while Betaproteobacteria showed prevalence in milkfish. We obtained 96 shared OTUs among the three species though the numbers of reads were highly variable. These differences in microbiota of farmed fish reared in same environment were presumably due to differences in the gut morphology, physiological behavior and host specificity. Copyright © 2017 Elsevier B.V. All rights reserved.

  19. Change in IgHV Mutational Status of CLL Suggests Origin From Multiple Clones.

    PubMed

    Osman, Afaf; Gocke, Christopher D; Gladstone, Douglas E

    2017-02-01

    Fluorescence in situ hybridization and immunoglobulin (Ig) heavy-chain variable-region (IgHV) mutational status are used to predict outcome in chronic lymphocytic leukemia (CLL). Although DNA aberrations change over time, IgHV sequences and mutational status are considered stable. In a retrospective review, 409 CLL patients, between 2008 and 2015, had IgHV analysis: 56 patients had multiple analyses performed. Seven patients' IgHV results changed: 2 from unmutated to mutated and 5 from mutated to unmutated IgHV sequence. Three concurrently changed their variable heavy-chain sequence. Secondary to allelic exclusion, 2 of the new variable heavy chains produced were biologically nonplausible. The existence of these new nonplausible heavy-chain variable regions suggests either the CLL cancer stem-cell maintains the ability to rearrange a previously silenced IgH allele or more likely that the cancer stem-cell produced at least 2 subclones, suggesting that the CLL cancer stem cell exists before the process of allelic exclusion occurs. Copyright © 2016 Elsevier Inc. All rights reserved.

  20. Variability Studies of Two Prunus-Infecting Fabaviruses with the Aid of High-Throughput Sequencing

    PubMed Central

    Sarkisova, Tatiana; Lenz, Ondřej; Přibylová, Jaroslava; Špak, Josef; Lotos, Leonidas; Beta, Christina; Katsiani, Asimina; Candresse, Thierry

    2018-01-01

    During their lifetime, perennial woody plants are expected to face multiple infection events. Furthermore, multiple genotypes of individual virus species may co-infect the same host. This may eventually lead to a situation where plants harbor complex communities of viral species/strains. Using high-throughput sequencing, we describe co-infection of sweet and sour cherry trees with diverse genomic variants of two closely related viruses, namely prunus virus F (PrVF) and cherry virus F (CVF). Both viruses are most homologous to members of the Fabavirus genus (Secoviridae family). The comparison of CVF and PrVF RNA2 genomic sequences suggests that the two viruses may significantly differ in their expression strategy. Indeed, similar to comoviruses, the smaller genomic segment of PrVF, RNA2, may be translated in two collinear proteins while CVF likely expresses only the shorter of these two proteins. Linked with the observation that identity levels between the coat proteins of these two viruses are significantly below the family species demarcation cut-off, these findings support the idea that CVF and PrVF represent two separate Fabavirus species. PMID:29670059

  1. Task planning with uncertainty for robotic systems. Thesis

    NASA Technical Reports Server (NTRS)

    Cao, Tiehua

    1993-01-01

    In a practical robotic system, it is important to represent and plan sequences of operations and to be able to choose an efficient sequence from them for a specific task. During the generation and execution of task plans, different kinds of uncertainty may occur and erroneous states need to be handled to ensure the efficiency and reliability of the system. An approach to task representation, planning, and error recovery for robotic systems is demonstrated. Our approach to task planning is based on an AND/OR net representation, which is then mapped to a Petri net representation of all feasible geometric states and associated feasibility criteria for net transitions. Task decomposition of robotic assembly plans based on this representation is performed on the Petri net for robotic assembly tasks, and the inheritance of properties of liveness, safeness, and reversibility at all levels of decomposition are explored. This approach provides a framework for robust execution of tasks through the properties of traceability and viability. Uncertainty in robotic systems are modeled by local fuzzy variables, fuzzy marking variables, and global fuzzy variables which are incorporated in fuzzy Petri nets. Analysis of properties and reasoning about uncertainty are investigated using fuzzy reasoning structures built into the net. Two applications of fuzzy Petri nets, robot task sequence planning and sensor-based error recovery, are explored. In the first application, the search space for feasible and complete task sequences with correct precedence relationships is reduced via the use of global fuzzy variables in reasoning about subgoals. In the second application, sensory verification operations are modeled by mutually exclusive transitions to reason about local and global fuzzy variables on-line and automatically select a retry or an alternative error recovery sequence when errors occur. Task sequencing and task execution with error recovery capability for one and multiple soft components in robotic systems are investigated.

  2. DNAAlignEditor: DNA alignment editor tool

    PubMed Central

    Sanchez-Villeda, Hector; Schroeder, Steven; Flint-Garcia, Sherry; Guill, Katherine E; Yamasaki, Masanori; McMullen, Michael D

    2008-01-01

    Background With advances in DNA re-sequencing methods and Next-Generation parallel sequencing approaches, there has been a large increase in genomic efforts to define and analyze the sequence variability present among individuals within a species. For very polymorphic species such as maize, this has lead to a need for intuitive, user-friendly software that aids the biologist, often with naïve programming capability, in tracking, editing, displaying, and exporting multiple individual sequence alignments. To fill this need we have developed a novel DNA alignment editor. Results We have generated a nucleotide sequence alignment editor (DNAAlignEditor) that provides an intuitive, user-friendly interface for manual editing of multiple sequence alignments with functions for input, editing, and output of sequence alignments. The color-coding of nucleotide identity and the display of associated quality score aids in the manual alignment editing process. DNAAlignEditor works as a client/server tool having two main components: a relational database that collects the processed alignments and a user interface connected to database through universal data access connectivity drivers. DNAAlignEditor can be used either as a stand-alone application or as a network application with multiple users concurrently connected. Conclusion We anticipate that this software will be of general interest to biologists and population genetics in editing DNA sequence alignments and analyzing natural sequence variation regardless of species, and will be particularly useful for manual alignment editing of sequences in species with high levels of polymorphism. PMID:18366684

  3. Delta Scuti Variables

    NASA Astrophysics Data System (ADS)

    Handler, Gerald

    2009-09-01

    We review recent research on Delta Scuti stars from an observer's viewpoint. First, some signposts helping to find the way through the Delta Scuti jungle are placed. Then, some problems in studying individual pulsators in the framework of asteroseismology are given before a view on how the study of these variables has benefited (or not) from past and present high-precision asteroseismic space missions is presented. Some possible pitfalls in the analysis of data with a large dynamical range in pulsational amplitudes are pointed out, and a strategy to optimize the outcome of asteroseismic studies of Delta Scuti stars is suggested. We continue with some views on ``hybrid'' pulsators and interesting individual High Amplitude Delta Scuti stars, and then take a look on Delta Scuti stars in stellar systems of several different kinds. Recent results on pre-main sequence Delta Scuti stars are discussed as are questions related to the instability strip of these variables. Finally, some remarkable new theoretical results are highlighted before, instead of a set of classical conclusions, questions to be solved in the future, are raised.

  4. Expansion of the Preimmune Antibody Repertoire by Junctional Diversity in Bos taurus

    PubMed Central

    Liljavirta, Jenni; Niku, Mikael; Pessa-Morikawa, Tiina; Ekman, Anna; Iivanainen, Antti

    2014-01-01

    Cattle have a limited range of immunoglobulin genes which are further diversified by antigen independent somatic hypermutation in fetuses. Junctional diversity generated during somatic recombination contributes to antibody diversity but its relative significance has not been comprehensively studied. We have investigated the importance of terminal deoxynucleotidyl transferase (TdT) -mediated junctional diversity to the bovine immunoglobulin repertoire. We also searched for new bovine heavy chain diversity (IGHD) genes as the information of the germline sequences is essential to define the junctional boundaries between gene segments. New heavy chain variable genes (IGHV) were explored to address the gene usage in the fetal recombinations. Our bioinformatics search revealed five new IGHD genes, which included the longest IGHD reported so far, 154 bp. By genomic sequencing we found 26 new IGHV sequences that represent potentially new IGHV genes or allelic variants. Sequence analysis of immunoglobulin heavy chain cDNA libraries of fetal bone marrow, ileum and spleen showed 0 to 36 nontemplated N-nucleotide additions between variable, diversity and joining genes. A maximum of 8 N nucleotides were also identified in the light chains. The junctional base profile was biased towards A and T nucleotide additions (64% in heavy chain VD, 52% in heavy chain DJ and 61% in light chain VJ junctions) in contrast to the high G/C content which is usually observed in mice. Sequence analysis also revealed extensive exonuclease activity, providing additional diversity. B-lymphocyte specific TdT expression was detected in bovine fetal bone marrow by reverse transcription-qPCR and immunofluorescence. These results suggest that TdT-mediated junctional diversity and exonuclease activity contribute significantly to the size of the cattle preimmune antibody repertoire already in the fetal period. PMID:24926997

  5. Metagenomic Analysis of Milk of Healthy and Mastitis-Suffering Women.

    PubMed

    Jiménez, Esther; de Andrés, Javier; Manrique, Marina; Pareja-Tobes, Pablo; Tobes, Raquel; Martínez-Blanch, Juan F; Codoñer, Francisco M; Ramón, Daniel; Fernández, Leónides; Rodríguez, Juan M

    2015-08-01

    Some studies have been conducted to assess the composition of the bacterial communities inhabiting human milk, but they did not evaluate the presence of other microorganisms, such as fungi, archaea, protozoa, or viruses. This study aimed to compare the metagenome of human milk samples provided by healthy and mastitis-suffering women. DNA was isolated from human milk samples collected from 10 healthy women and 10 women with symptoms of lactational mastitis. Shotgun libraries from total extracted DNA were constructed and the libraries were sequenced by 454 pyrosequencing. The amount of human DNA sequences was ≥ 90% in all the samples. Among the bacterial sequences, the predominant phyla were Proteobacteria, Firmicutes, and Bacteroidetes. The healthy core microbiome included the genera Staphylococcus, Streptococcus, Bacteroides, Faecalibacterium, Ruminococcus, Lactobacillus, and Propionibacterium. At the species level, a high degree of inter-individual variability was observed among healthy women. In contrast, Staphylococcus aureus clearly dominated the microbiome in the samples from the women with acute mastitis whereas high increases in Staphylococcus epidermidis-related reads were observed in the milk of those suffering from subacute mastitis. Fungal and protozoa-related reads were identified in most of the samples, whereas Archaea reads were absent in samples from women with mastitis. Some viral-related sequence reads were also detected. Human milk contains a complex microbial metagenome constituted by the genomes of bacteria, archaea, viruses, fungi, and protozoa. In mastitis cases, the milk microbiome reflects a loss of bacterial diversity and a high increase of the sequences related to the presumptive etiological agents. © The Author(s) 2015.

  6. A Statistical Test of Walrasian Equilibrium by Means of Complex Networks Theory

    NASA Astrophysics Data System (ADS)

    Bargigli, Leonardo; Viaggiu, Stefano; Lionetto, Andrea

    2016-10-01

    We represent an exchange economy in terms of statistical ensembles for complex networks by introducing the concept of market configuration. This is defined as a sequence of nonnegative discrete random variables {w_{ij}} describing the flow of a given commodity from agent i to agent j. This sequence can be arranged in a nonnegative matrix W which we can regard as the representation of a weighted and directed network or digraph G. Our main result consists in showing that general equilibrium theory imposes highly restrictive conditions upon market configurations, which are in most cases not fulfilled by real markets. An explicit example with reference to the e-MID interbank credit market is provided.

  7. The Structure of Branching Onsets and Rising Diphthongs: Evidence from the Acquisition of French and Spanish

    ERIC Educational Resources Information Center

    Kehoe, Margaret; Hilaire-Debove, Geraldine; Demuth, Katherine; Lleo, Conxita

    2008-01-01

    Consonant-glide-vowel (CGV) sequences are represented differently across languages. In some languages, the CG sequence is represented as a branching onset; in other languages, the GV sequence is represented as a rising diphthong. Given variable syllabification across languages, this study examines how young children represent CGV sequences. In…

  8. Sequence Comparisons of Odorant Receptors among Tortricid Moths Reveal Different Rates of Molecular Evolution among Family Members

    PubMed Central

    Carraher, Colm; Authier, Astrid; Steinwender, Bernd; Newcomb, Richard D.

    2012-01-01

    In insects, odorant receptors detect volatile cues involved in behaviours such as mate recognition, food location and oviposition. We have investigated the evolution of three odorant receptors from five species within the moth genera Ctenopseustis and Planotrotrix, family Tortricidae, which fall into distinct clades within the odorant receptor multigene family. One receptor is the orthologue of the co-receptor Or83b, now known as Orco (OR2), and encodes the obligate ion channel subunit of the receptor complex. In comparison, the other two receptors, OR1 and OR3, are ligand-binding receptor subunits, activated by volatile compounds produced by plants - methyl salicylate and citral, respectively. Rates of sequence evolution at non-synonymous sites were significantly higher in OR1 compared with OR2 and OR3. Within the dataset OR1 contains 109 variable amino acid positions that are distributed evenly across the entire protein including transmembrane helices, loop regions and termini, while OR2 and OR3 contain 18 and 16 variable sites, respectively. OR2 shows a high level of amino acid conservation as expected due to its essential role in odour detection; however we found unexpected differences in the rate of evolution between two ligand-binding odorant receptors, OR1 and OR3. OR3 shows high sequence conservation suggestive of a conserved role in odour reception, whereas the higher rate of evolution observed in OR1, particularly at non-synonymous sites, may be suggestive of relaxed constraint, perhaps associated with the loss of an ancestral role in sex pheromone reception. PMID:22701634

  9. Characterization and transferability of microsatellite markers of the cultivated peanut (Arachis hypogaea)

    PubMed Central

    Gimenes, Marcos A; Hoshino, Andrea A; Barbosa, Andrea VG; Palmieri, Dario A; Lopes, Catalina R

    2007-01-01

    Background The genus Arachis includes Arachis hypogaea (cultivated peanut) and wild species that are used in peanut breeding or as forage. Molecular markers have been employed in several studies of this genus, but microsatellite markers have only been used in few investigations. Microsatellites are very informative and are useful to assess genetic variability, analyze mating systems and in genetic mapping. The objectives of this study were to develop A. hypogaea microsatellite loci and to evaluate the transferability of these markers to other Arachis species. Results Thirteen loci were isolated and characterized using 16 accessions of A. hypogaea. The level of variation found in A. hypogaea using microsatellites was higher than with other markers. Cross-transferability of the markers was also high. Sequencing of the fragments amplified using the primer pair Ah11 from 17 wild Arachis species showed that almost all wild species had similar repeated sequence to the one observed in A. hypogaea. Sequence data suggested that there is no correlation between taxonomic relationship of a wild species to A. hypogaea and the number of repeats found in its microsatellite loci. Conclusion These results show that microsatellite primer pairs from A. hypogaea have multiple uses. A higher level of variation among A. hypogaea accessions can be detected using microsatellite markers in comparison to other markers, such as RFLP, RAPD and AFLP. The microsatellite primers of A. hypogaea showed a very high rate of transferability to other species of the genus. These primer pairs provide important tools to evaluate the genetic variability and to assess the mating system in Arachis species. PMID:17326826

  10. Characterization of 17 chaperone-usher fimbriae encoded by Proteus mirabilis reveals strong conservation

    PubMed Central

    Kuan, Lisa; Schaffer, Jessica N.; Zouzias, Christos D.

    2014-01-01

    Proteus mirabilis is a Gram-negative enteric bacterium that causes complicated urinary tract infections, particularly in patients with indwelling catheters. Sequencing of clinical isolate P. mirabilis HI4320 revealed the presence of 17 predicted chaperone-usher fimbrial operons. We classified these fimbriae into three groups by their genetic relationship to other chaperone-usher fimbriae. Sixteen of these fimbriae are encoded by all seven currently sequenced P. mirabilis genomes. The predicted protein sequence of the major structural subunit for 14 of these fimbriae was highly conserved (≥95 % identity), whereas three other structural subunits (Fim3A, UcaA and Fim6A) were variable. Further examination of 58 clinical isolates showed that 14 of the 17 predicted major structural subunit genes of the fimbriae were present in most strains (>85 %). Transcription of the predicted major structural subunit genes for all 17 fimbriae was measured under different culture conditions designed to mimic conditions in the urinary tract. The majority of the fimbrial genes were induced during stationary phase, static culture or colony growth when compared to exponential-phase aerated culture. Major structural subunit proteins for six of these fimbriae were detected using MS of proteins sheared from the surface of broth-cultured P. mirabilis, demonstrating that this organism may produce multiple fimbriae within a single culture. The high degree of conservation of P. mirabilis fimbriae stands in contrast to uropathogenic Escherichia coli and Salmonella enterica, which exhibit greater variability in their fimbrial repertoires. These findings suggest there may be evolutionary pressure for P. mirabilis to maintain a large fimbrial arsenal. PMID:24809384

  11. The Bias Associated with Amplicon Sequencing Does Not Affect the Quantitative Assessment of Bacterial Community Dynamics

    PubMed Central

    Figuerola, Eva L. M.; Erijman, Leonardo

    2014-01-01

    The performance of two sets of primers targeting variable regions of the 16S rRNA gene V1–V3 and V4 was compared in their ability to describe changes of bacterial diversity and temporal turnover in full-scale activated sludge. Duplicate sets of high-throughput amplicon sequencing data of the two 16S rRNA regions shared a collection of core taxa that were observed across a series of twelve monthly samples, although the relative abundance of each taxon was substantially different between regions. A case in point was the changes in the relative abundance of filamentous bacteria Thiothrix, which caused a large effect on diversity indices, but only in the V1–V3 data set. Yet the relative abundance of Thiothrix in the amplicon sequencing data from both regions correlated with the estimation of its abundance determined using fluorescence in situ hybridization. In nonmetric multidimensional analysis samples were distributed along the first ordination axis according to the sequenced region rather than according to sample identities. The dynamics of microbial communities indicated that V1–V3 and the V4 regions of the 16S rRNA gene yielded comparable patterns of: 1) the changes occurring within the communities along fixed time intervals, 2) the slow turnover of activated sludge communities and 3) the rate of species replacement calculated from the taxa–time relationships. The temperature was the only operational variable that showed significant correlation with the composition of bacterial communities over time for the sets of data obtained with both pairs of primers. In conclusion, we show that despite the bias introduced by amplicon sequencing, the variable regions V1–V3 and V4 can be confidently used for the quantitative assessment of bacterial community dynamics, and provide a proper qualitative account of general taxa in the community, especially when the data are obtained over a convenient time window rather than at a single time point. PMID:24923665

  12. A study of the relationships of cultivated peanut (Arachis hypogaea) and its most closely related wild species using intron sequences and microsatellite markers

    PubMed Central

    Moretzsohn, Márcio C.; Gouvea, Ediene G.; Inglis, Peter W.; Leal-Bertioli, Soraya C. M.; Valls, José F. M.; Bertioli, David J.

    2013-01-01

    Background and Aims The genus Arachis contains 80 described species. Section Arachis is of particular interest because it includes cultivated peanut, an allotetraploid, and closely related wild species, most of which are diploids. This study aimed to analyse the genetic relationships of multiple accessions of section Arachis species using two complementary methods. Microsatellites allowed the analysis of inter- and intraspecific variability. Intron sequences from single-copy genes allowed phylogenetic analysis including the separation of the allotetraploid genome components. Methods Intron sequences and microsatellite markers were used to reconstruct phylogenetic relationships in section Arachis through maximum parsimony and genetic distance analyses. Key Results Although high intraspecific variability was evident, there was good support for most species. However, some problems were revealed, notably a probable polyphyletic origin for A. kuhlmannii. The validity of the genome groups was well supported. The F, K and D genomes grouped close to the A genome group. The 2n = 18 species grouped closer to the B genome group. The phylogenetic tree based on the intron data strongly indicated that A. duranensis and A. ipaënsis are the ancestors of A. hypogaea and A. monticola. Intron nucleotide substitutions allowed the ages of divergences of the main genome groups to be estimated at a relatively recent 2·3–2·9 million years ago. This age and the number of species described indicate a much higher speciation rate for section Arachis than for legumes in general. Conclusions The analyses revealed relationships between the species and genome groups and showed a generally high level of intraspecific genetic diversity. The improved knowledge of species relationships should facilitate the utilization of wild species for peanut improvement. The estimates of speciation rates in section Arachis are high, but not unprecedented. We suggest these high rates may be linked to the peculiar reproductive biology of Arachis. PMID:23131301

  13. Probing genomic diversity and evolution of Escherichia coli O157 by single nucleotide polymorphisms.

    PubMed

    Zhang, Wei; Qi, Weihong; Albert, Thomas J; Motiwala, Alifiya S; Alland, David; Hyytia-Trees, Eija K; Ribot, Efrain M; Fields, Patricia I; Whittam, Thomas S; Swaminathan, Bala

    2006-06-01

    Infections by Shiga toxin-producing Escherichia coli O157:H7 (STEC O157) are the predominant cause of bloody diarrhea and hemolytic uremic syndrome in the United States. In silico comparison of the two complete STEC O157 genomes (Sakai and EDL933) revealed a strikingly high level of sequence identity in orthologous protein-coding genes, limiting the use of nucleotide sequences to study the evolution and epidemiology of this bacterial pathogen. To systematically examine single nucleotide polymorphisms (SNPs) at a genome scale, we designed comparative genome sequencing microarrays and analyzed 1199 chromosomal genes (a total of 1,167,948 bp) and 92,721 bp of the large virulence plasmid (pO157) of eleven outbreak-associated STEC O157 strains. We discovered 906 SNPs in 523 chromosomal genes and observed a high level of DNA polymorphisms among the pO157 plasmids. Based on a uniform rate of synonymous substitution for Escherichia coli and Salmonella enterica (4.7x10(-9) per site per year), we estimate that the most recent common ancestor of the contemporary beta-glucuronidase-negative, non-sorbitolfermenting STEC O157 strains existed ca. 40 thousand years ago. The phylogeny of the STEC O157 strains based on the informative synonymous SNPs was compared to the maximum parsimony trees inferred from pulsed-field gel electrophoresis and multilocus variable numbers of tandem repeats analysis. The topological discrepancies indicate that, in contrast to the synonymous mutations, parts of STEC O157 genomes have evolved through different mechanisms with highly variable divergence rates. The SNP loci reported here will provide useful genetic markers for developing high-throughput methods for fine-resolution genotyping of STEC O157. Functional characterization of nucleotide polymorphisms should shed new insights on the evolution, epidemiology, and pathogenesis of STEC O157 and related pathogens.

  14. Probing genomic diversity and evolution of Escherichia coli O157 by single nucleotide polymorphisms

    PubMed Central

    Zhang, Wei; Qi, Weihong; Albert, Thomas J.; Motiwala, Alifiya S.; Alland, David; Hyytia-Trees, Eija K.; Ribot, Efrain M.; Fields, Patricia I.; Whittam, Thomas S.; Swaminathan, Bala

    2006-01-01

    Infections by Shiga toxin-producing Escherichia coli O157:H7 (STEC O157) are the predominant cause of bloody diarrhea and hemolytic uremic syndrome in the United States. In silico comparison of the two complete STEC O157 genomes (Sakai and EDL933) revealed a strikingly high level of sequence identity in orthologous protein-coding genes, limiting the use of nucleotide sequences to study the evolution and epidemiology of this bacterial pathogen. To systematically examine single nucleotide polymorphisms (SNPs) at a genome scale, we designed comparative genome sequencing microarrays and analyzed 1199 chromosomal genes (a total of 1,167,948 bp) and 92,721 bp of the large virulence plasmid (pO157) of eleven outbreak-associated STEC O157 strains. We discovered 906 SNPs in 523 chromosomal genes and observed a high level of DNA polymorphisms among the pO157 plasmids. Based on a uniform rate of synonymous substitution for Escherichia coli and Salmonella enterica (4.7 × 10−9 per site per year), we estimate that the most recent common ancestor of the contemporary β-glucuronidase-negative, non-sorbitolfermenting STEC O157 strains existed ca. 40 thousand years ago. The phylogeny of the STEC O157 strains based on the informative synonymous SNPs was compared to the maximum parsimony trees inferred from pulsed-field gel electrophoresis and multilocus variable numbers of tandem repeats analysis. The topological discrepancies indicate that, in contrast to the synonymous mutations, parts of STEC O157 genomes have evolved through different mechanisms with highly variable divergence rates. The SNP loci reported here will provide useful genetic markers for developing high-throughput methods for fine-resolution genotyping of STEC O157. Functional characterization of nucleotide polymorphisms should shed new insights on the evolution, epidemiology, and pathogenesis of STEC O157 and related pathogens. PMID:16606700

  15. Immunoglobulin from Antarctic fish species of Rajidae family.

    PubMed

    Coscia, Maria Rosaria; Cocca, Ennio; Giacomelli, Stefano; Cuccaro, Fausta; Oreste, Umberto

    2012-03-01

    Immunoglobulins (Ig) of Chondroichthyes have been extensively studied in sharks; in contrast, in skates investigations on Ig remain scarce and fragmentary despite the high occurrence of skates in all of the major oceans of the world. To focus on Rajidae Igμ, the most abundant heavy chain isotype, we have chosen the Antarctic species Bathyraja eatonii, Bathyraja albomaculata, Bathyraja brachyurops, and Amblyraja georgiana which live at high latitudes in the Southern Ocean, and at very low temperatures. We prepared mRNA from the spleen of individuals of each species and performed RT-PCR experiments using two oligonucleotides designed on the alignment of various elasmobranch Igμ heavy chain sequences available in GenBank. The PCR products, about 1400-nt long, were cloned and sequenced. Nucleotide sequence identities calculated for the constant region domains ranged from 88.5% to 97.5% between species, and from 91.1% to 99.7% within species. In a distance tree, including also Raja erinacea sequences, two major branches were obtained, one containing Arhynchobatinae sequences, the other one Rajinae sequences. Four presumptive D gene segments were identified in the region of the VH/D/JH recombination; two different D segments were often found in the same sequence. Moreover, 5-15 genomic fragments of different lengths, carrying the gene locus encoding Igμ chain were revealed by Southern blotting analysis. B. eatonii amino acid sequences were analyzed for the positional diversity by Shannon entropy analysis, showing CH4 as the most conserved domain, and CH3 as the most variable one. B. eatonii CDR3 region length varied between 11 and 15 amino acid residues; the mean length (13.4 aa) was greater than that of Leucoraja eglanteria sequences (7.7 aa). An alignment of representative sequences of Antarctic species and R. erinacea showed that more cysteine residues not involved in the intradomain disulfide bridges were present in Antarctic species. Copyright © 2011 Elsevier B.V. All rights reserved.

  16. Mutational landscape of antibody variable domains reveals a switch modulating the interdomain conformational dynamics and antigen binding

    PubMed Central

    Koenig, Patrick; Lee, Chingwei V.; Walters, Benjamin T.; Janakiraman, Vasantharajan; Stinson, Jeremy; Patapoff, Thomas W.; Fuh, Germaine

    2017-01-01

    Somatic mutations within the antibody variable domains are critical to the immense capacity of the immune repertoire. Here, via a deep mutational scan, we dissect how mutations at all positions of the variable domains of a high-affinity anti-VEGF antibody G6.31 impact its antigen-binding function. The resulting mutational landscape demonstrates that large portions of antibody variable domain positions are open to mutation, and that beneficial mutations can be found throughout the variable domains. We determine the role of one antigen-distal light chain position 83, demonstrating that mutation at this site optimizes both antigen affinity and thermostability by modulating the interdomain conformational dynamics of the antigen-binding fragment. Furthermore, by analyzing a large number of human antibody sequences and structures, we demonstrate that somatic mutations occur frequently at position 83, with corresponding domain conformations observed for G6.31. Therefore, the modulation of interdomain dynamics represents an important mechanism during antibody maturation in vivo. PMID:28057863

  17. Major histocompatibility complex variation in the endangered Przewalski's horse.

    PubMed Central

    Hedrick, P W; Parker, K M; Miller, E L; Miller, P S

    1999-01-01

    The major histocompatibility complex (MHC) is a fundamental part of the vertebrate immune system, and the high variability in many MHC genes is thought to play an essential role in recognition of parasites. The Przewalski's horse is extinct in the wild and all the living individuals descend from 13 founders, most of whom were captured around the turn of the century. One of the primary genetic concerns in endangered species is whether they have ample adaptive variation to respond to novel selective factors. In examining 14 Przewalski's horses that are broadly representative of the living animals, we found six different class II DRB major histocompatibility sequences. The sequences showed extensive nonsynonymous variation, concentrated in the putative antigen-binding sites, and little synonymous variation. Individuals had from two to four sequences as determined by single-stranded conformation polymorphism (SSCP) analysis. On the basis of the SSCP data, phylogenetic analysis of the nucleotide sequences, and segregation in a family group, we conclude that four of these sequences are from one gene (although one sequence codes for a nonfunctional allele because it contains a stop codon) and two other sequences are from another gene. The position of the stop codon is at the same amino-acid position as in a closely related sequence from the domestic horse. Because other organisms have extensive variation at homologous loci, the Przewalski's horse may have quite low variation in this important adaptive region. PMID:10430594

  18. Method for altering antibody light chain interactions

    DOEpatents

    Stevens, Fred J.; Stevens, Priscilla Wilkins; Raffen, Rosemarie; Schiffer, Marianne

    2002-01-01

    A method for recombinant antibody subunit dimerization including modifying at least one codon of a nucleic acid sequence to replace an amino acid occurring naturally in the antibody with a charged amino acid at a position in the interface segment of the light polypeptide variable region, the charged amino acid having a first polarity; and modifying at least one codon of the nucleic acid sequence to replace an amino acid occurring naturally in the antibody with a charged amino acid at a position in an interface segment of the heavy polypeptide variable region corresponding to a position in the light polypeptide variable region, the charged amino acid having a second polarity opposite the first polarity. Nucleic acid sequences which code for novel light chain proteins, the latter of which are used in conjunction with the inventive method, are also provided.

  19. Isotropic 3-D T2-weighted spin-echo for abdominal and pelvic MRI in children.

    PubMed

    Dias, Sílvia Costa; Ølsen, Oystein E

    2012-11-01

    MRI has a fundamental role in paediatric imaging. The T2-weighted fast/turbo spin-echo sequence is important because it has high signal-to-noise ratio compared to gradient-echo sequences. It is usually acquired as 2-D sections in one or more planes. Volumetric spin-echo has until recently only been possible with very long echo times due to blurring of the soft-tissue contrast with long echo trains. A new 3-D spin-echo sequence uses variable flip angles to overcome this problem. It may reproduce useful soft-tissue contrast, with improved spatial resolution. Its isotropic capability allows subsequent reconstruction in standard, curved or arbitrary planes. It may be particularly useful for visualisation of small lesions, or if large lesions distort the usual anatomical relations. We present clinical examples, describe the technical parameters and discuss some potential artefacts and optimisation of image quality.

  20. Mutation of domain III and domain VI in L gene conserved domain of Nipah virus

    NASA Astrophysics Data System (ADS)

    Jalani, Siti Aishah; Ibrahim, Nazlina

    2016-11-01

    Nipah virus (NiV) is the etiologic agent responsible for the respiratory illness and causes fatal encephalitis in human. NiV L protein subunit is thought to be responsible for the majority of enzymatic activities involved in viral transcription and replication. The L protein which is the viral RNA dependent RNA polymerase has high sequence homology among negative sense RNA viruses. In negative stranded RNA viruses, based on sequence alignment six conserved domain (domain I-IV) have been determined. Each domain is separated on variable regions that suggest the structure to consist concatenated functional domain. To directly address the roles of domains III and VI, site-directed mutations were constructed by the substitution of bases at sequences 2497, 2500, 5528 and 5532. Each mutated L gene can be used in future studies to test the ability for expression on in vitro translation.

  1. Sequence Variability and Geographic Distribution of Lassa Virus, Sierra Leone

    PubMed Central

    Stockelman, Michael G.; Moses, Lina M.; Park, Matthew; Stenger, David A.; Ansumana, Rashid; Bausch, Daniel G.; Lin, Baochuan

    2015-01-01

    Lassa virus (LASV) is endemic to parts of West Africa and causes highly fatal hemorrhagic fever. The multimammate rat (Mastomys natalensis) is the only known reservoir of LASV. Most human infections result from zoonotic transmission. The very diverse LASV genome has 4 major lineages associated with different geographic locations. We used reverse transcription PCR and resequencing microarrays to detect LASV in 41 of 214 samples from rodents captured at 8 locations in Sierra Leone. Phylogenetic analysis of partial sequences of nucleoprotein (NP), glycoprotein precursor (GPC), and polymerase (L) genes showed 5 separate clades within lineage IV of LASV in this country. The sequence diversity was higher than previously observed; mean diversity was 7.01% for nucleoprotein gene at the nucleotide level. These results may have major implications for designing diagnostic tests and therapeutic agents for LASV infections in Sierra Leone. PMID:25811712

  2. Development of single-copy nuclear intron markers for species-level phylogenetics: Case study with Paullinieae (Sapindaceae).

    PubMed

    Chery, Joyce G; Sass, Chodon; Specht, Chelsea D

    2017-09-01

    We developed a bioinformatic pipeline that leverages a publicly available genome and published transcriptomes to design primers in conserved coding sequences flanking targeted introns of single-copy nuclear loci. Paullinieae (Sapindaceae) is used to demonstrate the pipeline. Transcriptome reads phylogenetically closer to the lineage of interest are aligned to the closest genome. Single-nucleotide polymorphisms are called, generating a "pseudoreference" closer to the lineage of interest. Several filters are applied to meet the criteria of single-copy nuclear loci with introns of a desired size. Primers are designed in conserved coding sequences flanking introns. Using this pipeline, we developed nine single-copy nuclear intron markers for Paullinieae. This pipeline is highly flexible and can be used for any group with available genomic and transcriptomic resources. This pipeline led to the development of nine variable markers for phylogenetic study without generating sequence data de novo.

  3. Structural determinants of nuclear export signal orientation in binding to exportin CRM1

    DOE PAGES

    Fung, Ho Yee Joyce; Fu, Szu -Chin; Brautigam, Chad A.; ...

    2015-09-08

    The Chromosome Region of Maintenance 1 (CRM1) protein mediates nuclear export of hundreds of proteins through recognition of their nuclear export signals (NESs), which are highly variable in sequence and structure. The plasticity of the CRM1-NES interaction is not well understood, as there are many NES sequences that seem incompatible with structures of the NES-bound CRM1 groove. Crystal structures of CRM1 bound to two different NESs with unusual sequences showed the NES peptides binding the CRM1 groove in the opposite orientation (minus) to that of previously studied NESs (plus). A comparison of minus and plus NESs identified structural and sequencemore » determinants for NES orientation. The binding of NESs to CRM1 in both orientations results in a large expansion in NES consensus patterns and therefore a corresponding expansion of potential NESs in the proteome.« less

  4. The convergence of the order sequence and the solution function sequence on fractional partial differential equation

    NASA Astrophysics Data System (ADS)

    Rusyaman, E.; Parmikanti, K.; Chaerani, D.; Asefan; Irianingsih, I.

    2018-03-01

    One of the application of fractional ordinary differential equation is related to the viscoelasticity, i.e., a correlation between the viscosity of fluids and the elasticity of solids. If the solution function develops into function with two or more variables, then its differential equation must be changed into fractional partial differential equation. As the preliminary study for two variables viscoelasticity problem, this paper discusses about convergence analysis of function sequence which is the solution of the homogenous fractional partial differential equation. The method used to solve the problem is Homotopy Analysis Method. The results show that if given two real number sequences (αn) and (βn) which converge to α and β respectively, then the solution function sequences of fractional partial differential equation with order (αn, βn) will also converge to the solution function of fractional partial differential equation with order (α, β).

  5. Mitochondrial DNA variation of indigenous goats in Narok and Isiolo counties of Kenya.

    PubMed

    Kibegwa, F M; Githui, K E; Jung'a, J O; Badamana, M S; Nyamu, M N

    2016-06-01

    Phylogenetic relationships among and genetic variability within 60 goats from two different indigenous breeds in Narok and Isiolo counties in Kenya and 22 published goat samples were analysed using mitochondrial control region sequences. The results showed that there were 54 polymorphic sites in a 481-bp sequence and 29 haplotypes were determined. The mean haplotype diversity and nucleotide diversity were 0.981 ± 0.006 and 0.019 ± 0.001, respectively. The phylogenetic analysis in combination with goat haplogroup reference sequences from GenBank showed that all goat sequences were clustered into two haplogroups (A and G), of which haplogroup A was the commonest in the two populations. A very high percentage (99.90%) of the genetic variation was distributed within the regions, and a smaller percentage (0.10%) distributed among regions as revealed by the analysis of molecular variance (amova). This amova results showed that the divergence between regions was not statistically significant. We concluded that the high levels of intrapopulation diversity in Isiolo and Narok goats and the weak phylogeographic structuring suggested that there existed strong gene flow among goat populations probably caused by extensive transportation of goats in history. © 2015 Blackwell Verlag GmbH.

  6. Structural analysis of the α subunit of Na(+)/K(+) ATPase genes in invertebrates.

    PubMed

    Thabet, Rahma; Rouault, J-D; Ayadi, Habib; Leignel, Vincent

    2016-01-01

    The Na(+)/K(+) ATPase is a ubiquitous pump coordinating the transport of Na(+) and K(+) across the membrane of cells and its role is fundamental to cellular functions. It is heteromer in eukaryotes including two or three subunits (α, β and γ which is specific to the vertebrates). The catalytic functions of the enzyme have been attributed to the α subunit. Several complete α protein sequences are available, but only few gene structures were characterized. We identified the genomic sequences coding the α-subunit of the Na(+)/K(+) ATPase, from the whole-genome shotgun contigs (WGS), NCBI Genomes (chromosome), Genomic Survey Sequences (GSS) and High Throughput Genomic Sequences (HTGS) databases across distinct phyla. One copy of the α subunit gene was found in Annelida, Arthropoda, Cnidaria, Echinodermata, Hemichordata, Mollusca, Placozoa, Porifera, Platyhelminthes, Urochordata, but the nematodes seem to possess 2 to 4 copies. The number of introns varied from 0 (Platyhelminthes) to 26 (Porifera); and their localization and length are also highly variable. Molecular phylogenies (Maximum Likelihood and Maximum Parsimony methods) showed some clusters constituted by (Chordata/(Echinodermata/Hemichordata)) or (Plathelminthes/(Annelida/Mollusca)) and a basal position for Porifera. These structural analyses increase our knowledge about the evolutionary events of the α subunit genes in the invertebrates. Copyright © 2016 Elsevier Inc. All rights reserved.

  7. Pyrosequencing of Bacterial Symbionts within Axinella corrugata Sponges: Diversity and Seasonal Variability

    PubMed Central

    White, James R.; Patel, Jignasa; Ottesen, Andrea; Arce, Gabriela; Blackwelder, Patricia; Lopez, Jose V.

    2012-01-01

    Background Marine sponge species are of significant interest to many scientific fields including marine ecology, conservation biology, genetics, host-microbe symbiosis and pharmacology. One of the most intriguing aspects of the sponge “holobiont” system is the unique physiology, interaction with microbes from the marine environment and the development of a complex commensal microbial community. However, intraspecific variability and temporal stability of sponge-associated bacterial symbionts remain relatively unknown. Methodology/Principal Findings We have characterized the bacterial symbiont community biodiversity of seven different individuals of the Caribbean reef sponge Axinella corrugata, from two different Florida reef locations during variable seasons using multiplex 454 pyrosequencing of 16 S rRNA amplicons. Over 265,512 high-quality 16 S rRNA sequences were generated and analyzed. Utilizing versatile bioinformatics methods and analytical software such as the QIIME and CloVR packages, we have identified 9,444 distinct bacterial operational taxonomic units (OTUs). Approximately 65,550 rRNA sequences (24%) could not be matched to bacteria at the class level, and may therefore represent novel taxa. Differentially abundant classes between seasonal Axinella communities included Gammaproteobacteria, Flavobacteria, Alphaproteobacteria, Cyanobacteria, Acidobacter and Nitrospira. Comparisons with a proximal outgroup sponge species (Amphimedon compressa), and the growing sponge symbiont literature, indicate that this study has identified approximately 330 A. corrugata-specific symbiotic OTUs, many of which are related to the sulfur-oxidizing Ectothiorhodospiraceae. This family appeared exclusively within A. corrugata, comprising >34.5% of all sequenced amplicons. Other A. corrugata symbionts such as Deltaproteobacteria, Bdellovibrio, and Thiocystis among many others are described. Conclusions/Significance Slight shifts in several bacterial taxa were observed between communities sampled during spring and fall seasons. New 16 S rDNA sequences and concomitant identifications greatly expand the microbial community profile for this model reef sponge, and will likely be useful as a baseline for any future comparisons regarding sponge microbial community dynamics. PMID:22701613

  8. Automated two-point dixon screening for the evaluation of hepatic steatosis and siderosis: comparison with R2-relaxometry and chemical shift-based sequences.

    PubMed

    Henninger, B; Zoller, H; Rauch, S; Schocke, M; Kannengiesser, S; Zhong, X; Reiter, G; Jaschke, W; Kremser, C

    2015-05-01

    To evaluate the automated two-point Dixon screening sequence for the detection and estimated quantification of hepatic iron and fat compared with standard sequences as a reference. One hundred and two patients with suspected diffuse liver disease were included in this prospective study. The following MRI protocol was used: 3D-T1-weighted opposed- and in-phase gradient echo with two-point Dixon reconstruction and dual-ratio signal discrimination algorithm ("screening" sequence); fat-saturated, multi-gradient-echo sequence with 12 echoes; gradient-echo T1 FLASH opposed- and in-phase. Bland-Altman plots were generated and correlation coefficients were calculated to compare the sequences. The screening sequence diagnosed fat in 33, iron in 35 and a combination of both in 4 patients. Correlation between R2* values of the screening sequence and the standard relaxometry was excellent (r = 0.988). A slightly lower correlation (r = 0.978) was found between the fat fraction of the screening sequence and the standard sequence. Bland-Altman revealed systematically lower R2* values obtained from the screening sequence and higher fat fraction values obtained with the standard sequence with a rather high variability in agreement. The screening sequence is a promising method with fast diagnosis of the predominant liver disease. It is capable of estimating the amount of hepatic fat and iron comparable to standard methods. • MRI plays a major role in the clarification of diffuse liver disease. • The screening sequence was introduced for the assessment of diffuse liver disease. • It is a fast and automated algorithm for the evaluation of hepatic iron and fat. • It is capable of estimating the amount of hepatic fat and iron.

  9. Differential sequence diversity at merozoite surface protein-1 locus of Plasmodium knowlesi from humans and macaques in Thailand.

    PubMed

    Putaporntip, Chaturong; Thongaree, Siriporn; Jongwutiwes, Somchai

    2013-08-01

    To determine the genetic diversity and potential transmission routes of Plasmodium knowlesi, we analyzed the complete nucleotide sequence of the gene encoding the merozoite surface protein-1 of this simian malaria (Pkmsp-1), an asexual blood-stage vaccine candidate, from naturally infected humans and macaques in Thailand. Analysis of Pkmsp-1 sequences from humans (n=12) and monkeys (n=12) reveals five conserved and four variable domains. Most nucleotide substitutions in conserved domains were dimorphic whereas three of four variable domains contained complex repeats with extensive sequence and size variation. Besides purifying selection in conserved domains, evidence of intragenic recombination scattering across Pkmsp-1 was detected. The number of haplotypes, haplotype diversity, nucleotide diversity and recombination sites of human-derived sequences exceeded that of monkey-derived sequences. Phylogenetic networks based on concatenated conserved sequences of Pkmsp-1 displayed a character pattern that could have arisen from sampling process or the presence of two independent routes of P. knowlesi transmission, i.e. from macaques to human and from human to humans in Thailand. Copyright © 2013 Elsevier B.V. All rights reserved.

  10. Mechanisms of haplotype divergence at the RGA08 nucleotide-binding leucine-rich repeat gene locus in wild banana (Musa balbisiana).

    PubMed

    Baurens, Franc-Christophe; Bocs, Stéphanie; Rouard, Mathieu; Matsumoto, Takashi; Miller, Robert N G; Rodier-Goud, Marguerite; MBéguié-A-MBéguié, Didier; Yahiaoui, Nabila

    2010-07-16

    Comparative sequence analysis of complex loci such as resistance gene analog clusters allows estimating the degree of sequence conservation and mechanisms of divergence at the intraspecies level. In banana (Musa sp.), two diploid wild species Musa acuminata (A genome) and Musa balbisiana (B genome) contribute to the polyploid genome of many cultivars. The M. balbisiana species is associated with vigour and tolerance to pests and disease and little is known on the genome structure and haplotype diversity within this species. Here, we compare two genomic sequences of 253 and 223 kb corresponding to two haplotypes of the RGA08 resistance gene analog locus in M. balbisiana "Pisang Klutuk Wulung" (PKW). Sequence comparison revealed two regions of contrasting features. The first is a highly colinear gene-rich region where the two haplotypes diverge only by single nucleotide polymorphisms and two repetitive element insertions. The second corresponds to a large cluster of RGA08 genes, with 13 and 18 predicted RGA genes and pseudogenes spread over 131 and 152 kb respectively on each haplotype. The RGA08 cluster is enriched in repetitive element insertions, in duplicated non-coding intergenic sequences including low complexity regions and shows structural variations between haplotypes. Although some allelic relationships are retained, a large diversity of RGA08 genes occurs in this single M. balbisiana genotype, with several RGA08 paralogs specific to each haplotype. The RGA08 gene family has evolved by mechanisms of unequal recombination, intragenic sequence exchange and diversifying selection. An unequal recombination event taking place between duplicated non-coding intergenic sequences resulted in a different RGA08 gene content between haplotypes pointing out the role of such duplicated regions in the evolution of RGA clusters. Based on the synonymous substitution rate in coding sequences, we estimated a 1 million year divergence time for these M. balbisiana haplotypes. A large RGA08 gene cluster identified in wild banana corresponds to a highly variable genomic region between haplotypes surrounded by conserved flanking regions. High level of sequence identity (70 to 99%) of the genic and intergenic regions suggests a recent and rapid evolution of this cluster in M. balbisiana.

  11. Blade and bladelet production at Hohle Fels Cave, AH IV in the Swabian Jura and its importance for characterizing the technological variability of the Aurignacian in Central Europe

    PubMed Central

    Conard, Nicholas J.

    2018-01-01

    Hohle Fels Cave in the Ach Valley of Southwestern Germany exhibits an Aurignacian sequence of 1 m thickness within geological horizons (GH) 6–8. The deposition of the layers took place during mild and cold phases between at least 42 ka (GI 10) and 36 ka calBP (GI 7). We present below a technological study of blade and bladelet production from AH IV (GH 7) at Hohle Fels. Our analyses show that blade manufacture is relatively constant, while bladelet production displays a high degree of variability in order to obtain different blanks. Knappers used a variety of burins as cores to produce fine bladelets. The results reveal a new variant of the Aurignacian in the Swabian Jura primarily characterized by the production of bladelets and microliths from burin-cores. The artefacts from the Swabian Aurignacian are technologically and functionally more diverse than earlier studies of the Geißenklösterle and Vogelherd sequences have suggested. The technological analyses presented here challenge the claim that the typo-chronological system from Southwestern Europe can be applied to the Central European Aurignacian. Instead, we emphasize the impact of technological and functional variables within the Aurignacian of the Swabian Jura. PMID:29630601

  12. The utility of mtDNA and rDNA for barcoding and phylogeny of plant-parasitic nematodes from Longidoridae (Nematoda, Enoplea).

    PubMed

    Palomares-Rius, J E; Cantalapiedra-Navarrete, C; Archidona-Yuste, A; Subbotin, S A; Castillo, P

    2017-09-07

    The traditional identification of plant-parasitic nematode species by morphology and morphometric studies is very difficult because of high morphological variability that can lead to considerable overlap of many characteristics and their ambiguous interpretation. For this reason, it is essential to implement approaches to ensure accurate species identification. DNA barcoding aids in identification and advances species discovery. This study sought to unravel the use of the mitochondrial marker cytochrome c oxidase subunit 1 (coxI) as barcode for Longidoridae species identification, and as a phylogenetic marker. The results showed that mitochondrial and ribosomal markers could be used as barcoding markers, except for some species from the Xiphinema americanum group. The ITS1 region showed a promising role in barcoding for species identification because of the clear molecular variability among species. Some species presented important molecular variability in coxI. The analysis of the newly provided sequences and the sequences deposited in GenBank showed plausible misidentifications, and the use of voucher species and topotype specimens is a priority for this group of nematodes. The use of coxI and D2 and D3 expansion segments of the 28S rRNA gene did not clarify the phylogeny at the genus level.

  13. Chromosomal Mapping of Repetitive DNA Sequences in the Genus Bryconamericus (Characidae) and DNA Barcoding to Differentiate Populations.

    PubMed

    Santos, Angélica Rossotti Dos; Usso, Mariana Campaner; Gouveia, Juceli Gonzalez; Araya-Jaime, Cristian; Frantine-Silva, Wilson; Giuliano-Caetano, Lucia; Foresti, Fausto; Dias, Ana Lúcia

    2017-06-01

    The mapping of repetitive DNA sites by fluorescence in situ hybridization has been widely used for karyotype studies in different species of fish, especially when dealing with related species or even genera presenting high chromosome variability. This study analyzed three populations of Bryconamericus, with diploid number preserved, but with different karyotype formulae. Bryconamericus ecai, from the Forquetinha river/RS, presented three new cytotypes, increasing the number of karyotype forms to seven in this population. Other two populations of Bryconamericus sp. from the Vermelho stream/PR and Cambuta river/PR exhibited interpopulation variation. The chromosome mapping of rDNA sites revealed unique markings among the three populations, showing inter- and intrapopulation variability located in the terminal region. The molecular analysis using DNA barcoding complementing the cytogenetic analysis also showed differentiation among the three populations. The U2 small nuclear DNA repetitive sequence exhibited conserved features, being located in the interstitial region of a single chromosome pair. This is the first report on its occurrence in the genus Bryconamericus. Data obtained revealed a karyotype variability already assigned to the genus, along with polymorphism of ribosomal sites, demonstrating that this group of fish can be undergoing a divergent evolutionary process, constituting a substantive model for studies of chromosomal evolution.

  14. Toward a Theory of Sequencing: Study 1-7: An Exploration of the Effect of Instructional Sequences Involving Enactive and Iconic Embodiments on the Ability to Generalize.

    ERIC Educational Resources Information Center

    Beardslee, Edward Clarke

    The purpose of the study was to examine the effect of instruction using Dienes' perceptual variability principles on primitive generalization and mathematical generalization. The following was studied: the effect of achievement-to-criterion on one, two, or three non-symbolic embodiments of an objective using a selected class of variables on the…

  15. Multi-region and single-cell sequencing reveal variable genomic heterogeneity in rectal cancer.

    PubMed

    Liu, Mingshan; Liu, Yang; Di, Jiabo; Su, Zhe; Yang, Hong; Jiang, Beihai; Wang, Zaozao; Zhuang, Meng; Bai, Fan; Su, Xiangqian

    2017-11-23

    Colorectal cancer is a heterogeneous group of malignancies with complex molecular subtypes. While colon cancer has been widely investigated, studies on rectal cancer are very limited. Here, we performed multi-region whole-exome sequencing and single-cell whole-genome sequencing to examine the genomic intratumor heterogeneity (ITH) of rectal tumors. We sequenced nine tumor regions and 88 single cells from two rectal cancer patients with tumors of the same molecular classification and characterized their mutation profiles and somatic copy number alterations (SCNAs) at the multi-region and the single-cell levels. A variable extent of genomic heterogeneity was observed between the two patients, and the degree of ITH increased when analyzed on the single-cell level. We found that major SCNAs were early events in cancer development and inherited steadily. Single-cell sequencing revealed mutations and SCNAs which were hidden in bulk sequencing. In summary, we studied the ITH of rectal cancer at regional and single-cell resolution and demonstrated that variable heterogeneity existed in two patients. The mutational scenarios and SCNA profiles of two patients with treatment naïve from the same molecular subtype are quite different. Our results suggest each tumor possesses its own architecture, which may result in different diagnosis, prognosis, and drug responses. Remarkable ITH exists in the two patients we have studied, providing a preliminary impression of ITH in rectal cancer.

  16. Biological and serological variability, evolution and molecular epidemiology of Zucchini yellow mosaic virus (ZYMV, Potyvirus) with special reference to Caribbean islands.

    PubMed

    Desbiez, C; Wipf-Scheibel, C; Lecoq, H

    2002-04-23

    Zucchini yellow mosaic virus (ZYMV, Potyvirus) emerged as an important pathogen of cucurbits within the last 20 years. Its origins and mechanisms for evolution and worldwide spread represent important questions to understand plant virus emergence. Sequence analysis on a 250 nucleotide fragment including the N-terminal part of the coat protein coding region, revealed one major group of strains, and some highly divergent isolates from distinct origins. Within the major group, three subsets of strains were defined without correlation with geographic origin, year of collection or biological properties. ZYMV was first observed in Martinique and Guadeloupe in 1992 and 1994, respectively. We studied the evolution of ZYMV variability on both islands in the few years following the putative virus introduction. In Martinique, molecular divergence remained low even after 6 years, suggesting a lack of new introductions. Interactions between strains resulted in a stability of the high biological variability, while the serological diversity decreased and molecular divergence remained low. In Guadeloupe, as in Martinique in 1993, serological variability was high shortly after virus introduction. While the first introduction in Guadeloupe was independent from Martinique, the 'Martinique' type was detected in 1998, suggesting further introductions, maybe through viruliferous aphids or imported plant material.

  17. Ancient diversity and geographical sub-structuring in African buffalo Theileria parva populations revealed through metagenetic analysis of antigen-encoding loci.

    PubMed

    Hemmink, Johanneke D; Sitt, Tatjana; Pelle, Roger; de Klerk-Lorist, Lin-Mari; Shiels, Brian; Toye, Philip G; Morrison, W Ivan; Weir, William

    2018-03-01

    An infection and treatment protocol involving infection with a mixture of three parasite isolates and simultaneous treatment with oxytetracycline is currently used to vaccinate cattle against Theileria parva. While vaccination results in high levels of protection in some regions, little or no protection is observed in areas where animals are challenged predominantly by parasites of buffalo origin. A previous study involving sequencing of two antigen-encoding genes from a series of parasite isolates indicated that this is associated with greater antigenic diversity in buffalo-derived T. parva. The current study set out to extend these analyses by applying high-throughput sequencing to ex vivo samples from naturally infected buffalo to determine the extent of diversity in a set of antigen-encoding genes. Samples from two populations of buffalo, one in Kenya and the other in South Africa, were examined to investigate the effect of geographical distance on the nature of sequence diversity. The results revealed a number of significant findings. First, there was a variable degree of nucleotide sequence diversity in all gene segments examined, with the percentage of polymorphic nucleotides ranging from 10% to 69%. Second, large numbers of allelic variants of each gene were found in individual animals, indicating multiple infection events. Third, despite the observed diversity in nucleotide sequences, several of the gene products had highly conserved amino acid sequences, and thus represent potential candidates for vaccine development. Fourth, although compelling evidence for population differentiation between the Kenyan and South African T. parva parasites was identified, analysis of molecular variance for each gene revealed that the majority of the underlying nucleotide sequence polymorphism was common to both areas, indicating that much of this aspect of genetic variation in the parasite population arose prior to geographic separation. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.

  18. Fatigue testing of weldable high strength steels under simulated service conditions

    NASA Astrophysics Data System (ADS)

    Tantbirojn, Natee

    There have been concerns over the effect of Cathodic Protection (CP) on weldable high strength steels employed in Jack-up production platform. The guidance provided by the Department of Energy HSE on higher strength steels, based on previous work, was to avoid overprotection as this could cause hydrogen embrittlement. However, the tests conducted so far at UCL for the SE702 type high strength steels (yields strength around 690 MPa) have shown that the effect of over protection on high strength steels may not be as severe as previously thought. For this thesis, SE702 high strength steels have been investigated in more detail. Thick (85mm) parent and ground welded plates were tested under constant amplitude in air and seawater with CP. Tests were also conducted on Thick (40mm) T-butt welded plates under variable amplitude loading in air and seawater with two CP levels (-800mV and -1050mV). Different backing materials (ceramic and metallic) for the welding process of the T-butt plates were also investigated. The variable amplitude sequences employed were generated using the Jack-up Offshore Standard load History (JOSH). The fatigue results are presented as crack growth and S/N curves. They were compared to the conventional offshore steel (BS 4360 50D). The results suggested that the fatigue life of the high strength steels was comparable to the BS 4360 50D steels. The effect of increasing the CP was found to be detrimental to the fatigue life but the effect was not large. The effect of CP was less noticeable in T-butt welded plates. However, in general, the effect of overprotection is not as detrimental to the Jack-up steels as previously thought. The load histories generated by JOSH were found to have some unfavourable characteristics. The framework is based on Markov Chain method and pseudo-random number generator for selecting sea-states. A study was carried out on the sequence generated by JOSH. The generated sequences were analysed for their validity for fatigue testing. This has resulted in recommendations on the methods for generating standard load histories.

  19. An evaluation of the accuracy and speed of metagenome analysis tools

    PubMed Central

    Lindgreen, Stinus; Adair, Karen L.; Gardner, Paul P.

    2016-01-01

    Metagenome studies are becoming increasingly widespread, yielding important insights into microbial communities covering diverse environments from terrestrial and aquatic ecosystems to human skin and gut. With the advent of high-throughput sequencing platforms, the use of large scale shotgun sequencing approaches is now commonplace. However, a thorough independent benchmark comparing state-of-the-art metagenome analysis tools is lacking. Here, we present a benchmark where the most widely used tools are tested on complex, realistic data sets. Our results clearly show that the most widely used tools are not necessarily the most accurate, that the most accurate tool is not necessarily the most time consuming, and that there is a high degree of variability between available tools. These findings are important as the conclusions of any metagenomics study are affected by errors in the predicted community composition and functional capacity. Data sets and results are freely available from http://www.ucbioinformatics.org/metabenchmark.html PMID:26778510

  20. Cytochrome b sequences in black-crowned night-herons (Nycticorax nycticorax) from heronries exposed to genotoxic contaminants

    USGS Publications Warehouse

    Dahl, Christopher R.; Bickham, John W.; Wickliffe, Jeffery K.; Custer, Thomas W.

    2001-01-01

    DNA sequence analysis of a 215 base-pair region of the mitochondrial cytochrome b gene was used to examine genetic variation and search for evidence of an increased mutation rate in black-crowned night-herons. We examined five populations exposed to environmental contamination (primarily PAHs and PCBs) and one reference population from the eastern U.S. There was no evidence of a high mutation rate even within populations previously shown to exhibit increased variation in DNA content among somatic cells as a result of petroleum exposure. Three haplotypes were observed among 99 individuals. The low level of variability could be evidence for a genetic bottleneck, or that cytochrome b is too conservative for use in population genetic studies of this species. With the exception of one population from Louisiana, pair-wise Phist estimates were very low, indicative of little population structure and potentially high rates of effective migration among populations.

  1. Pattern statistics on Markov chains and sensitivity to parameter estimation

    PubMed Central

    Nuel, Grégory

    2006-01-01

    Background: In order to compute pattern statistics in computational biology a Markov model is commonly used to take into account the sequence composition. Usually its parameter must be estimated. The aim of this paper is to determine how sensitive these statistics are to parameter estimation, and what are the consequences of this variability on pattern studies (finding the most over-represented words in a genome, the most significant common words to a set of sequences,...). Results: In the particular case where pattern statistics (overlap counting only) computed through binomial approximations we use the delta-method to give an explicit expression of σ, the standard deviation of a pattern statistic. This result is validated using simulations and a simple pattern study is also considered. Conclusion: We establish that the use of high order Markov model could easily lead to major mistakes due to the high sensitivity of pattern statistics to parameter estimation. PMID:17044916

  2. Pattern statistics on Markov chains and sensitivity to parameter estimation.

    PubMed

    Nuel, Grégory

    2006-10-17

    In order to compute pattern statistics in computational biology a Markov model is commonly used to take into account the sequence composition. Usually its parameter must be estimated. The aim of this paper is to determine how sensitive these statistics are to parameter estimation, and what are the consequences of this variability on pattern studies (finding the most over-represented words in a genome, the most significant common words to a set of sequences,...). In the particular case where pattern statistics (overlap counting only) computed through binomial approximations we use the delta-method to give an explicit expression of sigma, the standard deviation of a pattern statistic. This result is validated using simulations and a simple pattern study is also considered. We establish that the use of high order Markov model could easily lead to major mistakes due to the high sensitivity of pattern statistics to parameter estimation.

  3. Image correlation method for DNA sequence alignment.

    PubMed

    Curilem Saldías, Millaray; Villarroel Sassarini, Felipe; Muñoz Poblete, Carlos; Vargas Vásquez, Asticio; Maureira Butler, Iván

    2012-01-01

    The complexity of searches and the volume of genomic data make sequence alignment one of bioinformatics most active research areas. New alignment approaches have incorporated digital signal processing techniques. Among these, correlation methods are highly sensitive. This paper proposes a novel sequence alignment method based on 2-dimensional images, where each nucleic acid base is represented as a fixed gray intensity pixel. Query and known database sequences are coded to their pixel representation and sequence alignment is handled as object recognition in a scene problem. Query and database become object and scene, respectively. An image correlation process is carried out in order to search for the best match between them. Given that this procedure can be implemented in an optical correlator, the correlation could eventually be accomplished at light speed. This paper shows an initial research stage where results were "digitally" obtained by simulating an optical correlation of DNA sequences represented as images. A total of 303 queries (variable lengths from 50 to 4500 base pairs) and 100 scenes represented by 100 x 100 images each (in total, one million base pair database) were considered for the image correlation analysis. The results showed that correlations reached very high sensitivity (99.01%), specificity (98.99%) and outperformed BLAST when mutation numbers increased. However, digital correlation processes were hundred times slower than BLAST. We are currently starting an initiative to evaluate the correlation speed process of a real experimental optical correlator. By doing this, we expect to fully exploit optical correlation light properties. As the optical correlator works jointly with the computer, digital algorithms should also be optimized. The results presented in this paper are encouraging and support the study of image correlation methods on sequence alignment.

  4. Environmental genomics of "Haloquadratum walsbyi" in a saltern crystallizer indicates a large pool of accessory genes in an otherwise coherent species

    PubMed Central

    Legault, Boris A; Lopez-Lopez, Arantxa; Alba-Casado, Jose Carlos; Doolittle, W Ford; Bolhuis, Henk; Rodriguez-Valera, Francisco; Papke, R Thane

    2006-01-01

    Background Mature saturated brine (crystallizers) communities are largely dominated (>80% of cells) by the square halophilic archaeon "Haloquadratum walsbyi". The recent cultivation of the strain HBSQ001 and thesequencing of its genome allows comparison with the metagenome of this taxonomically simplified environment. Similar studies carried out in other extreme environments have revealed very little diversity in gene content among the cell lineages present. Results The metagenome of the microbial community of a crystallizer pond has been analyzed by end sequencing a 2000 clone fosmid library and comparing the sequences obtained with the genome sequence of "Haloquadratum walsbyi". The genome of the sequenced strain was retrieved nearly complete within this environmental DNA library. However, many ORF's that could be ascribed to the "Haloquadratum" metapopulation by common genome characteristics or scaffolding to the strain genome were not present in the specific sequenced isolate. Particularly, three regions of the sequenced genome were associated with multiple rearrangements and the presence of different genes from the metapopulation. Many transposition and phage related genes were found within this pool which, together with the associated atypical GC content in these areas, supports lateral gene transfer mediated by these elements as the most probable genetic cause of this variability. Additionally, these sequences were highly enriched in putative regulatory and signal transduction functions. Conclusion These results point to a large pan-genome (total gene repertoire of the genus/species) even in this highly specialized extremophile and at a single geographic location. The extensive gene repertoire is what might be expected of a population that exploits a diverse nutrient pool, resulting from the degradation of biomass produced at lower salinities. PMID:16820057

  5. Coevolutionary modeling of protein sequences: Predicting structure, function, and mutational landscapes

    NASA Astrophysics Data System (ADS)

    Weigt, Martin

    Over the last years, biological research has been revolutionized by experimental high-throughput techniques, in particular by next-generation sequencing technology. Unprecedented amounts of data are accumulating, and there is a growing request for computational methods unveiling the information hidden in raw data, thereby increasing our understanding of complex biological systems. Statistical-physics models based on the maximum-entropy principle have, in the last few years, played an important role in this context. To give a specific example, proteins and many non-coding RNA show a remarkable degree of structural and functional conservation in the course of evolution, despite a large variability in amino acid sequences. We have developed a statistical-mechanics inspired inference approach - called Direct-Coupling Analysis - to link this sequence variability (easy to observe in sequence alignments, which are available in public sequence databases) to bio-molecular structure and function. In my presentation I will show, how this methodology can be used (i) to infer contacts between residues and thus to guide tertiary and quaternary protein structure prediction and RNA structure prediction, (ii) to discriminate interacting from non-interacting protein families, and thus to infer conserved protein-protein interaction networks, and (iii) to reconstruct mutational landscapes and thus to predict the phenotypic effect of mutations. References [1] M. Figliuzzi, H. Jacquier, A. Schug, O. Tenaillon and M. Weigt ''Coevolutionary landscape inference and the context-dependence of mutations in beta-lactamase TEM-1'', Mol. Biol. Evol. (2015), doi: 10.1093/molbev/msv211 [2] E. De Leonardis, B. Lutz, S. Ratz, S. Cocco, R. Monasson, A. Schug, M. Weigt ''Direct-Coupling Analysis of nucleotide coevolution facilitates RNA secondary and tertiary structure prediction'', Nucleic Acids Research (2015), doi: 10.1093/nar/gkv932 [3] F. Morcos, A. Pagnani, B. Lunt, A. Bertolino, D. Marks, C. Sander, R. Zecchina, J.N. Onuchic, T. Hwa, M. Weigt, ''Direct-coupling analysis of residue co-evolution captures native contacts across many protein families'', Proc. Natl. Acad. Sci. 108, E1293-E1301 (2011).

  6. Novel variable number of tandem repeats of gibbon MAOA gene and its evolutionary significance.

    PubMed

    Choi, Yuri; Jung, Yi-Deun; Ayarpadikannan, Selvam; Koga, Akihiko; Imai, Hiroo; Hirai, Hirohisa; Roos, Christian; Kim, Heui-Soo

    2014-08-01

    Variable number of tandem repeats (VNTRs) are scattered throughout the primate genome, and genetic variation of these VNTRs have been accumulated during primate radiation. Here, we analyzed VNTRs upstream of the monoamine oxidase A (MAOA) gene in 11 different gibbon species. An abundance of truncated VNTR sequences and copy number differences were observed compared to those of human VNTR sequences. To better understand the biological role of these VNTRs, a luciferase activity assay was conducted and results indicated that selected VNTR sequences of the MAOA gene from human and three different gibbon species (Hylobates klossii, Hylobates lar, and Nomascus concolor) showed silencing ability. Together, these data could be useful for understanding the evolutionary history and functional significance of MAOA VNTR sequences in gibbon species.

  7. Sequence typing of human adenoviruses isolated from Polish patients subjected to allogeneic hematopoietic stem cell transplantation - a single center experience.

    PubMed

    Przybylski, Maciej; Rynans, Sylwia; Waszczuk-Gajda, Anna; Bilinski, Jarosław; Basak, Grzegorz W; Jędrzejczak, Wiesław W; Wróblewska, Marta; Młynarczyk, Grażyna; Dzieciątkowski, Tomasz

    2018-03-28

    Human adenoviruses (HAdV) from species A, B and C are commonly recognized as pathogens causing severe morbidity and mortality in hematopoietic stem cell transplant (HSCT) recipients. The purpose of the present study was to determine HAdV types responsible for viremia in HSCT recipients at a large tertiary hospital in Poland. Analysis of partial nucleotide sequences of HAdV hexon gene was used to type 40 clinical isolates of HAdV obtained from 40 HSCT recipients. We identified six different HAdV serotypes belonging to species B, C and E. We demonstrated high variability in sequences of detected HAdV types, and patients infected with the same HAdV types were not hospitalized at the same time, which suggests the low possibility of cross-infection. In almost all patients, anti-HAdV antibodies in IgG class were detected, which indicates a history of HAdV infection in the past. Clinical symptoms accompanying HAdV viremia were in 89%, and in 61.5% of individuals, HAdV was a sole pathogen detected. There were no cases with high-level HAdV viremia and severe systemic or organ infections. Graft-versus-host disease (GvHD) was present in patients infected with species B and C, but grade II of GvHD was observed only in patients infected with HAdV-B. The predominance of HAdV-C and common presence of anti-HAdV antibodies in IgG class may strongly suggest that most infections in the present study were reactivations of HAdV persisting into the patient's mucosa-associated lymphoid tissues. Variability of HAdV sequences suggests that cross-infections between patients were very rare. GvHD: graft-versus-host disease; HAdV: human adenoviruses; HSCT: hematopoietic stem cell transplantation.

  8. Comparative Genome Analysis of Ciprofloxacin-Resistant Pseudomonas aeruginosa Reveals Genes Within Newly Identified High Variability Regions Associated With Drug Resistance Development

    PubMed Central

    Su, Hsun-Cheng; Khatun, Jainab; Kanavy, Dona M.

    2013-01-01

    The alarming rise of ciprofloxacin-resistant Pseudomonas aeruginosa has been reported in several clinical studies. Though the mutation of resistance genes and their role in drug resistance has been researched, the process by which the bacterium acquires high-level resistance is still not well understood. How does the genomic evolution of P. aeruginosa affect resistance development? Could the exposure of antibiotics to the bacteria enrich genomic variants that lead to the development of resistance, and if so, how are these variants distributed through the genome? To answer these questions, we performed 454 pyrosequencing and a whole genome analysis both before and after exposure to ciprofloxacin. The comparative sequence data revealed 93 unique resistance strain variation sites, which included a mutation in the DNA gyrase subunit A gene. We generated variation-distribution maps comparing the wild and resistant types, and isolated 19 candidates from three discrete resistance-associated high variability regions that had available transposon mutants, to perform a ciprofloxacin exposure assay. Of these region candidates with transposon disruptions, 79% (15/19) showed a reduction in the ability to gain high-level resistance, suggesting that genes within these high variability regions might enrich for certain functions associated with resistance development. PMID:23808957

  9. Genomics of crop wild relatives: expanding the gene pool for crop improvement.

    PubMed

    Brozynska, Marta; Furtado, Agnelo; Henry, Robert J

    2016-04-01

    Plant breeders require access to new genetic diversity to satisfy the demands of a growing human population for more food that can be produced in a variable or changing climate and to deliver the high-quality food with nutritional and health benefits demanded by consumers. The close relatives of domesticated plants, crop wild relatives (CWRs), represent a practical gene pool for use by plant breeders. Genomics of CWR generates data that support the use of CWR to expand the genetic diversity of crop plants. Advances in DNA sequencing technology are enabling the efficient sequencing of CWR and their increased use in crop improvement. As the sequencing of genomes of major crop species is completed, attention has shifted to analysis of the wider gene pool of major crops including CWR. A combination of de novo sequencing and resequencing is required to efficiently explore useful genetic variation in CWR. Analysis of the nuclear genome, transcriptome and maternal (chloroplast and mitochondrial) genome of CWR is facilitating their use in crop improvement. Genome analysis results in discovery of useful alleles in CWR and identification of regions of the genome in which diversity has been lost in domestication bottlenecks. Targeting of high priority CWR for sequencing will maximize the contribution of genome sequencing of CWR. Coordination of global efforts to apply genomics has the potential to accelerate access to and conservation of the biodiversity essential to the sustainability of agriculture and food production. © 2015 Society for Experimental Biology, Association of Applied Biologists and John Wiley & Sons Ltd.

  10. MR Imaging with Metal-suppression Sequences for Evaluation of Total Joint Arthroplasty.

    PubMed

    Talbot, Brett S; Weinberg, Eric P

    2016-01-01

    Metallic artifact at orthopedic magnetic resonance (MR) imaging continues to be an important problem, particularly in the realm of total joint arthroplasty. Complications often follow total joint arthroplasty and can be expected for a small percentage of all implanted devices. Postoperative complications involve not only osseous structures but also adjacent soft tissues-a highly problematic area at MR imaging because of artifacts from metallic prostheses. Without special considerations, susceptibility artifacts from ferromagnetic implants can unacceptably degrade image quality. Common artifacts include in-plane distortions (signal loss and signal pileup), poor or absent fat suppression, geometric distortion, and through-section distortion. Basic methods to reduce metallic artifacts include use of spin-echo or fast spin-echo sequences with long echo train lengths, short inversion time inversion-recovery (STIR) sequences for fat suppression, a high bandwidth, thin section selection, and an increased matrix. With care and attention to the alloy type (eg, titanium, cobalt-chromium, stainless steel), orientation of the implant, and magnetic field strength, as well as use of proprietary and nonproprietary metal-suppression techniques, previously nondiagnostic studies can yield key diagnostic information. Specifically, sequences such as the metal artifact reduction sequence (MARS), WARP (Siemens Healthcare, Munich, Germany), slice encoding for metal artifact correction (SEMAC), and multiacquisition with variable-resonance image combination (MAVRIC) can be optimized to reveal pathologic conditions previously hidden by periprosthetic artifacts. Complications of total joint arthroplasty that can be evaluated by using MR imaging with metal-suppression sequences include pseudotumoral conditions such as metallosis and particle disease, infection, aseptic prosthesis loosening, tendon injury, and muscle injury. ©RSNA, 2015.

  11. Phylogeny and variability of Colletotrichum truncatum associated with soybean anthracnose in Brazil.

    PubMed

    Rogério, F; Ciampi-Guillardi, M; Barbieri, M C G; Bragança, C A D; Seixas, C D S; Almeida, A M R; Massola, N S

    2017-02-01

    Fungal diseases are among the main factors limiting high yields of soybean crop. Colletotrichum isolates from soybean plants with anthracnose symptoms were studied from different regions and time periods in Brazil using molecular, morphological and pathogenic analyses. Bayesian phylogenetic inference of GAPDH, HIS3 and ITS-5.8S rDNA sequences, the morphologies of colony and conidia, and inoculation tests on seeds and seedlings were performed. All isolates clustered only with Colletotrichum truncatum species in three well-separated clusters. Intraspecific genetic diversity revealed 27 distinct haplotypes in 51 fungal isolates; some of which were identical to C. truncatum sequences from other regions around the world, while others were related to alternative hosts. Conidia were falcate, hyaline, unicellular and aseptate, formed in acervuli, with variable dimensions. Despite being pathogenic to seedlings by both inoculation methods, variation was observed in the aggressiveness of the tested isolates, which was not correlated with genetic variation. The identification of C. truncatum in the sampled isolates was evidenced as being the only causal agent of soybean anthracnose in Brazil until 2007, with relevant genetic, morphological and pathogenic variability as well as a broad geographical origin. The wide distribution of the predominant C. truncatum haplotype indicated the existence of a highly efficient mechanism of pathogen dispersal over long distances, reinforcing the role of seeds as the primary source of disease inoculum. The characterization and distribution of Colletotrichum species in soybean-producing regions in Brazil is fundamental for understanding the disease epidemiology and for ensuring effective control strategies against anthracnose. © 2016 The Society for Applied Microbiology.

  12. Bayesian Correlation Analysis for Sequence Count Data

    PubMed Central

    Lau, Nelson; Perkins, Theodore J.

    2016-01-01

    Evaluating the similarity of different measured variables is a fundamental task of statistics, and a key part of many bioinformatics algorithms. Here we propose a Bayesian scheme for estimating the correlation between different entities’ measurements based on high-throughput sequencing data. These entities could be different genes or miRNAs whose expression is measured by RNA-seq, different transcription factors or histone marks whose expression is measured by ChIP-seq, or even combinations of different types of entities. Our Bayesian formulation accounts for both measured signal levels and uncertainty in those levels, due to varying sequencing depth in different experiments and to varying absolute levels of individual entities, both of which affect the precision of the measurements. In comparison with a traditional Pearson correlation analysis, we show that our Bayesian correlation analysis retains high correlations when measurement confidence is high, but suppresses correlations when measurement confidence is low—especially for entities with low signal levels. In addition, we consider the influence of priors on the Bayesian correlation estimate. Perhaps surprisingly, we show that naive, uniform priors on entities’ signal levels can lead to highly biased correlation estimates, particularly when different experiments have widely varying sequencing depths. However, we propose two alternative priors that provably mitigate this problem. We also prove that, like traditional Pearson correlation, our Bayesian correlation calculation constitutes a kernel in the machine learning sense, and thus can be used as a similarity measure in any kernel-based machine learning algorithm. We demonstrate our approach on two RNA-seq datasets and one miRNA-seq dataset. PMID:27701449

  13. A novel typing method for Listeria monocytogenes using high-resolution melting analysis (HRMA) of tandem repeat regions.

    PubMed

    Ohshima, Chihiro; Takahashi, Hajime; Iwakawa, Ai; Kuda, Takashi; Kimura, Bon

    2017-07-17

    Listeria monocytogenes, which is responsible for causing food poisoning known as listeriosis, infects humans and animals. Widely distributed in the environment, this bacterium is known to contaminate food products after being transmitted to factories via raw materials. To minimize the contamination of products by food pathogens, it is critical to identify and eliminate factory entry routes and pathways for the causative bacteria. High resolution melting analysis (HRMA) is a method that takes advantage of differences in DNA sequences and PCR product lengths that are reflected by the disassociation temperature. Through our research, we have developed a multiple locus variable-number tandem repeat analysis (MLVA) using HRMA as a simple and rapid method to differentiate L. monocytogenes isolates. While evaluating our developed method, the ability of MLVA-HRMA, MLVA using capillary electrophoresis, and multilocus sequence typing (MLST) was compared for their ability to discriminate between strains. The MLVA-HRMA method displayed greater discriminatory ability than MLST and MLVA using capillary electrophoresis, suggesting that the variation in the number of repeat units, along with mutations within the DNA sequence, was accurately reflected by the melting curve of HRMA. Rather than relying on DNA sequence analysis or high-resolution electrophoresis, the MLVA-HRMA method employs the same process as PCR until the analysis step, suggesting a combination of speed and simplicity. The result of MLVA-HRMA method is able to be shared between different laboratories. There are high expectations that this method will be adopted for regular inspections at food processing facilities in the near future. Copyright © 2017. Published by Elsevier B.V.

  14. The complete mitochondrial genome sequence of the Tibetan red fox (Vulpes vulpes montana).

    PubMed

    Zhang, Jin; Zhang, Honghai; Zhao, Chao; Chen, Lei; Sha, Weilai; Liu, Guangshuai

    2015-01-01

    In this study, the complete mitochondrial genome of the Tibetan red fox (Vulpes Vulpes montana) was sequenced for the first time using blood samples obtained from a wild female red fox captured from Lhasa in Tibet, China. Qinghai--Tibet Plateau is the highest plateau in the world with an average elevation above 3500 m. Sequence analysis showed it contains 12S rRNA gene, 16S rRNA gene, 22 tRNA genes, 13 protein-coding genes and 1 control region (CR). The variable tandem repeats in CR is the main reason of the length variability of mitochondrial genome among canide animals.

  15. Scaling Linguistic Characterization of Precipitation Variability

    NASA Astrophysics Data System (ADS)

    Primo, C.; Gutierrez, J. M.

    2003-04-01

    Rainfall variability is influenced by changes in the aggregation of daily rainfall. This problem is of great importance for hydrological, agricultural and ecological applications. Rainfall averages, or accumulations, are widely used as standard climatic parameters. However different aggregation schemes may lead to the same average or accumulated values. In this paper we present a fractal method to characterize different aggregation schemes. The method provides scaling exponents characterizing weekly or monthly rainfall patterns for a given station. To this aim, we establish an analogy with linguistic analysis, considering precipitation as a discrete variable (e.g., rain, no rain). Each weekly, or monthly, symbolic precipitation sequence of observed precipitation is then considered as a "word" (in this case, a binary word) which defines a specific weekly rainfall pattern. Thus, each site defines a "language" characterized by the words observed in that site during a period representative of the climatology. Then, the more variable the observed weekly precipitation sequences, the more complex the obtained language. To characterize these languages, we first applied the Zipf's method obtaining scaling histograms of rank ordered frequencies. However, to obtain significant exponents, the scaling must be maintained some orders of magnitude, requiring long sequences of daily precipitation which are not available at particular stations. Thus this analysis is not suitable for applications involving particular stations (such as regionalization). Then, we introduce an alternative fractal method applicable to data from local stations. The so-called Chaos-Game method uses Iterated Function Systems (IFS) for graphically representing rainfall languages, in a way that complex languages define complex graphical patterns. The box-counting dimension and the entropy of the resulting patterns are used as linguistic parameters to quantitatively characterize the complexity of the patterns. We illustrate the high climatological discrimination power of the linguistic parameters in the Iberian peninsula, when compared with other standard techniques (such as seasonal mean accumulated precipitation). As an example, standard and linguistic parameters are used as inputs for a clustering regionalization method, comparing the resulting clusters.

  16. TU-H-CAMPUS-IeP2-01: Quantitative Evaluation of PROPELLER DWI Using QIBA Diffusion Phantom

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yung, J; Ai, H; Liu, H

    Purpose: The purpose of this study is to determine the quantitative variability of apparent diffusion coefficient (ADC) values when varying imaging parameters in a diffusion-weighted (DW) fast spin echo (FSE) sequence with Periodically Rotated Overlapping ParallEL Lines with Enhanced Reconstruction (PROPELLER) k-space trajectory. Methods: Using a 3T MRI scanner, a NIST traceable, quantitative magnetic resonance imaging (MRI) diffusion phantom (High Precision Devices, Inc, Boulder, Colorado) consisting of 13 vials filled with various concentrations of polymer polyvinylpyrrolidone (PVP) in aqueous solution was imaged with a standard Quantitative Imaging Biomarkers Alliance (QIBA) DWI spin echo, echo planar imaging (SE EPI) acquisition. Themore » same phantom was then imaged with a DWI PROPELLER sequence at varying echo train lengths (ETL) of 8, 20, and 32, as well as b-values of 400, 900, and 2000. QIBA DWI phantom analysis software was used to generate ADC maps and create region of interests (ROIs) for quantitative measurements of each vial. Mean and standard deviations of the ROIs were compared. Results: The SE EPI sequence generated ADC values that showed very good agreement with the known ADC values of the phantom (r2 = 0.9995, slope = 1.0061). The ADC values measured from the PROPELLER sequences were inflated, but were highly correlated with an r2 range from 0.8754 to 0.9880. The PROPELLER sequence with an ETL=20 and b-value of 0 and 2000 showed the closest agreement (r2 = 0.9034, slope = 0.9880). Conclusion: The DW PROPELLER sequence is promising for quantitative evaluation of ADC values. A drawback of the PROPELLER sequence is the longer acquisition time. The 180° refocusing pulses may also cause the observed increase in ADC values compared to the standard SE EPI DW sequence. However, the FSE sequence offers an advantage with in-plane motion and geometric distortion which will be investigated in future studies.« less

  17. Highly diverse variable number tandem repeat loci in the E. coli O157:H7 and O55:H7 genomes for high-resolution molecular typing.

    PubMed

    Keys, C; Kemper, S; Keim, P

    2005-01-01

    Evaluation of the Escherichia coli genome for variable number tandem repeat (VNTR) loci in order to provide a subtyping tool with greater discrimination and more efficient capacity. Twenty-nine putative VNTR loci were identified from the E. coli genomic sequence. Their variability was validated by characterizing the number of repeats at each locus in a set of 56 E. coli O157:H7/HN and O55:H7 isolates. An optimized multiplex assay system was developed to facility high capacity analysis. Locus diversity values ranged from 0.23 to 0.95 while the number of alleles ranged from two to 29. This multiple-locus VNTR analysis (MLVA) data was used to describe genetic relationships among these isolates and was compared with PFGE (pulse field gel electrophoresis) data from a subset of the same strains. Genetic similarity values were highly correlated between the two approaches, through MLVA was capable of discrimination amongst closely related isolates when PFGE similar values were equal to 1.0. Highly variable VNTR loci exist in the E. coli O157:H7 genome and are excellent estimators of genetic relationships, in particular for closely related isolates. Escherichia coli O157:H7 MLVA offers a complimentary analysis to the more traditional PFGE approach. Application of MLVA to an outbreak cluster could generate superior molecular epidemiology and result in a more effective public health response.

  18. Molecular Strain Typing of Mycobacterium tuberculosis: a Review of Frequently Used Methods

    PubMed Central

    2016-01-01

    Tuberculosis, caused by the bacterium Mycobacterium tuberculosis, remains one of the most serious global health problems. Molecular typing of M. tuberculosis has been used for various epidemiologic purposes as well as for clinical management. Currently, many techniques are available to type M. tuberculosis. Choosing the most appropriate technique in accordance with the existing laboratory conditions and the specific features of the geographic region is important. Insertion sequence IS6110-based restriction fragment length polymorphism (RFLP) analysis is considered the gold standard for the molecular epidemiologic investigations of tuberculosis. However, other polymerase chain reaction-based methods such as spacer oligonucleotide typing (spoligotyping), which detects 43 spacer sequence-interspersing direct repeats (DRs) in the genomic DR region; mycobacterial interspersed repetitive units–variable number tandem repeats, (MIRU-VNTR), which determines the number and size of tandem repetitive DNA sequences; repetitive-sequence-based PCR (rep-PCR), which provides high-throughput genotypic fingerprinting of multiple Mycobacterium species; and the recently developed genome-based whole genome sequencing methods demonstrate similar discriminatory power and greater convenience. This review focuses on techniques frequently used for the molecular typing of M. tuberculosis and discusses their general aspects and applications. PMID:27709842

  19. Molecular Strain Typing of Mycobacterium tuberculosis: a Review of Frequently Used Methods.

    PubMed

    Ei, Phyu Win; Aung, Wah Wah; Lee, Jong Seok; Choi, Go Eun; Chang, Chulhun L

    2016-11-01

    Tuberculosis, caused by the bacterium Mycobacterium tuberculosis, remains one of the most serious global health problems. Molecular typing of M. tuberculosis has been used for various epidemiologic purposes as well as for clinical management. Currently, many techniques are available to type M. tuberculosis. Choosing the most appropriate technique in accordance with the existing laboratory conditions and the specific features of the geographic region is important. Insertion sequence IS6110-based restriction fragment length polymorphism (RFLP) analysis is considered the gold standard for the molecular epidemiologic investigations of tuberculosis. However, other polymerase chain reaction-based methods such as spacer oligonucleotide typing (spoligotyping), which detects 43 spacer sequence-interspersing direct repeats (DRs) in the genomic DR region; mycobacterial interspersed repetitive units-variable number tandem repeats, (MIRU-VNTR), which determines the number and size of tandem repetitive DNA sequences; repetitive-sequence-based PCR (rep-PCR), which provides high-throughput genotypic fingerprinting of multiple Mycobacterium species; and the recently developed genome-based whole genome sequencing methods demonstrate similar discriminatory power and greater convenience. This review focuses on techniques frequently used for the molecular typing of M. tuberculosis and discusses their general aspects and applications.

  20. Musculoskeletal motion flow fields using hierarchical variable-sized block matching in ultrasonographic video sequences.

    PubMed

    Revell, J D; Mirmehdi, M; McNally, D S

    2004-04-01

    We examine tissue deformations using non-invasive dynamic musculoskeletal ultrasonograhy, and quantify its performance on controlled in vitro gold standard (groundtruth) sequences followed by clinical in vivo data. The proposed approach employs a two-dimensional variable-sized block matching algorithm with a hierarchical full search. We extend this process by refining displacements to sub-pixel accuracy. We show by application that this technique yields quantitatively reliable results.

  1. Influence of flanking sequences on variability in expression levels of an introduced gene in transgenic tobacco plants.

    PubMed Central

    Dean, C; Jones, J; Favreau, M; Dunsmuir, P; Bedbrook, J

    1988-01-01

    The petunia rbcS gene SSU301 was introduced into tobacco using Agrobacterium tumefaciens-mediated transformation. The time at which rbcS expression was maximal after transfer of the tobacco plants to the greenhouse was determined. The expression level of the SSU301 gene varied up to 9 fold between individual tobacco plants which had been standardized physiologically as much as possible. The presence of adjacent pUC plasmid sequences did not affect the expression of the SSU301 gene. In an attempt to reduce the between-transformant variability in expression, the SSU301 gene was introduced into tobacco surrounded by 10kb of 5' and 13 kb of 3' DNA sequences which normally flank SSU301 in petunia. The longer flanking regions did not reduce the between-transformant variability of SSU301 gene expression. Images PMID:3174450

  2. A new high molecular weight immunoglobulin class from the carcharhine shark: implications for the properties of the primordial immunoglobulin.

    PubMed

    Berstein, R M; Schluter, S F; Shen, S; Marchalonis, J J

    1996-04-16

    All immunoglobulins and T-cell receptors throughout phylogeny share regions of highly conserved amino acid sequence. To identify possible primitive immunoglobulins and immunoglobulin-like molecules, we utilized 3' RACE (rapid amplification of cDNA ends) and a highly conserved constant region consensus amino acid sequence to isolate a new immunoglobulin class from the sandbar shark Carcharhinus plumbeus. The immunoglobulin, termed IgW, in its secreted form consists of 782 amino acids and is expressed in both the thymus and the spleen. The molecule overall most closely resembles mu chains of the skate and human and a new putative antigen binding molecule isolated from the nurse shark (NAR). The full-length IgW chain has a variable region resembling human and shark heavy-chain (VH) sequences and a novel joining segment containing the WGXGT motif characteristic of H chains. However, unlike any other H-chain-type molecule, it contains six constant (C) domains. The first C domain contains the cysteine residue characteristic of C mu1 that would allow dimerization with a light (L) chain. The fourth and sixth domains also contain comparable cysteines that would enable dimerization with other H chains or homodimerization. Comparison of the sequences of IgW V and C domains shows homology greater than that found in comparisons among VH and C mu or VL, or CL thereby suggesting that IgW may retain features of the primordial immunoglobulin in evolution.

  3. Expression of arginine kinase enzymatic activity and mRNA in gills of the euryhaline crabs Carcinus maenas and Callinectes sapidus.

    PubMed

    Kotlyar, S; Weihrauch, D; Paulsen, R S; Towle, D W

    2000-08-01

    Phosphagen kinases catalyze the reversible dephosphorylation of guanidino phosphagens such as phosphocreatine and phosphoarginine, contributing to the restoration of adenosine triphosphate concentrations in cells experiencing high and variable demands on their reserves of high-energy phosphates. The major invertebrate phosphagen kinase, arginine kinase, is expressed in the gills of two species of euryhaline crabs, the blue crab Callinectes sapidus and the shore crab Carcinus maenas, in which energy-requiring functions include monovalent ion transport, acid-base balance, nitrogen excretion and gas exchange. The enzymatic activity of arginine kinase approximately doubles in the ion-transporting gills of C. sapidus, a strong osmoregulator, when the crabs are transferred from high to low salinity, but does not change in C. maenas, a more modest osmoregulator. Amplification and sequencing of arginine kinase cDNA from both species, accomplished by reverse transcription of gill mRNA and the polymerase chain reaction, revealed an open reading frame coding for a 357-amino-acid protein. The predicted amino acid sequences showed a minimum of 75 % identity with arginine kinase sequences of other arthropods. Ten of the 11 amino acid residues believed to participate in arginine binding are completely conserved among the arthropod sequences analyzed. An estimation of arginine kinase mRNA abundance indicated that acclimation salinity has no effect on arginine kinase gene transcription. Thus, the observed enhancement of enzyme activity in C. sapidus probably results from altered translation rates or direct activation of pre-existing enzyme protein.

  4. Discovery of magnetic A supergiants: the descendants of magnetic main-sequence B stars

    NASA Astrophysics Data System (ADS)

    Neiner, Coralie; Oksala, Mary E.; Georgy, Cyril; Przybilla, Norbert; Mathis, Stéphane; Wade, Gregg; Kondrak, Matthias; Fossati, Luca; Blazère, Aurore; Buysschaert, Bram; Grunhut, Jason

    2017-10-01

    In the context of the high resolution, high signal-to-noise ratio, high sensitivity, spectropolarimetric survey BritePol, which complements observations by the BRITE constellation of nanosatellites for asteroseismology, we are looking for and measuring the magnetic field of all stars brighter than V = 4. In this paper, we present circularly polarized spectra obtained with HarpsPol at ESO in La Silla (Chile) and ESPaDOnS at CFHT (Hawaii) for three hot evolved stars: ι Car, HR 3890 and ɛ CMa. We detected a magnetic field in all three stars. Each star has been observed several times to confirm the magnetic detections and check for variability. The stellar parameters of the three objects were determined and their evolutionary status was ascertained employing evolution models computed with the Geneva code. ɛ CMa was already known and is confirmed to be magnetic, but our modelling indicates that it is located near the end of the main sequence, I.e. it is still in a core hydrogen burning phase. ι Car and HR 3890 are the first discoveries of magnetic hot supergiants located well after the end of the main sequence on the Hertzsprung-Russell diagram. These stars are probably the descendants of main-sequence magnetic massive stars. Their current field strength (a few G) is compatible with magnetic flux conservation during stellar evolution. These results provide observational constraints for the development of future evolutionary models of hot stars including a fossil magnetic field.

  5. Comprehensive global amino acid sequence analysis of PB1F2 protein of influenza A H5N1 viruses and the influenza A virus subtypes responsible for the 20th‐century pandemics

    PubMed Central

    Pasricha, Gunisha; Mishra, Akhilesh C.; Chakrabarti, Alok K.

    2012-01-01

    Please cite this paper as: Pasricha et al. (2012) Comprehensive global amino acid sequence analysis of PB1F2 protein of influenza A H5N1 viruses and the Influenza A virus subtypes responsible for the 20th‐century pandemics. Influenza and Other Respiratory Viruses 7(4), 497–505. Background  PB1F2 is the 11th protein of influenza A virus translated from +1 alternate reading frame of PB1 gene. Since the discovery, varying sizes and functions of the PB1F2 protein of influenza A viruses have been reported. Selection of PB1 gene segment in the pandemics, variable size and pleiotropic effect of PB1F2 intrigued us to analyze amino acid sequences of this protein in various influenza A viruses. Methods  Amino acid sequences for PB1F2 protein of influenza A H5N1, H1N1, H2N2, and H3N2 subtypes were obtained from Influenza Research Database. Multiple sequence alignments of the PB1F2 protein sequences of the aforementioned subtypes were used to determine the size, variable and conserved domains and to perform mutational analysis. Results  Analysis showed that 96·4% of the H5N1 influenza viruses harbored full‐length PB1F2 protein. Except for the 2009 pandemic H1N1 virus, all the subtypes of the 20th‐century pandemic influenza viruses contained full‐length PB1F2 protein. Through the years, PB1F2 protein of the H1N1 and H3N2 viruses has undergone much variation. PB1F2 protein sequences of H5N1 viruses showed both human‐ and avian host‐specific conserved domains. Global database of PB1F2 protein revealed that N66S mutation was present only in 3·8% of the H5N1 strains. We found a novel mutation, N84S in the PB1F2 protein of 9·35% of the highly pathogenic avian influenza H5N1 influenza viruses. Conclusions  Varying sizes and mutations of the PB1F2 protein in different influenza A virus subtypes with pandemic potential were obtained. There was genetic divergence of the protein in various hosts which highlighted the host‐specific evolution of the virus. However, studies are required to correlate this sequence variability with the virulence and pathogenicity. PMID:22788742

  6. Multilocus Variable-Number Tandem Repeat Typing of Mycobacterium ulcerans

    PubMed Central

    Ablordey, Anthony; Swings, Jean; Hubans, Christine; Chemlal, Karim; Locht, Camille; Portaels, Françoise; Supply, Philip

    2005-01-01

    The apparent genetic homogeneity of Mycobacterium ulcerans contributes to the poorly understood epidemiology of M. ulcerans infection. Here, we report the identification of variable number tandem repeat (VNTR) sequences as novel polymorphic elements in the genome of this species. A total of 19 potential VNTR loci identified in the closely related M. marinum genome sequence were screened in a collection of 23 M. ulcerans isolates, one Mycobacterium species referred to here as an intermediate species, and five M. marinum strains. Nine of the 19 loci were polymorphic in the three species (including the intermediate species) and revealed eight M. ulcerans and five M. marinum genotypes. The results from the VNTR analysis corroborated the genetic relationships of M. ulcerans isolates from various geographical origins, as defined by independent molecular markers. Although these results further highlight the extremely high clonal homogeneity within certain geographic regions, we report for the first time the discrimination of the two South American strains from Surinam and French Guyana. These findings support the potential of a VNTR-based genotyping method for strain discrimination within M. ulcerans and M. marinum. PMID:15814964

  7. Biodiversity of mannose-specific adhesion in Lactobacillus plantarum revisited: strain-specific domain composition of the mannose-adhesin.

    PubMed

    Gross, G; Snel, J; Boekhorst, J; Smits, M A; Kleerebezem, M

    2010-03-01

    Recently, we have identified the mannose-specific adhesin encoding gene (msa) of Lactobacillus plantarum. In the current study, structure and function of this potentially probiotic effector gene were further investigated, exploring genetic diversity of msa in L. plantarum in relation to mannose adhesion capacity. The results demonstrate that there is considerable variation in quantitative in vitro mannose adhesion capacity, which is paralleled by msa gene sequence variation. The msa genes of different L. plantarum strains encode proteins with variable domain composition. Construction of L. plantarum 299v mutant strains revealed that the msa gene product is the key-protein for mannose adhesion, also in a strain with high mannose adhering capacity. However, no straightforward correlation between adhesion capacity and domain composition of Msa in L. plantarum could be identified. Nevertheless, differences in Msa sequences in combination with variable genetic background of specific bacterial strains appears to determine mannose adhesion capacity and potentially affects probiotic properties. These findings exemplify the strain-specificity of probiotic characteristics and illustrate the need for careful and molecular selection of new candidate probiotics.

  8. Genetic polymorphisms of LPL and HL and their association with the performance of Chinese sturgeons fed a formulated diet.

    PubMed

    He, Y; Shen, D; Liang, X F; Lu, R H; Xiao, H

    2013-10-15

    It is very important to investigate the reasons for the large individual differences in individual performance of food acceptance when using formulated diets for the successful culture of larvae and juveniles of the Chinese sturgeon Acipenser sinensis. Genetic differences of the mitochondrial control region were investigated by direct sequencing in two groups of Chinese sturgeon, which were apt to accept or refuse formulated diets. Among 968-bp sequences, 111 variable sites were identified. One variable site showed close association with the individual performance of specimens fed with formulated diets. The commercial diet for Chinese sturgeons usually contains high levels of lipids. Lipoprotein lipase (LPL) and hepatic lipase (HL) are two members of the lipase gene family, which are essential for the utilization of dietary lipid. Single nucleotide polymorphisms (SNPs) in intron 7 were detected in the two experimental groups of Chinese sturgeons. We were able to demonstrate that one SNP in the LPL gene and one SNP in the HL gene showed close association with the performance of sturgeons on the formulated diet.

  9. Upper and lower bounds of ground-motion variabilities: implication for source properties

    NASA Astrophysics Data System (ADS)

    Cotton, Fabrice; Reddy-Kotha, Sreeram; Bora, Sanjay; Bindi, Dino

    2017-04-01

    One of the key challenges of seismology is to be able to analyse the physical factors that control earthquakes and ground-motion variabilities. Such analysis is particularly important to calibrate physics-based simulations and seismic hazard estimations at high frequencies. Within the framework of the development of ground-motion prediction equation (GMPE) developments, ground-motions residuals (differences between recorded ground motions and the values predicted by a GMPE) are computed. The exponential growth of seismological near-source records and modern GMPE analysis technics allow to partition these residuals into between- and a within-event components. In particular, the between-event term quantifies all those repeatable source effects (e.g. related to stress-drop or kappa-source variability) which have not been accounted by the magnitude-dependent term of the model. In this presentation, we first discuss the between-event variabilities computed both in the Fourier and Response Spectra domains, using recent high-quality global accelerometric datasets (e.g. NGA-west2, Resorce, Kiknet). These analysis lead to the assessment of upper bounds for the ground-motion variability. Then, we compare these upper bounds with lower bounds estimated by analysing seismic sequences which occurred on specific fault systems (e.g., located in Central Italy or in Japan). We show that the lower bounds of between-event variabilities are surprisingly large which indicates a large variability of earthquake dynamic properties even within the same fault system. Finally, these upper and lower bounds of ground-shaking variability are discussed in term of variability of earthquake physical properties (e.g., stress-drop and kappa_source).

  10. Effective DNA Inhibitors of Cathepsin G by In Vitro Selection

    PubMed Central

    Gatto, Barbara; Vianini, Elena; Lucatello, Lorena; Sissi, Claudia; Moltrasio, Danilo; Pescador, Rodolfo; Porta, Roberto; Palumbo, Manlio

    2008-01-01

    Cathepsin G (CatG) is a chymotrypsin-like protease released upon degranulation of neutrophils. In several inflammatory and ischaemic diseases the impaired balance between CatG and its physiological inhibitors leads to tissue destruction and platelet aggregation. Inhibitors of CatG are suitable for the treatment of inflammatory diseases and procoagulant conditions. DNA released upon the death of neutrophils at injury sites binds CatG. Moreover, short DNA fragments are more inhibitory than genomic DNA. Defibrotide, a single stranded polydeoxyribonucleotide with antithrombotic effect is also a potent CatG inhibitor. Given the above experimental evidences we employed a selection protocol to assess whether DNA inhibition of CatG may be ascribed to specific sequences present in defibrotide DNA. A Selex protocol was applied to identify the single-stranded DNA sequences exhibiting the highest affinity for CatG, the diversity of a combinatorial pool of oligodeoxyribonucleotides being a good representation of the complexity found in defibrotide. Biophysical and biochemical studies confirmed that the selected sequences bind tightly to the target enzyme and also efficiently inhibit its catalytic activity. Sequence analysis carried out to unveil a motif responsible for CatG recognition showed a recurrence of alternating TG repeats in the selected CatG binders, adopting an extended conformation that grants maximal interaction with the highly charged protein surface. This unprecedented finding is validated by our results showing high affinity and inhibition of CatG by specific DNA sequences of variable length designed to maximally reduce pairing/folding interactions. PMID:19325843

  11. Compressed Sensing SEMAC: 8-fold Accelerated High Resolution Metal Artifact Reduction MRI of Cobalt-Chromium Knee Arthroplasty Implants.

    PubMed

    Fritz, Jan; Ahlawat, Shivani; Demehri, Shadpour; Thawait, Gaurav K; Raithel, Esther; Gilson, Wesley D; Nittka, Mathias

    2016-10-01

    The aim of this study was to prospectively test the hypothesis that a compressed sensing-based slice encoding for metal artifact correction (SEMAC) turbo spin echo (TSE) pulse sequence prototype facilitates high-resolution metal artifact reduction magnetic resonance imaging (MRI) of cobalt-chromium knee arthroplasty implants within acquisition times of less than 5 minutes, thereby yielding better image quality than high-bandwidth (BW) TSE of similar length and similar image quality than lengthier SEMAC standard of reference pulse sequences. This prospective study was approved by our institutional review board. Twenty asymptomatic subjects (12 men, 8 women; mean age, 56 years; age range, 44-82 years) with total knee arthroplasty implants underwent MRI of the knee using a commercially available, clinical 1.5 T MRI system. Two compressed sensing-accelerated SEMAC prototype pulse sequences with 8-fold undersampling and acquisition times of approximately 5 minutes each were compared with commercially available high-BW and SEMAC pulse sequences with acquisition times of approximately 5 minutes and 11 minutes, respectively. For each pulse sequence type, sagittal intermediate-weighted (TR, 3750-4120 milliseconds; TE, 26-28 milliseconds; voxel size, 0.5 × 0.5 × 3 mm) and short tau inversion recovery (TR, 4010 milliseconds; TE, 5.2-7.5 milliseconds; voxel size, 0.8 × 0.8 × 4 mm) were acquired. Outcome variables included image quality, display of the bone-implant interfaces and pertinent knee structures, artifact size, signal-to-noise ratio (SNR), and contrast-to-noise ratio (CNR). Statistical analysis included Friedman, repeated measures analysis of variances, and Cohen weighted k tests. Bonferroni-corrected P values of 0.005 and less were considered statistically significant. Image quality, bone-implant interfaces, anatomic structures, artifact size, SNR, and CNR parameters were statistically similar between the compressed sensing-accelerated SEMAC prototype and SEMAC commercial pulse sequences. There was mild blur on images of both SEMAC sequences when compared with high-BW images (P < 0.001), which however did not impair the assessment of knee structures. Metal artifact reduction and visibility of central knee structures and bone-implant interfaces were good to very good and significantly better on both types of SEMAC than on high-BW images (P < 0.004). All 3 pulse sequences showed peripheral structures similarly well. The implant artifact size was 46% to 51% larger on high-BW images when compared with both types of SEMAC images (P < 0.0001). Signal-to-noise ratios and CNRs of fat tissue, tendon tissue, muscle tissue, and fluid were statistically similar on intermediate-weighted MR images of all 3 pulse sequence types. On short tau inversion recovery images, the SNRs of tendon tissue and the CNRs of fat and fluid, fluid and muscle, as well as fluid and tendon were significantly higher on SEMAC and compressed sensing SEMAC images (P < 0.005, respectively). We accept the hypothesis that prospective compressed sensing acceleration of SEMAC is feasible for high-quality metal artifact reduction MRI of cobalt-chromium knee arthroplasty implants in less than 5 minutes and yields better quality than high-BW TSE and similarly high quality than lengthier SEMAC pulse sequences.

  12. Is radon emission in caves causing deletions in satellite DNA sequences of cave-dwelling crickets?

    PubMed

    Allegrucci, Giuliana; Sbordoni, Valerio; Cesaroni, Donatella

    2015-01-01

    The most stable isotope of radon, 222Rn, represents the major source of natural radioactivity in confined environments such as mines, caves and houses. In this study, we explored the possible radon-related effects on the genome of Dolichopoda cave crickets (Orthoptera, Rhaphidophoridae) sampled in caves with different concentrations of radon. We analyzed specimens from ten populations belonging to two genetically closely related species, D. geniculata and D. laetitiae, and explored the possible association between the radioactivity dose and the level of genetic polymorphism in a specific family of satellite DNA (pDo500 satDNA). Radon concentration in the analyzed caves ranged from 221 to 26,000 Bq/m3. Specimens coming from caves with the highest radon concentration showed also the highest variability estimates in both species, and the increased sequence heterogeneity at pDo500 satDNA level can be explained as an effect of the mutation pressure induced by radon in cave. We discovered a specific category of nuclear DNA, the highly repetitive satellite DNA, where the effects of the exposure at high levels of radon-related ionizing radiation are detectable, suggesting that the satDNA sequences might be a valuable tool to disclose harmful effects also in other organisms exposed to high levels of radon concentration.

  13. Fluorescent signatures for variable DNA sequences

    PubMed Central

    Rice, John E.; Reis, Arthur H.; Rice, Lisa M.; Carver-Brown, Rachel K.; Wangh, Lawrence J.

    2012-01-01

    Life abounds with genetic variations writ in sequences that are often only a few hundred nucleotides long. Rapid detection of these variations for identification of genetic diseases, pathogens and organisms has become the mainstay of molecular science and medicine. This report describes a new, highly informative closed-tube polymerase chain reaction (PCR) strategy for analysis of both known and unknown sequence variations. It combines efficient quantitative amplification of single-stranded DNA targets through LATE-PCR with sets of Lights-On/Lights-Off probes that hybridize to their target sequences over a broad temperature range. Contiguous pairs of Lights-On/Lights-Off probes of the same fluorescent color are used to scan hundreds of nucleotides for the presence of mutations. Sets of probes in different colors can be combined in the same tube to analyze even longer single-stranded targets. Each set of hybridized Lights-On/Lights-Off probes generates a composite fluorescent contour, which is mathematically converted to a sequence-specific fluorescent signature. The versatility and broad utility of this new technology is illustrated in this report by characterization of variant sequences in three different DNA targets: the rpoB gene of Mycobacterium tuberculosis, a sequence in the mitochondrial cytochrome C oxidase subunit 1 gene of nematodes and the V3 hypervariable region of the bacterial 16 s ribosomal RNA gene. We anticipate widespread use of these technologies for diagnostics, species identification and basic research. PMID:22879378

  14. ScaffoldSeq: Software for characterization of directed evolution populations.

    PubMed

    Woldring, Daniel R; Holec, Patrick V; Hackel, Benjamin J

    2016-07-01

    ScaffoldSeq is software designed for the numerous applications-including directed evolution analysis-in which a user generates a population of DNA sequences encoding for partially diverse proteins with related functions and would like to characterize the single site and pairwise amino acid frequencies across the population. A common scenario for enzyme maturation, antibody screening, and alternative scaffold engineering involves naïve and evolved populations that contain diversified regions, varying in both sequence and length, within a conserved framework. Analyzing the diversified regions of such populations is facilitated by high-throughput sequencing platforms; however, length variability within these regions (e.g., antibody CDRs) encumbers the alignment process. To overcome this challenge, the ScaffoldSeq algorithm takes advantage of conserved framework sequences to quickly identify diverse regions. Beyond this, unintended biases in sequence frequency are generated throughout the experimental workflow required to evolve and isolate clones of interest prior to DNA sequencing. ScaffoldSeq software uniquely handles this issue by providing tools to quantify and remove background sequences, cluster similar protein families, and dampen the impact of dominant clones. The software produces graphical and tabular summaries for each region of interest, allowing users to evaluate diversity in a site-specific manner as well as identify epistatic pairwise interactions. The code and detailed information are freely available at http://research.cems.umn.edu/hackel. Proteins 2016; 84:869-874. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  15. Design, Construction and Evaluation of 1a/JFH1 HCV Chimera by Replacing the Intergenotypic Variable Region

    PubMed Central

    Ghasemi, Faezeh; Ghayour-Mobarhan, Majid; Pasdar, Alireza; Pourianfar, Hamid; Reza Aghasadeghi, Mohammad; Gouklani, Hamed; Meshkat, Zahra

    2016-01-01

    Background The E2 glycoprotein is an important encoded hepatitis C virus (HCV) protein that contains three different variable regions. Objectives The aim of the present study was to construct an HCV 1a/JFH1 chimeric virus by replacing the intergenotypic variable region (igVR) fragment of the highly variable region of the E2 gene of the Japanese Fulminant hepatitis genotype 2a JFH1 virus with a similar region of HCV genotype 1a. This chimera was produced as a model virus with the ability to be cultured. We analyzed the adapted virus and the variations of nucleic acids within it. Methods Specific primers were designed for the igVR of HCV genotype 1a followed by the overlap-PCR method for the synthesis of the desired DNA fragment. The amplified igVR-1a chimera gene and pFL-J6/JFH were digested by KpnI and BsiWI restriction enzymes, and the fragment was ligated into pFL-J6/JFH. The recombinant vector was transformed into Escherichia coli JM109 strain competent cells. All clones were confirmed by colony PCR using specific primers, and the confirmed recombinant vector was sequenced. The recombinant vector was targeted for RNA synthesis by T7 RNA polymerase enzyme. RNA transfection was performed in the Huh7.5 cell line. Virus production in several passages and the evaluated viral load were studied using quantitative real-time PCR and ELISA methods. After 30 passages, the RNA virus was extracted and cloned in PCDNA3.1 vector, and was then sequenced Results Quantitative real-time PCR results showed 11,292,514 copies/mL of chimeric virus production in cell culture. The virus production was confirmed using ELISA, which showed a virus core production of 808.2 pg/mL. The results of cloning and sequencing showed that some of the nucleic acids in the chimera virus were changed, affecting the viral behavior in the cell culture. Conclusions Real-time PCR and ELISA showed high levels of production of 1a/JFH1 chimeric HCV in the Huh7.5 cell culture. The constructed virus can be used for future studies, including the development of new HCV drugs and vaccines. PMID:27882063

  16. Response of Late Cretaceous migrating deltaic facies systems to sea level, tectonics, and sediment supply changes, New Jersey Coastal Plain, U.S.A.

    USGS Publications Warehouse

    Kulpecz, A.A.; Miller, K.G.; Sugarman, P.J.; Browning, J.V.

    2008-01-01

    Paleogeographic, isopach, and deltaic lithofacies mapping of thirteen depositional sequences establish a 35 myr high resolution (> 1 Myr) record of Late Cretaceous wave- and tide-influenced deltaic sedimentation. We integrate sequences defined on the basis of lithologic, biostratigraphic, and Sr-isotope stratigraphy from cores with geophysical log data from 28 wells to further develop and extend methods and calibrations of well-log recognition of sequences and facies variations. This study reveals the northeastward migration of depocenters from the Cenomanian (ca. 98 Ma) through the earliest Danian (ca. 64 Ma) and documents five primary phases of paleodeltaic evolution in response to long-term eustatic changes, variations in sediment supply, the location of two long-lived fluvial axes, and thermoflexural basement subsidence: (1) Cenomanian-early Turonian deltaic facies exhibit marine and nonmarine facies and are concentrated in the central coastal plain; (2) high sediment rates, low sea level, and high accommodation rates in the northern coastal plain resulted in thick, marginal to nonmarine mixed-influenced deltaic facies during the Turonign-Coniacian; (3) comparatively low sediment rates and high long-term sea level in the Santonian resulted in a sediment-starved margin with low deltaic influence; (4) well-developed Campanian deltaic sequences expand to the north and exhibit wave reworking and longshore transport of sands, and (5) low sedimentation rates and high long-term sea level during the Maastrichtian resulted in the deposition of a sediment-starved glauconitic shelf. Our study illustrates the widely known variability of mixed-influence deltaic systems, but also documents the relative stability of deltaic facies systems on the 106-107 yr scale, with long periods of cyclically repeating systems tracts controlled by eustasy. Results from the Late Cretaceous further show that although eustasy provides the template for sequences globally, regional tectonics (rates of subsidence and accommodation), changes in sediment supply, proximity to sediment input, and flexural subsidence from depocenter loading determines the regional to local preservation and facies expression of sequences. Copyright ?? 2008, SEPM (Society for Sedimentary Geology).

  17. Molecular identification of Nocardia species using the sodA gene: Identificación molecular de especies de Nocardia utilizando el gen sodA.

    PubMed

    Sánchez-Herrera, K; Sandoval, H; Mouniee, D; Ramírez-Durán, N; Bergeron, E; Boiron, P; Sánchez-Saucedo, N; Rodríguez-Nava, V

    2017-09-01

    Currently for bacterial identification and classification the rrs gene encoding 16S rRNA is used as a reference method for the analysis of strains of the genus Nocardia. However, it does not have enough polymorphism to differentiate them at the species level. This fact makes it necessary to search for molecular targets that can provide better identification. The sod A gene (encoding the enzyme superoxide dismutase) has had good results in identifying species of other Actinomycetes. In this study the sod A gene is proposed for the identification and differentiation at the species level of the genus Nocardia. We used 41 type species of various collections; a 386 bp fragment of the sod A gene was amplified and sequenced, and a phylogenetic analysis was performed comparing the genes rrs (1171 bp), hsp 65 (401 bp), sec A1 (494 bp), gyr B (1195 bp) and rpo B (401 bp). The sequences were aligned using the Clustal X program. Evolutionary trees according to the neighbour-joining method were created with the programs Phylo_win and MEGA 6. The specific variability of the sod A genus of the genus Nocardia was analysed. A high phylogenetic resolution, significant genetic variability, and specificity and reliability were observed for the differentiation of the isolates at the species level. The polymorphism observed in the sod A gene sequence contains variable regions that allow the discrimination of closely related Nocardia species. The clear specificity, despite its small size, proves to be of great advantage for use in taxonomic studies and clinical diagnosis of the genus Nocardia.

  18. Implications of hydrologic variability on the succession of plants in Great Lakes wetlands

    USGS Publications Warehouse

    Wilcox, Douglas A.

    2004-01-01

    Primary succession of plant communities directed toward a climax is not a typical occurrence in wetlands because these ecological systems are inherently dependent on hydrology, and temporal hydrologic variability often causes reversals or setbacks in succession. Wetlands of the Great Lakes provide good examples for demonstrating the implications of hydrology in driving successional processes and for illustrating potential misinterpretations of apparent successional sequences. Most Great Lakes coastal wetlands follow cyclic patterns in which emergent communities are reduced in area or eliminated by high lake levels and then regenerated from the seed bank during low lake levels. Thus, succession never proceeds for long. Wetlands also develop in ridge and swale terrains in many large embayments of the Great Lakes. These formations contain sequences of wetlands of similar origin but different age that can be several thousand years old, with older wetlands always further from the lake. Analyses of plant communities across a sequence of wetlands at the south end of Lake Michigan showed an apparent successional pattern from submersed to floating to emergent plants as water depth decreased with wetland age. However, paleoecological analyses showed that the observed vegetation changes were driven largely by disturbances associated with increased human settlement in the area. Climate-induced hydrologic changes were also shown to have greater effects on plant-community change than autogenic processes. Other terms, such as zonation, maturation, fluctuations, continuum concept, functional guilds, centrifugal organization, pulse stability, and hump-back models provide additional means of describing organization and changes in vegetation; some of them overlap with succession in describing vegetation processes in Great Lakes wetlands, but each must be used in the proper context with regard to short- and long-term hydrologic variability.

  19. A Bayesian Framework for Generalized Linear Mixed Modeling Identifies New Candidate Loci for Late-Onset Alzheimer’s Disease

    PubMed Central

    Wang, Xulong; Philip, Vivek M.; Ananda, Guruprasad; White, Charles C.; Malhotra, Ankit; Michalski, Paul J.; Karuturi, Krishna R. Murthy; Chintalapudi, Sumana R.; Acklin, Casey; Sasner, Michael; Bennett, David A.; De Jager, Philip L.; Howell, Gareth R.; Carter, Gregory W.

    2018-01-01

    Recent technical and methodological advances have greatly enhanced genome-wide association studies (GWAS). The advent of low-cost, whole-genome sequencing facilitates high-resolution variant identification, and the development of linear mixed models (LMM) allows improved identification of putatively causal variants. While essential for correcting false positive associations due to sample relatedness and population stratification, LMMs have commonly been restricted to quantitative variables. However, phenotypic traits in association studies are often categorical, coded as binary case-control or ordered variables describing disease stages. To address these issues, we have devised a method for genomic association studies that implements a generalized LMM (GLMM) in a Bayesian framework, called Bayes-GLMM. Bayes-GLMM has four major features: (1) support of categorical, binary, and quantitative variables; (2) cohesive integration of previous GWAS results for related traits; (3) correction for sample relatedness by mixed modeling; and (4) model estimation by both Markov chain Monte Carlo sampling and maximal likelihood estimation. We applied Bayes-GLMM to the whole-genome sequencing cohort of the Alzheimer’s Disease Sequencing Project. This study contains 570 individuals from 111 families, each with Alzheimer’s disease diagnosed at one of four confidence levels. Using Bayes-GLMM we identified four variants in three loci significantly associated with Alzheimer’s disease. Two variants, rs140233081 and rs149372995, lie between PRKAR1B and PDGFA. The coded proteins are localized to the glial-vascular unit, and PDGFA transcript levels are associated with Alzheimer’s disease-related neuropathology. In summary, this work provides implementation of a flexible, generalized mixed-model approach in a Bayesian framework for association studies. PMID:29507048

  20. Identification of Variable-Number Tandem-Repeat (VNTR) Sequences in Acinetobacter baumannii and Interlaboratory Validation of an Optimized Multiple-Locus VNTR Analysis Typing Scheme▿†

    PubMed Central

    Pourcel, Christine; Minandri, Fabrizia; Hauck, Yolande; D'Arezzo, Silvia; Imperi, Francesco; Vergnaud, Gilles; Visca, Paolo

    2011-01-01

    Acinetobacter baumannii is an important opportunistic pathogen responsible for nosocomial outbreaks, mostly occurring in intensive care units. Due to the multiplicity of infection sources, reliable molecular fingerprinting techniques are needed to establish epidemiological correlations among A. baumannii isolates. Multiple-locus variable-number tandem-repeat analysis (MLVA) has proven to be a fast, reliable, and cost-effective typing method for several bacterial species. In this study, an MLVA assay compatible with simple PCR- and agarose gel-based electrophoresis steps as well as with high-throughput automated methods was developed for A. baumannii typing. Preliminarily, 10 potential polymorphic variable-number tandem repeats (VNTRs) were identified upon bioinformatic screening of six annotated genome sequences of A. baumannii. A collection of 7 reference strains plus 18 well-characterized isolates, including unique types and representatives of the three international A. baumannii lineages, was then evaluated in a two-center study aimed at validating the MLVA assay and comparing it with other genotyping assays, namely, macrorestriction analysis with pulsed-field gel electrophoresis (PFGE) and PCR-based sequence group (SG) profiling. The results showed that MLVA can discriminate between isolates with identical PFGE types and SG profiles. A panel of eight VNTR markers was selected, all showing the ability to be amplified and good amounts of polymorphism in the majority of strains. Independently generated MLVA profiles, composed of an ordered string of allele numbers corresponding to the number of repeats at each VNTR locus, were concordant between centers. Typeability, reproducibility, stability, discriminatory power, and epidemiological concordance were excellent. A database containing information and MLVA profiles for several A. baumannii strains is available from http://mlva.u-psud.fr/. PMID:21147956

  1. High Fidelity and Multiscale Algorithms for Collisional-radiative and Nonequilibrium Plasmas (Briefing Charts)

    DTIC Science & Technology

    2014-07-01

    of models for variable conditions: – Use implicit models to eliminate constraint of sequence of fast time scales: c, ve, – Price to pay: lack...collisions: – Elastic – Bragiinski terms – Inelastic – warning! Rates depend on both T and relative velocity – Multi-fluid CR model from...merge/split for particle management, efficient sampling, inelastic collisions … – Level grouping schemes of electronic states, for dynamical coarse

  2. Characterization of genetic variability of Venezuelan equine encephalitis viruses

    DOE PAGES

    Gardner, Shea N.; McLoughlin, Kevin; Be, Nicholas A.; ...

    2016-04-07

    Venezuelan equine encephalitis virus (VEEV) is a mosquito-borne alphavirus that has caused large outbreaks of severe illness in both horses and humans. New approaches are needed to rapidly infer the origin of a newly discovered VEEV strain, estimate its equine amplification and resultant epidemic potential, and predict human virulence phenotype. We performed whole genome single nucleotide polymorphism (SNP) analysis of all available VEE antigenic complex genomes, verified that a SNP-based phylogeny accurately captured the features of a phylogenetic tree based on multiple sequence alignment, and developed a high resolution genome-wide SNP microarray. We used the microarray to analyze a broadmore » panel of VEEV isolates, found excellent concordance between array- and sequence-based SNP calls, genotyped unsequenced isolates, and placed them on a phylogeny with sequenced genomes. The microarray successfully genotyped VEEV directly from tissue samples of an infected mouse, bypassing the need for viral isolation, culture and genomic sequencing. Lastly, we identified genomic variants associated with serotypes and host species, revealing a complex relationship between genotype and phenotype.« less

  3. Characterization of genic microsatellite markers derived from expressed sequence tags in Pacific abalone ( Haliotis discus hannai)

    NASA Astrophysics Data System (ADS)

    Li, Qi; Shu, Jing; Zhao, Cui; Liu, Shikai; Kong, Lingfeng; Zheng, Xiaodong

    2010-01-01

    Simple sequence repeat (SSR) markers were developed from the expressed sequence tags (ESTs) of Pacific abalone ( Haliotis discus hannai). Repeat motifs were found in 4.95% of the ESTs at a frequency of one repeat every 10.04 kb of EST sequences, after redundancy elimination. Seventeen polymorphic EST-SSRs were developed. The number of alleles per locus varied from 2-17, with an average of 6.8 alleles per locus. The expected and observed heterozygosities ranged from 0.159 to 0.928 and from 0.132 to 0.922, respectively. Twelve of the 17 loci (70.6%) were successfully amplified in H. diversicolor. Seventeen loci segregated in three families, with three showing the presence of null alleles (17.6%). The adequate level of variability and low frequency of null alleles observed in H. discus hannai, together with the high rate of transportability across Haliotis species, make this set of EST-SSR markers an important tool for comparative mapping, marker-assisted selection, and evolutionary studies, not only in the Pacific abalone, but also in related species.

  4. Molecular Properties of Poliovirus Isolates: Nucleotide Sequence Analysis, Typing by PCR and Real-Time RT-PCR.

    PubMed

    Burns, Cara C; Kilpatrick, David R; Iber, Jane C; Chen, Qi; Kew, Olen M

    2016-01-01

    Virologic surveillance is essential to the success of the World Health Organization initiative to eradicate poliomyelitis. Molecular methods have been used to detect polioviruses in tissue culture isolates derived from stool samples obtained through surveillance for acute flaccid paralysis. This chapter describes the use of realtime PCR assays to identify and serotype polioviruses. In particular, a degenerate, inosine-containing, panpoliovirus (panPV) PCR primer set is used to distinguish polioviruses from NPEVs. The high degree of nucleotide sequence diversity among polioviruses presents a challenge to the systematic design of nucleic acid-based reagents. To accommodate the wide variability and rapid evolution of poliovirus genomes, degenerate codon positions on the template were matched to mixed-base or deoxyinosine residues on both the primers and the TaqMan™ probes. Additional assays distinguish between Sabin vaccine strains and non-Sabin strains. This chapter also describes the use of generic poliovirus specific primers, along with degenerate and inosine-containing primers, for routine VP1 sequencing of poliovirus isolates. These primers, along with nondegenerate serotype-specific Sabin primers, can also be used to sequence individual polioviruses in mixtures.

  5. Complete genome sequence of a novel Plum pox virus strain W isolate determined by 454 pyrosequencing.

    PubMed

    Sheveleva, Anna; Kudryavtseva, Anna; Speranskaya, Anna; Belenikin, Maxim; Melnikova, Natalia; Chirkov, Sergei

    2013-10-01

    The near-complete (99.7 %) genome sequence of a novel Russian Plum pox virus (PPV) isolate Pk, belonging to the strain Winona (W), has been determined by 454 pyrosequencing with the exception of the thirty-one 5'-terminal nucleotides. This region was amplified using 5'RACE kit and sequenced by the Sanger method. Genomic RNA released from immunocaptured PPV particles was employed for generation of cDNA library using TransPlex Whole transcriptome amplification kit (WTA2, Sigma-Aldrich). The entire Pk genome has identity level of 92.8-94.5 % when compared to the complete nucleotide sequences of other PPV-W isolates (W3174, LV-141pl, LV-145bt, and UKR 44189), confirming a high degree of variability within the PPV-W strain. The isolates Pk and LV-141pl are most closely related. The Pk has been found in a wild plum (Prunus domestica) in a new region of Russia indicating widespread dissemination of the PPV-W strain in the European part of the former USSR.

  6. DNA fingerprinting of Brassica juncea cultivars using microsatellite probes.

    PubMed

    Bhatia, S; Das, S; Jain, A; Lakshmikumaran, M

    1995-09-01

    The genetic variability in the Brassica juncea cultivars was detected by employing in-gel hybridization of restricted DNA to simple repetitive sequences such as (GATA)4, (GACA)4 and (CAC)5. The most informative probe/enzyme combination was (GATA)4/EcoRI, yielding highly polymorphic fingerprint patterns for the B. juncea cultivars. This technique was found to be dependable for establishing the variety specific patterns for most of the cultivars studied, a prerequisite for germplasm preservation. The results of the present study were compared with those reported in our earlier study in which random amplification of polymorphic DNA (RAPD) was used for assessing the genetic variability in the B. juncea cultivars.

  7. Rabies in the arctic fox population, Svalbard, Norway.

    PubMed

    Mørk, Torill; Bohlin, Jon; Fuglei, Eva; Åsbakk, Kjetil; Tryland, Morten

    2011-10-01

    Arctic foxes, 620 that were trapped and 22 found dead on Svalbard, Norway (1996-2004), as well as 10 foxes trapped in Nenets, North-West Russia (1999), were tested for rabies virus antigen in brain tissue by standard direct fluorescent antibody test. Rabies antigen was found in two foxes from Svalbard and in three from Russia. Blood samples from 515 of the fox carcasses were screened for rabies antibodies with negative result. Our results, together with a previous screening (1980-1989, n=817) indicate that the prevalence of rabies in Svalbard has remained low or that the virus has not been enzootic in the arctic fox population since the first reported outbreak in 1980. Brain tissues from four arctic foxes (one from Svalbard, three from Russia) in which rabies virus antigen was detected were further analyzed by reverse-transcriptase polymerase chain reaction direct amplicon sequencing and phylogenetic analysis. Sequences were compared to corresponding sequences from rabies virus isolates from other arctic regions. The Svalbard isolate and two of the Russian isolates were identical (310 nucleotides), whereas the third Russian isolate differed in six nucleotide positions. However, when translated into amino acid sequences, none of these substitutions produced changes in the amino acid sequence. These findings suggest that the spread of rabies virus to Svalbard was likely due to migration of arctic foxes over sea ice from Russia to Svalbard. Furthermore, when compared to other Arctic rabies virus isolates, a high degree of homology was found, suggesting a high contact rate between arctic fox populations from different arctic regions. The high degree of homology also indicates that other, and more variable, regions of the genome than this part of the nucleoprotein gene should be used to distinguish Arctic rabies virus isolates for epidemiologic purposes.

  8. Genomic resources and their influence on the detection of the signal of positive selection in genome scans.

    PubMed

    Manel, S; Perrier, C; Pratlong, M; Abi-Rached, L; Paganini, J; Pontarotti, P; Aurelle, D

    2016-01-01

    Genome scans represent powerful approaches to investigate the action of natural selection on the genetic variation of natural populations and to better understand local adaptation. This is very useful, for example, in the field of conservation biology and evolutionary biology. Thanks to Next Generation Sequencing, genomic resources are growing exponentially, improving genome scan analyses in non-model species. Thousands of SNPs called using Reduced Representation Sequencing are increasingly used in genome scans. Besides, genome sequences are also becoming increasingly available, allowing better processing of short-read data, offering physical localization of variants, and improving haplotype reconstruction and data imputation. Ultimately, genome sequences are also becoming the raw material for selection inferences. Here, we discuss how the increasing availability of such genomic resources, notably genome sequences, influences the detection of signals of selection. Mainly, increasing data density and having the information of physical linkage data expand genome scans by (i) improving the overall quality of the data, (ii) helping the reconstruction of demographic history for the population studied to decrease false-positive rates and (iii) improving the statistical power of methods to detect the signal of selection. Of particular importance, the availability of a high-quality reference genome can improve the detection of the signal of selection by (i) allowing matching the potential candidate loci to linked coding regions under selection, (ii) rapidly moving the investigation to the gene and function and (iii) ensuring that the highly variable regions of the genomes that include functional genes are also investigated. For all those reasons, using reference genomes in genome scan analyses is highly recommended. © 2015 John Wiley & Sons Ltd.

  9. New Technologies for the Identification of Novel Genetic Markers of Disorders of Sex Development (DSD)

    PubMed Central

    Bashamboo, A.; Ledig, S.; Wieacker, P.; Achermann, J.; McElreavey, K.

    2010-01-01

    Although the genetic basis of human sexual determination and differentiation has advanced considerably in recent years, the fact remains that in most subjects with disorders of sex development (DSD) the underlying genetic cause is unknown. Where pathogenic mutations have been identified, the phenotype can be highly variable, even within families, suggesting that other genetic variants are influencing the expression of the phenotype. This situation is likely to change, as more powerful and affordable tools become widely available for detailed genetic analyses. Here, we describe recent advances in comparative genomic hybridisation, sequencing by hybridisation and next generation sequencing, and we describe how these technologies will have an impact on our understanding of the genetic causes of DSD. PMID:20820110

  10. Geographic variation in marine turtle fibropapillomatosis

    USGS Publications Warehouse

    Greenblatt, R.J.; Work, Thierry M.; Dutton, P.; Sutton, C.A.; Spraker, T.R.; Casey, R.N.; Diez, C.E.; Parker, Dana C.; St. Ledger, J.; Balazs, G.H.; Casey, J.W.

    2005-01-01

    We document three examples of fibropapillomatosis by histology, quantitative polymerase chain reaction (qPCR), and sequence analysis from three different geographic areas. Tumors compatible in morphology with fibropapillomatosis were seen in green turtles from Puerto Rico and San Diego (California) and in a hybrid loggerhead/ hawksbill turtle from Florida Bay (Florida). Tumors were confirmed as fibropapillomas on histology, although severity of disease varied between cases. Polymerase chain reaction (PCR) analyses revealed infection with the fibropapilloma-associated turtle herpesvirus (FPTHV) in all cases, albeit at highly variable copy numbers per cell. Alignment of a portion of the polymerase gene from each fibropapilloma-associated turtle herpesvirus isolate demonstrated geographic variation in sequence. These cases illustrate geographic variation in both the pathology and the virology of fibropapillomatosis.

  11. Geographic variation in marine turtle fibropapillomatosis.

    PubMed

    Greenblatt, Rebecca J; Work, Thierry M; Dutton, Peter; Sutton, Claudia A; Spraker, Terry R; Casey, Rufina N; Diez, Carlos E; Parker, Denise; St Leger, Judy; Balazs, George H; Casey, James W

    2005-09-01

    We document three examples of fibropapillomatosis by histology, quantitative polymerase chain reaction (qPCR), and sequence analysis from three different geographic areas. Tumors compatible in morphology with fibropapillomatosis were seen in green turtles from Puerto Rico and San Diego (California) and in a hybrid loggerhead/ hawksbill turtle from Florida Bay (Florida). Tumors were confirmed as fibropapillomas on histology, although severity of disease varied between cases. Polymerase chain reaction (PCR) analyses revealed infection with the fibropapilloma-associated turtle herpesvirus (FPTHV) in all cases, albeit at highly variable copy numbers per cell. Alignment of a portion of the polymerase gene from each fibropapilloma-associated turtle herpesvirus isolate demonstrated geographic variation in sequence. These cases illustrate geographic variation in both the pathology and the virology of fibropapillomatosis.

  12. Trends of amino acid usage in the proteins from the unicellular parasite Giardia lamblia.

    PubMed

    Garat, B; Musto, H

    2000-12-29

    Correspondence analysis of amino acid frequencies was applied to 75 complete coding sequences from the unicellular parasite Giardia lamblia, and it was found that three major factors influence the variability of amino acidic composition of proteins. The first trend strongly correlated with (a) the cysteine content and (b) the mean weight of the amino acids used in each protein. The second trend correlated with the global levels of hydropathy and aromaticity of each protein. Both axes might be related with the defense of the parasite to oxygen free radicals. Finally, the third trend correlated with the expressivity of each gene, indicating that in G. lamblia highly expressed sequences display a tendency to preferentially use a subset of the total amino acids.

  13. Whole-genome sequencing identifies recurrent mutations in chronic lymphocytic leukaemia

    PubMed Central

    Puente, Xose S.; Pinyol, Magda; Quesada, Víctor; Conde, Laura; Ordóñez, Gonzalo R.; Villamor, Neus; Escaramis, Georgia; Jares, Pedro; Beà, Sílvia; González-Díaz, Marcos; Bassaganyas, Laia; Baumann, Tycho; Juan, Manel; López-Guerra, Mónica; Colomer, Dolors; Tubío, José M. C.; López, Cristina; Navarro, Alba; Tornador, Cristian; Aymerich, Marta; Rozman, María; Hernández, Jesús M.; Puente, Diana A.; Freije, José M. P.; Velasco, Gloria; Gutiérrez-Fernández, Ana; Costa, Dolors; Carrió, Anna; Guijarro, Sara; Enjuanes, Anna; Hernández, Lluís; Yagüe, Jordi; Nicolás, Pilar; Romeo-Casabona, Carlos M.; Himmelbauer, Heinz; Castillo, Ester; Dohm, Juliane C.; de Sanjosé, Silvia; Piris, Miguel A.; de Alava, Enrique; Miguel, Jesús San; Royo, Romina; Gelpí, Josep L.; Torrents, David; Orozco, Modesto; Pisano, David G.; Valencia, Alfonso; Guigó, Roderic; Bayés, Mónica; Heath, Simon; Gut, Marta; Klatt, Peter; Marshall, John; Raine, Keiran; Stebbings, Lucy A.; Futreal, P. Andrew; Stratton, Michael R.; Campbell, Peter J.; Gut, Ivo; López-Guillermo, Armando; Estivill, Xavier; Montserrat, Emili; López-Otín, Carlos; Campo, Elías

    2012-01-01

    Chronic lymphocytic leukaemia (CLL), the most frequent leukaemia in adults in Western countries, is a heterogeneous disease with variable clinical presentation and evolution1,2. Two major molecular subtypes can be distinguished, characterized respectively by a high or low number of somatic hypermutations in the variable region of immunoglobulin genes3,4. The molecular changes leading to the pathogenesis of the disease are still poorly understood. Here we performed whole-genome sequencing of four cases of CLL and identified 46 somatic mutations that potentially affect gene function. Further analysis of these mutations in 363 patients with CLL identified four genes that are recurrently mutated: notch 1 (NOTCH1), exportin 1 (XPO1), myeloid differentiation primary response gene 88 (MYD88) and kelch-like 6 (KLHL6). Mutations in MYD88 and KLHL6 are predominant in cases of CLL with mutated immunoglobulin genes, whereas NOTCH1 and XPO1 mutations are mainly detected in patients with unmutated immunoglobulins. The patterns of somatic mutation, supported by functional and clinical analyses, strongly indicate that the recurrent NOTCH1, MYD88 and XPO1 mutations are oncogenic changes that contribute to the clinical evolution of the disease. To our knowledge, this is the first comprehensive analysis of CLL combining whole-genome sequencing with clinical characteristics and clinical outcomes. It highlights the usefulness of this approach for the identification of clinically relevant mutations in cancer. PMID:21642962

  14. Community Structures of Fecal Bacteria in Cattle from Different Animal Feeding Operations▿†

    PubMed Central

    Shanks, Orin C.; Kelty, Catherine A.; Archibeque, Shawn; Jenkins, Michael; Newton, Ryan J.; McLellan, Sandra L.; Huse, Susan M.; Sogin, Mitchell L.

    2011-01-01

    The fecal microbiome of cattle plays a critical role not only in animal health and productivity but also in food safety, pathogen shedding, and the performance of fecal pollution detection methods. Unfortunately, most published molecular surveys fail to provide adequate detail about variability in the community structures of fecal bacteria within and across cattle populations. Using massively parallel pyrosequencing of a hypervariable region of the rRNA coding region, we profiled the fecal microbial communities of cattle from six different feeding operations where cattle were subjected to consistent management practices for a minimum of 90 days. We obtained a total of 633,877 high-quality sequences from the fecal samples of 30 adult beef cattle (5 individuals per operation). Sequence-based clustering and taxonomic analyses indicate less variability within a population than between populations. Overall, bacterial community composition correlated significantly with fecal starch concentrations, largely reflected in changes in the Bacteroidetes, Proteobacteria, and Firmicutes populations. In addition, network analysis demonstrated that annotated sequences clustered by management practice and fecal starch concentration, suggesting that the structures of bovine fecal bacterial communities can be dramatically different in different animal feeding operations, even at the phylum and family taxonomic levels, and that the feeding operation is a more important determinant of the cattle microbiome than is the geographic location of the feedlot. PMID:21378055

  15. Variability of Carotenoid Biosynthesis in Orange Colored Capsicum spp.

    PubMed Central

    Guzman, Ivette; Hamby, Shane; Romero, Joslynn; Bosland, Paul W.; O’Connell, Mary A.

    2010-01-01

    Pepper, Capsicum spp., is a worldwide crop valued for heat, nutrition, and rich pigment content. Carotenoids, the largest group of plant pigments, function as antioxidants and as vitamin A precursors. The most abundant carotenoids in ripe pepper fruits are β-carotene, capsanthin, and capsorubin. In this study, the carotenoid composition of orange fruited Capsicum lines was defined along with the allelic variability of the biosynthetic enzymes. The carotenoid chemical profiles present in seven orange pepper varieties were determined using a novel UPLC method. The orange appearance of the fruit was due either to the accumulation of β-carotene, or in two cases, due to only the accumulation of red and yellow carotenoids. Four carotenoid biosynthetic genes, Psy, Lcyb, CrtZ-2, and Ccs were cloned and sequenced from these cultivars. This data tested the hypothesis that different alleles for specific carotenoid biosynthetic enzymes are associated with specific carotenoid profiles in orange peppers. While the coding regions within Psy and CrtZ-2 did not change in any of the lines, the genomic sequence contained introns not previously reported. Lcyb and Ccs contained no introns but did exhibit polymorphisms resulting in amino acid changes; a new Ccs variant was found. When selectively breeding for high provitamin A levels, phenotypic recurrent selection based on fruit color is not sufficient, carotenoid chemical composition should also be conducted. Based on these results, specific alleles are candidate molecular markers for selection of orange pepper lines with high β-carotene and therefore high pro-vitamin A levels. PMID:20582146

  16. Oligonucleotide indexing of DNA barcodes: identification of tuna and other scombrid species in food products.

    PubMed

    Botti, Sara; Giuffra, Elisabetta

    2010-08-23

    DNA barcodes are a global standard for species identification and have countless applications in the medical, forensic and alimentary fields, but few barcoding methods work efficiently in samples in which DNA is degraded, e.g. foods and archival specimens. This limits the choice of target regions harbouring a sufficient number of diagnostic polymorphisms. The method described here uses existing PCR and sequencing methodologies to detect mitochondrial DNA polymorphisms in complex matrices such as foods. The reported application allowed the discrimination among 17 fish species of the Scombridae family with high commercial interest such as mackerels, bonitos and tunas which are often present in processed seafood. The approach can be easily upgraded with the release of new genetic diversity information to increase the range of detected species. Cocktail of primers are designed for PCR using publicly available sequences of the target sequence. They are composed of a fixed 5' region and of variable 3' cocktail portions that allow amplification of any member of a group of species of interest. The population of short amplicons is directly sequenced and indexed using primers containing a longer 5' region and the non polymorphic portion of the cocktail portion. A 226 bp region of CytB was selected as target after collection and screening of 148 online sequences; 85 SNPs were found, of which 75 were present in at least two sequences. Primers were also designed for two shorter sub-fragments that could be amplified from highly degraded samples. The test was used on 103 samples of seafood (canned tuna and scomber, tuna salad, tuna sauce) and could successfully detect the presence of different or additional species that were not identified on the labelling of canned tuna, tuna salad and sauce samples. The described method is largely independent of the degree of degradation of DNA source and can thus be applied to processed seafood. Moreover, the method is highly flexible: publicly available sequence information on mitochondrial genomes are rapidly increasing for most species, facilitating the choice of target sequences and the improvement of resolution of the test. This is particularly important for discrimination of marine and aquaculture species for which genome information is still limited.

  17. Characterization of bovine MHC DRB3 diversity in Latin American Creole cattle breeds.

    PubMed

    Giovambattista, Guillermo; Takeshima, Shin-nosuke; Ripoli, Maria Veronica; Matsumoto, Yuki; Franco, Luz Angela Alvarez; Saito, Hideki; Onuma, Misao; Aida, Yoko

    2013-04-25

    In cattle, bovine leukocyte antigens (BoLAs) have been extensively used as markers for diseases and immunological traits. However, none of the highly adapted Latin American Creole breeds have been characterized for BoLA gene polymorphism by high resolution typing methods. In this work, we sequenced exon 2 of the BoLA class II DRB3 gene from 179 cattle (113 Bolivian Yacumeño cattle and 66 Colombian Hartón del Valle cattle breeds) using a polymerase chain reaction sequence-based typing (PCR-SBT) method. We identified 36 previously reported alleles and three novel alleles. Thirty-five (32 reported and three new) and 24 alleles (22 reported and two new) were detected in Yacumeño and Hartón del Valle breeds, respectively. Interestingly, Latin American Creole cattle showed a high degree of gene diversity despite their small population sizes, and 10 alleles including three new alleles were found only in these two Creole breeds. We next compared the degree of genetic variability at the population and sequence levels and the genetic distance in the two breeds with those previously reported in five other breeds: Holstein, Japanese Shorthorn, Japanese Black, Jersey, and Hanwoo. Both Creole breeds presented gene diversity higher than 0.90, a nucleotide diversity higher than 0.07, and mean number of pairwise differences higher than 19, indicating that Creole cattle had similar genetic diversity at BoLA-DRB3 to the other breeds. A neutrality test showed that the high degree of genetic variability may be maintained by balancing selection. The FST index and the exact G test showed significant differences across all cattle populations (FST=0.0478; p<0.001). Results from the principal components analysis and the phylogenetic tree showed that Yacumeño and Hartón del Valle breeds were closely related to each other. Collectively, our results suggest that the high level of genetic diversity could be explained by the multiple origins of the Creole germplasm (European, African and Indicus), and this diversity might be maintained by balancing selection. Copyright © 2013 Elsevier B.V. All rights reserved.

  18. Universal Quantum Computing with Measurement-Induced Continuous-Variable Gate Sequence in a Loop-Based Architecture.

    PubMed

    Takeda, Shuntaro; Furusawa, Akira

    2017-09-22

    We propose a scalable scheme for optical quantum computing using measurement-induced continuous-variable quantum gates in a loop-based architecture. Here, time-bin-encoded quantum information in a single spatial mode is deterministically processed in a nested loop by an electrically programmable gate sequence. This architecture can process any input state and an arbitrary number of modes with almost minimum resources, and offers a universal gate set for both qubits and continuous variables. Furthermore, quantum computing can be performed fault tolerantly by a known scheme for encoding a qubit in an infinite-dimensional Hilbert space of a single light mode.

  19. Universal Quantum Computing with Measurement-Induced Continuous-Variable Gate Sequence in a Loop-Based Architecture

    NASA Astrophysics Data System (ADS)

    Takeda, Shuntaro; Furusawa, Akira

    2017-09-01

    We propose a scalable scheme for optical quantum computing using measurement-induced continuous-variable quantum gates in a loop-based architecture. Here, time-bin-encoded quantum information in a single spatial mode is deterministically processed in a nested loop by an electrically programmable gate sequence. This architecture can process any input state and an arbitrary number of modes with almost minimum resources, and offers a universal gate set for both qubits and continuous variables. Furthermore, quantum computing can be performed fault tolerantly by a known scheme for encoding a qubit in an infinite-dimensional Hilbert space of a single light mode.

  20. Setting up a probe based, closed tube real-time PCR assay for focused detection of variable sequence alterations.

    PubMed

    Becságh, Péter; Szakács, Orsolya

    2014-10-01

    During diagnostic workflow when detecting sequence alterations, sometimes it is important to design an algorithm that includes screening and direct tests in combination. Normally the use of direct test, which is mainly sequencing, is limited. There is an increased need for effective screening tests, with "closed tube" during the whole process and therefore decreasing the risk of PCR product contamination. The aim of this study was to design such a closed tube, detection probe based screening assay to detect different kind of sequence alterations in the exon 11 of the human c-kit gene region. Inside this region there are variable possible deletions and single nucleotide changes. During assay setup, more probe chemistry formats were screened and tested. After some optimization steps the taqman probe format was selected.

Top